[1]

Ronald D. Peters, Yimingdong Cao, and Mahesh Yadav, “Meta-Reflective Reinforcement Learning for Adaptive Decision-Making in Tool-Using LLM Systems”, CSET, vol. 4, no. 1, May 2026.