REMA: Learning to Meta-Think for LLMS with Multi-Agent Reinforcement Learning

Published in Arxiv, 2025