Progressive Multi-Turn Reinforcement Learning for Dynamic User-Interactive Tool Agents
Recent advances in reinforcement learning for large language models have produced powerful agent frameworks that achieve strong performance on multi-turn tool use, interactive search, and complex reasoning. However, existing reinforcement learning frameworks for large language model agents face three critical limitations: difficulty in handling dynamic user interactions owing to reliance on pre-scripted queries, limited scalability across varying interaction horizons with fixed scaling schedules, and substantial reward engineering overhead requiring domain-specific manual tuning. We introduce Progressive Multi-Turn Reinforcement Learning […]