[2605.31584] LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards
AI disclosure
Summary
Abstract page for arXiv paper 2605.31584: LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards