[2605.00365] Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity
AI disclosure
Summary
Abstract page for arXiv paper 2605.00365: Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity