[2605.00365] Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity

Read full story on arxiv.org
Share
[2605.00365] Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity
AI disclosure

Summary

Abstract page for arXiv paper 2605.00365: Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity

Original reporting

Open original source

Related coverage

Read full article on arxiv.org