[2605.30789] Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO
AI disclosure
Summary
Abstract page for arXiv paper 2605.30789: Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO