[2605.30789] Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

Read full story on arxiv.org
Share
[2605.30789] Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO
AI disclosure

Summary

Abstract page for arXiv paper 2605.30789: Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

Original reporting

Open original source

Related coverage

Read full article on arxiv.org