[2605.00155] Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback
AI disclosure
Summary
Abstract page for arXiv paper 2605.00155: Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback