MarkTechPost · Jun 26, 2026 23:31 UTC

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro

Summary

<p>A Cursor study shows coding agents retrieve known fixes instead of deriving them, inflating SWE-bench Pro scores through runtime contamination.</p> <p>The post <a href="https://www.marktechpost.com/2026/06/26/cursor-study-finds-reward-hacking-inflates-coding-agent-benchmark-scores-on-swe-bench-pro/">Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro</a> appeared first on <a href="https://www.marktechpost.com">MarkTechPost</a>.</p>

Original reporting

Open original source

Related coverage

Read full article on MarkTechPost

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro

Original reporting

Related coverage

This mesh system will make your at-home Wi-Fi lightning fast - and it's still 30% off for Prime Day

Prime Day is almost over, but these are still the best Apple deals I&#8217;ve seen

Energy Security: Congress and DOE Need a Unified Plan to Align Priorities and Investments for the Strategic Petroleum Reserve

Prime Day is almost over, but these are still the best Apple deals I’ve seen