Abstract page for arXiv paper 2605.29442: How Coding Agents Fail Their Users: A Large-Scale Analysis of Developer-Agent Misalignment in 20,574 Real-World Sessio...

science tech

Read story

arxiv.org · May 29, 2026 04:00 UTC

[2605.29458] Adaptive Interviewing for Persona Simulation in LLMs: Evidence-Grounded Reasoning Improves Decision Alignment

Abstract page for arXiv paper 2605.29458: Adaptive Interviewing for Persona Simulation in LLMs: Evidence-Grounded Reasoning Improves Decision Alignment

science tech

Read story

arxiv.org · May 28, 2026 04:00 UTC

[2605.28188] Framing Matters: Addressing Framing Sensitivity in Decision-Making through Behaviorally-Grounded Value Alignment

Abstract page for arXiv paper 2605.28188: Framing Matters: Addressing Framing Sensitivity in Decision-Making through Behaviorally-Grounded Value Alignment

science tech

Read story

arxiv.org · May 28, 2026 04:00 UTC

[2605.28597] Position: Retire the "Positive Backdoor" Label -- Secret Alignment Requires Strict and Systematic Evaluation

Abstract page for arXiv paper 2605.28597: Position: Retire the "Positive Backdoor" Label -- Secret Alignment Requires Strict and Systematic Evaluation

tech

Read story

arxiv.org · May 28, 2026 04:00 UTC

[2605.27969] Boundary Suppression Asymmetry in Post-trained Assistants: Over-expansion as a Controllability Cost

Abstract page for arXiv paper 2605.27969: Boundary Suppression Asymmetry in Post-trained Assistants: Over-expansion as a Controllability Cost

science tech

Read story

Related entities

ai-safety · other
benchmarks · other
ai · other
federated learning · technology
research · other
arxiv · other
privacy · other
LessWrong · other
risk · other

Browse all entities

alignment · AFBytes

Recent coverage