[2604.26511] Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

[2604.26511] Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

Summary

Abstract page for arXiv paper 2604.26511: Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

Description

Abstract page for arXiv paper 2604.26511: Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

Original reporting

AFBytes is a read-only aggregator. Use the original source for full context and complete reporting.

Open original source

Related coverage