lesswrong.com · May 3, 2026 07:53 AM UTC

Paraphrasing Is (At Best) a Partial Defence Against Steganography in LLMs — LessWrong

Summary

Within the AI Safety community, paraphrasing, which, in the context of this post, simply means using another LLM (with nonzero temperature) to rewrit…

Original reporting

Open original source

Related coverage

Read full article on lesswrong.com