Paraphrasing Is (At Best) a Partial Defence Against Steganography in LLMs — LessWrong

Read full story on lesswrong.com
Paraphrasing Is (At Best) a Partial Defence Against Steganography in LLMs — LessWrong
AI disclosure

Summary

Within the AI Safety community, paraphrasing, which, in the context of this post, simply means using another LLM (with nonzero temperature) to rewrit…

Original reporting

Open original source

Related coverage

Read full article on lesswrong.com