Spontaneous introspection in conversation tampering — LessWrong
Summary
Content warning: This post includes transcripts of language models exhibiting sustained frustration, distress-like outputs, and compulsive behavior u…
Description
Content warning: This post includes transcripts of language models exhibiting sustained frustration, distress-like outputs, and compulsive behavior u…
Original reporting
AFBytes is a read-only aggregator. Use the original source for full context and complete reporting.
Open original source