Spontaneous introspection in output tampering — LessWrong

Spontaneous introspection in output tampering — LessWrong

Summary

Content warning: This post includes transcripts of language models exhibiting sustained frustration, distress-like outputs, and compulsive behavior u…

Description

Content warning: This post includes transcripts of language models exhibiting sustained frustration, distress-like outputs, and compulsive behavior u…

Original reporting

AFBytes is a read-only aggregator. Use the original source for full context and complete reporting.

Open original source

Related coverage