Trained steering vectors may work as activation oracles — LessWrong

Trained steering vectors may work as activation oracles — LessWrong

Summary

Inspired by @Eriskii's recent finding that trained steering vectors can teach a base model to act as an assistant, I replaced the Activation Oracle p…

Description

Inspired by @Eriskii's recent finding that trained steering vectors can teach a base model to act as an assistant, I replaced the Activation Oracle p…

Original reporting

AFBytes is a read-only aggregator. Use the original source for full context and complete reporting.

Open original source

Related coverage