How do LLMs generalize when we do training that is intuitively compatible with two off-distribution behaviors? — LessWrong

How do LLMs generalize when we do training that is intuitively compatible with two off-distribution behaviors? — LessWrong

Summary

Authors: Dylan Xu, Alek Westover, Vivek Hebbar, Sebastian Prasanna, Nathan Sheffield, Buck Shlegeris, Julian Stastny …

Description

Authors: Dylan Xu, Alek Westover, Vivek Hebbar, Sebastian Prasanna, Nathan Sheffield, Buck Shlegeris, Julian Stastny …

Original reporting

AFBytes is a read-only aggregator. Use the original source for full context and complete reporting.

Open original source

Related coverage