LESSWRONG
LW

3138
Throw Fence
16040
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No posts to display.
An epistemic advantage of working as a moderate
Throw Fence2mo10

This comment articulates the main thought I was having reading this post. I wonder how Buck is avoiding this very trap, and if there is any hope at all of the Moderate strategy overcoming this problem? 

Reply
Subliminal Learning: LLMs Transmit Behavioral Traits via Hidden Signals in Data
Throw Fence2mo10

Am I interpreting you correctly that the responses of both Opus 4 and o3 here are wrong according to the theorem? 

Also would the following restatement of the theorem be a correct understanding? The student model can't ever become worse (according to the teacher) when fine tuned on (any) ouputs from the teacher, on any distribution.

Reply
Don’t ignore bad vibes you get from people
Throw Fence9mo1515

If the strategy is vibes-invariant, it's also ignoring useful information. It's not sensible to use an X-invariant strategy unless you believe X carries no information whatsoever. And that's kind of what the OP is arguing, that vibes do carry information. If you disagree with that, argue that directly! Arguing that you can adopt an invariant strategy without tossing away information is not correct or useful. 

Reply
Information vs Assurance
Throw Fence11mo30

I wonder if that's why your friend might say something "cool, can you pick me up on the way? Any time is good for me."

As in, to release you from the implicit assurance of the specific time.

Reply