LESSWRONG
LW

2511
SemanticMerlin
0010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
Two sources of beyond-episode goals (Section 2.2.2 of “Scheming AIs”)
SemanticMerlin2y10

Very surprised to be the first comment on this, nice work. You’ve framed beyond-episode goals really well. One thing that is bothering me, and I must be missing something - why is there a prima facie supposition of the emergence of beyond-episode goals at all? As you (rightly) note, the naive logic about SGD as a mechanism would seem strongly to point away from the plausibility of BEG. This is well written but I feel like “suppose some BEG emerges” is treated almost axiomatically. Don’t we need a stronger circumstantial/theoretical/evidentiary reason for thinking BEGs are, like, a thing that happens in SOTA deep learning paradigms? 

Reply