LESSWRONG
LW

1675
Roman Engeler
61010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Early situational awareness and its implications, a story
Roman Engeler3y10

Thanks for this concise and informative story.

 

A few questions from my side:

There are two common mental models of how situational awareness emerges

do you have pointers to references here? I'm quite interested myself in situational awareness and would like to read up on the literature of how it emerges.

 

the presence of duplicates can be taken as a signal distinguishing training from deployment

I'm not sure if I completely got it. So what you are saying is that the described deduplication procedure will leave sequences of length K that differ in only a single token.  However, we won't have any duplicates of sequences of length K left during training. Because we expect duplicates of sequences of length K to appear during deployment, the model can use this signal to distinguish training from deployment. Is that right?

 

PS: you mention \textit{4b} in point 1) under scenario, which I assume refers to your list of assumptions but does not exist. Additionally, maybe use A, B, C or I, II, III for the assumptions to not confuse the referencing of the assumptions with the scenario.

Reply
76Research agenda: Supervising AIs improving AIs
Ω
2y
Ω
5