Wiki Contributions

Comments

Thanks for this concise and informative story.

 

A few questions from my side:

There are two common mental models of how situational awareness emerges

do you have pointers to references here? I'm quite interested myself in situational awareness and would like to read up on the literature of how it emerges.

 

the presence of duplicates can be taken as a signal distinguishing training from deployment

I'm not sure if I completely got it. So what you are saying is that the described deduplication procedure will leave sequences of length K that differ in only a single token.  However, we won't have any duplicates of sequences of length K left during training. Because we expect duplicates of sequences of length K to appear during deployment, the model can use this signal to distinguish training from deployment. Is that right?

 

PS: you mention \textit{4b} in point 1) under scenario, which I assume refers to your list of assumptions but does not exist. Additionally, maybe use A, B, C or I, II, III for the assumptions to not confuse the referencing of the assumptions with the scenario.