LESSWRONG
LW

103
Perusha Moodley
26010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Induction heads - illustrated
Perusha Moodley2y20

I'm at the beginning of the MI journey: I read the paper, watched a video and I am working through the notebooks.  I have seen the single diagram version of this before but I needed this post to really help me get a feel for how the subspaces and composition work. I think it works well as a stand-alone document and I feel like it has helped setup some mental scaffolding for the next more detailed steps I need to take. Thank you for this! 

Reply
17Vulnerability in Trusted Monitoring and Mitigations
5mo
1
10Feature-Based Analysis of Safety-Relevant Multi-Agent Behavior
7mo
0