LESSWRONG
LW

Perusha Moodley
23010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Induction heads - illustrated
Perusha Moodley2y20

I'm at the beginning of the MI journey: I read the paper, watched a video and I am working through the notebooks.  I have seen the single diagram version of this before but I needed this post to really help me get a feel for how the subspaces and composition work. I think it works well as a stand-alone document and I feel like it has helped setup some mental scaffolding for the next more detailed steps I need to take. Thank you for this! 

Reply
15Vulnerability in Trusted Monitoring and Mitigations
3mo
1
9Feature-Based Analysis of Safety-Relevant Multi-Agent Behavior
4mo
0