spacecat
4
I'm an independent researcher with a background in information security and video/content creation. I built Striatica, an interactive 3D visualization tool for exploring SAE feature spaces, and used it to identify several geometric properties in GPT-2 small that aren't visible in existing 2D tools. Paper and tool are both public.
I'm an independent researcher with a background in information security and video/content creation. I enjoy building software, which I've been doing a lot of the past year. I'm also an established cat whisperer and pattern recognizer.
Absolutely fascinating. While I have a list of competing priorities, this is what my mind drifts off to investigating ever since I watched the 80,000 Hours episode with Kyle Fish discussing various instances of Claude behavior. I'm working on getting my 3D interpretability visualization tool working in realtime first, and then this "attractor state" area is #1 on my list. (I'd provide a link, but being new, any links lead to auto-rejection, which has been frustrating.)