I've been working on a new approach to AI alignment diagnostics—one that models intelligence as a system navigating a structured conceptual space, where reasoning processes correspond to discrete transitions through a graph distributed over a continuous 3D space.
This perspective allows us to visualize alignment failures as topological breakdowns—failures of coherence, recursive correction, or semantic resolution—rather than treating them as abstract behavioral mismatches or loss function anomalies.
First animation (5 min): here
Full animation series: here
This project argues that certain categories of alignment failures—especially those involving recursive misgeneralization, conceptual aliasing, or goal misalignment under proxy compression—may not be reliably discoverable using standard formal tools. Visualization becomes a necessary epistemic aid for detecting structure-level errors in cognition and optimization.
August 10, 2025 (Virtual)
If you're interested in thinking through alignment not just in math or code but in visual-functional terms, as part of the AGI-2025 conference, I'm hosting a free workshop focused on:
The goal is to make core failure modes visibly navigable—even for those without a formal background.
We’re also inviting short idea submissions through July 24.