We’ve just published the first whitepaper for Aurelius, a decentralized protocol designed to generate verifiable alignment data through adversarial prompting, reasoning transparency, and contestable evaluation. The system draws from cryptoeconomic design principles (inspired by Bittensor and Bitcoin) but applies them to the AI alignment problem, with interpretability and reasoning coherence as first-class citizens.
At its core, Aurelius is an evolving market for epistemic conflict, where independent agents are rewarded for surfacing misaligned behavior, evaluating reasoning quality, and refining collective judgment over time. The protocol is designed to scale adversarial robustness, not through static judging rules, but through a dynamic, recursive feedback loop.
This version outlines only Phase 1 of the protocol. Future papers will... (read more)