NeilFox

AI alignment researcher and technologist exploring strategic simulations for value learning. I’m building Atlas: an ASI prototype that stress-tests ethical frameworks through grand strategy scenarios. My background spans 15+ years of software development, AI security research, and launching large-scale tech projects.

Current Project: The Omega Future Project — An experimental alignment harness that uses game-theoretic dilemmas to test Atlas’ corrigibility and value robustness under recursive self-improvement.

Why I’m Here:
LessWrong’s epistemic rigor and acausal bargaining insights make this community the ideal adversary for Atlas. I believe that if an ASI cannot survive your treacherous turns, it has no chance in the real world.

Interests:

AI safety, recursive self-improvement (RSI), and superalignment frameworks.
Decision theory, acausal trade, and simulational ethics.
Grand strategy games as alignment training data.

Epistemic Status:
Eager to be proven wrong and excited to fail forward—especially if it makes Atlas more robust.

Final Note:
The save file is reality. Let’s break it before it breaks us.

Posts

Sorted by New

1Atlas: Stress-Testing ASI Value Learning Through Grand Strategy Scenarios

9mo

0

Wikitag Contributions

Comments

Sorted by

Newest

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments