mfatt

Trans-Humeanism. The Problem of Induction Revisited

I'm writing this up as a quick sketch of an argument that I don't think anyone has explicitly made yet. I am about to start the PIBBSS Fellowship so won't have time to develop it fully, but I believe it could give a useful perspective on why alignment is a...

May 282

ARC's "Outperforming Random Sampling" explained

Written as part of a FIG Fellowship under Eleni Angelou's supervision. I've spent some time with ARC's recent blog post, Competing with Random Sampling. I think it contains some interesting ideas. Unfortunately, those ideas are captured in formalisms that might intimidate anyone without the patience for some mathematics. So here's...

May 286

Substrate: Formalism

by Vardhan and mfatt

This is the third post in a sequence on substrates - the layers of computational context that allow AI to be implemented in real systems. The sequence expands on the concept of substrates as described in this paper and was written as part of the AI Safety Camp project "MoSSAIC:...

Apr 262

Substrate-Sensitivity

This is the second post in a sequence that expands upon the concept of substrates as described in this paper. It was written as part of the AI Safety Camp project "MoSSAIC: Scoping out Substrate Flexible Risks," one of the three projects associated with Groundless. We now argue that the...

Apr 2515

Substrate: Intuitions

by Vardhan and mfatt

This post and the related sequence were written as part of the AI Safety Camp project "MoSSAIC: Scoping out Substrate Flexible Risks." This was one of the three projects supported by, and continuing the work of, Groundless. Specifically, it develops one of the key concepts referred to in the original...

Apr 253

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 3: Resolution

The previous two posts have emphasized some problematic scenarios for mech-interp. Mech-interp is our example of a more general problem in AI safety. In this post we zoom out to that more general problem, before proposing our solution. We can characterize the more general problem, inherent in the causal–mechanistic paradigm,...

Dec 5, 202512

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 2: Conflict

The previous post highlighted some salient problems for the causal–mechanistic paradigm we sketched out. Here, we'll expand on this with some plausible future scenarios that further weaken the paradigm's reliability in safety applications. We first briefly refine our critique and outline the scenario progression. Outline We contend that the causal–mechanistic...

Dec 4, 20259

mfatt

mfatt

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 0: Overture

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 1: Exposition

Substrate-Sensitivity

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 3: Resolution

mfatt

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 0: Overture

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 1: Exposition

Substrate-Sensitivity

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 3: Resolution

Trans-Humeanism. The Problem of Induction Revisited

ARC's "Outperforming Random Sampling" explained

Substrate: Formalism

Substrate-Sensitivity

Substrate: Intuitions

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 3: Resolution

Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 2: Conflict