x

teo456

Subscribe

Message

2

3

1y

teo456

Subscribe

Message

2

3

1y

My AGI safety research—2025 review, ’26 plans

teo4566mo1-2

Plan type 3: Does the brain have any self-aware course-correcting mechanism? The Anterior Cingulate Cortex?

Perhaps we find a way of proving the inner-goal of small NNs, and we train these NNs to have the goal of evaluating, and aligning, the emergent goals and subgoals of other AI systems, and find a way of deploying these NNs as modular 'saftey workers' that can be integrated into AGI systems in order to monitor, evaluate, and align the goals of the entire system.

But then I guess really you'd have to incorporate these safety workers as a part of the larger system from the very beginning, and in a way that gives these parts of the system some kind of architectural advantage.

Reply

Excerpts from my neuroscience to-do list

teo4567mo30

Worth posting I think! I don't yet have the skills to conduct any of these effectively, but I can see a couple of them being useful projects for me to try my hand at eventually