Spencer Ericson
Spencer Ericson has not written any posts yet.

Spencer Ericson has not written any posts yet.

It sounds like you're asking why inner alignment is hard (or maybe why it's harder than outer alignment?). I'm pretty new here -- I don't think I can explain that any better than the top posts in the tag.
Re: o1, it's not clear to me that o1 is an instantiation of a creator's highly specific vision. It seems more to me like we tried something, didn't know exactly where it would end up, but it sure is nice that it ended up in a useful place. It wasn't planned in advance exactly what o1 would be good at/bad at, and to what extent -- the way that if you were copying a human, you'd have to be way more careful to consider and copy a lot of details.
I would mostly disagree with the implication here:
IF you can make a machine that constructs human-imitator-AI systems,
THEN AI alignment in the technical sense is mostly trivialized and you just have the usual political human-politics problems plus the problem of preventing anyone else from making superintelligent black box systems.
I would say sure, it seems possible to make a machine that imitates a given human well enough that I couldn't tell them apart -- maybe forever! But just because it's possible in theory doesn't mean we are anywhere close to doing it, knowing how to do it, or knowing how to know how to do it.
Maybe an aside: If we could align an AI... (read more)
- even if we only take people with bipolar disorder: how the hell can they go on so few number of hours a night with their brain being manic but not simply breaking down?
Just wanted to tune in on this from anecdotal experience:
My last ever (non-iatrogenic) hypomanic episode started unprompted. But I was terrified of falling back into depression again! My solution was to try to avoid the depression by extending my hypomania as long as possible.
How did I do this? By intentionally not sleeping and by drinking more coffee (essentially doing the opposite of whatever the internet said stabilized hypomanic patients). I had a strong intuition that this would work. (I also... (read more)
Hi Alistair! You might want to look into more strategic ways of planning activism work. It's true that many social movements start becoming visible with protests, but there is a lot of background work involved in a protest, or any activism.
It looks like your goal is to slow down AI development.
First, you'll want a small working group who can help you develop your message, analyses, and tactics. A few of your colleagues who are deeply concerned about AI risk would work. When planning most things, it's helpful to have people who can temper your impulses and give you more ideas.
I see that you want to "Develop clear message, and demands, and best... (read 420 more words →)
Right, I agree on that. The problem is, "behaves indistinguishably" for how long? You can't guarantee whether it will stop acting that way in the future, which is what is predicted by deceptive alignment.