Posts

Sorted by New

16Meta Alignment: Communication Wack-a-Mole

1mo

5Meta Alignment: Education

9mo

37My Weirdest Experience

Wiki Contributions

Comments

Meta Alignment: Communication Wack-a-Mole

Bridgett Kay1mo10

Thank you.

[April Fools' Day] Introducing Open Asteroid Impact

Bridgett Kay4mo319

We don't know how to align asteroids' trajectories, so it's important to use smaller asteroids to align larger ones- like a very large game of amateur billiards.

LessWrong's (first) album: I Have Been A Good Bing

Bridgett Kay4mo129

I love this! But I find myself a little disappointed there's not a musical rendition of the "I have been a good bing" dialogue.

Can we get an AI to do our alignment homework for us?

Answer by Bridgett KayFeb 28, 202410

As one scales up a system, any small misalignment within that system will become more apparent- more skewed. I use shooting an arrow as an example. Say you shoot an arrow at a target from only a few feet away. If you are only a few degrees off from being lined up with the bullseye, when you shoot the close target your arrow will land very close to the bullseye. However, if you shoot a target many yards away with the same degree of error, your arrow will land much, much farther from the bullseye.

So if you get a less powerful AI aligned with your goals to a degree where everything looks fine, and then assign it the task of aligning a much more powerful AI, then any small flaw in the alignment of the less powerful AI will go askew far worse in the more powerful AI. What's worse- since you assigned the less powerful AI the task aligning the larger AI, you won't be able to see exactly what the flaw was until it's too late, because if you'd been able to see the flaw, you would have aligned the larger AI yourself.

My Weirdest Experience

Bridgett Kay1y40

That seems fairly consistent with what happened to me. I did not experience my entire life in the dream- just the swim meet and the aftermath, and my memories were things I just summoned in the moment, like just coming up with small pieces of a story in real time. The thing that disturbed me the most wasn't living another life- though that was disturbing enough- but the fact that a character in the dream knew a truth that "I" did not.

My Weirdest Experience

Bridgett Kay1y10

I have a similar trick I use with pirouettes- if I can turn and turn without stopping, then it is a dream. Of course, in this dream, I was not a dancer and had never danced, so I didn't even think of it.

Ways I Expect AI Regulation To Increase Extinction Risk

Bridgett Kay1y64

Lately I've been appreciating, more and more, something I'm starting to call "Meta-Alignment." Like, with everything that touches AI, we have to make sure that thing is aligned just enough to where it won't mess up or "misalign" the alignment project. For example, we need to be careful about the discourse surrounding alignment, because we might give the wrong idea to people who will vote on policy or work on AI/AI adjacent fields themselves. Or policy needs to be carefully aligned, so it doesn't create misaligned incentives that mess up the alignment project; the same goes for policies in companies that work with AI. This is probably a statement of the obvious, but it is really a daunting prospect the more I think about it.

The LessWrong 2019 Review

Bridgett Kay4y30

I was just wondering, on the subject of research debt, if there was any sort of system so that people could "adopt" the posts of others. Like say, if someone posts an interesting idea that they don't have the time to polish or expand upon, they could post is somewhere for people who can.

My Weirdest Experience

Bridgett Kay4y20

Yeah- the experience really shook me. I'm prone to fairly vivid and interesting dreams, but this was definitely the strangest.