There's a well-worn story you've probably heard before about looking for keys under a light post. If you've not heard it before:
A man leaves the bar after a long night of drinking. He stumbles all the way home only to find when he gets there that he doesn't have his keys. He knows he had his keys when he left the bar, so he must have dropped them on the way home. He goes out to look for them, but it's the night of a new moon and too dark to possibly see his keys except where there's artificial lighting, so he spends all his effort looking under the sole light post between his home and the bar. After a while a police officers stops to question him.
"Sir. Sir! What are you doing?" shouts the officer.
"Looking for my keys," says the drunk man.
"Did you drop them around here?"
"I don't know."
"Then why are you looking here?"
"Because this is where the light is."
The "broke" interpretation of the story is that the drunk man is stupid for looking for his keys under the light post because the location of his lost keys is unlikely to be correlated with where the light is good enough to search. The "woke" interpretation is that he's right to spend his time looking where the light is because he's more likely to find his keys if he looks in places where he has a chance to find them even if this makes finding his keys conditional on his having dropped his keys in such a location, thus increasing his expected probability of finding his keys since he has a very small chance of finding them in the dark.
I'm here to stretch a metaphor thin and give you the "bespoke" interpretation: many of us are searching for answers that are probably out in the dark, but we should look under the light post anyway both because that's where the light is and because different people have different light posts to look under who can call out to us if they find the answer.
We often face situations in real life analogous to looking for our keys in the dark while drunk. For example, maybe we want to know where to donate our money, what to do on vacation, or how to prevent existential catastrophe. In each case we're trying to find something (a charity, a decision, a solution) and have some ideas about where to look but don't have time or the ability to explore the entire solution space. A typical strategy is to try to "shed light" on the problem and its solutions so we have more information to lead us to a solution.
But shedding light can only take us so far, because the solution space is large and we're small. In the case of the man looking for his keys, it might be 8 blocks from the bar to his house, and that's a lot of ground to cover. Even if he can make the world a little brighter, say by carrying a flashlight or being lucky enough to be drunk on the night of a full moon, he will only slightly improve his ability to look for his keys over the full 8 blocks and the task will still be easiest if his keys are under the light post.
Luckily for us and unluckily for our protagonist, we're not alone in our endeavors while he is. Everyone has their own metaphorical light post to look under created by their interests and abilities, and we can call out to others if we find something under ours. For example, by choice and accident I know a lot about (compared to the average person):
and many other things besides. In contrast, I live with Sarah Constantin (among other folks), who knows comparatively more about (than me or the average person):
- analysis (in a mathematical sense)
- literature reviews
and many other things besides. So if Sarah and I go out to search for the metaphorical key to the best vacation spot, we have very different places to look for the answer based on where the light is best for each of us. I'm not sure either of us is especially well equipped to find the answer given the skills that are probably necessary to find it, but no matter: if it's important enough for us to look then we'll look where the light is and do our best to find an answer where we're able to look. Maybe we'll decide pretty quickly we don't see the answer anywhere we have enough light to look and will ask someone else, but this is merely to ask this other person to look for the answer where they have enough light to look based on where we expect the answer to be, and the situation is no different other than that the search space is now better illuminated by the lights of others.
So what about when we don't know where the answer might be and don't know who to ask to look for us? That's the situation I find interesting, because it seems to be the situation we're in with AI alignment.
We sort of know what we want to do when we say we want to solve AI alignment, but we don't know how to do it. It feels like the answer is somewhere out there in the dark, and some folks have some ideas about where in the dark it might be, so they are building light posts to help us search the space. But you can only build new light posts where there's enough light to work, so we are limited by where the people looking have light now. This suggests a reasonable strategy given the high stakes (we really want to find a solution) and the uncertainty about what will work: get more people to look under their light posts for answers. If enough of us work together to cover enough ground no one of us is especially likely to find a solution but together it's more likely for someone to find it.
There are of course caveats because I've stretched this light post metaphor farther than I should have: AI alignment probably doesn't have a simple solution one person can find; it will probably take many people working on many subproblems combining their efforts to build a complete solution; there may not be a solution; and there may already be a solution we already know about but we just haven't figured out how to assemble it yet. These aren't really problems with the point I want to make, but problems with the metaphor; maybe if I didn't find the light post metaphor so evocative of the situation I would have thought of a better one.
So to conclude in less poetic language in case I've lost you with all this talk of drunk men searching for keys in the night, my point is this: if you want to work on AI alignment, work in the places you are well suited to work on the problems you are well suited to work on. Not just any problem you can rationalize as possibly being relevant, but those problems you believe relevant to solving AI alignment, and especially those problems you believe relevant that others are currently neglecting. You'll probably end up doing something that won't matter either because it's not actually relevant or because it's supervened by something else, but so will everyone else and that's okay because if enough of us look we're more likely to find something that matters.