I think this is a simple but evocative post that does a solid job of illustrating why one should be worried about near-term transformative AGI. I think some of the conclusions aren't entirely accurate and more thorough analysis would paint a less dire (but still dire!) picture, but I still really like this post, and I think it's probably very effective at getting people into the "transformative AGI is close" camp.
Part of my frustration of you guys is that I'm a layperson. If you were to post your proof of alignment impossibility I probably wouldn't have the relevant skillset to understand it. So the only thing I can do is rely on others whose expertise I trust to analyze it for me.
Paul is one of those people. He posted multiple times he would engage with a proof in the previous thread, and I was disappointed he was not replied with one. I disagree with your assertion he wouldn't handle it faithfully based off all of the past writing of his I have read. One thing Pa... (read more)
Being blunt here, you guys triggered a bunch of crank alarms and someone called you and your partner a crank. They called the quacking animal a duck.
The response to this is to let your work do the talking for you, to simply present the proof of your "outlandish" theory and nothing else. If you have a proof that P=NP or a slam-dunk theory of everything, then that should be able to stand on it's own. You can't complain about people engaging with things other than your idea if the only thing they can even engage with is your idea.
Instead you have decided to m... (read more)
If there is an easy way of fixing it sure, but I wouldn't devote more than a small amount of mental effort towards thinking up solutions, unless you don't see anywhere else you can plausibly help with alignment. Again, It's just a weird worst-case scenario that could only happen by a combination of incredible incompetence and astronomically bad luck, it's not even close to the default failure scenario.
I get you have anxiety and distress about this idea, but I don't think hyper-focusing on it will help you. Nobody else is going to hyper-focus on it, they'll... (read more)
We'll just have to agree to disagree here then. I just don't find this particular worst-case-scenario likely enough to worry about. Any AGI with a robust way of dealing with bugs will deal with this, and any AGI without that will far more likely just break or paperclip us.
You seem to have a somewhat general argument against any solution that involves adding onto the utility function in "What if that added solution was bugged instead?". While maybe this can be resolved, I think it's better to move on from trying to directly target the sign-flip problem and instead deal with bugs/accidents in general. After all, the sign-flip is just the very unlikely worse case scenario version of that, and any solution to dealing with bugs/accidents in an AGI will also deal with it.
Another small silver lining is we don't have to worry about making sure our alignment tools and processes can generalize, just that they scale. So they can be as tailor made to GPT as we want. I don't think this buys us much, as making an effective scalable tool in the first place seems like the much harder part than generalizing it.Agreed, GPT is very alien under the hood even though it's mimicking us, and that poses some problems. I'm curious however, just how good it's mimicry of us is/going to be, more specifically it's mo... (read more)
You've given me a lot to think about (and may have even lowered my confidence in some of my assertions). Kudos!
I do still have some thoughts to give in response though, but they don't really function as very in-depth responses to your points, as I'm still in the process of ruminating:
I agree with you that GPT-3 probably hasn't memorized the prompts given in the OP, it's too rare for that to be worth it. I just think it's so big and has access to so much data it really doesn't need to solve prompts like that. Take the Navy Seal Copypasta prompts Gwern di
I recognize the points you are making, and I agree, I don't want to be a person who sets an unfeasibly high bar, but with how GPT-3 was developed it's really difficult to put one that isn't near that height. If GPT-3 was instead made with mostly algorithmic advances instead of mostly scaling, I'd be a lot more comfortable placing said bar and a lot less skeptical, but it wasn't, and the sheer size of all this is in a sense intimidating.
The source of a lot of my skepticism is GPT-3's inherent inconsistency. It can range wildly from it's high-quality ouput t... (read more)
I think you were pretty clear on your thoughts, actually. So, the easy / low-level way response to some of your skeptical thoughts would be technical details and I'm going to do that and then follow it with a higher-level, more conceptual response.
The source of a lot of my skepticism is GPT-3's inherent inconsistency. It can range wildly from it's high-quality ouput to gibberish, repetition, regurgitation etc. If it did have some reasoning process, I wouldn't expect such inconsistency. Even when it is performing so well people call it &
In a very loosely similar sense (though not at all accurate architectural sense) to how AlphaGo knows which moves are relevant for playing Go. I wouldn't say it was reasoning. It was just recognizing and predicting.
To give an example: If I were to ask various levels of GPT (perhaps just 2 and 3, as I'm not very familiar with the capabilities of the first version off the top of my head) "What color is a bloody apple" It would have a list of facts in it's "head" about the words "bloody" and "apple", like one can be red or green, one is depicted as various sh... (read more)
Great, but the terms you're operating with here are kind of vague. What problems could you give to GPT-3 that would tell you whether it was reasoning, versus "recognising and predicting", passive "pattern-matching" or a presenting "illusion of reasoning"? This was a position I subscribed to until recently, when I realised that every time I saw GPT-3 perform a reasoning-related task, I automatically went "oh, but that's not real reasoning, it could do that just by pattern-matching", and when I saw it do some... (read more)
GPT-3 was trained on an astronomical amount of data from the internet, and asking weird hypotheticals is one of the internet's favorite pastimes. I would find it surprising if it was trained on no data resembling your prompts.
There's also the fact that it's representations are staggeringly complex. It knows an utterly absurd amount of facts "Off the top of it's head", including the mentioned facts about muzzle velocity, gravity, etc., and it's recognition abilities are great enough to recognize which of the facts it knows are the relevant ones based on the... (read more)
Yeah, this sampling stuff brings up arguments about "curating" or "If you rephrase the same question and get a different answer then there is no reasoning/understanding here" which I'm sympathetic to.
I also think categorizing GPT-3's evasiveness, tendency to take serious prompts as joke prompts, etc. as solely the fault of the human is unfair. GPT-3 also shares the blame for failing to interpret the prompt correctly. This is hard task obviously, but that just means we have further to go, despite the machine's impressiveness already.
I still haven't been convinced GPT-3 is capable of reasoning, but I'm also starting to wonder if it's even that important. Roughly, all GPT-3 does is examine text, try to find a pattern, and continue it. But it is so massive, and trained on so much data that the patterns it can "see" and connections it can make are far more expansive than we'd expect. What this means, is while it doesn't try to comprehend any logical questions and then apply some kind of reasoning to answer it, it's ability to see patterns combined with it's staggeringly huge amount of dat... (read more)
Yeah, that part was very impressive. Personally, I'm not sure if I get why it requires reasoning, it seems more like just very advanced and incredible recognition and mimicry abilities to me. But I'm just a casual observer of this stuff, I could easily be not getting something. Hopefully, since we can all play around with GPT-3, people continue to push it to it's limits, and we can get an accurate picture of what's really going on under the hood, if it's really developed some form of reasoning.
I'd love to know your reasoning here. I've been very impressed with GPT-3, but not to the extent I'd majorly update my timelines.
That's a fair point, but I don't think you need to have a livestreamed event to gain access to professional Starcraft players beyond consultation. I'm sure many would be willing to be flown to DeepMind HQ and test AlphaStar in private.
The "it" I was referring to were these showmatches, I worded that poorly my bad. I just don't see the point in having these uneven premature showmatches, instead of doing the extra work (which might be a lot) and then having fair showmatches. All the current ones have brought is confusion, and people distrusting Deepmind (due to misleading comments/graphs), and whenever they do finally get it right, it won't get the same buzz as before, because the media and general public will think it's old news that's already been done. Having them now instead of later just seems like a bad idea all around.
There were definitely some complaints around OAI5's capabilities. Besides criticism over it's superhuman reaction speed, the restrictions placed upon the game to allow the AI to learn it essentially formed it's own weird meta the humans were unfamiliar with, so the AI was using tactics built for an entirely different game than the humans.
Honestly, I haven't been very impressed with either AI through these showmatches, because it's so hard to tell what their "intelligence" level is, as it's influenced heavily by their unfair capabilities. They need to both
I'm a relative layperson, so I honestly don't know. Maybe no new tricks are needed. But if that's the case, why not just do it and not have all this confusion flying about?
These big uneven showmatches do a very poor job of highlighting the state of the art in Ai tactics, as the tactics the AI use seem to be heavily influenced by it's "unfair" capabilities. I can't really tell if these agents are generally smart enough to play at the full game evenly but use unfair strategies because they're allowed to, or if they're dumber agents who couldn't play the full game evenly so they're propped up by their unfair capabilities and earn victories by exploiting them.
After both Dota and this, I'm starting to come to the conclusion that video games are a poorer testbed for examining the "thinking" capabilities of modern AI. Unless a substantial amount of effort is put in to restrict the AI to being roughly equal "physically" to humans, it becomes very difficult to determine how much of the AI's victories were due to "thinking" or due to being so "physically" superior it pulls off moves humans are simply incapable of doing even if they thought of them (or having other advantages like being able to see all of the visible
Even if something like an electron has some weird degraded form of consciousness, I don't see why I should worry about it suffering. It doesn't feel physical pain or emotional pain, because it lacks physical processes for both of them, and any talk of it having "goals" and it failing to reach them means it suffers just reeks of anthropomorphism. I just don't buy it.
I agree that researchers can take shortcuts and develop tricks, but I don't see how that shortens it to something as incredibly short as 1 year, especially since we will be starting with parts that are far worse than their equivalent in the human brain.
"We could assume—by analogy with human brain training in childhood—that to train one model of human mind, at least 1 year of training time is needed (if the computer is running on the same speed as human mind)."
Could you clarify here? I'm no expert, but I'm pretty sure human brains in childhood take a lot longer than a year to learn everything they need to survive and thrive in the real world. And they have a lot more going for them than anything we'll build for the foreseeable future (better learning algorithm, better architecture built by evolution, etc.)