When choosing between two moves that are both judged to win the game with 0.9999999 alpha go not choosing the move that maximizes points suggest that it does not use patterns about what optimal moves are in certain local situations to make it's judgements.
I nitpick/object to your use of "optimal moves" here. The move that maximizes points is NOT the optimal move; the optimal move is the move that maximizes win probability. In a situation where you are many points ahead, plausibly the way to maximize win probability is not to try to get more points, but rather to try to anticipate and defend against weird crazy high-variance strategies your opponent might try.
It would be interesting to see this post updated, e.g. to describe the situation today or (even better) how it evolved over the course of 2020-2021.
I think I get this distinction; I realize the NN papers show the latter; I guess our disagreement is about how big a deal / how surprising this is.
Nice post! You may be interested in this related post and discussion.
I think you may have forgotten to put a link in "See Mesa-Search vs Mesa-Control for discussion."
Ah, OK. Interesting, thanks. Would you agree with the following view:
"The NTK/GP stuff has neural nets implementing a "psuedosimplicity prior" which is maybe also a simplicity prior but might not be, the evidence is unclear. A psuedosimplicity prior is like a simplicity prior except that there are some important classes of kolmogorov-simple functions that don't get high prior / high measure."
Which would you say is more likely: The NTK/GP stuff is indeed not universally data efficient, and thus modern neural nets aren't either, or (b) NTK/GP stuff is indeed not universally data efficient, and thus modern neural nets aren't well-characterized by the NTK/GP stuff.
Feature learning requires the intermediate neurons to adapt to structures in the data that are relevant to the task being learned, but in the NTK limit the intermediate neurons' functions don't change at all. Any meaningful function like a 'car detector' would need to be there at initialization -- extremely unlikely for functions of any complexity.
I used to think it would be extremely unlikely for a randomly initialized neural net to contain a subnetwork that performs just as well as the entire neural net does after training. But the mu... (read more)
Sorry I didn't notice this earlier! What do you think about the argument that Joar gave?
If a function is small-volume, it's complex, because it takes a lot of parameters to specify.
If a function is large-volume, it's simple, because it can be compressed a lot since most parameters are redundant.
It sounds like you are saying: Some small-volume functions are actually simple, or at least this might be the case for all we know, because maybe it's just really hard for neural networks to efficiently represent that function. This is especially... (read more)
The funnest one off the top of my head is how Yudkowsky used to think that the best thing for altruists to do was build AGI as soon as possible, because that's the quickest way to solve poverty, disease, etc. and achieve a glorious transhuman future. Then he thought more (and talked to Bostrom, I was told) and realized that that's pretty much the exact opposite of what we should be doing. When MIRI was founded its mission was to build AGI as soon as possible.
(Disclaimer: This is the story as I remember it being told, it's entirely possible I'm wrong)
My counterfactual attempts to get at the question "Holding ideas constant, how much would we need to increase compute until we'd have enough to build TAI/AGI/etc. in a few years?" This is (I think) what Ajeya is talking about with her timelines framework. Her median is +12 OOMs. I think +12 OOMs is much more than 50% likely to be enough; I think it's more like 80% and that's after having talked to a bunch of skeptics, attempted to account for unknown unknowns, etc. She mentioned to me that 80% seems plausible to her too but that sh... (read more)
Hmm, I don't count "It may work but we'll do something smarter instead" as "it won't work" for my purposes.
I totally agree that noise will start to dominate eventually... but the thing I'm especially interested in with Amp(GPT-7) is not the "7" part but the "Amp" part. Using prompt programming, fine-tuning on its own library, fine-tuning with RL, making chinese-room-bureaucracies, training/evolving those bureaucracies... what do you think about that? Naively the scaling laws would predict that we... (read more)
When you say hardware progress, do you just mean compute getting cheaper or do you include people spending more on compute? So you are saying, you guess that if we had 10 OOMs of compute today that would have a 50% chance of leading to human-level AI without any further software progress, but realistically you expect that what'll happen is we get +5 OOMs from increased spending and cheaper hardware, and then +5 "virtual OOMs" from better software?
Thanks for the thoughtful reply. Here are my answers to your questions:
Here is what you say in support of your probability judgment of 10% on "Conditional it being both possible and strongly incentivized to build APS systems, APS systems will end up disempowering approximately all of humanity."
Beyond this, though, I’m also unsure about the relative difficulty of creating practically PS-aligned systems, vs. creating systems that would be practically PS-misaligned, if deployed, but which are still superficially attractive to deploy. One comm
I agree with Zach above about the main point of the paper. One other thing I’d note is that SGD can’t have literally the same outcomes as random sampling, since random sampling wouldn’t display phenomena like double descent (AN #77).
Would you mind explaining why this is? It seems to me like random sampling would display double descent. For example, as you increase model size, at first you get more and more parameters that let you approximate the data better... but then you get too many parameters and just start memorizing the data... ... (read more)
I'll confess that I would personally find it kind of disappointing if neural nets were mostly just an efficient way to implement some fixed kernels, when it seems possible that they could be doing something much more interesting -- perhaps even implementing something like a simplicity prior over a large class of functions, which I'm pretty sure NTK/GP can't be
Wait, why can't NTK/GP be implementing a simplicity prior over a large class of functions? They totally are, it's just that the prior comes from the measure in random initia... (read more)
Well, it seems to be saying that the training process basically just throws away all the tickets that score less than perfectly, and randomly selects one of the rest. This means that tickets which are deceptive agents and whatnot are in there from the beginning, and if they score well, then they have as much chance of being selected at the end as anything else that scores well. And since we should expect deceptive agents that score well to outnumber aligned agents that score well... we should expect deception.
I'm working on a much more fleshed out and expanded version of this argument right now.
Pinging you to see what your current thoughts are! I think that if "SGD is basically equivalent to random search" then that has huge, huge implications.
I think Abram's concern about the lottery ticket hypothesis wasn't about the "vanilla" LTH that you discuss, but rather the scarier "tangent space hypothesis." See this comment thread.
I think universal paywalls would be much better. Consider how video games typically work: You pay for the game, then you can play it as much as you like. Video games sometimes try to sell you things (e.g. political ideologies, products) but there is vastly less of that then e.g. youtube or facebook, what with all the ads, propaganda, promoted content, etc. Imagine if instead all video games were free, but to make money the video game companies accepted bribes to fill their games with product placement and propaganda. I would not prefer that world, even tho... (read more)
Is it really true that most people sympathetic to short timelines are thus mainly due to social proof cascade? I don't know any such person myself; the short-timelines people I know are either people who have thought about it a ton and developed detailed models, or people who just got super excited about GPT-3 and recent AI progress basically. The people who like to defer to others pretty much all have medium or long timelines, in my opinion, because that's the respectable/normal thing to think.
Welcome! I recognize your username, we must have crossed paths before. Maybe something to do with SpaceX?
My guess is: Regulation. It would be illegal to build and rent out nano-apartments. (Evidence: In many places in the USA, it's illegal for more than X people not from the same family to live together, for X = 4 or something ridiculously small like that.)
To add a bit more detail to your comment, this form of housing used to exist in the from of single room occupancy (SRO) buildings, where people would rent a single room and share bathroom and kitchen spaces. Reformers and planners started efforts to ban this form of housing starting around the early 20th century. From Wikipedia:
By the 1880s, urban reformers began working on modernizing cities; their efforts to create "uniformity within areas, less mixture of social classes, maximum privacy for each family, much lower density for many activities, buildings
Welcome! It's people like you (and perhaps literally you) on which the future of the world depends. :)
Wait... you started using the internet in 2006? Like, when you were 5???
Thanks! 2006 is what I remember, and what my older brother says too. I was 5 though, so the most I got out of it was learning how to torrent movies and Pokemon ROMs until like 2008, when I joined Facebook (at the time to play an old game called FarmVille).
I'd be interested to see naturalism spelled out more and defended against the alternative view that (I think) prevails in this community. That alternative view is something like: "Look, different agents have different goals/values. I have mine and will pursue mine, and you have yours and pursue yours. Also, there are rules and norms that we come up with to help each other get along, analogous to laws and rules of etiquette. Also, there are game-theoretic principles like fairness, retribution, and bullying-resistance that are basically just good ... (read more)
From another point of view: some philosophers are convinced that caring about conscious experiences is the rational thing to do. If it's possible to write an algorithm that works in a similar way to how their mind works, we already have an (imperfect, biased, etc.) agent that is somewhat aligned, and is likely to stay aligned after further reflection.
I think this is an interesting point -- but I don't conclude optimism from it as you do. Humans engage in explicit reasoning about what they should do, and they theorize and systematize, and some o... (read more)
Thanks for this! This definitely does intersect with my interests; it's relevant to artificial intelligence and to ethics. It does mostly just confirm what I already thought though, so my reaction is mostly just to pay attention to this sort of thing going forward.
I'm very glad to hear that! Can you say more about why?
Natural language has both noise (that you can never model) and signal (that you could model if you were just smart enough). GPT-3 is in the regime where it's mostly signal (as evidenced by the fact that the loss keeps going down smoothly rather than approaching an asymptote). But it will soon get to the regime where there is a lot of noise, and by the time the model is 9 OOMs bigger I would guess (based on theory) that it will be overwhelmingly noise and training will be very expensive.
So it may or may not work in the sense of meeting some absolute performance threshold, but it will certainly be a very bad way to get there and we'll do something smarter instead.
Probably, when we reach an AI-induced point of no return, AI systems will still be "brittle" and "narrow" in the sense used in arguments against short timelines.
Argument: Consider AI Impacts' excellent point that "human-level" is superhuman (bottom of this page)
The point of no return, if caused by AI, could come in a variety of ways that don't involve human-level AI in this sense. See this post for more. The general idea is that being superhuman at some skills can compensate for being subhuman at others. We should expect the point of no return to be reache... (read more)
Thanks for this! I like your concept of APS systems; I think I might use that going forward. I think this document works as a good "conservative" (i.e. optimistic) case for worrying about AI risk. As you might expect, I think the real chances of disaster are higher. For more on why I think this, well, there are the sequences of posts I wrote and of course I'd love to chat with you anytime and run some additional arguments by you.
For now I'll just say: 5% total APS risk (seems to me to) fail a sanity check, as follows:
1. There's at... (read more)
Thanks for this reply!
--I thought the paper about the methods of neuroscience applied to computers was cute, and valuable, but I don't think it's fair to conclude "methods are not up to the task." But you later said that "It makes a lot of sense to me that the brain does something resembling belief propagation on bayes nets. (I take this to be the core idea of predictive coding.)" so you aren't a radical skeptic about what we can know about the brain so maybe we don't disagree after all.
1 - 3: OK, I think I'll ... (read more)
Part of my idea for this post was to go over different versions of the lottery ticket hypothesis, as well, and examine which ones imply something like this. However, this post is long enough as it is.
I'd love to see you do this!
Re: The Treacherous Turn argument: What do you think of the following spitball objections:
(a) Maybe the deceptive ticket that makes T' work is indeed there from the beginning, but maybe it's outnumbered by 'benign' tickets, so that the overall behavior of the network is benign. This is an argument against ... (read more)
In this post I argued that an AI-induced point of no return would probably happen before world GDP starts to noticeably accelerate. You gave me some good pushback about the historical precedent I cited, but what is your overall view? If you can spare the time, what is your credence in each of the following PONR-before-GDP-acceleration scenarios, and why?
1. Fast takeoff
2. The sorts of skills needed to succeed in politics or war are easier to develop in AI than the sorts needed to accelerate the entire world economy, and/or have less deployment lag. (Maybe ... (read more)
I don't know if we ever cleared up ambiguity about the concept of PONR. It seems like it depends critically on who is returning, i.e. what is the counterfactual we are considering when asking if we "could" return. If we don't do any magical intervention, then it seems like the PONR could be well before AI since the conclusion was always inevitable. If we do a maximally magical intervention, of creating unprecedented political will, then I think it's most likely that we'd see 100%+ annual growth (even of say energy capture) before PONR. I don't think there ... (read more)
1. What credence would you assign to "+12 OOMs of compute would be enough for us to achieve AGI / TAI / AI-induced Point of No Return within five years or so." (This is basically the same, though not identical, with this poll question)
2. Can you say a bit about where your number comes from? E.g. maybe 25% chance of scaling laws not continuing such that OmegaStar, Amp(GPT-7), etc. don't work, 25% chance that they happen but don't count as AGI / TAI / AI-PONR, for total of about 60%? The more you say the better, this is my biggest crux! ... (read more)
I'd say 70% for TAI in 5 years if you gave +12 OOM.
I think the single biggest uncertainty is about whether we will be able to adapt sufficiently quickly to the new larger compute budgets (i.e. how much do we need to change algorithms to scale reasonably? it's a very unusual situation and it's hard to scale up fast and depends on exactly how far that goes). Maybe I think that there's an 90% chance that TAI is in some sense possible (maybe: if you'd gotten to that much compute while remaining as well-adapted as we are now to our current levels of compute) an... (read more)
I love your health points analogy. Extending it, imagine that someone came up with "coherence arguments" that showed that for a rational doctor doing triage on patients, and/or for a group deciding who should do a risky thing that might result in damage, the optimal strategy involves a construct called "health points" such that:
--Each person at any given time has some number of health points
--Whenever someone reaches 0 health points, they (very probably) die
--Similar afflictions/disasters tend to cause similar amounts of decrease in hea... (read more)
Wouldn't these coherence arguments be pretty awesome? Wouldn't this be a massive step forward in our understanding (both theoretical and practical) of health, damage, triage, and risk allocation?
Insofar as such a system could practically help doctors prioritise, then that would be great. (This seems analogous to how utilities are used in economics.)
But if doctors use this concept to figure out how to treat patients, or using it when designing prostheses for their patients, then I expect things to go badly. If you take HP as a guiding principle - for exampl... (read more)
Ah! You are right, I misread the graph. *embarrassed* Thanks for the correction!
OH this indeed changes everything (about what I had been thinking) thank you! I shall have to puzzle over these ideas some more then, and probably read the multi-prize paper more closely (I only skimmed it earlier)
OH ok thanks! Glad to hear that. I'll edit.
There's another explanation for why the history books display that progression you mapped out: They are Dutch history books, so naturally they want to focus on the bits of history that are especially relevant to the Dutch. One should expect that the "center of action" of these books drifts towards the Netherlands over time, just as it drifts towards the USA over time in the USA, and (I would predict) towards Indonesia over time in Indonesia, towards Japan over time in Japan, etc.
The International Energy Agency releases regular reports in which it forecasts the growth of various energy technologies for the next few decades. It's been astoundingly terrible at forecasting solar energy for some reason. Marvel at this chart:
This is from an article criticizing the IEA's terrible track record of predictions. The article goes on to say that there should be about 500GW of installed capacity by 2020. This article was published in 2020; a year later, the 2020 data is in, and it's actually 714 GW. Even the article criticizing the IEA for thei... (read more)
Whoa, the thing you are arguing against is not at all what I had been saying -- but maybe it was implied by what I was saying and I just didn't realize it? I totally agree that there are many optima, not just one. Maybe we are talking past each other?
(Part of why I think the two tickets are the same is that the at-initialization ticket is found by taking the after-training ticket and rewinding it to the beginning! So for them not to be the same, the training process would need to kill the first ticket and then build a new ticket on exactly the same spot!)
Hmmm, ok. Can you say more about why? Isn't the simplest explanation that the two tickets are the same?
I definitely agree that our timelines forecasts should take into account the three phenomena you mention, and I also agree that e.g. Ajeya's doesn't talk about this much. I disagree that the effect size of these phenomena is enough to get us to 50 years rather than, say, +5 years to whatever our opinion sans these phenomena was. I also disagree that overall Ajeya's model is an underestimate of timelines, because while indeed the phenomena you mention should cause us to shade timelines upward, there is a long list of other phenomena I could m... (read more)
Thanks for this post! I'll write a fuller response later, but for now I'll say: These arguments prove too much; you could apply them to pretty much any technology (e.g. self-driving cars, 3D printing, reusable rockets, smart phones, VR headsets...). There doesn't seem to be any justification for the 50-year number; it's not like you'd give the same number for those other techs, and you could have made exactly this argument about AI 40 years ago, which would lead to 10-year timelines now. You are just pointing out three reasons in f... (read more)
These arguments prove too much; you could apply them to pretty much any technology (e.g. self-driving cars, 3D printing, reusable rockets, smart phones, VR headsets...).
I suppose my argument has an implicit, "current forecasts are not taking these arguments into account." If people actually were taking my arguments into account, and still concluding that we should have short timelines, then this would make sense. But, I made these arguments because I haven't seen people talk about these considerations much. For example, I deliberately avoided the argument ... (read more)
Yeah, fair enough. I should amend the title of the question. Re: reinforcing the winning tickets: Isn't that implied? If it's not implied, would you not agree that it is happening? Plausibly, if there is a ticket at the beginning that does well at the task, and a ticket at the end that does well at the task, it's reasonable to think that it's the same ticket? Idk, I'm open to alternative suggestions now that you mention it...
The original paper doesn't demonstrate this but later papers do, or at least claim to. Here are several papers with quotes:
https://arxiv.org/abs/2103.09377 "In this paper, we propose (and prove) a stronger Multi-Prize Lottery Ticket Hypothesis:A sufficiently over-parameterized neural network with random weights contains several subnetworks (winning tickets) that (a) have comparable accuracy to a dense target network with learned weights (prize 1), (b) do not require any further training to achieve prize 1 (prize 2), and (c) is robust to extreme ... (read more)
Update: It seems that GPT-3 can actually do quite well (maybe SOTA? Human-level-ish it seems) at SuperGLUE with the right prompt (which I suppose you can say is a kind of fine-tuning, but it's importantly different from what everyone meant by fine-tuning at the time this article was written!) What do you think of this?
This is also a reply to your passage in the OP:
The transformer was such an advance that it made the community create a new benchmark, “SuperGLUE,” because the previous gold standard benchmark (GLUE) was now too easy.
Thanks! I'm afraid I don't understand the math yet but I'll keep trying. In the meantime:
I doubt that today’s neural networks already contain dog-recognizing subcircuits at initialization. Modern neural networks are big, but not that big.
Can you say more about why? It's not obvious to me that they are not big enough. Would you agree they probably contain edge detectors, circle detectors, etc. at initialization? Also, it seems that some subnetworks/tickets are already decent at the task at initialization, see e.g. this paper. Is that not "dog-recognizing subcircuits at initialization?" Or something similar?
It is irrelevant to this post, because this post is about what our probability distribution over orders of magnitude of compute should be like. Once we have said distribution, then we can ask: How quickly (in clock time) will we progress through the distribution / explore more OOMs of compute? Then the AI and compute trend, and the update to it, become relevant.
But not super relevant IMO. The AI and Compute trend was way too fast to be sustained, people at the time even said so. This recent halt in the trend is not surprising. What matters is what the tren... (read more)
What Gwern said. :) But I don't know for sure what the person I talked to had in mind.