No plans so far. I'm a little unhappy with the experimental design from last time. If I ever come back to this, I'll change the experiments up anyways.
Could you elaborate a bit more about the strategic assumptions of the agenda? For example,
1. Do you think your system is competitive with end-to-end Deep Learning approaches?
1.1. Assuming the answer is yes, do you expect CoEm to be preferable to users?
1.2. Assuming the answer is now, how do you expect it to get traction? Is the path through lawmakers understanding the alignment problem and banning everything that is end-to-end and doesn't have the benefits of CoEm?
2. Do you think this is clearly the best possible path for everyone to take right now o...
fair. You convinced me that the effect is more determined by layer-norm than cross-entropy.
I agree that the layer norm does some work here but I think some parts of the explanation can be attributed to the inductive bias of the cross-entropy loss. I have been playing around with small toy transformers without layer norm and they show roughly similar behavior as described in this post (I ran different experiments, so I'm not confident in this claim).
My intuition was roughly:
- the softmax doesn't care about absolute size, only about the relative differences of the logits.
- thus, the network merely has to make the correct logits really big an...
I don't think there is a general answer here. But here are a couple of considerations:
- networks can get stuck in local optima, so if you initialize it to memorize, it might never find a general solution.
- grokking has shown that with high weight regularization, networks can transition from memorized to general solutions, so it is possible to move from one to the other.
- it probably depends a bit on how exactly you initialize the memorized solution. You can represent lookup tables in different ways and some are much more liked by NNs than others. For examp...
I agree with everything you're saying. I just want to note that as soon as someone starts training networks in a way where not all weights are updated simultaneously, e.g. because the weights are updated only for specific parts of the network, or when the network has an external memory that is not changed every training step, gradient hacking seems immediately much more likely and much scarier.
And there are probably hundreds of researchers out there working on modular networks with memory, so it probably won't take that long until we have models that...
This criticism has been made for the last 40 years and people have usually had new ideas and were able to execute them. Thus, on priors, we think this trend will continue even if we don't know exactly which kind of ideas they will be.
In fact, due to our post, we were made aware of a couple of interesting ideas about chip improvements that we hadn't considered before that might change the outcome of our predictions (towards later limits) but we haven't included them in the model yet.
Hmmm interesting.
Can you provide some of your reasons or intuitions for this fast FOOM?
My intuition against it is mostly like "intelligence just seems to be compute bound and thus extremely fast takeoffs (hours to weeks) are unlikely". But I feel very uncertain about this take and would like to refine it. So just understanding your intuitions better would probably already help a lot.
I think it's mostly my skepticism about extremely fast economic transformations.
Like GPT-3 could probably automate more parts of the economy today but somehow it just takes a while for people to understand that and get it to work in practice. I also expect that it will take a couple of years between showing the capabilities of new AI systems in the lab and widespread economic impact just because humans take a while to adapt (at least with narrow systems).
At some point (maybe in 2030) we will reach a level where AI is as capable as humans in man...
Well maybe. I still think it's easier to build AGI than to understand the brain, so even the smartest narrow AIs might not be able to build a consistent theory before someone else builds AGI.
I'm not very bullish on HMI. I think the progress humanity makes in understanding the brain is extremely slow and because it's so hard to do research on the brain, I don't expect us to get much faster.
Basically, I expect humanity to build AGI way before we are even close to understanding the brain.
I know your reasoning and I think it's a plausible possibility. I'd be interested in how the disruption of AI into society looks like in your scenario.
Is it more like one or a few companies have AGIs but the rest of the world is still kinda normal or is it roughly like my story just 2x as fast?
Thanks for pointing this out. I made a clarification in the text.
Similar to Daniel, I'd also be interested in what public opinion was at the time or what the consensus view among experts was if there was one.
Also, it seems like the timeframe for mobile phones is 1993 to 2020 if you can trust this statistic.
Definitely could be but don't have to be. We looked a bit into cooling and heat and did not find any clear consensus on the issue.
We did consider modeling it explicitly. However, most estimates on the Landauer limit give very similar predictions as size limits. So we decided against making an explicit addition to the model and it is "implicitly" modeled in the physical size. We intend to look into Landauer's limit at some point but it's not a high priority right now.
We originally wanted to forecast FLOP/s/$ instead of just FLOP/s but we found it hard to make estimates about price developments. We might look into this in the future.
Well, depending on who you ask, you'll get numbers between 1e13 and 1e18 for the human brain FLOP/s equivalent. So I think there is lots of uncertainty about it.
However, I do agree that if it was at 1e16, your reasoning sounds plausible to me. What a wild imagination.
Yeah, I also expect that there are some ways of compensating for the lack of miniaturization with other tech. I don't think progress will literally come to a halt.
We looked more into this because we wanted to get a better understanding of Ajeya's estimate of price-performance doubling every 2.5 years. Originally, some people I talked to were skeptical and thought that 2.5 years is too conservative. I now think that 2.5 years is probably insufficiently conservative in the long run.
However, I also want to note that there are still reasons to believe a doubling time of 2 years or less could be realistic due to progress in specialization or other breakthroughs. I still have large uncertainty about the doubling tim...
By uncertainty I mean, I really don't know, i.e. I could imagine both very high and very low gains. I didn't want to express that I'm skeptical.
For the third paragraph, I guess it depends on what you think of as specialized hardware. If you think GPUs are specialized hardware than a gain of 1000x from CPUs to GPUs sounds very plausible to me. If you think GPUs are the baseline and specialized hardware are e.g. TPUs, then a 1000x gain sounds implausible to me.
My original answers wasn't that clear. Does this make more sense to you?
As a follow-up to building the model, I was looking into specialized AI hardware and I have to say that I'm very uncertain about the claimed efficiency gains. There are some parts of the AI training pipeline that could be improved with specialized hardware but others seem to be pretty close to their limits.
We intend to understand this better and publish a piece in the future but it's currently not high on the priority list.
Also, when compared to CPUs, it's no wonder that any parallelized hardware is 1000x more efficient. So it really depends on what the exact comparison is that the authors used.
Just send in a new application. We have a couple of new mentors but they are quite busy at the moment so I can't promise that we'll find a match soon :( Sorry for that
I'm not sure I actually understand the distinction between forecasting and foresight. For me, most of the problems you describe sound either like forecasting questions or AI strategy questions that rely on some forecast.
Your two arguments for why foresight is different than forecasting are
a) some people think forecasting means only long-term predictions and some people think it means only short-term predictions.
My understanding of forecasting is that it is not time-dependent, e.g. I can make forecasts about an hour from now or for a milli...
I have not tested it since then. I think there were multiple projects that tried to improve profilers for PyTorch. I don't know how they went.
If it's easier for you, we can already facilitate that through M&M. Like we said, as long as both parties agree, you can do whatever makes sense for you :) But the program might make finding other people easier.
AI policy does count and we actively look for mentors in policy :)
I'm honestly not even sure whether this comment is in support of or against my disagreements.
I'm skeptical of the "recursive self-improvement leads to enormous gains in intelligence over days" story but I support the "more automation leads to faster R&D leads to more automation, etc." story which is also a form of recursive self-improvement-just over the span of years rather than days.
I think dangerous doesn't have to imply that it can replicate itself. Doing zero-shot shenanigans on the stock market causing a recession and then being turned off also counts as dangerous in my books.
Updated the slack link. Thanks for spotting it.
I'm not certain about it either but I'm less skeptical. However, I agree with you that some of this could be capabilities work and has to be treated with caution.
However, I think to answer some of the important questions around Deep Learning, e.g. which concepts they learn and under which conditions, we just need to get a better understanding of the entire pipeline. I think it's plausible that this is very hard and progress is much slower than one would hope.
Just to be clear, I also think that your grokking work increases alignment much more than capabilities on balance.
I think the way in which it increases capabilities would roughly look like this: "your insight on grokking is a key to understanding fast generalization better; other people build on this insight and then modify training; this improves the speed of learning and thus capabilities".
I think your work is clearly net positive, I just wanted to use a concrete example in the post to show that there are trade-offs worth taking.
Maybe I should have stated this differently in the post. Many conversations end up talking about X-risks at some point but usually only after it went through the other stuff. I think my main learning was just that starting with X-risk as the motivation did not seem very convincing.
Also, there is a big difference in how you talk about X-risk. You could say stuff like "there are plausible arguments why X-risk could lead to extinction but even experts are highly uncertain about this" or "We're all gonna die" and the more moderate version seems clearly more persuasive.
I don't think these conversations had as much impact as you suggest and I think most of the stuff funded by EA funders has decent EV, i.e. I have more trust in the funding process than you seem to have.
I think one nice side-effect of this is that I'm now widely known as "the AI safety guy" in parts of the European AIS community and some people have just randomly dropped me a message or started a conversation about it because they were curious.
I was working on different grants in the past but this particular work was not funded.
I think taking safety seriously was strongly correlated with whether or not they believed AI will be transformative in the near-term (as opposed to being just another hype cycle). But not sure. My sample is too small to make any general inferences.
This seems to be what https://www.adept.ai/act is working on if I understood their website correctly. They probably don't have a million users yet though so I agree that it is not accurate at the moment.
Also, https://openai.com/blog/webgpt/ is an LLM with access to the internet, right?
But yeah, probably an overstatement.
Firstly, I don't think the term matters that much. Whether you use AGI safety, AI safety, ML safety, etc. doesn't seem to have as much of an effect compared to the actual arguments you make during the conversation (at least that was my impression).
Secondly, I don't say you should never talk about x-risk. I mostly say you shouldn't start with it. Many of my conversations ended up in discussions of X-risk but only after 30 minutes of back and forth.
Not really. You can point out that this is your reasoning but whenever you talk about short timelines you can bet that most people will think you're crazy. To be fair, even most people in the alignment community are more confident that 2040 will exist than not, so this is even a controversial statement within AI safety.
Thanks for all the clarifications and the notebook. I'll definitely play around with this :)
Yeah, our impression was that a) there is a large body of literature that is relevant and related in the existing social science literature, and b) taking 90% of the existing setup and adding AI would probably already yield lots of interesting studies. In general, it seems like there is a lot of room for people interested in the intersection of AI+ethics+social sciences.
Also, Positly+Guidedtrack makes running these studies really simple and turned out to be much smoother than I would have expected. So even when people without a social science background "just want to get a rough understanding what the rest of the world thinks" they can quickly do so with the existing tools.
I think this is very similar to the hypothesis they have as well. But not sure if I understood it correctly, I think some parts of the paper are not as clear as they could be
Update: it's published now and you can find it here: https://chip-dataset.vercel.app/
Thanks for the post. I'm voting for the SGD inductive biases for the next one.
That's fair. I think my statement as presented is much stronger than I intended it to be. I'll add some updates/clarifications at the top of the post later.
Thanks for the feedback!
Thanks for the comment. It definitely pointed out some things that weren't clear in my post and head. Some comments:
1. I think your section on psychologizing is fairly accurate. I previously didn't spend a lot of time thinking about how my research would reduce the risks I care about and my theories of change were pretty vague. I plan to change that now.
2. I am aware of other failure modes such as fast takeoffs, capability gains in deployment, getting what we measure, etc. However, I feel like all of these scenarios get much worse/harder when decepti...
Thanks for the clarifications. They helped me :)
I think the main difference is that it has access to all of the user's computers (or at least browsers), right? This should imply way more opportunities for malicious actions, right?
Yeah, the phrasing might not be as precise as we intended it to be.
My tentative heuristic for whether you should publish a post that is potentially infohazardy is "Has company-X-who-cares-mostly-about-capabilities likely thought about this already?". It's obviously non-trivial to answer that question but I'm pretty sure most companies who build LLMs have looked at Chinchilla and come to similar conclusions as this post. In case you're unsure, write up the post in a google doc and ask someone who has thought more about infohazards whether they would publish it or not.
Also, I think Leon underestimates how fast a post can spread even if it is just intended for an alignment audience on LW.
How confident are you that the model is literally doing gradient descent from these papers? My understanding was that the evidence in these papers is not very conclusive and I treated it more as an initial hypothesis than an actual finding.
Even if you have the redundancy at every layer, you are still running copies of the same layer, right? Intuitively I would say this is not likely to be more space-efficient than not copying a layer and doing something else but I'm very uncertain about this argument.
I intend to look into the Knapsack + DP algorithm problem at some point. If I were to find that the model implements the DP algorithm, it would change my view on mesa optimization quite a bit.