List of lethalities is not by any means a "one stop shop". If you don't agree with Eliezer on 90% of the relevant issues, it's completely unconvincing. For example, in that article he takes as an assumption that an AGI will be godlike level omnipotent, and that it will default to murderism.
building a bacteria that eats all metals would be world-ending: Most elements on the periodic table are metals. If you engineer a bacteria that eats all metals, it would eat things that are essential for life and kill us all.
Okay, what about a bacteria that only eats "stereotypical" metals, like steel or iron? I beg you to understand that you can't just sub in different periodic table elements and expect a bacteria to work the same. There will always be some material that the bacteria wouldn't work on that computers could still be made with. And even making a bacteria that only works on one material, but is able to spread over the entire planet, is well beyond our abilities.
I think list of lethalities is nonsense for other reasons, but he is correct in that trying to do a "pivotal act" is a really stupid plan.
Under peer review, this never would have been seen by the public. It would have incentivized CAIS to actually think about the potential flaws in their work before blasting it to the public.
I asked the forecasting AI three questions:
Will iran possess a nuclear weapon before 2030:
539's Answer: 35%
Will iran possess a nuclear weapon before 2040:
539's Answer: 30%
Will Iran posses a nuclear weapon before 2050:
539's answer: 30%
Given that the AI apparently doesn't understand that things are more likely to happen if given more time, I'm somewhat skeptical that it will perform well in real forecasts.
The actual determinant here is whether or not you enjoy gambling.
Person A who regularly goes to a casino and bets 100 bucks on roulette for the fun of it will obviously go for bet 1. In addition to the expected 5 buck profit, they get the extra fun of gambling, making it a no-brainer. Similarly, bet 2 is a no brainer.
Person B who hates gambling and gets super upset when they lose will probably reject bet 1. The expected profit of 5 bucks is outweighed by the emotional cost of doing gambling, a thing they are upset by.
When it comes to bet 2, person B still hates gambling, but the expected profit is ridiculously high that it exceeds the emotional cost of gambling, so they take the bet.
Nobody is necessarily being actually irrational here, when you account for non-monetary costs.
I believe this is important because we should epistemically lower our trust in published media from here onwards.
From here onwards? Most of those tweets that chatgpt generated are not noticeably different from the background noise of political twitter (which is what it was trained on anyway). Also, twitter is not published media so I'm not sure where this statement comes from.
You should be willing to absorb information from published media with a healthy skepticism based on the source and an awareness of potential bias. This was true before chatgpt, and will still be true in the future.
No, I don't believe he did, but I'll save the critique of that paper for my upcoming "why MWI is flawed" post.
I'm not talking about the implications of the hypothesis, I'm pointing out the hypothesis itself is incomplete. To simplify, if you observe an electron which has a 25% chance of spin up and 75% chance of spin down, naive MWI predicts that one version of you sees spin up and one version of you sees spin down. It does not explain where the 25% or 75% numbers come from. Until we have a solution to that problem (and people are trying), you don't have a full theory that gives predictions, so how can you estimate it's kolmogorov complexity?
I am a physicist who works in a quantum related field, if that helps you take my objections seriously.
It’s the simplest explanation (in terms of Kolmogorov complexity).
Do you have proof of this? I see this stated a lot, but I don't see how you could know this when certain aspects of MWI theory (like how you actually get the Born probabilities) are unresolved.
I'm leaving the same comment here and in reply to daniel on my blog.
First, thank you for engaging in good faith and rewarding deep critique. Hopefully this dialogue will help people understand the disagreements over AI development and modelling better, so they can make their own judgements.
I think I’ll hold off on replying to most of the points there, and make my judgement after Eli does an in-depth writeup of the new model. However, I did see that there was more argumentation over the superexponential curve, so I’ll try out some more critiques here: not as confident about these, but hopefully it sparks discussion.
The impressive achievements in LLM capabilities since GPT-2 have been driven by many factors, such as drastically increased compute, drastically increased training data, algorithmic innovations such as chain-of-thought, increases in AI workforce, etc. The extent that each contributes is a matter of debate, which we can save for when you properly write up your new model.
Now, let’s look for a second at what happens when the curve goes extreme: using median parameters and starting the superexponentional today, the time horizon of AI would improve from one-thousand work-years to ten-thousand work-years In around five weeks. So you release a model, and it scores 80% on 1000 work year tasks, but only like 40% on 10,000 work year tasks (the current ratio of 50% to 80% time horizons is like 4:1 or so). Then five weeks later you release a new model, and now the reliability on the much harder tasks has doubled to 80%.
Why? What causes the reliability to shoot up in five weeks? The change in the amount of available compute, reference data, or labor force will not be significant in that time, and algorithmic breakthroughs do not come with regularity. It can’t be due to any algorithmic speedups from AI development because that’s in a different part of the model: we’re talking about three weeks of normal AI development, like it’s being done by openAI as it currently stands.. If the AI is only 30x faster than humans, then the time required for the AI to do the thousand year task is 33 years! So where does this come from? Will we have developed the perfect algorithm, such that AI no longer needs retraining?
I think a mistake could be made in trying to transfer intuition about humans to AI here: perhaps the intuition is “hey, a human who is good enough to do a 1 year task well can probably be trusted to do a 10 year task”.
However, if a human is trying to reliably do a “100 year” task (a task that would take a team of a hundred about a year to do), this might involve spending several years getting an extra degree in the subject, read a ton of literature, improving their productivity, get mentored by an expert in the subject, etc. While they work on it, they learn new stuff and their actual neurons get rewired.
But the AI equivalent to this would be getting new algorithms, new data, new computing power, new training. ie, becoming an entirely new model, which would take significantly more than a few weeks to be built. I think there may be some double counting going on between this superexp and the superexp from algo speedups.