Comments on “The Singularity is Nowhere Near”

[-]Said Achmiz5yΩ10190

I haven’t read the linked post/comment yet, and perhaps I am missing something very obvious, but: we have exaflop computing (that’s 10^18) right now. Is Tim Dettmers really saying that we’re not going to see a 1000x speed-up, in a century or possibly ever? That seems like a shocking claim, and I struggle to imagine what could justify it.

EDIT: I have now read the linked comment; it speaks of fundamental physical limitations such as speed of light, heat dissipation, etc., and says:

These are all hard physical boundaries that we cannot alter. Yet, all these physical boundaries will be hit within a couple of years and we will fall very, very far short of human processing capabilities and our models will not improve much further. Two orders of magnitude of additional capability are realistic, but anything beyond that is just wishful thinking.

I do not find this convincing. Taking the outside view, we can see all sorts of similar predictions of limitations having been made over the course of computing history, and yet Moore’s Law is still going strong despite quite a few years of predictions of imminent trend-crashing. (Take a look at the “Recent trends” and “Alternative materials research” sections of the Wikipedia page; do you really see any indication that we’re about to hit a hard barrier? I don’t…)

[-]Daniel_Eth4yΩ110

Also, these physical limits – insofar as they are hard limits – are limits on various aspects of the impressiveness of the technology, but not on the cost of producing the technology. Learning-by-doing, economies of scale, process-engineering R&D, and spillover effects should still allow for costs to come down, even if the technology itself can hardly be improved.

[-]moridinamael5y180

It is fun to note that Metaculus is extremely uncertain about how many FLOPS will be required for AGI. The community lower 25% bound is 3.9x10^15 FLOPS and the upper 75% bound is 4.1x10^20 FLOPS with very flattish tails extending well beyond these bounds. (The median is 6.2e17.)

I mention this mainly to point out that his estimate of 10^21 FLOPS is simplify overconfident in his particular model. There are simple objections that should reduce confidence in that kind of extremely high estimate at least somewhat.

For example, the human brain runs on 20 watts of glucose-derived power, and is optimized to fit through a birth canal. These design constraints alone suggest that much of its architectural weirdness arises due to energy and size restrictions, not due to optimization on intelligence. Actually optimizing for intelligence with no power or size restrictions will yield intelligent structures that look very different, so different that it is almost pointless to use brains as a reference object.

Again, I think a healthy stance to take here isn't "Tim Dettmers is WRONG" but rather "Tim Dettmers is overconfident."

[-]Donald Hobson5yΩ360

Tim Dettmers whole approach seems to be assuming that there are no computational shortcuts. No tricks that programmers can use for speed where evolution brute forced it. For example, maybe a part of the brain is doing a convolution by the straight forward brute force algorithm. And programmers can use fast fourier transform based convolutions. Maybe some neurons are discrete enough for us to use single bits. Maybe we can analyse the dimensions of the system and find that some are strongly attractive, and so just work in that subspace.

Of course, all this is providing an upper bound on the amount of compute needed to make a human level AI. Tim Dettmers is trying to prove it can't be done. This needs a lower bound. To get a lower bound, don't look at how long it takes a computer to simulate a human. Look at how long it takes a human to simulate a computer. This bound is really rather useless, compared to modern levels of compute. However, it might give us some rough idea how bad overhead can be. Suppose we thought "Compute needed to be at least as smart as a human" was uniformly distributed somewhere between "compute needed to simulate a human" and "compute a human can simulate".

Well actually, it depends on what intelligence test we give. Human brains have been optimised towards (human stuff) so it probably takes more compute to socialize to a human level than it takes to solve integrals to a human level.

Interesting but probably irrelevant note.

There are subtleties in even the very loose lower bound of a human simulating a cpu. Suppose there was some currently unknown magic algorithm. This algorithm can hypothetically solve all sorts of really tricky problems in a handful of cpu cycles. It is so fast that a human mentally simulating a cpu running this algorithm will still beat current humans on a lot of important problems. (Not problems humans can solve too quickly, because no algorithm can do much in <1 clock cycle.) If such a magic algorithm exists, then its possible that even an AI running on a 1 operation per day computer could be arguably superhuman. Of course, I am somewhat doubtful that an algorithm that magic exists (although I have no strong evidence of non existence, some weak evidence namely that evolution didn't find it and we haven't found it yet.) Either way, we are far into the realm of instant takeoff on any computer.

[-][anonymous]5y40

If you swapped out "AGI" for "Whole Brain Emulation" then Tim Dettmers' analysis becomes a lot more reasonable.

[-][anonymous]5y20

Tim is simply neglecting the obvious brute force solution to achieve brain-like capabilities. This is yet another startup and I'm not saying this approach will commercially succeed, but : [singularity hub]

The linked article is one on a startup called Cerebras who has gotten a 'wafer scale engine' to at least run in demos. This is where an entire silicon wafer is made into a large chip.

Enough of these, connected by hollow core optical fiber, would be what you need to hit that 10^21 threshold.

Also note that AI systems get a bunch of advantages that humans don't have. Each system is immortal and is always doing it's best. Human beings trivially make mistakes on simple tasks at high error rates - we do not "do our best" consistently 24/7/365. What does it mean to achieve human-like performance? Did you mean average performance or performance of the best human alive who is well rested?

Do you want broad spectrum capabilities or just the objects in imagenet? Because, again, it's harder than it sounds to for a human to do better.

AI systems in applications like autonomous cars get to learn from the experiences of their peers in way that is not biased. Think about how biased the information you get from your peers is - for one thing, humans tend to only tell each other about successes, which can cause you to overestimate your chance of success for a risky venture like a startup.

While a peer autonomous vehicle can report in an unbiased way the (novel situation, true outcome) to a cloud farm that updates the learning to the fleet. Which is something each individual car doesn't have to do - each vehicle doesn't need to learn in itself.

In fact, here's another flaw of Tim's reasoning. He's assuming we must have an AI system that learns in real time like a human does. This is not true - humans don't learn in real time, either, it's why we need 16-20 years of education to be useful.

Each AI system used in a field can give answers to questions in realtime, but record high prediction error results. This is sorta how OpenAI's current algorithms already do it though I am neglecting details.

For a useful AI system used in a field, therefore, you need a tiny fraction of all the neurons a human uses - most are never going to contribute in any single task you might do as a human. And if a rare edge case shows up that needs more capability than a pared down, 'sparse' system used in a real application, you would have the field AI system pause it's robotics and query a larger version of itself for the answer.

The more I type the more I realize how bullshit everything in this argument was. And there are efforts to make a silicon chip with more of the tradeoffs of the human brain. If you think you need power efficiency and breadth of capabilities more than accuracy, you can just do this. [an article on a startup that has built analog computers for neural network convolution. ]

So for Tim to be correct he needs to take into account a 'best effort' example of a large array of analog silicon processors, filling a whole warehouse, and conclude you cannot hit the computational needs required.

That startup is at about 300 TOPs for a single chip. Therefore, for a quick napkin estimate, that's 10^14. It's a startup making some of the first analog computers used in decades. So let's assume there's at least a power of 10 of "easy gains" leftover if this became a commercial technology. So 10^15.

10^21-10^15 = 6, or 1 million chips in a warehouse. Go to a 'chiplet' architecture to cram them into less packages, cram 10 per package, and you have 100,000 chips.

Current number 1 supercomputer is Fugaku with 158,976 48-core CPUs.

Cheap and easy if you had to do this next week? No, but it sounds like if enough resources available you could solve the problem even if we never get another improvement in silicon.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

61

Comments on “The Singularity is Nowhere Near”

61

Ω 25

61

Ω 25

Are brain algorithms computationally expensive to simulate?

Where does Tim's estimate of $10^{21}$ FLOP/s come from?

What about dynamic gene expression, axonal computations, subthreshold learning, etc.?

61

Comments on “The Singularity is Nowhere Near”

61

Ω 25

61

Ω 25

Are brain algorithms computationally expensive to simulate?

Where does Tim's estimate of 1021 FLOP/s come from?

What about dynamic gene expression, axonal computations, subthreshold learning, etc.?

Where does Tim's estimate of $10^{21}$ FLOP/s come from?