Disclaimer

I am very ignorant about machine learning.

 

Introduction

I've frequently heard suggestions that a superintelligence could dominate humans by thinking a thousand or million times faster than a human. Is this actually a feasible outcome for prosaic ML systems?
 

Why I Doubt Speed Superintelligence

One reason I think this might not be the case is that the "superpower" of speed superintelligences is faster serial thought. However, I'm under the impression that we're already running into fundamental limits to the serial processing speed and can't really make them go much faster:

In 2002, an Intel Pentium 4 model was introduced as the first CPU with a clock rate of 3 GHz (three billion cycles per second corresponding to ~ 0.33 nanoseconds per cycle). Since then, the clock rate of production processors has increased much more slowly, with performance improvements coming from other design changes.

Set in 2011, the Guinness World Record for the highest CPU clock rate is 8.42938 GHz with an overclocked AMD FX-8150 Bulldozer-based chip in an LHe/LN2 cryobath, 5 GHz on air.[4][5] This is surpassed by the CPU-Z overclocking record for the highest CPU clock rate at 8.79433 GHz with an AMD FX-8350 Piledriver-based chip bathed in LN2, achieved in November 2012.[6][7] It is also surpassed by the slightly slower AMD FX-8370 overclocked to 8.72 GHz which tops of the HWBOT frequency rankings.[8][9]

The highest base clock rate on a production processor is the IBM zEC12, clocked at 5.5 GHz, which was released in August 2012.

 

Of course the "clock rate" of the human brain is much slower, but it's not like ML models are ever going to run on processors with significantly faster clock rates. Even in 2062, we probably will not have any production processors with > 50 GHz base clock rate (it may well be considerably slower). Rising compute availability for ML will continue to be driven by parallel processing techniques.

GPT-30 would not have considerably faster serial processing than GPT-3. And I'm under the impression that "thinking speed" is mostly a function of serial processing speed?

 

Questions

The above said, my questions:

  1. Can we actually speed up the "thinking" of fully trained ML models by K times during inference if we run it on processors that are K times faster?
  2. How does thinking/inference speed scale with compute?
    1. Faster serial processors
    2. More parallel processors

9

New Answer
Ask Related Question
New Comment

1 Answers sorted by

The reason why the human brain can get away with such a low "clock speed" is because intelligence is an embarrassingly parallel problem. Realtime constraints and the clock speed of a chip puts a limit on how deep the stack of neural net layers can be, but no limit on how wide the neural net can be, and according to deep learning theory, a wide net is complete for all problems.

We also haven't seen yet how big an impact neuromorphic architectures could be. It could be several orders of magnitude. Add in the ability of multiple intelligent units to work together just like humans do (but with less in-fighting) and it's hard to say just how much effective collective intelligence they could express.

Thanks for the reply. Do you have any position or intuitions on question 1 or 2?

Does more inference compute speed up inference time?

2Lone Pine5mo
1. Can we actually speed up the "thinking" of fully trained ML models by K times during inference if we run it on processors that are K times faster? Yes, definitely. 1. a. Yes b. Yes This is all with the caveat that doing things faster doesn't mean that it can solve bigger, more difficult problems or that it's solutions will be of a higher quality.
4 comments, sorted by Click to highlight new comments since: Today at 4:31 PM

Meant to comment on this a while back but forgot. I have thought about this also and broadly agree that early AGI with 'thoughts' at GHz levels is highly unlikely. Originally this was because pre-ML EY and the community broadly associated thoughts with CPU ops but in practice thoughts are more like forward passes through the model. 

As Connor Sullivan says, the reasons brains can have low clock rates is that our intelligence algorithms are embarrassingly parallel, as is current ML. Funnily enough, for large models (and definitely if we were to run forward passes through NNs as large as the brain), inference latency is already within an OOM or so of the brain (100ms). Due to parallelisation, you can distribute your forward pass across many GPUs to potentially decrease latency but eventually will get throttled by the networking overhead.

The brain, interestingly, achieves its relatively low latency by being highly parallel and shallow. The brain is not that many 'layers' deep. Even though each neuron is slow, the brain can perform core object recognition in <300ms at about 10 synaptic transmissions from retina -> IT. This is compared to current resnets which are >>10 layers. It does this through some combination of better architecture, better inference algorithm, and adaptive compute which trades space for time. i.e. you don't have do all your thinking in a forward pass but instead have recurrent connections so you can keep pondering and improving your estimations through multiple 'passes'. 

Neuromorphic hardware can ameliorate some of these issues but not others. Potentially, it allows for much more efficient parallel processing and lets you replace a multi-GPU cluster with a really big neuromorphic chip. Theoretically this could enable forward passes to occur at GHz speed but probably not within the next decade (technically if you use pure analog or optical chips you can get even faster forward passes!). Downsides are unknown hardware difficulty for more exotic designs and general data movement costs on chip. Also energy intensity will be huge at these speeds. Another bottleneck you end up with in practice is simply speed of encoding/decoding data at the analog-digital interface.

Even based on GPU clusters, early AGI can probably improve inference latency by a few OOMs to 100-1000s of forward passes per second just from low hanging hardware/software improvements. Additional benefits AGI could have are:

1.) Batching. GPUs are great at handling batches rapidly. The AGI can 'think' about 1000 things in parallel. The brain has to operate on batch size 1. Interestingly this is also a potential limitation of a lot of neuromorphic hardware as well.

2.) Direct internal access to serial compute. Imagine you had a python repl in your brain you could query and instantly get responses. Same with instant internal database lookup. 

Strongly upvoted, I found this very valuable/enlightening. I think you should make this a top level answer.

"Just like the smartest humans alive only a thousand times faster" is actually presented as a conservative scenario to pump intuition in the right direction. It's almost certainly achievable by known physics, even if it would be very expensive and difficult for us to achieve directly.

An actual superintelligence will be strictly better than that, because its early iterations will design systems better than we can, and later iterations running on those systems will be able to design systems more effective than we can possibly imagine or even properly comprehend if it were explained to us. They might not be strictly faster, but speed is much easier to extrapolate and communicate than actual superhuman intelligence. People can understand what it means to think much faster in a way that they fundamentally can't think of actually being smarter.

So in a way, the actual question asked here is irrelevant. A speedup is just an analogy to try to extrapolate to something - anything - that is vastly more capable than our thought processes. The reality would be far more powerful still in ways that we can't comprehend.

The reality would be far more powerful still in ways that we can't comprehend.

I am unconvinced by this.

I get your broader point though.

 

That said, I am still curious about how pragmatic speed superintelligences are in practice. I don't think it's an irrelevant question.

New to LessWrong?