LESSWRONG
LW

Donald Hobson

MMath Cambridge. Currently studying postgrad at Edinburgh.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

Newest

The Industrial Explosion

Donald Hobson6d20

Compute could be a bottleneck, not just for AI but also for simulations of physical world systems that are good enough to avoid too many real experiments and thus dramatically speed up progress in designing things that will actually do what they need to do.

Imagine you have clunky nanotech. Sure it has it's downsides. It needs to run at liquid nitrogen temperatures and/or in high vacuum conditions. It needs high purity lab supplies. It's energy inefficient. It is full of rare elements. But if, being nanotech, it can make a wide range of molecularly precise designs in a day or less, and having self replicated to fill the beaker, can try ~10^9 different experiments at once. With experiment power like that, you don't really need compute.

So I suspect any compute bottleneck needs to happen before even clunky nanotech. And that would require even clunky nanotech to be Really hard to design.

The Industrial Explosion

Donald Hobson7d40

This scenario is post superhuman AI. So rainforest exists iff the AI likes rainforest. Same goes for humans.

The Industrial Explosion

Donald Hobson7d3-3

So, in this world, you have a post FOOM superintelligent AI.

What does it take such an AI to bootstrap nanotech? If, as I suspect, the answer is 1 lab and a few days, then the rest of this analysis is mostly irrelevant.

The doubling time of nanotech is so fast that the AI only wants macroscopic robots to the extent that they speed up the nanotech, or fulfill the AI's terminal values.

Thus the AI's strategy, if it somehow can't make nanotech quickly, will depend on what the bottleneck is. Time? Compute? Lab equipment?

The Reductionist Trap

Donald Hobson8d20

This is a way to gain understanding of the system without using the standard "reductionist" techniques, so non-reductionism isn't a nonapple in this case!

It is a non-apple. There are many different ways to understand a system without breaking it down into parts.

Imagine your in a shop and more than half your sales are apples. While non-apples aren't a compact cluster in thingspace, within the context of this shop, the concept seems "non-apple" seems useful.

POC || GTFO culture as partial antidote to alignment wordcelism

Donald Hobson1mo42

Your bad example takes 5 minutes at a party. Your "good" example takes 8 weeks of work. It is not hard, in general, to get a better answer by investing more effort.

A specific example, worked out in full detail can exhibit the presence of security holes, but not their absence. If the system is a complicated mess, it can be very hard to find a security hole, but also very hard to prove it doesn't have one. (And it's quite likely it does have one)

When speculating about the risks of future AI, the easiest proofs of concept will be rather toy and of arguable relevance. More sophisticated proofs of concept on less toy examples might be dangerous to create.

If you see a bunch of potential threats, it's not guaranteed that all those threats are real. But they are all likely enough to be real that you have to plan for them. The list of speculations will contain some false positives. The list of fully detail worked out exploits will contain false negatives.

The Best Way to Align an LLM: Is Inner Alignment Now a Solved Problem?

Donald Hobson1mo20

I don't think this works in the infinite limit. With a truely unlimited amount of compute, insane things happen. I wouldn't trust that a randomly initialized network wasn't already a threat.

For example, bulk randomness can produce deterministic-seeming laws over the distribution. (Statistical mechanics). These laws can in turn support the formation and evolution of life.

That or a sufficiently large neural net could just have all sorts of things hiding in it by shear probability.

The win scenario here is that these techniques work well enough that we get LLM's that can just tell us how to solve alignment properly.

a confusion about preference orderings

Donald Hobson1mo20

If you really have preferences over all possible histories of the universe, then technically you can do anything.

Money pumping thus only makes sense in a context where your preferences are over a limited subset of reality.

Suppose you go to a pizza place. The only 2 things you care about are which kind of pizza you end up eating, and how much money you leave with. And you have cyclic preferences about pizza flavor A<B<C<A.

Your waiter offers you a 3 way choice between pizza flavors A, B, C. Then they offer to let you change your choice for $1, repeating this offer N times. Then they make your pizza.

Without loss of generality, you originally choose A.

For N=1, you change your choice to B, having been money pumped for $1. For N=2, you know that if you change to B the first time, you will then change to C, so you refuse the first offer, and then change to B. The same goes for constant known N>2, repeatedly refuse, then switch at the last minute.

Suppose the waiter will keep asking and keep collecting $1 until you refuse to switch. Then, you will wish you could commit yourself to paying $1 exactly once, and then stopping. But if you are the sort of agent that switches, you will keep switching forever, paying infinity $ and never getting any pizza.

Suppose the waiter rolls a dice. If they get a 6, they let you change your pizza choice for $1, and roll the dice again. As soon as they get some other number, they stop rolling dice and make your pizza. Under slight strengthening of the idea of cyclical preferences to cover decisions under uncertainty, you will keep going around in a cycle until the dice stops rolling 6's.

So some small chance of being money pumped.

Money pumping is an agent with irrational cyclic preferences is quite tricky, if the agent isn't looking myopically 1 step ahead but can forsee how the money pumping ends long term.

Could we go another route with computers?

Answer by Donald HobsonMay 31, 202574

but my intuition suggests that giant amounts of transistors shouldn't be the fastest way to compute almost everything,

You really want your computation devices to be small, fast, cheap and reliable. And transistors are very small and fast and reliable. Also, binary has a lot of advantages, and transistors can do arbitrary logic gates.

Also, special purpose components have a lot of limitations.

Your gravity sort only works if the computer is the right way up.

Analogue processes in general are hard to do to reasonable precision. The sort of optics based fourier transform hardware would not only be low precision, and probably nondeterministic, it would also be fixed size. If you want to do bigger or smaller fourier transforms, tough.

It's hard to replace general purpose components with special purpose ones because there are so many different things you might want to compute. Modern computers can do loads of tasks at a well enough level. A device that could do all sorting magically and instantly, and made your computer 10% more expensive, would still probably not be worth it. How much of your computers time is actually spent on sorting.

A lot of the code currently run isn't a neat maths thing like sorting or fourier transform. It's full of 1000's of messy details. (eg the linux kernal, firefox, most other packages) It would be possible to make specialized hardware with a specific version of all the details built into it. But other than that, what you want is a general instruction following machine.

What's So Bad About Ad-Hoc Mathematical Definitions?

Donald Hobson1mo42

Now, Bell could patch over this problem. For instance, they could pick a bunch of functions like , $sin (Y)$ , $e^{X} + 2 X - 1$ , etc, and require that those also be uncorrelated.

One neat thing, if you require ALL functions to be uncorrelated, then this is equivalent to saying the mutual information is 0.

If I ran the zoo

Donald Hobson1mo20

One thing that makes this more complicated, is you seem to be talking about omnipresent simulated clones. But in such a scenario, a large fraction of my utility would concern the clones. So any task that requires too much boring manual detail work is likely to just not get done. Or are they hypothetical clones in some way? Is this about what the clones could do, not about what they would do?