Underconstrained Abstractions


5


Eliezer_Yudkowsky

Followup toThe Weak Inside View

Saith Robin:

"It is easy, way too easy, to generate new mechanisms, accounts, theories, and abstractions.  To see if such things are useful, we need to vet them, and that is easiest "nearby", where we know a lot.  When we want to deal with or understand things "far", where we know little, we have little choice other than to rely on mechanisms, theories, and concepts that have worked well near.  Far is just the wrong place to try new things."

Well... I understand why one would have that reaction.  But I'm not sure we can really get away with that.

When possible, I try to talk in concepts that can be verified with respect to existing history.  When I talk about natural selection not running into a law of diminishing returns on genetic complexity or brain size, I'm talking about something that we can try to verify by looking at the capabilities of other organisms with brains big and small.  When I talk about the boundaries to sharing cognitive content between AI programs, you can look at the field of AI the way it works today and see that, lo and behold, there isn't a lot of cognitive content shared.

But in my book this is just one trick in a library of methodologies for dealing with the Future, which is, in general, a hard thing to predict.

Let's say that instead of using my complicated-sounding disjunction (many different reasons why the growth trajectory might contain an upward cliff, which don't all have to be true), I instead staked my whole story on the critical threshold of human intelligence.  Saying, "Look how sharp the slope is here!" - well, it would sound like a simpler story.  It would be closer to fitting on a T-Shirt.  And by talking about just that one abstraction and no others, I could make it sound like I was dealing in verified historical facts - humanity's evolutionary history is something that has already happened.

But speaking of an abstraction being "verified" by previous history is a tricky thing.  There is this little problem of underconstraint - of there being more than one possible abstraction that the data "verifies".

In "Cascades, Cycles, Insight" I said that economics does not seem to me to deal much in the origins of novel knowledge and novel designs, and said, "If I underestimate your power and merely parody your field, by all means inform me what kind of economic study has been done of such things."  This challenge was answered by comments directing me to some papers on "endogenous growth", which happens to be the name of theories that don't take productivity improvements as exogenous forces.

I've looked at some literature on endogenous growth.  And don't get me wrong, it's probably not too bad as economics.  However, the seminal literature talks about ideas being generated by combining other ideas, so that if you've got N ideas already and you're combining them three at a time, that's a potential N!/((3!)(N - 3!)) new ideas to explore. And then goes on to note that, in this case, there will be vastly more ideas than anyone can explore, so that the rate at which ideas are exploited will depend more on a paucity of explorers than a paucity of ideas.

Well... first of all, the notion that "ideas are generated by combining other ideas N at a time" is not exactly an amazing AI theory; it is an economist looking at, essentially, the whole problem of AI, and trying to solve it in 5 seconds or less.  It's not as if any experiment was performed to actually watch ideas recombining.  Try to build an AI around this theory and you will find out in very short order how useless it is as an account of where ideas come from...

But more importantly, if the only proposition you actually use in your theory is that there are more ideas than people to exploit them, then this is the only proposition that can even be partially verified by testing your theory.

Even if a recombinant growth theory can be fit to the data, then the historical data still underconstrains the many possible abstractions that might describe the number of possible ideas available - any hypothesis that has around "more ideas than people to exploit them" will fit the same data equally well.  You should simply say, "I assume there are more ideas than people to exploit them", not go so far into mathematical detail as to talk about N choose 3 ideas.  It's not that the dangling math here is underconstrained by the previous data, but that you're not even using it going forward.

(And does it even fit the data?  I have friends in venture capital who would laugh like hell at the notion that there's an unlimited number of really good ideas out there.  Some kind of Gaussian or power-law or something distribution for the goodness of available ideas seems more in order...  I don't object to "endogenous growth" simplifying things for the sake of having one simplified abstraction and seeing if it fits the data well; we all have to do that.  Claiming that the underlying math doesn't just let you build a useful model, but also has a fairly direct correspondence to reality, ought to be a whole 'nother story, in economics - or so it seems to me.)

(If I merely misinterpret the endogenous growth literature or underestimate its sophistication, by all means correct me.)

The further away you get from highly regular things like atoms, and the closer you get to surface phenomena that are the final products of many moving parts, the more history underconstrains the abstractions that you use.  This is part of what makes futurism difficult.  If there were obviously only one story that fit the data, who would bother to use anything else?

Is Moore's Law a story about the increase in computing power over time - the number of transistors on a chip, as a function of how far the planets have spun in their orbits, or how many times a light wave emitted from a cesium atom has changed phase?

Or does the same data equally verify a hypothesis about exponential increases in investment in manufacturing facilities and R&D, with an even higher exponent, showing a law of diminishing returns?

Or is Moore's Law showing the increase in computing power, as a function of some kind of optimization pressure applied by human researchers, themselves thinking at a certain rate?

That last one might seem hard to verify, since we've never watched what happens when a chimpanzee tries to work in a chip R&D lab.  But on some raw, elemental level - would the history of the world really be just the same, proceeding on just exactly the same timeline as the planets move in their orbits, if, for these last fifty years, the researchers themselves had been running on the latest generation of computer chip at any given point?  That sounds to me even sillier than having a financial model in which there's no way to ask what happens if real estate prices go down.

And then, when you apply the abstraction going forward, there's the question of whether there's more than one way to apply it - which is one reason why a lot of futurists tend to dwell in great gory detail on the past events that seem to support their abstractions, but just assume a single application forward.

E.g. Moravec in '88, spending a lot of time talking about how much "computing power" the human brain seems to use - but much less time talking about whether an AI would use the same amount of computing power, or whether using Moore's Law to extrapolate the first supercomputer of this size is the right way to time the arrival of AI. (Moravec thought we were supposed to have AI around now, based on his calculations - and he underestimated the size of the supercomputers we'd actually have in 2008.)

That's another part of what makes futurism difficult - after you've told your story about the past, even if it seems like an abstraction that can be "verified" with respect to the past (but what if you overlooked an alternative story for the same evidence?) that often leaves a lot of slack with regards to exactly what will happen with respect to that abstraction, going forward.

So if it's not as simple as just using the one trick of finding abstractions you can easily verify on available data...

...what are some other tricks to use?