A suggested solution to AGI safety is to ensure intelligent systems only work on abstract mathematical problems  that have, as far as we know, little to no connection with the "real world". I'll sidestep some obvious problems (limited usefulness, temptation to formulate real world problems in mathematical terms) to explore some implications of this idea, and then tie it back to modelling physical reality efficiently.

Ideally, these powerful "math oracles" would be handed datasets related to purely theoretical problems, model the data, and send back some form of domain-relevant answer. These AIs would live solely in a universe of esoteric mathematical objects and their relations. They would neither know nor care about humans and all their machinations, the births and deaths of stars, the decay of bromine atoms, the combining of simple cells into wondrously complex structures, or the rise and fall of galactic empires. Indeed, they wouldn't even know or care about their own existence. There's no concept of "math oracle running on a planet named Earth" that can be derived from the data. The whole idea of physical reality gets thrown out here.

Or such is the hope.

Let's imagine one of these superintelligent math oracles that's tasked with modelling a single (carefully selected) integer sequence. An oracle that learns an efficient, predictive model of the primes (specifically, their distribution among the natural numbers) can be said to fully "understand" that sequence. But is that all it understands? Due to the fundamental nature of primes in mathematics, an AI that finds their underlying pattern would arguably "know" number theory (and related mathematical fields) on a far deeper level than any human. A task that seemed constrained in its implications ("Predict the next n primes") with a simple, associated dataset (e.g. the first million primes) gave the AI a far wider insight into hidden mathematical reality than was perhaps imagined at the outset.

Note that this doesn't necessarily have to be viewed negatively.

Added to that, this prime-predicting oracle might know more than the secret workings of numbers. The primes seem to bleed into our physical reality. For instance, patterns in primes are mirrored in certain forms of matter interactions. The hibernation cycle of cicadas follows a prime pattern. Whether these are coincidences or hint at some deeper connection between the primes and the real world, it raises the possibility that modelling abstract objects can still give an entity insights into the physical world. Insights, needless to say, humans don't currently possess.

Of course, the fact that the alien abstractions the AI has learned over the data overlap with some other structures we humans call "the real world" will probably be of zero interest to it. And as I previously mentioned, this potential phenomenon isn't necessarily a negative. In fact, I think it could be advantageous to humans. It hints, I think, at the possibility of radical data efficiency given the task of modelling a world. That is, the real world.

What kind of data would we feed a superintelligent oracle whose task was modelling known reality, and not just abstract mathematical objects? Well, there's Wikipedia. Marcus Hutter uses it as the basis of his contest for lossless compression in the hopes that this will lead to AGI. But Wikipedia is huge. Modelling the entire thing is grossly inefficient. Perhaps we could just use Vital Level V, a collection of the ~50,000 most important Wiki articles? That's still a huge amount of data though. Point a webcam at a busy town square? Way too much data.

What would be ideal is a small and simple (but probably very hard to model) dataset that gives an AI profound insights into our physical world, similar to how the prime-predictor uses its own simple dataset (a list of primes) to glean insights into mathematical reality.

Perhaps such an example is the VIX, a market volatility index that tracks the expected strength of S&P 500 price changes over the near term. It could be said to capture people's sense of foreboding about the near future at any given time. How much information is encoded in its price changes? It would seem to me that humans are in there, with all their fears, and hopes and ambitions. The current state of the world? The hidden workings of the world? Could reality as we know it be derived from a sequence of VIX price changes?

That's just one idea for an ultra-basic world-modelling dataset. I'd be happy to hear more suggestions.

New Comment
4 comments, sorted by Click to highlight new comments since:

One problem here is that there is a tradeoff between data and compute, and we are already at a bad place for compute.

People plausibly estimate the current universe with all its contents at kilobits (Standard Model equations + initial Big Bang conditions); that's radical data efficiency, hard to do better than that. So this minimal dataset problem is already solved. But even if someone handed the Turing machine of the universe to you, you can't compute it, it's too expensive. That's the problem with Kolmogorov complexity: it is the shortest program given unlimited compute. And it spends any amount of compute for a shorter program, implying that longer programs (up to simply encoding the answer literally) can be much more compute-efficient.

So if they hand you megabytes of text for free, that should make it way easier to run a cheap program... but as the Hutter Prize shows, we can't even run the programs which would solve that either! So we definitely can't run the programs which are more sample-efficient than the Hutter Prize.

And if, even if we had the minimal datasets, we couldn't do anything with them, what's the point?

That’s the problem with Kolmogorov complexity: it is the shortest program given unlimited compute. And it spends any amount of compute for a shorter program

I don't see why it's assumed that we'd necessarily be searching for the most concise models rather than, say, optimizing for CPU cycles or memory consumption. I'm thinking of something like Charles Bennett's Logical Depth.

These types of approaches also take it for granted that we're conducting an exhaustive search of model-space, which yes, is ludicrous. Of course we'd burn through our limited compute trying to brute-force the space. There's plenty of room for improvement in a stochastic search of models which, while still expensive, at least has us in the realm of the physically possible. There might be something to be said for working primarily on the problem of probabilistic search in large, discrete spaces before we even turn to the problem of trying to model reality.

(Standard Model equations + initial Big Bang conditions); that’s radical data efficiency,

Allow me to indulge in a bit of goal-post shifting.

A dataset like that gives us the entire Universe, ie. Earth and a vast amount of stuff we probably don't care about. There might come a point where I care about the social habits of a particular species in the Whirlpool Galaxy, but right now I'm much more concerned about the human world. I'm far more interested in datasets that primarily give us our world, and through which the fundamental workings of the Universe can be surmised. That's why I nominated the VIX as a simple, human/Earth-centric dataset that perhaps holds a great amount of extractible information.

rather than, say, optimizing for CPU cycles or memory consumption

As I already pointed out, we already do. And turns out that you need to optimize more for CPU/memory, past the kilobytes of samples which are already flabby and unnecessary from the point of view of KC. And more. And more. Go right past 'megabyte' without even stopping. Still way too small, way too compute/memory-hungry. And a whole bunch more beyond that. And then you hit the Hutter Prize size, and that's still too optimized for sample-efficiency, and we need to keep going. Yes, blow through 'gigabyte', and then more, more, and some more - and eventually, a few orders of magnitude sample-inefficiency later, you begin to hit projects like GPT-3 which are finally getting somewhere, having traded off enough sample-inefficiency (hundreds of gigabytes) to bring the compute requirements down into the merely mortal realm.

A dataset like that gives us the entire Universe, ie. Earth and a vast amount of stuff we probably don't care about.

You can locate the Earth in relatively few bits of information. Off the top of my head: the observable universe is only 45 billion lightyears radius; how many bits could an index into that possibly take? 24 bits to encode distance from origin in lightyears out of 45b, maybe another 24 bits to encode angle? <50 bits for such a crude encoding, giving an upper bound. You need to locate the Earth in time as well? Another <20 bits or so to pin down which year out of ~4.5b years. If you can do KC at all, another <60 bits or so shouldn't be a big deal...

I'm not so sure that knowing primes tells you anything useful about cicadas, for the simple reason that prime numbers function just as well on planets that with or without cicadas. Because of this, learning about primes doesn't inform your knowledge of whether or not cicadas are real by very much (in the best case).

I think this applies similarly to much of the middle part of your thesis, and I think the end kind of just puts your theoretical AI into already-existing "machine learning" territory (unless I'm misunderstanding something here).