[ Question ]

Open question: Math proofs that would enable you become a world dictator

by Samuel Shadrach5 min read17th Nov 202111 comments

-7

AIWorld Optimization
Frontpage

Epistemic status: Probably worth thinking about as a problem, but possibly not worth posting public solutions to

This seems to be a known question in AI alignment, I just thought it would popularising and getting more discussion on. (Those who are already deep into AI alignment may or may not find this a promising direction to work on. I just felt that it's a problem sufficiently abstracted from AI alignment stuff, that even people don't know too much AI background could contribute.)

Problem

Provide a set of mathematically expressible problems whose solutions would allow you personally (or a small group of scientists with less than $10B in funding) to establish significant totalitarian control over the world. These problems should contain as little information about the real world as possible.

A mathematical statement of Goldbach's conjecture, along with the base set of axioms contains very little information about the real world. A formal description of protein folding, along with descriptions of how proteins work and interact, along with what proteins are made up of and how those work, contains a lot of information about the real world.

You can consider the Kolmogrow complexity of your problem statement as a possible indicator for how much information it exposes about the real world. (Although beyond a certain point, this may not be a good indicator.)

Motivation

One of the proposals to the AI alignment problem is a pivotal act that prevents anyone else from building unaligned AGI. One way of doing this is setting up your own personal global dictatorship (singleton) that monitors and enforces this. One way of getting there is to use a Task AI that is superhuman at only one narrow field of study, that does not contain too much information about the real world in it. For instance, number theory or more broadly, proving math theorems.

Not containing information about the real world is important because it reduces the odds the AI tries to discover or discovers strategies like "figure out if you're in a box or simulation, break out of it, eliminate civilisation and get back to math problem with more compute power". The first of these steps is important here, "figure out if you're in a box or simulation, and if there's a real world outside". If it has no real world information or any priors that a "real world" exists, the AI may be less likely to decide this as a problem worth spending compute power on, as opposed to directly attacking the math problem. It is still an open question whether that would actually happen, or whether "figuring out if you're in a simulation" is so instrumentally convergent that there can be no guard against it. But for now, assume this whole thing is a viable strategy.

The problem now is that any set of problems whose solutions let you take over the (real) world, containing too much information about the world. "Solve Goldbach's conjecture" contains a lot less information about the world than "solve protein folding, given information about what proteins are and how they work". Real world strategies to establish a dictatorship usually require a lot of consideration of information about the real world, rather than abstract math problems. But it is possible the hardest challenges are in fact reducible. "Establish dictatorship" may reduce to "build one particular form of nanotech" which may reduce to "solve protein folding" which may reduce to "solve this complex differential equation" which may reduce to "solve this broader more abstract math theorem that contains even lesser real world information than the differential equation".

This research might go in two directions. First direction is reducing the information exposed by human solution. Problem_A(x1,y1,z1) can be reducible to Problem_B(x1,z1) if humans themselves know how to convert a solution to latter into a solution to the former. The latter contains less real world information. For instance, how does nanotech translate into establishing world dictatorship could be a problem that only humans are pointed at, not AI. The other direction is generalising to reduce information exposed. Problem_A(x1,y1,z1) could be reducible to Problem_A(x,y,z) for all x,y,z in X, Y, Z. So you don't expose data about what specific values the problem has to be instatiated with to be relevant to the real world. For instance, instead of asking about electron interactions given some constants like electron mass and permittivity, and an inverse square law - ask the AI to make progress on all differential equations that contain polynomials. This avoids revealing the fact that inverse square laws are specifically important to the real world. The human can then pick out whatever progress the AI makes on inverse square laws.

The querying could possibly be interactive. For instance you might want to solve one problem then react to the solution to that problem to figure out what the next problem worth solving is and how to cleverly pose it to the AI. Depending on how safe or unsafe the AI, you can take that many steps to interactively enhance your powers. But for now, I guess you can try to find key steps such that you can large progress using fewer number of them. 

Lastly, even if you don't become a world dictator yourself - having any significant power could be enough to get the powers-that-be to atleast take you seriously. Maybe you can negotiate existing powers to into a place where the agents they are composed of, gain more than they lose by co-operating and establishing this monitoring regime.

Caution

Solving this problem might pose an infohazard. My thinking on this so far is that all of us on this forum are aligned enough that even if someone* besides me uses my idea to become a world dictator, it may still be a better world than one where US and China remain in control and neither of them has singleton and neither of them thinks AI alignment is as important as people here. That being said, I would not be surprised by a different set of deranged behaviours from people here either. (I'm pretty sure I qualify by some people's definitions of deranged simply for thinking about this problem.) I haven't thought this through though - so if you have strong reason to not post solution attempts or discourage others from posting them - I'd be keen to know about it. Private channels to communicate solutions may be worthwhile.

*I guess it also depends on who the someone is, is it a random person on this forum or MIRI or the US govt? And how democratic a structure they wish to set up. Just because you can be a world dictator in all respects doesn't mean you want to. Maybe you just setup a strong social  or military institution that does monitoring and enforcement of AI, but does not otherwise interfere in global governance affairs.

To make this less of an infohazard, maybe I should explicitly ask for solutions that enable monitoring and enforcement of AI, without also enabling totalitarianism in other respects. This seems to me to be much harder, but I'm happy to be proven wrong.

-7

New Answer
Ask Related Question
New Comment

2 Answers

An algorithm for solving PSPACE complete problems in polynomial time would probably get you a good chunk of the way there, although there's no particular reason to believe this is possible other than the fact that nobody has yet proven it to be impossible.

Interesting suggestion.

Though I assume you also need the proof to be by construction and the algorithms to be implementable in practice (very large constant factors or exponents not allowed). Odds of this existing will have to be factored into whether this whole question is worth asking an unsafe AI.

I wonder if there's a computational problem whose solution has higher odds of existence or more easily leads to practical implementation.

3CronoDAS14dWhich is why I specified "an algorithm" and not "a proof". Also, if my understanding is correct, simulating quantum systems is in PSPACE, so one thing this would do is make nanotechnology much easier to develop...
1Samuel Shadrach13dI missed that. Although it doesn't change much, the AI will likely still hunt for a formal proof when it finds direct search for an algo yielding no result. And the risk-reward calculation of actually asking an unaligned AI this question is same, some odds of very high danger and small odds of useful success.

I feel like there are two obvious strategies here:

  1. Make a lot of money, somehow
  2. Break cryptography

I can't think of anything other than those which would be useful without requiring you give the AI lots of information about the physical world. (Though I can't be sure I haven't missed anything.)

5 comments, sorted by Highlighting new comments since Today at 6:53 AM

I am confused by the problem statement. What you're asking for is a generic tool, something that doesn't need information about the world to be created, but that I can then feed information about the real world and it will become very useful.

My problem is that the real world is rich, and feeding the tool with all relevant information will be expensive, and the more complicated the math problem is, the more safety issues you get.

I cannot rely on "don't worry if the Task AI is not aligned, we'll just feed it harmless problems", the risk comes from what the AI will do to get to the solution. If the problem is hard and you want to defer the search to a tool powerful enough that you have to choose carefully your inputs or catastrophe happens, you don't want to build that tool.

you don't want to build that tool.

Depending on the current world state (say someone else is close to discovering AGI too), it might be a viable option to have I guess.

Thanks for replying!

I think my idea was to restrict the number of compute cycles being given to the Task AI. Given enough compute, the AI will eventually end up spending some compute to learn the existence of the real world, but if the AI is naturally much better suited to math theorems than this, it might directly solve the math problem first.

I'm keen to know if that would work.

So, assuming an unaligned agent here.

If your agent isn't aware that its compute cycles are limited (i.e. the compute constraint is part of the math problem), then you have three cases: (1a) the agent doesn't hit the limit with its standard search, you're in luck; (1b) the problem is difficult enough that the agent runs its standards search but fails to find a solution in the allocated cycles, so it always fails, but safely. (1c) you tweak the agent to be more compute-efficient, which is very costly and might not work, in practice if you're in case 1b and it apparently fails safely you have an incentive to just increase the limit.

If your agent is indeed aware of the constraint, then it has an incentive to remove it, or increase the limit by other means. Three cases here again: (2a) identical to 1a, you're in luck; (2b) the limit is low enough that strategic action to remove the constraint is impossible, the agent fails "safely"; (3b) the agent finds a way to remove the constraint, and you're in very unsafe territory.

Two observations from there: first, ideally you'd want your agent to operate safely even if given unbounded cycles, that's the Omni Test. Second, there's indeed an alignment concept for agents that just try to solve the problem without long-term planning, that's Myopia (and defining it formally is... hard).

Thank you for these links, I will read them.

I agree in the world today, attempting to get AI that can do this is dangerous, better off slowing down AI race. What I'm suggesting is more as a precaution for a world where the AI race has gotten to the point where this AI can be built.