I'm interested in thinking formally about AI risk. I believe that a proper mathematization of the problem is important to making intellectual progress in that area.

I have been trying to understand the rather critical notion of *optimization power*. I was hoping that I could find a clear definition in Bostrom's *Superintelligence*. But having looked in the index at all the references to *optimization power* that it mentions, as far as I can tell he defines it *nowhere*. The closest he gets is defining it in terms of rate of change and recalcitrance (pp.62-77). This is an empty definition--just tautologically defining it in terms of other equally vague terms.

Looking around, this post by Yudkowksy, "Measuring Optimization Power" doesn't directly formalize optimization power. He does discuss how one would predict or identify if a system were the result of an optimization process in a Bayesian way:

The quantity we're measuring tells us *how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities.* To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around. This plugs directly into Bayesian updating: it says that highly optimized events are strong evidence for optimization processes that produce them.

This is not, however, a definition that can be used to help identify the pace of AI development, for example. Rather, it is just an expression of how one would infer anything in a Bayesian way, applied to the vague 'optimization process' phenomenon.

Alex Altair has a promising attempt at formalization here but it looks inconclusive. He points out the difficulty of identifying optimization power with just the shift in the probability mass of utility according to some utility function. I may be misunderstanding, but my gloss on this is that defining optimization power purely in terms of differences in probability of utility doesn't say anything substantive about how a process has *power*. Which is important it is going to be related to some other concept like recalcitrance in a useful way.

Has there been any further progress in this area?

It's notable that this discussion makes zero references to computational complexity, formally or otherwise. That's notable because the informal discussion about 'optimization power' is about speed and capacity to compute--whether it be brains, chips, or whatever. There is a very well-developed formal theory of computational complexity that's at the heart of contemporary statistical learning theory. I would think that the tools for specifying optimization power would be in there somewhere.

Those of you interested in the historical literature on this sort of thing may be interested in cyberneticist's Rosenblueth, Weiner, and Bigelow's 1943 paper "Behavior, Purpose and Teleology", one of the first papers to discuss machine 'purpose', which they associate with optimization but in the particular sense of a process that is driven by a negative feedback loop as it approaches its goal. That does not exactly square with an 'explosively' teleology. This is one indicator that explosively purposeful machines might be quite rare or bizarre. In general, the 20th century cybernetics movement has a lot in common with contemporary AI research community. Which is interesting, because its literature is rarely directly referenced. I wonder why.

I see two main ways to deal mathematically with these optimization processes:

1)The first is an 'whatever-it-takes' process that realizes a goal function ideally (in the limit). To get a feel how the mathematics looks I suggest a look at the comparable mathematics of the operational amplifier (short op-amp).An ideal op-amp also does whatever it takes to realize the transfer function applied to the input. Non-ideal i.e. real op-amps fail this goal but one can give operating ranges by comparing the parameters of the tranfer function elements with the prameters (mostly the A_OL) of the op-amp.

I think this is a good model for the limiting case because we abstract the 'optmization process' as a black box and look at what it does to its goal function - namely realize it. We just can make this mathematcally precise.

2)My second model tries to model the differential equations following from EYs description of Recursive Self-Improvement (RSI) namely the PDEs relating "Optimization slope", "Optimization resources", "Optimization efficiency" with actual physical quantities. I started to write the equations down and put a few into Wolfram Alpha but didn't have time to do a comprehensive analysis. But I'd think that the resulting equations form classes of functions which could be classified by their associated complexity and risk.And when searching for RSI look what I found:

Mathematical Measures of Optimization Power

1) This is an interesting approach. It looks very similar to the approach taken by the mid-20th century cybernetics movement--namely, modeling social and cognitive feedback processes with the metaphors of electrical engineering. Based on this response, you in particular might be interested in the history of that intellectual movement.

My problem with this approach is that it considers the optimization process as a black box. That seems particularly unhelpful when we are talking about the optimization process acting on itself as a cognitive process. It's easy to imagine that such a thing could just turn itself into a superoptimizer, but that would not be taking into account what we know about computational complexity.

I think that it's this kind of metaphor that is responsible for "foom" intuitions, but I think those are misplaced.

2) Partial differential equations assume continuous functions, no? But in computation, we are dealing almost always with discrete math. What do you think about using concepts from combinatorial optimization theory, since those are already designed to deal with things like optimization resources and optimization efficiency?

Interesting. I know a bit about cybernetics but wasn't consciously aware of a clear analog between cognitive and electrical processes. Maybe I'm missing some background. Could you give a reference I could follow up on?

That is a plausible interpretation. Fooming is actually the only valid interpretation given an ideal black-box AI modelled this way. We have to look into the box which is comparable to looking at non-ideal op-amps. Fooming (on human time-scales) may still be be possible, but to determine that we have to get a handle on the math going on inside the box(es).

One could formulate discrete analogs to the continuous equations relating self-optimization steps. But I don't think this gains much as we are not interested in the specific efficiency of a specific optimization step. That wouldn't work anyway simply because the effect of each optimization step isn't known precisely, not even its timing.

But maybe your proposal to use complexity results from combinatorial optimization theory for

specificfeedback types (between the optimization stages outlined by EY) could provide better approximations to possible speedups.Maybe we can approximate the black-box as a set of nested interrelated boxes.

Norbert Wiener is where it all starts. This book has a lot of essays. It's interesting--he's talking about learning machines before "machine learning" was a household word, but envisioning it as electrical circuits.

http://www.amazon.com/Cybernetics-Second-Edition-Control-Communication/dp/026273009X

I think that it's important to look inside the boxes. We know a lot about the mathematical limits of boxes which could help us understand whether and how they might go foom.

Thank you for introducing me to that Concrete Mathematics book. That looks cool.

I would be really interested to see how you model this problem. I'm afraid that op-amps are not something I'm familiar with but it sounds like you are onto something.

Thank you for the book. Just ordered it.

You're welcome. It is the most fun math book I ever read.

Currently it is just a bunch of PDEs on paper. But I really want to write a post on this as this could provide some mathematical footing for many of the fooming debates.

One problem I'm stumbling with is the modelling of hard practical physical limits on computational processes. And I mean really practical limits that take thermodynamic into account, not these computronium bounds that are much too high. Something that takes entropic cost of replication and message transfer into account.

It's not much, but: see our brief footnote #3 in IE:EI and the comments and sources I give in What is intelligence?

Thanks. That's very helpful.

I've been thinking about Stuart Russell lately, which reminds me...bounded rationality. Isn't there a bunch of literature on that?

http://en.wikipedia.org/wiki/Bounded_rationality

Have you ever looked into any connections there? Any luck with that?

You might say bounded rationality is our primary framework for thinking about AI agents, just like it is in AI textbooks like Russell & Norvig's. So that question sounds to me like it might sound to a biologist if she was asked whether her sub-area had any connections to that "Neo-Darwinism" thing. :)

That makes sense. I'm surprised that I haven't found any explicit reference to that in the literature I've been looking at. Is that because it is considered to be implicitly understood?

One way to talk about optimization power, maybe, would be to consider a spectrum between unbounded, LaPlacean rationality and the dumbest things around. There seems to be a move away from this though, because it's too tied to notions of intelligence and doesn't look enough at outcomes?

It's this move that I find confusing.

Generally speaking, given a decision problem and a strategy to solve it, one way to measure it''s quality is the "regret"): the difference (or the ratio) between the payoff of the theoretically optimal strategy and the payoff of the strategy under consideration.

If the strategies are algorithms, then you can further refine the concept by including resource constraints (e.g. running in polynomial time, or running within X seconds).

In general, I don't think there is really a definition that fits well all cases in a non-trivial way: a clock optimizes keeping time, a rock optimizes laying around and occasionally falling, but we don't usually think of these things as agents trying to optimize an utility function.

See some of my criticisms of the concept: http://lesswrong.com/lw/cxk/thoughts_and_problems_with_eliezers_measure_of/

Thanks. That criticism makes sense to me. You put the point very concretely.

What do you think of the use of optimization power in arguments about takeoff speed and x-risk?

Or do you have a different research agenda altogether?

As an informal concept, it's a good one (and better than "intelligence"). Just as long as its not taken too literally.

Thanks for the references, though the link doesn't work (and is missing a space after it). I think it should point here.

EDIT: This post is old, so the link was probably broken in the move to LW 2.0.

String A and B are descriptions of worldstates. A is the received string, B is the desired string. It is not clear that there is a simple operation that generates a string "between" A and B. It seems intuitively like we should look to the VNM-axioms for a simple formal model of "betweeness" I think. Not sure.

A powerful optimizer can perform transforms between more disparate As and Bs maybe.

You could compare the structure of the compression (minimal program to represent) of the two strings. Though this asks the same question only on another level.