One of the primary conceptual challenges of UDT is that, if future-you is going to be deferring to past-you about what to do in various circumstances, and past-you hasn't exhaustively thought through every possible circumstance ahead of time, that causes a tension. In order for deferring to past-you to produce acceptable results, past-you's beliefs can't be an explicit list. Rather, the beliefs of past-you have to be lazily-computed/implicit/queryable. Past-you needs to be the sort of thing that can be fed a wide range of (nontrivial) queries, and respond with "hang on, lemme think about that", and return vaguely sensible answers.

And so, we run into the question of "if we've only done a finite amount of thinking, how do we sensibly extrapolate that finite amount of information out to the complicated long questions we might be asked?" There's another question which shows up. Namely, "if we permit more computation resources to be spent on answering long questions, doesn't that let us smuggle increased amounts of computation and thinking time into our "fixed" belief state, which isn't supposed to incorporate additional information?". Abram and Martin addressed these questions from a different angle, though my preferred solution is a bit different from theirs, and I don't think my solution is the last word on this topic.

Logical inductors, as specified in the original paper, fail this desideratum that their beliefs can operate in query mode. When deciding on how many shares to buy/sell in a sentence, the traders can peek at the prices/probabilities of any other piece of math they can name. Because of this, a logical inductor's beliefs about one piece of math are entangled with their beliefs about all the rest of math. You can't ask about the probability of a sentence in isolation, you have to work out the entire belief state in one shot. Also, at any given day n, there are poly(n) traders, which can only write down a poly(n)-length bet, so explicitly enumerating our beliefs would result in us having beliefs about ~polynomially many statements. However, an algorithm that accepts queries is, in principle, the sort of thing which could have opinions about exponentially many different questions. For instance, there are exponentially many questions of the form "If I take action A in situation h, what effect does that have on expected utility now?"

Another obstacle with logical inductors is budgeting. The fundamental trick which logical inductors use is to have a finite amount of money sloshing around in the market, and going "if the beliefs kept being bad forever, someone could extract infinite money, which contradicts the finite amount of money sloshing around in the market". This fundamental trick breaks if there's a way of injecting money into the market. Accordingly, when a trader makes a bet, the money it put into that bet is treated as lost until proven otherwise. So, in order to do budgeting properly, you'd have to keep track of how much money the traders would spend on all of the exponentially-many queries which could show up, to be sure they aren't overdrawing their budget. Again, this assumes an unrealistic level of computational prowess.

Sequence Prediction as a Toy Problem

It's complicated to think about all of mathematics. Lets think about a smaller problem, sequence prediction with finite computing resources.

There's some infinite sequence of bits. Lets say the statements we can bet on are of the form "the nth bit will be a 1 (or 0)". If we threw the logical inductor from the original LI paper at this problem, then on day n, we'd tend to have decent probabilities on the first poly(n) bits, and decent probabilities on the bits that are part of easily-computable patterns of numbers (like, bit 10, bit 100, bit 1000, bit 10000, etc). The rest of our beliefs about other bits we haven't put much thought into might be garbage, which causes a problem if an agent (at some late time) is deferring to its early beliefs about the probabilities of some of those bits.

And what does it even mean to have "beliefs about bits that operate in query mode"?

Well, that question seems to have a simple answer. Any algorithm works like that. You feed in a number, you get a probability out. We could restrict to algorithms which are linear time in the length of the number (so asking about the probabilities on the 145253846'th bit permits spending time thinking about it), or quadratic time, or exponential time, ie, linear in the number (so asking about the probabilities on the 145253846'th bit permits spending time to think about it), or quadratic time in the number... and so on.

So, instead of having lookup-table style beliefs (and with each day, the lookup table gets bigger) a "logical inductor in query mode" would be the sort of thing that has algorithm-style beliefs to handle the (possibly complicated) queries it may be fed by its future self (and with each day, the algorithm gets better).

Lets take a digression to counterlogical mugging. Let's say the nth bit in the sequence is whether the nth digit of is even or odd.

Counterlogical Mugging and Computation Smuggling

Counterlogical Mugging is the decision theory problem where Omega is like "I didn't know the parity of the trillionth digit of pi. I committed that if the trillionth digit was even, I'd ask you for 10 dollars, and if it was odd, I'd give you 100 dollars if you'd give me 10 dollars if it was even. By the way, the trillionth digit of pi is even. Can I have ten bucks?"

This is tricky because, the world where you get the money is straight-up logically impossible. Still, it intuitively feels like you should pay up here and that the situation isn't essentially different from the normal framing of Counterfactual Mugging.

But what do you do if Omega is like "I didn't know the parity of the third digit of pi. I committed blahblahblah. By the way, the third digit of pi is a 4. Can I have ten bucks?"

Intuitively, you might as well save your money there. But where's the dividing line? Wouldn't you want to commit now that if you grew up and memorized more digits of pi (than the six digits you already know, 3.14159), that you'd still pay up if faced with a counterlogical mugging on those more distant digits? The desideratum of dynamic consistency (where the past agent shouldn't want to bind the decision-making of the future agent) would indicate that future-you should defer to your early beliefs about math. If future-you wouldn't defer to your early math beliefs in problems like this, then your performance in Counterlogical Mugging would suffer as you get smarter, and you shouldn't ever wish you were dumber on problems like that.

It's not just Counterlogical Mugging. If you're a really big powerful agent and you're trying to coordinate with a really small agent and it goes "I'll skip on the cooperation unless it's legible to me that you'll use quick simple reasoning to decide that you won't betray me with your big brain", then having the ability to fully simulate the small agent and see that it'll defect (so you might as well defect too), hurts you. From the perspective of early-you, late-you could have gotten mutual cooperation if it had just gone "alright, deferring to my early simple beliefs about math and game theory here" instead.

And so, for problems like these, it seems worrying if you smuggle in extra computational power when you get asked about longer and more complex problems. However, how could you avoid using extra computational resources if you get asked about longer math sentences and more complex problems? I mean, at minimum, you need more computation time just to read your input and comprehend the question you're being asked, right?

Yup, there's a tension here. However, the "increase in computation abilities from being asked longer questions" can be extremely modest, all the way down to "linear in the length of the question you're being asked". If, when asked about the trillionth digit of pi, you can only spend seconds thinking about its parity, and when asked about the quadrillionth digit, you can only spend seconds, then that "increased thinking time that's smuggling in additional computational resources" is linear in the length of time it takes for Omega to state the numbers and for you to comprehend the problem, and that's not remotely sufficient to threaten your performance on Counterlogical Mugging. Asking for any more updatelessness than that seems unfair. How are you expected to make good decisions with too little thinking ability to even comprehend the problem?

This behavior, where early-you might have questions referred back to it, and early-you can spend an additional (linear in length of the question) amount of thinking time on those distant questions, has some neat implications. First, it probably gets counterlogical mugging right. Second, it probably gets "legible coordination with small simple things" right. Third, it sorta makes sense. If postsingularity god-me is in a situation where it has to make really legible decisions for simple reasons and does this by punting the reasoning back to current-me, it feels vaguely appropriate to spend, say, 7x as much time on deciding what to do as it takes to understand future-me's situation. If it takes me a full day to comprehend the necessary context, that problem is complicated enough that I should be able to take a week to ponder my response without getting penalized for complicated and illegible thoughts. Asking me to make a decision in 10 seconds after comprehending the problem feels unfair, and giving me a hundred years to think about the problem feels like worrying behavior on legible-reasoning problems.

Fourth, this approach says that on the sequence prediction problem where the bits are 0 on the even numbers and 1 on the odd numbers, it's a simple enough problem that you shouldn't go along with Counterlogical Mugging. The appropriate response to Omega going "I didn't know the parity of 206892346. I committed that if it was even..." is "I might be updateless, but I'm not that updateless. It's obviously even, I can figure that out almost immediately after the question gets asked. I'm not paying you. Try again with digits of pi".

Fifth, there's a nifty thing going on where this approach (thinking time is linear in the length of the question) can emulate approaches that allow you to think longer (like, thinking time is quadratic in the length of the question), by encoding problems inefficiently. As an extreme example of this, we can take Counterlogical Mugging with unary encodings. A unary encoding of numbers is one where five gets encoded as 11111 and ten gets encoded as 1111111111. Let's take the sequence prediction problem where most bits are 0, but on numbers that are valid unary encodings, that bit is "parity of that digit of pi". So, the 111111111111'th bit would be 0 if the 10th digit of pi is even, and 1 if the 10th digit of pi is odd. Since the length of the problem is so disproportionate to its difficulty, you can throw a lot more effort into computing digits of pi. If Omega insists on running a Counterlogical Mugging with digits of pi, and also insists on representing "the billionth digit" by saying "pi's 1+1+1+1... (several months pass)... +1+1'th digit", this principle of "allocate thinking time that's linear in the amount of time needed to comprehend the problem" would go "I've computed the digit in comparable time that it took for you to state the problem, I'm keeping my money".

Hilbert Hotel's Room Relay

So, if the sort of thing we're after is a logical inductor in "algorithm mode", where the day n version of the inductor isn't a lookup table, but is rather a process that you can direct queries to about arbitrary sentences, which spends more time to think about longer sentences, how do you do that?

One of the really tricky parts of this is that you can't get every nice-sounding property at once. My solution gets past the fundamental obstacle of how to make "large bets" on unboundedly many sentences, but there are a number of additional technical obstacles that arise.

I can't prove, but have a vague sense, that pretty much any approach to the fundamental problem of "how do we get an early belief state to have opinions about all the information that could show up later" would run into those same obstacles in some other guise. I spent a whole lot of time bashing my head against the problem of getting a logical inductor to behave "just so" in order to make the UDT1.01 algorithm behave vaguely sanely when run, and got lost in swamps of fiddly technical details when I did so. However, it does feel like a problem that it's possible to make considerable forward progress on, and I'd strongly encourage someone else to try to make progress on it. No hero license needed for this one, you'll learn a lot from the obstacles you run into, and very likely get further than I did!

Anyways, without further ado, here's my stab at solving the problem.

The fundamental problem that needs to get solved in any attempt at getting an early logical inductor state to have opinions about late sentences, is how to get a trader to make "large bets", on arbitrarily many sentences. The standard formulation of a trader, as some algorithm which spits out a circuit mapping prices of some sentences to buy/sell orders on other sentences, won't cut it here. It's too lookup-table-like. Explicitly naming, in advance, all the things you're going to have opinions on, doesn't mesh with the ability to recieve a query and think about it. So, my first attempt to get past this, would probably be something like...

A trader is an algorithm where, when you feed it a math sentence and a day n, it spends time writing a circuit mapping prices of other sentences to buy/sell orders on . Instead of just recieving the day as input, a trader can also recieve a sentence as input.

This immediately runs into two fatal issues. The first fatal issue is, how do you make sure the trader stays within its budget the whole time? You'd have to work out its attempted trades on all the other sentences (there are infinitely many of them) to figure out whether the trader runs out of money or not, and whether to eliminate it.

The second fatal issue is, if all the prices/probabilities on the math sentences are worked out by solving a really big fixpoint problem, and the prices for everything depend on the prices of everything else, you're actually solving an infinite-dimensional fixpoint problem, where all the variables depend on all the other variables. This might be kinda hard. And by hard I mean uncomputable.

There's a full-generality solution further down, which I was originally going to present first, but it's harder to understand than the original motivating example. So here's the original motivating model.

Let's say we've got the Hilbert Hotel, with infinitely many floors. All the sentences have a room on floor . Ie, all the n-length sentences have a room on floor n. In particular, note that each floor has finitely many inhabitants. The day will be denoted by .

The rules are as follows. On day , you are allowed to take some money from the bank, and run up the hotel with that money to make bets on various sentences. But you're limited in the information you can use to make those bets. If you're on floor on day , and want to base your buy/sell decisions on the price of sentence on day , you can only do it if and . Ie, no peeking at the price of sentences on higher floors, and no using information from the future. On any particular floor, if you've got money left over, you can leave some of it on the ground (to pick up when you pass that floor tomorrow), and take the rest of your money to the next floor.

This stratification actually lets you have a sensible notion of "large bets on infinitely many statements" that solves both the infinite fixpoint problem, and the budgeting problem!

The fixpoint problem, first. Note that, if you're trying to work out the prices of the various sentences on floor n, day m, you don't need to solve an infinite fixpoint problem to do that, since you can only ask about the prices of sentences of equal or shorter length, on equal or previous days. You only need to solve finitely many fixpoint problems, of finite size (one for each of the finitely many floor, day pairs you're allowed to fetch information from), to work out the price of a sentence!

As for the budgeting problem, there's no danger of overdrawing your budget with your infinitely many bets, because you have a local budget, where you specify ahead of time, how much money you're willing to spend overall. Your local budget on a floor is the amount of money you brought up from the floor below, plus the amount of money you left on that floor yesterday. You bet on finitely many sentences, which, by usual logical inductor tricks, lets you figure out how much money was spent. Then you decide how much money to leave behind on that floor, and how much money to take up to the next floor. If you spend your whole budget, well then, no betting on long sentences/higher floors for you. When a new proof arrives, the bank goes up, looks at your buying/selling history for that sentence, and goes "here's your new batch of money that we just realized you have" at the ground floor the next day.

This is how a logical inductor can have beliefs about far-distant sentences in query mode, where it retroactively computes what its beliefs would have been at some earlier time. Our earlier (failed) attempt was "a trader is an algorithm where, when you feed it a math sentence and a day n, it spends time writing a circuit mapping prices of other sentences to the number of shares of to buy/sell". This was almost correct. We just need to add a "can only look at shorter sentences" restriction to let you solve the fixpoint problems one at a time, and also add an extra "budgeter" algorithm, that, given a day and floor , tells you how to split your money between "leave it on the floor" and "take it up to the next floor".

Generalized Room Relay

So, if the key bit is having a stratification where the task of "having beliefs about all of math" gets divided into a bunch of finite-sized fixpoint problems that aren't all entangled with each other, and where there's a local budget at each "batch of sentences" that can be relayed upwards, can we get a nice formalization of that general pattern without the fiddly details like "you can't bet on sentences of length 10 by looking at the prices on sentences of length 20"? Yes. Let be the (countably infinite) set of all statements you can bet on.

Definition 1: Dependency Structure

A dependency structure , is some countably infinite poset , and a function , with the following five properties.

1 (Finite Below):

2 (Unbounded Upwards):

3 (Finitely Many Sentences Per Node):

4 (No Time Travel):

5 (Each Day Has a Ground Floor):

For our Hilbert Hotel example, would be with the usual order you'd expect, and .

In a dependency structure, the intended interpretation of c is "if , then when the trader is computing how much to buy/sell of on day , its circuit is allowed to reference the price of on day ".*

So, property 1 basically says "for any bundle of (sentence, computational effort) pairs, there are finitely many bundles below it which the prices are allowed to depend on".

Property 2 says "for any bundle, there's a bundle strictly above it".

Property 3 says "every bundle contains finitely many sentences and is nonempty".

Property 4 says "you can't have prices on an earlier day depend on prices on a later day".

For Property 5, the key bit in understanding it is that property 4 implies . So, every node in is associated with some time, and can be interpreted as "the prices of this finite set of sentences on such-and-such day". And so, is basically "the subset of which corresponds to the batches of sentences on day ". Property 5 is effectively saying that no matter the day, the poset of "batches of sentences on this day" will have a bottom node/"ground floor".

Basically, our choice of and has the effect of sorting the set of all (sentence, computational effort) pairs into bundles, which are then partially ordered by which bundles get to look at price data from which other bundles.

It's a theorem that, for every node , there's a unique nonempty set of children nodes above it. The rough proof sketch is to pick an arbitrary node (property 2), use property 1 to get that is finite, pick out the minimal nodes in that subset, and repeat, to find all of x's children.

And so, a trader consistent with the dependency structure , is a pair of a trading algorithm and a budgeting algorithm. The trading algorithm takes a pair , and outputs a circuit for how to bet on that sentence, which can only depend on prices (price of on day ) with the property that . The budgeting algorithm, when given a node , outputs a probability distribution over 's children, which is interpreted as "where to send excess money to".

The infinite fixpoint problem is solved because each bundle only contains finitely many sentences, so you've only got a finite-dimensional fixpoint problem to solve (if prices are known for bundles further down in P), and property 1 implies there's only finitely many other fixpoint problems you need to solve first, to fill in that missing data.

The budgeting problem is solved because we need four components. The first component is "money flowing in". Just use property 1, and find the incoming flow of money from each of your "parent" nodes that are one step further down, and there are finitely many of them. The second component is computing "money spent". Well, there's only finitely many sentences to spend money on, and by usual logical inductor tricks, you can find how much you spent and how much money is left over. The third component is computing "money relayed upwards", which the budgeter algorithm solves. And the fourth is recieving money from finding out that you bought shares in a sentence that has been revealed as true. This can be computably done (since there's only finitely many nodes where you could have bought shares in the sentence of interest, and only finitely many fixpoint problems need to be solved to figure out how many shares you bought), and property 5 says that there's a natural place for you to recieve that money on the next day (the "ground floor", where the bank is, and money can be relayed up to everywhere else from there). If you want to send money down to the ground floor from higher floors, just buy shares in a long tautology. If you spend 5 dollars buying 5 shares of an obviously true sentence, then when that sentence is revealed as provably true, you recieve 5 dollars on the ground floor.

So, given a dependency structure of your choice, you can make a logical inductor have early beliefs about arbitrary late sentences.

Properties 1 and 3 are the really key ones. Property 2 says that excess money never gets stuck, there's always somewhere to send it to. Property 4 is a basic sanity condition which ensures compatibility with the linear order of time, and property 5 says that there's a start point where we can get money and use it however we want.

Nonobvious Tail Problems

The first nonobvious problem is that this approach seriously limits the sorts of Dutch book arguments you can pull off. Let's take the concrete example where expected utility should equal the expectation of expected utility. The Dutch book argument is pretty simple. If your expected utility is below what you think expected utility will (on average) be in the future, then you can buy a share of expected utility now (it's cheap, you only spend a little bit of money), and arrange some sequence of conditional bets that's guaranteed to lose you one share of expected utility in the future (you recieve more money upfront, because you're selling a high-priced thing), but gives you a bunch of money up front. When the future comes around, you have guaranteed money, no matter what happens.

Except that you can't actually do this, because that Dutch book relies on looking at the price of future expected utility, and using that to place bets on current expected utility. You just looked at more complex/longer sentences to decide whether to buy/sell shares in a short sentence. If we want our fixpoint finding to be doable, this can't be done.

Or, well, it can sorta be done if you let all sufficiently short sentences depend on each other's prices. Put another way, if the early floors in the Hilbert Hotel are all merged together, you have some limited ability to go "future expected utility is high, current expected utility is low, I'm gonna Dutch-book this". However, there has to be some finite threshold where you stop trying to place bets on short sentences from looking at the prices on long sentences. Put another way, to have a sensible notion of "limited computation", we can only look finitely far into the future for expected utility calculations. It's perfectly fine to expect that things will be bad, and if queried about some specific scenario, go "hang on, actually I just realized that it's probable that things will be good". However, that sort of belated realization can't get propagated back to your initial guess.

It sounds pretty obvious when stated that way, but this limitation does get in the way of some inexploitability arguments. For instance, I have a Dutch book argument of that form, which goes "if the change in expected utility from playing algorithm A in situation h doesn't equal the expectation of (complicated equation), I can extract guaranteed money". I used this Dutch book argument to argue that an agent would want to rewrite into playing the algorithm that optimized (complicated equation). However, the Dutch book which does that is of the forbidden form, where you look at late information to go "well, guess I should buy/sell shares in expected utility right now". I haven't resolved this problem.

The second nonobvious problem has to do with influence measures. To get sane actions in reality when running a UDT algorithm, the influence measures have to be sufficiently bounded. However, logical inductors can drive sentences to extreme prices with very little trading done. If people are spending fractions of a cent betting with each other about some event, inferring the probability of that event can be done no matter how little money they spend on it. So, the inductor market can say, for lots of situations, "yup I expect a notable change in expected utility if we do this action in this situation!". And this is the exact sort of behavior which (by influence measure arguments), tends to permanently lock in terrible decisions. But the traders don't get burned too much if they're wrong about influence because they put negligible money down.

Interestingly enough, auctions have the desired behavior that fixes this. In an auction, to say something has high expected value, you actually have to plop down a pile of money, and only find out later whether you won the auction. If influence measures are decided by auction mechanisms, you automatically get nice boundedness properties. This is because a mischevious trader only has finite money to throw around, so if they try to go "all these situations are really important to act properly in!", on a given day, they go broke, since the refund only comes on the next day. Figuring out exactly how to implement this this is quite tricky and currently unsolved, but it doesn't seem impossible, just annoying.

The third nonobvious problem has to do with tail behavior. With our "Hilbert Hotel" example, what happens if the prices on all the sentences look good, and you keep relaying money further up, to further floors, and never spend it on any shares, and never leave the money on the floor for later pickup? Well then, you lose your money to infinity. The bank goes "as far as we can see with our finite computational powers, you relayed money up to further floors we can't see yet. Maybe you lost it on a distant floor we can't yet see, by making stupid purchasing decisions. We assume the worst-case here, so no I won't give you a loan". In a certain sense, you "leaked money to infinity" and it got lost on the roof of the Hilbert Hotel.

For an agent to go all the way out to infinity and bet on all the sentences, and not lose its money to infinity if the sentences are well-priced, it must leave a little bit of money behind each time. The task of "taking your money up to bet on distant floors without losing it to infinity" is closely akin to specifying a probability distribution on by going "at each number, I have a pile of probability-mass/money, and I can leave a little bit behind on the number/floor I'm on, and relay the rest of the probability-mass/money forward." With this setup, you can specify some really spread-out probability measures on , which extend out amazingly far. However, it's impossible to computationally distinguish "a money-dumping strategy that produces a probability distribution on " and "a money-dumping strategy that leaks some finite amount of money to infinity".

And, if a trader is immune to "losing money to infinity" (ie, the pattern of leaving excess money behind always makes a probability distribution on ), then for every and every day m, there's some floor n (might be googologically big) where, past that, you only have money to spend. Which means there could be some more patient trader that just relays its money up that far and sets the prices on the tail of infinitely many sentences to really bad values, because it's got enough money to overpower your bets. Any trader that doesn't leak money to infinity is also unable to guarantee good tail behavior.

There's even a finite analogue of this problem. Let's take a trader that, on day m, if it has any money, will ascend up the Hilbert Hotel making trades in mispriced sentences. If it gets up to the floor with any money left, it buys a long tautology, for the bank to find and go "ah, this share you bought is worth money!" and give the remaining money back on day . This trader does indeed accumulate money. If it makes good trades, it will eventually be able to enforce that the sentences up to length have sensible probabilities when it's active. However, most of the time, this agent is waiting around to get its money back, and it will tend towards only being active on a really sparse subsequence of days like , because making good bets on really distant sentences means you're doing a lot of waiting to get your money back. The extreme power of this trader at making sure distant sentences are priced sensibly comes at the cost of only being active on an incredibly sparse subsequence of days.

In general, you can't get "make sure that sentences are priced sensibly out to incredibly far out", and "this nice behavior happens on a typical day" to coexist, because any trader that's taking appreciable amounts of money out to incredibly distant sentences has to wait a really long time to get its money back, and will, on average, not have money to spend on enforcing its sane prices.

The full tradeoff between good behavior, and subsequence sparsity, is pretty complicated, and I don't remember the exact theorem which governs it. I'm pretty much positive that you can get something of the form "for any c>0, in the limit, on day n, sentences of length up to cn will be well-priced". I also think this can be boosted to any polynomial, if I'm correctly remembering what past-me showed. But past that, you get nontrivial tradeoffs between the sparsity of the subsequence you're enforcing good behavior on, and how far you're trying to go in establishing good behavior. It's sort of like having a time horizon of "sensible beliefs" that isn't a constant distance ahead of you (like geometric time discount), but runs ahead of you at a superlinear (or superpolynomial?) rate.

The general takeaway is that we unavoidably have to plan for our early beliefs about really really really late sentences to be total garbage, because we don't have guarantees about tail behavior. That's yet another reason to enforce that influence measures decay hard enough, so the super-distant future us goes "ok, all the early me's have totally garbage beliefs about my situation, but they also think that whatever I do doesn't affect them, so I'm gonna ignore them and consult a later version of me".

New Comment
2 comments, sorted by Click to highlight new comments since:

It seems to me that the problem in the counterlogical mugging isn't about how much computation is required for getting the answer. It's about whether you trust Omega to have not done the computation beforehand, and whether you believe they actually would have paid you, no matter how hard or easy the computation is. Next to that, all the other discussion in that section seems irrelevant.

Note that I'm waiting for the entire sequence to be published before I read it (past the first post), so here's a heads up that I'm looking forward to seeing more of this sequence!