It's sort of like the difference between a programmable computer vs an arbitrary blob of matter. 

This is close to what I meant: My neurons keep doing something-like reinforcement learning, whether or not I theoretically believe thats valid. "I in fact can not think outside this" does adress the worry about a merely rational constraint.

On the other hand, we do want AI to eventually consider other hardware, and that might even be necessary for normal embedded agency, since we dont fully trust our hardware even when we dont want to normal-sense-change it. 

To sum up, meaning in this view is broadly more inferentialist and less correspondence-based: the meaning of a think is more closely tied with the inferences around that thing, than with how that thing corresponds to a territory. 

I broadly agree with inferentialism, but I don't think that entirely adresses it. The mark of confused, rather than merely wrong, beliefs is that they dont really have a coherent use. So for example it might be that theres a path through possible scenarios leading back to the starting point, where if at every step I adjust my reaction in a way that seems appropriate to me, I end up with a different reaction when I'm back at the start. If you tried to describe my practices here you would just explicitly account for the framing dependence. But then it wouldn't be confused! That framing-dependent concept you described also exists, but it seems quite different from the confused one. For the confused concept its essential that I consider it not dependent in this way. But if you try to include that in your description, by also describing the practices around my meta-beliefs about the concept, and the meta-meta beliefs, and so on, then you'd end up also describing the process with which I recognized it as confused and revised it. And then we're back in the position of already-having-recognized that its bullshit.

When you were only going up individual meta-levels, the propositions logical induction worked with could be meaningful even if they were wrong, because they were part of processes outside the logical induction process, and those were sufficient to give them truth-conditions. Now you want to determine both what to believe and how those beliefs are to be used in one go, and it's undermining that, because the "how beliefs are to be used" is what foundationalism kept fixed, and which gave them their truth conditions.

I'm not seeing that implication at all!

Well, this is a bit analogy-y, but I'll try to explain. I think theres a semantic issue with anthropics (indeed, under inferentialism all confusion can be expressed as a semantic issue). Things like "the propability that I will have existed if I do X now" are unclear. For example a descriptivist way of understanding conditional propabilities is something like "The token C means conditional propability iff whenever you believe xCy = p, then you would belief P(y) = p if you came to believe x". But that assumes not only that you are logically perfect but that you are there to have beliefs and answer for them. Now most of the time it's not a problem if you're not actually there, because we can just ask about if you were there (and you somehow got oxygen and glucose despite not touching anything, and you could see without blocking photons, etc but lets ignore that for now), even if you aren't actually. But this can be a problem with anthropic situations. Normally when a hypothetical involves you, you can just imagine it from your prespective, and when it doesn't involve you, you can imagine you were there. But if you're trying to imagine a scenario that involves you but you can't imagine it from your prespective, because you come into existence in it, or you have a mental defect in it, or something, then you have to imagine it from the third person. So you're not really thinking about yourself, you're thinking about a copy, which may be in quite a different epistemic situation. So if you can conceptually explain how to have semantics that accounts for my making mistakes, then I think that would propably be able to account for my not being there as well (in both cases, it's just the virtuous epistemic process missing). And that would tell us how to have anthropic beliefs, and that would unknot the area.


we have an updating process which can change its mind about any particular thing; and that updating process itself is not the ground truth, but rather has beliefs (which can change) about what makes an updating process legitimate.

This should still be a strong formal theory, but one which requires weaker assumptions than usual

There seems to be a bit of a tension here. What you're outlining for most of the post still requires a formal system with assumptions within which to take the fixed point, but then that would mean that it can't change its mind about any particular thing. Indeed it's not clear how such a totally self-revising system could ever be a fixed point of constraints of rationality: since it can revise anything, it could only be limited by the physically possible.

This is most directly an approach for solving meta-philosophy.

Related to the last point, your project would seem to imply some interesting semantics. Usually our semantics are based on the correspondence theory: we start thinking about what the world can be like, and then expressions get their meaning from their relations to these ways the world can be like, particularly through our use. (We can already see how this would take us down a descriptive path.) This leads to problems where you can't explain the content what confused beliefs. Like for example most children believe for some time that their languages words for things are the real words for them. If you're human you propably understand what I was talking about, but if some alien was puzzled about what the "real" there was supposed to mean, I don't think I could explain it. Basically once you've unconfused yourself, you become unable to say what you believed.

Now if we're foundationalists, we say that thats because you didn't actually believe anything, and that that was just a linguistic token passed around your head but failing to be meaningful, because you didn't implement The Laws correctly. But if we want to have a theory like yours, it treats this cognitively, and so such beliefs must meaningful in some sense. I'm very curious what this would look like.

More generally this is a massively ambitious undertaking. If you succeeded it would solve a bunch of other issues, not even from running it but just from the conceptual understanding of how it would work. For example in your last post on signalling you mentioned:

First of all, it might be difficult to define the hypothetical scenario in which all interests are aligned, so that communication is honest. Taking an extreme example, how would we then assign meaning to statements such as "our interests are not aligned"?

I think a large part of embedded agency has a similar problem, where we try to build our semantics on "If I was there, I would think", and apply this to scenarios where we are essentially not there, because we're thinking about our non-existence, or about bugs that would make us not think that way, or some such. So if you solved this it would propably just solve anthropics as well. On the one hand this is exciting, on the other its a reason to be sceptical. And all of this eerily reminds me of German Idealism. In any case I think this is very good as a post.

Signalling & Simulacra Level 3

Where do these crisp ontologies come from, if (under the signalling theory of meaning) symbols only have probabilistic meanings?

There are two things here which are at least potentially distinct: The meaning of symbols in thinking, and their meaning in communication. I'd expect these mechanisms to have a fair bit on common, but specifically the problem of alignment of the speakers which is adressed here would not seem to apply to the former. So I dont think we need to wonder here where those crisp ontologies came from.

This is the type of thinking that can't tell the difference between "a implies b" and "a, and also b" -- because people almost always endorse both "a" and "b" when they say "a implies b". 

One way to eliminate this particular problem is to focus on whether the speaker agrees with a sentence if asked, rather than spontaneaus assertions. This fails when the speaker is systematically wrong about something, or when Cartesian boundaries are broken, but other than that it seems to take out a lot of the "association" problems.

None of this is literally said, but a cloud of conversational implicature surrounds the literal text. The signalling analysis can't distinguish this cloud from the literal meaning.

Even what we would consider literal speech can depend on implicature. Consider: "Why don't we have bacon?" "The cat stole it". Which cat "the cat" is requires Gricean reasoning, and the phrase isn't compositional, either.

To hint at my opinion, I think it relates to learning normativity.

I think one criterion of adequacy for explanations of level 1 is to explain why it is sometimes rational to interpret people literally. Why would you throw away all that associated information? Your proposal in that post is quite abstract, could you outline how it would adress this?

Interrestingly I did think of norms when you drew up the problem, but in a different way, related to enforcement. We hold each other responsible for our assertions, and this means we need an idea of when a sentence is properly said. Now such norms can't require speakers to be faithful to all the propabilistic associations of a sentence. That would leave us with too few sentences to describe all situations, and if the norms are to be reponsive to changing expectations, it could never reach equilibrium. So we have to pick some subset of the associations to enforce, and that would then be the "literal meaning". We can see why it would be useful for this to incorporate some compositionality: assertions are much more useful when you can combine multiple, possibly from different sources, into one chain of reasoning.

Weird Things About Money

However, how I assign value to divergent sums is subjective -- it cannot be determined precisely from how I assign value to each of the elements of the sum, because I'm not trying to assume anything like countable additivity.

This implies that you believe in the existence of countably infinite bets but not countably infinite dutch booking processes. Thats seems like a strange/unphysical position to be in - if that were the best treatment of infinity possible, I think infinity is better abandoned. Im not even sure the framework in your linked post can really be said to contain infinte bets: the only way a bet ever gets evaluated is in a bookie strategy, and no single bookie strategy can be guaranteed to fully evaluate an infinte bet. Is there a single bookie strategy that differentiates the St. Petersburg bet from any finite bet? Because if no, then the agent at least cant distinguish them, which is very close to not existing at all here.

In a case like the St Petersburg Lottery, I believe I'm required to have some infinite expectation. 

Why? I haven't found any finite dutch books against not doing so.

Perhaps you can try to problematize this example for me given what I've written above -- not sure if I've already addressed your essential worry here or not.

I dont think you have. That example doesn't involve any uncertainty or infinite sums. The problem is that for any finite n, waiting n+1 is better than waiting n, but waiting indefinitely is worse than any. Formally, the problem is that I have a complete and transitive preference between actions, but no unique best action, just a series that keeps getting better.

Note that you talk about something related in your linked post:

I'm representing preferences on sets only so that I can argue that this reduces to binary preference.

But the proof for that reduction only goes one way: for any preference relation on sets, theres a binary one. My problem is that the inverse does not hold.

Weird Things About Money

I'm generally OK with dropping continuity-type axioms, though, in which case you can have hyperreal/surreal utility to deal with expectations which would otherwise be problematic (the divergent sums which unbounded utility allows).

Have you worked this out somewhere? I'd be interested to see it but I think there are some divergences it can't adress. There is for one the Pasadena paradox, which is also a divergent sum but one which doesn't stably lead anywhere, not even to infinity. The second is an apparently circular dominance relation: Imagine you are linear in monetary consumption. You start with 1$ which you can either spend or leave in the bank, which doubles it every year even after accounting for your time preference/uncertainty/other finite discounting. Now for every n, leaving it in the bank for n+1 years dominates leaving it for n years, but leaving it in the bank forever gets 0 utility. Note that if we replace money with energy here, this could actually happen in universes not too different from ours.

What is the expectation of the self-referential quantity "one greater than your expectation for this value"?

What is the expectation of the self-referential quantity "one greater than your expectation for this value, except when that would go over the maximum, in which case it's one lower than expectation instead"? Insofar as there is an answer it would have to be "one less than maximum", but that would seem to require uncertainty about what your expectations are.

Weird Things About Money

But I still think it's important to point out that the behavioral recommendations of Kelly do not violate the VNM axioms in any way, so the incompatibility is not as great as it may seem.

I think the interesting question is what to do when you expect many more, but only finitely many rounds. It seems like Kelly should somehow gradually transition, until it recommends normal utility maximization in the case of only a single round happening ever. Log utility doesn't do this. I'm not sure I have anything that does though, so maybe it's unfair to ask it from you, but still it seems like a core part of the idea, that the Kelly strategy comes from the compounding, is lost. 

And yet it doesn't violate VNM, which means the classic argument for maximizing expected utility goes through. How can this paradox be resolved? By noting that utility is just whatever quantity expectation maximization does go through for, "by definition".

This is the sort of argument you want to be very suspicious of if youre confused, as I suspect we are. For example, you can now just apply all the arguments that made Kelly seem compelling again, but this time with respect to the new, logarithmic utility function. Do they actually seem less compelling now? A little bit, yes, because I think we really are sublinear in money, and the intuitions related to that went away. But no matter what the utility function, we can always construct bets that are compounding in utility, and then bettors which are Kelly with respect to that utility function will come to dominate the market. So if you do this reverse-inference of utility, the utility function of Kelly bettors will seems to change based on the bets offered.

I'm curious if you're taking a side, here, wrt which limit one should take.

Not really, I think we're to confused to say yet. I do think I understand decisions with bounded utility (all the classical foundations imply bounded utilities, including VNM. This doesn't seem to be well known here). Bounded utility makes maximization a lot more Kelly: it means that the maximizers can no longer have the arbitrarily high pay-offs that are needed to balance the near-certainty of elimination. I also think it should make it not matter which limit you take first, but I don't think that leads to Kelly, either, because the betting structure that leads to Kelly assumes unbounded utility. Perhaps it would end up as a local approximation somewhere.

Now I also think that bounded decision theory is inadequate. I think a decision theory should be able to implement a paperclip maximizer, and it should work in worlds that last infinitely long. But I don't have something that fulfills that. I think theres a good chance the solution doesn't look like utility at all: A theorem that needs its problem to be finite propably won't do well in embedded problems.

Weird Things About Money

Money wants to be linear, but wants even more to be algorithmic

I think this is mixing up two things. First, a diminishing marginal utility in consumption measured in money. This can lead to risk averse behaviour, but it could be any sublinear function, not just logarithmic, and I have seen no reason to think it's logarithmic in actually existing humans.

if you have risk-averse behavior, other agents can exploit you by selling you insurance.

I wouldn't call it "exploit". It's not a money pump that can be repeated arbitrarily often, its simply a price you pay for stability.

This "money" acts very much like utility, suggesting that utility is supposed to be linear in money.

Only the utility of the agent in question is supposed to be linear in this "money", and that can always be achieved by a monotone transformation. This is quite different from suggesting there's a resource everyone should be linear in under the same scaling.

The second thing is the Kelly criterion. The Kelly criterion exist because money can compound. This is also why it produces specifically a logarithmic stucture. Kelly theory recommends you to use the criterion regardsless of the shape of your utility in consumption, if you expect many more games after this one - it is much more like a convergent instrumental goal. So this:

Kelly betting is fully compatible with expected utility maximization, since we can maximize the expectation of the logarithm of money.

is just wrong AFAICT. This is compatible from the side of utility maximization, but not from the side of Kelly as theory. Of course you can always construct a utility function that will behave in a specific way - this isn't saying much.

This means the previous counterpoint was wrong: expected-money bettors profit in expectation from selling insurance to Kelly bettors, but the Kelly bettors eventually dominate the market

Depends on how you define "dominate the market". In most worlds, most (by headcount) of the bettors still around will be Kelly bettors. I even think that weighing by money, in most worlds Kelly bettors would outweigh expectation maximizers. But weighing by money across all worlds, the expectation maximizers win - by definition. The Kelly criterion "almost surely" beats any other strategy when played sufficiently long - but it only wins by some amount in the cases where it wins, and its infinitely behind in the infinitely unlikely case that it doesn't win.

Kelly betting really is incompatible with expectation maximization. It deliberately takes a lower average. The conflict is essentially over two conflicting infinities: Kelly notes that for any sample size, if theres a long enough duration Kelly wins. And maximization notes that for any duration, if theres a big enough sample size maximisation wins.

Money wants to go negative, but can't.

A lot of what you say here goes into monetary economics, and you should ask someone in the field or at least read up on it before relying on any of this. Propably you shouldn't rely on it even then, if at all avoidable.

What is the interpretation of the do() operator?
But in a newly born child or blank AI system, how does it acquire causal models?

I see no problem assuming that you start out with a prior over causal models - we do the same for propabilistic models after all. The question is how the updating works, and if, assuming the world has a causal structure, this way of updating can identify it.

I myself think (but I haven't given it enough thought) that there might be a bridge from data to causal models though falsification. Take a list of possible causal models for a given problem and search through your data. You might not be able to prove your assumptions, but you might be able to rule causal models out, if they suppose there is a causal relation between two variables that show no correlation at all.

This can never distinguish between different causal models that predict the same propability distribution - all the advantage this would have over purely propabilistic updating would already be included in the prior.

To update in a way that distinguishes between causal models, you need to update on information that is true for some event. Now in this case you could allow each causal model to decide when that is true,for the purposes of its own updating, so you are now allowed to define it in causal terms. This would still need some work from what I wrote in the question - you can't really change something independent of its causal antecendents, at least not when we're talking about the whole world which includes you, but perhaps some notion of independence would suffice. And then you would have to show that this really does converge on the true causal structure, if there is one.

What is the interpretation of the do() operator?
If Markov models are simple explanations of our observations, then what's the problem with using them?

To be clear, by total propability distribution I mean a distribution over all possible conjunctions of events. A Markov model also creates a total propability distribution, but there are multiple Markov models with the same propability distribution. Believing in a Markov model is more specific, and so if we could do the same work with just propability distributions, then Occam would seem to demand we do.

The surface-level answer to your question would be to talk about how to interconvert between causal graphs and probabilities... But you can google this or find it in Pearl's book Causality.

My understanding is that you can't infer a causal graph from just a propability distribution. You need either causal assumptions or experiments to do that, and experimenting involves do()ing, so I'm asking if it can be explained what do()ing is in non-causal terms.

I'd just like you to think more about what you want from an "explanation." What is it you want to know that would make things feel explained?

If there were a way to infer causal structure from just propability distributions, that would be an explanation. Infering them from something else might also work, but it depends on what the something is, and I don't think I can give you a list of viable options in advance.

Alternatively, you might say that causality can't be reduced to something else. In that case, I would like to know how I come to have beliefs about causality, and why this gives true answers. I have something like that for propability distributions: I have a prior and a rule to update it (how I come to believe it) and a theorem saying if I do that in the limit I'll always do at least as well as my best hypothesis with propability ≠ 0 in the prior (why it works).

Towards a Formalisation of Logical Counterfactuals

Hello Darmani,

I'm indeed talking about a kind of counterfactual conditional, one that could apply to logical rather than causal dependencies.

Avoiding multiple antecedents isn't just a matter of what my conceptual toolkit can handle; I could well have done two different types of nodes, that would have represented it just fine. However restricting inferences this way makes a lot of things easier. For example it means that all inferences to any point "after" come only from propositions that have come through . If an inference could have multiple antecedents, then there would be inferences that combine a premise derived from with a premise in , and its not clear if the separation can be kept here.

From the paper you linked... Their first definiton of the counterfactual (the one where the consequent can only be a simple formula) describes the causal counterfactual (well, the indeterminstic-but-not-propabilistic protocolls throw things off a bit, but we can ignore those), and the whole "tremble" analysis closely resembles causal decision theory. Now their concept of knowledge is interesting because its defined with respect to the ecosystem, and so is implicity knowledge of the other agents strategy, but the subsequent definition of belief rather kneecaps this. The problem that belief is supposed to solve is very similar to the rederivation problem I'm trying to get around, but its formulated in terms of model theory. This seems like a bad way to formulate it, because having a model is a holistic property of the formal system, and our counterfactual surgery is basically trying to break a few things in that system without destroying the whole. And indeed the way belief is defined is basically to always assume that no further deviation from the known strategy will occur, so its impossible to use the counterfactuals based on it to evaluate different strategies. Or applying it to "strategies" of logical derivation: If you try to evaluate "what of X wasnt derived by the system", it'll do normal logical derivations, then the first time it would derive X it derives non-X instead, and then it continues doing correct derivations and soon derives X and therefore contradiction.

PM is sent.

Load More