Quinn

Let `FairBot`

be the player that sends an opponent to `Cooperate`

(`C`

) if it is provable that they cooperate with `FairBot`

, and sends them to `Defect`

(`D`

) otherwise.

Let `FairBot_k`

be the player that searches for proofs of length `<= k`

that it's input cooperates with `FairBot_k`

, and cooperates if it finds one, returning defect if all the proofs of length `<= k`

are exhausted without one being valid.

Critch writes that "100%" of the time, mathematicians and computer scientists report believing that `FairBot_k(FairBot_k) = D`

, owing to the basic vision of a stack overflow exceeding the value `k`

(spoiler in the footnote^{[1]} for how it actually shakes out, in what is now a traditional result in open source game theory).

I am one of these people who believe that `FairBot_k(FairBot_k) = D`

, because I don't understand Löb, nor do I understand parametric Löb. But I was talking about this on two separate occasions with friends Ben and Stephen, both of whom made the same remark, a remark which I have not seen discussed.

One shorter way of writing `FairBot`

is this

because when lands in , collapses to .

Here, I'm being sloppy about *evaluation* vs. *provability*. I'm taking what was originally " is provable" and replacing it with " is evaluable at ", and assuming decidability so that I can reflect into `bool`

for testing in an . Then I'm actually performing the evaluation.

Stepping back, if you can write down

and you know that and share a codomain (the moves of the game, in this case ), then the **solution set** of this equation . In other words, the equation is consistent at and consistent at , and there may not be a principled way of choosing one particular item of in general. In other words, the proofs of the type are not unique.

I'm guessing there's some solution to this problem in MIRI's old haskell repo, but I haven't been able to find it reading the code yet.

I can't think of a typey way to justify . It's simply nonsensical, or I'm missing something about a curry-howard correspondence with arithmoquining. In other words, agents like `FairBot`

that take other agents as input and return moves are a lispy/pythonic warcrime, in terms of type-driven ideology.

- Am I confused because of a subtlety distinguishing evaluability and provability?
- Am I confused because of some nuance about how recursion really works?

it turns out that Löb's theorem implies that

`FairBot`

cooperates with`FairBot`

, and a proof-length-aware variant of Löb's theorem implies that`FairBot_k`

cooperates with`FairBot_k`

. ↩︎

For me the scary part was Meta's willingness to do things that are minimally/arguably torment-nexusy and then put it in PR language like "cooperation" and actually with a straight face sweep the deceptive capability under the rug.

This is different from believing that the deceptive capability in question is on it's own dangerous or surprising.

My update from cicero is almost entirely on the social reality level: I now more strongly than before believe that in the social reality, rationalization for torment-nexus-ing will be extremely viable and accessible to careless actors.

(that said, I think I may have forecasted 30-45% chance of full-press diplomacy success if you had asked me a few weeks ago, so maybe I'm not *that* unsurprised on the technical level)

We can say "a monotonic map, is a phenomenon of as observed by ", then, emergence is simply the impreservation of joins.

Given preorders and , we say a map in "preserves" joins (which, recall, are least upper bounds) iff where by "" we mean .

Suppose is a measurement taken from a particle. We would like for our measurement system to be *robust against emergence*, which is literally operationalized by measuring one particle, measuring another, then doing some operation on the two results and getting the exact same thing as you would have gotten if you smashed the particles together somehow before taking the (now, single) measurement. But we don't always get what we want.

Indeed, for arbitrary preorders and monotone arrows, you can prove , which we interpret as saying "smashing things together before measuring gives you more information than measuring two things then somehow combining them".

In the sequences community, emergence is a post-it note that says "you're confused or uncertain, come back here to finish working later" (Eliezer, 2008 or whatever). In the applied category theory community, emergence is also a failure of understanding but the antidote, namely reductions to composition, is prescribed.

This is all in chapter 1 of seven sketches on compositionality by fong and spivak, citing a thesis by someone called adam.

Can someone explain to me how the giry monad factors in? For some , executing to get a would destroy information: what information, and why not destroy it? (Am I being too hasty comparing probability monad to haskell monad?)

Florian Brandl, Felix Brandt, and Hans Georg Seedig

hyperlink 404

Why is a codomain of `[0,1]`

more general than a preorder?

The function only uses the preorders on candidates implied by the utility functions.

The (implicit, as OP says "obvious/not matter") measurability of a compact set seems like *more* structure than a preorder, to me, and I'm not thinking of "generalization" as imposing more structure.

lol, I filed the same market on manifold before scrolling down and seeing you already did.

oof, good catch, fixed.

There are two different ways of reading Scott's kickoff of the type signature, though.

You took it in the direction of a term j:A→B is a sort of

beliefabout how actions turn into outcomes.But it's plausible to me that Scott meant "the item j from A→B is actually the underlying reality". The idea there would be that (A→B)→A isn't a comment that directly concerns the implementation, but it's a philosophical statement about embedded agency. Items j needn't be models, they could be the actual configurations of reality; agents are these terms that we can, either through proper prediction or post-hoc explanation, understand as turning physical configurations of causal arrows from actions to the world into actions. Like, it's a thing we observe them doing. Then to implement a (A→B)→A you would need a whole bunch of machinery including subjective epistemic rationality j′ (which might look a little like A→ΔB) which one would hope converges on the actual j with learning, of course utility so you know what you want to do with the second → in the signature. But the real-world implementation's type signature would be a bit more intricate than the philosophy's type signature.

I know philosophers of science probably duel-at-dawn about the idea that you can give a type signature to underlying reality ("physical configuration of causal arrows") rather than to a subjective model of it.

(I have a post about all this coming out soon, will edit this comment with a link to it) EDIT: the post is out