Much is owed to Diffractor for Giry-pilling me at Alignable Structures, I had been struggling with type-driven expected value previously.

Epistemic status: took a couple days off from my master plan to think about John's selection theorems call to action.

We would like a type signature of agency. Scott Garrabrant provides as a first approximation. You can choose one of two ideas here: 1. that an agent simply takes a belief about how actions turn into outcomes and returns a recommended action, or 2. that an agent takes underlying configurations of reality (containing information about how actions lead to outcomes) and tends to perform certain actions. Notice that happens to be for "outcome", "observation", and even "ontology", which is nice. This signature is widely discussed in the monad literature.

Scott wrote that primarily means causal influence and secondarily means functions. I will be mostly ignoring the causal influence idea, and I think instead of thinking of the signature from an objective perspective of it being a transcription of the underlying reality, I want to think of it from a subjective perspective of it being an assistant for implementation engineers. I think we should take a swing at being incredibly straightforward about what we mean by the type signature of agency: when I say that a type is the type signature of agency, I mean that if we have programs that are admitted by then those programs are doing all the things that interest me about agents (i.e., at , if we instantiate particular non-atomic propositions and that interact with the outside world in such a way that we can obtain proofs of (which we can't do in general) in some way, then those proofs are doing all the things that interest me about agents).

In my view, the first idea (involving "belief") can be called a subjective interpretation of the type signature, and we shall explore some adjustments to make this story better, while the second idea (involving "base reality") can be called an objective interpretation of the type signature, and we shall not explore philosophical controversies around saying that a type is "in" reality rather than in a model.

I will ultimately conclude that I am not equipped to flesh out the objective interpretation, and give a subjective interpretation such that an agent is not one selection function, but a pair of selection functions. In particular, (an agent is made up of an instrumental part and an epistemic part).

In the post, heading number one is an infodump about stuff I've been reading, and setup of some tooling. Heading number two is applications to agency.


  • denotes implementation of terms and denotes signature of types. is an alternative to .
  • is the type of types, which you can define via structural induction like propositional logic; the only important part for us today is , and I'm handwaving the equipping of preorders to arbitrary members of very fast and loose, and I'll handwave some other stuff implying that a properly structural induction story would be really messy if I actually worked out the details.
  • In addition to being a -constructor, is also a -constructor (for the type of propositions ).
  • Arrows associate to the right (see currying). .
  • We sometimes write instead of (alluding to the arrow type's associated counting problem).
  • When we say a function is a monad we mean that it comes to us equipped with a function (called a return) and a function (called a flatten) which agree to some laws.
    • Example: let setting to the construction of a singleton and to the removal of nesting information.
    • Exercise: fixing an , one of or forms a monad. Pick one and set its and (Solution[1]).

Selection and continuation

The agent type is widely discussed in the monad literature.

Fixing an outcome type , is called the selection monad, and its friend is called the continuation monad.

Remark: quantifiers are continuations


  • , or the type of two nullary construcors.
  • is the complete ordered field.

The story of -valued or -interpreted logics goes like this. For any ,

In other words, a quantifier takes a predicate (typed ) and returns a valuation of the predicate under different conditions. is the element of that says "the predicate is true all over ", (or we may write it point-free as ) is if and only if is always regardless of . is the element of that says "the predicate is true at least once over ", the point-free is if and only if you can provide at least one such that is .

The literature likes to call continuations generalized quantifiers, where your "truth values" can take on arbitrary type. The story of quantifiers can be updated to for a richer type of propositions such that not everything is decidable.


  • Think of distinguished primitives in reinforcement learning theory; is there either a selection or a continuation story one of them? (Solution[2]).
  • Name a distinguished primitives from calculus or analysis; is there a selection or continuation story of it? (Solution[3]).

Remark: distributions are a special case of generalized quantifiers


  • Recall that for each , you can construct a constant function by throwing out the .
  • A is a reflexive and transitive relation.
  • Recall that an is monotonic when, having a and a , .
  • Let .
  • A map , when and are drawn from some underlying field , is linear whenever .

Consume a valuation and produce an expectation

A particular way of strengthening or filtering (quantifiers generalized to valuations in ) is to require linearity, monotonicity, and the sending of constant functions to a neutral scalar. For arbitrary types and for types equipped with some multiplicative structure involving a neutral, we will write to describe the functions but only keeping the ones that are monotonic, linear, and that send constants in to the multiplicative neutral in (conventionally, pronounced "lollipop" or "lolli" denotes linearity). Letting play the roles of and , define

In other words, a distribution is just a continuation term that knows how to turn a valuation (an , i.e. a random variable) into an expectation (where the expectation abides linearity, monotonicity, and the sending of constants to ).

where I'm being lazy about the measure theory needed to actually compute terms, however, we see that measure theory doesn't really emerge at the type level.

I'm thinking of distributions as a subset of these -valued quantifiers because I want to eventually think about utilities, and I'm still pretty sure the utility codomain is going to be all the time.

forms a monad

The settings of and along with the lawfulness proofs are in this coq file, written a few weeks ago before I knew anything about the selection and continuation literature. (This is not surprising, as we knew that forms a monad, and the substitution of the second for only deletes maps and doesn't add any potential violators).

Remark: convert selections into continuations/quantifiers

In other words, if is a selection then is a continuation.


Presume a . Suppose I have a . is called attainable when it's preimage under is nonempty. In other words, is attainable if and only if . In that case, we may say " attains ".

Notice that from the existence half of the functionality predicate, we get a free existence proof of a continuation/quantifier for every selection. To believe that some continuations are unattainable is to believe that is not surjective.


  • Recall the solutions to previous exercises 1 and 2. What is the attainability relationship between them, if any?[4]

Wrapping the codomain

Fix a and a . Define

Example: powerset

Denote as the function that confiscates a type and rewards the powerset of that type. In other words, (where an is interpreted if and only if ).

We call the items of multi-valued selections and items of multi-valued quantifiers.

Exercise (harder than previous)

  • can you re-obtain monadicity for multi-valued selection?
  • can you re-obtain monadicity for multi-valued continuation?
  • write down multi-valued attainment[5]

Wrapping the codomain of the domain

We may additionally like to use maps to goof off with transforming the codomain of the input map.


  • again, can you re-obtain monadicity for ? For ?

Wrapping the whole domain

Having maps , and since , we also might enjoy transforming the whole input map type.

Modifying the agent signature

Recall the agent interpretation of selection. We fix an outcome type and an action type and we reason about . Recall that there are two cases: a subjective case in which items are beliefs, and an objective case in which items are actual configurations of reality. In the subjective case, an agent turns a model of reality into a recommended action (the term hardcodes its notion of utility or whatever). In the objective case, the world has configurations, and an agent can be trusted to tend toward the actual configuration over time, using it to (again relying on hardcoded utility data) select actions.

Investigation: continuation is to as selection is to what?

We obtained by replacing the rightmost in the definition of with my custom . Let's goof around with performing the same replacement in .

Recall that implies that it's codomain supports linearity, monotonicity, and multiplicative neutrality, so we know that the domain of isn't "really" just (hence the scare quotes), whereas the domain of was truly the unconstrained type . So it may be difficult now to be sure of the preservation of monadicity.


  • A monoidal preorder is a preorder with a monoid attached. If you start with such that is reflexive and transitive, and you find an associative that has a distinguished neutral element , and you know , then you have the monoidal preorder .
    • From any set you can construct a monoidal preorder where and are from set theory. Validate this, if you like.

Rambling about

How do we interpret this? In the agent case, actions are playing the role of , which immediately suggests that we'll only have the class of continuous action spaces, so we can try . But , which feels maybe problematic or vacuous. Possibly problematic, because I don't know how the theory of random variables adjusts to the bare real line (as opposed to a collection of subsets). Possibly vacuous, because I don't know any particular terms typed (other than or ones with fairly strong conditions like increasingness) that I would expect to correspond with some foggy coherence notion for valuations in the back of my mind. Moreover, what should we think of collapsing the very distinction between selection and continuation, by setting ? isn't provable in the logic interpretation (unless I'm missing some coinductive black magic resolving loops), which is a hint that we're barking up the wrong tree. My gut isn't telling me would be any better.

We could of course support the requirements on the codomain by putting a monoidal preorder on (namely setting , , , and ), which wouldn't work for entirely arbitrary but would work if you could interpret the scaling of a subset (like is a single suit out of a deck of cards, the valuation of a subset is the total number of pips across all the cards in the subset, and scalar hits it by doing some operation on that valuation, like ). Fix an that you can interpret in this way. Then, try . In other words, if I have an -generated multi-valued "selection distribution" , then for every valuation of a subset , is a kind of expected subset, or it's something the agent can proactively search for like or . Perhaps you could even interpret/implement it like "if is my complete account of what a subset is worth to me, then fixes an amount of optimization power I'm going to throw at steering the future into particular subsets over others, and denotes the sort of place I would end up if I applied that much optimization to my values (insofar as landing at an actual optima implies that possibly unbounded optimization power was deployed)".


  • Check that monotonicity, linearity, and the sending of constants to (in this case because it's the monoidal neutral) works with something like my deck-of-cards choice of .

What about for metric spaces ?

Loosening up the pedantry a little, because the actual type-driven story would get too hairy, let's by fiat admit , so we can take the subset of that just has continuous functions in it. You shall indulge me if I utilize without properly saying that the domain is just the types interpretable as or isomorphic to uninterrupted intervals, whatever.

A modus ponens with a little decoration with conditions like the linearity/monotonicity/sending constants to and continuity. What does it mean?

It could mean the environment actually giving the agent a reward for taking action , though it's a simpler story than the one in standard reinforcement learning theory, especially e.g. POMDPs.


The idea of rigging "scalar multiplication" to my deck of cards semantics was uncomfortable. The following, however, has a perfectly natural notion of linearity (alongside order and the idea of a ).

Selections over continuous functions (taking valuations of continuous functions as inputs and returning continuous functions as output) sounds like a kind of learning over "metavalues", when the continuous functions are interpreted as utilities, then the knows how to take the utility of a utility function (which is metautility) and choose the one that maximizes metautility.

This of course restricts that the action type be equipped with a metric.

Conjecture: attainability survives the transportation to the custom

should be provable. Indeed, it's just a domain restriction on the original , so this conjecture is in the bag.


This isn't quite the subjective approach I'm looking for. Mapping from uncertainty over valuations to actions seems kinda from the perspective of social choice theory, where the difference in opinion across the population is captured by not being able to know a precise point estimate of a valuation, having to turn a distribution over valuations of actions into an action.


This looks to me the most like "the agent turns models/beliefs into actions".

Let's unfold .

The general pattern of "terms such that the input is into quantifiers and the output is " might mean that terms are hardcoded predicates which can select values of to get a desired result depending on whichever quantifier shows up. We will not work with the unfolded version in what follows.

Rescue attempt: the objective interpretation

In the objective interpretation of the type signature of agency, an agent is a term that turns a configuration reality could be in (specifically the information about how actions lead to outcomes) into an action.

In my rescue operation, objectivity is not pure: we will see that I've installed a subjectivity (i.e. learning) layer as an implementation detail. Think of it like the difference between a lemma and a theorem; at the lemma level, there's subjectivity, while if the theorem level doesn't open up black boxes it may not notice subjectivity. Put another way, the challenge of the rescue operation is to tell a compellingly full story (which ought to oblige the term to empiricism under uncertainty) without resorting to .

The "lemma" will be a term . Its inputs are loss functions which come equipped with real-world data hardcoded into them. These loss functions make sense of the gap between a map and a territory, here focusing on action-output relations, i.e. they take a notion of how actions turn into outcomes and they score how accurate it is. Then for such a loss function , . (And if you want the function that constructs s, you need ontology to describe that function's domain). Since this is the objective point of view, we interpret 's codomain as the literal outcomes in the world, indeed is the gears by which perturbations from agents effect things. (Warning: here be monsters) if we say that in order to implement an agent you need to provide a , and describes the literal gears of the world and isn't a conditional forecast (like "our best guess at time is that action will transition the world into state "), then I don't see how an agent is remotely computational.

Equipped like so, with any proof hardcoded by some humans or learned by some ML model, and some provided by a stakeholder/principal, is the action the proof would like to take. But since is a blackbox, is as good as an axiom, i.e., the blackboxness propagates out and it can't be actually written at the low level. A configuration of the world has a similar problem.

It's plausible to me that infrabayesian physicalism or factored sets provide a way forward, but I'm not going to grok either of those today. (The first time I read "Saving Time" just as now, I was confused about "the future effects the past" because of the determinism/nondeterminism question, i.e. I get that forecasts of or distributions over the future effect the past, but I don't get how the actual future effects the past).

I'm marking this rescue operation as a failure, owing to the restriction against invoking .

Rescue: the subjective interpretation


  • We will use pair types, of one constructor (namely ) and two destructors (namely and ), and we assume associativity of -ary or nested products.

A subjective agent is a tuple with a protocol

In the subjective interpretation of the type signature of agency, an agent is a term that knows how to turn a belief about how actions turn into outcomes into an action. Since beliefs are emphasized, and beliefs are uncertain, we will allow ourselves liberal use of the operator. The following approach is based on the failed rescue of the objective interpretation.

Fix a type of actions and a type of outcomes. We consider proofs where items are conditional forecasts that accept an action and report, with uncertainty, a belief about what will happen if it does . As the domain of above, for any the domain of is loss functions, each of which considers a conditional forecast and scores it for calibration, accuracy, whatever. We write instead of .

A stakeholder or principal encodes observations at time of the world by hardcoding data into their choice of loss function by using the function , writing instead of .

Then implement the uncertain selection . Notice that it's domain is conditional forecasts.

Then an agent is none other than with sensors and actuators, which interacts with the world via a protocol which runs as follows

  1. At time , a stakeholder or principal sets and hands it to .
  2. is a conditional forecast that turns actions into uncertainty in . is the action taken by at time .
  3. Observe world and score the term against it, using the score to power some search process that informs .
  4. Increment and repeat.

In other words, the agent calculates an action because it can turn loss functions which score conditional forecasts into a handpicked conditional forecast, and it can also turn conditional forecasts into handpicked actions. hardcodes the procedure for doing bayesian updates, i.e. it has opinions about some beliefs being better than others. hardcodes (and hides) a utility function, i.e. it has opinions about some outcomes being better than others. Echoing a complex number, which is a real part and an imaginary part, we can view an agent as an instrumental part and an epistemic part. While the complex numbers are equipped with some notion of "" such that (real part plus imaginary part), I can make up a notion of "" such that

(epistemic part "plus" instrumental part).

I played fast and loose with the mutability and implicit differentiability of . I think this is appropriate: any intuition about agents is that their beliefs change over time, even if corrigibility remains an open problem (in other words, epistemic part ought to be mutable even if the instrumental part (where the utility function is) is not).

The abstract type signature of is , where here when we say a codomain is outcomes we mean that it's the literal world, not an implementational model of it, hence the signature being "abstract".

Selection product?

In the literature there's a function . It's only defined between selections that share an inner target , though, so it doesn't apply to . Still, there might be some cleverness I haven't considered.


We need more candidates for the type signature of agency. An obvious way to explore is to take the first candidate someone wrote down, make an incision, and poke its guts with various functions .

A more complete story of agency, together with a protocol describing interactions with the world, is not a single selection but a pair of selections. The pair can be understood as an epistemic part and an instrumental part.

I'm aware that I at least partially took some steps toward reinventing the reinforcement learning theory wheel when I gave the protocol , an alternative approach to this post would be to start with RL theory and see what notions of selection function are hanging around.

If we hammer out the dents in we get a really pretty notion of "turning agency into probability" (in the form of the function on a restricted domain), and plausibly also a characterization of the unreliability or impossibility of turning probability into agency (via the insurjectivity of ).

What about interp? I think something like the searching for search could, if we're not totally and completely wrong about the pillars of the agency type signature direction, show us a ton about how ML naturally implements terms/proofs of things like . A dope UX would be something like tactical programming not for creating terms/proofs, but for parsing out / identifying them in a big pile/soup of linear algebra. A fantasy world I'd like to live in is one where a prosaic/interp shop ships neural-hoogle or transformer-hoogle, a search engine that accepts type signatures and finds configurations of matrices and weights which, if you squint, count as proofs/terms of those types. To be abundantly clear, you can think of the following proof of as the dumbest possible search

def argmax(f: Callable[[A], float]) -> A:
  ret = None
  curr_y = - 2 ** 100
  for x in A:
    y = f(x)
    if y > curr_y:
      curr_y = y
      ret = x
  return ret

Insofar as the type A is enumerable. The hypothesis advanced by this post is that arbitrarily not-dumb search is constrained by the same type information as dumb search. Search is literally a significant class of proofs of selection.

The objective interpretation of the project of giving a type signature for agents seems a little borked right now, but that could change with increased understanding of factored sets or maybe infrabayesian physicalism.

Selections and continuations play a huge role in compositional game theory, which I'm starting to think provides a mean embedded agency story, though I haven't grokked it quite at the level of writing a post about it just yet.


  1. not super useful without an interactive session, nevertheless. ↩︎

  2. ↩︎

  3. ↩︎

  4. attains , or . ↩︎

  5. Presume some and some . is attainable if and only if . For multi-valued variants of and , we can check that the solution to exercise 2 transfers over to this setting. ↩︎

New to LessWrong?

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 9:14 AM

I read the first past of this post. It is quite interesting. Always thought Hedges' work should be more known on LessWrong.

Have you since thought about these topics? I'd be curious what your current take is. 

  1. I've been very distressed thinking that instrumental and epistemic parts are not cleanly separable, and that entire is-ought gap or humean facts-values is a grade school story or pedagogically noble lie
  2. I got severely burnt out from exhaustion not long after writing this, and one of the reasons was the open games literature lol. But good news! I was cleaning out old tabs on my browser and I landed on one of those papers, and it all made perfect sense instantly! I'm more convinced than I initially suspected that the open games community has a ton to offer.

That's nice to hear. Could you say more on your update towards open games ?