(A -> B) -> A

[-]MrMind7yΩ7160

It's interesting to notice that there's nothing with that type on hoogle (Haskell language search engine), so it's not the type of any common utility.

On the other hand, you can still say quite a bit on functions of that type, drawing from type and set theory.

First, let's name a generic function with that type $k : (A \to B) \to A$ . It's possible to show that k cannot be parametric in both types. If it were, $(0 \to 0) \to 0$ would be valid, which is absurd ( $0 \to 0$ has an element!). It' also possible to show that if k is not parametric in one type, it must have access to at least an element of that type (think about $(A \to 0) \to A$ and $(0 \to B) \to 0$ ).

A simple cardinality argument also shows that k must be many-to-one (that is, non injective): unless B is 1 (the one element type), $| B^{A} | > | A |$

There is an interesting operator that uses k, which I call interleave:

$i n t e r l e a v e : ((A \to B) \to A) \to (A \to B) \to B$

Trivially, $i n t e r l e a v e = λ k, f . f (k (f))$

It's interesting because partially applying interleave to some k has the type $(A \to B) \to B$ , which is the type of continuations, and I suspect that this is what underlies the common usage of such operators.

[-]Gurkenglas7y10

$(A \to 0) \to A$ usually has one element, so $B$ needs not have an element.

That Hoogle doesn't list a result essentially follows from k not being parametric in all types. (Except that it lists unsafeCoerce :: $A \to B$ - they'd rather have the type system inconsistent than incomplete...)

$(A \to B) \to B$ stands for what the agent ends up making happen, and may be easier to implement - just like predicting that Kasparov will win a chess match, without knowing how. Interesting $(A \to B) \to A$ should tend to have the property that interleave turns them into a particular kind of $(A \to B) \to B$ . Why would you call it interleave?

[-]MrMind7y30

Let's say that $A$ is the set of available actions and $B$ is the set of consequences. $A \to B$ is then the set of predictions, where a single prediction associates to every possible action a consequence. $(A \to B) \to A$ is then a choice operator, that selects for each prediction an action to take.

What we have seen so far:

There's no 'general' or 'natural' choice operator, that is, every choice operator must be based on at least a partial knowledge of the domain or the codomain;
Unless the possible consequences are trivial, a choice operator will choose the same action for many different predictions, that is a choice operator only uses certain feature of the predictions' space and is indifferent to anything else [1];
A choice operator defines naturally a 'preferred outcome' operator, which is simply the predicted outcome of the chosen action, and is defined by 'sandwiching' the choice operator between two predictions. I just thought interleave is a better name than sandwich. It's of type $(A \to B) \to B$ .

[1] To show this, let $k^{- 1} (A)$ be a partition of $A \to B$ and let $R_{k}$ be the equivalence relation uniquely generated by the partition. Then $k (A \to B) \equiv k ((A \to B) / R_{k})$

[-]Gurkenglas7y10

Knowledge that there is an action to select, in the form of having an action in hand, allows the implementation of exactly one chooser: The one that always selects that action.
$[1]$ holds for any function $k$ / partition $k^{- 1}$ between any two sets. The proof you want may be that $A \to B$ is an exponential space and therefore usually larger than $A$ .
interleave/sandwich should then take two predictions as parameters. This suggests that we could define a metric on the space of predictions, and then sandwich the chooser between two nearby predictions, to measure its response to inaccurate predictions.

[-]MrMind7y20

Re: the third point, I think it's important to differentiate between $f (k (f))$ and $t (k (f))$ , where $t$ is the true prediction, that is what actually happens when an agent performs the action $A$ .

$f (k (f))$ is simply the outcome the agent is aiming at, while $t (k (f))$ is the outcome the agent eventually gets. So maybe it's more interesting a measure of similarity in $B$ , from which you can compare the two.

[-]Diffractor6yΩ6130

I found a paper about this exact sort of thing. Escardo and Olivia call that type signature a "selection functional", and the type signature $(A \to B) \to B$ is called a "quantification functional", and there's several interesting things you can do with them, like combining multiple selection functionals into one in a way that looks reminiscent of game theory. (ie, if $ϵ$ has type signature $(A \to C) \to A$ , and $δ$ has type signature $(B \to C) \to B$ , then $ϵ \otimes δ$ has type signature $((A \times B) \to C) \to (A \times B)$ .

[-]cousin_it7y80

My first instinct is to translate this post to logic. Obviously A -> B doesn't imply A, because A -> B is true when A is false. So we need to expand the problem: imagine we have A -> B and some additional knowledge K, and together they imply A. Then it seems to me that K alone, without A -> B, would also imply A.

Proof: by definition, (not (A -> B)) = (A and not B). Therefore (not (A -> B)) -> A. Therefore (K and not (A -> B)) -> A. But we also have (K and (A -> B)) -> A by assumption. Therefore K -> A by case analysis.

So the only thing we can say about K is that it must imply A, and any such K would suffice. The proof only works in classical logic though. What can we say about K if the logic is intuitionistic?

We can see the difference by setting K = ((A -> B) -> A). Then in intuitionistic logic, K alone doesn't imply A because Peirce's law doesn't hold, but (K and (A -> B)) implies A by modus ponens. Moreover, it's obvious that any other suitable K must imply this one. That wraps up the intuitionistic case: K must imply (A -> B) -> A, any such K would suffice, and there's no shorter answer.

Can we exhibit specific A and B for which (A -> B) -> A holds intuitionistically, but A doesn't? Yes we can: this stackexchange answer says A = (((P -> Q) -> P) -> P) and B = (P -> Q) work. Of course this A still holds classically due to Peirce's law, but that's unavoidable.

[-]rk7y40

I wonder if there are any plausible examples of this type where the constraints don't look like ordering on B and search on A.

To be clear about what I mean about those constraints, here's an example. One way you might be able to implement this function is if you can enumerate all the values of A and then pick the maximum B according to some ordering. If you can't enumerate A, you might have some strategy for searching through it.

But that's not the only feasible strategy. For example, if you can order B, take two elements of B to C and order C, you might do something like taking the element of B that, together with the value less than it, takes you to the greatest C.

My question is whether these weirder functions have any interest

[-]MrMind7y20

I wonder if there are any plausible examples of this type where the constraints don't look like ordering on B and search on A.

Yes, as I shown in my post, such operators must know at least an element of one of the domains of the function. If it knows at least an element of A, a constant function on that element has the right type. Unfortunately, it's not much interesting.

[-]jmh7y10

"Bringing in the agency (Ai→Ui)→Ai of both players leads to cycle. This cycle does not make sense unless the agency arrows are lossy in some way, so as to not be able to create a contradiction. "

I'm definitely missing something here - and may be thinking of things incorrectly here. Isn't a contradiction inherent in a cycle behavior? I'm thinking about things like voting cycles events where preferences are multi-peaked resulting in shifting majorities.

Is the "lossy" point just saying in such a cycle we have rules about pairing the alternatives that are then voted for and once one alternative has lost then it's out of the set for future votes?

Am I thinking of this the right way (even if putting it in a bit of a different context)?

[-]Oliver Sourbut4yΩ100

I think the gradient descent bit is spot on. That also looks like the flavour of natural selection, with non infinitesimal (but really small) deltas. Natural selection consumes a proof that a particular (mutation) produces $δ f$ (fitness) to generate/propagate/multiply $δ x$ .

I recently did some thinking about this and found an equivalence proof under certain conditions for the natural selection case and the gradient descent case.

In general, I think the type signature here can indeed be soft or fuzzy or lossy and you still get consequentialism, and the 'better' the fidelity, the 'better' the consequentialism.

This post has also inspired some further thinking and conversations and refinement about the type of agency/consequentialism which I'm hoping to write up soon. A succinct intuitionistic-logic-flavoured summary is something like $(\exists A . A \to B) \to A$ but there's obviously more to it than that.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

80

80

Ω 26

80

Ω 26