Yet more prententious poetry
To understand the theory of decisions,
We must enquire into the nature of a decision,
A thing that does not truly exist.
Everything is determined.
The path of the future is fixed.
So when you decide,
There is nothing to decide.
You simply discover, what was determined in advance.
A decision only exists,
Within a model.
If there are two options,
Then there are two possible worlds.
Two possible versions of you.
A decision does not exist,
Unless a model has been built.
But a model can not be built.
Without defining a why.
We go and build models.
To tell us what to do.
To aid in decisions.
Which don't exist at all.
A model it must,
To reality match.
Our reality cannot be spoken.
The cycle of models can't be broken.
Upwards we must go.
A decision does need,
But a counterfactual does need,
A decision indeed.
This is not a cycle.
A decision follows,
Given the 'factuals.
The 'factuals follow,
From a method to construct,
This decision's ours to make.
But first we need,
The 'factuals for the choice,
Of how to construct,
The 'factuals for the start.
A never ending cycle,
Here we are again.
Well, taking the simpler case of exacting reproducing a certain string, you could find the simplest program that produces the string similar to Kolmogorov complexity and use that as a measure of complexity.
A slightly more useful way of modelling things may be to have a bunch of different strings with different points representing levels of importance. And perhaps we produce a metric combining the Kolmovorov complexity of a decoder with the sum of the points produced where points are obtained by concatenating the desired strings with a predefined separator. For example, we might find the quotient.
One immediate issue with this is that some of the strings may contain overlapping information. And we'd still have to produce a metric to assign importances to the strings. Perhaps a simpler case would be where the strings represent patterns in a stream via encoding a Turing machine with the Turing machines being able to output sets of symbols instead of just symbols representing the possible symbols at each locations. And the amount of points they provide would be equal to how much of the stream it allows you to predict. (This would still require producing a representation of the universe where the amount of the stream predicted would be roughly equivalent to how useful the predictions are).
Any thoughts on this general approach?
I think that part of the problem is that talking about knowledge requires adopting an interpretative frame. We can only really say whether a collection of particles represents some particular knowledge from within such a frame, although it would be possible to determine the frame of minimum complexity that interprets a system as representing certain facts. In practise though, whether or not a particular piece of storage contains knowledge will depend on the interpretative frames in the environment, although we need to remember that interpretative frames can emulate other interpretative frames. ie. A human experimenting with multiple codes in order to decode a message.
Regarding the topic of partial knowledge, it seems that the importance of various facts will vary wildly from context to context and also depending on the goal. I'm somewhat skeptical that goal independent knowledge will have a nice definition.
"And ability to handle counterfactuals is basically free if you have anything resembling a predictive model of the world" - ah, but a predictive model also requires counterfatuals.
I think that this might be likely if people who wrote on similar topics were to have conversations and record them as writing an article takes a lot more time than just having a discussion.
This is a cool feature, but I'd honestly be surprised if it gets much use.
I believe that we need to take a Conceputal Engineering approach here. That is, I don't see counterfactuals as intrinsically part of the world, but rather someone we construct. The question to answer is what purpose are we constructing these for? Once we've answer this question, we'll be 90% of the way towards constructing them.
As far as I can see, the answer is that we imagine a set of possible worlds and we notice that agents that use certain notions of counterfactuals tend to perform better than agents that don't. Of course, this raises the question of which possible worlds to consider, at which point we notice that this whole thing is somewhat circular.
However, this is less problematic than people think. Just as we can only talk about what things are true after already having taken some assumptions to be true (see Where Recursive Justification hits Bottom), it seems plausible that we might only be able to talk about possibility after having already taken some things to be possible.
I'm still really keen for footnotes. They allow people to make more nuanced arguments by addressing objections or misunderstandings that people may have without breaking the flow.
"The problem is that principle F elides" - Yeah, I was noting that principle F doesn't actually get us there and I'd have to assume a principle of independence as well. I'm still trying to think that through.
Hmm... that's a fascinating argument. I've been having trouble figuring out how to respond to you, so I'm thinking that I need to make my argument more precise and then perhaps that'll help us understand the situation.
Let's start from the objection I've heard against Counterfactual Mugging. Someone might say, well I understand that if I don't pay, then it means I would have lost out if it had come up heads, but since I know it didn't came up heads, I don't care. Making this more precise, when constructing counterfactuals for a decision, if we know fact F about the world before we've made our decision, F must be true in every counterfactual we construct (call this Principle F).
Now let's consider Counterfactual Prisoner's Dilemma. If the coin comes up HEADS, then principle F tells us that the counterfactuals need to have the COIN coming up HEADS as well. However, it doesn't tell us how to handle the impact of the agent's policy if they had seen TAILS. I think we should construct counterfactuals where the agent's TAILS policy is independent of its HEADS policy, whilst you think we should construct counterfactuals where they are linked.
You justify your construction by noting that the agent can figure out that it will make the same decision in both the HEADS and TAILS case. In contrast, my tendency is to exclude information about our decision making procedures. So, if you knew you were a utility maximiser this would typically exclude all but one counterfactual and prevent us saying choice A is better than choice B. Similarly, my tendency here is to suggest that we should be erasing the agent's self-knowledge of how it decides so that we can imagine the possibility of the agent choosing PAY/NOT PAY or NOT PAY/PAY.
But I still feel somewhat confused about this situation.