This is an interesting direction to explore but as is I don't have any idea what you mean by understand the go bot and I fear figuring that out would itself require answering more than you want to ask.
For instance, what if I just memorize the source code. I can slowly apply each step on paper and as the adversarial training process has no training data or human expert input if I know the rules of go I can, Chinese room style, fully replicate the best go bot using my knowledge given enough time.
But if that doesn't count and you don't just mean be better than them at go then you must have in mind that I'd somehow have the same 'insights' as the program. But now to state the challenge we need a precise (mathematical) definition that specifies the insights contained in a trained ML model which means we've already solved the problem.
At a conceptual level I'm completely on board. At a practical level I fear a disaster. Right now you at least need to find a word which you can claim to be analyzing and that fact encourages a certain degree of contact and disagreement even if a hard subject like philosophy should really have 5 specific rebuttal papers (the kind journals won't publish) for each positive proposal rather than the reverse as they do now.
The problem with conceptual engineering for philosophy is that philosophers aren't really going to start going out and doing tough empirical work the way a UI designer might. All they are going to do is basically assert that their concept are useful/good and the underlying sociology of philosophy means it's seen as bad form to mercilessly come after them insisting that: no that's a stupid and useless concept. Disagreements over the adequacy of a conceptual analysis or the coherence of a certain view are considered acceptable to push to a degree (not enough imo) but going after someone overtly (rather than via rumor) because their work isn't sufficiently interesting is a big no no. So I fear the end result would be to turn philosophy into a thousand little islands each just gazing at their own navel with no one willing to argue that your concepts aren't useful enough.
I'd argue that this argument doesn't work because the places where CDT, EDT or some new system diverge from each other are outside of the set of situations in which decision theory is a useful way to think about the problems. I mean it is always possible to simply take the outside perspective and merely describe facts of the form: under such and such situations algorithm A performs better than B.
What makes decision theory useful is that it implicitly accommodates the very common (for humans) situation in which the world doesn't depend in noticeable ways (ie the causal relationship is so lacking in simple patterns it looks random to our eyes) on the details of the algorithm we've adopted to make future choices. The second we get into situations like Newcomb problems where variants of decision theory might say something else there is simply no reason to model the scenario in terms of decisions at all anymore.
Once you have meaningful feedback between the algorithm adopted to make choices and other agent's choices it's time to do the kind of analysis we do for fixed points in CS/math not apply decision theory given that the fundamental abstraction of a decision doesn't really make sense anymore when we get feedback based on our choice algorithm.
Moreover, it's plausible that decision theory is only useful from an internal perspective and not the perspective of someone designing the an algorithm to make choices. Indeed, one of the reasons decision theory is useful is the kind of limited access we have to our own internal behavioral algorithms. If We are considering a computer program it seems strictly preferable to just reason about decision algorithms directly so we need not stretch the agent idealization too far.
Seems like phrasing it in terms of decision theory only makes the situation more confusing. Why not just state the results in terms of: assuming there are a large number of copies of some algorithm A then there is more utility if A has such and such properties.
This works more generally. Instead of burying ourselves in the confusions of decision theory we can simply state results about what kind of outcomes various algorithms give rise to under various conditions.
I think we need to be careful here about what constitutes a computation which might give rise to an experience. For instance suppose a chunk of brain pops into existence but with all momentum vectors flipped (for non-nuclear processes we can assume temporal symmetry) so the
brain is running in reverse.
Seems right to say that could just as easily give rise to the experience of being a thinking human brain. After all we think the arrow of time is determined by direction of decreasing entropy not by some weird fact that only computations which proced in one direction give rise to experiences.
Ok so far no biggie but why insist computations be embedded temporally? One can reformulate the laws of physics to constrain events to the left given the complete (future and past) set of events to the right so why can't the computation be embedded from left to right (ie the arrow we of time points right) or in some completely other way we haven't thought of.
More generally, once we accept the possibility that the laws of physics can give rise to computations that don't run in what we would view as a casual fashion then it's no longer clear that the only kind of things which count as computations are those the above analysis considered.
You are making some unjustified assumptions about the way computations can be embedded in a physical process. In particular we shouldn't presume that the only way to instantiate a computation giving rise to an experience is via the forward evolution of time. See comment below.
That won't fix the issue. Just redo the analysis at whatever size is able to mereky do a few seconds of brain simulation.
Of course, no actual individual or program is a pure Bayesian. Pure Bayesian updating presumes logical omniscience after all. Rather, when we talk about Bayesian reasoning we idealize individuals as abstract agents whose choices (potentially none) have a certain probabilistic effect on the world, i.e., basically we idealize the situation as a 1 person game.
You basically raise the question of what happens in Newcomb like cases where we allow the agent's internal deliberative state to affect outcomes independent of explicit choices made. But whole model breaks down the moment you do this. It no longer even makes sense to idealize a human as this kind of agent and ask what should be done because the moment you bring the agent's internal deliberative state into play it no longer makes sense to idealize the situation as one in which there is a choice to be made. At that point you might as well just shrug and say 'you'll choose whatever the laws of physics says you'll choose.'
Now, one can work around this problem by instead posing a question for a different agent who might idealize a past self, e.g., if I imagine I have a free choice about which belief to commit to having in these sorts of situations which belief/belief function should I presume.
As an aside I would argue that, while a perfectly valid mathematical calculation, there is something wrong in advocating for timeless decision theory or any other particular decision theory as the correct way to make choices in these Newcomb type scenarios. The model of choice making doesn't even really make sense in such situations so any argument over which is the true/correct decision theory must ultimately be a pragmatic one (when we suggest actual people use X versus Y they do better with X) but that's never the sense of correctness that is being claimed.
While I agree with your conclusion in some sense you are using the wrong notion of probability. The people who feel there is a right answer to the sleeping beauty case aren't talking about the kind of formally defined count over situations in some formal model. If that's the only notion of probability then you can't even talk about the probabilities of different physical theories being true.
The people who think there is a sleeping beauty paradox believe there is something like the rational credence one should have in a proposition given your evidence. If you believe this then you have a question to answer. What kind of credence should sleeping beauty have in the coin landing heads given she has the evidence of remembering being enrolled in the experiment and waking up this morning.
In my analysis of the issue I ultimately come to essentially the same conclusion that you do (it's an ill-posed problem) but an important feature of this account is that it requires **denying** that there is a well-defined notion that we refer to when we talk about rational credence in a belief.
This is a conclusion that I feel many rationalists will have a hard time swallowing. Not the abstract view that should shut up about probability and just look at decisions. Rather, the conclusion that we can't insist the person who (despite strong evidence to the contrary) that it's super likely that god exists is being somehow irrational because there isn't even necessarily a common notion of what kind of possible outcomes count for making decisions, e.g., if they only value being correct in worlds where there is a deity they get an equally valid notion of rational credence which makes their belief perfectly rational.
Also, I think there is a fair bit of tension between your suggestion that we should be taking advice from others about how much things should hurt and the idea that we should use the degree of pain we feel as a way to identify abusive/harmful communities/relationships. I mean the more we allow the advice from those communities to determine whether we listen to those pain signals the less useful they are to us .