Johannes C. Mayer

ARC's first technical report: Eliciting Latent Knowledge

Ah ok, thank you. Now I get it. I was confused by (i) "Imagine the reporter could do perfect inference" and (ii) "the reporter could simply do the best inference it can in the human Bayes net (given its predicted video)".

(i) I thought of this as that the reporter alone can do it, but what is actually meant is that with the use of the predictor model it can do it.

(ii) Somehow I thought that "given its predicted video" is the important modification here, where in fact the only change is to go from that the reporter can do perfect inference, to that it does the best inference that it can.

ARC's first technical report: Eliciting Latent Knowledge

In section: "New counterexample: better inference in the human Bayes net", what is meant with that the reporter does perfect inference in the human Bayes net? I am also unclear how the modified counterexample is different.

My current understanding: The reporter is doing inference using and the action sequence and does not use to do inference ( is inferred). The reporter has an exact copy of the human Bayes net and now fixes the nodes for and the action sequence. Then it infers the probability for all possible combinations of values each node can have (including ) (i.e. the joint probability distribution).

I am not sure here. Is the reporter not using ? The graphic in that section shows a red arrow from in the predictor, to in the human Bayes net model that the reporter uses. But that could be about the better counterexample already.

Now we assume that the model knows how to map a question in natural language onto nodes in the Bayes net and that it can then translate values of nodes into answers to questions. The model can then use the joint probability distribution and the law of total probability to calculate the probabilities of nodes/events occurring which can then be used to answer questions.

The only difference in the better counterexample is that we now also fix the value of to whatever our predictor part of the model said would happen. And we do not assume that our predictor works perfectly, hence our reporter can give wrong answers because of that.

And now when we have , then calculating the joint probability distribution becomes computationally feasible? Are we still assuming that the reporter does perfect inference in the human Bayes net, given that our predictor predicted correctly?

Johannes C. Mayer's Shortform

**The "Fu*k it" justification**:
Sometimes people seem to say "fu*k it" towards some particular thing. I think this is a way to justify one's intuitions. You intuitively feel like you should not care about something, but you actually can't put your intuition into words. Except you can say "fu*k it" to convey your conclusion, without any justification.

Johannes C. Mayer's Shortform

There could be but there does not need to be, I would say. Or maybe I really do not get what you are talking about. It could really be that if the cryptographic lock was not in place, that then you could take the box, and there is nothing else that prevents you from doing this. I guess I have an implicit model where I look at the world from a cartesian perspective. So is what you're saying about counterfactuals, and that I am using them in a way that is not valid, and that I do not acknowledge this?

Johannes C. Mayer's Shortform

I don't really get that. For example, you could put a cryptographic lock on the box (let's assume there is no way around it without the key), and then throw away the key. It seems that now you actually are not able to access the box, because you do not have the key. And you can also at the same time know that this is the case.

Not sure why this should be impossible to say.

Johannes C. Mayer's Shortform

The interesting thing is that you can end up in a scenario where you actually know that the other box contains 1,000,000$ for sure. The one that you did not pick. Although you can't take it because of the pre-commitment mechanism. And this pre-commitment mechanism is the only thing that prevents you from taking it. The thing that I found interesting is that such a situation can arise.

You have a system, that can predict perfectly what you will do in the future.

In fact, I do not. This (like Newcomb) doesn't tell me anything about the world.

Also of course there is no system in reality that can predict you perfectly, but this is about an idealised scenario that is relevant because there are systems that can predict you with more than 50% accuracy.

Johannes C. Mayer's Shortform

You have a system, that can predict perfectly what you will do in the future. It presents you with two opaque boxes. If you take both boxes, then it will place in one box 10$ and in the other 0$. If you will take only one box, then it will place in one box 10$ and in the other 1,000,000$. The system does not use its predictive power to predict which box you will choose, but only to determine if you choose one or two boxes. It uses a random number generator to determine where to place which amount of dollars.

This is a modified version of Newcomb's problem.

Imagine that you are an agent that can reliably pre-commit to an action. Now imagine you pre-commit to taking only one box in such a way, that it makes it impossible for you to not uphold that commitment. Now if you choose a box, and get 10$, you know that the other box contains 1,000,000$ for sure.

Arguing "By Definition"

Sometimes people say: "Too much of X is bad for you". Well, that is true by the definition of "too much". You can use this to make the argument, that the actual important point that the person tries to convey is that it is possible, and probably not too hard and quite likely if you are not careful, to get so much that it is bad for you.

From what I have heard (I have not researched any of this very thoroughly), the palatability is not the problem directly but something very related is. It is not the case that someone would eat a lot, just because it is just so tasty. It is rather about that the composition of processed food is often very different from unprocessed food. And this affects how our body responds, like when we feel full. Eating some Froot Loops is very different from eating a mango.

This might not be the only, or even the main effect, but I would guess that it is a significant factor. Intuitively it seems much less likely to me that if you put somebody on a diet of vegetables, fresh fruit, legumes, and whole grains, that they would then become obese (probably also make it low sodium to decrease to normal levels palatability).