viktor.rehnberg

Some blindspots in rationality and effective altruism

Good catch

This seems to ask the question 'Is a change in the quality of x like colour actually causal to outcomes y?'

Yes, I think you are right. Usually when modeling you can learn correlations that are useful for predictions but if the correlations are spurious they might disappear when the distributions changes. As such to know if p(y|x) changes from only observing x, then we would probably need that all causal relationships to y are captured in x?

Some blindspots in rationality and effective altruism

Good point, my example with the figure is lacking in regards to 1 simply because we are assuming that x is known completely and that the observed y are true instances of what we want to measure. And from this I realize that I am confused about when some uncertainties should be called aleagoric or epistemic.

When I think I can correctly point out epistemic uncertainty:

- If the y that are observed are not the ones that we actually want then I'd call this uncertainty epistemic. This could be if we are using tired undergrads to count the number of pips of each rolled die and they miscount for some fraction of the dice.
- If you haven't seen similar x before then you have epistemic uncertainty because you have uncertainty about which model or model parameters to use when estimating y. (This is the one I wrote about previously and the one shown in the figure)

My confusion from 1:

- If the conditions of the experiment changes. Our undergrads start to pull dice from another bag with an entirely different distribution p(y|x), then we have insufficient knowledge to estimate y and I would call this epistemic uncertainty.
- If x is lacking in some information to do good estimates of y. x is the color of the die and when we have thrown enough dice from our experimental distribution we get a good estimate of p(y|x) and our uncertainty doesn't increase with more rolls, which makes me think that it is aleatoric uncertainty. But on the other hand x is not sufficient to spot when we have a new type of die (see previous point) and if we knew more about the dice we could do better estimates which makes me think that it is epistemic uncertainty.

You bring up a good point in 1 and I agree that this feels like it should be epistemic uncertainty, but at some point the boundary between inherent uncertainty in the process and uncertainty from knowing too little about the process becomes vague to me and I can't really tell when a process is aleatoric or epistemic.

Some blindspots in rationality and effective altruism

Thanks, I realised that I provided zero context for the figure. I added some.

- is x like the input data?

- could y correspond to something like the supervised (continuous) labels of a neural network, which inputs are matched too?

Yes. The example is about estimating y *given* x where x is assumed to be known.

- does epistemic uncertainty here refer to that inputs for x could be much different from the current training dataset if sampled again (where new samples could turn out be outside of the current distribution)?

Not quite, we are still thinking of uncertainty only as applied to y. Epistemic uncertainty here refers to regions where the knowledge and data is insufficient to give a good estimate y given x from these regions.

To compare it with your dice example, consider x to be some quality of the die such that you think dies with similar x will give similar rolls y. Then aleatoric uncertainty is high for dies where you are uncertain for values of new rolls even after having rolled several similar dies and rolling more similar dies will not help. While epistemic uncertainty is high for dies with qualities you haven't seen enough of.

Some blindspots in rationality and effective altruism

This is my go-to figure when thinking about aleatoric vs epistemic uncertainty.

Edit: In the context of the figure. The *aleatoric* uncertainty is high in the left cluster because the uncertainty of where a new data point will be is high and is not reduced by the number of training examples. The *epistemic* uncertainty is high in regions where there is insufficient data or knowledge to produce an accurate estimate of the output, this would go down with more training data in these regions.

Environments as a bottleneck in AGI development

This seems related to a thought I had when reading An overview of 11 proposals for building safe advanced AI. How much harder is it to find an environment that promotes aligned AGI compared to any AGI?

It seems that a lot of the proposals for AGI under the current ML paradigm either utilizes oversight to get a second chance or to get an extra term in the loss-function to promote alignedness. How well either of these types of methods work seem to be dependent on the base rate of aligned AGI to any AGI that can emerge from a particular model and training environment. I'm thinking of it as roughly

where is some model and is the training environment without safeguards to detect deceptive or otherwise catastrophic behavior.

This post seems to concern

how much does the environment compared to the model influence the emergence of AGI?

What I'm trying to get at is that I think a related important question is

how much does the alignedness of an emerging AGI depend on its environment compared to the model?

Limits of Current US Prediction Markets (PredictIt Case Study)

In around half of the equations there is an extra right parenthesis. It makes reading the equations a bit extra work as it changes the interpretations somewhat.

In most of the equations with an extra right parenthesis, I believe it is the leftmost one (of the right parentheses) that should be removed.

Occam's Razor

My own way of thinking of Occam's Razor is through model selection. Suppose you have two competing statements (the which did it) and (it was chance or possibly something other than a which caused it ()) and some observations (the sequence came up 0101010101). Then the preferred statement is whichever is more probable calculated as

this is simply Bayes rule where

and the model is parametrized by some parameters .

Now all this is just the mathematical way of writing that a hypothesis that has more parameters (or more specifically more possible values that it predicts), will not be as strong a statement that predicts a smaller state of outcomes.

In the witch example this would be:

- There exist an advanced intelligent being (at least not much less than human intelligence) that can do things beyond what has ever been reproduced in a scientific way that for some reason chooses to live on our street and act mostly as a human that will choose to influence my sequence coin tosses to end up in some seemingly looking pattern
- The coin toss is ruled by chance and might end up in the set of possible outcomes that seem to form a pattern ()
- The coin toss ended up as
- The way I stated the hypotheses

Now what remains is to estimate the priors and the the fraction of outcomes that look like a pattern. We can skip as we are interested in .

Now comparing the amount of conditionals in the hypotheses and how surprised I am by them I would roughly estimate a ratio of the priors as something like in favor to chance, as the witch hypothesis goes against many of my formed beliefs of the world collected over many years, it includes weird choices of living for this hypothetical alien entity, it picks out me as a possible agent of many in the neighborhood, it singles out an arbitrary action of mine and an arbitrary set of outcomes.

For the sake of completeness. The fraction of outcomes that look like a pattern is kind of hard to estimate exactly. However, my way of thinking about it is how soon in the sequence would I postulate the specific sequence that it ended up in. After 0101, I think that the sequence 0101010101 is the most obvious pattern to continue it in. So roughly this is six bits of evidence.

In conclusion, I would say that the probability of the witch hypothesis is lacking around 94 bits of evidence for me to believe it as much as the chance hypothesis.

The downside of this approach to the Solomonoff induction and the minimum message length is that it is clunkier to use and it might be easy to forget to include conditionals or complexity in the priors the same way they can be lost in the English language. The upside is that as a model it is simpler, less ad hoc and builds directly on the product rule in probability and that probabilities sum to one and should thus be preferred by Occam's Razor ;).

Simulacra Levels and their Interactions

Nice write up, I believe I have a better grasp of simulacra levels after this post.

(Think this is missing 1-3 additional roles. Discussion question, what is The Idealist?)

I'll take this as an exercise for the readers.

I'll start with a definition of how I see *The Idealist*: The Idealist is someone with an ideal of how people should act to have the best consequences in the world. In the most simple case this could simply be someone that believes the truth to be most important and that everyone should stay in Level 1. However, this type of idealist could ironically be seen as a Level 2 move: "I'm telling the truth so that you will stay on Level 1 and tell the truth as well".

It becomes more complicated when the implications of the ideal affects higher simulacra levels. Consider an idealist with the ideal that intelligence is the most important and that advanced communication and progressively higher level reasoning is the way to achieve this. This way you value the complexity arising in Level 4 and you want people to see your group as good and that they move to a higher level.

In common, the types of idealist I can imagine want to affect the map so that people act in accordance with their ideal (Level 2 move), but they also want their group to be perceived as the cool group and will say things that will make undecided people move towards their group (Level 4 move). Therefore I would say that The Idealist is a Level 2 + Level 4 player. Furthermore, The Idealist only says something if it improves both how others perceive the map and how other value the in group (relative to the out group).

I don't know if this is roughly what you had in mind when you thought about The Idealist, but this is my stab at the problem.

I'm leaving AI alignment – you better stay

As someone just starting out on the path towards becoming AI safety researcher I appreciate this post a lot. I have started worrying about not having enough impact in the field if I could not become established fast enough. However, reading this I think that it might serve me (and the field) better if I instead take my time and instead only enter the field properly if I find that my personal fit seems good and that I can stay in the field for a long time.

Furthermore, this post has helped me in finding possible worthwhile early on projects that could increase my understanding and personal fit for the field.

To disentangle the confusion I took a look around about a few different definitions of the concepts. The definitions were mostly the same kind of vague statement of the type:

However, I found some useful tidbits

With this my updated view is that our confusion is probably because there is a free parameter in where to draw the line between aleatoric and epistemic uncertainty.

This seems reasonable as more information can always lead to better estimates (at least down to considering wavefunctions I suppose) but in most cases having this kind of information and using it is infeasible and thus having the distinction between aleatoric and epistemic depend on the problem at hand seems reasonable.