Sorted by New

Wiki Contributions


The author is associated with the Foundational Research Institute, which has a variety of interests highly connected to those of Lesswrong, yet some casual searches seem to show they've not been mentioned.

Briefly, they seem to be focused on averting suffering, with various outlooks on that including effective altruism outreach, animal suffering and ai-risk as a cause of great suffering.

Regarding the psychology of why people overestimate the correlation-causation link, I was just recently reading this, and something vaguely relevant struck my eye:

Later, Johnson-Laird put forward the theory that individuals reason by carrying out three fundamental steps [21]:

  1. They imagine a state of affairs in which the premises are true – i.e. they construct a mental model of them.

  2. They formulate, if possible, an informative conclusion true in the model.

  3. They check for an alternative model of the premises in which the putative conclusion is false.

If there is no such model, then the conclusion is a valid inference from the premises.

Johnson-Laird and Steedman implemented the theory in a computer program that made deductions from singly-quantified assertions, and its predictions about the relative difficulty of such problems were strikingly confirmed: the greater the number of models that have to be constructed in order to draw the correct conclusion, the harder the task [25]. Johnson-Laird concluded [22] that comprehension is a process of constructing a mental model, and set out his theory in an influential book [23]. Since then he has applied the idea to reasoning about Boolean circuitry [3] and to reasoning in modal logic [24].

I would be very surprised if this was not the case. Different fields already use different cutoffs for statistical-significance (you might get away with p<0.05 in psychology, but particle physics likes its five-sigmas, and in genomics the cutoff will be hundreds or thousands of times smaller and vary heavily based on what exactly you're analyzing) and likewise have different expectations for effect sizes (psychology expects large effects, medicine expects medium effects, and genomics expects very small effects; eg for genetic influence on IQ, any claim of a allele with an effect larger than d=0.06 should be greeted with surprise and alarm).

The existing defaults aren't usually well-justified: for example, why does psychology use 0.05 rather than 0.10 or 0.01?

This is a good point, and leads to what might be an interesting use of the experimental approach of linking correlations to causation: gauging whether the heuristics currently in use in a field are at a suitable level/reflect the degree to which correlation is evidence for causation.

If you were to find, for example, that physics is churning out huge sigmas where it doesn't really need to, or psychology really really needs to up its standards of evidence (not that that in itself would be a surprising result), those could be very interesting results.

Of course, to run these experiments you need large samples of well-researched correlations you can easily and objectively test for causality, from all the fields you're looking at, which is no small requirement.

It's hard to say because how would you measure this other than directly, and to measure this directly you need a clear set of correlations which are proposed to be causal, randomized experiments to establish what the true causal relationship is, and both categories need to be sharply delineated in advance to avoid issues of cherrypicking and retroactively confirming a correlation so you can say something like '11 out of the 100 proposed A->B causal relationships panned out'. This is pretty rare, although the few examples I've found from medicine tend to indicate under 10%. Not great. And we can't explain all of this away as the result of illusory correlations being throw up by the standard statistical problems with findings such as small n/sampling error, selection bias, publication bias, etc.

Say you do this, and you find that about 10% of all correlations in a dataset are shown to have a causal link. Can you then look for a correlation between certain aspects of a correlation (such as coefficient, field of study) and those correlations which are causal?

Building on this, you might establish something like "correlations at .9 are more likely to be causal than correlations at .7" and establish a causal mechanism for this. Alternatively, you might find that "correlations from the field of farkology are more often causal than correlations from spleen medicine", and find a causal explanation for this.

Part or all of this explanation might involve the size of the causal network. It could well be that both correlation coefficients and field of study are just proxy variables to describe the size of a network, and that's the only important factor in the ratio of correlations to causal links, but it might be the case that there is more to it.

This could lead to quite a bit of trouble in academic literature, as measures of what evidence a correlation is for causation will become dependent on a set of variables about the context you're working in, and this could potentially be gamed. In fact, that could be the case even with gwern's original proposition -- claiming you're working with a small causal net could be enough to lend strong evidence to a causal claim based on correlation, and it's only by having someone point out that your causal net is lacking that this evidence can have its weighting adjusted.

All these thoughts are sketchy outlines of an extension of what gwern's brought up. More considered comment is welcome.

For the basic interaction setup, yes. For a sense of community and for reliable collection of the logs, perhaps not. I'm also not sure how anonymous Omegle makes users to each other and itself.

What I was getting at is that the current setup allows for side-channel methods of getting information on your opponent. (Digging to find their identity, reading their Facebook page, etc.).

While I accept that this interaction could be one of many between the AI and the researcher, this can be simulated in the anonymous case via a 'I was previously GatekeeperXXX, I'm looking to resume a game with AIYYY' declaration in the public channel while still preserving the player's anonymity.


Prompted by Tuxedage learning to win, and various concerns about the current protocol, I have a plan to enable more AI-Box games whilst preserving the logs for public scrutiny.

See this: http://bæ