magfrump

Mathematician turned software engineer. I like swords and book clubs.

Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Battle of the Sexes

Nitpick:

While by the end of the article I feel like I understood what you mean by battle of the sexes, I didn't at the start and there is neither an explanation of the battle of the sexes game (even at the beginning of the section titled Battle of the Sexes!) nor is there a link to a post or article about it.

Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda

GPT-3 can already turn comments into code. We don't expect the reverse case to be fundamentally harder

I **would** expect the reverse case to be harder, possibly fundamentally. In a lot of code the reader's level of context is very important to code quality, and if you asked me to write code to follow a specification I would think it was boring but if you asked me to comment code that someone else wrote I would be very unhappy.

It's possible that it's just an order of magnitude harder or harder in a way that is bad for human attention systems that GPT-N would find easy. But I would predict that the project will have a stumbling block of "it gives comments but they are painful to parse," and there's at least some chance (10-30%?) that it will require some new insight.

Radical Probabilism

In the case where you get 1 heads, 5 tails, 25 heads, etc., then at every point in time, and you are working with the assumption that the flips are independent, then the Bayesian hypothesis will never converge, but it will actually give better predictions than the Socratic hypothesis most of the time. In particular when it's halfway through one of the powers of five, it will assign P>.5 to the correct prediction every time. And if you expand that to assuming the flip can depend on the previous flip, it will get to a hypothesis (the next flip will be the same as the last one) that actually performs VERY well, and does converge.

By "structured" I mean that I have a principled way of determining P(Evidence|Hypothesis); with the Socratic hypothesis I only have unprincipled ways of determining it.

I'm not sure what you mean by normalized, unless you mean that the Socratic hypothesis always gives probability 1 to the observed evidence, in which case it will dominate even the correct hypothesis if there is uncertainty.

Radical Probabilism

If you don't have statistical tests then I don't see how you have a principled way to update away from your structured hypotheses, since the structured space will always give strictly better predictions than the socratic hypothesis.

Introduction To The Infra-Bayesianism Sequence

I mean distinguishing between hypotheses that give very similar predictions--like the difference between a coin coming up heads 50% vs. 51% of the time.

As I said in my other comment, I think the assumption that you have discrete hypotheses is what I was missing.

Though for any countable set of hypotheses, you can expand that set by prepending some finite number of deterministic outcomes for the first several actions. The limit of this expansion is still countable, and the set of hypotheses that assign probability 1 to your observations is the same at every time step. I'm confused in this case about (1) whether or not this set of hypotheses is discrete and (2) whether hypotheses with shorter deterministic prefixes assign enough probability to allow meaningful inference in this case anyway.

I may mostly be confused about more basic statistical inference things that don't have to do with this setting.

Introduction To The Infra-Bayesianism Sequence

I think I see what I was confused about, which is that there is a specific countable family of properties, and these properties are discrete, so you aren't worried about **locally** distinguishing between hypotheses.

Introduction To The Infra-Bayesianism Sequence

I am confused about how the mechanisms and desiderata you lay out here can give meaningful differences of prediction over complete spaces of environments. Maybe it is possible to address this problem separately.

In particular, imagine the following environments:

E1: the outcome is deterministically 0 at even time steps and 1 at odd time steps.

E2: the outcome is deterministically 0 at even time steps up to step 100 and 1 at odd time steps up to step 100, then starts to be drawn randomly based on some uncomputable process.

E3: the outcome is drawn deterministically based on the action taken in a way which happens to give 0 for the first 100 even step actions and 1 for the odd step actions.

All of these deterministically predict all of the first 200 observations with probability 1. I have an intuition that if you get that set of 200 observations, you should be favoring E1, but I don't see how your update rule makes that possible without some prior measure over environments or some notion of Occam's Razor.

In the examples you give there are systemic differences between the environments but it isn't clear to me how the update is handled "locally" for environments that give the same predictions for all observed actions but diverge in the future, which seems sticky to me in practice.

Radical Probabilism

So one of the first thoughts I had when reading this was whether you can model any Radical Probabilist as a Bayesian agent that has some probability mass on "my assumptions are wrong" and will have that probability mass increase so that it questions its assumptions over a "reasonable timeframe" for whatever definition.

For the case of coin flips, there is a clear assumption in the naive model that the coin flips are independent of each other, which can be fairly simply expressed as $P(flip_i = H | flip_{j} = H) = P(flip_i = H | flip_{j} = T) \forall j < i$. In the case of the coin that flips 1 heads, 5 tails, 25 heads, 125 tails, just evaluating j=i-1 through the 31st flip gives P(H|last flip heads) = 24/25, P(H|last flip tails) = 1/5, which is unlikely at p=~1e-4, which is approximately the difference in bayesian weight between the hypothesis H1: the coin flips heads 26/31 times (P(E|H1)=~1e-6) and H0: the coin flips heads unpredictably (1/2 the time, P(E|H0)=~4e-10) which is a better hypothesis in the long run until you expand your hypothesis space.

So in this case, the "I don't have the hypothesis in my space" hypothesis actually wins out right around the 30th-32nd flip, possibly about the same time a human would be identifying the alternate hypothesis. That seems helpful!

However this relies on the fact that this specific hypothesis has a single very clear assumption and there is a single very clear calculation that can be done to test that assumption. Even in this case though, the "independence of all coin flips" assumption makes a bunch more predictions, like that coin flips two apart are independent, etc. calculating all of these may be theoretically possible but it's arduous in practice, and would give rise to far too much false evidence--for example, in real life there are often distributions that look a lot like normal distributions in the general sense that over half the data is within one standard deviation of the mean and 90% of the data is within two standard deviations, but where if you apply an actual hypothesis test of whether the data is normally distributed it will point out some ways that it isn't exactly normal (only 62% of the data is in this region, not 68%! etc.).

It seems like the idea of having a specific hypothesis in your space labeled "I don't have the right hypothesis in my space" can work okay under the conditions

1. You have a clearly stated assumption which defines your current hypothesis space

2. You have a clear statistical test which shows when data doesn't match your hypothesis space

3. You know how much data needs to be present for that test to be valid--both in terms of the minimum for it to distinguish itself so you don't follow conspiracy theories, and something like a maximum (maybe this will naturally emerge from tracking the probability of the data given the null hypothesis, maybe not).

I have no idea whether these conditions are reasonable "in practice" whatever that means, so I'm not really clear whether this framework is useful, but it's what I thought of and I want to share even negative results in case other people had the same thoughts.

Price Gouging and Speculative Costs

So is there a way to run a charitable organization that accomplishes this?

For example, if an org started ordering respirators to be built to expand capacity starting in January, would that be a good fit for Open Phil funding?

Would LW be able to convince people to move money to it at that point?

My primary use case for this would be at parties where whether or not someone is flirting is the core question.