Eric Neyman

I work at the Alignment Research Center (ARC). I write a blog on stuff I'm interested in (such as math, philosophy, puzzles, statistics, and elections): https://ericneyman.wordpress.com/

Sequences

Pseudorandomness Contest

Wiki Contributions

Comments

[Edit: this comment is probably retracted, although I'm still confused; see discussion below.]

I'd like clarification from Paul and Eliezer on how the bet would resolve, if it were about whether an AI could get IMO silver by 2024.

Besides not fitting in the time constraints (which I think is kind of a cop-out because the process seems pretty parallelizable), I think the main reason that such a bet would resolve no is that problems 1, 2, and 6 had the form "find the right answer and prove it right", whereas the DeepMind AI was given the right answer and merely had to prove it right. Often, finding the right answer is a decent part of the challenge of solving an Olympiad problem. Quoting more extensively from Manifold commenter Balasar:

The "translations" to Lean do some pretty substantial work on behalf of the model. For example, in the theorem for problem 6, the Lean translation that the model is asked to prove includes an answer that was not given in the original IMO problem.

theorem imo_2024_p6 (IsAquaesulian : (ℚ → ℚ) → Prop) (IsAquaesulian_def : ∀ f, IsAquaesulian f ↔ ∀ x y, f (x + f y) = f x + y ∨ f (f x + y) = x + f y) : IsLeast {(c : ℤ) | ∀ f, IsAquaesulian f → {(f r + f (-r)) | (r : ℚ)}.Finite ∧ {(f r + f (-r)) | (r : ℚ)}.ncard ≤ c} 2

The model is supposed to prove that "there exists an integer c such that for any aquaesulian function f there are at most c different rational numbers of the form f(r)+f(−r) for some rational number r, and find the smallest possible value of c".

The original IMO problem does not include that the smallest possible value of c is 2, but the theorem that AlphaProof was given to solve has the number 2 right there in the theorem statement. Part of the problem is to figure out what 2 is.

Link: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/imo-2024-solutions/P6/index.html

I'm now happy to make this bet about Trump vs. Harris, if you're interested.

Looks like this bet is voided. My take is roughly that:

  • To the extent that our disagreement was rooted in a difference in how much to weight polls vs. priors, I continue to feel good about my side of the bet.
  • I wouldn't have made this bet after the debate. I'm not sure to what extent I should have known that Biden would perform terribly. I was blindsided by how poorly he did, but maybe shouldn't have been.
  • I definitely wouldn't have made this bet after the assassination attempt, which I think increased Trump's chances. But that event didn't update me on how good my side of the bet was when I made it.
  • I think there's like a 75-80% chance that Kamala Harris wins Virginia.

I frequently find myself in the following situation:

Friend: I'm confused about X
Me: Well, I'm not confused about X, but I bet it's because you have more information than me, and if I knew what you knew then I would be confused.

(E.g. my friend who know more chemistry than me might say "I'm confused about how soap works", and while I have an explanation for why soap works, their confusion is at a deeper level, where if I gave them my explanation of how soap works, it wouldn't actually clarify their confusion.)

This is different from the "usual" state of affairs, where you're not confused but you know more than the other person.

I would love to have a succinct word or phrase for this kind of being not-confused!

Yup, sounds good! I've set myself a reminder for November 9th.

I'd have to think more about 4:1 odds, but definitely happy to make this bet at 3:1 odds. How about my $300 to your $100?

(Edit: my proposal is to consider the bet voided if Biden or Trump dies or isn't the nominee.)

I think the FiveThirtyEight model is pretty bad this year. This makes sense to me, because it's a pretty different model: Nate Silver owns the former FiveThirtyEight model IP (and will be publishing it on his Substack later this month), so FiveThirtyEight needed to create a new model from scratch. They hired G. Elliott Morris, whose 2020 forecasts were pretty crazy in my opinion.

Here are some concrete things about FiveThirtyEight's model that don't make sense to me:

  • There's only a 30% chance that Pennsylvania, Michigan, or Wisconsin will be the tipping point state. I think that's way too low; I would put this probability around 65%. In general, their probability distribution over which state will be the tipping point state is way too spread out.
  • They expect Biden to win by 2.5 points; currently he's down by 1 point. I buy that there will be some amount of movement toward Biden in expectation because of the economic fundamentals, but 3.5 seems too much as an average-case.
  • I think their Voter Power Index (VPI) doesn't make sense. VPI is a measure of how likely a voter in a given state is to flip the entire election. Their VPIs are way to similar. To pick a particularly egregious example, they think that a vote in Delaware is 1/7th as valuable as a vote in Pennsylvania. This is obvious nonsense: a vote in Delaware is less than 1% as valuable as a vote in Pennsylvania. In 2020, Biden won Delaware by 19%. If Biden wins 50% of the vote in Delaware, he will have lost the election in an almost unprecedented landslide.
    • I claim that the following is a pretty good approximation to VPI: (probability that the state is the tipping state) * (number of electoral votes) / (number of voters). If you use their tipping-point state probabilities, you'll find that Pennsylvania's VPI should be roughly 4.3 times larger than New Hampshire's. Instead, FiveThirtyEight has New Hampshire's VPI being (slightly) higher than Pennsylvania's. I retract this: the approximation should instead be (tipping point state probability) / (number of voters). Their VPI numbers now seem pretty consistent with their tipping point probabilities to me, although I still think their tipping point probabilities are wrong.

The Economist also has a model, which gives Trump a 2/3 chance of winning. I think that model is pretty bad too. For example, I think Biden is much more than 70% likely to win Virginia and New Hampshire. I haven't dug into the details of the model to get a better sense of what I think they're doing wrong.

Reply2111

One example of (2) is disapproving of publishing AI alignment research that may advance AI capabilities. That's because you're criticizing the research not on the basis of "this is wrong" but on the basis of "it was bad to say this, even if it's right".

People like to talk about decoupling vs. contextualizing norms. To summarize, decoupling norms encourage for arguments to be assessed in isolation of surrounding context, while contextualizing norms consider the context around an argument to be really important.

I think it's worth distinguishing between two kinds of contextualizing:

(1) If someone says X, updating on the fact that they are the sort of person who would say X. (E.g. if most people who say X in fact believe Y, contextualizing norms are fine with assuming that your interlocutor believes Y unless they say otherwise.)

(2) In a discussion where someone says X, considering "is it good for the world to be saying X" to be an importantly relevant question.

I think these are pretty different and it would be nice to have separate terms for them.

Load More