Eric Neyman

I'm a 3rd year PhD student at Columbia. My academic interests lie in mechanism design and algorithms related to the acquisition of knowledge. I write a blog on stuff I'm interested in (such as math, philosophy, puzzles, statistics, and elections): https://ericneyman.wordpress.com/

Sequences

Pseudorandomness Contest

Wiki Contributions

Comments

I'm curious what disagree votes mean here. Are people disagreeing with my first sentence? Or that the particular questions I asked are useful to consider? Or, like, the vibes of the post?

(Edit: I wrote this when the agree-disagree score was -15 or so.)

I think that people who work on AI alignment (including me) have generally not put enough thought into the question of whether a world where we build an aligned AI is better by their values than a world where we build an unaligned AI. I'd be interested in hearing people's answers to this question. Or, if you want more specific questions:

  • By your values, do you think a misaligned AI creates a world that "rounds to zero", or still has substantial positive value?
  • A common story for why aligned AI goes well goes something like: "If we (i.e. humanity) align AI, we can and will use it to figure out what we should use it for, and then we will use it in that way." To what extent is aligned AI going well contingent on something like this happening, and how likely do you think it is to happen? Why?
  • To what extent is your belief that aligned AI would go well contingent on some sort of assumption like: my idealized values are the same as the idealized values of the people or coalition who will control the aligned AI?
  • Do you care about AI welfare? Does your answer depend on whether the AI is aligned? If we built an aligned AI, how likely is it that we will create a world that treats AI welfare as important consideration? What if we build a misaligned AI?
  • Do you think that, to a first approximation, most of the possible value of the future happens in worlds that are optimized for something that resembles your current or idealized values? How bad is it to mostly sacrifice each of these? (What if the future world's values are similar to yours, but is only kinda effectual at pursuing them? What if the world is optimized for something that's only slightly correlated with your values?) How likely are these various options under an aligned AI future vs. an unaligned AI future?
Reply2111

Yeah, there's definitely value in experts being allowed to submit multiple times, allowing them to update on other experts' submissions. This is basically the frame taken in Chapter 8, where Alice and Bob update their estimate based on the other's estimate at each step. This is generally the way prediction markets work, and I think it's an understudied perspective (perhaps because it's more difficult to reason about than if you assume that each expert's estimate is static, i.e. does not depend on other experts' estimates).

Thanks! I think the reason I didn't give those expressions is that they're not very enlightening. See here for l = 2 on (0, 1/2] and here for l = 4 on [1/2, 1).

Thanks! Here are some brief responses:

From the high level summary here it sounds like you're offloading the task of aggregation to the forecasters themselves. It's odd to me that you're describing this as arbitrage.

Here's what I say about this anticipated objection in the thesis:

For many reasons, the expert may wish to make arbitrage impossible. First, the principal may wish to know whether the experts are in agreement: if they are not, for instance, the principal may want to elicit opinions from more experts. If the experts collude to report an aggregate value (as in our example), the principal does not find out whether they originally agreed. Second, even if the principal only seeks to act based on some aggregate of the experts' opinions, their method of aggregation may be different from the one that experts use to collude. For instance, the principal may have a private opinion on the trustworthiness of each expert and wishes to average the experts' opinions with corresponding weights. Collusion among the experts denies the principal this opportunity. Third, a principal may wish to track the accuracy of each individual expert (to figure out which experts to trust more in the future, for instance), and collusion makes this impossible. Fourth, the space of collusion strategies that constitute arbitrage is large. In our example above, any report in [0.546, 0.637] would guarantee a profit; and this does not even mention strategies in which experts report different probabilities. As such, the principal may not even be able to recover basic information about the experts' beliefs from their reports.

 

For example, when I worked with IARPA on geopolitical forecasting, our forecasters would get financial rewards depending on what percentile they were in relative to other forecasters.

This would indeed be arbitrage-free, but likely not proper: it wouldn't necessarily incentivize each expert to report their true belief; instead, an expert's optimal report is going to be some sort of function of the expert's belief about the joint probability distribution over the experts' beliefs. (I'm not sure how much this matters in practice -- I defer to you on that.)

It's surprising to me that you could disincentivize forecasters from reporting the aggregate as their individual forecast.

In Chapter 4, we are thinking of experts as having immutable beliefs, rather than beliefs that change upon hearing other experts' beliefs. Is this a silly model? If you want, you can think of these beliefs as each expert's belief after talking to the other experts a bunch. In theory(?) the experts' beliefs should converge (though I'm not actually clear what happens if the experts are computationally bounded); but in practice, experts often don't converge (see e.g. the FRI adversarial collaboration on AI risk).

It seems to me that under sufficiently pessimistic conditions, there would be no good way to aggregate those two forecasts.

Yup -- in my summary I described "robust aggregation" as "finding an aggregation strategy that works as well as possible in the worst case over a broad class of possible information structures." In fact, you can't do anything interesting in the worse case over all information structures. The assumption I make in the chapter in order to get interesting results is, roughly, that experts' information is substitutable rather than complementary (on average over the information structure). The sort of scenario you describe in your example is the type of example where Alice and Bob's information might be complementary.

Great questions!

  1. I didn't work directly on prediction markets. The one place that my thesis touches on prediction markets (outside of general background) is in Chapter 5, page 106, where I give an interpretation of QA pooling in terms of a particular kind of prediction market called a cost function market. This is a type of prediction market where participants trade with a centralized market maker, rather than having an order book. QA pooling might have implications in terms of the right way to structure these markets if you want to allow multiple experts to place trades at the same time, without having the market update in between. (Maybe this is useful in blockchain contexts if market prices can only update every time a new block is created? I'm just spitballing; I don't really understand how blockchains work.)
  2. I think that for most contexts, this question doesn't quite make sense, because there's only one question being forecast. The one exception is where I talk about learning weights for experts over the course of multiple questions (in Chapter 5 and especially 6). Since I talk about competing with the best weighted combination of experts in hindsight, the problem doesn't immediately make sense if some experts don't answer some questions. However, if you specify a "default thing to do" if some expert doesn't participate (e.g. take all the other experts' weights and renormalize them to add to 1), then you can get the question to make sense again. I didn't explore this, but my guess is that there are some nice generalizations in this direction.
  3. I don't! This is Question 4.5.2, on page 94 :) Unfortunately, I would conjecture (70%) that no such contract function exists.

(Note: I work with Paul at ARC theory. These views are my own and Paul did not ask me to write this comment.)

I think the following norm of civil discourse is super important: do not accuse someone of acting in bad faith, unless you have really strong evidence. An accusation of bad faith makes it basically impossible to proceed with discussion and seek truth together, because if you're treating someone's words as a calculated move in furtherance of their personal agenda, then you can't take those words at face value.

I believe that this post violates this norm pretty egregiously. It begins by saying that hiding your beliefs "is lying". I'm pretty confident that the sort of belif-hiding being discussed in the post is not something most people would label "lying" (see Ryan's comment), and it definitely isn't a central example of lying. (And so in effect it labels a particular behavior "lying" in an attempt to associate it with behaviors generally considered worse.)

The post then confidently asserts that Paul Christiano hides his beliefs in order to promote RSPs. This post presents very little evidence presented that this is what's going on, and Paul's account seems consistent with the facts (and I believe him).

So in effect, it accuses Paul and others of lying, cowardice, and bad faith on what I consider to be very little evidence.

Edited to add: What should the authors have done instead? I think they should have engaged in a public dialogue with one or more of the people they call out / believe to be acting dishonestly. The first line of the dialogue should maybe have been: "I believe you have been hiding your beliefs, for [reasons]. I think this is really bad, for [reasons]. I'd like to hear your perspective."

To elaborate on my feelings about the truck:

  • If it is meant as an attack on Paul, then it feels pretty bad/norm-violating to me. I don't know what general principle I endorse that makes it not okay: maybe something like "don't attack people in a really public and flashy way unless they're super high-profile or hold an important public office"? If you'd like I can poke at the feeling more. Seems like some people in the Twitter thread (Alex Lawsen, Neel Nanda) share the feeling.
  • If I'm wrong and it's not an attack, I still think they should have gotten Paul's consent, and I think the fact that it might be interpreted as an attack (by people seeing the truck) is also relevant.

(Obviously, I think the events "this is at least partially an attack on Paul" and "at least one of the authors of this post are connected to Control AI" are positively correlated, since this post is an attack on Paul. My probabilities are roughly 85% and 97%*, respectively.)

*For a broad-ish definition of "connected to"

I don't particularly see a reason to dox the people behind the truck, though I am not totally sure. My bar against doxxing is pretty high, though I do care about people being held accountable for large scale actions they take.

That's fair. I think that it would be better for the world if Control AI were not anonymous, and I judge the group negatively for being anonymous. On the other hand, I don't think I endorse them being doxxed. So perhaps my request to Connor and Gabriel is: please share what connection you have to Control AI, if any, and share what more information you have permission to share.

(Conflict of interest note: I work at ARC, Paul Christiano's org. Paul did not ask me to write this comment. I first heard about the truck (below) from him, though I later ran into it independently online.)

There is an anonymous group of people called Control AI, whose goal is to convince people to be against responsible scaling policies because they insufficiently constraint AI labs' actions. See their Twitter account and website (also anonymous Edit: now identifies Andrea Miotti of Conjecture as the director). (I first ran into Control AI via this tweet, which uses color-distorting visual effects to portray Anthropic CEO Dario Amodei in an unflattering light, in a way that's reminiscent of political attack ads.)

Control AI has rented a truck that had been circling London's Parliament Square. The truck plays a video of "Dr. Paul Christiano (Made ChatGPT Possible; Government AI adviser)" saying that there's a 10-20% chance of an AI takeover and an overall 50% chance of doom, and of Sam Altman saying that the "bad case" of AGI is "lights out for all of us". The back of the truck says "Responsible Scaling: No checks, No limits, No control". The video of Paul seems to me to be an attack on Paul (but see Twitter discussion here).

I currently strongly believe that the authors of this post are either in part responsible for Control AI, or at least have been working with or in contact with Control AI. That's because of the focus on RSPs and because both Connor Leahy and Gabriel Alfour have retweeted Control AI (which has a relatively small following).

Connor/Gabriel -- if you are connected with Control AI, I think it's important to make this clear, for a few reasons. First, if you're trying to drive policy change, people should know who you are, at minimum so they can engage with you. Second, I think this is particularly true if the policy campaign involves attacks on people who disagree with you. And third, because I think it's useful context for understanding this post.

Could you clarify if you have any connection (even informal) with Control AI? If you are affiliated with them, could you describe how you're affiliated and who else is involved?

EDIT: This Guardian article confirms that Connor is (among others) responsible for Control AI.

Reply4111

Social graces are not only about polite lies but about social decision procedures on maintaining game theoretic equilibria to maintain cooperation favoring payoff structures.

This sounds interesting. For the sake of concreteness, could you give a couple of central examples of this?

Load More