Wiki Contributions


Christiano, Cotra, and Yudkowsky on AI progress

Thanks this is helpful! I'd be very curious to see where Paul agreed / disagree with the summary / implications of his view here.

Christiano, Cotra, and Yudkowsky on AI progress

After reading these two Eliezer <> Paul discussions, I realize I'm confused about what the importance of their disagreement is.

It's very clear to me why Richard & Eliezer's disagreement is important. Alignment being extremely hard suggests AI companies should work a lot harder to avoid accidentally destroying the world, and suggests alignment researchers should be wary of easy-seeming alignment approaches.

But it seems like Paul & Eliezer basically agree about all of that. They disagree about... what the world looks like shortly before the end? Which, sure, does have some strategic implications. You might be able to make a ton of money by betting on AI companies and thus have a lot of power in the few years before the world drastically changes. That does seem important, but it doesn't seem nearly as important as the difficulty of alignment.

I wonder if there are other things Paul & Eliezer disagree about that are more important. Or if I'm underrating the importance of the ways they disagree here. Paul wants Eliezer to bet on things so Paul can have a chance to update to his view in the future if things end up being really different than he thinks. Okay, but what will he do differently in those worlds? Imo he'd just be doing the same things he's trying now if Eliezer was right. And maybe there is something implicit in Paul's "smooth line" forecasting beliefs that makes his prosaic alignment strategy more likely to work in world's where he's right, but I currently don't see it.

EA Hangout Prisoners' Dilemma

Another way to run this would be to have a period of time before launches are possible for people to negotiate, and then to not allow retracting nukes after that point.  And I think next time I would make it so that the total of no-nukes would be greater than the total if only one side nuked, though I did like this time that people had the option of a creative solution that "nuked" a side but lead to higher EV for both parties than not nuking. 

EA Hangout Prisoners' Dilemma

I think the fungibility is a good point, but it seems like the randomizer solution is strictly better than this. Otherwise one side clearly gets less value, even if they are better off than they would have been had the game not happened. It's still a mixed motive conflict!

EA Hangout Prisoners' Dilemma

I'm not sure that anyone exercised restraint in not responding to the last attack, as I don't have any evidence that anyone saw the last response. It's quite possible people did see it and didn't respond, but I have no way to know that.

EA Hangout Prisoners' Dilemma

Oh I should have specified, that I would consider the coin flip to be a cooperative solution! Seems obviously better to me than any other solution.

EA Hangout Prisoners' Dilemma

I think there are a lot of dynamics present here that aren't present in the classic prisoners dilemma, and some dynamics that are present (and some that are present in various iterated prisoner's dilemmas). The prize might be different for different actors, since actors place different value of "cooperative" outcomes. If you can trust people's precommitments, I think there is a race to commit OR precommit to an action. 

E.g. if I wanted the game to settle with no nukes launched, then I could pre-commit to launching a retaliatory strike to either side if an attack was launched.

Petrov Day 2021: Mutually Assured Destruction?

I sort of disagree. Not necessarily that it was the wrong choice to invest your security resources elsewhere--I think your threat model is approximately correct--but I disagree that it's wrong to invest in that part of your stack.

My argument here is that following best practices is a good principle, and that you can and should make exceptions sometimes, but Zack is right to point it out as a vulnerability. Security best practices exist to help you reduce attack surface without having to be aware of every attack vector. You might look at this instance and rightly think "okay but SHA-256 is very hard to crack with keys this long, and there is more low hanging fruit". But sometimes you're going to make a wrong assumption when evaluating things like this, and best practices help protect you from limitations of your ability to model this. Maybe your SHA-256 implementation has a known vulnerability that you didn't check, maybe your secret generation wasn't actually random, etc. I don't think any of these apply in this case, but I think sometimes you're likely to be mistaken about your assumptions. The question becomes a more general one about when it makes sense to follow best practices and when it makes sense to make exceptions. In this case I would think an hour of work would be worth it to follow the best practice. If it were more like five hours, I might agree with with you.

I wouldn't discount the signaling value either. Allies and attackers can notice whether you're following best practices and use that info to decide whether they should invest resources in trusting your infrastructure or attacking it.

Comment on the lab leak hypothesis

I don't think it's super clear, but I do think it's the clearest that we are likely to get that's more than 10% likely. I disagree that SARS could 15 years, or at least I think that one could have been called within a year or two. My previous attempt to operationalize a bet had the bet resolve if, within two years, a mutually agreed upon third party updated to believe that there is >90% probability that an identified intermediate host or bat species was the origin point of the pandemic, and that this was not a lab escape. 

Now that I'm writing this out, I think within two years of SARS I wouldn't have been >90% civet-->human origin. I'd guess I would have been 70-80% on civet-->human. But I'm currently <5% on any specific intermediate host for SARS-CoV-2, so something like the civet finding would greatly increase my odds that SARS-CoV-2 is a natural spillover. 

Comment on the lab leak hypothesis

It seems like an interesting hypothesis but I don't think it's particularly likely. I've never heard of other viruses becoming well adapted to humans within a single host. Though, I do think that's the explanation for how several variants evolved (since some of them emerged with a bunch of functional mutations rather than just one or two). I'd be interest to see more research into the evolution of viruses within human hosts, and what degree of change is possible & how this relates to spillover events.

Load More