My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha on Telegram).
Humanity's future can be enormous and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.
I have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.
I'm running two small nonprofits: AI Governance and Safety Institute and AI Safety and Governance Fund. Learn more about our results and donate: aisgf.us/fundraising
I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).
In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.
[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities would imprison me if I ever visit Russia.]
If you’re considering donating, this might important context:
Relatedly, here’s Sam Altman at Lighthaven:
Lighthaven, as people took from your comments on EA Forum, wants to be an impartial event venue. I’m not sure you all want to give money to an impartial event venue, just like you probably wouldn’t want to give money to a random hotel that is a good conference venue (at least, you wouldn’t give from your utilons budget; you might give them money to purchase fuzzies, but be clear with yourself when this is what you’re doing.
(I also think that people shouldn’t give money to the team if they’re enjoying and getting value out of LessWrong, as the value created by LessWrong should largely be attributed to the people who write posts and to the community and not to the team; that the cost of maintaining the website could be much lower than what the team is spending.)
Perhaps you’re right; I would love for that to be the case, and to have been wrong about all this. But this model- that it’s a there exists quantifier- is very surprised by a bunch of things from “lol, no, […]” to “I might use it that way. Like, I might tell someone who is worried about [third party] that they are planning to move into the space if it seems relevant. Or I might myself come to realize it's important and then actively tell people to maybe do something about it.”
And, like, he didn’t give any examples of when he would not use the information.
His position was pretty clear to me: he thought that the fact the third party is moving into that space is bad, and if there is a way to use the information to prevent them from doing it, he would do so (but he didn’t see any ways of doing that and didn’t find it very important overall).
Like, there’s nothing in the messages to suggest otherwise.
He didn’t give an isolated example of when he’d want to share information for different reasons, where it would have a side-effect of hurting the interests of the third party. Instead, it was an example where the reason to share information was specifically that it would lead to hurting the interests of the third party.
He did call the information “strategically relevant”. He did say that he would continue to share the information basically at his sole discretion. He did say he might use it if he realizes it’s strategically important.
I really don’t have a coherent model of an alternative explanation you’re trying to point at.
(If you- or someone else- is available for that, I would love to jump on a call with someone who has a good model of Oliver and can explain to me the alternative explanation for what generated the messages.)
“Oliver is not a good counterparty” is my judgment of him and his character based on the interaction that we had. How is it a “deceptive attack”?
I did replace it with “Oliver Habryka is a counterparty I regret having; he doesn't follow planecrash!lawful neutral/good norms; be careful when talking to him” to communicate information more directly, but I do think that if you refer having someone as a counterparty, they’re not a good counterparty!
Due to concerns with the validity of
Anthropic wants to stay near the front of the pack at AI capabilities so that their empirical research is relevant, but not at the actual front of the pack to avoid accelerating race-dynamics.
— From an Anthropic employee in a private conversation, early 2023
I decided to remove it from the section 0 of the post. (At first, I temporarily added “(approximate recollection)” at the end while checking with Raemon on the details, but decided to delete this entirely once I got the reply.)
I apologize to readers for having had it in the post.
Thanks to @DanielFilan for the flag to it and to Raemon for a quick response on the details and the clarification.
Yep, thanks for flagging. That was not intentional. After checking with Raemon, I removed this entirely.
I sent Mikhail the following via DM, in response to his request for "any particular parts of the post [that] unfairly attack Anthropic":
I think that the entire post is optimized to attack Anthropic, in a way where it's very hard to distinguish between evidence you have, things you're inferring, standards you're implicitly holding them to, standards you're explicitly holding them to, etc.
I asked you for any particular example; you replied that “the entire post is optimized in a way where it’s hard to distinguish…”. Could you, please, give a particular example of where it’s hard to distinguish between evidence that I have and things I’m inferring?
I agree that these are not the two worlds which would be helpful to consider, and your list of reasons are closer to my model than Lucie’s representation of my model.
(I do hope that my post somewhat decreases trust in Jack Clark and Dario Amodei and somewhat increases the incentives for the kind of governance that would not be dependent on trustworthy leadership to work.)
(I do not endorse any of this, except for the last two sentences, though those are not a comprehensive bottom line. The comment is wrong about my points, my view, what I know, my model of the world, the specific hypotheses I’d want people to consider, etc.
If you think there is an important point to make, I’d appreciate it if you could make it without attributing it to others.)
You, in 2024: "I would be surprised if we never end up hosting events for scaling lab teams at Lighthaven. If they pay us a fair price (and maybe somewhat more as a tax), I will rent to them."
I would give you the points here if you acknowledged a change in policy instead of pretending like you always said you'd charge "(potentially enormous) premiums", that would depend on the externalities, and would not be cheap; because a year ago, you said a "fair price (and maybe somewhat more as a tax)".
It's not really clear to me to what extent you didn't communicate your policy well back then, or changed it on your own in the meantime (what caused it?), or changed it because of the pushback, or what.
If you're claiming your policy was always to charge AI companies a lot more, I'd appreciate any source where you state it prior to the progress conference.
The issue I'm concerned with is that your policy, whatever it is now and however you would describe it, is not really immediately visible to people who would want to know if (to the extent it's the case) they're subsidizing an event venue that sometimes provides value to events that are aligned with your declared values, but sometimes, to not stay unused, impartially rents itself out at a fair price to anyone, including AI companies for events that will bring the destruction of the world closer.
Like, there's the world in which Lighthaven is basically a separate for-profit entity that rents out the venue to anyone, to not stay unused, and maybe otherwise supports events the community/your values largerly like or just spends profits to support Lightcone Infrastructure, and maybe has a preference for more aligned events when there's a choice, but otherwise follows the "standard practice for practical[ly] all hotels and event"; and there's a world in which Lighthaven is a fully nonprofit event venue that doers rent itself out to random events to not stay unused and support its operations, but heavily selects events for not causing harm, and would rather stay empty than help accelerate AI capabilities progress.
I would want you to specify, and your potential donors to know, where exactly on this scale Lighthaven falls.
Because people might have reasons to like Asana or the pre-collapse according-to-public-knowledge FTX, but they might not necessarily want to donate to them.
(I'm not sure how what I'm saying is already clarified by link #2, and I don't think link #4 is relevant to this specific question?)
I think that your policies regarding providing value to Sam Altman should be transparent to your (potential) donors; I think hosting Manifest that invites Hanania is consistent with the image you're presenting to your donors, and passes the onion test, regardless of my views on how okay it is to platform Hanania.