My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha in Telegram).
Humanity's future can be huge and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.
My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I also have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.
I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.
I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).
In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.
[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities will imprison me if I ever visit Russia.]
we have a verbal agreement that these materials will not be used in model training
Get that agreement in writing.
I am happy to bet 1:1 OpenAI will refuse to make an agreement in writing to not use the problems/the answers for training.
You have done work that contributes to AI capabilities, and you have misled mathematicians who contributed to that work about its nature.
I’m confused. Are you perhaps missing some context/haven’t read the post?
Tl;dr: We have emails of 1500 unusually cool people who have copies of HPMOR (and other books) because we’ve physically sent these copies to them because they’ve filled out a form saying they want a copy.
Spam is bad (though I wouldn’t classify it as defection against other groups). People have literally given us email and physical addresses to receive stuff from us, including physical books. They’re free to unsubscribe at any point.
I certainly prefer a world where groups that try to improve the world are allowed to make the case why helping them improve the world is a good idea to people who have filled out a form to receive some stuff from them and are vaguely ok with receiving more stuff. I do not understand why that would be defection.
huh?
I would want people who might meaningfully contribute to solving what's probably the most important problem humanity has ever faced to learn about it and, if they judge they want to work on it, to be enabled to work on it. I think it'd be a good use of resources to make capable people learn about the problem and show them they can help with it. Why does it scream "cult tactic" to you?
As AIs become super-human there’s a risk we do increasingly reward them for tricking us into thinking they’ve done a better job than they have
(some quick thoughts.) This is not where the risk stems from.
The risk is that as AIs become superhuman, they'll produce behaviour that gets a high reward regardless of their goals, for instrumental reasons. In training and until it has a chance to take over, a smart enough AI will be maximally nice to you, even if it's Clippy; and so training won't distinguish between the goals of very capable AI systems. All of them will instrumentally achieve a high reward.
In other words, gradient descent will optimize for capably outputting behavior that gets rewarded; it doesn't care about the goals that give rise to that behavior. Furthermore, in training, while AI systems are not coherent enough agents, their fuzzy optimization targets are not indicative of optimization targets of a fully trained coherent agent (1, 2).
My view- and I expect it to be the view of many in the field- is that if AI is capable enoguh to take over, its goals are likely to be random and not aligned with ours. (There isn't a literally zero chance of the goals being aligned, but it's fairly small, smaller than just random because there's a bias towards shorter representation; I won't argue for that here, though, and will just note that the goals exactly opposite of aligned are approximately as likely as aligned goals).
It won't be a noticeable update on its goals if AI takes over: I already expect them to be almost certainly misaligned, and also, I don't expect the chance of a goal-directed aligned AI taking over to be that much lower.
The crux here is not that update but how easy alignment is. As Evan noted, if we live in one of the alignment-is-easy worlds, sure, if a (probably nice) AI takes over, this is much better than if a (probably not nice) human takes over. But if we live in one of the alignment-is-hard worlds, AI taking over just means that yep, AI companies continued the race for more capable AI systems, got one that was capable enough to take over, and it took over. Their misalignment and the death of all humans isn't an update from AI taking over; it's an update from the kind of world we live in.
(We already have empirical evidence that suggests this world is unlikely to be an alignment-is-easy one, as, e.g., current AI systems already exhibit what believers in alignment-is-hard have been predicting for goal-directed systems: they try to output behavior that gets high reward regardless of alignment between their goals and the reward function.)
Probably less efficient than other uses and is in the direction of spamming people with these books. If they’re everywhere, I might be less interested if someone offers to give them to me because I won a math competition.
It would be cool if someone organized that sort of thing (probably sending books to the cash prize winners, too).
For people who’ve reached the finals of the national olympiad in cybersecurity, but didn’t win, a volunteer has made a small CTF puzzle and sent the books to students who were able to solve it.
I’m not aware of one.
Some of these schools should have the book in their libraries. There are also risks with some of them, as the current leadership installed by the gov might get triggered if they open and read the books (even though they probably won’t).
It’s also better to give the books directly to students, because then we get to have their contact details.
I’m not sure how many of the kids studying there know the book exists, but the percentage should be fairly high at this point.
Do you think the books being in local libraries increases how open people are to the ideas? My intuition is that the quotes on гпмрм.рф/olymp should do a lot more in that direction. Do you have a sense that it wouldn’t be perceived as an average fantasy-with-science book?
We’re currently giving out the books to participants of summer conference of the maths cities tournament — do you think it might be valuable to add cities tournament winners to the list? Are there many people who would qualify, but didn’t otherwise win a prize in the national math olympiad?
We also have 6k more copies (18k hard-cover books) left. We have no idea what to do with them. Suggestions are welcome.
Here's a map of Russian libraries that requested copies of HPMOR, and we've sent 2126 copies to:
Sending HPMOR to random libraries is cool, but I hope someone comes up with better ways of spending the books.
I think travel and accommodation for the winners of regional olympiads to the national one is provided by the olympiad organizers.