My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha on Telegram).
Humanity's future can be enormous and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.
My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I also have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.
I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.
I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).
In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.
[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities would imprison me if I ever visit Russia.]
What's the context of the ads- will they be used after the release or prior to the release?
Nope, I’m somewhat concerned about unethical uses (eg talking to a lot of people without disclosing it’s ai), so won’t publicly share the context.
If the chatbot answers questions well enough, we could in principle embed it into whatever you want if that seems useful. Currently have a couple of requests like that. DM me somewhere?
Stampy uses RAG & is worse.
This specific page is not really optimized for any use by anyone whatsoever; there are maybe five bugs each solvable with one query to claude, and all not a priority; the cool thing i want people to look at is the chatbot (when you give it some plausible context)!
(Also, non-personalized intros to why you should care about ai safety are still better done by people.)
I really wouldn't want to give a random member of the US general public a thing that advocates for AI risk while having a gender drop-down like that.[1]
The kinds of interfaces it would have if we get to scale it[2] would be very dependent on where specific people are coming from. I.e., demographic info can be pre-filled and not necessarily displayed if it's from ads; or maybe we ask one person we're talking to to share it with two other people, and generate unique links with pre-filled info that was provided by the first person; etc.
Voice mode would have a huge latency due to the 200k token context and thinking prior to responding.
Non-binary people are people, but the dropdown creates unnecessary negative halo effect for a significant portion of the general public.
Also, dropdowns = unnecessary clicks = bad.
which I really want to! someone please give us the budget and volunteers!
at the moment, we have only me working full-time (for free), $10k from SFF, and ~$15k from EAs who considered this to be the most effective nonprofit in this field.
reach out if you want to donate your time or money. (donations are tax-deductible in the us.)
Thanks, but, uhm, try to not specify “your mom” as the background and “what the actual fuck is ai alignment” as your question if you want it to have a writing style that’s not full of “we’re toast”
Another example:
What's corrigibility? (asked by an AI safety researcher)
It’s better than stampy (try asking both some interesting questions!). Stampy is cheaper to run though.
I wasn’t able to get LLMs to produce valid arguments or answer questions correctly without the context, though that could be scaffolding/skill issue on my part.
Thanks! I think we’re close to a point where I’d want to put this in front of a lot of people, though we don’t have the budget for this (which seems ridiculous, given the stats we have for our ads results etc.), and also haven’t yet optimized the interface (as in, half the US public won’t like the gender dropdown).
Also, it’s much better at conversations than at producing 5min elevator pitches. (Hard to make it good at being where the user is while getting to a point instead of being very sycophantic).
The end goal is to be able to explain the current situation to people at scale.
Sure! Mostly, it's just that a lot of stuff that correlates with specific qualia in humans doesn't provide any evidence about qualia in other animals; reinforcement learning- behavior that seeks the things that when encountered update the brain to seek more of them, and tries to avoid the things that update the brain to avoid them- doesn't mean that there are any circuits in the animal's brain for experiencing these updates from the inside, as qualia, the way humans do when we suffer. If I train a very simple RL agent with the feedback that salmon get via mechanisms that produce pain in humans, the RL agent will learn to demonstrate salmon's behavior while we can be very confident there's no qualia in that RL agent. Basically almost all of the evidence Rethink and others present are of the kind that RL agents and don't provide evidence that would add anything on top of "it's a brain of that size that can do RL and has this evolutionary history".
The reason we know other humans have qualia circuits in their brains is that these circuits have outputs that make humans talk about qualia even if they've not heard others talk about qualia (this would've been very surprising if that happened randomly).
We don't have anything remotely close to that for any non-human animals.
For many things, we can assume that something like what led to humans having qualia has been present in the evolutionary history of that thing; or have tests (such as a correct mirror test) that likely correlates with the kinds of things that lead to qualia; but among all known fish species we've done these experiments on, there are very few that have any social dynamics of the kind that would maybe correlate with qualia or can remotely pass anything like a mirror test, and salmon is not among those species.
i made a thing!
it is a chatbot with 200k tokens of context about AI safety. it is surprisingly good- better than you expect current LLMs to be- at answering questions and counterarguments about AI safety. A third of its dialogues contain genuinely great and valid arguments.
You can try the chatbot at https://whycare.aisgf.us (ignore the interface; it hasn't been optimized yet). Please ask it some hard questions! Especially if you're not convinced of AI x-risk yourself, or can repeat the kinds of questions others ask you.
Send feedback to ms@contact.ms.
A couple of examples of conversations with users:
I know AI will make jobs obsolete. I've read runaway scenarios, but I lack a coherent model of what makes us go from "llms answer our prompts in harmless ways" to "they rebel and annihilate humanity".
some ideas for inspiration
more