LESSWRONG
LW

Mikhail Samin
1978252633
Message
Dialogue
Subscribe

My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha on Telegram). 

Humanity's future can be enormous and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.

My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I also have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.

I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.

I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).

In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.

[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities would imprison me if I ever visit Russia.]

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
6Mikhail Samin's Shortform
2y
184
If Anyone Builds It, Everyone Dies: Advertisement design competition
Mikhail Samin3d51

some ideas for inspiration

more

 

 

Reply
If Anyone Builds It, Everyone Dies: Advertisement design competition
Mikhail Samin4d40

What's the context of the ads- will they be used after the release or prior to the release?

Reply
Mikhail Samin's Shortform
Mikhail Samin5d20

Nope, I’m somewhat concerned about unethical uses (eg talking to a lot of people without disclosing it’s ai), so won’t publicly share the context.

If the chatbot answers questions well enough, we could in principle embed it into whatever you want if that seems useful. Currently have a couple of requests like that. DM me somewhere?

Stampy uses RAG & is worse.

Reply
Mikhail Samin's Shortform
Mikhail Samin5d20

This specific page is not really optimized for any use by anyone whatsoever; there are maybe five bugs each solvable with one query to claude, and all not a priority; the cool thing i want people to look at is the chatbot (when you give it some plausible context)!

(Also, non-personalized intros to why you should care about ai safety are still better done by people.)

I really wouldn't want to give a random member of the US general public a thing that advocates for AI risk while having a gender drop-down like that.[1]

The kinds of interfaces it would have if we get to scale it[2] would be very dependent on where specific people are coming from. I.e., demographic info can be pre-filled and not necessarily displayed if it's from ads; or maybe we ask one person we're talking to to share it with two other people, and generate unique links with pre-filled info that was provided by the first person; etc.

Voice mode would have a huge latency due to the 200k token context and thinking prior to responding.

  1. ^

    Non-binary people are people, but the dropdown creates unnecessary negative halo effect for a significant portion of the general public.

    Also, dropdowns = unnecessary clicks = bad.

  2. ^

    which I really want to! someone please give us the budget and volunteers!

    at the moment, we have only me working full-time (for free), $10k from SFF, and ~$15k from EAs who considered this to be the most effective nonprofit in this field.

    reach out if you want to donate your time or money. (donations are tax-deductible in the us.)

Reply1
Mikhail Samin's Shortform
Mikhail Samin5d24

Thanks, but, uhm, try to not specify “your mom” as the background and “what the actual fuck is ai alignment” as your question if you want it to have a writing style that’s not full of “we’re toast”

Reply1
Mikhail Samin's Shortform
Mikhail Samin6d20

Another example:

What's corrigibility? (asked by an AI safety researcher)

Reply
Mikhail Samin's Shortform
Mikhail Samin6d20

It’s better than stampy (try asking both some interesting questions!). Stampy is cheaper to run though.

I wasn’t able to get LLMs to produce valid arguments or answer questions correctly without the context, though that could be scaffolding/skill issue on my part.

Reply
Mikhail Samin's Shortform
Mikhail Samin6d41

Thanks! I think we’re close to a point where I’d want to put this in front of a lot of people, though we don’t have the budget for this (which seems ridiculous, given the stats we have for our ads results etc.), and also haven’t yet optimized the interface (as in, half the US public won’t like the gender dropdown).

Also, it’s much better at conversations than at producing 5min elevator pitches. (Hard to make it good at being where the user is while getting to a point instead of being very sycophantic).

The end goal is to be able to explain the current situation to people at scale.

Reply
Don't Eat Honey
Mikhail Samin6d1-6

Sure! Mostly, it's just that a lot of stuff that correlates with specific qualia in humans doesn't provide any evidence about qualia in other animals; reinforcement learning- behavior that seeks the things that when encountered update the brain to seek more of them, and tries to avoid the things that update the brain to avoid them- doesn't mean that there are any circuits in the animal's brain for experiencing these updates from the inside, as qualia, the way humans do when we suffer. If I train a very simple RL agent with the feedback that salmon get via mechanisms that produce pain in humans, the RL agent will learn to demonstrate salmon's behavior while we can be very confident there's no qualia in that RL agent. Basically almost all of the evidence Rethink and others present are of the kind that RL agents and don't provide evidence that would add anything on top of "it's a brain of that size that can do RL and has this evolutionary history".

The reason we know other humans have qualia circuits in their brains is that these circuits have outputs that make humans talk about qualia even if they've not heard others talk about qualia (this would've been very surprising if that happened randomly).

We don't have anything remotely close to that for any non-human animals.

For many things, we can assume that something like what led to humans having qualia has been present in the evolutionary history of that thing; or have tests (such as a correct mirror test) that likely correlates with the kinds of things that lead to qualia; but among all known fish species we've done these experiments on, there are very few that have any social dynamics of the kind that would maybe correlate with qualia or can remotely pass anything like a mirror test, and salmon is not among those species.

Reply
Mikhail Samin's Shortform
Mikhail Samin6d364

i made a thing!

it is a chatbot with 200k tokens of context about AI safety. it is surprisingly good- better than you expect current LLMs to be- at answering questions and counterarguments about AI safety. A third of its dialogues contain genuinely great and valid arguments.

You can try the chatbot at https://whycare.aisgf.us (ignore the interface; it hasn't been optimized yet). Please ask it some hard questions! Especially if you're not convinced of AI x-risk yourself, or can repeat the kinds of questions others ask you.

Send feedback to ms@contact.ms.

A couple of examples of conversations with users:

I know AI will make jobs obsolete. I've read runaway scenarios, but I lack a coherent model of what makes us go from "llms answer our prompts in harmless ways" to "they rebel and annihilate humanity".

Reply
Load More
24No, Futarchy Doesn’t Have an EDT Flaw
11d
22
6Superintelligence's goals are likely to be random
4mo
6
80No one has the ball on 1500 Russian olympiad winners who've received HPMOR
6mo
21
67How to Give in to Threats (without incentivizing them)
10mo
31
11Can agents coordinate on randomness without outside sources?
Q
1y
Q
16
80Claude 3 claims it's conscious, doesn't want to die or be modified
1y
118
33FTX expects to return all customer money; clawbacks may go away
1y
1
24An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans
1y
1
42NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts
2y
17
Decision theory
4mo
(+142)
Functional Decision Theory
4mo
(+242)
Translations Into Other Languages
2y
(+84/-60)