Mikhail Samin

My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha in Telegram). 

Humanity's future can be huge and bright; losing it is our lightcone (and maybe the universe) losing most of its potential value.

My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.

I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.

I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).

In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.

[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities will imprison me if I ever visit Russia.]

Wiki Contributions

Comments

Sorted by

A smart agent can simply make decisions like a negotiator with restrictions on the kinds of terms it can accept, without having to spawn a "boulder" to do that.

You can just do the correct thing, without having to separate yourself into parts that do things correctly and a part that tries to not look at the world and spawns correct-thing-doers.

In Parfit's Hitchhiker, you can just pay once you're there, without precommiting/rewriting yourself into an agent that pays. You can just do the thing that wins.

Some agents can't do the things that win and would have to rewrite themselves into something better and still lose in some problems, but you can be an agent that wins, and gradient descent probably crystallizes something that wins into what is making the decisions in smart enough things.

Yep! If someone is doing things because it's in their best interests and not to make you do something (and they're not a result of someone else shaping themselves into them to cause you do something, whereas some previous agent wouldn't actually prefer the thing the new one prefers, that you don't want to happen), then this is not a threat.

By a stone, I meant a player with very deterministic behavior in a game with known payoffs, named this way after the idea of cooperate-stones in prisoner’s dilemma (with known payoffs).

I think to the extent there’s no relationship between giving in to a boulder/implemeting some particular decision theory and having this and other boulders thrown at you, UDT and FDT by default swerve (and probably don't consider the boulders to be threatening them, and it’s not very clear in what sense this is “giving in”); to the extent it sends more boulders their way, they don’t swerve.

If making decisions some way incentivizes other agents to become less like LDTs and more like uncooperative boulders, you can simply not make decisions that way. (If some agents actually have an ability to turn into animals and you can’t distinguish the causes behind an animal running at you, you can sometimes probabilistically take out your anti-animal gun and put them to sleep.)

Do you maybe have a realistic example where this would realistically be a problem?

I’d be moderately surprised if UDT/FDT consider something to be a better policy than what’s described in the post.

Edit: to add, LDTs don't swerve to boulders that were created to influence the LDT agent's responses. If you turn into a boulder because you expect some agents among all possible agents to swerve, this is a threat, and LDTs don't give in to those boulders (and it doesn't matter whether or not you tried to predict the behavior of LDTs in particular). If you believed LDT agents or agents in general would swerve against a boulder, and that made you become a boulder, LDT agents obviously don't swerve to that boulder. They might swerve to boulders that are actually natural boulders caused by the very simple physics no one influenced to cause the agents to do something. They also pay their rent- because they'd be evicted otherwise, not for the reason of getting rent from them under the threat of eviction but for the reason of getting rent from someone else, and they're sure there were no self-modifications to make it look this way.

(It is pretty important to very transparently respond with a nuclear strike to a nuclear strike. I think both Russia and the US are not really unpredictable in this question. But yeah, if you have nuclear weapons and your opponents don't, you might want to be unpredictable, so your opponent is more scared of using conventional weapons to destroy you. In real-life cases with potentially dumb agents, it might make sense to do this.)

Your solution works! It's not exploitable, and you get much more than 0 in expectation! Congrats!

Eliezer's solution is better/optimal in the sense that it accepts with the highest probability a strategy can use without becoming exploitable. If offered 4/10, you accept with p=40%; the optimal solution accepts with p=83% (or slightly less than 5/6); if offered 1/10, it's p=10% vs. p=55%. The other player's payout is still maximized at 5, but everyone gets the payout a lot more often!

It's not how the game would be played between dath ilan and true aliens

This is a reference to "Sometimes they accept your offer and then toss a jellychip back to you". Between dath ilan and true aliens, you do the same except for tossing the jellychip when you think you got more than what would've been fair. See True Prisoner's Dilemma.

I guess when criminals and booing bystanders are not as educated as dath ilani children, some real-world situations might get complicated. Possibly, transparent stats about the actions you've taken in similar situations might serve the same purpose even if you don't broadcast throwing your dice on live TV. Or it might make sense to transparently never give in to some kinds of threats in some sorts of real-life situations.

There’s certainly a huge difference in the UV levels between winters and summers. Even during winters, if you go out while the UV index isn’t 0, you should wear sunscreen if you’re on tretinoin. (I’m deferring to a dermatologist and haven’t actually checked the sources though.)

The skin is much more susceptible to sun damage. Epistemic status: heard it’s from a doctor practicing evidence-based medicine, but haven’t looked into the sources myself.

Note that you have to use at least SPF50 sunscreen every day, including during winters, including when it’s cloudy (clouds actually don’t reduce UV that much) if you use tretinoin.

The linked sunscreen is SPF 45, which is not suitable if you’re using tretinoin.

(I used tretinoin for five years and it has had a pretty awful effect on my skin- it made it very shiny/oily. It’s unclear how reversible this is. At some point I learned that it’s a side effect of tretinoin and mostly stopped using it. Some people want their skin to look like that and go for tretinoin to achieve it; for me, it looks pretty unnatural/bad and I am very much not unhappy about it.)

Load More