Why do some societies exhibit more antisocial punishment than others? Martin explores both some literature on the subject, and his own experience living in a country where "punishment of cooperators" was fairly common.
Yesterday, I had a coronectomy: the top halves of my bottom wisdom teeth were surgically removed. It was my first time being sedated, and I didn’t know what to expect. While I was unconscious during the surgery, the hour after surgery turned out to be a fascinating experience, because I was completely lucid but had almost zero short-term memory.
My girlfriend, who had kindly agreed to accompany me to the surgery, was with me during that hour. And so — apparently against the advice of the nurses — I spent that whole hour talking to her and asking her questions.
The biggest reason I find my experience fascinating is that it has mostly answered a question that I’ve had about myself for quite a long time: how deterministic am...
It could be an interesting experiment to build up this list iteratively. Like, every question you ask for the third time, the answer gets added at the bottom of the list. How long will the list get, and what will it contain?
Previously: On the Proposed California SB 1047.
Text of the bill is here. It focuses on safety requirements for highly capable AI models.
This is written as an FAQ, tackling all questions or points I saw raised.
Safe & Secure AI Innovation Act also has a description page.
There have been many highly vocal and forceful objections to SB 1047 this week, in reaction to a (disputed and seemingly incorrect) claim that the bill has been ‘fast tracked.’
The bill continues to have substantial chance of becoming law according to Manifold, where the market has not moved on recent events. The bill has been referred to two policy committees one of which put out this 38 page analysis.
The purpose of this post is to gather and analyze all...
If your model is not projected to be at least 2024 state of the art and it is not over the 10^26 flops limit?
It's not going to be 2024 forever. In the future being 2024 state of the art won't be as hard as it is in actual 2024.
That developers risk going to jail for making a mistake on a form.
- This (almost) never happens.
Because prosecuting someone for making a mistake on a form happens when the government wants to go after an otherwise innocent person for unacceptable reasons, so they prosecute a crime that goes unprosecuted 99% of the time.
...The
A couple years ago, I had a great conversation at a research retreat about the cool things we could do if only we had safe, reliable amnesic drugs - i.e. drugs which would allow us to act more-or-less normally for some time, but not remember it at all later on. And then nothing came of that conversation, because as far as any of us knew such drugs were science fiction.
… so yesterday when I read Eric Neyman’s fun post My hour of memoryless lucidity, I was pretty surprised to learn that what sounded like a pretty ideal amnesic drug was used in routine surgery. A little googling suggested that the drug was probably a benzodiazepine (think valium). Which means it’s not only a great amnesic, it’s also apparently one...
Another class of applications which we discussed at the retreat: person 1 takes the amnesic, person 2 shares private information on them, and then person gives their reaction to the private information. Can be used e.g. for complex negotiations: maybe it is in our mutual best interest to make some deal, but in order for me to know that I'd need some information which you don't want to share with me, so I take the drug, you share the information, and I record some verified record of myself saying "dear future self, you should in fact take this deal".
... which is cool in theory but I would guess not of high immediate value in practice, which is why the post didn't focus on it.
I say this because I can hardly use a computer without constantly getting distracted. Even when I actively try to ignore how bad software is, the suggestions keep coming.
Seriously Obsidian? You could not come up with a system where links to headings can't break? This makes you wonder what is wrong with humanity. But then I remember that humanity is building a god without knowing what they will want.
So for those of you who need to hear this: I feel you. It could be so much better. But right now, can we really afford to make the ultimate <programming language/text editor/window manager/file system/virtual collaborative environment/interface to GPT/...>?
Can we really afford to do this while our god software looks like...
May this find you well.
Consider the pressures and incentives. Adding new features can help you sell the software to more users. Fixing bugs... unless the application is practically falling apart, it does not make much of a difference. After all, the bugs will only get noticed by people who already use your application, i.e. they already paid for it.
For the artificial intelligence, I assume the "killer app" will be its integration with SharePoint.
This sort of begs the question of why we don't observe other companies assassinating whistleblowers.
(ElevenLabs reading of this post:)
I'm excited to share a project I've been working on that I think many in the Lesswrong community will appreciate - converting some rational fiction into high-quality audiobooks using cutting-edge AI voice technology from ElevenLabs, under the name "Askwho Casts AI".
The keystone of this project is an audiobook version of Planecrash (AKA Project Lawful), the epic glowfic authored by Eliezer Yudkowsky and Lintamande. Given the scope and scale of this work, with its large cast of characters, I'm using ElevenLabs to give each character their own distinct voice. It's a labor of love to convert this audiobook version of this story, and I hope if anyone has bounced off it before, this...
Added an embedded audio element for you.
Because[1] for a Bayesian reasoner, there is conversation of expected evidence.
Although I've seen it mentioned that technically the change in the belief on a Bayesian should follow a Martingale, and Brownian motion is a martingale.
I'm not super technically strong on this particular part of the math. Intuitively it could be that in a bounded reasoner which can only evaluate programs in , any pattern in its beliefs that can be described by an algorithm in is detected and the predicted future belief from that pattern is incorporated into current belief
Crossposted from the AI Optimists blog.
AI doom scenarios often suppose that future AIs will engage in scheming— planning to escape, gain power, and pursue ulterior motives, while deceiving us into thinking they are aligned with our interests. The worry is that if a schemer escapes, it may seek world domination to ensure humans do not interfere with its plans, whatever they may be.
In this essay, we debunk the counting argument— a central reason to think AIs might become schemers, according to a recent report by AI safety researcher Joe Carlsmith.[1] It’s premised on the idea that schemers can have “a wide variety of goals,” while the motivations of a non-schemer must be benign by definition. Since there are “more” possible schemers than non-schemers, the argument goes, we should...
I agree that, overall, counting arguments are weak.
But even if you expect SGD to be used for TAI, generalisation is not a good counterexample, because maybe most counting arguments about SGd do work except for generalisation (which would not be surprising, because we selected SGD precisely because it generalises well).
Many thanks to Spencer Greenberg, Lucius Caviola, Josh Lewis, John Bargh, Ben Pace, Diogo de Lucena, and Philip Gubbins for their valuable ideas and feedback at each stage of this project—as well as the ~375 EAs + alignment researchers who provided the data that made this project possible.
Last month, AE Studio launched two surveys: one for alignment researchers, and another for the broader EA community.
We got some surprisingly interesting results, and we're excited to share them here.
We set out to better explore and compare various population-level dynamics within and across both groups. We examined everything from demographics and personality traits to community views on specific EA/alignment-related topics. We took on this project because it seemed to be largely unexplored and rife with potentially-very-high-value insights. In this post, we’ll present what...
Thank you so much for conducting this survey! I want to share some information on behalf of MATS: