Oliver Sourbut — LessWrong

Autonomous Systems @ UK AI Safety Institute (AISI)
DPhil AI Safety @ Oxford (Hertford college, CS dept, AIMS CDT)
Former senior data scientist and software engineer + SERI MATS

I'm particularly interested in sustainable collaboration and the long-term future of value. I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.

I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! In no particular order, here are some I've enjoyed recently

Ord - The Precipice
Pearl - The Book of Why
Bostrom - Superintelligence
McCall Smith - The No. 1 Ladies' Detective Agency (and series)
Melville - Moby-Dick
Abelson & Sussman - Structure and Interpretation of Computer Programs
Stross - Accelerando
Graeme - The Rosie Project (and trilogy)

Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites

Hanabi (can't recommend enough; try it out!)
Pandemic (ironic at time of writing...)
Dungeons and Dragons (I DM a bit and it keeps me on my creative toes)
Overcooked (my partner and I enjoy the foody themes and frantic realtime coordination playing this)

People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.

The real danger is less to “humanity” and more to specific humans. Because when multicellularity arose, some cells made it, some didn’t, and those that didn’t probably suffered greatly.

I'd quibble, where this depends strongly on the interpretation of 'suffered', and 'made it'!

Another important question is whether (or under what conditions) it's nice to be part of a greater whole. The kinds of primitive (?) greater wholes that humans currently get themselves into seem to vary in that regard, and one could imagine far greater variety there. Would it be nice to be a mitochondria-ish or gene-ish component of a larger being?

And one should note that, in the examples you're giving, and perhaps in your described (if not conceived) vision, these are exclusive, hierarchical aggregations. (After all, my cells are part, exclusively, of me, and my mitochondria are part, exclusively, of a given cell, etc.)

In sharp contrast, contemporary human super-entities are very overlapping, as well as being porous, in ways which I think are very important and valuable (we often consider it a malaise if an aggregate is totalising and exclusionary, perhaps one of the bad markers of a cult - that fuzzy term).

I appreciate and partly agree with this perspective, but I think you're expressing it quite a bit too strongly.

As salient counterpoints, consider that nature repeatedly (in unrelated phyla) invented ganglia and brains, and that human-invented systems sometimes benefit from distribution and parallelism while other times benefiting from centralisation and vertical scaling.

So I certainly don't think you can say that nature empirically strenuously asserts in either direction.

I do think you're on the money saying

There is little reason to suppose an silicon-based and carbon-based life cannot likewise form greater wholes.

though of course a key question is whether it's competitive (and under what conditions), and if not, what can be done about that (or what can be done to set conditions so that it is).

I think the benchmark is intended to measure performance on an even narrower proxy than this—roughly, the sort of tasks involved in ordinary, everyday software engineering.

Note that METR has also published a subsequent attempt to broaden the class of activities, and has some suggestive results that the qualitative exponentially increasing time horizon phenomenon is somewhat robust, but the growth rate varies between domains.

On climate change, I just think it's a point worth making: people are getting exercised about very minor contributions to resource consumption by current AI firms, which is a bit silly, but it is continuous with the kinds of radical and extinctive activities which might stamp out humans forever!

On persuasion, it looks like you might have gone way too hard, if you've been arguing for arbitrary puppeteering out of the gate! (Though perhaps the quote in your mom's voice is hypothetical?)

I'm also offering an example of a way in to discussion (again most workable one-to-one), pointing out the many and varied ways that humans persuade and coerce each other.

On bio, I'm disagreeing that the mom test criteria override the importance of emphasising important interlinked issues. It's not necessary to mention bio in all conversations, but you shouldn't shy away from it, and it's a very available and reasonable example to turn to of the kinds of vulnerabilities that could wipe out huge populations or even all humans.

On 'totally outclassed', I was just offering some ways that in conversation you can make that point relatable. It's generally far more workable one-to-one, since you're having a conversation. Less likely to work in writing, though maybe.

My comment was actually (perhaps unhelpfully) a series of somewhat independent comments. I'll fork under here. In sum, you could say I'm arguing that the identified heuristics for 'mom test' aren't necessarily well fit, in part by giving reasons and angles-on-reasons which are (in my experience) more effective than those implied by the discussion you give to justify the heuristics. I'm also offering a few angles which I've found to be useful conversation openers.

Then they realize how helpless they are. Normal people will not feel helpless based only on a logical theory.

I'd be interested to know how many people have readily-recalled experiences of being totally outclassed. I've played games against people far my superior, been 'skinned' by very skilled football (soccer) players, same in tennis, and once or twice sparred with people who can actually fight. It's pretty visceral and memorable. (I also often outclass noobs myself in some games and sports.) Maybe 'have you ever played a game against someone who totally outclassed you?' is a good interactive conversation-starter? One issue might be that many people resist the idea that there are levels beyond human - but here there are existence proofs to point to.

A convincing scenario cannot involve any bioweapons.

I don't think it's good to compromise on this one. We have existence proofs all over the place and it really is a major weakness. Same for drones. In 2023 I had success talking to civil servants in the UK and they took this threat seriously. (Civil servants aren't just anyone, but they're not usually technical or philosophical nerds.)

Nevermind that this is false and she would do what it says. It’s not believable to her, nor to most people.

A point of comparison could be, 'you know that political faction you hate? Well, people got persuaded into believing and/or supporting that nonsense by a combination of trickery, self-interest, and delusion.' Although certainly persuasion has a ceiling above the human level, I expect most people can't be puppeteered arbitrarily. Most likely a majority can be subdued or confused by FUD, a large fraction swayed by greed or coercion or emotional attachment, and a small number conned into approximately anything with high effort. It's legit for people to think they'd have some level of resistance. But scenarios don't need everyone (or even most people) to be swayable.

No boiling oceans. The more conventional methods used, the more believable.

I think it's legit to point to climate change broadly construed. That's salient for many people (won't be for all), and many scenarios involving automated processes dispassionately trampling humanity are continuous with extreme climate change. It's habitat destruction, but on humans. 'Increasingly automated and inhumane firms drive super climate change and everyone dies' is both a true description of a plausible scenario and made out of salient, colloquially understandable pieces. For responses that people/government would step in, you can mention lobbying, mercenary/automated defence, regulatory capture, and amplification of all the existing means that insulate harmful industrial activities.

LESSWRONG
LW

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments