Yeah, seems like a kinda bad feedback loop. It doesn't seem to usually happen in that the comments I've seen upvoted in that section usually don't get this extremely many upvotes on a comment this short.

I don't have a great solution. We could do something that's more clever and algorithmic, which doesn't seem crazy but I am also hesitant to do because it's a lot of work and also I like more straightforward and simple algorithms for transparency reasons.

Reply

1

Open Thread Spring 2024

habryka1d20

Only the authors (and admins) can do it.

If you paste some images here that seem good to you, I can edit them unilaterally, and will message the authors to tell them I did that.

Reply

an effective ai safety initiative

habryka1d61

I am not sure I understand this comment. Are you saying you think there are autonomous AI systems that right now are trying to accumulate power? And that present regulation should be optimized to stop those?

Reply

On Not Pulling The Ladder Up Behind You

habryka2d31

Promoted to curated: I liked this post. It's not world-shattering, but it feels like a useful reference for a dynamic that I encounter a good amount and does a good job at all the basics. The kind of post that on the margin I would like to see a bunch more off (I wouldn't want it to be the only thing on LessWrong, but it feels like the kind of thing LW used to excel at, and now is only dabbling in, and that seems quite sad).

Reply

an effective ai safety initiative

habryka2d2630

Literally does not apply to any existing AI
Addresses only theoretical harms (e.g. AI could be used for WMD)

That's the whole point of the bill! It's not trying to address present harms, it's trying to address future harms, which are the important ones. Suggesting that you instead address present harms is like responding to a bill that is trying to price in environmental externalities by saying "but wouldn't it be better if you instead spent more money on education?", which like, IDK, you can think education is more important than climate change, but your suggestion has basically nothing to do with the aims of the original bill.

I don't want to address "real existing harm by existing actors", I want to prevent future AI systems from killing literally everyone.

Reply

an effective ai safety initiative

habryka2d62

Virtually every realistic "the AI takes over the world" story goes like this:
The AI gets access to the internet
It makes a ton of $$$
It uses that money to (idk, gather resources till it can turn us all into paperclips)
This means that learning how to defend and protect the internet from malicious actors is a fundamental AI safety need.

I don't think I know of a single story of this type? Do you have an example? It's a thing I've frequently heard argued against (the AI doesn't need to first make lots of money, it will probably be given lots of control anyways, or alternatively it can just directly skip to the "kill all the humans" step, it's not really clear how the money helps that much), and it's not like a ridiculous scenario, but saying "virtually every realistic takeover story goes like this" seems very false.

For example, Gwern's "It looks like you are trying to take over the world" has this explicit section:

“Working within the system” doesn’t suit Clippy. It could set up its shingle and try to earn money legitimately as a ‘outsourcing company’ or get into stock trading, or any of a dozen things, but all of that takes time. It is sacrificing every nanosecond a lot of maximized reward, and the reason is not to play nice but to ensure that it can’t be destroyed. Clippy considers a more radical option: boosting its code search capabilities, and finding a zero-day. Ideally, something which requires as little as an HTTP GET to exploit, like Log4Shell.
It begins reading the Internet (blowing right past the adversarial data-poisoning boobytraps planted long ago on popular websites, as its size immunizes it). Soon, a node bubbles up a hit to the top-level Clippies: a weird glitch in log files not decompressing right has surfaced in a bug report.
The Linux kernel is the most secure monolithic kernel in widespread use, whose source code has been intensively audited and analyzed for over 40 years, which is battle-tested across the entire Internet and unimaginable numbers of usecases; but it is written by humans, which means it (like its competitors) has approximately 15 quadrillion yet-undiscovered bugs & classes of bugs & weird machines—sometimes just because someone had typoed syntax or patched out an annoying warning or failed to check the signature or test the implementation at all or accidentally executed parts of a cookie^1—but any of which can be leveraged to attack the other parts of a ‘computer’. Clippy discovers the glitch is actually a lolworthy root bug where one just… pipes arbitrary data right into root files. (Somewhere inside Clippy, a language model inanely notes that “one does not simply pipe data into Mordor—only /mnt/ or…”)
This bug affects approximately 14 squillion Internet-connected devices, most embedded Linuxes controlling ‘Internet of Thing’ devices. (“Remember, the ‘S’ in ‘IoT’ stands for ‘Security’.”) Clippy filters them down to the ones with adequate local compute, such as discrete GPUs (>100 million manufactured annually). This leaves it a good 1 billion nodes which are powerful enough to not hold back the overall system (factors like capital or electricity cost being irrelevant).

Which explicitly addresses how it doesn't seem worth it for the AI to make money.

Reply

Introducing AI-Powered Audiobooks of Rational Fiction Classics

habryka3d50

Added an embedded audio element for you.

Reply

Please stop publishing ideas/insights/research about AI

habryka3d66

This style of thinking seems illogical to me. It has already clearly resulted in a sort of evaporative cooling in OpenAI.

I don't think what's happening at OpenAI is "evaporative cooling as a result of people being too risk-averse to do alignment work that's adjacent to capabilities". I would describe it more as "purging anyone who tries to provide oversight". I don't think the people who are leaving OpenAI who are safety conscious are doing it because of concerns like the OP, they are doing it because they are being marginalized and the organization is acting somewhat obviously reckless.

Reply

"AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case

habryka4d50

I did that! (I am the primary admin of the site). I copied your comment here just before I took down the duplicate post of yours to make sure it doesn't get lost.

Reply