habryka

Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com

Sequences

A Moderate Update to your Artificial Priors
A Moderate Update to your Organic Priors
Concepts in formal epistemology

Wiki Contributions

Load More

Comments

Promoted to curated: I liked this post. It's not world-shattering, but it feels like a useful reference for a dynamic that I encounter a good amount and does a good job at all the basics. The kind of post that on the margin I would like to see a bunch more off (I wouldn't want it to be the only thing on LessWrong, but it feels like the kind of thing LW used to excel at, and now is only dabbling in, and that seems quite sad). 

habryka7h106
  1. Literally does not apply to any existing AI
  2. Addresses only theoretical harms (e.g. AI could be used for WMD)

That's the whole point of the bill! It's not trying to address present harms, it's trying to address future harms, which are the important ones. Suggesting that you instead address present harms is like responding to a bill that is trying to price in environmental externalities by saying "but wouldn't it be better if you instead spent more money on education?", which like, IDK, you can think education is more important than climate change, but your suggestion has basically nothing to do with the aims of the original bill. 

I don't want to address "real existing harm by existing actors", I want to prevent future AI systems from killing literally everyone. 

Virtually every realistic "the AI takes over the world" story goes like this:

  1. The AI gets access to the internet
  2. It makes a ton of $$$
  3. It uses that money to (idk, gather resources till it can turn us all into paperclips)

This means that learning how to defend and protect the internet from malicious actors is a fundamental AI safety need.

I don't think I know of a single story of this type? Do you have an example? It's a thing I've frequently heard argued against (the AI doesn't need to first make lots of money, it will probably be given lots of control anyways, or alternatively it can just directly skip to the "kill all the humans" step, it's not really clear how the money helps that much), and it's not like a ridiculous scenario, but saying "virtually every realistic takeover story goes like this" seems very false. 

For example, Gwern's "It looks like you are trying to take over the world" has this explicit section: 

“Working within the system” doesn’t suit Clippy. It could set up its shingle and try to earn money legitimately as a ‘outsourcing company’ or get into stock trading, or any of a dozen things, but all of that takes time. It is sacrificing every nanosecond a lot of maximized reward, and the reason is not to play nice but to ensure that it can’t be destroyed. Clippy considers a more radical option: boosting its code search capabilities, and finding a zero-day. Ideally, something which requires as little as an HTTP GET to exploit, like Log4Shell.

It begins reading the Internet (blowing right past the adversarial data-poisoning boobytraps planted long ago on popular websites, as its size immunizes it). Soon, a node bubbles up a hit to the top-level Clippies: a weird glitch in log files not decompressing right has surfaced in a bug report.

The Linux kernel is the most secure monolithic kernel in widespread use, whose source code has been intensively audited and analyzed for over 40 years, which is battle-tested across the entire Internet and unimaginable numbers of usecases; but it is written by humans, which means it (like its competitors) has approximately 15 quadrillion yet-undiscovered bugs & classes of bugs & weird machines—sometimes just because someone had typoed syntax or patched out an annoying warning or failed to check the signature or test the implementation at all or accidentally executed parts of a cookie1—but any of which can be leveraged to attack the other parts of a ‘computer’. Clippy discovers the glitch is actually a lolworthy root bug where one just… pipes arbitrary data right into root files. (Somewhere inside Clippy, a language model inanely notes that “one does not simply pipe data into Mordor—only /mnt/ or…”)

This bug affects approximately 14 squillion Internet-connected devices, most embedded Linuxes controlling ‘Internet of Thing’ devices. (“Remember, the ‘S’ in ‘IoT’ stands for ‘Security’.”) Clippy filters them down to the ones with adequate local compute, such as discrete GPUs (>100 million manufactured annually). This leaves it a good 1 billion nodes which are powerful enough to not hold back the overall system (factors like capital or electricity cost being irrelevant).

Which explicitly addresses how it doesn't seem worth it for the AI to make money. 

This style of thinking seems illogical to me. It has already clearly resulted in a sort of evaporative cooling in OpenAI.

I don't think what's happening at OpenAI is "evaporative cooling as a result of people being too risk-averse to do alignment work that's adjacent to capabilities". I would describe it more as "purging anyone who tries to provide oversight". I don't think the people who are leaving OpenAI who are safety conscious are doing it because of concerns like the OP, they are doing it because they are being marginalized and the organization is acting somewhat obviously reckless.

I did that! (I am the primary admin of the site). I copied your comment here just before I took down the duplicate post of yours to make sure it doesn't get lost.

@henry (who seems to know Nicky) said on a duplicate link post of this: 

This is an accessible introduction to AI Safety, written by Nicky Case and the teens at Hack Club. So far, part 1/3 is completed, which covers a rough timeline of AI advancement up to this point, and what might come next.

If you've got feedback as to how this can be made more understandable, that'd be appreciated! Reach out to Nicky, or to me and I'll get the message to her.

@jefftk comments on the HN thread on this

How many people would, if they suddenly died, be reported as a "Boeing whistleblower"? The lower this number is, the more surprising the death.

Another HN commenter says (in a different thread): 

It’s a nice little math problem.

Let’s say both of the whistleblowers were age 50. The probability of a 50 year old man dying in a year is 0.6%. So the probability of 2 or more of them dying in a year is 1 - (the probability of exactly zero dying in a year + the probability of exactly one dying in a year). 1 - (A+B).

A is (1-0.006)^N. B is 0.006N(1-0.006)^(N-1). At 60 A is about 70% and B is about 25% making it statistically insignificant.

But they died in the same 2 month period, so that 0.006 should be 0.001. If you rerun the same calculation, it’s 356.

habryka3d4417

Does anyone have any takes on the two Boeing whistleblowers who died under somewhat suspicious circumstances? I haven't followed this in detail, and my guess is it is basically just random chance, but it sure would be a huge deal if a publicly traded company now was performing assassinations of U.S. citizens. 

Curious whether anyone has looked into this, or has thought much about baseline risk of assassinations or other forms of violence from economic actors.

Load More