Anthony Bailey — LessWrong

Pause AI has a lot of opportunity for growth.

Especially the “increase public awareness” lever is hugely underfunded. Almost no paid staff or advertising budget.

Our game plan is simple but not naive, and is most importantly a disjunct, value-add bet.

Please help us execute it well: explore, join, talk with us, donate whatever combination of time, skills, ideas and funds makes sense

(Excuse dearth of kudos, am not a regular LW person, just an old EA adjacent nerd who quit Amazon to volunteer full-time for the movement.)

AIs should also refuse to work on capabilities research

Anthony Bailey1mo98

I have a question on a topic sufficiently adjacent I reckon worth asking here of those likely to read the thread.

It seems that warning shots are more likely unsuccessful because of winner's curse: that the first models to take a shot will be those who have most badly overestimated their chances, and in turn this correlates with worse intellectual capabilities.

Has there been any illuminating discussion on this and its downstream consequences? E.g. how shots and aftermath are likely in practice to be perceived in general, by the better-informed, and - in the context of this post - by competing AIs? What dynamics result?

Which side of the AI safety community are you in?

Anthony Bailey1mo64

Suppose someone works for Anthropic, accords with the value placed on empiricism by their Core Views on AI Safety (March 2023) and gives any weight to the idea we are in the pessimistic scenario from that document.

I think they can reasonably sign the statement yet not want to assign themselves exclusively to either camp.

I pitched my tent as a Pause AI member and I guess camp B has formed nearby. But I also have empathy for the alternate version of me who judges the trade-offs differently and has ended up as above, with a camp A zipcode.

The A/B framing has value, but I strongly want to cooperate with that person and not sit in separate camps.

Open Global Investment as a Governance Model for AGI

Anthony Bailey3mo30

On reading the paper I came here to question whether OGI helps or harms relative to other governance models should technical alignment be sufficiently intractable and coordinating on a longer pause required. (I assume it harms.) It wasn't clear to me whether you had considered that.

Grateful for both the "needfully combative" challenge and this response.

I'm reading Nick as implicitly agreeing OGI doesn't help in this case, but rating treaty-based coordination as much lower likelihood than solving alignment. If so, I think it worth confirming this and explicitly calling out the assumption in or near the essay.

(Like Haiku I myself am keen to help the public be rightfully outraged by plans without consent that increase extinction risk. I'm grateful for the ivory tower, and a natural resident there, but advocate joining us on the streets.)

Psychology of AI doomers and AI optimists

Anthony Bailey3mo10

Given I hadn't seen this until now when Joep pointed me at it, perhaps comments are pointless. But I'd written them for him anyway so just in case...

Mostly your dialogue aligned closely with my own copium thinking. Many unmentioned observations observations confirmed existing thoughts rather than extending them.

The compartmentalization selection effect was new to me and genuinely insightful: abstract thinking both enables risk recognition AND prevents internalization.

My own experience suggests compartmentalization can collapse in months, not years, even after decades of emotional resilience to other suffering.

They may be genuinely aspy-different to you, me and Miles, but I think neither Igor or e.g. Liron had their big emotional break "yet*.

For politicians, "existential threat but we shouldn't lose sleep" may be projection - Sunak told himself he shouldn't. With lawmakers and policy folk, there's additional cope against professional accountability for past negligence and present obligation for extreme action.

On normalcy bias: yes, but note this typically persists because it usually works even now - it is not just biological evolutionary baggage; culture and knowledge evolution actively select for it until we go OOD and they... don't.

Our messaging can be "yeah, I had cope too" rather than "you have cope now" - vulnerability beats superiority for reducing reactance.

I find scale invariance means extremity of outcome cuts the other way to Igor and that "everyone dies" is less motivating than "this particular loved one dies" - aligns with charity findings.

Re shrug cope, beyond "that's good actually," I commonly hear fatalistic acceptance without reframing: "we deserve it," "I've lived my life," "I am selfish and don't care enough about my children," "hoping is worse than dying" - like the first, these are vulnerable to Zvi's "say it in public into the microphone"

Igor's career certainty contrasts sharply with my typical imposter syndrome; believe many AI safety workers feel "house is on fire but I don't know what I'm doing."

Also unlike Igor, I find the "smart people working on most important problem" framing frustrating rather than appealing when the smart people seem to be failing.

Agree with solutions messaging but do struggle with the most effective framings containing a "trick" overselling certainty ("we can solve this" vs. "we might improve odds slightly"). What gets long term effort?

Lastly one speculative idea: one might try conversation styles that accelerate argument-switching without countering each. It could reveal copium patterns to the other party. As outreach doomers, seeing how much terrible "that'll do" thinking and switching there is across many conversations has made us pretty certain about cope. Can we help folk see themselves doing it and feel weird?

An epistemic advantage of working as a moderate

Anthony Bailey3mo3-3

Very glad of this post. Thanks for broaching, Buck.

Status: I'm an old nerd, lately ML R&D, who dropped career and changed wheelhouse to volunteer at Pause AI.

Two comments on the OP:

details of the current situation are much more interesting to me. In contrast, radicals don't really care about e.g. the different ways that corporate politics affects AI safety interventions at different AI companies.

As per Joseph's response: this does not match me or my general experience of AI safety activism.

Concretely, a recent campaign was specifically about Deep Mind breaking particular voluntary testing commitments, with consideration of how staff would feel.

Radicals often seem to think of AI companies as faceless bogeymen thoughtlessly lumbering towards the destruction of the world.

I just cannot do this myself.

(There is some amount of it around, but also it is not without value. See later.)

Gideon F:

This strikes me as a fairly strong strawman. My guess if the vast majority of thoughtful radicals basically have a similar view to you.

Reporting from inside: I rate it a good guess, especially when you weight by “thoughtful”.

For illustration, imagine I donate to Pause AI (or joined one of their protests with one of the more uncontroversial protest signs), but I still care a lot about what the informed people who are convinced of Anthropic's strategy have to say. Imagine I don't think they're obviously unreasonable, I try to pass their Ideological Turing test, I care about whether they consider me well-informed, etc.

Anthony feels seen / imagined.

If those conditions are met, then I might still retain some of the benefits you list.

Some for sure. The important one I noticed struggling to get is engaged two-way conversation with frontier lab folk. Trade-off.

Back to faceless companies: some activists, including thoughtful ones, are more angry than me. (Anthropic tend to be a litmus test. Which is fun given their pH variance week to week.)

Exasperated steel man: these lab folk are externalizing the costs of their own risk models and tolerances without any consent. This doesn't seem very epistemically humble. But I get that the virtue math is fragile and so I feel sympathy and empathy for many parties here.

Still, for both emotional health of the activists and odds of public impact, radicals helping each other feel some aggravated anger does seem sane. In this regard as others, I find there are worthwhile things to learn and eval from the experience of campaigners who were never in EA on LessWrong.

I’ll risk another quote without huge development - williawa:

"The right amount of politics is not zero, even though it really is the mind killer". But I also think, arguments for taking AI x-risk very seriously, are unusually strong compared with most political debates.

For me: well-phrased, then insightful.

Lastly, Kaleb:

In the leftist political sphere, this distinction is captured by the names "reformers" vs "revolutionaries", and the argument about which approach to take has been going on forever.

and Lukas again:

whether radical change goes through mass advocacy and virality vs convincing specific highly-informed groups and experts, seems like somewhat of an open question and might depend on the specifics.

My response to both of these is pretty “porque no los dos”. This is not zero sum. Let us apply disjunctive effort.

It is even the case that a “pincer movement” helps: a radical flank primes an audience for moderate persuasion. (This isn't my driver: of course I express my real position. But it makes me less worried about harm if I'm on the wrong side.)

Inscrutability was always inevitable, right?

Anthony Bailey4mo10

I appreciate the clear argument as to why "fancy linear algebra" works better than "fancy logic".

And I understand why things that work better tend to get selected.

I do challenge "inevitable" though. It doesn't help us to survive.

If linear algebra probably kills everyone but logic probably doesn't, tell everyone and agree to prefer to use the thing that works worse.

MAISU - Minimal AI Safety Unconference

Anthony Bailey7mo10

I understand it went well.

Where can we find recordings of presentations and other outputs? Not yet seeing anything on https://www.aisafety.camp or in the MAISU Google doc homepage.

Overview: AI Safety Outreach Grassroots Orgs

Anthony Bailey7mo22

I volunteer as Pause AI software team lead and confirm this is basically correct. Many members and origins in common between the global Pause AI movement and Pause AI US, but some different emphases mostly for good specialism reasons. The US org has Washington connections and more protests focussed on the AI labs themselves. We work closely.

Neither has more than a few paid employees and truly full-time volunteers. As per OP, anyone who agrees activism and public engagement remain a very under-leveraged value-add way to help AI safety has massive opportunity here for impact through time, skill or money.

Liability regimes for AI

Anthony Bailey1y10

It's plausible even the big companies are judgment-proof (e.g. if billions of people die or the human species goes extinct) and this might need to be addressed by other forms of regulation

...or by a further twist on liability.

Gabriel Well explored such an idea in https://axrp.net/episode/2024/04/17/episode-28-tort-law-for-ai-risk-gabriel-weil.html

The core is punitive damages for expected harms rather than those that manifested. When a non-fatal warning shot causes harm, then as well as suing for those damages that occurred, one assesses how much worse of an outcome was plausible and foreseeable given the circumstances, and awards damages in terms of the risk taken. We escaped what looks like 10% chance that thousands died? Pay 10% those costs.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments