Dave Orr — LessWrong

LESSWRONG
LW

Which side of the AI safety community are you in?

I agree that the statement doesn't require direct democracy but that seems like the most likely way to answer the question "do people want this".

Here's a brief list of things that were unpopular and broadly opposed that I nonetheless think were clearly good:

smallpox vaccine
seatbelts, and then seatbelt laws
cars
unleaded gasoline
microwaves (the oven, not the radiation)

Generally I feel like people sometimes oppose things that seem disruptive and can be swayed by demagogues. There's a reason that representative democracy works better than direct democracy. (Though it has obvious issues as well.)

As another whole class of examples, I think people instinctively dislike free speech, immigration, and free markets. We have those things because elites took a strong stance based on better understanding of the world.

I support democractic input, and especially understanding people's fears and being responsive to them. But I don't support only doing things that people want to happen. If we had followed that rule for the past few centuries I think the world would be massively worse off.

Which side of the AI safety community are you in?

Dave Orr6d2928

I like the first clause in the 2025 statement. If that were the whole thing, I would happily sign it. However having lived in California for decades, I'm pretty skeptical that direct democracy is a good way of making decisions, and would not endorse making a critical decision based on polls or a vote. (See also: brexit.)

I did sign the 2023 statement.

come work on dangerous capability mitigations at Anthropic

Dave Orr6d20

I think this is a good idea in certain context. Right now we have a weak version of this where we give people a number to call if they seem like they might be showing signs of being suicidal. Stay tuned for more ideas in this direction later this year.

Omelas Is Perfectly Misread

Dave Orr1mo100

I'm also reminded of the dispossessed, in which le guin describes two worlds, a capitalist society in which people are rich but have curtailed freedoms and some authoritarian aspects, and a kind of socialist anarchy where people are very poor, claim to be free, and are constrained by culture and custom rather than law.

I find that people who come into the book with a strong prior on capitalism being good or bad will also end up with a clear view on which "utopia" is better. The book itself is probably a critique of the idea that utopia is even possible and whether it's a coherent concept at all.

AI #131 Part 2: Various Misaligned Things

Dave Orr2mo20

Which makes it rather strange to choose to sell worse, and thus less expensive and less profitable, chips to China rather than instead making better chips to sell to the West.

I think what might be going on here is that different fabs have separate capacities. You can't make more H200s because they have to be made on the new and fancy fabs, but you can make H20s in an older facility. So if you want to sell more chips, and you're supply limited on the H200s, then the only thing you can do is make crappier chips and figure out where to sell them.

come work on dangerous capability mitigations at Anthropic

Dave Orr2mo50

We certainly plan to!

come work on dangerous capability mitigations at Anthropic

Dave Orr2mo8-2

I 100% endorse working on alignment and agree that it's super important.

We do think that misuse mitigations at Anthropic can help improve things generally though race-to-the-top dynamics, and I can attest that while at GDM I was meaningfully influenced by things that Anthropic did.

come work on dangerous capability mitigations at Anthropic

Dave Orr2mo74

I'm new to Anthropic myself, leading the Safeguards team. I joined a few weeks ago, inspired by the mission and the opportunity. I'm really worried about the world as AI continues to get more powerful, and I couldn't pass up the chance to help if I could.

I was previously at GDM working on similar problems (miss you all!), but the chance to help drive the safety agenda at Anthropic as we transition to a new scarier world felt too important to miss.

So far everything at Anthropic except my new commute is amazing, but most of all the feeling of the mission is intense and awesome. Also the level of transparency inside the company is astounding for a company this size (not that it's big compared to many others).

Obviously there could be some honeymoon effect here but honestly I'm having a lot of fun, and I honestly think Safeguards (along with Alignment Science) makes a real different in safety for the world.

Should you start a for-profit AI safety org?

Dave Orr2mo20

It depends on what you're trying to do, right? Like, if you build a great eval to detect agent autonomy, but nobody adopts it, you haven't accomplished anything. You need to know how to work with AI labs. In that case, selling widgets (your eval) is highly aligned with AI safety.

IME there are an extremely large number of NGOs with passionate people who do not remotely move the needle on whatever problem they are trying to solve. I think it's the modal outcome for a new nonprofit. I'm not 100% sure that the feedback loop aspect is the reason but I think it plays a very substantial role.

Should you start a for-profit AI safety org?

Dave Orr2mo42

I agree the incentives matter, and point in the direction you indicate.

However, there's another effect pointing in the opposite direction that also matters and can be as large or bigger: feedback loops.

It's very easy in the nonprofit space to end up doing stuff that doesn't impact the real world. You do things that you hope matter and that sound good to funders, but measurement is hard and funding cycles are annual so feedback is rare.

In contrast, if you have a business, you get rapid feedback from customers, and know immediately if you're getting traction. You can iterate rapidly and quickly get better at something because of rapid feedback.

So in addition to thinking about abstract incentives, think about what kind of product feedback you will get and how important that is. For policy work, maybe it's not that important. For things that look more like auditing, testing, etc where your effectiveness is in significant part transactional, think hard about being for profit.

Source: long engagement with NGO sector on boards and funders, work at private companies.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments