LESSWRONG
LW

351
Dave Orr
1766Ω2052350
Message
Dialogue
Subscribe

Anthropic Safeguards lead; formerly GDM safety work. Foundation board member. 

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Leaving Open Philanthropy, going to Anthropic
Dave Orr10d128

Welcome to Anthropic! We're lucky to have you. :)

Reply
Which side of the AI safety community are you in?
Dave Orr21d225

I agree that the statement doesn't require direct democracy but that seems like the most likely way to answer the question "do people want this".

Here's a brief list of things that were unpopular and broadly opposed that I nonetheless think were clearly good:

  • smallpox vaccine
  • seatbelts, and then seatbelt laws
  • cars
  • unleaded gasoline
  • microwaves (the oven, not the radiation)

Generally I feel like people sometimes oppose things that seem disruptive and can be swayed by demagogues. There's a reason that representative democracy works better than direct democracy. (Though it has obvious issues as well.)

As another whole class of examples, I think people instinctively dislike free speech, immigration, and free markets. We have those things because elites took a strong stance based on better understanding of the world. 

I support democractic input, and especially understanding people's fears and being responsive to them. But I don't support only doing things that people want to happen. If we had followed that rule for the past few centuries I think the world would be massively worse off.

Reply1
Which side of the AI safety community are you in?
Dave Orr21d2928

I like the first clause in the 2025 statement. If that were the whole thing, I would happily sign it. However having lived in California for decades, I'm pretty skeptical that direct democracy is a good way of making decisions, and would not endorse making a critical decision based on polls or a vote. (See also: brexit.)

I did sign the 2023 statement.

Reply1
come work on dangerous capability mitigations at Anthropic
Dave Orr21d20

I think this is a good idea in certain context. Right now we have a weak version of this where we give people a number to call if they seem like they might be showing signs of being suicidal. Stay tuned for more ideas in this direction later this year.

Reply
Omelas Is Perfectly Misread
Dave Orr1mo100

I'm also reminded of the dispossessed, in which le guin describes two worlds, a capitalist society in which people are rich but have curtailed freedoms and some authoritarian aspects, and a kind of socialist anarchy where people are very poor, claim to be free, and are constrained by culture and custom rather than law.

I find that people who come into the book with a strong prior on capitalism being good or bad will also end up with a clear view on which "utopia" is better. The book itself is probably a critique of the idea that utopia is even possible and whether it's a coherent concept at all.

Reply
AI #131 Part 2: Various Misaligned Things
Dave Orr2mo20

Which makes it rather strange to choose to sell worse, and thus less expensive and less profitable, chips to China rather than instead making better chips to sell to the West.

 

I think what might be going on here is that different fabs have separate capacities. You can't make more H200s because they have to be made on the new and fancy fabs, but you can make H20s in an older facility. So if you want to sell more chips, and you're supply limited on the H200s, then the only thing you can do is make crappier chips and figure out where to sell them.

Reply
come work on dangerous capability mitigations at Anthropic
Dave Orr3mo50

We certainly plan to!

Reply1
come work on dangerous capability mitigations at Anthropic
Dave Orr3mo8-2

I 100% endorse working on alignment and agree that it's super important. 

We do think that misuse mitigations at Anthropic can help improve things generally though race-to-the-top dynamics, and I can attest that while at GDM I was meaningfully influenced by things that Anthropic did. 

Reply
come work on dangerous capability mitigations at Anthropic
Dave Orr3mo74

I'm new to Anthropic myself, leading the Safeguards team. I joined a few weeks ago, inspired by the mission and the opportunity. I'm really worried about the world as AI continues to get more powerful, and I couldn't pass up the chance to help if I could.

I was previously at GDM working on similar problems (miss you all!), but the chance to help drive the safety agenda at Anthropic as we transition to a new scarier world felt too important to miss.

So far everything at Anthropic except my new commute is amazing, but most of all the feeling of the mission is intense and awesome. Also the level of transparency inside the company is astounding for a company this size (not that it's big compared to many others).

Obviously there could be some honeymoon effect here but honestly I'm having a lot of fun, and I honestly think Safeguards (along with Alignment Science) makes a real different in safety for the world.

Reply11
Should you start a for-profit AI safety org?
Dave Orr3mo20

It depends on what you're trying to do, right? Like, if you build a great eval to detect agent autonomy, but nobody adopts it, you haven't accomplished anything. You need to know how to work with AI labs. In that case, selling widgets (your eval) is highly aligned with AI safety.

IME there are an extremely large number of NGOs with passionate people who do not remotely move the needle on whatever problem they are trying to solve. I think it's the modal outcome for a new nonprofit. I'm not 100% sure that the feedback loop aspect is the reason but I think it plays a very substantial role.

Reply
Load More
No wikitag contributions to display.
33come work on dangerous capability mitigations at Anthropic
3mo
9
65Checking in on Scott's composition image bet with imagen 3
11mo
0
50Why I think nuclear war triggered by Russian tactical nukes in Ukraine is unlikely
3y
7
166Playing with DALL·E 2
4y
118
159parenting rules
5y
9