LESSWRONG
LW

Dave Orr
1646Ω2042240
Message
Dialogue
Subscribe

DeepMind Gemini Safety lead; Foundation board member

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
How AI researchers define AI sentience? Participate in the poll
Dave Orr12d20

I don't think behavioral is enough -- I think LLMs have basically passed the Turing test anyway.

But I also don't see why it would need to have our specific brain structure either. Surely experiences are possible with things besides the mammal brain. However, if something did have similar brain structure to us, that would probably be sufficient. (It certainly is for other people, and I think most of us think that e.g. higher mammals have experiences.)

What I think we need is some kind of story about why what we have gives rise to experience, and then we can see if AIs have some similar pathway. Unfortunately this is very hard because we have no idea why what we have gives rise to experience (afaik).

Until we have that I think we just have to be very uncertain about what is going on.

Reply
How AI researchers define AI sentience? Participate in the poll
Dave Orr16d30

We think humans are sentient because of two factors: first, we have internal experience that means we ourselves are sentient; and two, we rely on testimony from others who say they are sentient. We can rely on the latter because people seem similar. I feel sentient and say I am. You are similar to me and say you are. Probably you are sentient. 

With AI, this breaks down because they aren't very similar to us in terms of cognition, brain architecture, or "life" "experience". So unfortunately AI saying they are sentient does not produce the same kind of evidence as it does for people.

This suggests that any test should try to establish relevant similarity between AIs and humans, or else use an objective definition of what it means to experience something. Given that the latter does not exist, perhaps the former will be more useful. 

Reply
The best simple argument for Pausing AI?
Dave Orr18d20

For that specific example, I would not call it safety critical in the sense that you shouldn't use an unreliable source. Intel involves lots of noisy and untrustworthy data, and indeed the job is making sense out of lots of conflicting and noisy signals. It doesn't strike me that adding an LLM to the mix changes things all that much. It's useful, it adds signal (presumably), but also is wrong sometimes -- this is just what all the inputs are for an analyst.

Where I would say it crosses a line is if there isn't a human analyst. If an LLM analyst was directly providing recommendations for actions that weren't vetted by a human, yikes that seems super bad and we're not ready for that. But I would be quite surprised if that were happening right now.

Reply
The best simple argument for Pausing AI?
Dave Orr20d91

"Perhaps we should pause widespread rollout of Generative AI in safety-critical domains — unless and until it can be relied on to follow rules with significant greater reliability."

This seems clearly correct to me - LLMs should not be in safety critical domains until we can make a clear case for why things will go well in that situation. I'm not actually aware of anyone using LLMs in that way yet, mostly because they aren't good enough, but I'm sure that at some point it'll start happening. You could imagine enshrining in regulation that there must be affirmative safety cases made in safety critical domains that lower risk to at or below the reasonable alternative.

Note that this does not exclude other threats - for instance misalignment in very capable models could go badly wrong even if they aren't deployed to critical domains. Lots of threats to consider!

Reply
Orphaned Policies (Post 5 of 7 on AI Governance)
Dave Orr2mo1311

This is a great piece! I especially appreciate the concrete list at the end. 

In other areas of advocacy and policy, it's typical practice to have model legislation and available experts ready to go so that when a window opens when action is possible, progress can be very fast. We need to get AI safety into a similar place.

Reply
AI #113: The o3 Era Begins
Dave Orr3mo30

Formatting is still kind of bad, and is affecting readability. It's been a couple of posts in a row now with long wall of text paragraphs. I feel like you changed something? And you should change it back. :)

Reply
LLM-based Fact Checking for Popular Posts?
Dave Orr3mo52

Are there examples of posts with factual errors you think would be caught by LLMs? 

One thing you could do is fact check a few likely posts and see if it's adding substantial value. That would be more persuasive than abstract arguments.

Reply
On Google’s Safety Plan
Dave Orr3mo2212

"There have been some relatively discontinuous jumps already (e.g. GPT-3, 3.5 and 4), at least from the outside perspective."

These are firmly within our definition of continuity - we intend our approach to handle jumps larger than seen in your examples here. 

Possibly a disconnect is that from an end user perspective a new release can look like a big jump, while from a developer perspective it was continuous. 

Note also that continuous can still be very fast. And of course we could be wrong about discontinuous jumps.

Reply
Recent AI model progress feels mostly like bullshit
Dave Orr4mo312

I don't work directly on pretraining, but when there were allegations of eval set contamination due to detection of a canary string last year, I looked into it specifically. I read the docs on prevention, talked with the lead engineer, and discussed with other execs.

So I have pretty detailed knowledge here. Of course GDM is a big complicated place and I certainly don't know everything, but I'm confident that we are trying hard to prevent contamination.

Reply
Load More
65Checking in on Scott's composition image bet with imagen 3
7mo
0
50Why I think nuclear war triggered by Russian tactical nukes in Ukraine is unlikely
3y
7
166Playing with DALL·E 2
3y
118
158parenting rules
5y
9