LESSWRONG
LW

623
AmberDawn
178080
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Fluent, Cruxy Predictions
AmberDawn1y40

I got a Fatebook account thanks to this post!

Reply1
AGI Safety FAQ / all-dumb-questions-allowed thread
AmberDawn3y10

Thanks, this is helpful!

Reply
AGI Safety FAQ / all-dumb-questions-allowed thread
AmberDawn3y30

Thanks! This is interesting.

Reply
AGI Safety FAQ / all-dumb-questions-allowed thread
AmberDawn3y10

Thanks!

I think my question is deeper - why do machines 'want' or 'have a goal to' follow the algorithm to maximize reward? How can machines 'find stuff rewarding'? 

Reply
AGI Safety FAQ / all-dumb-questions-allowed thread
AmberDawn3y30

This might be a crux, because I'm inclined to think they depend on qualia.

Why does AI 'behave' in that way? How do engineers make it 'want' to do things?

Reply
AGI Safety FAQ / all-dumb-questions-allowed thread
AmberDawn3y30

My comment-box got glitchy but just to add: this category of intervention might be a good thing to do for people who care about AI safety and don't have ML/programming skills, but do have people skills/comms skills/political skills/etc. 

Maybe lots of people are indeed working on this sort of thing, I've just heard much less discussion of this kind of solution relative to technical solutions.

Reply
AGI Safety FAQ / all-dumb-questions-allowed thread
AmberDawn3y130
  • Yudkowksy writes in his AGI Ruin post:
         "We can't just "decide not to build AGI" because GPUs are everywhere..." 

    Is anyone thinking seriously about how we might bring it about such that we coordinate globally to not build AGI (at least until we're confident we can do so safely)? If so, who? If not, why not? It seems like something we should at least try to do, especially if the situation is as dire as Yudkowsky thinks. The sort of thing I'm thinking of is (and this touches on points others have made in their questions):
     
  • international governance/regulation
  • start a protest movement against building AI
  • do lots of research and thinking about rhetoric and communication and diplomacy, find some extremely charming and charismatic people to work on this, and send them to persuade all actors capable of building AGI to not do it (and to do everything they can to prevent others from doing it)
  • as someone suggested in another question, translate good materials on why people are concerned about AI safety into Mandarin and other languages
  • more popularising of AI concerns in English 

To be clear, I'm not claiming that this will be easy - this is not a "why don't we just-" point.  I agree with the things Yudkowsky says in that paragraph about why it would be difficult. I'm just saying that it's not obvious to me that this is fundamentally intractable or harder than solving the technical alignment problem. Reasons for relative optimism:

  • we seem to achieved some international cooperation around nuclear weapons - isn't it theoretically possible to do so around AGI? 
  • there are lots of actors who could build AGIs, but it's still a limited number. Larger groups of actors do cooperate. 
  • through negotiation and diplomacy, people successfully persuade other people to do stuff that's not even in their interest.  AI safety should be a much easier sell because if developing AGI is really dangerous, it's in everyone's interest to stop developing it. There are coordination problems to be sure, but the fact remains that the AI safety 'message' is fundamentally 'if you stop doing this we won't all die'

     
Reply
AGI Safety FAQ / all-dumb-questions-allowed thread
AmberDawn3y121

This is very basic/fundamental compared to many questions in this thread, but I am taking 'all dumb questions allowed' hyper-literally, lol. I have little technical background and though I've absorbed some stuff about AI safety by osmosis, I've only recently been trying to dig deeper into it (and there's lots of basic/fundamental texts I haven't read).

Writers on AGI often talk about AGI in anthropomorphic terms - they talk about it having 'goals', being an 'agent', 'thinking' 'wanting', 'rewards' etc. As I understand it, most AI researchers don't think that AIs will have human-style qualia, sentience, or consciousness. 

But if AI don't have qualia/sentience, how can they 'want things' 'have goals' 'be rewarded', etc? (since in humans, these things seem to depend on our qualia, and specifically our ability to feel pleasure and pain). 

I first realised that I was confused about this when reading Richard Ngo's introduction to AI safety and he was talking about reward functions and reinforcement learning. I realised that I don't understand how reinforcement learning works in machines. I understand how it works in humans and other animals - give the animal something pleasant when it does the desired behaviour and/or painful when it does the bad behaviour. But how can you make a machine without qualia "feel" pleasure or pain? 

When I talked to some friends about this, I came to the conclusion that this is just a subset of 'not knowing how computers work', and it might be addressed by me getting more knowledge about how computers work (on a hardware, or software-communicating-with-hardware, level). But I'm interested in people's answers here. 

Reply
1Should Effective Altruists be Valuists instead of utilitarians?
2y
3
65AGI Timelines in Governance: Different Strategies for Different Timeframes
3y
28
57Two reasons we might be closer to solving alignment than it seems
3y
9
57How and why to turn everything into audio
3y
20
39Four reasons I find AI safety emotionally compelling
3y
3