LESSWRONG
LW

504
arete
23140
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
1arete's Shortform
5d
2
1arete's Shortform
5d
2
arete's Shortform
arete1d1411

The statements "LLMs are a normal technology" and "Advanced AI is a normal technology" are completely different. If you think LLMs are not very advanced, it is perfectly valid to believe both that LLMs are a normal technology and that advanced AI is not.

Reply
arete's Shortform
arete5d50

I've found a lighting solution that gets you 24000 lumens, requires no installation, can be placed anywhere with an outlet, and looks tolerable. It uses a temporary sale though.

Just combine this sale of 2x 12000 lumen lightbulbs for $30:

 https://www.amazon.com/dp/B0D7VKXF4R


with 2x of these floor stands for 35$:

https://www.walmart.com/ip/Mainstays-71-Black-Floor-Lamp-Modern-Design/12173437 

Hopefully others find this useful.

 

Reply
A non-review of "If Anyone Builds It, Everyone Dies"
arete26d*21

Upon reflection, I agree that my previous comment describes fragility of value. 

My mental model is that the standard MIRI position[1] claims the following [2]:
1. Because of the way AI systems are trained, δ,δ′ will be large even if we knew humanity's collective utility function and could target that (this is inner misalignment)
2. Even if δ′ were fairly small, this would still result in catastrophic outcomes if M′ is an extremely powerful optimizer (this is fragility of value)

A few questions:
3. Are the claims (1) and (2) accurate representations of inner misalignment and fragility of value?
4. Is the "misgeneralization" claim just "δ′ will be much larger than δ"? 

If the answer to (4) is yes, I am confused as to why the misgeneralization claim is brought up. It seems that (1) and (2) are sufficient to argue for AI risk.. By contrast, it seems that the misgeneralization claim is neither sufficient nor necessary to make a case for AI risk. Furthermore, the misgeneralization claim seems less likely to be true than (1) and (2).

Also let me know if I am thinking about things in a completely wrong framework and should scrap my made up notation.


 

  1. ^

    There's probably a better name for this. Please suggest one!

  2. ^

    Non-exhaustive list.

Reply
A non-review of "If Anyone Builds It, Everyone Dies"
arete1mo*6-5

Here's my attempted phrasing, which I think avoids some of the common confusions:

Suppose we have a model M with utility function ϕ, where M is not capable of taking over the world. Assume that thanks to a bunch of alignment work, ϕ is within δ (by some metric) of humanity's collective utility function. Then in the process of maximizing ϕ, M ends up doing a bunch of vaguely helpful stuff.

Then someone releases model M′ with utility function ϕ′, where M′ is capable of taking over the world. Suppose that our alignment techniques generalize perfectly. That is, ϕ′ is also within δ′ of humanity's collective utility function, where δ′≤δ Then in the process of maximizing ϕ′, M′ gets rid of humans and rearranges their molecules to satisfy ϕ′ better.

Does this phrasing seem accurate and helpful?
 

Reply