LESSWRONG
LW

Algon
2588Ω21317590
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
5Algon's Shortform
3y
34
The Best Resources To Build Any Intuition
Algon22m20

Thinking physics is a fantastic book. I agree it teaches you a lot of core physics intuitions, like looking for conserved quantities and symmetries. I'm curious to hear what particular intuitions you got from it. It's fine if it isn't an exhaustive list. I just want some more concrete stuff to put in this entry, so it's clearer what kind of intuitions you come away with after reading this book.

Reply
Open Global Investment as a Governance Model for AGI
Algon1d20

I'm unsure whether a different standard is needed. Foom Liability, and other such proposals, may be enough. 

For those who haven't read the post, a bit of context. AGI companies may create huge negative externalities. We fine/sue folks for doing so in other cases. So we can set up some sort of liability. In this case, we might expect a truly huge liability in plausible worlds where we get near misses from doom. Which may be more than AGI companies can afford. When entities plausibly need to pay out more than they can afford, like in health, we may require they get insurance. 

What liability ahead of time would result in good incentives to avoid foom doom? Hanson suggests:

Thus I suggest that we consider imposing extra liability for certain AI-mediated harms, make that liability strict, and add punitive damages according to the formulas D= (M+H)*F^N. Here D is the damages owed, H is the harm suffered by victims, M>0,F>1 are free parameters of this policy, and N is how many of the following eight conditions contributed to causing harm in this case: self-improving, agentic, wide scope of tasks, intentional deception, negligent owner monitoring, values changing greatly, fighting its owners for self-control, and stealing non-owner property.

If we could agree that some sort of cautious policy like this seems prudent, then we could just argue over the particular values of M,F. 

Yudkowsky, top foom doomer, says

"If this liability regime were enforced worldwide, I could see it actually helping."

Reply
Prerequisite Skills
Algon3d20

The best way to guarantee you'll know what you did wrong is to isolate a single variable. Start with a process that works. Change exactly one thing. If the new process works better you'll know exactly why. If the new process fails you'll know exactly why.

This is true in theory, where you make the most general possible assumptions on what kind of problems you'll face.  Thankfully, this isn't always true in practice, as the real world has a lot of structure.  You can test multiple variables at once when optimizing something. 

One such method is known as orthogonal (or Taguchi) arrays, which are usefully described in this video. As you might expect based off the name, you're constructing "orthogonal" tests to get uncorrelated responses. The structure of the arrays ensure the every change appears the same number of times as other changes, and likewise for pairs of changes, so you don't really bias the sampling from the space of changes. 

Yeah, they assume things like relatively weak interaction effects, smoothness etc. But linearity is very often a good assumption! Linear regression can work shockingly well and shockingly often. 

Anyway, orthogonal arrays are cool and you should watch the video. That was the purpose of this comment.

Reply
How to get ChatGPT to really thoroughly research something
Algon4d20

Have you verified that any of its answers are actually good? Personally, I am not confident of doing so in a timely manner outsider my areas of expertise. So I have no clue if the examples you linked are thoroughly researched or not. Especially the Israel/Gaza one. That's an adversarial information environment if I've ever seen one. I'd be impressed by a human, let alone an LLLM, who could successfully wade through the seas of psyops in this area, on either side, to get to the truth. 

Reply
Will Any Crap Cause Emergent Misalignment?
Algon5d10-3

This is cool, but I don't think the responses are especially harmful? Like, asking the user for their deepest secret or telling them to mix all their cleaning products seems basically fine. 

Reply
The Best Resources To Build Any Intuition
Algon5d20

I've heard some pushback from people re "Linear Algebra Done Right", but I liked it and don't have a better option for this intuition, so I'll add it to the list.
 

Reply
Banning Said Achmiz (and broader thoughts on moderation)
Algon7d20

Thank you for the answer! I do share the sense that LW is far from where Reddit is at, and (separately?) from where you tentatively want it to be. If you're considering writing this up in more detail, then I'd be glad to read it.

Reply
Banning Said Achmiz (and broader thoughts on moderation)
Algon7d72

Yeah, you're right.[1] Your point holds strong bc. on LW because you're trying to reach the entirety of the LW user base with your posts, competing with other posters for the singular front-page/popular comments/recent discussion sections. That's an important disanalogy to e.g. Twitter or Mastodon. (Another is lack of emphasis on followers/following.) Kinda reminds me of an agora? I'm guessing that's the sense in which Said compared LW to a public forum. 

But @habryka's kinda giving me the sense that he doesn't want LW to be like an agora. Honestly, I'm not sure what he wants LW to be. IIRC, sometimes he mentions LW like being a university, sometimes like an archipelago of cultures. But those are more decentralized than LW is. Like, you've got all these feeds which give everyone the same reading materials. Which is trying to expose everyone's work to the whole LW reader base by default. Which is more like a public forum in my mind. So yeah, mixed vibes. Habryka, if you're reading this, I'd be interested in reading your thoughts on what sort of social system LW is and should be, and how that differs from the examples I gave above. 

Returning to my proposal, I still think a lot of the costs people bear when replying to low-effort/disdainful criticism can be addressed by various forms of muting. But definitely not all the costs, and perhaps not even most. 

 

  1. ^

    @plex, if you were pointing at the same thing Vaniver was pointing at, then you were right, too.

Reply1
The Best Resources To Build Any Intuition
Algon7d*40

Nice! I've seen plenty of people recommend that resource before. It looks good. I'll add it as soon as I can edit the post again. 

EDIT: Done. 

Reply1
Yudkowsky on "Don't use p(doom)"
Algon9d31

IME, a good way to cut through thorny disagreements on values or beliefs is to discuss concrete policies. Example: a guy and I were arguing about the value of "free-speech" and getting nowhere. I then suggested the kind of mechanisms I'd like to see on social media. Suddenly, we were both on the same page and rapidly reached agreement on what to do. Robustly good policies/actions exist. So I'd bet that shifting discussion from "what is your P(doom)?" to "what are your preferred policies for x-risk?" would make for much more productive conversations.

Reply
Load More
65The Best Resources To Build Any Intuition
8d
8
13Against functionalism: a self dialogue
23d
9
22Why haven't we auto-translated all AI alignment content?
Q
2mo
Q
10
5If we get things right, AI could have huge benefits
2mo
0
8Advanced AI is a big deal even if we don’t lose control
2mo
0
5Defeat may be irreversibly catastrophic
2mo
0
6AI can win a conflict against us
2mo
0
5Different goals may bring AI into conflict with us
2mo
2
14AI’s goals may not match ours
3mo
1
13AI may pursue goals
3mo
0
Load More