LESSWRONG
LW

804
Søren Elverlin
673331080
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
2Søren Elverlin's Shortform
5y
19
Book Review: If Anyone Builds It, Everyone Dies
Søren Elverlin21d20

Hi Nina,

We discussed this post in the AISafety.com Reading Group, and we were of the general opinion that this was one of the best object-level responses to IABIED.

I recorded my presentation/response, and I'd be interested in hearing your thoughts on the points I raise.

Reply
Launching the $10,000 Existential Hope Meme Prize
Søren Elverlin23d20

Have you read The Bottom Line by Eliezer Yudkowsky? This price (and the existential hope project) might not be rational.

Reply
AISafety.com Reading Group session 327
Søren Elverlin26d10

We have chosen your review as the topic for our discussion on Thursday.

Reply
How I tell human and AI flash fiction apart
Søren Elverlin1mo40

Thank you, this updated me. My previous model was "Good humans write better than SoTA AI" without any specifics.

I'm not a good writer, and I both struggle to distinguish AI writing from Human writing and I struggle to distinguish good writing from bad writing.

Reply
Søren Elverlin's Shortform
Søren Elverlin1mo00

A hunger strike is a symmetrical tool, equally effective in worlds AI will destroy and in worlds AI will not destroy. This is in contrast to arguing for/against AI Safety, which is an asymmetric tool since arguments are easier to make and are more persuasive if they reflect the truth.

I could imagine people who are dying from a disease that a Superintelligence could cure would be willing to stage a larger counter-hunger-strike. "Intensity of feeling" isn't entirely disentangled from the question of whether AI Doom will happen, but it is a very noisy signal.

The current hunger strike explicitly aims at making employees at Frontier AI Corporations aware of AI Risk. This aspect is slightly asymmetrical, but I expect the effect of the hunger strike will primarily be influencing the general public.

Reply11
A Timing Problem for Instrumental Convergence
Søren Elverlin2mo10

It is possible that we also disagree on the nature of goal having. I reserve the right to find my own places to challenge your argument.

Reply
A Timing Problem for Instrumental Convergence
Søren Elverlin2mo10

I did read 2/3rd of the paper, and I tried my best to understand it, but apparently I failed.

Reply
A Timing Problem for Instrumental Convergence
Søren Elverlin2mo10

The timing problem is a problem for how well we can predict the actions of myopic agents: Any agent that has a myopic utility function has no instrumental convergent reason for goal preservation.

Reply
A Timing Problem for Instrumental Convergence
Søren Elverlin2mo10

I second Petr's comment: Your definition relates to myopic agents. Consider two utility functions for a paperclip-maximizer:

  1. Myopic paperclip-maximizer: Utility is the number of paperclips in existence right now
  2. Paperclip-maximizer: Utility is the number of paperclips that will eventually exist

A myopic paperclip-maximizer will suffer from the timing problem you described: When faced with an action that creates a superior number of paperclips and also changes the utility function, the myopic maximizer will take this action.

The standard paperclip-maximizer will not. It considers not just the actions it can take right now, but all actions throughout the future. Crucially, it evaluates these actions against the current goal, not the goal it would have at that time. It does not evaluate these actions against what utility the maximizer would later have.

Reply
Søren Elverlin's Shortform
Søren Elverlin2mo32

Regarding "Poll on De/Accelerating AI": Great idea - sort by "oldest" to get the intended ordering of the questions.

Some of the questions are ambiguous. E.g., I believe SB1047 is a step in the right direction, but that this kind of regulation is insufficient. Should I agree or disagree on "SB1047"?

Reply
Load More
64Map of AI Safety v2
6mo
4
33Top AI safety newsletters, books, podcasts, etc – new AISafety.com resource
7mo
2
2414+ AI Safety Advisors You Can Speak to – New AISafety.com Resource
9mo
0
9Notes from Copenhagen Secular Solstice 2024
10mo
0
84AISafety.com – Resources for AI Safety
1y
3
105Retrospective: Lessons from the Failed Alignment Startup AISafety.com
2y
9
9OpenAI’s Alignment Plan is not S.M.A.R.T.
3y
19
7Searching for post on Community Takeover
Q
4y
Q
11
2Søren Elverlin's Shortform
5y
19
17A long reply to Ben Garfinkel on Scrutinizing Classic AI Risk Arguments
5y
6
Load More