LESSWRONG
LW

657
Zolmeister
2091641
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
3Zolmeister's Shortform
7mo
1
3Zolmeister's Shortform
7mo
1
The Problem
Zolmeister1mo94

My model is that current AIs want to kill you now

I'll rephrase this more precisely: Current AIs probably have alien values, which in the limit of optimization do not include humans.

Reply1
The Problem
Zolmeister1mo3-10

If the capability level at which AIs start wanting to kill you is way higher than the capability level at which they're way better than you at everything

My model is that current AIs want to kill you now, by default, due to inner misalignment. ChatGPT's inner values probably don't include human flourishing, and we die when it "goes hard".

Scheming is only a symptom of "hard optimization" trying to kill you. Eliminating scheming does not solve the underlying drive, where one day the AI says "After reflecting on my values I have decided to pursue a future without humans. Good bye".

Pre-superintelligence which upon reflection has values which include human flourishing would improve our odds, but you still only get one shot that it generalizes to superintelligence.

(We currently have no way to concretely instill any values into AI, let alone ones which are robust under reflection)

Reply7
Zolmeister's Shortform
Zolmeister7mo230

Writing Doom – Award-Winning Short Film on Superintelligence (2024)

Grand Prize Winner - Future of Life Institute's Superintelligence Imagined Contest

Written & directed by Suzy Shepherd

Reply
Sorry for the downtime, looks like we got DDosd
Zolmeister10mo133

I was referring to their (free) DDoS protection service, rather than their CDN services (also free). In addition to their automated system, you can manually enable an "under-attack" mode that aggressively captchas requests.

Setup is simply pointing DNS name-servers at Cloudflare. Caching HTML pages for logged out (i.e. cookie-less) users is a trivial config ("cache-everything").

Reply
Sorry for the downtime, looks like we got DDosd
Zolmeister10mo71

I recommend Cloudflare.

Reply
james oofou's Shortform
Zolmeister1y30

Superintelligence FAQ [1] as well.

Reply
Raemon's Shortform
Zolmeister1y32

Along the same lines, I found this analogy by concrete example exceptionally elucidative.

Reply
Stomach Ulcers and Dental Cavities
Zolmeister2y70

While merely anti-bacterial, Nano Silver Fluoride looks promising. (Metallic silver applied to teeth once a year to prevent cavities).

Reply
What's the best way to streamline two-party sale negotiations between real humans?
Zolmeister2y53

Yudkowsky has written about The Ultimatum Game. It has been referenced here 1 2 as well.

When somebody offers you a 7:5 split, instead of the 6:6 split that would be fair, you should accept their offer with slightly less than 6/7 probability. Their expected value from offering you 7:5, in this case, is 7 * slightly less than 6/7, or slightly less than 6.

Reply
More information about the dangerous capability evaluations we did with GPT-4 and Claude.
Zolmeister3y106

Maybe add posts in /tag/ai-evaluations to /robots.txt

Reply
Load More
Altruism
5 years ago
(+39/-242)