LESSWRONGTags
LW

Aligned AI Proposals

EditHistorySubscribe

Help improve this page

EditHistorySubscribe

Help improve this page

Aligned AI Proposals

Contributors

Posts tagged Aligned AI Proposals

5

64How to Control an LLM's Behavior (why my P(DOOM) went down)

6mo

30

5

46Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor

4mo

8

5

35Striking Implications for Learning Theory, Interpretability — and Safety?

4mo

4

5

22Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI?

4mo

4

5

21Requirements for a Basin of Attraction to Alignment

3mo

6

5

4Alignment has a Basin of Attraction: Beyond the Orthogonality Thesis

4mo

15

4

30Interpreting the Learning of Deceit

5mo

10

2

161A list of core AI safety problems and how I hope to solve them

9mo

26

2

113AI Alignment Metastrategy

5mo

12

2

47Safety First: safety before full alignment. The deontic sufficiency hypothesis.

5mo

3

2

45an Evangelion dialogue explaining the QACI alignment plan

1y

15

2

31We have promising alignment plans with low taxes

6mo

9

2

27The (partial) fallacy of dumb superintelligence

7mo

5

2

11Two paths to win the AGI transition

Nathan Helm-Burger

10mo

8

2

8Desiderata for an AI

Nathan Helm-Burger

10mo

0