False Name

Mercy to the Machine: Thoughts & Rights

Abstract: First [1)], a suggested general method of determining, for AI operating under the human feedback reinforcement learning (HFRL) model, whether the AI is “thinking”; an elucidation of latent knowledge that is separate from a recapitulation of its training data. With independent concepts or cognitions, then, an early observation that...

Apr 27, 20249

What's Your Best AI Safety "Quip"?

Motivated by thinking gay rights were advanced by asking "When did you choose to be straight?" Which emphasised that what isn't a choice and doesn't harm others shouldn't be proscribed. Here, we're seeking a memetic way of framing the fact that the alignment problem is unsolved. Author's "null quip": "Can...

Mar 26, 2024-2

Impossibility of Anthropocentric-Alignment

Abstract: Values alignment, in AI safety, is typically construed as the imbuing into artificial intelligence of human values, so as to have the artificial intelligence act in ways that encourage what humans value to persist, and equally to preclude what humans do not value. “Anthropocentric” alignment emphasises that the values...

Feb 24, 2024-8

A Challenge to Effective Altruism's Premises

Abstract: A demonstration that the philosophy Effective Altruism (hereafter “EA”), particularly its emphasis on the use of the free market to collect means then used to promote human welfare, including reducing risks to human existence (our definition of EA), is contradictory, and therefore ineffectual. Epistemic status: Modest confidence. 1. EA’s...

Jan 6, 2024-26

Worldwork for Ethics

Abstract: An alternative to the now-predominating models of alignment, corrigibility and "CEV", following a critique of these. The critique to show, in substance: CEV and corrigibility have the exact same problems - in effect, they're isomorphs of one another, and each equally unobtainable. This briefly shown, and then, in flat...

Oct 17, 202310

Introspective Bayes

An attempt to demonstrate a limiting condition of an optimal Bayesian agent, or a probabilistic agent in general. This analysis relies on the description of such an optimal Bayesian agent (hereafter OBA), in Bostrom 2014, Boxes one and ten, and associated endnotes, with supplementary research on Kolmogorov’s axioms of probability,...

May 27, 2023-3

What about an AI that's SUPPOSED to kill us (not ChaosGPT; only on paper)?

So as to know, structurally, what it is the better to avoid it; Yudkowsky's previously mentioned, so implied, that MIRI, in the process of designing a corrigible agent, in fact succeeded in designing an agent specifically intended to shut itself down. So, not having seen it elsewhere, here to suggest:...

Apr 11, 2023-13

LESSWRONG
LW

LESSWRONG
LW

False Name

False Name

Worldwork for Ethics

Mercy to the Machine: Thoughts & Rights

What "upside" of AI?

Contra-Berkeley

False Name

Worldwork for Ethics

Mercy to the Machine: Thoughts & Rights

What "upside" of AI?

Contra-Berkeley

Mercy to the Machine: Thoughts & Rights

What's Your Best AI Safety "Quip"?

Impossibility of Anthropocentric-Alignment

A Challenge to Effective Altruism's Premises

Worldwork for Ethics

Introspective Bayes

What about an AI that's SUPPOSED to kill us (not ChaosGPT; only on paper)?