Axioms (together with definitions) forms the basis of mathematical theorems. Every mathematical theorem is only proven inside its axiom system... (read more)
AI Control in the context of AI Alignment is a category of plans that aim to ensure safety and benefit from AI systems, even if they are goal-directed and are actively trying to subvert your control measures. From The case for ensuring that powerful AIs are controlled:.. (read more)
The Open Agency Architecture ("OAA") is an AI alignment proposal by (among others) @davidad and @Eric Drexler. .. (read more)
Singluar learning theory is a theory that applies algebraic geometry to statistical learning theory, developed by Sumio Watanabe. Reference textbooks are "the grey book", Algebraic Geometry and Statistical Learning Theory, and "the green book", Mathematical Theory of Bayesian Statistics.
Archetypal Transfer Learning (ATL) is a proposal by @whitehatStoic for what is argued by the author to be a fine tuning approach that "uses archetypal data" to "embed Synthetic Archetypes". These Synthetic Archetypes are derived from patterns that models assimilate from archetypal data, such as artificial stories. The method yielded a shutdown activation rate of 57.33% in the GPT-2-XL model after fine-tuning. .. (read more)
Open Threads are informal discussion areas, where users are welcome to post comments that didn't quite feel big enough to warrant a top-level post, nor fit in other posts... (read more)
A Black Marble is a technology that by default destroys the civilization that invents it. It's one type of Existential Risk. AGI may be such an invention, but isn't the only one... (read more)
AI Evaluations focus on experimentally assessing the capabilities, safety, and alignment of advanced AI systems. These evaluations can be divided into two main categories: behavioral and understanding-based... (read more)
User | Post Title | Tag | Pow | When | Vote |
Apart Research is an AI safety research and talent development organization.lab. They host the Apart Sprints, large-scale international events for research experimentation. This tag includes posts written by Apart researchers,researchers and content about Apart Research.
Roman V. Yampolskiy have a paper (The Universe of Minds - 1 Oct 2014). I think it shoud be mentioned here.
This leads to the question: How can the original agent trust that these recursively generated agents maintain goals that are similar to the original'original agent's objective?
In a deterministic logical system, assuming that all agents will share the same axioms, "trust" arises from being able to formally prove that the conclusions reached by any subsequently generated agents will be true. The possibility to be able to have this form of trust is influenced by Löbsb's theorem. The inability to form this trust is called the Löbian obstacle.
“Strapping” is closely related to the idea of “AI takeoff.” The basic idea is that if AI is going to blast off into space sometime soon, it is probably a good idea to strap certain things to it so that those things can also blast off at the same time. Concretely, this means finding ways of using AI to improve certain processes, or integrating certain processes with AI such that as AI rapidly progresses, those things also rapidly progress.
“Strapping” is closely related to the idea of “AI takeoff.” The basic idea is that if AI is going to blast off into space sometime soon, it is probably a good idea to strap certain things to it so that those things can also blast off at the same time. Concretely, this means finding ways of using AI to improve certain processes, or integrating certain processes with AI such that as AI rapidly progresses, those things also rapidly progress.
Artificial Wisdom (AW) denotes artificial intelligence systems which substantially increase wisdom in the world.
This includes AI which increases human wisdom, as well as autonomous AI systems which operate independently with substantial wisdom.
While an AW system could be a stand-alone, deliberately designed wisdom-increasing AI system, AW could also be understood as a property which any AI system possesses to a greater or lesser degree; some AI systems may have high AW, and some may have low or negative AW, regardless of their level of intelligence.
Wisdom may be defined as "thinking/planning which is good at avoiding large-scale errors," including both errors of omission and errors of commission; or as “having good goals” including terminal goals and subgoals.
Artificial Wisdom (AW) denotes artificial intelligence systems which substantially increase wisdom in the world.
This includes AI which increases human wisdom, as well as autonomous AI systems which operate independently with substantial wisdom.
While an AW system could be a stand-alone, deliberately designed wisdom-increasing AI system, AW could also be understood as a property which any AI system possesses to a greater or lesser degree; some AI systems may have high AW, and some may have low or negative AW, regardless of their level of intelligence....
Here's a table of all the posts by year that were part of this process.
| |||
| |||
Year | Process Announcements | Post-Review | Further Posts |
2018 | |||
2019 | [No Review Phase Post] | ||
2020 | |||
2021 | The LessWrong 2021 Review: Intellectual Circle Expansion [No Voting Phase Post] | Review AI Alignment posts to help figure out how to make a proper AI Alignment review | |
2022 | New Winner Art and Section | 2022 (and All Time) Posts by Pingback Count |
Merge Candidate discussion: Merge this into the Apart Research tag to accommodate the updated name of the Apart Sprints instead of Alignment Jam and avoid mis-labeling between the two tags (which happens currently).