Archetypal Transfer Learning (ATL) is a proposal by @whitehatStoic for what is argued by the author to be a fine tuning approach that "uses archetypal data" to "embed Synthetic Archetypes". These Synthetic Archetypes are derived from patterns that models assimilate from archetypal data, such as artificial stories. The method yielded a shutdown activation rate of 57.33% in the GPT-2-XL model after fine-tuning. .. (read more)
Religion is a complex group of human activities — involving commitment to higher power, belief in belief, and a range of shared group practices such as worship meetings, rites of passage, etc... (read more)
| User | Post Title | Wikitag | Pow | When | Vote |
"The X-Men comics use terms like “evolution,” “mutation,” and “genetic code,” purely to place themselves in what they conceive to be the literary genre of science. The part that scares me is wondering how many people, especially in the media, understand science only as a literary genre." -- from Eliezer's post Science as Attire.
Nit: What do the "*" mean? I find them slightly distracting.
This tag is specifically for discussions about these formal constructs. For
See AI for discussions about artificial intelligence, seeintelligence.
See AIGeneral Intelligence. For for discussions about human-level intelligence in a broader sense, see General Intelligence.sense.
A generalization of Aumann's Agreement Theorem across M objectives and N agents, without assuming common priors. This framework also encompasses Debate, CIRL, Iterated Amplification as well. See Nayebi (2025) for the formal definition, and see From Barriers to Alignment to the First Formal Corrigibility Guarantees for applications.
A generalization of Aumann's Agreement Theorem across M objectives and N agents, without assuming common priors. See Nayebi (2025) for the formal definition, and see From Barriers to Alignment to the First Formal Corrigibility Guarantees for applications.
"Noticing confusion" means noticing the "tiny note of discord" (the quiet sense of "something is off") that our minds produce when what we're seeing differs from what we would have expected to see.
There is much rationality skill to be found in learning to raise this subtle signal to full consciousness, and to pay attention to it.
"The X-Men comics use terms like “evolution,” “mutation,” and “genetic code,” purely to place themselves in what they conceive to be the literary genre of science. The part that scares me is wondering how many people, especially in the media, understand science only as a literary genre." -- from Eliezer's post Science as Attire.
Related tags and wikis: Disagreement, Modesty, Modesty argument, Aumann agreement, The Aumann Game, ⟨M,N,ε,δ⟩-agreement
As far as I understand, * means something that something that one would want, agree, decide, etc. under ideal reflection conditions (e.g. knowing most plausibly relevant arguments, given a long time to think, etc) See, e.g. the CEV as defined not in relation to alignment targets or Wei Dai's metaethical alternatives 3-5.