LESSWRONG
LW

495
Wikitags

Archetypal Transfer Learning

Edited by MiguelDev, the gears to ascension, et al. last updated 5th Jul 2023

Archetypal Transfer Learning (ATL) is a proposal by @whitehatStoic for what is argued by the author to be a fine tuning approach that "uses archetypal data" to "embed Synthetic Archetypes". These Synthetic Archetypes are derived from patterns that models assimilate from archetypal data, such as artificial stories. The method yielded a shutdown activation rate of 57.33% in the GPT-2-XL model after fine-tuning. 

 

Related Tags: Corrigibility, Inner Alignment, Outer Alignment

Subscribe
Discussion
Subscribe
Discussion
Posts tagged Archetypal Transfer Learning
12Exploring Functional Decision Theory (FDT) and a modified version (ModFDT)
MiguelDev
2y
11
12Relevance of 'Harmful Intelligence' Data in Training Datasets (WebText vs. Pile)
MiguelDev
2y
0
6GPT-2 XL's capacity for coherence and ontology clustering
MiguelDev
2y
2
10On Ilya Sutskever's "A Theory of Unsupervised Learning"
MiguelDev
2y
0
4A Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL)
MiguelDev
2y
2
14Archetypal Transfer Learning: a Proposed Alignment Solution that solves the Inner & Outer Alignment Problem while adding Corrigible Traits to GPT-2-medium
MiguelDev
2y
5
5Research proposal: Leveraging Jungian archetypes to create values-based models
MiguelDev
3y
2
Add Posts