Wiki-Tags in Need of Work

Scalable Oversight

Rational convergence towards certain norms, habits, social contracts, even though agents might have no way of communicating or affecting each other, nor even any direct evidence that the other exists

Just a research idea, not a whole damn agenda

AI alignment & monitoring & control techniques which consist of doing many sometimes-fail tricks rather than a single never-fails method. This tag applies both to posts proposing an entire bag and posts that offer a single trick.

A tag for conversations with, rather than about, digital minds. This tag needs expansion and checking for duplicate tags; I checked shallowly before creating it. Should it be merged with AI Evaluations?

Sparse Autoencoders (SAEs) are an unsupervised technique for decomposing the activations of a neural network into a sum of interpretable components (often referred to as features). Sparse Autoencoders may be useful interpretability and related alignment agendas. .. (read more)

Pause AI is an AI advocacy organizing with the aim of pausing AGI research due to concerns about future risks from AI. From their website:.. (read more)

A cost-benefit analysis is a method for quantitatively assigning values to different actions... (read more)

Shitposting is the noble art of saying stupid things on purpose in order to awe one's readers with the sheer nonsensicality of it all. It is said to have originated on 4chan, but it has been a part of the human spirit for long before that.

Axioms (together with definitions) forms the basis of mathematical theorems. Every mathematical theorem is only proven inside its axiom system... (read more)

Tag Voting Activity

User Post Title Tag Pow When Vote

Recent Tag & Wiki Activity