This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Language Models
•
Applied to
[Linkpost] Faith and Fate: Limits of Transformers on Compositionality
by
Joe Kwon
5h
ago
•
Applied to
[Linkpost] Mapping Brains with Language Models: A Survey
by
Bogdan Ionut Cirstea
10h
ago
•
Applied to
MetaAI: less is less for alignment.
by
Cleo Nardo
3d
ago
•
Applied to
[Linkpost] Large Language Models Converge on Brain-Like Word Representations
by
Bogdan Ionut Cirstea
5d
ago
•
Applied to
Exploring Concept-Specific Slices in Weight Matrices for Network Interpretability
by
Raemon
7d
ago
•
Applied to
[Linkpost] Scaling laws for language encoding models in fMRI
by
Bogdan Ionut Cirstea
8d
ago
•
Applied to
LEAst-squares Concept Erasure (LEACE)
by
Raemon
9d
ago
•
Applied to
Unfaithful Explanations in Chain-of-Thought Prompting
by
miles
14d
ago
•
Applied to
Open Source LLMs Can Now Actively Lie
by
Josh Levy
15d
ago
•
Applied to
"LLMs Don't Have a Coherent Model of the World" - What it Means, Why it Matters
by
Kaj_Sotala
15d
ago
•
Applied to
Programming AGI is impossible
by
Áron Ecsenyi
17d
ago
•
Applied to
PaLM-2 & GPT-4 in "Extrapolating GPT-N performance"
by
Lukas Finnveden
17d
ago
•
Applied to
LIMA: Less Is More for Alignment
by
Raemon
17d
ago
•
Applied to
Aligning an H-JEPA agent via training on the outputs of an LLM-based "exemplary actor"
by
Roman Leventov
18d
ago
•
Applied to
An LLM-based “exemplary actor”
by
Roman Leventov
19d
ago
•
Applied to
Data and "tokens" a 30 year old human "trains" on
by
Jose Miguel Cruz y Celis
25d
ago
•
Applied to
Why I Believe LLMs Do Not Have Human-like Emotions
by
Raemon
25d
ago
•
Applied to
Transformer Architecture Choice for Resisting Prompt Injection and Jail-Breaking Attacks
by
RogerDearnaley
1mo
ago