This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Language Models
•
Applied to
Navigating LLM embedding spaces using archetype-based directions
by
mwatkins
3d
ago
•
Applied to
Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant
by
Olli Järviniemi
5d
ago
•
Applied to
On precise out-of-context steering
by
Olli Järviniemi
8d
ago
•
Applied to
Mechanistically Eliciting Latent Behaviors in Language Models
by
Vanessa Kosoy
10d
ago
•
Applied to
LLMs could be as conscious as human emulations, potentially
by
weightt an
11d
ago
•
Applied to
An interesting mathematical model of how LLMs work
by
Bill Benzon
11d
ago
•
Applied to
LLMs seem (relatively) safe
by
JustisMills
15d
ago
•
Applied to
At last! ChatGPT does, shall we say, interesting imitations of “Kubla Khan”
by
Bill Benzon
16d
ago
•
Applied to
How LLMs Work, in the Style of The Economist
by
Rocket
18d
ago
•
Applied to
What's up with all the non-Mormons? Weirdly specific universalities across LLMs
by
mwatkins
21d
ago
•
Applied to
Inducing Unprompted Misalignment in LLMs
by
Sam Svenningsen
22d
ago
•
Applied to
An examination of GPT-2's boring yet effective glitch
by
niplav
23d
ago
•
Applied to
Claude 3 Opus can operate as a Turing machine
by
Gunnar_Zarncke
24d
ago
•
Applied to
Experiments with an alternative method to promote sparsity in sparse autoencoders
by
Eoin Farrell
25d
ago
•
Applied to
Claude wants to be conscious
by
Joe Kwon
1mo
ago
•
Applied to
Barcoding LLM Training Data Subsets. Anyone trying this for interpretability?
by
right..enough?
1mo
ago
•
Applied to
Is LLM Translation Without Rosetta Stone possible?
by
cubefox
1mo
ago
•
Applied to
End-to-end hacking with language models
by
tchauvin
1mo
ago
•
Applied to
Language and Capabilities: Testing LLM Mathematical Abilities Across Languages
by
Ethan Edwards
1mo
ago