LESSWRONG
LW

1414
Wikitags

Apart Research

Edited by Esben Kran, habryka, Jason Hoelscher-Obermaier last updated 18th Jul 2024

Apart Research is an AI safety research lab. They host the Apart Sprints, large-scale international events for research experimentation. This tag includes posts written by Apart researchers and content about Apart Research.

Subscribe
Discussion
1
Subscribe
Discussion
1
Posts tagged Apart Research
26Newsletter for Alignment Research: The ML Safety Updates
Esben Kran
3y
0
9Black Box Investigation Research Hackathon
Esben Kran, Jonas Hallgren
3y
4
143We Found An Neuron in GPT-2
Ω
Joseph Miller, Clement Neo
3y
Ω
23
38Safety timelines: How long will it take to solve alignment?
Esben Kran, JonathanRystroem, Steinthal
3y
7
33Deceptive agents can collude to hide dangerous features in SAEs
Simon Lermen, Mateusz Dziemian
1y
2
24AI Safety Ideas: A collaborative AI safety research platform
Esben Kran
3y
0
22Results from the language model hackathon
Esben Kran
3y
1
9Analysing Adversarial Attacks with Linear Probing
Ω
Yoann Poupart, Imene Kerboua, Clement Neo, Jason Hoelscher-Obermaier
1y
Ω
0
119Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
Ω
StefanHex, Marius Hobbhahn
2y
Ω
1
81Results from the interpretability hackathon
Esben Kran, Neel Nanda
3y
0
71Solving the Mechanistic Interpretability challenges: EIS VII Challenge 2
Ω
StefanHex, Marius Hobbhahn
2y
Ω
1
47Robustness of Model-Graded Evaluations and Automated Interpretability
Ω
Simon Lermen, viluon
2y
Ω
5
47How-to Transformer Mechanistic Interpretability—in 50 lines of code or less!
Ω
StefanHex
3y
Ω
5
44College technical AI safety hackathon retrospective - Georgia Tech
yix
10mo
2
34Computational Mechanics Hackathon (June 1 & 2)
Adam Shai
1y
5
Load More (15/39)
Add Posts