This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Wikitags
Verification
This page is a stub.
Subscribe
Subscribe
Discussion
0
Discussion
0
Posts tagged
Verification
Most Relevant
39
Validating against a misalignment detector is very different to training against one
Ω
mattmacdermott
6mo
Ω
4
156
Formal verification, heuristic explanations and surprise accounting
Ω
Jacob_Hilton
1y
Ω
11
104
Compact Proofs of Model Performance via Mechanistic Interpretability
Ω
LawrenceC
,
rajashree
,
Adrià Garriga-alonso
,
Jason Gross
1y
Ω
4
20
The V&V method - A step towards safer AGI
Yoav Hollander
2mo
1
15
Making it harder for an AGI to "trick" us, with STVs
Ω
Tor Økland Barstad
3y
Ω
5
10
Alignment with argument-networks and assessment-predictions
Ω
Tor Økland Barstad
3y
Ω
5