LESSWRONG
LW

743
Wikitags

Truthful AI

This page is a stub.
Subscribe
Discussion
Subscribe
Discussion
Posts tagged Truthful AI
4
65Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses
Ω
TurnTrout
10mo
Ω
3
2
72New, improved multiple-choice TruthfulQA
Ω
Owain_Evans, James Chua, Steph Lin
10mo
Ω
1
2
31A tension between two prosaic alignment subgoals
Alex Lawsen
3y
8
2
27How do LLMs give truthful answers? A discussion of LLM vs. human reasoning, ensembles & parrots
Ω
Owain_Evans
2y
Ω
0
2
12Truthfulness, standards and credibility
Ω
Joe Collman
4y
Ω
2
1
49Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Ω
Felix Hofstätter, Francis Rhys Ward, HarrietW, LAThomson, Ollie J, Patrik Bartak, Sam F. Brown
2y
Ω
0
Add Posts