erik

Scoping LLMs

Emile Delcourt, David Baek, Adriano Hernandez, Erik Nordby with advising from Apart Lab Studio Introduction & Problem Statement Helpful, Harmless, and Honest (”HHH”, Askell 2021) is a framework for aligning large language models (LLMs) with human values and expectations. In this context, "helpful" means the model strives to assist users...

Apr 10, 20254

LESSWRONG
LW

LESSWRONG
LW

erik

erik

erik's Shortform

Scoping LLMs

erik

erik

erik

erik's Shortform

Scoping LLMs

Advent of Code Alignment

Introduction & Problem Statement