x

LESSWRONG

LW

4gate — LessWrong

4gate

4gate

Message

8

Ω

1

1

14

4y

4gate

8

Ω

1

4y

Scoping LLMs

by erik, David Baek, emile delcourt, and 4gate

Emile Delcourt, David Baek, Adriano Hernandez, Erik Nordby with advising from Apart Lab Studio Introduction & Problem Statement Helpful, Harmless, and Honest (”HHH”, Askell 2021) is a framework for aligning large language models (LLMs) with human values and expectations. In this context, "helpful" means the model strives to assist users...

Apr 10, 2025•4

What's going on with Per-Component Weight Updates?

Hi all, this is my first post on LW. It's a small one, but I want improve my writing, get into the habit of sharing my work, and maybe exchange some ideas in case anyone has already gotten further along some projection of my trajectory. TLDR: I looked at the...

Aug 22, 2024•1