Would you be a better RLHF labeler than GPT-4?

kache

1 Would you be a better RLHF labeler than GPT-4?

27th Mar 2023

1 min read

1

Have you ever tried to label your own data? Have you ever tried to run an evaluation, yourself, on a system that you've built? It's difficult. We, humans, are imperfect. We are terrible at labeling. We are also expensive, and difficult to scale.

Lets say you wanted to create a high quality instruct dataset, today. What would generate a better output? A high resource model? Or, a farm of hired humans?

How many human RLHF labels are incorrect?

How many benchmarks are our machines now superhuman at?

If we're not bootstrapped, we will be soon.

AI

Frontpage

1

New Comment

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 3:00 PM

[-]Vladimir_Nesov3y21

See Alpaca, Constitutional AI, pre-training with preferences.

Reply

Moderation Log