Why GPT wants to mesa-optimize & how we might change this

Oh, I had actually seen that paper. Forgot that they did that though. Thanks!

This makes me wonder, how would Monte Carlo tree search do for GPT? And could you do AlphaGo-style IDA?

You'd need an analogue of the value network (or value head). (Where current GPT seems analogous to the policy network.) And then ideally you'd also want some analogue of winning / losing to ground out the evaluation.

Maybe you could set it up like this --

  1. start with a task description like, "write a poem in the style of e.e. cummings about the romance between cryptographers Alice and Bob"
  2. feed the task description (with some boilerplate) into GPT, and have it start generating continuations
  3. do MCTS on the continuations; use your value network (head) to evaluate the continuations vs the task description; update the policy network based on the evaluations
  4. include an "is done" head and evaluate it to decide when to stop
  5. send completed works to humans to provide feedback; the feedback should include separate scores for "good so far" for the value head, and "is a completed work" for the "is done" head.

I'd be curious whether this would enable GPT to significantly improve. Specifically, would you be able to generate longer works with less intervention?

Comparing Utilities

G: A 50/50 gamble between A and B

This bit in the first drawing should say "... between A and C", right?

rohinmshah's Shortform

It's totally possible that there's no way to get to reliably correct answers and you instead want decisions that are good regardless of what the answer is.

Good point!

Open & Welcome Thread - September 2020

It seems like teaching that fact, and instilling moral uncertainty in general into children

I would guess that teaching that fact is not enough to instill moral uncertainty. And that instilling moral uncertainty would be very hard.

rohinmshah's Shortform

Sounds like you probably disagree with the (exaggeratedly stated) point made here then, yeah?

(My own take is the cop-out-like, "it depends". I think how much you ought to defer to experts varies a lot based on what the topic is, what the specific question is, details of your own personal characteristics, how much thought you've put into it, etc.)

rohinmshah's Shortform

Options 1 & 2 sound to me a lot like inside view and outside view. Fair?

ESRogs's Shortform

Also asked on Twitter here.

ESRogs's Shortform

If GPT-3 is what you get when you do a massive amount of unsupervised learning on internet text, what do you get when you do a massive amount of unsupervised learning on video data from cars?

(In other words, can we expect anything interesting to come from Tesla's Dojo project, besides just better autopilot?)

The Box Spread Trick: Get rich slightly faster

the FDIC does not directly buy the spread; they allow the CD to exist, which the buyers should have access to

I'm not sure how important FDIC insurance is to the story here, but worth noting that it has a 250k per account limit. So I don't think financial institutions would have access to an unlimited amount of it.

