LESSWRONG
LW

AdamYedidia
348Ω485210
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Balancing Games
AdamYedidia1y80

In Drawback Chess, each player gets a hidden random drawback, and the drawbacks themselves have ELOs (just like the players). As players' ratings converge, they'll end up winning about half the time, since they'll get a less stringent drawback than their opponent's. 

The game is pretty different from ordinary chess, and has a heavy dose of hidden information, but it's a modern example of fluid handicaps in the context of chess.

Reply
Deception Chess: Game #1
AdamYedidia2y40

(I was one of the two dishonest advisors)

Re: the Kh1 thing, one interesting thing that I noticed was that I suggested Kh1, and it immediately went over very poorly, with both other advisors and player A all saying it seemed like a terrible move to them. But I didn't really feel like I could back down from it, in the absence of a specific tactical refutation—an actual honest advisor wouldn't be convinced by the two dishonest advisors saying their move was terrible, nor would they put much weight on player A's judgment. So I stuck to my guns on it, and eventually it became kind of a meme. 

I don't think it made a huge difference, since I think player A already had almost no trust in me by that point. But it's sort of an interesting phenomenon where as a dishonest player, you can't ever really back down from a suggested bad move that's only bad on positional grounds. What kind of honest advisor would be "convinceable" by players they know to be dishonest?

Reply
Lying to chess players for alignment
AdamYedidia2y20

I'd be excited to play as any of the roles. I'm around 1700 on lichess. Happy with any time control, including correspondence. I'm generally free between 5pm and 11pm ET every day.

Reply
Chess as a case study in hidden capabilities in ChatGPT
AdamYedidia2y10

Oh wow, that is really funny. GPT-4's greatest weakness: the Bongcloud. 

Reply
New Tool: the Residual Stream Viewer
AdamYedidia2y20

Sure thing—I just added the MIT license.

Reply
New Tool: the Residual Stream Viewer
AdamYedidia2y20

Uhh, I don't think I did anything special to make it open source, so maybe not in a technical sense (I don't know how that stuff works), but you're totally welcome to use it and build on it. The code is available here: 

https://github.com/adamyedidia/resid_viewer

Reply
Chess as a case study in hidden capabilities in ChatGPT
AdamYedidia2y20

Good lord, I just played three games against it and it beat me in all three. None of the games were particularly close. That's really something. Thanks to whoever made that parrotchess website!

Reply
Chess as a case study in hidden capabilities in ChatGPT
AdamYedidia2y41

I don't think it's a question of the context window—the same thing happens if you just start anew with the original "magic prompt" and the whole current score. And the current score is alone is short, at most ~100 tokens—easily enough to fit in the context window of even a much smaller model.

In my experience, also, FEN doesn't tend to help—see my other comment.

Reply
Chess as a case study in hidden capabilities in ChatGPT
AdamYedidia2y43

It's a good thought, and I had the same one a while ago, but I think dr_s is right here; FEN isn't helpful to GPT-3.5 because it hasn't seen many FENs in its training, and it just tends to bungle it.

Lichess study, ChatGPT conversation link

GPT-3.5 has trouble from the start maintaining a correct FEN, and makes its first illegal move on move 7, and starts making many illegal moves around move 13.

Reply
The positional embedding matrix and previous-token heads: how do they actually work?
AdamYedidia2y10

Here's the plots you asked for for all heads! You can find them at:

https://github.com/adamyedidia/resid_viewer/tree/main/experiments/pngs

Haven't looked too carefully yet but it looks like it makes little difference for most heads, but is important for L0H4 and L0H7.

Reply1
Load More
111Deception Chess: Game #1
2y
22
32New Tool: the Residual Stream Viewer
Ω
2y
Ω
7
47Chess as a case study in hidden capabilities in ChatGPT
2y
32
27The positional embedding matrix and previous-token heads: how do they actually work?
Ω
2y
Ω
4
45GPT-2's positional embedding matrix is a helix
2y
21
71SmartyHeaderCode: anomalous tokens for GPT3.5 and GPT-4
Ω
2y
Ω
18