LESSWRONG
LW

408
Mario Schlosser
0010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
LLMs and computation complexity
Mario Schlosser2y11

Agree. If GPT-4 can solve 3-dim matrix multiplication with chain-of-thought, then doesn't that mean you could just take the last layer's output (before you generate a single token from it) and send it into other instances of GPT-4, and then chain together their output? That should by definition by enough "internal state-keeping" that you wouldn't need it to do the "note-keeping" of chain-of-thought. And that's precisely bayesed's point - because from the outside, that kind of a construct would just look like a bigger LLM. I think this is a clever post, but the bottleneck-ing created by token generation is too arbitrary of a way to assess LLM complexity.

Reply
No posts to display.