LESSWRONG
LW

768
Sophie Y
0010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
How does GPT-3 spend its 175B parameters?
Sophie Y2y10

The architecture shown for "Not in GPT" seems to be wrong? GPT is decoder only. The part labeled as "Not in GPT" is decoder part. 

Reply