Posts

Sorted by New

Wiki Contributions

Comments

Apparently a "Sydney" model existed at least as early as 17 Dec 2021.

Why do you think that GPT-3 has untied embeddings?