LESSWRONG
LW

961
Juraj Vitko
5Ω3010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
What are the most important papers/post/resources to read to understand more of GPT-3?
Answer by Juraj VitkoAug 04, 2020Ω360

Here's a list of resources that may be of use to you. The GPT-3 paper isn't too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.

  • https://github.com/jalammar/jalammar.github.io/blob/master/notebooks/nlp/01_Exploring_Word_Embeddings.ipynb
  • http://www.peterbloem.nl/blog/transformers
  • http://jalammar.github.io/illustrated-transformer/
  • https://amaarora.github.io/2020/02/18/annotatedGPT2.html
  • http://jalammar.github.io/illustrated-gpt2/
  • http://jalammar.github.io/how-gpt3-works-visualizations-animations/
  • https://arxiv.org/pdf/1409.0473.pdf Attention (initial)
  • https://arxiv.org/pdf/1706.03762.pdf Attention Is All You Need
  • http://nlp.seas.harvard.edu/2018/04/03/attention.html (annotated)
  • https://www.arxiv-vanity.com/papers/1904.02679/ Visualizing Attention
  • https://stats.stackexchange.com/questions/421935/what-exactly-are-keys-queries-and-values-in-attention-mechanisms
  • https://arxiv.org/pdf/1807.03819.pdf Universal Transformers
  • https://arxiv.org/pdf/2007.14062.pdf Big Bird (see appendices)
  • https://www.reddit.com/r/MachineLearning/comments/hxvts0/d_breaking_the_quadratic_attention_bottleneck_in/
  • https://www.tensorflow.org/tutorials/text/transformer
  • https://www.tensorflow.org/tutorials/text/nmt_with_attention
  • https://cdn.openai.com/blocksparse/blocksparsepaper.pdf
  • https://openai.com/blog/block-sparse-gpu-kernels/
  • https://github.com/pbloem/former/blob/master/former/transformers.py
  • https://github.com/openai/blocksparse/blob/master/examples/transformer/enwik8.py
  • https://github.com/google/trax/blob/master/trax/models/transformer.py
  • https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_gpt2.py
Reply