All of jade's Comments + Replies

Thanks! I had actually skimmed this recently but forgot to add it to my reading list. The cherry-picked examples for text generation seem a bit low-information, but it would be interesting to see their technique applied to a larger model.

Huggingface has a nice guide that covers popular approaches to generation circa 2020. I recently read about tail free sampling as well. I'm sure other techniques have been developed since then, though I'm not immersed enough in NLP state-of-the-art to be aware of them.

Thanks, I added a parenthetical sentence to indicate this possibility.
If you're curious, the most interesting pure stochastic sampling variant I've seen lately is: "Contrastive Search Is What You Need For Neural Text Generation", Su & Collier 2022. (Unfortunately, only benchmarked on very small models and AFAIK no one has generated samples from large GPT-3 scale models or provided quantitative/qualitative description.)

For context, I have moderate experience working with LLMs, and I think this is a great summary for laypeople. I've observed humans have a general tendency to anthropomorphize behavior that seems "intelligent", and it seems more productive to resist that tendency and seek better explanations.

At risk of becoming too technical, one topic that could help bridge the "From predictor to generator" and "A better guessing machine" sections is a bit more detail on how outputs are chosen in modern models. The greedy (choose the most likely next word) and random (samp... (read more)

Thanks! I was not aware of beam search. Any good references to learn about it?