Localizing Finetuned Information in Transformers with Dynamic Weight Grafting
This is a write up of "Multiple Streams of Knowledge Retrieval: Enriching and Recalling in Transformers", work with David Reber, Sean Richardson & Ari Holtzman. Code is available here. This is cross-posted from https://toddnief.com/articles/dynamic-weight-grafting/ When a new pope is elected and we want an LLM to answer “Who’s the pope?”...