Zijie J. Wang, Robert Turko, Duen Horng Chau, Dodrio: Exploring Transformer Models with Interactive Visualization, arXiv:2103.14625 [cs.CL].

Abstract: Why do large pre-trained transformer-based models perform so well across a wide variety of NLP tasks? Recent research suggests the key may lie in multi-headed attention mechanism's ability to learn and represent linguistic information. Understanding how these models represent both syntactic and semantic knowledge is vital to investigate why they succeed and fail, what they have learned, and how they can improve. We present Dodrio, an open-source interactive visualization tool to help NLP researchers and practitioners analyze attention mechanisms in transformer-based models with linguistic knowledge. Dodrio tightly integrates an overview that summarizes the roles of different attention heads, and detailed views that help users compare attention weights with the syntactic structure and semantic information in the input text. To facilitate the visual comparison of attention weights and linguistic knowledge, Dodrio applies different graph visualization techniques to represent attention weights scalable to longer input text. Case studies highlight how Dodrio provides insights into understanding the attention mechanism in transformer-based models. Dodrio is available at this https URL.  

From the documentation at the interactive website:

Dodrio addresses the challenges of interpreting attention weights through an interactive visualization system that provides attention head summarization and semantic and syntractic knowledge contexts. By identifying the linguistic properties that an attention head attends to in the Attention Head Overview (bottom right), you can click the attention head to explore the semantic and syntactic significance of the sentence at the selected attention head. If you are interested in the lexical dependencies in a sentence, you can explore a syntactically important head in the Dependency View and accompanying Comparison View (top), while semantically important heads can be investigated in the Semantic Attention Graph view (bottom left). We encourage you to further investigate the multi-headed attention mechanism across various text instances with interesting linguistic features (eg. coreferences, word sense, etc.) in the Instance Selection View by clicking the appropriate icon in the toolbar at the top of the interface.

New to LessWrong?

New Comment