LESSWRONG
LW

Dhananjay Ashok — LessWrong

In my previous post I went over some common approaches for updating LMs with fresh knowledge. Here, I detail a specific approach that has gained popularity in recent years - locating and editing factual associations in language models. I do not believe in this approach, in this post I try to summarize it fairly, and explain why I don’t quite like it.

Do LMs Store Facts in Their Weights?

Language models that employ the transformer architecture have Feed Forward Networks (FFNs) as an important subcomponent. For any specific layer, the FFN has 2 sublayers within it.

During the forward pass of a LM, the FFN takes as input a dense vector representation from the previous... (read 934 more words →)

Replying toUpdating and Editing Factual Knowledge in Language Models

Dhananjay Ashok1y

Updating and Editing Factual Knowledge in Language Models

I think we know very little about how humans learn individual new facts from a neuroscience perspective. There are some studies that track fMRI scans of individuals who learn new things over the course of a day (sleep seems particularly important for the formation of connections in the brain having to do with new learning), but I am sceptical that it is the kind of learning that could be applied to Language Model type systems as of now.

In general though, I think humans are the gold standard for factual knowledge updating. I agree with you that there are some examples of failures, but no other system comes close in my view.

Updating and Editing Factual Knowledge in Language Models

Dhananjay Ashok

Language Models go out of date. Is it possible to stop this from happening by making intrinsic alterations to the network itself?

* This article was written around Jan 2024, the exact references are surely out of date but the core ideas still hold.

Among the many analogies used to understand Language Models is the idea that LMs are “Blurry JPEGs” of their training data. The intuition here is that LMs compress the information they see during training in their weights. The study of knowledge editing and updating in Language Modeling takes a similar perspective — we believe that Language Models “know” certain facts, and hope that in some way we can change the “facts” that... (read 2901 more words →)

LESSWRONG
LW

LESSWRONG
LW

Dhananjay Ashok

Dhananjay Ashok

Locating and Editing Knowledge in LMs

Updating and Editing Factual Knowledge in Language Models

Dhananjay Ashok

Dhananjay Ashok

Locating and Editing Knowledge in LMs

Updating and Editing Factual Knowledge in Language Models

Do LMs Store Facts in Their Weights?