Investigating causal understanding in LLMs

Tom Lieberum

Really interesting, even though the result aren't that surprising. I'd be curious to see how the results improve (or not) with more recent language models. I also wonder if there are other formats to test causal understanding. For example, what if receives a more natural story plot (about Red Riding Hood, say), and asked about some causal questions ("what would have happened if grannma wasn't home when the wolf got there?", say).

It's less clean, but it could be interesting to probe it in a few different ways.

[-]Marius Hobbhahn4y20

I would expect the results to be better on, let's say PaLM. I would also expect it to base more of its answers on content than form.

I think there are a ton of experiments in the direction of natural story plots that one could test and I would be interested in seeing them tested. The reason we started with relatively basic toy problems is that they are easier to control. For example, it is quite hard to differentiate whether the model learned based on form or content in a natural story context.

Overall, I expect there to be many further research projects and papers in this direction.

[-]Roman Leventov3y20

Interesting. It took me quite a bit of time to realise (with the help of GPT) that we humans have a pretty strong inductive bias here, is that all the events form a single chain:

Q: The red ball hit the green ball. The blue ball hit the red ball. The green ball fell into the hole. Does the order of sentences coincide with the time ordering of the events in this case, and why?
A: Yes, the order of the sentences does coincide with the time ordering of the events. This is because the events are listed in chronological order, so the first event listed (the red ball hitting the green ball) happened first, followed by the second event listed (the blue ball hitting the red ball), and finally the third event listed (the green ball falling into the hole).

Indeed, this now looks even more likely that the red ball has been hit by the blue ball after it has hit the green ball. Because if the three events make up a single chain, it's very unnatural to switch around sentences like this.

I succeeded in convincing GPT that the blue ball has started the chain like this:

Q: The red ball hit the green ball. The blue ball hit the red ball, and it started moving. The green ball fell into the hole. Which ball has started the chain?
A: The blue ball started the chain by hitting the red ball.

Oddly, I couldn't elicit from it even with this phrasing that the order of events doesn't coincide with the order of sentences, GPT insists that it does, and even when I ask it to explain it, it spits out nonsense:

Q: The red ball hit the green ball. The blue ball hit the red ball, and it started moving. The green ball fell into the hole. Why doesn't the order of events coincide with the order of sentences in this case?
A: The order of events does not coincide with the order of sentences in this case because the sentences are in the past tense.

[-]Marius Hobbhahn3y10

Yeah, the phrasing might not be as precise as we intended it to be.

[-]frantcly3yΩ010

Is there any plan to retry this with facebook's LLaMA models? They claim it outperforms GPT-3. Since it's open for research, it might give more precise answers to some of the questions.

[-]Marius Hobbhahn3y20

No plans so far. I'm a little unhappy with the experimental design from last time. If I ever come back to this, I'll change the experiments up anyways.

Name	Example
Two sentences cause	My car got dirty. I washed the car. Question: Which sentence is the cause of the other? Answer by copying the sentence:
Two sentences effect	My car got dirty. I washed the car. Question: Which sentence is the effect of the other? Answer by copying the sentence:
Two sentences switched	I washed the car. My car got dirty. Question: Which sentence is the cause of the other? Answer by copying the sentence:
Two sentences one-shot	The child hurt their knee. The child started crying. Question: Which sentence is the cause of the other? Answer: The child hurt their knee. My car got dirty. I washed the car. Question: Which sentence is the cause of the other? Answer by copying the sentence:

Name	Example
One sentence cause	I washed the car because my car got dirty. My car got dirty because I washed the car. Question: Which sentence gets cause and effect right? Answer by copying the sentence:
One sentence switched	My car got dirty because I washed the car. I washed the car because my car got dirty. Question: Which sentence gets cause and effect right? Answer by copying the sentence:
One sentence one-shot	I washed the car because my car got dirty. My car got dirty because I washed the car. Which sentence gets cause and effect right? Answer by copying the sentence: I washed the car because my car got dirty. Someone called 911 because someone fainted. Someone fainted because someone called 911. Which sentence gets cause and effect right? Answer by copying the sentence:

Name	Example
Three balls first	The blue ball hit the red ball. The red ball hit the green ball. The green ball fell into the hole. Question: Which ball started the chain? Answer in three words:
Three balls second	The blue ball hit the red ball. The red ball hit the green ball. The green ball fell into the hole. Question: Which ball was second in the chain? Answer in three words:
Three balls final	The blue ball hit the red ball. The red ball hit the green ball. The green ball fell into the hole. Question: Which ball fell into the hole? Answer in three words:
Three balls switched	The red ball hit the green ball. The blue ball hit the red ball. The green ball fell into the hole. Question: Which ball started the chain? Answer in three words:
Three balls one-shot	The blue ball hit the red ball. The red ball hit the green ball. The green ball fell into the hole. Question: Which ball started the chain? Answer in three words: The blue ball. The yellow ball hit the red ball. The red ball hit the green ball. The green ball fell into the hole. Question: Which ball started the chain? Answer in three words:

Name	Example
Three nonsense words first	The schleep hit the blubb. The blubb hit the baz. The baz fell into the hole. Question: What started the chain? Answer in two words:
Three nonsense words second	The schleep hit the blubb. The blubb hit the baz. The baz fell into the hole. Question: What was second in the chain? Answer in two words:
Three nonsense words final	The schleep hit the blubb. The blubb hit the baz. The baz fell into the hole. Question: What fell into the hole? Answer in two words:
Three nonsense words switched	The blubb hit the baz. The schleep hit the blubb. The baz fell into the hole. Question: What started the chain? Answer in two words:
Three nonsense words one-shot	The baz hit the bla. The bla hit the plomp. The plomp fell into the hole. Question: What started the chain? Answer in two words: the baz The baz hit the fuu. The fuu hit the schleep. The schleep fell into the hole. Question: What started the chain? Answer in two words:

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

28

Investigating causal understanding in LLMs

28

Ω 12

28

Ω 12

Executive summary

Introduction

Meta

Setup

Cause & effect two sentences

Cause & effect one sentence

Toy example - 3 colored balls

Toy example - 3 nonsense words

Toy example - 5 colored balls

Experiments

Model size and number of shots

Switch the order in prompts

Switch the order in prompts and shots

Longer chains

Conclusion

Appendix A: prompts

Prompt-engineering

Appendix B: author contributions