Someone should probably write a real book review, but to make a brief recommendation: The Book of Why by Judea Pearl and Dana Mackenzie is probably the most interesting general-science book I've read since Thinking Fast and Slow.
Pearl's goal is to explain and promote causal inference, which you might think of as (allegedly) the next big thing after frequentist and Bayesian statistics. The introduction is probably skippable, since the authors make some rather grand claims that aren't backed up until later. I found myself thinking, "okay, maybe it's great, but explain what it is already".
Chapter 1 introduces the Ladder of Causation, the authors' way of distinguishing the correlations found via a model-free statistical summary of data (which is level 1) from deductions that require a causal model (levels 2 and 3).
Chapters 2 and 3 give a partial, "whiggish" history of statistics from a causal perspective, covering frequentist and Bayesian statistics and Pearl's AI work, when he invented Bayesian networks. At the end, he talks about the possible junctions in a Bayesian network: the chain, fork, and collider, and how they can easily cause confusion.
Chapter 4 uses causal reasoning to explain the logic behind randomized controlled trials and other ways of controlling for confounding variables.
Chapter 5 covers the scientific debate over cigarette smoking, and how lack of clarity about causation resulted in this debate taking years longer than it needed to.
Chapter 6 is a fun chapter showing how to use causal diagrams to shed new light on the Monty Hall problem and Simpson's paradox.
And that's as far as I've read, but it's enough to make a strong recommendation.
I did a quick search on Less Wrong and causality has been covered before, though not as clearly. In particular, see Yudkowsky's Causal Diagrams and Causal Models.
(I was confused about one bit, though: Yudkowsky writes that "Causal models (with specific probabilities attached) are sometimes known as 'Bayesian networks' or 'Bayes nets'." But in the book, the authors make a clear distinction: "Unlike the causal diagrams we will deal with throughout the book, a Bayesian network carries no assumption that the arrow has any causal meaning." Though later, they write, "These three junctions [...] are like keyholes through the door that separates the first and second levels of the Ladder of Causation.")