I'll be brief, omit needless words.
Intelligence is prediction is compression because
Compression is finding a code that makes the data shorter
And codeword lengths are probabilities
So codes are probability distributions
But probability distributions are prediction strategies.