Introduction to abstract entropy
This post, and much of the following sequence, was greatly aided by feedback from the following people (among others): Lawrence Chan, Joanna Morningstar, John Wentworth, Samira Nedungadi, Aysja Johnson, Cody Wild, Jeremy Gillen, Ryan Kidd, Justis Mills and Jonathan Mustin. Introduction & motivation In the course of researching optimization, I decided that I had to really understand what entropy is.[1] But there are a lot of other reasons why the concept is worth studying: * Information theory: * Entropy tells you about the amount of information in something. * It tells us how to design optimal communication protocols. * It helps us understand strategies for (and limits on) file compression. * Statistical mechanics: * Entropy tells us how macroscopic physical systems act in practice. * It gives us the heat equation. * We can use it to improve engine efficiency. * It tells us how hot things glow, which led to the discovery of quantum mechanics. * Epistemics (an important application to me and many others on LessWrong): * The concept of entropy yields the maximum entropy principle, which is extremely helpful for doing general Bayesian reasoning. * Entropy tells us how "unlikely" something is and how much we would have to fight against nature to get that outcome (i.e. optimize). * It can be used to explain the arrow of time. * It is relevant to the fate of the universe. * And it's also a fun puzzle to figure out! I didn't intend to write a post about entropy when I started trying to understand it. But I found the existing resources (textbooks, Wikipedia, science explainers) so poor that it actually seems important to have a better one as a prerequisite for understanding optimization! One failure mode I was running into was that other resources tended only to be concerned about the application of the concept in their particular sub-domain. Here, I try to take on the task of synthesizing the abstract concept of entropy, to show what's so