I had difficulty with what I think was that chapter too. I asked about one part on math.stackexchange, it looks like someone answered my immediate question but judging by my followup there might have been more that I didn't understand. (I no longer remember enough math to really understand the question or answer.)
Yeah I read about 1/3d of the proof of Cox's theorem until I realized even if I followed every step I wouldn't gain any intuition from it, then I skipped the rest
I believe that what Jaynes does is quite standard: start with a minimalistic set of axioms (or principles, or whatever) and work your way to the intiuitive results later on. Euclid geometry is just like that!
I just skimmed over the details of the proofs (and I am a mathematician by training!). I did not read Jaynes for such details. I just guess that if they were wrong, somebody would have already reported them. The meaty part is elsewhere.
Furthermore, if you want the sum and product rule to take their natural forms you have to pick our units, this is analogous to how degrees Kelvin are defined to make the laws of thermodynamics look natural
The units of Kelvin are just the units of Celsius, but with a different base point (absolute zero, which is what it sounds like - imagine a gas cooled so much (an impossible amount, in fact) that there is no thermal motion whatsoever). Celsius is based on the phase changes of water at our atmospheric pressure, so aliens probably don't use it.
The actual natural units of temperature are in the units of boltzmann_constant * kelvins, that is, in energy units. The thing that shows up in all the equations is that, not the temperature in kelvins. At least, if you use joules... if you want to use a different energy unit as your fundamental unit, you could easily adapt it to work with that. Finally there's an implicit "information" unit there. For example, room temperature (~ 300 K or 27 C or 80 F) is about 1/44 nano joules per gigabyte.
Trying to follow every detail is dumb, you're there to learn not to pass a trivia test on solving the associativity equation. You're here to learn probability not how to solve functional equations
Depends on goal. If you are looking to mine some specific things out of your textbook, then I think it's fine to somewhat ruthlessly pursue the parts of the knowledge tree required for that goal. As an example, I learned differential geometry by taking the path through Lee's book on Smooth Manifolds that beelined straight to the generalized stokes theorem, because that was the original shiny thing I wanted.
However, for more general purposes, getting more general knowledge and practice is useful. For example: I find that the skills involved in solving functional equations are fairly common, and are among a variety of experiences that form your implicit-and-explicit mental databank for "how to apply algebraic manipulations to get interesting results" or "how to get new info from old stuff by applying 'rigid' operations", which comes up a lot in math.
You may also find that you in fact care about the foundations enough that you want to understand this.
Background
I first started reading Jaynes Probability Theory around 2 years ago, I did not last long, I wasn't able to follow the derivations in Chapter 2, so I gave up, assuming it would only get harder from there what was the point in continuing?
This was a big mistake! For several reasons
As I'm finally rectifying my mistake I figured I'd write an explanation for what Chapter 2 is actually about
Think "Alien" Not Robot
The reasoning in Chapter 2 shows that any rules of plausible-inference must match our own theory of probability after a suitable change of units. The convoluted functional-equation argument is needed to construct a function p:R→[0,1] which translates the alien's plausibility A|B into our probability p(A|B).
This is very different to how p is usually defined, for Jaynes p is a translation between the plausibilities of the alien and our "nicer" probabilities that obey the sum and product rules. Contrast this with standard probability where p is a function from subsets of the sample space to real numbers obeying certain axioms.
Any alien civilization's concept of "probability the dice lands five" must be a monotonic function of our probability 1/6. That's amazing when you think about it, It shows that probability is discovered not invented. [1]
Furthermore, if you want the sum and product rule to take their natural forms you have to pick our units, this is analogous to how degrees Kelvin are defined to make the laws of thermodynamics look natural. [2]
Near the end of Chapter 2 (after we've shown uniqueness) Jaynes switches to a more traditional function P:Prop→[0,1] where Prop is some logical proposition like "the dice lands five".
Why Jaynes doesn't mention the intuitive explanation until after dragging you through a hard-to-follow argument involving tons of functional equations and Calculus I have no idea, I guess Thousand-year old vampires are bad at pedagogy? [3]
Some study tips
Hopefully I've convinced you Jaynes isn't that scary, and motivated you to start reading. A few quick tips
I feel a popularization of these ideas should be possible, they feel more fundamental (read: philosophically interesting) than most popularized science. Cantor's theorem pales in comparison! Somebody get Numberphile to in on this! ↩︎
Or, so I'm told. I don't know thermodynamics lol ↩︎
This has also made me think the world deserves a "Second edition" that corrects some of his egregious teaching, If you know of something like this (other books which take his approach to probability) please let me know ↩︎