Note, though, that time reversal is still an anti-unitary operator in quantum mechanics in spite of the hand-waving argument failing when time reversal isn't a good symmetry. Even when time reversal symmetry fails, though, there's still CPT symmetry (and CPT is also anti-unitary).
I argue that counting branches is not well-behaved with the Hilbert space structure and unitary time evolution, and instead assigning a measure to branches (the 'dilution' argument) is the proper way to handle this. (See Wallace's decision-theory 'proof' of the Born rule for more).
The quantum state is a vector in a Hilbert space. Hilbert spaces have an inner product structure. That inner product structure is important for a lot of derivations/proofs of the Born rule, but in particular the inner product induces a norm. Norms let us do a lot of things. One of the more important things is we can define continuous functions. The short version is, for a continuous function, arbitrarily small changes to the input should produce arbitrarily small changes to the output. Another thing commonly used for vector spaces is linear operators, which are a kind of function that maps vectors to other vectors in a way that respects scalar multiplication and vector addition. We can combine the notion of continuous functions with linear operators and we get bounded linear operators.
While quantum mechanics contains a lot of unbounded operators representing observables (position, momentum, energy, etc.), bounded operators are still important. In particular, projection operators are bounded, and every self-adjoint operator, whether bounded or unbounded, has projection-valued measures. Projection-valued measures go hand-in-hand with the Born rule, and they are used to give the probability of a measurement falling on some set of values. There's an analogy with probability distributions. Sampling from an arbitrary distribution can in principle give an arbitrarily large number, and many distributions even lack a finite average. However, the probability of a sample from an arbitrary distribution falling in the interval [a,b] will always be a number between 0 and 1.
If we are careful to ask only about probabilities instead of averages, or even just to only ask about averages when the quantity is bounded, we can do practically everything in quantum mechanics with bounded linear operators. The expectation values of bounded linear operators are continuous functions of the quantum state. And so now we get to the core issue: arbitrarily small changes to the quantum state produce arbitrarily small changes to the expectation value of any bounded operator, and in particular to any Born rule probability.
So what about branch counting? Let's assume for sake of discussion that we have a preferred basis for counting in, which is its own can of worms. For a toy model, if we have a vector like (1, 0, 0, 0, 0, 0, ....) that we count as having 1 branch and a vector like (1, x, x, x, 0, 0, ....) that we're going to count as 4 branches if x is an arbitrarily small but nonzero number, this branch counting is not a continuous function of the state. If you don't know the state with infinite precision, you can't distinguish whether a coefficient is actually zero or just some really small positive number. Thus, you can't actually practically count the branches: there might be 1, there might be 4, there might be an infinite number of branches. On the other hand, the Born rule measure changes continuously with any small change to the state, so knowing the state with finite precision also gives finite precision on any Born rule measure.
In short, arbitrarily small changes to the quantum state can result in arbitrarily large changes to branch counting.
I will amend my statement to be more precise:
Everett's proof that the Born rule measure (amplitude squared for orthogonal states) is the only measure that satisfies the desired properties has no dependence on tensor product structure.
Everett's proof that a "typical" observer sees measurements that agree with the Born rule in the long term uses the tensor product structure and the result of the previous proof.
I kind of get why Hermitian operators here makes sense, but then we apply the measurement and the system collapses to one of its eigenfunctions. Why?
If I understand what you mean, this is a consequence of what we defined as a measurement (or what's sometimes called a pre-measurement). Taking the tensor product structure and density matrix formalism as a given, if the interesting subsystem starts in a pure state, the unitary measurement structure implies that the reduced state of the interesting subsystem will generally be a mixed state after measurement. You might find parts of this review informative; it covers pre-measurements and also weak measurements, and in particular talks about how to actually implement measurements with an interaction Hamiltonian.
I don't see how that relates to what I said. I was addressing why an amplitude-only measure that respects unitarity and is additive over branches has to use amplitudes for a mutually orthogonal set of states to make sense. Nothing in Everett's proof of the Born rule relies on a tensor product structure.
Why should (2,1) split into one branch of (2,0) and one branch of (0,1), not into one branch of (1,0) and one branch of (1,1)?
Again, it's because of unitarity.
As Everett argues, we need to work with normalized states to unambiguously define the coefficients, so let's define normalized vectors v1=(1,0) and v2=(1,1)/sqrt(2). (1,0) has an amplitude of 1, (1,1) has an amplitude of sqrt(2), and (2,1) has an amplitude of sqrt(5).
(2,1) = v1 + sqrt(2) v2, so we need M[sqrt(5)] = M + M[sqrt(2)] for the additivity of measures. Now let's do a unitary transformation on (2,1) to get (1,2) = -1 v1 + 2 sqrt(2) v2 which still has an amplitude of sqrt(5). So now we need M[sqrt(5)] = M[2 sqrt(2)] + M[-1] = M[2 sqrt(2)] + M. This can only work if M[2 sqrt(2)] = M[sqrt(2)]. If one wanted a strictly monotonic dependence on amplitude, that'd be the end. We can keep going instead and look at the vector (a+1, a) = v1 + a sqrt(2) v2, rotate it to (a, a+1) = -v1 + (a+1) sqrt(2) v2, and prove that M[(a+1) sqrt(2)] = M[a sqrt(2)] for all a. Continuing similarly, we're led inevitably to M[x] = 0 for any x. If we want a non-trivial measure with these properties, we have to look at orthogonal states.
I guess I don't understand the question. If we accept that mutually exclusive states are represented by orthogonal vectors, and we want to distinguish mutually exclusive states of some interesting subsystem, then what's unreasonable with defining a "measurement" as something that correlates our apparatus with the orthogonal states of the interesting subsystem, or at least as an ideal form of a measurement?
I don't know if it would make things clearer, but questions about why eigenvectors of Hermitian operators are important can basically be recast as one question of why orthogonal states correspond to mutually exclusive 'outcomes'. From that starting point, projection-valued measures let you associate real numbers to various orthogonal outcomes, and that's how you make the operator with the corresponding eigenvectors.
As for why orthogonal states are important in the first place, the natural thing to point to is the unitary dynamics (though there are also various more sophisticated arguments).
Everett argued in his thesis that the unitary dynamics motivated this:
...we demand that the measure assigned to a trajectory at one time shall equal the sum of the measures of its separate branches at a later time.
He made the analogy with Liouville's theorem in classical dynamics, where symplectic dynamics motivated the Lebesgue measure on phase space.
The earlier post has problems of its own: it works with an action with nonstandard units (in particular, mass is missing), its sign is backwards from the typical definition, and it doesn't address how vector potentials should be treated. The Lagrangian doesn't have to be positive, so interpreting it as any sort of temporal velocity will already be troublesome, but the Lagrangian is also not unique. It simply does not make sense in general to interpret a Lagrangian as a temporal velocity, so importing that notion into field theory also does not make sense.
The problem with all these entropic arrows of time is that a time reversible random walk tends to increase entropy both forward and backward in time. Without touching on time reversibility, fluctuation theorems, Liouville's theorem in classical mechanics and unitarity in quantum mechanics, fine-grained vs coarse-grained entropy, etc, I don't think this makes sense as an explanation of the arrow of time. As a physicist, this doesn't come across as a coherent description.