Hm. I can try (depending on how math-y and/or patient you are I may also fail; this is quite long).
In the spirit of Eliezer's dictum "rather than a 3-vector being made out of an ordered list of 3 scalars, a 3-vector was just a pure mathematical object in a vector algebra" the same thing is going on here - the configuration of the photon is given by a complex 2-vector, which we represent as a pair of complex numbers. This means that we have chosen a basis - to quote previous post again "We can represent the polarization of light as a complex amplitude for up-down plus a complex amplitude for left-right". Thus, the basis is a pair of configurations - first one being "up-down", the second one being "left-right". Any other configuration of a single photon is a linear combination of those - hence we write it as (1 ; 0) or (0 ; -i) or (√.5 ; √.5).
There is an unfortunate complication that two configurations are identified if they differ by a scalar complex factor. So as long as you talk about just one photon configurations (0 ; -i) and (0, 1) are actually the same.
(Side note: One has to be careful with this issue when talking about multiple photons (as we will shortly); in that case only one overall scalar complex factor for the whole joint system is allowed, as opposed to "one for each photon," which does not even make sense, because "each photon" does not necessarily make sense - due to entanglement. I'll get back to this after I talk about how one describes a joint configuration for multiple systems, which is right after this.)
What about a configuration of two photons? Or in general, a combined quantum system describing joint configurations of (system number 1) and (system number 2)? The technical term here is "tensor product", a term Eliezer is careful to avoid (because math). In concrete terms, you can take all combination of basis configurations as a basis for the joint configuration space. In the case of two photons and using our basis "up-down"=(1 ; 0) and "left-right"=(0 ; 1)
we get 4 basis configurations for the joint system: (1 ; 0) ∧ (1 ; 0) , (1 ; 0) ∧ (0 ; 1), (0 ; 1) ∧ (0 ; 1) and (0 ; 1) ∧ (1 ; 0). Every joint configuration can be described as a complex linear combination of these. Hence it makes sense to write for a joint state something like √(1/2) * ( [ A=(1 ; 0) ∧ B=(0 ; 1) ] - [ A=(0 ; 1) ∧ B=(1; 0) ] ).
Some of these are "plaid" or "factored" configurations - notably each of the 4 basis configurations are "product", but there are some other ones - you can take (√.5 ; √.5) ∧ (1,0) and this is also a product. The rules of the game encoded in the words "tensor product" say that (√.5 ; √.5) ∧ (1,0)= (√.5 ; 0) ∧ (1,0)+ (0 ; √.5) ∧ (1,0)= √.5 ( 1 ; 0) ∧ (1,0)+ √.5 (0 ;1)∧ (1,0), so this is indeed a linear combination of two "basic" configurations ( 1 ; 0) ∧ (1,0) and (0 ;1)∧ (1,0). Note that written this way it does not look factored, or "plaid", but it is. It is a factored as (√.5 ; √.5) ∧ (1,0). Now it is a mathematical fact that there are combinations that can not be rewritten in a factored form, that are actually "entangled" - and √(1/2) * ( [ A=(1 ; 0) ∧ B=(0 ; 1) ] - [ A=(0 ; 1) ∧ B=(1; 0) ] ) is one such.
Side note #2: Now I can explain about that "one complex scalar factor" business. For a factored configuration (a,b)∧(c,d) we can multiply each factor by a scalar (ka, kb)∧(lc, ld)= k(a,b)∧l(c,d)=(kl) (a,b)∧(c,d) and this differs by a single scalar multiplication by (kl) from (a,b)∧(c,d), so is the same joint configuration. However for (a,b)∧(c,d) + (e,f)∧(c,d) we can not multiply each of the copies of (c,d) by it's own scalar - (a,b)∧(k c, k d) + (e,f)∧(l c, ld) is not equivalent to (a,b)∧(c,d) + (e,f)∧(c,d) in the joint configuration space. Hence the caveat.
Returning to our main line, one may ask why √(1/2) * ( [ A=(1 ; 0) ∧ B=(0 ; 1) ] - [ A=(0 ; 1) ∧ B=(1; 0) ] )? Why does this particular combination encode "unknown opposite polarization"?
First of all we need to understand what "same" or "opposite" polarization means. "Same" known state means that the state is factoriseable as (a,b)∧(a,b) - "both in the same known configuration (a,b)". This is of course the same as (a,b)∧(ka,kb)=k (a,b)∧(a,b), because of the constant scalar rule. What does "opposite" mean? Well, the opposite of (1 ; 0) is (0 ; 1), and in general the opposite known configuration means that the configuration is factorizeable as (a,b)∧(c,d) where (a,b) and (c,d) are (Hermitian) orthogonal. ("Hermitian" because all entries are complex; this is an extension of usual "orhtogonal" property that we know from real vectors.)
√(1/2) * ( [ A=(1 ; 0) ∧ B=(0 ; 1) ] - [ A=(0 ; 1) ∧ B=(1; 0) ] ) is not a factorizeable configuration, so it's not known opposite polarization, but it's a mixture of two such. This still does not explain why this is the "unknown mixed state". Why the - sign, for example?
One "naive" justification is that it works. Clearly in the decohered blob where you have A=(1 ; 0) you also have B=(0 ; 1) and vice versa. But you can decohere this joint configuration in a different way and still get similar outcome - this is what Eliezer's post demonstrates. But there is a more high-brow reason.
The more high brow reason is that the joint configuration space - the span of (1 ; 0) ∧ (1 ; 0) , (1 ; 0) ∧ (0 ; 1), (0 ; 1) ∧ (0; 1) and (0 ; 1) ∧ (1 ; 0) - has a decomposition into a "symmetric" and "antisymmetric" parts. The antisymmetric part is spanned by u∧v-v∧u for any pair of configurations u, v, and the symmetric part is spanned by v∧v, u∧u and v∧u+u∧v. In particular the antisymmetric part is the one where the configuration is in the "unknown but opposite polarization". The important thing is that this is a 1-dimensional subspace, so all configurations in it are the same, up to a scalar multiple. So you can use any v and u and you always get the same result. And if you use u=(0 ; 1) and v=(1 ; 0) you get exactly √(1/2) * ( [ A=(1 ; 0) ∧ B=(0 ; 1) ] - [ A=(0 ; 1) ∧ B=(1; 0) ] ), where √(1/2) is a scalar chosen to make the norm equal 1. But of course if you decohere u∧v-v∧u "along u" you will a blob where A=u and B=v and a blob where A=v and B=u. So no matter which way you decohere this one joint configuration you will always get opposite individual configurations! Math works!
Side note 3: If photons were not moving in opposite directions we would have a problem - photons are bosonic, so a joint configuration space would actually only contain the symmetric part, not the antisymmetric part (that's sort of what "bosonic" means). However, this whole discussion is actually happening inside a larger joint configuration space where we have tensored with the space of positions/momenta of the photons, in addition to polarizations, and there the two photons have different configurations, so everything should be ok. Say the position configuration of first photon is described by f, and the second by g. If I did everything correctly the "total unknown opposite polarization configuration" is [(u∧f) ∧ (v∧g) + (v∧g)∧(u∧f)] - [(u∧g) ∧ (v∧f) + (v∧f)∧(u∧g)] isomorphic to [(u∧v) - (v∧u)] ∧ [(f∧g)-(g∧f)], which is symmetric in (u∧f), (v∧g) as it should be, but antisymmetric in u,v and g, f separately.
He is a massive crackpot in "pseudohistory", but he is also a decent mathematician. His book in symplectic geometry is probably fine, so unless you are generally depressed by the fact that mathematicians can be crackpots in other fields, I don't think you should be too depressed.
I think this notion of "mathematical maturity" is hard to grasp for a beginning student.
I had a very similar experience. Introduction to (the Russian edition of) Fomenko & Fuchs "Homotopic topology" said that "later chapters require higher level of mathematical culture". I thought that this was just a weasel-y way to say "they are not self-contained", and disliked this way of putting it as deceptive. Now, a few years later I know fairly well what they meant (although, alas, I still have not read those "later chapters").
I wonder if there is a way to explain this phenomenon to those who have not experienced it themselves.
When reading about Transparent Newcomb's problem: Isn't this perfectly general? Suppose Omega says: I give everyone who subscribes to decision theory A $1000, and give those who subscribe to other decision theories nothing. Clearly everyone who subscribes to decision theory A "wins".
It seems that if one lives in the world with many such Omegas, and subscribing to decision theory A (vs subscribing to decision theory B) would otherwise lead to losing at most, say, $100 per day between two successive encounters with such Omegas, then one would win overall by subscribing (or self-modifying to subscribe) to A.
In other words, if subscribing to certain decision theory changes your subjective experience of the world (not sure what proper terminology for this is), which decision theory wins will depend on the world you live in. There would simply not be a "universal" winning decision theory.
Similar thing will happen with counterfactual mugging - if you expect to encounter the coin-tossing Omega again many times then you should give up your $100, and if not then not.
Good, now we are talking.
Shall I contribute to charities promoting assassinations of evil foreign leaders (there are still a few left) and backing democratic coups instead of the blanket pro-peace movements?
This argument is based on completely ignoring future costs and benefit analysis and the available alternatives. To accept this as a (implicit?) axiom seems unnatural. Imagine a powerful lobby group stopped American involvement in the Korean war and all of South Korea ended up like the North. Imagine NATO did not strike Serbia and Milosevic continued to reign. Even the Iraq war did have some positive effect - Hussein was evil, and potentially the new government in Iraq would lead to less suffering, both internally and because of other - local and global - conflicts avoided. The particulars of these arguments are debatable (the Iraq government may collapse into chaos; even if it does not, we will never know the ultimate costs of keeping or not keeping Saddam in power), but the larger point stands. Other comments mentioned promoting democracy as a means of promoting peace. War can be a radical mean of promoting democracy (at least in Serbia it seems to have worked), and this should not be ignored.