Hi. I'm Gareth McCaughan. I've been a consistent reader and occasional commenter since the Overcoming Bias days. My LW username is "gjm" (not "Gjm" despite the wiki software's preference for that capitalization). Elsewehere I generally go by one of "g", "gjm", or "gjm11". The URL listed here is for my website and blog, neither of which has been substantially updated for several years. I live near Cambridge (UK) and work for Hewlett-Packard (who acquired the company that acquired what remained of the small company I used to work for, after they were acquired by someone else). My business cards say "mathematician" but in practice my work is a mixture of simulation, data analysis, algorithm design, software development, problem-solving, and whatever random engineering no one else is doing. I am married and have a daughter born in mid-2006. The best way to contact me is by email: firstname dot lastname at pobox dot com. I am happy to be emailed out of the blue by interesting people. If you are an LW regular you are probably an interesting person in the relevant sense even if you think you aren't.

If you're wondering why some of my very old posts and comments are at surprisingly negative scores, it's because for some time I was the favourite target of old-LW's resident neoreactionary troll, sockpuppeteer and mass-downvoter.

Wiki Contributions



I think this is oversimplified:

High decouplers will notice that, holding preferences constant, offering people an additional choice cannot make them worse off. People will only take the choice if its better than any of their current options.

This is obviously true if somehow giving a person an additional choice is literally the only change being made, but you don't have to be a low-decoupler to notice that that's very very often not true. For a specific and very common example: often other people have some idea what choices you have (and, in particular, if we're talking about whether it should be legal to do something or not, it is generally fairly widely known what's legal).

Pretty much everyone's standard example of how having an extra choice that others know about can hurt you: threats and blackmail and the like. I might prefer not to have the ability to pay $1M to avoid being shot dead, or to prove I voted for a particular candidate to avoid losing my job.

This is pretty much parallel to a common argument for laws against euthanasia, assisted suicide, etc.: the easier it is for someone with terrible medical conditions to arrange to die, the more opportunities there are for others to put pressure on them to do so, or (this isn't quite parallel, but it seems clearly related) to make it appear that they've done so when actually they were just murdered.


Then it seems unfortunate that you illustrated it with a single example, in which A was a single (uniformly distributed)  number between 0 and 1.


I think this claim is both key to OP's argument and importantly wrong:

But a wavefunction is just a way to embed any quantum system into a deterministic system

(the idea being that a wavefunction is just like a probability distribution, and treating the wavefunction as real is like treating the probability distribution of some perhaps-truly-stochastic thing as real).

The wavefunction in quantum mechanics is not like the probability distribution of (say) where a dart lands when you throw it at a dartboard. (In some but not all imaginable Truly Stochastic worlds, perhaps it's like the probability distribution of the whole state of the universe, but OP's intuition-pumping example seems to be imagining a case where A is some small bit of the universe.)

The reason why it's not like that is that the laws describing the evolution of the system explicitly refer to what's in the wavefunction. We don't have any way to understand and describe what a quantum universe does other than in terms of the evolution of the wavefunction or something basically equivalent thereto.

Which, to my mind, makes it pretty weird to say that postulating that the wavefunction is what's real is "going further away from quantum mechanics". Maybe one day we'll discover some better way to think about quantum mechanics that makes that so, but for now I don't think we have a better notion of what being Truly Quantum means than to say "it's that thing that wavefunctions do".

I have the impression -- which may well be very unfair -- that at some early stage OP imbibed the idea that what "quantum" fundamentally means is something very like "random", so that a system that's deterministic is ipso facto less "quantum" than a system that's stochastic. But that seems wrong to me. We don't presently have any way to distinguish random from deterministic versions of quantum physics; randomness or something very like it shows up in our experience of quantum phenomena, but the fact that a many-worlds interpretation is workable at all means that that doesn't tell us much about whether randomness is essential to quantum-ness.

So I don't buy the claim that treating the wavefunction as real is a sort of deterministicating hack that moves us further away from a Truly Quantum understanding of the universe.

(And, incidentally, if we had a model of Truly Stochastic physics in which the evolution of the system is driven by what's inside those probability distributions -- why, then, I would rather like the idea of claiming that the probability distributions are what's real, rather than just their outcomes.)


I don't know exactly what the LW norms are around plagiarism and plagiarism-ish things, but I think that introducing that basically-copied material with

I learned this by observing how beginners and more experienced people approach improv comedy.

is outright dishonest. OP is claiming to have observed this phenomenon and gleaned insight from it, when in fact he read about it in someone else's book and copied it into his post.

I have strong-downvoted the post for this reason alone (though, full disclosure, I also find the one-sentence-per-paragraph style really annoying and that may have influenced my decision[1]) and will not find it easy to trust anything else I see from this author.

[1] It feels to me as if the dishonest appropriation of someone else's insight and the annoying style may not be completely unrelated. One reason why I find this style annoying is that it gives me the strong impression of someone who is optimizing for sounding good. This sort of style -- punchy sentences, not too much complexity in how they relate to one another, the impression of degree of emphasis on every sentence -- feels like a public speaking style to me, and when I see someone writing this way I can't shake the feeling that someone is trying to manipulate me, to oversimplfy things to make them more likely to lodge in the brain, etc. And stealing other people's ideas and pretending they're your own is also a thing people do when they are optimizing for sounding good. (Obviously everything in this footnote is super-handwavy and unfair.)

In case anyone is in doubt about abstractapplic's accusation, I've checked. The relevant passage is near the end of section 3 of the chapter entitled "Spontaneity"; in my copy it's on page 88. I'm not sure "almost verbatim" is quite right, but the overall claim being made is the same, "fried mermaid" and "fish" are both there, and "will desperately try to think up something original" is taken verbatim from Johnstone.


Wow, that was incredibly close.

I think simon and aphyer deserve extra credit for noticing the "implicit age variable" thing.


There are a few things in the list that I would say differently, which I mention not because the versions in the post are _wrong_ but because if you're using a crib-sheet like this then you might get confused when other people say it differently:

  • I say "grad f", "div f", "curl f" for . I more often say "del" than "nabla" and for the Laplacian I would likely say either "del squared f" or "Laplacian of f".
  • I pronounce "cos" as "coss" not as "coz".
  • For derivatives I will say "dash" at least as often as "prime".

The selection of things in the list feels kinda strange (if it was mostly produced by GPT-4 then that may be why) -- if the goal is to teach you how to say various things then some of the entries aren't really pulling their weight (e.g., the one about the z-score, or the example of how to read out loud an explicit matrix transpose, when we've already been told how to say "transpose" and how to read out the numbers in a matrix). It feels as if whoever-or-whatever generated the list sometimes forgot whether they were making a list of bits of mathematical notation that you might not know how to say out loud or a list of things in early undergraduate mathematics that you might not know about.

It always makes me just a little bit sad when I see Heron's formula for the area of a triangle. Not because there's anything wrong with it or because it isn't a beautiful formula -- but because it's a special case of something even nicer. If you have a cyclic quadrilateral with sides  then (writing ) its area is . Heron's formula is just the special case where two vertices coincide so . The more general formula (due to Brahmagupta) is also more symmetrical and at least as easy to remember.


With rather little confidence, I estimate for turtles A-J respectively:

22.93, 18.91, 25.47, 21.54, 17.79, 7.24, 30.36, 20.40, 24.25, 20.69 lb

Justification, such as it is:

The first thing I notice on eyeballing some histograms is that we seem to have three different distributions here: one normal-ish with weights < 10lb, one maybe lognormal-ish with weights > 20lb, and a sharp spike at exactly 20.4lb. Looking at some turtles with weight 20.4lb, it becomes apparent that 6-shell-segment turtles are special; they all have no wrinkles, green colour, no fangs, normal nostrils, no misc abnormalities, and a weight of 20.4lb. So that takes care of Harold. Then the small/large distinction seems to go along with (gray, fangs) versus (not-gray, no fangs). Among the fanged gray turtles, I didn't find any obvious sign of relationships between weight and anything other than number of shell segments, but there there's a clear linear relationship. Variability of weight doesn't seem interestingly dependent on anything. Residuals of the model a + b*segs look plausibly normal. So that takes care of Flint. The other pets are all green or grayish-green so I'll ignore the greenish-gray ones. These look like different populations again, though not so drastically different. Within each population it looks as if there's a plausibly-linear dependence of weight on the various quantitative features; nostrils seem irrelevant; no obvious sign of interactions or nonlinearities. The coefficients of wrinkles and segments are very close to a 1:2 ratio and I was tempted to force that in the name of model simplicity, but I decided not to. The coefficient of misc abs is very close to 1 and I was tempted to force that too but again decided not to. Given the estimated mean, the residuals now look pretty normally distributed -- the skewness seems to be an artefact of the distribution of parameters -- with stddev plausibly looking like a + b*mean. The same goes for the grayish-green turtles, but with different coefficients everywhere (except that the misc abs coeff looks like 1 lb/abnormality again). Finally, if we have a normally distributed estimate of a turtle's weight then the expected monetary loss is minimized ifwe estimate mu + 1.221*sigma.

I assume

that there's a more principled generation process, which on past form will probably involve rolling variable numbers of dice with variable numbers of sides, but I didn't try to identify it.

I will be moderately unsurprised if

it turns out that there are subtle interactions that I completely missed that would enable us to predict some of the turtles' weights with much better accuracy. I haven't looked very hard for such things. In particular, although I found no sign that nostril size relates to anything else it wouldn't be very surprising if it turns out that it does. Though it might not! Not everything you can measure actually turns out to be relevant! Oh, and I also saw some hints of interactions among the green turtles between scar-count and the numbers of wrinkles and shell segments, though my brief attempts to follow that up didn't go anywhere useful.

Tools used: Python, Pandas, statsmodels, matplotlib+seaborn. I haven't so far seen evidence that this would benefit much from

 fancier models like random forests etc.


Yes , I know what the middle-child phenomenon is in the more literal context. I just don't have any idea why you're using the term here. I don't see any similarities between the oldest / middle / youngest child relationships in a family and whatever relationships there might be between programmers / lawyers / alignment researchers.

(I think maybe all you actually mean is "these people are more important than we're treating them as". Might be true, but that isn't a phenomenon, it's just a one-off judgement that a particular group of people are being neglected.)

I still don't understand why the distribution of talent/success/whatever among law students is relevant. If your point is that very few of them are going to be in a position to make a difference to AI policy then surely that actually argues against your main claim that law students should be getting more attention from people who care about AI.


Having read this post, I am still not sure what "the Middle Child Phenomenon" actually is, nor why it's called that.

The name suggests something rather general. But most of the post seems like maybe the definition is something like "the fact that there isn't a vigorous effort to get law students informed about artificial intelligence".

Except that there's also all the stuff about the distribution of talent and interests among law students, and another thing I don't understand is what that actually has to do with it. If (as I'm maybe 75% confident) the main point of the post is that it would be valuable to have law students learn something about AI because public policy tends to be strongly influenced by lawyers, then it seems like this point would be equally strong regardless of how your cohort of 1000 lawyers is distributed between dropouts, nobodies, all-rounders, CV-chasers, and "golden children". (I am deeply unconvinced by this classification, by the way, but I am not a lawyer myself and maybe it's more accurate than it sounds.)

Load More