Post will be returning in Main, after a rewrite by the company's writing staff. Citations Galore.

New Comment
321 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Did some research. The claim that the proposals are poorly written leaps out at me as immediately true. Here's a website with successful grant applications, to be used as models to write them:

This is the first grant I pulled up (it's not the first, but it -was- the first I felt competent to evaluate, concerning primarily technology):

First, the horrible spelling, grammar, and punctuation leap out at me immediately. Second, the claim in the post that grant proposals are written to describe what they're doing, rather than what they're intending to achieve, holds up, for this grant at least.

This proposal is the best-written I encountered. It describes the specific problems it intends to resolve and the specific solutions it intends to use. Unfortunately, the only evidence it introduces is the evidence that there is a problem. It doesn't provide any evidence that its solutions work. Its stated "Method of Evaluation", moreover, exactly mirrors the claims made in this post - it evaluates whether or not its solutions are implemented,... (read more)

I just read that grant in its entirety. I noticed one possible typo, but did not find other bad grammar or spelling. The are asking for a grant to get equipment, primarily computers and software, for use in teaching students. It is not really a research project. What is the outcome hoped for from a grant like that? That students will be taught using these computers. They make a feint at claiming it will raise grades or enrolement, but really if I were a science teacher, my real goal would be to get the stuff and sit students down in front of it and teach them with it. I think that is pretty accurately reflected. I'll look at the ipad grant, and kudos for finding the site and bringing me that much closer to real contact with the kinds of grants under discussion.

First, the horrible spelling, grammar, and punctuation leap out at me immediately.

Me too. Good thing they're not trying to improve writing ability!

I just read that grant in its entirety. I noticed one possible typo, but did not find other bad grammar or spelling.

The VERY FIRST SENTENCE has minor punctuation issues and refers to "Excellence in Leaning (sp) Through Technology" - I refuse to believe that the original Senate bill being referred to failed to spell the word "Learning" correctly in its title. :-)

The second sentence puts a space before the colon for no apparent reason.

"The moneys this school is requesting" => should probably be "money", though I'd accept argument to the contrary. "With request to ..." => should probably be "With RESPECT to"

"This shows community support for improvement and a move forward with the support of a technology plan." => You can tell what the writer is trying to say, but the writer is not actually saying it; the sentence is just broken.

"Teachers will...learn ho to integrate this technology" => should be "learn HOW to integrate..."

That's ju... (read more)

Agree with everything but: If you look in an old enough style guide (the current standard is as you say), it will say to use an apostrophe when you pluralize an acronym. Wikipedia agrees.
Quotes from the .pdf, with my corrections: respect, regard, or reference. how hardware, software in–depth principles day–to–day after $2,250 Consistently 'moneys' is used where 'money' or 'monies' seems correct to me; I did not count this as an error despite not following a strict style guide. Most other 'errors' are very reasonably scanning errors rather than writing errors; the only error that couldn't plausibly be a scanner error would be 'principal' for 'principle'. Overall, the writing was simplistic, sentences were short and simple, and would pass a technical writing test. Presented as a model for what complexity and intelligence level of grants are approved, that is very informative. Grant proposals (apparently) should be simple, repetitive, and full of Capitalized Buzzwords that are Important to the Right People.
I actually counted the really short sentences heavily against them mentally, probably too much. Owing to the way I parse sentences, reading the grant was like listening to William Shatner at his... not quite hammiest, but pretty close. As far as the 2.250 thing, that's actually not that uncommon outside English-speaking nations; see which lists countries which use decimals as thousands separators and commas as decimal marks. (That may actually help to explain the short sentences, come to think of it.) An alternative explanation is offered here: (Specifically, that the document may have been electronically scanned; this could also account for other apparent spelling mistakes. Handwriting recognition is getting better, but is still far from perfect.)
How, exactly is it, that these people get hired in the first place?

I'll put it this way: in the average GRE scores by intended field, education ranks below philosophy & STEM in every subtest, and various forms of education rank very low (early childhood education is, out of 50 groups, second from the bottom in 2 subtests and fifth from the bottom in the last subtest).

Average GRE is useless. Elementary teachers have far lower GRE scores than secondary school teachers, and are about average in verbal and below average in math. Secondary school content teachers are above average in verbal and average in math. However, close to half of all secondary school teachers get higher than 600 on the math section, which is more than the number of math and science teachers. While I suppose it's possible that math and science teachers have terrible math scores and the English/history teachers are scoring those 600+ scores, I'm figuring it's far more likely that math and science high school teachers have eminently respectable GRE scores in math, and that English/history teachers have higher than average verbal. Anyone who claims that teachers are stupid is using propaganda instead of ETS data. Cite for SAT scores and for GRE scores
What makes "higher than 600 on ONE section" a cutoff above which counts as an "eminently respectable" score? Would you accept "mediocre"? ;-)
Education even ranks below Religion in every category. Also, Economics is only quantitatively better than Religion.
Not surprising, given my experience. Most religion majors I've met were relatively smart and often made fun of the more fundamentalist/evangelical types who typically were turned off by their religion classes. Religion majors seemed like philosophy-lite majors (which is consistent with the rankings). Edit: Also, relative to Religion, econ has a bunch of poor english speakers that pull the other two categories down. (Note: the "analytical" section is/was actually a couple of very short essays)
Yes, given that economics is apparently one of the most lucrative fields around going by Caplan's recent post on majors, it's interesting that the econ students aren't ranked even higher.
3Eliezer Yudkowsky
Chicken-and-egg problem: Non-economics majors don't think economically enough to choose fields on the basis of their remuneration?
That seems to explain why Econ majors get a premium, but that doesn't seem to explain why econ majors don't rank higher, or am I missing something?
Those who can't do... Another category that jumped out at me - see all the public/education/business administration near the bottom. These are the people running institutions. The world is the way it is for a reason.
I notice that I'm confused: the maximum score on the Quantitative section is 800 (at that time), and Ph.D. econ programs won't even consider you if you're under a 780. The quantitative exam is actually really easy for math types. When you sign up for the GRE, you get a free CD with 2 practice exams. When I took it, I took the first practice exam without studying at all and got a 760 or so on the quantitiative section (within 10 pts). After studying I got a 800 on the second practice exam and on the actual exam, I got a 790. The questions were basic algebra for the most part with a bit of calculus and basic stat at the top end and a tricky question here and there. The exam was easy - really easy. I was a math major at a tiny / terrible liberal arts school; nothing like MIT or any self respecting state school. So it seems like it should be easy for anyone with a halfway decent mathematics background. Now you're telling me people intending to major in econ in grad school average a 706, and people intending to major in math average a 733? That's low. Really low relative to my expectations. I would have expected a 730 in econ and maybe a 760 in math. Possible explanations: 1) Tons of applicants who don't want to believe that they aren't cut out for their field create a long tail on the low side while the high side is capped at 800. 2) Master's programs are, in general, more lenient and there are a large number of people who only intend to go to them, creating the same sort of long tail effect as above in 1). 3) There's way more low-tier graduate programs than I thought in both fields willing to accept the average or even below average student. 4) Weirdness in how these fields are classified (e.g. I don't see statistics there anywhere, is that included in math?) 5) the quantitative section of the standard GRE actually doesn't matter if you're headed to a math or physics program (someone in that field care to comment?). Note: the quantitative section of the standard
I was actually too lazy to study for my GRE, so I think I got like in the 600s on the math section (it had been a long time since I had studied any of that stuff); I realized while taking it that this was a stupid mistake and I was perfectly capable of answering everything, but the GRE cost too much for me to want to take it a second time. Oh well. Statistics does not seem to be broken out in the latest GRE scores I found: I think statistics is almost always part of the math department. My guess is that there are a lot of grad schools (consider law schools, the standard advice is to not bother unless you can make the top 10, yet there are scores if not hundreds of active law schools), and few actually intend to do a PhD.

Luckily, my firm started collecting data on teacher aptitude some time ago, and basically you can separate all advanced math teachers easily into two categories: Okay with blacks in their classroom. Blacks and whites both end up succeeding at equal rates. Not okay with blacks in their classroom. Whites end up succeeding, blacks end up failing.

There are a number of research groups tracking teachers and student test scores. If such results had been released anywhere, wouldn't they be front page national news? And this seems like something that, e.g. the Gates Foundation would want known: if true, it's a magic bullet.

Why haven't academics and foundations studying teacher quality and value-added metrics reported such results?

A certain principle...remember that as principle of a school

The word is "principal."

The Principal is your pal. Separate is a rat.
Ugh. There are three types of lies in the world: lies, damn lies, and people falsely claiming that their incentives are aligned with yours.

There are three types of lies in the world: lies, damn lies, and people falsely claiming that their incentives are aligned with yours.

There are three types of lies in the world: lies, damn lies, and mnemonics.

Thank you. I know it's not very profound, but since someone did spell it wrong, and another felt the need to correct them, I thought I'd throw it out there.

The specific project I was evaluating had only gotten $800,000 out of the maximum $2m. Its strategy was to purchase the male students iPod Touches, the female students makeovers, manicures, and pedicures at a local beauty parlor, and all students were offered an additional iPod Touch or Makeover, respectively, if they passed the exam at the end of the current year. The grant proposal had specifically listed these actions as being the goal of the proposal. If the iPods and makeovers were purchased, that constituted success.

If true and documentable, I think there's a large section of the Internet which would be very, very interested and very, very loud about this because the males got iPods and the females got makeovers. (And justly so.)

Generating drama over inconsequential bullshit while the world burns is not justifiable. "One can lie by omission just by neglecting to choose a sufficiently important subject matter." — @aristosophy

It wouldn't be a more effective program if it had iPods for both genders. Up in arms over different gender incentives would target an en vogue failure point that isn't relevant over the one relevant criticism: That the program doesn't work.

You know, it might. Most teenagers are interested in iPods; not all female teenagers are interested in makeovers. In addition, an iPod can be sold if you don't want to use it. That said, absolutely correct about that not being the point.
And is incapable of being designed to work by those paid to do so.
Bingo. On the other hand, one possible way to start solving the education-grant problems is to shine press attention on them. And this would be quite a handy hook for that.
If I hadn't recently seen that "students fighting segregated prom" story from credible news sources, I'd have considered this part of the story to be nearly conclusive evidence of trolling. I should be more charitable than that. It's still evidence, though. Who could fail to anticipate the devastatingly bad PR from "iPods vs Makeover/mani/pedis"? For that matter, why didn't the devastatingly bad PR occur? Surely the students and their parents weren't under NDA too. Yet a Google search for 'ipod makeover school -"chic school girls"' doesn't seem to find anything relevant, with or without outraged commentary attached. This random lesswrong page comes up for me in the first couple dozen hits, even on a browser with no Google login or cookies that might trigger personalized rankings. Nobody ever felt it was worth blogging about how their kids were being given these prizes at school?
If that program exists, it was tiny, which increases the odds that there' would be no public notice. I suspect we overestimate how much of the world (not just the proportion of people, but the proportion of what's going on) is online.
I can imagine some lesswrong users thinking it would be terribly clever for them to create a sock-puppet and "test" how gullible the lesswrong readerbase is via a post like this. If such a case were ever identified I would like to see the user banned by IP rather than rewarded with status and congratulations.

Banning by IP is useless at best and harmful in most cases (where hapless customers of an ISP get the old IP address of a troll). There is no way to prevent attempts to test how gullible we are; therefore we need to be generally immune to all attempts and not only to specific cases or instigators.

I'd estimate a 30% probability (but with a fairly large variance) that at least some elements of this article are inaccurate and an attempt at trolling. The "best" trolls are 90% truth with one or two outrageous elements using the halo effect to gain belief.

CarlShulman's comment has no satisfactory replies yet. What is the probability that the article's profound accusations are a) completely true, b) otherwise unreported, and c) first reported on lesswrong? It seems more likely that an actual whistle-blower would choose a more widely read media outlet (Wikileaks even?), and probably more than one. b) and c) could just be my inability to find similar information reported elsewhere. It seems to be fairly common knowledge that the education system is broken but not with the specific detail in the article.

The more widely read the media outlet, the higher the probability of the whistle-blower's anonymity getting blown.
Once it's released anywhere the risk of losing anonymity is basically the same in the end. Either it's a troll and will die here or it's true and will be disseminated everywhere. Such a strategy would only make sense if the original poster thought that this forum would replicate the research and publish it with no mention of the original source, but that seems more like the kind of thing an investigative journalist would do.
Huh? Why?
The only people aware that the project happened, as far as I know, are myself, my boss, the man in charge, and the 56 students (who were in 6-8th grade at the time, and all from poor black families). The issuer of the grant was the local government, and they issue so many grants that I seriously doubt there's anyone looking at all of them.
If a student with poor parents in that age group get's a free iPod, his peer and parents are likely to notice. The idea that you can give a school a grant worth $800,000 and only one adult in the school knowing about the grant also seems strange. If a lot of the student got the second iPod makeover, and iPod touches were more expensive at the time the whole thing might have cost $20000. That means $780,000 just disappeared. If that kind of money disappears people are bound to be interested.
Maybe the person applying for the grant was himself black. Then it would be considered acceptable for the same reason rap lyrics are considered acceptable.
I wouldn't be surprised if the description was simplified for brevity, and what actually happened is they were given the choice or something.
And they will be completely uninterested in the (if true) far more important elements of the story. 1938 is not the time to purge the Bolsheviks!

I believe you're posting this because you want the problems fixed; but in order for that to happen, you need to make it happen. There are a variety of escalation strategies to consider: you might go to your state's secretary of education; you might go to your state's attorney general; you might go to the press. Note that NDAs usually do not hold up when you're making accusations of criminal activity, and some of the accusations you have made are criminal in nature. There are "whistleblower protection" laws, which vary from state to state. Also note that if someone is your lawyer, then you have the absolute iron-clad right to tell them everything ("attorney-client privilege") regardless of what you may have signed. And if what you say is true, then there are probably quite a few groups that would happy to provide you with a lawyer, or lawyers that would work on contingency. In any case, talking to a lawyer should be your next step before taking any major action; it's just a question of which one.

Your career as an auditor that people pay to evaluate themselves is completely doomed and you should sacrifice it before it explodes. Your career as an auditor that peopl... (read more)

Pulling a bunch of money out of the system with a lawsuit is not a winning outcome if it leaves the existing corrupt power structure in place. Be warned that for many lawyers, the goal will be money, not improvement. Do not use a lawyer whose goals are different than your own.

You should be collecting evidence. If you are ever alone with an incriminating document, photograph every page. You may later wish to allow these documents a chance to go missing (eg, by making a FOIA request) before you reveal that you've copied them. If you are in a state where it is legal to do so without telling anyone, record every interesting conversation. A recorded statement like "the evaluator's job is to collude with the submitter" is a political instakill if given to the press, and the threat of releasing it is significant leverage.

Pay attention to the specific people involved, especially the ones who are higher up, and the ones who are keeping themselves hidden. Try to figure out who's good, who's evil, who's smart, and who's dumb. Try to predict how each will react to scandal. Assume that by default, the evildoers will successfully deflect blame onto the stupid, unless you have specific incriminating evidence.

This is good advice in general for dealing with the tort portion of the US legal system. But we shouldn't forget the general deterrence effect of money damages. And in education law in particular, money damages for being a negligent educator are essential impossible to get, so the extreme money grubbers aren't in this market. In special education, money damages simply for having a stupid education plan are unavailable under the most important special education statute. That said, the rest of this post is excellent advice.

In particular, if any of the children evaluated are in special education, parents have fairly strong leverage to punish school districts who are doing stuff that no one could think would work. Attorneys and non-attorney advocates in all 50 US states are available to help the parents, who often have no idea what their rights are or that the school district is being foolish.

Separately, the American's with Disabilities Act prohibits retaliation against anyone who acts to protect a disabled individual's rights. Alas, proving this is a difficult matter. You might consider looking at my post on one way to document verbal statements in writing.

And for clarity, disabled in this context is a broader label than cognitively impaired, blind, or deaf. If a child has a medical condition that impairs their ability to learn, they are a "child with a disability" under the Individuals with Disabilities Education Act.

Disclaimer: This is a general statement of the law, not legal advice. Consult your own attorney, because reading this does not create an attorney-client relationship.


Technical person meets a bureaucracy. Good clean fun, like the Mr. Bill show. I wish I had been there when Thomas Sowell interned for the Department of Labor.

The only things about your story that surprised me was that you weren't shit canned within a month, and that an actual company exists that would hire you. You, and by extension them, rocked the boat and survived. That's not what anyone is paying you for. You're there to validate that they're doing the right thing. I don't know how you and your company have survived this long, but I'd like to thank you all for saving some students from the regularly scheduled destruction of their lives.

As for your conversations with the bureaucracy, do you really think their confusion was in not understanding your point? I'd guess that any confusion they had was in how you had a job there at all, while you were busy saying things that shouldn't be said. I think you were the one not "getting it".

Every so often someone says something that opens a new world to me. I'll pass on the new world to you.

The purpose of a bureaucracy is to further the interests of the bureaucracy, whatever goals they give lip service to. But even theoretically... (read more)

Not to lower signal-to-noise, but - I really liked this comment. It shows of a fine mind made cynical, a delicate sarcasm born of an impinging upon by a horrific, Cthulhian reality.

"People are crazy, the world is mad."

Thank you. At first I didn't agree with the "horrific, Cthulhian reality", but I think that's one of the Orwellian problems. Bureaucracy's are infuriating and frustrating and horrific, but they are the way they are for reasons, like gravity. If you're willing to stare into the abyss, it's not hard at all to see why things are the way they are. An institution is a machine, indifferent to our wants and intentions. Make the machine wrong, and it will be a meat grinder for everyone involved. But's there's the other, more human Orwellian horror. People really are different in the head, running different algorithms. Or more to the point, they aren't aliens, I am. That manager's meeting was a peek behind the curtain to true and habitual Doublethink. In their heads, the epistemic truth algorithm was completely inoperative. The Lie was completely True until I opened my yap. People can be frustrating and infuriating - but they too are what they are for reasons, like gravity, and they're not mysterious at all if you stare into their abyss. The Hitch had a catchy phrase - "just two chromosomes away from a chimpanzee". We aren't so smart, rational, or sane. All sorts of "mysteries" dissolve in the light of that.
"As for your conversations with the bureaucracy, do you really think their confusion was in not understanding your point? I'd guess that any confusion they had was in how you had a job there at all, while you were busy saying things that shouldn't be said. I think you were the one not "getting it"." I'm torn myself; I could see it being due to self-interested playing dumb, but also the genuine kind given that we are talking about gross failings of basic education, after all, which has to have consequences down the line.

I don't know about the rest of the country, but this fails entirely to be surprising [eta: given the region of the country I grew up in, Texas]. My family has seen too much shit go on in schools.

The most egregious case that happened -to us- was a school that put one of my siblings in a fucked-up experiment (paid for by a grant!) without my parents' consent, and indeed told all the children involved not to tell their parents or bad things would happen to them (we grew up being taught very firmly to question authority, so of course my parents found out, and a shitstorm was raised - with nothing ultimately happening. One of several reasons we moved out of that school district.). The experiment involved shit like telling the (extremely young) children to imagine they were in a crashing airplane, and there's nothing they could do, they were going to die, and how did they feel about this?

Some of it seems exaggerated, but the basics - half-assed school grants funding ridiculous shit - ring a little too true to me to outright reject the post. I've seen too many things happen in schools that remain completely unreported on, like prayer in school, to think that the scarcity of information on the internet means anything, as well.

The laughter brought tears to my eyes. Obviously, that's an obscene thing to do to children. Shades of the Milgram Experiment. Martian.

Man, I registered just so I could vote and then it turns out there's something called karma.

This post is almost entirely nonsense. I give it "almost" simply because in certain all-URM school districts the corruption level is high. It's within the realm of possibility that "fake grants" to "fake grant programs" that are nothing more than chump change doled out by large employers who can wave the program in front of Jesse Jackson and his ilk--look! We're providing gravy!--so I won't call it an outright lie. But it's certainly not the norm. Did you notice that this guy acts like the education world is comprised solely of blacks and whites? If any element of his story is true, it's because he lives or works in an all black school district that is, indeed, corrupt. Detroit, New Jersey somewhere, or the like. And that's a generous interpretation.

The second half of his post is so risible I'm amazed anyone takes it seriously. We live in a world where, as I write this, federal settlements are forced on schools that suspend or expel minorities at a higher rate, never mind the details, and anyone believes that schools assign classes by race? It's not just wrong.... (read more)


I upvoted this comment, because I'm interested in hearing a dissenting view on this, but ... I find this to be pretty poor dissent.

You should tone down your accusations, and especially make them more precise - on the face of it, I'm not sure to what extent the things that you're saying (like "the pressure to integrate classes when the kids are unprepared is huge") actually contradict the OP, as opposed to merely being evidence that supports a different interpretation (and you'll find arguments for both sides on any disagreement).

Mostly, from my French point of view, I'm seeing American politics cloud up issues here, and I would much rather see a dispassionate discussion of the facts rather than flinging accusations back and forth. Too much "THIS IS A LIE AND YOU ARE ALL IDIOTS", not enough "this particular specific statement appears to be false, and here is why".

Likewise upvoted, likewise would prefer higher-quality criticism. Given how incredibly important education is and how few citations there are here (at the moment, from both 'sides'), forgetting to actually think about something for five minutes before updating your beliefs would be a very bad idea.

Is it possible that different parts of USA have different situation, because of a different state, different county, or just depending on whether the parents in the specific school are politically savvy, know their rights and fight for them?

Sometimes the official rules are the same for everyone, and yet what actually happens, depends more on the local culture. Maybe the lawsuits get big media attention, but in reality they happen rarely and require a lot of effort on parents' side (or a coincidence that some political group decides to push this cause), so most parents don't even try.

Is it possible that different parts of USA have different situation, because of a different state, different county, or just depending on whether the parents in the specific school are politically savvy, know their rights and fight for them?

In a country where some school districts have higher college acceptance rates than others have high school graduation rates, I would say this is a near-certainty.


I actually think there's a decent chance this story is a hoax, but not because it is remotely implausible. It sounds exactly like everything I've heard about the NYC school system.

Upvoted because I'd like to see the OP address your questions.

I apologize for the late response. I do not know where you come from, but I have personally reviewed the math placement criteria of hundreds of middle schools and high schools. Teacher recommendations are always on the list, whereas I have never seen a school which used "principal recommendations". Wake County, NC's placement criteria: Alamance County's placement criteria: I will find more if you'd like me to, but teacher recommendations are plainly listed. In my experience, principals generally back their math teachers when it comes to which students get placed where. The schools do not outright assign math placement based on race; it is slightly more subtle than this. An example would be Wake County, in North Carolina. Wake County used a model called the "effectiveness index". A student is given a score based on: 1) Their previous test scores 2) Their income level (trinary: free lunch, reduced-price lunch, normal) 3) Their race. If two students with exactly equal grades and test scores were evaluated using the effectiveness index, with one student being a poor black, and another being a middle-class white, the former would be given a lower residual score, and therefore would be less likely to be placed into an advanced class. These scores were also used to determine how well a school is doing at teaching. If the poor black student did as well as the white student, the difference between his score and his effectiveness index residual would be larger than the white student's, and so the school would be rewarded for overcoming the "risk factors" of being poor and black and managing to instruct him anyway. Wake county is currently doing away with the effectiveness index, replacing it with EVAAS, a system which takes into account nothing but test scores. Source: Can you point me to a federal settlement forced on a school that suspend
Do you have citeable evidence that principals are facing lawsuits for using 'teacher recommendations' to either assign students to limited slots, or to discourage students from competing for those slots (e.g. tell students that a teacher recommendation is required, knowing that one group of students will see it as more of a bar to entry than another, resulting in a smaller proportion of that group even competing for the slot). Because either of those actions are indistinguishable from using race as a factor in determining access to classes.

My top candidates for what is up here are: 1) fabrication as part of a social experiment on how credulous we are 2) fabrication by a sociopath with a very odd idea of self-entertainment 3) incredibly erroneous interpretation of what is going on by a crank

But it is SO full of red flags that I would be surprised if it is not intentional. Call it 66% chance it is intentional hoax.

And it is so far from the mark of a true post that I would be very surprised if it had more than a glancing connection to the truth, call it 95% that it is barely connected to actual facts.

I have kids in California public schools. I have read, over the years, many critiques of public schools and public funding generally. As bad as things are, they are quite obviously nowhere near as bad as this article suggests in the schools my kids have gone to and are now going to. Further, I am quite good friends with a long time teacher, administrator, and union officer in NYC. I by no means share her respect for the union and DO believe documented horror stories of "turkey farms" where truly impossibly bad teachers are stored while being paid rather than following the more expensive process of firing... (read more)

The variation in educational standards and practices between districts in America is too large to make generalizing from one's own experience very useful except insofar as it demonstrates that the critiques given in the article cannot be universal.

When I talk to friends who went to decent schools (which is pretty much all of my friends,) their experiences, cynical though they might be about them, don't reflect the sort of scandal the OP describes. When I talk to acquaintances who work as teachers for seriously disadvantaged schools through programs like Teach For America, the general consensus appears to be "No matter how bad you think it is, it's always worse."


Next, there is a thriving critique of publicschools in this country. With the amount of negative attention public education has drawn, is it really plausible that NONE of this critique has discovered the depths of waste and stupidity described as routine by this post? It is not plausible to me.

Every scandal was at some point not yet known. Consider an apropos contemporary news event: the Memphis cheating ring, which embraced an entire school district in cheating far worse than merely sustained incompetence and racism. It apparently may have started as early as 1995, and only began coming out in 2009.

Wait, systematic cheating is far worse than systematic racism? That seems, uh, non-obvious to me.
FWIW I estimate 30% chance something of the sort is going on; if so, my guess is that the OP is actually a ... well, I suppose he might use the term "racial realist" ... who wants to show how those lefties on Less Wrong will believe even the most ridiculous claims if they allow them to blame underperformance by "blacks and poors" on systematic mistreatment rather than natural inferiority.
I have never worked in California, nor New York, and cannot speak for your experience. Really? I myself went through the advanced classes. In my "Calculus AB" class, there were 28 whites, 1 hispanic, and 2 blacks. My school was probably around 30% black, 20% hispanic, 50% white. There are two possibilities. Either whites are 5/3 * 28/2 = 23 times more likely than blacks to be prepared for Calculus, or there is some kind of institutionalized racism going on. Nobody issues grants to help the academically gifted kids who are already doing well. Most grants come as "dropout prevention grants", or are otherwise targeted at students unlikely to end up on Lesswrong. So I would ask you: in your advanced math classes, were minorities represented as a proportion of the school's population? Or was the ratio of the percentage of minorities in your school's population to the percentage of minorities in your advanced classes higher than 1? Perhaps higher than 2? For me it was 23. Eh. I wish citations were easier to find; it's kind of ridiculous, honestly. Just trying to find math placement criteria for any given school system on the internet is impossible, much less a random assortment of school systems such that my location is anonymous.

This post is popular not because it is accurate, but because it repeats the popular misconceptions about the US education system, and tells both left and right what they want to hear:

Of course, the biggest myth that the media reporting of PISA scores propagates is that the American public school system is horrible. The liberal left in U.S and in Europe loves this myth, because they get to demand more government spending, and at the same time get to gloat about how much smarter Europeans are than Americans. The right also kind of likes the myth, because they get to blame social problems on the government, and scare the public about Chinese competitiveness. We all know that Asian students beat Americans students, which "proves" that they must have a better education system. This inference is considered common sense among public intellectuals. Well, expect for the fact that Asian kids in the American school system actually score slightly better than Asian kids in North-East-Asia!

American students generally outperform their racial group in other countries. White Americans have higher PISA scores than any European country except statistical outlier Finland. Asian Americans... (read more)

The picture I have of the US education system is that there are a large number of smart, dedicated, people spending a lot of money trying get the best outcomes they can with the students they have to work with. This is all irreconcilable with the claims the OP makes.

Not so irreconcilable, if you don't suppose that "a lot" means "most."

The current average likelihood of a high school freshman in America making it to graduation is about 78%, and that's the best it's been in quite a while.

At the public high school I went to, it was a pretty big deal if a year passed where someone failed to graduate, and students would ask each other, not if they were planning to go to college, but what college they planned to go to. The only student I ever asked or heard asked that question who said they weren't planning to go to college, went to college. And not a two-year or community college, but a pretty decent state college.

That was a good high school, but it wasn't by any means renowned. With schools like that bringing up the national average, consider the state of the schools dragging down the national average.

Just imagine...there are countries where education can be discussed without bringing in race at all...

The U.S. educational system can be better than most other countries' (assuming higher performance is not due to some other factor) and yet have much room for improvement. The U.S. economy has higher GDP per capita than almost all other countries, and yet it keeps growing, and there are many areas where policy is clearly forsaking GDP.

Doesn't the US spend a lot more per pupil?
The U.S. spends much more per manual laborer, nanny, etc. The wage level is higher, and immigration restrictions prevent wages from equalizing across national borders. You have to ask whether the U.S. has more or better teachers, or textbooks/facilities/amenities, as opposed to paying more for similar or lesser inputs. Also, spending reflects other factors to some degree, e.g. it is more labor-intensive, and thus expensive, to educate children with learning disabilities or other serious problems.
Wage levels much higher than Northern European countries? Really? More to the point, teacher's pay much higher?
Here is a chart of teacher salaries as a share of GDP per capita, and here is a table_per_capita) of GDP per capita across countries. The US spends a lower share of GDP than many other countries, but off a higher GDP base.
In states with high Hispanic-American populations, that portion is false.
I don't see that supported by the data, since other countries aren't broken out by race. Also, how biased is the sample? What percentage of Americans take the test? And Europeans?
PISA website.
In all countries, schools are sampled, attempting to control for some variables. Around 5000 students, 160 schools in the US. Note that schools may/may not choose to participate, and the same with students. PISA has minimum standards for acceptance as sampled selection, in an attempt to avoid the obvious bias that countries would have an interest to produce in their samples. Given the optional sampling, and obvious incentives, I'm skeptical that this is particularly accurate, and an apples to apples comparison. Also, homeschoolers, one of the strongest demographics for ethnic and intact family reasons, are likely missed by this sample, skewing results in the US down. They appear to be about 4% of the total population, disproportionately white and in two parent households.
They control for schools opting not to participate; see the section under substitute schools. It's standard research ethics that minors (and their guardians) be given the option to refuse to participate in a study. Now I'm confused. The original inference Sailer drew from PISA was that American students outperform their racial group in other countries. You're claiming the study will be biased against white Americans. If anything you should be annoyed at Sailer for trying to support his racial performance narrative using a study that didn't really focus on it. Regarding homeschooling in particular, it'd be nearly impossible to develop an international study on the same scale of PISA (which you already want to reject as too small) merely because homeschooling isn't prevalent in most of the PISA-participating countries.
I'm aware that they tried to control, but the bounds are large and open to exploitation by the countries who so choose to do so. And the homeschool issue should strengthen his conclusion. I was just noting a factor he hadn't controlled for. Both factors I pointed to would tend to mean that the relative rank for the US is in reality better than listed, IMO, but the wide bounds of substitution adjustment makes it very hard to be confident in the results.
Given that there are racial gaps in the test scores, it's not fair to compare the average of white Americans against the nation-wide averages of European countries, since they also have significant non-white populations.

If you follow my first link, you can see the author's analysis is demographically neutralized (it excludes 1st and 2nd generation immigrants in European countries, and compares to white Americans). In this ranking, American whites substantially outperform the European average, and only 2 small European countries (Switzerland and Finland) noticeably outrank American whites. US whites are outscoring the EU-15 (basically the core nations of the EU, before it expanded into Eastern Europe), by a substantial amount.

The second image is not demographically neutralized, but European countries have far, far lower non-white percentages than the United States. For example, Germany is about 10% non-white as of 2010.

The image appears to come from Steve Sailer, who is not the most reliable source in the history of reliable sources. Identifying as anti-establishment media seems to correlate with poor epistemic hygiene. On the other hand, I wouldn't be surprised if some version of this is true after correcting for this, as your typical European country has an ethnic majority over 80%.

I would dearly like citations for everything - I would really like to know if I am still terrible at estimating how awful the world is.

Not the world, the US.
1. Sure, in this case. 2. The US is part of the world. 3. I was expressing my thoughts on a general and common failure mode that I have attempted to correct, and that this article provides evidence I have failed to do so.

So what criteria are necessary to apply for these grants? I have a feeling there are a lot of smart people working on startups in the ed tech space. If you could get in contact with them, you might have more competent grant applicants, and those startups would find more revenue to pursue their (potentially workable) ideas for improving education.

Here are some ed tech incubators I found on Google. If you get in contact with the people behind the incubators, they'll probably tell all of their startups about the ease of getting funding this way. Their startups will have to work on one of the problems that there exists a grant for, but there should be a decent number that find this workable.

You might have seen some of those sketchy advertisements, similar to the "Google will Pay YOU!!! To Work From Home!" ads, which say stuff like "Get Grant Money Here!". At least, I associate those two kinds of ads as being similar.

In any case, the process of finding grants to apply for is very simple. The Department of Education grants are all on Pretty much every university's Research and Evaluation Department gives out grants to the local community; check out your local Uni's website. Sometimes large corporations give out grants, sometimes individual people. In general, get in touch with the education department of your county government to find out which grants are being offered nearby and how to apply for them.

Now that I think of it, this is the main request I should have of lesswrongers. I bet anyone on this website could write a damn good proposal for any grant they come across, and I bet their project would be better than the shit I evaluate.

Good info. Are you going to talk to the ed tech incubators and give them an inside contact or shall I email them a link to this thread?

the evaluators job was to collude with the grant proposal submitter, so that we got more evaluation jobs from them in the future.

The grantee, not the grantor, hires the evaluator? What the hell?

Can you see where the politics might come in? There were rich whites who got upset when the data told the schools to put some blacks in their son's advanced math class. There were math teachers who absolutely refused to allow blacks/poors in their classroom, or worse, treated them in such a way as to cause them to fail, thus confirming their worldview.

More what the hell. Was this a long time ago, or am I just really naive about what certain parts of the US outside the bay area look like?

Most things in general are broken to a degree that the average reasonable person would find completely shocking. There are absolutely comic book levels of incompetence, grift, discrimination, and vice, within most bureaucratic organizations if you know where to look.

Other country, other situation, but I think this meta observation works for both:

If the educational system is broken too much, the society loses an ability to rationally discuss how to fix it, because the "unbelievable" facts you report are taken as an evidence against your sanity or honesty, not against the system. And of course there are some people who benefit from the system remaining as it is, and they are happy to confirm that you are wrong.

On some level, yeah, extraordinary claims require extraordinary evidence. But people have their priors seriously wrong here -- sanity waterline is generally low, incompetence is not so rare in the absence of feedback, and there are lot of money to make by abusing the school system. Also the prior probabilities are counted repeatedly. (If one teacher complains, he is probabably just an incompetent lunatic. If another teacher complains, she is also an incompetent lunatic. If thousands of teachers complain... well, by the same logic, they are all incompetent lunatics. There is never a point where there is enough evidence to start suspecting that they might actually be right. It also does not promote honest communication if it is wid... (read more)

I'm reading The Shadow Scholar, a book by a man who makes his living by writing papers for college students. There's a root and branch attack on Rutgers, a university which is sort of an educational scam which loots the students for the benefit of the parking system and the athletics program. Poking around online, the only people I've found who've said he's unfair to Rutgers have been very marginal about it (they say it's possible to get a good education at Rutgers), and a number who agree that it's really that bad.

I attended Rutgers part time when I was a full time employee of bell labs. The graduate physics classes were excellent, rigorous, well taught, well designed, and hard. I have no particular recollection of any parking fees or atheletic activity. Since that time, I proceeded to get a PhD from Caltech and teach for 8 years at University of Rochester. In my opinion, informed by my experience, Rutgers is categorically NOT an educational scam.

The company I work for has hired many engineers who have been educated at Rutgers. There is no evidence that Rutgers is a scam, either in the interview process for these engineers, or their subsequent performance on the job.

The quality of undergraduate and graduate experiences at the same university can be dramatically different, since their funding sources (and thus their incentive structures) are separate. It's possible that Rutgers is broken as an undergrad institution, but not as a graduate one.

(Rutgers also has a good reputation as a graduate math department.)

It's also possible that there's a division between STEM and everything else. Especially, there aren't many term papers or essays being written for math-heavy courses, and so I can safely assume the Shadow Scholar wouldn't have run across their students.
Thanks. When did you attend Rutgers? One data point of modest value-- a friend of mine graduated from Rutgers as a history major, probably in the 80s. She didn't know that life could be very hard for civilians in war zones. She isn't my smartest friend, but she isn't stupid and she's pretty conscientious.
I attended in 1979. I did not matriculate, but I did take regular graduate Physics courses that I had to leave work during the day to attend. My other exposure to Rutgers is through a very good friend who was a Bell Labs department head for years, who was then Professor at Rutgers for seems about 10 years until probably 2000. He started the Wireless Information Lab which has a superb national reputation for research, graduate, and postgraduate work. I visited him and Lab events many times over the years, and find it implausible that if the undergrad education was a scam he wouldn't have mentioned it. In fact I'll email him about this, and if he answers post something here.
Having read The Shadow Scholar also, I don't think the author himself would stand behind a claim that Rutgers is an educational scam, although he certainly testified to it having an uncaring and incompetent administration which doesn't show much care for the education of its students. The sort of lost purpose educational aimlessness that allows students to graduate without really learning anything exists in universities all over the country, as do the students who retained his services and those like his. If you haven't gotten that far in the book yet, it's for-profit colleges which he really attacks as educational scams.
"Educational scam" was my language, and may have been too strong. The author does describe Rutgers as a place where it's difficult to learn much (indifferent teaching, and a lot of institutional barriers to spending much time on learning). Oddly, I haven't seen any complaints about something which might plausibly discredit the author-- spending much of his time drunk and/or drugged.
Although he's fairly open about that, I don't think he particularly stands out in terms of substance use among college students. I had plenty of peers in college who graduated with good-to-reasonable grades who I suspect used alcohol and marijuana to similar degrees. In fact, I would have to say that they probably got considerably more out of college than I did, having gotten a lot of valuable networking done socializing while high.
My impression is that heavy alcohol use tends to increase people's level of anger, obsession, and resentment. In other words, he might overestimate the pervasiveness of administrative abuse and neglect at Rutgers. It's also conceivable that his life was going worse than it would have if he was sober more of the time. However, I searched on "rutgers r-u screw" and found this. It looks like, at a minium, the administrative style at Rutgers is significantly unfriendly to students.
I was almost an RU Screw victim - some data entry clerk recorded my high school class rank as 20 instead of 2, which would have rendered me ineligible for a rather generous state scholarship if my parents hadn't inadvertently discovered this.
I thought every college had something like this.
Conversely, I didn't believe this was a thing until I saw it happen during grad school (to undergrads).
what does this mean?
It's a combination of not enough parking, not enough buses, high parking tickets, and not letting people graduate unless their parking tickets are paid. It's also claimed that the department which handles parking tickets is literally the only part of the university the author dealt with where people seemed to be focused on doing their jobs.
I'm a Rutgers graduate...
Please, I beg of you, tell your story! Now that I've asked, people will believe you, right? so you have no excuse to keep silent. Incompetence is not what I find suspicious in this post.
For extra credit apply this insight to fast growing organisations that were initially founded or staffed by a group of unusual people, and don't foreget the role of luck or reversion to the mean.

Was this a long time ago, or am I just really naive about what certain parts of the US outside the bay area look like?

Outside view suggests the latter. Also, it's probably more parts than you think. The Bay Area is pretty weird along several dimensions.

If you are a decision maker in education in your area, please, please, please look into the various Bayesian predictive models used for math placement

Seems like you have worse problems than not using Bayesian predictive models. Like racism and corruption in the school system, and inability to tell means from goals.

For comparison, the first two don't seem to be a significant issue up here in BC, Canada, what with more than half of the students being Asian (and often ESL) and a reasonably strong tradition of integrity in the teacher's union. From what I know, there are few issues with assigning children into classes by ability, not by profiling.The main problem is the steadily declining financing, resulting in fewer and weaker programs. Another issue is that there is virtually no way to affect or even get rid of a bad teacher (union, remember?), and some teachers suck big time. I am not even aware of any targeted programs to "raise literacy" except for ESL classes, or to "raise basic math skills". Well, there are some which target the local native population, not sure how successful they are.

We had a creepy social stupies teacher with big seniority who eventually got fired. Everyone knew if was coming for years, and then he finally pissed off the wrong parents. It is possible to get really bad teachers fired eventually. (Kits high, btw)

If you could prove this stuff you could become a hero to a lot of people.

Edit: I now think this post is probably a hoax. As EY writes "Your strength as a rationalist is your ability to be more confused by fiction than by reality."

Please look through the comments where I have replied to criticisms; I have tried to find relevant citations.
Your confusion is not edit: strong evidence.
My not understanding how something could have happened is evidence for it not having happened.
If you understand how something could happen, how strong is that evidence that it happened?
Some. I tell you that I mixed chemicals X and Y to make Z. You are initially 99% confident that X and Y don't make Z and so think I'm probably lying. Then you read that something in the air in the state in which I live cause X and Y to create Z. Won't your estimate of my having told the truth go up?
I would be even more certain that X and Y don't make Z, and that you were mistaken. I would believe that you mixed X and Y and a and made Z, where a is the characteristic in the air which I was unaware of prior. There is the very weak effect that I am more likely to understand how something happens if it is possible than if it is impossible, and things which are possible are more likely to happen than things that are impossible. Therefore I am more likely to be confused in general by things that didn't happen than by things that did- but not more likely to be confused by things that didn't happen but are possible than by things which did happen and are possible. My biggest doubt comes from the fact that there should be trivial to reference to at least one grant which is literally as bad as the example given; this could be done without compromising anonymity, given that FOIA requests can originate from any source. Because the details of one grant as bad as the iPod/makeover grant would be fairly weak evidence that almost all grants are horrible, the absence of any in my research is fairly strong evidence that not almost all grants in the nation are horrible. Which is not to say that there couldn't be districts where horrible grants are the norm, or clearly fraudulent grants. Finally, the biggest inconsistencies I found in the original post were A) That an apparently literate and intelligent person though that a state-standardized test was an accurate measure of literacy, B) That a school with a test results problem would still have a 75% pass rate among lowest-class students, and C) That he never mentions being told by his supervisor that his job was specifically not to evaluate if the goals were appropriate (that being the job of the department issuing the grant, prior to issuing the grant; if they said that giving students iPods was the goal of the grant, it was sufficient), but only to evaluate whether the goals written into the grant were met. Instead the aut
I think that (A) is true because of Spearman's g. The evidence for g is overwhelming.
Did you just claim that g correlates well enough with two specific s to measure one s (ability to determine the answer expected by the writer of a test) and provide results for a different one (literacy) in the general sense? Because my position is that most standardized tests measure a combination of the intended subject and the ability of the taker to figure out the test writer; part of this comes from my observed ability to consistently outperform people with an equal or better knowledge of the subject being tested on many different tests, and most of it comes from my ability to explicitly recognize the test author's thought patterns in determining which options were available in multiple choice tests and figure the correct answer to a large number of their questions by looking only at the possible answers.
Yes. I did a huge amount of reading on IQ to write this.
Great. We can use test any skill to accurately measure any other skill now, right? It is impossible for someone to have great math skills and poor English literacy, because of general intelligence? That requires more extraordinary evidence for me to believe than I have seen. What is the most extraordinary citable data that you encountered in your research indicating that specific skills are in general interchangeable?
Just do research on IQ. And we're talking correlations so your use of the world "impossible" is incorrect.
I asked if you were making the claim that you could, in general, measure how well someone could guess the teacher's password (on a test for any subject) and get results which measured their literacy (implied: or calculus) skills. You indicated yes. If that is the case, it is impossible for people in general to have different literacy and calculus skills, because the same test would generally measure both skills. Did you instead want to make the claim that people who are poor at guessing passwords are in general worse at a skill than their performance on a hypothetical test which measured only that skill would indicate, and vice versa for people who are good at guessing passwords? My default case for considering a typical 'official' test is that it measures mostly the ability to guess the password, and only somewhat the mastery of the subject. Based on my experience with IQ tests, there is a significant factor of password guessing in them as well.
No disrespect intended, but I would rather not go over the literature on IQ with you. If you are interested I do discuss this literature in this book in Chapter 7 which is called "What IQ Tells You". Lots of tasks, such as the ability to repeat a sequence of numbers backwards, are highly correlated with IQ.
In general though, the ability to guess the teacher's password is not a measure of literacy. To the point that entered this line: Suppose there was a test with a single score that was 50% based on password guessing and 50% based on literacy: It is not the case that because password guessing and literacy both correlate to intelligence, that this test measures literacy. Password guessing is a specific skill, like literacy or calculus. Those skills develop faster in people with high intelligence, but they can be developed to a high level in almost anybody. My experience with education makes me believe that the skill of password guessing is being intentionally taught to the detriment of the skills of literacy or math, precisely because it results in a greater increase in test scores on the tests used to evaluate the teachers.

I'd like to see the citations on the importance of 8th grade Algebra as critical predictor of future success.

Also, since you're professionally aware of the ins and outs of US education systems, can you give us some general advice for contacting local school/school board officials or the like--how we should go about getting involved in ameliorating such issues in our communities?

I found this. (Gated, though)

PDF available here.

Thank you! It's not feasible for me to access that article at the moment, but I've bookmarked it for when it is. I am still hoping to see ThinkOfTheChildren's source, though; I think it'd probably be the most germane evidence.
Edited, with accessible link to paper.
Aha! Excellent. Thank you very much.

I can provide citations for any statements I've made, but since nearly everything I said is NDA'd I'd like to be careful to anonymize my citations, making sure to find national studies instead of state or county specific studies.

It's probably too late, based on what you've said already. Anonymity is hard.

For something like this, security through obscurity is probably good enough, as long as the information isn't presented in a way that shows up in Google searches for the actual context.

Even if anynomity is hard, deniability is useful. Also, the people most likely to complain that this article was written, would prefer if their names remain unknown.
The people most likely to complain are evidently have very poor reading comprehension to begin with. (Not meaning the students, either).

I'd be very interested in a citation on

the evidence shows that teacher recommendations have zero correlation with aptitude in a field

Since OP's clearly a bit venting, I'd give him some charitable leeway and interpret 'zero' as 'so small as to not be relevant'.

Seconded. A relatively low correlation I could believe, but none? As a friend pointed out, this would imply that if there's a math prodigy in the class, the teacher would be just as likely to recommend advanced classes as they would be to recommend the student needing extra help with basic stuff? I could accept prodigies slacking off due to boredom and therefore sometimes getting mistaken for people with bad skills, but 50-50?
See also: Pygmalion in the Classroom. It's entirely reasonable that teacher's ratings of children's academic abilities &tc cause future achievement.
No, it's not. Did you read Carl's comment in this same thread?
It's been demonstrated by controlled research that students who have teachers who expect them to perform better than their peers do, even when the expectations of the teachers are not founded on fact.
The Jussim et al review of that literature is worth reading. Expectations do seem to have causal impact, but the effect is usually small relative to measures of past performance and ability, and teacher expectations tend to reflect past performance more. The review covers some serious challenges to the effect sizes claimed by Rosenthal and coauthors, such as effect sizes declining with sample size and publication bias. Or, regarding the original Pygmalion/Oak School experiment: As an aside, Rosenthal pioneered meta-analysis in psychology because the effect only replicated a third of the time in the published literature (despite the presence of publication bias and QRPs). In doing so he promulgated a test for publication bias which implicitly assumed the absence of any publication bias, and so almost always output the conclusion that no publication bias was present. These methods were eagerly adopted by the parapsychology community, as the same methodology that appeared to show strong expectancy effects also appeared to show ESP in the ganzfeld psychic experiment, as Rosenthal (1986) agreed. Since I think that the ESP literature reflects the scale of apparent effect that can be shown in the absence of a real effect, purely through publication bias, experimenter bias, optional stopping, and other questionable research practices, this makes me suspicious of the stronger claims about expectation effects.
I don't think the sample of experiments reviewed is large enough to evaluate sample size versus effect size; throw out the outliers and there's nothing left. I'm now heavily concerned about the validity of the IQ test used; however, that's more due to the 8 point increase in the control group, when no increase is expected. I'll have to dig further, exclude any of the controls with out-of-band scores and redo the math. One result of the meta-analysis, however, is that experimentally-induced changes to teacher expectation have a small casual effect on student performance; another result is that non-induced teacher expectations correlate well with performance in the same year, and less well with long term performance. I would rephrase that as 'Teacher expectations of student performance in their class tend to be accurate, but correlate poorly with student performance in other classes.' In any case, thanks for the link. I'm going to have to spend some time determining how much I should change my mind with this new evidence, but my gut feeling is that the objectively worst possible data (my own experience with performing well when expected to perform well, and performing poorly when expected to perform poorly), will continue to dominate my personal opinion on the matter.
Upvoted for candor.
The first Rosenthal meta-analysis used 345 studies. That is pretty big. And the individual studies listed in table 17.1 have large n, ranging from 79 to 5000+. No, that's not a problem that should concern you. Children IQ scores are less stable than older people's scores, test-retest effects will give you a number of IQ points (that's why one uses controls), and children are constantly growing. What should concern you is that the researchers involved were willing to pass on and champion a result driven solely by obviously impossible nonsensical meaningless data. A kid going from 18 IQ to 122? or 113 to 211? This can't even be explained by incompetence in failing to exclude scores from kids refusing to cooperate, because tests in general (much less the specific test they used!) are never normed from 18 to 211. (How do you get a sample big enough to norm as high as 7.4 standard deviations?) Worrying about the control's gains and not the actual data is like reading a physics paper reporting that they measured the speed of several neutrinos at 50 hogsheads per milifortnight, and saying 'Hm, yes, but are they sure they properly corrected for GPS clock skew and did accurately record the flight time of their control photons?"
Unstable IQ scores should provide a net zero; an average increase of half a standard deviation across the entire population already means that the norms are fucked. Therefore, the IQ test used simply wasn't properly normed; if we assume that it was equally improperly normed for all students in the study, we still see an increase of 4 points based on teachers being told to expect more. Whether an increase of 4 points is statistically significant on that (improperly normed) test is a new question.
Only if you make the very strong assumptions that there is no systematic bias or selection effect or regression to the mean or anything which might cause the unstability to favor an increase. Plus you ignored my other points. Plus we already know from the pairs of before-afters that these researchers are either incredibly incompetent or actively dishonest. Plus we already know biases in analysis or design or data collection can be introduced much more subtly. Gould's brainpacking problems is only the latest example. Which claim and assumption we will make because we are terminally optimistic, and to borrow from the '90s, "I want to believe!" Wow, you still aren't giving up on the Pygmalion study? Just let it go already. You don't even have to give up on your wish for self-fulfilling expectations - there are plenty of followup studies which turned in your desired significant effects.
What effects could cause an increase of 8 points on a properly normed test across the board? Why would there a significant benefit to being in the control group of this study? You can rule out that they were using a test which produced the scores that they recorded, perhaps by using raw score rather than normed output. You can rule out every other explanation for why the recorded results aren't valid scores. You can even rule out that they were competently dishonest, since competent dishonesty would be nontrivial to detect; your only possible conclusion is incompetence, which isn't evidence which should change your priors. Incompetence is the social equivalent of the null hypothesis, and there is very rarely any significant evidence against it. Assuming only incompetence as you have, the expected result would be equally erratic for all students. You can assign any likelihood to the assumption that the incompetence was the primary factor and that dishonesty doesn't modify it significantly, but you have already concluded systemic incompetent dishonesty across a large number of studies. As you say, it's been confirmed by other studies. I'm not insisting that a particular study was done correctly, I'm explaining why their conclusions being true is consistent with the errors in their study. (Which means that a study with those flaws would be expected to reach the same conclusions, if those conclusions were true)
I already gave you three separate explanations for why an increase is possible, even in controls. I have no idea what you mean by this, and I think that if one accepts their incompetence, the best thing to do is to ignore their data as having been poisoned in unknown ways - maliciousness, ideology, and stupidity often being difficult to tell apart. Why is that? The competent result is, since IQ interventions almost universally fail (our prior for any result like 'we increased IQ by 8 points' ought to be very low, as in, well below 1%, because hundreds of interventions have failed to pan out and 8 points is astounding and practically on the level of iodization) and the followups confirm that there is only a much much smaller effect, that there is no or a small effect. Any incompetence is going to lead to an extreme result. Like what they found. 'Confirmed'? Well, this is an active debate as to what counts as a replication. Near the same magnitude or just having the same sign? If someone publishes a study claiming to find a weight loss drug that will drop 100 pounds, and exhaustive replications find that the true estimate is actually 1 pound, has the original claim been "confirmed"? After all, both estimates are non-zero and both estimates have the same sign...
So, "systematic bias or selection effect or regression to the mean" can result in average properly normed IQ scores increasing by 8 points? Doesn't the normalizing process (when done properly) force the average score to remain constant?
What normalizing process? You mean the one the paid psychometricians go through years before any specific test is purchased by researchers like the ones doing the Pygmalion study? Yeah, I suppose so, but that's irrelevant to the discussion.
Right- because the entire population going up half a SD in a year isn't unusual at all, and the test purchased for use in this study was normalized the way one would expect it to be, despite the fact that it had results that are impossible if it was normalized in that manner.
...'entire population'? Alright, I have to admit I have no idea what test you are now referring to. I thought we were discussing the Pygmalion results in which a small sample of elementary school students turned in increased IQ scores, which could be explained by a number of well-known and perfectly ordinary processes. But it seems like you're talking about something completely else and may be thinking of country-level Flynn effects or something, I have no idea what.
The PitC study showed an 8 point IQ increase in the control group. You offered those three explanations and said that they explained why that wasn't particularly unusual, and my understanding of normed IQ tests is that they are expected to remain constant over short times.
Over the general average population when tested once, yes. But the control group is neither general nor average nor the population nor tested once.
If the control group isn't at least representative, there is a different methodology flaw. If the confounding factor of prior IQ tests wasn't measured, given that there is apparently a significant increase in scores on the first retest (and presumably a diminishing increase in scores at some point; the expected result of taking the test very many times isn't to become the highest scorer ever), there is an unaccounted confounding factor. I'm still trying to figure out what questions to ask before I dig up as much primary source as I can. Is "points of normed IQ" the right thing to measure? That would imply that going from an IQ of 140 to 152 is equally as much a gain as going from 94 to 106. Is raw score the right thing to measure? That would imply that going from being able to answer 75% of the questions accurately to 80% is equally as much gain as going from 25% to 30%. Is the percentage decrease in incorrect answers the correct metric? 75%-80% would be the same as 25%-40%. The percentage increase in correct answers? 25%-30% (20% increase) would be equivalent to 75%-90%. I'm still reluctant to accept class grades and state-mandated graduation test scores as measuring primarily intelligence or even mastery of the material, rather than the specific skill of taking the test. That makes my error bars larger than those of someone who does accept them as accurate measurements of something important.
No, usually in these cases you will be using an effect size like Cohen's d: expressing the difference in standard deviations (on the raw score) between the two groups. You can convert it back to IQ points if you want; if you discover a d of 1.0, that's boosting scores by 1 standard deviation which is usually defined as something like 15 IQ points, and so on. So if you have your standard paradigmatic experiment (an equal number of controls and experimentals, the two groups having exactly the same beginning mean IQ and standard deviation of the scores), you'd do your intervention, do a retest of IQ, and your effect size would be 'IQ(bigger) - IQ(smaller) / standard deviation of experimentals & controls'. Some of the things that these approaches do: 1. test-retest concerns disappear, because you're not looking at the difference between the first test and the second test within groups, but just the difference in the second test between groups. Did the practice effect give them all 1 point, 5 points, 10 points? Doesn't matter, as long as it applies to both groups equally and their pre-tests were also equal. The first test is there to make sure you aren't accidentally picking a group of geniuses and a group of dunces and that the two groups started off equivalent. (Fun fact: the single strongest effect in my n-back meta-analysis is from when a group on the pre-test answered like 4 questions more than any of the others; even though their score dropped on the post-test, because the assumption that the groups were equivalent is built into the meta-analysis, they still look like n-back had an effect size of like d=3 or something crazy like that.) 2. you're not converting to IQ points, but using the raw score. This avoids the discreteness issue (suppose the test has 10 questions on it. What does it then mean to convert scores on it to its normed range of 70-130 IQ or whatever? getting even a single additional question right is worth 10 points!) 3. you avoid the issues of
... For some reason I thought the first test was used to evenly distribute performance on the pretest between the two groups. Aren't the control and experimental groups supposed to be as close to identical as possible, and to help analysis identify which subgroups, if any, had effects different from other subgroups? If an intervention showed significantly different results for tall people than for short people, then a study of that intervention on people based on height may be indicated. That's carryover from a different branch, sorry.
Ideally, yes, but if you shuffle people around, you're not necessarily doing yourself any favors. (I think. This seems to be related to an old debate in experimental design going back to Gosset and Fisher over 'balanced' versus 'randomized' designs, which I don't understand very well.) This is part of the randomized vs balanced design debate. Suppose tall people did better, but you just randomly allocated people; with a small sample like, say, 10 total and 5 in each, you would expect to wind up with different numbers of tall people in your control and experimentals (eg a 4-1 split of 5 tall people), and now that may be driving the difference. If you were using a large sample like 5000 people, then you'd expect the random allocation to be very even between the two groups of 2500. If you specify in advance that tall people are a possibility, you can try to 'balance' the groups by additional steps: for example, you might randomize short people as usual, but block (randomize) pairs of tall people - if heads, the guy on the left is in the experimental and right in control, if tails, other way around - where by definition you get an even split of tall people (and maybe 1 guy left over). This is fine, sensible, and efficient use of your sample, and if you're testing additional hypotheses like 'tall people score better, even on top of the intervention', you'll take appropriate measures like increasing your sample size to reach your desired statistical power / alpha parameters. No problems there. But any post hoc analysis can be abused. If after you run your study you decide to look at how tall people did, you may have an unbalanced split driving any result, you're increasing how many hypotheses you're testing, and so on. Post hoc analyses are untrustworthy and suspicious; here's an example where a post hoc analysis was done:
I was saying that if there was any reason to suspect height might be a factor, then height should be added to the factors considered when trying to make the groups indistinguishable from each other. If height isn't suspected to be a factor, adding height to those factors with a low weight does almost no harm to the rest of the distribution. Is there any excuse for the measured variable to notably differ between the control and experimental groups in a well-executed experiment?
In a perfect world, perhaps. But every variable is more effort, and you need to do it from the start or else you might wind up screwing things up (imagine processing people one by one over a few weeks and starting their intervention, and half-way through, noticing that height is differing between the groups...?) If you didn't balance them, it may easily happen. And the more variables that describe each person, the more likely the groups will be unbalanced by some variable. People are complex like that. If you're interested in the topic, I've already pointed you at the Wikipedia articles, but you could also check out Ziliak's papers.
I see where gathering information about all participants before starting the intervention might not be possible. It should still be possible to maximize balance with each batch added, but that means a tradeoff between balancing each batch and balancing the experiment as a whole. For a given experiment, we would have to decide the relative likelihood that that there would be a confounding variable which in the batches or a confounding variable in the demographics. The undetected confounding variable is always a possibility. That doesn't mean that we can't or shouldn't do as much about it as the expected gains offset the expected costs, and doing some really complicated math to divide the sample into two groups isn't much more expensive than collecting the data to go into it.
That's going further than I did. It's a reasonable prior, and the evidence is at least consistent with weak effects.
Eh. Decius was clearly thinking of, and still is thinking of, substantial and longlasting effects rather than the almost trivially small disappearing effects confirmed by the followups and meta-analyses. That is completely unreasonable a view to hold after reading that review, and I would suggest that even that small nonzero effect is dubious since it seems that few to none of the studies fully accounted for the accuracy issue and there is obviously publication bias at play.
I wonder if the teachers making the predictions were the same ones who then taught the students, and examined them to determine the outcome.
Here's a review of the literature on teacher self-fulfilling prophecies from Lee Jussim, who is skeptical but finds that they occur and are of nontrivial magnitude, moreso for grades vs standardized tests, although they dissipate quickly and teacher judgments are more driven by accuracy than stereotypes in the aggregate.
Would you believe that many teachers use 'effort' as an explicit factor in assigning grades? As in, someone who understands the material without putting forth visible effort is assigned a lower grade than someone who visibly struggles to have the same level of understanding.
They are officially required to (in Slovakia). But it is just one on many confusing, sometimes mutually contradicting, mostly applause-lights criteria. (From my memory: The grades have to reflect knowledge, they have to reflect effort, they have to be motivating, and they have to respect human rights, whatever that means. And a dozen other conditions.) Yes, this is explicitly required by and explicitly forbidden by the rules. Welcome in the world of educational system!
One more data point; was it a politician with no educator qualifications that wrote the requirements?
I don't know, but my guess would be that a group of bureaucrats with zero educational experience from the Department of Education prepared the document, and some minister just signed it, because it seemed okay (it contained all the applause lights).

P.S. I was going to ask about the terms of your NDA. While I agree with greater transparency, I (perhaps idealistically) hope it can be done without breaking promises.

However, I also have a principle, showing honour to honourless dogs is worse than useless.

He couldn't understand why he had needed to do this, and indeed, refused.

I have to disagree that this is ineptitude. He knows which evidence he has to conceal from you, and is doing so effectively. Of course by doing so he only confirms that it is harmful to his case, but it nevertheless grants plausible deniability. Especially as I expect anyone who can fire him will collude in the concealment.

When I submitted this to my boss for approval, she was flabbergasted, and explained that the evaluators job was to collude with the grant proposal submitter,

Sadly I cannot prove this, but I read this after writing the above paragraph. I wasn't primed on 'collude.' I'ma go ahead and conclude nothing happened to the poor bastard sideswiped by a thoroughly unexpected honest appraisal.

every single project I evaluated listed their 'process' and then said that their 'goal' was to enact the process.

Pays the piper, etc... Whoever ... (read more)

"The point of the grant program is to give goodies to certain demographics. The process was indeed the goal, no matter what anyone else said." The job of being the grant writer, and evaluator, are probably themselves goodies. In fact, the author of the article is probably viewed as ungrateful for doing the job he was hired to do so scrupulously.

make sure the criteria for math placement is based on achievement data

Make sure you collect achievement data. Bayesian calculations are fine and dandy, but I'd declare victory if they collected the data and let people see it.

I'm torn between thinking that if this is a hoax, the hoaxer should be banned with extreme prejudice, and hoping that there will be another hoax designed to appeal to right-wingers.

The sign of a good Usenet troll post is that it is a mirror held up to as many groups as possible.

I'm surprised you think the appeal of the OP is confined to left-wingers. The bad guys are all government beaurocrats, the current boogey men of the right and a group championed by the left.

You've got a point. The OP would appeal to both-- I was probably biased by the left-wing appeal being at the end of the post.
I come down strongly on the "hoax" side because I spend a lot of time "reviewing" the emails that my father and other relatives exchange. These are of the sort Obama born in Kenya, Obama dissed dead soldiers and their families, Obama pushing Sharia law, Obama hates Flag pins. As far as I'm concerned, I have seen 100s of the hoaxes designed to appeal to right wingers. You can see them too: go to, search on Obama and False ,stop reading when you get bored. As to banning, if we really are supposed to be learning rationality here, how does it help to erase all evidence that in large numbers we got tricked? And it didn't even take Omega to do it to us, it was just another Beta like ourselves? If this does turn out to be a hoax designed to appeal to us, it should be taught as something we need to watch out for.
I initially considered this post pretty credible, but then this thread happened; and I subsequently realized that it was kind of interesting that the post's description of the education system seemed to systematically have something in it to offend pretty much everyone's political views. That struck me as ... odd. I find myself not having much of an opinion on whether it is true or not; but then, I don't expect to take any particular action based on whether it is true, so I feel pretty safe not caring.
Your post might be clearer if you initially specified "this post" as ThinkOfTheChildren's rather than NancyLebovitz's.
That's interesting. If this were a hoax, it would certainly appeal to right-wingers. In general, the way the school board is debating this issue, the democrats are in favor of teacher recommendations and "helping the poor black kids", whereas the republicans (although, on the school board, they're all teapartiers) are the ones running with the "Data Driven Decisions D^3" slogan.
What specifically is the school board debating? Allow the Principal to keep some minority students in honors classes?
That would be the first 7 paragraphs.

Most of the money/resources schools receive comes in the form of grants.

Could you provide a source for that?

If you "know that there isn't actually any way to fix the problems," why do you care if the grants are scored in insane ways, or the interventions targeted demographically, or that 98% of the money is embezzled?

(Incidentally, a reason to give away incentives demographically rather than by test score is that they become an incentive to sandbag the early scores. Which could then produce the illusion that the program improved scores.)

What... (read more)

Most of the money/resources schools receive comes in the form of grants.

Could you provide a source for that?

This claim definitely conflicts with my understanding, although perhaps it's true for that portion of resources that is actually up for grabs and not already committed through the normal funding (government) process.

This link is more in line with my understanding, that is, that most resources come from state and local government, and most of those resources are not awarded through "grants," but rather that local resources generally stay with local schools and state resources are divided in other ways but not usually through award of a grant. But I'd be interested in hearing if my (not heavily researched/sourced) understanding is incorrect either generally or at least for some portion of schools.

Some quotes from the link:

States rely primarily on income and sales taxes to fund elementary and secondary education. State legislatures generally determine the level and distribution of funding, following different rules and procedures depending on the state.

State funding for elementary and secondary education is generally distributed by formula. Many states use fun

... (read more)
Thanks! That link did imply that the 10% of funding that is federal is structured as grants, which surprised me. Though it's not clear that it means exactly the same thing as in this post. That sounds close to a tautology to me. Aren't grant applications the way that one grabs resources that are up for grabs? (OK, I can think of other examples, like specializing in disabled students, but...)
Yeah, I didn't phrase that very clearly. My thinking was drawing a distinction between (1) what may be the smaller portion of resources that is always up for grabs (and that is perhaps mainly grants) and (2) the larger portion of resources that is not discretionary in the same way because it is awarded by the government without the competitive grant application process. Of course, there may still be opportunities to also influence how that larger portion of resources is distributed, e.g., lobbying or maybe gaming the system to affect the distribution in some way.
Hm, I'm surprised you're surprised. I thought it was widely understood that giving and withholding grants was the main way the federal government got around its lack of de jure authority over them and exerted pressure on state and local governments and school districts* - you can read coverage of the sequester's effect on them and the funding comes as grants, and Obama's "Race To The Top" program was purely about competing for federal grants. * because they are in effect insolvent without federal money
There are several aspects. I knew that federal highway funds are grants for the purpose of making highways to the interstate standards (eg, landing strips), but also conditional on various things, most famously the 55 speed limit. I thought federal school funding was like that. I am fairly surprised to learn that RTTT was zero-sum, but since it was part of the stimulus, I'm not sure that's very informative. I am still unclear on whether the states proposed things to spend the money on, or whether it was an unrestricted prize for winning the contest.
It occurs to me that what could be going on is not that the individual embezzled the grant money, but that the money wound up in the school's general budget. If I believe that most money flowing through the school is from grants, then I conclude that it is needed to pay teacher salaries. So it is a decidedly good thing that it is not spent on ipods.
Using one standardized test to choose placement in an academic route would be called tracking and for some reason is a terrible thing to do.
No, "tracking" is just having different academic routes - what the school is already doing. If you can find someone who has a strong opinion on the difference between tracking based on standardized tests vs local grades, I will be very surprised.
Well, it would be tracking either way, but it wouldn't be called such if it was entirely informal, which is what it appeared based on the OP.

The specific project I was evaluating had only gotten $800,000 out of the maximum $2m. Its strategy was to purchase the male students iPod Touches, the female students makeovers, manicures, and pedicures at a local beauty parlor, and all students were offered an additional iPod Touch or Makeover, respectively, if they passed the exam at the end of the current year.

Besides everything else, the iPod touch doesn't sound exactly like the kind of thing that already having one makes you more likely to want another. What the heck should I do with a second iPod touch if already have one? (Beside selling it or giving it to my sister, that is.)

One public and one private? One to listen to while you record with the other? One for practical and one for recreational use? Not that either of the uses you downplay aren't also good.
Okay, the second sentence of my comment might be an exaggeration, but I stand by its first sentence. Yeah, but I'd kind-of prefer to be given the retail price of a new iPod touch in cash rather than be given a new iPod touch.

Even an effective program that actually, verifiably works would have its problems: It would (as it stands) target standardized test scores of some sort, which then automatically lose some of their previous reliability as surrogate parameters. That effect has a name, which eludes me, can anyone supply it? (Loss of reliability when a variable is targetted directly and thus becomes subject to manipulations.)

That effect has a name, which eludes me, can anyone supply it? (Loss of reliability when a variable is targetted directly and thus becomes subject to manipulations.)

You are thinking of Goodhart's Law.


If you are a decision maker in education in your area, please, please, please look into the various Bayesian predictive models used for math placement;

Bayesian methods still can (and in practice, will) use race and the like as evidence, meaning, if you're black you need higher test scores and grades to qualify - they just don't entirely stone-wall you from qualifying, which is a step forward I guess.

The fair approach is to have an entrance exam for better math classes, blind to the race.

This seems to me mistaken. The reason race can be used as a proxy in the first place is that there is some correlation with performance on the standard tests. If you use the standard tests, then that entirely screens out all the race information; there is no additional information in race that you didn't get from the test. This is similar to checking whether the plane flies: All information from authority and from theory is screened out by the experiment. More generally: If A is a proxy for B, and you use B, then trying to use A in addition is double-counting. Now, perhaps you are arguing that race is a proxy for test scores and something else, and you can still extract the something-else? If so this should be made explicit.
Consider a test followed by a re-test (which we are trying to predict). To calculate expected score on the re-test you need to apply regression to the mean. For a population where you measured lower mean or (in high range) smaller variance, you'll have to regress more. Of course, that mathematical fact doesn't make it non racist or morally right to do such adjustment. You could add a couple simple extra questions to the test, to obtain similar improvement in the accuracy. Or you could use some other side data instead - weight, height, and blood type, for example, there's a lot of other data you can use besides race, if the race is used but nothing else, that's because of tradition of racism, not because of some awesome rationality. It's fairly amusing to see how race realists justify racism with increased accuracy, but start complaining when you adjust your evaluation of them in much same manner using racism/non-racism as evidence... edit: An important correction. The test-to-test variance may also differ between the groups. E.g. if we have some robots that always test the same, even if they have low mean, they'll have smaller regression to the mean than humans.
This is only true if you assume there is some component of luck or guesswork to the score. I admit that this may be a good model for the kinds of tests you get in American high schools. However, it is not clear to me that "black people" is the correct population to use for the regression, because by construction you have an untypical member. Why not "high-scoring people" or "all students"? Perhaps it would be helpful to construct an example using something other than race as the difference between populations, to avoid emotional entanglements?
Try neuroskeptic. If there is no component of luck or guesswork or something that varies from test to test, then the retest will be exactly the same as the original test, but that's not what we see in pretty much any test. or any measurement of anything.
Other way around. If you've begun with a socioeconomic disadvantage, then achieving a particular test score is an indicator of greater inherent ability, insofar as such a thing exists. Someone who can run a mile in five minutes while carrying a fifty-pound weight is a better runner than someone who can run the same mile in the same speed while carrying no additional weight.

That depends very much to specific priors and correlations.

If you're looking for the expected score on a re-test, you should apply regression towards the mean, and for a lower mean, that's more regression. A school may be interested in the probability of student success on a course, which is not a measure of inherent ability either but very much depends to the same disadvantages that lower the test score.

edit: that is to say, if you made a programming contest where the contestants write programs to predict re-test scores from score and a profile photo, given huge enough database of US students (split in two, one available to our contestants, one for the final test), winner code will literally measure skin albedo, and in some cases maybe also try to detect eyeglasses. Of course, the morale of the story is not that racism is good but that socially sometimes we don't want the most accurate guess.

edit2: Subtler measures may correlate too, besides the racial ones. E.g. angle between line connecting pupils of the eyes, and horizontal, the pupil dilation in the photo, use/non use of flash, strength of red eye effect, and who knows what else (how busy does the background look, maybe?). I don't think many people here want to have their math scores be adjusted depending to how they held their head in a photograph. edit3: ohh, and the image metadata, or noise signatures, that'd be a big one - is the image taken by an expensive camera? Get free points on your math test. And a free tip: squint. It will think you're asian or smart enough to squint.

I think fubar may be right in a certain way: if you observe someone reaching a very high score while having a known poor environment (let's say you've tested them enough so one can ignore issues of <1 reliability causing a regression to the mean on subsequent retests), then you might then estimate that the non-environmental contributions must be unusually high - because something must be causing him to score very high, and it's sure not the environment. So for example, we might infer that his genes or prenatal environment or personality are better than average.
Yes. As I say, depends to what we are trying to predict and priors. Even with 1 test and significant regression, it's correct to infer higher non-environmental contribution, just not higher combination of environmental and non-environmental.
It seems to me that private_messaging is right and explains his point beautifully. Here's a Robin Hanson post making a similar point. Also see this discussion, especially Wei Dai's comment.
And more recently: and
I'm confused as to why race would matter if you already have the grades and test scores information. Race might be helpful in predicting what their previous grades and test scores were, but I doubt it would improve the accuracy much over a model that excluded it. And if it really was true that race matters in who will benefit the most from the program, then so be it. Why would you not want to help those that benefited the most first, regardless what information was used to predict it?
Is it more important to be fair or accurate? There are times where it's more important to be fair. For example, punishing a person because he's guilty discourages crime. Punishing someone because he's black does not. Thus, using the fact that he's black as evidence will mean more guilty people will go to jail, more innocents will avoid being jailed, and more people will commit crime. I don't think that really applies here, though.
Well, what do you think about losing points because your profile photo has atypical proportions, or atypical posture? Points adjustment for round face, or for relative finger lengths? For having too many or too few facebook friends, likes, and so on? Weight, height, and blood type? Well, if you want to encourage education rather than encourage being white or having typical posture or other things like that, it does apply.
If you're giving prizes to the best students to encourage them, then it applies. If you're trying to match the teaching style to the student, I don't think it does.
One might say that the sanity waterline one has to cross to rationally handle test-score-based Bayesian predictions in a by-and-large rational way is much lower than the sanity waterline one has to cross to rationally handle relative-finger-length-based predictions, which itself is lower than the waterline for skin-color-based predictions.
What sanity? Everyone is pushing for measures that would be advantageous for themselves, opposing disadvantageous measures, and there's nothing particularly insane about that, it's just instinctive selfishness. The white 'nerds' for instance could be OK with adjustment for race, but very much not OK with adjustment for various looking odd metrics (which lump them together with the autistic). It's only Bayesian when it's someone else; when it's you losing points, that's you being lumped together with other people (on basis of some random trait that happens to be widely measured), which is of course bad and irrational and a bias (complete with examples of how it is inexact). Nothing insane about that either, it's just selfishness. Meanwhile, I'd dare to guess you can get considerably larger boost in accuracy from adding a couple more questions to a test, or using data from some other standardized test.

I live in Arkansas (you may remember us as the state that threw a fit over desegregation roughly a Jesus-lifetime ago), in a region that is pretty economically strong but still has distinct socioeconomic classes. I'm pretty confident that this post describes reality pretty well, based on 1, my direct observations as a student, 2, what I've heard from other students, past and present, and 3, what I've heard from teachers and principles, retired and practicing.

[edit] To be more clear, I think there's almost certainly a county or several in the United States ... (read more)


I wish I had computer acess to write out a longer reply to this, for now see educationrealists response and his blog in general. I was torn wether to upvote or downvote the article as I don't know whether it was exploiting or exposing key weaknesses of community rationality here.

Some people have expressed some doubts about this story, and because it is anonymous, we can't verify it directly. I would like to use this opportunity to explore our models of the school system, and especially the difference between the models of insiders and outsiders.

This poll asks a pair of questions. The first question is about how the story fits your model of educational system. The second question is whether you are an "school system insider". That means whether you ever had a full-time job or a part-time job related to the school system; ... (read more)

You move from 'very difficult to believe' to 'seems like a very exaggerated version of true events,' which is almost the same thing but from that you directly jump to 'This story is credible' without any middle ground. I am sorry but this seems like a really poor questionnaire design.
Once the poll is made, the answers cannot be edited. The difference was supposed to mean approximately a) I don't believe this could ever happen, b) I believe it can happen exceptionally, but not all the time as the author claims, c) I believe this can be the way system works. In other words, the second option is like: "I believe that with so many grant proposals, once in a while a crazy thing passes unnoticed; but I don't believe that it happens all the time, not even half of the time -- you have probably seen one or two bad cases, and now you exaggerate to make your case more appealing".
I found myself wanting to say "I think this sort of grant proposal thing happens maybe 25% of the time, but not all the time, the way the post implies." I also wished there was some kind of gradation for "school insider/outsider". I'm an outsider, but I talk a lot with a friend who teaches full time. I showed this article to her and she said "yes, yes, yes. This is basically how it is." I actually DO still assign substantial probability to this being a hoax, despite it matching my understanding - we know the this is sockpuppet account, created ostensibly for NDA anonymity. But can think of some people here who might have created this explicitly as a test of rationality, who are sort of annoyed that the politics involved here get less scrutiny and want to demonstrate that.
Spreading false data as a "test of rationality" would be actively harmful. But I can imagine people misunderstanding that. Rationality is a method of working with the data you have. You should update on evidence correctly, instead of updating incorrectly. You should be able to recognize that this specific piece of evidence contradicts the model based on all other evidence, which makes this specific piece of evidence suspicious. But also should estimate your degree of certainty in a given model. It is proper to say "I defy the data" when one's model is based on a lot of reliable evidence. Saying it more often would be overconfidence, not rationality.
I was assuming that if it were a hoax, they'd let people know in another few days with a gloating update.
I agree that this would be the most likely course of action if the essay is a hoax, but I think it would still risk being harmful overall, since retractions generally don't result in an appropriate corresponding decrease in confidence in the material that was originally presented. I'd expect Less Wrong members in general to be better at reducing their confidence in a retracted claim than most people, but better is not necessarily good enough.
Which people? Stylometry might us allow to work out whether someone's writting style matches this post. Anonymity is hard.
Yes, we could probably de-anonymize OP; I have some passing familiarity with stylometrics, so I considered trying that myself. I decided not to because so far no one has produced any smoking guns that this is false (school districts vary massively in quality across the USA & I have already mentioned an existing and far more shocking instance of school failure/corruption), if it is true then I approve of whistleblowing and have no interest in attacking the OP*, de-anonymizing probably would not set a good precedent, and if it were false - well, I do not disapprove of red team tests of LW (and would be hypocritical to disapprove) and so far this seems to be limited to LW. * Similarly, it's been suggested to me by a few people that it would be an interesting project to try to de-anonymize Satoshi Nakamoto or La Griffe du Lion. I am not sure I could, and even if I could, I would choose not to since I either approve of their work or find their material interesting. If this post ever looked like it was both false and not a limited-scale test, like it was something else (an entrapment of an off-site person? an attempt to discredit LW entirely with a Sokal-style attack?), then I might change my mind. But so far, that does not seem to be the case since I see nothing indicating this post has been picked up by the Drudge Report, Hacker News, Fox News, Breitbart, etc.
Yeah, I also generally consider that posts like this have around 5% chances of being a hoax followed by a gloating "they swallowed it" update (here or somewhere else), though this post doesn't have any huge red flags (there doesn't seem to be any huge gloating potential, I mean it's basically just venting).
So, you are asking me to condition on belief that the author of this post isn't a troll who crafted this story in such a way to be especially appealing to the LW community. Am I right? If I am, you probably should edit your original post to make your idea more clear.
That sounds very different from "This story seems like a very exaggerated version of true events". One is about frequency (how often do things in the author's intended reference class occur?) while the other is about severity (how bad are the events that actually happened to the author?)
My thoughts: having seen lots of people trying to write things and failing miserably - my father the professor repeatedly finds that large numbers of students don't follow directions when writing lab reports - I'm not surprised by a claim that people would do stupid things when writing grant applications and then not understand what the problem is when it was pointed out to them.
I'm a school system insider by this standard. This makes the survey rather broken because I know very little about schools in your country. I recommend correcting your language.
Whose country? Viliam_Bur's country is most probably not the same country as the OP author's.
In that case the survey makes even less sense to me.
The educational systems seems to me similar enough in different countries. When I read stories of teachers in Britain or USA, of course there are differences, but the similarities are also obvious. Some of the stories could have as well happened in my class. The biggest differences are: How the school is organized. (Are teachers in a union? Who appoints the director? Is there a management layer between the director and the teachers?) Which minorities underperform, and what political consequences does it have. (Black students? Romani students? Are there ethnical quotas? Affirmative action? How often are teachers accused of racism; what is the typical reason and typical consequences?) Another significant difference is the level of violence at school -- but this varies also within the country, and changes during time. (How often and how severly do students attack each other? Do students attack teachers? Do parents attack teachers? What are the consequences for the agressor?) And here are the similarities: Bureaucracy. Decisions made by people who don't have a clue, and often have zero educational experience. Supervision and assessment according to unintelligible or actively harmful criteria. Pseudoscience, and aversion to measuring outcomes. (Tests are bad. Teachers shouldn't explain, but entertain. If any recommended technique doesn't work, it is always the teacher's fault, never a problem with the technique. Knowledge is a lost purpose, the true goal of the school system is to make students happy.) People making big money selling pseudoscience to schools. Random minor changes in school system to make voters see that politicians care about their children. Textbooks containing nonsense. Parents treating teachers as babysitters.
I mostly agree (strongly) with this. However the "Tests are bad" part in particular doesn't seem to be completely general. More testing and measurement seems to be the direction things have been going here.
I read "Tests are bad" as "The tests do not accurately measure".
Ok, I've answered the survey adopting this assumption. I chose "Very difficult to believe, school system insider". But note that I would also have found the prom segregation difficult to believe if not for the somewhat credible sources so discount the results as appropriate.
I have taught in a school and subbed in others. I had very little say about who could come into my classroom, though I was only a first year teacher. (Quit after 1 year). I don't have any reason to believe education beaurocracy is generally bright or honest, but can't say I've seen that kind of thing in the grant process, possibly just because I wasn't involved in it.
I voted earlier then came back to this post to see how it was going. How do I view the results? I tried voting again, but it won't let me, and I'm sorry, I can't guess how to view the poll results anymore. EDIT: I figured it out, on the off chance anybody else wonders, hit the chain-link looking icon at the bottom of the poll to go to a page where you see results (at least if you have voted). That link points to:

Currently, my firm and its allies are trying to push the government into forcing the schools to use a Bayesian prediction model, in which you feed an individual student's test scores for the past 5 years and it spits out their probability of success in the advanced classes, and you keep putting the students with the highest probability of success in the top classes until you run out of teachers.

This is good, and I hope that such models are implemented. However, when I hear the phrase "problems in education," these sorts of placement problems a... (read more)

There were math teachers who absolutely refused to allow blacks/poors in their classroom, or worse, treated them in such a way as to cause them to fail, thus confirming their worldview.

This is now? Not 100 years ago?

There were certainly double standards when I was in secondary.
When was that?
Early 2000's.

What would be the point of the hoax?

Ha ha, people believed a story about a fucked up bureaucracy! How foolish they are! Bureaucracies are the perfect will of God here on earth!

A fictional account of a screwed up bureaucracy, or a screwed up school district, would hardly prove that they don't exist.

The post seemed a little strange at some points, and a little like spam for "bayesian predictive models for mathematics placement". Maybe some company has a patent on that?

Whatever. Hoax or no hoax, lots of good stories and good times.

I voted credible/outsider so that I could see the poll results. I'd have gone for plausible/outsider or preferably "no strong opinion, but I want to see the results" if either had been available.

I was in the same situation as you, and I flipped a coin between “exaggerated” and “credible”.

The specific project I was evaluating had only gotten $800,000 out of the maximum $2m. Its strategy was to purchase the male students iPod Touches, the female students makeovers, manicures, and pedicures at a local beauty parlor, and all students were offered an additional iPod Touch or Makeover, respectively, if they passed the exam at the end of the current year.... only 25% (14/56) of the students targeted by the program had failed the reading exam in the first place.

$800,000/56 students = $14,000 per student. Those are some expensive iPod touches!

See this part of the post:

I described in rigorous detail everything the man had done wrong, put in a strong recommendation to not award him grant money in the future, and suggested that some sort of corruption investigation be conducted to see if he had committed any crimes (23 iPods + 23 Makeovers does not total to $800,000, after all).

Interestingly, 23+23 != 56

Has anyone published data on the effectiveness of Bayesian prediction models as an educational intervention? It seems like that would be very helpful in terms of being able to convince school districts to give them a shot.

Most of the discussions I've happened to see focus on , because being able to judge teachers is directly useful to school districts and lots of outsiders are interested in the topic. Since the usual approach uses multilevel model (you need to adjust for school-level effects, district-level effects, etc before you can extract a usable teacher-level effect), it's almost Bayesian by default, and if you google 'bayesian value-added modeling' you'll find a ton of material.
My experience is that school districts have a strong not-invented-here bias. For example, special education laws require research based interventions, a requirement that is generally ignored.
Maybe slightly vary the parameters to make the model "new"? Like, fit it to data from that district, and it will probably be slightly different from "other" models.
But that requires effort, and school districts don't generally want to put in effort to do things differently.
I think the hard part of refitting the model would probably just be getting access to the data -- beyond that it seems like a statistician or programmer would be able to just tell a computer how to minimize some appropriate cost function. Something like most of the marginal effort is devoted to gathering the data, which presumably doesn't require that much expertise relative to understanding the model in the first place.
In practice, there are substantial privacy law issues, although those can be gotten around if the district is clever. More importantly, collecting, collating, and ensuring coder reliability is expensive. What you called "marginal effort" is quite difficult for just about any large bureaucracy.
Marginal effort within the bounds of a consulting agency offering a service "tailored" to each school district.
0ThinkOfTheChildren Quite a few things there. SAS's EVAAS is generally considered the gold standard of bayesian prediction models as educational interventions; unfortunately as SAS is based in North Carolina it has yet to spread outside that particular state. Some states have similar systems being produced by similar companies. Particularly, if I were you I would read:

Scary, if true, but not too surprising.

(It seemed to me that my high school used an algorithm that amounted to "students who asked for honors classes got them", although apparently there was a lot going on behind the scenes that I didn't see...)

It depends entirely on when you were in school. At present day, most of a student's path is determined by whether they are selected for 8th grade Algebra (in fact, if you were to rank all of the factors possible in determining a person's lifetime earnings, the factor at the top would be whether you took Algebra in 8th grade). The 7th grade math teacher's recommendation is the primary factor in this particular decision, and middle school teachers are incompetent at predicting whether a child could succeed at advanced math 4-6 years later.

in fact, if you were to rank all of the factors possible in determining a person's lifetime earnings, the factor at the top would be whether you took Algebra in 8th grade

Citation needed, especially for a claim of causality.

I would settle for a correlation which was stronger than "whether you wanted to take Algebra in the 8th grade".

Since the post never came back (much less with "citations galore"), here's a mirror.

It came back here

I don't suppose anyone saved the original version of this before it was edited?

Good data on the disparate racial outcomes for some advanced math teachers. What's the relative prevalence of the bigoted? Is this across other subjects as well? What region of the country are we talking about?

There were math teachers who absolutely refused to allow blacks/poors in their classroom, or worse, treated them in such a way as to cause them to fail, thus confirming their worldview.

They can't be fired/fined/reprimanded, or is there no will to do it?

As I understand it, it's exceptionally hard to fire teachers within the American school system -- it takes evidence of sexual misconduct or something similarly precipitous, and it's expensive, time-consuming, and legally hazardous. Even those charges aren't a sure bet. A teacher of mine in high school was suspended on sexual harassment charges -- well-founded ones from what I heard, although I have no direct knowledge -- leading to a lengthy punitive process that involved, among other things, investigators taking students out of their classes and interrogating them about the allegations. He was back in his classroom before the year was out. Needless to say, ordinary incompetence won't do it. I don't think implicit racism would either, as long as it stayed implicit -- wearing a KKK hood into the classroom would probably be beyond the pale. Probably.
I understand that it next to impossible to fire teachers, unless you hit on extreme hot button issues. Sex with students is number 1. But I'd expect the long knives to come out for racism/sexism/homophobia as well, at least in some jurisdictions. Likely not in others. That's why I was asking about what region of the country we're talking about.
In all but the most liberal districts, and maybe even then depending on how cynical you are, I think I'd expect any of that to get a pass as long as plausible deniability existed. Unfortunately, that's plausible deniability from the standpoint of parents and administrators who generally aren't statistically literate nor inclined to take student impressions all that seriously, and that leaves quite a bit of leeway as long as the teacher in question is bright enough to couch their objections in the right terms. You know and I know that if the bell curve on expected achievement is shaped such that 30% of the student population from some minority group should be admitted to an advanced math class, and 0% actually is, then after a couple of years that's as good as admitting racial prejudice. But I think that'd be a much harder sell to a review board, especially one that doesn't want to incur the wrath of the teachers' union or any further investigative costs.
If you have 20 teachers who are fair, it would not be surprising for a statistical analysis to show that one of them is 95% likely to be unfair.