A piece I saw that Benjamin Todd adapted from THINK's module on charity assessment. Some of you may recall the network's recent launch.
Lots of social interventions end up doing more harm than good. Many more make no difference at all, and are just a waste of resources. At times, we’ve probably argued with friends about which interventions we’d like to see, and which we wouldn’t. But are we any good at judging what’s likely to work?
Here’s a cool bit of content adapted from THINK. Try and guess which of these eight programs made a difference, which had no effect, and which made things worse.
cipergoth said that it should be emphasised that this isn't a trick question where the answer is they all worked or none did.
Round #1: Scared Straight
Program description: “In the 1970s, inmates serving life sentences at a New Jersey (USA) prison began a program to ‘scare’ or deter at‐risk or delinquent children from a future life of crime. The program, known as ‘Scared Straight’, featured as its main component an aggressive presentation by inmates to juveniles visiting the prison facility. The presentation depicted life in adult prisons, and often included exaggerated stories of rape and murder … The program received considerable and favorable media attention and was soon replicated in over 30 jurisdictions nationwide … Although the harsh and sometimes vulgar presentation in the earlier New Jersey version is the most famous, inmate presentations are now sometimes designed to be more educational than confrontational but with a similar crime prevention goal. Some of these programs featured interactive discussions between the inmates and juveniles, also referred to as ‘rap sessions.’(2)
Did the program decrease the rate of juvenile crime?
Round #2: Nurse‐Family Partnership
Program description: “The Nurse‐Family Partnership program provides nurse home visits to pregnant women with no previous live births, most of whom are i) low‐income, ii) unmarried, and iii) teenagers. The nurses visit the women approximately once per month during their pregnancy and the first two years of their children’s lives. The nurses teach i) positive health related behaviors, ii) competent care of children, and iii) maternal personal development (family planning, educational achievement, and participation in the workforce). The program costs approximately $12,500 per woman over the three years of visits (in 2010 dollars).”(6)
Did the program improve the quality of child care?
Round #3: Drug Abuse Resistance Education (DARE)
Program description: “DARE is a highly‐structured substance‐abuse prevention program taught by uniformed police officers … The program is typically provided over the course of 10‐20 weekly hour‐long sessions, during which the police officers use lectures, class discussion, role plays, and homework assignments to i) teach students about substance use and its effects, ii) teach students decision‐making and peer pressure resistance skills, and iii) boost students’ self‐esteem. Prior to teaching, the police officers take an 80‐hour training course on teaching techniques, classroom management, and the DARE curriculum … DARE costs approximately $130 (in 2004 dollars) per student and, as of 2001, was operating in 75% of American school districts.”(8)
Did the program decrease the rate of drug use?
Rounds #4 and #5: 21st Century Community Learning Centers
Program description: “21st Century Community Learning centers is a large ($1 billion per year) US Department of Education program which funds optional after‐school programs for elementary and middle school students in mostly high‐poverty schools. Key goals of the program are to i) provide students with a safe place after school, and ii) improve their academic performance. Recipients of program funds (ie, school districts and/or non‐profit educational/community organizations) are required to provide academically focused “extended learning activities” (e.g., instructional enrichment programs, tutoring, or homework assistance). Most centers also offer enrichment/recreational activities such as martial arts, sports, dance, art and/or music … (Elementary school) centers vary in the activities they offer and other key features, and thus comprise a range of after‐school interventions rather than a single intervention. In a typical center,
students may spend an hour doing homework and having a snack, an hour on additional academic activity (eg, a lesson or working in a computer lab), and an hour doing recreational or cultural activities;
the center’s staff are a mixture of certified teachers, instructional aides, and representatives of community youth organizations;
the center is open 4‐5 days per week for three hours after school, and serves approximately 85 students per day; and
the average student attends the center 2‐3 days per week.
Centers spend approximately $1,000 (in 2005 dollars) on each enrolled student per year.”(10)
Did the program increase the students’ academic achievement?
Did the program improve the behavioural problems at the schools?
Round #6: Even Start Family Literacy program
Program description: “The Even Start program is intended to ‘help break the cycle of poverty and illiteracy by improving the educational opportunities of the nation’s low‐income families by integrating early childhood education, adult literacy or adult basic education, and parenting education into a unified family literacy program’. In 2000‐2001, there were 855 Even Start projects serving 31,896 families … Even Start grantees had considerable flexibility in designing services to meet the needs of the low‐income families, but all were required to offer four services:
adult education to develop basic educational and literacy skills;
early childhood education services to provide developmentally appropriate services to help prepare children for school;
parenting education to help parents support the educational growth of their children; and
parent‐child literacy activities.”(13)
Did the program increase literacy?
Round #7: Big Brothers Big Sisters
Program description: “Big Brothers Big Sisters’ community‐based mentoring program matches youths aged 6‐18, predominantly from low‐income, single‐parent households, with adult volunteer mentors who are typically young (20‐34) and well‐educated (the majority are college graduates) … The mentor and youth typically meet for 2‐4 times per month for at least a year, and engage in activities of their choosing (e.g., studying, cooking, playing sports). The typical meeting lasts 3‐4 hours … For the first year, Big Brothers Big Sisters case workers maintain monthly contact with the mentor, as well as the youth and his or her parent, to insure a positive mentor‐youth match, and to help resolve any problems in the relationship. Mentors are encouraged to form a supportive friendship with the youths, as opposed to modifying the youth’s behavior or character… In 2008, Big Brothers Big Sisters served 255,000 youths and 470 agencies nationwide. The national average cost of making and supporting a match is approximately $1,300 in 2009 dollars.”(14)
Did the program decrease drug use and violent behavior?
Round #8: Top 16 Educational Software
Program description: “In the No Child Left Behind Act of 2002, Congress called for a rigorous study of the effectiveness of educational technology for improving student academic achievement … In fall 2003, developers and vendors of educational technology products responded to a public invitation and submitted products for possible inclusion in the national study. Mathematica Policy Research, Inc. staff selected 40 of the 160 submissions for further review by two panels of outside experts, one for reading products and one for math products … In January 2004, (the US Department of Education] considered the panel’s recommendations and selected 16 products for the study. In selecting products, (the US Department of Education) grouped them into four areas:
early reading (first grade),
reading comprehension (fourth grade),
pre‐algebra (sixth grade), and
algebra (ninth grade).
The products ranged widely in their instructional approaches and how long they had been in use. In general, however, the criteria weighted the selection towards products that had evidence of effectiveness from previous research, or, for newer products, evidence that their designs were based on approaches found to be effective by research. Twelve of the sixteen products had received awards or been nominated for awards (some as recently as 2006) by trade associations, media, teachers, or parents.”(15)
Did the program improve test scores?
Here are the answers!
Round #1: Scared Straight
Negative! Several randomized controlled trials have shown that Scared Straight had a negative effect. Going through Scared Straight made children more likely to commit crimes in the future (3). Fun fact: Scared Straight programs are still being run today (4), and people promote them as being effective, despite the fact that they are harmful (5).
Round #2: Nurse‐Family Partnership
Positive! Three randomized controlled trials have shown that the Nurse‐Family Partnership had a positive effect. The program led to a reduction in child abuse/neglect, child injuries (20‐50% reduction) and an improvement in cognitive/educational outcomes for children of mothers with low mental health/confidence/intelligence (e.g., 6 percentile point increase in grade 1‐6 in reading/math achievement) (7).
Round #3: Drug Abuse Resistance Education (DARE)
No effect!Two randomized controlled trials have shown that DARE did not have an effect on the rate of drug use among participants. The rate of drug use did not increase or decrease (9).
Round #4: 21st Century Community Learning Centers
No effect! A randomized controlled trial has shown that the 21st Century Community Learning Centers had no effect on participating students’ academic performance. Students who participated were neither helped nor harmed by the program.(11)
Round #5: 21st Century Community Learning Centers
Negative! A randomized controlled trial has shown that the 21st Century Community Learning Centers caused an increase in the behavioral problems of participating students (12).
Round #6: Even Start Family Literacy Program
No effect! A randomized controlled trial on a subset of Even Start programs found no evidence of an increase or decrease in literacy in parents or children (17).
Round #7: Big Brothers Big Sisters
Positive! A randomized controlled trial has shown that Big Brothers Big Sisters caused youths to be 46% less likely to have started using illegal drugs, 27% less likely to have started using alcohol, 32% less likely to have hit someone in the previous year and fewer days of skipping school during the past year (18).
Round #8: Top 16 Educational Software
No effect!The study described was a randomized controlled trial, and showed that the software did not make a noticeable difference in any of the categories. It did not help or hurt with 1) early reading (first grade), 2) reading comprehension (fourth grade), 3) pre‐ algebra (sixth grade), or 4) algebra (ninth grade) (19).
How did you do?
If you got 7-8 right, there’s less than a 1% chance you were guessing. If you got 5-6 right, there was only an 8.5% chance you were guessing, so it might be skill. If you got 1-4 right, then you did no better than randomly guessing. If you got zero right … we could get useful information by always doing the opposite of what you do.
The effects of social interventions are extremely complex. All of these programs sound good, but unintended consequences can get in the way. It’s very difficult to work out what’s going to be successful ahead of time. Instead, we need to test, measure the results, and take it from there.
I thought Round 2 would have no effect and expected Round #5 to have no effect not a negative one, I got 6 out of 8 correct. How well did you do?
I recommend checking out the links and references. Gwern's comment there was also interesting.
So out of this sample, the only two interventions that had positive effects were based on one-on-one relationships. Any wisdom we can draw from this, or is it just a coincidence?
I got all 8 correct. My take is that programs that make participants feel like they have low status are unlikely to succeed.
I would explicitly say "The answer is yes for some but not all of these" - ie it's not a trick question where the answer is that they all worked, or none did.
As it usually happens in the social "sciences," it's very naive to believe that in any of these cases we have anything like solid evidence about the total effect of the programs in question. Even ignoring the intractable problems with disentangling all the countless non-obvious confounding variables, there is still the problem of unintended consequences -- which may be unaccounted for even if the study seemingly asks all the relevant questions, and which may manifest themselves only in the longer run.
Take for example this nurse-family partnership program. Even if the study has correctly proven that these positive outcomes have occurred in the families covered by the intervention, and that they are in fact a consequence of the intervention -- a big if -- we still have no way of knowing its total long-run effect. For one, it may happen that it lowers the cost of having children for poor unmarried women, both by providing assistance and by lowering the stigma and fear of such an outcome, so that in the new long-term equilibrium, more children are born to such women, especially the least responsible, resourceful, and competent ones, eventually increasing the total measure of child poverty, neglect, abuse, etc. Of course, this may or may not be the case, but there's no way to know it based on these studies that purport to give a definitive evaluation of the program's success.
I found this article interesting overview of examples of unintended consequences of past changes, that makes a case for being very cynical of this particular kind of argument:
A Really, Really, Really Long Post About Gay Marriage That Does Not, In The End, Support One Side Or The Other
Eh. There are two camps in libertarianism: the moral libertarians, and the technical libertarians. The moral libertarians derive their policies from principles- force is wrong, taxation implies the threat of force, and thus we need to build a society without taxation if we want to live in a moral society.
The technical libertarians derive their policies from economic arguments and history. It doesn't matter whether you think it's moral or immoral to lend money for profit- let's look at societies which allow that and societies which don't, and see which ones prosper more, and apply theoretical principles to expect which should be the case.
And so the atheist moral libertarian looks at gay marriage, and says something along the lines of "the state shouldn't be involved in marriage at all!" or, if you're lucky, "the state should recognize a marriage contract between any two consenting adults!". (The Christian moral libertarian probably thinks that gay marriage is wrong for the standard Christian reasons.) The technical libertarian, though, will be wi... (read more)
I've been noticing a lot of my comments get rapidly downvoted once shortly after I post them lately, especially (but not exclusively) in threads where I post libertarian-progressive-ish rebuttals to social-conservative positions.
I'd like to think that it's just someone who doesn't approve of political discussion on LW — but the socially conservative interlocutors don't seem to be getting the same treatment. (With the exception of the ever-popular sam0345, whose low comment scores I expect have more to do with his hostile attitude than the fact that he posts about politics.)
So there does seem to be some Blue/Green unpleasantness going on here. Comments advocating "race realism", sexual shame, or other socially conservative positions tend to float around +3 or +4, while responses disagreeing with them — even with citations to academic work and evidence on the subject — tend to float around -1 to +1.
It doesn't bother me all that much. If my comments were actually getting buried, I'd be worried that we had a bury brigade going on — but they're not. My current hypotheses are either ⓪ I'm just not very good at commenting, ① I have a stalker, ② the idea that social conservatism ... (read more)
I think this is what being on one side of a tribal conflict looks like from the inside. My experiences have been similar, with many of my posts getting instantly down voted to -3 to -4, then slowly recovering karma later. As you probably recall from our recent conversations with me we have differing opinions on some politically charged subjects.
I don't think you a bad poster and you seem to have a high karma score so we can mostly throw out ⓪. I recall often up voting posts by you, even the ones I disagree with and only recall downvoting a recent one where you seemed to be plain wrong in the context of the discussed article. In that case I also made a comment explaining why I thought it wrong. The contrarian explanation as I wil... (read more)
I used to be excited about the idea of harnessing the high intellectual ability and strong norms of politeness on LW to reach accurate insight about various issues that are otherwise hard to discuss rationally. However, more recently I've become deeply pessimistic about the possibility of having a discussion forum that wouldn't be either severely biased and mind-killed or strictly confined to technical topics in math and hard sciences.
It looks like even if a forum approaches this happy state of affairs, the way old Overcoming Bias and early LessWrong arguably did for some time, this can happen only as a brief and transient phenomenon. (In fact, it isn't hard to identify the forces that inevitably make this situation unstable.) So, while OB ceased to be much of a discussion forum long ago, LW is currently in the final stages of turning into a forum that still has unusual smarts and politeness, but where on any mention of controversial issues, bat... (read more)
Much of the uncertainty in estimating the "success" of these programs lies in not knowing to what degree each of the different social indicators measuring said "success" have already been corrupted, in line with Campbell's law.
I would like to see this article published in the (very hypothetical) World Where Nobody Just Guesses. They would surely be amused at that sky-high 1% figure, when the real number is obviously 0% since nobody was going to guess.
You can't just flip around statements about conditional probability, darn it.
Did the first 4, then skipped to the answers - and I got all of those right! Well, with the exception of DARE; I see no logical way it can't be harmful and counterproductive - for the simple reason that it's crying wolf to teenagers, who are both impressionable and suspicious of adult authority.
I think that, when programs like these are unleashed upon sheltered teenagers from middle-class backgrounds (which is now happening in Russia as well), it can be a big part of the problem; kids are told in strong terms that marijuana and meth are both deadly and aw... (read more)
What does it say about me that on reading #1 I thought “Well, I guess that it increased the rate of crime among males and decreased it among females, but given that more crimes are committed by males than by females to begin with, I guess the total crime rate went up”?
Here are my results. P(yes) is the probability I gave that a given intervention worked before I saw the answers.
My average probability rating was 0.39. My average probability rating for correct answers was 0.5, and my average probability rating for incorrect answers was 0.35.
If I do better than chance at this, it's not by much.
6/8 - I thought 21st Century schools would improve behavior, and I thought Even Start would improve child literacy but not adult literacy. This is really more of a 5/7, though, since I already knew of DARE's ineffectiveness.
The Nurse-family partnership thing sounds like a big-government liberal dream program, so how could I not predict that it would greatly improve outcomes :P
According to the Wikipedia article on happiness economics, historically women's have reported greater happiness than men up till the 1960s. This coincides with the triumph of first wave feminism - suffrage an legal obstacles and the emergence of social justice feminism that's more retributory and provides retaliatory privellages for women. Since then, women have reported increasingly less happiness and are no more unhappy than men on aggregate, in the West. Evidence supporting social conservatism?
Did they miss "against a properly set up control" after the word "test"?
I applaud your effort to promote clear thinking on effects of policy. One thing I wanted to mention is that even programs with weak effects may be eventually worthwhile if there is some cancellation of direct effects of the program by some indirect effects. Even if not worthwhile directly, interventions with "effect cancellation" may perhaps be informative for designing future interventions. People in "mediation analysis" worry about these things a lot.
Analyzing mediation is tricky because it needs strong assumptions.