I've just watched this talk about Genetics and Intelligence by Steve Hsu1, a theoretical physicist and Scientific Advisor to the Cognitive Genomics Lab of BGI (formerly the Beijing Genomics Institute), probably the leading genomics research center in the world.
Apparently, the main reason he gave this talk was to recruit volunteers for a study from the Cognitive Genomics Lab with the goal of investigating the genetics of human cognition.
From their homepage:
We currently seek participants with high cognitive ability. You can qualify for the study if you have obtained a high SAT/ACT/GRE score, or have performed well in academic competitions such as the Math, Physics, or Informatics Olympiads, the William Lowell Putnam Mathematical Competition, TopCoder, etc.
Automatic qualifying criteria include:
- An SAT score of at least 760V/800M post-recentering or 700V/780M pre-recentering; ACT score of 35-36; or GRE score of at least 700V/800Q.
- A PhD from a top US program in physics, math, EE, or theoretical computer science.
- Honorable mention or better in the Putnam competition.
If you qualify as a participant, we may send you a DNA saliva kit. After you return this kit, we will genotype your DNA, and the data will eventually be available to you on this website, in a format compatible with many 3rd party interpretational tools.
I guess there are quite a few Lesswrongers smart enough to qualify for this study. If you want to advance Science and get genotyped for free check out their website for further information.
1: Steve Hsu has an awesome blog called "Information Processing". He writes about the genetics of intelligence, economics, psychometry, career advice for geeks, physics, etc.
Actually, they will not sequence your genome - they will genotype you. It's actually quite a difference and I would recommend changing the title of this post. So what they do is use a so called "SNP chip" to test your genotype at a large number (hundreds of thousands) of positions known to be polymorphic in the human population (http://en.wikipedia.org/wiki/SNP_genotyping). This is the same technology currently used by personal genomics companies such as 23andMe, and it's not particularly expensive.
Sequencing a genome is done by totally different technologies, and can potentially determine your entire genome sequence (whereas in the genotyping case you are restricted to the particular loci that were included on the chip). It it also still considerably more expensive.
I signed up for this back in March 2012, sent in my spit in September, and according to an email I got today they're actually sequencing my full genome. So it turns out the original title was correct (at least in my case).
Thanks for correcting my error. How should I name the title? What about "SNP genotyping for free! ( If your IQ is in the 99,9th percentile) "
( The problem is that some people, like me, may not know the term. Which makes for bad marketing :-) )
I think "Get genotyped for free" would work, I don't know how many people will not fully understand what is meant but it'd be hard to come up with something else without getting non-snappy.
How ironic that the people who are supposed to be intelligent can't tell what they're signing up for. Then again, maybe I'm bitter because for some reason I didn't do well enough on the math portion.
Continuing; general media coverage:
Couldn't help but be bemused by http://www.cnn.com/2009/WORLD/asiapcf/08/03/china.dna.children.ability/ & http://www.bionews.org.uk/page_46453.asp - if ever phenotype should be screening off genotype...
And on the topic of whether they will use the results? http://www.nytimes.com/1998/08/16/world/scientists-debate-china-s-law-on-sterilizing-the-carriers-of-genetic-defects.html?pagewanted=all&src=pm came up in a search:
From another South China Morning Post article, "Genetic data on IQ could mislead, academics say":
Where do you get the IQ of 145 from? I don't see anything about it on their website.
He mentions it in the talk. They are searching for people with cognitive abilities at least +3 SD above average ( which equals IQ 145.) (EDIT: At 45:20 he starts talking about the study-design. )
I was under the impression that an IQ of 145 puts one in the 99.75% percentile but I never investigated too closely. Some stats:
" rarity on a 15 SD (e.g. Wechsler) and 16 SD (e.g. Stanford-Binet) scale: 145 99.8650032777% 1 in 741
99.7542037453% 1 in 407
Hm, I've trusted Steve Hsu on this. I don't know if they only accept people in the 99,9th percentile or if folks in the 99,865th meet the criteria, too. I guess it doesn't matter too much.
EDIT: And I've changed the title again. Now it should be sufficiently vague.
(And is best expressed as SDs because IQ numbers from one test are not directly equivalent to those on another test.)
Alright, I've changed the title.
I wonder how these criteria were decided upon. I don't see how a 1500 GRE score is at all comparable to an honorable mention on the Putnam. It seems like even getting a 20 on the Putnam (most years) is significantly more impressive than any GRE score.
I wouldn't be surprised if the two results are equivalent by the naive measure: the fraction of Putnam participants who get honorable mention or better might well be about equal to the fraction of GRE test takers who get a score of 1500 or greater.
The real issue here, of course, is self-selection. Lots of people take standardized tests because they want to apply somewhere that requires the score, which drives the GRE average down. On the other hand, most people who take the Putnam do so because they enjoy the challenge, and are more likely than average to be good at math. This drives the Putnam average up, compared to what it would be if every college student in the US took it.
Edit: I don't have combined statistics for the GRE, but between 5% and 6% of test-takers got 800 on the math section when I took it in 2010, while I estimate 700+ on the verbal section was about as likely. Obviously the two are correlated, but it seems reasonable to suspect that the combined percentage was less than the 2% of Putnam contestants who received an honorable mention or better.
I think the real issue is that comparing the GRE quantitative section to the Putnam isn't reasonable.
I wouldn't be the least bit surprised if there are tons of people who are not capable of scoring anything at all on the Putnam, yet have perfect GRE quant scores.
In any event, seeing researchers using the naive measure (if this is indeed what they did) to compare what are blatantly apples and oranges makes me feel a bit uneasy.
We are aware that the SAT and GRE quant sections have low ceilings, which is why you also need a very high verbal score to qualify through one of those tests alone. But some extremely intelligent people fail to clear the verbal cutoff (especially likely when English isn't their first language), so we figured it was worth mentioning that performance on the level of Putnam Honorable Mention gets you in regardless of your standardized test scores.
Also keep in mind that these are just automatic qualifying criteria. We'll admit a fair number of volunteers who don't satisfy any of them.
Christopher Chang, BGI Cognitive Genomics
Great to hear from you, that's good to know!
On the GRE in 2010, 800M was ~94th percentile, while 700V was ~97th percentile. The math section is just too easy for many students in engineering, physics, mathematics, and so forth.
For those of you still waiting, got an email to the effect of: We did it, sorry it took so long, it'll be uploaded in 3 weeks. Also they said that more than half were not yet done.
I received a similar email and was able to download my genome file a few days ago. The file is 23andMe format output by Plink. It was text even though it had a .gz suffix. I had trouble uploading the file to Promethease, but was able to get it working by changing the header to one copied from an actual 23andMe file and removing the missing (--) SNPs. Unfortunately, despite being ~125MB (~5x the size of an example 23andMe file I have) my file is missing many of the 23andMe SNPs (7948 genotypes annotated in Promethease vs. 20k+ for the 23andMe example). I have an email in to BGI requesting additional information. For example, Promethease directly supports the dbSNPAnnotated.bz2 Complete Genomics file and I was hoping to get a copy of that file for my data.
Have you had any success analyzing your results? Would anyone be interested in starting a discussion group for analyzing our BGI results?
Are you sure you've downloaded your entire genome file? My uncompressed file is about 500 MB, and I got about 26000 annotations on Promethease. It seems like your file might have gotten truncated during the download.
Short step-by-step guide for those who want to get their genome annotated by Promethease:
# rsid; Promethease chokes if you don't) and save. This is required to get Promethease to recognize the file.
* I advise against downloading the
genome.txt.gzfile directly because for some reason SpiderOak has
Content-encoding: gzipin their HTTP response header, which means that browsers will transparently uncompress that file. This makes me uneasy because there is no checksum provided for the (somewhat large) plain text file, so we have little protection against corruption and truncation. In contrast, by using 'Download All Files' to download everything in a zip, the data's integrity will be automatically verified against CRC-32 checksums when we unzip and gunzip locally.
Thanks for the explanation and tips! I used your procedure and ended up with the same 131MB file. Interestingly I did not need to remove the "--" entries. I have been exchanging email with BGI and they indicated files could have significantly different number of entries (but I am surprised at >3x!). Is there any chance your sequencing had greater than 4x coverage? My VCF file is queued up and should be available in a few months which should help clarify what I am seeing.
I don't know. How do I find out?
I think the VCF would tell you if you had it. Another possibility would be using a lower quality threshold for calling SNPs, but that seems unlikely.
I signed up with 23andMe, a few days before getting that letter from BGI. I'm currently waiting for both results. Can anyone point me to a good resource for studying what the data mean and what I can do with them?
I think Promethease (http://promethease.com) is a good and inexpensive ($5) start. If you have both sets of results I would recommend using 23andMe given my experience with uploading BGI data. Web searching "promethease review" will give some details and alternatives. Hopefully those of us in the BGI study can work out a good way of analyzing that data.
I'm another participant. I'm still waiting for my results, but would be interested in any discussion group for analysis.
What is 'it' here, just your particular raw SNP results and not news about any hits of reaching genome-wide statistical-significance?
Just my particular results.
Apropos of that, that's part of the Chinese effort http://duende.uoregon.edu/~hsu/talks/ggenomics.pdf :
I was struck by Hsu's estimate of how well the selection would work if the optimistic estimates about how many alleles they find works out:
The talk isn't really citable, but there's "BGI Cognitive Genomics Lab: Proposal for Gene-Trait Association Study of g"
There are some delicious discussions in there; for example, on weaknesses in previous research (this sort of discussion is why I tend to skepticism, as in my previous email on genes):
Here's information on Big Five and heredity:
Thank you for this info. I've signed up. I think this flipped my mood from gloomy to happy.
Incidentally, this is the second study I've signed up for via the web. The first is the Good Judgement Project which has been a fun exercise so far.
Is anyone aware of a cheap/free way to check your SAT scores online and provide documentation for this survey? The last time I had need of my SAT scores was nearly a decade ago, and my best memory is "really high and I got scholarships a lot," which I don't think would count officially.
I can view mine on collegeboard.org. I only took the tests a few years ago though, and I don't know how long they keep them.
It is often listed on unofficial transcripts, which are often provided free.
I was able to find this: http://testprep.about.com/od/act/ht/Old_Scores.htm
If there is a cheap/free online way I'd like to know about it, too. I'm pretty sure my verbal score was too low, but I'm not sure how much it missed by, I might have been close and I did take it a year or two early...
Update: Jeff says they only ever sent him part of his data.
I remember when this was announced; I eagerly checked my old SAT & GRE scores and... I was short by a few dozen points. Oh well. Genomics is falling so fast I'll just have it done in 10 years or so for next to nothing.
Hmmm... this seems like a fascinating project. The ego boost I got from qualifying is enough to motivate me to sign up :) .
Deadline: 25 March 2012. For those of us who are fascinating minorities or human specimens but not the sharpest tools in the shed, you may still have a chance at free genotyping via OpenSNP: http://opensnp.wordpress.com/2012/02/20/apply-now-for-a-free-genotyping/
Watching the talk linked below, at about 35:26 he mentions some new results on the comparison of IQ similarity to genetic similarity. Does anyone know where to find these results?
I will be happy to take the free stuff, thanks! But seriously...the GRE is kind of easy.
Easy for you? Do you think your experience was typical?
Easy enough that it can't really distinguish 2 SDs from 3 SDs at the top end.
Though it's possible that it's already an SD above the population mean to begin with since it's only college grads. I don't think these researchers are looking for a very precise cutoff.
Of course I am aware that I did unusually well, and I don't think that everyone can get a perfect score on the quant section, but last I checked, like one in 20 test takers does. To me, that suggests they could usefully make it a lot harder. Maybe I'm underestimating the difficulty of getting a 700 verbal, though. (I didn't think it was particularly more difficult than the SAT critical reading section, and I expect people to have a lot more practice reading dense academic prose at the end of undergrad than they did at the end of high school, but I can believe I'm not great at distinguishing degrees of difficulty in that range.)
I was briefly excited as I met both GRE and SAT cutoffs. But now I'm feeling guilty and debating whether or not to apply; I'm certainly not in the 99.9th percentile. I absolutely love this community but I don't really post because I sincerely feel inadequate.
I'm easily in the 5th percentile, but I feel like an imposter with my standardized test scores: the tests are SO damn easy and don't measure anything of substance. GRE verbal tests your ability to recall obscure words, and the math tests your ability to maintain focus through 2 hours of trivial middle-school math. I didn't study at all.
That's what being intelligent is supposed to feel like!
Guilt is overrated. They say you qualify. Therefore you do. It's their study. In fact, if you do qualify but don't think you should then you are biasing their data against genes for low self esteem.
If you meet their samples then you should go for it. If they've already decided that those are the metrics they want to use then that's what matters for their data gathering purposes.
Also, if you have GRE and SAT scores in that range, while it is possible that they aren't really a reflection of intelligence in the 99.9th percentile (ignoring for now what intelligence means in any useful sense) one is almost certainly well above the 95th percentile.
Please sign up. Do you really believe that your perception of your abilities is more accurate than that of objective tests? And if you believe that you aren't that intelligent, then why do you trust your reasoning and not that of the researchers who designed the study? Do you think they've made some great error in selecting these criteria? Do you think you are smarter than them? ;) Either way, you definitely should sign up.
Oh, and you don't have to be in the 99,9th percentile. The 99,865th percentile totally suffices! ;)
Typical for who? For the general population? For the people on LW? For the people I personally hang out with, or who are attending the same school as me?
I suspect that for the kinds of people who generally hang out on LW.. yeah, it's not challenging. I often feel stupid compared to the people here, and I breezed through it without any special preparation. But judging by the amount of GRE prep material out there and the number of people moaning on grad school forums about whether they'll be admitted with their low scores, I guess there are lots of people who find it difficult.
Since risk from individual SNP's 'should' not be aggregated to indicate an individual's risk based on multiple sources of evidence, how are the magnitudes for genosets determined?. Can bayes or another method be used to interpret a promethease report?
Even genetic epidemiology textbooks seem pessimistic: about the usefulness of the genetic research underpinning precision medicine:
The references in question are about the impact of population stratification on genetic association studies. That doesn’t seem to substantiate such a broad stroke about the non-replicability of genetic epidemiology. I don't know what to make of these findings.
Damnit, can't post images in-line with comments. So, here is a link to a screenshot of those references
It suprises me that entrepreneurial machine learning analysts don’t beg for genetic research to identify how combinatorial patterns of genes to be able to characterise individual risk. It seems like if/once they can get hold of that information, the sequence from genetic science to consumer actionable health information is bridged. So where are the 'lean gene learning machine' startups? I certainly don’t have the lean gene to do it myself. I don’t know machine learning.
Regulatory issues seems like the biggest hurdle. To the best of my google-fu, 23andme doesn't even disclose what it's 'Established Research' genes are. So, once regulatory hurdles are summounted, lots of useful research will flood out.
That's a bit misleading given that they don't make the decisions via IQ scores.
It's been a while; any further updates on this project? All the BGI website says is that my sample has been received.
I haven't had my IQ tested in so long. But back in the 1970s, via the Stanford Binet and the Wechsler (WISC) I qualified for the gifted education program (MGM- Mentally gifted minors) with scores of 142 and 144 respectively. I did not take the SAT as my family was poor and it was certain I wouldn't attend college, or at least, not a four year university. As it turned out, I did finish with my BA degree (financial aid and working odd jobs).
Anyway, I have already had my genotype (and other genetic information determined) via saliva sample sent to labs at 23andme.com, and I have the AA genotype. According to articles on 23andme.com, this genotype is prevalent among those with higher I.Q.s . It will be interesting to see if your findings corroborate these findings.
Apparently selection is still ongoing: I got an email today saying they're sending me the kit. What kind of information should I expect when the results come back? I've never been genotyped before, so I don't know if this will be telling me stuff I already know, listing risk factors for diseases, declaring me genetically nonhuman, or what. I'm a step behind the rest of you on what genotyping actually does.
Although I got the email 2/23/13 saying my results were expected in May, I haven't gotten anything yet, and the site says it is still being sequenced. The results will be encoded. You will need a program to decode it. When you log into the site, at the top one of the options is Software. Under that, they list several programs that can be used. If anyone can make sense of it, I'd love to learn how. Just because I was a mathematical idiot savant at one point doesn't mean that part of my brain didn't shrivel up and die once real life happened.
On the website, it eventually updated the due date on sequencing to July. Never got an updated email.
Has anyone gotten results? Is anyone else still waiting?
I didn't become involved through that route, but I was a participant in the Study of Mathematically Precocious Youth. The majority of participants are from SMPY via the samples taken earlier by Dr. Robert Plomin, another researcher. The SMPY research emphasized gender at nearly every step. I haven't heard a word about it with this. For whatever reason, the non-verbal intelligence of males exceeds that of females. That is especially true of spatial intelligence, which seems to be specifically what they (ideally) are looking for.
So, any other gals here participating in this?
If anyone is still following this, if you fell slightly short in one area they still wanted people to apply. Those figures were simply for guaranteed inclusion. I think there was a crappy understanding of Western thinking and a lot of people self-excluded who should not have. It seems BGI only got 500-600 people on its own, the other 1500 were via Plomin. When I first read of this almost 2 yrs ago, there was mention of getting 10k volunteers.
I can't speak to the GRE, but on the SAT it is slightly unusual for people to do extremely well on both the verbal and the qualitative sections. For 2008, a GRE score of 700 was the top 4%. Those top 5% scoring 800 Math would not necessarily be top 4% Verbally, those would be largely English and liberal arts types (like me, 20 yrs after graduating college, nowhere near 800 on Math despite scoring that on the SAT 25 yrs before). Hitting 700 V & 800 M thins the herd, which was already reduced because not everyone goes to grad school, not all schools require tests, some take GMAT, etc.
I am in the same position: results expected in May (2013), status page now says July. I sent them an email in August but had no reply. As a non-American I don't have a GRE or SAT, but when I volunteered I just listed my educational record and career and waited to see what they said. I am wondering what their actual criteria for inclusion are, defined by who they have actually included.
Have any of their participants, on LW or elsewhere, received their data? Or is this whole thing just a ploy to get free DNA samples from smart Westerners in order to, I dunno, craft a deadly virus that only smart Westerners will succumb to? (Joke.)
Someone recently asked BGI Cognitive Genomics and received the following response (NB: most of that linked thread is about commercial sequencing using BGI's lab, not about the BGI Cognitive Genomics study):
Earlier, on August 18, Steve Hsu wrote:
USA resident here, I submitted my sample in April 2013 and have not received data. The status page indicates they are still sequencing my genome. I emailed them twice to inquire on the timeframe for completion to no avail.
I had a reply from them in October 2013, saying:
I believe some other people have received the same. When I log in to their website and look at my status page, it says:
Sadly, since I did neither grow up in the USA, nor participate in any competition due to disinterest, I am surely not eligible for this study but I am very much interested in the findings since IQ seems to be correlated with the risk of some serious mental disorders.
Edit: I would also be very interested in a study that examines the genetic basis for other personality traits, like conscientiousness or extraversion.
I signed up. I'm interested to see what their sequencing finds, although it seems like we will have to wait a while for that information.