A while ago I wrote how I managed to add 13 points to my IQ (as measured by the mean between 4 different tests).

I had 3 “self-experimenters” follow my instructions in San Francisco. One of them dropped off, since, surprise surprise, the intervention is hard.

The other two had an increase of 11 and 10 points in IQ respectively (using the “fluid” components of each test) and an increase of 9 and 7 respectively if we include verbal IQ.

A total of 7 people acted as a control and were given advantages on the test compared to the intervention group to exacerbate the effects of memory and motivation, only 1 scored on par with the intervention group. We get a very good p-value, considering the small n, both when comparing the % change in control vs intervention (0.04) and the before/after intervention values (0.006)

 

Working Hypothesis

My working hypothesis for this was simple:

If I can increase blood flow to the brain in a safe way (e.g. via specific exercises, specific supplements, and photostimulation in the NUV and NIR range)

And I can make people think “out of the box” (e.g. via specific games, specific “supplements”, specific meditations)

And prod people to think about how they can improve in whatever areas they want (e.g. via journaling, talking, and meditating)

Then you get this amazing cocktail of spare cognitive capacity suddenly getting used.

As per the last article, I can’t exactly have a step-by-step guide for how to do this, given that a lot of this is quite specific. I was rather lucky that 2 of my subjects were very athletic and “got it” quite fast in terms of the exercises they had to be doing.

 

The Rub

At this point, I’m confident all the “common sense” distillation on what people were experimenting with has been done, and the intervention takes quite a while.

Dedicating 4 hours a day to something for 2 weeks is one thing, but given that we’re engaging in a form of training for the mind, the participants need not only be present, but actively engaged.

A core component of my approach is the idea that people can (often non-conceptually) reason through their shortcomings if given enough spare capacity, and reach a more holistic form of thinking.

I’m hardly the first to propose or observe this, though I do want to think my approach is more well-proven, entirely secular, and faster. Still, the main bottleneck remains convincing people to spend the time on it.

 

What’s next

My goal when I started thinking about this was to prove to myself that the brain and the mind are more malleable than we think, that relatively silly and easy things, to the tune of:

A few supplements and 3-4 hours of effort a day for 2 weeks, can change things that degrade with aging and are taken as impossible to reverse

Over the last two months, I became quite convinced there is something here… I don’t quite understand its shape yet, but I want to pursue it.

At present, I am considering putting together a team of specialists (which is to say neuroscientists and “bodyworkers”), refining this intervention with them, and selling it to people as a 2-week retreat.

But there’s also a bunch of cool hardware that’s coming out of doing this

 

As well as a much better understanding of the way some drugs and supplements work… and understanding I could package together with the insanely long test-and-iterate decision tree to use these substances optimally (more on this soon).


There was some discussion and interested expressed by the Lighthaven team in the previous comment section to replicate, and now that I have data from more people I hope that follows through, I'd be high-quality data from a trustworthy first party, and I'm well aware at this point this should still hit the "quack" meter for most people.

I'm also independently looking for:

  1. People to help me get better psychometrics, the variance in my dataset is huge and my tests stop working at 3 STDs of IQ, for the most part. I'd love to have one or two more comprehensive tests that are sensitive to analysesup to 5 STDs
  2. People to run independent analysis on the data, in whatever way they see fit. If you are a professor or otherwise system-recognized expert in the area this would be especially useful. I think the analysis here is quite trivial and "just look at the numbers" is sufficient, but having external validation also helps.

For now, I’m pretty happy to explain to anyone who wants to do this intervention themselves what it involved for me (for free, I want the data), my disclaimers are as follows:

I am not a doctor, and anything that I suggest might be unsafe, you do at your own risk, I guarantee neither the results nor the safety profile for what I did.

I prefer to work with groups of between 2 or 3 people.

I can’t be physically present to help you, but we can have a Zoom call every couple of days.

I expect you to bring 3 to 5 controls along for the ride, without them the data is much weaker, the more similar the controls are to you (in terms of environment and genetics) the better.

My current approach involves dedicating at least 3 to 4 hours of your day to this, wholeheartedly: in a way that’s consistent, involved, and enthusiastic.

The specialists you’ll need to hire and the hardware you’ll need to buy might well drive you past the 10k point (for a group of 3 people) if you do this properly, and you might need a week of scouting to find the right people to work with you.

That being said, since a lot of people were excited to follow through with this last time, I am now putting this offer out there.

 

.

.

.

Confounder elimination

There are a few confounders in a self-experiment like this:

You are just taking people who are not supplementing or eating properly and you are making them use common-sense meals/supplements

You are taking people who don’t exercise and making them exercise because exercise is magic this will result in a positive change but is boring (because exercise is hard)

You are doing a tradeoff to increase performance on the IQ test (e.g. giving them caffeine and or Adderall)

You are not taking into account memorization happening on the IQ tests

The subjects are “more motivated” to perform when redoing the tests

I have addressed all of these:

The subjects kept the same diet and the same supplement stack they used before, I only added 6 things on top. They are both pretty high up the food chain of supplement optimization, one ran 2 healthcare companies and worked with half a dozen — the other one is his partner

The subjects are both semi-professional athletes, exercising for > 2 hrs a day, able to run marathons and ironmans

The subjects’ HR and BP were monitored and no changes happened, no supplements whatsoever were taken > 24hrs before re-taking the IQ tests

I had controls, and 2 of my controls took the tests 24 hours apart, to “maximize” memorization effects

I had controls that were being paid sums between 40 and 100$ (adjusted to be ~2x their hourly pay rate) for every point of IQ gained upon retaking the tests


So how do the numbers look after I control them?

Intervention mean increases: (11.2 [9%], 9.6 [8%], 12.6 [10%]) (mean of means: 11.1) - Average increase: 9.3%
Control mean increase: (14.2 [12%], 4.4 [3%], 8.8 [7%], 7.6 [6%], 5.2 [4%], 5.6 [5%], 3.2 [2%]) (mean of means: 7.0) - Average increase: 5.9%

Controlled mean increase: 4.1
    
Related T-test between the before/after means for the intervention: -12.846 (p=0.006)
Related T-test between the before/after means for the control: -5.015 (p=0.002)
Independent T-test between the before/after difference between intervention and control: -2.46 (p=0.04)

I’d say pretty damn nice given that the controls are going above and beyond in taking the tests under better conditions and with more incentives than the intervention. I am testing a “worse case” scenario here and even in a worst-case scenario 1/3 of the finding holds.


My speculation is that most of the control data is just memorization or incentives. For one the variance between controls is huge (And the p values reflect this).

For seconds, let’s look at verbal IQ:

Intervention mean increases: (0.0 [0%], 5.0 [4%], -16.0 [-14%]) (mean of means: -3.7) - Average increase: -3.4%
Control mean increase: (18.0 [16%], 25.0 [25%], 14.0 [13%], 13.0 [10%], 2.0 [1%], 10.0 [8%], -5.0 [-4%]) (mean of means: 11.0) - Average increase: 10.2%

Controlled mean increase: -14.7
    
Related T-test between the before/after means for the intervention: 0.579 (p=0.621)
Related T-test between the before/after means for the control: -2.92 (p=0.027)
Independent T-test between the before/after difference between intervention and control: 2.032 (p=0.115)

So the fluid component has a +4.1 diff, and the verbal component (which we expect to be stable) has a -14.7 diff. That to me indicates the controls are “trying harder” or “memorizing better” in a way that the intervention group isn’t.


Overall this doesn’t matter, the finding is significant and of an unexpected magnitude either way.

But I do feel like it’s important to stress that I am controlling for the worst-case scenario, and still getting an unambiguously positive result. This approach is not typical in science, where the control and intervention are equally matched, as opposed for the control being optimized to eliminate any and all potential confounders.

New to LessWrong?

New Comment
51 comments, sorted by Click to highlight new comments since: Today at 3:51 PM

Well, what are your actual steps? Or is this just advertisement?

[+]George3d61mo-22-45

This is your second post and you're still being vague about the method. I'm updating strongly towards this being a hoax and I'm surprised people are taking you seriously.

Edit: I'll offer you a 50 USD even money bet that your method won't replicate when tested by a 3rd party with more subjects and a proper control group.

I'm surprised people are taking you seriously.

If you're reading comments under the post, that obviously selects for people who take him seriously, similarly to how if you clicked through a banner advertising to increase one's penis by X inches, you would mostly find people who took the ad more seriously than you'd expect.

[+]George3d61mo-11-13
[-]nwinter1mo1513

People to help me get better psychometrics, the variance in my dataset is huge and my tests stop working at 3 STDs of IQ, for the most part. I'd love to have one or two more comprehensive tests that are sensitive to analyses up to 5 STDs

A friend of mine made https://quantified-mind.appspot.com/ for measuring experiments like this (I helped with the website). It sounds like a good fit for what you're doing.  You can have create an experiment, invite subjects to it, and have them test daily, at the same time of day, for perhaps 5-15 minutes a day, for at least a few weeks. Ideally you cycle the subjects in and out of the experimental condition multiple times, so the controls are the off-protocol subjects, rather than using other people as the controls, because the interpersonal variance is so high.

... not taking into account memorization happening on the IQ tests

Practice effects on cognitive testing are high, to the point where gains from practice usually dominate gains from interventions until tens of hours of practice for most tests.  This effect is higher the more complicated the test: practice effects attenuate faster with simple reaction time than with choice reaction time than with Stroop than with matrices than with SATs. This means you typically want to test on low-level basic psychometric tests, have subjects practice all the tests quite a bit before you start measuring costly interventions, and include time or test number as one of the variables you're analyzing.

Apart from practice, the biggest typical confounders are things like caffeine/alcohol, time of day, amount of sleep, and timing of meals, so you'd either want to hold those variables constant or make sure they're measured as part of your experiment.

These are my recollections from what we learned–my friend did most of the actual experiments and knows much more. If you want to go deep on experimental design, I can ask him.

I would love to help your friend set up a backup system at a secondary datacenter if the data is all in one datacenter!

Awesome! It's on an old version of Google App Engine, so not very vulnerable to that form of data loss, but it is very vulnerable to code rot, and needs to be migrated.  (It was originally running on quantified-mind.com, but he hasn't thought about it a long time and let the domain expire.)

Is that upgrade process something you could help with? The underlying platform is pretty good, and he put a lot time into adapting gold-standard psychometric tests in a way that allows for easy, powerful Quantified-Self-style experimentation, but the project doesn't have a maintainer.

Yes I can help with that. Will DM

The problem with dyi tests is that they have no external validation -- during my initial experiment I actually had a 5 min test I did 2x a day (generated so it was new problems each time) -- but the results from that don't really make sense to anyone but myself, hence why I've chosen to forgo doing it.

 

In terms of saturating the learning effect, that's a better approach, but getting people to put their time into doing that makes it even harder.

Right, Quantified Mind tests are not normed, so you couldn't say "participants added 10 IQ points" or even "this participant went from 130 to 140".

However, they do have a lot of data from other test-takers, so you can say, "participants increased 0.7 SDs [amidst the population of other QM subjects]" or "this participant went from +2.0 to +2.7 SDs", broken down very specifically by subskill.  You are not going to get any real statistical power using full IQ tests.

In terms of saturating the learning effect, that's a better approach, but getting people to put their time into doing that makes it even harder.

It sounds like the protocols involve hours of daily participant effort over multiple weeks. Compared to that, it seems doable to have them do 5-10 minutes of daily baseline psychometrics (which double as practice) for 2-4 weeks before the experimental protocols begin? This amount of practice washout might not be enough, but if your effects are strong, it might.

In reality, that's table stakes for measuring cognitive effects from anything short of the strongest of interventions (like giving vs. withholding caffeine to someone accustomed to having it). I recall the founder of Soylent approached us at the beginning, wanting to test whether it had cognitive benefits.  When we told him how much testing he would need to have subjects do, he shelved the idea. A QM-like approach reduces the burden of cognitive testing as much as possible, but you can't reduce it further than this, or you can't power your experiments.

On a more positive note, if you have a small number of participants who are willing to cycle your protocols for a long time, you can get a lot of power by comparing the on- and off-protocol time periods. So if this level of testing and implementation of protocols would be too daunting to consider for dozens of participants, but you have four hardcore people who can do it all for half a year, then you can likely get some very solid results.

If I sound skeptical about expected measured effects from cognitive testing due to various interventions, it's because, as I recall, virtually none of the experiments we ran (on our selves, with academic collaborators from Stanford, from QS volunteers, etc.) ever led to any significant increases. The exceptions were all around removing negative interventions (being tired, not having your normal stimulants, alcohol, etc.); the supposed positives (meditation, nootropics, music, exercise, specific nutrients, etc.) consistently either did roughly nothing or had a surprising negative effect (butter). What this all reinforced:

  • it's easy to fool yourself with self-reports of cognitive performance (unreliable)
  • it's easy to fool yourself with underpowered experiments (especially due to practice effects in longer and more complicated tests)
  • virtually no one does well-powered experiments (because, as above, it's hard)

This gives me a strong prior against most of the "intervention X boosts cognition!" claims. ("How would you know?")

Still, I'm fascinated by this area and would love to see someone do it right and find the right interventions. If you offset different interventions in your protocols, you can even start to measure which pieces of your overall cocktail work, in general and for specific participants, and which can be skipped or are even hurting performance. I have a very old and poorly recorded talk on a lazy way to do this.

One last point: all of this kind of psychometric testing, like IQ tests, only measures subjects' alert, "aroused" performance, which is close to peak performance and is very hard to affect. Even if you're tired and not at your best but just plodding along, when someone puts a cognitive test in front of you, boom, let's go, wake up, it's time–energy levels go up, test goes well, and then back to your slump. Most interventions that might make you generally more alert and significantly increase average, passive performance will end up having a negligible impact on the peak, active performance that the tests are measuring. If I were building more cognitive testing tools these days, I would try to build things that infer mental performance passively, without triggering this testing arousal. Perhaps that is where the real impacts from interventions are plentiful, strong, and useful.

I think it might be easier to improve on high-level IQ tests than low-level ones in a way that's still real and valuable. I am not sure how one would design more practice-resistant high-level tests. It might be too hard.

A question I have for the subjects in the experimental group:

Do they feel any different? Surely being +0.67 std will make someone feel different. Do they feel faster, smoother, or really anything different? Both physically and especially mentally? I'm curious if this is just helping for the IQ test or if they can notice (not rigorously ofc) a difference in their life. Of course, this could be placebo, but it would still be interesting, especially if they work at a cognitively demanding job (like are they doing work faster/better?).

one reported being significantly better with conversation afterwards, the other being able to focus much better

Here's a market if you want to predict if this will replicate: https://manifold.markets/g_w1/will-george3d6s-increasing-iq-is-tr

Isn't this post describing the replication attempt?

I would say this is not enough data to close the market, I'd need some 3rd party self-experimenters to replicate it.

You should try doing the next version as an adversarial collaboration.

With whom ?

Sorry if I've missed something about this elsewhere, but is it possible to explain what it involves to people who aren't going to properly do it?

I don't have 4+ hours a day to spare at the moment, nor $10k, but I'd love to know what the intervention involves so I can adopt as much of it as it is feasible to do (given it sounds like a multi-pronged intervention). Unless there's reason to think it only works as an all-or-nothing? Even just the supplements on their own sounds like they might be worth trying, otherwise.

Apologies, I just read your reply to Joseph C.

I would like to request the information, your reservations notwithstanding. I am happy to sign a liability waiver, or anything of that nature that would make you feel comfortable. I am also happy to share as much data as it is feasible to collect, and believe I could recruit at least some controls. As I mention above, I don't think I'll be able to implement the intervention in its entirety, given practical and resource constraints, but given your stated interest in a '1000 ships' approach this seems like it could be a positive for you.

pinged you in DMs :) Happy to share, I don't need a liability waver just making sure people understand this is not medical advice, I am not a doctor+ not being assholes

So the fluid component has a +4.1 diff, and the verbal component (which we expect to be stable) has a -14.7 diff. That to me indicates the controls are “trying harder” or “memorizing better” in a way that the intervention group isn’t.

Perhaps you are getting people to trade memory for reasoning?? There certainly seems to be a tradeoff among my peers between those two traits. This is pure speculation.

memory tests are included within the FSIQ evaluation

[-]kave1mo30

And I can make people think “out of the box” (e.g. via specific games, specific “supplements”, specific meditations)

And prod people to think about how they can improve in whatever areas they want (e.g. via journaling, talking, and meditating)

 

Ah, these two have made me more concerned about training effects: especially the games, but also the meditations and journaling.

It seems pretty plausible certain games could basically train the same skills as the IQ test.

I mean games as in "playing catch while blindfolded" physical group activities

As for calling meditation and journaling training, that just seems like motivated reasoning, under that definition anything is training.

If anything journaling would lead to better verbal results, and, well, read my analysis

[-]kave1mo20

I mean when I journal I come up with little exercises to improve areas of my life. I imagine that people in your cohort might do similarly, and given that they signed up to improve their IQ, that might include things adjacent to the tasks of the IQ test.

And I don't think general meditation should count as training, but specific meditations could (e.g. if you are training doing mental visualisations and the task involves mental rotations).

I'm not trying to say that there are definitely cross-training effects, just that these seem like the kinds of thing which are somewhat more likely (than, say, supplements) to create fairly narrow improvements close to the test.

That all sounds to me like increasing IQ ?

Like, if shape rotation is an underlying component of many valuable cognitive processes (e.g. math) and you get better at it in a generic way (not learning for the test)... that's getting smarter

[-]kave1mo20

Yep, the question is definitely about how far it transfers.

Journaling makes you love to think and hate to read. There is a clear read vs write (ie think) tradeoff in thinking styles IMO. I want to try your course though.

What evidence do you have about how much time it takes per day to maintain the effect after the end of the 2 weeks?

No idea, I would re-do the tests on myself but I was semi-present for the replication so I'd rather wait more time.

All 3 of us might try to re-do the tests in a month and I can get 4-5 controls to re-do them too. Then I'd have numbers 1 month in.

This is also an important question for me.

Nitpick: I think it is good form to give P values for things you think ARE coincidences as well. So we can compare the "coincidental" and the "causal" P values or whatever. Example would be control before vs control after P value.

I agree, that's why I did that :)

Within the article you can find examples of that:

-> Control before vs control after p values are provided (not looking good, p value within group alone is insufficient, can capture learning, hence why I do a between group % change test)

-> Control before vs after for verbal IQ (significant -- learning effect / motivation / shorter time between tests ?)

-> Intervention before vs after for verbal IQ (no significant -- backs up hypothesis that this works for fluid IQ only, and the control effect is learning + the advantages on time & motivation)

One day I will learn to Read To The End Before Commenting. You really have done your due diligence.

I want to try! I can't spend money right now but I might be able to get a group to go in. How much do you need the hardware?

Can you send me your email and phone number? 

If you can get a motivated group together I might be able to fund you replicating it as long as you're ok being scrappy because I don't have that much money to throw at this

actually -- dmed you my signal, ping there

Whoa 6% increase for control — did I read that right? Okay I'm 95% sure that's from memorizing the test but what if taking an IQ test turns your brain 5% more on? Only problem is how to control for this...

That's learning effects (: The tests are the same because psychometry is BS and IQ tests aren't designed to be retaken (even though people, for some reason, make claims about IQ increases/decreases)

Makes sense thank you for excusing my ignorance / reading comprehension

You do realise that simply doing the IQ test more than once will result in a higher IQ score? I wouldn't be surprised at all if placebo, and muscle memory accounts for a 10-20 point difference.

Edit: surprised at how much this is getting downvoted when I'm absolutely correct? Even professional IQ taking centres factor in whether someone's taken the test before to account for practice effects. There's a guy (I can't recall his name) who takes an IQ test once a year (might be in the Guinneas Book of World Records, not sure) and has gone from 120 to 150 IQ. 

Yes, he realizes and it would have been easy for you to know by reading. He had a control group that also took IQ tests to factor out training effects. 

I didn't get that impression at all from '...for every point of IQ gained upon retaking the tests...' but each to their own interpretation, I guess. 

I just don't see the feasibility in accounting for a practice effect when retaking the IQ test is also directly linked to the increased score you're bound to get.

I just don't see the feasibility in accounting for a practice effect when retaking the IQ test is also directly linked to the increased score you're bound to get.


How do you think controlled experiments work?