[musing] Actually another mistake here which I wish I just said in the first comment: I didn't have a strong enough TAP for, if someone says a negative thing about your org (or something that could be interpreted negatively), you should have a high bar for not taking away data (meaning more broadly than numbers) that they were using to form that perception, even if you think the data is wrong for reasons they're not tracking. You can like, try and clarify the misconception (ideally, given time & energy constraints etc.), and you can try harder to avoid putting wrong things out there, but don't just take it away -- it's not... (read more)
More thoughts here, but TL;DR I’ve decided to revert the dashboard back to its original state & have republished the stale data. (Just flagging for readers who wanted to dig into the metrics.)
Hey! Sorry for the silence, I was feeling a bit stressed by this whole thread, and so I wanted to step away and think about this before responding. I’ve decided to revert the dashboard back to its original state & have republished the stale data. I did some quick/light data checks but prioritised getting this out fast. For transparency: I’ve also added stronger context warnings and I took down the form to access our raw data in sheet form but intend to add it back once we’ve fixed the data. It’s still on our stack to Actually Fix this at some point but we’re still figuring out the timing on that.
Hey! I just saw your edited text and wanted to jot down a response:
Edit: I'll be honest, after thinking about it for longer, the only reason I can think of why you would take down the data is because it makes CEA and EA look less on an upwards trajectory. But this seems so crazy. How can I trust data coming out of CEA if you have a policy of retracting data that doesn't align with the story you want to tell about CEA and EA? The whole point of sharing raw data is to allow other people to come to their own conclusions. This really seems like such a dumb move
“The data was accurate as far as I can tell until August 2024”
I’ve heard a few reports over the last few weeks that made me unsure whether the pre-Aug data was actually correct. I haven’t had time to dig into this.
In one case (e.g. with the EA.org data) we have a known problem with the historical data that I haven’t had time to fix, that probably means the reported downward trend in views is misleading. Again I haven’t had time to scope the magnitude of this etc.
I’m going to check internally to see if we can just get this back up in a week or two (It was already high on our stack, so this just nudges up timelines a bit). I will update this thread once I have a plan to share.
I’m probably going to drop responding to “was this a bad call” and prioritize “just get the dashboard back up soon”.
Hi! A quick note: I created theCEA Dashboard which is the 2nd link you reference. The data here hadn’t been updated since August 2024, and so was quite out of date at the time of your comment. I've now taken this dashboard down, since I think it's overall more confusing than helpful for grokking the state of CEA's work. We still intend to come back and update it within a few months.
Just to be clear on why / what’s going on:
I stopped updating the dashboard in August because I started getting busy with some other projects, and my manager & I decided to deprioritize this. (There are some manual steps needed to keep the data live).
I’ve now seen several people refer to that dashboard as a reference for how CEA is doing in ways I think are pretty misleading.
We (CEA) still intend to come back and fix this, and this is a good nudge to prioritize it.
Thanks, I found this interesting! I remember reading that piece by Froolow but I didn't realize the refactoring was such a big part of it (and that the GiveWell CEA was formatted in such a dense way, wow).
This resonates a lot with my experience auditing sprawling, messy Excel models back in my last job (my god are there so many shitty Excel models in the world writ large).
FWIW if I were building a model this complex, I'd personally pop it into Squiggle / Squigglehub — if only because at that point, properly multiplying probabilities together and keeping track of my confidence interval starts to really matter to me :)
Epistemic status: Passion project / domain I’m pretty opinionated about, just for fun.
In this post, I walk through some principles I think good spreadsheets abide by, and then in the companion piece, I walk through a whole bunch of tricks I've found valuable.
Illustrated by GPT-4o
Who am I?
I’ve spent a big chunk of my (short) professional career so far getting good at Excel and Google Sheets.[1] As such, I’ve accumulated a bunch of opinions on this topic.
Who should read this?
This is not a guide to learning how to start using spreadsheets at all. I think you will get more out of this post if you use spreadsheets at least somewhat frequently, e.g.
I also spent a cursed day looking into the literature for NONS. I was going to try and brush this up into a post, but I'm probably not going to do that after all. Here are my scrappy notes if anyone cares to read them.
You're citing the same main two studies on Enovid that I found (Phase 3 lancet or "Paper 1", Phase 2 UK trial or "Paper 2"), so in case it's helpful, here are my notes under "Some concerns you might have" re: the Lancet paper:
The study was funded and conducted by the drug manufacturer (the first 3 authors of the study all work at the manufacturer).
For example: let’s say you want to know the impact of daily jogs on happiness. You randomly instruct 80 people to either jog daily or to simply continue their regular routine. As a per protocol analyst, you drop the many treated people who did not go jogging. You keep the whole control group because it wasn’t as hard for them to follow instructions.
I didn't realize this was a common practice, that does seem pretty bad!
Do you have a sense of how commonplace this is?
What’s depressing is that there is a known fix for this: intent-to-treat analysis. It looks at effects based on the original assignment, regardless of whether someone complied or not.
I have built three or four traditional-style lumenators, as described in Eliezer and Raemon’s posts. There’s a significant startup cost — my last one cost $500 for the materials (with $300 of that being the bulbs), and the assembly always takes me several hours and is rife with frustration — but given that they last for years, it’s worth it to me.
Reading this post inspired me to figure out how to set up a lumenator in my room, so thank you for writing it! :)
I just set mine up and FWIW I got 62,400 lumens for $87 ($3.35 / bulb if you buy 26, 2600 lumens, 85 CRI, 5000k, non-dimmable). These aren't dimmable, but are over half the price of the 83 CRI Great Eagle bulbs you mentioned (which are $6.98 / bulb right now).
June 2023 cheap-ish lumenator DIY instructions (USA)
I set up a lumenator! I liked the products I used, and did ~3 hours of research, so am sharing the set-up here. Here are some other posts about lumenators.
$17 for command hooks (if you get a different brand, check that your hooks can hold the cumulative weight of your bulbs + string)
62,400 lumens total(!!)
Here are the bulbs I like. The 26 listed come out to $3.35 / bulb, 2600 lumens, 85 CRI, 5000k, non-dimmable. This comes out at 776 lumens / $ (!!!) which is kind
[musing] Actually another mistake here which I wish I just said in the first comment: I didn't have a strong enough TAP for, if someone says a negative thing about your org (or something that could be interpreted negatively), you should have a high bar for not taking away data (meaning more broadly than numbers) that they were using to form that perception, even if you think the data is wrong for reasons they're not tracking. You can like, try and clarify the misconception (ideally, given time & energy constraints etc.), and you can try harder to avoid putting wrong things out there, but don't just take it away -- it's not... (read more)