There isn't enough sharing of positive and negative results within the rationality community. I suspect this results in a fair amount of wasted effort as people explore the same dead ends, and a fair amount of lost potential when more effective tools don't get shared.
So, here are some things Boston has tried (not everything though):
Everyone shows up for a few hours with the intention of taking care of whatever bureaucratic tasks they've been putting off (doctor's appointments, getting a passport, taking care of personal finance tasks, etc.). Various supplies (printer, staplers, envelopes, etc.) are available.
I record how long each attempted task had been put off for, whether or not it was completed, and (optionally) what the task was. I use "old tasks get accomplished" as a proxy for the impact of the intervention, since I assume that if someone has been putting something off for six months they weren't going to do it in the counterfactual world where Bureaucracy Day doesn't exist.
Overall it's been way more successful than I expected. It's not uncommon for tasks that are years old to get finished.
I expect efficacy to drop over time as all the oldest tasks get accomplished, but so far that's been counteracted by new people participating.
My attempt to use Hackathon mindset for something actually productive. People show up and work for ten hours. No social media, no non-essential conversations, and only one project is allowed. Food is ordered or cooked the night before.
There aren't any common metrics, because I don't want to disrupt workflow by imposing recording procedures. Using my own metrics (number of github issues resolved and time spent working) I'm about an order of magnitude more productive on sprint days than on normal Saturdays. Not enough data to tell if I'm just redistributing my productivity to sprint days, but the data do not suggest this. Other participants report excellent results (that sprint days are at least 90th percentile productivity).
However participation has been very low (and I suspect this hurts individual efficacy). I'm not sure if this is because people don't want to give up their Saturdays, don't have a project to work on, or something else.
A conversational norm that anyone can say "backthumb" and the conversation will return to the previous topic. We use it to cull tangents. It's useful enough to have reached fixation within the local community, and we use it regularly.
I do not have any data on how focused our conversations are with / without the backthumb norm, but I'm quite confident it's a significant improvement.
Boiling point of nitrogen
Another conversational norm. Anyone can say "boiling point of nitrogen" to indicate that the current disagreement/question can be easily resolved with google, and then everyone has to shut up and google it. (Not sure if this is original to Boston or if we imported it).
Works well for culling pointless debates. Adoption has steadily increased.
Order of the Sphex
We made a weekly review worksheet, and attempted to iterate on it. Questions were things like: "What goals are you working on", "What trivial inconveniences are in your way", "Is there something you need to get off your plate". (I can share the full list if anyone is interested).
Was initially successful, but eventually became useless, and attempts to save it failed. There was also a meta failure where we didn't notice how badly it was failing, and so continued spending time on it.
There were many attempts to make use of time worksheets (a la CFAR) in our early days. As far as I can tell these were almost never useful for anyone (although one person reports them being very useful).
Individual members have reported finding group Habiticas useful in the past, but our attempt to create a community-wide Habitica failed. Very few people joined, and of those very few made use of it.
Group intervention testing
We created a list of interventions (take modafinil, plan your day in the morning, etc.). Once a week an intervention was picked at random, and everyone would try it.
Project died almost immediately, since no one actually implemented the chosen interventions.
: Or people aren't trying new things (I hope not) or they're sharing but in places I don't check.
I'm not claiming that all results will generalize, or that it's never worth replicating an idea. But currently those aren't even possibilities (subject to ).