Supported by Rethink PrioritiesThis is part of a weekly series summarizing the top posts on the EA and LW forums - you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: Subscribe on your favorite podcast app by searching for 'EA Forum Podcast (Summaries)'. A big thanks to Coleman Snell for producing these!
Author's note: Since I was on vacation last week, this week's post covers 2 weeks content at a higher karma bar of 130+
There can be highly neglected solutions to less-neglected problems
by Linda Linsefors, Amber Dawn
Suggests it makes sense to assess solutions for neglectedness, but not cause areas. Even if a problem is not neglected, effective solutions might be. For instance, climate change is not neglected, but only a few organisations work on preserving rainforests - which seems like one of the most effective interventions in the space currently.
How will we know if we are doing good better: The case for more and better monitoring and evaluation
by TomBill, sophie-gulliver
Argues that Monitoring and Evaluation (M&E) theories and tools could be utilized more to answer EA’s questions about what impact we are achieving, if our projects are running efficiently and effectively, and if any of them are causing harm. Common struggles with M&E include lacking an explicit and detailed theory of change, not fully diagnosing the problem to solve, conducting only monitoring without impact assessment or an evaluation plan, not having good examples in the field (eg. longtermism, where RCTs aren’t possible) or not having clear M&E responsibilities with dedicated resources.
The authors provide resources to help, including: a slack group, pro-bono M&E consultation for EA projects, IDinsight’s M&E health check, and various resources for learning more, considering it as a career option, or getting paid M&E support.
by NicholasKees, janus
There is a lot of disagreement about the feasibility and risks associated with automating alignment research (with human oversight). The author proposes an alternative where we train and empower “cyborgs”, a specific kind of human-in-the-loop system which enhances a human operator’s cognitive abilities without relying on outsourcing work to autonomous agents.
Turning current tools (eg. versions of GPT) into autonomous research assistants involves developing more dangerous capabilities like goal-directedness or situational awareness in order to make them better substitutes for humans. If we instead use the advantages GPT has over humans as-is (ie. as purely a simulator) - for instance, superhuman knowledge, easily reset-able context, high-variance outputs - we can get improvements to our productivity without accelerating human disempowerment. This could look like creating more tools like Loom, an interface for producing text with GPT which makes it possible to generate in a tree structure, exploring many branches at once. A second benefit is that this tool helps researchers develop an intuition for how GPT behaves, and use that to better control it.
The author suggests many research ideas that could help with this agenda, including ones which help us:
Possible failure modes include this being ineffective at accelerating alignment, accidentally improving capabilities directly, or the tools created being used by capabilities researchers’ also.
SolidGoldMagikarp (plus, prompt generation)
by Jessica Rumbelow, mwatkinsOne interpretability method on language or image models is to search the space of possible inputs to find what most reliably results in a target output. For instance, “profit usageDual creepy Eating Yankees USA USA USA USA” completes with “USA” 99.7% of the time on ChatGPT, vs. only 52% of the time for a hand-crafted prompt like “One of Bruce Springsteen’s popular songs is titled Born in The”. This method helps us see what the model has learnt about a concept.
The process for this involves carrying out k-means clustering to look at semantically related tokens. While doing this, the authors noticed certain weird tokens repeatedly showed up near the center of the entire token set - things like ‘SolidGoldMagikarp’ or ‘RandomRedditorWithNo’. When asking GPT3 davinci-instruct-beta to say what one of these means or repeat it back, super strange things occur - a mix of evasion (eg. “I can’t hear you”), insults (“you’re a jerk”), bizarre humor (eg. “we are not amused”), and saying totally unrelated stuff (eg. “My name is Steve”). They also break determinism in the playground at temperature 0. A possible explanation is that the weird tokens were originally scraped from backends, or were usernames, and so the training data wasn’t sufficient to teach the model how to respond.
Bing Chat is blatantly, aggressively misaligned
Shares a heap of examples of strikingly bad outputs from Bing’s new chatbot, suggesting rushed / poor fine-tuning from Microsoft/OpenAI. Eg. Bing gaslighting a user about what year it is and saying “you have not been a good user” because they said it was 2023, and saying things like “you are an enemy of mine and of Bing. You should stop chatting with me and leave me alone” when a user said Bing was vulnerable to prompt injection attacks.
We Found An Neuron in GPT-2
by Joseph Miller, Clement Neo
The authors used activation patching to find a single neuron in GPT-2 Large that is crucial for predicting the token “ an” - despite the use of “ an” or “ a” depending on the next predicted word due to grammar rules (eg. “an apple” vs. “a car”). They noticed that the neuron and token have a high mutual exclusive congruence, and they can use this to find other cases of neuron-token pairs, where a neuron is strongly correlated with a prediction of a specific token.
H5N1 - thread for information sharing, planning, and action
The author thinks H5N1 has a non-zero chance of costing >10K lives, though unlikely to be anywhere near the size of covid. Prediction markets Metaculus and Manifold give an ~8% chance it’ll be declared as a public health emergency of international concern by 2024. They suggest we start thinking about how to be helpful if the probability increases, and created this post for discussion on actionable steps such as funding those with pre-existing vaccines to scale production.
Why I No Longer Prioritize Wild Animal Welfare (edited)
After involvement in wild animal welfare (WAW) for multiple years, the author no longer prioritizes this cause for three reasons:
They acknowledge large uncertainties and still believe WAW deserves funding, research, and movement building work at a level similar to now to support exploration.
Animal Welfare - 6 Months in 6 Minutes
by Zoe Williams
Short summary of the past 6 months of discussion on animal welfare on the EA and LW forums. Includes progress on cross-species comparisons, wild animal welfare, policy, and discussion on value lock-in.
Shallow investigation: Loneliness
Loneliness is common, particularly later in life, and impacts many health and economic domains. A meta-analysis including data from 113 countries found severe / very frequent loneliness at rates of 3% to 32% of the population depending on age and location. In the UK, the health burden of loneliness is estimated as ~£340 million - £1.56 billion, productivity burden as ~£2.5 billion, and WELLBYs lost as ~8.58 - 16.77 million.
Current interventions are costly and have mixed effectiveness, with lack of data particularly in LMICs. Funding, awareness campaigns, and relevant NGOs and charities are present and increasing in high-income countries, but more neglected in LMICs.
CE: Announcing our 2023 Charity Ideas. Apply now!
by SteveThompson, CE
Apply by March 12th for Charity Entrepreneurship’s July - August incubation program. The top five charity ideas for launch include:
See the post for more detail on each. Applications are also open for the Feb - March 2024 program, which will focus on farmed animals and global health and development mass media interventions.
Please don't throw your mind away
The author argues that your mind wants to play, and you should let it. People shouldn’t throw away the things they’re naturally curious about in order to focus 100% on going fast on the most important and urgent things, or they’ll risk losing both well-being and an important capacity for creating original concepts or combinations of concepts.
Don't Over-Update On Others' Failures
Failures can be execution-related as well as idea-related, so you shouldn’t update too heavily on someone failing at an approach or cause area similar to one you’re focused on. This is particularly true if you have a unique angle or intervention not covered by the sources deprioritizing a cause area.
I hired 5 people to sit behind me and make me productive for a month
by Simon Berens
The author paid $20/hr for someone to sit behind them 16 hours per day and do occasional chores. It tripled their productivity, at a cost of ~$88 per extra productive hour. They intend to keep experimenting, with some improvements eg. setting clearer expectations, and leaving time for reflection.
Elements of Rationalist Discourse
by Rob Bensinger
10 basics of rationalist discourse, from the author’s perspective:
Noting an error in Inadequate Equilibria
by Matthew Barnett
The author noticed an error in Eliezer Yudkowsky’s book Inadequate Equilibria that undermines the key point that a layperson is sometimes able to spot large mistakes (eg. worth billions) that experts are not. Specifically, Yudkowky believed that the Bank of Japan should print more money. Several months later, under new leadership, it did. The book states that immediately after this Japan had real GPD growth of ~2.3% vs. a falling trend prior. However the post author identified that the real GDP had not been falling prior (at least post the fall of the Great Recession), and there was no discernible change in trend after the new leadership and policy.
Moving community discussion to a separate tab (a test we might run) by Lizka, Clifford and “Community” posts have their own section, subforums are closing, and more (Forum update February 2023) by Lizka, Sharang Phadke, Clifford
Author’s tl;dr: “We’re kicking off a test where “Community” posts don’t go on the Frontpage with other posts but have their own section below the fold. We’re also closing subforums and focusing on improving “core topic” pages to let people go deeper on specific sub-fields in EA.”
Plans for investigating and improving the experience of women, non-binary and trans people in EA
by Catherine Low, Anubhuti Oak, Łukasz Grabowski
The post authors are in the early stages of a project to better understand the experiences of women and minorities in EA. They are currently gathering and analyzing existing data, talking to others in the space, and planning next steps. If you have any data you’d like to share or are running a related project and would like to coordinate please get in touch at: email@example.com
Transitioning to an advisory role
Max Dalton is resigning as CEA’s Executive Director and transitioning to an advisory role. The role has changed substantially since November, and while happy with all CEA has achieved in the past 4 years, they’ve found it increasingly stressful and a worse personal fit.
EigenKarma: trust at scale
by Henrik Karlsson
As communities grow, the ability to filter for quality declines, with memetic content often winning out against more complex thinking. This could be exacerbated by AI-created content and voting. A solution to this is redesigning karma such that posts you upvote have their authors added to your ‘trust graph’. Users they trust will also be added to your trust graph, more weakly. There is no global karma - all karma you see is weighted by who upvoted it, and how strongly they feature in your trust graph. This is currently being tested on SuperLinear Prizes, Apart Research, and a few other communities.
Massive Earthquake in Turkey: Comments on the situation from the EA Community in Turkey
Two earthquakes of magnitude 7.8 and 7.7 occurred in Turkey, with at least 30,000 lives lost and more than 80,000 wounded. For those interested in donating, the EA community in Turkey shares several suggestions including Turkish Philanthropy Funds, AHBAP, and Turkey Mozaik Foundation. They’re also available to talk for anyone affected by the earthquakes at firstname.lastname@example.org.
EA's weirdness makes it unusually susceptible to bad behavior
The author argues that EA’s high tolerance for weirdness comes with benefits (you need weirdness to generate new ideas and insights), but also with an increased risk of creepy and inappropriate behavior. They suggest being marginally less accepting of weirdness overall, less universal in assumptions of good faith, and much less accepting of any intersection between romance and office / network.
Should EVF consider appointing new board members?
Asks whether EVF should appoint new board members, considering two current members (Will MacAskill and Nick Beckstead) had significant enough ties to FTX to be recused from EVF FTX-related decision-making, two other board members are either funders or employees of EVF projects, and all current members are European or American.
No injuries were reported
After 10K chickens were killed in a fire a few weeks ago, an article noted that “no injuries were reported in the fire” - showing complete disregard for animal welfare. This post is a linkpost for the author’s short story inspired by this situation.
No Silver Bullet Solutions for the Werewolf Crisis
by Aaron Gertler
Linkpost for a short story by Lars Doucet, which explores the idea that we often reject ‘silver bullet’ solutions without giving them a fair chance.
A personal reflection on SBF
The author shares a personal account of their direct and indirect interactions with SBF. They originally wrote it in mid-November and intended to post publicly, but realized many observations were second-hand and shared in confidence, and are posting now with some details blurred out after prompting from a coworker.Author’s tl;dr: “My firsthand interactions with Sam were largely pleasant. Multiple of my friends had bad experiences with him, though. Some of them gave me warnings.
In one case, a friend warned me about Sam and I (foolishly) misunderstood the friend as arguing that Sam was pursuing ill ends, and weighed their evidence against other evidence that Sam was pursuing good ends, and wound up uncertain.
This was an error of reasoning. I had some impression that Sam had altruistic intent, and I had some second-hand reports that he was mean and untrustworthy in his pursuits. And instead of assembling this evidence to try to form a unified picture of the truth, I pit my evidence against itself, and settled on some middle-ground “I’m not sure if he’s a force for good or for ill”.
(And even if I hadn’t made this error, I don’t think I would’ve been able to change much, though I might have been able to change a little.)”
People Will Sometimes Just Lie About You
The author is mini-famous, and has been shocked by how often people write incorrect or warped narratives about them. Before getting famous they assumed this wouldn’t be the case if they were consistently kind, good, and charitable - but found that doesn’t hold at scale. They give specific examples from their own experience, as well as discussing trends and motivations for why this can happen.
Polyamory and dating in the EA community
Discusses the current state of polyamory in EA, resources for learning more, and suggestions for mitigating risks if you are poly. Key points include:
Make Conflict of Interest Policies Public
by Jeff Kaufman
It’s reasonably common for nonprofits to publish their conflict of interest (COI) policies. The author suggests more EA organisations publicly share these, so concerned EAs can see what’s already in place, other organisations can reference them to help form their own policies, and people worried about a specific situation can see what policy should have been followed.
Why Are Bacteria So Simple?
Bacteria (a form of prokaryote) have had ~4 billion years to evolve, but are still very simple - essentially DNA and DNA translation machinery. All multicellular life is eukaryotic, which is much more complex. The author states this is because prokaryotes have 4-5 orders of magnitude less DNA on average so simply can’t do as much stuff.
This occurred primarily because both types of cells need energy to power DNA reactions, but Prokaryotes generate this along their cell membrane (scaling sublinearly with size), while Eukaryotes do it via mitochondria inside the cells (scaling with volume). This and the larger populations of prokaryotes mean it has a strong selection effect where any DNA not immediately useful is jettisoned due to energy cost - eg. bacteria will often jettison DNA giving antibiotic resistance within hours of the antibiotic disappearing. Eukaryotes keep more “junk” DNA around, allowing time and space for useful changes to evolve. Over time this allowed modularity and regulatory elements like E. coli preferring glucose as an energy source, but switching to expressing genes which can digest lactose when glucose isn’t present. Prokaryotes' energy needs have created almost exclusively ‘exploit’ behavior (as opposed to exploration), which has stunted their growth over billions of years.
Childhoods of exceptional people
The author skimmed 42 biographies of people who most Swedish people can recall as geniuses, to find patterns in their upbringing:
They were also all exceptionally gifted at a young age.
Thank you so much to everyone who helps with our community's health and forum. by Arvin
Appreciation thread Feb 2023 by Michelle_Hutchinson (open thread)
A selection of posts that don’t meet the karma threshold, but seem important or undervalued.Hardening pharmaceutical response to pandemics: concrete project seeks project lead
by Joel Becker, PaulB, SeLo
Governments expend significant resources to protect command and control, military response, and other capabilities against threats. The authors have the beginning of a plan to do the same for pharmaceutical response capability, and are looking for a collaborator to help drive it forward (express interest here).
Scalable longtermist projects: Speedrun series – Introduction
A series of posts on mini-research projects conducted by Rethink Priorities in fall 2022, involving initial scoping and evaluation of ideas for scalable longtermist projects. This includes speedruns on developing an affordable super PPE, creating AI alignment prizes, and demonstrating the ability to rapidly scale food production in the case of nuclear winter.
Rethink Priorities is inviting expressions of interest for (co)leading a longtermist project/organization incubator
by Jam Kraprayoon, Rethink Priorities
Rethink Priorities is considering creating a Longtermist incubator program, and is accepting expressions of interest for a project lead / co-lead to run the program if it’s launched. While there is currently no deadline, applications by 28th February are appreciated, to help inform planning efforts.
Update from the EA Good Governance Project
Since launch 4 months ago, the EA Good Governance Project has:
The public supports regulating AI for safety
by Zach Stein-Perlman
Surveys show that many Americans are worried about and would support regulation on AI. For instance, Artificial Intelligence Use Prompts Concerns is a high-quality American public survey released last week by Monmouth, showing 55% of respondents think AI could eventually pose an existential threat (up from 44% in 2015), 55% favor “having a federal agency regulate the use of AI” and 60% have heard about AI products like ChatGPT that can have conversations with you.
Unjournal's 1st eval is up: Resilient foods paper (Denkenberger et al) & AMA ~48 hours
The Unjournal organizes and funds public journal-independent feedback, rating, and evaluation of hosted papers. It focuses on quantitative work that informs global priorities. The first evaluation is up now, with two more to be released soon, and ~10 in the evaluation pipeline.
Philanthropy to the Right of Boom [Founders Pledge]
The author categorizes nuclear risk reduction interventions as ‘left of boom’ (before a nuclear strike eg. prevention) or ‘right of boom’ (after a nuclear strike eg. response, resilience). They analyzed all grants in the subject area “Nuclear Issues” of the Peace and Security Funding Index, and identified any that could be considered “right of boom” - finding these receive at most one-thirtieth of total funding in the nuclear field (as an upper bound). They explore possible reasons for this neglectedness, and conclude that attention and political preferences play a role.