This is part of a weekly series summarizing the top posts on the EA and LW forums - you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: Subscribe on your favorite podcast app by searching for 'EA Forum Podcast (Summaries)'. A big thanks to Coleman Snell for producing these!
Author's note: because of travel / leave, this post covers the past three weeks of posts. We'll be back to our usual weekly schedule from here.
FLI open letter: Pause giant AI experiments
by Zach Stein-Perlman
Linkpost for this open letter by Future of Life Institute, which calls for “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.” It has over 2000 signatures, including Yoshua Bengio, Stuart Russell, Elon Musk, Steve Wozniak, and other well-known academics, entrepreneurs and public figures. It’s been covered by NYT, BBC, and many other media outlets.
Time Ideas also published an article by Eliezer Yudkowsky which argues the letter’s ask doesn’t go far enough and 6 months isn’t enough time to solve for safety.
by Akash
YouGov America (a reputable pollster) released a survey of 20,810 American adults which found:
There were no meaningful differences by region, gender, or political party.
Announcing Epoch’s dashboard of key trends and figures in Machine Learning
by Jaime Sevilla
Epoch has launched a new dashboard, covering key numbers from their research to help understand the present and future of machine learning. Eg. training compute requirements, model size, availability and use of training data, hardware efficiency, algorithmic improvements and investment in training runs over time.
New blog: Planned Obsolescence
by Ajeya, Kelsey Piper
Kelsey Piper and Ajeya Cotra have launched a new blog on AI futurism and alignment. It’s aiming to clearly communicate thoughts on the biggest challenges in technical work and policy to make AI go well, and is targeted at a broader audience than EA or AI Safety communities.
Hooray for stepping out of the limelight
by So8res
Celebrates that since ~2016, DeepMind has stepped out of the limelight and hyped their developments a lot less than they could have. The author suspects this was a purposeful move to avoid bringing AGI capabilities to the forefront of public attention.
by Zvi
OpenAI has launched the ability for ChatGPT to browse the internet for up to date information, run Python in a walled sandbox without internet access, and integrate with third party apps. For instance, it can integrate with Slack and Zapier to access personal data and put responses in context. It’s also been trained to know when to reach out to plug-ins like Wolfram Alpha to improve its responses.
If interpretability research goes well, it may get dangerous
by So8res
Interpretability research is important, but the author argues it should be kept private (or to a limited group) if it allows understanding of AIs that could significantly advance capabilities. They don’t think we’re close to that yet but want to highlight the potential trade-off.
AISafety.world is a map of the AIS ecosystem
by Hamish Doodles
Aisafety.world is a reasonably comprehensive map of organizations, people, and resources in the AI safety space (including research organizations, blogs / forums, podcasts, youtube channels, training programs, career support and funders).
Policy discussions follow strong contextualizing norms
by Richard_Ngo
Claims like "X is worse than Y" can often be interpreted as a general endorsement of causing Y in order to avoid X. This is particularly true in areas with strong ‘contextualizing norms’ (ie. which expect implications of statements to be addressed) like policy.
GPTs are Predictors, not Imitators
by Eliezer Yudkowsky
A predictor needs to be more intelligent than an imitator or simulator. For instance, predicting <Hash, plaintext> pairs requires cracking the hash algorithm - but generating typical instances of <Hash, plaintext> pairs does not. A lot of what ChatGPT predicts has complex causal chains behind it - for instance, the results of a science paper. To accurately predict this, you need to understand the science at play. Therefore the task GPTs are being trained on (next-token-prediction) is in many ways harder than being an actual human.
by Zvi
AutoGPT uses GPT-4 to generate, prioritize and execute sub-tasks toward a given objective or larger task, using plug-ins for internet browsing and other access. It quickly became #1 on Github and generated excitement. So far though, it hasn’t achieved much - it has a tendency to get distracted or confused or caught in loops. However, it’s the first version and it’s likely it will get significantly better over time. The author suggests AutoGPT happening now could be good, as we were always going to get agents eventually, and this gives us a chance to gradually face more capable AI agents.
Relatedly, Stanford and Google researchers put 25 ChatGPT characters into a Sims-like game world, with starting personas and relationships to see how they would interact. The author suggests taking this further, and putting one into a game world with an economy with the goal to make money (and seeing if it takes over the game world in the process).
The author also talks about arguments for and against AutoGPT being a ‘true agent’, and predictions on what they expect next.
Critiques of prominent AI safety labs: Redwood Research
by Omega
The first of a series of critiques of AI safety organizations that have received >$10M per year in funding.
Redwood Research launched in 2021 and is focused on technical AI safety (TAIS) alignment research. They have strong connections and reputation within EA and have received ~$21M in funding, primarily from Open Philanthropy (OP). So far they list six research projects on their website, have run programs (MLAB and REMIX) for junior TAIS researchers, and run the Longtermist office Constellation. They employed ~6-15 FTE researchers at any given time in the past 2 years.
The author offers several observations and suggestions:
Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds
by 1a3orn
“Giant inscrutable matrices” are often referred to as the reason why deep learning models are hard to understand and control. Some have argued for work on a more easily-interpretable paradigm. However the author argues current state is the best we can expect, because:
They also suggest it’s possible something quite simple will work for scaling interpretability of these matrices - citing research where a team was able to easily identify and patch ‘cheese-seeking’ behavior out of a cheese seeking model, using the simplest first approach they came across.
Nobody’s on the ball on AGI alignment
by leopold
The author argues we’re not on track to solve alignment of superhuman AGI systems, but could be with a large-scale effort similar to the first moon landing.
Few people are working on AI alignment: ~300 technical alignment researchers vs. 100,000 ML capabilities researchers. Of those, many aren’t tackling the core difficulties. Reinforcement learning from human feedback relies on human supervision, which isn’t reliable when models become superhuman. Mechanistic interpretability is producing interesting findings but is like “trying to engineer nuclear reactor security by doing fundamental physics research, 2 hours before starting the reactor”. There isn’t a lot else happening in technical AI safety outside of abstract / theoretical work like decision theory and mathematical proofs. However, alignment is increasingly becoming a ‘real science’ where experimentation is possible, meaning it’s possible to make substantial progress with enough money and attention.
U.S. is launching a $5 billion follow-up to Operation Warp Speed
by Juan Cambeiro
Linkpost for this article. Author’s summary: “The Biden administration is launching a $5 billion follow-up to Operation Warp Speed called "Project Next Gen." It has 3 goals, of which the most relevant for future pandemic preparedness is development of pan-coronavirus vaccines. The $5 billion seems to be coming from unspent COVID funds, so no new appropriations are needed.”
Polio Lab Leak Caught with Wastewater Sampling
by Cullen
Linkpost for this article, describing a polio lab leak in the Netherlands caught by wastewater sampling.
How much funding does GiveWell expect to raise through 2025?
by GiveWell
Medians and 90% confidence intervals for expected funds raised in 2023 to 2025:
These are relatively constant due to a decrease in expected funding from GiveWell’s biggest funder Open Philanthropy (OP), offset by an expected increase from other donors. OP donated ~$350M in 2022, and tentatively plans to give ~$250M in 2023, with possible further decreases. This substantially increases uncertainty in the total funding level for GiveWell. Their strategy will continue to focus on finding outstanding giving opportunities, but they may smooth spending year to year to maintain a stable 10x cash cost-effectiveness bar, and plan to increase fundraising efforts (with a goal of $500M in funds raised from non-OP donors by 2025).
Introducing the Maternal Health Initiative
by Ben Williamson, Sarah H
The Maternal Health Initiative was launched in 2022 via the Charity Entrepreneurship incubation programme. It’s conducting an initial pilot in Ghana through local partner organisations, where it trains health providers to provide family planning counseling during the postpartum period - with the first training sessions taking place in the next month. Their initial estimate is that this could avert a DALY (disability adjusted life year) per $166, though more will be known after the pilot. Long-term, the plan is to demonstrate efficacy and then shift from direct delivery into supporting the Ghana Health Service to make this a standard long-term practice. You can see more on their website, sign up to their mailing list, or reach out directly.
Lead exposure: a shallow cause exploration
by JoelMcGuire, Samuel Dupret, MichaelPlant, Ryan Dwyer
2-week investigation into the impact of lead exposure in childhood on subjective wellbeing in adulthood. Two correlational longitudinal studies in New Zealand and Australia suggest an additional microgram of lead per deciliter of blood throughout 10 years of childhood leads to a loss of 1.5 WELLBYs (or an estimated 3.8 including household spillovers). Back of the envelope calculations suggest this means lead-reducing interventions could be 1 to 107 times more cost-effective than cash transfers. The authors are unsure if top organisations working to reduce lead exposure (eg. Pure Earth, Lead Exposure Elimination Project) have funding gaps.
Announcing a new animal advocacy podcast: How I Learned to Love Shrimp
by James Özden, AmyOdene
“How I Learned to Love Shrimp is a podcast about promising ways to help animals and build the animal advocacy movement. We showcase interesting and exciting ideas within animal advocacy and will release bi-weekly, hour-long interviews with people who are working on these projects.”
Healthier Hens Y1.5 Update and scaledown assessment
by lukasj10, Isaac_Esparza
Author’s tl;dr: “Healthier Hens has scaled down due to not being able to secure enough funding to provide a sufficient runway to pilot dietary interventions effectively. We will continue through mini-projects and refining our plan for a feed pilot on the ground until our next organisational assessment at the end of summer 2023. Most efforts will now be spent on reporting, dissemination and fundraising. In this post we share updates, show what went well, less so and what others can learn from our attempts.”
Planned Updates to U.S. Regulatory Analysis Methods are Likely Relevant to EAs
by MHR
The U.S. Office of Management and Budget (OMB) has proposed an update to Circular A-4, which provides guidance to federal agencies regarding methods of regulatory analysis. Relevant updates include:
Public comments can now be submitted here and here until June 6th.
GWWC's 2020–2022 Impact evaluation (executive summary)
by Michael Townsend, Sjir Hoeijmakers, Giving What We Can
Giving What We Can (GWWC) estimates their counterfactual impact on donations in 2020 - 2022, based largely on self-reported data. Key results include:
The billionaires’ philanthropy index
by brb243
A spreadsheet of 2.5K billionaires, containing information on current wealth, donation amounts and cause areas donated to.
Apply to >30 AI safety funders in one application with the Nonlinear Network
by Drew Spartz, Kat Woods, Emerson Spartz
Nonlinear has built a network of >30 (and growing) earn-to-givers who are interested in funding good AI safety-related projects. Apply for funding or sign up as a funder by May 17th. Funders will get access to a database of applications relevant to their specific interests (eg. interpretability, moonshots, forecasting etc.) and can then get in touch directly or via Nonlinear with those they’d like to fund.
Write more Wikipedia articles on policy-relevant EA concepts
by freedomandutility
The impact of writing Wikipedia articles on important EA concepts is hard to estimate, but potentially high upside with little downside risk. The author suggests 23 pages (with more ideas in comments) that could use creating or adding detail to, including ‘Existential Risk’, ‘Alternative Proteins’, and ‘Political Representation of Future Generations’.
SERI MATS - Summer 2023 Cohort
by Aris Richardson
Applications are open for the Summer 2023 Cohort of the SERI ML Alignment Theory Scholars Program, due May 7th. It aims to help scholars develop as alignment researchers. The program will run from ~June - August (including 2 months in-person in Berkeley), with an optional extension through to December.
EA is three radical ideas I want to protect
by Peter Wildeford
Argues that EA contains three important ideas rarely found elsewhere, and which are important enough to protect:
by Jeff Kaufman
The author is very happy to see posts about EA orgs which point out errors or ask hard questions. However, they suggest letting orgs review a draft first. This allows the org to prepare a response (potentially including important details not accessible to those outside the organization) and comment it when you post. Without this, staff may have to choose between scrambling to respond (potentially working out of hours), or delaying response and risking that many people will never see their follow-up and may downgrade their perception of the org without knowing those details.
Things that can make EA a good place for women
by lilly
EA is a subpar place for women in some ways, but the author argues it also does well on many gender issues relative to other communities, including:
by Benjamin_Todd
A long list of updates the author is and isn’t making after reflecting on the events of the past 6 months, including FTX.
Updates include (not a comprehensive list - see post for more):
Updates not made include (again not comprehensive):
Announcing CEA’s Interim Managing Director
by Ben_West
Ben West is the new Interim Managing Director of CEA, after Max Dalton stepped down as Executive Director a few weeks ago. This position will likely last ~6-12 months until a new Executive Director is hired.
They use this post to reflect on some wins in CEA over the past year, with plentiful jokes and memes (click into the post itself for those!):
EA & “The correct response to uncertainty is *not* half-speed”
by Lizka
“When we're unsure about what to do, we sometimes naively take the "average" of the obvious options — despite the fact that a different strategy is often better.”
For example:
Going ‘half-speed’ in this way can make sense if speed itself is part of the problem, if you’re being careful / preserving option value, or if it’s a low cost way of getting capacity for figuring out what to do. Otherwise it’s often not the best option.