EA & LW Forums Weekly Summary (7th Nov - 13th Nov 22')

Zoe Williams

Supported by Rethink Priorities

This is part of a weekly series summarizing the top (40+ karma) posts on the EA and LW forums - you can see the full collection here. The first post includes some details on purpose and methodology.

If you'd like to receive these summaries via email, you can subscribe here.

Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell for producing these! Subscribe on your favorite podcast app by searching for 'Effective Altruism Forum Podcast'.

Author's note: I'm currently travelling, so there won't be any summary post next week. It'll be back the week after with the summaries for those 2 weeks of content at a higher karma bar.

Top / Curated Readings

FTX

FTX filed for bankruptcy on 11th Nov. The FTX Future Fund, financed by FTX and its founder SBF (Sam Bankman-Fried), had been a major funder in EA since its launch in Feb this year.

What happened: Binance, a competitor, sold their stake in FTX’s primary coin - this caused it to drop in value, and led to the equivalent of a bank run ie. a large number of clients attempting to withdraw funds. FTX didn’t have enough money to pay them, and there have been claims this is due to SBF misusing customer funds to prop up his trading company Alameda Research.

This has meant thousands of people with funds in FTX have likely lost them. If you need someone to talk to, even just to vent, CEA’s community health team is available or you can access peer support set up by Rethink Wellbeing and the Mental Health Navigator here.

FTX FAQ by Hamish Doodles provides a good overview of the situation as of Sunday 13th November.

EA Forum

Philosophy and Methodologies

The Welfare Range Table

by Bob Fischer

Second post in the moral weight project sequence. Assesses the likelihood of trait possession for each of 11 different species on >90 empirical proxies that might provide evidence of variation in valenced states. (Ie. The proxies provide evidence on if the species best and worst potential experiences would be very different in ‘goodness’). These included both hedonic and cognitive proxies. Some examples: ‘joy like behavior’, ‘concept of death’, ‘reward based learning’, ‘cooperative behavior’, ‘individual differences / personality’. This is presented visually in a (super cool) graph, and available in excel also.

Key meta-results included that there are a lot of unknowns, a decent amount of cases with positive evidence for a trait, and very few cases where we have evidence a species doesn’t have the trait. The most information was known about terrestrial vertebrates (eg. pigs) and the least about invertebrates (eg. silkworms). In terms of traits, a lot was known about some (eg. cooperative behavior, communication, parental care) and very little about others - particularly affective states like guilt-like behavior, sympathy-like behavior, or shared intentionality.

Naïve vs Prudent Utilitarianism

by Richard Y Chappell

Naive utilitarianism is acting whenever the most salient first-order consequences are positive. These calculations are unreliable, and violating people’s rights is almost guaranteed to be of negative expected value, even when the first order effects look positive. This is well known by utilitarian theorists, and so prudent / rational utilitarians tend to abide by cooperative norms - making them more trustworthy than critics equating their moral philosophy to naive utilitarianism would think.

Object Level Interventions / Reviews

Opportunities that surprised us during our Clearer Thinking Regrants program

by spencerg, Clare_Diane

Clearer Thinking evaluated >630 project proposals as part of their Regrants program. They share key learnings that updated them toward there being many promising opportunities where 10K-500K gifts could make a big difference.

Unexpected room for funding of well-known orgs

Rethink Priorities survey team has significant room for funding for original research to benefit EA as a whole (eg. message testing).
Happier lives institute has significant room for more funding.
1Day Sooner conducts activities outside of their human challenge trial work eg. faster regulatory pathways for vaccines, and these activities have significant room for funding.

Object-level learnings

There’s been minimal work to quantify risks of large scale volcanic eruptions.
Boiling water using solid fuels contributes significantly to indoor air pollution in low and middle income countries.
There are new ideas on how to reduce nuclear war risk, despite the age of the field (eg. developing nuclear de-escalation toolkits via historical analysis).
It’s hard to detect vitamin deficiency in a population, but point of care bio-sensors might soon be an option to fix that gap.

Instead of technical research, more people should focus on buying time and Ways to buy time

by AW

Delaying AGI timelines by 1 year gives the entire alignment community an extra year to solve the problem. ‘Buying time’ interventions typically involve convincing AI labs that AI x-risk is an important concern, and providing feasible actions for them to reduce it. Eg:

Outreach to AGI researchers, 1-1, in written form or via coordination events. Alternatively improve persuasiveness of outreach by:
- Making AI x-risk arguments more concrete
- Demonstrating concerning capabilities / alignment failures
- Red-teaming counter-arguments that are common at top AI labs
Support safety and governance teams at major AI labs.
Develop and promote safety standards for AI labs eg. infosecurity and publication policies.
Potentially many other ideas could help eg. creating safety benchmarks, overviews on open safety problems to make it easier to dig in, or alignment competitions.

The author thinks ~40-60% of alignment researchers should work on this instead of technical research (particularly if they would be a better fit for it), and 20-40% of AI Safety community builders should also switch to this focus. This is particularly the case if you are able to progress an intervention that buys time at the end, i.e. when we know more and have more tools to help with alignment.

[Links post] Economists Chris Blattman and Noah Smith on China, Taiwan, and the likelihood of war

by Stephen Clare

Summaries of two recent posts by prominent economists on the likelihood of a China / US war over Taiwan, using economic frameworks. Both think it’s relatively likely. Details below:

The prospects for war with China: Why I see a serious chance of World War III in the next decade by Chris Blattman

Applying the bargaining framework he developed in his book ‘Why We Fight’, Chris argues that as China grows its economy and military, war becomes more likely. While negotiated settlement is theoretically preferable, there may be principles (e.g. democracy vs autocracy) that are non-negotiable, and China harmed its reputation for sticking to settlements with its crackdown on Hong Kong.

Why I think an invasion of Taiwan probably means WW3 by Noah Smith

Uses game theory to predict the most likely war scenarios, via:

Weighing up factors like national pride, reputation, and military cost in terms of their importance to each of Chinese and American leaders
Uses these weights to assign expected payoffs to different actions in a decision tree with three cascading choices (China on if to invade Taiwan, China on if to invade US bases first, US on whether to resist the invasion)

Given his assumptions, the equilibrium solution is for China to invade and attack the US to maximize its chances of victory, nearly assuring the outbreak of a major great power war. However the assumptions might not be great, and he discusses how they don’t take into account eg. misinformation, or the ‘US resists’ outcomes being too negative for both countries for either to take that path.

Tracking the money flows in forecasting

by NunoSempere

A list of 27 forecasting organisations (within and outside EA), including description and rough estimates of monetary value and social value.

Does the US public support radical action against factory farming in the name of animal welfare?

by Neil_Dullaghan

A US nationally representative survey by Rethink Priorities found 15.7% of respondents (N=2,698) supported a ban on slaughterhouses when presented with arguments for and against, and asked to explain their reasoning. Previous surveys by the Sentience Institute in the US found ~39-43% of people supported banning slaughterhouses, highlighting a large discrepancy.

Rethink Priorities suggests that previous polls which determined attitudes in response to broad questions (e.g. “I support a ban on slaughterhouses”) may not be accurate indicators of support for certain policies. These findings are notable for animal advocates as previous findings had been cited as support for bold reforms.

Neil also notes that for future research, it might be useful to test a radical ask (ban factory farming) and a moderate ask (labelling for cage-free eggs, say) each with a radical message ("meat is murder") versus a moderate one ("human/consumer welfare") somewhat similar to this paper.

How to change a system from the inside

by weeatquince

A guide on how to change your organization’s rules, protocols, decision-making, culture or ethos from the inside. Originally written by UK civil servants, the author is most confident it applies in that context, but has adapted the post to give advice more broadly to staff from juniors to middle managers in large organisations. The process suggested is:

Understand what needs to change (be observant, learn best practice, brainstorm and prioritize ideas, gather evidence, and understand why it hasn’t been fixed already)
Understand how things have changed already (talk to those making changes, identify precedents)
Create space for action by doing your day job well, and exploring options with your manager to add this new objective to your day job.
Fix things you can fix (start small, ask for forgiveness not permission)
Build a group of collaborators (network, find others with the same goals, make them into a team with clear responsibilities and action points)
Involve senior staff (build credibility, work out who can pull the right levers and talk to them, present a process not a solution, and be humble)

Change is slow, and can fail or be limited by external factors - but giving it a go can be good for learning and career capital even if it fails.

Opportunities

Apply now for the EU Tech Policy Fellowship 2023 by Jan-WillemvanPutten, Cillian Crosson, Training for Good, SteveThompson

8-month programme to catapult grads into high-impact career paths in EU policy, mainly working on the topic of AI Governance. Includes remote PT study July/Aug, 2x week-long policy trainings in Brussels, and choice of ~5 month placement at a host org or support applying to other EU policy jobs. Open to EU citizens, Apply by Dec 11.

GiveWell is hiring a Research Analyst (apply by November 20)

by GiveWell

GiveWell is hiring for a Research Analyst for their core interventions team, which investigates and makes funding decisions about programs they’re already supporting at scale. All locations welcome, as long as willing to join meetings in the California time zone. No specific experience or degrees needed. Apply by Nov 20th.

Community & Media

What's Happening in Australia

by Bradley Tjandra, Nathan Sherburn

A lot of EA aligned work being done in Australia involves working remotely with the international community, but there’s also a growing list of projects at least partially led by Australian residents. This post compiles them together and provides summaries, who’s working on it, links, and requests / calls for action for each.

Includes: AI Safety Australia & New Zealand, AI Safety Support, EA Pathfinder, Foundations for Tomorrow, Giving What We Can, Good Ancestors Project, High Impact Engineers, High Impact Recruitment, Insights for Impact, Lead Exposure Elimination Project, Quantifying Uncertainty in Givewell CEAs, Ready Research, and Sentience Institute.

Google Scholar is now listing (some) EA Forum posts

by PeterSlattery

When typing ‘effective altruism’ into Google Scholar, some EA forum posts show up - although without proper titles. DAOMaxi explains in the comments why this would happen, and how it’s possible to either manually add articles to Google Scholar or for the forum team devs to include tags that will do it automatically (and with correct details). Currently 48 forum posts are indexed on Google Scholar.

Some advice on independent research

by mariushobbhahn

You might want to do independent research as a side / transition project, if there aren’t positions open in the area you want to research, or because you value independence / flexibility.

The author has several tips for doing this successfully:

Identify if you’re primarily looking to produce useful outputs, or to upskill. This will help you decide between eg. replicating current key research vs. doing new research.
Get feedback early eg. on your project goals and plan.
Have clear research goals and re-evaluate from time to time to see if your plan is the most effective way to get there.
Collaborate with others - it’s motivating as well as useful.
Create accountability mechanisms eg. via intermediate goals or committing to posting results.
Be more active overall - independent researchers have less structure and more responsibility.

Doing Ops in EA FAQ: before you join (2022)

by Vaidehi Agarwalla, Alexandra Malikova, ES

Guide by Pineapple Operations on things to consider before entering Ops work at an EA org. Includes:

What is Ops, and how to get into it?

List of possible activities eg. HR, recruiting, events, logistics, fundraising, marketing, comms, accounting, process implementation, project management, generalist work.
1. Even within titles, roles can differ significantly - always read the job description
Important attributes include an ownership & service mindset, able to context-switch, detail-oriented, organized, quick learner, process thinking, good judgment.
1. Mission-alignment is also important, particularly for smaller orgs.
2. The best way to learn Ops is to do it.
Test fit via introspection, research, talking to people, doing some Ops somewhere, volunteering, applying to roles and doing test tasks
Contract / volunteer roles are common but rarely advertised, you can find them via talking to others (eg. via EA conferences, the forum, Pineapple Ops).

What Ops is like

Your role may feel similar to one at a non-EA org, day to day.
In some cases work can be intense, lonely, or difficult to progress in - advocate for yourself, don’t burn out.
Pay depends on company size, cause area, salary philosophy, funding and work location. 2022 UK roles are often £35,000 to £75,000.

They also list further resources and orgs specializing in operations.

The 8-week mental health programme for EAs finally published

by tereziekosik, Kristyna Stastna, Sylvie Wagnerová

Mental Health Navigator has published an 8-week mental health programme for EAs. This includes an 8-chapter workbook and resources for facilitators to run weekly workshops.

AI Safety groups should imitate career development clubs

by Joshc

If you want talented ML students to learn about AI safety, offer them what they find valuable ie. projects and skill-building that create career capital. This is the model of ML @ Berkeley, which is run unpaid, requires 15 hour p/w commitment from participants, is extremely selective (~7% get in) and still has 50 students. Many groups are more discussion-focused, and could benefit from this approach.

EA Images

by Bob Jacobs

The author designed symbols for a utilitarianism flag, their local EA group, and common EA mindsets which they share here. They’ve also created a lot of banners, thumbnails, and images used elsewhere on the forum.

LW Forum

by So8res

Alignment proposals are answers to the scenario: ‘Imagine you’re about to launch AGI, and think there is >50% chance it will end the acute risk period in a good way. Why do you think that?’

Current proposals fall into three buckets, which the author believes are doomed for the reasons indented below each:

Output evaluation - you know exactly what the AGI will do eg. it only outputs plans that you then screen and implement.
- Humans aren’t capable of evaluating plans of a complexity that might end the risk period, particularly if they involve branching (eg. info or power gathering first).
Cognitive interpretability - you understand the AI’s cognition enough to be confident in how it will reach a plan and that it would never consider bad approaches.
- The current paradigm trains minds vs. building them, and we have little insight into their internal thinking.
Heavy-precedent approaches - you’ve run this AGI before, trained out all the hiccups, and only plan to run it on similar tasks.
- The author can’t think of any pivotal acts that would end the risk period, while having safe analogs we could train on.

Applying superintelligence without collusion

by Eric Drexler
A lot of AI safety research assumes a monolithic AGI, with the argument that if there were multiple superintelligent-level systems they would inevitably collude and act as one.

Factors that make collusion less likely include a large number of actors, sensitivity to defectors, diversity among actors, constrained communication, single-move decision-processes, and lack of shared knowledge. The author argues these conditions are supported by current architectures and incentives (eg. making multiple diverse models improves quality / reliability of answers). Applying multiple potentially untrustworthy superintelligent-level systems to problems can improve rather than degrade safety by thwarting collusion. They call for greater attention on this prospect.

Rationality Related

I Converted Book I of The Sequences Into A Zoomer-Readable Format

by dkirmani

The author has converted book 1 of the sequences into machine-read audio overlaid on unrelated Subway Surfers gameplay footage. Similar videos are often recommended on TikTok, so this format may be highly engaging for some people. The videos are all linked in the post.

"Rudeness", a useful coordination mechanic

by Raemon

Rudeness is a way of spending down social capital, which you accumulate doing high status respectable things. Different communities and cultures have different norms of ‘rude’ eg. belching at a meal might be rude in one culture, and rude not to do in another. This allows groups to fine-tune what they optimize for - making some actions more socially expensive and so occur less often.

Trying to Make a Treacherous Mesa-Optimizer

by MadHatter

The author builds a toy model to try and empirically prove the possibility of treacherous mesa-optimizers ie. optimizers that try to look aligned in training.

They created a model to follow the X = Y line in a graph up to Y = 5, to know humans can’t control it after that, to learn via simulation, and to have a loss model that is different from the ‘true’ loss model we want. They show in this case the model sometimes veers away from X = Y after the Y = 5 point. Code, graphs, and commentary are provided.

Instrumental convergence is what makes general intelligence possible

by tailcalled

Author’s tl;dr (lightly edited): General intelligence is possible because solving real-world problems requires solving common subtasks. Common subtasks are what give us instrumental convergence (definition: the tendency for most sufficiently intelligent beings to pursue similar sub-goals). Common subtasks are also what make AI useful; you want AIs to pursue instrumentally convergent goals. Capabilities research proceeds by figuring out algorithms for instrumentally convergent cognition. Consequentialism and search are fairly general ways of solving common subtasks.

A philosopher's critique of RLHF

by ThomasW

Shares a transcription of a short Q&A between Brian Christian (Author of The Alignment Problem) and Yale Philosophy Professor L.A. Paul. The latter suggests that the issue with RLHF (reinforcement learning from human feedback) is new scenarios where humans can’t distinguish what is better, or even bucket it into an existing category. The phrasing is crisp and tackles the problem clearly, despite Professor Paul having limited background in AI Safety. The author recommends watching the full recording.

Other

What is epigenetics?

by Metacelsus

Genetics is the study of genes ie. sequences of genetic material that encode functional products. Epigenetics is the study of modifications to this genetic material that don’t affect the sequence, but control which genes get expressed. In mammals, these are DNA methylation and histone modifications. Newly copied DNA lacks these modifications and the modifications can be read, written, and erased by specialized proteins (scientists can also do this via CRISPR).

Speculation on Current Opportunities for Unusually High Impact in Global Health

by johnswentworth

The Sahel region is close to the Malthusian equilibrium (where all production is used for sustenance). This means many die in an economic downturn. Support is likely neglected due to government corruption, necessitating charities to deliver support in person, with one source stating antibiotic imports of the entirety of Mali amounted to $53k in 2020. Givewell top charities also tend to physically distribute goods. Based on this, the author suggests even flying in with a backpack of antibiotics to give away might be highly impactful.

Exams-Only Universities

by Mati_Roy

Suggests exam-only universities as a way to allow people to learn at their own pace / from wherever they want, and for testing to become more standardized across institutions. It also removes bias from lecturers offering exam hints during classes, and the inconvenience and cost of attending classes.

Didn’t Summarize

What it's like to dissect a cadaver by Alok Singh

Mysteries of mode collapse due to RLHF by janus

LESSWRONG
LW