2020 Review Article

Vaniver

A common thing in academia is to write ‘review articles’ that attempt to summarize a whole field quickly, allowing researchers to see what’s out there (while referring them to the actual articles for all of the details). This is my attempt to do something similar for the 2020 Review, focusing on posts that had sufficiently many votes (as all nominated posts was a few too many).

I ended up clustering the posts into seven categories: rationality, gears, economics, history, current events, communication, and alignment.

Rationality

The site doesn't have a tagline anymore, but interest in rationality remains Less Wrong's defining feature.

There were a handful of posts on rationality 'directly'. Anna Salamon looked at two sorts of puzzles: reality-masking and reality-revealing, or those which are about controlling yourself (and others) or about understanding non-agentic reality. Listing out examples (both internal and external) helped explain cognitive biases more simply. Kaj Sotala elaborated on the Felt Sense, a core component of Gendlin’s Focusing. CFAR released its participant handbook. Jacob Falkovich wrote about the treacherous path to rationality, focusing on various obstacles in the way of developing more rationality.

Personal productivity is a perennial topic on LW. alkjash identified a belief that ‘pain is the unit of effort’, where caring is measured by suffering, and identifies an alternative, superior view. Lynette Bye gave five specific high-variance tips for productivity, and then later argued prioritization is a huge driver of productivity, and explained five ways to prioritize better. AllAmericanBreakfast elaborated on what it means to give something a Good Try. adamShimi wrote about how habits shape identity. Ben Kuhn repeated Byrne Hobart’s claim that focus drives productivity, and argued that attention is your scarcest resource, and then talked about tools for keeping focused. alkjash pointed out some ways success can have downsides, and how to mitigate those downsides.

orthonormal discussed the impact of zero points, and thus the importance of choosing yours. Jacob Falkovich argued against victim mentality.

There was some progress on the project of 'slowly digest some maybe-woo things'. Kaj Sotala gives a non-mystical explanation of “no-self”, detailing some 'early insights' into what it means, as part of his sequence on multiagent models of mind. Ouroboros grapples with Valentine’s Kensho. I write a post about how Circling (the social practice) focuses on updating based on experience in a way that makes it deeply empirical.

Gears

John Wentworth wrote a sequence, Gears Which Turn the World, which had six nominated posts. The first post discussed constraints, and how technology primarily acts by changing the constraints on behavior. Later posts then looked at different types of constraints, and examples where that constraint is the tight constraint / scarce resource: coordination, interfaces, and transportation. He argued that money cannot substitute for expertise on how to use money, twice.

Other posts contained thoughts on how to develop better models and gearsy intuitions. While our intuitive sense of dimensionality is low-dimensional space, much of our decision-making and planning happens in high-dimensional space, where we benefit from applying heuristics trained on high-dimensional optimization and geometry. Ideas from statistical mechanics apply in many situations of uncertainty. Oliver Habryka and Eli Tyre described how to Fermi Model. Maxwell Peterson used animations to demonstrate how quickly the central limit theorem applies for some distributions. Mark Xu talked about why the first sample is the most informative when estimating a uncertain quantity.

Scott Alexander wrote Studies on Slack, which goes through examples to see the impacts of amount of slack, and what dynamics lead to more or less of it.

landfish evaluated the evidence and theories and suggests nuclear war is unlikely to be an x-risk.

John Wentworth summarized Working With Contracts. A babble challenge on generating (within an hour) 50 ways to send something to the moon drew 29 responses. dynomight wondered: what happens if you drink acetone?, an example of boggling at simple things posted to the internet, to paraphrase a comment.

Economics

LessWrong has lots of systems thinkers; economics remains a perennial interest from the Sequences's inception on an economist's blog. Comparative advantage is not about trade, but about production. Talking about comparative advantage also sometimes involves talking about negotiation. Credit-allocation is often imperfect, and so it’s useful to think about incentive design that takes that into account. Paul Christiano thought about moral public goods.

Buck talked about six economics misconceptions of his that he recently resolved. Richard Meadows defended the Efficient Market Hypothesis in the wake of COVID, and Wei Dai responded with some specific inefficiencies, and asked for help timing the SPAC bubble. aphyer looked into the limits of PredictIt’s ability to track truth.

philh looked at all possible two-player simultaneous symmetric games in normal form, and classifies them. Abram Demski argued that most analysis of game theory misidentifies the relevant games.

History

Things happened in the past; we talk about them sometimes. Often historical examples can help ground our modeling efforts, as when AI Impacts searched for examples of discontinuous progress in history.

Mostly written about by Jason Crawford, Progress Studies seeks to understand what causes progress and thus better understand what interventions would make the world better. It’s grown more nuanced after contact with thinking on x-risk, and seeks a ‘new theory of progress’ that might care more about things like differential tech development. While he started crossposting on LW in 2019, 2020 saw 8 of his posts in the review, of which the best-liked was Industrial literacy, which argued that understanding the basics of how industrial society works helps reframe the economy as ‘solutions to problems’, which makes the world much more sensible (and perhaps makes people's desired interventions much more sensible).

Gwern wrote about his personal progress (and the progress he saw in the world) from 2010-2019. DARPA built a digital tutor that educated much more effectively than traditional classrooms in 2009.

Not only can we model the past, we can look at people in the past modeling the past (and future). Jan Bloch learned from the Franco-Prussian war that wars were getting much more damaging and less beneficial, and that the old style of warfare was on the way out; he tried to stop WWI and failed. Wei Dai shared his grandparents’ story of Communist China.

Martin Sustrik wrote about the Swiss Political System. Anna Salamon asks where stable, cooperative institutions came from. Zvi writes about the dynamics and origins of moral mazes. Julia Wise shares notes on “The Anthopology of Childhood”. jefftk writes about growing independence for his two young children.

Current Events

Things keep happening in the present; we talk about them sometimes too.

Anti-Aging is much further along than it looked in 2015; over a hundred companies are deliberately targeting it and plausibly there will be evidence of therapeutic success in 2025-2030. John Wentworth speculated about aging’s impact on the thymus and what could be done about it.

COVID spread to the world. Practical advice was collected. Smoke was seen. Points of leverage were discussed. The CDC was fact-checked. Authorities and Amateurs were compared. John Wentworth asked how hard it would be to make a COVID vaccine. Zvi analyzed the reaction with the lens of simulacra levels. catherio announced microCOVID.org. Zvi predicted in December that there would be a large wave of infections in March-May, which doesn’t come to pass; he detailed in an edit how the data he had at the time led to his prediction.

Biden’s prediction market price was too low, according to deluks917. reallyeli asked if superforecasters are real; David Manheim said yes. niplav investigated the impact of time-until-event on forecasting accuracy, finding that long-run questions are easier to predict than short-run questions, but events are easier to predict closer to their time of resolution.

SuspendedReason interviewed a professional philosopher about LessWrong, highlighting adjacent ideas in contemporary philosophy and particularly friendly corners of that space.

Wei Dai asked if epistemic conditions have always been this bad, and responses are mixed (with "no, it's worse now" seeming to have a bit more weight behind it).

Richard Korzekwa talked about fixing indoor lighting.

Katja Grace posted a photo of an elephant seal.

Communication

Sometimes we talk about talking itself.

Ben Hoffman asked if crimes can be discussed literally, as often straightforward interpretations of behavior rely on ‘attack words’, and thus it is difficult to have clear conversations. Elizabeth considered negative feedback thru the lens of simulcra levels.

Malcolm Ocean crossposted his 2015 writing about Reveal Culture, an amendment to the Tell Culture model from 2014. Ben Kuhn claimed curiosity is a core component of listening well. Buck thought about criticism, noticing that he gets a lot of value from being criticized, and thinking about how to make it happen more. MakoYass outlined a way to build parallel webs of trust.

Zvi described Motive Ambiguity, where one might take destructive actions to reduce ambiguity and so signal one’s preferences or trustworthiness. Raemon talked about practicalities of confidentiality, and applying the fundamental question of rationality to it.

Alignment

Things will happen in the future; we talk about that sometimes.

The AI Alignment field has grown significantly over the years, and much of the discussion about it happens on the Alignment Forum, which automatically crossposts to LessWrong.

Some work collected, reviewed, and categorized previous work. Rohin Shah reviewed work done in 2018-2019. Evan Hubinger overviewed 11 proposals for building safe advanced AI. Andrew Critch laid out some AI research areas and their relevance to existential safety. Richard Ngo published AGI safety from first principles, which grew from a summary of many people's views to his detailed view.

Other work attempted to define relevant concepts. Evan Hubinger clarified his definitions of alignment terminology. Alex Flint attempted to ground optimization with a clear definition and many examples. John Wentworth wrote about abstraction. Alex Flint compared search and design.

Paul Christiano wrote precursors to his current research on Eliciting Latent Knowledge: Inaccessible Information, Learning the prior, and Better priors as a safety problem. Evan Hubinger argued that Zoom In by Chris Olah gives other researchers a foundation to build off of. nostalgebraist built a lens to interpret GPT. Beth Barnes et al summarized Progress on AI Safety via Debate, and then Barnes discussed obfuscated arguments in more detail. Mark Xu suspected that SGD favors deceptively aligned models. Scott Garrabrant introduced Cartesian Frames.

A forecasting thread resulted in a collection of AI timelines. hippke looked at Measuring hardware overhang thru backdating modern solutions to older hardware. Ajeya Cotra released her draft report on timelines to get feedback. Daniel Kokotajlo examined conquistadors as precedents for takeover of human societies (concluding that small edges can be enough to give a significant edge, especially if you can take advantage of pre-existing schisms in your target society), observed that the visible event when AI takes over is preceded by the point of no return at which their takeover is inevitable, and argued against using GDP as a metric for timelines and takeoff. Lanrian extrapolated GPT-N performance. Stuart Armstrong assessed Kurtzweil’s predictions about 2019 (half of them turned out false).

Steven Byrnes outlined his computational framework for the brain, drawing heavily on human neuroscience, inner alignment in the brain, and the specific example of inner alignment in salt-starved rats, where rats are able to identify the situational usefulness of salt in a way current RL algorithms can’t. Alex Zhu investigated cortical uniformity, ultimately thinking it’s plausible.

Chi Nguyen attempted to understand Paul Christiano’s Iterated Amplification. Rafael Harth explained inner alignment like I’m 12.

John Wentworth argued that alignability is a bottleneck to generating economic value for things like GPT-3. He also described what it might look like to get alignment by default. In the Pointers Problem, he argued that human values are a function of humans’ latent variables.

Jan Kulveit noted that there’s a ‘box inversion’, or duality, between the alignment problems as seen by Agent Foundations and Comprehensive AI Services. JohnWentworth outlined Demons in Imperfect Search, and then DaemonicSigil built a toy model of it. Diffractor wrote up a sequence on Infra-Bayesianism, with the key post on inframeasure theory also making it into the review. Joar Skalse discussed research on why neural networks generalize, with lots of discussions in the comments. nostalgebraist thought GPT-3 was disappointing and later explained an openAI insight about scaling (that data would become the tight constraint instead of compute, moving past GPT-3)

Abram Demski presented two alternative views of ‘utility functions’: the ‘view from nowhere’ defined over the base elements of reductionism, or the ‘view from somewhere’ defined over perceptual events, and favors the second. He then discussed Radical Probabilism, where Richard Jeffrey expands the possible range of updates beyond strict Bayesian updates. In The Bayesian Tyrant, he gave a simple parable of futarchy further developing this view.

Conclusion

A lot happened on LW over the course of the year! The main thing that seemed noteworthy, reading thru the review, was just how much alignment stuff there was. [This could be an artifact of more people interested in alignment voting in the review, but I think this matches my memory of the year.]

One of the things that surprised me was how much continuity there was between posts from 2020 and things that people are writing about now; part of this is because of COVID, but I think part of it is a sign of research interests maturing; where rather than a handful of people chasing fads the community is a considerably larger set of people working on steady accumulation in more narrow subfields.

LESSWRONG
LW