Many technical alignment researchers are bad-to-mediocre at writing up their ideas and results in a form intelligible to other people. And even for those who are reasonably good at it, writing up a good intuitive explanation still takes a lot of work, and that work lengthens the turn-time on publishing new results. For instance, a couple months ago I wrote a post which formalized the idea of abstractions as redundant information, and argued that it’s equivalent to abstractions as information relevant at a distance. That post came out about two months after I had the rough math worked out, because it took a lot of work to explain it decently - and I don’t even think the end result was all that good an explanation! And I still don’t have a post which explains well why that result is interesting.
I think there’s a lot of potential space in the field for people who are good at figuring out what other researchers’ math is saying intuitively, and why it’s interesting, and then communicating that clearly - i.e. the skill of distillation. This post will briefly sketch out what two kinds of distillation roles might look like, what skills are needed, and talk about how one might get started in such a role.
Two Distiller Roles
The two types of distiller role I’ll sketch are:
- “Independent” distiller: someone who works independently, understanding work published by other researchers and producing distillations of that work.
- “Adjunct” distiller: someone who works directly with one researcher or a small team, producing regular write-ups of what the person/team is thinking about and why.
These two roles add value in slightly different ways.
An independent distiller’s main value-adds are:
- Explaining the motivation and intended applications
- Coming up with new examples
- Boiling down the “key intuitive story” behind an argument
- Showing how the intuitive story fits into the context of the intended applications
I expect the ability to come up with novel examples and boil down the core intuitive story behind a bunch of math are the rate-limiting skills here.
Rob Miles is a good example of an existing independent distiller in the field. He makes YouTube videos intuitively explaining various technical results and arguments. Rob’s work is aimed somewhat more at a popular audience than what I have in mind, but it’s nonetheless been useful for people in the field.
I expect an adjunct distiller’s main value-adds are:
- Writing up explanations, examples, and intuitions, similar to the independent distiller
- Saving time for the technical researcher/team; allow more specialization
- Providing more external visibility/legibility into the research process and motivation
- Accelerating the research process directly by coming up with good examples and intuitive explanations
I expect finding a researcher/team to work with is the rate-limiting step to this sort of work.
Mark Xu is a good example of an existing adjunct distiller. He’s worked with both Evan Hubinger and Paul Christiano, and has written up decent distillations of some of their thoughts. I believe Mark did this with the aim of later doing technical research himself, rather than mostly being a distiller. That is a pretty good strategy and I expect it to be a common pathway, though naturally I expect people who aim to specialize in distillation long-term will end up better at distillation.
What Kind Of Skills Are Needed?
I expect the key rate-limiting skills are:
- Ability to independently generate intuitive examples when reading mathematical arguments, or having a mathematical discussion
- Ability to extract the core intuitive story from a mathematical argument
- Writing/drawing skills to clearly convey technical intuitions to a wider audience
- Ability to do most of the work of crossing the communication gap yourself - both so that researchers do not need to spend a lot of effort communicating to you, and so that readers do not need to spend a lot of effort understanding you
- For the adjunct role, ability to write decent things quickly and frequently without too much perfectionism
- For the non-adjunct role, ability to do all this relatively independently
How To Get Started
Getting started in an independent distiller role should be pretty straightforward: choose some research, and produce some distillations. It’s inherently a very legible job, so you should pretty quickly have some good example pieces which you could showcase in a grant application (e.g. from the Long Term Future Fund or FTX Future Fund). That said, bear in mind that you may need some practice before you actually start to produce very good distillations.
An adjunct role is more difficult, because you need someone to work with. Obvious advice: just asking people is an underutilized strategy, and works surprisingly well. Be sure to emphasize your intended value-add to the researcher(s). If you want to prove yourself a bit before reaching out, independently distilling some of a researcher’s existing public work is another obvious step. You might also try interviewing a researcher on some part of their work, and then distilling that, in order to get a better feel for what it would be like to work together before actually committing.
What posts of yours do you want distilled?
(Ordered by priority)
More generally, two big categories:
Finally, since it's Thomas Kwa asking this question: no, I am not going to create bounties for distillations on these right now, because I don't want to deal with the overhead. Fortunately, the target audience for distillations of my work is everyone except me, so people other than me are quite well qualified to set up their own distillation bounties.
I want to really signal boost an idea that's come up a bit recently: doing user research on pieces of writing. When writing such distillations, have someone in the target audience read it and sit down next to them (or on a video call) and watch them do so. Try to get them to think out loud. Observe what works and what doesn't. I suspect rather strongly that this is a low hanging fruit.
I think I weakly disagree with the implication that “distillation” should be thought of as a different category of activity from “original research”. It is in a superficial sense, but a lot of the underlying activities and skills and motivations overlap. For example, original researchers also have the experience of reading something, feeling confused about it, and then eventually feeling less confused about it. They just might not choose to spend the time writing up how they came to be less confused. Conversely, someone trying to understand something for the purpose of pedagogy may notice a mistake in the original, or that the original is outright wrong, which is original research.
I guess if I were writing something-like-this-post, I would frame it as:
(Maybe other things too.)
For my part I've spent much of the last five months on a #3 project, and I think that was the right call for my particular situation—I suspect that I learned more through writing those things than anyone else will by reading them. I also spent the better part of a month on a #2 project, and also found it a good use of time. And the very first thing I did when I decided to learn about the field was spend a few months creating unoriginal pedagogy. It was a great way to learn. :)
I agree i.e. I also (fairly weakly) disagree with the value of thinking of 'distilling' as a separate thing. Part of me wants to conjecture that it's comes from thinking of alignment work predominantly as mathematics or a hard science in which the standard 'unit' is a an original theorem or original result which might be poorly written up but can't really be argued against much. But if we think of the area (I'm thinking predominantly about more conceptual/theoretical alignment) as a 'softer', messier, ongoing discourse full of different arguments from different viewpoints and under different assumptions, with counter-arguments, rejoinders, clarifications, retractions etc. that takes place across blogs, papers, talks, theorems, experiments etc that all somehow slowly works to produce progress, then it starts to be less clear what this special activity called 'distilling' really is.
Another relevant point, but one which I won't bother trying to expand on much here, is that a research community assimilating - and then eventually building on - complex ideas can take a really long time.
[At risk of extending into a rant, I also just think the term is a bit off-putting. Sure, I can get the sense of what it means from the word and the way it is used - it's not completely opaque or anything - but I'd not heard it used regularly in this way until I started looking at the alignment forum. What's really so special about alignment that we need to use this word? Do we think we have figured out some new secret activity that is useful for intellectual progress that other fields haven't figured out? Can we not get by using words like "writing" and "teaching" and "explaining"?]
(I might be wrong, but) I think there is a relatively large group of people who want to become AI alignment researchers that just wouldn't be good enough to do very effective alignment research, and I think many of those people might be more effective as distillers. (And I think distillers (and teachers for AI safety) as occupation is currently very neglected.)
Similarly, there may also be people who think they aren't good enough for alignment research, but may be more encouraged to just learn the stuff well and then teach it to others.
I was about to write approximately this, so thank you! To add one point in this direction, I am sceptical about the value of reducing the expectation for researchers to explain what they are doing. My research is in two fields (arithmetic geometry and enumerative geometry). In the first we put a lot of burden on the writer to explain themselves, and in the latter poor and incomplete explanations are standard. This sometimes allows people in the latter field to move faster, but
Looking for "distillers" / happy to pay for this work.
Also distillation seems like a wrong name. What's often needed seems to be more like dilution & blending - I can often describe core of an idea by a few sentences, but the inferential steps required from the reader are then too large, or rely on knowledge unknown to many readers.
Bit of a pointless gripe (and not too specific to this post), but I wish we could use the word “pedagogy” instead of “distillation”. Not only is “pedagogy” much more understandable to lots of potential readers, but I don't think “distill” is even capturing a helpful mental image. “Distillation” in chemistry is getting rid of all the excess to make something super-concentrated. But that's often the opposite of good pedagogy! Imagine (1) a dense math proof with almost no English words, (2) an explanation of the same proof with lots of examples and diagrams and stories and intuitions, and explaining the same thing multiple times from multiple perspectives, etc. When I imagine “distillation” in the original (chemistry) sense, to me it invokes a mental image much closer to (1) not (2). But (2) is better pedagogy, and (2) is what we actually want in this context.
(For example, Rob Miles videos are not designed to pack in the maximum possible number of concentrated insights per second of video.)
FYI there are three different words I need, and I don't know what the proper... conjugation? is for pedagogy:
The first one could be called "a teacher", but, that has different connotations when you're writing things down or whatnot.
Also "communicator", e.g. "science communicator".
I think distillation can describe the process very well from the right perspective. Imagine an academic paper in physics or math with pages of bunch of formulae. Obviously, the average person isn't going to understand them, so a distiller would take out all the jargon and specialized knowledge to leave behind the main ideas.
Funny timing seeing Scott Alexander's Yudkowsky Contra Christiano On AI Takeoff Speeds post so soon after this post. My god is he good. I wonder what the implications of that are.
My current plan is to go through most of the MIRI dialogues and anything else lying around that I think would be of interest to my readers, at some slow rate where I don't scare off people who don't want to read too much AI stuff. If anyone here feels like something else would be a better use of my time, let me know.
Coming up with good examples strikes me as the sort of thing that is a better fit for crowdsourcing than for an individual to try to think up themself.
In practice, I understand there is a bit of a catch-22 though: a published post is the time when such crowdsourcing is most practical to occur, but you'd probably want have the good examples be included in that published post. I'm not sure what to do about this.
One idea is to post something like "I'm working on a post for X and am looking for good examples of Y." But that's hard because it very well might be hard to explain X without having really written up X (depending on the X). A second idea that I like more is publishing rough drafts, or even more early stage writeups, and getting help with examples there. That's an idea I am a fan of anyway though, forgetting about the benefit of it helping with the generation of examples.
Curated. I think this is a message that's well worth getting out there, and a write-up of a message I find myself telling people often. As more people are interested in joining the Alignment field, I think we should establish this is a way that people can start contributing. A suggestion here is that people can further flesh out LessWrong wiki-tag pages on AI (see the concepts page), and I'd be interested in building further framework on LessWrong to enable distillation work.
This sounds like something that could be done as an organization creating a job for it, which could help with mentorship/connections/motivation/job security relative to expecting people to apply to EAIF/LTFF
My organization (Rethink Priorities) is currently hiring for research assistants and research fellows (among other roles) and some of their responsibilities will include distillation.
I'd be excited to see more of this happening.
It reminds me of the recent job posting from Abram, Vanessa and Diffractor, which seems to be a role of adjunct distiller for Infrabaysianism, though they use different terms.
A quadratic funding mechanism (similar to Gitcoin) could make sense for putting up distillation bounties. Quadratic funding (QF) lets a grant-maker put up a pool of matching funds while individual researchers specify how much each individual bounty would be valuable to her/him; then the matching is done via the QF strategy to optimize for aggregate researcher utility. Speaking for myself, I would contribute to a community fund for further distillations, and I would also be more likely to distill.
I find the level of distillation done by Daniel Filan at AXRP to be great. Short and listenable enough for easy access while detailed enough to let you form opinions/directions for further research.
Some IMO valuable targets for distillation: Infra-Bayesianism and recent work on why deep nets converge e.g. 'Gradient Descent Finds Global Minima of Deep Neural Networks', Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers, A Theoretical Analysis of Deep Q-Learning
One way to define such a bounty would be to define them via alignment forum karma. You could publish a list of papers to be distilled and define how much you would pay for an alignment forum post with 10, 25, and 50 alignment forum karma on the topic.
I consider myself a fairly good distiller, though fairly bad at math (compared to the average person here). So far I’ve mostly used that skill to write biographies on Wikipedia, but it might be possible for me to work on more math-heavy topics if I either have enough time to learn the math in depth or have a strongly collaborative relationship with someone who already does.
Four months in. Is there a list of distillation that one could look at?
Nope. Maybe somebody should make one ;).
Difficult if you don't know who worked on what.
But I offer this as an example: A summary of every "Highlights from the Sequences" post
I currently receive funding from LTF Fund for translating AI-Safety-related texts into Russian.
My partner (she's also my editor) is interested in something very similar to what this post says. The last year she often says something like "Your rationalist guys are definitely very smart, but they are very bad at speaking human language, so no one understands them".
We actually thought, that after finishing my grant work (so that she edits everything that I translated, and becomes well versed in the topic) in July, she will send her own grant application with plan "she writes easy-to-understand texts about AI-Safety, hired translator translates them into English (I'm good at English-Russian translations, but not vice-versa), I check that translator doesn't distort the meaning."
She has a lot of experience with writing short texts: she runs a group in social media with ~20000 subscribers 7 years.
Should we really do this, what do you think?
There's some trickiness around vetting translation work, but pairing it with distillation work seems like it'd make it easier to vet.
I do anticipate an issue where, attempting to simplify something to be more readable loses important nuance, so I think it'd be good to have another person in the loop.
I'm pretty confused that this is as necessary as it is, particularly with writing that involves a lot of math and math notation. I don't understand how people get the insight and motivation necessary to write that kind of thing without explaining what the point of it is or giving examples of how to apply it as part of their expositions.
Your (johnswentworth's) posts don't seem to suffer nearly as much from this, at least from a quick skim of the ones you said on another thread you'd like distilled, but e.g. Infra-Bayesianism seems maybe important (it's about something I was thinking about anyway), but I've "bounced off" the sequence, as apparently have enough others for there to be a job offering distilling it. The same thing happens sometimes when I'm reading academic papers.
I have nothing against rigor and I actually enjoy math, but when writing about models or systems one wants to apply to something that isn't purely mathematical, these are the basic parts I generally see and how I react to them:
Motivation. This is great! Tell me what problem you're trying to solve and sketch your solution. People tend to do this only at the beginning of a piece, if they do it at all, which in my opinion is a mistake. I want to know why you care about proving that theorem, and if you don't say I can't always guess.
Examples. These are great! If you introduce a construct, show me one. Make it just non-trivial enough that there's a point to using the abstraction at all. Ideally make its components things I can already recognize and understand without the abstraction. Again, there often are some but not enough; any significant point made that isn't illustrated with an example forces me to come up with one on my own or else I just won't understand it.
Text description of math. This varies a lot. If it's showing the correspondence between the real problem being solved and the abstractions and notation used, that's fine. Other statements relating mathematical objects to one another are okay if I already have a good grasp of what those objects are (less so if some or all of them were only introduced in the current piece and are non-obvious in terms of the underlying constructs). Proofs written in text tend to frustrate me. They require extreme amounts of thought per sentence because they're written very elliptically (stating "X has property Z" while assuming I remember "X has property Y" and "property Y implies property Z", often when neither of these latter things is stated in the piece at all), and when I do figure out what they're saying, 90% of it is extremely obvious and uninteresting, but the 10% that makes the proof work isn't highlighted in any way. I largely end up skipping them, and since it seems like they're probably one of the hardest parts to write, that feels like a big waste.
Symbolic math. There's no other way to be so precise so concisely, but I find myself having to refer back to definitions constantly to figure out what they're saying, and long blocks of nothing but symbols are even more impenetrable than prose proofs; I end up skipping such blocks if there's any way I can figure out what's being said without decoding them. In programming we all know single letter variable and function names make code hard to understand (with a few common exceptions like naming a loop index "i"); I think symbolic math would be much easier to read if it looked more like a computer program.
Now, I do also see writing that's almost all motivation and examples, and that doesn't seem right either. It doesn't create sufficient technical understanding that I could implement the ideas in a computer program, or adapt them to very different cases than the ones presented. But to summarize, some skills you say are important in distilling are:
And, well, I honestly don't understand how anyone can even think about math, and particularly not how they can come up with useful math, without having examples and an intuitive story in mind already.
Are there examples or best practices you would recommend for this?
We're hoping to alleviate parts of this issue with the use of language models: https://www.lesswrong.com/posts/ebYiodG3MAEqskCDG/a-survey-of-tool-use-and-workflows-in-alignment-research-1
Should the role of a distiller include spotting mistakes? I assume that you'd only want distillers to get to work once you have some confidence that the original claims are correct.
What you're looking for here is someone in a field called technical writing and/or translation analysis. Technical writers and translation analysts can have different specialties. Someone who is mathematics specialized could do the work you describe.
This is a growing and well-paying field for people with excellent writing skills to work in the tech, engineering, medical, etc fields without having much technical training/degrees.
This is extremely articulate and frankly universally applicable. Distillers are needed at any critical point in a business when you’re trying to understand what you have and what to do with it aka decision intelligence. Anyone able to effectively and efficiently “distill” the critical, actionable nuggets which drive outcomes are the most valuable resource in a company!
What is the intended difference between a distiller and the field of science communication?
In practice, "science communicators" tend to be popsci-y; there's often a focus on entertainment over epistemics, and they'll often end up misleading or outright wrong as a result (often accidentally). A distiller's job is more centrally about deeply understanding the ideas themselves, and then communicating the core pieces accurately. Their path-to-impact is through providing useful explanations to researchers, not entertaining laypeople.
A minor degree of separation. A good communicator writes for an intended audience.
I want to point out what could be a serious problem for anybody attempting to do "distillation" in a public setting. Although here, "distillation" is specifically couched as a way of explicating mathematics, I believe the concept generalizes to any repackaging dry, terse and abstract set of ideas into more intuitive language.
Let me start by giving a specific example of an unpublished piece of writing I produced as a form of distillation. The Handbook of the Biology of Aging is a great intellectual resource on the subject of geroscience, but it's written in terse, abstract academese. I rewrote chapter 4 in much livelier language, with expanded examples and a slightly reworked structure, and credited the original author with both the ideas and the structure, being clear that I'm not claiming any intellectual novelty in my new version. It was always intended for my blog, not for publication in any peer-reviewed journal. I'm pretty confident that most lay audiences would prefer to absorb the original author's ideas via my version than via the original.
The problem is plagiarism. Plagiarism isn't just about copying words - it's also about copying ideas. Although a careful distiller could avoid risk of violating formal policies/laws concerning plagiarism by carefully citing their sources and being clear that their work is not attempting to provide any form of intellectual novelty, it also poses a potential reputational risk to the distiller.
Here, distillation is highlighted in part as a way for students to build into a career as a researcher. However, even if it's unfair, distillation can look like intellectual laziness - the academic version of an artist copying someone else's work and displaying it in a gallery. Even if the artist cites the original source, perhaps by labeling the image with the tag "a copy of Van Gogh's Starry Night" displaying their copy in public is likely to undermine their reputation for being capable of original artistry and build their reputation as a "mere copier." They might be perceived not only as artistically weak, but as a seedy sort of person who may well be on the road to selling art forgeries. A distiller faces the same reputational risk.
Think of it like the difference between an action that violates the law, and an action that could result in being sued. If a lawyer wants to sue you, then even if you ultimately win the case, you might be tied up in court for years, and suffer massive legal expenses. Smart people don't skirt the edge of being potentially sued, at least not as part of their normal business operations. They steer well clear of this whenever possible. I think that distillation is skirting so close to potential perceptions of plagiarism and intellectual laziness that it creates an analogous risk.
I think this is deeply unfortunate, because distillation has all the benefits described in the OP. A good distillation can make important ideas more accessible, and that might in fact be the bottleneck for creating new intellectual contributions based on those ideas. But unfortunately, academia doesn't really have a culture of considering distillation as a valuable form of scientific outreach. It will tend to see distillation as somewhere between plagiarism and intellectual laziness. Even if a persistent argument with one particular person who might accuse the distiller of these failings manages to convince them to see the value in the work of distillation, there will be another person right behind them to lobby the same accusation. And then the distiller looks like the sort of person who doesn't have the judgment to know what's going to rile up academics and create a perception of plagiarism - and who wants to work with somebody like that?
There's a difference between a literature review or piece of journalism, which weaves together properly cited ideas from a variety of sources into a fundamentally new structure in a way that everybody can understand is not plagiaristic, and a distillation which, as I understand it, takes the intellectual architecture of a single source and dresses it up in new language. The latter looks a lot like plagiarism to many people.
The difficulty with producing distillation that doesn't create laziness/plagiarism perceptions is unfortunate. But I think it's akin to the unfortunate effect of patent law in slowing innovation. Right now, we prioritize the need of intellectuals to protect their intellectual contributions over the need for writers to supply audiences with more accessible versions of those ideas.
So if I was going to leave potential distillers with a takeaway message, it would be this:
Be EXTREMELY CAREFUL in how you write distillations. If you must write them, consider not publishing them. It's not enough to properly cite your sources. Every time you publish a distillation, you are taking a reputational risk, and in many situations, there isn't any real personal reward to counterbalance it. Even if you have not plagiarized, and even if you are capable of original thought, you might create a perception that you are an intellectually lazy plagiarizer hiding these failings under the term "distillation." This might permanently damage your career prospects in academia. Unless you're very confident that your specific approach to distillation will avoid that reputational risk to yourself, strongly consider keeping your distillation private.
Do you have a particular story that shows the types of negative outcomes that could happen? While it's not impossible for me to imagine an overly sensitive academic getting angry or annoyed unreasonably, at a distillation, it hardly seems to me like it would be at all likely. I have fairly high confidence in my understanding of academic mindsets, and a single sentence at the top "this is a summary of XYZ's work on whatever" with a link would in almost all cases be enough. You could even add in another flattering sentence, "I'm very excited about this work because... I find it super exciting so here's my notes/attempt at understanding it more"
Generally, academics like it when people try to understand their work.
Yes. I posted the description of the aging distillation project I described above on the AskAcademia subreddit, and was met with a firestorm of downvotes and strident claims from multiple respondants that it would be plagiaristic/stealing, and that I was obviously unfit to be a graduate student for even considering it.
One important caveat is that I originally posted that I was going to "publish" this essay, which many respondants seem to have initially taken as meaning "publish in a peer reviewed journal, passing the ideas and structure off as my own." But even after updating the OP and specifically addressing that point in numerous replies, respondants generally continued to see the idea as a form of intellectual theft and of making no useful contribution to the reader.
It's entirely possible that my initial post grabbed the attention of a couple redditors who are a few SDs from the mean in terms of sensitivity to plagiarism concerns, and that they got so fired up about that possibility that they couldn't really make a distinction between the scenario they had imagined I was proposing and what I actually intended to do. But I think the more likely explanation is that a lot of academics would see a thorough rewrite of a specific source in new language as a form of intellectual laziness/theft, even with proper citations, and that people almost never do this for that exact reason. Up close, it might not be plagiarism, but from a distance, it sure looks like it. You have to do a lot of explaining to show why it's maybe not plagiarism. Even if you convince one person, they might even still feel pressured to accuse you of plagiarism, because otherwise it looks like they're being soft on crime. And even if not, they might still think you're a fool for provoking a potentially ugly controversy, and want to distance themselves from you.
There are probably ways to do distillations that avoid this sort of issue, but I think anybody planning to do it ought to have a carefully thought-through plan for how they're going to avoid accusations of plagiarism. Distillation of a single source is an unconventional format. Conventional formats - the book review, the summary, etc - exist because we, as a culture, have carved out a set of generally acceptable ways for people to respond to the works of other authors. Distillations aren't really one of them (correct me if I'm wrong and you can point to sources on things like "how to write a distillation" from the wider world). When people write academic works, they might expect a review, a piece of science journalism, or whatever, but not that some stranger will come along and try to write a "distillation" of their entire paper and publish it online. And they might be pissed off to have their expectations violated.
By analogy, it's a person deciding that since dancing is fun and healthy and they believe in "ask culture," it's OK for them to walk up to strangers at the bus stop and ask them to dance. It's a weird thing to be asked, people will be confused about your motives and get anxious, and you shouldn't be surprised if you quickly develop a reputation as a creep even if you always politely walk away when you get rejected and never ask the same person twice. We do not have a cultural norm of asking for dances at bus stops, and we don't have a cultural norm of writing distillations. So at the very least, you should carefully vet the proposed distillation with the original author and be super clear on why, in each specific case, it's OK for you to be producing one.