I am new to the field of AI safety and am testing my fit for being a communicator because, with the current climate, I think it would do a lot of good to bring clarity to the mind of the everyday person. 

I started off with writing distillations as was recommended and suggested and I wrote one on the first post of "Risks From Learned Optimization" by Evan Hubinger et al. Filled with analogies and humor, cause that's my style of writing.

But I recently was part of the SERI MATS workshops, in the Agent Foundations stream headed by John Wentworth and realized I had a bunch of questions regarding writing distillations in general.

  • What's the appropriate length?
  • How much should you cut out / add to the distillation?
  • How do you assess whether your distillation adds anything of value to sites that alignment researchers/laypersons frequent? (Other than the obvious value of perceiving the material through a clearer lens?)

John's opinion (in quick response to my query) was that, whilst he firmly believed that distillations contained value and awareness was key, over the past time period, there were a growing number of distillations that just regurgitated what the original article/paper/post said or cut down and just gave the insights. Upon additional queries, he believed a good percentage of the time, the distillation in question should ideally be larger than the source material.


With that in mind, I think distillations could be divided into two categories (which can also be categorized as two others, but I will get to that) would provide a better categorization to go through for incoming would-be alignment researchers:

-> Expansive Distillations

     This would be a space for distillations that expand and bring to proper clarity the source material so that each paragraph of the original can be digested more easily, and serve as a reference for anyone who seeks explanation.

-> Contractive Distillations

     Working on a better name, cause that sounds pretentious to me, these would be distillations that work on just summarizing the key points of the source material for easy perusal. One could easily cover a larger amount of papers and this would pave the way to identify the more confusing posts that could use expansive distillations.


I think these distillations could also be classified into two different categories:

-> Explanatory Distillations

     This might be redundant but this is purely in a manner where you do not have any new intuitions to offer but you can break down complicated paragraphs in the source material.

-> Intuitive Distillations

     Short stories, analogies, and parallels can be showcased in this category to provide a lovely wrapper around concepts that are more twisted to wrap one's mind around. 


So one could have Contractive Intuitive Distillations to have short high-level intuitions, Expansive Intuitive Distillations for the lower-level intuitions, Contractive Explanatory Distillations, which could honestly just reference for the key insights (possibly a useless category), or Expansive Explanatory Distillations for detailed explanations with the original terms.

(This could be just three categories, I suppose, but I would like to ruminate on that.)

Honestly, I think I intended more of a personality test for any distillation based on these four categories, like 20% Intuitive, and 60% Expansive, so that generally people can find it easier to peruse through what they want to know.

I don't know how to end a LW post so thank you.

New to LessWrong?

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 6:34 AM

I started off with writing distillations as was recommended and suggested and I wrote one on the first post of "Risks From Learned Optimization" by Evan Hubinger et al.

FWIW, I consider Risks From Learned Optimization to itself be a distillation - it distills a cluster of mental models which were already used by many people in alignment at the time. I also consider it one of the highest-value-add distillations in alignment to date, and a central example of what a really good distillation can achieve.

That's fascinating, as someone new to the landscape. I found the sequence particularly thick, each sentence heavy when I read it the first time, right off the bat. Months later, it's much easier on the eyes. Do you think it could use any further help in getting it out? (Also, do you have any examples of other good existing distillations as well, out of curiosity?)