I finally took the time to read this post, and it's really interesting! Thanks for writing it!
One general comment is that I feel this book (as you summarize it) shows confusion over deconfusion. Notably all the talk of pure vs applied and conceptual vs applied isn't cutting reality at the joint, and is just the old stereotype between theorists and experimenters.
Additionally, it occurs to me that maybe "I have information for you" mode just a cheaper version of the question/problem modes. Sometimes I think of something that might lead to cool new information (either a theory or an experiment), and I'm engaged moreso by the potential for novelty than I am by the potential for applications.
I think I'd like to become more problem-driven. To derive possibilities for research from problems, and make sure I'm not just seeking novelty. At the end of the day, I don't think these roles are "equal" I think the problem-driven role is the best one, the one we should aspire to.
I don't necessarily agree with the "cheaper" judgment, except that if you want your research to contribute to a specific problem instead of maybe being relevant, the problem-driven role is probably better. Or at least the application-driven roles, which include the problem-driven and the deconfusion-driven.
Also, the trick I've found from reading EA materials but which isn't really in the air in academia is that if you want to work on a specific subject/problem but still follow your excitement, it's as simple as looking at many different approaches to the problem, and find ones that excite you. I feel like researchers are so used to protecting their ability to work on whatever they want that they end up believing that whatever looked cool first is necessarily their only interest. A bit like how the idea of passion can fuck up some career thinking.
Isn't it the case that deconfusion/writer role three research can be disseminated to practical (as opposed to theoretical) -minded people, and then those people turn question-answer into problem-solution?
In my experience, practical-minded people without much nerd excitement for theory tend to be bad at deconfusion, because it doesn't interest them. They tend to be problem-solvers first, and you have to convince them that your deconfusion is useful for their problem for them to care, which is fair.
There might be some truth to your point if we tweak it though, because I feel like deconfusion requires a mix of conceptual and practical mindsets: you need to care enough about theory and concept to want to clarify things, but you also need to care enough about applications to clarify and deconfuse with a goal in mind.
The conceptual problem case is where intangibles play in. The condition in that case is always the simple lack of knowledge or understanding of something. The cost in that case is simple ignorance.
Disagree with the cost part, because often the cost of deconfusion problems is... confusion. That is to say, it's being unable to solve the problem, or present it, or get money for it because nobody understands. There's a big chunk of that which sounds more like problem-related costs than pure ignorance to me.
A helpful exercise is if you find yourself saying "we want to understand x so that we can y", try flipping to "we can't y if we don't understand x". This sort of shifts the burden on the reader to provide ways in which we can y without understanding x. You can do this iteratively: come up with _z_s which you can't do without y, and so on.
That's a good intuition pump, but it is often too strong of a condition. Deconfusion of a given idea or concept might be the clearest, most promising or most obvious way of solving the problem, but it's almost never a necessary condition. There is almost no such things as necessary condition in the real world, and if you wait for one, you'll never do anything.
I want to reason about what these distinctions look like in the alignment community, and whether or not they're important.
Would guess no, because that's a distinction nobody cares about in the STEM world. The only point is maybe "people should read the original papers instead of just citing them", but that doesn't apply to many things here.
Moreover, what is a primary source in the alignment community? Surely if one is writing about inner alignment, a primary source is the Risks from Learned Optimization paper. But what are Risks' primary, secondary, tertiary sources? Does it matter?
On inner alignment, RIsks is a primary source which doesn't really have primary sources. I don't think it necessarily makes sense to talk about primary sources in STEM settings except as the first paper to present an idea/concept/theory. It's not about "the source written during the time it happened" as in history. So to even answer the question of the primary sources of Risks, you need to first know the primary source about what.
But once again, I don't think there is any value here, except in making people read Risks instead of reverse engineering it from subsequent posts.
supported by CEEALAR
I happened to glance over at the CEEALAR bookshelf one day and saw a book called Craft of Research. Craft of Research is a book by 5 english professors that tries to provide a field-agnostic guide to, well, doing research. I previously shared my notes piecemeal in my lesswrong shortform, and today I'm collecting them into one top-level post. There will occasionally be reflections on the relevance of this book to the alignment community throughout.
The audience models of research - thoughts on chapter 2
Before considering the role you're creating for your reader, consider the role you're creating for yourself. Your broad options are the following
The authors recommend assuming one of these three. There is of course a wider gap between information and the neighborhood of problems and questions than there is between problems and questions! Later on in chapter four the authors provide a graph illustrating problems and questions:
Practical problem -> motivates -> Research question -> defines -> Conceptual/research problem
. Information, when provided mostly for novelty, however, is not in this cycle. Information can be leveled at problems or questions, plays a role in providing solutions or answers, but can also be for "its own sake".I'm reminded of a paper/post I started but never finished, on providing a poset-like structure to capabilities. I thought it would be useful if you could give a precise ordering on a set of agents, to assign supervising/overseeing responsibilities. Looking back, providing this poset would just be a cool piece of information, effectively: I wasn't motivated by a question or problem so much as "look at what we can do". Yes, I can post-hoc think of a question or a problem that the research would address, but that was not my prevailing seed of a reason for starting the project. Is the role of the researcher primarily a writing thing, though, applying mostly to the final draft? Perhaps it's appropriate for early stages of the research to involve multi-role drifting, even if it's better for the reader experience if you settle on one role in the end.
Additionally, it occurs to me that maybe "I have information for you" mode just a cheaper version of the question/problem modes. Sometimes I think of something that might lead to cool new information (either a theory or an experiment), and I'm engaged moreso by the potential for novelty than I am by the potential for applications.
I think I'd like to become more problem-driven. To derive possibilities for research from problems, and make sure I'm not just seeking novelty. At the end of the day, I don't think these roles are "equal" I think the problem-driven role is the best one, the one we should aspire to.
The three reader roles complementing the three writer roles are
It's basically stated that your choice of writer role implies a particular reader role, 1 mapping to 1, 2 mapping to 2, and 3 mapping to 3.
Role 1 speaks to an important difficulty in the x-risk, EA, alignment community; which is how not to get drawn into the phenomenal sensation of insight when something isn't going to help you on a problem. At my local EA meetup I sometimes worry that the impact of our speaker events is low, because the audience may not meaningfully update even though they're intellectually engaged. Put another way, intellectual engagement can be goodhartable, the sensation of insight can distract you from your resolve to shatter your bottlenecks and save the world if it becomes an end itself. Should researchers who want to be careful about this avoid the first role entirely? Should the alignment literature look upon the first reader role as a failure mode? We talk about a lot of cool stuff, it can be easy to be drawn in by the cool factor like some of the non-EA rationalists I've met at meetups.
I'm not saying reader role number two absolutely must dominate, because it can diverge from deconfusion which is better captured by reader role number three.
Division of labor between reader and writer, writer roles do not always imply exactly one reader role
Isn't it the case that deconfusion/writer role three research can be disseminated to practical (as opposed to theoretical) -minded people, and then those people turn question-answer into problem-solution? You can write in the question-answer regime, but there may be that (rare) reader who interprets it in the problem-solution regime! This seems to be an extremely good thing that we should find a way to encourage. In general reading the drifts across multiple roles seems like the most engaged kind of reading.
Questions and Problems - thoughts on chapter 4
Last time we discussed the difference between information and a question or a problem, and I suggested that the novelty-satisfied mode of information presentation isn't as good as addressing actual questions or problems. In chapter 3 which I have not typed up thoughts about, A three step procedure is introduced
The basic feedback loop introduced in this chapter relates practical with conceptual problems and relates research questions with research answers.
What should we do vs. what do we know - practical vs conceptual problems
Opposite eachother in the loop are practical problems and conceptual problems. Practical problems are simply those which imply uncertainty over decisions or actions, while conceptual problems are those which only imply uncertainty over understanding. Concretely, your bike chain breaking is a practical problem because you don't know where to get it fixed, implying that the research task of finding bike shops will reduce your uncertainty about how to fix the bike chain.
Conditions and consequences
The structure of a problem is that it has a condition (or situation) and the (undesirable) consequences of that condition. The consequences-costs model of problems holds both for practical problems and conceptual problems, but comes in slightly different flavors. In the practical problem case, the condition and costs are immediate and observed. However, a chain of "so what?" must be walked.
One person's cost may be another person's condition, so when stating the cost you ought to imagine a socratic "so what?" voice, forcing you to articulate more immediate costs until the socratic voice has to really reach in order to say that it's not a real cost.
The conceptual problem case is where intangibles play in. The condition in that case is always the simple lack of knowledge or understanding of something. The cost in that case is simple ignorance.
Modus tollens
A helpful exercise is if you find yourself saying "we want to understand x so that we can y", try flipping to "we can't y if we don't understand x". This sort of shifts the burden on the reader to provide ways in which we can y without understanding x. You can do this iteratively: come up with _z_s which you can't do without y, and so on.
Pure vs. applied research
Research is pure when the significance stage of the topic-question-significance frame refers only to knowing, not to doing. Research is applied when the significance step refers to doing. Notice that the question step, even in applied research, refers to knowing or understanding.
Connecting research to practical consequences
You might find that the significance stage is stretching a bit to relate the conceptual understanding gained from the question stage. Sometimes you can modify and add a fourth step to the topic-question-significance frame and make it into topic-conceptual question-conceptual significance-possible practical application. Splitting significance into two helps you draw reasonable, plausible applications. A claimed application is a stretch when it is not plausible. Note: the authors suggest that there is a class of conceptual papers in which you want to save practical implications entirely for the conclusion, that for a certain kind of paper practical applications do not belong in the introduction.
AI safety
One characterisitic of AI safety that makes it difficult both to do and interface with is the chains of "so what" are often very long. The path from deconfusion research to everyone dying or not dying feels like a stretch if not done carefully, and has a lot of steps when done carefully. As I mentioned in the section on chapter 2, it's easy to get sucked into the "novel information for it's own sake" regime at least as a reader. More practical oriented approaches are perhaps those that seek new regimes for how to even train models, and the "so what?" is answered "so we have dramatically less OODR-failures" or something. The condition-costs framework seems really beneficial for articulating alignment agendas and directions.
Misc
Sources - notes on chapters 5 and 6
Primary, secondary, and tertiary sources
The distinction between primary and secondary sources comes from 19th century historians, and the idea of tertiary sources came later. The boundaries can be fuzzy, and are certainly dependent on the task at hand.
I want to reason about what these distinctions look like in the alignment community, and whether or not they're important.
The rest of chapter five is about how to use libraries and information technologies, and evaluating sources for relevance and reliability.
Chapter 6 starts off with the kind of thing you should be looking for while you read
Look for creative agreement
Look for creative disagreement
The rest of chapter 6 is a few more notes about what you're looking for while reading (evidence, reasons), how to take notes, and how to stay organized while doing this.
The alignment community
I think I see the creative agreement modes and the creative disagreement modes floating around in posts. Would it be more helpful if writers decided on one or two of these modes before sitting down to write?
Moreover, what is a primary source in the alignment community? Surely if one is writing about inner alignment, a primary source is the Risks from Learned Optimization paper. But what are Risks' primary, secondary, tertiary sources? Does it matter?
Now look at Arbital. Arbital started off to be a tertiary source, but articles that seemed more like primary sources started appearing there. I remember distinctively thinking "what's up with that?" it struck me as awkward for Arbital to change it's identity like that, but I end up thinking about and citing the articles that seem more like primary sources.
There's also the problem of stuff in the memeplex not written down is the real "primary" source while the first person who happens to write it down looks like they're writing a primary source when in fact what they're doing is really more like writing a secondary or even tertiary source.
Good arguments - notes on chapter 7
Arguments take place in 5 parts.
This can be modeled as a conversation with readers, where the reader prompts the writer to taking the next step on the list.
Claims ought to be supported with reasons. Reasons ought to be based on evidence. Arguments are recursive: a part of an argument is an acknowledgment of an anticipated response, and another argument addresses that response. Finally, when the distance between a claim and a reason grows large, we draw connections with something called warrants.
The logic of warrants proceeds in generalities and instances. A general circumstance predictably leads to a general consequence, and if you have an instance of the circumstance you can infer an instance of the consequence.
Arguing in real life papers is complexified from the 5 steps, because
Claims - thoughts on chapter 8
Broadly, the two kinds of claims are conceptual and practical.
Conceptual claims ask readers not to ask, but to understand. The flavors of conceptual claim are as follows:
There's essentially one flavor of practical claim
If you read between the lines, you might notice that a kind of claim of fact or cause/consequence is that a policy works or doesn't work to bring about some end. In this case, we see that practical claims deal in ought or should. There is a difference, perhaps subtle perhaps not, between "X brings about Y" and "to get Y we ought to X".
Readers expect a claim to be specific and significant. You can evaluate your claim along these two axes.
To make a claim specific, you can use precise language and explicit logic. Usually, precision comes at the cost of a higher word count. To gain explicitness, use words like "although" and "because". Note some fields might differ in norms.
You can think of significance of a claim as the quantity it asks readers to change their mind, or I suppose even behavior.
Avoid arrogance.
Two ways of avoiding arrogance are acknowledging limiting conditions and using hedges to limit certainty.
Don't run aground: there are innumerable caveats that you could think of, so it's important to limit yourself only to the most relevant ones or the ones that readers would most plausibly think of. Limiting certainty with hedging is given by example of Watson and Crick, publishing what would become a high-impact result, "We wish to suggest ... in our opinion ... we believe ... Some ... appear"
It is not obvious how to walk the line between hedging too little and hedging too much.
thoughts on chapter 9
We saw previously that claims ought to be supported with reasons, and reasons ought to be based on evidence. Now we will look closer at reasons and evidence.
Reasons must be in a clear, logical order. Atomically, readers need to buy each of your reasons, but compositionally they need to buy your logic. Storyboarding is a useful technique for arranging reasons into a logical order: physical arrangements of index cards, or some DAG-like syntax. Here, you can list evidence you have for each reason or, if you're speculating, list the kind of evidence you would need.
When storyboarding, you want to read out the top level reasons as a composite entity without looking at the details (evidence), because you want to make sure the high-level logic makes sense.
I think there is a contract between you and the reader. You must agree to cite sources that are plausibly truthful, and your reader must agree to accept that these sources are reliable. A diligent and well-meaning reader can always second-guess whether, for instance, the beureau of subject matter statistics is collecting and reporting data correctly, but at a certain point this violates the social contract. If they're genuinely curious or concerned, it may fall on them to investigate the source, not on you. The bar you need to meet is that your sources are plausibly trustworthy. The book doesn't talk much about this contract, so there's little I can say about what "plausible" means.
Sometimes you have to be extra careful to distinguish reasons from evidence, a
(<claim>, <reason>, <evidence>)
tuple is subject to regress in the latter two components,(A, B, C)
may need to be justified by(B, C, D)
and so on. The example given of this regress is if I told you(american higher education must curb escalating tuition costs, because the price of college is becoming an impediment to the american dream, today a majority of students leave college with a crushing debt burden)
. In the context of this sentence, "a majority of students..." is evidence, but it would be reasonable to ask for more specifics. In principle, any time information is compressed it may be reasonable to ask for more specifics. A new tuple might look like(the price of college is becoming an impediment to the american dream, because today a majority of students leave college with a crushing debt burden, in 2013 nearly 70% of students borrowed money for college with loans averaging $30000...)
. The third component is still compressing information, but it's not in the contract between you and the reader for the reader to demand the raw spreadsheet, so this second tuple might be a reasonable stopping point of the regress.Sometimes you have to be careful to distinguish evidence from reports of it. Again, because we are necessarily dealing with compressed information, we can't often point directly to evidence. Even a spreadsheet, rather than summary statistics of it, is a compression of the phenomena in base reality that it tracks.
There is a criteria you want to screen your evidence with respect to.
Being honest about the reliability and prospective accuracy of evidence is always a positive signal. Evidence can be either too precise or not precise enough. The women in one or two of Shakespeare's plays do not represent all his women, they are not representative. Figure out what sorts of authority signals are considered credible in your community, and seek to emulate them.
Question your argument as your readers will - thoughts on chapter 10
Three predictable disagreements are
There are roughly two kinds of queries readers will have about your argument
Voicing too many hypothetical objections up front can paralyze you. Instead, what you should do before anything else is focus on what you want to say. Give that some structure, some meat, some life. Then, an important exercise is to imagine readers' responses to it.
I think cleaving these into two highly separated steps is an interesting idea, doing this with intention may be a valuable exercise next time I'm writing something.
The authors provide some questions about your problem from a possible reader:
Then, they provide some questions about your support from a possible reader.
It builds credibility to play defense: to recognize your own argument's limitations. It builds even more credibility to play offense: to explore alternatives to your argument and bring them into your reasoning. If you can, you might develop those alternatives in your own imagination, but more likely you'd like to find alternatives in your sources.
What is the perfect amount of objections to acknowledge? Acknowledging too many can distract readers from the core of your argument, while acknowledging too few is a signal of laziness or even disrespect. You need to narrow your list of alternatives or objections by subjecting them to the following priorities
It is wise to build up good faith by acknowledging questions you can't answer. Concessions are often interpreted as positive signals by the reader.
It is important for your responses to acknowledgments to be subordinate to your main point, or else the reader will miss the forest for the trees.
Remember to make an intentional decision about how much credence to give to an objection or alternative. Weaker ones imply weaker credences, imply less effort in your acknowledgment and response.
On warrants - thoughts on chapter 11
We saw that arguments take place in five parts. Claims, reasons, evidence, acknowledgments & response, and warrants flow together into into a single argument.
In everyday life, people use proverbs as warrants. If I rush out of the room and leave behind my charging laptop, someone might say "you shouldn't have rushed", and if I question the relevance of rushing to leaving behind my laptop they might say "haste makes waste" to illustrate the cause-effect relation. In academic arguments, however, there's no reliance on commonsense or commonly acknowledged phrases that rhyme, they are instead "specific principles of reasoning that belong to particular communities of researchers".
Often warrants are implicit, especially in writing by experienced researchers. Like problems, warrants take on the structure of circumstance and consequence.
Warrant-powered arguments proceed in a logic of generalities and instances. A warrant might be "when a powerful system has a goal (general circumstance), the universe gets tiled with counterintuitive realizations of that goal (general consequence)", which could power an argument like "the atoms in my kidney might get rearranged into smileyfaces (claim) because an arbitrary self-improving system got tasked with making people happy (reason)". The atoms in my kidney getting rearranged into smileyfaces is a good instance of a counterintuitive realization of a goal, that goal being making people happy. Notice the square above.
When to state a warrant
Warrants can go unsaid quite often. There are roughly three circumstances where you want to be explicit about your warrants.
Meta-warrants
It is worth picking out such a thing as methodological warrants or meta-warrants. Think of them as the source of proverbs rather than proverbs themselves. Three basic metawarrants are
Reasons, evidence, and warrants