I have a plan to write a sequence that I can turn into a book about the problem of the criterion. I'm calling it "Fundamental Uncertainty" since I think it's a better name and it identifies the thing the problem of the criterion exposes. Yes, there's already an idea of "fundamental uncertainty" in the literature but that's okay because the existing name points to a special case of what I mean by the term. Maybe I'll come up with a better name before I finish and I'll switch to that.

Why write a book? Although I can adequately make my points in just a few blog posts and assemble a path to convey my points by linking to existing material on the internet and saying "read this and think real hard about it", most people won't do that. Books have a few advantages.

First, they carry weight. When someone takes the time to write a book they demonstrate the subject is important enough to put in that level of effort, so it signals that the topic is more likely than any other random topic to be worth knowing about.

Second, books let a reader spend time with ideas. When you read a blog post you can easily go on to the next, unrelated thing. When you read a book you stay with the same ideas for a long time, come back to them over time if you don't read the whole book in one sitting, and have some of the deep thinking about the ideas the author would like you to do happen automatically by giving the ideas time to seep into your brain as you spend time with them in a long-form format.

Third, it's an opportunity to put a lot of related ideas together in one place and make clear how they fit together. It's easy to miss a blog post or not see how a set of blog posts fit together. Books naturally put ideas together in one place and encourage showing how they are related.

Fourth, publishing a book conveys status to the author, and I shouldn't discount the signaling value in being a "published author" above and beyond the minor status bump from having published journal articles, conference papers, and of course blog posts in various places.

My post today is meant to serve as a rough table of contents, mostly for myself, and to incentivize me a bit to write all the parts (and not give up on it when it seems hard to flesh out some of the chapters, especially when I'd rather link to someone saying similar things better) by having said in public that I planned to.

Here's my initial draft for the table of contents:

  1. Introduction
    1. Basically a 2000 to 5000 word summary of the book. Doesn't go into details but does cover the whole thing. Personally I like when books do this, and I think it's honest: I can fit all of what I want to say in an essay if you already know the background material I want to reference, or you read the references. Also I like it as a didactic device: you tell people what you're going to tell them, you tell them, and then you tell them what you told them. Helps with retention and integration of the ideas. I think of it as something like putting up the drywall so there's somewhere to hang the pictures of the later chapters (and then you put the roof on? look, it's not a perfect metaphor).
  2. Symbol Grounding
    1. Chapter on the symbol grounding problem. This is well worn ground but I think it's a key example to talk about because it builds intuitions.
    2. Note that this and the next two chapters are basically just setup stolen from my post "Why the Problem of the Criterion Matters". I think it's important to first show why there's a problem that needs a solution. The problem of the criterion is fairly abstract, so let's show where the fundamental uncertainty it causes lies so by the time we get around to addressing it head on the reader has built up some intuitions that they can generalize from.
  3. Disagreements about values/morals
    1. Talk about how people seem to fundamentally value different things and end up disagreeing even when they agree on the facts. We probably don't have to get into metaethical uncertainty explicitly but never hurts to give a shout out!
  4. Motivation problems
    1. A bit more psychological, but basically look at problems where fundamental uncertain causes people to distrust themselves and not get stuff done. This chapter could get pretty sprawling and I don't want to necessarily pull in lots of cutting edge psychology, but I think everyone is familiar with the experience of wanting to do one thing and then doing another, so we can use that as a basis to talk about ways that we distrust ourselves because we can't be certain, as I see that as the root of the problem.
  5. The Problem of the Criterion/Fundamental Uncertainty
    1. I've written about these ideas more since I wrote the linked post, so I can probably make a better version of it using some of the better ways I've cooked up to explain the ideas, especially by leaning on the intuitions of the previous three chapters.
  6. Purpose
    1. My "answer" to the problem of the criterion is to see it's only a relative problem because there was no absolute ground to stand on to begin with, and that's because things are actually grounded in purpose, care, Sorge, whatever you want to call it.
    2. This chapter might need to be split up, I'm not sure. I might need something like an interlude about cybernetics to give a satisfying answer here, but maybe it's only that I'm unsatisfied without the cybernetics to back up the theory and most others are fine without trying to understand what purpose is at the most fundamental level. I'll think about it. Feedback welcome.
  7. Epistemic Humility and Goodharting
    1. Seeking Truth Too Hard Can Keep You From Winning
    2. Now that we've come through explaining the problem of the criterion and purpose, I think it's good to return to a case I could explain without it but I think lands harder in the context of it, namely that if you try too hard to optimize for truth that is, as I've now shown, ultimately uncertain and contingent, then you can Goodhart yourself on it. End by pointing out the general problem of Goodharting from not accepting epistemic limits and other ways this shows up (see also "Forcing Yourself is Self Harm, or Don't Goodhart Yourself" & "Forcing yourself to keep your identity small is self-harm").
  8. Miscellaneous musings
    1. I think there's various interesting things we can say as a result of fundamental uncertainty about various philosophical and other questions. This is a bit of a fun, walk in the park type chapter where we just look at a few things to get people thinking.
    2. Definitely gonna stick some stuff in here about AI alignment, pandemic mitigation, politics, and existential risks.
  9. Conclusion
    1. Revisit the introduction. Maybe just literally reprint it, but likely we can do something more that reflects what the reader has learned to say things a bit differently, like maybe the introduction but now use all the jargon and concepts the reader has learned about along the way. This is a combination recap and reinforcement of the core ideas so the reader walks away reminded of them. Spaced repetition in book form.

I'll probably work on the chapters out of order and add them to the sequence here slowly as I make progress. Chapter 2, 3, and 4 are totally unwritten by me, other than in passing in other posts. Chapter 5, 6, and 7 already exist as blog posts, but could stand substantial revision, especially to fit as part of a book. Chapter 1, 8, and 9 need to be written but basically come last, with chapter 8 likely being where I'll dump a bunch of tangents that get cut from other chapters to streamline them.

If I manage to make it to the end, I'll figure out something to do about publishing. If you have thoughts about what's planned in the chapter outline above, the publication process, or other general thoughts on the idea of this book, this is a great post to discuss them. For example, maybe I'm missing something really important or you think something seems off topic and should be cut. Let me know in the comments!

I'm also mildly interested in a co-author. Writing a book is hard, and many parts of the book don't benefit from specifically me writing them, or could be written by me providing initial drafts and my co-author turning them into polished work. I'd be happy to work with you to figure out funding if that would be needed to support you getting involved (I think there's a strong case to get money from an EA fund to support this work since I see it as vital to several existential risk mitigation missions). Ideally you'd be someone with a track record of quality published material (blog posts, books, articles, etc.), interested in this topic, and generally on board with my take on the problem of the criterion. I'd be looking for you to take on writing the earlier chapters in the book and help me improve the later chapters (i.e. help me better convey what I'm trying to say to readers without watering down what I want to say). If that sounds interesting, let's chat!

I'm not sure what my timeline is for completing the book. Without a co-author it's long, likely 2-3 years. With a co-author maybe it would be something like 18 months? Just trying to be realistic given I'm working a full time, mentally taxing job and otherwise have a life, but also know that writing comes in spurts and is often motivationally "free" vs. other things I might do. We'll see.


New Comment

New to LessWrong?