TLDR: Poor categorization led me to focus too narrowly on tasks.
I was in the pitfalls of rationality. I didn’t know how to get out. For better or worse, I moved on to the next step of my master plan. Understanding the most important causes in the multiverse.
So, what are the most important causes? I skimmed 80,000 Hours, the UN, Existential Hope, Wikipedia, and Big List of Cause Candidates. My gut jumped to answer AI safety. But I reminded myself I didn’t just want a superintelligent AI to be intent-aligned. I wanted everyone to have my values as soon as possible. So, my new number one cause became spreading my values.
However, even if everyone shared my values, they could accidentally misalign an AI. And I don’t just want to avoid misalignment. I don’t want anyone with my values to make a mistake. So, I decided rationality (i.e., decision-making) was a more important cause than AI.
It hit me. Cause categorizations (and all categories?) are arbitrary. Positively shaping the development of AI sounds like a bigger cause than AI safety because it could be read to include AI safety and good AI capabilities. I could say the same thing about positively shaping biotechnology and pandemic prevention. Or I could group AI safety, pandemic prevention, nuclear war, etc., into a cause named preventing existential risk.
Anything can be a cause. I just didn’t hear anyone refer to the cause of climate change, eliminating poverty, and ending fidgeting because people don’t commonly group those three things together.
I asked the wrong question. I wanted to rank causes because I thought it’d help me make a positive impact. But I explored causes in too much isolation. I should’ve assessed them along with my comparative advantage. To estimate what I can accomplish during my lifetime rather than to play with words.
In retrospect, I think I made similar mistakes when studying rationality. When I made a to-possibly-do list, I didn’t consider the length of my tasks.
I had “look into AI alignment” and “try AquaNotes” on the same list. One of those things has a much higher expected utility than the other. Looking into AI alignment probably takes more time too.
But duration wasn’t in my expected utility formula. Which I planned to use to rank to-do’s. I could redefine a to-do, combine to-do’s, or break down a to-do into multiple to-do’s to change expected utility rankings. I didn’t notice I was breaking down tasks arbitrarily until I read about decision theory.
I think I also studied rationality in too much isolation. I didn’t see how I could teach myself to notice when to recall concepts on my list. Such as cached thoughts. Or when I’d use a concept at all. For example, blackmail. What was I thinking when I wrote that down?
And some concepts seemed sufficiently complicated that I barely bothered to understand them. I wasn’t motivated.
In the end, studying rationality made me less worried I’m missing out on easily accessible valuable life lessons. But I’m skeptical I used my time efficiently.
So What Now?
I’ve remembered my fundamental tenet of rationality, and my number one cause are the same thing. Maximize my expected utility. Now!
I’ve still got a ton to learn about rationality and causes. But I’ll strive to read about them or anything else in context. The context that I’m a utility monster!
Hopefully, I’ll learn from my categorization mistakes. My posts are infrequent enough that I’m not counting on becoming a Scott Alexander / Tim Urban / Nathan Fielder hybrid. It’s time to make a new plan.
But, as I think about moving forward, I feel rushed. My stress has already been clouding my thoughts.
I’m worried I’ll succumb to pressure to appear productive. And fail to step back, collect myself, and make the effort to be more rational. Potentially repeating mistakes that led me to waste thousands of hours.
Granted, I just went from reading about causes to reading 37 Ways That Words Can Be Wrong. I don’t think I’m moving too fast yet.
And if I don’t work through more of my anxieties, I don’t see how I can become much happier.
I was only looking to determine the “importance” (i.e., scale) of a cause. I wasn’t considering tractability (i.e., solvability), neglectedness, or any other criteria.
I’d consider an AI intent-aligned if it does what its creators want it to do. Not that its creators will 100% agree on what they want it to do.
And is any word a category? It depends on how you define category. Each definition of a word is like an item in the word’s category. And some (all?) words can be seen as categories that group categories. (E.g., Tigers, “objects with the properties of largeness, yellowness, stripedness, and feline shape, [that] have previously often possessed the properties ‘hungry’ and ‘dangerous,’")
I'd seen that categorization before writing this post. But I'd subconsciously treated preventing existential risk and AI safety (etc.) as incomparable categories. I probably did this because I think of AI safety as an item in the existential risk category. I hadn’t consciously realized that I had multiple levels of causes. (E.g., existential risk -> AI safety, etc., -> interpretability, other AI safety problems) I confused myself when ranking causes because I used the same word, cause, for each level of causes.
My brain also struggles with categorization because my categories don’t form a neat tree. They’re a graph. (E.g., As I implied, I treat AI safety as a sub-category of both existential risk and AI generally. And AI and existential risks affect each other.)
In this case, I was using the terms AI alignment and AI safety equivalently. I generally prefer the term AI safety. That’s because I’m not sure how commonly AI alignment is meant to include misuse (i.e., someone intent-aligning an AI, but using that AI for “bad”).
I haven’t tried AquaNotes yet.
Per page 5. It implies (when it’s worth taking time to make a decision) I should use expected utility per minute/hour to rank tasks. And in case clarification is needed, I’ve used a (arbitrarily defined) task’s importance and duration to make decisions as long as I can remember. I just hadn’t come up with that formula.
And since I’d written up thoughts on this mistake before drafting this post, I’m not sure why I repeated this mistake when ranking causes. I think it’s because I temporarily forgot what I learned. And, even if my mistake was fresh in my mind, I probably still would’ve needed a little time to recognize that causes are defined arbitrarily.
Ask myself whenever I have an opinion if I formed that opinion years ago? And ask if I’m succumbing to every cognitive bias? That’s time consuming. It’s easier to ask myself about a more broad cognitive bias. (E.g., confirmation bias) A cached thought could be considered a form of confirmation bias because I want to believe I’m right and I don’t want to make the effort to rethink my opinion. But, in practice, the term confirmation bias doesn’t trigger the same thoughts as the term cached thoughts, etc.
Anything I learn could be considered (applied) rationality. In the previous sentence, I’m defining rationality as anything related to decision-making that I don’t categorize as another subject.
I don’t need to break down the question, “How can I maximize my expected utility now, forever?" But I don’t see how I can prevent myself from making mental categories. And trying to answer that question without breaking it down sounds as overwhelming as asking myself, “What task should I do on this enormous list?”
Rather than being productive or whatever else I’d do if I wasn’t stressed.
I read all the content and linked posts through number 16.