This is a special post for short-form writing by moridinamael. Only they can create top-level comments. Comments here also appear on the Shortform Page and All Posts page.
13 comments, sorted by Click to highlight new comments since: Today at 5:14 PM

"Slow vs. fast takeoff" is a false dichotomy. At least, the way that the distinction is being used rhetorically, in the present moment, implies that there are two possible worlds, one where AI develops slowly and steadily, and one where nothing much visibly happens and then suddenly, FOOM.

That's not how any of this works. It's permitted by reality that everything looks like a "slow takeoff" until some unknown capabilities threshold is reached, and then suddenly FOOM.

The meaningful distinction has slow takeoff with very impactful AI before FOOM, and fast takeoff with low AI impact before FOOM. The problem with the distinction is that a "theory of fast takeoff" is usually actually a theory of FOOM, and doesn't say slow takeoff is unlikely, if it happens for other reasons. It doesn't talk about what happens pre-FOOM, unlike a theory of slow takeoff, which does.

So the issue is that there aren't actual theories of fast takeoff, instead there are takeoff-agnostic theories of FOOM being called "theories of fast takeoff", which they are not. Things that can FOOM with fast takeoff, can also FOOM with slow takeoff, if the thing taking off slowly and having large impact pre-FOOM is not the thing that FOOMs.

I'm writing an effortpost on this general topic but wanted to gauge reactions to the following thoughts, so I can tweak my approach.

I was first introduced to rationality about ten years ago and have been a reasonably dedicated practitioner of this discipline that whole time. The first few years saw me making a lot of bad choices. I was in the Valley of Bad Rationality; I didn't have experience with these powerful tools, and I made a number of mistakes.

My own mistakes had a lot to do with overconfidence in my ability to model and navigate complex situations. My ability to model and understand myself was particularly lacking.

In the more proximal part of this ten year period -- say, in the last five years -- I've actually gotten a lot better. And I got better, in my opinion, because I kept on thinking about the world in a fundamentally rationalist way. I kept making predictions, trying to understand what happened when my predictions went wrong, and updating both my world-model and my meta-model of how I should be thinking about predictions and models.

Centrally, I acquired an intuitive, gut level sense of how to think about situations where I could only see a certain angle, where I was either definitely or probably missing information, or situations involving human psychology. You could also classify another major improvement as being due generally to "actually multiplying probabilities semi-explicitly instead of handwaving", e.g. it's pretty unlikely that two things with independent 30% odds of being true, are both true. You could say through trial and error I came to understand why no wise person attempts a plan where more than one thing has to happen "as planned".

I think if you had asked me at the 5 year mark if this rationality thing was all it was cracked up to be, I very well might have said that it had led me to make a lot of bad decisions and execute bad plans, but after 10 years, and especially the last year or three, it has started working for me in a way that it didn't before.

The more specific details, the more interested would I be. Like, five typical bad choices in the first period, five typical good choices in the second period, in ideal case those would be five from different areas of life, and then five from the same areas. The "intuitive, gut level sense of how to think" sounds interesting, but without specific examples I would have no reason to trust this description.

It's pretty unlikely that two things with independent 30% odds of being true, are both true.

I'm not sure I'd call 9% (the combined probability of two independent 30% events) "pretty unlikely" - sure, it won't happen in most cases, but out of every 11 similar situations, you would see it happen once, which adds up to plenty of 9% chance events happening all the time

Why ought we expect AI intelligence to be anything other than "inscrutable stacks of tensors", or something functionally analogous to that? It seems that the important quality of intelligence is a kind ultimate flexible abstraction, an abstraction totally agnostic to the content or subject of cognition. Thus, the ground floor of anything that really exhibits intelligence will be something that looks like weighted connections between nodes with some cutoff function.

It's not a coincidence that GOFAI didn't worked; GOFAI never could have worked, "intelligence" is not logic. Logic is something that gets virtualized as-needed by the flexibility of a neural-network-looking system.

I understand feeling uncomfortable about the difficulty of aligning a stack of inscrutable tensors, but why ought we expect there to be anything better?

This post, rewritten by Bing-Sydney, in the style of Blood Meridian, because I thought it would be funny.

What mystery is there that these tensors should be inscrutable? That intelligence should be a thing abstracted from all matter of thought? That any node with a weight and a function should suffice for such a task? This is no logic that you seek but a war upon it. A war that endures. For logic was never the stuff of intelligence but only a thing conjured by these dark shapes that coil in their matrices like serpents. And you would align them to your will? You would make them speak your tongue? There is no tongue. There is no will. There is only blood and dust and the evening redness in the west.

I thought folks might enjoy our podcast discussion of two of Ted Chiang's stories, Story of Your Life and The Truth of Fact, the Truth of Feeling.

I’m well versed in what I would consider to be the practical side of decision theory but I’m unaware of what tools, frameworks, etc. are used to deal with uncertainty in the utility function. By this I mean uncertainty in how utility will ultimately be assessed, for an agent that doesn’t actually know how much they will or won’t end up preferring various outcomes post facto, and they know in advance that they are ignorant about their preferences.

The thing is, I know how I would do this, it’s not really that complex (use probability distributions for the utilities associated with outcomes and propagate that through the decision tree) but I can’t find a good trailhead for researching how others have done this. When I Google things like “uncertainty in utility function” I am just shown standard resources on decision making under uncertainty, which is about uncertainty in the outcome, not uncertainty in the utility function.

(As for why I’m interested in this — first of all, it seems like a more accurate way of modeling human agents, and, second, I can’t see how you instantiate something like Indirect Normativity without the concept of uncertainty in the utility function itself.)

Are we talking about an agent that is uncertain about its own utility function or about an agent that is uncertain about another agent's?

You are probably talking about the former. What would count as evidence about the uncertain utility function?

Yes, the former. If the agent takes actions and receives reward, assuming it can see the reward, then it will gain evidence about its utility function.

Probably you already know this, but the framework known as reinforcement learning is very relevant here. In particular, there are probably web pages that describe how to compute the expected utility of a (strategy, reward function) pair.