# Nisan's Shortform

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This is a special post for quick takes by Nisan. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
[-]NisanΩ15280

On 2018-04-09, OpenAI said[1]:

OpenAI’s mission is to ensure that artificial general intelligence (AGI) [...] benefits all of humanity.

In contrast, in 2023, OpenAI said[2]:

[...] OpenAI’s mission: to build artificial general intelligence (AGI) that is safe and benefits all of humanity.

1. This archived snapshot is from 2023-05-17, but the document didn't get much attention until November that year. ↩︎

Recent interviews with Eliezer:

[-]Nisan190

Hyperbolic growth

The differential equation , for positive and , has solution

(after changing the units). The Roodman report argues that our economy follows this hyperbolic growth trend, rather than an exponential one.

While exponential growth has a single parameter — the growth rate or interest rate — hyperbolic growth has two parameters: is the time until singularity, and is the "hardness" of the takeoff.

A value of close to zero gives a "soft" takeoff where the derivative gets high well in advance of the singularity. A large value of gives a "hard" takeoff, where explosive growth comes all at once right at the singularity. (Paul Christiano calls these "slow" and "fast" takeoff.)

Paul defines "slow takeoff" as "There will be a complete 4 year interval in which world output doubles, before the first 1 year interval in which world output doubles." This corresponds to . (At , the first four-year doubling starts at and the first one-year doubling starts at years before the singularity.)

So the simple hyperbola with counts as "slow takeoff". (This is the "naive model" mentioned in footnote 31 of Intelligence Explosion Microeconomics.)

Roodman's estimates of historical are closer to (see Table 3).

People are fond of using the neologism "cruxy", but there's already a word for that: "crucial". Apparently this sense of "crucial" can be traced back to Francis Bacon.

The point of using a word like this is to point to different habits of thoughts. If you use an existing word that's unlikely to happen in listerners.

If you don't do that you get a lot of motte-and-bailey issues.

A cruxy point doesn't have to be important, the whole question being considered doesn't have to be important. This is an unfortunate connotation of "crucial", because when I'm pointing out that the sky is blue, I'm usually not saying that it's important that it's blue, or that it's important for this object level argument to be resolved. It's only important to figure out what caused a simple mistake that's usually reliably avoided, and to keep channeling curiosity to fill out the map, so that it's not just the apparently useful parts that are not wild conjecture.

I think it's relative.  A crux is crucial to a question, whether the question is crucial to anything else or not.  If you're pointing out the sky is blue, that's only a crux if it's important to some misunderstanding or disagreement.

I'm with Nisan.  "Crucial" is simply the proper and common term that should be used instead of the backformation "cruxy".

12 years ago, in The state of Computer Vision and AI: we are really, really far away, Andrej Karpathy wrote:

The picture above is funny.

But for me it is also one of those examples that make me sad about the outlook for AI and for Computer Vision. What would it take for a computer to understand this image as you or I do? [...]

In any case, we are very, very far and this depresses me. What is the way forward? :(

I just asked gpt-4o what's going on in the picture, and it understood most of it:

In this image, a group of men in business attire are seen in a locker room or a similar setting. The focus is on two men, where the taller man is standing on a scale. The shorter man, who appears to be playfully pressing down on the taller man's shoulders to increase his weight on the scale, is creating a humorous situation. Both men and those observing in the background are smiling or laughing, indicating that they are enjoying the lighthearted moment. The man pressing down seems to be taking part in a playful prank or joke, adding a sense of camaraderie and fun to the scene.

Of course, Karpathy's post could be in the multimodal training data.

[-]NisanΩ360

Conception is a startup trying to do in vitro gametogenesis for humans!

We can derive Newton's law of cooling from first principles.

Consider an ergodic discrete-time dynamical system and group the microstates into macrostates according to some observable variable . ( might be the temperature of a subsystem.)

Let's assume that if , then in the next timestep can be one of the values , , or .

Let's make the further assumption that the transition probabilities for these three possibilities have the same ratio as the number of microstates.

Then it turns out that the rate of change over time is proportional to , where is the entropy, which is the logarithm of the number of microstates.

Now suppose our system consists of two interacting subsystems with energies and . Total energy is conserved. How fast will energy flow from one system to the other? By the above lemma, is proportional to .

Here and are the coldnesses of the subsystems. Coldness is the inverse of temperature, and is more fundamental than temperature.

Note that Newton's law of cooling says that the rate of heat transfer is proportional to . For a narrow temperature range this will approximate our result.

I'd love if anyone can point me to anywhere this cooling law (proportional to the difference of coldnesses) has been written up.

Also my assumptions about the dynamical system are kinda ad hoc. I'd like to know assumptions I ought to be using.

Agents who model each other can be modeled as programs with access to reflective oracles. I used to think the agents have to use the same oracle. But actually the agents can use different oracles, as long as each oracle can predict all the other oracles. This feels more realistic somehow.

I'm not sure there's a functional difference between "same" and "different" oracles at this level of modeling.