An Exploratory Toy AI Takeoff Model

I agree that the background growth could be higher. I have two calculations with different backdrop growth speeds in this section, but the model works with unitless timesteps.

mike_hawke's Shortform

The post talks about this book, I assume the first paragraph is supposed to be a quote.


I very much dislike the second sentence in this tag: "If you do something to feel good about helping people, or even to be a better person in some spiritual sense, it isn't truly altruism."

First of all, it's cryptonormative. Second, it leads to the old "people only care about their happiness" model that explains everything. Third (but this is a weak & contextualizing point), it is related to the common perception that egoistic actions are usually bad.

I have replaced the second sentence with "However, non-altruistically motivated actions can still be good (e.g. people pursuing non-rival goods), and altruistically motivated actions can still be bad (e.g. people being mistaken about what is good).", and added that altruism is a motivation rather than a set of actions, but this is rather preliminary. I would be equally fine with the second sentence being deleted altogether.

Eli's shortform feed

I think you are thinking of “AI Alignment: Why It’s Hard, and Where to Start”:

The next problem is unforeseen instantiation: you can’t think fast enough to search the whole space of possibilities. At an early singularity summit, Jürgen Schmidhuber, who did some of the pioneering work on self-modifying agents that preserve their own utility functions with his Gödel machine, also solved the friendly AI problem. Yes, he came up with the one true utility function that is all you need to program into AGIs!

(For God’s sake, don’t try doing this yourselves. Everyone does it. They all come up with different utility functions. It’s always horrible.)

His one true utility function was “increasing the compression of environmental data.” Because science increases the compression of environmental data: if you understand science better, you can better compress what you see in the environment. Art, according to him, also involves compressing the environment better. I went up in Q&A and said, “Yes, science does let you compress the environment better, but you know what really maxes out your utility function? Building something that encrypts streams of 1s and 0s using a cryptographic key, and then reveals the cryptographic key to you.”

He put up a utility function; that was the maximum. All of a sudden, the cryptographic key is revealed and what you thought was a long stream of random-looking 1s and 0s has been compressed down to a single stream of 1s.

There's also a mention of that method in this post.

1960: The Year The Singularity Was Cancelled

I believe this is an important gears-level addition to posts like hyperbolic growth, long-term growth as a sequence of exponential modes and an old yudkowsky post I am unable to find at the moment.

I don't know how closely these texts are connected, but Modeling the Human Trajectory picks up one year later, creating two technical models: one stochastically fitting and extrapolating GDP growth; the other providing a deterministic outlook, considering labor, capital, human capital, technology and production (and, in one case, natural resources). Roodman arrives at somewhat similar conclusions, too: The industrial revolution was a very big deal, and something happened around 1960 that has slowed the previous strong growth (as far as I remember, it doesn't provide an explicit reason for this).

A point in this post that I found especially interesting was the speculation about the back plague being the spark that ignited the industrial revolution. The reason given is a good example of slack catapulting a system out of a local maximum, in this case a malthusian europe into the industrial revolution.

Interestingly, both this text and Roodman don't consider individual intelligence as an important factor in global productivity. Despite the well-known Flynn-Effect that has mostly continued since 1960 (caveat caveat), no extraordinary change in global productivity has occurred. This makes some sense: a rise of less than 1 standard deviation might be appreciable, but not groundbreaking. But the relation to artificial intelligence makes it interesting: the purported (economic) advantage of AI systems is that they can copy themselves, thereby making population growth not the most constraining variable in this growth model. I don't believe this is particularly anticipation-constraining, though: this could mean that either the post-singularity ("singularity") world is multipolar, or the singleton controlling everything has created many sub-agents.

I appreciate this post. I have referenced it a couple of times in conversations. Together with the investigation by OpenPhil it makes a solid case that the gods of straight lines have decided to throw us into the most important century of history. May the godess of everything else be merciful with us.

Open & Welcome Thread - January 2021

Hello everybody!

I have done some commenting & posting around here, but I think a proper introduction is never bad.

I was Marxist for a few years, then I fell out of it, discovered SSC and thereby LW three years ago, started reading the Sequences and the Codex (yes, you now name them together). I very much enjoy the discussions around here, and the fact that LW got resurrected.

I sometimes write things for my personal website about forecasting, obscure programming languages and [REDACTED]. I think I might start cross-posting a bit more (the two last posts on my profile are such cross-posts).

I endorse spending my time reading, meditating, and [REDACTED], but my motivational system often decides to waste time on the internet instead.

Daniel Kokotajlo's Shortform

Maybe we give the LessWrong team a magic Karma Wand, and they take all the karma that the anonymous reviews got and bestow it (plus or minus some random noise) to the actual authors.

Wouldn't this achieve the opposite of what we want, disincentivize reviews? Unless coupled with paying people to write reviews, this would remove the remaining incentive.

I'd prefer going into the opposite direction, making reviews more visible (giving them a more prominent spot on the front page/on allPosts, so that more people vote on them/interact with them). At the moment, they still feel a bit disconnected from the rest of the site.

Eli's shortform feed

Yes, it definitely does–you just created the resource I will will link people to. Thank you!

Especially the third paragraph is cruxy. As far as I can tell, there are many people who have (to some extent) defused this propensity to get triggered for themselves. At least for me, LW was a resource to achieve that.

What failure looks like

I read this post only half a year ago after seeing it being referenced in several different places, mostly as a newer, better alternative to the existing FOOM-type failure scenarios. I also didn't follow the comments on this post when it came out.

This post makes a lot of sense in Christiano's worldview, where we have a relatively continuous, somewhat multipolar takeoff which to a large extent inherits the problem in our current world. This is especially applies to part I: we already have many different instances of scenarios where humans follow measured incentives and produce unintended outcomes. Goodhart's law is a thing. Part I ties in especially well with Wei Dai's concern that

AI-powered memetic warfare makes all humans effectively insane.

While I haven't done research on this, I have a medium strength intuition that this is already happening. Many people I know are at least somewhat addicted to the internet, having lost a lot of attention due to having their motivational system hijacked, which is worrying because Attention is your scarcest resource. I believe investigating the amount to which attention has deteriorated (or has been monopolized by different actors) would be valuable, as well as thinking about which incentives will start when AI technologies become more powerful (Daniel Kokotajlo has been writing especially interesting essays on this kind of problem).

As for part II, I'm a bit more skeptical. I would summarize "going out with a bang" as a "collective treacherous turn", which would demand somewhat high levels of coordination between agents of various different levels of intelligence (agents would be incentivized to turn early because of first-mover-advantages, but this would increase the probability of humans doing something about it), as well as agents knowing very early that they want to perform a treacherous turn to influence-seeking behavior. I'd like to think about how the frequency of premature treacherous turns relates to the intelligence of agents. Would that be continuous or discontinuous? Unrelated to Christiano's post, this seems like an important consideration (maybe work has gone into this and I just haven't seen it yet).

Still, part II holds up pretty well, especially since we can expect AI systems to cooperate effectively via merging utility functions, and we can see systems in the real world that fail regularly, but not much is being done about them (especially social structures that sort-of work).

I have referenced this post numerous times, mostly in connection with a short explanation of how I think current attention-grabbing systems are a variant of what is described in part I. I think it's pretty good, and someone (not me) should flesh the idea out a bit more, perhaps connecting it to existing systems (I remember the story about the recommender system manipulating its users into political extremism to increase viewing time, but I can't find a link right now).

The one thing I would like to see improved is at least some links to prior existing work. Christiano writes that

(None of the concerns in this post are novel.)

but it isn't clear whether he is just summarizing things he has thought about, which are implicit knowledge in his social web, or whether he is summarizing existing texts. I think part I would have benefitted from a link to Goodhart's law (or an explanation why it is something different).

Eli's shortform feed

(This question is only related to a small point)

You write that one possible foundational strategy could be to "radically detraumatize large fractions of the population". Do you believe that

  1. A large part of the population is traumatized
  2. That trauma is reversible
  3. Removing/reversing that trauma would improve the development of humanity drastically?

If yes, why? I'm happy to get a 1k page PDF thrown at me.

I know that this has been a relatively popular talking point on twitter, but without a canonical resource, and I also haven't seen it discussed on LW.

Load More