Nate Showell

Nate Showell16d

On the first question, reaching superintelligence might require designing, testing, at-scale manufacturing, and installation of new types of computing hardware, which would probably take more than two years.

Nate Showell19d

In the pre-LLM era, it seemed more likely (compared to now) that there was an algorithmically simple core of general intelligence, rather than intelligence being a complex aggregation of skills. If you're operating under the assumption that general intelligence has a simple algorithmic structure, decision theory is an obvious place to search for it. So the early focus on decision theory wasn't random.

Nate Showell21d

There are the terms "closed individualism," "open individualism," and "empty individualism" used in this Qualia Computing post.

Replying to"The first two weeks are the hardest": my first digital declutter

Nate Showell22d

"The first two weeks are the hardest": my first digital declutter

My own experience is very different from those described in this post. I find it relaxing instead of stressful to spend time doing nothing, and felt this way even when I was a child and hadn't started meditating regularly. I also don't enjoy using a smartphone, due to the small screen size and reliance on touch inputs, so I don't fill gaps in activities by browsing the Internet on my phone. It's also common for me to have brief interactions with strangers even though I'm young. People frequently ask me for directions when I'm on my way to or from work.

-1

Nate Showell23d

The easiest way to promote justice is to focus on punishing people who behave badly (since that's easier than rewarding people who behave well).

The premises of the toy model don't require this to be true. Whether it's true, and to what extent, can vary between environments.

Nate Showell1mo*Quick Take

The orthogonality question is an engineering question

People usually think about the orthogonality question ("Is the orthogonality thesis true?") as a philosophical question. The usual way of approaching the orthogonality question is by taking a starting point of "assume an AGI exists" and then reasoning about what goals the AGI would have. But one can flip the usual starting point around and ask, for a specific goal, "is it realistically achievable to create a general intelligence that has this goal?" This reframing turns the orthogonality question into an engineering question that has more direct practical relevance than the philosophical version. The engineering version is a question about the types of results an AI... (read more)

-1

Replying toIn My Misanthropy Era

Nate Showell1mo

In My Misanthropy Era

There was likely a midwit-meme effect going on at the philosophy meetup, where, in order to distinguish themselves from the stereotypical sports-bar-goers, the attendees were forming their beliefs in ways that would never occur to a true "normie." You might have a better experience interacting with "common people" in a setting where they aren't self-selected for trying to demonstrate sophistication.

Nate Showell3mo

Just spitballing, but maybe you could incorporate some notion of resource consumption, like in linear logic. You could have a system where the copies have to "feed" on some resource in order to stay active, and data corruption inhibits a copy's ability to "feed."

Replying toA bad review != a bad book

Nate Showell3mo

A bad review != a bad book

I don't remember, it was something I saw in the New York Times Book Review section a few years ago.

Nate Showell3mo

The spiralism attractor is the same type of failure mode as GPT-2 getting stuck repeating a single character or ChatGPT's image generator turning photos into caricatures of black people. The only difference between the spiralism attractor and other mode collapse attractors is that some people experiencing mania happen to find it compelling. That is to say, the spiralism attractor is centrally a capabilities failure and only incidentally an alignment failure.

-4

Metroid Prime would work well as a difficult video-game-based test for AI generality.

It has a mixture of puzzles, exploration, and action.
It takes place in a 3D environment.
It frequently involves backtracking across large portions of the map, so it requires planning ahead.
There are various pieces of text you come across during the game. Some of them are descriptions of enemies' weaknesses or clues on how to solve puzzles, but most of them are flavor text with no mechanical significance.
The player occasionally unlocks new abilities they have to learn how to use.
It requires the player to manage resources (health, missiles, power bombs)
It's on the difficult side for human players, but not to an extreme level.

There are no current AI systems that are anywhere close to being able to autonomously complete Metroid Prime. Such a system would probably have to be at or near the point where it could automate large portions of human labor.

Bug report: when I'm writing an in-line comment on a quoted block of a post, and then select text within my comment to add formatting, the formatting menu is displayed underneath the box where I'm writing the comment. For example, this prevents me from inserting links into in-line comments.

How are you preparing for the possibility of an AI bust?

Nate Showell

Nate Showell, Nathan Helm-Burger

What actions have you been taking to prepare for the possibility that the AI industry will experience a market crash, something along the lines of the dotcom bust of the early 2000s? Also, what actions would you take if a crash like that occurred? For example:

If your career is strongly AI-focused, what backup plan do you have in place?
If you're working on AI alignment research, how would your research agenda change? Would you switch to a different cause area altogether?
Likewise, if you're working on AI policy, how would your approach to it change?

Let's set aside speculation about whether a crash will happen or not; I'd like to hear what concrete actions people are taking.

An edgy writing style is an epistemic red flag. A writing style designed to provoke a strong, usually negative, emotional response from the reader can be used to disguise the thinness of the substance behind the author's arguments. Instead of carefully considering and evaluating the author's arguments, the reader gets distracted by the disruption to their emotional state and reacts to the text in a way that more closely resembles a trauma response, with all the negative effects on their reasoning capabilities that such a response entails. Some examples of authors who do this: Friedrich Nietzsche, Grant Morrison, and The Last Psychiatrist.

Is trade ever fully causal? Ordinary trade can be modeled as acausal trade with the "no communication" condition relaxed. Even in a scenario as seemingly causal as using a vending machine, trade only occurs if the buyer believes that the vending machine will actually dispense its goods and not just take the buyer's money. Similarly, the vending machine owner's decision to set up the machine was informed by predictions about whether or not people would buy from it. The only kind of trade that seems like it might be fully causal is a self-executing contract that's tied to an external trigger, and for which both parties have seen the source code and verified that the other party have enough resources to make the agreed-upon trade. Would a contract like that still have some acausal element anyway?

I've come to believe (~65%) that Twitter is anti-informative: that it makes its users' predictive calibration worse on average. On Manifold, I frequently adopt a strategy of betting against Twitter hype (e.g., on the LK-99 market), and this strategy has been profitable for me.

What do other people here think of quantum Bayesianism as an interpretation of quantum mechanics? I've only just started reading about it, but it seems promising to me. It lets you treat probabilities in quantum mechanics and probabilities in Bayesian statistics as having the same ontological status: both are properties of beliefs, whereas in some other interpretations of quantum mechanics, probabilities are properties of an external system. This match allows quantum mechanics and Bayesian statistics to be unified into one overarching approach, without requiring you to postulate additional entities like unobserved Everett branches.

I find myself betting "no" on Manifold a lot more than I bet "yes," and it's tended to be a profitable strategy. It's common for questions on Manifold to have the form "Will [sensational event] happen by [date]." These markets have a systematic tendency to be too high. I'm not sure how much of this bias is due to Manifold users overestimating the probabilities of sensational, low-probability events, and how much of it is an artifact of markets being initialized at 50%.

Simulacrum level 4 is more honest than level 3. Someone who speaks at level 4 explicitly asks himself "what statement will win me social approval?" Someone who speaks at level 3 asks herself the same question, but hides from herself the fact that she asked it.

Nate Showell's Shortform

Nate Showell

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Degamification

Nate Showell

Goodhart's Law refers to the tendency that when someone sets a performance metric for a goal, the metric itself becomes a target of optimization, often at the expense of the goal it's supposed to measure. Some metrics are subject to imperfectly-aligned incentives in ways that are easy to identify, such as when students optimize for getting high grades rather than understanding the course material. But in other scenarios, metrics fail in less obvious ways. For example, someone might limit himself to one drink per night, but still end up drinking too much because he drinks every night and overestimates how much alcohol counts as "one drink." There's no custom-made giant wineglass staring... (read 513 more words →)

Reinforcement Learner Wireheading

Nate Showell

Edit: I no longer agree with most of this post, due to the arguments given in Reward is not the optimization target.

Epistemic status: still early in the process of exploring this.

Goal-content integrity is frequently listed as one of the convergent instrumental goals that any AGI is likely to have. In the usual description of goal-content integrity as a convergent instrumental goal, an AI that implements some form of reinforcement learning determines that having its goal function modified would prevent it from achieving its current goals. The AI therefore acts to prevent its goal function from being modified. An AI would be likely to try to maintain its current goal function regardless of... (read 871 more words →)

LESSWRONG
LW

LESSWRONG
LW

How are you preparing for the possibility of an AI bust?

Degamification

Reinforcement Learner Wireheading

Nate Showell's Shortform

Nate Showell

How are you preparing for the possibility of an AI bust?

Nate Showell's Shortform

Degamification

Reinforcement Learner Wireheading

Nate Showell

How are you preparing for the possibility of an AI bust?

Degamification

Reinforcement Learner Wireheading

Nate Showell's Shortform

Nate Showell

How are you preparing for the possibility of an AI bust?

Nate Showell's Shortform

Degamification

Reinforcement Learner Wireheading