Nate Showell — LessWrong

Just spitballing, but maybe you could incorporate some notion of resource consumption, like in linear logic. You could have a system where the copies have to "feed" on some resource in order to stay active, and data corruption inhibits a copy's ability to "feed."

A bad review != a bad book

Nate Showell20d30

I don't remember, it was something I saw in the New York Times Book Review section a few years ago.

koanchuk's Shortform

Nate Showell21d1-4

The spiralism attractor is the same type of failure mode as GPT-2 getting stuck repeating a single character or ChatGPT's image generator turning photos into caricatures of black people. The only difference between the spiralism attractor and other mode collapse attractors is that some people experiencing mania happen to find it compelling. That is to say, the spiralism attractor is centrally a capabilities failure and only incidentally an alignment failure.

A bad review != a bad book

Nate Showell21d70

I once read a positive review of a novel that, in one brief passage, described reading that novel as feeling similar to reading Twitter. That one sentence alone made the review useful to me by giving me a strong signal that I wouldn't like the book, even though the author liked it.

LessWrong Feed [new, now in beta]

Nate Showell22d10

The "Use New Feed" checkbox is stuck checked for me. Clicking on it doesn't uncheck it.

Condensation

Nate Showell24d30

Second, we could take condensation as inspiration and try to create new machine-learning models which resemble condensation, in the hopes that their structure will be more interpretable.

Condensation could also be applied to model scaffolding design or the interpretability of scaffolded systems. Some AI memory storage and retrieval systems already have structures that resemble the tagged-notebook analogy, with documents stored in a database along with tags or summaries. A condensation-inspired memory structure could potentially have low retrieval latency while also being highly interpretable. Condensation might also be useful for interpreting why a model retrieves a specific set of documents from its memory system when responding to a query.

Heroic Responsibility

Nate Showell1mo3-3

It's worth distinguishing between epistemic and instrumental forms of heroic responsibility. Shapley values are the mathematically precise way of apportioning credit or blame for an outcome among a group of people. Heroic responsibility as a belief about one's own share of credit or blame is a dark art of rationality, since it involves explicitly deviating from the Shapley value assignment in one's beliefs about credit or blame. But taking heroic responsibility as an action, while acknowledging that you're not trying to be mathematically precise in your credit assignment, can still be useful as a way of solving coordination problems.

Shortform

Nate Showell1mo10

Williamson and Dai both appear to describe philosophy as a general-theoretical-model-building activity, but there are other conceptions of what it means to do philosophy. In contrast to both Williamson and Dai, if Wittgenstein (either early or late period) is right that the proper role of philosophy is to clarify and critique language rather than to construct general theses and explanations, LLM-based AI may be quickly approaching peak-human competence at philosophy. Critiquing and clarifying writing are already tasks that LLMs are good at and widely used for. They're tasks that AI systems improve at from the types of scaling-up that labs are already doing, and labs have strong incentives to keep making their AIs better at them. As such, I'm optimistic about the philosophical competence of future AIs, but according to a different idea of what it means to be philosophically competent. AI systems that reach peak-human or superhuman levels of competence at Wittgensteinian philosophy-as-an-activity would be systems that help people become wiser on an individual level by clearing up their conceptual confusions, rather than a tool for coming up with abstract solutions to grand Philosophical Problems.

abramdemski's Shortform

Nate Showell2mo60

No, don't leak people's private medical information just because you think it will help the AI safety movement. That belongs in the same category as doxxing people or using violence. Even from a purely practical standpoint, without considering questions of morality, it's useful to precommit to not leaking people's medical information if you want them to trust you and work with you.

And that's assuming the rumor is true. Considering that this is a rumor we're talking about, it likely isn't.

Maybe social media algorithms don't suck

Nate Showell2mo12

A reminder for people who are unhappy with the current state of the Internet: you have the option of just using it less.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments