aysja

I wrote the first draft of this essay around a year ago, in between the bouts of delirium that long covid was beginning to deliver me. And I couldn’t quite tell back then how real it was, and as long covid consumed more of my mind it drifted further away. It began to feel impossible that I had ever had, or could ever have, courage. Because courage requires capacity and I was losing all of mine. And the doubts grew larger, and the clarity dimmed, and I forgot about Frodo for awhile, forgot about most everything, as I was left for many months staring directly into the bowels of deep atheism, wondering... (read more)

Solemn Courage

aysja

13d

Every so often it slips. It seems I am writing a book, but I can’t remember why. Somehow, the sentences are supposed to perform that impossible, intimate task: to translate my inner world into another. Yet they sit there so quiescent and small. How could an arrangement of words do anything, let alone reduce that ultimate threat to which it is all supposedly connected: the looming god machines? I look again at the monitor in which the words are contained and suddenly what once felt so raw and powerful deflates into limpness. Why would anyone listen to me, anyway? Have I said anything new? Or is too weird—the strangeness in my head failing to... (read 1551 more words →)

125

aysja1mo

I agree that human values are more accretive like this, but I would also call those genes “terminal” in the same sense that I call some of my own goals “terminal.” E.g., I can usually ask myself why I’m taking a given action and my brain will give a reasonable answer: “because I want to finish this post,” “because I’m hungry,” whatever. And then I can keep double clicking on those: “I want to finish the post because I don’t think this crux has been spelled out very well yet” and I can keep going and going until at some point the answer is like “I don’t know, because it’s intrinsically beautiful?”... (read more)

aysja1mo

I'm not sure if I expect motivated reasoning to come out better on average, even in domains where you might naively expect it to. In part that's because self-serving strategies often involve doing things other people don't like, e.g. being deceptive, manipulative, or generally unethical, in a way that can cause long-term harm to your reputation and so long term harm to your ability to win. And I think there is significant optimization pressure on catching this kind of thing, in part for reasons similar to the ones outlined in Elephant in the Brain, i.e., that we evolved in an environment where winning that cat and mouse game was a big part... (read more)

Replying toRuby's Inkhaven Retrospective

aysja3mo

Ruby's Inkhaven Retrospective

Fwiw, my experience has been more varied. My most well received comments (100+ karma) are a mix of spending days getting a hard point right and spending minutes extemporaneously gesturing at stuff without much editing. But overall I think the trend points towards "more effort = more engagement and better received." I have mostly attributed this to the standards and readership LessWrong has cultivated, which is why I feel excited to post here. It seems like one of the rare places on the internet where long, complex essays about the most fascinating and important topics are incentivized. My reddit posts are not nearly as well received, for instance. I haven't posted as... (read more)

Replying toUnless its governance changes, Anthropic is untrustworthy

aysja3mo

Unless its governance changes, Anthropic is untrustworthy

I haven't followed every comment you've left on these sorts of discussions, but they often don't include information or arguments I can evaluate. Which MIRI employees, and what did they actually say? Why do you think that working at Anthropic even in non-safety roles is a great way to contribute to AI safety? I understand there are limits to what you can share, but without that information these comments don't amount to much more than you asking us to defer to your judgement. Which is a fine thing to do, I just wish it were more clearly stated as such.

Replying toYou Are Much More Salient To Yourself Than To Everyone Else

aysja3mo

You Are Much More Salient To Yourself Than To Everyone Else

This was a really important update for me. I remember being afraid of lots of things before I started publishing more publicly on the internet: how my intelligence would be perceived, if I'd make some obviously stupid in retrospect point and my reputation ruined forever, etc. Then at some point in this thought loop I was like wait the most likely thing is just that no one reads this, right? More like a "huh" or a nothing at all rather than vitriolic hatred of my soul or whatever I was fearing. This was very liberating, and still is. I probably ended up over optimizing for invisibility because of the freedom I feel from it—being mostly untethered from myopic social dynamics has been really helpful for my thinking and writing.

Replying toAim for single piece flow

aysja3mo

Aim for single piece flow

I tend to write in large tomes that take months or years to complete, so I suppose I disagree with you too. Not that intellectual progress must consist of this, obviously, but that it can mark an importantly different kind of intellectual progress from the sort downstream of continuous shipping.

In particular, I think shipping constantly often causes people to be too moored to social reception, risks killing butterfly ideas, screens off deeper thought, and forces premature legibility. Like, a lot of the time I feel ready to publish something there is some bramble I pass in my writing, some inkling of “Is that really true? What exactly do I mean there?” These often... (read more)

Replying toRSPs are pauses done right

aysja3mo

RSPs are pauses done right

The first RSP was also pretty explicit about their willingness to unilaterally pause:

Note that ASLs are defined by risk relative to baseline, excluding other advanced AI systems.... Just because other language models pose a catastrophic risk does not mean it is acceptable for ours to.

Which was reversed in the second:

It is possible at some point in the future that another actor in the frontier AI ecosystem will pass, or be on track to imminently pass, a Capability Threshold… such that their actions pose a serious risk for the world. In such a scenario, because the incremental increase in risk attributable to us would be small, we might decide to lower the Required Safeguards.

Replying toA glimpse of the other side

aysja3mo

A glimpse of the other side

Replying toWhy Is Printing So Bad?

aysja4mo

Why Is Printing So Bad?

Relatedly, I often feel like I'm interfacing with a process that responded to every edge case with patching. I imagine this is some of what's happening when the poor printer has to interface with a ton of computing systems, and also why bureaucracies like the DMV seem much more convoluted than necessary. Since each time an edge case comes up the easier thing is to add another checkbox/more red tape/etc, and no one is incentivized enough to do the much harder task of refactoring all of that accretion. The legal system has a bunch of this too, indeed I just had to sign legal documents which were full of commitments to abstain from very weird actions (why on Earth would anyone do that?). But then you realize that yes, someone in fact did that exact thing, and now it has to be forever reflected there.

Non-Disparagement Canaries for OpenAI

aysja

aysja, Adam Scholl

Since at least 2017, OpenAI has asked departing employees to sign offboarding agreements which legally bind them to permanently—that is, for the rest of their lives—refrain from criticizing OpenAI, or from otherwise taking any actions which might damage its finances or reputation.^[1]

If they refused to sign, OpenAI threatened to take back (or make unsellable) all of their already-vested equity—a huge portion of their overall compensation, which often amounted to millions of dollars. Given this immense pressure, it seems likely that most employees signed.

If they did sign, they became personally liable forevermore for any financial or reputational harm they later caused. This liability was unbounded, so had the potential to be financially ruinous—if,... (read 568 more words →)

290

OMMC Announces RIP

Adam Scholl

Adam Scholl, aysja

At the Omnicide Machine Manufacturing Corporation, we work tirelessly to ensure an omnicide-free future. That’s why we’re excited to announce our Responsible Increase Policy (RIP)—our internal protocol for managing any risks that arise as we create increasingly omnicidal machines.

Inspired by the risk-management framework used in gain-of-function virology research, our RIP defines a framework of Omnicidal Ability Levels (OAL), reflecting the precautions we plan to take as we release increasingly dangerous features over time:

The basic idea of the RIP is simple: each time we ship an update which makes our product more lethal, we will pause our efforts for some amount of time, and then revise our policies to be in some sense... (read 598 more words →)

191

Why Are Bacteria So Simple?

aysja

As far as we can tell, bacteria were the first lifeforms on Earth. Which means they’ve had a full four billion years to make something of themselves. And yet, despite their long evolutionary history, they mostly still look like this:

Bacteria belong to one major class of cells—prokaryotes.^[1] The other major class of cells, eukaryotes, arrived about one billion years after bacteria. But despite their late start, they are vastly more complex.

Prokaryotes mostly only contain DNA, and DNA translation machinery. Eukaryotes, on the other hand, contain a huge variety of internal organelles that run all kinds of specialized processes—lysosomes digest, vesicles transport, cytoskeletons offer structural support, etc.

Not only that, but all multicellular life is eukaryotic.^[2] Every... (read 2708 more words →)

172

LESSWRONG
LW

LESSWRONG
LW

Non-Disparagement Canaries for OpenAI

OMMC Announces RIP

Why Are Bacteria So Simple?

Solemn Courage

aysja

Solemn Courage

Non-Disparagement Canaries for OpenAI

OMMC Announces RIP

Why Are Bacteria So Simple?

aysja

Non-Disparagement Canaries for OpenAI

OMMC Announces RIP

Why Are Bacteria So Simple?

Solemn Courage

aysja

Solemn Courage

Non-Disparagement Canaries for OpenAI

OMMC Announces RIP

Why Are Bacteria So Simple?