Steven K. (not to be confused with

Wiki Contributions


steven0461's Shortform Feed

Suppose you have an AI powered world stabilization regime. Suppose somebody makes a reasonable moral argument about how humanity's reflection should proceed, like "it's unfair for me to have less influence just because I hate posting on Facebook". Does the world stabilization regime now add a Facebook compensation factor to the set of restrictions it enforces? If it does things like this all the time, doesn't the long reflection just amount to a stage performance of CEV with human actors? If it doesn't do things like this all the time, doesn't that create a serious risk of the long term future being stolen by some undesirable dynamic?

Petrov Day Retrospective: 2021

If Petrov pressing the button would have led to a decent chance of him being incinerated by American nukes, and if he valued his life much more than he valued avoiding the consequences he could expect to face for not pressing, then he had no reason to press the button even from a purely selfish perspective, and pressing it would have been a purely destructive act, like in past LW Petrov Days, or maybe a kind of Russian roulette.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

Well, I don't think it's obviously objectionable, and I'd have trouble putting my finger on the exact criterion for objectionability we should be using here. Something like "we'd all be better off in the presence of a norm against encouraging people to think in ways that might be valid in the particular case where we're talking to them but whose appeal comes from emotional predispositions that we sought out in them that aren't generally either truth-tracking or good for them" seems plausible to me. But I think it's obviously not as obviously unobjectionable as Zack seemed to be suggesting in his last few sentences, which was what moved me to comment.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

If short timelines advocates were seeking out people with personalities that predisposed them toward apocalyptic terror, would you find it similarly unobjectionable? My guess is no. It seems to me that a neutral observer who didn't care about any of the object-level arguments would say that seeking out high-psychoticism people is more analogous to seeking out high-apocalypticism people than it is to seeking out programmers, transhumanists, reductionists, or people who think machine learning / deep learning are important.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

There's a sliding scale ranging from seeking out people who are better at understanding arguments in general to seeking out people who are biased toward agreeing with a specific set of arguments (and perhaps made better at understanding those arguments by that bias). Targeting math contest winners seems more toward the former end of the scale than targeting high-psychoticism people. This is something that seems to me to be true independently of the correctness of the underlying arguments. You don't have to already agree about the robot apocalypse to be able to see why math contest winners would be better able to understand arguments for or against the robot apocalypse.

If Yudkowsky and friends were deliberately targeting arguments for short AI timelines at people who already had a sense of a foreshortened future, then that would be more toward the latter end of the scale, and I think you'd object to that targeting strategy even though they'd be able to make an argument structurally the same as your comment.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

It sounds like they meant they used to work at CFAR, not that they currently do.

The interpretation of "I'm a CFAR employee commenting anonymously to avoid retribution" as "I'm not a CFAR employee, but used to be one" seems to me to be sufficiently strained and non-obvious that we should infer from the commenter's choice not to use clearer language that they should be treated as having deliberately intended for readers to believe that they're a current CFAR employee.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

Or maybe you should move out of the Bay Area, a.s.a.p. (Like, half seriously, I wonder how much of this epistemic swamp is geographically determined. Not having the everyday experience, I don't know.)

I wonder what the rationalist community would be like if, instead of having been forced to shape itself around risks of future superintelligent AI in the Bay Area, it had been artificial computing superhardware in Taiwan, or artificial superfracking in North Dakota, or artificial shipping supercontainers in Singapore, or something. (Hypothetically, let's say the risks and opportunities of these technologies were equally great and equally technically and philosophically complex as those of AI in our universe.)

How to think about and deal with OpenAI

Hmm, I was imagining that in Anna's view, it's not just about what concrete social media or other venues exist, but about some social dynamic that makes even the informal benevolent conspiracy part impossible or undesirable.

How to think about and deal with OpenAI

a benevolent conspiracy that figured out which conversations could/couldn’t nudge AI politics in useful ways

functional private fora with memory (in the way that a LW comment thread has memory) that span across organizations

What's standing in the way of these being created?

What role should LW play in AI Safety?

By being the community out of which MIRI arose

I would say the LW community arose out of MIRI.

Load More