roha — LessWrong

"The Algorithm" is in the hands of very few actors. This is the prime gear where "Evil People have figured it out, and hold The Power" isn't a fantasy. There would be many obvious improvements if it were in adult hands.

An epistemic advantage of working as a moderate

roha2mo20

I think there might be a confusion between optimizing for an instrumental vs. an upper-level goal. Is maintaining good epistemics more relevant than working on the right topic? To me the rigor of an inquiry seems secondary to choosing the right subject.

Consider chilling out in 2028

roha4mo10

I also had to look it up and got interested in testing whether or how it could apply.

Here's an explanation of Bulverism that suggests a concrete logical form of the fallacy:

Person 1 makes argument X.
Person 2 assumes person 1 must be wrong because of their Y (e.g. suspected motives, social identity, or other characteristic associated with their identity).
Therefore, argument X is flawed or not true.

Here's a possible assignment for X and Y that tries to remain rather general:

X = Doom is plausible because ...
Y = Trauma / Fear / Fixation

Why would that be a fallacy? Whether an argument is true or false, depends on the structure and content of the argument, but not on the source of the argument (genetic fallacy), and not on a property of the source that gets equated with being wrong (circular reasoning). Whether an argument for doom is true, does not depend on who is arguing for it, and being traumatized does not automatically imply being wrong.

Here's another possible assignment for X and Y that tries to be more concrete. To be able to do so, "Person 1" is also replaced by more than one person, now called "Group 1":

X (from AI 2027) = A takeover by an unaligned superintelligence by 2030 is plausible because ...
Y (from the post) = "lots of very smart people have preverbal trauma" and "embed that pain such that it colors what reality even looks like at a fundamental level", so "there's something like a traumatized infant inside such people" and "its only way of "crying" is to paint the subjective experience of world in the horror it experiences, and to use the built-up mental edifice it has access to in order to try to convey to others what its horror is like".

From looking at this, I think the post suggests a slightly stronger logical form that extends 3:

Group 1 makes argument X.
Person 2 assumes group 1 must be wrong because of their Y (e.g. suspected motives, social identity, or other characteristic associated with their identity).
Therefore, argument X is flawed or not true AND group 1 can't evaluate its truth value because of their Y.

From this, I think one can see that not only Bulverism makes the model a bit suspicious, but two additional aspects come into play:

If Group 1 is the LessWrong community, then there are also people outside of it that predict that there's an existential risk from AI and that timelines might be short. How can argument X from these people become wrong by Group 1 entering the stage, and would it still be true if Group 1 was doing something else?
I think it's fair to say that in 3 an aspect is introduced that's adjacent to gaslighting, i.e. manipulating someone into questioning their perception of reality. Even if it's in a well-meaning way, since some people's perception of reality is indeed flawed and they might profit from becoming aware of it, the way it is weaved into the argument doesn't seem that benign anymore. I suppose that might be the source of some people getting annoyed by the post.

Consider chilling out in 2028

roha4mo80

"it's psychologically appealing to have a hypothesis that means you don't have to do any mundane work"

I don't doubt that something like inverse bike-shedding can be a driving force for some individuals to focus on the field of AI safety. I highly doubt it is explanatory for the field and the associated risk predictions to exist in the first place, or that its validity should be questioned on such grounds, but this seems to happen in the article if I'm not entirely misreading it. From my point of view, there is already an overemphasis on psychological factors in the broader debate and it would be desirable to get back to the object level, be it with theoretical or empirical research, which both have their value. This latter aspect seems to lead to a partial agreement here, even though there's more than one path to arrive at it.

Consider chilling out in 2028

roha4mo30

Point addressed with unnecessarily polemic tone:

"Suppose that what's going on is, lots of very smart people have preverbal trauma."
"consider the possibility that the person in question might not be perceiving the real problem objectively because their inner little one might be using it as a microphone and optimizing what's "said" for effect, not for truth."

It is alright to consider it. I find it implausible that a wide range of accomplished researchers lay out arguments, collect data, interpret what has and hasn't been observed and come to the conclusion that our current trajectory of AI development poses a significant amount of existential risk, which can potentially manifest in short timelines, because a majority of them has a childhood trauma that blurs their epistemology on this particular issue but not on others where success criteria could already be observed.

Consider chilling out in 2028

roha4mo31

I'm close to getting a postverbal trauma from having to observe all the mental gymnastics around the question of whether building a superintelligence without having reliable methods to shape its behavior is actually dangerous. Yes, it is. No, that fact does not depend on whether Hinton, Bengio, Russell, Omohundro, Bostrom, Yudkowsky, et al. were held as a baby.

EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024

roha1y62

Further context about the "recent advancements in the AI sector have resolved this issue" paragraph:

Contained in a16z letter to UK parliament: https://committees.parliament.uk/writtenevidence/127070/pdf/
Contained in a16z letter to Biden, signed by Andreessen, Horowitz, LeCun, Carmack et al.: https://x.com/a16z/status/1720524920596128012
Carmack claiming not to have proofread it, both Carmack and Casado admitting the claim is false: https://x.com/GarrisonLovely/status/1799139346651775361

Ilya Sutskever and Jan Leike resign from OpenAI [updated]

[+]roha1y*-6-5

Ilya Sutskever and Jan Leike resign from OpenAI [updated]

roha1y10

I assume they can't make a statement and that their choice of next occupation will be the clearest signal they can and will send out to the public.

Ilya Sutskever and Jan Leike resign from OpenAI [updated]

roha1y169

He has a stance towards risk that is a necessary condition for becoming the CEO of a company like OpenAI, but doesn't give you a high probability of building a safe ASI:

https://blog.samaltman.com/what-i-wish-someone-had-told-me
- "Inaction is a particularly insidious type of risk."
https://blog.samaltman.com/how-to-be-successful
- "Most people overestimate risk and underestimate reward."
https://blog.samaltman.com/upside-risk
- "Instead of downside risk [2], more investors should think about upside risk—not getting to invest in the company that will provide the return everyone is looking for."

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments