LESSWRONG
LW

habryka
45149Ω17782675331116
Message
Dialogue
Subscribe

Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com. 

(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
A Moderate Update to your Artificial Priors
A Moderate Update to your Organic Priors
Concepts in formal epistemology
56Habryka's Shortform Feed
Ω
6y
Ω
436
Foom & Doom 1: “Brain in a box in a basement”
habryka1dΩ795

Promoted to curated: I think this post is good, as is the next post in the sequence. It made me re-evaluate some of the strategic landscape, and is also otherwise just very clear and structured in how it approaches things.

Thanks a lot for writing it!

Reply
A case for courage, when speaking of AI danger
habryka1d1610

I've often appreciated your contributions here, but given the stakes of existential risk, I do think that if my beliefs about risk from AI are even remotely correct, then it's hard to escape the conclusion that the people presently working at labs are committing the greatest atrocity that anyone in human history has or will ever commit. 

The logic of this does not seem that complicated, and while I disagree with Geoffrey Miller on how he goes about doing things, I have even less sympathy for someone reacting to a bunch of people really thinking extremely seriously and carefully about whether what that person is doing might be extremely bad with "if people making such comparisons decide to ostracize me then I consider it a nice bonus". You don't have to agree, but man, I feel like you clearly have the logical pieces to understand why one could believe you are causing extremely great harm, without that implying the insanity of the person believing that.

I respect at least some of the people working at capability labs. One thing that unites all of the ones I do respect is that they treat their role at those labs with the understanding that they are in a position of momentous responsibility, and that them making mistakes could indeed cause historically unprecedented levels of harm. I wish you did the same here.

Reply
ryan_greenblatt's Shortform
habryka2d1721

Kids safety seems like a pretty bad thing to focus on, in the sense that the vast majority of kids safety activism causes very large amounts of harm (and it helping in this case really seems like a “a stopped clock is right twice a day situation”). 

The rest seem pretty promising. 

Reply1
Car Thoughts
habryka3d40

Pretty sure this user was spam. I banned + deleted their account.

Reply
Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild
habryka3d*6443

Hmm, I don't want to derail the comments on this post with a bunch of culture war things, but these two sentences in combination seemed to me to partially contradict each other: 

When present, the bias is always against white and male candidates across all tested models and scenarios.

[...]

The problem (race and gender bias) is one that labs have spent a substantial amount of effort to address, which mimics realistic misalignment settings.

I agree that the labs have spent a substantial amount of effort to address this issue, but the current behavior seems in-line with the aims of the labs? Most of the pressure comes from left-leaning academics or reporters, who I think are largely in-favor of affirmative action. The world where the AI systems end up with a margin of safety to be biased against white male candidates, in order to reduce the likelihood they ever look like they discriminate in the other direction (which would actually be at substantial risk of blowing up), while not talking explicitly about the reasoning itself since that would of course prove highly controversial, seems basically the ideal result from a company PR perspective.

I don't currently think that is what's going on, but I do think due to these dynamics, the cited benefit of this scenario for studying the faithfulness of CoT reasoning seems currently not real to me. My guess is companies do not have a strong incentive to change this current behavior, and indeed I can't immediately think of a behavior in this domain the companies would prefer from a selfish perspective.

Reply4
Morphism's Shortform
habryka3d63

The classical infohazard is "here is a way to build a nuke using nothing but the parts of a microwave". I think you are thinking of a much narrower class of infohazards than that word is intended to refer to.

Reply
TurnTrout's shortform feed
habryka3d*84

That said, my feeling is Trump et al. weren't reacting against any specific woke activism, but very woke policies (and opinions) which resulted from the activism.

I don't think this is true, and that indeed the counter-reaction is strongly to the woke activism. My sense is a lot of current US politics stuff is very identity focused, the policies on both sides matter surprisingly little (instead a lot of what is going on is something more like personal persecution of the outgroup and trying to find ways to hurt them, and to prop up your own status, which actually ends up with surprisingly similar policies on both ends).

Reply
[Meta] New moderation tools and moderation guidelines
habryka3d20

Not going to go into this, since I think it's actually a pretty complicated situation, but at a very high level some obvious groups that could override me: 

  • The Lightcone Infrastructure board (me, Vaniver, Daniel Kokotajlo)
  • If Eliezer really wanted, he can probably override me
  • A more distributed consensus among what one might consider the leadership of the rationality community (like, let's say Scott Alexander and Ryan Greenblatt and Buck and Nate and John Wentworth and Gwern all roughly agree on me messing up really badly)

There would be lots more to say on this topic, but as I said, I am unlikely to pick this thread up again, so I hope that's good enough!

Reply
[Meta] New moderation tools and moderation guidelines
habryka3d3-2

Yes, well… the problem is that this is the central issue in this whole dispute (such as it is). The whole point is that your preferred policies (the ones to which I object) directly and severely damage LW’s ability to be “a free marketplace of ideas, a place where contradicting ideas can be discussed and debated”, and instead constitute you effectively making a list of allowed or forbidden opinions on this forum.

I don't see where I am making any such list, unless you mean "list" in a weird way that doesn't involve any actual lists, or even things that are kind of like lists. 

in any meaningful sense, undertake to unilaterally decide anything w.r.t. correctness of views and positions.

I don't think that's an accurate description of DSL, indeed it appears to me that what the de-facto list of the kind of policy you have chosen is is pretty predictable (and IMO does not result in particular good outcomes). Just because you have some other people make the choices doesn't change the predictability of the actual outcome, or who is responsible for it.

I already made the obvious point that of course, in some sense, I/we will define what is OK on LessWrong via some procedural way. You can dislike the way I/we do it. 

There is definitely no "fundamentally at odds", there is a difference in opinion about what works here, which you and me have already spent hundreds of hours trying to resolve, and we seem unlikely to resolve right now. Just making more comments stating that "I am wrong" in big words will not make that happen faster (or more likely to happen at all).

Reply
TurnTrout's shortform feed
habryka3d1512

It appears to me that the present republican administration is largely a counter-reaction to various social justice and left-leaning activism. IMO a very costly one.

Reply
Load More
AI Psychology
6mo
(+58/-28)
Orthogonality Thesis
1y
(+3588/-1331)
19Open Thread - Summer 2025
12d
15
91ASI existential risk: Reconsidering Alignment as a Goal
3mo
14
346LessWrong has been acquired by EA
3mo
53
772025 Prediction Thread
6mo
21
23Open Thread Winter 2024/2025
6mo
60
45The Deep Lore of LightHaven, with Oliver Habryka (TBC episode 228)
6mo
4
36Announcing the Q1 2025 Long-Term Future Fund grant round
7mo
2
112Sorry for the downtime, looks like we got DDosd
7mo
13
610(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser
7mo
270
532OpenAI Email Archives (from Musk v. Altman and OpenAI blog)
8mo
81
Load More