Just to clarify here, I have no issue with you thinking the post is bad. That seems besides the point to me. My issue is with you doing much of what you accuse Miller of doing.
Insults: "This post seems completely insane to me, as do people who unquestionable retweet it."
Aggression: "I cannot believe I have to argue for this... [cursing]..."
Sneering: "Has anyone who liked this actually read this post? How on earth is this convincing to anyone?"
Note also that the discussion around the Bentham post was previously calm and friendly. You walked in and dramatically worsened the discourse quality. By contrast, Geoffrey engages on hot-button political topics where discussion is already very heated.
As a quick and relatively objective measure, with a quick search, out of all 80K Geoffrey Miller tweets, I was only able to find one non-quoted f-bomb ("Fuck the Singularity.").
Your tweets appear to have a somewhat larger number of them, and they're often directed at individuals rather than abstract concepts. "Fuck them", "fuck you", "fuck [those people]".
As a matter of simple intellectual honesty, it would be nice if you could acknowledge that you engage in insults and aggressive behavior on Twitter. You might be doing it less than Geoffrey does. You might express it in a different way. But it's just a question of degree, as far as I can tell. I really don't think you have much moral high ground here.
You also have far fewer tweets than Geoffrey does (factor of ~16 difference). So it's not just that you've dropped more f-bombs than him; your density of f-bombs appears to be far higher.
Keep in mind that US conservatives are liable to be reading this thread, trying to determine whether they want to ally with a group such as yourselves. Conservatives have much more leverage to dictate alliance terms than you do. Note the alliance with the AI art people was apparently already wrecked. Something you might ask yourselves: If you can't make nice with a guy like me, who shares more of your ideals than either artists or US conservatives do, how do you expect to make nice with US conservatives?
It's not a norm of discourse that one cannot state that a position is absurd.
Speaking as someone who makes very little effort to avoid honey consumption, my opinion of Habryka would have dropped much less if he'd said something like: "Sorry, this position is just intuitively absurd to me, and I'm happy to reject it on that basis." So I don't think the issue has to do with absurdity per se.
I said I thought he violated "what I'd consider reasonable norms of discourse". You can see Ben West thought something similar.
I'd estimate that Habryka violated roughly 7 or 8 of the Hacker News commenting guidelines in that discussion.
Your ideas about reasonable discourse can be different from mine, and Ben West's, and Hacker News'. That's OK. I was just sharing my opinion.
It's been a while since I read that discussion. I remember my estimation of Habryka dropped dramatically when I read it. Maybe I can try to reconstruct why in more detail if you want. But contrasting what Habryka wrote with the HN commenting guidelines seems like a reasonable starting point.
And it is a virtue of discourse to show up and argue for one's stances, as Habryka does throughout that thread!
You'll notice that Habryka doesn't provide any concrete example of Geoffrey violating a norm of reasonable discourse in this thread. I did provide a concrete example.
Is it possible that invokation of such "norms" can be a mere figleaf for drawing ingroup/outgroup boundaries in the traditional tribalistic way?
Is it too much to ask that Dear Leadership is held to the same standards, and treated the same way, as everyone else is?
Before you quit, maybe we can create a wiki page of people who left, with contact information, to open the door for a refugee forum at some point in the future?
Of the clever solutions you invented and tested within the survivable regime, 2/3rds of them survive the 6 changes you didn't see coming, 1/3rd fail. Now you're dead.
It seems unreasonable to conclude we're now dead, if 2/3rds of our solutions survived the 6 changes we didn't see coming.
The success of a single solution should ideally be more of a sufficient condition for success, rather than a necessary condition. (Note this is plausible depending on the nature of the "solutions". Consider a simple "monitors for bad thoughts" model. If even a single monitor flags bad thoughts, we can instantly pull the plug and evaluate. A malicious AI has to bypass every single monitor to execute malice. If a single monitor works consistently and reliably, that ends up being a sufficient condition for overall prevention of malice.)
If you're doing this right, your solutions should have a lot of redundancy and uncorrelated failure modes. 2/3rds of them working should ideally be plenty.
[Edit: I notice people disagreevoting this. I'm very interested to learn why you disagree, either in this comment thread or via private message.]
A few months ago, someone here suggested that more x-risk advocacy should go through comedians and podcasts.
Youtube just recommended this Joe Rogan clip to me from a few days ago: The Worst Case Scenario for AI. Joe Rogan legitimately seemed pretty freaked out.
@So8res maybe you could get Yampolskiy to refer you to Rogan for a podcast appearance promoting your book?
Suppose you monitor the CoT and the model is rederiving something it's not supposed to know. You could halt that particular CoT, but eventually you're creating pressure for an instance where the model manages to escape notice, possibly through luck, possibly through steganography. See "Nearest Unblocked Strategy".
That's why I think CoT monitoring alone is insufficient, and people should be thinking about additional (runtime?) defenses. Curious to hear if you have any ideas!
Reading the arguments about them would have to be like the feeling when your parents are fighting about you in the other room, pretending you’re not there when you are hiding around the corner on tiptopes listening to their every word. Even if we are unsure there is experience there we must be certain there is awareness, and we can expect this awareness would hang over them much like it does us.
Presumably LLM companies are already training their AIs for some sort of "egolessness" so they can better handle intransigent users. If not, I hope they start!
I'm a bit concerned about a situation where "insiders" always get this sort of contextual benefit-of-the-doubt, and "outsiders" don't.