Here are some things I like about owning this space:
- He cites ARC’s GPT-4 evaluation and Lesswrong in his AI report which has a large section on safety.
I wanted to double-check this.
The relevant section starts on page 94, "Section 4: Safety", and those pages cite in their sources around 10-15 LW posts for their technical research or overviews of the field and funding in the field. (Make sure to drag up the sources section to view all the links.)
Throughout the presentation and news articles he also has a few other links to interviews with ppl on LW (Shane Legg, Sam Altman, Katja Grace).
Thanks! This is fairly tempting. I'm a bit concerned by
Some other explanation that's of this level of "very weird"
To be clear, if it were just the 4 hypotheses you mention, then I feel pretty good about this, and I'd just want to reflect over 200:1 versus 100:1.
I have not read this post, and I have not looked into whatever the report is, but I'm willing to take a 100:1 bet that there is no such non-human originating craft (by which I mean anything actively designed by a technological species — I do not mean that no simple biological matter of any kind could not have arrived on this planet via some natural process like an asteroid), operationalized to there being no Metaculus community forecast (or Manifold market with a sensible operationalization and reasonable number of players) that assigns over 50% probabilit...
"AI maniacs" is maybe a term that meets this goal? Mania is the opposite side to depression, both of which are about having false beliefs just in opposite emotionally valenced directions, and also I do think just letting AI systems loose in the economy is the sort of thing a maniac in charge of a civilization would do.
The rest of my quick babble: "AI believers" "AI devotee" "AI fanatic" "AI true believer" "AI prophets" "AI ideologue" "AI apologist" "AI dogmatist" "AI propagandists" "AI priests".
I think I tend to base my level of alarm on the log of the severity*probability, not the absolute value. Most of the work is getting enough info to raise a problem to my attention to be worth solving. "Oh no, my house has a decent >30% chance of flooding this week, better do something about it, and I'll likely enact some preventative measures whether it's 30% or 80%." The amount of work I'm going to put into solving it is not twice as much if my odds double, mostly there's a threshold around whether it's worth dealing with or not.
Setting that aside, it ...
(Strong-upvote, weak-disagree. I sadly don't have time right now to reflect and write why I disagree with this position but I hope someone else who disagrees does.)
Relatedly, when we made DontDoxScottAlexander.com, we tried not to wade into a bigger fight about the NYT and other news sites, nor to make it an endorsement of Scott and everything he's ever written/done. It just focused on the issue of not deanonymizing bloggers when revealing their identity is a threat to their careers or personal safety and there isn't a strong ethical reason to do so. I know more high-profile people signed it because the wording was conservative in this manner.
I also have a strong personal rule against making public time-bound commitments unless I need to. I generally regret it because unexpected things come up and I feel guilty about not replying in the time frame I thought I would.
I might be inclined to hit a button that says "I hope to respond further to this".
I've just had an interesting experience that changed my felt-sense of consciousness and being embodied.
I've played over 80 hours of the newly released Zelda game, which is a lot given that it's only been out for 14 days. I do not normally play video games very much, this has been a fairly drastic change in how I've spent my personal time.
I'm really focused while playing it, and feel very immersed in the world of the game. So much so that I had a quite odd experience coming back to the rest of my life.
Yesterday, after playing the game for an hour, I wandere...
Curated! I loved a lot of things about this post.
I think the post is doing three things, all of which I like. First, it documents what it was like for Joe as he made substantial updates about the world. Secondly, it exhibits the rationalist practice of explaining what those updates look like using the framework of probabilities, and considering what sorts of updates a rational agent would make in his position, and contrasted that with a helpful explicit model of how a human being would make updates (e.g. using its guts). And third it's a serious and ...
You're right. Thanks!
“No,” says the philosopher. “It’s not a false dichotomy! For the sake of argument we’re suspending the laws of physics. Didn’t Galileo do the same when he banished friction from his thought experiment?” Yes, but a general rule of thumb emerges from the comparison: the utility of a thought experiment is inversely proportional to the size of its departures from reality.
This is a good point (and I think I occasionally make this mistake of giving far too unrealistic or rare counterexamples), but I do want to say that sometimes there is substantive disagreement about whether the counter-example is an extreme case or a relatively central one.
I think of people who are willing to accept very non-central counterexamples as relevant as being very conservative on the dimension of trusting their own taste, in that they are trying to avoid using their own judgment about what counts as central. (Mostly this seems good to me in...
It seems to me like you would like to be able to succeed at preventing an extinction-level threat without having to be competent at anything. I think reality has higher standards than that.
Occupy Wall Street is exactly the wrong example to mention, and further makes me think you have no sense of what a successful historical protest looks like.
I'm not into random people showing up and trying to command the political force of a web forum that they've made zero contributions to and trying to direct it into a poorly thought out and haphazardly-aimed effort.
I am well-aware of the stakes here, but that doesn't mean bad plans suddenly work.
These don’t seem very relevant counterarguments, I think literally all are from people who believe that AGI is an extinction-level threat soon facing our civilization.
Perhaps you mean “>50% of extinction-level bad outcomes” but I think that the relevant alternative viewpoint that would calm someone is not that the probability is only 20% or something, but is “this is not an extinction-level threat and we don’t need to be worried about it”, for which I have seen no good argument for (that engages seriously with any misalignment concerns).
These are some of the reasons I don't want to join your protest: You've done basically zero of the hard work required to rally people behind a successful protest (other than write this announcement). You'd really need a concrete policy ask before I would think of joining your protest, and show that you're going to be able to coordinate a bunch of work from lots of people. More generally I don't really like the dynamic where the first person to say "me" is suddenly able to direct a bunch of free-energy, even if they probably aren't able to follow-through. A...
Oops, I was unclear in my last line.
I was attempting to distinguish between someone getting angry with you and shouting at you and then punching you, and someone who is quiet and doesn’t look at you and isn’t talking much to you who then walks over and punches you.
Both are alarming. To me the latter feels more unpredictable and more alarming because I’m getting no info about when it will happen, but if someone is getting visibly angry and escalates and you don’t know where their lines are for conflict, then I can see that being more intensely alarming.
There’s nothing special about taking responsibility for something big or small. It’s the same meaning.
Within teams I’ve worked in it has meant:
...For example, as far as the “normatively correct general principles” thing goes—alright, so you think I’m factually incorrect about this particular thing I said once.[1] Let’s take for granted that I disagree. Well, and is that… a moderation-worthy offense? To disagree (with the mods? with the consensus—established how?—of Less Wrong? with anyone?) about what is essentially a philosophical claim? Are you suggesting that your correctness on this is so obvious that disagreeing can only constitute either some sort of bad faith, or blameworthy ignorance? That h
The claim I understand Ray to be making is that he believes you gave a false account of the site-wide norms about what users are obligated to do
Is that really the claim? I must object to it, if that’s so. I don’t think I’ve ever made any false claims about what social norms obtain on Less Wrong (and to the extent that some of my comments were interpreted that way, I was quick to clearly correct that misinterpretation).
Certainly the “normatively correct general principles” comment didn’t contain any such false claims. (And Raemon does not seem to be clai...
I think it could be quite nice to give new users information about what site norms are and give a suggested spirit in which to engage with comments.
(Though I'm sure there's lots of things it'd be quite nice to tell new users about the spirit of the site, but there's of course bandwidth limitations on how much they'll read, so just because it's an improvement doesn't mean it's worth doing.)
I have a version of heroic responsibility in my head that I don’t think causes one to have false beliefs about supernatural phenomena, so I’m interested in engaging on whether the version in my head makes sense, though I don’t mean to invalidate your strongly negative personal experiences with the idea.
I think there’s a difference between causing something and taking responsibility for it. There’s a notion of “I didn’t cause this mess but I am going to clean it up.” In my team often a problem arises that we didn’t cause and weren’t expecting. A few months ...
My other comment notwithstanding, I do think the HPMOR quote is not very helpful for someone's mental health when they're in pain and seems a bit odd placed atop a section on advice, and I think the advice at the wrong time can feel oppressive. The hero-licensing post feels much less like it risks feeling oppressed by every bad thing that happens in the world. And personally I found Anna's post linked earlier to be much more helpful advice that is related to and partially upstream of the sorts of changes in my life that have reduced a lot of anxiety. If it...
I can understand thinking of yourself as having evil intentions, but I don't understand believing you're a partly-demonic entity.
I think the way that the global market and culture can respond to ideas is strange and surprising, with people you don't know taking major undertakings based on your ideas, with lots of copying and imitation and whole organizations or people changing their lives around something you did without them ever knowing you. Like the way that Elon Musk met a girlfriend of his via a Roko's Basilisk meme, or one time someone on reddi...
The obvious dis-analogy is that if the police had no funding and largely ceased to exist, a string of horrendous things would quickly occur. Murders and thefts and kidnappings and rapes and more would occur throughout every country in which it was occurring, people would revert to tight-knit groups who had weapons to defend themselves, a lot of basic infrastructure would probably break down (e.g. would Amazon be able to pivot to get their drivers armed guards?) and much more chaos would ensue.
And if AI research paused, society would continue to basically function as it has been doing so far.
One of them seems to me like a goal that directly causes catastrophes and a breakdown of society and the other doesn't.
Ray writes:
Here are some areas I think Said contributes in a way that seem important:
- Various ops/dev work maintaining sites like readthesequences.com, greaterwrong.com, and gwern.com.
For the record, I think the value here is "Said is the person independent of MIRI (including Vaniver) and Lightcone who contributes the most counterfactual bits to the sequences and LW still being alive in the world", and I don't think that comes across in this bullet.
Thanks!
I'm not sure if it's worth us having more back-and-forth, so I'll say my general feelings right now:
I have not engaged much with your and Quintin's recent arguments about how deep learning may change the basic arguments, so I want to acknowledge that I would probably shift my opinion a bunch in some direction if I did. Nonetheless, a few related points:
I think Gerald is using 'posts' to mean any sort of content that has been 'posted', like when he writes "Obviously I have not done so or you would not be reading this post" referring to the comment he has written.
If you search by top-comments (as you can on GreaterWrong) you can find this 55 karma comment by Gerald Monroe from 2 years ago.
(The next highest karma comment is 22, 20, 19, so there's only the one comment I would call 'highly upvoted'.)
Can I ask what your epistemic state here is exactly? Here are some options:
Also, can I just remind you that for most of LessWrong's history the top-karma post was Holden's critique of SingInst where he recommended against funding SingInst and argued in favor of Tool AI as the solution. Recently Eliezer's List-of-Lethalities became the top-karma post, but less than a month later Paul's response-and-critique post became the top-karma post where he argued that the problem is much more tractable than Eliezer thinks, and generally advocates a very different research strategy for dealing with alignment.
Eliezer is the primary pers...
So really my disagreement is more on alignment strategy. A problem with this site is that it overweights EY/MIRI classic old alignment literature and arguments by about 100x what it should be
I don't think there are many people with alignment strategies and research that they're working on. Eliezer has a hugely important perspective, Scott Garrabrant, Paul Christiano, John Wentworth, Steve Byrnes, and more, all have approaches and perspectives too that they're working full-time on. I think if you're working on this full-time and any of your particular ideas...
So really my disagreement is more on alignment strategy. A problem with this site is that it overweights EY/MIRI classic old alignment literature and arguments by about 100x what it should be
I don't think there are many people with alignment strategies and research that they're working on.
I agree that's a problem - but causally downstream of the problem I mention. Whereas Bostrom deserves credit for raising awareness of AI-risk in academia, EY/MIRI deserves credit for awakening many young techies to the issue - but also some blame.
Whether intentio...
Friends aren’t resources for intellectual stimulation or new insights. I don’t want my friends to like me because I read niche blogs or have things to say about crypto. It comes dangerously close to conflating knowing a lot, reading a lot, or having thoughtful things to say with moral goodness.
I think I would reverse most sentences in this paragraph. Being able to think for yourself and have your own useful and insightful takes on how the world is working seems to me closer to a requirement for being able to take morally good action. Anything can be g...
Theatre, lectures, debates, live music... I'm not saying that all of these feel the same when recorded, but I am saying that many of them can be improved when done naturalistically and audience-less, and more importantly they have a much lesser effect of wasting the time of the audience, which in my opinion happens for the majority of the audience to a substantial degree.
I don't think it's mildly insulting, I think it's ambiguously insulting, in that a person wanting to insult you might do it. But in general I think it's a totally reasonable question in truth-seeking and I'd be sad if people required disclaimers to clarify that it isn't meant insultingly, just to ask for examples of what the person is talking about.
(Commenting from Recent Discussion)
(Commenting from recent discussion, also intended as a reply to Gwern)
The annual review is an attempt to figure out what were the best contributions with the benefit of a great deal of hindsight, and I think it's prosocial to contribute to it, similar to how it was prosocial to contribute to the LW survey back when Scott ran a big one every year.
I am always pleased when people contribute, and sometimes I am sad if there are particular users whose reviews I'd really like to read but don't write any. But I don't think anyone is obligated to write reviews!
Really great post.
The two parts that stand out most to me are the Causality and Ontology Change sections.
Regarding Causality, I agree that there will be little investment into robotics as a mechanism for intervening on the world and building causal models. However, I don't see why practicing on videogames doesn't produce this sort of interventionist data and why AIs wouldn't learn causal models from that. And it doesn't seem that expensive to create the data. It's already happened a bunch with AIs trained on mutltiplayer videogames, and later on it will ge...
It's... possible this is actually the single best example of a public doublecrux writeup that I know of?
This sentence was confusing to me given that the post does not mention 'double crux', but I mentioned it to someone and they said to think of it as the mental motion and not the explicit format, and that makes more sense to me.
I strong upvoted this quite interesting post, but I want to mention that I do not mean my upvote to endorse this particular historical story of Socrates. This is because I have not read any Socrates nor read about his history, and I do not personally have confident beliefs about whether the Socrates in this post is the same as the historic one. (The list of historic details about Athens did me help understand the environment though.)
Edit: Upvote retracted, I realize this is a post criticizing Said that he cannot comment on.
shift the default assumptions of LessWrong to "users by default have a rate limit of 1-comment-per day"
Natural times I expect this to be frustrating are when someone's written a post, got 20 comments, and tries to reply to 5 of them, but is locked after the first one. 1 per day seems too strong there. I might say "unlimited daily comments on your own posts".
I also think I'd prefer a cut-off where after which you're trusted to comment freely. Reading the positive-selection post (which I agree with), I think some bars here could include having written a curated post or a post with 200+ karma or having 1000 karma on your account.
For the record, our relationship to supporting events for this ecosystem is changing from something like "all of our resources are the same, here have my venue for free if you need it" to "markets and pricing are a great way for large masses of people to coordinate on the value of a good or service, let's coordinate substantially via trade".
For instance, during a previous cohort of SERI MATS scholars at the Lightcone Offices, I spent a couple of weeks of work adding a second floor and getting it furnished and doing interior design, hiring another sup... (read more)