On this Earth there are two factions, and you must pick one.
(Hat tip: I got these names 2 years ago from Robert Miles who had been playing with GPT-3.)
Did people say why they deferred to these people?
I think another interesting question to correlate this would be "If you believe AI x-risk is a severely important issue, what year did you come to believe that?".
For the closing party of the Lightcone Offices, I used Midjourney 5 to make a piece of art to represent a LessWrong essay by each member of the Lightcone team, and printed them out on canvases. I'm quite pleased about how it came out. Here they are.
(context: Jacob has been taking flying lessons, and someday hopes to do cross-country material runs for the Rose Garden Inn at shockingly fast speeds by flying himself to pick them up)
Gotcha, I was unclear about whether you were saying it prescriptively or descriptively.
Right. I suspect we still have some disagreement but happy to leave it here.
(To briefly leave pointer, but with no expectation Jeff for you to respond to it: I think this sort of dynamic extends further into lots of other criticism, where even if your criticism isn't about bad behavior you're still pretty unsure how they respond to criticism and whether they'll respond well, and it can be very stressful to engage directly yet still pro-social to publish criticism.)
Yep, it seems to good to me to respond to just one point that you disagreed with, definitely positive to do so relative to responding to none :)
I genuinely have uncertainty here, I know there were a bunch of folks at CSET who understood some of the args, I'm not sure whether/what roles they have in Government, I think of many of them as being in "policy think tanks" that are outside of government. Matheny was in the White House for a while but now he runs RAND; if he were still there I would be wrong and there would be at least one person who I believe gro...
Consider an average human, who understands goodness enough to do science without catstrophic consequences, but is not a benevolent sovereign.
If "science" includes "building and testing AGIs" or "building and testing nukes" or "building and testing nanotech", then I think the "average human" "doing science" is unaligned.
I have occasionally heard people debate whether "humans are aligned". I find it a bit odd to think of it as a yes/no answer. I think humans are good at modeling some environments and not others. High pressured environments with superstimuli ...
It's not clear to me that OpenAI has a clear lead over Anthropic in terms of capabilities.
Just repeating from my other comments: my main issue is the broad proposal of "let's get governments involved" that appears to not be aware of all the horrible and corrupt things governments do by-default when they get involved (cf. Covid), nor proposes any ways to avoid lots of dysfunction.
When we did Scott's petition, names were not automatically added to the list, but each name was read by me-or-Jacob, and if we were uncertain about one we didn't add it without checking in with others or thinking it over. This meant that added names were staggered throughout the day because we only checked every hour or two, but overall prevented a number of fake names from getting on there.
(I write this to contrast it with automatically adding names then removing them as you notice issues.)
I've slept, and now looked it over again.
Such decisions must not be delegated to unelected tech leaders.
I don’t agree with the clear implication that the problem with tech leaders is that they weren’t elected. I commonly think their judgment is better than people who are elected and in government. I think competent and elected people are best, but given the choice between only competent or only elected (in the current shitty electoral systems of the UK and the US that I am familiar with), I think I prefer competent.
If such a pause cannot be enacted quickly
I wasn't referring specifically to the OP when I wrote that, I meant I ought to pushback against a pretty wide swath of possible arguments and pressures against publishing criticism. Nonetheless I want to answer your question.
My answer is more "yes" than "no". If someone is to publish a critique of an EA org and hasn't shown it to the org, people can say "But you didn't check it with the org, which we agreed is a norm around here, so you don't seem to be willing to play by the rules, and I now suspect the rest of this post is in bad faith." Yet I think it'...
(Also Zoe Curzi and Leverage. Really there’s a lot of examples.)
Also examples on the other side, I would note. Without a healthy skepticism of anonymous or other kinds of reduced-accountability reports, one would've been lead around by the nose by Ziz's attempts.
Oh, I didn't read that correctly. Good point.
I am concerned about some other parts of it, that seem to imbue a feeling of "trust in government" that I don't share, and I am concerned that if this letter is successful then governments will get involved in a pretty indiscriminate and corrupt way and then everything will get worse; but my concern is somewhat vague and hard to pin down.
I think it'd be good for me to sleep on it, and see if it seems so bad to sign on to the next time I see it.
Ah, I was under a misapprehension, I thought the data was much more recent, but the GPT-4 page says:
GPT-4 generally lacks knowledge of events that have occurred after the vast majority of its data cuts off (September 2021)
However that is after GPT-3 was released (June 2020), so it's a new dataset.
Extrapolating naively, 2 years from now we will see GPT-5 trained on data from today.
My model is that OpenAI and Anthropic researchers set up a web-scraper that reads through lots of popular internal reddit links (or possibly literally all of reddit) and then uses all of that as the training data for their language models.
...googling shows this as the official answer for GPT-3, which contains a lot of the popular and public internet. I am unclear whether that contains reddit, but if not then I believe I heard that they made a crawler specifically for reddit.
I don't agree with the recommendation, so I don't think I should sign my name to it.
To describe a concrete bad thing that may happen: suppose the letter is successful and then there is a pause. Suppose a bunch of AI companies agree to some protocols that they say that these protocols "ensure that systems adhering to them are safe beyond a reasonable doubt". If I (or another signatory) is then to say "But I don't think that any such protocols exist" I think they'd be in their right to say "Then why on Earth did you sign this letter saying that we could find them within 6 months?" and then not trust me again to mean the things I say publicly.
The letter says to pause for at least 6 months, not exactly 6 months.
So anyone who doesn't believe that protocols exist to ensure the safety of more capable AI systems shouldn't avoid signing the letter for that reason, because the letter can be interpreted as supporting an indefinite pause in that case.
I think scraping reddit is common. The SSC subreddit is pretty popular. I wonder if there could be a post on that subreddit that was just a space for people to publish books in the comments.
I concur, the typo in "Poeple" does call into question whether he has truly signed this letter.
I think this letter is not robust enough to people submitting false names. Back when Jacob and I put together DontDoxScottAlexander.com we included this section, and I would recommend doing something pretty similar:
I think someone checking some of these emails would slow down high-profile signatories by 6-48 hours, but sustain trust that the names are all real.
I'm willing to help out if those running it would like, feel free to PM me.
I believe the high-profile names at the top are individually verified, at least, and it looks like there's someone behind the form deleting fake entries as they're noticed. (Eg Yann LeCun was on the list briefly, but has since been deleted from the list.)
Oh no. Apparently also Yann LeCun didn't really sign this.
Indeed. Among the alleged signatories:
Xi Jinping, Poeple's Republic of China, Chairman of the CCP, Order of the Golden Eagle, Order of Saint Andrew, Grand Cordon of the Order of Leopold
Which I heavily doubt.
AI labs and independent experts should use this pause to jointly develop and implement a set of shared safety protocols for advanced AI design and development that are rigorously audited and overseen by independent outside experts. These protocols should ensure that systems adhering to them are safe beyond a reasonable doubt.
That's nice, but I don't currently believe there are any audits or protocols that can prove future AIs safe "beyond a reasonable doubt".
In parallel, AI developers must work with policymakers to dramatically accelerate development of ro
I'm not saying it won't improve someone's post to get direct feedback from us, and I'm not saying it might not end up reducing some amount of effort from someone on the lightcone team to respond to things people are wrong about, but my current model is that for people to have justified belief in their model of the work that an org does, they should believe they would have heard negative info about us if it exists, and so I ought to encourage people to be openly severely critical and push back against demands to not write their criticism for a pretty large ...
For the record I don't think anyone needs to check with Lightcone before criticizing any of our work.
For the record I think "the correct number of people to die as a result of technological progress is not zero". My issue is that the correct number is not "all of the people".
My current understanding is that Sam gained good standing as a result of having lots of money for EA causes, not as a result of being charismatic in EA spaces? My sense is that the person you mentioned would struggle to gain good standing in the Lightcone offices without any preexisting money or other power.
No, he gained good standing from being around the EA community for so many years and having sophisticated ethical views (veganism, a form of utilitarianism, etc) and convincing well-respected EAs to work with him and fund him, as well as from havi...
Sorry if I wrote unclearly. For most of the time (even in the last 6 months) I thought it was worth continuing to support the ecosystem, and certainly to support the people in the office, even if I was planning later to move on. I wanted to move on primarily because of the opportunity cost — I thought we could do something greater. But I believe Habryka wanted to separate from the whole ecosystem and question whether the resources we were providing were actually improving the world at all, and at that point it's not simply a question of opportunity cost bu...
Added: To give context, here's a list of number of LW posts by year:
(My, it's getting to be quite a lot of posts these days.)
This is an odd response from me, but, recently for my birthday, I posted a survey for my friends to fill out about me, anonymously rating me on lots of different attributes.
I included some spicier/fun questions, one of which was whether they thought they were smarter than me or not.
Here were the results for that question:
It was roughly 50/50 throughout the entire time data came in over the two days.
The vast majority of people responding said that they'd read the sequences (either "some" or "yes"). I'd guess that basically everyone had except my...
Even though Eliezer claims that there was no fire alarm for AGI, perhaps this is the fire alarm?
I mean, obviously not, as most governments don't know that metaculus exists or what a prediction market is.
And if you block any one path to the insight that the earth is round, in a way that somehow fails to cripple it, then it will find another path later, because truths are interwoven. Tell one lie, and the truth is ever-after your enemy.
In case it's of any interest, I'll mention that when I "pump this intuition", I find myself thinking it essentially impossible to expect we could ever build a general agent that didn't notice that the world was round, and I'm unsure why (if I recall correctly) I sometimes I read Nate or Eliezer write that they think it's quit...
- the AGI was NOT exercising its intelligence & reason & planning etc. towards an explicit, reflectively-endorsed desire for “I am being helpful / I am being docile / I am acting with integrity / blah blah”.
I am naively more scared about such an AI. That AI sounds more like if I say "you're not being helpful, please stop" that it will respond "actually I thought about it, I disagree, I'm going to continue doing what I think is helpful".
I think that, if an AGI has any explicit reflectively-endorsed desire whatsoever, then I can tell a similar scary story: The AGI’s desire isn’t quite what I wanted, so I try to correct it, and the AGI says no. (Unless the AGI’s explicit endorsed desires include / entail a desire to accept correction! Which most desires don’t!)
And yes, that is a scary story! It is the central scary story of AGI alignment, right? It would be nice to make an AGI with no explicit desires whatsoever, but I don’t think that’s possible.
So anyway, if we do Procedure X which will n...
The fact that some people in EA (a huge broad community) are probably wrong about some things didn't seem to be an argument that Lightcone Offices would be ineffective as (AFAIK) you could filter people at your discretion.
I mean, no, we were specifically trying to support the EA community, we do not get to unilaterally decide who is part of the community. People I don't personally have much respect for but are members of the EA community who are putting in the work to be considered members in good standing definitely get to pass through. I'm not goin...
And these are both real obstacles. But there are deeper obstacles, that seem to me more central, and that I haven't observed others to notice on their own.
I brainstormed some possible answers. This list is a bit long. I'm publishing this comment because it's not worth the half hour to make it concise, yet it seems worth trying the exercise before reading the post and possibly others will find it worth seeing my quick attempt.
I think the last two bullets are probably my best guesses. Nonetheless here is my list:
For contrast, on the same time period, $185k/month could provide for salary, lodging and office space for 50 people in Europe, all who counterfactually would not be doing that work otherwise, for which I claim 50 man months per month of extra x-risk reduction work.
The default outcome of giving people money, is either nothing, noise, or the resources getting captured by existing incentive gradients. In my experience, if you give people free money, they will take it, and they will nominally try to please you with it, so it's not that surprising if you can fi...
the local incentives of those with high status agree with performance quantification just fine, so long as the metric in question is one by which they're already doing well.
To me this rhymes pretty closely with the message in Is Success the Enemy of Freedom?, in that in both cases you're very averse to competition on even pretty nearby metrics that you do worse on.
The vast majority of users see a little circle widget in the bottom right. For instance I can see it now as I look at your comment.
Sometimes users have added things to their browsers that remove it, and you can also remove it in your account settings.
People think the speed-up by rationalists is only ~5 years? I thought people were thinking 10-40. I do not think I would trade the entire history of LessWrong, including the articulation of the alignment problem, for 5 years of timelines. I mean, maybe it's the right call, but it hardly seems obvious.
When LessWrong was ~dead (before we worked on the revival) I had this strong sense that being able to even consider that OpenAI could be bad for the world, or the notion that the alignment problem wasn't going to go okay by-default, was being edged out o...
I mean, I don't see the argument for more than that. Unless you have some argument for hardware progress stopping, my sense is that things would get cheap enough that someone is going to try the AI stuff that is happening today within a decade.
A few replies:
That is ~$185k/month and ~$2.22m/year. I wonder if the cost has anything to do with the decision? There may be a tendency to say "an action is either extremely good or extremely bad because it either reduces x-risk or increases x-risk, so if I think it's net positive I should be willing to spend huge amounts of money."
I don't think cost had that much to do with the decision, I expect that Open Philanthropy thought it was worth the money and would have been willing to continue funding at this price point.
In general I think the correct re...
The hard question is "how much goodharting is too much goodharting".
You did see part of it before; I posted in Open Thread a month ago with the announcement, but today Ray poked me and Oli to also publish some of the reasoning we wrote in slack.
Yeah, but that doesn't sound like my strategy. I've many times talked to people who are leaving or left and interviewed them about why and what they didn't like and their reasons for leaving.
Thanks for saying. Sounds like another piece I will skip!
While I am generally interested in justice around these parts, I generally buy the maxim that if the news is important, I will hear they key info in it directly from friends (this was true both for covid and for Russia-nukes stuff), and that otherwise the news media spend enough effort to do narrative-control that I'd much rather not even read the media's account of things.
This seems like a bad rule of thumb. If your social circle is largely comprised of people who have chosen to remain within the community, ignoring information from "outsiders" seems like a bad strategy for understanding issues with the community.
I also went to hpmor.com yesterday, was disappointed that the old site was gone and now redirects to a relatively under-optimized LessWrong page, and complained to the LW team about this.
I think mostly somewhat confused?
Though I've never met her, from her writing and things others have told me, I expect LaSota seems much more visibly out-of-it and threatening than e.g. Michael does, who I have met and didn't seem socially alarming or unpredictable in the way where you might be scared of a sudden physical altercation.
I think Vassar is alarming and unpredictable in a way that causes people to be afraid of a sudden physical altercation. For example, I have felt scared of physical altercations with him. If I recall correctly, he raised his voice while telling a friend of mine that he thought they were worse than the Nazis during a conversation in a hotel lobby, which freaked out other people who were in the lobby (I don't remember how my friend felt).
Not sure if this answers your question, but recently I had an assistant who would ask me questions about how I was feeling. Often, when I was in the midst of focusing on some difficult piece of work, I would answer "I don't know", and get back to focusing on the work.
My vague recollection is that she later showed me notes she'd written that said I was sighing deeply, holding my forehead, had my shoulders raised, was occasionally talking to myself, and I came to realize I was feeling quite anxious at those times, but this information wasn't accessible...
I think I've been implicitly coming to believe that (a) all people are feeling emotions all the time, but (b) people vary in how self-aware they are of these emotions.
Does anyone want to give me a counter-argument or counter-evidence to this claim?
I heard that LaSota ('ziz') and Michael interacted but I am sort of under the impression she was always kind of violent and bizarre before that, so I'm not putting much of this bizarreness down to Michael. Certainly interest in evidence about this (here or in DM).
It sure sounds like you think outsiders would typically have the "common sense" to avoid Ziz. What do you think such an outsider would make of this comment?
In case you're interested, I choose the latter, for there is at least the hope of learning from the mistakes.