The citation on that sentence is the same as the first paragraph in this post about the animal-ethics.org website; Elizabeth is aware of that study and did not find it convincing.
In this and your comments below, you recapitulate points Elizabeth made pretty exactly- so it looks like you didn't need to read it after all!
My answer is "work on applications of existing AI, not the frontier". Advancing the frontier is the dangerous part, not using the state-of-the-art to make products.
But also, don't do frontend or infra for a company that's advancing capabilities.
I also had a bunch of thoughts of ‘oh, well, that’s easy, obviously you would just [OH MY LORD IS THIS CENSORED]’
I applaud your Virtue of Silence, and I'm also uncomfortable with the simplicity of some of the ones I'm sitting on.
Thanks for the info about sockpuppeting, will edit my first comment accordingly.
Re: Glassdoor, the most devastating reviews were indeed after 2017, but it's still the case that nobody rated the CEO above average among the ~30 people who worked in the Spartz era.
Thanks for updating! LessWrong at it’s best :)
I went through and added up all of the reviews from when Emerson was in charge and the org averaged a 3.9 rating. You can check my math if you’d like (5+3+5+4+1+4+5+5+5+5+5+5+5+1+5+5+3+5+5+5+3+1+2+4+5+3+1)/27
For reference, Meta has a 4 star rating on GlassDoor and has won one of their prizes for Best Place to Work for 12 years straight. (2022 (#47), 2021 (#11), 2020 (#23), 2019 (#7), 2018 (#1), 2017 (#2), 2016 (#5), 2015 (#13), 2014 (#5), 2013 (#1), 2...
This is a good idea; unfortunately, based on discussions on the EA Forum, Nonlinear is not an organization I would trust to handle it. (Note, as external evidence, that the Glassdoor reviews of Emerson's previous company frequently mention a toxic upper management culture of exactly the sort that the commenter alleges at Nonlinear, and have a 0% rating of him as CEO.)
[EDITED TO ADD: The second comment quotes reviews written after the Spartz era (although I'm sure many of them were present during it), which is misleading; moreover, the second commenter was ...
Hi, thanks for saying you liked the idea, and also appreciate the chance to clear up some things here. As a reminder, we’re not making funding decisions. We’re just helping funders and applicants find each other.
Some updates on that thread you might not have seen: the EA Forum moderators investigated and banned two users for creating ~8 fake sockpuppet accounts. This has possibly led to information cascades about things “lots of people are saying.”
Another thing you might not be aware of: the Glassdoor CEO rating of 0% was actually not Emers...
Looking for "elbows" in a noisy time series with relatively few points is a pretty easy way to get spurious results. If the 1960-62 obesity number was overestimated by 2 points and/or the 1976-1980 number was underestimated by 2 points, it wouldn't look like 1976-1980 was a special transition at all.
(And clearly errors of that magnitude happen, unless you think there's a deep reason why obesity rates were nonmonotonic from 2005-06 to 2011-12.)
[EDIT: fallenpegasus points out that there's a low bar to entry to this corner of TIME's website. I have to say I should have been confused that even now they let Eliezer write in his own idiom.]
The Eliezer of 2010 had no shot of being directly published (instead of featured in an interview that at best paints him as a curiosity) in TIME of 2010. I'm not sure about 2020.
I wonder at what point the threshold of "admitting it's at least okay to discuss Eliezer's viewpoint at face value" was crossed for the editors of TIME. I fear the answer is "last month".
Public attention is rare and safety measures are even more rare unless there's real world damage. This is a known pattern in engineering, product design and project planning so I fear there will be little public attention and even less legislation until someone gets hurt by AI. That could take the form of a hot coffee type incident or it could be a Chernobyl type incident. The threshold won't be discussing Eliezer's point of view, we've been doing that for a long time, but losing sleep over Eliezer's point of view. I appreciate in the article Yudkowsky's use of the think-of-the-children stance which has a great track record for sparking legislation.
I can confirm that Nate is not backdating memories—he and Eliezer were pretty clear within MIRI at the time that they thought Sam and Elon were making a tremendous mistake and that they were trying to figure out how to use MIRI's small influence within a worsened strategic landscape.
You were paying more attention than me (I don't follow anyone who engages with him a lot, so I maybe saw one of his tweets a week). I knew of him as someone who had been right early about COVID, and I also saw him criticizing the media for some of the correct reasons, so I didn't write him off just because he was obnoxious and a crypto fanatic.
The interest rate thing was therefore my Igon Value moment.
Balaji treating the ratio between 0.1% interest and 4.75% interest as deeply meaningful is so preposterous that I'm going to stop paying attention to anything he says from here on out.
I can imagine this coming from the equivalent of "adapt someone else's StackOverflow code" level capability, which is still pretty impressive.
In my opinion, the scariest thing I've seen so far is coding Game Of Life Pong, which doesn't seem to resemble any code GPT-4 would have had in its training data. Stitching those things together means coding for real for real.
Sam's real plan for OpenAI has never changed, and has been clear from the beginning if you knew about his and Elon's deep distrust of DeepMind:
Kudos for talking about learning empathy in a way that seems meaningfully different and less immediately broken than adjacent proposals.
I think what you should expect from this approach, should it in fact succeed, is not nothing- but still something more alien than the way we empathize with lower animals, let alone higher animals. Consider the empathy we have towards cats... and the way it is complicated by their desire to be a predator, and specifically to enjoy causing fear/suffering. Our empathy with cats doesn't lead us to abandon our empathy for their...
I used to believe, as do many Christians, that an open-hearted truthseeker will become convinced of the existence of the true God once they are exposed. To say otherwise makes missionary work seem rather manipulative (albeit still important for saving souls). More importantly, the principle is well attested in Christian thought and in the New Testament (Jesus with Nicodemus, Paul with the Athenians, etc).
There are and have been world religions that don't evangelize because they don't have the same assumption, but Christianity in particular is greatly wounded if that assumption proves false.
I have not read the book but I think this is exactly wrong, in that what happens after the ??? step is that shareholder value is not maximized.
I think you misinterpreted the book review: Caroline was almost surely making a Underpants Gnomes reference, which is used to indicate that the last thing does not follow in any way from the preceding.
This is honestly some of the most significant alignment work I've seen in recent years (for reasons I plan to post on shortly), thank you for going to all this length!
Typo: "Thoughout this process test loss remains low - even a partial memorising solution still performs extremely badly on unseen data!", 'low' should be 'high' (and 'throughout' is misspelled too).
So I would argue that all of the main contenders are very training data efficient compared to artificial neural nets. I'm not going to go into detail on that argument, unless people let me know that that seems cruxy to them and they'd like more detail.
I'm not sure I get this enough for it to even be a crux, but what's the intuition behind this?
My guess for your argument is that you see it as analogous to the way a CNN beats out a fully-connected one at image recognition, because it cuts down massively on the number of possible models, compatibly with the k...
Has any serious AI Safety research org thought about situating themselves so that they could continue to function after a nuclear war?
Wait, hear me out.
A global thermonuclear war would set AI timelines back by at least a decade, for all of the obvious reasons. So an AI Safety org that survived would have additional precious years to work on the alignment problem, compared to orgs in the worlds where we avoid that war.
So it seems to me that at least one org with short timelines ought to move to New Zealand or at least move farther away from cities.
(Yes, I k...
The distinction between your post and Eliezer's is more or less that he doesn't trust anyone to identify or think sanely about [plans that they admit have negative expected value in terms of log odds but believe possess a compensatory advantage in probability of success conditional on some assumption].
Such plans are very likely to hurt the remaining opportunities in the worlds where the assumption doesn't hold, which makes it especially bad if different actors are committing to different plans. And he thinks that even if a plan's assumptions hold, the odds...
In principle, I was imagining talking about two AIs.
In practice, there are quite a few preferences I feel confident a random person would have, even if the details differ between people and even though there's no canonical way to rectify our preferences into a utility function. I believe that the argument carries through practically with a decent amount of noise; I certainly treat it as some evidence for X when a thinker I respect believes X.
Identifying someone else's beliefs requires you to separate a person's value function from their beliefs, which is impossible.
I think it's unfair to raise this objection here while treating beliefs about probability as fundamental throughout the remainder of the post.
If you instead want to talk about the probability-utility mix that can be extracted from seeing another agent's actions even while treating them as a black box... two Bayesian utility-maximizers with relatively simple utility functions in a rich environment will indeed start inferring Bayesian...
This fails to engage with Eli's above comment, which focuses on Elon Musk, and is a counterargument to the very thing you're saying.
and they, I'm afraid, will be PrudentBot, not FairBot.
This shouldn't matter for anyone besides me, but there's something personally heartbreaking about seeing the one bit of research for which I feel comfortable claiming a fraction of a point of dignity, being mentioned validly to argue why decision theory won't save us.
(Modal bargaining agents didn't turn out to be helpful, but given the state of knowledge at that time, it was worth doing.)
Sorry.
It would be dying with a lot less dignity if everyone on Earth - not just the managers of the AGI company making the decision to kill us - thought that all you needed to do was be CooperateBot, and had no words for any sharper concepts than that. Thank you for that, Patrick.
But sorry anyways.
I see Biden as having cogent things he intends to communicate but sometimes failing to speak them coherently, while Trump is a pure stream of consciousness sometimes, stringing together loosely related concepts like a GPT.
(This isn't the same as cognitive capacity, mind you. Trump is certainly more intelligent than many people who speak more legibly.)
I haven't seen a "word salad" from Biden where I can't go "okay, here's the content he intended to communicate", but there are plenty from Trump where I can't reconstruct anything more than sentiment and gestures at disconnected facts.
"How" questions are less amenable to lucky guesses than "what" questions. Especially planning questions, e.g. "how would you make a good hat out of food?"
As Anisha said, GPT can pick something workable from a top-100-most-common menu with just a bit of luck, but engineering a plan for a nonstandard task seems beyond its capacity.
Is there already a concept handle for the notion of a Problem Where The Intuitive Solution Actually Makes It Worse But Makes You Want To Use Even More Dakka On It?
My most salient example is the way that political progressives in the Bay Area tried using restrictive zoning and rent control in order to prevent displacement... but this made for a housing shortage and made the existing housing stock skyrocket in value... which led to displacement happening by other (often cruel and/or backhanded) methods... which led to progressives concluding that their rules...
You can see my other reviews from this and past years, and check that I don't generally say this sort of thing:
This was the best post I've written in years. I think it distilled an idea that's perennially sorely needed in the EA community, and presented it well. I fully endorse it word-for-word today.
The only edit I'd consider making is to have the "Denial" reaction explicitly say "that pit over there doesn't really exist".
(Yeah, I know, not an especially informative review - just that the upvote to my past self is an exceptionally strong one.)
Thank you!
Re: your second paragraph, I was (and am) of the opinion that, given the first sentence, readers were in danger of being sucked down into their thoughts on the object-level topic before they would even reach the meta-level point. So I gave a hard disclaimer then and there.
Your mileage varied, of course, but I model more people as having been saved by the warning lights than blinded by them.
There are some posts with perennial value, and some which depend heavily on their surrounding context. This post is of the latter type. I think it was pretty worthwhile in its day (and in particular, the analogy between GPT upgrades and developmental stages is one I still find interesting), but I leave it to you whether the book should include time capsules like this.
It's also worth noting that, in the recent discussions, Eliezer has pointed to the GPT architecture as an example that scaling up has worked better than expected, but he diverges from the thes...
Fighting is different from trying. To fight harder for X is more externally verifiable than to try harder for X.
It's one thing to acknowledge that the game appears to be unwinnable. It's another thing to fight any less hard on that account.
One tiny note: I was among the people on AAMLS; I did leave MIRI the next year; and my reasons for so doing are not in any way an indictment of MIRI. (I was having some me-problems.)
I still endorse MIRI as, in some sense, being the adults in the AI Safety room, which has... disconcerting effects on my own level of optimism.
Ditto - the first half makes it clear that any strategy which isn't at most 2 years slower than an unaligned approach will be useless, and that prosaic AI safety falls into that bucket.
Thanks for asking about the ITT.
I think that if I put a more measured version of myself back into that comment, it has one key difference from your version.
"Pay attention to me and people like me" is a status claim rather than a useful model.
I'd have said "pay attention to a person who incurred social costs by loudly predicting one later-confirmed bad actor, when they incur social costs by loudly predicting another".
(My denouncing of Geoff drove a wedge between me and several friends, including my then-best friend; my denouncing of the other on...
Thanks, supposedlyfun, for pointing me to this thread.
I think it's important to distinguish my behavior in writing the comment (which was emotive rather than optimized - it would even have been in my own case's favor to point out that the 2012 workshop was a weeklong experiment with lots of unstructured time, rather than the weekend that CFAR later settled on, or to explain that his CoZE idea was to recruit teens to meddle with the other participants' CoZE) from the behavior of people upvoting the comment.
I expect that many of the upvotes were not of the form "this is a good comment on the meta level" so much as "SOMEBODY ELSE SAW THE THING ALL ALONG, I WORRIED IT WAS JUST ME".
A secondary concern in that it's better to have one org that has some people in different locations, but everyone communicating heavily, than to have two separate organizations.
I think this is much more complex than you're assuming. As a sketch of why, costs of communication scale poorly, and the benefits of being small and coordinating centrally often beats the costs imposed by needing to run everything as one organization. (This is why people advise startups to outsource non-central work.)
Sure - and MIRI/FHI are a decent complement to each other, the latter providing a respectable academic face to weird ideas.
Generally though, it's far more productive to have ten top researchers in the same org rather than having five orgs each with two top researchers and a couple of others to round them out. Geography is a secondary concern to that.
Thank you for writing this, Jessica. First, you've had some miserable experiences in the last several years, and regardless of everything else, those times sound terrifying and awful. You have my deep sympathy.
Regardless of my seeing a large distinction between the Leverage situation and MIRI/CFAR, I agree with Jessica that this is a good time to revisit the safety of various orgs in the rationality/EA space.
I almost perfectly overlapped with Jessica at MIRI from March 2015 to June 2017. (Yes, this uniquely identifies me. Don't use my actual name here anyw...
I think CFAR would be better off if Anna delegated hiring to someone else.
I think Pete did (most of?) the hiring as soon as he became ED, so I think this has been the state of CFAR for a while (while I think Anna has also been able to hire people she wanted to hire).
: People in and adjacent to MIRI/CFAR manifest major mental health problems, significantly more often than the background rate.
I think this is true
My main complaint about this and the Leverage post is the lack of base-rate data. How many people develop mental health problems in a) normal companies, b) startups, c) small non-profits, d) cults/sects? So far, all I have seen are two cases. And in the startups I have worked at, I would also have been able to find mental health cases that could be tied to the company narrative. Humans being human narratives get...
if one believed somebody else were just as capable of causing AI to be Friendly, clearly one should join their project instead of starting one's own.
Nitpicking: there are reasons to have multiple projects, for example it's convenient to be in the same geographic location but not anyone can relocate to any place.
Elizabeth has invested a lot of work already, and has explicitly requested that people put in some amount of work when trying to argue against her cruxes (including actually reading her cruxes, and supporting one's points with studies whose methodology one has critically checked).