All of Aryeh Englander's Comments + Replies

You should make this a top level post so it gets visibility. I think it's important for people to know the caveats attached to your results and the limits on its implications in real-world dynamics.

When you say that you'd give different probability estimates on different days, do you think you can represent that as you sampling on different days from a probability distribution over your "true" latent credence? If yes, do you think it would be useful to try to estimate what that distribution looks like, and then report the mean or perhaps the 90% CI or something like that? So for example, if your estimate typically ranges between 33% and 66% depending on the day with a mean of say 50%, then instead of reporting what you think today (the equivalent of taking a single random sample from the distribution), maybe you could report 50% because that's your mean and/or report that your estimate typically ranges from 33% to 66%.

From a Facebook discussion with Scott Aaronson yesterday:

Yann: I think neither Yoshua nor Geoff believe that AI is going kill us all with any significant probability.

Scott: Well, Yoshua signed the pause letter, and wrote an accompanying statement about what he sees as the risk to civilization (I agree that there are many civilizational risks short of extinction). In his words: “No one, not even the leading AI experts, including those who developed these giant AI models, can be absolutely certain that such powerful tools now or in the future cannot be used... (read more)

4gjm2mo
On the one hand, "Yuddite" is (kinda rude but) really rather clever. On the other hand, the actual Luddites were concerned about technological unemployment which makes "Yuddite" a potentially misleading term, given that there's something of a rift between the "concerned about ways in which AI might lead to people's lives being worse within a world that's basically like the one we have now" and "concerned about the possibility that AI will turn the world so completely upside down that there's no room for us in it any more" camps and Yudkowsky is very firmly in the latter camp.
2dr_s2mo
Ok, but how is this any different in practice? Or preventable via "corporate law"? It feels to me like people make too much of a distinction between slow and fast take offs scenarios, as if somewhat, if humans appear to be in the loop a bit more, this makes the problem less bad or less AI-related. Essentially, if your mode of failure follows almost naturally from introducing AI system in current society and basic economic incentives, to the point that you can't really look at any part of the process and identify anyone maliciously and intentionally setting it up to end the world, yet it does end the world, then it's an AI problem. It may be a weird, slow, cyborg-like amalgamation of AI and human society that caused the catastrophe instead of a singular agentic AI taking everything over quickly, but the AI is still the main driver, and the only way to avoid the problem is to make AI extremely robust not just to intentional bad use but also to unintentional bad incentives feedback loops, essentially smart and moral enough to stop its own users and creators when they don't know any better. Or alternatively, to just not make the AI at all. Honestly, given what Facebook's recommender systems have already caused, it's disheartening that the leader of AI research at Meta doesn't get something like this.

Sometimes it's better in the long run to take a good chunk of time off to do things for fun and write or work less. Sometimes less is more. But this is very much a YMMV thing.

This is actually another related area of my research: To the extent that we cannot get people to sit down and agree on double cruxes, can we still assign some reasonable likelihoods and/or uncertainty estimates for those likelihoods? After all, we do ultimately need to make decisions here! Or if it turns out that we literally cannot use any numbers here, how do we best make decisions anyway?

4shminux1y
 It's an interesting question, I think Scott A explored it as https://slatestarcodex.com/2019/06/03/repost-epistemic-learned-helplessness/ [https://slatestarcodex.com/2019/06/03/repost-epistemic-learned-helplessness/] . But it would likely be inferior to figuring out a way for people to either double-crux, or at least do some kind of adversarial collaboration. Seems a lot easier than the problem we are trying to address, so what hope is there for the bigger problem if this one remains unresolved?

I have now posted a "Half-baked AI safety ideas thread" (LW version, EA Forum version) - let me know if that's more or less what you had in mind.

Just putting in my vote for doing both broader and deeper explorations of these topics!

My impression - which I kind of hope is wrong - has been that it is much easier to get an EA grant the more you are an "EA insider" or have EA insider connections. The only EA connection that my professor has is me. On the other hand, I understand the reluctance to some degree in the case of AI safety because funders are concerned that researchers will take the money and go do capabilities research instead.

Honestly I suspect this is going to be the single largest benefit from paying Scott to work on the problem. Similarly, when I suggested in an earlier comment that we should pay other academics in a similar manner, in my mind the largest benefit of doing so is because that will help normalize this kind of research in the wider academic community. The more respected researchers there are working on the problem, the more other researchers start thinking about it as well, resulting (hopefully) in a snowball effect. Also, researchers often bring along their grad students!

9Adam Zerner1y
Right, I was going to bring up the snowball effect as well but I forgot. I think that's a huge point.

Hopefully. I have a feeling it won't be so easy, but we'll see.

7Adam Zerner1y
If it ends up not being easy, it seems to me like that means that we are in fact funding constrained. Is that true or am I missing something? (The advisor in question is just one person. If it was only them who wanted to work in AI safety but couldn't do to a lack of funds, that wouldn't be a big deal. But I am assuming that there are lots similar people in a similar boat. In which case the lack of funding would be an important problem.) (I know this topic has been discussed previously. I bring it up again here because the situation with this advisor seems like a really good concrete example.)

Yes! I actually just discussed this with one of my advisors (an expert on machine learning), and he told me that if he could get funding to do it he would definitely be interested in dedicating a good chunk of his time to researching AGI safety. (For any funders who might read this and might be interested in providing that funding, please reach out to me by email Aryeh.Englander@jhuapl.edu. I'm going to try to reach out to some potential funders next week.)

I think that there are a lot of researchers who are sympathetic to AI risk concerns, but they either ... (read more)

There's been discussion about there being a surplus of funding in EA and not enough people who want to get funded to do important work. If that is true, shouldn't it be relatively easy for your presumably competent advisor to get such funding to work on AI safety?

4TekhneMakre1y
Non-rhetorically, what's the difference between AI risk questions and ordinary scientific questions, in this respect? "There aren't clear / precise / interesting / tractable problems" is a thing we hear, but why do we hear that about AI risk as opposed to other fields with sort of undefined problems? Hasn't a lot of scientific work started out asking imprecise, intuitive questions, or no? Clearly there's some difference.

It also depends on your target audience. (Which is basically what you said, just in slightly different words.) If you want to get Serious Researchers to listen to you and they aren't already within the sub-sub-culture that is the rationality community and its immediate neighbors, then in many (most?) cases ranting and freaking out is probably going to be actively counterproductive to your cause. Same if you're trying to build a reputation as a Serious Researcher, with a chance that decision makers who listen to Serious Researchers might listen to you. On t... (read more)

I'm pretty sure that's the whole purpose of having province governors and sub-kingdoms, and various systems in place to ensure loyalty. Every empire in history did this, to my knowledge. The threat of an imperial army showing up on your doorstep if you fail to comply has historically been sufficient to ensure loyalty, at least while the empire is strong.

We have a points system in our family to incentivize the kids to do their chores. But we have to regularly update the rules because it turns out that there are ways to optimize for the points that we didn't anticipate and that don't really reflect what we actually want the kids to be incentivized to do. Every time this happens I think - ha, alignment failure!

Alexey Turchin and David Denkenberger describe several scenarios here: https://philpapers.org/rec/TURCOG-2 (additional recent discussion in this comment thread)

Eliezer's go-to scenario (from his recent post):

The concrete example I usually use here is nanotech, because there's been pretty detailed analysis of what definitely look like physically attainable lower bounds on what should be possible with nanotech, and those lower bounds are sufficient to carry the point.  My lower-bound model of "how a sufficiently powerful intelligence would kill everyone, if it didn't want to not do that" is that it gets access to the Internet, emails some DNA sequences to any of the many many online firms that will take a DNA

... (read more)

https://www.gwern.net/fiction/Clippy (very detailed but also very long and very full of technical jargon; on the other hand, I think it's mostly understandable even if you have to gloss over most of the jargon)

Please describe or provide links to descriptions of concrete AGI takeover scenarios that are at least semi-plausible, and especially takeover scenarios that result in human extermination and/or eternal suffering (s-risk). Yes, I know that the arguments don't necessarily require that we can describe particular takeover scenarios, but I still find it extremely useful to have concrete scenarios available, both for thinking purposes and for explaining things to others.

1delton1371y
I find slower take-off scenarios more plausible. I like the general thrust of Christiano's "What failure looks like [https://www.lesswrong.com/posts/HBxe6wdjxK239zajf/what-failure-looks-like]". I wonder if anyone has written up a more narrative / concrete account of that sort of scenario.
2Evan R. Murphy1y
This new series of posts from Holden Karnofsky (CEO of Open Philanthropy) is about exactly this. The first post came out today: https://www.lesswrong.com/posts/oBBzqkZwkxDvsKBGB/ai-could-defeat-all-of-us-combined [https://www.lesswrong.com/posts/oBBzqkZwkxDvsKBGB/ai-could-defeat-all-of-us-combined]
3cousin_it1y
Without nanotech or anything like that, maybe the easiest way is to manipulate humans into building lots of powerful and hackable weapons (or just wait since we're doing it anyway). Then one day, strike. Edit: and of course the AI's first action will be to covertly take over the internet, because the biggest danger to the AI is another AI already existing or being about to appear. It's worth taking a small risk of being detected by humans to prevent the bigger risk of being outraced by a competitor.
1Aryeh Englander1y
https://www.lesswrong.com/posts/XFBHXu4YNqyF6R3cv/pitching-an-alignment-softball [https://www.lesswrong.com/posts/XFBHXu4YNqyF6R3cv/pitching-an-alignment-softball] 
1Aryeh Englander1y
Alexey Turchin and David Denkenberger describe several scenarios here: https://philpapers.org/rec/TURCOG-2 [https://philpapers.org/rec/TURCOG-2] (additional recent discussion in this comment thread [https://www.lesswrong.com/posts/MLKmxZgtLYRH73um3/we-will-be-around-in-30-years?commentId=pnxombDr6rCYG79y8])
1Aryeh Englander1y
Eliezer's go-to scenario (from his recent post [https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities]):
1Aryeh Englander1y
https://www.lesswrong.com/posts/BAzCGCys4BkzGDCWR/the-prototypical-catastrophic-ai-action-is-getting-root [https://www.lesswrong.com/posts/BAzCGCys4BkzGDCWR/the-prototypical-catastrophic-ai-action-is-getting-root] 
1Aryeh Englander1y
https://www.gwern.net/fiction/Clippy [https://www.gwern.net/fiction/Clippy] (very detailed but also very long and very full of technical jargon; on the other hand, I think it's mostly understandable even if you have to gloss over most of the jargon)

One of the most common proposals I see people raise (once they understand the core issues) is some form of, "can't we just use some form of slightly-weaker safe AI to augment human capabilities and allow us to bootstrap to / monitor / understand the more advanced versions?" And in fact lots of AI safety agendas do propose something along these lines. How would you best explain to a newcomer why Eliezer and others think this will not work? How would you explain the key cruxes that make Eliezer et al think nothing along these lines will work, while others think it's more promising?

2AprilSR1y
Eliezer's argument from the recent post: The reason why nobody in this community has successfully named a 'pivotal weak act' where you do something weak enough with an AGI to be passively safe, but powerful enough to prevent any other AGI from destroying the world a year later - and yet also we can't just go do that right now and need to wait on AI - is that nothing like that exists. There's no reason why it should exist. There is not some elaborate clever reason why it exists but nobody can see it. It takes a lot of power to do something to the current world that prevents any other AGI from coming into existence; nothing which can do that is passively safe in virtue of its weakness. If you can't solve the problem right now (which you can't, because you're opposed to other actors who don't want to be solved and those actors are on roughly the same level as you) then you are resorting to some cognitive system that can do things you could not figure out how to do yourself, that you were not close to figuring out because you are not close to being able to, for example, burn all GPUs. Burning all GPUs would actually stop Facebook AI Research from destroying the world six months later; weaksauce Overton-abiding stuff about 'improving public epistemology by setting GPT-4 loose on Twitter to provide scientifically literate arguments about everything' will be cool but will not actually prevent Facebook AI Research from destroying the world six months later, or some eager open-source collaborative from destroying the world a year later if you manage to stop FAIR specifically. There are no pivotal weak acts.

[Note that two-axis voting is now enabled for this post. Thanks to the mods for allowing that!]

2evhub1y
Seems worse for this post than one-axis voting imo.

This website looks pretty cool! I didn't know about this before.

8plex1y
Thanks! I've spent a lot of the last year and a half working on the wiki infrastructure, we're getting pretty close to being ready to launch to editors in a more serious way.

I haven't even read the post yet, but I'm giving a strong upvote in favor of promoting the norm of posting unpopular critical opinions.

Such a policy invites moral hazard, though. If many people followed it, you could farm karma by simply beginning each post with the trite "this is going to get downvoted" thing.

2mukashi1y
Thanks, I appreciate that

I forgot about downvotes. I'm going to add this in to the guidelines.

2Charbel-Raphaël1y
here we are, a concrete example of failure of alignment 

Background material recommendations (more in depth): Please recommend your favorite AGI safety background reading / videos / lectures / etc. For this sub-thread more in-depth recommendations are allowed, including material that requires technical expertise of some sort. (Please specify what kind of background knowledge / expertise is required to understand the material you're recommending.) This is also the place to recommend general resources people can look at if they want to start doing a deeper dive into AGI safety and related topics.

7plex1y
Stampy [https://stampy.ai/wiki/Main_Page] has the canonical answer to this: I’d like to get deeper into the AI alignment literature. Where should I look? [https://stampy.ai/read/Plex%27s_Answer_to_I%E2%80%99d_like_to_get_deeper_into_the_AI_alignment_literature._Where_should_I_look%3F] Feel free to improve the answer, as it's on a wiki. It will be served via a custom interface once that's ready (prototype here [https://stampy-ui.aprillion.workers.dev/]).
4Alex Lawsen 1y
AXRP [https://axrp.net/] - Excellent interviews with a variety of researchers. Daniel's substantial own knowledge means that the questions he asks are often excellent, and the technical depth is far better than anything else that's available in audio, given the difficulty of autoreaders on papers or the alignment forum finding it difficult to handle actual maths.
6Aryeh Englander1y
Obligatory link to the excellent AGI Safety Fundamentals curriculum [https://www.eacambridge.org/agi-safety-fundamentals].

Background material recommendations (popular-level audience, several hours time commitment): Please recommend your favorite basic AGI safety background reading / videos / lectures / etc. For this sub-thread please only recommend background material suitable for a popular level audience. Time commitment is allowed to be up to several hours, so for example a popular-level book or sequence of posts would work. Extra bonus for explaining why you particularly like your suggestion over other potential suggestions, and/or for elaborating on which audiences might benefit most from different suggestions.

1james.lucassen1y
Whatever you end up doing, I strongly recommend taking a learning-by-writing [https://www.cold-takes.com/learning-by-writing/] style approach (or anything else that will keep you in critical assessment mode rather than classroom mode). These ideas are nowhere near solidified enough to merit a classroom-style approach, and even if they were infallible, that's probably not the fastest way to learn them and contribute original stuff. The most common failure mode I expect for rapid introductions to alignment is just trying to absorb, rather than constantly poking and prodding to get a real working understanding. This happened to me, and wasted a lot of time.
3Jay Bailey1y
Human Compatible [https://www.amazon.com/Human-Compatible-Artificial-Intelligence-Problem-ebook/dp/B07N5J5FTS] is the first book on AI Safety I read, and I think it was the right choice. I read The Alignment problem and Superintelligence after that, and I think that's the right order if you end up reading all three, but Human Compatible is a good start.
4plex1y
Stampy [https://stampy.ai/wiki/Main_Page] has the canonical version of this: I’d like a good introduction to AI alignment. Where can I find one? [https://stampy.ai/read/Plex%27s_Answer_to_I%E2%80%99d_like_a_good_introduction_to_AI_alignment._Where_can_I_find_one%3F] Feel free to improve the answer, as it's on a wiki. It will be served via a custom interface once that's ready (prototype here [https://stampy-ui.aprillion.workers.dev/]).
1Alex Lawsen 1y
The Alignment Problem [https://brianchristian.org/the-alignment-problem/] - Easily accessible, well written and full of interesting facts about the development of ML. Unfortunately somewhat light on actual AI x-risk, but in many cases is enough to encourage people to learn more. Edit: Someone strong-downvoted this, I'd find it pretty useful to know why.  To be clear, by 'why' I mean 'why does this rec seem bad', rather than 'why downvote'. If it's the lightness on x-risk stuff I mentioned, this is useful to know, if my description seems inaccurate, this is very useful for me to know, given that I am in a position to recommend books relatively often. Happy for the reasoning to be via DM if that's easier for any reason.

Background material recommendations (popular-level audience, very short time commitment): Please recommend your favorite basic AGI safety background reading / videos / lectures / etc. For this sub-thread please only recommend background material suitable for complete newcomers to the field, with a time commitment of at most 1-2 hours. Extra bonus for explaining why you particularly like your suggestion over other potential suggestions, and/or for elaborating on which audiences might benefit most from different suggestions.

6Chris_Leong1y
I really enjoyed Wait but Why [https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html].
3plex1y
Stampy [https://stampy.ai/wiki/Main_Page] has an initial draft of an answer [https://stampy.ai/wiki/Luca%27s_Answer_to_What_is_AI_alignment%3F], but it could be improved.
2Alex Lawsen 1y
This [https://www.lesswrong.com/posts/76etTtAiKtZGGzkmi/video-and-transcript-of-presentation-on-existential-risk] talk from Joe Carlsmith. - Hits at several of the key ideas really directly given the time and technical background constraints. Like Rob's videos, implies an obvious next step for people interested in learning more, or who are suspicious of one of the claims (reading Joe's actual report, maybe even the extensive discussion of it on here).  
8Alex Lawsen 1y
Rob Miles's youtube channel, see this [https://www.youtube.com/watch?v=pYXy-A4siMw] intro. Also his video on the stop button problem for Computerphile. - Easily accessible, entertaining, videos are low cost for many people to watch, and they often end up watching several.  

Quick thought: What counts as a "company" and what counts as "one year of effort"? If Alphabet's board and directors decided for some reason to divert 99% of the company's resources towards buying up coal companies and thereby becomes a world leader in the coal industry, does that count? What if Alphabet doesn't buy the companies outright but instead headhunts all of their employees and buys all the necessary hardware and infrastructure?

Similarly, you specified that it needs to be a "tech company", but what exactly differentiates a tech company from a regu... (read more)

5Andrew_Critch1y
I agree this is an important question.  From the post: I.e., in the definition, the "company" is considered to have "chosen" once the Board and CEO have agreed to do it.  If the CEO and Board agree and make the choice but the company fails to do the thing — e.g., because the employees refuse to go along with the Board+CEO decision — then the company has failed to execute on its choice, despite "effort" (presumably, the CEO and Board telling their people and machines to do stuff that didn't end up getting done). As for what is or is not a tech company, I don't think it matters to the definition or the post or predictions, because I think only things that would presently colloquially be considered "tech companies" have a reasonable chance at meeting the remainder of the conditions in the definition.
4[comment deleted]1y

A friend pointed out on Facebook that Gato uses TPU-v3's. Not sure why - I thought Google already had v4's available for internal use a while ago? In any case, the TPU-v4 might potentially help a lot for the latency issue.

4lennart1y
They trained it on TPUv3s, however, the robot inference was run on a Geforce RTX 3090 (see section G). TPUs are mostly designed for data centers and are not really usable for on-device inference.
6Qumeric1y
Two main options: * It was trained e.g. 1 year ago but published only now * All TPU-v4 very busy with something even more important

"More specifically, says my Inner Eliezer, it is less helpful to reason from or about one's priors about really smart, careful-thinking people making or not making mistakes, and much more helpful to think directly about the object-level arguments, and whether they seem true."

When you say it's much more helpful, do you mean it's helpful for (a) forming accurate credences about which side is in fact correct, or do you just mean it's helpful for (b) getting a much deeper understanding of the issues? If (b) then I totally agree. If (a) though, why would I expe... (read more)

Heh, no problem. At least I changed my LessWrong username from Iarwain to my real name a while back.

Darn, there goes my ability to use Iarwain as a really unusual pseudonym. I've used it off and on for almost 20 years, ever since my brother made me a new email address right after having read the LOTR appendixes.

...sincere apologies.

How about, "the words "hello world!" written on a piece of paper"? Or you could substitute "on a compute screen" instead of a piece of paper, or you could just leave out the writing medium entirely. I'm curious if it can handle simple words if asked specifically for them.

2Dave Orr1y
Added.
3Gunnar_Zarncke1y
You could go by the all-time highest voted: https://www.lesswrong.com/allPosts?timeframe=allTime [https://www.lesswrong.com/allPosts?timeframe=allTime] 

I keep having kind of off-the-cuff questions I would love to ask the community, but I don't know where the right place is to post those questions. I don't usually have the time to go polish up the questions so that they are high quality, cite appropriate sources and previous discussions, etc., but I would still like them answered! Typically these are the types of questions I might post on Facebook, but I think I would get higher quality answers here.

Do questions of this sort belong as question posts, shortform posts, or comments on the monthly open threads... (read more)

3Dagon1y
A lot depends on the question and the kinds of responses you hope to get.  I don't think it's necessarily about polish and cites of previous work, but you do need enough specificity and clarity to show what you already know and what specific part of the topics you mention you're asking about. High-quality answers come from high-quality questions.  There are things you can ask on facebook which don't work well here, if you're not putting much effort into preparation.  And there are things you can ask in shortform that you can't easily ask in a top-level question post (without getting downvoted).  You're correct that you get less traction in those places, but the expectations are also lower.  Also, shortform (and facebook) has more expectation of refinement and discussion via comments, where capital-Q Question posts are generally (but not always) expected to elicit answers. All that said, you've been here long enough to get 500+ karma - I'd recommend just trying stuff out.  The only way to spend that karma is to make risky posts, so get good value out of it by experimenting!   I strongly expect that some questions will catch peoples' attention even if you're not greatly prepared in the post, and some questions won't get useful responses no matter how perfect your form is.  
3Raemon1y
Either of the three options you listed are fine. Question posts don't need to be super-high-polish. Shortform and Open Threads are sort of interchangeable.

My general impression based on numerous interactions is that many EA orgs are specifically looking to hire and work with other EAs, many longtermist orgs are looking to specifically work with longtermists, and many AI safety orgs are specifically looking to hire people who are passionate about existential risks from AI. I get this to a certain extent, but I strongly suspect that ultimately this may be very counterproductive if we are really truly playing to win.

And it's not just in terms of who gets hired. Maybe I'm wrong about this, but my impression is t... (read more)

There is a precedent for doing secret work of high strategic importance, which is every intelligence agency and defense contractor ever.

in-group bias

I'm shocked, shocked, to find gambling in this establishment.

Also note the Percy Liang's Stanford Center for Research on Foundation Models seems to have a strong focus on potential risks as well as potential benefits. At least that's what it seemed to me based on their inaugural paper and from a lot of the talks at the associated workshop last year.

I think part of what I was reacting to is a kind of half-formed argument that goes something like:

  • My prior credence is very low that all these really smart, carefully thought-through people are making the kinds of stupid or biased mistakes they are being accused of.
  • In fact, my prior for the above is sufficiently low that I suspect it's more likely that the author is the one making the mistake(s) here, at least in the sense of straw-manning his opponents.
  • But if that's the case then I shouldn't trust the other things he says as much, because it looks lik
... (read more)
2Eli Tyre1y
Right. And according to Zvi's posit above, a large part of the point of this dialog is that that class of implicit argument is not actually good reasoning (acknowledging that you don't endorse this argument). More specifically, says my Inner Eliezer, it is less helpful to reason from or about one's priors about really smart, careful-thinking people making or not making mistakes, and much more helpful to think directly about the object-level arguments, and whether they seem true.

Meta-comment:

I noticed that I found it very difficult to read through this post, even though I felt the content was important, because of the (deliberately) condescending style. I also noticed that I'm finding it difficult to take the ideas as seriously as I think I should, again due to the style. I did manage to read through it in the end, because I do think it's important, and I think I am mostly able to avoid letting the style influence my judgments. But I find it fascinating to watch my own reaction to the post, and I'm wondering if others have any (co... (read more)

5Jotto9991y
My posting this comment will be contrary to the moderation disclaimer advising not to talk about tone.  But FWIW, I react similarly and I skip reading things written in this way, interpreting them as manipulating me into believing the writer is hypercompetent.
-7Charlie Sanders2y

When I try to mentally simulate negative reader-reactions to the dialogue, I usually get a complicated feeling that's some combination of:

  • Some amount of conflict aversion: Harsh language feels conflict-y, which is inherently unpleasant.
  • Empathy for, or identification with, the people or views Eliezer was criticizing. It feels bad to be criticized, and it feels doubly bad to be told 'you are making basic mistakes'.
  • Something status-regulation-y: My reader-model here finds the implied threat to the status hierarchy salient (whether or not Eliezer is just tryin
... (read more)

I had a pretty strong negative reaction to it. I got the feeling that the post derives much of its rhetorical force from setting up an intentionally stupid character who can be condescended to, and that this is used to sneak in a conclusion that would seem much weaker without that device.

2Pattern2y
It's not just a meta issue. The way it's written has a big impact on how to engage with it. I dealt with this by reading it and trying to be critical. The comment this produced was (predictably) downvoted.
Zvi2yΩ3065

Things I instinctively observed slash that my model believes that I got while reading that seem relevant, not attempting to justify them at this time:

  1. There is a core thing that Eliezer is trying to communicate. It's not actually about timeline estimates, that's an output of the thing. Its core message length is short, but all attempts to find short ways of expressing it, so far, have failed.
  2. Mostly so have very long attempts to communicate it and its prerequisites, which to some extent at least includes the Sequences. Partial success in some cases, full suc
... (read more)

I find it concerning that you felt the need to write "This is not at all a criticism of the way this post was written. I am simply curious about my own reaction to it" (and still got downvoted?).

For my part, I both believe that this post contains valuable content and good arguments, and that it was annoying / rude / bothersome in certain sections.

9Rafael Harth2y
1: To me, it made it more entertaining and thus easier to read. (No idea about non-anecdotal data, would also be interested.) 3: Also no data; I strongly suspect the metric is generally good because... actually I think it's just because the people I find worth listening to are overwhelmingly not condescending. This post seems highly usual in several ways.

I've gotten one private message expressing more or less the same thing about this post, so I don't think this is a super unusual reaction.

Thank you for articulating this. This matches closely with my own thoughts re Eliezer's recently published discussion. I strongly agree that if Eliezer is in fact correct then the single most effective thing we could do is to persuasively show that to be true. Right now it's not even persuasive to many / most alignment researchers, let alone anybody else.

Conditional on Eliezer being wrong though, I'm not sure how valuable showing him to be wrong would be. Presumably it would depend on why exactly he's wrong, because if we knew that then we might be able to... (read more)

Why isn’t there a persuasive write-up of the “current alignment research efforts are doomed” theory?

EY wrote hundreds of thousands of words to show that alignment is a hard and important problem. And it worked! Lots of people listened and started researching this

But that discussion now claims these efforts are no good. And I can’t find good evidence, other than folks talking past each other

I agree with everything in your comment except the value of showing EY’s claim to be wrong:

  • Believing a problem is harder than it is can stop you from finding creative
... (read more)

Could someone please steelman the position of people who disagree strongly with this book? Which parts of the book are considered factually or logically incorrect, which parts do people object to so strongly on moral grounds, etc.?

3Ape in the coat2y
There are a couple [https://www.amazon.com/Mismeasure-Man-Revised-Expanded/dp/0393314251] of [https://www.amazon.com/Intelligence-Genes-Success-Scientists-Statistics/dp/0387949860] books [https://www.amazon.com/Inequality-Design-Cracking-Bell-Curve/dp/0691028982] pointing out mistakes and methodological problems with the Bell Curve. Maybe we will get their reviews as well in the future. If you are not alergic to long youtube videos you may be interested in this [https://www.youtube.com/watch?v=UBc7qBS1Ujo] fairly reasonable and thorough approach to critique of the Bell Curve from the left. In a nutshell, Murray bases his conclusions on a bunch of very epistemically poor research, often uses shady methodology, contradicts himself in a way that hints at writing the book in a bad faith and despite all his neutral tone, smuggles harmful political agenda.  Probably the most obvious example of bad research is Richard Lynn's review, which Murray uses as a countrol group for Afro-Americans. Data for this review was cherry-picked, ironicaly didn't produce Bell Curve and wasn't supposed to be translated into IQ score, and was mainly collected in South Africans during apartheid. Murray's impressive claim that IQ score is a better predictor of success than parents social economic status, becomes much less impressive if we know that while computing SES Murray omits lots of possible factors and as soon as we include them in the calculation, social status becomes a better predictor of success than IQ. As for contradictions, the Bell Curves goes from stating that even if intelligence is 100% genetically determined it doesn't change anything in our society to claims that due to 60% heritability of intelligence we are to implement specific policies or else we are in trouble. In between Murray mixes heritability of a trait in a population with a degree to which a trait is genetically determined in a an individual, despite the fact that he mentions that these are different things.

For what it's worth, I know of at least one decision theorist who is very familiar with and closely associated with the LessWrong community who at least at one point not long ago leaned toward two-boxing. I think he may have changed his mind since then, but this is at least a data point showing that it's not a given that philosophers who are closely aligned with LessWrong type of thinking necessarily one-box.

2Rob Bensinger2y
Yeah, I see possible signs of this in the survey data itself -- decision theorists strongly favor two-boxing, but a lot of their other answers are surprisingly LW-like if there's no causal connection like 'decision theorists are unusually likely to read LW'. It's one reasonable explanation, anyway.

I see that many people are commenting how it's crazy to try to keep things secret between coworkers, or to not allow people to even mention certain projects, or that this kind of secrecy is psychologically damaging, or the like.

Now, I imagine this is heavily dependent on exactly how it's implemented, and I have no idea how it's implemented at MIRI. But just as a relevant data point - this kind of secrecy is totally par for the course for anybody who works for certain government and especially military-related organizations or contractors. You need extensiv... (read more)

Some secrecy between coworkers could be reasonable. Including secrecy about what secret projects exist (e.g. "we're combining AI techniques X and Y and applying them to application Z first as a test").

What seemed off is that the only information concealed by the policy in question (that researchers shouldn't ask each other what they're working on) is who is and isn't recently working on a secret project. That isn't remotely enough information to derive AI insights to any significant degree. Doing detective work on "who started saying they had secrets at ... (read more)

Bit of a side point: One thing I got from spending a lot of time in the philosophy department was an appreciation for just how differently many philosophers think compared to how I tend to think. I spent a substantial fraction of my time in college trying to get at the roots of those differences - what exactly are the differences, what are the cruxes of the disagreements, and is there any way to show that one perspective is better than another? (And before anyone asks - nope, I still don't have good answers to any of those.)

It opened my eyes to the existen... (read more)

Load More