AMA: Ajeya Cotra, researcher at Open Phil

Ajeya

Effective Altruism Forum
EA Forum

AMA: Ajeya Cotra, researcher at Open Phil

Jan 28 20211 min read 105

84

Cause prioritizationAI safetyForecastingAsk Me AnythingAI alignmentForum PrizeLongtermismAI forecastingOpen Philanthropy

Frontpage

[EDIT: Thanks for the questions everyone! Just noting that I'm mostly done answering questions, and there were a few that came in Tuesday night or later that I probably won't get to.]

Hi everyone! I’m Ajeya, and I’ll be doing an Ask Me Anything here. I’ll plan to start answering questions Monday Feb 1 at 10 AM Pacific. I will be blocking off much of Monday and Tuesday for question-answering, and may continue to answer a few more questions through the week if there are ones left, though I might not get to everything.

About me: I’m a Senior Research Analyst at Open Philanthropy, where I focus on cause prioritization and AI. 80,000 Hours released a podcast episode with me last week discussing some of my work, and last September I put out a draft report on AI timelines which is discussed in the podcast. Currently, I’m trying to think about AI threat models and how much x-risk reduction we could expect the “last long-termist dollar” to buy. I joined Open Phil in the summer of 2016, and before that I was a student at UC Berkeley, where I studied computer science, co-ran the Effective Altruists of Berkeley student group, and taught a student-run course on EA.

I’m most excited about answering questions related to AI timelines, AI risk more broadly, and cause prioritization, but feel free to ask me anything!

84 Reactions

Mentioned in

125A longtermist critique of “The expected value of extinction risk reduction is positive”

43EA Forum Prize: Winners for January 2021

38Running an AMA on the EA Forum

23What should I ask Ajeya Cotra — senior researcher at Open Philanthropy, and expert on AI timelines and safety challenges?

23EA Updates for February 2021

Comments106

Sorted by

New & upvoted

Click to highlight new comments since: Today at 9:21 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

kokotajlod3y32

Hi Ajeya! I"m a huge fan of your timelines report, it's by far the best thing out there on the topic as far as I know. Whenever people ask me to explain my timelines, I say "It's like Ajeya's, except..."

My question is, how important do you think it is for someone like me to do timelines research, compared to other kinds of research (e.g. takeoff speeds, alignment, acausal trade...)

I sometimes think that even if I managed to convince everyone to shift from median 2050 to median 2032 (an obviously unlikely scenario!), it still wouldn't matter much because people's decisions about what to work on are mostly driven by considerations of tractability, neglectedness, personal fit, importance, etc. and even that timelines difference would be a relatively minor consideration. On the other hand, intuitively it does feel like the difference between 2050 and 2032 is a big deal and that people who believe one when the other is true will probably make big strategic mistakes.

Bonus question: Murphyjitsu: Conditional on TAI being built in 2025, what happened? (i.e. how was it built, what parts of your model were wrong, what do the next 5 years look like, what do the 5 years after 2025 look like?)

Ajeya3y28

Thanks so much, that's great to hear! I'll answer your first question in this comment and leave a separate reply for your Murphyjitsu question.

First of all, I definitely agree that the difference between 2050 and 2032 is a big deal and worth getting to the bottom of; it would make a difference to Open Phil's prioritization (and internally we're trying to do projects that could convince us of timelines significantly shorter than in my report). You may be right that it could have a counterintuitively small impact on many individual people's career choices, for the reasons you say, but I think many others (especially early career people) would and should change their actions substantially.

I think there are roughly three types of reasons why Bob might disagree with Alice about a bottom line conclusion like TAI timelines, which correspond to three types of research or discourse contributions Bob could make in this space:

1. Disagreements can come from Bob knowing more facts than Alice about a key parameter, which can allow Bob to make "straightforward corrections" to Alice's proposed value for that parameter. E.g., "You didn't think much about hardware, but I did a solid research project... (read more)

Ajeya

It occurred to me that another way to try to move someone on complicated category 3 disagreements might be to put together a well-constructed survey of a population that the person is inclined to defer to. This approach is definitely still tricky: you'd have to convince the person that the relevant population was provided with the strongest arguments for that person's view in addition to your counterarguments, and that the individuals surveyed were thinking about it reasonably hard. But if done well, it could be pretty powerful.

kokotajlod

Thanks, this was a surprisingly helpful answer, and I had high expectations! This is updating me somewhat towards doing more blog posts of the sort that I've been doing. As it happens, I have a draft of one that is very much Category 3, let me know if you are interested in giving comments! Your sense of why we disagree is pretty accurate, I think. The only thing I'd add is that I do think we should update downwards on low-end compute scenarios because of market efficiency considerations, just not as strongly as you perhaps, and moreover I also think that we should update upwards for various reasons (the surprising recent sucesses of deep learning, the fact that big corporations are investing heavily-by-historical-standards in AI, the fact that various experts think they are close to achieving AGI) and the upwards update mostly cancels out the downwards update IMO.

kokotajlod

Update: The draft I mentioned is now a post!

richard_ngo3y10

An extension of Daniel's bonus question:

If I condition on your report being wrong in an important way (either in its numerical predictions, or via conceptual flaws) and think about how we might figure that out today, it seems like two salient possibilities are inside-view arguments and outside-view arguments.

The former are things like "this explicit assumption in your model is wrong". E.g. I count my concern about the infeasibility of building AGI using algorithms available in 2020 as an inside-view argument.

The latter are arguments that, based on the general difficulty of forecasting the future, there's probably some upcoming paradigm shift or crucial consideration which will have a big effect on your conclusions (even if nobody currently knows what it will be).

Are you more worried about the inside-view arguments of current ML researchers, or outside-view arguments?

Ajeya

I generally spend most of my energy looking for inside-view considerations that might be wrong, because they are more likely to suggest a particular directional update (although I'm not focused only on inside view arguments specifically from ML researchers, and place a lot of weight on inside view arguments from generalists too). It's often hard to incorporate the most outside-view considerations into bottom line estimates, because it's not clear what their implication should be. For example, the outside-view argument "it's difficult to forecast the future and you should be very uncertain" may imply spreading probability out more widely, but that would involve assigning higher probabilities to TAI very soon, which is in tension with another outside view argument along the lines of "Predicting something extraordinary will happen very soon has a bad track record."

Aryeh Englander

Shouldn't a combination of those two heuristics lead to spreading out the probability but with somewhat more probability mass on the longer-term rather than the shorter term?

Ajeya

That's fair, and I do try to think about this sort of thing when choosing e.g. how wide to make my probability distributions and where to center them; I often make them wider than feels reasonable to me. I didn't mean to imply that I explicitly avoid incorporating such outside view considerations, just that returns to further thinking about them are often lower by their nature (since they're often about unkown-unkowns).

Aryeh Englander

True. My main concern here is the lamppost issue (looking under the lamppost because that's where the light is). If the unknown unknowns affect the probability distribution, then personally I'd prefer to incorporate that or at least explicitly acknowledge it. Not a critique - I think you do acknowledge it - but just a comment.

MichaelA

Just in case any readers would misinterpret that statement: I'm pretty sure that what Daniel is saying is unlikely is not that TAI will be built in 2032, but rather that he would be able to convince to shift their median to that date. I think Daniel's median for when TAI will be built is indeed somewhere around 2032 or perhaps sooner. (I think that based on conversations around October and this post. Daniel can of course correct me if I'm wrong!) (Maybe no readers would've misinterpreted Daniel anyway and this is just a weird comment...)

kokotajlod

Yep, my current median is something like 2032. It fluctuates depending on how I estimate it, sometimes I adjust it up or down a bit based on how I'm feeling in the moment and recent updates, etc.

Ajeya

On the object level, I think it would probably turn out to be the case that a) I was wrong about horizon length and something more like ~1 token was sufficient, b) I was wrong about model size and something more like ~10T parameter was sufficient. On a deeper level, it would mean I was wrong about the plausibility of ultra-sudden takeoff and shouldn't have placed as much weight as I did on the observation that AI isn't generating a lot of annual revenue right now and its value-added seems to have been increasing relatively smoothly so far. I would guess that the model looks like a scaled-up predictive model (natural language and/or code), perhaps combined with simple planning or search. Maybe a coding model rapidly trains more-powerful successors in a pretty classically Bostromian / Yudkowskian way. Since this is a pretty Bostromian scenario, and I haven't thought deeply about those scenarios, I would default to guessing that the world after looks fairly Bostromian, with risks involving the AI forcibly taking control of most of the world's resources, and the positive scenario involving cooperatively using the AI to prevent other x-risks (including risks from other AI projects).

5[anonymous]3y

Re why AI isn't generating much revenue - have you considered the productivity paradox? It's historically normal that productivity slows down before steeply increasing when a new general purpose technologies arrives. See "Why Future Technological Progress Is Consistent with Low Current Productivity Growth" in "Artificial Intelligence and the Modern Productivity Paradox"

MaxRa

Not sure how relevant, but I saw that Gwern seems to think this comes from a bottleneck of people who can apply AI, not from current AI being insufficient: And the lack of coders may rapidly disappear soon-ish, right? At least in Germany studying ML seems very popular since a couple of years now.

Ajeya

In some sense I agree with gwern that the reason ML hasn't generated a lot of value is because people haven't put in the work (both coding and otherwise) needed to roll it out to different domains, but (I think unlike gwern) the main inference I make from that that it wouldn't have been hugely profitable to put in the work to create ML-based applications (or else more people would have been diverted from other coding tasks to the task of rolling out ML applications).

gwern3y10

I mostly agree with that with the further caveat that I tend to think the low value reflects not that ML is useless but the inertia of a local optima where the gains from automation are low because so little else is automated and vice-versa ("automation as colonization wave"). This is part of why, I think, we see the broader macroeconomic trends like big tech productivity pulling away: many organizations are just too incompetent to meaningful restructure themselves or their activities to take full advantage. Software is surprisingly hard from a social and organizational point of view, and ML more so. A recent example is coronavirus/remote-work: it turns out that remote is in fact totally doable for all sorts of things people swore it couldn't work for - at least when you have a deadly global pandemic solving the coordination problem...

As for my specific tweet, I wasn't talking about making $$$ but just doing cool projects and research. People should be a little more imaginative about applications. Lots of people angst about how they can possibly compete with OA or GB or DM, but the reality is, as crowded as specific research topics like 'yet another efficient Transformer variant' m... (read more)

Ajeya

Ah yeah, that makes sense -- I agree that a lot of the reason for low commercialization is local optima, and also agree that there are lots of cool/fun applications that are left undone right now.

Linda Linsefors3y21

What type of funding opportunities related to AI Safety would OpenPhil want to see more of?

Anything else you can tell me about the funding situation with regards to AI Safety. I'm very confused about why not more people and projects get funded. Is because there is not enough money, or if there is some bottleneck related to evaluation and/or trust?

Ajeya

I primarily do research rather than grantmaking, but I can give my speculations about what grant opportunities people on the grantmaking side of the organization would be excited about. In general, I think it's exciting when there is an opportunity to fund a relatively senior person with a strong track record who can manage or mentor a number of earlier-career people, because that provides an opportunity for exponential growth in the pool of people who are working on these issues. For example, this could look like funding a new professor who is aligned with our priorities in a sub-area and wants to mentor students to work on problems we are excited about in that sub-area. In terms of why more people and projects don't get funded: at least at Open Phil, grantmakers generally try not to evaluate large numbers of applications or inquiries from earlier-career people individually, because each evaluation can be fairly intensive but the grant size is often relatively small; grantmakers at Open Phil prefer to focus on investigations that could lead to larger grants. Open Phil does offer some scholarships for early career researchers (e.g. here and here), but in general we prefer that this sort of grantmaking be handled by organizations like EA Funds.

Linch3y21

Looking at the mistakes you've made in the past, what fraction of your (importance-weighted) mistakes would you classify the issue as being:

Not being aware of the relevant empirical details/facts (that is both in principle and in practice within your ability to find) versus
Being wrong about stuff due to reasoning errors (that is both in principle and in practice within your ability to correct for)

And what ratios would you assign to this for EAs/career EAs in general?

For context, a coworker and I recently had a discussion about, loosely speaking, whether it was more important for junior researchers within EA to build domain knowledge or general skills. Very very roughly, my coworker was more on the former case because he thought that EAs had an undersupply of domain knowledge over so-called "generalist skills." However, I leaned more on the latter side of this debate because I weakly believe that more of my mistakes (and more of my most critical mistakes) were due to errors of cognition rather than insufficient knowledge of facts. (Obviously credit assignment is hard in both cases).

Ajeya

I think the inclusion of "in principle" makes the answer kind of boring -- when we're not thinking about practicality at all, I think I'd definitely prefer to know more facts (about e.g. the future of AI or what would happen in the world if we pursued strategy A vs strategy B) than to have better reasoning skills, but that's not a very interesting answer. In practice, I'm usually investing a lot more in general reasoning, because I'm operating in a domain (AI forecasting and futurism more generally) where it's pretty expensive to collect new knowledge/facts, it's pretty difficult to figure out how to connect facts about the present to beliefs about the distant future, and facts you could gather in 2021 are fairy likely to be obsoleted by new developments in 2022. So I would say most of my importance-weighted errors are going to be in the general reasoning domain. I think it's fairy similar for most people at Open Phil, and most EAs trying to do global priorities research or cause prioritization, especially within long-termism. I think the more object-level your work is, the more likely it is that your big mistakes will involve being unaware of empirical details. However, investing in general reasoning doesn't often look like "explicitly practicing general reasoning" (e.g. doing calibration training, studying probability theory or analytic philosophy, etc). It's usually incidental improvement that's happening over the course of a particular project (which will often involve developing plenty of content knowledge too).

MichaelA

Interesting answer. Given that, could you say a bit more about how "investing from general reasoning" differs from "just working on projects based on what I expect to be directly impactful / what my employer said I should do", and from "trying to learn content knowledge about some domain(s) while forming intuitions, theories, and predictions about those domain(s)"? I.e., concretely, what does your belief that "investing in general reasoning" is particularly valuable lead you to spend more or less time doing (compared to if you believed content knowledge was particularly valuable)? Your other reply in this thread makes me think that maybe you actually think people should basically just spend almost all of their time directly working on projects they expect to be directly impactful, and trust that they'll pick up both improvements in their general reasoning skills and content knowledge along the way? For a concrete example: About a month ago, I started making something like 3-15 Anki cards a day as I do my research (as well as learning random things on the side, e.g. from podcasts), and I'm spending something like 10-30 minutes a day reviewing them. This will help with the specific, directly impactful things I'm working on, but it's not me directly working on those projects - it's an activity that's more directly focused on building content knowledge. What would be your views on the value of that sort of thing? (Maybe the general reasoning equivalent would be spending 10-30 minutes a day making forecasts relevant to the domains one is also concurrently doing research projects on.)

Ajeya

Personally, I don't do much explicit, dedicated practice or learning of either general reasoning skills (like forecasts) or content knowledge (like Anki decks); virtually all of my development on these axes comes from "just doing my job." However, I don't feel strongly that this is how everyone should be -- I've just found that this sort of explicit practice holds my attention less and subjectively feels like a less rewarding and efficient way to learn, so I don't invest in it much. I know lots of folks who feel differently, and do things like Anki decks, forecasting practice, or both.

MichaelA

Oh, actually, that all mainly relates to just one underlying reason why the sort of question Linch and I have in mind matters, which is that it could inform how much time EA researchers spend on various different types of specific tasks in their day-to-day work, and what goals they set for themselves on the scale of weeks/months. Another reason this sort of question matters is that it could inform whether researchers/orgs: 1. Invest time in developing areas of expertise based essentially around certain domains of knowledge (e.g., nuclear war, AI risk, politics & policy, consciousness), and try to mostly work within those domains (even when they notice a specific high-priority question outside of that domain which no one else is tackling, or when someone asks them to tackle a question outside of that domain, or similar) 2. Try to become skilled generalists, tackling whatever questions seem highest priority on the margin in a general sense (without paying too much attention to personal fit), or whatever questions people ask them to tackle, or similar, even if those questions are in domains they currently have very little expertise in (This is of course really a continuum. And there could be other options that aren't highlighted by the continuum - e.g. developing expertise in some broadly applicable skillsets like forecasting or statistics or maybe policy analysis, and then applying those skills wherever seems highest priority on the margin.) So I'd be interested in your thoughts on that tradeoff as well. You suggesting that improving on general reasoning often (in some sense) matters more than improving on content knowledge would seem to maybe imply that you lean a bit more towards option 2 in many cases?

Ajeya

My answer to this one is going to be a pretty boring "it depends" unfortunately. I was speaking to my own experience in responding to the top level question, and since I do a pretty "generalist"-y job, improving at general reasoning is likely to be more important for me. At least when restricting to areas that seem highly promising from a long-termist perspective, I think questions of personal fit and comparative advantage will end up determining the degree to which someone should be specialized in a particular topic like machine learning or biology. I also think that often someone who is a generalist in terms of topic areas still specializes in a certain kind of methodology, e.g. researchers at Open Phil will often do "back of the envelope calculations" (BOTECs) in several different domains, effective "specializing" in the BOTEC skillset.

MichaelA

I'm the coworker in question, and to clarify a little, my position was more like "It's probably quite useful to build expertise in some area or cluster of areas by building lots of content knowledge in that area/those areas. And this seems worth doing for a typical full-time EA researcher even at the cost of having less time available to work on building general reasoning skills." And that in turn is partly because I'd guess that it'd be really hard for a typical full-time EA researcher to make substantial further progress on their general reasoning skills than on their content knowledge. I'd agree there's a major "undersupply" of general reasoning skills in the sense that all humans are way worse at general reasoning than would be ideal and than seems theoretically possible (if we stripped away all biases, added loads of processing power, etc.). I think Linch and I disagree more on how easy it is to make progress towards that ideal (for a typical full-time EA researcher), rather than on how valuable such progress would be. (I think we also disagree on how important more content knowledge tends to be.) And I don't think I'd say this for most non-EAs. E.g., I think I might actually guess that most non-EAs would benefit more from either reading Rationality: AI to Zombies or absorbing the ideas from it in some other way more fitting for the person (e.g., workshops, podcasts, discussions), rather than spending the same amount of time learning facts and concepts from important domains. (Though I guess I feel unsure precisely what I'm saying or what it means. E.g., I'd feel tempted to put "learning some core concepts from economics and some examples of how they're applied" in the "improving general reasoning" bucket in addition to the "improving content knowledge" bucket.) In any case, all of my views here are vaguely stated and weakly held, and I'd be very interested to hear Ajeya's thoughts on this!

Ajeya

In my reply to Linch, I said that most of my errors were probably in some sense "general reasoning" errors, and a lot of what I'm improving over the course of doing my job is general reasoning. But at the same time, I don't think that most EAs should spend a large fraction of their time doing things that look like explicitly practicing general reasoning in an isolated or artificial way (for example, re-reading the Sequences, studying probability theory, doing calibration training, etc). I think it's good to be spending most of your time trying to accomplish something straightforwardly valuable, which will often incidentally require building up some content expertise. It's just that a lot of the benefit of those things will probably come through improving your general skills.

Linch

Apologies if I misrepresented your stance! Was just trying to give my own very rough overview of what you said. :)

MichaelA

Yeah, that makes sense, and no need to apologise. I think your question was already useful without me adding a clarification of what my stance happens to be. I just figured I may as well add that clarification.

Alex HT3y12

I’d be keen to hear your thoughts about the (small) field of AI forecasting and its trajectory. Feel free to say whatever’s easiest or most interesting. Here are some optional prompts:

Do you think the field is progressing ‘well’, however you define ‘well’?
What skills/types of people do you think AI forecasting needs?
What does progress look like in the field? Eg. does it mean producing a more detailed report, getting a narrower credible interval, getting better at making near-term AI predictions...(relatedly, how do we know if we're making progress?)
Can you make any super rough predictions like ‘by this date I expect we’ll be this good at AI forecasting’?

Aryeh Englander

I know you asked Ajeya, but I'm going to add my own unsolicited opinion that we need more people with professional risk analysis backgrounds, and if we're going to do expert judgment elicitations as part of forecasting then we need people with professional elicitation backgrounds. Properly done elicitations are hard. (Relevant background: I led an AI forecasting project for about a year.)

Ajeya

Hm, I think I'd say progress at this stage largely looks like being better able to cash out disagreements about big-picture and long-term questions in terms of disagreements about more narrow, empirical, or near-term questions, and then trying to further break down and ultimately answer these sub-questions to try to figure out which big picture view(s) are most correct. I think given the relatively small amount of effort put into it so far and the intrinsic difficulty of this project, returns have been pretty good on that front -- it feels like people are having somewhat narrower and more tractable arguments as time goes on. I'm not sure about what exact skillsets the field most needs. I think the field right now is still in a very early stage and could use a lot of disentanglement research, and it's often pretty chaotic and contingent what "qualifies" someone for this kind of work. Deep familiarity with the existing discourse and previous arguments/attempts at disentanglement is often useful, and some sort of quantitative background (e.g. economics or computer science or math) or mindset is often useful, and subject matter expertise (in this case machine learning and AI more broadly) is often useful, but none of these things are obviously necessary or sufficient. Often it's just that someone happens to strike upon an approach to the question that has some purchase, they write it up on the EA Forum or LessWrong, and it strikes a chord with others and results in more progress along those lines.

MichaelA

Interesting answer :) That made me think to ask the following questions, which are sort-of a tangent and sort-of a generalisation of the kind of questions Alex HT asked: (These questions are inspired by a post by Max Daniel.) 1. Do you think many major insights from longtermist macrostrategy or global priorities research have been found since 2015? 2. If so, what would you say are some of the main ones? 3. Do you think the progress has been at a good pace (however you want to interpret that)? 4. Do you think that this pushes for or against allocating more resources (labour, money, etc.) towards that type of work? 5. Do you think that this suggests we should change how we do this work, or emphasise some types of it more? (Feel very free to just answer some of these, answer variants of them, etc.)

Ajeya

1. I think "major insights" is potentially a somewhat loaded framing; it seems to imply that only highly conceptual considerations that change our minds about previously-accepted big picture claims count as significant progress. I think very early on, EA produced a number of somewhat arguments and considerations which felt like "major insights" in that they caused major swings in the consensus of what cause areas to prioritize at a very high level; I think that probably reflected that the question was relatively new and there was low-hanging fruit. I think we shouldn't expect future progress to take the form of "major insights" that wildly swing views about a basic, high-level question as much (although I still think that's possible). 2. Since 2015, I think we've seen good analysis and discussion of AI timelines and takeoff speeds, discussion of specific AI risks that go beyond the classic scenario presented in Superintellilgence, better characterization of multipolar and distributed AI scenarios, some interesting and more quantitative debates on giving now vs giving later and "hinge of history" vs "patient" long-termism, etc. None of these have provided definitive / authoritative answers, but they all feel useful to me as someone trying to prioritize where Open Phil dollars should go. 3. I'm not sure how to answer this; I think taking into account the expected low-hanging fruit effect, and the relatively low investment in this research, progress has probably been pretty good, but I'm very uncertain about the degree of progress I "should have expected" on priors. 4. I think ideally the world as a whole would be investing much more in this type of work than it is now. A lot of the bottleneck to this is that the work is not very well-scoped or broken into tractable sub-problems, which makes it hard for a large number of people to be quickly on-boarded to it. 5. Related to the above, I'd love for the work to become better-scoped over time -- this is one thing we

MichaelA

Thanks! Yeah, to be clear, I don't intend to imply that we should expect there to have many been "major insights" after EA's early years, or that that's the only thing that's useful. Tobias Baumann said on Max's post: That's basically my view too, and it sounds like your view is sort-of similar. Though your comment makes me notice that some things that don't seem explicitly captured by either Max's question or Tobias's response are: * better framings and disentanglement, to lay the groundwork for future minor/major insights * I.e., things that help make topics more "well-scoped or broken into tractable sub-problems" * better framings, to help us just think through something or be able to form intuitions more easily/reliably * things that are more concrete and practical than what people usually think of as "insights" * E.g., better estimates for some parameter (ETA: I've now copied Ajeya's answer to these questions as an answer to Max's post.)

Arepo3y10

One issue I feel the EA community has badly neglected is the probability given various (including modest) civilizational backslide scenarios of us still being able to (and *actually*) developing the economies of scale needed to become an interstellar species.

To give a single example, a runaway Kessler effect could make putting anything in orbit basically impossible unless governments overcome the global tragedy of the commons and mount an extremely expensive mission to remove enough debris to regain effective orbital access - in a world where we've lost satellite technology and everything that depends on it.

EA so far seem to have treated 'humanity doesn't go extinct' in scenarios like this as equivalent to 'humanity reaches its interstellar potential', which seems very dangerous to me - intuitively, it feels like there's at least a 1% chance that we wouldn't ever solve such a problem in practice, even if civilisation lasted for millennia afterwards. If so, then we should be treating it as (at least) 1/100th of an existential catastrophe - and a couple of orders of magnitude doesn't seem like that big a deal especially if there are many more such scenarios than there are extinction-causing ones.

Do you have any thoughts on how to model this question in a generalisable way that it could give a heuristic for non-literal-extinction GCRs? Or do you think one would need to research specific GCRs to answer it for each of them?

abergal3y10

Also a big fan of your report. :)

Historically, what has caused the subjectively biggest-feeling updates to your timelines views? (e.g. arguments, things you learned while writing the report, events in the world).

Ajeya3y10

Thanks! :)

The first time I really thought about TAI timelines was in 2016, when I read Holden's blog post. That got me to take the possibility of TAI soonish seriously for the first time (I hadn't been explicitly convinced of long timelines earlier or anything, I just hadn't thought about it).

Then I talked more with Holden and technical advisors over the next few years, and formed the impression that there was a relatively simple argument that many technical advisors believed that if a brain-sized model could be transformative, then there's a relatively tight argument implying it would take X FLOP to train it, which would become affordable in the next couple decades. That meant that if we had a moderate probability on the first premise, we should have a moderate probability on TAI in the next couple decades. This made me take short timelines even more seriously because I found the biological analogy intuitively appealing, and I didn't think that people who confidently disagreed had strong arguments against it.

Then I started digging into those arguments in mid-2019 for the project that ultimately became the report, and I started to be more skeptical again because it seemed tha... (read more)

EdoArad3y9

What cause-prioritization efforts would you most like to see from within the EA community?

Ajeya3y12

I'm most interested in forecasting work that could help us figure out how much to prioritize AI risk over other x-risks, for example estimating transformative AI timelines, trying to characterize what the world would look like in between now and transformative AI, and trying to estimate the magnitude of risk from AI.

tamgent3y9

If a magic fairy gave you 10 excellent researchers from a range of relevant backgrounds who were to work on a team together to answer important questions about the simulation hypothesis, what are the top n research questions you'd be most excited to discover they are pursuing?

Ajeya3y16

I'm afraid I don't have crisp enough models of the simulation hypothesis and related sub-questions to have a top n list. My biggest question is something more like "This seems like a pretty fishy argument, and I find myself not fully getting or buying it despite not being able to write down a simple flaw. What's up with that? Can somebody explain away my intuition that it's fishy in a more satisfying way and convince me to buy it more wholeheartedly, or else can someone pinpoint the fishiness more precisely?" My second biggest question is something like "Does this actually have any actionable implications for altruists/philanthropists? What are they, and can you justify them in a way that feels more robust and concrete and satisfying than earlier attempts, like Robin Hanson's How to Live in a Simulation?"

MichaelA3y8

Thanks for doing this AMA!

I'd be interested to hear about what you or Open Phil include (and prioritise) within the "longtermism" bucket. In particular, I'm interested in things like:

When you/Open Phil talk about existential risk, are you (1) almost entirely concerned about extinction risk specifically, (2) mostly concerned about extinction risk specifically, or (3) somewhat similar concerned about extinction risk and other existential risks (i.e., risks of unrecoverable collapse or unrecoverable dystopias)?
When you/Open Phil talk about longtermism, are yo

... (read more)

Ajeya

* I'd say that we're interested in all three of preventing outright extinction, preventing some other kind of existential catastrophe, and in trajectory changes such as moving probability mass from "okay" worlds to "very good" worlds; I would expect some non-trivial fraction of our impact to come from all of those channels. However, I'm unsure how much weight each of these scenarios should get -- that depends on various complicated empirical and philosophical questions we haven't fully investigated (e.g. "What is the probability civilization would recover from collapse of various types?" and "How morally valuable should we think it is if the culture which arises after a recovery from collapse is very different from our current culture, and that culture is the one which gets to determine the long-term future?"). In practice our grantmaking isn't making fine-grained distinctions between these or premised on one particular channel of impact: biosecurity and pandemic preparedness grantmaking may help prevent both outright extinction and civilizational collapse scenarios, AI alignment grantmaking may help prevent outright extinction or help make an "ok" future into a "great" one, etc. * I'd say that long-termism as a view is inherently animal-inclusive (just as the animal-inclusive view inherently also cares about humans); the view places weight on humans and animals today, and humans / animals / other types of moral patients in the distant future. Often the fact that it's animal-inclusive is less salient though, because it is concerned with the potential for creating large numbers of thriving digital minds in the future, which we often picture as more human-like than animal-like. * I think the total view on population ethics is one important route to long-termism but others are possible. For example, you could be very uncertain what you value, but reason that it would be easier to figure out what we value and realize our values if we are safer, wiser, and have access

MichaelA

Thanks! FWIW, I think that all matches my own views, with the minor exception that I think longtermism (as typically defined, e.g. by MacAskill) is consistent with human-centrism as well as with animal-inclusivity. (Just as it's consistent with either intrinsically valuing only happiness and reductions in suffering or also other things like liberty and art, and consistent with weighting reducing suffering more strongly than increasing happiness or weighting them equally.) Perhaps you meant that Open Philanthropy's longtermist worldview is inherently animal-inclusive? (Personally, I adopt an animal-inclusive longtermist view. I just think one can be a human-centric longtermist.)

Ajeya

Yes, I meant that the version of long-termism we think about at Open Phil is animal-inclusive.

EdoArad3y7

How would you define a "cause area" and "cause prioritization", in a way which extends beyond Open Phil?

Ajeya3y18

I'd say that a "cause" is something analogous to an academic field (like "machine learning theory" or "marine biology") or an industry (like "car manufacturing" or "corporate law"), organized around a problem or opportunity to improve the world. The motivating problem or opportunity needs to be specific enough and clear enough that it pays off to specialize in it by developing particular skills, reading up on a body of work related to the problem, trying to join particular organizations that also work on the problem, etc.

Like fields and industries, the boundaries around what exactly a "cause" is can be fuzzy, and a cause can have sub-causes (e.g. "marine biology" is a sub-field of "biology" and "car manufacturing" is a sub-industry within "manufacturing"). But some things are clearly too broad to be a cause: "doing good" is not a cause in the same way that "learning stuff" is not an academic field and "making money" is not an industry. Right now, the cause areas that long-termist EAs support are in their infancy, so they're pretty broad and "generalist"; over time I expect sub-causes to become more clearly defined and deeper specialized expertise to develop within them (e.g. ... (read more)

Emily Grundy3y7

Hi Ajeya, that's a wonderful idea - I have a couple of questions below that are more about how you find working as a Senior Research Analyst and in this area:

What do you love about your role / work?

What do you dislike about your role / work?

What’s blocking you from having the impact you’d like to have?

What is the most important thing you did to get to where you are? (e.g., network, trying out lots of jobs / internships, continuity at one job, a particular a course etc.)

Ajeya

* The thing I most love about my work is my relationships with my coworkers and manager; they are all deeply thoughtful, perceptive, and compassionate people who help me improve along lots of dimensions. * Like I discussed in the podcast, a demoralizing aspect of my work is that we're often pursuing questions were deeply satisfying answers are functionally impossible and it's extremely unclear when something is "done." It's easy to spend much longer on a project than you hoped, and to feel that you put in a lot of work to end up with an answer that's still hopelessly subjective and extremely easy to disagree with. * I think I would do significantly better in my role if I were less sensitive about the possibility that someone (especially experts or fancy people) would think I'm dumb for missing some consideration, not having an excellent response to an objection, not knowing everything about a technical sub-topic, making a mistake, etc. It would allow me to make better judgment calls about when it's actually worth digging into something more, and to write more freely without getting bogged down in figuring out exactly how to caveat something. * I think the most important thing I did before joining Open Phil was to follow GiveWell's research closely and to attempt to digest EA concepts well enough to teach them to others; I think this helped me notice when there was a job opportunity at GiveWell and to perform well in the interview process. Once at Open Phil, I think it was good that I asked a lot of questions about everything and pretty consistently said yes to opportunities to work on something harder than what I had done before.

MichaelA

I'm also interested in hearing more of what Ajeya has to say on these questions. People might also be interested in her answers to questions similar to at least the first and second of those questions on the 80k podcast, from around 2 hours 17 minutes onwards. (I also commented here about how parts of her answers resonated with my own experiences.)

MaxRa3y7

Regarding forecasts on transformative AI:

Ajeya Cotra: Yeah. I mean, I think I would frame it as like 12–15% by 2036, which is kind of the original question, a median of 2055, and then 70–80% chance this century. That’s how I would put the bottom line.

I'd be really interested in hearing about the discussions you have with people that have earlier median estimates, and/or what you expect those discussions would resolve around. For example, I saw that the Metaculus crowd has a median estimate of 2035 for fully general AI. Skimming their discussions, they migh... (read more)

Ajeya3y11

Like Linch says, some of the reason the Metaculus median is lower than mine is probably because they have a weaker definition; 2035 seems like a reasonable median for "fully general AI" as they define it, and my best guess may even be sooner.

With that said, I've definitely had a number of conversations with people who have shorter timelines than me for truly transformative AI; Daniel Kokotajlo articulates a view in this space here. Disagreements tend to be around the following points:

People with shorter timelines than me tend to feel that the notion of "effective horizon length" either doesn't make sense, or that training time scales sub-linearly rather than linearly with effective horizon length, or that models with short effective horizon lengths will be transformative despite being "myopic." They generally prefer a model where a scaled-up GPT-3 constitutes transformative AI. Since I published my draft report, Guille Costa (an intern at Open Philanthropy) released a version of the model that explicitly breaks out "scaled up GPT-3" as a hypothesis, which would imply a median of 2040 if all my other assumptions are kept intact.
They also tend to feel that extrapolations of whe

... (read more)

MaxRa

Thanks, super interesting! In my very premature thinking, the question of algorithmic progress is most load-bearing. My background is in cognitive science and my broad impression is that * human cognition is not *that* crazy complex, * that I wouldn’t be surprised at all if one of the broad architectural ideas I've seen floating around on human cognition could afford "significant" steps towards proper AGI * e.g. how Bayesian inference and Reinforcement Learning maybe realized in the predictive coding framework was impressive to me, for example flashed out by Steve Byrnes on LessWrong * or e.g. rough sketches of different systems that fulfill specific functions like in the further breakdown of System 2 in Stanovich's Rationality and the Reflective Mind * when thinking about how many „significant“ steps or insights we still need until AGI, I think more on the order of less than ten * (I've heard the idea of "insight-based forecasting" from a Joscha Bach interview) * those insights might not be extremely expensive and, once had, cheap-ish to implement * e.g. the GANs story maybe fits this, they're not crazy complicated, not crazy hard to implement, but very powerful This all feels pretty freewheeling so far. Would be really interested in further thoughts or reading recommendation on algorithmic progress.

Ajeya

My approach to thinking about algorithmic progress has been to try to extrapolate the rate of past progress forward; I rely on two sources for this, a paper by Katja Grace and a paper by Danny Hernandez and Tom Brown. One question I'd think about when forming a view on this is whether arguments like the ones you make should lead you to expect algorithmic progress to be significantly faster than the trendline, or whether those considerations are already "priced in" to the existing trendline.

MaxRa

And yes, thanks, the point about thinking with trendlines in mind is really good. Maybe those two developments could be relevant: * bigger number of recent ML/CogSci/Comp. Neuroscience graduates that academically grew up in times of noticeable AI progress and much more widespread aspirations to build AGI than the previous generation * related to my question about non-academic open-source projects: If there is a certain level of computation necessary to solve interesting general reasoning gridworld problems with new algorithms, then we might unlock a lot of work in the coming years

MaxRa

Thanks! :) I find Grace's paper a little bit unsatisfying. From the outside, fields around like SAT, factoring, scheduling and linear optimization seem only weakly analogous to the fields around developing general thinking capabilities. It seems to me that the former is about hundreds of researchers going very deep into very specific problems and optimizing a ton to produce slightly more elegant and optimal solutions, whereas the latter is more about smart and creative "pioneers" having new insights how to frame the problem correctly and finding new relatively simple architectures that make a lot of progress. What would be more informative for me? * by above logic maybe I would focus more on progress of younger fields within computer science * also maybe there is a way to measure how "random" praciticioners perceive the field to be - maybe just asking them how surprised they are by recent breakthroughs is a solid measures of how many other potential breakthroughs are still out there * also I'd be interested in solidifying my very rough impression that breakthroughs like transformers or GANs relatively simple algorithms in comparison with breakthroughs in other areas of computer science * evolution's algorithmic progress would maybe also be informative to me, i.e. how much trial and error was roughly invested to make specific jumps * e.g. I'm reading Pearls Book of Why and he makes a tentative claim that counterfactual reasoning is something that appeared at some point, and the first sign we can report of it is the lion-man from roughly 40.000 years ago * though of course evolution did not aim at general intelligence, e.g. saying "evolution took hundreds of millions of years to develop an AGI" in this context seems disanalogous * how big of a fraction of human cognition do we actually need for TAI? E.g. we might save about an order of magnitude by ditching vision and focussing on language?

abergal

Sherry et al. have a more exhaustive working paper about algorithmic progress in a wide variety of fields.

Linch3y11

Note that the definition of "fully general AI" on that Metaculus question is considerably weaker than how Open Phil talks about "transformative AI."

For these purposes we will thus define "an artificial general intelligence" as a single unified software system that can satisfy the following criteria, all easily completable by a typical college-educated human.
Able to reliably pass a Turing test of the type that would win the Loebner Silver Prize.
Able to score 90% or more on a robust version of the Winograd Schema Challenge, e.g. the "Winogrande" challenge or comparable data set for which human performance is at 90+%
Be able to score 75th percentile (as compared to the corresponding year's human students; this was a score of 600 in 2016) on all the full mathematics section of a circa-2015-2020 standard SAT exam, using just images of the exam pages and having less than ten SAT exams as part of the training data. (Training on other corpuses of math problems is fair game as long as they are arguably distinct from SAT exams.)
Be able to learn the classic Atari game "Montezuma's revenge" (based on just visual inputs and standard controls) and explore all 24 rooms based on the equ

... (read more)

MaxRa

Thanks, I didn‘t read that carefully enough!

Linch

Right, to be clear I think this is (mostly) not your fault. Unfortunately others have made this and similar mistakes before, for both other questions and this specific question. Obviously some of the onus is on user error, but I think the rest of us (the forecasting community and the Metaculus platform) should do better on having the intuitive interpretation of the headline question match the question specifications, and vice versa.

Ofer3y6

Imagine you win $10B in a donor lottery. What sort of interventions—that are unlikely to be funded by Open Phil in the near future—might you fund with that money?

Ajeya

There aren't $10B worth of giving opportunities that I'd be excited about supporting now, for essentially the same reasons why Open Phil isn't giving everything away over the next few years. Basically, we expect (and I agree) that there will be more, better giving opportunities in the medium-term future and so it makes sense to save the marginal dollar for future giving, at least right now. There would likely be some differences between what I would fund and what Open Phil is currently funding due to different intuitions about the most promising interventions to investigate with scarce capacity, but I don't expect them to be large.

EdoArad3y6

How much worldview-diversification and dividing capital into buckets do you have within each of the three main cause areas, if at all? For example, I could imagine a divide between short and long AI Timelines, or a divide between policy-oriented and research-oriented grants.

Ajeya

We don't have firmly-articulated "worldview divisions" beyond the three laid out in that post, though as I mention towards the end of this section in my podcast, different giving opportunities within a particular worldview can perform differently on important but hard-to-quantify axes such as the strength of feedback loops, the risk of self-delusion, or the extent to which it feels like a "Pascal's mugging", and these types of considerations can affect how much we give to particular opportunities.

EdoArad

Thanks for the answer! I want to make sure that I get this clearly, if you are still taking questions :) Are you making attempts to diversify grants based on these kinds of axes, in cases where there is no clear-cut position? My current understanding is that you do it but mostly implicitly

MichaelA3y6

I'd be interested to hear whether you think eventually expanding beyond our solar system is necessary for achieving a long period with very low extinction risk (and, if so, your reasons for thinking that).

Context for this question (adapted from this comment):

As part of the discussion of "Effective size of the long-term future" during your recent 80k appearance, you and Rob discussed the barriers to and likelihood of various forms of space colonisation.

During that section, I got the impression that you were implicitly thinking that a stable, low... (read more)

Ajeya

Thanks Michael! I agree space colonization may not be strictly required for achieving a stable state of low x-risk, but because it's the "canonical" vision of the stable low-risk future, I would feel significantly more uncertain if we were to rule out the possibility of expansion into space, and I would be inclined to be skeptical-by-default, particularly if we are picturing biological humans, because it seems like there are a large number of possible ways the environmental conditions needed for survival might be destroyed and it intuitively seems like "offense" would have an advantage over "defense" there. But I haven't thought deeply about the technology that would be needed to preserve a state of low x-risk entirely on Earth and I'd expect my views would change a lot with only a few hours of thinking on this.

Ofer3y5

Apart from the biological anchors approach, what efforts in AI timelines or takeoff dynamics forecasting—both inside and outside Open Phil—are you most excited about?

Ajeya

I'm pretty excited about economic modeling-based approaches, either: * Estimating the value-added from machine learning historically and extrapolating it into the future, or * Doing a takeoff analysis that takes into account how AI progress relates to inputs such as hardware and software effort, and the extent to which AI of a certain quality level can allow hardware to substitute for software effort, similar to the "Intelligence Explosion Microeconomics" paper.

NunoSempere3y5

What instrumental goals have you pursued successfully?

Ajeya

In my work, I've gotten better at resisting the urge to investigate sub-questions more deeply and instead pulling back and trying to find short-cuts to answering the high-level question. In my personal life, I've gotten better at setting up my schedule so I'm having fun in the evenings and weekends instead of mindlessly browsing social media. (I have a long way to go on both of these though.) Also, I got a university degree :)

Aryeh Englander3y5

For thinking about AI timelines, how do you go about choosing the best reference classes to use (see e.g., here and here)?

Ajeya

I don't think I have a satisfying general answer to this question; in practice, the approaches I pursue first are heavily influenced by which approaches I happen to find some purchase on, since many theoretically appealing reference classes or high-level approaches to the question may be difficult to make progress on for whatever reason.

MichaelA3y4

[I'm not sure if you've thought about the following sort of question much. Also, I haven't properly read your report - let me know if this is covered in there.]

I'm interested in a question along the lines of "Do you think some work done before TAI is developed matters in a predictable way - i.e., better than 0 value in expectation - for its effects on the post-TAI world, in ways that don't just flow through how the work affects the pre-TAI world or how the TAI transition itself plays out? If so, to what extent? And what sort of work?"

An example to illustra... (read more)

Ajeya

I haven't thought very deeply about this, but my first intuition is that the most compelling reason to expect to have an impact that predictably lasts longer than several hundred years without being washed out is because of the possibility of some sort of "lock-in" -- technology that allows values and preferences to be more stably transmitted into the very long-term future than current technology allows. For example, the ability to program space probes with instructions for creating the type of "digital life" we would morally value, with error-correcting measures to prevent drift, would count as a technology that allows for effective lock-in in my mind. A lot of people may act as if we can't impact anything post-transformative AI because they believe technology that enables lock-in will be built very close in time after transformative AI (since TAI would likely cause R&D towards these types of tech to be greatly accelerated).

MichaelA

[Kind-of thinking aloud; bit of a tangent from your AMA] Yeah, that basically matches my views. I guess what I have in mind is that some people seem to: * round up "most compelling reason" to "only reason" * not consider the idea of trying to influence lock-in events that occur after a TAI transition, in ways other than influencing how the TAI transition itself occurs * Such ways could include things like influencing political systems in long-lasting ways * round up "substantial chance that technology that enables lock-in will be built very close in time after TAI" up to "it's basically guaranteed that..." I think what concerns me about this is that I get the impression many of people are doing this without noticing it. It seems like maybe some thought leaders recognised that there were questions to ask here, thought about the questions, and formed conclusions, but then other people just got a slightly simplified version of the conclusion without noticing there's even a question to ask. A counterpoint is that I think the ideas of "broad longtermism", and some ideas that people like MacAskill have raised, kind-of highlight the questions I'm suggesting should be highlighted. But even those ideas seem to often be about what to do given the premise that a TAI transition won't occur for a long time, or how to indirectly influence how a TAI transition occurs. So I think they're still not exactly about the sort of thing I'm talking about. To be clear, I do think we should put more longtermist resources towards influencing potential lock-in events prior to or right around the time of a TAI transition than towards non-TAI-focused ways of influencing events after a TAI transition. But it seems pretty plausible to me that some longtermist resources should go towards other things, and it also seems good for people to be aware that a debate could be had on this. (I should probably think more about this, check whether similar points are already covered well in some

NunoSempere3y4

To the extent that you have "a worldview" (in scare quotes), what is a short summary of that worldview?

Ajeya

I don't have an easily-summarizable worldview that ties together the different parts of my life. In my career, effective altruism (something like "Try to do as much good as possible, and think deeply about what that means and be open to counterintuitive answers") is definitely dominant. In my personal life, I try to be "agenty" about getting what I want, and to be open to trying unusually hard or being "weird" when that's what works for me and makes me happy. I think these are both evolving a lot in the specifics.

EdoArad3y4

I'm curious about your take on prioritizing between science funding and other causes. In the 80k interview you said:

When we were starting out, it was important to us that we put some money in science funding and some money in policy funding. Most of that is coming through our other causes that we already identified, but we also want to get experience with those things.
We also want to gain experience in just funding basic science, and doing that well and having a world-class team at that. So, some of our money in science goes there as well.
That’s coming muc

... (read more)

Ajeya

Decisions about the size of the basic science budget are made within the "near-termist" worldview bucket, since we see the primary case for this funding as the potential for scientific breakthroughs to improve health and welfare over the next several decades; I'm not involved with that since my research focus is on cause prioritization within the "long-termist" worldview. In terms of high-level principles, the decision would be made by comparing an estimate of the value of marginal science funding against an estimate of the value of the near-termist "last dollar", but I'm not familiar with the specific numbers myself.

BrownHairedEevee3y4

I really appreciated your 80K episode - it was one of my favorites! I created a discussion thread for it.

Some questions - feel free to answer as many as you want:

How much of your day-to-day work involves coding or computer science knowledge in general? I know you created a Jupyter notebook to go with your AI timelines forecast; is there anything else?
What are your thoughts on the public interest tech movement?
- More specifically, I've been thinking about starting some meta research on using public interest tech to address the most pressing problems from

... (read more)

Ajeya

Thanks, I'm glad you liked it so much! * I reasonably often do things like make models in Python, but the actual coding is a pretty small part of my work -- something like 5%-10% of my time. I've never done a coding project for work that was more complicated than the notebook accompanying my timelines report, and most models I make are considerably simpler (usually implemented in spreadsheets rather than in code). * I'm not familiar with the public interest tech movement unfortunately, so I'm not sure what I think about that research project idea.

Darika3y3

Any thoughts on the recent exodus of employees from OpenAI?

Aryeh Englander3y3

In your 80,000 Hours interview you talked about worldview diversification. You emphasized the distinction between total utilitarianism vs. person-affecting views within the EA community. What about diversification beyond utilitarianism entirely? How would you incorporate other normative ethical views into cause prioritization considerations? (I'm aware that in general this is basically just the question of moral uncertainty, but I'm curious how you and Open Phil view this issue in practice.)

Ajeya3y11

Most people at Open Phil aren't 100% bought into to utilitarianism, but utilitarian thinking has an outsized impact on cause selection and prioritization because under a lot of other ethical perspectives, philanthropy is supererogatory, so those other ethical perspectives are not as "opinionated" about how best to do philanthropy. It seems that the non-utilitarian perspectives we take most seriously usually don't provide explicit cause prioritization input such as "Fund biosecurity rather than farm animal welfare", but rather provide input about what rules or constraints we should be operating under, such as "Don't misrepresent what you believe even if it would increase expected impact in utilitarian terms."

Samuel3y3

Hello! I really enjoyed your 80,000 Hours interview, and thanks for answering questions!

1 - Do you have any thoughts about the prudential/personal/non-altruistic implications of transformative AI in our lifetimes?

2 - I find fairness agreements between worldviews unintuitive but also intriguing. Are there any references you'd suggest on fairness agreements besides the OpenPhil cause prioritization update?

Ajeya

Thanks, I'm glad you enjoyed it! 1. I haven't put a lot of energy into thinking about personal implications, and don't have very worked-out views right now. 2. I don't have a citation off the top of my head for fairness agreements specifically, but they're closely related to "variance normalization" approaches to moral uncertainty, which are described here (that blog post links to a few papers).

quinn3y3

I've been increasingly hearing advice to the effect that "stories" are an effective way for an AI x-safety researcher to figure out what to work on, that drawing scenarios about how you think it could go well or go poorly and doing backward induction to derive a research question is better than traditional methods of finding a research question. Do you agree with this? It seems like the uncertainty when you draw such scenarios is so massive that one couldn't make a dent in it, but do you think it's valuable for AI x-safety researchers to make significant (... (read more)

Ajeya

I would love to see more stories of this form, and think that writing stories like this is a good area of research to be pursuing for its own sake that could help inform strategy at Open Phil and elsewhere. With that said, I don't think I'd advise everyone who is trying to do technical AI alignment to determine what questions they're going to pursue based on an exercise like this -- doing this can be very laborious, and the technical research route it makes the most sense for you to pursue will probably be affected by a lot of considerations not captured in the exercise, such as your existing background, your native research intuitions and aesthetic (which can often determine what approaches you'll be able to find any purchase on), what mentorship opportunities you have available to you and what your potential mentors are interested in, etc.

Ben Snodin3y3

Thanks for doing this and for doing the 80k podcast, I enjoyed the episode.

What are some longtermist cause areas other than AI, biorisk and cause prioritisation that you'd be keen to see more work on?
I gather that Open Phil has grown a lot recently. Can you say anything about the growth and hiring you expect for Open Phil over the next say 1-3 years? E.g. would you expect to hire lots more generalists, or maybe specialists in new cause areas, etc.

Ajeya

Thanks, I'm glad you enjoyed it! 1. This is fairly basic, but EA community building is definitely another cause I'd add to that list. I'm less confident in other potential areas, but I would also be curious about exploring some aspects of improving institutional decision-making as well. 2. The decision to open a hiring round is usually made at the level of individual focus areas and sub-teams, and we don't have an organization-wide growth plan, so it's fairly difficult to estimate exact numbers; with that said, I expect we'll be doing some hiring of both generalists and program specialists over the next few years. (We have a new open positions page here.)

MichaelA3y3

[The following question might just be confused, might not be important, and will likely be poorly phrased/explained.]

In your recent 80k appearance, you and Rob both say that the way the self-sampling assumption (SSA) leads to the doomsday argument seems sort-of "suspicious". You then say that, on the other hand, the way the self-indication assumption (SIA) causes an opposing update also seems suspicious.

But I think all of your illustrations of how updates based on the SIA can seem suspicious involved infinities. And we already know that loads o... (read more)

Ajeya

Thanks, I'm glad you found that explanation helpful! I think I broadly agree with you that SIA is somewhat less "suspicious" than SSA, with the small caveat that I think most of the weirdness can be preserved with a finite-but-sufficienty-giant world rather than a literally infinite world.

MaxRa3y3

Hi Ajeya! :) What do you think about open source projects like https://www.eleuther.ai/ that replicate cutting-edge projects like GPT-3 or Alphafold? Speaking as an outsider, I imagine that a lot of AI progress comes from "random" tinkering, and so I wondered if "Discord groups tinkering along" are relevant actors in your strategic landscape.

(I really enjoyed listening to the recent interview!)

Ajeya

I'm not very familiar with these open source implementations; they seem interesting! So far, I haven't explicitly broken out different possible sources of algorithmic progress in my model, since I'm thinking about in a very zoomed-out way (extrapolating big-picture quantitative trends in algorithmic progress). I'm not sure how much of the progress captured in these trends comes from traditional industry/academia sources vs open source projects like these.

janus3y2

Hi Ajeya, thank you for publishing such a massive and detailed report on timelines!! Like other commenters here, it is my go-to reference. Allowing users to adjust the parameters of your model is very helpful for picking out built-in assumptions and being able to update predictions as new developments are made.

In your report you mention that you discount the aggressive timelines in part due to lack of major economic applications of AI so far. I have a few questions along those lines.

Do you think TAI will necessarily be foreshadowed by incremental economic ... (read more)

Ajeya

Thanks! I'll answer your cluster of questions about takeoff speeds and commercialization in this comment and leave another comment respond to your questions about sharing my report outside the EA community. Broadly speaking, I do expect that transformative AI will be foreshadowed by incremental economic gains; I generally expect gradual takeoff , meaning I would bet that at some point growth will be ~10% per year before it hits 30% per year (which was the arbitrary cut-off for "transformative" used in my report). I don't think it's necessarily the case; I just think it'll probably work this way. On the outside view, that's how most technologies seem to have worked. And on the inside view, it seems like there are lots of valuable-but-not-transformative applications of existing models on the horizon, and industry giants + startups are already on the move trying to capitalize. My views imply a roughly ~10% probability that the compute to train transformative AI would be affordable in 10 years or less, which wouldn't really leave time for this kind of gradual takeoff. One reason it's a pretty low number is because it would imply sudden takeoff and I'm skeptical of that implication (though it's not the only reason -- I think there are separate reasons to be skeptical of the Lifetime Anchor and the Short Horizon Neural Network anchor, which drive short timelines in my model). I don't expect that several generations of more powerful successors to GPT-3 will be developed before we see significant commercial applications to GPT-3; I expect commercialization of existing models and scaleup to larger models to be happening in parallel. There are already various applications online, e.g. AI Dungeon (based on GPT-3), TabNine (based on GPT-2), and this list of other apps. I don't think that evidence OpenAI was productizing GPT-3 would shift my timelines much either way, since I already expect them to be investing pretty heavily in this. Relative to the present, I expect the

Ajeya

I haven't engaged much with people outside the EA and AI alignment communities, and I'd guess that very few people outside these communities have heard about the report. I don't personally feel sold that the risks of publishing this type of analysis more broadly (in terms of potentially increasing capabilities work) outweigh the benefits of helping people better understand what to expect with AI and giving us a better chance of figuring out if our views are wrong. However, some other people in the AI risk reduction community who we consulted (TBC, not my manager or Open Phil as an institution) were more concerned about this, and I respect their judgment, so I chose to publish the draft report on LessWrong and avoid doing things that could result in it being shared much more widely, especially in a "low-bandwidth" way (e.g. just the "headline graph" being shared on social media).

Ajeya

To clarify, we are planning to seek more feedback from people outside the EA community on our views about TAI timelines, but we're seeing that as a separate project from this report (and may gather feedback from outside the EA community without necessarily publicizing the report more widely).

mike_mclaren3y2

Hi Ajeya, thanks for doing this and for your recent 80K interview! I'm trying to understand what assumptions are needed for the argument you raise in the podcast discussion on fairness agreements that a longtermist worldview should have been willing to trade up all its influence for ever-larger potential universe. There are two points I was wondering if you could comment on if/how these align with your argument.

My intuition says that the argument requires a prior probability distribution on universe size that has an infinite expectation, rather than jus

... (read more)

Ajeya

1. I agree that your prior would need to have an infinite expectation for the size of the universe for this argument to go through. 2. I agree with the generalized statement that your prior over "value-I-can-affect" needs to have an infinite expectation, but I don't think I agree with the operationalization of "value-I-can-affect" as V/n. It seems possible to me that even if there are a high density of value-maximizing civilizations out there, each one could have an infinite impact through e.g. acausal trade. I'm not sure what a crisp operationalization of "value-I-can-affect" would be.

mike_mclaren

I see, thank you!

Arepo3y1

What do you make of Ben Garfinkel's work on scepticism towards AI's capacity being separable from its goals/his broader skepticism of brain in a box scenarios?

[comment deleted]3y1

Deleted by Alex HT, 01/28/2021

Reason: Accidentally hit enter