jsteinhardt's Comments

Ben Hoffman's donor recommendations

I don't understand why this is evidence that "EA Funds (other than the global health and development one) currently funges heavily with GiveWell recommended charities", which was Howie's original question. It seems like evidence that donations to OpenPhil (which afaik cannot be made by individual donors) funge against donations to the long-term future EA fund.

RFC: Philosophical Conservatism in AI Alignment Research

I like the general thrust here, although I have a different version of this idea, which I would call "minimizing philosophical pre-commitments". For instance, there is a great deal of debate about whether Bayesian probability is a reasonable philosophical foundation for statistical reasoning. It seems that it would be better, all else equal, for approaches to AI alignment to not hinge on being on the right side of this debate.

I think there are some places where it is hard to avoid pre-commitments. For instance, while this isn't quite a philosophical pre-commitment, it is probably hard to develop approaches that are simultaneously optimized for short and long timelines. In this case it is probably better to explicitly do case splitting on the two worlds and have some subset of people pursuing approaches that are good in each individual world.

Takeoff Speed: Simple Asymptotics in a Toy Model.

Thanks for writing this Aaron! (And for engaging with some of the common arguments for/against AI safety work.)

I personally am very uncertain about whether to expect a singularity/fast take-off (I think it is plausible but far from certain). Some reasons that I am still very interested in AI safety are the following:

  • I think AI safety likely involves solving a number of difficult conceptual problems, such that it would take >5 years (I would guess something like 10-30 years, with very wide error bars) of research to have solutions that we are happy with. Moreover, many of the relevant problems have short-term analogues that can be worked on today. (Indeed, some of these align with your own research interests, e.g. imputing value functions of agents from actions/decisions; although I am particularly interested in the agnostic case where the value function might lie outside of the given model family, which I think makes things much harder.)
  • I suppose the summary point of the above is that even if you think AI is a ways off (my median estimate is ~50 years, again with high error bars) research is not something that can happen instantaneously, and conceptual research in particular can move slowly due to being harder to work on / parallelize.
  • While I have uncertainty about fast take-off, that still leaves some probability that fast take-off will happen, and in that world it is an important enough problem that it is worth thinking about. (It is also very worthwhile to think about the probability of fast take-off, as better estimates would help to better direct resources even within the AI safety space.)
  • Finally, I think there are a number of important safety problems even from sub-human AI systems. Tech-driven unemployment is I guess the standard one here, although I spend more time thinking about cyber-warfare/autonomous weapons, as well as changes in the balance of power between nation-states and corporations. These are not as clearly an existential risk as unfriendly AI, but I think in some forms would qualify as a global catastrophic risk; on the other hand I would guess that most people who care about AI safety (at least on this website) do not care about it for this reason, so this is more idiosyncratic to me.

Happy to expand on/discuss any of the above points if you are interested.



Takeoff Speed: Simple Asymptotics in a Toy Model.

Very minor nitpick, but just to add, FLI is as far as I know not formally affiliated with MIT. (FHI is in fact a formal institute at Oxford.)

Zeroing Out

Hi Zvi,

I enjoy reading your posts because they often consist of clear explanations of concepts I wish more people appreciated. But I think this is the first instance where I feel I got something that I actually hadn't thought about before at all, so I wanted to convey extra appreciation for writing it up.



Seek Fair Expectations of Others’ Models

I think the conflation is "decades out" and "far away".

Oxford Prioritisation Project Review

Points 1-5 at the beginning of the post are all primarily about community-building and personal development externalities of the project, and not about the donation itself.

Oxford Prioritisation Project Review

?? If you literally mean minimum-wage, I think that is less than 10,000 pounds... although agree with the general thrust of your point about the money being more valuanle than the time (but think you are missing the spirit of the exercise as outlined in the post).

Load More