Consequentialists (including utilitarians) claim that the goodness of an action should be judged based on the goodness of its consequences. The word utility is often used to refer to the quantified goodness of a particular outcome. When the consequences of an action are uncertain, it is often taken for granted...
Work funded by the Long Term Future Fund. Corrigibility is the hypothetical feature of some agents which allows them to be 'shut down' by an outside user without attempting to manipulate the whether or not they are shut down. The motivation behind this concept is the possibility of making an...
This post is about ideas in the 2015 paper Corrigibility by Soares et al. The paper is not too hard to follow but is written in a fairly dry academic style (which is reasonable as it is an academic paper!). It also uses what I think is a clunky notation...
This essay was written as an entry to the TxP Progress Prize. The prize is run in partnership with Civic Future and New Statesman Spotlight to encourage blog posts responding to the question 'Britain is stuck. How can we get it moving again?'. The result is a bit more 'argumentative'...
tl;dr: There are several trends which suggest that global temperatures over the next year will experience a short-term increase, relative to the long-term increase in temperatures caused by man-made global warming. Credits: Most of the information comes from Berkley Earth monthly temperature updates. Several people on Twitter (Robert Rohde, Zeke...
Recently, I read Corrigibility by Soares et al. and became confused. I followed most of the mathematical reasoning but am now struggling to understand what the point or end goal of this avenue of research is meant to be. I know that MIRI now pursues a different research direction so...
Ahead of next week's AI Safety Summit, the UK government has published a discussion paper on the capabilities and risks of AI. The paper comes in three parts and can be found here. See here for the press release announcing the publication of the paper. The paper has been reviewed...