Wiki Contributions


Nice recommendations! In addition to brain enthusiasts being useful for empirical work, there also are theoretical tools from system neuroscience that could be useful for AI safety. One area in particular would be for interpretability: if we want to model a network at various levels of "emergence", recent development in information decomposition and multivariate information theory to move beyond pairwise interaction in a neural network might be very useful. Also see recent publications to model synergestic information  and dynamical independance to perhaps automate macro variables discovery which could also be well worth exploring to study higher levels of large ML models. This would actually require both empirical and theoretical work as once the various measures of information decomposition are clearer one would need to empirically estimate test them and use them in actual ML systems for interpretability if they turn out to be meaningful.

Thanks for all the useful links! I'm also always happy to receive more feedback.

I agree that the sense in which I use metaethics in this post is different from what academic philosophers usually call metaethics. I have the impression that metaethics, in academic sense, and metaphilosophy are somehow related. Studying what morality itself is, how to select ethical theories and what is the process behind ethical reasoning seems not independent. For example if moral nihilism is more plausible then it seems to be less likely that there is some meaningful feedback loop to select ethical theories or that there is such a meaningful thing as a ‘good’ ethical theory (at least in an observer independent way) . If moral emotivism is more plausible then maybe reflecting on ethics is more like emotions rationalisation, e.g. typically expressing in a sophisticated way something that just fundamentally means ‘boo suffering’. In that case having better understanding of metaethics in the academic sense seems to bring some light to a process that generates ethical theories, at least in humans.

Sure, I'm happy to read/discuss your ideas about this topic.

I am not sure about what computer aided analysis mean but one possibility could be to have formal ethical theories and prove some theorem inside their formal framework. But this raises questions about the sort of formal framework that one could use to 'prove theorems' under ethics in a meaningful way.