
Sorted by New

Wiki Contributions


Sorted by

Broadly agree, in that most safety research expands control over systems and our understanding of them, which can be abused by a bad actor.

This problem is encountered by for-profit companies, where profit is on the lines instead of catastrophe. They too have R&D departments and research directions which have the potential for misuse. However, this research is done inside a social environment (the company) where it is only explicitly used to make money.

To give a more concrete example, improving self-driving capabilities also allows the companies making the cars to intentionally make them run people down if they so wish. The more advanced the capabilities, the more precise they can be in deploying their pedestrian-killing machines onto the roads. However, we would never expect this to happen as this would clearly demolish the profitability of a company and result in the cessation of these activities.

AI safety research is not done in this kind of environment at present whatsoever. However, it does seem to me that these kinds of institutions that carefully vet research and products, only releasing them when they remain beneficial, are possible. 


Really fascinating stuff! I have a (possibly answered) question about how using expert updates on other expert prediction might be valuable.

You discuss the negative impacts of allowing experts to aggregate themselves, or viewing one another's forecasts before initially submitting their own. Might there be value in allowing experts to submit multiple times, each time seeing the submitted predictions of a previous round? The final aggregation scheme would be able to not only assign a credence to each expert, but also gain a proxy for what credence the experts give to one another. In a more practical scenario where experts will talk if not collude, this might give better insight into how expert predictions are being created.

Thanks for taking the time to distill this work into a more approachable format - it certainly made the thesis more manageable!