philipn

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

Thank you for this. This is very close to what I was hoping to find!

It looks like Benjamin Hilton makes a rough guess of the proportion of workers dedicated to AI x-risk for each organization. This seems appropriate for assessing a rough % across all organizations, but if we want to nudge organizations to employ more people toward alignment then I think we want to highlight exact figures.

E.g. we want to ask the organizations how many people they have working on alignment and then post what they say - a sort of accountability feedback loop. 

You mention the number of people at OpenAI doing alignment work.  I think it would be helpful to compile a list of the different labs and the number of people that can be reasonably said to be doing alignment work.  Then we could put together a chart of sorts, highlighting this gap.

Highlighting gaps like this is a proven and effective strategy to drive change when dealing with various organizational-level inequities.

If people reading this comment have insight into the number of people at the various labs doing alignment work and/or the total number of people at said labs: please comment here!

Could you elaborate on "For NN Model 1, the belief is encoded in the learned parameters . For NN Model 2, the belief is encoded in the architecture itself "?