larval stage AI alignment researcher

Wiki Contributions


The main reason I think a split OpenAI means shortened timelines is that the main bottleneck to capabilities right now is insight/technical-knowledge. Quibbles aside, basically any company with enough cash can get sufficient compute. Even with other big players and thousands/millions of open source devs trying to do better, to my knowledge GPT4 is still the best, implying some moderate to significant insight lead. I worry by fracturing OpenAI, more people will have access to those insights, which 1) significantly increases the surface area of people working on the frontiers of insight/capabilities, 2) we burn the lead time OpenAI had, which might otherwise have been used to pay off some alignment tax, and 3) the insights might end up at a less scrupulous (wrt alignment) company. 

A potential counter to (1): OpenAI's success could be dependent on having all (or some key subset) of their people centralized and collaborating. 

Counter-counter: OpenAI staff, especially the core engineering talent but it seems the entire company at this point, clearly wants to mostly stick together, whether at the official OpenAI, Microsoft, or with any other independent solution. So them moving to any other host, such as Microsoft, means you get some of the worst of both worlds; OAI staff are centralized for peak collaboration, and Microsoft probably unavoidably gets their insights. I don't buy the story that anything under the Microsoft umbrella gets swallowed and slowed down by the bureaucracy; Satya knows what he is dealing with and what they need, and won't get in the way. 

For one thing, there is a difference between disagreement and "overall quality" (good faith, well reasoned, etc), and this division already exists in comments. So maybe it is a good idea to have this feature for posts as well, and only have disciplinary actions taken against posts that meet some low/negative threshold for "overall quality". 

Further, having multiple tiers of moderation/community-regulatory action in response to "overall quality" (encompassing both things like karma and explicit moderator action) seem good to me, and this comment limitation you describe seems like just another tier in such a system, one that is above "just ban them", but below "just let them catch the lower karma from other users downvoting them". 

It's possible that, lacking the existence of the tier you are currently on, the next best tier you'd be rounded-off to would be getting banned. (I haven't read your stuff, and so I'm not suggesting either way that this should or should not be done in your case). 

If you were downvoted for good faith disagreement, and are now limited/penalized, then yeah that's probably bad and maybe a split voting system as mentioned would help. But its possible you were primarily downvoted for the "overall quality" aspect. 

Is the usage of "Leviathan" (like here and in ) just convergence on an appropriate and biblical name, or is there additional history of it specifically being used as a name for an AI? 

I'm trying to catch up with the general alignment ecosystem - is this site still intended to be live/active? I'm getting a 404. 

Really extremely happy with this podcast - but I feel like it also contributed to a major concern I have about how this PR campaign is being conducted

There we go - thank you! That matches my memory for what I was looking for.