I also am not paying for any LLM. Between Microsoft's Copilot (formerly Bing Chat), LMSYS Chatbot Arena, and Codeium, I have plenty of free access to SOTA chatbots/assistants. (Slightly worried that I'm contributing to race dynamics or AI risk in general even by using these systems for free, but not enough to stop, unless someone wants to argue for this.)
Unfortunately I don't have well-formed thoughts on this topic. I wonder if there are people who specialize in AI lab governance and have written about this, but I'm not personally aware of such writings. To brainstorm some ideas:
I'd like to hear from people who thought that AI companies would act increasingly reasonable (from an x-safety perspective) as AGI got closer. Is there still a viable defense of that position (e.g., that SamA being in his position / doing what he's doing is just uniquely bad luck, not reflecting what is likely to be happening / will happen at other AI labs)?
Also, why is there so little discussion of x-safety culture at other AI labs? I asked on Twitter and did not get a single relevant response. Are other AI company employees also reluctant to speak out, if so that seems bad (every explanation I can think of seems bad, including default incentives + companies not proactively encouraging transparency).
Suggest having a row for "Transparency", to cover things like whether the company encourages or discourages whistleblowing, does it report bad news about alignment/safety (such as negative research results) or only good news (new ideas and positive results), does it provide enough info to the public to judge the adequacy of its safety culture and governance, etc.
It's also notable that the topic of OpenAI nondisparagement agreements was brought to Holden Karnofsky's attention in 2022, and he replied with "I don’t know whether OpenAI uses nondisparagement agreements; I haven’t signed one." (He could have asked his contacts inside OAI about it, or asked the EA board member to investigate. Or even set himself up earlier as someone OpenAI employees could whistleblow to on such issues.)
If the point was to buy a ticket to play the inside game, then it was played terribly and negative credit should be assigned on that basis, and for misleading people about how prosocial OpenAI was likely to be (due to having an EA board member).
Agreed that it reflects on badly on the people involved, although less on Paul since he was only a "technical advisor" and arguably less responsible for thinking through / due diligence on the social aspects. It's frustrating to see the EA community (on EAF and Twitter at least) and those directly involved all ignoring this.
("shouldn’t be allowed anywhere near AI Safety decision making in the future" may be going too far though.)
So these resignations don’t negatively impact my p(doom) in the obvious way. The alignment people at OpenAI were already powerless to do anything useful regarding changing the company direction.
How were you already sure of this before the resignations actually happened? I of course had my own suspicions that this was the case, but was uncertain enough that the resignations are still a significant negative update.
ETA: Perhaps worth pointing out here that Geoffrey Irving recently left Google DeepMind to be Research Director at UK AISI, but seemingly on good terms (since Google DeepMind recently reaffirmed its intention to collaborate with UK AISI).
Bad: AI developers haven't taken alignment seriously enough to have invested enough in scalable oversight, and/or those techniques are unworkable or too costly, causing them to be unavailable.
Turns out at least one scalable alignment team has been struggling for resources. From Jan Leike (formerly co-head of Superalignment at OpenAI):
Over the past few months my team has been sailing against the wind. Sometimes we were struggling for compute and it was getting harder and harder to get this crucial research done.
Even worse, apparently the whole Superalignment team has been disbanded.
These may be among the ‘most direct’ or ‘simplest to imagine’ possible actions, but in the case of superintelligence, simplicity is not a constraint.
I think it is considered a constraint by some because they think that it would be easier/safer to use a superintelligent AI to do simpler actions, while alignment is not yet fully solved. In other words, if alignment was fully solved, then you could use it to do complicated things like what you suggest, but there could be an intermediate stage of alignment progress where you could safely use SI to do something simple like "melt GPUs" but not to achieve more complex goals.