Apply at the link.
We're expanding our red team, whose job it is to try breaking our LLMs to make them exhibit unexpected and unsafe behaviors. Note that one of the requirements for this specific position is a PhD in linguistics. If you have prior red teaming experience (professional or personal), even better :) If you do apply at the link, please send me a message on here as well.
If you do not meet the PhD requirement but you think you're a good candidate for the red team (e.g. have discovered new jailbreaks or adversarial techniques, are curious and have a hacker mindset, have done red teaming before, etc), you can still message me on here so we can keep you in mind if other red teaming positions open up in the future without the PhD qualification.
There's an argument here along the lines of:
Imo this proves too much. You can use the same argument to argue that any behavior not present in <current generation of AIs> won't exist in AGI.
"AGI will ask humans for help when it realizes it would be useful"
"Ah, but present AI doesn't do this almost ever, despite plenty of examples in training of people reaching out for help. Therefore, this behavior won't exist in AGI either"
<Fill in with examples of essentially any smart behavior AI doesn't have today>
This post also doesn't touch on the... (read more)