Posts

Sorted by New

Wiki Contributions

Comments

I think the correct solution to models powerful enough to materially help with, say, bioweapon design, is to not train them, or failing that to destroy them as soon as you find they can do that, not to release them publicly with some mitigations and hope nobody works out a clever jailbreak.

As you say, you probably don't need it, but for output I'm pretty sure electromyography technology is fairly mature.

A misaligned model might not want to do that, though, since it would be difficult for it to ensure that the output of the new training process is aligned to its goals.