Elliot Mckernon — LessWrong

Researcher at Convergence Analysis

Open source or not open source.
Is that the question?
Whether tis nobler in the mind to share
the bits and weights of outrageous fortune 500 models?
or to take arms against superintelligence
and through privacy, end them? to hide.
to share, no more. and by a share to say we end
the headache and the thousand artificial shocks
the brain is heir to: tis a conversation
devoutly to be wished. to hide.
to encrypt, perchance to silence - aye, there's the rub.
for in that closed off world, what solutions may arise,
that may save us from the models we build,
may give us our pause?

Update, 16th October:

The Q&A with Secretary of State for Science, Michelle Donelan MP, has been moved to today on LinkedIn.

The programme for the summit has been released. Brief summary:

Day 1

Roundtables on "understanding frontier AI risks":
1. Risks to Global Safety from Frontier AI Misuse
2. Risks from Unpredictable Advances in Frontier AI Capability
3. Risks from Loss of Control over Frontier AI
4. Risks from the Integration of Frontier AI into Society

Roundtables on "improving frontier AI safety":
1. What should Frontier AI developers do to scale responsibly?
2. What should National Policymakers do in relation to the risk and opportunities of AI?
3. What should the International Community do in relation to the risk and opportunities of AI?
4. What should the Scientific Community do in relation to the risk and opportunities of AI?

Panel discussion on "AI for good – AI for the next generation".

Day 2

"The Prime Minister will convene a small group of governments, companies and experts to further the discussion on what steps can be taken to address the risks in emerging AI technology and ensure it is used as a force for good. In parallel, UK Technology Secretary Michelle Donelan will reconvene international counterparts to agree next steps."

Thanks for the query! We don't think you should keep misaligned AI around if you've got a provably aligned one to use instead. We're worried about evals of misaligned AI, and specifically how one prompts the model that's being testing, what context it's tested in, and so on. We think that evals of misaligned AIs should be minimized, and one way to do that is to get the most information you can from prompting nice, friendly behavior, rather than prompting misaligned behaviour (e.g. the red-team tests).

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments