ozhang

$250K in Prizes: SafeBench Competition Announcement

TLDR: CAIS is distributing $250,000 in prizes for benchmarks that empirically assess AI safety. This project is supported by Schmidt Sciences, submissions are open until February 25th, 2025. Winners will be announced April 25th, 2025. To view additional info about the competition, including submission guidelines and FAQs, visit https://www.mlsafety.org/safebench If...

Apr 3, 202426

AI Safety Newsletter #4: AI and Cybersecurity, Persuasive AIs, Weaponization, and Geoffrey Hinton talks AI risks

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Subscribe here to receive future versions. --- Cybersecurity Challenges in AI Safety Meta accidentally leaks a language model to the public. Meta’s newest language model, LLaMa,...

May 2, 202332

AI Safety Newsletter #3: AI policy proposals and a new challenger approaches

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Subscribe here to receive future versions. --- Policy Proposals for AI Safety Critical industries rely on the government to protect consumer safety. The FAA approves new...

Apr 25, 202333

AI Safety Newsletter #2: ChaosGPT, Natural Selection, and AI Safety in the Media

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Subscribe here to receive future versions. --- ChaosGPT and the Rise of Language Agents Chatbots like ChatGPT usually only respond to one prompt at a time,...

Apr 18, 202330

AI Safety Newsletter #1 [CAIS Linkpost]

by Orpheus16, Dan H, and ozhang

The Center for AI Safety just launched its first AI Safety Newsletter. The newsletter is designed to inform readers about developments in AI and AI safety. No technical background required. Subscribe here to receive future versions. First edition below: --- Growing concerns about rapid AI progress Recent advancements in AI...

Apr 10, 202345

Announcing the Introduction to ML Safety course

by Dan H, TW123, and ozhang

TLDR We're announcing a new course designed to introduce students with a background in machine learning to the most relevant concepts in empirical ML-based AI safety. The course is available publicly here. Background AI safety is a small but rapidly growing field, and both younger and more experienced researchers are...

Aug 6, 202273

$20K In Bounties for AI Safety Public Materials

by Dan H, TW123, and ozhang

TLDR We are announcing a $20k bounty for publicly-understandable explainers of AI safety concepts. We are also releasing the results of the AI Safety Arguments competition. Background Of the technologists, ML researchers, and policymakers thinking about AI, very few are seriously thinking about AI existential safety. This results in less...

Aug 5, 202271

ozhang

ozhang

ML Alignment Theory Program under Evan Hubinger

[$20K in Prizes] AI Safety Arguments Competition

Introducing the ML Safety Scholars Program

Announcing the Introduction to ML Safety course

ozhang

ML Alignment Theory Program under Evan Hubinger

[$20K in Prizes] AI Safety Arguments Competition

Introducing the ML Safety Scholars Program

Announcing the Introduction to ML Safety course

$250K in Prizes: SafeBench Competition Announcement

AI Safety Newsletter #4: AI and Cybersecurity, Persuasive AIs, Weaponization, and Geoffrey Hinton talks AI risks

AI Safety Newsletter #3: AI policy proposals and a new challenger approaches

AI Safety Newsletter #2: ChaosGPT, Natural Selection, and AI Safety in the Media

AI Safety Newsletter #1 [CAIS Linkpost]

Announcing the Introduction to ML Safety course

$20K In Bounties for AI Safety Public Materials