Tom DAVID — LessWrong

Protecting Cognitive Integrity: Our internal AI use policy (V1)

We (at GPAI Policy Lab) want to share our V1 policy as an invitation for pushback. Some of what motivates it is our extrapolations of AI capabilities, internal conversations about their effects on cognition, and some empirical evidence. I think the expected cost of being somewhat over-cautious here is lower...

Apr 24115

Claude Mythos Preview: Analysis of Anthropic's Public Announcement

by Antoine Maier, Tom DAVID, lcarbonell, Jérémy Andréoletti, TReltgen, and Pierre Peigné

tl;dr: * The virally shared figures of 10 trillion parameters and $10 billion training cost come from no identifiable source; * Cybersecurity capabilities represent a significant leap, but are in line with previous models; * Updated Responsible Scaling Policy removed threat models related to radiological and nuclear weapons with no...

Apr 1417

A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities

A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities Epistemic status: This idea emerged during my participation in the MATS program this summer. While I intended to develop it further and conduct more rigorous analysis, time constraints led me to publish this initial version (30-60m of work) . I'm...

Jan 9, 20252

Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks

I am sharing this call from the EU AI Office for organizations involved in evaluation. Please take a close look: among the selection criteria, organizations must be based in Europe, or their leader must be European. If these criteria pose challenges for some of you, feel free to reach out...

Nov 27, 202430

Workshop Report: Why current benchmarks approaches are not sufficient for safety?

I’m sharing the report from the workshop held during the AI, Data, Robotics Forum in Eindhoven, a European event bringing together policymakers, industry representatives, and academics to discuss the challenges and opportunities in AI, data, and robotics. This report provides a snapshot of the current state of discussions on benchmarking...

Nov 26, 20243