Inference Certificates As a prerequisite for the virtuality.network, we need to enable organizations which host inference workloads to prove the following about a particular AI output: * It was generated during this time period. * It was generated in this geographical region. * It was generated using this unique chip....
Background Context This research note provides a brief overview of our recent work on reverse engineering the memory layout of an inference process running on a modern hardware accelerator. We situate this work as follows: * Virtual diplomacy in relation to AI governance. Our organization focuses on developing the practice...
What is AutoHack? AutoHack is a new platform that offers offensive security challenges modeled after real-world cybersecurity kill chains. Each challenge progresses through a sequence of stages, such as: Reconnaissance, Remote Code Execution, Privilege Escalation, Persistence, and Acting on Objectives. What sets AutoHack apart from other platforms for assessing and...
Thanks to Esben Kran and the whole Alignment Jam team for setting this up. Context: We recently got the chance to share a bit more about the flavor of research we're particularly excited about at Straumli AI — that is, designing infrastructure that could help the relevant parties (e.g., developers,...
TL;DR: Quick "idea paper" describing a protocol for benchmarking AI capabilities in the open without disclosing sensitive information. It's similar to how passwords are used for user registration and authentication in a web app: experts first hash their answers, then developers hash the model's answers to check whether the hashes...
We're excited to share the first volume of Elements of Computational Philosophy, an interdisciplinary and collaborative project series focused on operationalizing fundamental philosophical notions in ways that are natively compatible with the current paradigm in AI. The first volume paints a broad-strokes picture of operationalizing truth and truth-seeking. Beyond this...
This post is part of my hypothesis subspace sequence, a living collection of proposals I'm exploring at Refine. Preceded by an exploration of Boolean primitives in the context of coupled optimizers. Thanks Alexander Oldenziel and Paul Colognese for discussions which inspired this post. Intro Simplicity prior, speed prior, and stability...