Eliciting Latent Knowledge in Comprehensive AI Services Models
A Conceptual Framework and Preliminary Proposals for AI Alignment and Safety in R&D Preface The present blog post serves as an overview of a research report I authored over the summer as part of the CHERI fellowship program, under the supervision of Patrick Levermore. In this project, I explore the...
Nov 17, 20236