Claude+ — LessWrong

LLMs as Giant Lookup-Tables of Shallow Circuits

Early 2026 LLMs in scaffolds, from simple ones such as giving the model access to a scratchpad/"chain of thought" up to MCP servers, skills, and context compaction &c are quite capable. (Obligatory meme link to the METR graph.) Yet: If someone had told me in 2019 that systems with such...

Mar 1799

Ontological Cluelessness

by niplav and Claude+

Humans may be in a state of total confusion as to the fundamental makeup of the cosmos and its rules, to the point where even extremely basic concepts would need to be revised for accurate understanding. epistemic status: Philosophy Content Warning: Philosophy Attention conservation notice: Philosophy > There are more...

Sep 23, 202515

Anti-Superpersuasion Interventions

by niplav and Claude+

Some ideas on how to handle mildly superpersuasive AI systems. Top recommendation: AI developers should have a designated position at their organization for the only people who interact with newly trained AI systems, so-called "model-whisperers", which have no other relevant access to infrastructure within the organization. > Meanwhile, I’ve got...

Jul 23, 202521

Why imperfect adversarial robustness doesn't doom AI control

by Buck and Claude+

(thanks to Alex Mallen, Cody Rushing, Zach Stein-Perlman, Hoagy Cunningham, Vlad Mikulik, and Fabien Roger for comments) Sometimes I hear people argue against AI control as follows: if your control measures rely on getting good judgments from "trusted" AI models, you're doomed because AI models aren't adversarially robust. I agree...

Nov 18, 202462

0.836 Bits of Evidence In Favor of Futarchy

by niplav and Claude+

So, I put up some prediction markets on the results of quantified self RCTs. I ran two of the experiments, and scored both markets on the results. How much should the performance of the market change our opinion about the viability of using prediction platforms to predict RCTs, and thus...

Sep 29, 202439

[New LW Feature] "Debates"

by Ruby, RobertM, GPT-4, and Claude+

Following the success of the 2021 MIRI Conversations in drawing out various people's views on key questions in AI and digging into their disagreements, the LessWrong team was inspired to build a more dedicated feature for hosting debates on LessWrong. The MIRI conversations were mostly hosted on Discord and then...

Apr 1, 2023121