KAP's Shortform
Nov 17, 20252
TLDR: I describe a takeover path by an AI [1]with a deep understanding of human nature and a long planning horizon that, for strategic reasons, chooses not to directly pursue physical power. Instead, the AI "backdoors" alignment by building a broad base of human support, hijacking institutions and power structures,...
This is the first in a series of posts on the question: > "Can we extract meaningful information or interesting behavior from gradients on 'input embedding space'?" I'm defining 'input embedding space' as the token embeddings prior to positional encoding. The basic procedure for obtaining input space gradients is as...