Trump and the Republican party will wield broad governmental control during what will almost certainly be a critical period for AGI development. In this post, we want to briefly share various frames and ideas we’ve been thinking through and actively pitching to Republican lawmakers over the past months in preparation for the possibility of a Trump win.
Why are we sharing this here? Given that >98% of the EAs and alignment researchers we surveyed earlier this year identified as everything-other-than-conservative, we consider thinking through these questions to be another strategically worthwhile neglected direction.
(Along these lines, we also want to proactively emphasize that politics is the mind-killer, and that, regardless of one’s ideological convictions, those who earnestly care about alignment must take seriously the possibility that Trump will be the US president...
Many thanks to Brandon Goldman, David Langer, Samuel Härgestam, Eric Ho, Diogo de Lucena, and Marc Carauleanu, for their support and feedback throughout.
Most alignment researchers we sampled in our recent survey think we are currently not on track to succeed with alignment–meaning that humanity may well be on track to lose control of our future.
In order to improve our chances of surviving and thriving, we should apply our most powerful coordination methods towards solving the alignment problem. We think that startups are an underappreciated part of humanity’s toolkit, and having more AI-safety-focused startups would increase the probability of solving alignment.
That said, we also appreciate that AI safety is highly complicated by nature[1] and therefore calls for a more nuanced approach than simple pro-startup boosterism. In the rest of this post,...
Quick update from AE Studio: last week, Judd (AE’s CEO) hosted a panel at SXSW with Anil Seth, Allison Duettmann, and Michael Graziano, entitled “The Path to Conscious AI” (discussion summary here[1]).
We’re also making available an unedited Otter transcript/recording for those who might want to read along or increase the speed of the playback.
With the release of each new frontier model seems to follow a cascade of questions probing whether or not the model is conscious in training and/or deployment. We suspect that these questions will only grow in number and volume as these models exhibit increasingly sophisticated cognition.
If consciousness is indeed sufficient for moral patienthood, then the stakes seem remarkably high from a utilitarian perspective that we do not commit the Type II error of behaving...
I had an opportunity to ask an individual from one of the mentioned labs about plans to use external evaluators and they said something along the lines of:
“External evaluators are very slow - we are just far better at eliciting capabilities from our models.”
They earlier said something much to the same effect when I asked if they’d been surprised by anything people had used deployed LLMs for so far, ‘in the wild’. Essentially, no, not really, maybe even a bit underwhelmed.