Give me your model, with numbers, that shows supporting Anthropic to be a bad bet, or admit you are confused and that you don't actually have good advice to give anyone.
It seems to me that other possibilities exist, besides "has model with numbers" or "confused." For example, that there are relevant ethical considerations here which are hard to crisply, quantitatively operationalize!
One such consideration which feels especially salient to me is the heuristic that before doing things, one should ideally try to imagine how people would react, upon learning what you did. In this case the action in question involves creating new minds vastly smarter than any person, which pose double-digit risk of killing everyone on Earth, so my guess is that the reaction would entail things like e.g. literal worldwide riots. If so, this strikes me as the sort of consideration one should generally weight more highly than their idiosyncratic utilitarian BOTEC.
The only safety techniques that count are the ones that actually get deployed in time.
True, but note this doesn't necessarily imply trying to maximize your impact in the mean timelines world! Alignment plans vary hugely in potential usefulness, so I think it can pretty easily be the case that your highest EV bet would only pay off in a minority of possible futures.
Prelude to Power is my favorite depiction of scientific discovery. Unlike any other such film I've seen, it adequately demonstrates the inquiry from the perspective of the inquirer, rather than from conceptual or biographical retrospect.
I'm curious if "trusted" in this sense basically just means "aligned"—or like, the superset of that which also includes "unaligned yet too dumb to cause harm" and "unaligned yet prevented from causing harm"—or whether you mean something more specific? E.g., are you imagining that some powerful unconstrained systems are trusted yet unaligned, or vice versa?
I would guess it does somewhat exacerbate risk. I think it's unlikely (~15%) that alignment is easy enough that prosaic techniques even could suffice, but in those worlds I expect things go well mostly because the behavior of powerful models is non-trivially influenced/constrained by their training. In which case I do expect there's more room for things to go wrong, the more that training is for lethality/adversariality.
Given the state of atheoretical confusion about alignment, I feel wary of confidently dismissing these sorts of basic, obvious-at-first-glance arguments about risk—like e.g., "all else equal, probably we should expect more killing people-type problems from models trained to kill people"—without decently strong countervailing arguments.
It seems the pro-Trump Polymarket whale may have had a real edge after all. Wall Street Journal reports (paywalled link, screenshot) that he’s a former professional trader, who commissioned his own polls from a major polling firm using an alternate methodology—the neighbor method, i.e. asking respondents who they expect their neighbors will vote for—he thought would be less biased by preference falsification.
I didn't bet against him, though I strongly considered it; feeling glad this morning that I didn't.
Thanks; it makes sense that use cases like these would benefit, I just rarely have similar ones when thinking or writing.
I also use them rarely, fwiw. Maybe I'm missing some more productive use, but I've experimented a decent amount and have yet to find a way to make regular use even neutral (much less helpful) for my thinking or writing.
I don't know much about religion, but my impression is the Pope disagrees with your interpretation of Catholic doctrine, which seems like strong counterevidence. For example, see this quote:
“All religions are paths to God. I will use an analogy, they are like different languages that express the divine. But God is for everyone, and therefore, we are all God’s children.... There is only one God, and religions are like languages, paths to reach God. Some Sikh, some Muslim, some Hindu, some Christian."
And this one:
The pluralism and the diversity of religions, colour, sex, race and language are willed by God in His wisdom, through which He created human beings. This divine wisdom is the source from which the right to freedom of belief and the freedom to be different derives. Therefore, the fact that people are forced to adhere to a certain religion or culture must be rejected, as too the imposition of a cultural way of life that others do not accept.
When do you think would be a good time to lock in regulation? I personally doubt RSP-style regulation would even help, but the notion that now is too soon/risks locking in early sketches, strikes me as in some tension with e.g. Anthropic trying to automate AI research ASAP, Dario expecting ASL-4 systems between 2025—the current year!—and 2028, etc.