“Utility Engineering: Analyzing and Controlling Emergent Value Systems in AI”
https://www.emergent-values.ai/
I walked through this paper’s finding in detail in a previous episode of Doom Debates which IMO is one of my best episodes. Just skip straight to the chapters in the second half, timestamp 49:13:
This article is just saying "doomers are failing to prevent doom for various reasons, and also they might be wrong that doom is coming soon". But we're probably not wrong, and not being doomers isn't a better strategy. So it's a lame article IMO.
Wow this is my favorite post in a long time, super educational. I was familiar with the basic concept from the Sequences, but this added a great level of understandable detail. Kudos.
By your logic, if I ask you a totally separate question "What's the probability that a parent's two kids are both boys", would you answer 1/3? Becuase the correct answer should be 1/4 right? So something about your preferred methodology isn't robust.
I agree that frequentists are flexible about their approach to try to get the right answer. But I think your version of the problem highlights how flexible they have to be i.e. mental gymnastics, compared to just explicitly being Bayesian all along.
In scenario B, where a random child runs up, I wonder if a non-Bayesian might prefer that you just eliminate (girl, girl) and say that the probability of two boys is 1/3?
In Puzzle 1 in my post, the non-Bayesian has an interpretation that's still plausibly reasonable, but in your scenario B it seems like they'd be clowning themselves to take that approach.
So I think we're on the same page that whenever things get real/practical/bigger-picture, then you gotta be Bayesian.
Thanks for this post.
I'd love to have a regular (weekly/monthly/quarterly) post that's just "here's what we're focusing on at MIRI these days".
I respect and value MIRI's leadership on the complex topic of building understanding and coordination around AI.
I spend a lot of time doing AI social media, and I try to promote the best recommendations I know to others. Whatever thoughts MIRI has would be helpful.
Given that I think about this less often and less capably than you folks do, it seems like there's a low hanging fruit opportunity for people like me to stay more in sync with MIRI. My show (Doom Debates) isn't affiliated with MIRI, but as long as there keeps being no particular disagreement that I have with MIRI, I'd like to make sure I'm pulling in the same direction as you all.
I’ve heard MIRI has some big content projects in the works, maybe a book.
FWIW I think having a regular stream of lower-effort content that a somewhat mainstream audience consumes would help to bolster MIRI’s position as a thought leader when they release the bigger works.
I'd ask: If one day your God stopped existing, would anything have any kind of observable change?
Seems like a meaningless concept, a node in the causal model of reality that doesn't have any power to constrain expectation, but the person likes it because their knowledge of the existence of the node in their own belief network brings them emotional reward.
I have multi-year-wide confidence intervals, which I think the authors of AI 2027 also do, so I don't have much of a stance on whether the best guess is 2026 or 2027 or 2030 or 2035. I agree 2027 seems a bit soon given the subjective rate of progress 🤷♂️