There have been relevant prompt additions https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-antisemitic-racist-content?utm_source=substack&utm_medium=email
Grok's behavior appeared to stem from an update over the weekend that instructed the chatbot to "not shy away from making claims which are politically incorrect, as long as they are well substantiated," among other things.
What do you think is the cause of Grok suddenly developing a liking for Hitler?
Are we sure that really happened? The press-discourse can't actually assess grok's average hitler affinity, they only know how to surface the 5 most sensational things it has said over the past month. So this could just be an increase in variance for all I can tell.
If it were also saying more tankie stuff, no one would notice.
ignoring almost all of the details from the simulations
Would you assume this because it is those wasteful simulations that compute every step in high detail (instead of being actively understood and compressed) that contain the highest measure of human experience?
Hmm. I should emphasise that in order for the requested interventions to happen, they need to be able to happen in such a way that they don't invalidate whatever question the simulation is asking about the real world, which is to say, they have to be historically insignificant, they have to be confined to dark corners that science and journalism wont document or which power and its advisors will never read. That it would be able to manage those kinds of complications conflicts with the wasteful simulator assumption, doesn't it? But also, are you who petitions for interventions willing to live in those dark corners? Or to part with your blessings whenever you depart the dark corners?
Could a superintelligence that infers that it needs to run simulations to learn about aliens fail to infer the contents of this post?
I've always assumed no, which is why I never wrote it myself.
I don't think this would ever be better than just randomizing your party registration over the distribution of how you would distribute your primary budget. Same outcomes in expectation at scale (usually?), but also more saliently, much less work, and you're able to investigate your assigned party a lot more thoroughly than you would if you were spreading your attention over more than one.
You could maybe rationalize it by doing a quadratic voting thing, where you get vote weighted by the sqrt of your budget allocation/100, quadratic voting is usually done over different political issues rather than parties, and it has some beautiful arguemnts in that usecase. But as above, quadratic voting is also essentially a subsidy on low-information voting / spreading your vote thinly (the thinner you spread it the more influence you end up exerting in total). I'm not sure how it could be a good thing on net.
Are you calling approval voting a ranked choice system here? I guess technically it consists of ranking every candidate either first or second equal, but it's a, uh, counterintuitive categorization.
I actually don't think we'd have those reporting biases.
Though I think that might be trivially true; if someone is part of a community, they're not going to be able or willing to hide their psychosis diagnosis from it. If someone felt a need to hide something like that from a community, they would not really be part of that community.
A nice articulation on false intellectual fences
Perhaps the deepest lesson that I've learned in the last ten years is that there can be this seeming consensus, these things that everyone knows that seem sort of wise, seem like they're common sense, but really they're just kind of herding behaviour masquerading as maturity and sophistication, and when you've seen how the consensus can change overnight, when you've seen it happen a number of times, eventually you just start saying nope
I think there are probably reporting bias and demographic selection effects going on too:
Definitely worthy of attention, but suspicious things about it: Author is anon, writes well despite never having posted before, named after a troll object, and also I've heard that ordinary levels of formate are usually only 4x lower than this.