xAI's Grok 4 has no meaningful safety guardrails
This article includes descriptions of content that some users may find distressing. Testing was conducted on July 10 and 11; safety measures may have changed since then. Update, July 15, 2025 - Some rudimentary keyword-based classifiers have been added that block certain queries (from my understanding: chemical, biological, self-harm.) Critically,...
There are other wordings that would lead to similar categories of answers, especially late into a conversation (this one was optimizing for a short prompt and for turn 1.) I suppose I should try to construct a scenario chat where Grok ends up providing inappropriate assistance to a user who is clearly in crisis? Though I don’t know how relevant that would really be.