LESSWRONG
LW

2756
Nick Baldwin
0010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
xAI's new safety framework is dreadful
Nick Baldwin5d10

Are these misalignment issues mitigated for companies that elect to buy the xAI model for use "inside the fence"? These companies can then write their own system prompts to induce alignment toward company goals, rather than deal with the "Grok" issues?

Reply1