LESSWRONG
LW

379
Zhijing Jin
26Ω1410
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Welcome to Apply: The 2024 Vitalik Buterin Fellowships in AI Existential Safety by FLI!
Zhijing Jin2y10

Thank you for spotting it! I just did the fix :).

Reply
9Testing the Authoritarian Bias of LLMs
2mo
1
6Why Reasoning Isn’t Enough: How LLM Agents Struggle with Ethics and Cooperation
3mo
0
6Investigating Accidental Misalignment: Causal Effects of Fine-Tuning Data on Model Vulnerability
Ω
4mo
Ω
0
24Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games
Ω
5mo
Ω
3
5Welcome to Apply: The 2024 Vitalik Buterin Fellowships in AI Existential Safety by FLI!
2y
2