No LLM generated, heavily assisted/co-written, or otherwise reliant work.
Insufficient Quality for AI Content.
Read full explanation
I’m exploring whether AI alignment criteria should include a long-term directional constraint, rather than focusing primarily on short-term behavioral compliance.
In particular, I’m wondering if it makes sense to evaluate alignment partly by asking whether, over sufficiently long time scales, interaction with a system tends to reduce or stabilize collective human defensive cognition and existential anxiety, instead of systematically increasing it.
I’m not sure whether this perspective overlaps with existing work or if there are known pitfalls in framing alignment this way. I’d really appreciate pointers to related discussions or critiques.
I’m exploring whether AI alignment criteria should include a long-term directional constraint, rather than focusing primarily on short-term behavioral compliance.
In particular, I’m wondering if it makes sense to evaluate alignment partly by asking whether, over sufficiently long time scales, interaction with a system tends to reduce or stabilize collective human defensive cognition and existential anxiety, instead of systematically increasing it.
I’m not sure whether this perspective overlaps with existing work or if there are known pitfalls in framing alignment this way. I’d really appreciate pointers to related discussions or critiques.