LESSWRONG
LW

doxav
1010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
The Illusion of Iterative Improvement: Why AI (and Humans) Fail to Track Their Own Epistemic Drift
doxav4mo*20

Very interesting. I was recently on generative AI meta-optimization and this has many common points.

Do you think the elements below could be the core of an epistemic architecture ?

1. Telos / task contract: goal + constraints + epistemic evaluations
2. History / revisions of iterations (e.g. prompt > answers and discarded alternatives) with epistemic salient differences
3. Epistemic evaluations (e.g. global and local (topic, sections) evaluations of goal alignment, relevance, proofs, logical coherence, fluency, argument quality, topic coverage, novelty, and non-redundancy)
4. Control policy: analyze progress and drift scores and decide (accept, reject, choose between multiple candidates, ask for revision...)

One note on evaluations: when no hard-ground metric exists, relying on scores from AI/LLM (synthetic judges) is very unstable; pairwise preference ranking ( “Which of these two is closer to the sub goal X?” ) is currently more robust but might silently drift if it is not also compared to some fixed previous checkpoints. 

Reply
No posts to display.