Very short version in the title. A bit longer version at the end. Most of the question is context.
Long version / context:
This is something I vaguely remember reading (I think on ACX). I want to check if I remember correctly/ where I could learn it in more technical detail.
Say you go camping in a desert. You wake up and notice something that might be a scary spider you take a look and confirm it's a scary spider indeed. This is bad, you feel bad.
Since this is bad, you will be less likely to do some things that led to you will be less likely to do things led to you feeling bad, for example you'll be less likely to go camping in a desert.
But you probably won't learn to:
- avoid looking at something that might be a scary spider or
- stop recognizing spiders
even though those were much closer to you feeling bad (about being close to a scary spider).
This is a bit weird if you think that humans learn to just get a reward usually you'd expect stuff that happened closer to the punishment to get punished more, not less.
What I recall is that there is a different reward for "epistemic" tasks. Based on accuracy or saliency of things it recognizes, not on whether it's positive / negative.
A bit longer version of the question:
Why don't humans learn to not recognize unpleasant things (too much)? Is there a different reward for some "epistemic" processes? Where could I learn more about this?