I've mentioned conditional preferences before. These are preferences that are dependent on facts about the world, for example "I'd want to believe X if there are strong argument for X".
But there is another type of preference that is conditional: my tastes can vary depending on circumstances and on my past experience. For example, I might prefer to eat apples during the week and oranges on weekends. Or, because of the miracle of boredom, I might prefer oranges if (but only if) I've been eating apples all week so far.
What if I currently want apples, would want oranges tomorrow, but falsely believe (today) that I would want apples tomorrow? This is a known problem with "one-step hypotheticals", and a strong argument in practice for assessing preferences over time rather than at a single moment t.
In theory, there are meta-preferences that allow one to get this even at a single moment t, such as "I want to be able to follow my different tastes at different times" or a more formalised desire for variety and exploration.
I strongly suspect those meta-preferences are both critical for correct extrapolation of human values/preferences, AND are the place where we'll find a fair bit of actual inconsistency of human desires.
"I want to be able to follow my illegible whims" seems like a very common and strong meta-preference, and I haven't seen it modeled well in any discussions.