I'm writing an effortpost on this general topic but wanted to gauge reactions to the following thoughts, so I can tweak my approach.
I was first introduced to rationality about ten years ago and have been a reasonably dedicated practitioner of this discipline that whole time. The first few years saw me making a lot of bad choices. I was in the Valley of Bad Rationality; I didn't have experience with these powerful tools, and I made a number of mistakes.
My own mistakes had a lot to do with overconfidence in my ability to model and navigate complex situations. My ability to model and understand myself was particularly lacking.
In the more proximal part of this ten year period -- say, in the last five years -- I've actually gotten a lot better. And I got better, in my opinion, because I kept on thinking about the world in a fundamentally rationalist way. I kept making predictions, trying to understand what happened when my predictions went wrong, and updating both my world-model and my meta-model of how I should be thinking about predictions and models.
Centrally, I acquired an intuitive, gut level sense of how to think about situations where I could only see a certain angle, where I was either definitely or probably missing information, or situations involving human psychology. You could also classify another major improvement as being due generally to "actually multiplying probabilities semi-explicitly instead of handwaving", e.g. it's pretty unlikely that two things with independent 30% odds of being true, are both true. You could say through trial and error I came to understand why no wise person attempts a plan where more than one thing has to happen "as planned".
I think if you had asked me at the 5 year mark if this rationality thing was all it was cracked up to be, I very well might have said that it had led me to make a lot of bad decisions and execute bad plans, but after 10 years, and especially the last year or three, it has started working for me in a way that it didn't before.
The more specific details, the more interested would I be. Like, five typical bad choices in the first period, five typical good choices in the second period, in ideal case those would be five from different areas of life, and then five from the same areas. The "intuitive, gut level sense of how to think" sounds interesting, but without specific examples I would have no reason to trust this description.
It's pretty unlikely that two things with independent 30% odds of being true, are both true.
I'm not sure I'd call 9% (the combined probability of two independent 30% events) "pretty unlikely" - sure, it won't happen in most cases, but out of every 11 similar situations, you would see it happen once, which adds up to plenty of 9% chance events happening all the time
I thought folks might enjoy our podcast discussion of two of Ted Chiang's stories, Story of Your Life and The Truth of Fact, the Truth of Feeling.
I’m well versed in what I would consider to be the practical side of decision theory but I’m unaware of what tools, frameworks, etc. are used to deal with uncertainty in the utility function. By this I mean uncertainty in how utility will ultimately be assessed, for an agent that doesn’t actually know how much they will or won’t end up preferring various outcomes post facto, and they know in advance that they are ignorant about their preferences.
The thing is, I know how I would do this, it’s not really that complex (use probability distributions for the utilities associated with outcomes and propagate that through the decision tree) but I can’t find a good trailhead for researching how others have done this. When I Google things like “uncertainty in utility function” I am just shown standard resources on decision making under uncertainty, which is about uncertainty in the outcome, not uncertainty in the utility function.
(As for why I’m interested in this — first of all, it seems like a more accurate way of modeling human agents, and, second, I can’t see how you instantiate something like Indirect Normativity without the concept of uncertainty in the utility function itself.)
Are we talking about an agent that is uncertain about its own utility function or about an agent that is uncertain about another agent's?
You are probably talking about the former. What would count as evidence about the uncertain utility function?
Yes, the former. If the agent takes actions and receives reward, assuming it can see the reward, then it will gain evidence about its utility function.
Probably you already know this, but the framework known as reinforcement learning is very relevant here. In particular, there are probably web pages that describe how to compute the expected utility of a (strategy, reward function) pair.