Just a reminder to self that I wrote this, but need to write a counterargument to it based upon a new insight about what a good "popular book" can do.
Ah! Thanks so much. I was definitely conflating farsightedness as discount factor and farsightedness as vision of possible states in a landscape.
And that is why some resource increasing state may be too far out of the way, meaning NOT instrumentally convergent, - because the more distant that state is the closer its value is to zero, until it actually is zero. Hence the bracket.
"most agents stay alive in Pac-Man and postpone ending a Tic-Tac-Toe game", but only in the limit of farsightedness (γ→1)
I think there are two separable concepts at work in these examples, the success of an agent and the agent's choices as determined by the reward functions and farsightedness.
If we compare two agents, one with the limit of farsightedness and the other with half that, farsightedness (γ→1/2), then I expect the first agent to be more successful across a uniform distribution of reward functions and to skip over doing things like Trade School, but the second agent in light of more limited farsightedness would be more successful if it were seeking power. As Vanessa Kosoy said above,
... gaining is more robust to inaccuracies of the model or changes in the circumstances than pursuing more "direct" paths to objectives.
What I meant originally is that if an agent doesn't know if γ→1, then is it not true that an agent "seeks out the states in the future with the most resources or power? Now, certainly the agent can get stuck at a local maximum because of shortsightedness, and an agent can forgo certain options as result of its farsightedness.
So I am interpreting the theorem like so:
An agent seeks out states in the future that have more power at the limit of its farsightedness, but not states that, while they have more power, are below its farsightedness "rating."
Note: Assuming a uniform reward function.
If an agent is randomly placed in a given distribution of randomly connected points, I see why there are diminishing returns on seeking more power, but that return is never 0, is it?
This gives me pause.
You said, "Once promoted to your attention, you notice that the new plan isn't so much worse after all. The impact vanishes." Just to clarify, you mean that the negative impact of the original plan falling through vanishes, right?
When I think about the difference between value impact and objective impact, I keep getting confused.
Is money a type of AU? Money both functions as a resource for trading up (personal realization of goals) AND as a value itself (for example when it is an asset).
If this is the case, then any form of value based upon optionality violates the "No 'coulds' rule," doesn't it?
For example, imagine I have a choice between hosting a rationalist meetup and going on a long bike ride. There's a 50/50 chance of me doing either of those. Then something happens which removes one of those options (say a pandemic sweeps the country or something like that). If I'm interpreting this right, then the loss of the option has some personal impact, but zero objective impact.
Is that right?
Let's say an agent works in a low paying job that has a lot of positive impact for her clients - both by helping them attain their values and helping them increase resources for the world. Does the job have high objective impact and low personal impact? Is the agent in bad equilibrium when achievable objective impact mugs her of personal value realization?
Let's take your examples of sad person with (P, EU):
Mope and watch Netflix (.90, 1). Text ex (.06, -500). Work (.04, 10). If suddenly one of these options disappeared is that a big deal? Behind my question is the worry that we are missing something about impact being exploited by one of the two terms which compose it and about whether agents in this framework get stuck in strange equilibria because of the way probabilities change based on time.
Help would be appreciated.
Thank you for this.
Your characterization of Reductive Utility matches very well my own experience in philosophical discussion about utilitarianism. Most of my interlocutors object that I am proposing a reductive utility notion which suffers from incomputability (which is essentially how Anscombe dismissed it all in one paragraph, putting generations of philosophers pitted eternally against any form of consequentialism).
However, I always thought it was obvious that one need not believe that objects and moral thinking must be derived from ever lower levels of world states.
What do you think are the downstream effects of holding Reductive Utility Function theory?
I'm thinking the social effects of RUF is more compartmentalization of domains because from an agent perspective their continuity is incomputable, does that make sense?
I think about the mystery of spelling a lot. Part of it is that English is difficult, of course. But still why does my friend who reads several long books a year fail so badly at spelling, as he has always struggled since 2nd and 3rd grade when his mom would take extra time out just to ensure that he learned his spelling words enough to pass.
I have never really had a problem with spelling and seem to use many methods when I am thinking about spelling explicitly, sound it out, picture it, remember it as a chunk, recall the language of origin to figure out dipthongs. I notice that students who are bad at spelling frequently have trouble learning foreign languages, maybe the correlation points to a common cause?
Strongly agree causal models need lots of visuals. I liked the video, but I also realize I understood it because I know what Counterfactuals and Causal Inference is already. I think that is actually a fair assumption given your audience and the goals of this sequence. Nonetheless, I think you should provide some links to required background information.
I am not familiar with circuits or fluid dynamics so those examples weren't especially elucidating to me. But I think as long as a reader understands one or two of your examples it is fine. Part of making this judgment depends upon your own personal intuition about how labor should be divided between author and reader. I am fine with high labor, and making a video is, IMO, already quite difficult.
I think you should keep experimenting with the medium.
What have you learned about transfer in your experience at CFAR? Have you seen people gain the ability to transfer the methods of one domain into other domains? How do you make transfer more likely to occur?
I'm sure the methods of CFAR have wider application than to Machine Learning...