Coding day in and out on LessWrong 2.0
Yeah, my current commenting guidelines are empty. Other users have non-empty commenting guideliens.
The FAQ covers almost all the site-functionality, including karma. Here is the relevant section:
You can also link to subsections, if you just right-click on the relevant section in the ToC and select "Copy Link Address".
I've also spent 30 minutes looking for anything in this space and didn't find anything. The closest that I could find was Neuroeconomics.
Promoted to curated: It's been a while since this post has come out, but I've been thinking of the "credit assignment" abstraction a lot since then, and found it quite useful. I also really like the way the post made me curious about a lot of different aspects of the world, and I liked the way it invited me to boggle at the world together with you.
I also really appreciated your long responses to questions in the comments, which clarified a lot of things for me.
One thing comes to mind for maybe improving the post, though I think that's mostly a difference of competing audiences:
I think some sections of the post end up referencing a lot of really high-level concepts, in a way that I think is valuable as a reference, but also in a way that might cause a lot of people to bounce off of it (even people with a pretty strong AI Alignment background). I can imagine a post that includes very short explanations of those concepts, or moves them into a context where they are more clearly marked as optional (since I think the post stands well without at least some of those high-level concepts)
nods Seems good. I agree that there are much more interesting things to discuss.
nods You did say the following:
I honestly don’t see how they could sensibly be aggregated into anything at all resembling a natural category
I interpreted that as saying "there is no resemblance between attending a CFAR workshop and reading the sequences", which seems to me to include the natural categories of "they both include reading/listening to largely overlapping concepts" and "their creators largely shared the same aim in the effects it tried to produce in people".
I think there is a valuable and useful argument to be made here that in the context of trying to analyze the impact of these interventions, you want to be careful to account for the important differences between reading a many-book length set of explanations and going to an in-person workshop with in-person instructors, but that doesn't seem to me what you said in the your original comment. You just said that there is no sensible way to put these things into the same category, which just seems obviously wrong to me, since there clearly is a lot of shared structure to analyze between these interventions.
I mean, a lot of the CFAR curriculum is based on content in the sequences, the handbook covers a lot of the same declarative content, and they are setting out with highly related goals (with Eliezer helping with early curriculum development, though much less so in recent years). The beginning of R:A-Z even explicitly highlights how he thinks CFAR is filling in many of the gaps he left in the sequences, clearly implying that they are part of the same aim.
Sure, there are differences, but overall they are highly related and I think can meaningfully be judged to be in a natural category. Similar to how a textbook and a university-class or workshop on the same subject are obviously related, even though they will differ on many relevant dimensions.
Note that all three of the linked paper are about "boundedly rational agents with perfectly rational principals" or about "equally boundedly rational agents and principals". I have been so far unable to find any papers that follow the described pattern of "boundedly rational principals and perfectly rational agents".
I am confused. If MWI is true, we are all already immortal, and every living mind is instantiated a very large number of times, probably literally forever (since entropy doesn't actually decrease in the full multiverse, and is just a result of statistical correlation, but if you buy the quantum immortality argument you no longer care about this).
Bayesian agents are logically omniscient, and I think a large fraction of deceptive practices rely on asymmetries in computation time between two agents with access to slightly different information (like generating a lie and checking the consistencies between this new statement and all my previous statements)
My sense is also that two-player games with bayesian agents are actually underspecified and give rise to all kinds of weird things due to the necessity for infinite regress (i.e. an agent modeling the other agent modeling themselves modeling the other agent, etc.), which doesn't actually reliably converge, though I am not confident. A lot of decision-theory seems to do weird things with bayesian agents.
So overall, not sure how well you can prove theorems in this space, without having made a lot of progress in decision-theory, and I expect the resolution to a lot of our confusions in decision-theory to be resolved by moving away from bayesianism.
Yep, that's correct. We experimented with some other indicators, but this was the one that seemed least intrusive.