Epistemic status: trying to vaguely gesture at vague intuitions. A similar idea was explored here under the heading "the intelligibility of intelligence", although I hadn't seen it before writing this post. As of 2020, I consider this follow-up comment to be a better summary of the thing I was trying to convey with this post than the post itself.
There’s a mindset which is common in the rationalist community, which I call “realism about rationality” (the name being intended as a parallel to moral realism). I feel like my skepticism about agent foundations research is closely tied to my skepticism about this mindset, and so in this essay I try to articulate what it is.
Humans ascribe properties to entities in the world in order to describe and predict them. Here are three such properties: "momentum", "evolutionary fitness", and "intelligence". These are all pretty useful properties for high-level reasoning in the fields of physics, biology and AI, respectively. There's a key difference between the first two, though. Momentum is very amenable to formalisation: we can describe it using precise equations, and even prove things about it. Evolutionary fitness is the opposite: although nothing in biology makes sense without it, no biologist can take an organism and write down a simple equation to define its fitness in terms of more basic traits. This isn't just because biologists haven't figured out that equation yet. Rather, we have excellent reasons to think that fitness is an incredibly complicated "function" which basically requires you to describe that organism's entire phenotype, genotype and environment.
In a nutshell, then, realism about rationality is a mindset in which reasoning and intelligence are more like momentum than like fitness. It's a mindset which makes the following ideas seem natural:
- The idea that there is a simple yet powerful theoretical framework which describes human intelligence and/or intelligence in general. (I don't count brute force approaches like AIXI for the same reason I don't consider physics a simple yet powerful description of biology).
- The idea that there is an “ideal” decision theory.
- The idea that AGI will very likely be an “agent”.
- The idea that Turing machines and Kolmogorov complexity are foundational for epistemology.
- The idea that, given certain evidence for a proposition, there's an "objective" level of subjective credence which you should assign to it, even under computational constraints.
- The idea that Aumann's agreement theorem is relevant to humans.
- The idea that morality is quite like mathematics, in that there are certain types of moral reasoning that are just correct.
- The idea that defining coherent extrapolated volition in terms of an idealised process of reflection roughly makes sense, and that it converges in a way which doesn’t depend very much on morally arbitrary factors.
- The idea that having having contradictory preferences or beliefs is really bad, even when there’s no clear way that they’ll lead to bad consequences (and you’re very good at avoiding dutch books and money pumps and so on).
To be clear, I am neither claiming that realism about rationality makes people dogmatic about such ideas, nor claiming that they're all false. In fact, from a historical point of view I’m quite optimistic about using maths to describe things in general. But starting from that historical baseline, I’m inclined to adjust downwards on questions related to formalising intelligent thought, whereas rationality realism would endorse adjusting upwards. This essay is primarily intended to explain my position, not justify it, but one important consideration for me is that intelligence as implemented in humans and animals is very messy, and so are our concepts and inferences, and so is the closest replica we have so far (intelligence in neural networks). It's true that "messy" human intelligence is able to generalise to a wide variety of domains it hadn't evolved to deal with, which supports rationality realism, but analogously an animal can be evolutionarily fit in novel environments without implying that fitness is easily formalisable.
Another way of pointing at rationality realism: suppose we model humans as internally-consistent agents with beliefs and goals. This model is obviously flawed, but also predictively powerful on the level of our everyday lives. When we use this model to extrapolate much further (e.g. imagining a much smarter agent with the same beliefs and goals), or base morality on this model (e.g. preference utilitarianism, CEV), is that more like using Newtonian physics to approximate relativity (works well, breaks down in edge cases) or more like cavemen using their physics intuitions to reason about space (a fundamentally flawed approach)?
Another gesture towards the thing: a popular metaphor for Kahneman and Tversky's dual process theory is a rider trying to control an elephant. Implicit in this metaphor is the localisation of personal identity primarily in the system 2 rider. Imagine reversing that, so that the experience and behaviour you identify with are primarily driven by your system 1, with a system 2 that is mostly a Hansonian rationalisation engine on top (one which occasionally also does useful maths). Does this shift your intuitions about the ideas above, e.g. by making your CEV feel less well-defined? I claim that the latter perspective is just as sensible as the former, and perhaps even more so - see, for example, Paul Christiano's model of the mind, which leads him to conclude that "imagining conscious deliberation as fundamental, rather than a product and input to reflexes that actually drive behavior, seems likely to cause confusion."
These ideas have been stewing in my mind for a while, but the immediate trigger for this post was a conversation about morality which went along these lines:
R (me): Evolution gave us a jumble of intuitions, which might contradict when we extrapolate them. So it’s fine to accept that our moral preferences may contain some contradictions.
O (a friend): You can’t just accept a contradiction! It’s like saying “I have an intuition that 51 is prime, so I’ll just accept that as an axiom.”
R: Morality isn’t like maths. It’s more like having tastes in food, and then having preferences that the tastes have certain consistency properties - but if your tastes are strong enough, you might just ignore some of those preferences.
O: For me, my meta-level preferences about the ways to reason about ethics (e.g. that you shouldn’t allow contradictions) are so much stronger than my object-level preferences that this wouldn’t happen. Maybe you can ignore the fact that your preferences contain a contradiction, but if we scaled you up to be much more intelligent, running on a brain orders of magnitude larger, having such a contradiction would break your thought processes.
R: Actually, I think a much smarter agent could still be weirdly modular like humans are, and work in such a way that describing it as having “beliefs” is still a very lossy approximation. And it’s plausible that there’s no canonical way to “scale me up”.
I had a lot of difficulty in figuring out what I actually meant during that conversation, but I think a quick way to summarise the disagreement is that O is a rationality realist, and I’m not. This is not a problem, per se: I'm happy that some people are already working on AI safety from this mindset, and I can imagine becoming convinced that rationality realism is a more correct mindset than my own. But I think it's a distinction worth keeping in mind, because assumptions baked into underlying worldviews are often difficult to notice, and also because the rationality community has selection effects favouring this particular worldview even though it doesn't necessarily follow from the community's founding thesis (that humans can and should be more rational).
I didn't like this post. At the time, I didn't engage with it very much. I wrote a mildly critical comment (which is currently the top-voted comment, somewhat to my surprise) but I didn't actually engage with the idea very much. So it seems like a good idea to say something now.
The main argument that this is valuable seems to be: this captures a common crux in AI safety. I don't think it's my crux, and I think other people who think it is their crux are probably mistaken. So from my perspective it's a straw-man of the view it's trying to point at.
The main problem is the word "realism". It isn't clear exactly what it means, but I suspect that being really anti-realist about rationality would not shift my views about the importance of MIRI-style research that much.
I agree that there's something kind of like rationality realism. I just don't think this post successfully points at it.
Ricraz starts out with the list: momentum, evolutionary fitness, intelligence. He says that the question (of rationality realism) is whether fitness is more like momentum or more like fitness. Momentum is highly formalizable. Fitness is a useful a... (read more)
ETA: The original version of this comment conflated "evolution" and "reproductive fitness", I've updated it now (see also my reply to Ben Pace's comment).
MIRI in general and you in particular seem unusually (to me) confident that:
1. We can learn more than we already know about rationality of "ideal" agents (or perhaps arbitrary agents?).
2. This understanding will allow us to build AI systems that we understand better than the ones we build today.
3. We will be able to do this in time for it to affect real AI systems. (This could be either because it is unusually tractable and can be solved very quickly, or because timelines are very long.)
This is primarily based on what research you and MIRI do, some of MIRI's strategy writing, writing like the Rocket Alignment problem and law thinking, and an assumption that you are choosing to do this research because you think it is an effective way to reduce AI risk (given your skills).
(Another possibility is that you think that
... (read more)Huh? A lot of these points about evolution register to me as straightforwardly false. Understanding the theory of evolution moved us from "Why are there all these weird living things? Why do they exist? What is going on?" to "Each part of these organisms has been designed by a local hill-climbing process to maximise reproduction." If I looked into it, I expect I'd find out that early medicine found it very helpful to understand how the system was built. This is like me handing you a massive amount of code that has a bunch of weird outputs and telling you to make it work better and more efficiently, and the same thing but where I tell you what company made the code, why they made it, and how they made it, and loads of examples of other pieces of code they made in this fashion.
If I knew how to operationalise it I would take a pretty strong bet that the theory of natural selection has been revolutionary in the history of medicine.
This seems like the closest fit, but my view has some commonalities with points 1-3 nonetheless.
It sounds like our potential cruxes are closer to point 3 and to the question of how doomed current approaches are. Given that, do you still think rationality realism seems super relevant (to your attempted steelman of my view)?
I guess my position is something like this. I think it may be quite possible to make capabilities "blindly" -- basically the processing-power heavy type of AI progress (applying enough tricks so you're not lit... (read more)
I like this review and think it was very helpful in understanding your (Abram's) perspective, as well as highlighting some flaws in the original post, and ways that I'd been unclear in communicating my intuitions. In the rest of my comment I'll try write a synthesis of my intentions for the original post with your comments; I'd be interested in the extent to which you agree or disagree.
We can distinguish between two ways to understand a concept X. For lack of better terminology, I'll call them "understanding how X functions" and "understanding the nature of X". I conflated these in the original post in a confusing way.
For example, I'd say that studying how fitness functions would involve looking into the ways in which different components are important for the fitness of existing organisms (e.g. internal organs; circulatory systems; etc). Sometimes you can generalise that knowledge to organisms that don't yet exist, or even prove things about those components (e.g. there's probably useful maths connecting graph theory with optimal nerve wiring), but it's still very grounded in concrete examples. If we thought that we ... (read more)
[Spoiler-boxing the following response not because it's a spoiler, but because I was typing a response as I was reading your message and the below became less relevant. The end of your message includes exactly the examples I was asking for (I think), but I didn't want to totally delete my thinking-out-loud in case it gave helpful evidence about my state.]
I'm having trouble here because yes, the theory of population genetics factors in heavily to what I said, but to me reproductive fitness functions (largely) inherit their realness from the role they play in population genetics. So the two comparisons you give seem not very
In this essay, ricraz argues that we shouldn't expect a clean mathematical theory of rationality and intelligence to exist. I have debated em about this, and I continue to endorse more or less everything I said in that debate. Here I want to restate some of my (critical) position by building it from the ground up, instead of responding to ricraz point by point.
When should we expect a domain to be "clean" or "messy"? Let's look at everything we know about science. The "cleanest" domains are mathematics and fundamental physics. There, we have crisply defined concepts and elegant, parsimonious theories. We can then "move up the ladder" from fundamental to emergent phenomena, going through high energy physics, molecular physics, condensed matter physics, biology, geophysics / astrophysics, psychology, sociology, economics... On each level more "mess" appears. Why? Occam's razor tells us that we should prioritize simple theories over complex theories. But, we shouldn't expect a theory to be more simple than the specification of the domain. The general theory of planets should be simpler than a detailed description of planet Earth, the general theory of atomic matter should be simpler th
... (read more)Rationality realism seems like a good thing to point out which might be a crux for a lot of people, but it doesn't seem to be a crux for me.
I don't think there's a true rationality out there in the world, or a true decision theory out there in the world, or even a true notion of intelligence out there in the world. I work on agent foundations because there's still something I'm confused about even after that, and furthermore, AI safety work seems fairly hopeless while still so radically confused about the-phenomena-which-we-use-intelligence-and-rationality-and-agency-and-decision-theory-to-describe. And, as you say, "from a historical point of view I’m quite optimistic about using maths to describe things in general".
I have an intuition that the "realism about rationality" approach will lead to success, even if it will have to be dramatically revised on the way.
To explain, imagine that centuries years ago there are two groups trying to find out how the planets move. Group A says: "Obviously, planets must move according to some simple mathematical rule. The simplest mathematical shape is a circle, therefore planets move in circles. All we have to do is find out the exact diameter of each circle." Group B says: "No, you guys underestimate the complexity of the real world. The planets, just like everything in nature, can only be approximated by a rule, but there are always exceptions and unpredictability. You will never find a simple mathematical model to describe the movement of the planets."
The people who finally find out how the planets move will be spiritual descendants of the group A. Even if on the way they will have to add epicycles, and then discard the idea of circles, which seems like total failure of the original group. The problem with the group B is that it has no energy to move forward.
The right moment to discard a simple model is when you have enough data to build a more complex model.
In this particular example, it's true that group A was more correct. This is because planetary physics can be formalised relatively easily, and also because it's a field where you can only observe and not experiment. But imagine the same conversation between sociologists who are trying to find out what makes people happy, or between venture capitalists trying to find out what makes startups succeed. In those cases, Group B can move forward using the sort of "energy" that biologists and inventors and entrepreneurs have, driven an experimental and empirical mindset. Whereas Group A might spend a long time writing increasingly elegant equations which rely on unjustified simplifications.
Instinctively reasoning about intelligence using analogies from physics instead of the other domains I mentioned above is a very good example of rationality realism.
I'll note that (non-extreme) versions of this position are consistent with ideas like "it's possible to build non-opaque AGI systems." The full answer to "how do birds work?" is incredibly complex, hard to formalize, and dependent on surprisingly detailed local conditions that need to be discovered empirically. But you don't need to understand much of that complexity at all to build flying machines with superavian speed or carrying capacity, or to come up with useful theory and metrics for evaluating "goodness of flying" for various practical purposes; and the resultant machines can be a lot simpler and more reliable than a bird, rather than being "different from birds but equally opaque in their own alien way".
This isn't meant to be a... (read more)
Just want to note that I've been pushing for (what I think is) a proper amount of uncertainty about "realism about rationality" for a long time. Here's a collection of quotes from just my top-level posts, arguing against various items in your list:
... (read more)I really like the compression "There's no canonical way to scale me up."
I think it captures a lot of the important intuitions here.
I think I want to split up ricraz's examples in the post into two subclasses, defined by two questions.
The first asks, given that there are many different AGI architectures one could scale up into, are some better than others? (My intuition is both that there are better ones than others, and also that there are many who are on the pareto frontier.) And is there any sort of simple ways to determine about why one is better than another? This leads to saying the following examples from the OP:
The second asks - suppose that some architectures are better than others, and suppose there are some simple explanations about why some are better than others. How practical is it to talk of me in this way today? Here's some concrete examples of things I might do:
... (read more)Making a type error is not easy to distinguish from attempting to shift frame. (If it were, the frame control wouldn't be very effective.) In the example Eliezer gave from the sequences, he was shifting frame from one that implicitly acknowledges interpretive labor as a cost, to one that demands unlimited amounts of interpretive labor by assuming that we're all perfect Bayesians (and therefore have unlimited computational ability, memory, etc).
This is a big part of the dynamic underlying mistake vs conflict theory.
Eliezer's behavior in the story you're alluding to only seems "rational" insofar as we think the other side ends up with a better opinion - I can easily imagine a structurally identical interaction where the protagonist manipulates someone into giving up on a genuine but hard to articulate objection, or proceeding down a conversational path they're ill-equipped to navigate, thus "closing the sale."
It's not at all clear that improving the other person's opinion was really one of Eliezer's goals on this occasion, as opposed to showing up the other person's intellectual inferiority. He called the post "Bayesian Judo", and highlighted how his showing-off impressed someone of the opposite sex.
He does also suggest that in the end he and the other person came to some sort of agreement -- but it seems pretty clear that the thing they agreed on had little to do with the claim the other guy had originally been making, and that the other guy's opinion on that didn't actually change. So I think an accurate, though arguably unkind, summary of "Bayesian Judo" goes like this: "I was at a party, I got into an argument with a religious guy who didn't believe AI was possible, I overwhelmed him with my superior knowledge and intelligence, he submitted to my manifest superiority, and the whole performance impressed a woman". On this occasion, helping the other party to have better opinions doesn't seem to have been a high priority.
Note: I confess to being a bit surprised that you picked this example. I’m not quite sure whether you picked a bad example for your point (possible) or whether I’m misunderstanding your point (also possible), but I do think that this question is interesting all on its own, so I’m going to try and answer it.
Here’s a joke that you’ve surely heard before—or have you?
The lesson of this joke applies to the “according to Aumann’s Agreement Theorem …” case.
When someone says “I guess we’ll have to agree to disagree” and their interlocutor responds with “Actually according to Aumann’s Agreement Theorem, we can’t”, I don’t know if I’d call this a “type error”, precisely (maybe it is; I’d have to think about it more carefully); but the second person is certainly being ridiculou
... (read more)Firstly, I hadn't heard the joke before, and it made me chuckle to myself.
Secondly, I loved this comment, for very accurately conveying the perspective I felt like ricraz was trying to defend wrt realism about rationality.
Let me say two (more) things in response:
Firstly, I was taking the example directly from Eliezer.
(Sidenote: I have not yet become sufficiently un-confused about AAT to have a definite opinion about whether EY was using it correctly there. I do expect after further reflection to object to most rationalist uses of the AAT but not this particular one.)
Secondly, and where I think the crux ... (read more)
Well, I guess you probably won’t be surprised to hear that I’m very familiar with that particular post of Eliezer’s, and instantly thought of it when I read your example. So, consider my commentary with that in mind!
Well, whether Eliezer was using the AAT correctly rather depends on what he meant by “rationalist”. Was he using it as a synonym for “perfect Bayesian reasoner”? (Not an implausible reading, given his insistence elsewhere on the term “aspiring rationalist” for mere mortals like us, and, indeed, like himself.) If so, then certainly what he said about the Theorem was true… but then, of course, it would be wholly inappropriate to apply it in the actual case at hand (especially since his interlocutor was, I surmise, some sort of religious person, and plausibly not even an aspiring rationalist).
If, instead, Eliezer was using “rationalist” to refer to mere actual humans of today, such as himself and the fellow he was conve
... (read more)Relevant: Why Common Priors
Although I don't necessarily subscribe to the precise set of claims characterized as "realism about rationality", I do think this broad mindset is mostly correct, and the objections outlined in this essay are mostly wrong.
This seems entirely wrong to me. Evolution definitely should be studied using mathematical models, and although I am not an expert in that, AFAIK this approach is fairly standard. "Fitness" just refers to the expected behavior of the number of descendants of a given organism or gene. Therefore, it is perfectly defin
... (read more)I used to think the same way, but the OP made me have a crisis of faith, and now I think the opposite way.
Sure, an animal brain solving an animal problem is messy. But a general purpose computer solving a simple mathematical problem can be just as messy. The algorithm for multiplying matrices in O(n^2.8) is more complex than the algorithm for doing it in O(n^3), and the algorithm with O(n^2.4) is way more complex than that. As I said in the other comment, "algorithms don't get simpler as they get better".
I don't know a lot about the study of matrix multiplication complexity, but I think that one of the following two possibilities is likely to be true:
Moreover, if we only care about having a polynomial time algorithm with some exponent then the solution is simple (and doesn't require any astronomical coefficients like Levin search; incidentally, the O(n3) algorithm is a
... (read more)Machine learning uses data samples about an unknown phenomenon to extrapolate and predict the phenomenon in new instances. Such algorithms can have provable guarantees regarding the quality of the generalization: this is exactly what computational learning theory is about. Deep learning is currently poorly understood, but this seems more like a result of how young the field is, rather than some inherent mysteriousness of neural networks. And even so, there is already some progress. People have been making buildings and cannons before Newtonian mechanics, engines before thermodynamics and ways of using chemical reactions before quantum mechanics or modern atomic theory. The fact you can do something using trial and error doesn't mean trial and error is the only way to do it.
Once again, it seems perfectly possible to build an abstract theory of evolution (for example, evolutionary game theory would be one component of that theory). Of course, the specific organisms we have on Earth with their specific quirks is not something we can describe by simple equations: unsurprisingly, since they are a rather arbitrary point in the space of all possible organisms!
It plays a minor role in deep learning, in the sense that some "deep" algorithms are adaptations of algorithms that have theoretical guarantees. For example, deep Q-learning is an adaptation of ordinary Q-learning. Obviously I cannot prove that it is possible to create an abstract theory of intellig
... (read more)As a matter of fact, I emphatically do not agree. "Birds" are a confusing example, because it speaks of modifying an existing (messy, complicated, poorly designed) system rather than making something from scratch. If we wanted to make something vaguely bird-like from scratch, we might have needed something like a "theory of self-sustaining, se
... (read more)Excellent post!
I find myself agreeing with much of what you say, but there are a couple of things which strike me as… not quite fitting (at least, into the way I have thought about these issues), and also I am somewhat skeptical about whether your attempt at conceptually unifying these concerns—i.e., the concept of “rationality realism”—quite works. (My position on this topic is rather tentative, I should note; all that’s clear to me is that there’s much here that’s confusing—which is, however, itself a point of agreement with the OP, and disagreement with “rationality realists”, who seem much more certain of their view than the facts warrant.)
Some specific points:
This seems to me to be a fundamentally confused proposition. Regardless of whether Hanson is right about how our minds work (and I suspect he is right to a large degree, if not quite entirely right), the question of who we are seems to be a matter of choosing which aspect(s) of our minds’ functioning to endorse as ego-syntonic. Under this view, it is non
... (read more)Thanks for the helpful comment! I'm glad other people have a sense of the thing I'm describing. Some responses:
I agree that it's a bit of a messy concept. I do suspect, though, that people who see each of the ideas listed above as "natural" do so because of intuitions that are similar both across ideas and across people. So even if I can't conceptually unify those intuitions, I can still identify a clustering.
I was a bit lazy in expressing it, but I think that the underlying idea makes sense (and have edited to clarify a little). There are certain properties we consider key to our identities, like consistency and introspective access. If w... (read more)
For the record, and in case I didn’t get this across—I very much agree that identifying this clustering is quite valuable.
As for the challenge of conceptual unification, we ought, I think, to treat it as a separate and additional challenge (and, indeed, we must be open to the possibility that a straightforward unification is not, after all, appropriate).
... (read more)There are many classes of decision problems that allow optimal solutions, but none of them can cover all of reality, because in reality an AI can be punished for having any given decision theory. That said, the design space of decision theories has sweet spots. For example, future AIs will likely face an environment where copying and simulation is commonplace, and we've found simple decision theories that allow for copies and simulations. Looking for more such sweet spots is fun and fruitful.
Great post, thank you for writing this! Your list of natural-seeming ideas is very thought provoking.
I used to think that way, but now I agree with your position more. Something like Bayesian rationality is a small piece that many problems have in common, but any given problem will have lots of other structure to be exploited as well. In many AI problems, like recognizing handwriting or playing board games, that lets you progress faster than if you'd tried to start with the Bayesian angle.
We could still hope that the best algorithm for any given problem will turn out to be simple. But that seems unlikely, judging from both AI tasks like MNIST, where neural nets beat anything hand-coded, and non-AI tasks like matrix multiplication, where asymptotically best algorithms have been getting more and more complex. As a rule, algorithms don't get simpler as they get better.
I'm not sure what you changed your mind about. Some of the examples you give are unconvincing, as they do have simple meta-algorithms that both discover the more complicated better solutions and analyse their behavior. My guess is that the point is that for example looking into nuance of things like decision theory is an endless pursuit, with more and more complicated solutions accounting for more and more unusual aspects of situations (that can no longer be judged as clearly superior), and no simple meta-algorithm that could've found these more complicated solutions, because it wouldn't know what to look for. But that's content of values, the thing you look for in human behavior, and we need at least a poor solution to the problem of making use of that. Perhaps you mean that even this poor solution is too complicated for humans to discover?
My impression is that an overarching algorithm would allow the agent to develop solutions for the specialized tasks, not that it would directly constitute a perfect solution. I don’t quite understand your position here – would you mind elaborating?
It's a common crux between me and MIRI / rationalist types in AI safety, and it's way easier to say "Realism about rationality" than to engage in an endless debate about whether everything is approximating AIXI or whatever that never seems to update me.
I think it was important to have something like this post exist. However, I now think it's not fit for purpose. In this discussion thread, rohinmshah, abramdemski and I end up spilling a lot of ink about a disagreement that ended up being at least partially because we took 'realism about rationality' to mean different things. rohinmshah thought that irrealism would mean that the theory of rationality was about as real as the theory of liberalism, abramdemski thought that irrealism would mean that the theory of rationality would be about as real as the theo
... (read more)This is one of the unfortunately few times there was *substantive* philosophical discussion on the forum. This is a central example of what I think is good about LW.
I think within "realism about rationality" there are at least 5 plausible positions one could take on other metaethical issues, some of which do not agree with all the items on your list, so it's not really a single mindset. See this post, where I listed those 5 positions along with the denial of "realism about rationality" as the number 6 position (which I called normative anti-realism), and expressed my uncertainty as to which is the right one.
Curated this post for:
This post gave a short name for a way of thinking that I naturally fall into, and implicitly pointing to the possibility of that way of thinking being mistaken. This makes a variety of discussions in the AI alignment space more tractable. I do wish that the post were more precise at characterising the position of 'realism about rationality' and its converse, or (even better) that it gave arguments for or against 'realism about rationality' (even a priors-based one as in this closely related Robin Hanson post), but pointing to a type of proposition and giving it a name seems very valuable.
Although not exactly the central point, seemed like a good time to link back to "Do you identify as the elephant or the rider?"
I was kind of iffy about this post until the last point, which immediately stood out to me as something I vehemently disagree with. Whether or not humans naturally have values or are consistent is irrelevant -- that which is not required will happen only at random and thus tend not to happen at all, and so if you aren't very very careful to actually make sure you're working in a particular coherent direction, you're probably not working nearly as efficiently as you could be and may in fact be running in circles without noticing.
Thanks for writing this, it's a very concise summary of the parts of LW I've never been able to make sense of, and I'd love to have a better understanding of what makes the ideas in your bullet-pointed list appealing to those who tend towards 'rationality realism'. (It's sort of a background assumption in most LW stuff, so it's hard to find places where it's explicitly justified.)
Also:
Is there any online reference explaining this?
First, let me say I 100% agree with the idea that there is a problem in the rationality community of viewing rationality as something like momentum or gold (I named my blog rejectingrationality after this phenomena and tried to deal with it in my first post).
However, I'm not totally sure everything you say falls under that concept. In particular, I'd say that rationality realism is something like the belief that there is a fact of the matter about how best to form beliefs or take actions in response to a particular set of experiences and that ... (read more)
I like this post and the concept in general, but would prefer slightly different terminology. To me, a mindset being called "realism about rationality" implies that this is the realistic, or correct mindset to have; a more neutral name would feel appropriate. Maybe something like "'rationality is math' mindset" or "'intelligence is intelligible' mindset"?
Some other ideas for the list of the "rationality realism":
Very interesting post. I think exploring the limits of our standard models of rationality is very worthwhile. IMO the models used in AI tend to be far too abstract, and don't engage enough with situatedness, unclear ontologies, and the fundamental weirdness of the open world.
One strand of critique of rationality that I really appreciate is David Chapman's "meta-rationality," which he defines as "evaluating, choosing, combining, modifying, discovering, and creating [rational] systems"
https://meaningness.com/metablog/meta-rationality-curriculum
I consider myself a rational realist, but I don't believe some of the things you attribute to rational realism (particularly concerning morality) and particularly concerning consciousness. I don't think there's a true decision theory or true morality, but I do think that you could find systems of reasoning that are provably optimal within certain formal models.
There is no sense in which our formal models are true, but as long as they have high predictive power the models would be useful, and that I think is all that matters.
"Implicit in this metaphor is the localization of personal identity primarily in the system 2 rider. Imagine reversing that, so that the experience and behaviour you identify with are primarily driven by your system 1, with a system 2 that is mostly a Hansonian rationalization engine on top (one which occasionally also does useful maths). Does this shift your intuitions about the ideas above, e.g. by making your CEV feel less well-defined?"
I find this very interesting because locating personal identity in system 1 feels conceptually impossible or... (read more)