Philosophical self-ratification

jessicata

"Ratification" is defined as "the act or process of ratifying something (such as a treaty or amendment) : formal confirmation or sanction". Self-ratification, then, is assigning validity to one's self. (My use of the term "self-ratification" follows philosophical usage in analysis of causal decision theory)

At first this seems like a trivial condition. It is, indeed, easy to write silly sentences such as "This sentence is true and also the sky is green", which are self-ratifying. However, self-ratification combined with other ontological and epistemic coherence conditions is a much less trivial condition, which I believe to be quite important for philosophical theory-development and criticism.

I will walk through some examples.

Causal decision theory

Formal studies of causal decision theory run into a problem with self-ratification. Suppose some agent A is deciding between two actions, L and R. Suppose the agent may randomize their action, and that their payoff equals their believed probability that they take the action other than the one they actually take. (For example, if the agent takes action L with 40% probability and actually takes action R, the agent's payoff is 0.4)

If the agent believes they will take action L with 30% probability, then, if they are a causal decision theorist, they will take action L with 100% probability, because that leads to 0.7 payoff instead of 0.3 payoff. But, if they do so, this invalidates their original belief that they will take action L with 30% probability. Thus, the agent's belief that they will take action L with 30% probability is not self-ratifying: the fact of the agent having this belief leads to the conclusion that they take action L with 100% probability, not 30%, which contradicts the original belief.

The only self-ratifying belief is that the agent will take each action with 50% probability; this way, both actions yield equal expected utility, and so a policy 50/50 randomization is compatible with causal decision theory, and this policy ratifies the original belief.

Genetic optimism

(This example is due to Robin Hanson's "Uncommon Priors Require Origin Disputes".)

Suppose Oscar and Peter are brothers. Oscar is more optimistic than Peter. Oscar comes to believe that the reason he is more optimistic is due to inheriting a gene that inflates beliefs about positive outcomes, whereas Peter did not inherit this same gene.

Oscar's belief-set is now not self-ratifying. He believes the cause of his belief that things will go well to be a random gene, not correlation with reality. This means that, according to his own beliefs, his optimism is untrustworthy.

Low-power psychological theories

Suppose a psychological researcher, Beth, believes that humans are reinforcement-learning stimulus-response machines, and that such machines are incapable of reasoning about representations of the world. She presents a logical specification of stimulus-response machines that she believes applies to all humans. (For similar real-world theories, see: Behaviorism, Associationism, Perceptual Control Theory)

However, a logical implication of Beth's beliefs is that she herself is a stimulus-response machine, and incapable of reasoning about world-representations. Thus, she cannot consistently believe that her specification of stimulus-response machines is likely to be an accurate, logically coherent representation of humans. Her belief-set, then, fails to self-ratify, on the basis that it assigns to herself a level of cognitive power insufficient to come to know that her belief-set is true.

Moral realism and value drift

Suppose a moral theorist, Valerie, believes:

Societies' moral beliefs across history follow a random walk, not directed anywhere.
Her own moral beliefs, for the most part, follow society's beliefs.
There is a true morality which is stable and unchanging.
Almost all historical societies' moral beliefs are terribly, terribly false.

From these it follows that, absent further evidence, the moral beliefs of Valerie's society should not be expected to be more accurate (according to estimation of the objective morality that Valerie believes exists) than the average moral beliefs across historical societies, since there is no moral progress in expectation. However, this implies that the moral beliefs of her own society are likely to be terribly, terribly false. Therefore, Valerie's adoption of her society's beliefs would imply that her own moral beliefs are likely to be terribly, terrible false: a failure of self-ratification.

Trust without honesty

Suppose Larry is a blogger who reads other blogs. Suppose Larry believes:

The things he reads in other blogs are, for the most part, true (~90% likely to be correct).
He's pretty much the same as other bloggers; there is a great degree of subjunctive dependence between his own behavior and other bloggers' behaviors (including their past behaviors).

Due to the first belief, he concludes that lying in his own blog is fine, as there's enough honesty out there that some additional lies won't pose a large problem. So he starts believing that he will lie and therefore his own blog will contain mostly falsehoods (~90%).

However, an implication of his similarity to other bloggers is that other bloggers will reason similarly, and lie in their own blog posts. Since this applies to past behavior as well, a further implication is that the things he reads in other blogs are, for the most part, false. Thus the belief-set, and his argument for lying, fail to self-ratify.

(I presented a similar example in "Is Requires Ought".)

Mental nonrealism

Suppose Phyllis believes that the physical world exists, but that minds don't exist. That is, there are not entities that are capable of observation, thought, etc. (This is a rather simple, naive formulation of eliminative materialism)

Her reason for this belief is that she has studied physics, and believes that physics is sufficient to explain everything, such that there is no reason to additionally posit the existence of minds.

However, if she were arguing for the accuracy of her beliefs about physics, she would have difficulty arguing except in terms of e.g. physicists making and communicating observations, theorists having logical thoughts, her reading and understanding physics books, etc.

Thus, her belief that minds don't exist fails to self-ratify. It would imply that she lacks evidential basis for belief in the accuracy of physics. (On the other hand, she may be able to make up for this by coming up with a non-mentalistic account for how physics can come to be "known", though this is difficult, as it is not clear what there is that could possibly have knowledge. Additionally, she could believe that minds exist but are somehow "not fundamental", in that they are determined by physics; however, specifying how they are determined by physics requires assuming they exist at all and have properties in the first place.)

Conclusion

I hope the basic picture is clear by now. Agents have beliefs, and some of these beliefs imply beliefs about the trustworthiness of their own beliefs, primarily due to the historical origins of the beliefs (e.g. psychology, society, history). When the belief-set implies that it itself is untrustworthy (being likely to be wrong), there is a failure of self-ratification. Thus, self-ratification, rather than being a trivial condition, is quite nontrivial when combined with other coherence conditions.

Why would self-ratification be important? Simply put, a non-self-ratifying belief set cannot be trustworthy; if it were trustworthy then it would be untrustworthy, which shows untrustworthiness by contradiction. Thus, self-ratification points to a rich set of philosophical coherence conditions that may be neglected if one is only paying attention to surface-level features such as logical consistency.

Self-ratification as a philosophical coherence condition points at naturalized epistemology being an essential philosophical achievement. While epistemology may possibly start non-naturalized, as it gains self-consciousness of the fact of its embeddedness in a natural world, such self-consciousness imposes additional self-ratification constraints.

Using self-ratification in practice often requires flips between treating one's self as a subject and as an object. This kind of dual self-consciousness is quite interesting and is a rich source of updates to both self-as-subject beliefs and self-as-object beliefs.

Taking coherence conditions including self-ratification to be the only objective conditions of epistemic justification is a coherentist theory of justification; note that coherentists need not believe that all "justified" belief-sets are likely to be true (and indeed, such a belief would be difficult to hold given the possibility of coherent belief-sets very different from one's own and from each other).

Appendix: Proof by contradiction is consistent with self-ratification

There is a possible misinterpretation of self-ratification that says: "You cannot assume a belief to be true in the course of refuting it; the assumption would then fail to self-ratify".

Classical logic permits proof-by-contradiction, indicating that this interpretation is wrong. The thing that a proof by contradiction does is show that some other belief-set (not the belief-set held by the arguer) fails to self-ratify (and indeed, self-invalidates). If the arguer actually believed in the belief-set that they are showing to be self-invalidating, then, indeed, that would be a self-ratification problem for the arguer. However, the arguer's belief is that some proposition P implies not-P, not that P is true, so this does not present a self-ratification problem.

I think it's more subtle. In mathematical logic, there's a few things that can happen to a theory:

It can prove a falsehood. That's bad: the theory is busted.
It can prove itself consistent. That's bad too: it implies the theory is inconsistent, by the second incompleteness theorem.
It can prove itself inconsistent. That's not necessarily bad: the silly theory PA+¬Con(PA), which asserts its own inconsistency, is actually equiconsistent with PA. But it suggests that the theory has a funny relationship with reality (in this case, that any model of it must include some nonstandard integers).

Overall it seems we should prefer theories that don't say anything much about their own justifications one way or the other. I suspect the right approach in philosophy is the same.

Most of the examples I'm talking about are more like proving false or proving your own finitistic inconsistency than failing to prove your own consistency. Like, if your theory implies a strong (possibly probabilistic) argument that your theory is false, that's almost like proving false.

Godel's incompleteness theorem doesn't rule out finitistic self consistency proofs, e.g ability to prove in length n that there is no inconsistency proof of length up to n^2. Logical inductors also achieve this kind of finitistic self trust. I think this is usually a better fit for real world problems than proving infinitary consistency.

Sure, but I don't see why such self-trust is a good sign. All inconsistent theories have proofs of finitistic self-consistency up to n that are shorter than n (for some n), but only some consistent theories do. So seeing such a proof is Bayesian evidence in favor of inconsistency.

Suppose a psychological researcher, Beth, believes that humans are reinforcement-learning stimulus-response machines, and that such machines are incapable of reasoning about representations of the world. She presents a logical specification of stimulus-response machines that she believes applies to all humans. (For similar real-world theories, see: Behaviorism, Associationism, Perceptual Control Theory)
However, a logical implication of Beth's beliefs is that she herself is a stimulus-response machine, and incapable of reasoning about world-representations. Thus, she cannot consistently believe that her specification of stimulus-response machines is likely to be an accurate, logically coherent representation of humans. Her belief-set, then, fails to self-ratify, on the basis that it assigns to herself a level of cognitive power insufficient to come to know that her belief-set is true.

This section (and the physicist later) seem weird to me. Suppose the physicist is, rather than a mind skeptic, an anti-dualist. That is, the dualist thinks that there's a physical realm and a mental realm, and messages pass between them, and the physical realm by itself isn't sufficient to have minds in it. The anti-dualist doesn't see any way for messages to enter the physical realm, and so concludes that it actually must just be one realm, with minds originating from basic components in a reductionistic way.

The dualist will look at this and scoff. "They think they have a belief that there's no mental realm, but that's where beliefs are obviously stored, and so their disbelief in the mental realm cannot be self-ratifying." But the anti-dualist doesn't believe that's where beliefs are stored, they think it's stored in positions and behaviors of material objects, and so the anti-dualist's position looks self-ratifying to the anti-dualist.

Similarly with Beth, it looks like there's this sense of "ah, Beth is doing it wrong, and therefore can't be doing what she's doing in a self-ratifying way." But if Beth's standard of self-ratification is different, then Beth can think what she's doing is self-ratifying! "Yeah, I'm a stimulus-response machine," Beth says, "I didn't write that book through reasoning, I wrote it through responding to stimulus after stimulus. Even this sentence was constructed that way!"

This is a bit like looking at someone building an obviously-unstable tower saying "no, see, my tower is self-supporting, I just have a different notion of 'self-supporting'!". If I'm interpreting self-supporting in a physical manner and they're interpreting it in a tautological manner, then we are talking about different things in saying towers are self-supporting or not.

Note that I said at the top that self-ratification is nontrivial when combined with other coherence conditions; without those other conditions (e.g. in the case of asserting "psychological theories" that make no claim about being representative of any actual psychologies) it's a rather trivial criterion.

(In the case of eliminativism, what Phyllis would need is an account of the evidentiary basis for physics that does not refer to minds making observations, theorizing, etc; this account could respond to the dualist's objection by offering an alternative ontology of evidence)

I think you're dismissing the "tautological" cases too easily. If you don't believe in a philosophy, their standards will often seem artificially constructed to validate themselves. For example a simple argument that pops up from time to time:

Fallibilist: You can never be totally certain that something is true.

Absolutist: Do you think thats true?

F: Yes.

A: See, you've just contradicted yourself.

Obviously F is unimpressed by this, but if he argues that you can believe things without being certain of them, thats not that different from Beth saying she wrote the book by responding to stimuli to someone not already believing their theory.

I've been referring to this as modal inconsistency. I.e. a system that does not sustain the conditions that led to the system arising and is therefore unstable. This generalization goes outside logical belief structures to a broader set of memetic structures.

In Buddhist psychology it is one of the big problems with god realm. (For those unfamiliar, Buddhist realms can be thought of as attractors in the space of possible mind designs).

I'm confused about the difference between self-ratification and self-consistency. For example, CDT (as usually described) fights the hypothetical ("there are perfect predictors") in Newcomb's, by assigning non-zero probability to successful two-boxing, which has zero probability in the setup. Since the CDT is blind to this own shortcoming (is it? I assume it is, not sure if there is a formal proof of it, or what it would even mean to write out such a statement somewhat formally.), does it mean it's not self-ratifying? inconsistent? As I said, confused...

Self-consistency is about not asserting contradictions; self-ratification is about asserting that the process that produced this very theory is likely to produce correct theories.

Theories completely blind to their own shortcomings can very well be consistent and self-ratifying (e.g. faith that one's beliefs came from divine revelation, including the belief that one's beliefs came from divine revelation).

Ah, makes sense. So self-ratification is about seeing oneself as trustworthy. Which is neither a necessary condition for a theory to be useful, nor a sufficient condition for it to be trustworthy from outside. But still a useful criterion when evaluating a theory.

Causal decision theory

This section seems like a thought experiment about how one might want to have 'non-self-ratifying'* beliefs.

*Beliefs which do not ratify themselves.

This means that, according to his own beliefs, his optimism is untrustworthy.

Not trustworthy. Untrustworthy means we have some reason to believe they are incorrect. Trustworthy means we have some reason to believe they are correct. Randomness has neither property. (But is the gene random?)

However, a logical implication of Beth's beliefs is that she herself is a stimulus-response machine, and incapable of reasoning about world-representations. Thus, she cannot consistently believe that her specification of stimulus-response machines is likely to be an accurate, logically coherent representation of humans. Her belief-set, then, fails to self-ratify, on the basis that it assigns to herself a level of cognitive power insufficient to come to know that her belief-set is true [with hight probability].

The conclusion does not necessarily follow from the premises. Even if it did, one researcher's modus ponens, may be another's modus tollens.** (For a more direct counter argument, consider the Chinese room argument. Even if humans can't reason about world-representations (and it's not clear what that means), that doesn't mean they can't run an algorithm which can, even if the humans don't understand the algorithm.)

There is a possible misinterpretation of self-ratification that says: "You cannot assume a belief to be true in the course of refuting it; the assumption would then fail to self-ratify".

"You cannot assume a belief to be true after it has been refuted."

"You cannot assume a belief is true merely because you have shown it is not false."

"You cannot assume something is true merely because you have shown it to be so, because you are fallible."

Since this applies to past behavior as well, a further implication is that the things he reads in other blogs are, for the most part, false.

Unless the blogs he reads are filtered in some fashion. (Perhaps he has some discernment, or cleverly reads blogs only for the first X posts/years, while bloggers are honest, or by some other means detects their treacherous turn that mirrors his own.)

Suppose Phyllis believes that the physical world exists, but that minds don't exist. That is, there are not entities that are capable of observation, thought, etc. (This is a rather simple, naive formulation of eliminative materialism)

Every time this post mentions a hypothetical about how humans/human minds work, I wonder 'what does this mean?' How can this belief be refuted as 'un-ratified' (which seems a form of logical consistency, what other forms are there?), when it isn't clear what it means?

there are not entities that are capable of observation

From HPMOR:

When you walked through a park, the immersive world that surrounded you was something that existed inside your own brain as a pattern of neurons firing. The sensation of a bright blue sky wasn't something high above you, it was something in your visual cortex, and your visual cortex was in the back of your brain. All the sensations of that bright world were really happening in that quiet cave of bone you called your skull, the place where you lived and never, ever left. If you really wanted to say hello to someone, to the actual person, you wouldn't shake their hand, you'd knock gently on their skull and say "How are you doing in there?" That was what people were, that was where they really lived. And the picture of the park that you thought you were walking through was something that was visualized inside your brain as it processed the signals sent down from your eyes and retina.

**Perhaps Beth isn't human.

Not trustworthy. Untrustworthy means we have some reason to believe they are incorrect. Trustworthy means we have some reason to believe they are correct. Randomness has neither property. (But is the gene random?)

If I tell you the sun is 1.6 * 10^43 kg, but I also tell you I generated the number using a random number generator (not calibrated to the sun), that is an untrustworthy estimate. The RNG wouldn't be expected to get the right answer by accident.

A stopped clock is right twice a day, but only that often, so it's untrustworthy.

The conclusion does not necessarily follow from the premises.

Huh? If Beth's brain can't reason about the world then she can't know that humans are stimulus-response engines. (I'm not concerned with where in her brain the reasoning happens, just that it happens in her brain somewhere)

"The conclusion does not necessarily follow from the premises."

Huh? If Beth's brain can't reason about the world then she can't know that humans are stimulus-response engines. (I'm not concerned with where in her brain the reasoning happens, just that it happens in her brain somewhere)

I was going to object a bit to this example, too, but since you're already engaged with it here I'll jump in.

I think reasoning about these theories as saying humans are "just" stimulus-response engines strawmans some of these theories. I feel similarly about the mental nonrealism example. In both cases there are better versions of these theories that aren't so easily shown as non-self-ratifying, although I realize you wanted versions here for illustrative purposes. Just a complication to the context of mentioning classes of theories where only the "worst" version of serves as an example, thus is likely to raise objections that fail to notice the isolation to only the worst version.

I think it's more subtle. In mathematical logic, there's a few things that can happen to a theory:

It can prove a falsehood. That's bad: the theory is busted.
It can prove itself consistent. That's bad too: it implies the theory is inconsistent, by the second incompleteness theorem.
It can prove itself inconsistent. That's not necessarily bad: the silly theory PA+¬Con(PA), which asserts its own inconsistency, is actually equiconsistent with PA. But it suggests that the theory has a funny relationship with reality (in this case, that any model of it must include some nonstandard integers).

Overall it seems we should prefer theories that don't say anything much about their own justifications one way or the other. I suspect the right approach in philosophy is the same.

Suppose a psychological researcher, Beth, believes that humans are reinforcement-learning stimulus-response machines, and that such machines are incapable of reasoning about representations of the world. She presents a logical specification of stimulus-response machines that she believes applies to all humans. (For similar real-world theories, see: Behaviorism, Associationism, Perceptual Control Theory)
However, a logical implication of Beth's beliefs is that she herself is a stimulus-response machine, and incapable of reasoning about world-representations. Thus, she cannot consistently believe that her specification of stimulus-response machines is likely to be an accurate, logically coherent representation of humans. Her belief-set, then, fails to self-ratify, on the basis that it assigns to herself a level of cognitive power insufficient to come to know that her belief-set is true.

Fallibilist: You can never be totally certain that something is true.

Absolutist: Do you think thats true?

F: Yes.

A: See, you've just contradicted yourself.

In Buddhist psychology it is one of the big problems with god realm. (For those unfamiliar, Buddhist realms can be thought of as attractors in the space of possible mind designs).

Self-consistency is about not asserting contradictions; self-ratification is about asserting that the process that produced this very theory is likely to produce correct theories.

Causal decision theory

This section seems like a thought experiment about how one might want to have 'non-self-ratifying'* beliefs.

*Beliefs which do not ratify themselves.

This means that, according to his own beliefs, his optimism is untrustworthy.

However, a logical implication of Beth's beliefs is that she herself is a stimulus-response machine, and incapable of reasoning about world-representations. Thus, she cannot consistently believe that her specification of stimulus-response machines is likely to be an accurate, logically coherent representation of humans. Her belief-set, then, fails to self-ratify, on the basis that it assigns to herself a level of cognitive power insufficient to come to know that her belief-set is true [with hight probability].

There is a possible misinterpretation of self-ratification that says: "You cannot assume a belief to be true in the course of refuting it; the assumption would then fail to self-ratify".

"You cannot assume a belief to be true after it has been refuted."

"You cannot assume a belief is true merely because you have shown it is not false."

"You cannot assume something is true merely because you have shown it to be so, because you are fallible."

Since this applies to past behavior as well, a further implication is that the things he reads in other blogs are, for the most part, false.

Suppose Phyllis believes that the physical world exists, but that minds don't exist. That is, there are not entities that are capable of observation, thought, etc. (This is a rather simple, naive formulation of eliminative materialism)

there are not entities that are capable of observation

From HPMOR:

**Perhaps Beth isn't human.

Not trustworthy. Untrustworthy means we have some reason to believe they are incorrect. Trustworthy means we have some reason to believe they are correct. Randomness has neither property. (But is the gene random?)

A stopped clock is right twice a day, but only that often, so it's untrustworthy.

The conclusion does not necessarily follow from the premises.

"The conclusion does not necessarily follow from the premises."

Huh? If Beth's brain can't reason about the world then she can't know that humans are stimulus-response engines. (I'm not concerned with where in her brain the reasoning happens, just that it happens in her brain somewhere)

I was going to object a bit to this example, too, but since you're already engaged with it here I'll jump in.

LESSWRONG
LW

LESSWRONG
LW

23

Philosophical self-ratification

23

Causal decision theory

Genetic optimism

Low-power psychological theories

Moral realism and value drift

Trust without honesty

Mental nonrealism

Conclusion

Appendix: Proof by contradiction is consistent with self-ratification

23

23