Some beliefs seem to naively imply radical and dangerous actions. But there often are rational reasons to not act on those beliefs. Knowing those reasons is really important for those that don't have a natural defense mechanism.
Most people have a natural defense mechanism which is to not taking ideas seriously. If you just follow what others do, it's less likely that an error in your explicit reasoning will lead you to doing something radical and dangerous. The more likely you are to make such errors, the most (evolutionary and individually) advantageous it is for you to have a conformist instinct.
The answer to this question is mostly meant for people to which I want to share ideas that are dangerous if taken at face value / object-level (I want to make sure they have those defense mechanisms first; I encourage you do the same, and do your due diligences when discussing dangerous ideas; this post is not sufficient). I want to advocate to smart people to take ideas more seriously, but I don't want them to fully repress their conformist instincts, especially if they haven't built in explicit defense mechanisms. This post should also be useful for people already not having those defense mechanism. And also useful to people that want to better understand the function of conformity (although conformity is not the only defense mechanism).
Note that the defense mechanisms are not meant as fully general counterargument. They are not insurmontable (at least, not always), they just indicate when it's prudent to want more evidence.
As a small tangent, is also often has a positive externality to do exploration:
Like it’s rational for any individual to be pursuing much more heavily exploitation based strategy as long as someone somewhere else is creating the information and part of what I find kind of charming and counterintuitive about this is that you realize people who are very exploratory by nature are performing a public service. (source: Computersciencealgorithmstacklefundamentalanduniversalproblems.Cantheyhelpuslivebetter,oristhatafalse hope?)
I will post my answer below.
Bullet biting seems like a small subset of what you're gesturing at. Ideas may imply action without making it clear how those actions could go wrong (even if the act is successful).
oh yeah, that's true. I guess I thought of it in terms of bullet biting because they are the most propice to the most dangerous actions
Differential knowledge improvement / Differential learning
The order in which an agent (AI, human, etc.) learns things might be really important.
For a superintelligence, learning some information in the wrong order could pause an existential risk. For example, if they learn about Pascal's mugging argument before its resolution, they might get their future light cone mugged.
For a human, if they learn arguments for dangerous behavior before learning about 'defense mechanisms', this could have a high cost, including imminent death. See examples.
I think I could come up with many more examples. Let me know if interested.
Can you give some examples? Some belief sets (that is, the sum conditional prediction of a potential action, or the sum of empirical and deontological beliefs that relate to the action), within most decision theories, do imply actions. But "radical" and "dangerous" are just part of the belief sets, not external labels on the actions.
Are those reasons not simply beliefs that go into the decision? Can you give me an example of a non-belief rational reason to act or not-act?
My point is that if you don't have some of those general / meta beliefs described in this post, you will generally take much worse decisions, in a way that will often be known by you intuitively, but not by your explicit reasoning (which is dangerous if you don't take your intuitive warning signal seriously).
Let's assume you're someone that doesn't know the answer to the question I asked (or the information in the specific answer I gave).
Here are examples of what could go wrong.
Example 1
If you believe that a discontinuity in consciousness means you die, and when consciousness is reestablished in the brain, another mind is instantiated that is a copy of you. Then you might decide to not go back to sleep until you actually, biologically die from sleep deprivation.
While this could be the actual optimal choice, even taking into account this post, it seems likely to me that taking into account information in this post could change one's mind from 'not sleeping at all' to 'keeping normal sleeping habit'.
Some approach to moral uncertainty might actually recommend sleeping even if you're rather confident it will kill you because: % you care about discontinuity * how long you can go without sleeping << % you don't care about discontinuities * how long you can live if you sleep.
But if you don't know about how to integrate uncertainty at the model level in your reasoning, then you might just act based on your belief that sleep kills, and so stop sleeping. This error mode could severely affect a lot of people around me based on the 'object-level' beliefs I see shared around.
I've written more about this here, but I have made the post private for now as I'm revisiting whether it contains info-hazard.
Example 2
If you don't see any error with Pascal's mugging, and so you decide to act on its logical implications, then a mugger might rob you of everything, and render you a complete slave.
Actually, I'm not sure if I have a defense mechanism to propose for this one, beside knowing the resolution of the problem before / at the same time than being introduced to the problem. But one could argue that "your intuitions that this is wrong" would be a good defense mechanism against explicit reasoning going astray.
Like a belief that you've discovered a fantastic investment opportunity, perhaps?
So, false beliefs are the risk here? I'd think the defense mechanism is Bayes' Rule.
The vast majority of people who read about Pascal's Mugging won't actually be convinced to give money to someone promising them ludicrous fulfilment of their utility function. The vast majority of people who read about Roko's Basilisk do not immediately go out and throw themselves into a research institute dedicated to building the basilisk. However, they also do not stop believing in the principles underpinning these "radical" scenarios/courses of action (the maximization of utility, for one). Many of them will go on to affirm the very same thought processes that would lead you to give all your money to a mugger or build an evil AI, for instance by donating money to charities they think will be most effective.
This suggests that most people have some innate way of distinguishing between "good" and "bad" implementations of certain ideas or principles that isn't just "throw the idea away completely". It might* be helpful if we could dig out this innate method and apply it more consciously.
*I say might because there's a real chance that the method turns out to be just "accept implementations that are societally approved of, like giving money to charity, and dismiss implementations that are not societally approved of, like building rogue AIs". If this is the case, then it's not very useful. But it's probably worth investigating some amount at least.
Someone wrote on the Facebook thread (sharing with permission):