[EDIT - While I still support the general premise argued for in this post, the examples provided were fairly terrible. I won't delete this post because the comments contain some interesting and valuable discussions, but please bear in mind that this is not even close to the most convincing argument for my point.]
A great deal of the theory involved in improving computer and network security involves the definition and creation of "trusted systems", pieces of hardware or software that can be relied upon because the input they receive is entirely under the control of the user. (In some cases, this may instead be the system administrator, manufacturer, programmer, or any other single entity with an interest in the system.) The only way to protect a system from being compromised by untrusted input is to ensure that no possible input can cause harm, which requires either a robust filtering system or strict limits on what kinds of input are accepted: a blacklist or a whitelist, roughly.
One of the downsides of having a brain designed by a blind idiot is that said idiot hasn’t done a terribly good job with limiting input or anything resembling “robust filtering”. Hence that whole bias thing. A consequence of this is that your brain is not a trusted system, which itself has consequences that go much, much deeper than a bunch of misapplied heuristics. (And those are bad enough on their own!)
In discussions of the AI-Box Experiment I’ve seen, there has been plenty of outrage, dismay, and incredulity directed towards the underlying claim: that a sufficiently intelligent being can hack a human via a text-only channel. But whether or not this is the case (and it seems to be likely), the vulnerability is trivial in the face of a machine that is completely integrated with your consciousness and can manipulate it, at will, towards its own ends and without your awareness.
Your brain cannot be trusted. It is not safe. You must be careful with what you put into it, because it will decide the output, not you. We have been warned, here on Less Wrong, that there is dangerous knowledge; Eliezer has told us that knowing about biases can cause us harm. Nick Bostrom has written a paper describing dozens of ways in which information can hurt us, but he missed (at least) one.
The acquisition of some thoughts, discoveries, and pieces of evidence can lower our expected outcomes, even when they are true. This can be accounted for; we can debias. But some thoughts and discoveries and pieces of evidence can be used by our underhanded, untrustworthy brains to change our utility functions, a fate that is undesirable for the same reason that being forced to take a murder pill is undesirable.
(I am making a distinction here between the parts of your brain that you have access to and can introspect about, which for lack of better terms I call “you” or “your consciousness”, and the vast majority of your brain, to which you have no such access or awareness, which I call “your brain.” This is an emotional manipulation, which you are now explicitly aware of. Does that negate its effect? Can it?)
A few examples (in approximately increasing order of controversy):
Identity Politics: Paul Graham and Kaj Sotala have covered this ground, so I will not rehash their arguments. I will only add that, in the absence of a stronger aspect of your identity, truly identifying as something new is an irreversible operation. It might be overwritten again in time, but your brain will not permit an undo.
Power Corrupts: History is littered with examples of idealists seizing power only to find themselves betraying the values they once held dear. No human who values anything more than power itself should seek it; your brain will betray you. There has not yet been a truly benevolent dictator and it would be delusional at best to believe that you will be the first. You are not a mutant. (EDIT: Michael Vassar has pointed out that there have been benevolent dictators by any reasonable definition of the word.)
Opening the Door to Bigotry: I place a high value on not discriminating against sentient beings on the basis of artifacts of the birth lottery. I’ve also observed that people who come to believe that there are significant differences between the sexes/races/whatevers on average begin to discriminate against all individuals of the disadvantaged sex/race/whatever, even when they were only persuaded by scientific results they believed to be accurate and were reluctant to accept that conclusion. I have watched this happen to smart people more than once. Furthermore, I have never met (or read the writings of) any person who believed in fundamental differences between the whatevers and who was not also to some degree a bigot.
One specific and relatively common version of this are people who believe that women have a lower standard deviation on measures of IQ than men. This belief is not incompatible with believing that any particular woman might be astonishingly intelligent, but these people all seem to have a great deal of trouble applying the latter to any particular woman. There may be exceptions, but I haven’t met them. Based on all the evidence I have, I’ve made a conscious decision to avoid seeking out information on sex differences in intelligence and other, similar kinds of research. I might be able to resist my brain’s attempts to change what I value, but I’m not willing to take that risk; not yet, not with the brain I have right now.
If you know of other ways in which a person’s brain might stealthily alter their utility function, please describe them in the comments.
If you proceed anyway...
If the big red button labelled “DO NOT TOUCH!” is still irresistible, if your desire to know demands you endure any danger and accept any consequences, then you should still think really, really hard before continuing. But I’m quite confident that a sizable chunk of the Less Wrong crowd will not be deterred, and so I have a final few pieces of advice.
- Identify knowledge that may be dangerous. Forewarned is forearmed.
- Try to cut dangerous knowledge out of your decision network. Don’t let it influence other beliefs or your actions without your conscious awareness. You can’t succeed completely at this, but it might help.
- Deliberately lower dangerous priors, by acknowledging the possibility that your brain is contaminating your reasoning and then overcompensating, because you know that you’re still too overconfident.
- Spend a disproportionate amount of time seeking contradictory evidence. If believing something could have a great cost to your values, make a commensurately great effort to be right.
- Just don’t do it. It’s not worth it. And if I found out, I’d have to figure out where you live, track you down, and kill you.
Just kidding! That would be impossibly ridiculous.