This post was written for Convergence Analysis. Cross-posted to the EA Forum.

The concept of information hazards is highly relevant to many efforts to do good in the world (particularly, but not only, from the perspective of reducing existential risks). I’m thus glad that many effective altruists and rationalists seem to know of, and refer to, this concept. However, it also seems that:

  • People referring to the concept often don’t clearly define/explain it
  • Many people (quite understandably) haven’t read Nick Bostrom’s original (34 page) paper on the subject
  • Some people misunderstand or misuse the term “information hazards” in certain ways

Thus, this post seeks to summarise, clarify, and dispel misconceptions about the concept of information hazards. It doesn’t present any new ideas of my own.


In his paper, Bostrom defines an information hazard as:

A risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm.

He emphasises that this concept is just about how true information can cause harm, not how false information can cause harm (which is typically a more obvious possibility).

Bostrom’s paper outlines many different types of information hazards, and gives examples of each. The first two types listed are the following:

Data hazard: Specific data, such as the genetic sequence of a lethal pathogen or a blueprint for making a thermonuclear weapon, if disseminated, create risk.
[...] Idea hazard: A general idea, if disseminated, creates a risk, even without a data-rich detailed specification.
For example, the idea of using a fission reaction to create a bomb, or the idea of culturing bacteria in a growth medium with an antibiotic gradient to evolve antibiotic resistance, may be all the guidance a suitably prepared developer requires; the details can be figured out. Sometimes the mere demonstration that something (such as a nuclear bomb) is possible provides valuable information which can increase the likelihood that some agent will successfully set out to replicate the achievement.

He further writes:

Even if the relevant ideas and data are already “known”, and published in the open literature, an increased risk may nonetheless be created by drawing attention to a particularly potent possibility.

This leads him to his third type:

Attention hazard: mere drawing of attention to some particularly potent or relevant ideas or data increases risk, even when these ideas or data are already “known”.
Because there are countless avenues for doing harm, an adversary faces a vast search task in finding out which avenue is most likely to achieve his goals. Drawing the adversary’s attention to a subset of especially potent avenues can greatly facilitate the search. For example, if we focus our concern and our discourse on the challenge of defending against viral attacks, this may signal to an adversary that viral weapons—as distinct from, say, conventional explosives or chemical weapons—constitute an especially promising domain in which to search for destructive applications.

The possibility of these and other types of information hazards means that it will sometimes be morally best to avoid creating or spreading even true information. (Exactly when and how to attend to and reduce potential information hazards is beyond the scope of this post; for thoughts on that, see my later post.)

Context and scale

Those quoted examples all relate to fairly large-scale risks (perhaps even existential risks). Some also relate to risks from advancing the development of potentially dangerous technologies (contrary to the principle of differential progress). It seems to me that the concept of information hazards is most often used in relation to such large-scale, existential, and/or technological risks, and indeed that that’s where the concept is most useful.

However, it’s also worth noting that information hazards don’t have to relate to these sorts of risks. Information hazards can occur in a wide variety of contexts, and can sometimes be very mundane or minor. Some of Bostrom’s types and examples highlight that. For example:

Spoilers constitute a special kind of disappointment. Many forms of entertainment depend on the marshalling of ignorance. Hide-and-seek would be less fun if there were no way to hide and no need to seek. For some, knowing the day and the hour of their death long in advance might cast shadow over their existence.
[Thus, it is also possible to have a] Spoiler hazard: Fun that depends on ignorance and suspense is at risk of being destroyed by premature disclosure of truth.

Who’s at risk?

Spoiler hazards are a type of information hazards that risk harm only to the knower of the true information themselves, and as a direct result of them knowing the information. (In contrast, in Bostrom’s earlier examples, although the knower might eventually be harmed, this would be (1) along with many other people, and (2) a result of catastrophic or existential risks that were made more likely by the knowledge the knower spread, rather than as a direct result of the knower having that knowledge.)

Bostrom also lists additional types of information hazards where the risk of harm is to the knower themselves, and directly results from their knowledge. But there appears to be no established term for the entire subset of information hazards that fit that description. Proposed terms I’m partial to include “knowledge hazards” and “direct information hazards”. (Further discussion can be found in that comments section. If you have thoughts on that, please comment here or there.)

But I should emphasise that that is indeed only a subset of all information hazards; information hazards can harm people other than the knower themselves, and, as mentioned above, this will be the case in the contexts where information hazards are perhaps most worrisome and worth attending to. (I emphasise this because some people misunderstand or misuse the term “information hazards” as referring only to what we might call “knowledge hazards”; this misuse is apparent here, and is discussed here.)

Information hazards are risks

As noted, an information hazard is “A risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm” (emphasis added). Thus, as best I can tell:

  • Something can create an information hazard even if no harm has yet occurred, and there’s no guarantee it will ever occur.
    • E.g., writing a paper on a plausibly dangerous technology can create an information hazard, even if the technology turns out to be safe after all.
  • But if we don’t have any specific reason to believe that there’s even a “risk” of harm from some true information, and we just think that it’d be worth bearing in mind that there might be a risk, it may be best to say there’s a “potential information hazard”.
    • So it’s probably not helpful to, for example, slap the label of “information hazard” on all of biological research.

Closing remarks

I hope you’ve found this post clear and useful. To recap:

  • The concept of information hazards relates to risks of harm from creating or spreading true information (not from creating or spreading false information).
  • The concept is definitely very useful in relation to existential risks and risks from technological development, but can also apply in a wide range of other contexts, and at much smaller scales.
  • Some information hazards risk harm only to the knower of the true information themselves, and as a direct result of them knowing the information. But many information hazards harm other people, or harm in other ways.
  • Information hazards are risks of harm, not necessarily guaranteed harms.

In my next posts, I'll:

  • discuss and visually represent how information hazards and downside risks relate to each other and to other types of effects
  • suggest some rules-of-thumb regarding why, when, and how one should deal with potential information hazards (aiming for more nuance than just “Always pursue truth!” or “Never risk it!”).

My thanks to David Kristoffersson and Justin Shovelain for feedback on this post.

Other sources relevant to this topic are listed here.

New to LessWrong?

New Comment
15 comments, sorted by Click to highlight new comments since: Today at 2:26 PM

Thanks, this is a really useful summary to have since linking back to Bostrom on info hazards is reasonable but not great if you want people to actually read something and understand information hazards rather than bounce of something explaining the idea. Kudos!

Thanks! And in that case, you may also be interested in my post trying to summarise/clarify the concepts of differential progress / intellectual progress / technological development (if you hadn’t already seen it).

Data hazard: Specific data, such as the genetic sequence of a lethal pathogen or a blueprint for making a thermonuclear weapon, if disseminated, create risk.

Dangerous blue prints. A generalization might include 'stability'.

It's interesting how it relates to false information. A failed implementation of 'true' nuclear reactor blue prints could also be dangerous (depending on the design).* Some designs could have more risk than others based on how likely people handling it are to fail at making a safe reactor. (Dangerous Wooden Catapult - Plans not safe for children.)

*This property could be called stability, safety, robustness, etc.

Idea hazard: A general idea, if disseminated, creates a risk, even without a data-rich detailed specification.

This one may make more sense absent the true criteria - telling true from false need not be trivial. (How would people who aren't nuclear specialists tell that a design for a nuke is flawed?) The difference between true and false w.r.t possibly self-fulfilling prophecies isn't clear.

Also antibiotics might be useful for creating antibiotic resistant bacteria. (Not sure if such bacteria are more deadly to humans all else equal - this makes categorization difficult, how can an inventor tell if their invention can be used for ill?)

Sometimes the mere demonstration that something (such as a nuclear bomb) is possible provides valuable information which can increase the likelihood that some agent will successfully set out to replicate the achievement.

This is a useful concept in its own right.

Attention hazard: mere drawing of attention to some particularly potent or relevant ideas or data increases risk, even when these ideas or data are already “known”.

Also could be a risk in reverse - hiding evidence of a catastrophe could hinder its prevention/counter measures being developed (in time).

A mild/'non-hazardous' form might be making methods of paying attention to a thing less valuable, or bringing attention to things which if followed turn out to be dead ends.

(Exactly when and how to attend to and reduce potential information hazards is beyond the scope of this post; Convergence hopes to explore that topic later.)

I look forward to this work.


the principle of differential progress

from the linked post:

What we do have the power to affect (to what extent depends on how we define “we”) is the rate of development of various technologies and potentially the sequence in which feasible technologies are developed and implemented. Our focus should be on what I want to call differential technological development: trying to retard the implementation of dangerous technologies and accelerate implementation of beneficial technologies, especially those that ameliorate the hazards posed by other technologies.

An idea that seems as good and obvious as utilitarianism. But what if these things come in cycles? Technology A may be both positive and negative, but technology B which negates its harms is based on A. Slowing down tech development seems good before A arrives, but bad after. (This scenario implicitly requires that the poison has to be invented before the cure.)


[Thus, it is also possible to have a] Spoiler hazard: Fun that depends on ignorance and suspense is at risk of being destroyed by premature disclosure of truth.

Words like 'hazard' or 'risk' seem too extreme in this context. The effect can also be reversed - the knowledge that learning physics could enable you to reach the moon might serve to make the subject more, rather than less interesting. (The key point here is that people vary, which could be important to 'infohazards in general'. Perhaps some people acquiring the blueprints for a nuclear reactor wouldn't be dangerous because they wouldn't use them. Someone with the right knowledge (or in the right time and place) might be able to do more good with these blueprints, or even have less risk of harm; "I didn't think of doing that, but I see how it'd make the reactor safer.")

Terminology questions:

What is a minor hazard? (Info-paper cut doesn't sound right.)

What is the opposite of a hazard? (Info safeguard or shield sounds like it could refer to something that shields from info-hazards.)

As noted, an information hazard is “A risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm”

The opposite being a "noble lie".

E.g., writing a paper on a plausibly dangerous technology can be an information hazard even if it turns out to be safe after all.

This seems confused. It seems the relevant map-territory distinction here is "something could be an information hazard, without us knowing that it is."

The concept of information hazards relates to risks of harm from creating or spreading true information (not from creating or spreading false information).

By definition only - the hazards of information need not obey this constraint.

The concept is definitely very useful in relation to existential risks and risks from technological development, but can also apply in a wide range of other contexts, and at much smaller scales.

What is an infohazard seems relative. If information about how to increase health can also be used to negatively impact it, then whether or not something is an infohazard seems to be based on the audience - are they benign or malign?

Some information hazards risk harm only to the knower of the true information themselves, and as a direct result of them knowing the information. But many information hazards harm other people, or harm in other ways.

Can idea A be harmful to those that don't carry it? Can two ideas x and y exist such that both are true, but holders of one idea may be dangerous to holders of the other? Are there non-trivial cases where ideas q, r, and, s are infohazards only as a whole? (Trivial case might be like 3 parts to knowing how build a nuke.)

Can a set of ideas together in part be an infohazard, but be harmless as a whole?

Information hazards are risks of harm, not necessarily guaranteed harms.

Information...harms? weapons? (Weapons are a guaranteed source of harm.) Thorns? Sharp information?

Misc

I've quoted and replied to your comments regarding differential technological development on the relevant post, as I thought that the more appropriate home for that discussion. Hope that's ok.

---

This is a useful concept in its own right.

The "mere demonstration" point was part of Bostrom's explanation of the concept of an idea hazard. He was expressing that just showing you've done something can communicate one form of true information - the idea that it can be done - which can be harmful, even without detailed data on precisely how to do it.

---

I look forward to this work.

Good to hear! Current plan is to have it out Thursday (UK time). We've got a fair few posts planned in this sequence.

Interaction between different pieces of info

Are there non-trivial cases where ideas q, r, and, s are infohazards only as a whole? (Trivial case might be like 3 parts to knowing how build a nuke.)

Good question! I hadn't explicitly thought about that, but I think the answer is "yes, in a sense", and that that's important. E.g., people have wondered whether some nuclear physics research, and possibly even broader sets of research, were harmful in that they helped lead to nuclear weapons. I'd guess that these pieces of research would have been pretty much purely positive if certain other pieces of research, ideas, info, etc. hadn't occurred. But given that the other pieces of research did occur, they were harmful.

I say "in a sense" because I think this may be better framed as such research having been an information hazard by itself, given the chance that the other information would later occur and cause harm, and that that was made more likely by this initial research. (Rather than that each piece of information "wasn't risky", and only the collective was.)

But I think that highlighting that information doesn't exist in a vacuum, and that there can be interaction effects between different bits of information, lines of research, uses of research, etc., is interesting and important.

Often harmful to people who don't have the info

Can idea A be harmful to those that don't carry it? Can two ideas x and y exist such that both are true, but holders of one idea may be dangerous to holders of the other?

Definitely.

Examples for the first question are easy - info on how to build nukes, make bioweapons, create a misaligned AGI, are all/would all be harmful to many people who don't carry them. That's the type of information hazards I'm most interested in, and the type that makes the concept highly relevant to (most technology-related) global catastrophic and existential risks.

Examples for the second question are a bit harder, if you mean dangerous specifically to holders of idea y, and not to the more general public. But one quick and somewhat uninteresting example would be the information of how to do a terrorist attack on one rich and productive city, and the information that that city is rich and productive. Both info could be true. The latter info would draw people to the city, to some extent, increasing the chances that they're harmed by holders of the former info.

(I see this example as "uninteresting" in the sense that the interaction effects of the two pieces of information don't seem especially worth noting or highlighting. But it still seems to fit, and this sort of thing is probably common, I'd guess.)

Information hazards are risks, typically viewed ex ante

"E.g., writing a paper on a plausibly dangerous technology can be an information hazard even if it turns out to be safe after all."
This seems confused. It seems the relevant map-territory distinction here is "something could be an information hazard, without us knowing that it is."

I disagree with "This seems confused". Bostrom makes it clear that an information hazard is a risk. And I'd say that the most valuable usage* of the concept is ex ante, and subjective/epistemic (in the probability/credence sense); evaluating whether to develop or share some (true) information, based on the chance it could do harm, given what we know/can predict. We can't know for sure, at that stage, whether it will in fact do harm.

And if the concept meant guaranteed or certain harms from information, then that concept itself would be less useful ex ante, and we'd often want to say "risks of harm from (true) information" (i.e., we'd be back to the same concept, with a longer term).

This is totally in line with how the term risk is used in general. E.g., existential risks are risks even if it turns out they don't happen. In fact, a lot of people working on them expect they won't happen, but think just moderately or very low odds are utterly unacceptable, given the stakes, and they want to lower those odds further. In the same way, you can meaningfully talk about your risk of dying in a car crash vs in a plane crash, even though you'll probably never die in either.

*We can also use the term ex post, to refer to harms that did happen, or times we thought a harm might come to pass but we got lucky or turned out to have assessed the risks incorrectly. (I'd call such actual harms "harms from information hazards", "harms from information", or perhaps "information harms", as you suggest. And this fits neatly with information hazards being risks of harm from information.) But that's not the only use-case, or (I'd argue) the most useful one.

"E.g., writing a paper on a plausibly dangerous technology can be an information hazard even if it turns out to be safe after all."

So you're talking about probability. If a gun has a bullet in one of chambers, but it isn't known which one, and the rest are empty, then firing it once has the same physical risks as a fully loaded firearm with 1/n probability, where n is the number of chambers. Even if in one instance tragedy does not occur, that doesn't (necessarily) change the 'probability' we should assign in the same situations in the future.

If I'm understanding you correctly, I think you're indeed understanding me correctly :)

I'm saying, as per Bostrom's definition, that information hazards are risks of harms, with the risks typically being evaluated ex ante and subjectively/epistemically (in the sense in which those terms are used in relation to probability). In some cases the harm won't actually occur. In some cases a fully informed agent might have been able to say with certain that the risk wouldn't have ended up occurring. But based on what we knew, there was a risk (in some hypothetical situation), and that means there was an information hazard.

Very minor risks

Words like 'hazard' or 'risk' seem too extreme in this context.
[...]
What is a minor hazard? (Info-paper cut doesn't sound right.)

Personally, I wouldn't really bother talking about information hazards at the really small scales. My interest is primarily in how they relate to catastrophic and existential risks.

But as I say, Bostrom defines the term, and gives types, in such a way that it can apply even at small scales. And it does seem like that make sense, conceptually - the basic mechanisms occurring can sometimes be similar even at very small scales.

And I think you can say things like the "risk" of getting a paper cut. Although "hazard" does feel a little strong to me too.

So personally, I wouldn't want to invent a new term for the small scales, because

  • The term "information hazard" is defined and explained in such a way that it applies at those scales
  • That extension does seem to make conceptual sense to me
  • I rarely care about talking about risks on those small scales anyway, so I see no need for a new term for them

Information hazards are risks of harm, not necessarily net harm - and the audience matters

You write:

Also antibiotics might be useful for creating antibiotic resistant bacteria. (Not sure if such bacteria are more deadly to humans all else equal - this makes categorization difficult, how can an inventor tell if their invention can be used for ill?)

From memory, I don't believe the following point is actually explicit in Bostrom's paper, but I'd say that information hazards are just risks that some (notable level of) harm will occur, not necessarily that the net impacts of the information (dissemination) will be negative. (It's possible that this is contrary to standard usage, but I think standard usage doesn't necessarily have a clear position here, and this usage seems useful to me.)

Note that Bostrom defines an information hazard as "A risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm." He doesn't say "A risk that the dissemination or the potential dissemination of (true) information may make the world worse overall, compared to if the information hadn't been disseminated, or enable some agent to make the world worse overall, compared to if the information hadn't been disseminated."

In fact, our upcoming post will note that it may often be the case that some information does pose risks of harm, but is still worth developing or sharing on balance, because the potential benefits are sufficiently high. This will often be because the harms may not actually occur (as they're currently just risks). But it could also be because, even if both the harms and benefits do actually occur, the benefits would outweigh the harms.

(There are also other cases in which that isn't true, and one should be very careful about developing or sharing some information, or even just not do so at all. That'll be explored in that post.)

But I hadn't thought about that point very explicitly when writing this post, and perhaps should've made it more explicit here, so thanks for bringing it to my attention.

Relatedly, you write:

(The key point here is that people vary, which could be important to 'infohazards in general'. Perhaps some people acquiring the blueprints for a nuclear reactor wouldn't be dangerous because they wouldn't use them. Someone with the right knowledge (or in the right time and place) might be able to do more good with these blueprints, or even have less risk of harm; "I didn't think of doing that, but I see how it'd make the reactor safer.")
[...]
What is an infohazard seems relative. If information about how to increase health can also be used to negatively impact it, then whether or not something is an infohazard seems to be based on the audience - are they benign or malign?

It will often be the case that developing and sharing information will have some positive consequences and some negative, as noted above. It will also often be the case that information will have mostly positive consequences when received by certain people, and mostly negative consequences when received by others, as you note here.

I would say that the risk that the information you share ultimately reaches people who may use it badly is part of information hazards. If it might not reach those people, that means the risk is lower. If it will also reach other people who'll use the information in beneficial ways, then that's a benefit of sharing the info, and a reason it may be worth doing so even if there are also risks.

In our upcoming post, we'll note that one strategy for handling potential information hazards would be to be quite conscious about how the info is framed/explained and who you share it with, to influence who receives it and how they use it. This is one of the "middle paths" between just sharing as if there's no risk and just not sharing as if there's only risk and no potential benefits.

Somewhat related to your point, I also think that, if we had a world where no one was malign or careless, then most things that currently pose information hazards would not. (Though I think some of Bostrom's types would remain.) And if we had a world where the vast majority of people were very malign or careless, then the benefits of sharing info would go down and the risks would go up. We should judge how risky (vs potential beneficial) developing and sharing info is based on our best knowledge of how the world actually is, including the people in it, and especially those who are likely to receive the info.

Thanks for the detailed comment! I'll split my reply up by rough subtopics, in hopes that that makes things easier to follow and engage with.

On harms from true information

You write:

It's interesting how it relates to false information. A failed implementation of 'true' nuclear reactor blue prints could also be dangerous (depending on the design).* Some designs could have more risk than others based on how likely people handling it are to fail at making a safe reactor. (Dangerous Wooden Catapult - Plans not safe for children.)
[...]
This one may make more sense absent the true criteria - telling true from false need not be trivial. (How would people who aren't nuclear specialists tell that a design for a nuke is flawed?) The difference between true and false w.r.t possibly self-fulfilling prophecies isn't clear.\
[...]
Also could be a risk in reverse - hiding evidence of a catastrophe could hinder its prevention/counter measures being developed (in time).
[...]
"The concept of information hazards relates to risks of harm from creating or spreading true information (not from creating or spreading false information)."
By definition only - the hazards of information need not obey this constraint.

Bostrom writes:

This paper will also not discuss the ways in which harm can be caused by false information. Many of those ways are obvious. We can be harmed, for instance, by false information that misleads us into believing that some carcinogenic pharmaceutical is safe; or, alternatively, that some safe pharmaceutical is carcinogenic. We will limit our investigation to the ways in which the discovery and dissemination of true information can be harmful.

Likewise, I think it's definitely and clearly true that false information (which perhaps shouldn't be called information at all, but rather misinformation, falsehoods, mistaken ideas, etc.) can and often does cause harm. I expect that mirrors of all or most of Bostrom's types of information hazards could be created for misinformation specifically. But Bostrom's aim was to emphasise the less obvious and often overlooked claim that even true information can sometimes cause harm (not that especially or only true information can cause harm).

Also, yes, in a sense "The concept of information hazards relates to risks of harm from creating or spreading true information (not from creating or spreading false information)" is true only by definition, because that sentence is a definition. But concepts and definitions are useful - they aid in communication, and in gaining mutual understanding of what subset of all the myriad possible things and ideas one is referring to.

It's true that misinformation can cause harm, but that's widely noted, so it's useful to also have a term to highlight that true information can cause harm to, and that we use when we're referring that narrow subset of all possibilities.

(This is why I wrote "He emphasises that this concept is just about how true information can cause harm, not how false information can cause harm (which is typically a more obvious possibility)" [emphasis swapped].)

Do people generally consider wireheading and Goodheart's law as information hazards? They're both "errors" caused by access to true data, but that is easy to misuse.

Interesting.

I would say that, to the extent to true information allowed those outcomes to occur, and to the extent that those outcomes were harmful, the true information posed an information hazard. I.e., there was a risk of harm from true information, and the harm was that harmful wireheading or Goodharting happened.

I.e., I'd say that the wireheading or Goodharting isn't itself an information hazard; it's the harm that the information led to.

In the same way, it's not that a pandemic is an information hazard, but that a pandemic is a harm that spreading certain information could lead to, which makes that information hazardous.

Does that make sense?

ETA: I guess an implicit assumption I'm making is that the access to the true information made the harmful wireheading or Goodharting more likely, in these hypotheticals. If it happens to be that any random data, or a decent subset of all possible false data, would've also led to the harmful outcomes with similar likelihood, then there isn't necessarily a risk arising from true information.

I suggested that, while a set of plans for a nuclear reactor might be true, and safe if executed correctly, if not executed correctly this might have similar effects to a nuke. Thus 'stability' - if something is (almost) impossible for humans to execute correctly, and is unsafe in the event it is performed even slightly incorrectly, then it is 'unstable' (and dangerous in a different way than 'stable' designs for a nuke).

Misuse starts to get into relativity - someone who would never use plans to build a nuke isn't harmed by receiving them (absent other actors trying to steal said plans from them), which means information hazards are relative.