I'm a Researcher and Writer for Convergence Analysis (https://www.convergenceanalysis.org/), an existential risk strategy research group.

Posts of mine that were written for/with Convergence will mention that fact. In other posts, and in most of my comments, opinions expressed are my own.

I'm always very interested in feedback, comments, ideas, etc., and potentially research/writing collaborations.

About half of my posts are on the EA Forum: https://forum.effectivealtruism.org/users/michaela

MichaelA's Comments

Editor Mini-Guide

Yeah, I just tested it with both "longnote" and just random characters and it worked.

I can't really remember my issue, but given that I wrote "they were working as footnotes, but were just removing the paragraphs after the first one", I'm guessing I just hadn't realised I have to indent, or use 4 spaces for that, until after trying "bignote" instead of "longnote". And then I misinterpreted switching from "longnote" to "bignote" as the active ingredient that caused later paragraphs to stop disappearing.

What are information hazards?

If I'm understanding you correctly, I think you're indeed understanding me correctly :)

I'm saying, as per Bostrom's definition, that information hazards are risks of harms, with the risks typically being evaluated ex ante and subjectively/epistemically (in the sense in which those terms are used in relation to probability). In some cases the harm won't actually occur. In some cases a fully informed agent might have been able to say with certain that the risk wouldn't have ended up occurring. But based on what we knew, there was a risk (in some hypothetical situation), and that means there was an information hazard.

Morality vs related concepts

I'm pretty confident that (self-described) utilitarians, in practice, very rarely do that. I think it's more common for them to view and discuss things as if they become "progressively more praiseworthy", or as if there's an obligation to do something that's at least sufficiently good, and then better things become "progressively more praiseworthy" (i.e., like you have to satisfice, and then past there it's a matter of supererogation).

I'm pretty confident that at least some forms of utilitarianism do see only the maximally good (either in expectation or "objectively") action as permitted. And I think that classical utilitarianism, if you take its original form at face value, fits that description. But there are various forms of utilitarianism, and it's very possible that not all of them have this "maximising" nature. (Note that I'm not a philosopher either.)

I think a few somewhat relevant distinctions/debates are subjectivism vs objectivism (as mentioned in this post) and actualism vs possibilism (full disclosure: I haven't read that linked article).

Note that me highlighting that self-described utilitarians don't necessarily live by or make statements directly corresponding to classical utilitarianism isn't necessarily a critique. I would roughly describe myself as utilitarian, and don't necessarily live by or make statements directly corresponding to classical utilitarianism. This post is somewhat relevant to that (and is very interesting anyway).

What are information hazards?


I would say that, to the extent to true information allowed those outcomes to occur, and to the extent that those outcomes were harmful, the true information posed an information hazard. I.e., there was a risk of harm from true information, and the harm was that harmful wireheading or Goodharting happened.

I.e., I'd say that the wireheading or Goodharting isn't itself an information hazard; it's the harm that the information led to.

In the same way, it's not that a pandemic is an information hazard, but that a pandemic is a harm that spreading certain information could lead to, which makes that information hazardous.

Does that make sense?

ETA: I guess an implicit assumption I'm making is that the access to the true information made the harmful wireheading or Goodharting more likely, in these hypotheticals. If it happens to be that any random data, or a decent subset of all possible false data, would've also led to the harmful outcomes with similar likelihood, then there isn't necessarily a risk arising from true information.

What are information hazards?


I've quoted and replied to your comments regarding differential technological development on the relevant post, as I thought that the more appropriate home for that discussion. Hope that's ok.


This is a useful concept in its own right.

The "mere demonstration" point was part of Bostrom's explanation of the concept of an idea hazard. He was expressing that just showing you've done something can communicate one form of true information - the idea that it can be done - which can be harmful, even without detailed data on precisely how to do it.


I look forward to this work.

Good to hear! Current plan is to have it out Thursday (UK time). We've got a fair few posts planned in this sequence.

What are information hazards?

Interaction between different pieces of info

Are there non-trivial cases where ideas q, r, and, s are infohazards only as a whole? (Trivial case might be like 3 parts to knowing how build a nuke.)

Good question! I hadn't explicitly thought about that, but I think the answer is "yes, in a sense", and that that's important. E.g., people have wondered whether some nuclear physics research, and possibly even broader sets of research, were harmful in that they helped lead to nuclear weapons. I'd guess that these pieces of research would have been pretty much purely positive if certain other pieces of research, ideas, info, etc. hadn't occurred. But given that the other pieces of research did occur, they were harmful.

I say "in a sense" because I think this may be better framed as such research having been an information hazard by itself, given the chance that the other information would later occur and cause harm, and that that was made more likely by this initial research. (Rather than that each piece of information "wasn't risky", and only the collective was.)

But I think that highlighting that information doesn't exist in a vacuum, and that there can be interaction effects between different bits of information, lines of research, uses of research, etc., is interesting and important.

What are information hazards?

Often harmful to people who don't have the info

Can idea A be harmful to those that don't carry it? Can two ideas x and y exist such that both are true, but holders of one idea may be dangerous to holders of the other?


Examples for the first question are easy - info on how to build nukes, make bioweapons, create a misaligned AGI, are all/would all be harmful to many people who don't carry them. That's the type of information hazards I'm most interested in, and the type that makes the concept highly relevant to (most technology-related) global catastrophic and existential risks.

Examples for the second question are a bit harder, if you mean dangerous specifically to holders of idea y, and not to the more general public. But one quick and somewhat uninteresting example would be the information of how to do a terrorist attack on one rich and productive city, and the information that that city is rich and productive. Both info could be true. The latter info would draw people to the city, to some extent, increasing the chances that they're harmed by holders of the former info.

(I see this example as "uninteresting" in the sense that the interaction effects of the two pieces of information don't seem especially worth noting or highlighting. But it still seems to fit, and this sort of thing is probably common, I'd guess.)

What are information hazards?

Information hazards are risks, typically viewed ex ante

"E.g., writing a paper on a plausibly dangerous technology can be an information hazard even if it turns out to be safe after all."
This seems confused. It seems the relevant map-territory distinction here is "something could be an information hazard, without us knowing that it is."

I disagree with "This seems confused". Bostrom makes it clear that an information hazard is a risk. And I'd say that the most valuable usage* of the concept is ex ante, and subjective/epistemic (in the probability/credence sense); evaluating whether to develop or share some (true) information, based on the chance it could do harm, given what we know/can predict. We can't know for sure, at that stage, whether it will in fact do harm.

And if the concept meant guaranteed or certain harms from information, then that concept itself would be less useful ex ante, and we'd often want to say "risks of harm from (true) information" (i.e., we'd be back to the same concept, with a longer term).

This is totally in line with how the term risk is used in general. E.g., existential risks are risks even if it turns out they don't happen. In fact, a lot of people working on them expect they won't happen, but think just moderately or very low odds are utterly unacceptable, given the stakes, and they want to lower those odds further. In the same way, you can meaningfully talk about your risk of dying in a car crash vs in a plane crash, even though you'll probably never die in either.

*We can also use the term ex post, to refer to harms that did happen, or times we thought a harm might come to pass but we got lucky or turned out to have assessed the risks incorrectly. (I'd call such actual harms "harms from information hazards", "harms from information", or perhaps "information harms", as you suggest. And this fits neatly with information hazards being risks of harm from information.) But that's not the only use-case, or (I'd argue) the most useful one.

What are information hazards?

Very minor risks

Words like 'hazard' or 'risk' seem too extreme in this context.
What is a minor hazard? (Info-paper cut doesn't sound right.)

Personally, I wouldn't really bother talking about information hazards at the really small scales. My interest is primarily in how they relate to catastrophic and existential risks.

But as I say, Bostrom defines the term, and gives types, in such a way that it can apply even at small scales. And it does seem like that make sense, conceptually - the basic mechanisms occurring can sometimes be similar even at very small scales.

And I think you can say things like the "risk" of getting a paper cut. Although "hazard" does feel a little strong to me too.

So personally, I wouldn't want to invent a new term for the small scales, because

  • The term "information hazard" is defined and explained in such a way that it applies at those scales
  • That extension does seem to make conceptual sense to me
  • I rarely care about talking about risks on those small scales anyway, so I see no need for a new term for them
What are information hazards?

Information hazards are risks of harm, not necessarily net harm - and the audience matters

You write:

Also antibiotics might be useful for creating antibiotic resistant bacteria. (Not sure if such bacteria are more deadly to humans all else equal - this makes categorization difficult, how can an inventor tell if their invention can be used for ill?)

From memory, I don't believe the following point is actually explicit in Bostrom's paper, but I'd say that information hazards are just risks that some (notable level of) harm will occur, not necessarily that the net impacts of the information (dissemination) will be negative. (It's possible that this is contrary to standard usage, but I think standard usage doesn't necessarily have a clear position here, and this usage seems useful to me.)

Note that Bostrom defines an information hazard as "A risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm." He doesn't say "A risk that the dissemination or the potential dissemination of (true) information may make the world worse overall, compared to if the information hadn't been disseminated, or enable some agent to make the world worse overall, compared to if the information hadn't been disseminated."

In fact, our upcoming post will note that it may often be the case that some information does pose risks of harm, but is still worth developing or sharing on balance, because the potential benefits are sufficiently high. This will often be because the harms may not actually occur (as they're currently just risks). But it could also be because, even if both the harms and benefits do actually occur, the benefits would outweigh the harms.

(There are also other cases in which that isn't true, and one should be very careful about developing or sharing some information, or even just not do so at all. That'll be explored in that post.)

But I hadn't thought about that point very explicitly when writing this post, and perhaps should've made it more explicit here, so thanks for bringing it to my attention.

Relatedly, you write:

(The key point here is that people vary, which could be important to 'infohazards in general'. Perhaps some people acquiring the blueprints for a nuclear reactor wouldn't be dangerous because they wouldn't use them. Someone with the right knowledge (or in the right time and place) might be able to do more good with these blueprints, or even have less risk of harm; "I didn't think of doing that, but I see how it'd make the reactor safer.")
What is an infohazard seems relative. If information about how to increase health can also be used to negatively impact it, then whether or not something is an infohazard seems to be based on the audience - are they benign or malign?

It will often be the case that developing and sharing information will have some positive consequences and some negative, as noted above. It will also often be the case that information will have mostly positive consequences when received by certain people, and mostly negative consequences when received by others, as you note here.

I would say that the risk that the information you share ultimately reaches people who may use it badly is part of information hazards. If it might not reach those people, that means the risk is lower. If it will also reach other people who'll use the information in beneficial ways, then that's a benefit of sharing the info, and a reason it may be worth doing so even if there are also risks.

In our upcoming post, we'll note that one strategy for handling potential information hazards would be to be quite conscious about how the info is framed/explained and who you share it with, to influence who receives it and how they use it. This is one of the "middle paths" between just sharing as if there's no risk and just not sharing as if there's only risk and no potential benefits.

Somewhat related to your point, I also think that, if we had a world where no one was malign or careless, then most things that currently pose information hazards would not. (Though I think some of Bostrom's types would remain.) And if we had a world where the vast majority of people were very malign or careless, then the benefits of sharing info would go down and the risks would go up. We should judge how risky (vs potential beneficial) developing and sharing info is based on our best knowledge of how the world actually is, including the people in it, and especially those who are likely to receive the info.

Load More