Informational hazards and the cost-effectiveness of open discussion of catastrophic risks

by avturchin4 min read23rd Jun 201811 comments

-1

Personal Blog

TL;DR: In order to prevent x-risks, our strategic vision should outperform technical capabilities of the potential malevolent agents, which means that strategic discussion should be public and open, but the publication of technical dangerous knowledge should be prevented. 

Risks and benefits of the open discussion

Bostrom has created a typology of info-hazards, but any information could also have “x-risk prevention positive impact”, or info-benefits. Obviously, info-benefits must outweigh the info-hazards of the open public discussion of x-risks, or the research of the x-risks is useless. In other words, the “cost-effectiveness” of the open discussion of a risk A should be estimated, and the potential increase in catastrophic probability should be weighed against a possible decrease of the probability of a catastrophe.

The benefits of public discussion are rather obvious: If we publicly discuss a catastrophic risk, we can raise awareness about it, prepare for the risk, and increase funding for its prevention. Publicly discussed risks are also more likely to be viewed by a larger group of scientists than those that are discussed by some closed group. Interdisciplinary research and comparison of different risks is impossible if they are secrets of different groups; as a result of this phenomenon, for example, asteroid risks are overestimated (more publicly discussed) and biorisks are underestimated (less-discussed). 

A blanket "information hazard counterargument" is too general, as any significant new research on x-risks changes the information landscape. For example, even good news about new prevention methods may be used by bad actors to overcome these methods.

The problem of informational hazards has already been explored in the field of computer security, and they have developed best practices in this area. The protocol calls for the discoverer of an informational hazard to first try to contact the firm which owns the vulnerable software, and later, if there is no reply, to publish it openly (or at least hint at it), to provide users with an advantage over bad actors who may use it secretly. 

The relative power of info-hazards depends on the information that has already been published and on other circumstances

Consideration 1. If something is already public knowledge, then discussing it is not an informational hazard. Example: AI risk. The same is true for the “attention hazard”: if something is extensively present now in the public filed, it is less dangerous to discuss it publicly. 

Consideration 2: If information X is public knowledge, then similar information X2 is a lower informational hazard. Example: if the genome of a flu virus N1 has been published, publishing similar flu genome N2 has a marginal informational hazard.

Consideration 3. If many info-hazards have already been openly published, the world may be considered saturated with info-hazards, as a malevolent agent already has access to so much dangerous information. In our world, where genomes of the pandemic flus have been openly published, it is difficult to make the situation worse. 

Consideration 4: If I have an idea about x-risks in a field in which I don’t have technical expertise and it only took me one day to develop this idea, such an idea is probably obvious to those with technical expertise, and most likely regarded by them as trivial or non-threatening. 

Consideration 5: A layman with access only to information available on Wikipedia is not able to generate ideas about really powerful informational hazards, that could not already be created by a dedicated malevolent agent, like the secret service of a rogue country. However, if one has access to unique information, which is typically not available to laymen, this could be an informational hazard.

Consideration 6. Some ideas will be suggested anyway soon, but by speakers who are less interested in risk prevention. For example, ideas similar to Roko’s Basilisk have been suggested independently twice by my friends. 

Consideration 7: Suppressing some kinds of information may signal its importance to malevolent agents, producing a “Streisand effect”.

Consideration 8: If there is a secret net to discuss such risks, some people will be excluded from it, which may create undesired social dynamics.

Info-hazard classification 

There are several types of catastrophic informational hazard (a more detailed classification can be seen in Bostrom’s article, which covers not only catastrophic info-hazards, but all possible ones):  

* Technical dangerous information (genome of a virus), 

* Ideas about possible future risks (like deflection of asteroids to Earth) 

* Value-related informational hazards (e.g. idea voluntary human extinction movement or fighting overpopulation by creation small catastrophes). 

* Attention-related info-hazards – they are closely related to value-hazard, as the more attention gets the idea, the more value humans typically gives to it. the potentially dangerous idea should be discussed in the ways where it more likely attracts the attention of the specialists than public or potentially dangerous agents. This includes special forums, jargon, non-sensational titles, and scientific fora for discussion.

The most dangerous are value-related and technical information. Value-related information could work as a self-replicating meme, and technical information could be used to actually create dangerous weapons, while ideas about possible risks could help us to prepare for such risks or start additional research.

Value of public discussion

We could use human extinction probability change as the only important measure of the effectiveness of any action according to Bostrom’s maxipoc. In that case, the utility of any public statement A is:

V = ∆I(increase of survival probability via better preparedness) – ∆IH(increase of the probability of the x-risk because the bad actors will know it). 

Emerging technologies increase the complexity of the future, which in some moment could become chaotic. The more chaotic is the future, the shorter is planning horizon, and the less time we have to act preventively. We need a full picture of the future risks for strategic planning. To have the full picture, we have to openly discuss the risks without going in the technical details.

The reason for it is that we can’t prevent risks which we don’t know, and the prevention strategy should have full list of risks, while malevolent agent may need only technical knowledge of one risk (and such knowledge is already available in the field of biotech, so malevolent agents can’t gain much from our lists).

Conclusion

Society could benefit from the open discussion of possible risks ideas as such discussion could help in the development of general prevention measures, increasing awareness, funding and cooperation. This could also help us to choose priorities in fighting different global risks.  

For example, biorisks are less-discussed and thus could be perceived as being less of a threat than the risks of AI. However, biorisks could exterminate humanity before the emergence of superintelligent AI (to prove this argument I would have to present general information which may be regarded as having informational hazard). But the amount of technical hazardous information openly published is much larger in the field of biorisks – exactly because the risk of the field as a whole is underestimated!

If you have a new idea which may appear to be a potential info-hazard, you may need to search the internet to find out if it has already been published – most likely, it is. Then you may privately discuss it with a respected scientist in the field, who also has knowledge of catastrophic risks and ask if the scientist thinks that this idea is really dangerous. The attention hazard should be overcome by non-sensationalist and non-media-attracting methods of analysis. 

It is a best practice to add to the description of any info-hazard the ways in which the risk could be overcome, or why the discussion could be used to find approaches for its mitigation.

Literature

Bostrom “Information Hazards: A Typology of Potential Harms from Knowledge”, 2011.  https://nickbostrom.com/information-hazards.pdf

Yampolsky “BEYOND MAD?: THE RACE FOR ARTIFICIAL GENERAL INTELLIGENCE”

https://www.itu.int/en/journal/001/Documents/itu2018-9.pdf

https://wiki.lesswrong.com/wiki/Information_hazard

Personal Blog

-1

11 comments, sorted by Highlighting new comments since Today at 9:36 AM
New Comment
If many info-hazards have already been openly published, the world may be considered saturated with info-hazards, as a malevolent agent already has access to so much dangerous information. In our world, where genomes of the pandemic flus have been openly published, it is difficult to make the situation worse.

I strongly disagree that we're in a world of accessible easy catastrophic information right now.

This is based on a lot of background knowledge, but as a good start, Sonia Ben Ouagrham-Gormley makes a strong case that bioweapons groups historically have had very difficult times creating usable weapons even when they already have a viable pathogen. Having a flu genome online doesn't solve any of the other problems of weapons creation. While biotechnology has certainly progressed since major historic programs, and more info and procedures of various kinds are online, I still don't see the case for lots of highly destructive technology being easily available.

If you do not believe that we're at that future of plenty of calamitous information easily available online, but believe we could conceivably get there, then the proposed strategy of openly discussing GCR-related infohazards is extremely dangerous, because it pushes us there even faster.

If the reader thinks we're probably already there, I'd ask how confident they are. Getting it wrong carries a very high cost, and it's not clear to me that having lots of infohazards publicly available is the correct response, even for moderately high certainty that we're in "lots of GCR instruction manuals online" world. (For starters, publication has a circuitous path to positive impact at best. You have to get them to the right eyes.)

Other thoughts:

The steps for checking a possibly-dangerous idea before you put it online, including running it by multiple wise knowledgeable people and trying to see if it's been discovered already, and doing analysis in a way that won't get enormous publicity, seem like good heuristics for potentially risky ideas. Although if you think you've found something profoundly dangerous, you probably don't even want to type it into Google.

Re: dangerous-but-simple ideas being easy to find: It seems that for some reason or other, bioterrorism and bioweapons programs are very rare these days. This suggests to me that there could be a major risk in the form of inadvertently convincing non-bio malicious actors to switch to bio - by perhaps suggesting a new idea that fulfils their goals or is within their means. We as humans are in a bad place to competently judge whether ideas that are obvious to us are also obvious to everybody else. So while inferential distance is a real and important thing, I'd suggest against being blindly incautious with "obvious" ideas.

(Anyways, this isn't to say such things shouldn't be researched or addressed, but there's a vast difference between "turn off your computer and never speak of this again" and "post widely in public forums; scream from the rooftops", and many useful actions between the two.)

(Please note that all of this is my own opinion and doesn't reflect that of my employer or sponsors.)

I thinks that Sophia is wrong for several reasons discussing which may be regarded as info hazard, so I could PM you, if you are interested.

Another point. A lot of people on this forum discusses potential risks of superintelligent AI. However, such public discussion may advertise the idea of AI as an instrument of global domination. The problem was recognised by Seth Baum in one of his articles ( cant' find the link).

Would the world, there nobody publicly discusses the problem of AI alignment, be a better one? Probably not, because in that case all EY's outreach would not happen, and not much research in AI alignment will be ever done. In that case, chances on creation of the beneficial AI are slim.

Thanks for writing this. How best to manage hazardous information is fraught, and although I have some work in draft and under review, much remains unclear - as you say, almost anything could have some some downside risk, and never discussing anything seems a poor approach.

Yet I strongly disagree with the conclusion that the default should be to discuss potentially hazardous (but non-technical) information publicly, and I think your proposals of how to manage these dangers (e.g. talk to one scientist first) generally err too lax. I provide the substance of this disagreement in a child comment.

I’d strongly endorse a heuristic along the lines of, “Try to avoid coming up with (and don’t publish) things which are novel and potentially dangerous”, with the standard of novelty being a relatively uninformed bad actor rather than an expert (e.g. highlighting/elaborating something dangerous which can be found buried in the scientific literature should be avoided).

This expressly includes more general information as well as particular technical points (e.g. “No one seems to be talking about technology X, but here’s why it has really dangerous misuse potential” would ‘count’, even if a particular ‘worked example’ wasn’t included).

I agree it would be good to have direct channels of communication for people considering things like this to get advice on whether projects they have in mind are wise to pursue, and to communicate concerns they have without feeling they need to resort to internet broadcast (cf. Jan Kulveit’s remark).

To these ends, people with concerns/questions of this nature are warmly welcomed and encouraged to contact me to arrange further discussion.

Thanks for this and subsequent comment which generally helped me to update my views on the problem and become even more cautious in discussing things.

Some thoughts appeared in my mind while reading, maybe I will have more thoughts later:

1. It looks like that all the talk about infohazards could be boiled down to just one thesis: "biorisk is much more serious x-risk than AI safety, but we decided not to acknowledge it, as it could be harmful".

2. Almost all work in AI safety is based on "red-teaming": someone comes with an idea X how to make AI safe, and EY appears and say "Actually, this will spectacularly fail because...". However, the fact that future AI may read that thread of comments and act accordingly to the red-team advise is not considered, because AI is assumed superintelligent and able to create all our ideas from scratch.

3. The idea of infohazards is based on the idea of intellectual advantage of "EA people" over "bad people" when even an arm-chaired futurist could create a dozen ideas how to destroy the world, while professional scientists of some rogue county are sitting completely clueless and have to go to obscure forums for search of inspiration. From the outside point of view, this could look like arrogance. But it could also be interpreted that, in fact, we live in the world where it is very easy to create plausible ways of its destruction, which contributes to the idea of oversaturation of infohazards.

4. People, who study x-risks are most dangerous people in the world as they actually know how to destroy the world. More over, if a "bad agent" ever appear, he is more likely to be some deranged LW-commentator than North Korean officer.

0: We agree potentially hazardous information should only be disclosed (or potentially discovered) when the benefits of disclosure (or discovery) outweigh the downsides. Heuristics can make principles concrete, and a rule of thumb I try to follow is to have a clear objective in mind for gathering or disclosing such information (and being wary of vague justifications like ‘improving background knowledge’ or ‘better epistemic commons’) and incur the least possible information hazard in achieving this.

A further heuristic which seems right to me is one should disclose information in the way that maximally disadvantages bad actors versus good ones. There are a wide spectrum of approaches that could be taken that lie between ‘try to forget about it’, and ‘broadcast publicly’, and I think one of the intermediate options is often best.

1: I disagree with many of the considerations which push towards more open disclosure and discussion.

1.1: I don’t think we should be confident there is little downside in disclosing dangers a sophisticated bad actor would likely rediscover themselves. Not all plausible bad actors are sophisticated: a typical criminal or terrorist is no mastermind, and so may not make (to us) relatively straightforward insights, but could still ‘pick them up’ from elsewhere.

1.2: Although a big fan of epistemic modesty (and generally a detractor of ‘EA exceptionalism’), EAs do have an impressive track record in coming up with novel and important ideas. So there is some chance of coming up with something novel and dangerous even without exceptional effort.

1.3: I emphatically disagree we are at ‘infohazard saturation’ where the situation re. Infohazards ‘can’t get any worse’. I also find it unfathomable ever being confident enough in this claim to base strategy upon its assumption (cf. eukaryote’s comment).

1.4: There are some benefits to getting out ‘in front’ of more reckless disclosure by someone else. Yet in cases where one wouldn’t want to disclose it oneself, delaying the downsides of wide disclosure as long as possible seems usually more important, and so rules against bringing this to an end by disclosing yourself save in (rare) cases one knows disclosure is imminent rather than merely possible.

2: I don’t think there’s a neat distinction between ‘technical dangerous information’ and ‘broader ideas about possible risks’, with the latter being generally safe to publicise and discuss.

2.1: It seems easy to imagine cases where the general idea comprises most of the danger. The conceptual step to a ‘key insight’ of how something could be dangerously misused ‘in principle’ might be much harder to make than subsequent steps from this insight to realising this danger ‘in practice’. In such cases the insight is the key bottleneck for bad actors traversing the risk pipeline, and so comprises a major information hazard.

2.2: For similar reasons, highlighting a neglected-by-public-discussion part of the risk landscape where one suspects information hazards lie has a considerable downside, as increased attention could prompt investigation which brings these currently dormant hazards to light.

3: Even if I take the downside risks as weightier than you, one still needs to weigh these against the benefits. I take the benefit of ‘general (or public) disclosure’ to have little marginal benefit above more limited disclosure targeted to key stakeholders. As the latter approach greatly reduces the downside risks, this is usually the better strategy by the lights of cost/benefit. At least trying targeted disclosure first seems a robustly better strategy than skipping straight to public discussion (cf.).

3.1: In bio (and I think elsewhere) the set of people who are relevant setting strategy and otherwise contributing to reducing a given risk is usually small and known (e.g. particular academics, parts of the government, civil society, and so on). A particular scientist unwittingly performing research with misuse potential might need to know the risks of their work (likewise some relevant policy and security stakeholders), but the added upside to illustrating these risks in the scientific literature is limited (and the added downsides much greater). The upside of discussing them in the popular/generalist literature (including EA literature not narrowly targeted at those working on biorisk) is limited still further.

3.2: Information also informs decisions around how to weigh causes relative to one another. Yet less-hazardous information (e.g. the basic motivation given here or here, and you could throw in social epistemic steers from the prevailing views of EA ‘cognoscenti’) is sufficient for most decisions and decision-makers. The cases where this nonetheless might be ‘worth it’ (e.g. you are a decision maker allocating a large pool of human or monetary capital between cause areas) are few and so targeted disclosure (similar to 3.1 above) looks better.

3.3: Beyond the direct cost of potentially giving bad actors good ideas, the benefits of more public discussion may not be very high. There are many ways public discussion could be counter-productive (e.g. alarmism, ill-advised remarks poisoning our relationship with scientific groups, etc.). I’d suggest the examples of cryonics, AI safety, GMOs and other lowlights of public communication of policy and science are relevant cautionary examples.

4: I also want to supply other more general considerations which point towards a very high degree caution:

4.1: In addition to the considerations around the unilateralist’s curse offered by Brian Wang (I have written a bit about this in the context of biotechnology here) there is also an asymmetry in the sense that it is much easier to disclose previously-secret information than make previously-disclosed information secret. The irreversibility of disclosure warrants further caution in cases of uncertainty like this.

4.2: I take the examples of analogous fields to also support great caution. As you note, there is a norm in computer security of ‘don’t publicise a vulnerability until there’s a fix in place’, and initially informing a responsible party to give them the opportunity to to do this pre-publication. Applied to bio, this suggests targeted disclosure to those best placed to mitigate the information hazard, rather than public discussion in the hopes of prompting a fix to be produced. (Not to mention a ‘fix’ in this area might prove much more challenging than pushing a software update.)

4.3: More distantly, adversarial work (e.g. red-teaming exercises) is usually done by professionals, with a concrete decision-relevant objective in mind, with exceptional care paid to operational security, and their results are seldom made publicly available. This is for exercises which generate information hazards for a particular group or organisation - similar or greater caution should apply to exercises that one anticipates could generate information hazardous for everyone.

4.4: Even more distantly, norms of intellectual openness are used more in some areas, and much less in others (compare the research performed in academia to security services). In areas like bio, the fact that a significant proportion of the risk arises from deliberate misuse by malicious actors means security services seem to provide the closer analogy, and ‘public/open discussion’ is seldom found desirable in these contexts.

5: In my work, I try to approach potentially hazardous areas as obliquely as possible, more along the lines of general considerations of the risk landscape or from the perspective of safety-enhancing technologies and countermeasures. I do basically no ‘red-teamy’ types of research (e.g. brainstorm the nastiest things I can think of, figure out the ‘best’ ways of defeating existing protections, etc.)

(Concretely, this would comprise asking questions like, “How are disease surveillance systems forecast to improve over the medium term, and are there any robustly beneficial characteristics for preventing high-consequence events that can be pushed for?” or “Are there relevant limits which give insight to whether surveillance will be a key plank of the ‘next-gen biosecurity’ portfolio?”, and not things like, “What are the most effective approaches to make pathogen X maximally damaging yet minimally detectable?”)

I expect a non-professional doing more red-teamy work would generate less upside (e.g. less well networked to people who may be in a position to mitigate vulnerabilities they discover, less likely to unwittingly duplicate work) and more downside (e.g. less experience with trying to manage info-hazards well) than I. Given I think this work is usually a bad idea for me to do, I think it’s definitely a bad idea for non-professionals to try.

I therefore hope people working independently on this topic approach ‘object level’ work here with similar aversion to more ‘red-teamy’ stuff, or instead focus on improving their capital by gaining credentials/experience/etc. (this has other benefits: a lot of the best levers in biorisk are working with/alongside existing stakeholders rather than striking out on one’s own, and it’s hard to get a role without (e.g.) graduate training in a relevant field). I hope to produce a list of self-contained projects to help direct laudable ‘EA energy’ to the best ends.

Would you be up for me copying over the content of the article to LW? This makes it searchable in our search, and generally makes it easier for people to read it.

Yes, should I repost it as a text post, or you will copy it?

I will copy it over :)

Congrats, you rediscovered the rationale for the atomic energy act of 1954, and postulated the logical problems stemming from it.

Way to go guys.

No, the main difference between atomic situation and now is that now much small groups of people, or even individual arm-chair-bound thinker could - supposedly - come to idea which may be an informational hazard.