31 comments, sorted by Click to highlight new comments since: Today at 6:51 PM
New Comment

An example of a real-world visual infohazard that isn't all that dangerous, but is very powerful: The McCollough effect. This could be useful as a quick example when introducing people to the concept of infohazards in general, and is also probably worthy of further research by people smarter than me.

Huh, started reading about this and then sorta got scared and stopped. :P

(I googled "McCollough effect", and looked at a bunch of images for awhile before starting to read the article, then the article was saying "looking at the visuals might leave lasting changes", then I some combination of 'freaked out slightly' and also 'decided I didn't care enough to finish reading the article')

[potential infohazard warning, though I’ve tried to keep details out] I’ve been thinking

EDIT: accidentally posted while typing, leading to the implication that the very act of thinking is an infohazard. This is far funnier than anything I could have written deliberately, so I’m keeping it here.

I find it surprising how few comments there are on some posts here. I've seen people ask some really excellent sceptical questions, which if answered persuasively could push that person towards research we think will be positive, but instead goes unanswered. What can the community do to ensure that sceptics are able to have a better conversation here?

There's generally been a pretty huge wave of Alignment posts and it's just kinda hard to get attention to all of them. I do agree the current situation is a problem.

I find it surprising how small the community is in general.

I think that’s primarily due to the unbelievably bad PR LessWrong has

EDIT: very kindly given a temporary key to Midjourney, thanks to that person! 😊

Does anyone have a spare key to Midjourney they can lend out? I’ve been on the waiting list for months, and there’s a time-sensitive comparative experiment I want to do with it. (Access to Dall-E 2 would also be helpful, but I assume there’s no way to get access outside of the waitlist)

I just reached the “kinda good hearts” leaderboard, and I notice I’m suddenly more hesitant to upvote other posts, perhaps because I’m “afraid” of being booted off the leaderboard now that I’m on it. This seems like a bad incentive, assuming that you think that upvoting posts is generally good. I can even image a more malicious and slightly stupider version of myself going around and downvoting other’s posts, so I’d appear higher up. I also notice a temptation to continue posting, which isn’t necessarily good if my writing isn’t constructive. On the upside, however, this has inspired me to actually write out a lot of potentially useful thoughts I would otherwise not have shared!

Thinking about how a bioterrorist might use the new advances in protein folding for nefarious purposes. My first thought is that they might be able to more easily construct deadly prions, which immediately brings up the question—just how infectious can prions theoretically become? If anyone happens to know the answer to that, I'd be interested to hear your thoughts

Prions can only produce problem to the extend that existing proteins are suspectible to misfold into the shape of the prion.

The are also not a delivery vehicle. 

Viruses and bacteria are both bigger problems because they can actually travel easier from host to host. 

I don't know the answer, but your question makes me think you might find it valuable to define what you mean by "infectious." Zvi had a section in one of his recent COVID posts where he was struggling with the ambiguity over what it actually means that there might be a "more infectious COVID strain."

With that definition in hand, it might perhaps be valuable to try a subreddit?

It would be interesting to consider the aspects of a disease mechanism that would make it more useful to a terrorist. I can think of a few characteristics. Deadliness, how easy it is to target, infectiousness, and ease of preparation all spring to mind.

Quick thought—has there been any focus on research investigating the difference between empathetic and psychopathic people? I wonder if that could help us better understand alignment…

I'd really like to understand what's going on in videos like this one where graphing calculators seem to "glitch out" on certain equations—are there any accessible reads out there about this topic?

Take two! [Note, the following may contain an infohazard, though I’ve tried to leave key details out while still getting what I want across] I’ve been wondering if we should be more concerned about “pessimistic philosophy.” By this I mean the family of philosophical positions which lead to a seemingly-rationally-arrived-at conclusion that it is better not to exist than to exist. It seems quite easy, or at least conceivable, for an intelligent individual, perhaps one with significant power, to find themselves at such a conclusion, and decide to “benevolently” try to act on that (perhaps Nick Land as interpreted by his critics is an example of this?). I’m not sure what, if anything, to do with this train of thought, and am concerned that with even light study of the subject, I’ve run into a large body of infohazards, some of which may have negatively affected me slightly (as far as I’m aware not contagious though, unless you count this post as a potential spreader. Reminder to be responsible with your personal mental health here if you want to look into this further.).

I have often come to a seemingly-rationally-arrived-at conclusion that 1+1=3 (or some other mathematical contradiction). I invariably conclude that my reasoning went astray, not that ZF is inconsistent.

I respond similarly to reasoning that it is better to die/never have existed/kill everyone and fill my future lightcone with copies of myself/erase my own identity/wirehead/give away everything I own/obsess over the idea that I might be a Boltzmann brain/go on a hour-long crying jag whenever I contemplate the sorrows of the world/be paralysed in terror at the octillions of potential future lives whose welfare and suffering hang on the slightest twitch of my finger/consider myself such a vile and depraved thing that one thousand pages by the most gifted writer could not express the smallest particle of my evilness/succumb to Power Word: Red Pill/respond to the zombie when it croaks "yes, but what if? what if?"/take the unwelcomeness of any of these conclusions as evidence of their truth.

I know not to trust my satnav when it tells me to drive off a cliff, and neither do I follow an argument when it leads into the abyss.

It's great that you have that satnav. I worry about people like me. I worry about being incapable of leaving those thoughts alone until I've pulled the thread enough be sure I should ignore it. In other words, if I think there's a chance something like that is true, I do want to trust the satnav, but I also want to be sure my "big if true" discovery genuinely isn't true.

Of course, a good innoculation against this has been reading some intense blogs of people who've adopted alternative decision-theories which lead them down really scary paths to watch.

I worry "there but for the grace of chance go I." But that's not quite right, and being able to read that content and not go off the deep end myself is evidence that maybe my satnav is functioning just fine after all.

I suspect I'm talking about the same exact class of infohazard as mentioned here. I think I know what's being veiled and have looked it in the eye.

Thanks for your excellent input! It’s not really the potential accuracy of such dark philosophies that I’m worried about here (though that is also an area of some concern, of course, since I am human and do have those anxieties on occasion), but rather how easy it seems to be to fall prey to and subsequently act on those infohazards for a certain subclass of extremely intelligent people. We’ve sadly had multiple cases in this community of smart people succumbing to thought-patterns which arguably (probably?) led to real-world deaths, but as far as I can tell, the damage has mostly been contained to individuals or small groups of people so far. The same cannot be said of some religious groups and cults, who have a history of falling prey to such ideologies (“everyone in outgroup x deserves death,” is a popular one). How concerned should we be about, say, philosophical infohazards leading to x-risk level conclusions [example removed]? I suspect natural human satnav/moral intuition leads to very few people being convinced by such arguments, but due to the tendency of people in rationalist (and religious!) spaces to deliberately rethink their intuition, there seems to be a higher risk in those subgroups for perverse eschatological ideologies. Is that risk high enough that active preventative measures should be taken, or is this concern itself of the 1+1=3, wrong-side-of-the-abyss type?

I know what you mean, and I think that similar to Richard Kennaway says below, we need to teach people new to the sequences and to exotic decision theories not to drive off a cliff because of a thread they couldn't resist pulling.

I think we really need something in the sequences about how to tell if your wild seeming idea is remotely likely. I.e a "How to Trust Your SatNav" post. The basic content in the post is: remember to stay grounded, and ask how likely this wild new framework might be. Ask others who can understand and assess your theory, and if they say you're getting some things wrong, take them very seriously. This doesn't mean you can't follow your own convictions, it just means you should do it in a way that minimises potential harm.

Now, having read the content you're talking about, I think a person needs to already be pretty far gone epistemically before this info hazard can "get them," and I mean either the original idea-haver and also those who receive it via transmission. But I think it's still going to help very new readers to not drive off so many cliffs. It's almost like some of them want to, which is... its own class of concerns.

Less (comparatively) intelligent AGI is probably safer, as it will have a greater incentive to coordinate with humans (over killing us all immediately and starting from scratch), which gives us more time to blackmail them.

Thinking about EY's 2-4-6 problem (the following assumes you've read the link, slight spoilers for the Sequences ahead), and I'm noticing some confusion. I'm going to walk through my thought process as it happens (starting with trying to solve the problem as if I don't know the solution), so this is gonna get messy.

Let's say you start with the "default" hypothesis (we'll call it Hyp1) that people seem to jump to first (by this I mean both me and Yudkowsky; I have no idea about others (why did we jump to this one first?)) that only sequences of numbers increasing by 2 are true.  How should a rationalist try to prove/disprove Hyp1 is the correct algorithm with as few sets of sequences as possible? Well, my naive thought would be let's test with a random high-number sequence that follows Hyp1. This would be to insure there isn't some sort of cap to the size allowed (wouldn't be proven of course, but if I'm dealing with a human I can probably assume that). Now what? Knowing nothing else, should we continue with similar sequences, to try to add evidence through "positive" induction, or aim for something against the rules, to make sure we aren't being too restrictive? The fact of the matter is, while we can (and have to) make sure our hypothesis returns True and False, respectively, for all past "graded" sequences, we can't insure the ruleset isn't more complicated than it seems so far, such that there will be a future sequence that our hypothesis will give the incorrect answer to.  A possible exception to this is if the ruleset only allows for a finite number of sequences which return True, but as I'm typing this I realize that's not an exception; you can go through the full finite set of presumable True sequences and they'll all return True, but you still can't be sure the presumably False sequences will all return false. So there is no way to prove you've landed on the correct hypothesis with any possible hypothesis you give.

Okay, so at what point do you decide you've gathered enough evidence to determine if Hyp1, or any given Hyp, is true or not? As fun as it would be to have Yudkowsky waste his time grading infinite slips of paper (this is a joke—I do not endorse wasting Eliezer's time irl, he's got better stuff to do), I'm gonna get bored way before infinity. So let's say I only need to reach 50% confidence. How would I know once I've reached that? I'm not sure right now, and would appreciate any comments giving insight on this, since if we weren't dealing with a human, but rather with a God who choose an algorithm "randomly from among all finite computable algorithms" (lets assume for now the quoted statement has any meaning), then I think it will be impossible to gain any finite amount of confidence in a given algorithm in finite time, since there are infinitely more possible algorithms that can also be true, no matter the sample size.

The good news is we aren't dealing with a God, we're dealing with a human (presumably) that built this problem, so we can restrict the phase space of possible valid algorithms to those which a human would reasonably come up with.  We can also assume with good probability that the algorithm will be both fairly simple, and also at least somewhat counter-intuitive in nature, considering the classroom setting Yudkowsky put us in, and the fact it's EY who's giving us this problem, come on guys. 

For good measure, we can probably also assume only high-school level math or below will be used, since otherwise people who just don't know advanced math will feel cheated. I'm noticing this is turning into a psychological analysis, which I think is what's going on "under the hood" when people consider math problems irl. (I remember being in grade school, like 10 years old or so, and getting frustrated that a math question about the speed of an airplane didn't take acceleration into account. I was convinced that the teacher had the answer wrong, since she assumed a flat speed. Fun times, grade school...) This is something I don't think Yudkowsky brought up, but even seemingly simple questions like this have a lot of real-world assumptions baked into them, and it's often assumed by researchers and teachers that people will make the same assumptions they did. 

Okay, so we've narrowed down the space of possible valid algorithms considerably, (in fact from an infinite to a finite space, which is pretty impressive) and all this without writing down a thing. What now? Hyp1 was what intuitively came into my mind, and considering similarity of intuitive mental patterns across humans, it's likely that the test designer at least considered it. as well So let's start there. My first impulse is to see if we can extract individual assumptions, or variables, from Hyp1, so we can play around with them.  Writing it down explicitly,

Hyp1 = for (the last two numbers of the sequence), return True iff (PreviousNumber + 2 = CurrentNumber)

Not sure where to put this, but I also just noticed that all of Yudkowsky's examples are whole positive numbers, and without testing we can't actually be sure if that's a requirement or not. I would test that in a minimal amount of space with one sequence which starts with a negative fractional number,  but which otherwise assumes Hyp1, such as (-2.07, -0.07, 1.93). If we're optimizing for space, we're probably going to want to pack in as many falsifiable assumptions as we can into each sequence, and if it returns false, we treat each assumption with separate sequences on the next set of three, so we can narrow down what went wrong.

What are some other assumptions we're making with Hyp1?

 

 

[submitting to save, this is still a work in progress; idk how to draftify a shortform lol]

The triplet

6, 4, 2

also seems worth testing.


infinite slips of paper

1. Write a program that knows the rule.

2. Go faster by allowing triplet rules.

Like a, a+2, a+4.

This isn't guessing the rule. If all instances would return true, then it gets true back.

Thinking about Bricking, from yesterday’s post (https://www.lesswrong.com/posts/eMYNNXndBqm26WH9m/infohazards-hacking-and-bricking-how-to-formalize-these), and realized that the answer to the question “can a Universal Turing Machine be Bricked” is an obvious yes—just input the HALT signal! That would immediately make any input past that irrelevant, since the machine would never read it. Is there a way to sidestep this such that Bricking is a non-trivial concept? In a real-world computer, sending a HALT signal as an input doesn’t tend to hard brick (https://en.wikipedia.org/wiki/Brick_(electronics)#Hard_brick) the machine, since you can just turn it on again with relative ease. My guess is I’m missing some very basic concepts here, but I’m finding it hard to find any existing articles on the topic.

Technically, all the input to any Turing machine in the usual formalism is provided up-front as tape symbols, and output is considered to be the state of the tape after the machine halts. "Bricking" requires some notion of ongoing input concurrent with computation.

There are many models of computation, and some of those may be more suited to the question. For example, there are some models in which there is an "I/O" stream as well as a "memory" tape, with associated changes to the state transition formalism. In this model, a "bricked" machine could be one which enters a state from which it will never read from the input stream or write to output. Some machines of this type will never enter a "bricked" state, including some capable of universal computation.

Real-world computers are both more and less complex than any such formal model of computation. More complex in that they have a great deal more fiddly detail in their implementations, but less complex in that there are physical upper bounds on the complexity of algorithms they can compute.

Has there been any EA/rationalist writing/discussion on plausible effects of Roe v. Wade being overturned, if that ends up happening this summer?

Not that I've seen, and I hope it stays that way (at least on LW; there may be other rationalist-adjacent places where getting that close to current political topics works well).

It seems to be working okay with regards to Covid policy and Ukraine stuff, which is very heavily politicized. I’d expect perhaps a few nasty comments, but my (perhaps naïve) assumption is that it would be possible to discuss that sort of thing here in a relatively mature manner.

Those things are politicized, but there's a ground-truth behind them, and most of the discussion is driven by a few well-respected posters doing a LOT of great work to keep it primarily factual and not give much weight to the political side of things (while recognizing the pain caused by the fact that it's politicized).

I don't believe that treatment is possible with culture-war topics, which are political through and through.  I also don't expect any long-time prolific community member (who understands the somewhat mutable boundaries of what's useful here and what's not) to take up the topic.  

Also, those series started as linkposts to outside blogs, and the LW mods decided to promote and encourage them.  This is a GREAT pattern to follow for topics you think might work well, but aren't sure - post them on your own forum where you're used to having full freedom of topic and see how people will react there, then link one or two of the best ones here to see how it goes.

I agree that a topic like abortion by default doesn't fit well on LW. However, I could imagine some posts about abortion might be okay on LW (but wouldn't get on the frontpage, I'm guessing, because they'd be too at risk of triggering political fights). For example, a post analyzing various abortion policies and what effects they have without making a strong policy recommendation (or making multiple policy recommendations based on what objectives one is trying to achieve) would probably be fine and interesting. A post about how [terrible thing] will befall [group x] as a result of changes to US abortion policy would probably end up too political.

One aspect of the topic I would be interested in is expected long-term effects on population growth rates, potential movements/migrations as a result, etc. I'd expect there to be some data on the topic if other nations have done anything similar in the past, and while I don't feel qualified to analyze such topics in any depth, I can imagine it being handled well.

I think there are ways to treat those kind of topics in a productive way (through ideological Turing tests for example).