Sure, but let me clarify that I'm probably not drawing as hard a boundary between "ordinary paranoia" and "deep security" as I should be. I think Bruce Schneier's and Eliezer's buckets for "security mindset" blended together in the months since I read both posts. Also, re-reading the logistic success curve post reminded me that Eliezer calls into question whether someone who lacks security mindset can identify people who have it. So it's worth noting that my ability to identify people with security mindset is itself suspect by this criteria (there's no public evidence that I have security mindset and I wouldn't claim that I have a consistent ability to do "deep security"-style analysis.)

With that out of the way, here are some of the examples I was thinking of.

First of all, at a high level, I've noticed that you seem to consistently question assumptions other posters are making and clarify terminology when appropriate. This seems like a prerequisite for security mindset, since it's a necessary first step towards constructing systems.

Second and more substantively, I've seen you consistently raise concerns about human safety problems (also here. I see this as an example of security mindset because it requires questioning the assumptions implicit in a lot of proposals. The analogy to Eliezer's post here would be that ordinary paranoia is trying to come up with more ways to prevent the AI from corrupting the human (or something similar) whereas I think a deep security solution would look more like avoiding the assumption that humans are safe altogether and instead seeking clear guarantees that our AIs will be safe even if we ourselves aren't.

Last, you seem to be unusually willing to point out flaws in your own proposals, the prime example being UDT. The most recent example of this is your comment about the bomb argument, but I've seen you do this quite a bit and could find more examples if prompted. On reflection, this may be more of an example of "ordinary paranoia" than "deep security", but it's still quite important in my opinion.

Let me know if that clarifies things at all. I can probably come up with more examples of each type if requested, but it will take me some time to keep digging through posts and comments so figured I'd check in to see if what I'm saying makes sense before continuing to dig.

Jimrandomh's Shortform

by jimrandomh 1 min read4th Jul 201964 comments

This post is a container for my short-form writing. See this post for meta-level discussion about shortform as an upcoming site feature.