Views purely my own unless clearly stated otherwise
Maybe I will make a (somewhat lazy) LessWrong post with my favorite quotes
Edit: I did it: https://www.lesswrong.com/posts/jAH4dYhbw3CkpoHz5/favorite-quotes-from-high-output-management
Nice principle.
Reminds me of the following quote from classic management book High Output Management:
Given a choice, should you delegate activities that are familiar to you or those that aren’t? Before answering, consider the following principle: delegation without follow-through is abdication. You can never wash your hands of a task. Even after you delegate it, you are still responsible for its accomplishment, and monitoring the delegated task is the only practical way for you to ensure a result. Monitoring is not meddling, but means checking to make sure an activity is proceeding in line with expectations. Because it is easier to monitor something with which you are familiar, if you have a choice you should delegate those activities you know best. But recall the pencil experiment and understand before the fact that this will very likely go against your emotional grain.
A common use of "Human Values" is in sentences like "we should align AI with Human Values" or "it would be good to maximize Human Values upon reflection", i.e. normative claims about how Human Values are good and should be achieved. However, if you're not a moral realist, there's no (or very little) reason to believe that humans, even if they reflect for a long time etc., will arrive on the same values. Most of the time if someone says "Human Values" they don't mean to include the values of Hitler or a serial killer. This makes the term confusing, because it can both be used descriptively and normatively, and the normative use is common enough to make it confusing when used as a purely descriptive term.
I agree that if you're a moral realist, it's useful to have a term for "preferences shared amongst most humans" as distinct from Goodness, but Human Values is a bad choice because:
I really appreciate your clear-headedness at recognizing these phenomena even in people "on the same team", i.e. people very concerned about and interested in preventing AI X-Risk.
However, I suspect that you also underrate the amount of self-deception going on here. It's much easier to convince others if you convince yourself first. I think people in the AI Safety community self-deceive in various ways, for example by choosing to not fully think through how their beliefs are justified (e.g. not acknowledging the extent to which they are based on deference—Tsvi writes about this in his recent post rather well).
There are of course people who explicitly, consciously, plan to deceive, thinking things like "it's very important to convince people that AI Safety/policy X is important, and so we should use the most effective messaging techniques possible, even if they use false or misleading claims." However, I think there's a larger set of people who, as they realize claims A B C are useful for consequentialist reasons, internally start questioning A B C less, and become biased to believe A B C themselves.
The <1% comes from a combination of:
Very rough numbers would be p(superintelligence within 20 years) = 1%, p(superintelligence kills everyone within 100 years of being built) = 5%, though it's very hard to put numbers on such things while lacking info, so take this as gesturing at a general ballpark.
I haven't written much about (1). Some of it is intuition from working in the field and using AI a lot. (Edit: see this from Andrej Karpathy that gestures some of this intuition).
Re (2), I've written a couple relevant posts (post 1, post 2 - review of IABIED), though I'm somewhat dissatisfied with their level of completeness. The TLDR is that I'm very skeptical of appeals to coherence argument style reasoning, which is central to most misalignment-related doom stories (relevant discussion with Raemon).
Correct. Though when writing the original comment I didn't realize Nikola's p(doom) within 19yrs was literally >50%. My main point was that even if your p(doom) is relatively high, but <50%, you can expect to be able to raise a family. Even at Nikola's p(doom) there's some chance he can raise children to adulthood (15% according to him), which makes it not a completely doomed pursuit if he really wanted them.
I mean, I also think it's OK to birth people who will die soon. But indeed that wasn't my main point.
Yeah I think it's very unlikely your family would die in the next 20 years (<<1%) so that's the crux re. whether or not you can raise a family
By the time I'd have had kids
It only takes 10 months to make one…
At risk of committing a Bulverism, I’ve noticed a tendency for people to see ethical bullet-biting as epistemically virtuous, like a demonstration of how rational/unswayed by emotion you are (biasing them to overconfidently bullet-bite). However, this makes less sense in ethics where intuitions like repugnance are a large proportion of what everything is based on in the first place.