Is it just me or is Overcoming Bias almost reaching the point of self-parody with recent posts like http://www.overcomingbias.com/2010/12/renew-forager-law.html ?
It's interesting as a "just for fun" idea. On some blogs it would probably be fine, but OB used to feel a lot more rigorous, important, and multiple levels above me, than it does now.

Coincidence or name-drop?
(A fine suggestion, in any case, though "List of cognitive biases" would also be a good one to have on the list.)
I've been thinking for a while about the distinction between instrumental and terminal values, because the places it comes up in the Sequences (1) are places where I've bogged down in reading them. And I am concluding that it may be a misleading distinction.
EY presents a toy example here, an...
Ask a Mathematician / Ask a Physicist: Q: Which is a better approach to quantum mechanics: Copenhagen or Many Worlds? to my surprise answers unequivocally in favour of Many Worlds.
Some may appreciate the 'Chrismas' message OkCupid sent out to atheist members:
12 Days of Atheist Matches
Greetings, fellow atheist. This is Chris from OkCupid. I know from hard experience that non-believers have few holiday entertainment options:
- Telling kids that Santa Claus is just the tip of the iceberg, in terms of things that don't actually exist
- Asking Christians to explain why they don't also believe in Zeus
- Messaging cute girls
And I'll start things off with a question I couldn't find a place for or a post for.
Coherent extrapolated volition. That 2004 paper sets out what it would be and why we want it, in the broadest outlines.
Has there been any progress on making this concept any more concrete since 2004? How to work out a CEV? Or even one person's EV? I couldn't find anything.
I'm interested because it's an idea with obvious application even if the intelligence doing the calculation is human.
On the topic of CEV: the Wikipedia article only has primary sources and needs third-party ones.
Expand? Are you talking about saying things about the output of CEV, or something else?
Not just the output, the input and means of computation are also potential minefields of moral politics. After all this touches on what amounts to the ultimate moral question: "If I had ultimate power how would I decide how to use it?" When you are answering that question in public you must use extreme caution, at least you must if you have any real intent to gain power.
There are some things that are safe to say about CEV, particularly things on a technical side. But for most part it is best to avoid giving too many straight answers. I said something on the subject of what can be considered the subproblem ("Do you confess to being consequentialist, even when it sounds nasty?"). Eliezer's responses took a similar position:
then they would be better off simply providing an answer calibrated to please whoever they most desired to avoid disapproval from
No they wouldn't. Ambiguity is their ally. Both answers elicit negative responses, and they can avoid that from most people by not saying anything, so why shouldn't they shut up?
When describing CEV mechanisms in detail from...
In what situation would this be better than or easier than simply donating more, especially if percentage of income is considered over some period of time instead of simply "here it is?"
Sure, and people feel safer driving than riding in an airplane, because driving makes them feel more in control, even though it's actually far more dangerous per mile.
Probably a lot of people would feel more comfortable with a genie that took orders than an AI that was trying to do any of that extrapolating stuff. Until they died, I mean. They'd feel more comfortable up until that point.
Feedback just supplies a form of information. If you disentangle the I-want-to-drive bias and say exactly what you want to do with that information, it'll just come out to the AI observing humans and updating some beliefs based on their behavior, and then it'll turn out that most of that information is obtainable and predictable in advance. There's also a moral component where making a decision is different from predictably making that decision, but that's on an object level rather than metaethical level and just says "There's some things we wouldn't want the AI to do until we actually decide them even if the decision is predictable in advance, because the decision itself is significant and not just the strategy and consequences following from it."
When you build automated systems capable of moving faster and stronger than humans can keep up with, I think you just have to bite the bullet and accept that you have to get it right. The idea of building such a system and then having it wait for human feedback, while emotionally tempting, just doesn't work.
If you build an automatic steering system for a car that travels 250 mph, you either trust it or you don't, but you certainly don't let humans anywhere near the steering wheel at that speed.
Which is to say that while I sympathize with you here, I'm not at all convinced that the distinction you're highlighting actually makes all that much difference, unless we impose the artificial constraint that the environment doesn't get changed more quickly than a typical human can assimilate completely enough to provide meaningful feedback on.
I mean, without that constraint, a powerful enough environment-changer simply won't receive meaningful feedback, no matter how willing it might be to take it if offered, any more than the 250-mph artificial car driver can get meaningful feedback from its human passenger.
And while one could add such a constraint, I'm not sure I want to die of old age w...
Complete randomness that seemed appropriate for an open thread: I just noticed the blog post header on the OvercomingBias summary: "Ban Mirror Cells"
Which, it turned out when I read it, is about chirality, but which I had parsed as talking about mirror neurons, and the notion of wanting to ban mirror neurons struck me as delightfully absurd: "Darned mirror neurons! If I wanted to trigger the same cognitive events in response to doing something as in response to seeing it done by others, I'd join a commune! Darned kids, get off my lawn!"
Even with the discussion section, there are ideas or questions too short or inchoate to be worth a post.
Thankyou! I'd been mourning the loss. There have been plenty of things I had wanted to ask or say that didn't warrant a post even here.
It occurs to me that the concept of a "dangerous idea" might be productively viewed in the context of memetic immunization: ideas are often (but not always) tagged as dangerous because they carry infectious memes, and the concept of dangerous information itself is often rejected because it's frequently hijacked to defend an already internalized infectious memeplex.
Some articles I've read here seem related to this idea in various ways, but I can't find anything in the Sequences or on a search that seems to tackle it directly. Worth writing up as a post?
This seems like a good audience to solve a tip-of-my-brain problem. I read something in the last year about subconscious mirroring of gestures during conversation. The discussion was about a researcher filming a family (mother, father, child) having a conversation, and analyzing a 3 second clip in slow motion for several months. The researcher noted an almost instantaneous mirroring of the speaker's micro-gestures in the listeners.
I think that I've tracked the original researcher down to Jay Haley, though unfortunately the articles are behind a pay wall: ...
Global Nuclear fuel bank reaches critical mass.
http://www.nytimes.com/2010/12/04/science/04nuke.html?_r=1&ref=science
I'm intrigued by this notion that the government solicited Buffett for the funding promise which then became a substantial chunk of the total startup capital. Did they really need his money, or were they looking for something else?
I'm jealous of all these LW meetups happening in places that I don't live. Is there not a sizable contingent of LW-ers in the DC area?
I've been thinking for a while about the distinction between instrumental and terminal values, because the places it comes up in the Sequences (1) are places where I've bogged down in reading them. And I am concluding that it may be a misleading distinction.
EY presents a toy example here, and I certainly agree that failing to distinguish between (V1) "wanting chocolate" and (V2) "wanting to drive to the store" is a fallacy, and a common one, and an important one to dissolve. And the approach he takes to dissolving it is sound, as far as it goes: consider the utility attached to each outcome, consider the probability of each outcome given possible actions, then choose the actions that maximize expected utility.
But in that example, V1 and V2 aren't just different values, they are hierarchically arranged values... V2 depends on V1, such that if their causal link is severed (e.g., driving to the store stops being a way to get chocolate) then it stops being sensible to consider V2 a goal at all. In other words, the utility of V2 is zero within this toy example, and we just take the action with the highest probability of V1 (which may incidentally involve satisfying V2, but that's just a path, not a goal).
Of course, we know wanting chocolate isn't a real terminal value outside of that toy example; it depends on other things. But by showing V1 as the stable root of a toy network, we suggest that in principle there are real terminal values, and a concerted philosophical effort by smart enough minds will identify them. Which dovetails with the recurring(1) idea that FAI depends on this effort because uncovering humanity's terminal values is a necessary step along the way to implementing them, as per Fun Theory.
But just because values exist in a mutually referential network doesn't mean they exist in a hierarchy with certain values at the root. Maybe I have (V3) wanting to marry my boyfriend and (V4) wanting to make my boyfriend happy. Here, too, these are different values, and failing to distinguish between them is a problem, and there's a causal link that matters. But it's not strictly hierarchical: if the causal link is severed (e.g., marrying my boyfriend isn't a way to make him happy) I still have both goals. Worse, if the causal link is reversed (e.g., marrying my boyfriend makes him less happy, because he has V5: don't get married), I still have both goals. Now what?
Well, one answer is to treat V3 and V4 (and V5, if present) as instrumental goals of some shared (as yet undiscovered) terminal goal (V6). But failing that, all that's left is to work out a mutually acceptable utility distribution that is suboptimal along one or more of (V2-V5) and implement the associated actions. You can't always get what you want. (2)
Well and good; nobody has claimed otherwise.
But, again, the Metaethics and Fun Sequences seem to depend(1) on a shared as-yet-undiscovered terminal goal that screens off the contradictions in our instrumental goals. If instead it's instrumental links throughout the network, and what seem like terminal goals are merely those instrumental goals at the edge of whatever subset of the network we're representing at the moment, and nothing prevents even our post-Singularity descendents from having mutually inhibitory goals... well, then maybe humanity's values simply aren't coherent; maybe some of our post-Singularity descendents will be varelse to one another.
So, OK... suppose we discover that, and the various tribes of humanity consequently separate. After we're done (link)throwing up on the sand(/link)(flawed_utopia), what do we do then?
Perhaps we and our AIs need a pluralist metaethic(3), one that allows us to treat other beings who don't share our values -- including, perhaps, the (link)Babykillers and the SHFP(/link)(see SHFP story) and the (link)Pebblesorters(/link)(see pebblesorters), as well as the other tribes of post-Singularity humans -- as beings whose preferences have moral weight?
=============
(1) The whole (link)meta-ethics Sequence(/link)(see meta-ethics Sequence) is shot through with the idea that compromise on instrumental values is possible given shared terminal values, even if it doesn't seem that way at first, so humans can coexist and extracting a "coherent volition" of humanity is possible, but entities with different terminal values are varelse: there's just no point of compatibility.
The recurring message is that any notion of compromise on terminal values is just wrongheaded, which is why the (link)SHFP's solution to the Babykiller problem(/link)(see SHFP story) is presented as flawed, as is viewing the (link)Pebblesorters(/link)($pebblesorters) as having a notion of right and wrong deserving of moral consideration. Implementing our instrumental values can leave us (link)tragically happy(/link)(see flawed utopia), on this view, because our terminal values are the ones that really matter.
More generally, LW's formulation of post-Singularity ethics (aka (link)Fun(/link)(see fun Sequence)) seems to depend on this distinction. The idea of a reflectively stable shared value system that can survive a radical alteration of our environment (e.g, the ability to create arbitrary numbers of systems with the same moral weight that I have, or even mere immortality) is pretty fundamental, not just for the specific Fun Theory proposed, but for any fixed notion of what humans would find valuable after such a transition. If I don't have a stable value system in the first place, or if my stable values are fundamentally incompatible with yours, then the whole enterprise is a non-starter... and clearly our instrumental values are neither stable nor shared. So the hope that our terminal values are stable and shared is important.
This distinction also may underlie the warning against (link)messing with emotions(/link)(see emotions)... the idea seems to be that messing with emotions, unlike messing with everything else, risks affecting my terminal values. (I may be pounding that screw with my hammer, though; I'm still not confident I understand why EY thinks messing with everything else is so much safer than messing with emotions.)
(2) I feel I should clarify here that my husband and I are happily married; this is entirely a hypothetical example. Also, my officemate recently brought me chocolate without my even having to leave my cube, let alone drive anywhere. Truly, I live a blessed life.
(3) Mind you, I don't have one handy. But the longest journey begins, not with a single step, but with the formation of the desire to get somewhere.
I came here from the pedophile discussion. This comment interests me more so I'm replying to it.
To preface, here is what I currently think: Preferences are in a hierarchy. You make a list of possible universes (branching out as a result of your actions) and choose the one you prefer the most - so I'm basically coming from VNM. The terminal value lies in which universe you choose. The instrumental stuff lies in which actions you take to get there.
So I'm reading your line of thought...
...But just because values exist in a mutually referential network doesn't
Even with the discussion section, there are ideas or questions too short or inchoate to be worth a post.
This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.