Wiki Contributions


I'd just like to add that even if you think this piece is completly mistaken I think it certainly shows we are definitely not knowledgeable enough about what and how values and motives work in us much less AI to confidently make the prediction that AIs will be usefully described with a single global utility function or will work to subvert their reward system or the like.

Maybe that will turn out to be true but before we spend so many resources on trying to solve AI alignment let's try to make the argument for the great danger much more rigorous first...usually best way to start anyway.

This is one of the most important posts ever on LW though I don't think the implications have been fully drawn out. Specifically, this post raises serious doubts about the arguments for AI x-risk as a result of alignment mismatch and the models used to talk about that risk. It undercuts both Bostrom's argument that an AI will have a meaningful (self-aware?) utility function and Yudkowsky's reward button parables.

The role these two arguments play in convincing people that AI x-risk is a hard problem is to explain why, if you don't anthropomorphize should a program that's , say, excellent at conducting/scheduling interviews to ferret out moles in the intelligence community try to manipulate external events at all not just think about them to better catch moles? I mean it's often the case that ppl fail to pursue their fervent goals outside familiar context. Why will AI be different? Both arguments conclude that AI will inevitably act like it's very effectively maximizing some simple utility function in all contexts and in all ways.

Bostrom tries to convince us that as creatures get more capable they tend to act more coherently (more like they are governed by a global utility function). This is of course true for evolved creatures but by offering a theory of how value type things can arise this theory predicts that if you only train your AI in a relatively confined class of circumstances (even if that requires making very accurate predictions about the rest of the world) it isn't going to develop that kind of simple global value but, rather, would likely find multie shards in tension without clear direction if forced to make value choices in very different circumstances. Similarly, it exains why the AI won't just wirehead itself by pressing it's rewaes button.

I absolutely think that the future of online marketing g involves more asking ppl for their prefs. I know I go into my settings on good to active curate what they show me.

Indeed, I think Google is leaving a fucking pile of cash on the table by not adding a "I dislike" button and a little survey on their ads.

I feel there is something else going on here too.

Your claimed outside view asks us to compare a clean codebase with an unclean one and I absolutely agree that it's a good case for using currentDate when initially writing code.

But you motivated this by considering refactoring and I think things go off the rails there. If the only issue in your codebase was you called currentDate yyymmdd consistently or even had other consistent weird names it wouldn't be a message it would just have slightly weird conventions. Any coder working on it for a non-trivial length of time would start just reading yyymmdd as current date in their head.

Tge codebase is only messy when you inconsistently use a bunch of different names for a concept that aren't very descriptive. But now refactoring faces exactly the same problem working with the code does..the confusion coders experience seeing the variable and wondering what it does becomes ambiguity which forces a time intensive refactor.

Practically the right move is probably better stds going forward and to encourage coders to fix variable names in any piece of code they touch. But I don't think it's really a good example of divergent intuitions once you are talking about the same things.

I don't think this is a big problem.. The people who use ad blockers are both a small fraction of internet users and the most sophisticated ones so I doubt they are a major issue for website profit. I mean sure, Facebook is eventually going to try to squeeze out the last few percent of users if they can do so with an easy countermeasure but if this was really a big concern websites would be pushing to get that info back from the company they use to host ads. Admittedly when I was working on ads for Google (I'm not cut out to be out of academia so I went back to it) I never really got into this part of the system so I can't comment on how it would work out but I think if this mattered enough companies serving ads would figure out how to report back to the page about ad blockers.

I'm sure some sites resent ad blockers and take some easy countermeasures but at an economic level I'm skeptical this really matters.

What this means for how you should feel about using ad blockers is more tricky but since I kinda like well targeted ads I don't have much advice on this point.

Interesting, but I think it's the other end of the equation where the real problem lies: voting. Given the facts that

1) A surprisingly large fraction of the US population has tried hard drugs of one kind or another.

2) Even those who haven't almost surely know people who have and seem to find it interesting/fascinating/etc.. not horrifying behavior that deserves prison time.

So why is it that people who would never dream of sending their friend who tried coke to prison or even the friend who sold that friend some of his stash how do we end up with draconian drug laws?

I don't have an easy answer. I'm sure the overton window and a desire to signal that they themselves are not pro-drug or drug users is part of the answer. It's like lowering the age of consent for sex. As long as the loudest voices arguing it should be legal for 40 year olds to sleep with 16 year olds are creeps few people will make that argument no matter how good.

But this doesn't really seem like enough to explain the phenomena.

So your intent here is to diagnose the conceptual confusion that many people have with respect to infinity yes? And your thesis is that: people are confused about infinity because they think it has a unique referant while in fact positive and negative infinity are different?

I think you are on to something but it's a little more complicated and that's what gets people are confused. The problem is that in fact there are a number of different concepts we use the term infinity to describe which is why it so super confusing (and I bet there are more).

1. Virtual Points that are above or below all other values in an ordered ring (or their positive component) which we use as shorthand to write limits and reason about how they behave.

2. The background idea of the infinite as meaning something that is beyond all finite values (hence why a point at infinity is infinite).

3. The cardinality of sets which are bijectable with a proper subset of themselves, i.e., infinite. Even here there is an ambiguity between the sets with a given cardinality and the cardinal itself.

4. The notion of absolute mathematical infinity. If this concept makes sense it does have a single reference which is taken to be 'larger' (usually in the sense of cardinality) than any possible cardinal, i.e. the height of the true hierarchy of sets.

5. The metaphorical or theological notion of infinity as a way of describing something beyond human comprehension and/or without limits.

The fact that some of these notions do uniquely refer while others don't is a part of the problem.

Stimulants are an excellent short term solution. If you absolutely need to get work done tonight and can't sleep amphetamine (i.e. Adderall) is a great solution. Indeed, there are a number of studies/experiments (including those the airforce relies on to give pilots amphetamines) backing up the fact that it improves the ability to get tasks done while sleep deprived.

Of course, if you are having long term sleep problems it will likely increase those problems.

There is a lot of philosophical work on this issue some of which recommends taking conditional probability as the fundamental unit (in which case Bayes theorem only applies for non-extremal values). For instance, see this paper

Computability is just \Delta^0_1 definability. There are plenty of other notions of definability you could try to cash out this paradox in terms of. Why pick \Delta^0_1 definability?

If the argument worked in any particular definability notion (e.g. arithmetic definability) it would be a problem. Thus, the solution needs to explain why the argument shouldn't convince you that with respect to any concrete notion of definable set the argument doesn't go through.

Load More