dankane

Posts

Sorted by New

Comments

A taxonomy of Oracle AIs

I feel like your discussion of predictors makes a few not-necessarily-warranted assumptions about how the predictor deals with self-reference. Then again, I guess anything that doesn't do this fails as a predictor in a wide range of useful cases. It predicts a massive fire will kill 100 people, and so naturally this prediction is used to invalidate the original prediction.

But there is a simple-ish fix. What if you simply ask it to make predictions about what would happen if it (and say all similar predictors) suddenly stopped functioning immediately before this prediction was returned?

Markets are Anti-Inductive

Unless you can explain to me how prediction markets are going to break the pattern that two different shares of the same stock have correlated prices.

I'm actually not sure how prediction markets are supposed to have an effect on this issue. My issue is not that people have too much difficulty recognizing patterns. My issue is that some patterns once recognized do not provide incentives to make that pattern disappear. Unless you can tell me how prediction markets might fix this problem, your response seems like a bit of a non-sequitur.

Markets are Anti-Inductive

This seems like too general a principle. I agree that in many circumstances, public knowledge of a pattern in pricing will lead to effects causing that pattern to disappear. However, it is not clear to me that this is always to case, or that the size of the effect will be sufficient to complete cancel out the original observation.

For example, I observe that two different units of Google stock have prices that are highly correlated with each other. I doubt that this observation will cause separate markets to spring up giving wildly divergent prices to different shares of the same stock. I also note that stock prices are always non-negative. I also doubt that this will cease to be the case any time soon.

Although these are somewhat tautological, one can imagine non-tautological observations that will not disappear. If stocks A and B are known to be highly correlated, this may well lead to a larger gap as hedge funds who predict a small difference in expected returns will buy one and short the other. However, if they are correlated for structural reasons part of this might be that it is hard to detect effects that will cause their prices to diverge significantly, so the observation of the effect will likely not be enough to actually remove all of the correlation.

One can also imagine general observations about the market itself, like the approximate frequency of crashes, or log normality of price changes that might not disappear simply because they are known. In order for an effect to disappear there needs to be a way to make a profit off of it.

The Strangest Thing An AI Could Tell You

We probably couldn't even talk ourselves out of this box.

I don't know... That sounds a lot like what an AI trying to talk itself out of a box would say.

The Mystery of the Haunted Rationalist

Hmm... I would probably explain the threshold for staying in the house not as an implicit expected probability computation, but an evaluation of the price of the discomfort associated with staying in a location that you find spooky. At least for me, I think that the part of my mind that knows that ghosts do not exist would have no trouble controlling whether or not I remain in the house or not. However, it might well decide that it is not worth the $10 that I would receive to spend the entire night in a place where some other piece of my mind is constantly yelling at me to run away screaming.

An overall schema for the friendly AI problems: self-referential convergence criteria

It's just that such self-referential criteria as reflective equilibrium are a necessary condition

Why? The only example of adequately friendly intelligent systems that we have (i.e. us) don't meet this condition. Why should reflective equilibrium be a necessary condition for FAI?

Taking Effective Altruism Seriously

That may be true (at least to the degree to which it is sensible to assign a specific cause to a given util). However, it is not very good evidence that investment in first world economies is the most effective way to generate utils in Africa.

Taking Effective Altruism Seriously

OK. So suppose that I grant your claim that donations to sub-Saharan Africa will not substantially affect the size of the future economic pie, but that other investments will. I claim that there may still be reason to donate there.

I grant that such a donation will produce fewer dollars of value than investing in capitol infrastructure. On the other hand dollars is not the objective, utils are. We can reasonably assume that marginal utility of an extra dollar for a given person is decreasing as that person's wealth increases. We can reasonably expect that world GDP per capita will be much higher in 100 years, and know that GDP per capita is much higher in the US than in sub-Saharan Africa. Thus, even if an investment in first-world infrastructure produces more total dollars of value, these dollars are going to much wealthier people than dollars donated to people today in sub-Saharan Africa, and thus might well produce fewer total utils.

The Truly Iterated Prisoner's Dilemma

[I realize that I missed the train and probably very few people will read this, but here goes]

So in non-iterated prisoner's dilemma, defect is a dominant strategy. No matter what the opponent is doing, defecting will always give you the best possible outcome. In iterated prisoner's dilemma, there is no longer a dominant strategy. If my opponent is playing Tit-for-Tat, I get the best outcome by cooperating in all rounds but the last. If my opponent ignores what I do, I get the best outcome by always defecting. It is true that all defects is the unique Nash equilibrium strategy, but this is a much weaker reason for playing it, especially given that evidence shows that when playing among people who are trying to win, Tit-for-Tat tends to achieve much better outcomes.

There seems to be a lot of discussion in the comments about this or that being the rational thing to do, and I think that this is a big problem that gets in the way of clear thinking about the issue. The problem is that people are using the word "rational" here without having a clear idea as to what exactly that means. Sure, it's the thing that wins, but wins when? Provably, there is no single strategy that achieves the best possible outcome against all possible implementations of Clippy. So what do you mean? Are you trying to optimize your expected utility under a Kolmogorov prior? If so how come nobody seems to be trying to do computations of the posterior distribution? Or discussing exactly what side data we know about the issue that might inform this probability computation? Or even wondering which universal Turing machine we are using to define our prior? Unless you want to give a more concrete definition of what you mean by "rational" in this context, perhaps you should stop arguing for a moment about what the rational thing to do is.

An Introduction to Löb's Theorem in MIRI Research

I think that the way that humans predict other humans is the wrong way to look at this, and instead consider how humans would reason about the behavior of an AI that they build. I'm not proposing simply "don't use formal systems", or even "don't limit yourself exclusively to a single formal system". I am actually alluding to a far more specific procedure:

  • Come up with a small set of basic assumptions (axioms)
  • Convince yourself that these assumptions accurately describe the system at hand
  • Try to prove that the axioms would imply the desired behavior
  • If you cannot do this return for the first step and see if additional assumptions are necessary

Now it turns out that for almost any mathematical problem that we are actually interested in, ZFC is going to be a sufficient set of assumptions, so the first few steps here are somewhat invisible, but they are still there. Somebody need need to come up with these axioms for the first time, and each individual who wants to use them should convince themselves that they are reasonable before relying on them.

A good AI should already do this to some degree. It needs to come up with models of a system that it is interacting with before determining its course of action. It is obvious that it might need to update what assumptions it's using the model physical laws, why shouldn't it just do the same thing for logical ones?

Load More