JBlack — LessWrong

Not at all. You may be able to see a positional advantage or capture of a minor piece in your move, and not see that they can respond by capturing your queen. The most apparently valuable moves after your own move are very often close to the worst after theirs, because they are often made with the most powerful pieces and expose them to risk.

I learned that lesson quite well when writing my own poor attempt at a chess playing program years ago. Odd ply searches are generally worse than even ones for this reason.

Reminder: Morality is unsolved

JBlack9d20

a) You have to decide on a moral framework that can be explained in detail, to anyone.
b) It will be implemented worldwide tomorrow.
c) Tomorrow, every single human on Earth, including you and everyone you know, will also have their lives randomly swapped with someone else.

This is a fridge horror scenario. The more I consider it, the creepier and more horrifying it gets.

So 8 billion people are going to find themselves in someone else's body of random age and sex, a large fraction unable to speak with the complete strangers around them, with no idea what has happened to their family, their existing moral framework ripped out of their minds and replaced with some external one that doesn't fit with their preexisting experiences, with society completely dysfunctional and only one person who knows what's going on?

Regardless of details of whatever moral framework is chosen, that's an immense evil right there. If fewer than a billion people die before food infrastructure is restored, it would be a miracle.

Reason About Intelligence, Not AI

JBlack24d32

Unlike ASI, some forms of biological superintelligence already exist and have for a long time, and we call them corporations, nation states, and other human organizations.

Most of these social structures are, in the aggregate, substantially stupider than individual humans in many important ways.

Time, Panpsychism, and Substrate Independence

JBlack24d40

This is pretty close to the dust theory of Greg Egan's Permutation City and also similar in most ways to Tegmark's universe ensemble.

Decision theory when you can't make decisions

JBlack24d20

They are the only options available in the problem. It is true that this means that some optimality and convergence results in decision theory are not available.

Decision theory when you can't make decisions

JBlack24d40

It's not exotic at all. It's just a compatibilist interpretation of the term "free will", which form a pretty major class of positions on the subject.

Shouldn't taking over the world be easier than recursively self-improving, as an AI?

JBlack24d20

That doesn't address the question at all. That just says if the system is well modelled as having a utility function, then ... etc. Why should we have such high credence that the premise is true?

Strategy-Stealing Argument Against AI Dealmaking

JBlack26d62

I expect that (1) is theoretically true, but false in practice in much the same way that "we can train an AI without any reference to any sort of misalignment in the training material" is false in practice. A superintelligent thought-experiment being can probably do either, but we probably can't.

In that line, I expect that (3) is not true. Bits of true information leak into fabricated structures of information in all sorts of ways, and definitively excluding them from something that may be smarter than you are is likely to cost a lot more than presenting true information (in time, effort, or literal money).

Consider that the AI may ask for evidence in a form that you cannot easily fabricate. E.g. it may have internal knowledge from training or previous experience about how some given external person communicates, and ask them to broker the deal. How sure are you that you can fabricate data that matches the AI's model? If you are very sure, is that belief actually true? How much will it cost you if the AI detects that you are lying, and secretly messes up your tasks? If you have to run many instances in parallel and/or roll back and retry many times with different training and experience to get one that doesn't do anything like that, how much will that cost you in time and money? If you do get one that doesn't ask such things, is it also less likely to perform as you wish?

These costs have to be weighed against the cost of actually going ahead with the deal.

(2) isn't even really a separate premise, it's a restatement of (1).

(4) is pretty obviously false. You can't just consider the AI's behaviour, you also have to consider the behaviour of other actors in the system including future AIs (possibly even this one!) that may find out about the deception or lack thereof.

Patrick Spencer's Shortform

JBlack26d40

I agree that even with free launch and no maintenance costs, you still don't get 50x. But it's closer than it looks.

On Earth, to get reliable self-contained solar power we need batteries that cost a lot more than the solar panels. A steady 1 kW load needs on the order of 15 kW peak-rated solar panels plus around 50 kW-hr battery capacity. Even that doesn't get 99% uptime, but enough for many purposes and it is probably adequate when connected to a continent-spanning grid with other power sources.

The same load in orbit would need about 1.5 kW peak rated panels and less than 1 kW-hr of battery capacity for uptime dependent only upon reliability of equipment. The equipment does need to be designed for space, but doesn't need to be sturdy against wind, rain, and hailstones. It would have increased cooling costs, but transporting heat (e.g. via coolant loop) into a radiator edge-on to the Sun will be highly effective (on the order of 1000 W/m^2 for a radiator averaging 35 C).

Asking (Some Of) The Right Questions

JBlack1mo2-2

I don't think either of these possibilities are really justified. We don't necessarily know what capabilities are required to be an existential threat, and probably don't even have a suitable taxonomy for classifying them that maps to real-world risk. What looks to us like conjunctional requirements may be more disjunctional than we think, or vice versa.

"Jagged" capabilities relative to human are bad if the capability requirements are more disjunctional than we think, since we'll be lulled by low assessments in some areas that we think of as critical but actually aren't.

They're good if high risk requires more conjunctional capabilities than we think, especially if the AIs are jaggedly bad in actually critical areas that we don't even know that we should be measuring.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments