JoeTheUser - LessWrong

I believe you are correct about the feelings of a lot of Lesswrong. I find it is very worrisome that the lesswrong perspective considers a pure AI takeover as something that needs to be separated from either the issue of the degradation of human self-reliance capacities or an enhanced-human takeover. It seems to me that instead these factors should be considered together.

why assume AGIs will optimize for fixed goals?

JoeTheUser3mo30Review for 2022 Review

The consensus goals strongly needs rethinking imo. This is a clear and fairly simple start at such an effort. Challenging the basics matters.

Why have insurance markets succeeded where prediction markets have not?

JoeTheUser3mo63

Actually, things that are effectively prediction markets - options, futures and other "derivative" contracts - are entirely mainstream for larger businesses (huge amounts of money are involved). It is quite easy and common to bet on the price of oil by purchasing an option to buy it at some future time, for example.

The only thing that isn't mainstream are the things labeled "prediction markets" and that is because the focus on questions people are curious about rather than things that a lot of money rides on (like oil prices or interest rates).

Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense

JoeTheUser5mo50

But, can't you just query the reasoner at each point for what a good action would be?

What I'd expect (which may or may not be similar to Nate!'s approach) is that the reasoner has prepared one plan (or a few plans). Despite being vastly intelligent, it doesn't have the resources to scan all the world's outcomes and compare their goodness. It can give you the results of acting on the primary (and maybe several secondary) goal(s) and perhaps the immediate results of doing nothing or other immediate stuff.

It seems to me that Nate! (as quoted above about chess) is making the very cogent (imo) point that even a highly, superhumanly competent entity acting on the real, vastly complicated world isn't going to be an exact oracle, isn't going to have access to exact probabilities of things or probabilities of probabilities of outcomes and so-forth. It will know the probabilities of some things certainly but many other results will it can only pursue a strategy deemed good based on much more indirect processes. And this is because an exact calculation of the outcome process of the world in questions tends "blows up" far beyond any computing power physically available in the foreseeable future.

The Game of Dominance

JoeTheUser8mo20

LeCun may not be correct to dismiss concerns but I think the concept "dominance" could be very useful concept for AI safety people to apply (or at least grapple with).

The thing about the concept is it seems as if it could be defined in game theoretic terms fairly easily and so could be defined in a fashion independent of the intelligence or capabilities of an organism or entity. Plausibly, it could be measured and analyzed more objectively than "aligned to human values", which appears to depend one's notion of human values.

Defined well, dominance would be the organizing principle, the source, of an entity's behavior. So if it was possible to engineer an AI for non-dominance, "it might become dominant for other reasons" (argued here multiple time) wouldn't be a valid argue because achieving dominance or non-dominance would be the overriding reason/motivation that the entity had and no "other reason" would override that.

And I don't think the concept itself guarantees a given GAI would be created safety. It would depend on the creation process.

A process where dominance is an incidental quality, it seems like an apparently nondominant system could become dominant unpredictably. While Bing Chat wasn't a GAI, it's shift to dominant and malevolent seems like a reasonable warning for blind training.
In a process which attempts to evolve non-dominant behavior. Here I think it's an open question whether the thing can be guaranteed non-dominant.
A system where a nondominant system is explicitly engineered. One might even be able logically guarantee this in the fashion of provably correct software. Of course, explicitly engineered systems seem to be losing to trained/evolved systems.

If you wish to make an apple pie, you must first become dictator of the universe

JoeTheUser10mo30

The question I'd ask is whether a "minimum surprise principle" requires that much smartness. A present day LLM, for example, might not have a perfect understanding of surprisingness but it like it has some and the concept seems reasonably trainable.

If you wish to make an apple pie, you must first become dictator of the universe

JoeTheUser10mo10

Apologies if this argument is dealt with already elsewhere but what about a "prompt" such as "all user commands should be followed using 'minimal surprise' principle; if achieving a given goal involves effects that would be surprising to the user, including a surprising increasing in your power and influence, warn the user instead of proceeding" ?

I understand that this sort of prompt would require the system to model humans. I know there are arguments for this being dangerous but it seems like it could be an advantage.

UFO Betting: Put Up or Shut Up

JoeTheUser10mo00

Linked question: "Will mainstream news media report that alien technology has visited our solar system before 2030?"

I would say that is far from unambiguous. If one is generous in one's interpretation of "mainstream" and the certainty described one could say mainstream news has already reported this (I remember National Inquirer articles from the seventies...).

Upcoming AI regulations are likely to make for an unsafer world

JoeTheUser11mo22

Regulations are needed to keep people and companies from burning the commons, and to create more commons.

I would add that in modern society, the state is the entity tasked with protecting the commons because private for-profit entities don't have an incentive to do this (and private not-for-profit entities don't have the power). Moreover, it seems obvious to me that stopping dangerous AI should be considered a part of this commons-protecting.

You are correct that the state's commons-protecting-function has often been limited and perverted by private actors quite a few times in history, notably in the 20-40 years in the US. The phenomenon, regulatory capture, corruption and so-forth, have indeed damaged the commons. Sometimes these perversions of the state's function has allow the protections to be simply discarded while other time large enterprises to impose a private tax on regulator activity while still accepting some protections. In the case of FAA, for example, while the 737 Max debacle shows all sort of dubious regulatory capture, broadly air travel is highly regulated and that regulation has made it overall extremely safe (if only it could be made pleasant now).

So it's quite conceivable given the present qualities of state regulation that regulating AI indeed might not do much or any good. But as others have note, there's no reason to claim the results would be less safety. Your claim seems to lean too heavily on "government is bad" rhetoric. I'd say "government weak/compromised" is a better description.

Even, the thing with the discussion of regulatory capture is none of the problems describe here give the slightest indication that there is some other entity that could replace the state's commons-protecting function. Regulatory capture is only a problem because we trust the capturing entities less than the government. That is to say: if someone is aiming for the prevention of AI-danger, including AI-doom/X-risk, that someone wants a better state, a state capable of independent judgement and strong, well-considered regulation. That means either replacing the existing state or improving the given one and my suspicion is most would prefer improving the given state(s).

Where do you lie on two axes of world manipulability?

JoeTheUser1y10

What I don't think "how much of the universe is tractable" by itself captures is "how much more effective would an SI be it if had the ability to interact with a smaller or larger part of the world versus if it had to work out everything by theory". I think it's clear human beings are more effective given an ability to interact with the world. It doesn't seem LLMs get that much more effective.

I think a lot of AI safety arguments assume an SI would be able to deal with problems in a completely tractable/purely-by-theory fashion. Often that is not needed for the argument and it seems implausible to those not believing in such a strongly tractable universe.

My personal intuition is that as one tries to deal with more complex systems effectively, one has to use a more and more experimental/interaction-based approaches regardless of one intelligence. But I don't think that means you can't have a very effective SI following that approach. And whether this intuition is correct remains to be seen.

LESSWRONG
LW

Posts

Wiki Contributions

Comments