Donald Hobson — LessWrong

The world is not automatically divided up into lots of separate tasks.

If you divide tasks into too many small pieces, too many little buckets, many important problems can fall through the gaps.

For example. If you use RL to train a plumbing robot. And separately train an electrician robot. Then neither of these robots is can solve the problem that you get an electric shock whenever you turn on the tap.

If you train on a few huge buckets, then you have 1 robot that does everything, and that's basically an AGI again.

And in this RL as a service model, wouldn't there be people doing RL for AI research.

So, when this model gets good enough, someone can just say "build an AGI" and get one. Because all tasks are being automated, and that includes the task of building AGI.

Actually, RL is based on trial and error. It would be hard to train an AI researcher without giving it the opportunity to run arbitrary code in training.

Alcohol is so bad for society that you should probably stop drinking

Donald Hobson2mo30

Some of the scenarios I was thinking about included people who started dating due to alcohol, and then later have an intended pregnancy.

But also, it does depend on whether we are looking at the local "do these people want to be pregnant" or the societal "what would a drop in birthrate do to society".

Agent foundations: not really math, not really science

Donald Hobson2mo30

One thing that's kind of in the powerful non-fooming corrigible AI bucket is a lot of good approximations to the higher complexity classes.

There is a sense in which, if you had an Incredibly fast 3 sat algorithm, you could use it with a formal proof checker to prove arbitrary mathematical statements. You could use your fast 3sat + a fluid dynamics simulator to design efficient aerofoils. There is a lot of interesting search and optimization and simulation things that you could do trivially, if you had infinite compute.

There is a sense that an empty python terminal is already a corigable AI. It does whatever you tell it to. You just have to tell it in python. This feels like it's missing something. But when you try to say what is missing, the line between a neat programming language feature and a corrigable AI seems somehow blurry.

MIRI's "The Problem" hinges on diagnostic dilution

Donald Hobson2mo20

Given that alignment is theoretically solvable, (probably) and not currently solved, almost any argument about alignment failure is going to have an

"and the programmers didn't have a giant breakthrough at the last minute" assumption.

MIRI's "The Problem" hinges on diagnostic dilution

Donald Hobson2mo20

If the simulation approach is to be effective, it probably has to have pretty high fidelity, in which case sim behaviours are likely to be pretty representative of the real world behaviour

Yes. I expect that, before smart AI does competent harmful actions (as opposed to flailing randomly, which can also do some damage), then there will exist, somewhere within the AI, a pretty detailed simulation of what is going to happen.

Reasons humans might not read the simulation and shut it down.

A previous competent harmful action intended to prevent this.
The sheer number of possibilities the AI considers actions.
Default difficulty of a human understanding the simulation.

Lets consider an optimistic case. You have found a magic computer and have programmed the laws of quantum field theory. You have added various features so you can put a virtual camera and microphone at any point in the simulation. Lets say you have a full VR setup. There is still a lot of room for all sorts of subtle indirect bad effects to slip under the radar. Because the world is a big place and you can't watch all of it.

Also, you risk any prediction of a future infohazard becoming a current day infohazard.

In the other extreme, it's a total black box. Some utterly inscrutable computation, perhaps learned from training data. Well in the worst case, the whole AI, from data in to action out, is one big holomorphically encrypted black box.

Theory of culture as waste.

Donald Hobson2mo70

"AI Psychosis". Is ad hominem a thing here in less wrong?

It felt similar. So more intended as a hypothesis than an insult, but sure. I can see how you saw it that way.

Yes, actions that benefit the ecosystem in fact benefit the species and are in fact rewarded. Digging holes to hide food: reward for individual plus reward to ecosystem.

You seem to be mixing up several different claims here.

Claim 1) Evolution inherently favors actions that benefit the ecosystem, whether or not those actions benefit the individual. (false)

Claim 2) It so happens that every action that benefits the ecosystem also benefits the individual.

Claim 3) There exists an action that benefits the ecosystem, and also the individual.

I don't feel that "benefits the ecosystem" is well defined. The ecosytem is not an agent with a well defined goal that can recieve benfits. What does it mean to "benefit the planet mars"? Ecosystems contain a variety of animals with that are often at odds with each other.

What quantity exactly is going up when an ecosystem "benefits"? Total biomass? Genetic diversity? Individual animals hedonic wellbeing? Similarity to how the ecosystem would look without human interference?

If the ecosystem survives, you survive.

That is wildly not true. Plenty of animals die all the time while their ecosystem survives.

If the ecosystem dies, you probably die with it.

Sure. You might escape. But probably. Yes.

The problem is, the overall situation is like a many player prisoners dilemma. Often times the actions of any individual animal will make a hill of beans to the overall environment, but it all adds up.

Cooperation can evolve, in social settings, with small groups of animals that all know each other and punish the defectors.

A lot of the time, this isn't the case, and nature is stuck on a many way defect.

It also serves a purpose for mating.

It does seem to be mostly mating.

Theory of culture as waste.

Donald Hobson2mo20

The things animals do in general are not necessarily necessary for their survival, but they are usually necessary for the good health of the ecosystem they are part of.

That seems strange. That's not how evolution usually works.

Movement is the first action of the universe. Memesis is the second. (Chaos is the third).

Have you had long discussions with ChatGPT? This sounds like the sort of thing a person suffering from AI psychosis might say.

Peacocks feathers are a defense mechanism

Really?

By the way, how many r's are in raspberry?

Theory of culture as waste.

Donald Hobson2mo30

This isn't a good theory.

While humans aren't literally the only animals to make tools, the difference between human tools and the basically pointy sticks that other animals make is vast.

There isn't any great reason to expect there to exist a narratively satisfying "what makes us human".

"Waste" is a non-apple. You have a very specific definition of efficiency, and anything that doesn't fit this model is "waste". Sure, by your model, a cathedral is waste. But spending the same amount of effort and resources digging holes and then filling them in again would also be waste. By your definition, almost any activity that doesn't spread your genes is waste. So your theory is non-predictive as it doesn't explain why humans build cathedrals rather than something else.

Also, culture and "waste" aren't uniquely human. From peacock tails to bird songs to magpies collecting shiny things to orcas balancing fish on their heads. Lots of animals have something resembling culture.

Learning to copy is easy. Learning to flawlessly assess if some behavior is useful is hard. So it's no surprise that many animals learn by copying each other. And the actions they copy aren't always useful to survival, but are usually pretty good.

Interiors can be more fun

Donald Hobson2mo22

I think this might be a case of "no one got fired for ..."

No one got fired for designing the most generic boring beige office. No great skill is needed to pull it off. I think there is a common risk minimization strategy of producing bland neutral inoffensive designs.

How Does A Blind Model See The Earth?

Donald Hobson2mo2510

One thing that might be interesting is asking for SVG's, and seeing if the errors in these maps match up with corresponding errors in the SVG's, suggesting a single global data store.

Also, this is a good reminder of what a huge and bewildering variety of LLM's there are these days.

LESSWRONG
LW

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments