The Concepts Problem

[-]Roko16y180

A mind can only represent a complex concept X by embedding it into a tightly intervowen network of other concepts that combine to give X its meaning.

I'm going to object right there. A mind can represent a concept as a high-level regularity of sensory data. For example, "cat" is the high level regularity that explains the sensory data obtained from looking at cats. Cats have many regularity properties: they are solid objects which have a constant size and a shape that varies only in certain predictable ways. There is more than one cat in the world, and they have similar appearances. They also behave similarly.

This "concept-as-regularity" idea means that you don't have a symbol grounding problem, you don't have to define the semantics of concepts in terms of other concepts, and you don't have the problem of having to hand-pick an ontology for your system; rather, it generates the ontology that is appropriate for the world that it sees, hears and senses.

[-]CarlShulman16y50

Of course, you're still taking sensory inputs as primitives. How do you then evaluate changes to your sensory apparatus?

[-]Roko16y20

In the most basic case, simply ignore the possibility that this can happen.

In the more advanced case, I would say that you need to identify robust features of external reality using the first sensory apparatus you have. I.e. construct an ontology. Once you have that, you can utilize a different set of sensory apparatus, and note that many robust features of external reality manifest themselves as an isomorphic set of regularities in the new sensory apparatus.

For example, viewing a cat through an IR camera will not yield all and only the regularities that we see when looking at a cat through echo-location or visible light. But there will be a mapping, mediated by the fact that these sensor systems are all looking at the same reality.

[-]Vladimir_Nesov16y10

In the simplest case, the initial agent doesn't allow changes in its I/O construction. Any modified agent would be a special case of what the initial agent constructs in environment, acting through the initial I/O, using the initial definition of preference expressed in terms of that initial I/O. Since the initial agent is part of environment, its control over the environment allows, in particular, to deconstruct or change the initial agent, understood as a pattern in environment (in the model of sensory input/reaction to output, seen through preference).

[-]Vladimir_Nesov16y30

Yup. And for preference, it's the same situation, except that there is only one preference (expressed in terms of I/O) and it doesn't depend on observations (but it determines what should be done for each possible observation sequence). As concepts adapt to actual observations, so could representations of preference, constructed specifically for efficient access in this particular world (but they don't take over the general preference definition).

[-][anonymous]16y30

I agree with the "concept as regularity" concept. You can see that in how computers use network packets to communicate with each other. They don't define a packet as a discrete message from another computer, they just chop it up and process it according to its regularities

This leads to problems trying to point at humans in an AI motivational system though. Which you have to build yourself.... The problem is this. Starting at the level of visual and audio input signals, build a regularity parser that returns a 1 when it apprehends a human, and 0 when it sees apprehends something else. You have to do the following, future proof it so it recognises post/trans humans as humans (else if might get confused when we seem to want to wipe ourselves out). Make sure it is not fooled by pictures, mannequins, answer phones, chat bots.

Basically you have to build a system that can abstract out the computational underpinning of what it means to be human, and recognise it from physical interaction. And not just any computational underpinning, as physics is computational there is tons of physics of our brains we don't care about, such as exactly how we get different types of brain damage from different types of blunt trauma. So you have to build a regularity processor that abstracts what humans think are important about the computational part of humans.

If you understand how it does this, you should be able to make uploads.

We develop an understanding of what it means to be human, through interactions with humans. With a motivational system that can be somewhat gamed by static images and simulations, but one we don't trust fully. This however leads to conflicting notions about humanity. Whether uploads are humans or not, for example. So this type of process should probably not be used for something that might go foom.

[-][anonymous]16y00

I've kind of wanted to write about the concept-as-regularity thing for a while, but it seems akrasia is getting the best of me. Here's a compressed block of my thoughts on the issue.

Concept-as-regularity ought to be formalized. It is possible to conclude that a concept makes sense under certain circumstances involving other existing concepts that are correlated with no apparent determining factor. Since a Y-delta transformation on a Bayesian network looks like CAR, I'm guessing that the required number of mutually correlated concepts is three. Formalizing CAR would allow us to "formally" define lots of concepts, hopefully all of them. Bleggs and rubes are a perfect example of what CAR is useful for.

[-]Roko16y00

OK now I see what a Y-Delta transform is, but I doubt that anything that simple is the key to a rigorous definition of "concept as regularity". Better, see the paper "The discovery of structural form" By Charles Kemp and Joshua B. Tenenbaum.

[-]Roko16y00

what's a Y-delta transformation ?

[-]Bo10201016y10

I [no longer] believe it's (http://en.wikipedia.org/wiki/Yang-Baxter_equation).

[-]Roko16y00

Whilst it would be intellectually pleasing if this were the concept that Warrigal is referencing, I doubt it.

[-]Bo10201016y00

I didn't think it was the electrical engineering trick of turning a star-connected load into a triangle-connected one, but on further reflection, we are talking about a network...

[-]Douglas_Knight16y10

The electrical engineering trick was several decades before Yang and Baxter and has its own wikipedia entry.

[-]Roko16y10

But the nature of concepts poses a challenge for this objective. There seems to be no obvious way of programming those highly complex goals into the AI right from the beginning

You don't have to.

The idea is to give the AI a preference that causes it to want to do what [certain] humans would want to do, even though it doesn't know what that will turn out to be.

The challenge is to give it enough information to unambiguously point it at those humans, so that it extrapolates our volitions, rather than, say, those of our genes (universe-tiled-with-your-DNA failure mode) or of our more subconscious processes. Key to this is getting it to identify a physical instantiation of an optimizing agent.

[-]Vladimir_Nesov16y10

Key to this is getting it to identify a physical instantiation of an optimizing agent.

Here, have a functional upload, given as a lambda term. The main problem is what to do with it, not how to find one. Eventually we'll have uploads, but it's still far from being clear how to use them for defining preference. Recognizing a person without explicit uploading is a minor problem in comparison (though it's necessary for aggregation from whole non-uploaded humanity).

[-]PhilGoetz16y00

You misunderstand the post. The problem is that the concepts themselves, which you need to use to express the goals, will change in meaning as the AI develops.

I didn't notice this post at first, but it's really good. Very important, a critical problem with the FAI plan.

[-]wedrifid16y00

While I am not particularly optimistic about the creation of an FAI I say:

You don't need to make a particularly advanced ontology to create an AI. By this I mean that while complex, the AI need not come built with an ontology that represents even all of current human knowledge, let alone potential ontological breakthroughs made by future humans.
A GAI could maintain its original goal system under self improvement even if it makes ontological breakthroughs.
I would never trust an AI to have a full human goal system, or the attached ontology necessary to represent it.
The ontology of a pre-foom FAI would not be like the general human one. It would be simpler and clearer, with a mechanism to create whatever further ontology necessary to represent the values from an appropriate reference.
A super-intelligence can figure out ontological stuff better than a human. The 'only' problem (for the AI creators) is getting to to a system that has a simplified (less incoherent) version of the goal system of the creators that can self improve without goal change. Where 'without goal change' includes "don't destroy all of that gooey grey stuff from which you need to get a whole heap more of the detailed information about your values!!!"

[-]Peter_de_Blanc16y30

Please justify your claims (particularly #2).

[-]wedrifid16y10

(Only slightly less briefly)

* You don't need to make a particularly advanced ontology to create an AI. By this I mean that while complex, the AI need not come built with an ontology that represents even all of current human knowledge, let alone potential ontological breakthroughs made by future humans.

Human ontologies are complex, redundant and outright contradictory at times. Not only is some of it not needed to create an AI it would be counter-productive to include it.

A GAI could maintain its original goal system under self improvement even if it makes ontological breakthroughs.

When AI comes to model human preferences and the elements of the universe which are most relevant to fulfilling them they need not interfere at all with the implementation of the AI itself. It's just a complex form of data and metadata to keep in mind. When it comes to things that would more fundamentally influence the direct operation of the AI goals it can ensure that any alterations do not contradict the old version or do so only to resolve a discovered contradiction in whatever the sanest way possible is.

I would never trust an AI to have a full human goal system, or the attached ontology necessary to represent it.

Humans suck compared to superintelligences. They even suck at knowing what they want. I'd rather tell a friendly superintelligence to do what I want it to do rather than try to program my goals into it. Did I mention that it is smarter than me? It can even emulate me and ask em-me my goals that way if it hasn't got a better option. There is no downside to getting the FAI to do it for me. If it isn't friendly then....

The ontology of a pre-foom FAI would not be like the general human one. It would be simpler and clearer, with a mechanism to create whatever further ontology necessary to represent the values from an appropriate reference.

Humans suck at creating ontologies. Less than any other species I know but they still suck. I wouldn't include stupid parts in a FAI, that'd make it particularly hard to prove friendly. But it would naturally be able to look at humans and figure out any necessary stupid parts itself.

A super-intelligence can figure out ontological stuff better than a human. The 'only' problem (for the AI creators) is getting to to a system that has a simplified (less incoherent) version of the goal system of the creators that can self improve without goal change. Where 'without goal change' includes "don't destroy all of that gooey grey stuff from which you need to get a whole heap more of the detailed information about your values!!!"

That is rather dense, I'll admit. But the gist of the reasoning is there.

[-]Aleksei_Riikonen16y00

Hardcoding a knowledge ontology that would include e.g. all concepts humans have ever thought of is theoretically possible, since those concepts are made up of a finite amount of complexity. It's just that this would take so very long...

Anyway, I wouldn't rule out that a sufficient knowledge ontology for a FAI could be semi-manually constructed in a century or two, or perhaps a few millenia. It is also theoretically possible that all major players in the world come to an agreement that until then, very strong measures need to be taken to prevent anyone from building anything-that-could-go-UFAI.

I of course wouldn't claim this probability to be particularly high.

[-]Risto_Saarelma16y50

You might actually be able to do some back-of-the-envelope calculations on this. Humans are slow learners, and end up with reasonable ontologies in a finite number of years. By this old estimate, humans learn two bits worth of long term memory content per second. Assuming that people learn with this rate during 16 hours of waking time each day of their life, this would end up something like 32 megabytes of accumulated permanent memory for a 13-year old. 13-year olds can have most of the basic world ontology fixed, and that's around the age where we stop treating people as children who can be expected to be confused about obvious elements of the world ontology as opposed to subtle ones.

Hand-crafting a concept kernel that compresses down to that order of magnitude doesn't seem like an impossible task, but it's possible there's something very wrong with the memory accumulation rate estimation.

[-]Strange716y20

The 32 megabytes in question should be added to any pre-programmed instincts.

[-]Risto_Saarelma16y20

Yes. Those would go into the complexity bound for the human genome, since the genome is pretty much the only information source for human ontogeny. The original post suggested 25 MB, which apparently turned out to be too low. If you make the very conservative assumption that all of the human genome is important, I think the limit is somewhere around 500 MB. The genes needed to build and run the brain are going to be just a fraction of the total genome, but I don't know enough biology to guess at the size of the fraction.

Anyway, it looks like even in the worst case the code for an AGI that can do interesting stuff out of the box could fit on a single CD-ROM.

[-]NancyLebovitz16y30

Also, by that time, people might be enough more complex that hand-coding all the concepts 21st century people can hold will be an interesting historical project, but not enough for a useful FAI.

LESSWRONG
LW

LESSWRONG
LW

14

The Concepts Problem

14

14