For years, I've felt that our AI categories have been missing an important step: what comes between narrow AI & general AI. With the rise of synthetic media (especially natural-language generation) and game-playing AI, we're finally forced to confront this architecture.
Based on my lack of knowledge in the field of machine learning and bare observations of OpenAI's GPT-2, I propose a hypothesis: between narrow artificial intelligence and general artificial intelligence, there lies a sort of architecture capable of generalized learning in narrow fields of tasks. I don't claim to be an AI expert or even an amateur. Indeed, I likely lack so much understanding of data science that literally everything I'm about to say is actually wrong on a fundamental level.
But I do feel like, at least when it comes to mainstream discussions of AI, there's a big problem. Several big problems, in fact.
How does media talk about AI? Typically by reducing it to three architectural categories:
Artificial narrow intelligence (ANI). This is AI that can do one thing— and only one thing. If there's a network that you notice does more than one thing, it's actually just a bundle of ANIs all doing different things at the same time. In technical parlance, most of what is called ANI isn't actually artificial intelligence at all— basic scripts, Markov chains, Monte Carlo Tree Searches, conversation trees, stochastic gradient descent, autoencoding, etc. are part of data science & optimization more than AI, but this really comes down to the fact "AI" carries connotations of humanlike cognition. For the sake of this post, we'll consider them AI anyway.
Artificial general intelligence (AGI). The holy grail of data science. The cybernetic messiah. The solution to all our problems (which includes nuking all our problems). This is AI that can do anything, presumably as well as a human can.
Artificial superintelligence (ASI). The rapture of the nerds and your new God. This is an AGI on crack, if that crack was also on crack. Take the limits of human intelligence: fusion-ha'ing Einstein, Euler, Newton, Mozart, the whole lot of them. Push human intelligence as far as it can go genetically, to the absolute limit of standard deviations. ASI is everything even further beyond. It's a level of intelligence no human, either living, dead, or to be, will ever attain.
That's all well and good, but surely one can recognize that there's a massive gap there. How do we go from an AI that can do only one thing to an AI that does literally everything? Surely there's some intermediate state in between where you have narrow networks that are generalized, but not quite "general AI."
Up until recently, we had no reference for such a thing. It was either the soberingly incapable computer networks of the present or the artificial brains of science fiction.
But then deep learning happened. By itself, deep learning is little more than a more volumetric evolution of perceptrons made possible by modern computing power, such as might be possible with GPUs. Here we are a decade later, and what do we have? Networks and models that are either generalized or possessing generalized capabilities.
Nominally, most of these networks can only do "one" thing, just like any ANI. But unlike other ANIs, they can learn to do something else that's either closely related to or a direct outgrowth of that thing.
For example: MuZero from DeepMind. This one network has mastered over 50 different games. Even AlphaZero qualified, as it could play three different games. Of course, it still has to be retrained to play these different games as far as I know.
There's another example, this one as a "rooted in a narrow thread, and sprouting into multiple areas" deal: GPT-2. Natural language generation is probably as narrow of a task as you can get: generate data in natural language. But from this narrow task, you can see a very wide range of generalized results. By itself, it has to be trained to do certain things, so the training data determines whether it does any specific thing at this juncture. But as it turns out (and even surprising me), there's a lot that this entails. Natural-language processing is a very funny thing: because digital data itself qualifies as a natural language, that means that a theoretical NLG model can do anything on a computer. Write a story, write a song, compose a song, play that song, create art...
Though GPT-2 can't actually "play" the game, theoretically it would be feasible to get MuZero and GPT-2 to face off against each other.
Why is this important? Because of something I've called the AGI Fallacy. It's a phenomenon where we assume new tech will either only come about with AGI or is unlikely without it.
We're probably familiar with the AI Effect, yes? The gist there is that we assume that a technology, accomplishment, or innovative idea [X] requires "true" artificial intelligence [Y], but once we actually accomplish [X] with [Y], [Y] is no longer [Y]. That might sound esoteric on the surface, but it's simple: once we do something new with AI, it's no longer called "AI". It's just a classifier, a tree search, fuzzy statistics, a Boolean loop, an expert system, or something of that sort.
As a result, I've started translating "NAI" (narrow AI) as "Not AI" because that's what just about any and every narrow AI system is going to be.
It's possible there's a similar issue building with a fallacy that's closely related to (but is not quite) the AI Effect. To explain my hypothesis: take [X] again. It's a Super Task that requires skills far beyond any ANI system today. In order to reliably accomplish [X], we need [Y]— artificial general intelligence. But here's the rub: most experts place the ETA of AGI at around 2045 at the earliest, with actual data scientists leaning much closer to the 2060s at the earliest, with more conservative estimates placing its creation into the 22nd century. [Z] is how many years away this is, and for simplicity's sake, let's presume that [Z] = 50 years.
To simplify: [X] requires [Y], but [Y] is [Z] years away. Therefore, [X] must also be [Z] years away, or at least it's close to it and accomplishing it heralds [Y].
But this isn't the case for almost everything done with AI thus far. As it turns out, a sufficiently advanced narrow AI system was capable of doing things that past researchers were doggedly sure could only be done with general AI.
Of course, there are some classes of things that do require something more generalized, and it's those that people tend to hinge their bets on as being married to AGI. Except if there is a hitherfore unrecognized type of AI that can also be generalized but doesn't require the herculean task of creating AGI, even those tasks can be predicted to be solved far ahead of time.
So, say, generating a 5-minute-long video of a photorealistic person talking might seem to require AGI at first. This network has to generate a person, make that person move naturally, generate their text, generate their speech, and then make it coherent over the course of five minutes. How can't you do it with AGI? Well, depending on the tools you have, it's possible it's relatively easy.
This can greatly affect future predictions too. If you write something off as requiring AGI and then say that AGI is 50 years away, you then put off that prediction as being 50 years away as well. So if you're concerned about fake videos & movies but think we need AGI to generate them in order for them to be decent or coherent, you're probably going to compartmentalize that concern in the same place as your own natural death or of your grandchildren attending college. It's worth none of your concern in the immediate future, so why bother caring so much about it?
Whereas if you believe that this tech might be here within five years, you're much more apt to act and prepare. If you accept that some AI will be generalized but not completely generalized, you'll be more likely to take seriously the possibility of great upheavals much sooner than commonly considered to be realistic.
It happens to be ridiculously hard to get some people to understand this because, as mentioned, we don't really have any name for that intermediate type of AI and, thus, never discuss it. This even brings some problems because whenever we do talk about "increasingly generalized AI," some types latch onto the "generalized" part of that and think that you're discussing general AI and, thus, believe that we're closer to AGI than we actually are. Or conversely, they say that whatever network you're talking about is the furthest thing from AGI and use that mention of AI generality to shut down the topic since it "deals with science fiction instead of facts."
That's why I really don't like using terms like "proto-AGI" since that makes it sound like we just need to add more power and tasks to make it the full thing when it's really an architectural issue.
Hence why I went with "artificial expert intelligence." I forget where I first heard the term, but it was justified by the fact that
- The acronym can be "AXI," which sounds suitably cyberpunk.
- The acronym is original. The other names including "artificial specialized intelligence" (ASI, which is taken) and "artificial networked intelligence" (ANI, which is taken).
The only real drawback is its potential association with expert systems. But generally, I went with "expert" because of the association: experts will have specialized knowledge in a small field of areas, and can explain the relationship in those fields. Not quite a polymath savant that knows everything, and not really a student who has memorized a few equations and definitions to pass some tests.
...ever since roughly around 2015 or so, I started asking myself: "what about AI that can do some things but not everything?" That is, it might be specialized for one specific class of tasks, but it can do several or all of the subtasks within that class. Or, perhaps more simply, it's generalized across a cluster of tasks and capabilities but isn't general AI. It seems so obvious to me that this is the next step in AI, and we even have networks that do this: transformers, for example, specialize in natural-language generation, but from text synthesis, you can also do rudimentary images or organize MIDI files; even with just pure text synthesis, you can generate anything from poems to scripts and everything in between. Normally, you'd need an ANI that specializes in each one of those tasks, and it's true that most transformers right now are trained to do one specifically. But as long as they generate character data, they can theoretically generate more than just words.
This isn't "proto-AGI" or anything close; if anything, it's closer to ANI. But it isn't ANI; it's too generalized to be ANI.
Unfortunately, I have literally zero influence and clout in data science, and my understanding of it all is likely wrong, so it's unlikely this term will ever take off.