Will AGI surprise the world?

Cross-posted from my blog.

Yudkowsky writes:

In general and across all instances I can think of so far, I do not agree with the part of your futurological forecast in which you reason, "After event W happens, everyone will see the truth of proposition X, leading them to endorse Y and agree with me about policy decision Z."

...

Example 2: "As AI gets more sophisticated, everyone will realize that real AI is on the way and then they'll start taking Friendly AI development seriously."

Alternative projection: As AI gets more sophisticated, the rest of society can't see any difference between the latest breakthrough reported in a press release and that business earlier with Watson beating Ken Jennings or Deep Blue beating Kasparov; it seems like the same sort of press release to them. The same people who were talking about robot overlords earlier continue to talk about robot overlords. The same people who were talking about human irreproducibility continue to talk about human specialness. Concern is expressed over technological unemployment the same as today or Keynes in 1930, and this is used to fuel someone's previous ideological commitment to a basic income guarantee, inequality reduction, or whatever. The same tiny segment of unusually consequentialist people are concerned about Friendly AI as before. If anyone in the science community does start thinking that superintelligent AI is on the way, they exhibit the same distribution of performance as modern scientists who think it's on the way, e.g. Hugo de Garis, Ben Goertzel, etc.

My own projection goes more like this:

As AI gets more sophisticated, and as more prestigious AI scientists begin to publicly acknowledge that AI is plausibly only 2-6 decades away, policy-makers and research funders will begin to respond to the AGI safety challenge, just like they began to respond to CFC damages in the late 70s, to global warming in the late 80s, and to synbio developments in the 2010s. As for society at large, I dunno. They'll think all kinds of random stuff for random reasons, and in some cases this will seriously impede effective policy, as it does in the USA for science education and immigration reform. Because AGI lends itself to arms races and is harder to handle adequately than global warming or nuclear security are, policy-makers and industry leaders will generally know AGI is coming but be unable to fund the needed efforts and coordinate effectively enough to ensure good outcomes.

At least one clear difference between my projection and Yudkowsky's is that I expect AI-expert performance on the problem to improve substantially as a greater fraction of elite AI scientists begin to think about the issue in Near mode rather than Far mode.

As a friend of mine suggested recently, current elite awareness of the AGI safety challenge is roughly where elite awareness of the global warming challenge was in the early 80s. Except, I expect elite acknowledgement of the AGI safety challenge to spread more slowly than it did for global warming or nuclear security, because AGI is tougher to forecast in general, and involves trickier philosophical nuances. (Nobody was ever tempted to say, "But as the nuclear chain reaction grows in power, it will necessarily become more moral!")

Still, there is a worryingly non-negligible chance that AGI explodes "out of nowhere." Sometimes important theorems are proved suddenly after decades of failed attempts by other mathematicians, and sometimes a computational procedure is sped up by 20 orders of magnitude with a single breakthrough.

129 comments, sorted by
magical algorithm
Highlighting new comments since Today at 1:23 PM
Select new highlight date

A third possibility is that AGI becomes the next big scare.

There's always a market for the next big scare, and a market for people who'll claim putting them in control will save us from the next big scare.

Having the evil machines take over has always been a scare. When AI gets more embodied, and start working together autonomously, people will be more likely to freak, IMO.

Getting beat on Jeopardy is one thing, watching a fleet of autonomous quad copters doing their thing is another. It made me a little nervous, and I'm quite pro AI. When people see machines that seem like they're alive, like they think, communicate among themselves, and cooperate in action, many will freak, and others will be there to channel and make use of that fear.

That's where I disagree with EY. He's right that a smarter talking box will likely just be seen as an nonthreatening curiosity. Watson 2.0, big deal. But embodied intelligent things that communicate and take concerted action will press our base primate "threatening tribe" buttons.

"Her" would have had a very different feel if all those AI operating systems had bodies, and got together in their own parallel and much more quickly advancing society. Kurzweil is right in pointing out that with such advanced AI, Samantha could certainly have a body. We'll be seeing embodied AI well before any human level of AI. That will be enough for a lot of people to get their freak out on.

Yeah, this becomes plausible if some analogue of Chernobyl happens. Maybe self-driving cars cause some kind of horrible accident due to algorithms behaving unexpectedly.

There's always a market for the next big scare, and a market for people who'll claim putting them in control will save us from the next big scare.

That's how I've always viewed SIAI/MIRI, at least in terms of a significant subset of those who send them money...

Perhaps, but AFAIK, minus the "putting them in control" part.

MIRI literally claims to want to build an overlord for the universe and has actively solicited donations with that goal explicitly stated. I'd call that a variant on the theme where they get money via people who think they're putting them in charge.

I imagine someone puts an autonomous mosquito-zapping laser (to fight malaria) on a drone.

In that scenario, most people who are surprised to meet one in action, and many who hear of it, cannot help but wonder "how long till we're the mosquitoes?"

That'd be a smart move to secure some MIRI funding...

That'd be a smart move to secure some MIRI funding...

Besides just being freaking awesome. Imagine that floating around the backyard barbecue. Pew. Pew. Pew.

We'll be seeing embodied AI well before any human level of AI.

By "embodied" do you mean "humanoid"? It seems like there's more demand for humanoid robots in some parts of the world (Japan) than other parts (US). Or by "embodied" do you mean detached autonomous robots like the Roomba?

The reference to autonomous fleets of quadcopters would suggest the latter.

Yes, "embodied" was simply physical, capable of motion and physical action.

Reactions will depend on the physical capabilies and culture too. For me, I've never liked bugs that fly, so the quadcopters press my buttons. Watching Terminator may have something to do with it too.

Self-driving cars are already inspiring discussion of AI ethics in mainstream media.

Driving is something that most people in the developed world feel familiar with — even if they don't themselves drive a car or truck, they interact with people who do. They are aware of the consequences of collisions, traffic jams, road rage, trucker or cabdriver strikes, and other failures of cooperation on the road. The kinds of moral judgments involved in driving are familiar to most people — in a way that (say) operating a factory or manipulating a stock market are not.

I don't mean to imply that most people make good moral judgments about driving — or that they will reach conclusions about self-driving cars that an AI-aware consequentialist would agree with. But they will feel like having opinions on the issue, rather than writing it off as something that programmers or lawyers should figure out. And some of those people will actually become more aware of the issue, who otherwise (i.e. in the absence of self-driving cars) would not.

So yeah, people will become more and more aware of AI ethics. It's already happening.


Self-driving cars will also inevitably catalyze discussion of the economic morality of AI deployment. Or rather, self-driving trucks will, as they put millions of truck drivers out of work over the course of five to ten years — long-distance truckers first, followed by delivery drivers. As soon as the ability to retrofit an existing truck with self-driving is available, it would be economic idiocy for any given trucking firm to not adopt it as soon as possible. Robots don't sleep or take breaks.

So, who benefits? The owners of the trucking firm and the folks who make the robots. And, of course, everyone whose goods are now being shipped twice as fast because robots don't sleep or take breaks. (The AI does not love you, nor does it hate you, but you have a job that it can do better than you can.)

As this level of AI — not AGI, but application-specific AI — replaces more and more skilled labor, faster and faster, it will become increasingly impractical for the displaced workers to retrain into the fewer and fewer remaining jobs.

This is also a moral problem of AI ...

Whether we should do otherwise-obviously-suboptimal things solely because it'd result in more jobs is a question that long predates self-driving cars...

Well, I want to end up in the future where humans don't have to labor to survive, so I'm all for automating more and more jobs away. But in order to end up in that future, the benefits of automation have to also accrue to the displaced workers. Otherwise you end up with a shrinking productive class, a teeny-tiny owner class, and a rapidly growing unemployable class — who literally can't learn a new trade fast enough to work at it before it is automated away by accelerating AI deployment.

As far as I can tell, the only serious proposal that might make the transition from the "most adult humans work at jobs to make a living" present to the "robots do most of the work and humans do what they like" future — without the sort of mass die-off of the lower class that someone out there probably fantasizes about — is something like Friedman's basic income / negative income tax proposal. If you want to end up in a future where humans can screw off all day because the robots have the work covered, you have to let some humans screw off all day. May as well be the displaced workers.

I agree. (Yvain wrote about that in more detail here, and a followup here.)

I'd prefer something like Georgism to negative income tax, but the former has fewer chances of actually being implemented any time soon.

Whether we should do otherwise-obviously-suboptimal things solely because it'd result in more jobs is a question that long predates self-driving cars...

It long predates Milton Friedman, too

I don't think the linked PCP thing is a great example. Yes, the first time someone seriously writes an algorithm to do X it typically represents a big speedup on X. The prediction of the "progress is continuous" hypothesis is that the first time someone writes an algorithm to do X, it won't be very economically important---otherwise someone would have done it sooner---and this example conforms to that trend pretty well.

The other issue seems closer to relevant; mathematical problems do go from being "unsolved" to "solved" with comparatively little warning. I think this is largely because they are small enough problems that they are 1 man jobs (which would not be plausible if anyone really cared about the outcome), but that may not be the whole story and at any rate something is going on here.

In the PCP case, the relevantly similar outcome would be the situation where theoretical work on interactive proofs turned out to be useful right out of the box. I'm not aware of any historical cases where this has happened, but I could be missing some, and I don't really understand why it would happen as rarely as it does. It would be nice to understand this possibility better.

As for "people can't tell the difference between watson and being close to broadly human-level AI,” I think this is unlikely. At the very least the broader intellectual community is going to have little trouble distinguishing between watson and economically disruptive AI, so this is only plausible if we get a discontinuous jump. But even assuming a jump, the AI community is not all that impressed by watson and I expect this is an important channel by which significant developments would affect expectations.

Just for the record, I wasn't proposing the PCP thing as a counterexample to your model of "economically important progress is continuous."

As you noted on your blog Elon Musk is concerned about unfriendly AI and from his comments about how escaping to mars won't be a solution because "The A.I. will chase us there pretty quickly" he might well share MIRI's fear that the AI will seek to capture all of the free energy of the universe. Peter Thiel, a major financial supporter of yours, probably also has this fear.

If after event W happens, Elon Musk, Peter Thiel, and a few of their peers see the truth of proposition X and decide that they and everything they care about will perish if policy Z doesn't get enacted, they will with high probability succeed in getting Z enacted.

I don't know if this story is true, but I read somewhere that when Julius Caesar was marching on the Roman Republic several Senators went to Pompey the Great, handed him a sword and said "save Rome." Perhaps when certain Ws happen we should make an analogous request of Musk or Thiel.

I would believe that as soon as AGI becomes near (it it will ever will) predictions by experts will start to converge to some fixed date, rather than the usual "15-20 years in the future".

(Nobody was ever tempted to say, "But as the nuclear chain reaction grows in power, it will necessarily become more moral!")

Apologies for asking an off-topic question that has certainly been discussed somewhere before, but if advanced decision theories are logically superior, then they are in some sense universal, in that a large subspace of mindspace will adopt them when the minds become intelligent enough ("Three worlds collide" seems to indicate that this is EYs opinion, at least for minds that evolved), then even a paperclip maximiser would assign some nontrivial component of its utility function to match humanity's, iff we would have done the same in the counterfactual case that FAI came first (I think this does also have to assume that at least one party has a sublinear utility curve).

In this sense, it seems that as entities grow in intelligence, they are at least likely to become more cooperative/moral.

Of course, FAI is vastly preferable to an AI that might be partially cooperative, so I am not trying to diminish the importance of FAI. I'd still like to know whether the consensus opinion is that this is plausible.

Actually I think I know one place has been discussed before - Clippy promised friendliness and someone else promised him a lot of paperclips. But I don't know of a serious discussion.

Just a historical note, I think Rolf Nelson was the earliest person to come up with that idea, back in 2007. Though it was phrased in terms of simulation warfare rather than acausal bargaining at first.

Cooperative play (as opposed to morality) strongly depends on the position from which you're negotiating. For example if the FAI scenario is much less likely (a priori) than a Clippy scenario, then there's no reason for Clippy to make strong concessions.

For example if the FAI scenario is much less likely (a priori) than a Clippy scenario, then there's no reason for Clippy to make strong concessions.

But if a "paperclips" maximizer, as opposed to "tables", "cars", or "alien sex toys" maximizer, is just one of many unfriendly maximizers, then maximizing "human values" is just one of many unlikely outcomes. In other words, you can't just say that unfriendly AIs are more likely than friendly AIs when it comes to cooperation. Since the opposition between a paperclip maximizer and an "alien sex toy" maximizer is the same as the opposition between the former and an alien or human friendly AI. Since all of them want to maximize their opposing values. And even if there turns out to be a subset of values shared by some AIs, other groups could cooperate to outweigh their leverage.

But since there is an exponentially huge set of random maximisers, the probability of each individual one is infinitesimal. OTOH, human values have a high probability density in mindspace because people are actually working towards it.

human values have a high probability density in mindspace because people are actually working towards it

Depends on how high probability density have humans (and alien life forms so similar to humans that they share our values) in mindspace. Maybe very low. Maybe a society ruled by intelligent ants according to their values would make us very unhappy... and on a cosmic scale, ants are our cousins; alien life should be much more different.

I don't understand what point you're trying to make. My point was that cooperative game theory doesn't magically guarantee a UFAI will treat us nicely. It might work but only if there is a sufficiently substantial Everett branch with a FAI. The probability of that branch probably strongly depends on effort invested into FAI research.

If the probability of FAI (and friendly uploads etc) is near zero, then we're doomed either way. But even though I believe the probability of provably friendly AI coming first is <50% , its definitely not 10^-11!

Fair enough, but there's still an enormous incentive to work on FAI.

Of course, I was not trying to suggest otherwise.

But the we might be able achieve AI safety in a relatively easy way by creating networks of interacting agents (including interacting with us)

I think you points out the conclusion of the assumption that

Cooperative play strongly depends on the position from which you're negotiating.

but if you have multiple AIs then none of them is much stronger than the other.

Sorry, didn't follow that. Can you elaborate?

I mean you don't have to assume a singleton AI becoming very powerful very quickly. You can assume intelligence and friendliness developing in parallel.[and incrementally]

Hmm.

Are you suggesting (super)intelligence would be a result of direct human programming, like Friendliness presumably would be?

Or that Friendliness would be a result of self-modification, like SIAI is predicted to be 'round these parts?

I am talking about SIRI. I mean that human engineers are /will make multiple efforts at simultaneously improving AI and friendliness, and the ecosystem of AIs and AI users are/will select for friendliness that works.

Is the idea that the network develops at roughly the same rate, with no single entity undergoing a hard takeoff?

I what sense I don't have to assume it? I think singleton AI happens to be a likely scenario and this has little to do with cooperation.

The more alternative scenarios there are, the less likelihood iof the MIRI scenario, and the less need for the MIRI solutiion.

I don't understand what it has to do with cooperative game theory.

Also, cooperation seems to be at least a large component of morality, while some believe morality should be derived entirely from game theory.

I think this is a confusion. Game theory is only meaningful after you specified the utility functions of the players. If these utility functions don't already include caring about other agents, the result is not what I'd call "morality", it is just cooperation between selfish entities. Surely the evolutionary reasons for morality have to do with cooperative game theory, so what? The evolutionary reason for sex is reproduction, it doesn't mean we shouldn't be doing sex with condoms. Morality should not be derived from anything except human brains.

I think this disagreement is purely a matter of semantics: 'morality' is an umbrella term which is often used to cover several distinct concepts, such as empathy, group allegiance and cooperation. In this case, the AI would be moral according to one dimension of morality, but not the others.

True, but since the universe is quite big, even a share of 10^-11 (many orders of magnitude lower than P(FAI)) would be sufficient for humanity to have a galaxy while Clippy clips the rest of the universe. If the laws of physics permit an arbitrary large amount of whatever compromises utility, than all parties can achieve arbitrary large amounts of utility, providing the utility functions do not actually involve impeding the other parties activities.

It might be Clippy will let us have our galaxy. However "arbitrary large amounts of utility" sounds completely infeasible since for one thing the utility function has time discount and for another our universe is going to hit heat death (maybe it is escapable by destabilizing the vacuum into a state with non-positive cosmological constant but I'm not at all sure).

Utility functions do not have to have a time discount - in fact, while it might be useful when dealing with inflation, I don't see why there should be time discounts in general. As far as circumventing the second law of thermodynamics goes, there are several proposed methods, and given that humanity doesn't have a complete understanding of physics I don't think we can have a high degree of confidence one way or the other.

Utility functions do not have to have a time discount...

Without time discount you run into issues like the procrastination paradox and Boltzmann brains. UDT also runs into trouble since arbitrarily tight bounds on utility become impossible to prove due to Goedel incompleteness. If your utility function is unbounded it gets worse: your expectation values fail to converge (as exemplified by Pascal mugging).

As far as circumventing the second law of thermodynamics goes, there are several proposed methods...

Are there?

...given that humanity doesn't have a complete understanding of physics I don't think we can have a high degree of confidence one way or the other.

Well, we can't have complete confidence, but I think our understanding is not so bad. We're missing a theory of heterogeneous nucleation of string theoretic vacua (as far as I know).

Without time discount you run into issues like the procrastination paradox and Boltzmann brains. UDT also runs into trouble since arbitrarily tight bounds on utility become impossible to prove due to Goedel incompleteness.

Could you provide links? A google search turned up many different things, but I think you mean this procrastination paradox. Is it possible that one's utility function does not discount, but given uncertainty about the future one should kind of behave as if it does? (e.g. I value life tomorrow exactly as much as I value life today, but maybe we should party hard now because we cannot be absolutely certain that we will survive until tomorrow)

If your utility function is unbounded it gets worse: your expectation values fail to converge (as exemplified by Pascal mugging).

What if I maximise measure. or maximise the probability of attaining an unbounded amount of utility?

WRT circumventing the second law of thermodynamics, there is the idea of creating a basement universe to escape into, some form of hypercomputation that can experience subjective infinite time in a finite amount of real time, and time crystals which apparently is a real thing and not what powers the TARDIS.

I think our understanding is not so bad. We're missing a theory of heterogeneous nucleation of string theoretic vacua (as far as I know).

AFAIK humanity does not know what the dark matter/ derk energy is that 96% of the universe is made of. This alone seems like a pretty big gap in our understanding, although you seem to know more physics than I do.

Could you provide links?

Boltzmann brains were discussed in many places, not sure what the best link would be. The idea is that when the universes reaches thermodynamic equilibrium, after humongous amount of time you get Poincare recurrences: that is, any configuration of matter will randomly appear an infinite number of times. This means there's an infinite number of "conscious" brains coalescing from randomly floating junk, living for a brief moment and perishing. In the current context this calls for time discount because we don't want the utility function to be dominated by the well being of those guys. You might argue we can't influence their well being anyway but you would be wrong. According to UDT, you should behave as if you're deciding for all agents in the same state. Since you have an infinite number of Boltzmann clones, w/o time discount you should be deciding as if you're one of them. Which means, extreme short term optimization (since your chances to survive the next t seconds decline very fast with t). I wouldn't bite this bullet.

UDT is sort-of "cutting edge FAI research", so there are no very good references. Basically, UDT works by counting formal proofs. If your utility function involves an infinite time span it would be typically impossible to prove arbitrarily tight bounds on it since logical sentences that contain unbounded quantifiers can be undecidable.

...I think you mean this procrastination paradox.

Yes.

Is it possible that one's utility function does not discount, but given uncertainty about the future one should kind of behave as if it does?

Well, you can try something like this but for one it doesn't sound consistent with "all parties can achieve arbitrary large amounts of utility" because the latter requires arbitrarily high confidence about the future and for another I think you need unbounded utility to make it work which opens a different can of worms.

What if I maximise measure. or maximise the probability of attaining an unbounded amount of utility?

I don't understand what you mean by maximizing measure. Regarding maximizing the probability of attaining an unbounded (actually infinite) amount of utility, well, that would make you a satisficing agent that only cares about the asymptotically far future (since apparently anything happening in a finite time interval only carries finite utility). I don't think it's a promising approach, but if you want to pursue it, you can recast it in terms of finite utility (by assigning new utility "1" when old utility is "infinity" and new utility "0" in other cases). Of course, this leaves you with the problems mentioned before.

...there is the idea of creating a basement universe to escape into...

If I understand you correctly it's the same as destabilizing the vacuum which I mentioned earlier.

...some form of hypercomputation that can experience subjective infinite time in a finite amount of real time...

This is a nice fantasy but unfortunately strongly incompatible with what we know about physics. By "strongly" I mean that it would take a very radical update to make it work.

...and time crystals which apparently is a real thing and not what powers the TARDIS...

To me it looks the journalist is misrepresenting what has actually been achieved. I think that this is a proposal for computing in extremely low temperatures, not for violating the second law of thermodynamics. Indeed the latter would require actual new physics which is not the case here at all.

AFAIK humanity does not know what the dark matter/ dark energy is that 96% of the universe is made of. This alone seems like a pretty big gap in our understanding...

You're right, of course. There's a lot we don't know yet, what I meant is that we already know enough to begin discussing whether heat death is escapable because the answer might turn out to be universal or nearly universal across a very wide range of models.

Boltzmann brains were discussed in many places, not sure what the best link would be.

Sorry, I should have been more precise - I've read about Boltzmann brains, I just didn't realise the connection to UDT.

In the current context this calls for time discount because we don't want the utility function to be dominated by the well being of those guys.

This is the bit I don't understand - if these agents are identical to me, then it follows that I'm probably a Boltzmann brain too, as if I have some knowledge that I am not a Boltzmann brain, this would be a point of difference. In which case, surely I should optimise for the very near future even under old-fashioned causal decision theory. Like you, I wouldn't bite this bullet.

If your utility function involves an infinite time span it would be typically impossible to prove arbitrarily tight bounds on it since logical sentences that contain unbounded quantifiers can be undecidable.

I didn't know that - I've studied formal logic, but not to that depth unfortunately.

I don't understand what you mean by maximizing measure.

I was meaning in the sense of measure theory. I've seen people discussing maximising the measure of a utility function over all future Everett branches, although from my limited understanding of quantum mechanics I'm unsure whether this makes sense.

I don't think it's a promising approach, but if you want to pursue it, you can recast it in terms of finite utility (by assigning new utility "1" when old utility is "infinity" and new utility "0" in other cases).

Yeah, I doubt this would be a good approach either, in that if it does turn out to be impossible to achieve unboundedly large utility I would still want to make the best of a bad situation and maximise the utility achievable by the finite amount of negentropy available. I imagine a better approach would be to add the satisfying function to the time-discounting function, scaled in some suitable manner. This doesn't intuitively strike me as a real utility function, as its adding apples and oranges so to speak, but perhaps useful as a tool?

If I understand you correctly it's the same as destabilizing the vacuum which I mentioned earlier.

Well, I'm approaching the limit of my understanding of physics here, but actually I was talking about alpha-point computation which I think may involve the creation of daughter universes inside black holes.

This is a nice fantasy but unfortunately strongly incompatible with what we know about physics. By "strongly" I mean that it would take a very radical update to make it work.

It does seem incompatible with e.g. the plank time, I just don't know enough to dismiss it with a very high level of confidence, although I'm updating wrt your reply.

Your reply has been very interesting, but I must admit I'm starting to get seriously point out that I'm starting to get out of my depth here, in physics and formal logic.

This is the bit I don't understand - if these agents are identical to me, then it follows that I'm probably a Boltzmann brain too...

In UDT you shouldn't consider yourself to be just one of your clones. There is no probability measure on the set of your clones: you are all of them simultaneously. CDT is difficult to apply to situations with clones, unless you supplement it by some anthropic hypothesis like SIA or SSA. If you use an anthropic hypothesis, Boltzman brains will still get you in trouble. In fact, some cosmologists are trying to find models w/o Boltzman brains precise to avoid the conclusion that you are likely to be a Boltzman brain (although UDT shows the effort is misguided). The problem with UDT and Goedel incompleteness is a separate issue which has no relation to Boltzman brains.

I was meaning in the sense of measure theory. I've seen people discussing maximising the measure of a utility function over all future Everett branches...

I'm not sure what you mean here. Sets have measure, not functions.

I imagine a better approach would be to add the satisfying function to the time-discounting function, scaled in some suitable manner. This doesn't intuitively strike me as a real utility function, as its adding apples and oranges so to speak, but perhaps useful as a tool?

Well, you still got all of the abovementioned problems except divergence.

...actually I was talking about alpha-point computation which I think may involve the creation of daughter universes inside black holes.

Hmm, baby universes are a possibility to consider. I thought the case for them is rather weak but a quick search revealed this. Regarding performing an infinite number of computations I'm pretty sure it doesn't work.

CDT is difficult to apply to situations with clones, unless you supplement it by some anthropic hypothesis like SIA or SSA.

While I can see why there intuitive cause to abandon the "I am person #2, therefore there are probably not 100 people" reasoning, abandoning "There are 100 clones, therefore I'm probably not clone #1" seems to be simply abandoning probability theory altogether, and I'm certainly not willing to bite that bullet.

Actually, looking back through the conversation, I'm also confused as to how time discounting helps in the case that one is acting like a Boltzmann brain - someone who knows they are a B-brain would discount quickly anyway due to short lifespan, wouldn't extra time discounting make the situation worse? Specifically, if there are X B-brains for each 'real' brain, then if the real brain can survive more than X times as long as a B-brain, and doesn't time discount, then the 'real' brain utility still is dominant.

I'm not sure what you mean here. Sets have measure, not functions.

I wasn't being very precise with my wording - I meant that one would maximise the measure of whatever it is one values.

Hmm, baby universes are a possibility to consider. I thought the case for them is rather weak but a quick search revealed this. Regarding performing an infinite number of computations I'm pretty sure it doesn't work.

Well, I have only a layman's understanding of string theory, but if it were possible to 'escape' into a baby universe by creating a clone inside the universe, then the process can be repeated, leading to an uncountably infinite (!) tree of universes.

While I can see why there intuitive cause to abandon the "I am person #2, therefore there are probably not 100 people" reasoning, abandoning "There are 100 clones, therefore I'm probably not clone #1" seems to be simply abandoning probability theory altogether, and I'm certainly not willing to bite that bullet.

I'm not entirely sure what you're saying here. UDT suggests that subjective probabilities are meaningless (thus taking the third horn of the anthropic trilemma although it can be argued that selfish utility functions are still possible). "What is the probability I am clone #n" is not a meaningful question. "What is the (updated/posteriori) probability I am in a universe with property P" is not a meaningful question in general but has approximate meaning in contexts where anthropic considerations are irrelevant. "What is the a priori probability the universe has property P" is a question that might be meaningful but is probably also approximate since there is a freedom of redefining the prior and the utility function simultaneously (see this). The single fully meaningful type of question is "what is the expected utility I should assign to action A?" which is OK since it is the only question you have to answer in practice.

Actually, looking back through the conversation, I'm also confused as to how time discounting helps in the case that one is acting like a Boltzmann brain - someone who knows they are a B-brain would discount quickly anyway due to short lifespan, wouldn't extra time discounting make the situation worse?

Boltzmann brains exist very far in the future wrt "normal" brains, therefore their contribution to utility is very small. The discount depends on absolute time.

I wasn't being very precise with my wording - I meant that one would maximise the measure of whatever it is one values.

If "measure" here equals "probability wrt prior" (e.g. Solomonoff prior) then this is just another way to define a satisficing agent (utility equals either 0 or 1).

Well, I have only a layman's understanding of string theory, but if it were possible to 'escape' into a baby universe by creating a clone inside the universe, then the process can be repeated, leading to an uncountably infinite (!) tree of universes.

Good point. Surely we need to understand these baby universes better.

In the current context this calls for time discount because we don't want the utility function to be dominated by the well being of those guys.

This is the bit I don't understand - if these agents are identical to me, then it follows that I'm probably a Boltzmann brain too, as if I have some knowledge that I am not a Boltzmann brain, this would be a point of difference. In which case, surely I should optimise for the very near future even under old-fashioned causal decision theory. Like you, I wouldn't bite this bullet.

I think Boltzmann brains in the classical formulation of random manifestation in vacuum are a non-issue, as neither can they benefit from our reasoning (being random, while reason assumes a predictable universe) nor from our utility maximization efforts (since maximizing our short-term utility will make it no more or less likely that a Boltzmann brain with the increased utility manifests).

I've posted about this before, but there are many aspects of AI safety that we can research much more effectively once strong AI is nearer to realization. If people today say "AI could be a risk but it would be hard to get a good ROI on research dollars invested in AI safety today", I'm inclined to agree.

Therefore, it won't simply be interest in X-risk, but feasibility of concrete research plans on how to reduce it that help advance any AI safety agenda.

policy-makers and research funders will begin to respond to the AGI safety challenge, just like they began to respond to... synbio developments in the 2010s.

What are we referring to here? As in, what synbio developments and how did they respond to it?

Nobody was ever tempted to say, "But as the nuclear chain reaction grows in power, it will necessarily become more moral!"

We became better at constructing nuclear power plants, and nuclear bombs became cleaner. What critics are saying is that as AI advances, our control over it advances as well. In other words, the better AI becomes, the better we become at making AI work as expected. Because if AI became increasingly unreliable as its power grew, AI would cease to be a commercially viable product.

That's one of the standard responses to the MIRI argument, but not the same as the Artificial Philosopher response. I call it the SIRI versus MIRI response.

We became better at constructing nuclear power plants, and nuclear bombs became cleaner.

That would be small comfort if WWIII erupted triggering a nuclear winter.

...AI would cease to be a commercially viable product.

A doomsday device doesn't have to be a commercially viable product. It just has to be used, once.

Unless you can show it is reasonably likely that SIRI will take over the world, that is a Pascal's mugging.

I doubt about SIRI, but I think the plausibility of AI risk has already been shown in MIRI's writing and I don't see much point in repeating the arguments here. Regarding Pascal's mugging, I believe in bounded utility functions. So, yea, something with low probability and dire consequences is important up to a point. But AI risk is not even something I'd say has low probability.

A related question:

Are any governments working on AI projects? Surely the idea has occurred to a lot of military planners and spy agencies that AI would be an extremely potent weapon. What would the world be like if AI is first developed secretly in a government facility in Maryland?

What would the world be like if AI is first developed secretly in a government facility in Maryland?

Fucked.

Fucked.

I'm not sure I would agree with that. Would you mind telling me why you think so?

To my mind, whoever is the first to develop AI has a good chance of having an awesome amount of power. I am not comfortable with anyone having that kind of power, but if I had to pick one person or organization, I would probably pick the United States government.

Would you mind telling me why you think so?

An AI developed by the military (or the security apparatus) will have the goals of the military (or the security). And being developed within the bureaucratic labyrinths of a federal organization ensures that there will be many things wrong with it, things about which no one will know (and so cannot suggest fixing) because it all will be very very secret.

And being developed within the bureaucratic labyrinths of a federal organization ensures that there will be many things wrong with it, things about which no one will know (and so cannot suggest fixing) because it all will be very very secret.

Well if Microsoft develops AI first, won't there be a similar problem?

Not to the same extent, I don't think so.

I would have to disagree. For one thing, if a corporation like Microsoft (or perhaps Google) develops AI, it seems pretty likely that it will be done in secret, if only to prevent competitors from gaining an advantage. Even if the existence of the project is known, the actual details will probably be secret. For another, it seems like it would be difficult for the public to provide meaningful feedback in such a situation.

And have those tricky philosophical nuances been solved? Can there be reliable predictions of AI unfriendliness without such a solution?

Does MIRI have a prediction market on this stuff?

By the time the market closes everyone will have bigger concerns than whatever was being risked on the market.

AGI is already 1-2 decades away. Or 2-5 years if a well-funded project started now. I don't think that is enough time for a meaningful reaction by society, even just its upper echelons.

I would be very concerned about the "out of nowhere" outcome, especially now that the AI winter has thawed. We have the tools, and we have the technology to do AGI now. Why assume that it is decades away?

Why do you think it's so near? I don't see many others taking that position even among those who are already concerned about AGI (like around here).

This is my adopted long-term field -- though professionally I work as a bitcoin developer right now -- and those estimates are my own. 1-2 decades is based on existing AGI work such as OpenCog, and what is known about generalizations to narrow AI being done by Google and a few smaller startups. It is reasonable extrapolations based on published project plans, the authors' opinions, and my own evaluation of the code in the case of OpenCog. 5 years is what it would take if money were not a concern. 2-years is based on my own, unpublished simplification of the CogPrime architecture meant as a blitz to seed-stage oracle AGI, under the same money-is-no-concern conditions.

The only extrapolations I've seen around here, e.g. by lukeprog, involve statistically sampling AI researchers' opinions. Stuart Armstrong showed a year or two ago just how inaccurate this method is historically, as well as concrete reasons for why such statistical methods are useless in this case.

You rate your ability to predict AI above AI researchers? It seems to me that at best, I as an independent observer should give your opinion about as much weight as any AI researcher. Any concerns with the predictions of AI researchers in general should also apply to your estimate. (With all due respect.)

This is required reading for anyone wanting to extrapolate AI researcher predictions:

https://intelligence.org/files/PredictingAI.pdf

In short, asking AI researchers (including myself) their opinions is probably the worst way to get an answer here. What you need to do instead is learn the field, try your hand at it yourself, ask AI researchers what they feel are the remaining unsolved problems, investigate those answers, and most critically form your own opinion. That's what I did, and where my numbers came from.

If several people follow this procedure, I would expect to get a better estimate from averaging their results than trying it out for myself.

That's a reasonable expectation. But in as much as one can expect AI researchers to have gone through this exercise in the past (this is where the problem is, I think), the data is apparently not predictive. Kaj Sotala and Stuart Armstrong looked at this in some detail, with MIRI funding. Some highlights:

"There is little difference between experts and non-experts" "There is little difference between current predictions, and those known to have been wrong previously" "It is not unlikely that recent predictions are suffering from the same biases and errors as their predecessors"

http://lesswrong.com/lw/e36/ai_timeline_predictions_are_we_getting_better/ https://intelligence.org/files/PredictingAI.pdf

In other words, asking AI experts is about as useless as it can get when it comes to making predictions about future AI developments. This includes myself, objectively. What I advocate people do instead is what I did: investigate the matter yourself and make your own evaluation.

It sounds to me as though you are aware that your estimate for when AI will arrive is earlier than most estimates, but you're also aware that the reference class of which your estimate is a part of is not especially reliable. So instead of pushing your estimate as the one true estimate, you're encouraging others to investigate in case they discover what you discovered (because if your estimate is accurate, that would be important information). That seems pretty reasonable. Another thing you could do is create a discussion post where you lay out the specific steps you took to come to the conclusion that AI will come relatively early in detail, and get others to check your work directly that way. It could be especially persuasive if you were to contrast the procedure you think was used to generate other estimates and explain why you think that procedure was flawed.

"What I discovered" was that all the pieces for a seed AGI exist, are demonstrated to work as advertised, and could be assembled together rather quickly if adequate resources were available to do so. Really all that is required is rolling up our sleeves and doing some major integrative work in putting the pieces together.

With designs that are public knowledge (albeit not contained in one place), this could be done as well-funded project in the order of 5 years -- an assessment that concurs with what is said by the leaders of the project I am thinking of as well.

My own unpublished contribution is a refinement of this particular plan which strips out those pieces not strictly needed for a seed UFAI (these components being learnt by the AI rather than hand coded), and tweaks the remaining structure slightly in order to favor self-modifying agents. The critical path here is 2 years assuming infinite resources, but more scarily the actual resources needed are quire small. With the right people it could be done in a basement in maybe 3-4 years and take the world by storm.

But here's the conundrum, as was mentioned in one of the other sub-threads: how do I convince you of that, without walking you through the steps involved in creating an UFAI? If I am right, I would then have posted on the internet blueprints for the destruction of humankind. Then the race would really be on.

So what can I do, except encourage people to walk the same path I did, and see if they come to the same conclusions?

But here's the conundrum, as was mentioned in one of the other sub-threads: how do I convince you of that, without walking you through the steps involved in creating an UFAI? If I am right, I would then have posted on the internet blueprints for the destruction of humankind. Then the race would really be on.

That's assuming people take you seriously. Even if your plan is solid, probably most people will write you off as another Crackpot Who Thinks He's Solved an Important Problem.

But I do agree it's a bit of a conundrum. If you have what you think is an important idea, it's natural to worry that people will either (1) steal your idea or (2) criticize it not because it's not a great idea but because they want to feel superior.

But I do agree it's a bit of a conundrum. If you have what you think is an important idea, it's natural to worry that people will either (1) steal your idea or (2) criticize it not because it's not a great idea but because they want to feel superior.

I think you entirely missed the point.

I think you entirely missed the point.

I would agree with this in the sense that my stated reasons for the "conundrum" are a bit different from yours.

Well perhaps instead of insinuating motives, you could share your thoughts about the actual stated reason? At what point does one have a moral obligation not to share information about a dangerous idea on a public forum?

Well perhaps instead of insinuating motives,

I was thinking of my own motives in similar situations, sorry if you took it as a characterization of yours. I do see it could have been read that way.

you could share your thoughts about the actual stated reason?

I would suggest you e-mail your blueprint to a few of the posters here with the understanding they keep it to themselves. If even one long-term poster says "I've read Friedenbach's arguments and while they are confidential, I now agree that his estimate of the time to AI is actually pretty good," then I think your argument is starting to become persuasive.

Sorry I didn't mean to come off so abrasively either. I was just being unduly snarky. The internet is not good for conveying emotional state :\

My own unpublished contribution is a refinement of this particular plan which strips out those pieces not strictly needed for a seed UFAI (these components being learnt by the AI rather than hand coded), and tweaks the remaining structure slightly in order to favor self-modifying agents. The critical path here is 2 years assuming infinite resources, but more scarily the actual resources needed are quire small. With the right people it could be done in a basement in maybe 3-4 years and take the world by storm.

If you've solved stable self-improvement issues, that's FAI work, and you should damn well share that component.

Read the OP, I didn't make any boisterous claims. I simply said UFAI is 2-5 years away, focused effort, and 10-20 years away otherwise. I therefore believe it important that FAI research be refocused on near-term solutions. I state so publicly in order to counter the entrenched meme that seems to have infected everyone here, saying that AI is X years away, where X is some arbitrary number that by golly seems like a lot, in the hope that some people who encounter the post consider refocusing on near-term work. What's wrong with that?

Disregard my reply. I really shouldn't be posting from my phone at 2 AM. Such a venture rarely ends well.

OpenCog

Hey, speaking as an AI layman, how do you rate the odds that a design based on OpenCog could foom? I haven't really dug into that codebase, but from reading the Wiki it's my impression that it's a bit of a heap left behind by multiple contributors trying to make different parts of it work for their own ends, and if a coherent whole could be wrought from it it would be too complex to feasibly understand itself. In that sense: how far out do you think OpenCog is from containing a complete operational causal model of its own codebase and operation? How much of it would have to be modified or rewritten to reach this point?

This is my adopted long-term field -- though professionally I work as a bitcoin developer right now -- and those estimates are my own. 1-2 decades is based on existing AGI work such as OpenCog, and what is known about generalizations to narrow AI being done by Google and a few smaller startups. It is reasonable extrapolations based on published project plans, the authors' opinions, and my own evaluation of the code in the case of OpenCog. 5 years is what it would take if money were not a concern. 2-years is based on my own, unpublished simplification of the CogPrime architecture meant as a blitz to seed-stage oracle AGI, under the same money-is-no-concern conditions.

I don't really entirely endorse the algorithms behind OpenCog and such, but I do share the forecasting timeline. Modern work in hierarchical learning, probabilities over sentences (and thus: learning and inference over structured knowledge), planning as inference... basically, I've been reading enough papers to say that we're definitely starting to see the pieces emerge that embody algorithms for actual, human-level cognition. We will soon confront the question, "Yes, we have all these algorithms, but how do we put them together into an agent?"

I also think that most if not all parts needed for AGI are already there and 'only' need to be integrated. But that is actually a hard part. Kind of comparable to our understanding of the human brain: We know how most modules work - or at least how we can produce comparable results - but not how these are integrated. Just adding a meta level to Cog and plugins for domain specific modules at least wouldn't do.

20 years is on the very soon end of plausible; but 2-5 years is absolutely impossible. We just don't have the slightest notion how we would do that, regardless of fundingn.

We do not have the tools or technology right now; it won't come out of the blue.

We just don't have the slightest notion how we would do that, regardless of funding.

Really? And what's that opinion based on? Are you an expert in the field? I very often see this meme quoted, but no explanation to back it up.

I'm a computer scientist that has been following the AI / AGI literature for years. I have been doing my own private research (since publishing AGI work is too dangerous) based on OpenCog, pretty much since it was first open sourced, and a few other projects. I've looked at the issues involved in creating a seed AGI, while creating my own design for just such a system. And they are all solvable, or more often already solved but not yet integrated.

I'm a computer scientist who has been in a machine learning and natural language processing PhD program quite recently. I have an in-depth knowledge of machine learning, NLP and text mining.

In particular, I know that the broadest existing knowledge bases in the real-world (e.g. Google's knowledge Graph) are built on a hodge-podge of text parsing and logical inference techniques. These systems can be huge in scale and very useful, and reveal that a lot of knowledge is quite shallow even if it is apparently deeper, but also reveal the difficulty in dealing with knowledge that genuinely is deeper, by which I mean it relies on complex models of he world.

I am not familiar with OpenCog, but I do not see how it can address these sorts of issues.

The pitfall with private research is that nobody sees your work, meaning there's nobody to criticize it or tell you your assessment "the issues are solvable or solved but not yet integrated" is incorrect. Or, if it is correct and I'm dead wrong in my pessimism, nobody can know that either. Why would publishing it be dangerous (yeah, I get the general "AGI can be dangerous" thing, but what would be the actual marginal danger vs. not publishing and being left out of important conversations when they happen, assuming you've got something)?

In terms of practicalities, AI and AGI share two letters in common, and that's about it. OpenCog / CogPrime is at core nothing more than an interface language specification built on hypergraphs which is capable of storing inputs, outputs, and trace data for any kind of narrow AI application. It is most importantly a platform for integrating narrow AI techniques. (If you read any of the official documentation, you'll find most of it covers the specific narrow AI components they've selected, and the specific interconnect networks they are deploying. But those are secondary details to the more important contribution: the universal hypergraph language of the atomspace.)

So when you say:

I am not familiar with OpenCog, but I do not see how it can address these sorts of issues.

It doesn't really make sense. OpenCog solves these issues in the same way: through traditional text parsing and logical inference techniques. What's different is that the inputs, outputs, and the way in which these components are used are fully specified inside of the system, in a data structure that is self-modifying. Think LISP: code is data (albeit using a weird hypergraph language instead of s-expressions), data is code, and the machine has access to its own source code.

That's mostly what AGI is about: the interconnects and reflection layers which allow an otherwise traditional narrow AI program to modify itself in order to adapt to circumstances outside of its programmed expertise.

My two cents here are just:

1) Narrow AI is still the botteneck to Strong AI, and a feedback loop of development especially in the area of NLP is what's going to eventualy crack the hardest problems.

2) OpenCog's Hypergraphs do not seem especially useful. The power of a language cannot overcome the fact that without sufficiently strong self-modification techniques, it will never be able to self-modify into anything useful. Interconnects and reflection just allow a program to mess itself up, not become more useful, and scale or better NLP modules alone aren't a solution.

That's mostly what AGI is about: the interconnects and reflection layers which allow an otherwise traditional narrow AI program to modify itself in order to adapt to circumstances outside of its programmed expertise.

Actually, what AGI is about, by definition, is to achieve human-level or higher performance in a broad variety of cognitive tasks.
Whether self-modification is useful or necessary to achieve such goal is questionable.

Even if self-modification turns out to be a core enabling technology for AGI, we are still quite far from getting it to work.
Just having a language or platform that allows introspection and runtime code generation isn't enough: LISP didn't lead to AGI. Neither did Eurisko. And, while I'm not very familiar with OpenCog, frankly I can't see any fundamental innovation in it.

Representing code as data is trivial. The hard problem is making a machine reason about code.
Automatic program verification is only barely starting to become commercially useful in a few restricted application domains, and automatic programming is still largely undeveloped with very little progress being made beyond optimizing compilers.

Having a machine write code at the level of a human programmer in 2 - 5 years is completely unrealistic, and 20 years looks like the bare minimum, with the realistic expectation being higher.

"Having a machine write code at the level of a human programmer" is a strawman. One can already think about machine learning techniques as the computer writing its own classification programs. These machines already "write code" (classifiers) better than any human could under the same circumstances.. it just doesn't look like code a human would write.

A significant pieces of my own architecture is basically doing the same thing but with the classifiers themselves composed in a nearly turing-complete total functional language, which are then operated on by other reflective agents who are able to reason about the code due to its strong type system. This isn't the way humans write code, and it doesn't produce an output which looks like "source code" as we know it. But it does result in programs writing programs faster, better, and cheaper than humans writing those same programs.

Regarding what AGI is "about", yes that is true in the strictest, definitional sense. But what I was trying to convey is how AGI is separate from narrow AI in that it is basically a field of meta-AI. An AGI approaches a problem by first thinking about how to solve the problem. It first thinks about thinking, before it thinks.

And yes, there are generally multiple ways it can actually accomplish that, e.g. the AGI could not actually solve the problem or modify itself to solve the problem, but instead output the source code for a narrow AI which efficiently does so. But if you draw the system boundary large enough, it's effectively the same thing.

"Having a machine write code at the level of a human programmer" is a strawman. One can already think about machine learning techniques as the computer writing its own classification programs. These machines already "write code" (classifiers) better than any human could under the same circumstances.. it just doesn't look like code a human would write.

Yes, and my pocket calculator can compute cosines faster than Newton could. Therefore my pocket calculator is better at math than Newton.

A significant pieces of my own architecture is basically doing the same thing but with the classifiers themselves composed in a nearly turing-complete total functional language, which are then operated on by other reflective agents who are able to reason about the code due to its strong type system.

Lots of commonly used classifiers are "nearly Turing-complete".
Specifically, non-linear SVMs, feed-forward neural networks and the various kinds of decision tree methods can represent arbitrary Boolean functions, while recurrent neural networks can represent arbitrary finite state automata when implemented with finite precision arithmetic, and they are Turing-complete when implemented with arbitrary precision arithmetic.

But we don't exactly observe hordes of unemployed programmers begging in the streets after losing their jobs to some machine learning algorithm, do we?
Useful as they are, current machine learning algorithms are still very far from performing automatic programming.

But it does result in programs writing programs faster, better, and cheaper than humans writing those same programs.

Really? Can you system provide a correct implementation of the FizzBuzz program starting from a specification written in English?
Can it play competitively in a programming contest?

Or, even if your system is restricted to machine learning, can it beat random forests on a standard benchmark?

If it can do no such thing perhaps you should consider avoiding such claims, in particular when you are unwilling to show your work.

And yes, there are generally multiple ways it can actually accomplish that, e.g. the AGI could not actually solve the problem or modify itself to solve the problem, but instead output the source code for a narrow AI which efficiently does so. But if you draw the system boundary large enough, it's effectively the same thing.

Which we are currently very far from accomplishing.

I'm not disagreeing with the general thrust of your comment, which I think makes a lot of sense.

But the idea that an AGI must start out with the ability to parse human languages effectively is not at all required. An AGI is an alien. It might grow up with a completely different sort of intelligence, and only at the late stages of growth have the ability to interpret and model human thoughts and languages.

We consider "write fizzbuzz from a description" to be a basic task of intelligence because it is for humans. But humans are the most complicated machines in the solar system, and we are naturally good at dealing with other humans because we instinctively understand them to some extent. An AGI may be able to accomplish quite a lot before human-style intelligence can be comprehended using raw general intelligence and massive amounts of data and study.

I agree that natural language understanding is not a necessary requirement for an early AGI, but I would say that by definition an AGI would have to be good at the sort of cognitive tasks humans are good at, even if communication with humans was somehow difficult.
Think of making first contact with an undiscovered human civilization, or better, a civilization of space-faring aliens.

... raw general intelligence ...

Note that it is unclear whether there is any way to achieve "general intelligence" other than by combining lots of modules specialized for the various cognitive tasks we consider to be necessary for intelligence.
I mean, Solomonoff induction, AIXI and the like do certainly look interesting on paper, but the extent they can be applied to real problems (if it is even possible) without any specialization is not known.

The human brain is based on a fairly general architecture (biological neural networks), instantiated into thousands of specialized modules. You could argue that biological evolution should be included into human intelligence at a meta level, but biological evolution is not a goal-directed process, and it is unclear whether humans (or human-like intelligence) was a likely outcome or a fortunate occurrence.

Anyway, even if it turns out that "universal induction" techniques are actually applicable to a practical human-made AGI, given the economic interests of humans I think that before seeing a full AGI we should see lots of improvements in narrow AI applications.

I agree that natural language understanding is not a necessary requirement for an early AGI, but I would say that by definition an AGI would have to be good at the sort of cognitive tasks humans are good at, even if communication with humans was somehow difficult.

I think we're now saying the same thing, but to be clear: I don't think it follows at all that an AGI needs to be good at X, for any interesting X, in order to be considered an AGI. No, it has the meta-level condition instead: it must be able to become good at X, if doing so accomplishes its goals and it is given suitable inputs and processing power to accomplish that learning task.

Indeed, my blitz AGI design involves no natural language processing components, at all. The initial goal loading and debug interfaces would be via a custom language best described as a cross between vocabulary-limited Lojban and a strongly typed functional programming language. Having looked at the best approaches to NLP so far (Watson et al), and expert opinions on what would be required to go beyond that and build a truly human-level understanding of language, I found nothing that could not be rediscovered and developed by a less capable seed AI, if given sufficient resources and time.

Note that it is unclear whether there is any way to achieve "general intelligence" other than by combining lots of modules specialized for the various cognitive tasks we consider to be necessary for intelligence.

Ok, try this experiment: start with a high-level diagram of what you would consider to be a complete human-level AGI design, e.g. able to do everything a human can do, as good or better. I think we're on the same page in assuming that at least on one level it would consist of a ton of little specialized programs handling the various specialized aspects of human intelligence. Enumerate all of these, and take a guess at how they are interconnected. I doubt you'll be able to fit it all in one sheet of paper, or even 10. Here's a start based on OpenCog, but there's lots lots more details you will need to fill in:

http://goertzel.org/MonsterDiagram.jpg

Now consider each component in turn. If you cut that component out of the diagram (perhaps rearranging some of the connections as necessary), could you reliably recreate it with the remaining pieces, if tasked with doing so and given the necessary inputs and processing power? If so, get rid of it. If not, ask: what are the minimum (less than human-level) capabilities required, which let you recreate the rest? Replace with that. Continue until the design can't be simplified further.

This experiment is a form of local search, and you may have to repeat from different starting points, or employ other global search methods to be sure that you are arriving at something close to the global minimum seed AGI design, but as an exercise I hope it gets the point across.

The basic AGI design I arrived as involved a dozen different "universal induction" techniques with different strengths, a meta-architecture for linking them together, a generic and powerful internal language for representing really anything, and basic scaffolding to stand in for the rest. It's damn slow an inefficient at first, but like a human infant a good portion of its time would be spent "dreaming" where it analyzes its acquired memories and seeks improvements to its own processes... and gains there have multiplying affects. Don't discount the importance of power-law mechanisms.

On the subject of recurrent neural networks, keep in mind that you are such a network, and training you to write code and write it well took years.

A significant pieces of my own architecture is basically doing the same thing but with the classifiers themselves composed in a nearly turing-complete total functional language, which are then operated on by other reflective agents who are able to reason about the code due to its strong type system.

Hmmm... Do you have a completeness result? I mean, I can see that if you make it a total language, you can just use coinduction to reason about indefinite computing processes, but I'm wondering what sort of internal logic you're using that would allow complete reasoning over programs in the language and decidable typing (since to have the agent rewrite its own code it will also have to type-check its own code).

Current theorem-proving systems like Coq that work in logics this advanced usually have undecidable type inference somewhere, and require humans to add type annotations sometimes.

Personal opinion: OpenCog is attempting to get as general as it can within the logic-and-discrete-maths framework of Narrow AI. They are going to hit a wall as they try to connect their current video-game like environment to the real world, and find that they failed to integrate probabilistic approaches reasonably well. Also, without probabilistic approaches, you can't get around Rice's Theorem to build a self-improving agent.

Wellll.... the agent could make "narrow" self-improvements. It could build a formal specification for a few of its component parts and then perform the equivalent of provable compiler optimizations. But it would have a very hard time strengthening its core logic, as Rice's Theorem would interfere: proving that certain improvements are improvements (or, even, that the optimized program performs the same task as the original source code) would be impossible.

But it would have a very hard time strengthening its core logic, as Rice's Theorem would interfere: proving that certain improvements are improvements (or, even, that the optimized program performs the same task as the original source code) would be impossible.

This seems like the wrong conclusion to draw. Rice's theorem (and other undecidability results) imply that there exist optimizations that are safe but cannot be proven to be safe. It doesn't follow that most optimizations are hard to prove. One imagines that software could do what humans do -- hunt around in the space of optimizations until one looks plausible, try to find a proof, and then if it takes too long, try another. This won't necessarily enumerate the set of provable optimizations (much less the set of all enumerations), but it will produce some.

One imagines that software could do what humans do -- hunt around in the space of optimizations until one looks plausible, try to find a proof, and then if it takes too long, try another. This won't necessarily enumerate the set of provable optimizations (much less the set of all enumerations), but it will produce some.

To do that it's going to need a decent sense of probability and expected utility. Problem is, OpenCog (and SOAR, too, when I saw it) is still based in a fundamentally certainty-based way of looking at AI tasks, rather than one focused on probability and optimization.

Problem is, OpenCog (and SOAR, too, when I saw it) is still based in a fundamentally certainty-based way of looking at AI tasks, rather than one focused on probability and optimization.

Uh, what were you looking at? The basic foundation of OpenCog is a probabilistic logic called PLN (the wrong one to be using, IMHO, but a probabilistic logic nonetheless). Everything in OpenCog is expressed and reasoned about in probabilities.

Aaaaand now I have to go look at OpenCog again.

To do that it's going to need a decent sense of probability and expected utility. Problem is, OpenCog (and SOAR, too, when I saw it) is still based in a fundamentally certainty-based way of looking at AI tasks, rather than one focused on probability and optimization.

I don't see why this follows. It might be that mildly smart random search, plus a theorem prover with a fixed timeout, plus a benchmark, delivers a steady stream of useful optimizations. The probabilistic reasoning and utility calculation might be implicit in the design of the "self-improvement-finding submodule", rather than an explicit part of the overall architecture. I don't claim this is particularly likely, but neither does undecidability seem like the fundamental limitation here.

I have trouble trusting your expert opinion because it is not clear to me that you are an expert in the field, though you claim to be. Google doesn't point to any of your research in the area, and I can find no mention of your work beyond bitcoin by any (other) AI researchers. Feel free to link to anything corroborating your claims.

I have as much credibility as Eliezer Yudkowsky in that regard, and for the same reason. As I mention in the post you replied to, my work is private and unpublished. None of my work is accessible to the internet, as it should be. I consider it unethical to be publishing AGI research given what is at stake.

I have as much credibility as Eliezer Yudkowsky in that regard

That is, not very much.
But at least Eliezer Yudkowsky and pals have made an effort to publish arguments for their position, even if they haven't published in peer-reviewed journals or conferences (except some philosophical "special issue" volumes, IIRC).

Your "Trust me, I'm a computer scientist and I've fiddled with OpenCog in my basement but I can't show you my work because humans not ready for it" gives you even less credibility.

I have as much credibility as Eliezer Yudkowsky in that regard, and for the same reason.

Eliezer published a lot of relevant work, I have seen none from you.