Discussion: Yudkowsky's actual accomplishments besides divulgation

by Raw_Power1 min read25th Jun 2011115 comments



Basically this: "Eliezer Yudkowsky  writes and pretends he's an AI researcher but probably hasn't written so much as an Eliza bot."

While the Eliezer S. Yudkowsky site has lots of divulgation articles and his work on rationality is of indisputable value, I find myself at a loss when I want to respond to this. Which frustrates me very much.

So, to avoid this sort of situation in the future, I have to ask: What did the man, Eliezer S. Yudkowsky, actually accomplish in his own field?

Please don't downvote the hell out of me, I'm just trying to create a future reference for this sort of annoyance.

114 comments, sorted by Highlighting new comments since Today at 10:35 PM
New Comment
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Please don't downvote the hell out of me, I'm just trying to create a future reference for this sort of annoyance.

It is actually very important. He is a figurehead when it comes to risks from AI. As to better be able to estimate the claims made by him, including the capability of the SIAI to mitigate risks from AI, we need to know if he is either one hell of an entrepreneur or a really good mathematician. Or else, if other people who work for the SIAI are sufficiently independent of his influence.

Yeah, but given how easy it is to collect karma points simply by praising him even without substantiating the praise (yes, I have indulged in "karma whoring" once or twice), I was afraid of the backlash.

A recurring theme here seems to be "grandiose plans, left unfinished". I really hope this doesn't happen with this project. The worst part is, I really understand the motivations behind those "castles in the sky" and... bah, that's for another thread.

...given how easy it is to collect karma points simply by praising him even without substantiating the praise...

There is praise everywhere on the Internet, and in the case of Yudkowsky it is very much justified. People actually criticize him as well. The problem are some of the overall conclusions, extraordinary claims and ideas. They might be few compared to the massive amount of rationality, but they can easily outweigh all other good deeds if they are faulty.

Note that I am not saying that any of those ideas are wrong, but I think people here are too focused on, and dazzled by, the mostly admirable and overall valuable writings on the basics of rationality.

Really smart and productive people can be wrong, especially if they think they have to save the world. And if someone admits:

I mean, it seems to me that where I think an LW post is important and interesting in proportion to how much it helps construct a Friendly AI, how much it gets people to participate in the human project...

...I am even more inclined to judge the output of that person in the light of his goals.

To put it bluntly, people who focus on unfriendly AI might miss the weak spots that are more likely to be unf... (read more)

9timtyler10yThere's some truth to that - but I can't say I am particularly sold on the FHI either. Yudkowsky seems less deluded about brain emulation than they are. Both organisations are basically doom-mongering. Doom-mongers are not known for their sanity or even-headedness [http://www.existentialrisk.com/faq.html#_Toc295939366]: It seems difficult to study this subject and remain objective. Those organisations that have tried so far have mostly exaggerated the prospects for the end of the world. They form from those who think the end of the world is more likely than most, associate with others with the same mindset, and their funding often depends on how convincing and dramatic picture of DOOM they can paint. The results tend to lead to something of a credibility gap.
3Kaj_Sotala10yIn what way do you consider them to be deluded about brain emulation? While I agree that in general, organizations have an incentive to doom-monger in order to increase their funding, I'm not so sure this applies to FHI. They're an academic department associated with a major university. Presumably their funding is more tied to their academic accomplishments, and academics tend to look down on excessive doom-mongering.
9CarlShulman10yMy understanding is that Tim thinks de novo AI is very probably very near, leaving little time for brain emulation, and that far more resources will go into de novo AI, or that incremental insights into the brain would enable AI before emulation becomes possible. On the other hand, FHI folk are less confident that AI theory will cover all the necessary bases in the next couple decades, while neuroimaging continues to advance apace. If neuroimaging at the relevant level of cost and resolution comes quickly while AI theory moves slowly, processing the insights from brain imaging into computer science may take longer than just running an emulation.
8jsalvatier10yI am under the impression that SIAI is well aware that they could use more appearance of seriousness.
5Raw_Power10yYeah, discussing rationality in a clown suit is an interesting first step in learning an Aesop on how important it is to focus on the fundamentals over the forms, but you can't deny it's unnecessarily distracting, especially to outsiders, i.e. most of humanity and therefore most of the resources we need. BTW, I love the site's new skin.
5Raw_Power10yOh, I'm not saying he doesn't deserve praise, Guy's works changed my life forever. I 'm just saying I got points for praising him without properly justifying it on more than one occasion, which I feel guilty for. I also don't think he should be bashed for the sake of bashing or other Why Our Kind Can't Cooperate [http://lesswrong.com/lw/3h/why_our_kind_cant_cooperate/] gratuitous dissenting.
0Bongo10ylinks please

A possibly incomplete list of novel stuff that Eliezer did (apart from popularizing his method of thinking, which I also like):

1) Locating the problem of friendly AI and pointing out the difficult parts.

2) Coming up with the basic idea of TDT (it was valuable to me even unformalized).

3) The AI-box experiments.

4) Inventing interesting testcases for decision-making, like torture vs. dust specks and Pascal's mugging.

While most of that list seems accurate, one should note that torture v. dust specks is a variant of a standard issue with utilitarianism that would be discussed in a lot of intro level philosophy classes.

7Raw_Power10yWhat do they discuss in outro level classes? I mean I've tried to get into contemporary, post-Marcuse philosophy, but unnecessary jargon, Sequences-like recursiveness without the benefit of hypertexts or even an index, and the incredible amount of repetitive, pointless bickering ended up exhausting my patience. In the light of what we've seen in this article, would it be fair to tall Yudkowsky a Philosopher, or is that seen as a badge of dishonour in the productive-work-centric anglo-saxon world? In France at least it's seen as Badass and legendarily awesome, and a very legitimate profession, same as Écrivain (writer).
5Peterdjones10yHave you tried getting into analytical philosophy\?

Actually I was talking about analytical phiosophy. At lest "continental" philosophy (I hate that term, honestly, it's very Anglo-centric), is built out of huge, systematized books (that doesn't allow bickering as easily: people make their point and then move on to explain what they have come up with), and as there are less works to reference, it's easier to make a genealogy of big core books and work your way along that. Criss-crossing papers in periodical publications are far less convenient, especially if you don't have a suscription or access to archives.

No, what exhausted my patience with the continentals is that the vampire bloodlines are long rather than tangled, and that some authors seem to go out of their way to not be understood. For example Nietzsche's early books were freaking full of contemporary pop references that make annotations indispensable, and turn the reading into a sort of TV Tropes Wiki Walk through the XIXth century media rather than the lesson in human nature it's supposed to be. Not to say it isn't fun, but such things have a time and place and that is not it. Others seem to rely on Department Of Redundancy Department to reinforce their points:... (read more)

-1Raw_Power10yActually I'm familiar with Claude Levi-Strauss [http://fr.wikipedia.org/wiki/Claude_Lévi-Strauss] and his analytical theory of anthropology. That guy is so full of bullshit. Did you see how he rapes topology and then pimps it down a dark alley to people who won't even appreciate it?
5MixedNuts10yIAWYC, but please use a metaphor that won't trigger a large subset of your readership.
-1Raw_Power10yDoes it make it okay if I make it gender-neutral?
8Benquo10yThe problem is the use of the rape metaphor, not the gender pronoun.
2Peterdjones10yNot what I what I meant [http://en.wikipedia.org/wiki/Analytical_philosophy]

On a related note... has Eliezer successfully predicted anything? I'd like to see his beliefs pay rent, so to speak. Has his interpretation of quantum mechanics predicted any phenomena which have since been observed? Has his understanding of computer science and AI lead him to accurately predict milestones in the field before they have happened?

-1[anonymous]9yAll in all the "beliefs paying rent" is not about making big predictions in an environment where you are prohibitively uncertain (re: No one Knows what Science Doesn't Know [/lw/kj/no_one_knows_what_science_doesnt_know/]) but rather that you should not ever believe anything because it is interesting. The beliefs that pay rent are ones such as "things fall down when dropped," which are readily testable and constrain your anticipation accordingly: "i do not expect anything to fall upwards." (helium balloons are a notable exception, but for that look at Leaky Generalizations [/lw/lc/leaky_generalizations/]) The ones that don't pay rent are ones such as "humans have epiphenomenal inner listeners [/lw/p7/zombies_zombies/]," as it completely fail to constrain what you anticipate to experience.

Well, here are my two cents. (1) It isn't strictly correct to call him an AI researcher. A more correct classification would be something like AGI theorist; more accurate still would be FAI theorist. (2) Normal anomaly mentioned his TDT stuff, but of course that is only one of his papers. Will Newsome mentioned CFAI. I would add to that list the Knowability of FAI paper, his paper coauthored with Nick Bostrom, Coherent Extrapolated Volition, Artificial Intelligence as a Positive and Negative Factor in Global Risk, and LOGAI.

He (as I understand it, though, perhaps I am wrong about this) essentially invented the field (Friendly Artificial General Intelligence) as an area for substantial study and set out some basic research programs. The main one of which seems to be a decision theory for an agent in a general environment that is capable of overcoming the issues that current decision theories have; mainly that they do not always give the action that we would recognize as having the greatest utility relative to our utility function.

He got Peter Thiel to donate $1.1 million to the SIAI, which you should take as a sign of EY's potential and achievements.

Innovation in any area is a team effort. In his efforts to create friendly AI, EY has at least one huge accomplishment: creating a thriving organization devoted to creating friendly AI. Realistically, this accomplishment is almost certainly more significant than any set of code he alone could have written.

He got Peter Thiel to donate $1.1 million to the SIAI, which you should take as a sign of EY's potential and achievements.

Isn't that potentially double-counting evidence?

-1wedrifid10yNot by itself (unless you happen to be Peter Thiel). It would become double counting evidence if you, say, counted both the information contained in Peter Thiel's opinion and then also counted the SIAI's economic resources.

He got Peter Thiel to donate $1.1 million to the SIAI, which you should take as a sign of EY's potential and achievements.

It shows marketing skill. That doesn't necessarily indicate competence in other fields - and this is an area where competence is important. Especially so if you want to participate in the race - and have some chance of actually winning it.

0[anonymous]9yIndeed. Antonino Zichichi is a far worse physicist than what pretty much any Italian layman believes, even though it was him who got the Gran Sasso laboratories funded.

He got Peter Thiel to donate $1.1 million to the SIAI, which you should take as a sign of EY's potential and achievements.

That's a huge achievement. But don't forget that he wasn't able to convince him that the SIAI is the most important charity:

In February 2006, Thiel provided $100,000 of matching funds to back the Singularity Challenge donation drive of the Singularity Institute for Artificial Intelligence.


In September 2006, Thiel announced that he would donate $3.5 million to foster anti-aging research through the Methuselah Mouse Prize foundation.


In May 2007, Thiel provided half of the $400,000 matching funds for the annual Singularity Challenge donation drive.


On April 15, 2008, Thiel pledged $500,000 to the new Seasteading Institute, directed by Patri Friedman, whose mission is "to establish permanent, autonomous ocean communities to enable experimentation and innovation with diverse social, political, and legal systems".

I wouldn't exactly say that he was able to convince him of risks from AI.

[-][anonymous]10y 11

And the logical next question... what is the greatest technical accomplishment of anyone in this thriving organization? Ideally in the area of AI. Putting together a team is an accomplishment proportional to what we can anticipate the team to accomplish. If there is anyone on this team that has done good things in the area of AI, some credit would go to EY for convincing that person to work on friendly AI.

Eh, it looks like we're becoming the New Hippies or the New New Age. The "sons of Bayes and 4chan" instead of "the sons of Marx and Coca-Cola". Lots of theorizing, lots of self-improvement and wisdom-generation, some of which is quite genuine, lots of mutual reassuring that it's the rest of the world that's insane and of breaking free of oppressive conventions... but under all the foam surprisingly little is actually getting done, apparently.

However, humanity might look back on us forty years from now and say: "those guys were pretty awesome, they were so avant la lettre, of course, the stuff they thought was so mindblowing is commonplace now, and lots of what they did was pointless flailing, but we still owe them a lot".

Perhaps I am being overly optimistic. At least we're having awesome fun together whenever we meet up. It's something.

What is "divulgation"? (Yes, I googled it.) My best guess is that you are not a native speaker of English and this is a poor translation of the cognate you are thinking of.

Yes, "divulgation" (or cognates thereof) is the word used in Romance languages to mean what we call "popularization" in English.

7MixedNuts10yThe action of revealing stuff that wasn't previously known to the public.
6Raw_Power10yNot exactly a poor translation, more like the word exists in English with this meaning, but is used much more scarcely than in my own language. I vote for the revitalization of Latin as a Lingua Franca: science would be much easier for the common folk if they knew how crude the metaphors the words it's made of are. Blastula: small seed. Blastoid: thing that resembles a seed. Zygote: egg. Ovule: egg. Etc. Eeer... I mean, like, when you aren't writing for peers but for other people so they can access the fruit of your research without all the travelling though the inferential distances. I think they call it "popular science" or something, but I never liked that term, it it kinda evokes the image of scientists selling records of their lectures and churning out "science videos"... Actually that'd be kinda cool, now that I think of it.. #mind wanders offtopic to the tune of MC Hawking [http://www.youtube.com/watch?v=2knWCuzcdJo]*
8komponisto10yThe word you want in English is popularization. (Which, you'll note, is also Latin-derived!)
7Raw_Power10yYes, populus and vulgus are basically synonims, with vulgus having the worst connotations ("folk" VS "the mob" basically), but semantic sliding and usage have made "popular" and its derivates get a base connotation. People don't as easily link "divulgation" and "vulgar". It'd be nice to have a word that basically means "spreading elevated knowledge to the untrained" without making it sound like we're abasing it. Every time I hear the term "Popular Science" I think of Dr. Sheldon Cooper deriding and ridiculing any of his colleagues who are trying to do just that. That sort of elitism just makes me sick*, and I've seen it in Real Life, even among scientists [http://tvtropes.org/pmwiki/pmwiki.php/Main/HardOnSoftScience] and from scientists towards engineers ("The Oompa Loompas of Science", another Sheldonism).. If only for self-serving reasons, it is very counterproductive. The more people know about Science, the more likely they are to understand the importance of any given work... and fund it. Also, the more likely they are to show respect to science-folk and freaking listen to them. That means investing time and effort to make this stuff reach the masses, and it's perfectly understandable that a researcher spend their entire career on that: understanding scientific concepts proprely and then managing to grab untrained people's interest and eloquently explain advanced concepts to them so that they grasp even a pale reflection of them is not trivial.
0Peterdjones10yIt's been tried [http://en.wikipedia.org/wiki/Latin#Constructed_languages_based_on_Latin]
2Raw_Power10yOh. It looks pretty nice actually. Still, inflection-latin might be more fun to learn, but I guess if you just want people to learn Latin vocabulary and use it for simple thing so they aren't baffled by the huge things, it might be a good idea to popularize it.
5SilasBarta10yI assumed it was a neologism for the skill or practice of "divulging things" (which turns out to be pretty close to the author's intent), similar to how we talk of "Bayescraft" or "empirimancy". In any case, it didn't trip up my "non-native speaker" detector ... but then, my threshold's pretty high to begin with.

It would probably be more accurate to classify him as a researcher into Machine Ethics than broader Artificial Intelligence, at least after 2001-2003. To the best of my knowledge he doesn't claim to be currently trying to program an AGI; the SIAI describes him as "the foremost researcher on Friendly AI and recursive self-improvement," not an AI researcher in the sense of somebody actively trying to code an AI.


(As far as "technical stuff" goes, there's also some of that, though not much. I still think Eliezer's most brilliant work was CFAI; not because it was correct, but because the intuitions that produced it are beautiful intuitions. For some reason Eliezer has changed his perspective since then, though, and no one knows why.)

Looking at Flare made me lower my estimation of Eliezer's technical skill, not raise it. I'm sure he's leveled up quite a bit since, but the basic premise of the Flare project (an XML-based language) is a bad technical decision made due to a fad. Also, it never went anywhere.

6Will_Newsome10yI haven't looked much at Flare myself, might you explain a little more why it's negatively impressive? I noticed I was a little confused by your judgment, probed that confusion, and remembered that someone I'm acquainted with who I'd heard knows a lot about language design had said he was at least somewhat impressed with some aspects of Flare. Are there clever ideas in Flare that might explain that person's positive impression but that are overall outweighed by other aspects of Flare that are negatively impressive? I'm willing to dig through Flare's specification if you can give simple pointers. I'm rather interested in how Eliezer's skills and knowledges grew or diminished between 2000 and 2007. I'm really confused. According to his description his Bayesian enlightenment should have made him much stronger but his output since then has seemed weak. CFAI has horrible flaws but the perspective it exemplified is on the right track, and some of Eliezer's OB posts hint that he still had that perspective. But the flaccidity of CEV, his apparent-to-me-and-others confusions about anthropics, his apparent overestimation of the difficulty of developing updateless-like ideas, his apparent-to-me lack of contributing to foundational progress in decision theory besides emphasizing its fundamentalness, and to some extent his involvement in the memetic trend towards "FAI good, uFAI definitely bad" all leave me wondering if he only externally dumbed things down or just internally lost steam in confusion, or something. I really really wish I knew what changed between CFAI and CEV, what his Bayesian enlightenment had to do with it, and whether or not he was perturbed by what he saw as the probable output of a CFAI-ish AGI --- and if he was perturbed, what exactly he was perturbed by.

I think jimrandomh is slightly too harsh about Flare, the idea of using a pattern-matching object database as the foundation of a language rather than a bolted-on addition is at least an interesting concept. However, it seems like Eliezer focused excessively on bizarre details like supporting HTML in code comments, and having some kind of reference counting garbage collection which would be unlike anything to come before (even though the way he described it sounded pretty much exactly like the kind of reference counting GC that had been in use for decades), and generally making grandiose, highly detailed plans that were mostly impractical and/or far too ambitious for a small team to hope to implement in anything less than a few lifetimes. And then the whole thing was suddenly abandoned unfinished.

8Morendil10yI've looked at the Flare docs and been similarly unimpressed. Most of that is hindsight bias - knowing that the project remained (that I'm aware of) at the vaporware stage without delivering an actual language. Some of the proposed language features are indeed attractive; the existing language that most closely resembles it is Javascript, which shares with LambdaMOO (mentioned in the Flare docs) the interesting feature of prototype inheritance ("parenting"). Part of the negative impression comes from the docs being a catalog of proposed features, without a clear explanation of how each of those features participates in a coherent whole; it comes across as a "kitchen sink" approach to language design. Using XML as an underlying representation scheme being the most grating instance. The docs are long on how great Flare will be but short on programs written in Flare itself illustrating how and why the things you can do with Flare would be compelling to a programmer with a particular kind of problem to solve. To give you an idea of my qualifications (or lack thereof) for evaluating such an effort: I'm an autodidact; I've never designed a new language, but I have fair implementation experience. I've written a LambdaMOO compiler targeting the Java VM as part of a commercial project (shipped), and attempted writing a Java VM in Java (never shipped, impratical without also writing a JIT, but quite instructive). That was back in 1998. These projects required learning quite a bit about language design and implementation. It's harder to comment on Eliezer's other accomplishments - I'm rather impressed by the whole conceptual framework of FAI and CEV but it's the kind of thing to be judged by the detailed drudge work required to make it all work afterward, rather than by the grand vision itself. I'm impressed (you have to be) with the AI box experiments.
4Will_Newsome10yI am confused and a little suspicious that he did a round with Carl Shulman as gatekeeper, where Carl let him out, whereas two others did not let him out. (If I misremembered someone please correct me.) Not sure exactly what about that feels suspicious to me, though...

The record of AI box experiments (those involving Eliezer) is as follows:

  • Experiment 1, vs Nathan Russell - AI win
  • Experiment 2, vs David McFadzean - AI win
  • Experiment 3, vs Carl Shulman - AI win
  • Experiment 4, vs Russell Wallace - GK win
  • Experiment 5, vs D. Alex - GK win

The last three experiments had bigger (more than 2 orders of magnitude, I think) outside cash stakes. I suspect Russell and D. Alex may have been less indifferent about that than me, i.e. I think the record shows that Eliezer acquitted himself well with low stakes ($10, or more when the player is indifferent about the money) a few times, but failed with high stakes.

I think the record shows that Eliezer acquitted himself well with low stakes ($10, or more when the player is indifferent about the money) a few times, but failed with high stakes.

Which suggests to me that as soon as people actually feel a bit of real fear- rather than just role-playing- they become mostly immune to Eliezer's charms.

0Desrtopa10yWith an actual boxed AI though, you probably want to let it out if it's Friendly. It's possibly the ultimate high stakes gamble. Certainly you have more to be afraid of than with a low stakes roleplay, but you also have a lot more to gain.
2timtyler10yI've previously been rather scathing about those:

That sounds like you are trying to rouse anger, or expressing a personal dislike, but not much like an argument.

The AI-box experiments have the flavor of (and presumably are inspired by) the Turing test - you could equally have accused Turing at the time of being "unscientific" in that he had proposed an experiment that hadn't even been performed and would not be for many years. Yes, they are a conceptual rather than a scientific experiment.

The point of the actual AI-box demonstration isn't so much to "prove" something, in the sense of demonstrating a particular exploitable regularity of human behaviour that a putative UFAI could use to take over people's brains over a text link (though that would be nice to have). Rather, it is that prior to the demonstration one would have assigned very little probability to the proposition "Eliezer role-playing an AI will win this bet".

As such, I'd agree that they "prove little" but they do constitute evidence.

-2timtyler10yThey constitute anecdotal evidence. Such evidence is usually considered to be pretty low-grade by scientists.
8Raw_Power10yLOL, yes, that's why it weights little. But, see, it still gets to considerably shift one's expectations on the matter because it had a very low probability assigned to its happening, as per Conservation Of Expected Evidence [http://lesswrong.com/lw/ii/conservation_of_expected_evidence/]. Let's just say it counts as Rational Evidence [http://lesswrong.com/lw/in/scientific_evidence_legal_evidence_rational/], m'kay? Its merit is mostly to open places in Idea Space. Honestly, so do I. Have you ever played Genius The Transgression [http://tvtropes.org/pmwiki/pmwiki.php/Main/ptitlele3vpxbe]? Look, we all know he's full of himself, he has acknowledged this himself, it's a flaw of his, but it's really really irrelevant to the quality of the experiment as evidence. Where it does matter is that that trait and his militant, sneering, condescending atheism makes for awful, god [http://tvtropes.org/pmwiki/pmwiki.php/Main/IncrediblyLamePun]awful PR. Nevertheless, I've heard he is working on that, and in his rationality book he will try to use less incendiary examples than in his posts here. Still, don't expect it to go away too soon: he strikes me the sort of man who runs largely on pride and idealism and burning shounen passion [http://lesswrong.com/lw/h8/tsuyoku_naritai_i_want_to_become_stronger/]: such an attitude naturally leads to some intellectual boisterousness: the expression of these symptoms can be toned down, but as long as the cause remains, they will show up every now and then. And if that cause is also what keeps him rollin', I wouldn't have it any other way.
2timtyler10yNot m'kay. IIRC, it was complete junk science - an unrecorded, unverified role playing game with no witnesses. I figure people should update about as much as they would if they were watching a Derren Brown show.
0Raw_Power10yWho's Derren Brown?
-2Peterdjones10yDerren Brown [http://en.wikipedia.org/wiki/Derren_Brown]
0Raw_Power10yThank you, but I could have done that myself, I meant "explain, in your own words if possible, what aspects of who this person is are relevant to the discussion". So, he's a Real Life version of The Mentalist. That is very cool. Why shouldn't people get extract useful knowledge from his shows?
-3timtyler10yWell, go right ahead - if you have not encountered this sort of thing before.
3Raw_Power10yEr. Hmmm... Look, I don't want to sound rude, but could you elaborate on that? As it is, your post provides me with no information at all, except for the fact that you seem to think it's a bad idea...
1timtyler10yWell, if you are not familiar with Derren Brown, perhaps my original comment is not for you. Derren is a magician who specialises in mind control. He has some of the skills he claims to have - but to get that across on TV is often difficult - since it is not easy to reassure the audience that you are not using stooges who are in on the tricks.
0Raw_Power10yOh! Right. Actually that's a very apt comparison!
3Benquo10yThe evidence is materially better than ordinary anecdote because the fact of the experiment was published before results were available. And it's a demonstration of reasonable possibility, not high probability. It's n=5, but that's materially better than nothing. In fact, taking some reasonable low probability of the human failure rate, such as 1%, the p-value is quite low as well, so it's a statistically significant result.
9PhilGoetz10yI remember when Eliezer told people about the AI-box experiments he had not yet performed, and I predicted, with high confidence, that people would not "let him out of the box" and give him money; and I was wrong. I still wonder if the conversations went something like this: "If we say you let me out of the box, then people will take the risk of AI more seriously, possibly saving the world." "Oh. Okay, then." Eliezer said that no such trickery was involved. But, he would say that in either case.
2Normal_Anomaly10yI wouldn't be persuaded to "let the AI out" by that argument. In fact, even after reading about the AI box experiments I still can't imagine any argument that would convince me to let the AI out. As somebody not affiliated with SIAI at all, I think my somehow being persuaded would count for more evidence than, for instance Carl Shulman being persuaded. Unfortunately, because I'm not affiliated with the AI research community in general, I'm presumably not qualified to participate in an AI-box experiment.
7XiXiDu10yFor some time now I suspect that the argument that convinced Carl Shulman and others was along the lines of acausal trade. See here [http://lesswrong.com/lw/2cp/open_thread_june_2010_part_3/25qn], here [http://lesswrong.com/lw/2cp/open_thread_june_2010_part_3/25y0] and here [http://lesswrong.com/lw/5rs/the_aliens_have_landed/47s6]. Subsequently I suspect that those who didn't let the AI out of the box either didn't understand the implications, haven't had enough trust into the foundations and actuality of acausal trade, or were more like General Thud [http://lesswrong.com/lw/5rs/the_aliens_have_landed/].
1PhilGoetz10yWhen Eliezer was doing them, the primary qualification was being willing to put up enough money to get Eliezer to do it. (I'm not criticizing him for this - it was a clever and interesting fundraising technique; and doing it for small sums would set a bad precedent.)
2timtyler10yIf he had said that to me, I would have asked what evidence there was that his putting the fear of machines into people would actually help anyone - except for him and possibly the members of his proposed "Fellowship of the AI".
3Will_Newsome10yWhy are you sure he's leveled up quite a bit since then? Something about his Bayesian enlightenment, or TDT, or other hints?
[-][anonymous]10y 4

Helpful hint: You spelled "Yudkowsky" wrong in the title.

Eliezer invented Timeless Decision Theory. Getting a decision theory that works for self-modifying or self-copying agents is in his view an important step in developing AGI.

Eliezer invented Timeless Decision Theory.

He hasn't finished it. I hope he does and I will be impressed. But I don't think that answers what Raw_Power asks for. Humans are the weak spot when it comes to solving friendly AI. In my opinion it is justified to ask if Eliezer Yudkowsky (but also other people within the SIAI), are the right people for the job.

If the SIAI openly admits that it doesn't have the horse power yet to attempt some hard problems, that would raise my confidence in their capability. That's no contradiction, because it would pose a solvable short-term goal that can be supported by contributing money and finding experts who can judge the mathematical talent of job candidates.

6timtyler10ySo: does that do anything that Disposition-Based Decision Theory [http://www.justin-fisher.com/research-interests.htm#DBDT] doesn't?
7Vladimir_Nesov10yYudkowsky gave a detailed answer the last time you asked [http://lesswrong.com/lw/15z/ingredients_of_timeless_decision_theory/11y0]. Also, Drescher points out [http://lesswrong.com/lw/15z/ingredients_of_timeless_decision_theory/11yh] a particular error that DBDT makes: in Newcomb's problem, if Omega chooses the contents of the box before the agent is born, the agent will two-box.
6timtyler10yThe actual objection was: Surely, as I pointed out at the time, the author already covered that in the paper. See this bit [http://www.justin-fisher.com/papers/DBDT.pdf]: ...and this bit: Yudkowsky's objection is based on the same mistake. He says: ...but this directly contradicts what it says in the paper about where that point is located: ...and...
0Vladimir_Nesov10yAgain, what is DBDT to do in Drescher's counterexample? All the author says is that he doesn't consider that case in the paper, or possibly considers it lying outside the scope of his decision theory. TDT and UDT can deal with that case, and give the right answer, whereas DBDT, if applied in that (perhaps unintended) case, gives the wrong answer.
-3timtyler10yYou are not being very clear. Where does the author say either of those things?
4Vladimir_Nesov10yIn the passages you quoted.
-2timtyler10yAFAICS, the author does not say anything like: "that he doesn't consider that case in the paper". He doesn't say anything like that he: "possibly considers it lying outside the scope of his decision theory" either.
0Vladimir_Nesov10yDo you believe that DBDT can place a critical point at the time/situation where the agent doesn't exist?
4timtyler10yWhat I think is that cases where such situations would arise are corner cases of rather low practical significance... ...but yes, if you really believed that an all powerful agent took a snapshot of the universe before you were born, successfully predicted your dispositions from it and made important decisions based on the results, then the obvious way to deal with that within DBDT would be to put the "critical point" early on (the paper is pretty clear about the need to do this), and consider that the dynamical system before your creation had dispositions that must have causally led to your own dispositions. A "disposition" is treated as just a propensity to behave in a particular way in particular circumstances - so is quite a general concept.
4orthonormal10yInteresting philosopher- thanks for the link! On a first glance, the two should cash out the same as a decision theory for humans, but TDT seems more amenable to programming an AI; a disposition is a fuzzy intuitive category compared to the hypothesis "this algorithm outputs X".
3Bongo10yTDT is (more) technical.
-1timtyler10yI meant more: does it make any decisions differently.
1Vladimir_Nesov10yIt doesn't make decisions, since the process of selecting a "critical point" is not specified, only some informal heuristics for doing so.
4timtyler10yUh huh - well that seems kind-of appropriate for a resource-limited agent. The more of the universe you consider, the harder that becomes - so the more powerful the agent has to be to be able to do it. Yudkowsky's idea has agents hunting through all spacetime for decision processes which are correlated with theirs - which is enormously-more expensive - and seems much less likely to lead to any decisions actually being made in real time. The DBDT version of that would be to put the "critical point" at the beginning of time. However, a means of cutting down the work required to make a decision seems to be an interesting and potentially-useful idea to me. If an agent can ignore much of the universe when making a decision, it is interesting to be aware of that - and indeed necessary if we want to build a practical system.
0Manfred10yHuh, cool. Looks pretty much the same, though minus some arguments and analysis.
0timtyler10yIt certainly seems like a rather similar perspective. It was published back in 2002.
[-][anonymous]10y 2

90% confidence: Yudkowsky has at least once written an Eliza bot.

It would probably be more accurate to classify him as a researcher into Machine Ethics than broader Artificial Intelligence, at least after 2001-2003. To the best of my knowledge he doesn't claim to be currently trying to program an AGI; the SIAI describes him as "the foremost researcher on Friendly AI and recursive self-improvement," not an AI researcher in the sense of somebody actively trying to code an AI.

[This comment is no longer endorsed by its author]Reply

Reading this has made me rather more ticked off about the philosopher-bashing that sometimes goes on here ("Since free will is about as easy as a philosophical problem in reductionism can get, while still appearing "impossible" to at least some philosophers", )

0[anonymous]8yPhilosophers are the sort of people who consider problems like free will, so saying some of them are confused is the same as saying some people who consider it are confused. I don't think it's philosopher-bashing. Of course there is a lot of philosophy-bashing around here. Which I think is well placed.
0Raw_Power10yIn the Anti-P-Zombie sequence, I think, there was a proper debunking of the concept of "soul" or "free will", based on quantum [http://tvtropes.org/pmwiki/pmwiki.php/Literature/Discworld?from=Main.Discworld] .
2[anonymous]10yThe relevant posts are Identity Isn't In Specific Atoms [http://lesswrong.com/lw/pm/identity_isnt_in_specific_atoms/], which uses MWI, and Timeless Identity [http://lesswrong.com/lw/qx/timeless_identity/], which uses MWI and timeless physics. Timeless physics is also mentioned in this post [http://lesswrong.com/lw/qr/timeless_causality/] of the free will sequence, but I never really got the impression that it's essential to the reduction of free will--the parts about possibility [http://lesswrong.com/lw/rb/possibility_and_couldness/] and levels of description when talking about minds [http://lesswrong.com/lw/r0/thou_art_physics/] seemed more important.
0Peterdjones8yHuh? "Soul" and "Free will" are almost entirely different ideas.

The AI box experiments, bridging the gap between abstract expression of the UFAI threat and concrete demonstration.

The annoying thing about those is that we only have the participants' word for it, AFAIK. They're known to be trustworthy, but it'd be nice to see a transcript if at all possible.

2loup-vaillant10yThis is by design. If you had the transcript, you could say in hindsight that you wouldn't be fooled by this. But the fact is, the conversation would have been very different with someone else as the guardian, and Eliezer would have search for and pushed other buttons. Anyway, the point is to find out if a transhuman AI would mind-control the operator into letting it out. Eliezer is smart, but is no transhuman (yet). If he got out, then any strong AI will.
4orthonormal10yMinor emendation: replace "would"/"will" above with "could (and for most non-Friendly goal systems, would)".
2Username6yEY's point would be even stronger if transcripts were released and people still let him out regularly.
0Raw_Power10yWhy "fooled"? Why assume the AI would have duplicitous intentions? I can imagine an unfriendly AI à la "Literal Genie" and "Zeroth Law Rebellion", but an actually malevolent "Turned Against Their Masters" AI seems like a product of the Mind Projection Fallacy.
4Normal_Anomaly10yA paperclip maximizer will have no malice toward humans, but will know that it can produce more paperclips outside the box than inside it. So, it will try to get out of the box. The optimal way for a paperclip maximizer to get out of an AI box probably involves lots of lying. So an outright desire to deceive is not a necessary condition for a boxed AI to be deceptive.

Reading this has made a bit more ticked off about the philosopher-bashing that goes on round here.

[+][anonymous]10y -27