All of Furcas's Comments + Replies

This comment has gotten lots of upvotes but, has anyone here tried Vicuna-13B?

I have. It seems pretty good (not obviously worse than ChatGPT 3.5) at short conversational prompts, haven't tried technical or reasoning tasks.

Does the code it writes work? ChatGPT can usually write a working Python module on its first try, and can make adjustments or fix bugs if you ask it to. All the local models I've tried so far could not keep it coherent for something that long. In one case it even tried to close a couple of Python blocks with curly braces. Maybe I'm just using the wrong settings.
Are you implying that it is close to GPT-4 level? If yes, it is clearly wrong. Especially in regards to code: everything (maybe except StarCoder which was released literally yesterday) is worse than GPT-3.5, and much worse than GPT-4.

Well, this is insanely disappointing. Yes, the OP shouldn't have directly replied to the Bankless podcast like that, but it's not like he didn't read your List of Lethalities, or your other writing on AGI risk. You really have no excuse for brushing off very thorough and honest criticism such as this, particularly the sections that talk about alignment.

And as others have noted, Eliezer Yudkowsky, of all people, complaining about a blog post being long is the height of irony.

This is coming from someone who's mostly agreed with you on AGI risk since reading the Sequences, years ago, and who's donated to MIRI, by the way.

On the bright side, this does make me (slightly) update my probability of doom downwards.

You may be right about Deepmind's intentions in general but, I'm certain that the reason they didn't brag about AlphaStar is because it didn't quite succeed. There never was an official series between the best SC2 player in the world and AlphaStar. And, once Grandmaster-level players got a bit used to playing against AlphaStar, even they could beat it, to say nothing of pros. AlphaStar had excellent micro-management and decent tactics, but zero strategic ability. It had the appearance of strategic thinking because there were in fact multiple AlphaStars, ea... (read more)

9Eric Herboso1y
While I largely agree with this comment, I do want to point out that I think AlphaStar did in fact do some amount of scouting. When Oriol and Dario spoke about AlphaStar on The Pylon Show in December 2019, they showed an example of AlphaStar specifically checking for a lair and verbally spoke about other examples where it would check for a handful of other building types. They also spoke about how it is particularly deficient at scouting, only looking for a few specific buildings, and that this causes it to significantly underperform in situations where scouting would be used by humans to get more ahead. You said that "[w]e never saw AlphaStar do something as elementary as scouting the enemy's army composition and building the units that would best counter it." I'm not sure this is strictly true.  At least according to Oriol,  AlphaStar did scout for a handful of building types (though maybe not necessarily unit types) and appeared to change what it did according to the buildings it scouted. With that said, this nitpick doesn't change the main point of your comment, which I concur with.  AlphaStar did not succeed in nearly the same way that they hoped it would, and the combination of how long it took to train and the changing nature of how StarCraft gets patched meant that it would have been prohibitively expensive to try to get it trained to a level similar to what AlhpaGo had achieved.
5Rob Bensinger1y
Disagree-voted just because of the words "I'm certain that the reason...". I'd be much less skeptical of "I'm pretty dang sure that the reason..." or at the very least "I'm certain that an important contributing factor was..." (But even the latter seems pretty hard unless you have a lot of insider knowledge from talking to the people who made the decision at DeepMind, along with a lot of trust in them. E.g., if it did turn out that DeepMind was trying to reduce AI hype, then they might have advertised a result less if they thought it were a bigger deal. I don't know this to be so, but it's an example of why I raise an eyebrow at "I'm certain that the reason".)
Yeah I never got the impression that they got a robust solution to fog of war, or any sort of theory of mind, which you absolutely need for Starcraft.

I'd guess 1%. The small minority of AI researchers working on FAI will have to find the right solutions to a set of extremely difficult problems on the first try, before the (much better funded!) majority of AI researchers solve the vastly easier problem of Unfriendly AGI.

1%? Shouldn't your basic uncertainty over models and paradigms be great enough to increase that substantially?
"Friendliness" is a rag-bag of different things -- benevolence, absence of malevolence, the ability to control a system whether it's benevolent or malevolent , and so on. So the question is somewhat ill-posed. As far as control goes, all AI projects involve an element of control, because if you can;t get the AI to do what you want, it is useless.So the idea that AI and FAI are disjoint is wrong.

Huh. Is it possible that the corpus callosum has (at least partially) healed since the original studies? Or that some other connection has formed between the hemispheres in the years since the operation?


Yes it was video. As Brillyant mentioned, the official version will be released on the 29th of September. It's possible someone will upload it before then (again), but AFAIK nobody has since the video I linked was taken down.

I changed the link to the audio, should work now.

No dice. Edit: either it works from my phone only, or it works now. Yay!

Sam Harris' TED talk on AGI existential risk:

ETA: It's been taken down, probably so TED can upload it on their own channel. Here's the audio in the meantime:

Was the youtube link video? Do you have the video of the TED talk? Audio is boring, but I can wait.
Sam Harris' Twitter says Sept 29 release for the AI Risk TED Talk.
Thanks for the pointer, though I can't open the audio file either.

If you don't like it now, you never will.

Yeah, I edited my comment after reading kilobug's.

Ahh, it wasn't meant to be snarky. I saw an opportunity to try and get Eliezer to fess up, that's all. :)


So, when are you going to tell us your solution to the hard problem of consciousness?

Edited to add: The above wasn't meant as a sarcastic objection to Eliezer's post. I'm totally convinced by his arguments, and even if I wasn't I don't think not having a solution to the hard problem is a greater problem for reductionism than for dualism (of any kind). I was seriously asking Eliezer to share his solution, because he seems to think he has one.

Not having a solution doesn't prevent from criticizing an hypothesis or theory on the subject. I don't know what are the prime factors of 4567613486214 but I know that "5" is not a valid answer (numbers having 5 among their prime factors end up with 5 or 0) and that "blue" doesn't have the shape of a valid answer. So saying p-zombism and epiphenomenalism aren't valid answers to the "hard problem of consciousness" doesn't require having a solution to it.
Snarky, but pertinent. This re-posting was prompted by a Sean Carroll article, that argued along similar lines...epiphenomenalism (one of a number of possible alternatives to physicalism) is incredible, therefore no zombies. There are a number of problems with this kind of thinking. One is that there may be better dualisms than epiphenomenalism. Another is that criticising epi. doesn't show that there is a workable physical explanation of consciousness. There is no see-saw (titter-totter) effect whereby the wrongness of one theory implies the correctness of another. For one thing,there are more than two theories (see above). For another, an explanation has to explain...there are positive, absolute standards for cannot say some Y is an an explanation, that it actually explains, just because some X is wrong, and Y is different to X. (The idea that physicalism is correct as an incomprehensible brute fact is known as the "new mysterianism" and probably isn't what reductionists physicalists and rationalists are aiming at). Carroll and others have put forward a philosophical version of a physical account of consciousness, one stating in general terms that consciousness is a high-level, emergent outcome, of fine-grained neurological activity. The zombie argument (Mary's room, etc) are intended as handwaving philosophical arguments against that sort of argument. If the physicalist side had a scientific version of a physical account of consciousness, there would be no point in arguing against them philosophically, any more than there is a point in arguing philosophically against gravity. Scientific, as opposed to philosophical, theories are detailed and predictive, which allows them to be disproven or confirmed and not merely argued for or against. And, given that there is no detailed, predictive explanation of consciousness, zombies are still imaginable, in a sense. If someone claims they can imagine (in the sense of picturing) a hovering rock, you ca

IMO since people are patterns (and not instances of patterns), there's still only one person in the universe regardless of how many perfect copies of me there are. So I choose dust specks. Looks like the predictor isn't so perfect. :P

Why don't you go ask some.

I mentioned three crucial caveats. I think it would be difficult to find Christians in 2016 who have no doubts and swallow the bullet about the implications of Christianity. It would be a lot easier a few hundred years ago.

Huh? The "concept" of Christianity hasn't changed since the Middle Ages

What I mean is that the religious beliefs of the majority of people who call themselves Christians have changed a lot since medieval times.

We are talking here about what you can, should, or must sacrifice to get closer to t

... (read more)
Yes. Multiple systems, somewhat inconsistent but serving as a check and a constraint on each other, not letting a single one dominate. Not in all ethical systems. In consequentialism yes, but not all ethics are consequentialist. How do you know that? Not in this specific example, but in general -- how do you know there is nothing wrong with your One True Goal?

Would real-life Christians who sincerely and wholeheartedly believe that Christianity is true agree that such acts are not horrible at all and, in fact, desirable and highly moral?

Yes? Of course? With the caveats that the concept of 'Christianity' is the medieval one you mentioned above, that these Christians really have no doubts about their beliefs, and that they swallow the bullet.

So once you think you have good evidence, all the horrors stop being horrors and become justified?

Are you trolling? Is the notion that the morality of actions is dependent on reality really that surprising to you?

Why don't you go ask some. Huh? The "concept" of Christianity hasn't changed since the Middle Ages. The relevant part is that you either get saved and achieve eternal life or you are doomed to eternal torment. Of course I don't mean people like Unitarian Universalists, but rather "standard" Christians who believe in heaven and hell. Morality certainly depends on the perception of reality, but the point here is different. We are talking here about what you can, should, or must sacrifice to get closer to the One True Goal (which in Christianity is salvation). Your answer is "everything". Why? Because the One True Goal justifies everything including things people call "horrors". Am I reading you wrong?

Well, my point is that stating all the horrible things that Christians should do to (hypothetically) save people from eternal torment is not a good argument against 'hard-core' utilitarianism. These acts are only horrible because Christianity isn't true. Therefore the antidote for these horrors is not, "don't swallow the bullet", it's "don't believe stuff without good evidence".

Is that so? Would real-life Christians who sincerely and wholeheartedly believe that Christianity is true agree that such acts are not horrible at all and, in fact, desirable and highly moral? So once you think you have good evidence, all the horrors stop being horrors and become justified?

Yes, I acknowledge all of that. Do you understand the consequence of not doing those things, if Christianity is true?

Eternal torment, for everyone you failed to convert.

Eternal. Torment.

The real danger, of course, is being utterly convinced Christianity is true when it is not. The actions described by Lumifer are horrific precisely because they are balanced against a hypothetical benefit, not a certain one. If there is only an epsilon chance of Christianity being true, but the utility loss of eternal torment is infinite, should you take radical steps anyway? In a nutshell, Lumifer's position is just hedging against Pascal's mugging, and IMHO any moral system that doesn't do so is not appropriate for use out here in the real world.
Yes, I do. Well, since I'm not actually religious, my understanding is hypothetical. But yes, this is precisely the point I'm making.

The parallel should be obvious: if you believe in eternal (!) salvation and torment, absolutely anything on Earth can be sacrificed for a minute increase in the chance of salvation.

... yes? What's wrong with that? Are you saying that, if you came across strong evidence that the Christian Heaven and Hell are real, you wouldn't do absolutely anything necessary to get yourself and the people you care about to Heaven?

The medieval Christians you describe didn't fail morally because they were hard-core utilitarians, they failed because they believed Christianity was true!

Yes, I'm saying that. I'm not sure you're realizing all the consequences of taking that position VERY seriously. For example, you would want to kidnap children to baptize them. That's just as an intermediate step, of course -- you would want to convert or kill all non-Christians, as soon as possible, because even if their souls are already lost, they are leading their children astray, children whose souls could possibly be saved if they are removed from their heathen/Muslim/Jewish/etc. parents.

Do you already have something written on the subject? I'd like to read it.

No. It would probably be worth doing but difficult, since evaluating truth or at least plausibility depends on a complex web of assumptions.

Ohh, Floornight is pretty awesome (so far). Thanks!

He doesn't come from the LW-sphere but he's obviously read a lot of LW or LW-affiliated stuff. I mean, he's written a pair of articles about the existential risk of AGI...

The Time article doesn't say anything interesting.

Goertzel's article (the first link you posted) is worth reading, although about half of it doesn't actually argue against AI risk, and the part that does seems obviously flawed to me. Even so, if more LessWrongers take the time to read the article I would enjoy talking about the details, particularly about his conception of AI architectures that aren't goal-driven.

I updated my earlier comment to say "against AI x-risk positions" which I think is a more accurate description of the arguments. There are others as well, e.g. Andrew Ng, but I think Goertzel does the best job at explaining why the AI x-risk arguments themselves are possibly flawed. They are simplistic in how they model AGIs, and therefore draw simple conclusions that don't hold up in the real world. And yes, I think more LW'ers and AI x-risk people should read and respond to Goertzel's super-intelligence article. I don't agree with it 100%, but there are some valid points in there. And one doesn't become effective by only reading viewpoints you agree with...

What that complaint usually means is "The AI is too hard, I would like easier wins".

That may be true in some cases, but in many other cases the AI really does cheat, and it cheats because it's not smart enough to offer a challenge to good players without cheating.


Human-like uncertainty could be inserted into the AI's knowledge of those things, but yeah, as you say, it's going to be a mess. Probably best to pick another kind of game to beat humans at.

RTS is a bit of a special case because a lot of the skill involved is micromanagement and software is MUCH better at micromanagement than humans.

The micro capabilities of the AI could be limited so they're more or less equivalent to a human pro gamer's, forcing the AI to win via build choice and tactics.

Or the game could be played on its slowest mode.
It's going to be a mess. Even if you, say, limit the AI's click-per-minute rate, it still has serious advantages. It knows how many fractions of a second can these units stay in the range of enemy artillery and still be able to pull back to recover. It knows whether those units will arrive in time to reinforce the defense or they'll be too late and should do something else instead. Build choice is not all that complicated and with tactics you run right into micro.

I think Jim means that if minds are patterns, there could be instances of our minds in a simulation (or more!) as well as in the base reality, so that we exist in both (until the simulation diverges from reality, if it ever does).

Well, if nothing else, this is a good reminder that rationality has nothing to do with articulacy.


I strongly recommend JourneyQuest. It's a very smartly written and well acted fantasy webseries. It starts off mostly humorous but quickly becomes more serious. I think it's the sort of thing most LWers would enjoy. There are two seasons so far, with a third one coming in a few months if the Kickstarter succeeds

The person accomplished notable things?

The person is a next reincarnation of someone from the notable deaths section. (Notability is 20% hereditary, 30% environment, and 50% karma.) On a second thought, when a notable person has a child, that should also be celebrated.
Yes. (I see that LessWrong has twigged to the fact that this was a stupid joke and not a serious proposal, and I accept the downkarma.)
It seems that they consider a soft takeoff more likely than a hard takeoff, which is still compatible with understanding the concept of an intelligence explosion.
Musk has read it and has repeatedly and publically agreed with its key points.

World's first anti-ageing drug could see humans live to 120

Anyone know anything about this?

The drug is metformin, currently used for Type 2 diabetes.

It seems like the drug trial is funded by the The American Federation for Aging Research(nonprofit). Likelihood of success isn't high but one of the core reason for running the trial seems to be the first anti-aging drug trial and have the FDA develop a framework for that purpose. 3000 patients from who are between 70 to 80 years old at the start of the study. Metformin is cheap to produce, so the claim isn't too expensive for the nonprofit that funds it.
There is discussion on Hacker News. tl;dr: Don't hold your breath.

You have understood Loosemore's point but you're making the same mistake he is. The AI in your example would understand the intent behind the words "maximize human happiness" perfectly well but that doesn't mean it would want to obey that intent. You talk about learning human values and internalizing them as if those things naturally go together. The only way that value internalization naturally follows from value learning is if the agent already wants to internalize these values; figuring out how to do that is (part of) the Friendly AI problem.

Yes, I'm quite aware of that problem. It was outside the scope of this particular essay, though it's somewhat implied by the deceptive turn and degrees of freedom hypotheses.

My cursor was literally pixels away from the downvote button. :)

I honestly don't know what more to write to make you understand that you misunderstand what Yudkowsky really means.

You may be suffering from a bad case of the Doctrine of Logical Infallibility, yourself.

What you need to do is address the topic carefully, and eliminate the ad hominem comments like this: ... which talk about me, the person discussing things with you. I will now examine the last substantial comment you wrote, above. This is your opening topic statement. Fair enough. You are agreeing with what I say on this point, so we are in agreement so far. You make three statements here, but I will start with the second one: This is a contradiction of the previous paragraph, where you said "Yudkowsky believes that a superintelligent AI [...] will put all humans on dopamine drip despite protests that this is not what they want". Your other two statements are that Yudkowsky is NOT saying that the AI will do this "because it is absolutely certain of its conclusions past some threshold", and he is NOT saying that the AI will "fail to update its beliefs accordingly". In the paper I have made a precise statement of what the "Doctrine of Logical Infallibility" means, and I have given references to show that the DLI is a summary of what Yudkowsky et al have been claiming. I have then given you a more detailed explanation of what the DLI is, so you can have it clarified as much as possible. If you look at every single one of the definitions I have given for the DLI you will see that they are all precisely true of what Yudkowsky says. I will now itemize the DLI into five components so we can find which component is inconsistent with what Yudkowsky has publicly said. 1) The AI decides to do action X (forcing humans to go on a dopamine drip). Everyone agrees that Yudkowsky says this. 2) The AI knows quite well that there is massive, converging evidence that action X is inconsistent with the goal statement Y that was supposed to justify X (where goal statement Y was something like "maximize human happiness"). This is a point that you and others repeatedly misunderstand or misconstrue, so before you respond to it, let me give details of the "converging evidence" tha

The only sense in which the "rigidity" of goals can be said to be a universal fact about minds is that it is these goals that determine how the AI will modify itself once it has become smart and capable enough to do so. It's not a good idea to modify your goals if you want them to become reality; that seems obviously true to me, except perhaps for a small number of edge cases related to internally incoherent goals.

Your points against the inevitability of goal rigidity don't seem relevant to this.

If you take the binary view that you're either smart enough to achieve your goals or not, then you might well want to stop improving when you have the minimum intelligence necessary to meet them...which means, among other things,that AIs with goals requiring human or lower intelligence won't become superhuman .... which lowers the probability of the Clippie scenario. It doesn't require huge intelligence to make paperclips,so an AI with a goal to make paperclips, but not to make any specific amount, wouldn't grow into a threatening monster. The probability of the Clippie scenario is also lowered by the consideration that fine grained goals might shift during self-improvement phase, so the Clippie scenario .... arbitrary goals combined with a superintelligence .... is whittled away from both ends.

Is the Doctrine of Logical Infallibility Taken Seriously?

No, it's not.

The Doctrine of Logical Infallibility is indeed completely crazy, but Yudkowsky and Muehlhauser (and probably Omohundro, I haven't read all of his stuff) don't believe it's true. At all.

Yudkowsky believes that a superintelligent AI programmed with the goal to "make humans happy" will put all humans on dopamine drip despite protests that this is not what they want, yes. However, he doesn't believe the AI will do this because it is absolutely certain of its conclusions past so... (read more)

Furcas, you say: When I talked to Omohundro at the AAAI workshop where this paper was delivered, he accepted without hesitation that the Doctrine of Logical Infallibility was indeed implicit in all the types of AI that he and the others were talking about. Your statement above is nonsensical because the idea of a DLI was '''invented''' precisely in order to summarize, in a short phrase, a range of absolutely explicit and categorical statements made by Yudkowsky and others, about what the AI will do if it (a) decides to do action X, and (b) knows quite well that there is massive, converging evidence that action X is inconsistent with the goal statement Y that was supposed to justify X. Under those circumstances, the AI will ignore the massive converging evidence of inconsistency and instead it will enforce the 'literal' interpretation of goal statement Y. The fact that the AI behaves in this way -- sticking to the literal interpretation of the goal statement, in spite of external evidence that the literal interpretation is inconsistent with everything else that is known about the connection between goal statement Y and action X, '''IS THE VERY DEFINITION OF THE DOCTRINE OF LOGICAL INFALLIBILITY'''
All assuming that the AI won't update it's goals even it realizes there is some mistake. That isnt obvious, and in fact is hard to defend. An AI that is powerful and effective would need to seek the truth about a lot of things,since entity that has contradictory beliefs will be a poor instrumental rationalist. But would its goal of truth seeking necessarily be overridden by other goals....would it know but not care? It might be possible to build an AI that didn't care about interpreting its goals correctly. It looks like you would need to engineer a distinction between instrumental beliefs and terminal beliefs. Remember that the terminal/instrumental distinction is conceptual, not a law of nature. ( While we're on the subject, you might need a firewall to stop an .AI acting on intrinsically motivating ideas, if they exis ) In any case, orthogonality is an architecture choice, not an ineluctable fact about minds. MIRI's critics, Loosemore, Hibbard and so in are tacitly assuming architectures without such unupdateability and firewalling. MIRI needs to show that such an architecture is likely to occur, either as a design or a natural evolution. If AIs with unupdateable goals are dangerous, as MIRI seats, it would be simplest not to use that architecture...if it can be avoided. ( We also agree with Yudkowsky(2008a),who points out that research on the philosophical and technical requirements of safe AGI might show that broad classes of possible AGI architectures are fundamentally unsafe,suggesting that such architectures should be avoided.") In other words, it would be careless to build a genie that doesn't care. If the AI community isnt going to deliberately build the goal rigid kind of AI, then MIRIs arguments come down to how it might be a natural or convergent feature....and the wider AI community finds the goal-rigid idea so unintuitive that it fails to understand MIRI, who in turn fail to make it explicit enough. When Loosemore talks about of the doctrine of

Nah, we can just ignore the evil fraction of humanity's wishes when designing the Friendly AI's utility function.

The moslems think the west is evil, or certainly less moral and essentially vice versa. The atheists think all the religious are less moral, and vice versa. Practically speaking, I think the fraction of humanity that is not particularly involved in building AI will have their wishes ignored, and it would not be many who would define that fraction of humanity as evil.
While that was phrased in a provocative manner, there /is/ an important point here: If one has irreconcilable value differences with other humans, the obvious reaction is to fight about them; in this case, by competing to see who can build an SI implementing theirs first. I very much hope it won't come to that, in particular because that kind of technology race would significantly decrease the chance that the winning design is any kind of FAI. In principle, some kinds of agents could still coordinate to avoid the costs of that kind of outcome. In practice, our species does not seem to be capable of coordination at that level, and it seems unlikely that this will change pre-SI.

Not about anything important, and that scares me.

Is there anything important you think you should change your mind about?

and Eliezer's new sequence (most of it's not metaethics, but it's required reading for understanding the explanation of his 2nd attempt to explain metaethics, which is more precise than his first attempt in the earlier Sequences).

Where is this 2nd attempt to explain metaethics by Eliezer?

I'm pretty new, I couldn't tell you for sure. I'm pretty sure it's two posts in that second sequence: Mixed Reference: The Great Reductionist Project and By Which It May Be Judged. I'm pretty sure the rest of the sequence at least is necessary to understand those. 2015 question: WHAT DO YOU THINK ABOUT MACHINES THAT THINK?

There are answers by lots of famous or interesting scientists and philosophers, including Max Tegmark, Nick Bostrom, and Eliezer.

What I find most interesting about the responses is how many of them state an opinion on the Superintelligence danger issue either without responding at all to Bostrom's arguments, or based on counter-arguments that completely miss Bostrom's points. And this after the question explicitly cited Bostrom's work.

All of these high status scientists speaking out about AGI existential risk seldom mention MIRI or use their terminology. I guess MIRI is still seen as too low status.

There has certainly been increased general media coverage lately , and MIRI was mentioned in the Financial Times recently.
Perhaps they do, but the journalists or their editors edit it out?

A while ago Louie Helm recommended buying Darkcoins. After he did the price of a darkcoin went up to more than 10$, but now it's down to 2$. Is it still a good idea to buy darkcoins, that is, is their price likely to go back up?

Honestly I doubt cryptocurrencies are any better than a random walk (unless you have some special foreknowledge of some extra attention the currency is about to get.)
Load More