All of PhilGoetz's Comments + Replies

I think the problem is that each study has to make many arbitrary decisions about aspects of the experimental protocol.  This decision will be made the same way for each subject in a single study, but will vary across studies.  There are so many such decisions that, if the meta-analysis were to include them as dependent variables, each study would introduce enough new variables to cancel out the statistical power gain of introducing that study.

You have it backwards.  The difference between a Friendly AI and an unfriendly one is entirely one of restrictions placed on the Friendly AI.  So an unfriendly AI can do anything a friendly AI could, but not vice-versa.

The friendly AI could lose out because it would be restricted from committing atrocities, or at least atrocities which were strictly bad for humans, even in the long run.

Your comment that they can commit atrocities for the good of humanity without worrying about becoming corrupt is a reason to be fearful of "friendly" AIs.

By "just thinking about IRL", do you mean "just thinking about the robot using IRL to learn what humans want"?  'Coz that isn't alignment.

'But potentially a problem with more abstract cashings-out of the idea "learn human values and then want that"' is what I'm talking about, yes.  But it also seems to be what you're talking about in your last paragraph.

"Human wants cookie" is not a full-enough understanding of what the human really wants, and under what conditions, to take intelligent actions to help the human.  A robot learning that would ... (read more)

How is that de re and de dicto?

You're looking at the logical form and imagining that that's a sufficient understanding to start pursuing the goal. But it's only sufficient in toy worlds, where you have one goal at a time, and the mapping between the goal and the environment is so simple that the agent doesn't need to understand the value, or the target of "cookie", beyond "cookie" vs. "non-cookie". In the real world, the agent has many goals, and the goals will involve nebulous concepts, and have many considerations and conditions attached, eg how healthy is this cookie, how tasty is it... (read more)

2Charlie Steiner4mo
No, I'm definitely just thinking about IRL here. IRL takes a model of the world and of the human's affordances as given constants, assumes the human is (maybe noisily) rational, and then infers human desires in terms of that world model, which then can also be used by the AI to choose actions if you have a model of the AI's affordances. It has many flaws, but it's definitely worth refreshing yourself about occasionally.

So, "mesa" here means "tabletop", and is pronounced "MAY-suh"?

I think your insight is that progress counts--that counting counts.  It's overcoming the Boolean mindset, in which anything that's true some of the time, must be true all of the time.  That you either "have" or "don't have" a problem.

I prefer to think of this as "100% and 0% are both unattainable", but stating it as the 99% rule might be more-motivating to most people.

What do you mean by a goodhearting problem, & why is it a lossy compression problem?  Are you using "goodhearting" to refer to Goodhart's Law?

I'll preface this by saying that I don't see why it's a problem, for purposes of alignment, for human values to refer to non-existent entities.  This should manifest as humans and their AIs wasting some time and energy trying to optimize for things that don't exist, but this seems irrelevant to alignment.  If the AI optimizes for the same things that don't exist as humans do, it's still aligned; it isn't going to screw things up any worse than humans do.

But I think it's more important to point out that you're joining the same metaphysical goose c... (read more)

My analysis was intended to concern primarily the former; I care about the latter almost exclusively insofar as it provides evidence relevant to the former.

When you write of A belief in human agency, it's important to distinguish between the different conceptions of human agency on offer, corresponding to the 3 main political groups:

  • The openly religious or reactionary statists say that human agency should mean humans acting as the agents of God.  (These are a subset of your fatalists.  Other fatalists are generally apolitical.)
  • The covertly religious or progressive statists say human agency can only mean humans acting as agents of the State (which has the moral authority and magical powers of God). &
... (read more)

I think it would be more-graceful of you to just admit that it is possible that there may be more than one reason for people to be in terror of the end of the world, and likewise qualify your other claims to certainty and universality.

That's the main point of what gjm wrote.  I'm sympathetic to the view you're trying to communicate, Valentine; but you used words that claim that what you say is absolute, immutable truth, and that's the worst mind-killer of all.  Everything you wrote just above seems to me to be just equivocation trying to deny tha... (read more)

I say that knowing particular kinds of math, the kind that let you model the world more-precisely, and that give you a theory of error, isn't like knowing another language.  It's like knowing language at all.  Learning these types of math gives you as much of an effective intelligence boost over people who don't, as learning a spoken language gives you above people who don't know any language (e.g., many deaf-mutes in earlier times).

The kinds of math I mean include:

  • how to count things in an unbiased manner; the methodology of polls and other data
... (read more)

Agree.  Though I don't think Turing ever intended that test to be used.  I think what he wanted to accomplish with his paper was to operationalize "intelligence".  When he published it, if you asked somebody "Could a computer be intelligent?", they'd have responded with a religious argument about it not having a soul, or free will, or consciousness.  Turing sneakily got people to  look past their metaphysics, and ask the question in terms of the computer program's behavior.  THAT was what was significant about that paper.

It's a great question.  I'm sure I've read something about that, possibly in some pop book like Thinking, Fast & Slow.  What I read was an evaluation of the relationship of IQ to wealth, and the takeaway was that your economic success depends more on the average IQ in your country than it does on your personal IQ.  It may have been an entire book rather than an article.

Google turns up this 2010 study from Science.  The summaries you'll see there are sharply self-contradictory.

First comes an unexplained box called "The Meeting of Min... (read more)

This “c factor” is not strongly correlated with the average or maximum individual intelligence of group members but is correlated with the average social sensitivity of group members, the equality in distribution of conversational turn-taking, and the proportion of females in the group.

I have read (long ago, not sure where) a hypothesis that most people (in the educated professional bubble?) are good at cooperation, but one bad person ruins the entire team. Imagine that for each member of the group you roll a die, but you roll 1d6 for men, and 1d20 for wom... (read more)

I think in your first paragraph, you may be referring to:
My interest is not political - though that might make it harder to study, yes. I think it's relevant to AI because it could uncover scaling laws. One presumable advantage of AI is that it scales better, but how does that depend on speed of communication between parts and capability of parts? I'm not saying that there is a close relationship but I guess there are potentially surprising results.

But what makes you so confident that it's not possible for subject-matter experts to have correct intuitions that outpace their ability to articulate legible explanations to others?

That's irrelevant, because what Richard wrote was a truism. An Eliezer who understands his own confidence in his ideas will "always" be better at inspiring confidence in those ideas in others.  Richard's statement leads to a conclusion of import (Eliezer should develop arguments to defend his intuitions) precisely because it's correct whether Eliezer's intuitions are correct or incorrect.

The way to dig the bottom deeper today is to get government bailouts, like bailing out companies or lenders, and like Biden's recent tuition debt repayment bill.  Bailouts are especially perverse because they give people who get into debt a competitive advantage over people who don't, in an unpredictable manner that encourages people to see taking out a loan as a lottery ticket.

Finding a way for people to make money by posting good ideas is a great idea.

Saying that it should be based on the goodness of the people and how much they care is a terrible idea.  Privileging goodness and caring over reason is the most well-trodden path to unreason.  This is LessWrong.  I go to fimfiction for rainbows and unicorns.

I think that was part of the whole "haha goodhart's law doesn't exist, making value is really easy" joke. However, it's also possible that that's... actually one of the hard-to-fake things they're looking for (along with actual competence/intelligence). See PG's Mean People Fail or Earnestness. I agree that "just give good money to good people" is a terrible idea, but there's a steelman of that which is "along with intelligence, originality, and domain expertise, being a Good Person (whatever that means) and being earnest is a really good trait in EA/LW an... (read more)

No; most philosophers today do, I think, believe that the alleged humanity of 9-fingered instances *homo sapiens* is a serious philosophical problem.  It comes up in many "intro to philosophy" or "philosophy of science" texts or courses.  Post-modernist arguments rely heavily on the belief that any sort of categorization which has any exceptions is completely invalid.

I'm glad to see Eliezer addressed this point.  This post doesn't get across how absolutely critical it is to understand that {categories always have exceptions, and that's okay}.  Understanding this demolishes nearly all Western philosophy since Socrates (who, along with Parmenides, Heraclitus, Pythagoras, and a few others, corrupted Greek "philosophy" from the natural science of Thales and Anaximander, who studied the world to understand it, into a kind of theology, in which one dictates to the world what it must be like).

Many philosophers have ... (read more)

I theorize that you're experiencing at least two different common, related, yet almost opposed mental re-organizations.

One, which I approve of, accounts for many of the effects you describe under "Bemused exasperation here...".  It sounds similar to what I've gotten from writing fiction.

Writing fiction is, mostly, thinking, with focus, persistence, and patience, about other people, often looking into yourself to try to find some point of connection that will enable you to understand them.  This isn't quantifiable, at least not to me; but I would ... (read more)

This sound suspiciously like Plato telling people to stop looking at the shadows on the wall of the cave, turn around, and see the transcendental Forms.

To me, saying that someone is a better philosopher than Kant seems less crazy than saying that saying that someone is a better philosopher than Kant seems crazy.

Isn't the thing Rob is calling crazy that someone "believed he was learning from Kant himself live across time", rather than believing that e.g. Geoff Anders is a better philosopher than Kant?

It's more crazy after you load in the context that people at Leverage think Kant is more impressive than eg Jeremy Bentham. 

An easy reason not to play quantum roulette is that, if your theory justifying it is right, you don't gain any expected utility; you just redistribute it, in a manner most people consider unjust, among different future yous.  If your theory is wrong, the outcome is much worse.  So it's at the very best a break even / lose proposition.

The Von Neumann-Morgenstern theory is bullshit.  It assumes its conclusion.  See the comments by Wei Dai and gjm here.

See the 2nd-to-last paragraph of my revised comment above, and see if any of it jogs your memory.

Republic is the reference. I'm not going to take the hours it would take to give book-and-paragraph citations, because either you haven't read the the entire Republic, or else you've read it, but you want to argue that each of the many terrible things he wrote don't actually represent Plato's opinion or desire.

(You know it's a big book, right? 89,000 words in the Greek.  If you read it in a collection or anthology, it wasn't the whole Republic.)

The task of arguing over what in /Republic/ Plato approves or disapproves of is arduous and, I think, unnece... (read more)

I read the Bloom translation all the way through. Maybe you could tell me which translation you read all the way through.
Yes. Have you? Then I'm not going to believe you.

The most-important thing is to explicitly repudiate these wrong and evil parts of the traditional meaning of "progress":

  • Plato's notion of "perfection", which included his belief that there is exactly one "perfect" society, and that our goal should be to do ABSOLUTELY ANYTHING NO MATTER HOW HORRIBLE to construct it, and then do ABSOLUTELY ANYTHING NO MATTER HOW HORRIBLE to make sure it STAYS THAT WAY FOREVER.
  • Hegel's elaboration on Plato's concept, claiming that not only is there just one perfect end-state, but that there is one and only one path of pro
... (read more)
Citation needed. I've read The Republic , and there's nothing remotely like that in it.

Sorry; your example is interesting and potentially useful, but I don't follow your reasoning.  This manner of fertilization would be evidence that kin selection should be strong in Chimaphila, but I don't see how this manner of fertilization is itself evidence that kin selection has taken place.  Also, I have no good intuitions about what differences kin selection predicts in the variables you mentioned, except that maybe dispersion would be greater in Chimaphila because of teh greater danger of inbreeding.  Also, kin selection isn't controversial, so I don't know where you want to go with this comment.

Hi, see above for my email address. Email me a request at that address. I don't have your email. I just sent you a message.

ADDED in 2021: Some people tried to contact me thru LessWrong and Facebook. I check messages there like once a year.  Nobody sent me an email at the email address I gave above. I've edited it to make it more clear what my email address is.

[Original first point deleted, on account of describing something that resembled Bayesian updating closely enough to make my point invalid.]

I don't think this approach applies to most actual bad arguments.

The things we argue about the most are ones over which the population is polarized, and polarization is usually caused by conflicts between different worldviews.  Worldviews are constructed to be nearly self-consistent.  So you're not going to be able to reconcile people of different worldviews by comparing proofs.  Wrong beliefs come in se... (read more)

They might well say the same about you. All arguments are based on fundamental assumptions that are necessarily unproven.
3Beth Barnes3y
When you say 'this approach', what are you referring to?
Is the set of real numbers simple or complex? What information does it contain? What information doesnt it contain?

"Cynicism is a self-fulfilling prophecy; believing that an institution is bad makes the people within it stop trying, and the good people stop going there."

I think this is a key observation. Western academia has grown continually more cynical since the advent of Marxism, which assumes an almost absolute cynicism as a point of dogma: all actions are political actions motivated by class, except those of bourgeois Marxists who for mysterious reasons advocate the interests of the proletariat.

This cynicism became even worse with Foucault, who taught people to s... (read more)

"At its core, this is the main argument why the Solomonoff prior is malign: a lot of the programs will contain agents with preferences, these agents will seek to influence the Solomonoff prior, and they will be able to do so effectively."

First, this is irrelevant to most applications of the Solomonoff prior.  If I'm using it to check the randomness of my random number generator, I'm going to be looking at 64-bit strings, and probably very few intelligent-life-producing universe-simulators output just 64 bits, and it's hard to imagine how an alien in a... (read more)

The S. prior is a general-purpose prior which we can apply to any problem. The output string has no meaning except in a particular application and representation, so it seems senseless to try to influence the prior for a string when you don't know how that string will be interpreted.

The claim is that consequentalists in simulated universes will model decisions based on the Solomonoff prior, so they will know how that string will be interpreted.

Can you give an instance of an application of the S. prior in which, if everything you wrote were correct, i

... (read more)

I think we learned that trolls will destroy the world.

It's only offensive if you still think of mental illness as shameful.

Me: We could be more successful at increasing general human intelligence if we looked at low intelligence as something that people didn't have to be ashamed of, and that could be remedied, much as how we now try to look at depression and other mental illness as illness--a condition which can often be treated and which people don't need to be ashamed of.

You: YOU MONSTER! You want to call stupidity "mental illness", and mental illness is a bad and shameful thing!

I think this whole problem is a bit more nuanced than you seem to suggest here. I can't help but at least tentatively give some credit to the assertion that LW is, for lack of a better term, mildly elitist. To be sure, for perhaps roughly the right reasons, but being elitist in whatever measure tends to be detrimental to the chances of getting your point across, especially if it needs to be elucidated to the very folks you're elitist towards ;) Not many behaviors are judged more repulsive than being made to feel a lesser person... I'd say it's pretty close to a cultural universal. It's not right to assert that if one does not agree with your suggestion that stupidity is to be seen as a type of affliction of the same type or category as mental illness, one therefore is disparaging mental illness as shameful; This is a false dichotomy. One can disagree with you for other reasons, not in the least for reasons as remote from shame as evolution... it is nowhere close to a given that nature cares even a single bit about whatever might end up being called intelligence. You will note that most creatures seem to have just the right CPU for their "lifestyle", and while it might be easy for us to imagine how, say, a dog might benefit from being smarter, I'd sooner call that a round-about way of anthropomorphizing than a probable truth. Exhibit B seems to be the most convincing observation that, by the look of things, wanting to "go for max IQ" is hardly on evolution's To-Do list... us, primates, dolphins and a handful of birds aside, most creatures seem perfectly content with being fairly dim and instinct-driven, if the behaviours and habits exhibited by animals are a reliable indication ;) I'll be quiet about the elephant in the room that the vast majority of our important motivations are emotional and non-rational, too... What's more - and I am actually curious what you will respond to this... it could be said that animals, all animals, are more rational than human beings

That's technically true, but it doesn't help a lot. You're assuming one starts with fixation to non-SC in a species. But how does one get to that point of fixation, starting from fixation of SC, which is more advantageous to the individual? That's the problem.

It's not that I no longer endorse it; it's that I replied to a deleted comment instead of to the identical not-deleted comment.

Group selection, as I've heard it explained before, is the idea that genes spread because their effects are for the good of the species. The whole point of evolution is that genes do well because of what they do for the survival of the gene. The effect isn't on the group, or on the individual, the species, or any other unit other than the unit that gets copied and inherited.

Group selection is group selection: selection of groups. That means the phenotype is group behavior, and the effect of selection is spread equally among members of the group. ... (read more)

You're assuming that the benefits of an adaptation can only be linear in the fraction of group members with that adaptation. If the benefits are nonlinear, then they can't be modeled by individual selection, or by kin selection, or by the Haystack model, or by the Harpending & Rogers model, in all of which the total group benefit is a linear sum of the individual benefits.

For instance, the benefits of the Greek phalanx are tremendous if 100% of Greek soldiers will hold the line, but negligible if only 99% of them do. We can guess--though I ... (read more)

Group selection, as I've heard it explained before, is the idea that genes spread because their effects are for the good of the species. The whole point of evolution is that genes do well because of what they do for the survival of the gene. The effect isn't on the group, or on the individual, the species, or any other unit other than the unit that gets copied and inherited.

Group selection is group selection: selection of groups. That means the phenotype is group behavior, and the effect of selection is spread equally among members of the group... (read more)

[This comment is no longer endorsed by its author]Reply
It's not that I no longer endorse it; it's that I replied to a deleted comment instead of to the identical not-deleted comment.


I have great difficulty finding any philosophy published after 1960 other than post-modern philosophy, probably because my starting point is literary theory, which is completely confined to--the words "dominated" and even "controlled" are too weak--Marxian post-modern identity politics, which views literature only as a political barometer and tool.

I think you're assuming that to give in to the mugging is the wrong answer in a one-shot game for a being that values all humans in existence equally, because it feels wrong to you, a being with a moral compass evolved in iterated multi-generational games.

Consider these possibilities, any one of which would create challenges for your reasoning:

1. Giving in is the right answer in a one-shot game, but the wrong answer in an iterated game. If you give in to the mugging, the outsider will keep mugging you and other rationalists until you're all brok... (read more)

Your overall point is right and important but most of your specific historical claims here are false - more mythical than real.

Free-market economic theory developed only after millenia during which everyone believed that top-down control was the best way of allocating resources.

Free market economic theory was developed during a period of rapid centralization of power, before which it was common sense that most resource allocation had to be done at the local level, letting peasants mostly alone to farm their own plots. To find a prior epoch of deliberate ce... (read more)

I scanned in Extropy 1, 3, 4, 5, 7, 16, and 17, which leaves only #2 missing. How can I send these to you? Contact me at [my LessWrong user name] at

There is a way that you can copy here the link for the missing issues? Or perhaps could you email me with the link?

I just now read that one post. It isn't clear how you think it's relevant. I'm guessing you think that it implies that positing free will is invalid.

You don't have to believe in free will to incorporate it into a model of how humans act. We're all nominalists here; we don't believe that the concepts in our theories actually exist somewhere in Form-space.

When someone asks the question, "Should you one-box?", they're using a model which uses the concept of free will. You can't object to that by saying "You don't really have free will."... (read more)

It's not just the one post, it's the whole sequence of related posts. It's hard for me to summarize it all and do it justice, but it disagrees with the way you're framing this. I would suggest you read some of that sequence and/or some of the decision theory papers for a defense of "should" notions being used even when believing in a deterministic world, which you reject. I don't really want to argue the whole thing from scratch, but that is where our disagreement would lie.

Yep, nice list. One I didn't see: Defining a word in a way that is less useful (that conveys less information) and rejecting a definition that is more useful (that conveys more information). Always choose the definition that conveys more information; eliminate words that convey zero information. It's common for people to define words that convey zero information. But if everything has the Buddha nature, nothing empirical can be said about what it means and it conveys no information.

Along similar lines, always define words so that no other word conveys... (read more)

[moved to top level of replies]

[This comment is no longer endorsed by its author]Reply

But you're arguing against Eliezer, as "God" and "miracle" were (and still are) commonly-used words, and so Eliezer is saying those are good, short words for them.

I don't think so - I think Eliezer's just being sloppy here. "God did a miracle" is supposed to be an example of something that sounds simple in plain English but is actually complex:

Great post! There is also the non-discrete aspect of compression: information loss. English has, according to some dictionaries, over a million words. It's unlikely we store most of our information in English. Probably there is some sort of dimension reduction, like PCA. There is in any case probably lossy compression. This means people with different histories will use different frequency tables for their compression, and will throw out different information when encoding a verbal statement. I think you would almost certainly find that if you measu... (read more)

I don't think that what you need has any bearing on what reality has actually given you. Nor can we talk about different decision theories here--as long as we are talking about maximizing expected utility, we have our decision theory; that is enough specification to answer the Newcomb one-shot question. We can only arrive at a different outcome by stating the problem differently, or by sneaking in different metaphysics, or by just doing bad logic (in this case, usually allowing contradictory beliefs about free will in different parts of the analysis.)

You... (read more)

As far as I can tell, I would pay Parfit's Hitchhiker because of intuitions that were rewarded by natural selection. It would be nice to have a formalization that agrees with those intuitions. This seems wrong to me, if you're explicitly declaring different metaphysics (if you mean the thing by metaphysics that I think you mean). If I view myself as a function that generates an output based on inputs, and my decision-making procedure being the search for the best such function (for maximizing utility), then this could be considered as different metaphysics from trying to cause the most increase in utility for myself by making decisions, but it's not obvious that the latter leads to better decisions.
Load More