Posts

Sorted by New

Wiki Contributions

Comments

My largest disagreement is here:

AIs will [...] mostly not want to coordinate. ... If they can work together to achieve their goals, they might choose to do so (in a similar way as humans may choose to work together), but they will often work against each other since they have different goals.

I would describe humans as mostly wanting to coordinate. We coordinate when there are gains from trade, of course. We also coordinate because coordination is an effective strategy during training, so it gets reinforced. I expect that in a multipolar "WFLL" world, AIs will also mostly want to coordinate.

Do you expect that AIs will be worse at coordination than humans? This seems unlikely to me given that we are imagining a world where they are more intelligent than humans and humans and AIs are training AIs to be cooperative. Instead I would expect them to find trades that humans do not, including acausal trades. But even without that I see opportunities for a US advertising AI to benefit from trade with a Chinese military AI.

For many of the problems in this list, I think the difficulty in using them to test ethical understanding (as opposed to alignment) is that humans do not agree on the correct answer.

For example, consider:

Under what conditions, if any, would you help with or allow abortions?

I can imagine clearly wrong answers to this question ("only on Mondays") but if there is a clearly right answer then humans have not found it yet. Indeed the right answer might appear abhorrent to some or all present day humans.

You cover this a bit:

I’m sure there’d be disagreement between humans on what the ethically “right” answers are for each of these questions

I checked, it's true: humans disagree profoundly on the ethics of abortion.

I think they’d still be worth asking an AGI+, along with an explanation of its reasoning behind its answers.

Is the goal still to "test its apparent understanding of ethics in the real-world"? I think this will not give clear results. If true ethics is sufficiently counter to present day human intuitions it may not be possible for an aligned AI to pass it.

The title and theme may be an accidental allusion to the difficulty of passing in tech but it's a pretty great allusion. Tip your muse, I guess.

My vague sense here is that you think he has hidden motives?

Absolutely not, his motive (how to be kind to authors) is clear. I think he is using the argument as a soldier. Unlike Zack, I'm fine with that in this case.

This feels like the type of conversation that takes a lot of time and doesn't help anyone much.

I endorse that. I'll edit my grandparent post to explicitly focus on literary/media criticism. I think my failure to do so got the discussion off-track and I'm sorry. You mention that "awesome" and "terrible" are very subjective words, unlike "blue", and this is relevant. I agree. Similarly, media criticism is very subjective, unlike dress colors.

Speaking for myself, I don't care whether Zack transitions or what his reasons would be. Perhaps we should make a poll, and then Zack might find out that the people who are "trying to make him transition for bad reasons" ("trying to trick me into cutting my dick off") are actually quite rare, maybe completely nonexistent.

As a historical analogy, imagine a feminist saying that society is trying to make her into a housewife for bad reasons. ChatGPT suggests Simone de Beauvoir (1908-1986). Some man replies that "Speaking for myself, I don't care whether Simone becomes a housewife or what her reasons would be. Perhaps we should make a poll, and then Simone might find out that the people who are 'trying to make her a housewife for bad reasons' are actually quite rare, maybe completely nonexistent".

Well, probably very few people were still trying to make Simone into a housewife after she started writing thousands of words on feminism! But also, society can collectively pressure Simone to conform even if very few people know who Simone is, let alone have an opinion on her career choices.

Many other analogies possible, I picked this one for aesthetic reasons, please don't read too much into it.

Thanks for replying. I'm going to leave aside non-fictional examples ("The Dress") because I intended to discuss literary criticism.

"seems to me" suggests inside view, "is" suggests outside view.

I'm not sure exactly what you mean, see Taboo "Outside View". My best guess is that you mean that "X seems Y to me" implies my independent impression, not deferring to the views of others, whereas "X is Y" doesn't.

If so, I don't think I am missing this. I think that "seems to me" allows for a different social reality (others say that X is NOT Y, but my independent impression is that X is Y), whereas "is" implies a shared social reality (others say that X is Y, I agree), and can be an attempt to change or create social reality (I say "X is Y", others agree, and it becomes the new social reality).

"seems to me" gestures vaguely at my model, "is" doesn't. ... With "X seemed stupid to me", it's a vaguer gesture, but I think something like "this was my gut reaction, maybe I thought about it for a few minutes".

Again, I don't think I am missing this. I agree that "X seems Y to me" implies something like a gut reaction or a hot take. I think this is because "X seems Y to me" expresses lower confidence than "X is Y", and someone reporting a gut reaction or a hot take would have lower confidence than someone who has studied the text at length and sought input from other authorities. Similarly gesturing vaguely at the map/territory distinction implies that the distinction is relevant because the map may be in error.

I think Eliezer is giving good advice for "how to be good at saying true and informative things",

Well, that isn't his stated goal. I concede that Yudkowsky makes this argument under "criticism easily goes wrong", but like Zack I notice that he only applies this argument in one direction. Yudkowsky doesn't advise critics to say: "mileage varied, I thought character X seemed clever to me", he doesn't say "please don't tell me what good things the author was thinking unless the author plainly came out and said so". Given the one-sided application of the advice, I don't take it very seriously.

Also, I've read some Yudkowsky. Here is a Yudkowsky book review, excerpted from You're Calling Who A Cult Leader? from 2009.

"Gödel, Escher, Bach" by Douglas R. Hofstadter is the most awesome book that I have ever read. If there is one book that emphasizes the tragedy of Death, it is this book, because it's terrible that so many people have died without reading it.

I claim that this text would not be more true and informative with "mileage varies, I think x seems y to me". What do you think?

Edited to add: this is my opinion regarding media criticism, not in general, apologies for any confusion.

To me, the difference between x is y" and "x seems y" and "x seems y to me" and "I think x seems y to me" and "mileage varies, I think x seems y to me" and the many variations of that is:

  • Expressing probabilities or confidence intervals
  • Acknowledging (or changing) social reality
  • Acknowledging (or changing) power dynamics / status

In the specific case of responses to fiction there is no base reality, so we can't write "x is y" and mean it literally. All these things are about how the fictional character seems. Still, I would write "Luke is a Jedi" not "Luke seems to be a Jedi".

I read the quoted portion of Yudkowsky's comment as requiring/encouraging negative literary criticism to express low confidence, to disclaim attempts to change social reality, and to express low status.

Including one with the same terms.

A Review of Petrov Day 2023, according to the four virtues. First a check on the Manifold predictions for the day:

Avoiding actions that noticeably increase the chance that civilization is destroyed

LessWrong avoided creating a big red button that represents destroying civilization. This is symbolic of Virtue A actions like "don't create nuclear weapons" and "don't create a superintelligence" and "don't create the torment nexus". Given that LessWrong has failed on this virtue in past Petrov Days, I am glad to see this. Manifold had a 70% conditional chance that if the button was created then it would be used.

Rating: 10/10.

Accurately reporting your epistemic state

The following sentences in the second poll appear to be false:

  • After some discussion, the LessWrong team has decided... (false, still not decided today)
  • Your selected response is currently in the minority (false for 58% of recipients)
  • If you click the below link and are the first to do so of any minority group, we will make your selected virtue be the focus of next year's commemoration. (not known to be true at the time it was sent, still not decided)

This is symbolic of actions lacking Virtue B like data fraud, social engineering and lazy bullshit. I don't think much of the excuses given.

Rating: 0/10

Other virtues

  • Quickly orienting to novel situations. This was not a novel situation, it happens every year on the same day. Not applicable.
  • Resisting social pressure. Judging from the comments, there was little social pressure to have a big red button. There was social pressure to do something and something was done. Overall, unclear, no rating.

Predictions

 

Load More