Posts

Sorted by New

Wiki Contributions

Comments

Nobody would understand that.

This sort of saying-things-directly doesn't usually work unless the other person feels the social obligation to parse what you're saying to the extent they can't run away from it.

Correct me if I'm wrong, but I think we could apply the concept of logical uncertainty to metaphysics and then use Bayes' theorem to update depending on where our metaphysical research takes us, the way we can use it to update the probability of logically necessarily true/false statements.

Bayes' theorem is about the truth of propositions. Why couldn't it be applied to propositions about ontology?

However, this image is obviously optimized to be scary and disgusting. It looks dangerous, with long rows of sharp teeth. It is an eldritch horror. It's at this point that I'd like to point out the simple, obvious fact that "we don't actually know how these models work, and we definitely don't know that they're creepy and dangerous on the inside."

It's optimized to illustrate the point that the neural network isn't trained to actually care about what the person training it thinks it came to care about, it's only optimized to act that way on the training distribution. Unless I'm missing something, arguing the image is wrong would be equivalent to arguing that maybe the model truly cares about what its human trainers want it to care about. (Which we know isn't actually the case.)

Well. Their actual performance is human-like, as long as they're using GPT-4 and have a right prompt. I've talked to such bots.

In any case, the topic is about what future AIs will do, so, by definition, we're speculating about the future.

green_leaf4mo-4-13

They're accused, not whistleblowers. They can't retroactively gain the right to anonymity, since their identities have already been revealed.

They could argue that they became whistleblowers as well, and so they should be retroactively anonymized, but that would interfere with the first whistleblowing accusation (there is no point in whistleblowing against anonymous people), and also they're (I assume) in a position of comparative power here.

There could be a second whistleblowing accusation made by them (but this time anonymously) against the (this time) deanonymized accuser, but given their (I assume) higher social power, that doesn't seem appropriate.

green_leaf4mo1410

I agree that it feels wrong to reveal the identities of Alice and/or Chloe without concrete evidence of major wrongdoing, but I don't think we have a good theoretical framework for why that is.

Ethically (and pragmatically), you want whistleblowers to have the right to anonymity, or else you'll learn of much less wrongdoing that you would otherwise, and because whistleblowers are (usually) in a position of lower social power, so anonymity is meant to compensate for that, I suppose.

I will not "make friends" with an appliance.

That's really substratist of you.

But in any case, the toaster (working in tandem with the LLM "simulating" the toaster-AI-character) will predict that and persuade you some other way.

Of course, it would first make friends with you, so that you'd feel comfortable leaving up to it the preparation of your breakfast, and you'll even feel happy that you have such a thoughtful friend.

If you were to break the toaster for that, it would predict that and simply do it in a way that would actually work.

Unless you precommit to breaking all your AIs that will do anything differently from what you tell them to, no matter the circumstances and no matter how you feel about it in that moment.

Load More