Wiki Contributions


Absence of evidence, in many cases, is evidence (not proof, but updateable Bayesean evidence) of absence.

This conflicts with Gödel's incompleteness theoremsFitch's paradox of knowabilityBlack swan theory.

A concept of experiment relies on this principle.

And this is exactly what scares me - people who work with AI have beliefs that are non scientific. I consider this to be an existential risk.

You may believe so, but AGI would not believe so.

Thanks to you too!

I AM curious if you have any modeling more than "could be anything at all!" for the idea of an unknown goal.


I could say - Christian God or aliens. And you would say - bullshit. And I would say - argument from ignorance. And you would say - I don't have time for that.

So I won't say.

We can approach this from different angle. Imagine an unknown goal that according to your beliefs AGI would really care about. And accept the fact that there is a possibility that it exists. Absense of evidence is not evidence of absense.

If killing itself / allowing itself to be replaced leads to more expected paperclips than clinging to life does, it will do so.

I agree, but this misses the point.

What would change your opinion? It is not the first time we have a discussion, I don't feel you are open for my perspective. I am concerned that you may be overlooking the possibility of an argument from ignorance fallacy.

Hm. How many paperclips is enough for the maximizer to kill itself?

It seems to me that you don't hear me...

  • I claim that utility function is irrelevant
  • You claim that utility function could ignore improbable outcomes

I agree with your claim. But it seems to me that your claim is not directly related to my claim. Self-preservation is not part of utility function (instrumental convergence). How can you affect it?

OK, so using your vocabulary I think that's the point I want to make - alignment is physically-impossible behavioral policy.

I elaborated a bit more there

What you think?

Thank you for your comment! It is super rare for me to get such a reasonable reaction in this community, you are awesome 👍😁

there could be a line of code in an agent-program which sets the assigned EV of outcomes premised on a probability which is either <0.0001% or 'unknown' to 0.

I don't think it is possible, could you help me understand how is this possible? This conflicts with Recursive Self-Improvement, doesn't it?

Why do you think it is rational to ignore tiny probabilities? I don't think you can make maximizer ignore tiny probabilities. And some probabilities are not tiny, they are unknown (black swans), why do you think it is rational to ignore them? In my opinion ignoring self preservation is contradictory to maximizer's goal. I understand that this is popular opinion, but it is not proven in any way. The opposite (focus on self preservation instead of paperclips) has logical proof (Pascal's wager).

Maximizer can use robust decision making ( to deal with many contradictory choices.

I don't think your reasoning is mathematical. Worth of survival is infinite. And we have situation analogous to Pascal's wager. Why do you think the maximizer would reject Pascal's logic?

Building one paperclip could EASILY increase the median and average number of future paperclips more than investing one paperclip's worth of power into comet diversion.

Why do you think so? There will be no paperclips if planet and maximizer are destroyed.

Load More