LESSWRONG
LW

14
Dalmert
1294190
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
3Dalmert's Shortform
1mo
3
Using Bayes' Theorem to determine Optimal Protein Intake
Dalmert25d41

I've been thinking of doing more things like this, however, I wonder about this part:

  • P(E | H1) = 0.2 (if 150g is enough, poor recovery is unlikely)
  • P(E | H2) = 0.7 (if 170g is needed, poor recovery is likely if you only ate 150g)

Any reason why these particular conditional probabilities are chosen here -- 0.2 and 0.7?
 

Would there be any principled way to update these probabilities while new evidence rolls in, and maybe even start both from 0.5? I think for simple observation our set of formulae might be under-constrained, so maybe we'd need to incorporate other stream of evidence to constrain it enough?

Reply
Dalmert's Shortform
Dalmert1mo21

'Shower thought: If Christianity is true, then being aborted is like winning the lottery'
 

Think about it - you start to exist without consciousness and you die without consciousness. You never experience any form of suffering, because you are not physically able to feel the pain. You are going straight to heaven, because you have no opportunity to sin in any way. You don't even have a birth sin because well... you weren't born. You never experience human existence, you are just "born" in heaven and live in the perfect world from the beginning - like Adam and Eve would have if they hadn't sinned. In Christianity, that's the best position you could be in.

But then, there's also the darker side of it. If you want to create the highest amount of those sinless sufferless beings, you could... get abortions as frequently as you can. The problem is, it will send YOU to hell. In that scenario the woman who would do that would be an ultimate martyr, who would sacrifice her own soul to create many infinitely happy children...

 

Source: /u/Uszanka from reddit (lightly edited mostly to clean up typos and likely ESL quirks)

This thought is short and sweet. Saw it and figured others here might appreciate this view as well. This was a to-me-novel way to look at what might be a somewhat fundamental incoherence at the core of Christianity, maybe some of you will be inclined to argue against.

This also ties into something I've been thinking about lately, that there might be very few "true" christians who do not just believe if belief, but truly fully believe what they preach. Maybe with very heavy compartmentalization and low enough self-reflection it can happen, but the concept of heaven seems to cause some issues.

Here is another: if someone truly believes in heaven, they should take much fewer steps--than what their revealed preferences betray--to avoid dying. On a long enough time horizon, evolution might make sure that, of heaven, only belief-in-belief can persist, the others select themselves out.

Reply
Sense-making about extreme power concentration
Dalmert2mo10

Thanks for your response, can I ask the same question of you as I do here in this cousin comment?

Reply
Sense-making about extreme power concentration
Dalmert2mo40

Reference class tennis, yay!

Reply
Sense-making about extreme power concentration
Dalmert2mo10

I think I see somewhat where you are coming from, but can you spell it out for me a bit more? Maybe through describing a somewhat fleshed out concrete example scenario all the while I can acknowledge that this is just one hastily put together possibility of many.

Let me start by proposing one such possibility but feel free to start going in another direction entirely too. Let's suppose the altruistic few put together sanctuaries or "wild human life reserves", how might this play out after this? Will the selfish ones somehow try to intrude or curtail this practice? By our scenarios granted premises, the altruistic ones do wield real power, and they do use some fraction of it to maintain this sanctum. Even if the others are many, would they have a lot to gain by trying to mess with this? Is it just entertainment, or sport for them? What do they stand to gain? Not really anything economic or more power, or maybe you think that they do?

Reply
Sense-making about extreme power concentration
Dalmert2mo32

There is one counterargument that I sometimes hear that I'm not sure how convincing I should find:

  1. AI will bring unprecedented and unimaginable wealth.
  2. More than zero people are more than zero altruistic.
  3. It might not be a stretch to argue that at least some altruistic people might end up with some considerable portion of that large wealth and power.
  4. Therefore: some of these somewhat-well-off somewhat-altruists[1] will rather give up little bits of their wealth[2] and power than to see the largest humanitarian catastrophe ever unfold before their eyes, in no small part due to their inaction playing a central role, especially if they have to give up comparatively so little to save so many.

Do you agree or disagree with any parts of this?

p.s. this might go without saying but this question might only be relevant if technical alignment can be and is solved in any fashion. With that said I think it's entirely good to ask this question lest we find ourselves in a world where we clear one impossible seeming hurdle and still find ourselves in a world of hurt all the same.

  1. ^

    This only needs there to exist something of a pareto frontier of either very altruistic okay-offs, or well-off only-a-little-altruists, or somewhere in-between. If we have many very altruistic very-well-offs, then the argument might just make itself, so I'm arguing in a less convenient context.

  2. ^

    This might truly be tiny indeed, like one one-millionth of someone's wealth, truly a rounding error. Someone arguing for side A might be positing a very large amount of callousness if all other points stand. Or indifference. Or some other force that pushes against the desire to help.

Reply
Truth or Dare
Dalmert5mo126

> The low-trust attractor starts to bend other people into reciprocal low-trust shapes, just like a prion twisting nearby proteins.

Convincing people using your actions sounds disgusting!


Could you expand on what you mean here? I'm not sure I or others followed you. Perhaps you mean what you say sarcastically?

(Formatting wise: not sure how to quote a quote here, perhaps someone knows?)

Reply
Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
Dalmert6mo140

Is an audiobook version also planned per chance? Could preordering that one also help?

Judging from Stephen Fry's endorsement and, as I've seen, his interest in the topic for some time in general, perhaps a delightful and maybe even eager deal could be made where he narrates? Unless some other choice might be better for either party of course. And I also understand if negotiations or existing agreements prevent anyone from confirming anything on this aspect, I'd be happy to hear whether the audio version is planned/intended to begin with and when if that can be known.

Reply
a confusion about preference orderings
Dalmert6mo60

I might be missing something that's written on this page, including comments, but if not, here is a vague understanding of mine of what people might fear regarding money pumps. I'm going to diverge from your model a bit, and use the concept of sub-world-state, denoted A', B', and C', which includes everything related to the world that can be preferred except for how much money you have, which I handle separately in this comment.

A' -> B' -> C' -> A' preferences hold in a cycle.

Mless -> Mmore preference also holds for money had.

I think, agents, either intrinsically, or instrumentally, (have to) simplify their decisions by factoring them at each timestep.

So they ask themselves:

Do I prefer going from A' -> B' more than having Mless -> Mmore , more concretely MΔ−1 -> MΔ0  ?

In this example, the non-money preference is strong, so the answer is clearly yes.

Even if the agent plans ahead a bit, and considers:

Do I prefer A' -> B' -> C' more than having    MΔ−2 -> MΔ0?

The answer will still be a clear yes.

The interesting question is, what might someone who fears money pumps say an agent would do if it occurs to them to plan ahead enough and consider:

Do I prefer A' -> B' -> C' -> A' more than having MΔ−3 -> MΔ0?

According to both this and your formalisms, this agent should clearly realize that they much prefer MΔ−3 -> MΔ0 and stay put at A'. And I think you are correct to ask whether planning is allowed by these different formalisms, and how it fits in.

I think concerns come in two flavors: 

One is how you put it: if the agent is stupid, (or more charitably, computationally bounded, as we all are) they might not realize that they are going in circles and trading away value in the process for no (comparable(?)) benefit to themselves. Maybe agents are more prone to notice repetition and stop after a few cycles, since prediction and planning are famously hard.

The other concern is what we seem to notice in other humans and might notice in ourselves as well (and therefore might in practice diverge from idealized formalisms): sometimes we know or strongly suspect that something is likely not a good choice, and yet we do it anyway. How come? One simple answer can be how preference evaluations work in humans, if A' -> B' is strongly enough preferred in itself, knowing or suspecting A' -> B' -> C' -> A' + MΔ−3 -> MΔ0 might not be strong enough to override it.

It might be important that, if we can, we construct agents that do not exhibit this 'flaw'. Although one might need to be careful with such wishes since such an agent might monomaniacally pursue a peak to which it then statically sticks if reached. Which humans might dis-prefer. Which might be incoherent. (This has interested me for a while and I am not yet convinced that human values do not contain some (fundamental(?)) incoherence, e.g. in the form of such loops. For better or for worse I expanded a bit on this below, though not at all very formally and I fear less than clearly.)

So in summary, I think that if an agent

  1. has static preferences over complete world states
  2. is computationally boundless (enough), plans, and
  3. does not 'suffer' from the kind of near-term bias that humans seem to

then it cannot be made worse by money pumps around things it cares about.

I think your questions are very important to be clear on as much as we can, and I only responded to a little bit of what you wrote. I might respond to more, hopefully in a more targeted and clear way, if I have more time later. And I also really hope that others also provide answers to your questions.


Some bonus pondering is below, much less connected to your post just felt nice to think through this a little and perhaps invite others' thoughts on as well.

Let's imagine the terminus of human preference satisfaction. Let's assume that all preferences are fulfilled, importantly enough: in a non-wire-headed fashion. What would that look like, at least abstractly?

a) Rejecting the premise: All (human) preferences can never be fulfilled. If one has n dyson spheres, one can always wish for n+1. There will always be an orderable list of world states that we can inch ever higher on. And even if it's hard to imagine what might someone desire if they could have everything we currently can imagine--by definition--new desires will always spring up. In a sense, dissatisfaction might be a constant companion.

b) We find a static peak of human preferences. Hard to imagine what this might be, especially if we ruled out wireheading. Hard to imagine not dis-preferring it at least a little.

c) A (small or large) cycle is found and occupied at the top. This might also fulfill the (meta-)preference against boringness. But it's hard to escape that this might have to be a cycle. And if nothing else we are spending negentropy anyway, so maybe this is a helical money-pump spiral to the bottom still?

d) Something more chaotic is happening at the top of the preferences, with no true cycles, but maybe dynamically changing fads and fashions, never much deviating from the peak. This is hard to see how or why states would be transitioned to and from if one believes in cardinal utility. This still spends negentropy, but if we never truly return to a prior world-state even apart from that, maybe it's not a cycle in the formal sense?

I welcome thoughts and votes on the above possibilities.

Reply
Load More
3Dalmert's Shortform
1mo
3
25OpenAI Superalignment: Weak-to-strong generalization
2y
3
12Interview with Paul Christiano: How We Prevent the AI’s from Killing us
3y
0
4Personal predictions for decisions: seeking insights
Q
3y
Q
4