## LESSWRONGLW

Gurkenglas

I operate by Crocker's rules.

I won't deliberately, derisively spread something just because you tried to point out an infohazard.

# Wiki Contributions

When disambiguating as far as possible, löb becomes □(□B → A) → □A, but □löb becomes □(□(□B → A) → □B). Perhaps Ψ has a universal property related to this?

Similarly:

löb = □ (□ A → A) → □ A
□löb = □ (□ (□ A → A) → □ A)
□löb -> löb:
löb premises □ (□ A → A).
By internal necessitation, □ (□ (□ A → A)).
By □löb, □ (□ A).
By löb's premise, □ A.

All consequentialism is kinda non-local: The morality of an action depends on its far-future effects. If you can reason about far-future effects, you should be able to reason about the average happiness of aliens.

Ah, we've seen my problem before: Solve bargaining, then make g bargain to choose f.

It could be a conceptual ingredient to something useful. For example, it can select between two plans.

Your math can be terser: f:U->ΔA. g:ΔU. h:ΔA. The condition is: f and g combine into h. To fix your first problem, make A policies, not actions.

My biggest problem here is that f depends on how we represent g since U identifies u with 2u. Silence this warning by normalizing U, and f depends on how we normalize.

Sounds very related to https://towardsdatascience.com/curiosity-in-deep-reinforcement-learning-understanding-random-network-distillation-747b322e2403 , which deals with the problem of RL agents staring at a source of noise because they can't predict it.

The recursive self-improvement isn't necessarily human-out-of-the-loop: If an AGI comes up with simpler math, everything gets easier.

Huh, I misremembered Cotra's update, and wrt that Metaculus question that got retitled due to the AI effect, I can see most people thinking it resolves long before history ends.

People know that doing AI safety is a risk to their academic career

Would it help to point out that 2040 is unlikely to be a real year, so they should stop having 20-year plans?