LESSWRONG
LW

Kerrigan
50Ω10580
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Understanding Agent Preferences
Kerrigan3mo20

If an agent has a preference for a move in a specific position in chess, but then gets more compute and more optimization and gets better at chess, and makes a different move after getting better, would you say it's preference changed, or it reduced epistemic uncertainty and got better at achieving it's preference, which stayed the same?

Reply
SIA > SSA, part 1: Learning from the fact that you exist
Kerrigan4mo10

Under an Occam prior the laws already lean simple. SSA leaves that tilt unchanged, whereas SIA multiplies each world’s weight by the total number of observers in the reference class. That means SSA, relative to SIA, favors worlds that stay simple, while SIA boosts those that are populous once the simplicity penalty is paid. Given that, can we update our credence in SSA vs. SIA by looking at how simple our universe’s laws appear and how many observers it seems to contain?

Reply
why assume AGIs will optimize for fixed goals?
Kerrigan4mo10

Is this trivializing the concept of a Utility Function?

Reply
No Universally Compelling Arguments
Kerrigan4mo10

This post was from a long time ago. I think it is important to reconsider everything written, after developments in machine learning.

Reply
We Don't Have a Utility Function
Kerrigan4mo10

How are humans exploitable, given that they don't have utility functions?

Reply
Coherent decisions imply consistent utilities
Kerrigan5mo10

Since humans are not EU maximizers and are exploitable, can someone give an example of how they are exploitable?

Reply
What do coherence arguments actually prove about agentic behavior?
Kerrigan5mo30

Is exploitability necessarily unstable? Could there be a tolerable level of exploitability, especially if it allows for tradeoffs with desirable characteristics that are only available to non-EU maximizers?"

Reply
Clarifying Power-Seeking and Instrumental Convergence
Kerrigan5mo10

Why is this not true for most humans? Many religious people would not want to modify the lightcone as they think that it's God's territory to modify.

Reply
why assume AGIs will optimize for fixed goals?
Kerrigan5mo10

The initial distribution of values need not be highly related to the resultant values after moral philosophy and philosophical self-reflection. Optimizing hedonistic utilitariansm, for example, looks very little like any values from the outer optimization loop of natural selection.

Reply
Coherent decisions imply consistent utilities
Kerrigan5mo10

Although there would be pressure for an AI to not be exploitable, wouldn't there also be pressure for adaptability and dynamism? The ability to alter preferences and goals given new environments?

Reply
Load More
No posts to display.