LESSWRONG
LW

327
Wikitags

Utility Functions

Edited by the gears to ascension, Multicore, abramdemski, steven0461, Ruby, et al. last updated 30th Dec 2024

Utility Function is a function that assigns numerical values ("utilities") to outcomes, in such a way that outcomes with higher utilities are absolutely always preferred to outcomes with lower utilities, with no exceptions; the lack of exploitable holes in the preference ordering is necessary for the definition and separates utility from mere reward.

See also: Complexity of Value, Decision Theory, Game Theory, Orthogonality Thesis, Utilitarianism, Preference, Utility, VNM Theorem

Utility Functions do not work very well in practice for individual humans. Human drives are not coherent nor is there any reason to think they would converge to a utility-function-grade level of reliability (Thou Art Godshatter), and even people with a strong interest in the concept have trouble working out what their utility function actually is even slightly (Post Your Utility Function). Furthermore, humans appear to calculate reward and loss separately - adding one to the other does not predict their behavior accurately, and thus human reward is not human utility. This makes humans highly exploitable - and in fact, not being exploitable would be a minimum requirement in order to qualify as having a coherent utility function.

pjeby posits humans' difficulty in understanding their own utility functions as the root of akrasia.

However, utility functions can be a useful model for dealing with humans in groups, e.g. in economics.

The VNM Theorem tag is likely to be a strict subtag of the Utility Functions tag, because the VNM theorem establishes when preferences can be represented by a utility function, but a post discussing utility functions may or may not discuss the VNM theorem/axioms.

Because utility functions arise from VNM rationality, they may still be of note in understanding intelligent systems even when the system does not explicitly store a utility function anywhere, since reducing exploitable error rate should eventually converge to utility-function-like guarantees.

Subscribe
Discussion
2
Subscribe
Discussion
2
Posts tagged Utility Functions
156Coherent decisions imply consistent utilities
Eliezer Yudkowsky
6y
83
154An Orthodox Case Against Utility Functions
Ω
abramdemski
5y
Ω
66
134Coherence arguments do not entail goal-directed behavior
Ω
Rohin Shah
7y
Ω
69
16Approximately Bayesian Reasoning: Knightian Uncertainty, Goodhart, and the Look-Elsewhere Effect
RogerDearnaley
2y
2
131Utility ≠ Reward
Ω
Vlad Mikulik
6y
Ω
24
130Why Not Subagents?
Ω
johnswentworth, David Lorell
2y
Ω
52
50Bayesian Utility: Representing Preference by Probability Measures
Ω
Vladimir_Nesov
16y
Ω
37
20How easily can we separate a friendly AI in design space from one which would bring about a hyperexistential catastrophe?
Anirandis
5y
19
94Pinpointing Utility
[anonymous]13y
156
68The Human's Hidden Utility Function (Maybe)
lukeprog
14y
91
66Time and Effort Discounting
Scott Alexander
14y
32
175Why Subagents?
Ω
johnswentworth
6y
Ω
48
170Choosing the Zero Point
orthonormal
5y
25
167Shard Theory: An Overview
Ω
David Udell
3y
Ω
34
131Ngo and Yudkowsky on AI capability gains
Ω
Eliezer Yudkowsky, Richard_Ngo
4y
Ω
61
Load More (15/180)
Add Posts