Take some agents Ai with utility functions Ui(x1,...xN). In general if they are individually maximizing their utility functions, then their chosen actions (x∗1,…x∗N) might be some Nash equilibrium of the game---but it may not be possible to interpret this as the action of a "super-agent". There are two ways to...
Recent discourse surrounding Jevon’s paradox provides an opportunity to flesh out the precise intuitions behind marginalist economics and how concepts like “elasticity” arise naturally from agent/graph-based first principles. Say you have resource x (e.g. compute) with cost C(x) that produces good y (e.g. AI), with utility function U(x) via production...
I’m listing some “ways to think about alignment”. I’m not sure how much of the “whole problem” each of these individually captures, but they are intended to be very well-motivated, fundamental problems which can: 1. demonstrate to an honest skeptic that there is “something here”, it’s not just hand-wavy sci-fi...
Markets for information are inefficient, in large part due to the Buyer’s Inspection Paradox: you can’t “inspect” information like you would any other good before buying — the moment you inspect the information, you have obtained it and cannot return it. More generally, the problem is an inability to reliably...
Work supported by MATS and SPAR. Code at https://github.com/ArjunPanickssery/math_problems_debate/. Three measures for evaluating debate are 1. whether the debate judge outperforms a naive-judge baseline where the naive judge answers questions without hearing any debate arguments. 2. whether the debate judge outperforms a consultancy baseline where the judge hears argument(s) from...
A phenomenon I have often encountered when thinking about things is when everything seems to collapse to tautology. This is hard to define precisely, but I’ll give you some examples: * Bounded rationality: Bounded rationality can be thought of as “rationality conditional on some given algorithmic information” (in contrast to...
A logarithmic scoring rule to elicit a probability distribution r on a random variable X∈{1…n} is s(r)=blog(rX). Something that always seemed clear to me but I haven’t seen explicitly written anywhere is that the parameter b is just the price of information on X. Firstly: for an agent with true...