PaulK — LessWrong

When and why should you use the Kelly criterion?

I wonder if you can recover Kelly from linear utility in money, plus a number of rounds unknown to you and chosen probabilistically from a distribution.

How could AIs 'see' each other's source code?

PaulK3y20

In the soaking-up-extra-compute case? Yeah, for sure, I can only really picture it (a) on a very short-term basis, for example maybe while linking up tightly for important negotations (but even here, not very likely). Or (b) in a situation with high power asymmetry. For example maybe there's a story where 'lords' delegate work to their 'vassals', but the workload intensity is variable, so the vassals have leftover compute, and the lords demand that they spend it on something like blockchain mining. To compensate for the vulnerability this induces, the lords would also provide protection.

How could AIs 'see' each other's source code?

PaulK3y10

Yup, all that would certainly make it more complicated. In a regime where this kind of tightly-controlled delegation were really important, we might also demand our counterparties standardize their hardware so they can't play tricks like this.
I was picturing a more power-asymmetric situation, more like a feudal lord giving his vassals lots of busywork so they don't have time to plot anything.

How could AIs 'see' each other's source code?

Answer by PaulKJun 03, 2023*2-1

We might develop schemes for auditable computation, where one party can come in at any time and check the other party's logs. They should conform to the source code that the second party is supposed to be running; and also to any observable behavior that the second party has displayed. It's probably possible to have logging and behavioral signalling be sufficiently rich that the first party can be convinced that that code is indeed being run (without it being too hard to check -- maybe with some kind of probabilistically checkable proof).

However, this only provides a positive proof that certain code is being run, not a negative proof that no other code is being run at the same time. This part, I think, inherently requires knowing something about the other party's computational resources. But if you can know about those, then ~~it's possible~~ it might be possible. For a perhaps dystopian example, if you know your counterparty has compute A, and the program you want them to run takes compute B, then you could demand they do something (difficult but easily checkable) like invert hash functions, that'll soak up around A-B of their compute, so they have nothing left over to do anything secret with.

Is "Strong Coherence" Anti-Natural?

PaulK3y32

Sorry, I guess I didn't make the connection to your post clear. I substantially agree with you that utility functions over agent-states aren't rich enough to model real behavior. (Except, maybe, at a very abstract level, a la predictive processing? (which I don't understand well enough to make the connection precise)).

Utility functions over world-states -- which is what I thought you meant by 'states' at first -- are in some sense richer, but I still think inadequate.

And I agree that utility functions over agent histories are too flexible.

I was sort of jumping off to a different way to look at value, which might have both some of the desirable coherence of the utility-function-over-states framing, but without its rigidity.

And this way is something like, viewing 'what you value' or 'what is good' as something abstract, something to be inferred, out of the many partial glimpses of it we have in the form of our extant values.

Is "Strong Coherence" Anti-Natural?

PaulK3y20

Oh, huh, this post was on the LW front page, and dated as posted today, so I assumed it was fresh, but the replies' dates are actually from a month ago.

Is "Strong Coherence" Anti-Natural?

Answer by PaulKApr 11, 202310

(A somewhat theologically inspired answer:)

Outside the dichotomy of values (in the shard-theory sense) vs. immutable goals, we could also talk about valuing something that is in some sense fixed, but "too big" to fit inside your mind. Maybe a very abstract thing. So your understanding of it is always partial, though you can keep learning more and more about it (and you might shift around, feeling out different parts of the elephant). And your acted-on values would appear mutable, but there would actually be a, perhaps non-obvious, coherence to them.

It's possible this is already sort of a consequence of shard theory? In the way learned values would have coherences to accord with (perhaps very abstract or complex) invariant structure in the environment?

A stylized dialogue on John Wentworth's claims about markets and optimization

PaulK3y21

I still don't know exactly what parts of my comment you're responding to. Maybe talking about a concrete sub-agent coordination problem would help ground this more.

But as a general response: in your example it sounds like you already have the problem very well narrowed down, to 3 possibilities with precise probabilities. What if there were 10^100 possibilities instead? Or uncertainty where the full real thing is not contained in the hypothesis space?

A stylized dialogue on John Wentworth's claims about markets and optimization

PaulK3y10

This is for logical coordination? How does it help you with that?

A stylized dialogue on John Wentworth's claims about markets and optimization

PaulK3y202

IMO, coordination difficulties among sub-agents can't be waved away so easily. The solutions named, side-channel trades and counterfactual coordination, are both limited.

I would frame the nature of their limits, loosely, like this. In real minds (or at least the human ones we are familiar with), the stuff we care about lives in a high-dimensional space. A mind could be said to be, roughly, a network spanning such a space. A trade between elements (~sub-agents) that are nearby in this space will not be too hard to do directly. But for long-distance trades, side-channel reward will need to flow through a series of intermediaries -- this might involve several changes of local currencies (including traded favors or promises). Each local exchange needs to be worthwhile to its participants, and not overload the relationships that it's piggybacking on.

These long-distance trades can be really difficult to set up sometimes. The same way it would be hard for a random villager in the middle ages in France to send $10 to another random villager in China.

The difficulty depends on things like the size / dimensionality of the space; how well-connected it is; and how much slack is available in the relevant places in the system (for the intermediate elements to wiggle around enough to make all the local trades possible). Note that the need for slack makes this a holistic constraint: if you just have one really important trade to make, then sure, you can probably make it happen, by using up lots of slack (locking a lot of intermediate elements into orientations optimized for that big trade). But you can't do that for every possible trade. So these issues really show up when you have a lot of heterogeneous trades to make.

Counterfactual ("logical" ) coordination has similar issues. If A and B want to counterfactually coordinate, but they're far apart in this mind-space, then they can only communicate or understand one another in a limited way, via intermediaries (or via the small # of dimensions they do share). This just makes things harder -- hard to get shared meaning, hard to agree on what's fair, hard to find a solution together that will generalize well instead of being brittle.

BTW, I'm not denying that intelligence (whatever that might mean) helps with all this, but I am denying that it's a panacea.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments