Wiki Contributions


I think what is going on here is that both  and  are of the form  with  and , respectively. Let's define the star operator as . Then , by associativity of function composition. Further, if  and  commute, then so do  and 

So the commutativity of the geometric expectation and derivative fall directly out of their representation as  and , respectively, by commutativity of  and , as long as they are over different variables. 

We can also derive what happens when the expectation and gradient are over the same variables: . First, notice that , so .. Also .

Now let's expand the composition of the gradient and expectation. , using the log-derivative trick. So 

Therefore,   .

Writing it out, we have .

And if I pushed around symbols correctly, the geometric derivative can be pulled inside of a geometric expectation () similarly to how an additive derivative can be pulled inside an additive expectation (). Also, just as additive expectation distributes over addition (), geometric expectation distributes over multiplication ().

If I try to use this framework to express two agents communicating, I get an image with a V1, A1, P1, V2, A2, and P2, with cross arrows from A1 to P2 and A2 to P1. This admits many ways to get a roundtrip message. We could have A1 -> P2 -> A2 -> P2 directly, or A1 -> P2 -> V2 -> A2 -> P1, or many cycles among P2, V2, and A2 before P1 receives a message. But in none of these could I hope to get a response in one time step the way I would if both agents simultaneously took an action, and then simultaneously read from their inputs and their current state to get their next state. So I have this feeling that pi : S -> Action and update : Observation x S -> S already bake in this active/passive distinction by virtue of the type signature, and this framing is maybe just taking away the computational teeth/specificity. And I can write the same infiltration and exfiltration formulas by substituting S_t for V_t, Obs_t for P_t, Action_t for A_t, and S_env_t for E_t.

Actually maybe this family is more relevant:, where the geometric mean is the limit as we approach zero.

The "harmonic integral" would be the inverse of integral of the inverse of a function --

Also here is a nice family that parametrizes these different kinds of average (

If arithmetic and geometric means are so good, why not the harmonic mean? What would a "harmonic rationality" look like?

I wonder if this entails that RLHF, while currently useful for capabilities, will eventually become an alignment tax. Namely OpenAI might have text evaluators discourage the LM from writing self-calling agenty looking code.

So in thinking about alignment futures that are the limit of RLHF, these feel like two fairly different forks of that future.

@Quinn @Zac Hatfield-Dodds Yep, I agree. I could allow voters to offer replacements for debate steps and aggregation steps. Then we get the choice to either 
  1) delete the old versions and keep a single active copy of the aggregation tree, or to 
  2) keep the whole multiverse of aggregation trees around. 

If we keep a single copy, and we have a sufficient number of users, the root of the merge tree will change too rapidly, unless you batch changes. However, recomputing the aggregation trees from a batch of changes will end up ignoring changes to parents of nodes in the batch, since all parents end up getting recomputed anyway. Suppose we keep all constitutions (either user submitted, intermediate aggregations, or final aggregations) as a flat list of candidates to be voted amongst. Then there will be too many constitution candidates for people to interact with. So instead a user can vote with a distribution by presenting a constitution, and the distribution is generated by the softmax of negated distances to all of the constitutions in the multiverse. A user could tune their distribution by weighing multiple query constitutions, and changing softmax temperatures to tune variances. And the general population doesn't really need to know what a distribution is -- they can just input a natural language paragraph, or pick and existing one as the query.

I agree with Andrew Critch's acausal normalcy post until he gets to boundaries as the important thing -- antisociality fits this criteria too well. I'm not quite trying to say that people are just active inference agents. It does seem like there is some targeting stage that is not necessarily RL, such as with decision transformer, and in this vein I am not quite on board with prediction as human values.

Load More