All of Chris van Merwijk's Comments + Replies

Countably Factored Spaces

I suggest renaming this to "countably factored spaces". Countably being a property of the factorization rather than the space.

Also I suggest adding an actual self-contained definition of countable factored space to make it more readable.

Moloch games

Maybe. I actually don't think the term "Moloch" is very important. What I think is important is getting a good conceptual understanding of the behavioural notion of "what society wants", behavioural in the sense that it is independent of idealized notions of what would be good or what individuals imagine society wants but depends on how the collection of agents behaves/is incentivized to behave. I view the fact that this ends up deviating from what would be good for the sum of utilities, as essentially the motivation for this topic, but not the core concep... (read more)

Moloch games

Yeah I suppose that you're taking an essential property of a Moloch to be that it wants something other than the sum of utilities. That's a reasonable terminological condition I suppose, but I'm addressing the question of "what does it even mean for 'society' to want anything at all?" Then whatever that is, it might be that (e.g. by some coincidence, or by good coordination mechanisms, or because everyone wants the same thing) what society wants is the same as what would be good for the sum of individual utilities. It seems to me that the question of "what does society want?" is more fundamental than "how does that which society want deviate from what would be good for its individuals?"

2VojtaKovarik4moI definitely agree that (1) "what society wants" is a useful notion and that it is different from (2) "situations in which what society wants deviates from what would be good for its individuals". I would just argue that given both the historical and SSC-inspired connotations of "Moloch", this term should be associated with (2) rather than with (1) :-).
Finite Factored Sets: Conditional Orthogonality

I think a subpartition of S can be thought of as a partial function on S, or equivalently, a variable on S that has the possible value "Null"/"undefined".

3Ramana Kumar4moThat's right. A partial function can be thought of as a subset (of its domain) and a total function on that subset. And a (total) function can be thought of as a partition (of its domain): the parts are the inverse images of each point in the function's image.
Finite Factored Sets: Orthogonality and Time

I just want to point out some interesting properties of this definition of time: Let time_C refer to the classical notion of time in a dynamical system, and time_FFS the notion defined in this article.

1. Suppose we have a field on space-time generated by a typical differential dynamical law that satisfies time_C-reversal symmetry, and suppose we factorize its histories according to the states of the system at time_C t=0. Then time_FFS doesn't make a distinction between the "positive" and "negative" part of the time_C. That is, if x is some position (choose... (read more)

Finite Factored Sets: Orthogonality and Time

In the proof of proposition 18, "part 3" should be "part 4".

2Scott Garrabrant5mofixed, thanks.
Finite Factored Sets: Orthogonality and Time

Can't you define  for any set  of partitions of , rather than  w.r.t. a specific factorization , simply as  iff ? If so, it would seem to me to be clearer to define  that way (i.e. make 7 rather than 2 from proposition 10 the definition), and then basically proposition 10 says "if  is a subset of factors of a partition then here are a set of equivalent definitions in terms of chimera". Also I would guess that proposition 11 is still true for  rat... (read more)

2Scott Garrabrant5moI could do that. I think it wouldn't be useful, and wouldn't generalize to sub partitions.
How to Throw Away Information

I might misunderstand something or made a mistake and I'm not gonna try to figure it out since the post is old and maybe not alive anymore, but isn't the following a counter-example to the claim that the method of constructing S described above does what it's supposed to do?

Let X and Y be independent coin flips. Then S will be computed as follows:

X=0, Y=0 maps to uniform distribution on {{0:0, 1:0}, {0:0, 1:1}}
X=0, Y=1 maps to uniform distribution on {{0:0, 1:0}, {0:1, 1:0}}
X=1, Y=0 maps to uniform distribution on {{0:1, 1:0}, {0:1, 1:1}}
X=1, Y=1 maps to u... (read more)

Violating the EMH - Prediction Markets

Is there currently a way to pool money on the trades you're suggesting? In general it seems like there is some economies of scale to be gained by creating some kind of rationalist fund

9antanaclasis9moOn this note, I would definitely be willing to pay premium to be part of a fund run by a rationalist who’s more intimately involved with the crypto and prediction markets than I am, and would thereby be able to get significantly more edge than I currently can.
Predictive Coding has been Unified with Backpropagation

I suspect a better title would be "Here is a proposed unification of a particular formalization of predictive coding, with backprop"

Moloch games

Yes that's what I meant, thanks.

Subspace optima

I made up the term on the spot, so I don't think so.

Tabooing 'Agent' for Prosaic Alignment

I endorse this. I like the framing, and it's very much in line with how I think about the problem. One point I'd make is: I'd replace the word "model" with "algorithm", to be even more agnostic. "Model" seems for many people already to carry an implicit intuitive interpretation of what the learned algorithm is doing, namely "trying to faithfully represent the problem", or something similar.

Two agents can have the same source code and optimise different utility functions

Here are some counterarguments:

There can be scenario's where the agent cannot change his source code without processing observations. e.g. the agent may need to reprogram himself via some external device.

The agent may not be aware that there are multiple copies of him.

It seems that for many plausible agent designs, it would require a significant change in the architecture to change his utility function. E.g. if two human sociopaths would want to change their utility function into a weighted average of the two, they couldn't do so without signif... (read more)

2cousin_it4yYeah, I was talking mostly about idealized UDT agents, not humans.
Two agents can have the same source code and optimise different utility functions

You don't necessarily need "explicit self-reference". The difference in utility functions can also be obtained due to a difference in the location of the agent in the universe. Two identical worms placed in different locations will have different utility functions due to their atoms being not exactly in the same location, despite not having explicit self-reference. Similarly, in a computer simulation, the agents with the same source code will be called by the universe-program in different contexts (if they weren't, I don't see how ... (read more)

2philh4yUsing the definitions from the post, those agents would be optimising the same utility functions, just by taking different actions.