Consider using reversible automata for alignment research

[-]Adam Shai3y131

This is super interesting. I was wondering if you could give a few more thoughts/intuitions about why you think reversibility is important. I understand that it would make the simulations more physics like, but why is being physics like important to alignment research and/or agency research?

I clicked on the paper by the Critter creator, which seems like it might go deeper into that issue, but don't have the time to read through it right now. Super exciting stuff! Thanks.

[-]Alex_Altair3y51

I'm (currently) mostly interested in it for the purpose of understanding optimization. If, for example, the world has a finite number of possible states, and the evolution rule is reversible, then no long-term optimization is possible, because all (accessible) states will be visited equally often. That scenario is relatively clear, and I'm trying to understand exactly what happens under different constraints, and which kinds of optimization are possible.

[-]tgb3y50

Not sure that I understand your claim here about optimization. An optimizer is presumably given some choice of possible initial states to choose from to achieve its goal (otherwise it cannot interact at all). In which case, the set of accessible states will depend upon the chosen initial state and so the optimizer can influence long term behavior and choose whatever best matches it’s desires.

[-]Adam Shai3y30

I share your confusions/intution about what is meant by optimization here. But I think for the purposes of this post, optimization is defined here, which is linked to at the beginning of this post. In that link, optimization is thought of as a pattern that persists in the face of perturbations and that evolves towards a small set of states. I'm still not totally grokking it though.

[-]tgb3y20

Thanks. I think I've been tripped up by this terminology more than once now.

[-]tailcalled3y60

Another physics property that would be nice to have is relativity, as it allows objects to have different velocities despite otherwise being the same configuration. However relativity might be too difficult to have in practice, as it puts a lot of constraints on how the automaton should behave, and prevents it from being grid-based.

[-]Alex_Altair3y40

Alex Mennen mentioned this too. It would be super interesting if we could have a discrete state space that still obeyed special relativity! (I unfortunately never properly learned it.)

One difference is that special relativity is approximately not true at everyday speeds (or at least, it doesn't need to be accounted for) whereas reversible laws of physics are fundamentally evident in everyday scenarios. So they somehow feel intuitively more of a useful constraint to me.

[-]tailcalled3y20

One difference is that special relativity is approximately not true at everyday speeds (or at least, it doesn't need to be accounted for) whereas reversible laws of physics are fundamentally evident in everyday scenarios. So they somehow feel intuitively more of a useful constraint to me.

At everyday speeds, special relativity still holds, it just happens to also be closely approximated by a different kind of relativity that is sometimes called "Newtonian relativity".

Alex Mennen mentioned this too. It would be super interesting if we could have a discrete state space that still obeyed special relativity! (I unfortunately never properly learned it.)

Stuff like relativity is fundamentally about symmetry. You want to say that if you have some trajectory which satisfies the laws of physics, and some symmetry $σ$ (such as "have everything move in $\to$ direction at a speed of 5 m/s"), then $σ τ$ must also satisfy the laws of physics.

Interestingly, I think you can actually have something resembling Newtonian relativity with a discrete state space, though in doing so you lose the lightspeed limit, which seems bad. More concretely, consider e.g. the trajectory space in cellular automata like Conway's Game of Life or Critters. It is $2^{N^{2} \times N}$ (writing it as $N^{2} \times N$ rather than $N^{3}$ to separate out the time axis).

Suppose we want to create a symmetry corresponding to a Newtonian boost by some vector $v$ . That is, we need to create a symmetry $σ : 2^{N^{2} \times N} \to 2^{N^{2} \times N}$ such that if you input some trajectory $τ$ , you get a resulting trajectory $σ τ$ where everything moves in the direction of $v$ . This can be defined by $σ τ (q, t) = τ (q - t v, t)$ .

I suspect that there are no interesting automata which satisfy this symmetry. However, one might be able to make some if one expanded from a cell space of $2$ to a richer cell space, and also had the symmetry act on that space (for example, one could attach a velocity to each of the cells, and then have the symmetry also add to that velocity).

Still, I think giving up on having a finite speed of light is a pretty serious problem, so I think one would want special relativity rather than Newtonian relativity, and I am pretty sure that is incompatible with a discrete state space.

[-]Alex_Altair3y20

Ah, right. (I've heard this referred to as Galilean relativity.) That does seem like it qualifies as "fundamentally evident in everyday scenarios".

Cellular automata don't really even feel like they have a concept of velocity, just a fixed rate of causal propagation. And when things "move" it's just that they're making changes in the grid next to them, and some patterns just so happen to do so in a way where, after a certain period, it's the same pattern translated... is that what we think happens in our universe? Are electrons moving "just causal propagations"? Somehow this feels more natural for the Game of Life and less natural for physics.

[-]tailcalled3y5-1

My claim would basically be that "velocity" as a concept only exists in relativistic systems. Because in a relativistic system, you can take any configuration and apply the symmetry to it to get a configuration that travels at a given velocity. Meanwhile in a non-relativistic system like Game of Life, only very special configurations have a recurring pattern, and they might be better thought of as spacetime crystals than as having an actual velocity.

(In fact, I think it might be useful to think of "velocity exists" as the thing that relativity is asserting, rather than "everything is relative" being the thing relativity is asserting.)

[-]Adam Scherlis3y10

This seems too strong. Can't you write down a linear field theory with no (Galilean or Lorentzian) boost symmetry, but where waves still propagate at constant velocity? Just with a weird dispersion relation?

(Not confident in this, I haven't actually tried it and have spent very little time thinking about systems without boost symmetry.)

[-]tailcalled3y4-1

You can probably come up with lots of systems that look approximately like they have velocity. The trouble comes when you want them to exactly satisfy the rule of "for any trajectory t, there is an equivalent trajectory t' which is exactly the same except everything moves with some given velocity, and it still follows the laws of physics", because if you have that property then you also have relativity because relativity is that property.

[-]Adam Scherlis3y30

I just realized,

for any trajectory t, there is an equivalent trajectory t' which is exactly the same except everything moves with some given velocity, and it still follows the laws of physics

This describes Galilean relativity. For special relativity you have to shift different objects' velocities by different amounts, depending on what their velocity already is, so that you don't cross the speed of light.

So the fact that velocity (and not just rapidity) is used all the time in special relativity is already a counterexample to this being required for velocity to make sense.

[-]Adam Scherlis3y30

Sure. I'd say that property is a lot stronger than "velocity exists as a concept", which seems like an unobjectionable statement to make about any theory with particles or waves or both.

[-]tailcalled3y40

I guess there's "velocity exists as a description you can impose on certain things within the trajectory", and then there's "velocity exists as a variable that can be given any value". When I say relativity asserts that velocity exists, I mean in the second sense.

In the former case you would probably not include velocity within causal models of the system, whereas in the latter case you probably would.

[-]Adam Scherlis3y20

As far as I know, condensed matter physicists use velocity and momentum to describe quasiparticles in systems that lack both Galilean and Lorentzian symmetry. I would call that a causal model.

[-]tailcalled3y20

Interesting point. Do the velocities for such quasiparticles act intuitively similar to velocities in ordinary physics?

[-]Adam Scherlis3y10

Yes, it's exactly the same except for the lack of symmetry. In particular, any quasiparticle can have any velocity (possibly up to some upper limit like the speed of light).

[-]the gears to ascension3y30

I had to look up "boost symmetry", so for posterity, here's the results of the lookup. From text-davinci-003:

Boost symmetry is a property of ~~quantum field theory~~ [note: actually, relativity] which states that the laws of physics remain unchanged under a Lorentz boost, or change in the relative velocity of two frames of reference. This means that the same equations of motion will be true regardless of the observer's velocity relative to the system, and that the laws of nature do not depend on the frame of reference in which they are measured.

I found this video on Lorentz transformations by minutephysics to be the best explanation I found, and I now feel I understand well enough to understand the point being made in context.

Here's a lookup trace:

Very first I tried google, which gave results that seemed to mostly assume I wanted a math reference rather than a first visual explanation; it did link to wikipedia:LorentzTransformation, which does give a nice summary of the math, but I wasn't yet sure it was the right thing. So then I asked text-davinci-003 (~~because chatgpt is an insufferable teenager and I'm tired of talking to it whereas td3 is a ... somewhat less insufferable teenager~~). td3 gave the above explanation.

I was still pretty sure I didn't quite understand, so I popped the explanation into metaphor.systems which gave me a bunch of vaguely relevant links, probably because it's not quantum, it's relativity, but I hadn't noticed the error yet.

Then I sighed and tried a youtube search for "boost symmetry". that gave one result, the video I linked above, which did explain to my satisfaction, and I stopped looking. I don't think I could pass many tests on it at the moment, but my visual math system seems to have a solid enough grasp on it for now.

[-]Alex_Altair3y50

(I enjoyed this style of "log of how I looked something up" comment.)

[-]gwern3y20

I have a series of search case studies if you want to read more like that.

[-]the gears to ascension3y20

curious if you've tried metaphor.

[-]Adam Scherlis3y30

Yeah, sorry for the jargon. "System with a boost symmetry" = "relativistic system" as tailcalled was using it above.

Quoting tailcalled:

Stuff like relativity is fundamentally about symmetry. You want to say that if you have some trajectory which satisfies the laws of physics, and some symmetry $σ$ (such as "have everything move in $\to$ direction at a speed of 5 m/s"), then $σ τ$ must also satisfy the laws of physics.

A "boost" is a transformation of a physical trajectory ("trajectory" = complete history of things happening in the universe) that changes it by adding a fixed offset to everything's velocity; or equivalently, by making everything in the universe move in some direction while keeping all their relative velocities the same.

[-]Adam Scherlis3y*30

And when things "move" it's just that they're making changes in the grid next to them, and some patterns just so happen to do so in a way where, after a certain period, it's the same pattern translated... is that what we think happens in our universe? Are electrons moving "just causal propagations"? Somehow this feels more natural for the Game of Life and less natural for physics.

This is what we think happens in our universe!

Both general relativity and quantum field theory are field theories: they have degrees of freedom at each point in space (and time), and objects that "move" are just an approximate description of propagating patterns of field excitations that reproduce themselves exactly in another location after some time.

The most accessible example of this is that light is an electromagnetic wave (a pattern of mutually-reinforcing electric and magnetic waves); photons aren't an additional part of the ontology, they're just a description of how electromagnetic waves work in a quantum universe.

(Quantum field theory can be described using particles to a very good degree of approximation, but the field formalism includes some observable phenomena that the particle formalism doesn't, so it has a strictly better claim to being fundamental.)

Beware, though; string theory may be what underlies QFT and GR, and it describes a world of stringy objects that actually do move through space... But at the very least, the cellular-automata perspective on "objects" and "motion" is not at all strange from a modern physics perspective.

EDIT: I might go so far as to claim that the reason all electrons are identical is the same as the reason all gliders are identical.

[-]Vivek Hebbar3y31

Beware, though; string theory may be what underlies QFT and GR, and it describes a world of stringy objects that actually do move through space

I think this contrast is wrong.^[1] IIRC, strings have the same status in string theory that particles do in QFT. In QM, a wavefunction assigns a complex number to each point in configuration space, where state space has an axis for each property of each particle.^[2] So, for instance, a system with 4 particles with only position and momentum will have a 12-dimensional configuration space.^[3] IIRC, string theory is basically a QFT over configurations of strings (and also branes?), instead of particles. So the "strings" are just as non-classical as the "fundamental particles" in QFT are.

^{^}
I don't know much about string theory though, I could be wrong.
^{^}
Oversimplifying a bit
^{^}
4 particles * 3 dimensions. The reason it isn't 24-dimensional is that position and momentum are canonical conjugates.

[-]Adam Scherlis3y20

QFT doesn't actually work like that -- the "classical degrees of freedom" underlying its configuration space are classical fields over space, not properties of particles.

Note that Quantum Field Theory is not the same as the theory taught in "Quantum Mechanics" courses, which is as you describe.

"Quantum Mechanics" (in common parlance): quantum theory of (a fixed number of) particles, as you describe.

"Quantum Field Theory": quantum theory of fields, which are ontologically similar to cellular automata.

"String Theory": quantum theory of strings, and maybe branes, as you describe.*

"Quantum Mechanics" (strictly speaking): any of the above; quantum theory of anything.

You can do a change of basis in QFT and get something that looks like properties of particles (Fock space), and people do this very often, but the actual laws of physics in a QFT (the Lagrangian) can't be expressed nicely in the particle ontology because of nonperturbative effects. This doesn't come up often in practice -- I spent most of grad school thinking QFT was agnostic about whether fields or particles are fundamental -- but it's an important thing to recognize in a discussion about whether modern physics privileges one ontology over the other.

(Note that even in the imperfect particle ontology / Fock space picture, you don't have a finite-dimensional classical configuration space. 12 dimensions for 4 particles works great until you end up with a superposition of states with different particle numbers!)

String theory is as you describe, AFAIK, which is why I contrasted it to QFT. But maybe a real string theorist would tell me that nobody believes those strings are the fundamental degrees of freedom, just like particles aren't the fundamental degrees of freedom in QFT.

*Note: People sometimes use "string theory" to refer to weirder things like M-theory, where nobody knows which degrees of freedom to use...

[-]the gears to ascension3y50

anyone have thoughts on the usefulness of smoothlife/lenia/etc continuous cellular automata?

[-]rorygreig3y80

I have been thinking about this for quite a while. In particular this paper which learns robust "agents" in Lenia seems very relevant to themes in alignment research: Learning Sensorimotor Agency in Cellular Automata

Continuous cellular automata have a few properties which in my view make them a potentially interesting testbed for agency research in AI alignment:

They seem to be able to support (or make discoverable) much more robust and complex behaviours and agents than discrete CAs, which makes them seem a bit less like "toy" models.
They can be differentiable, which allows for more efficient search for interesting behaviours (as in the linked paper). This should also be amenable to being accelerated by GPUs.

I am hoping to get the time at some point to explore some of these ideas using Lenia (I am working a full time job so it would have to be more of a side project). In particular I would like to re-implement the sensorimotor agency paper then see what avenues that opens. Perhaps trying to quantitatively measure abstraction within Lenia, for example can we come up with a measure of abstraction that can automatically identify these "agents". Or something along the lines of the information theory of individuality, to see whether optimizing globally for these measures (with gradient descent) actually produces something that we recognise as agents / individuals.

I will admit that a lot of my motivation for this is just that I find continuous cellular automata fascinating and fun, rather than considering this the most promising direction for alignment research. But I do also think it could be fruitful for alignment research.

[-]Alex_Altair3y30

This previous comment thread talks about about this idea, and I probably read it a while ago and was influenced.

[-]Rafael Cosman3y10

Have spent some time playing with reversible CAs, and can confirm that they are very interesting. They are a great example of how provable high-level properties (things like conservation of gliders) can come out of low level properties (reversibility).

LESSWRONG
LW

LESSWRONG
LW

89

Consider using reversible automata for alignment research

89

89