This is your second post and you're still being vague about the method. I'm updating strongly towards this being a hoax and I'm surprised people are taking you seriously.

Edit: I'll offer you a 50 USD even money bet that your method won't replicate when tested by a 3rd party with more subjects and a proper control group.

Reply

Stephen Fowler's Shortform

Stephen Fowler2mo50

You are given a string s corresponding to the Instructions for the construction of an AGI which has been correctly aligned with the goal of converting as much of the universe into diamonds as possible.

What is the conditional Kolmogorov Complexity of the string s' which produces an AGI aligned with "human values" or any other suitable alignment target.

To convert an abstract string to a physical object, the "Instructions" are read by a Finite State Automata, with the state of the FSA at each step dictating the behavior of a robotic arm (with appropriate mobility and precision) with access to a large collection of physical materials.

Reply

Is a random box of gas predictable after 20 seconds?

Answer by Stephen FowlerFeb 10, 202410

Tangential.

Is part of the motivation behind this question to think about the level of control that a super-intelligence could have on a complex system if it was only able to only influence a small part of that system?

Reply

Stephen Fowler's Shortform

Stephen Fowler4mo10

I was not precise enough in my language and agree with you highlighting that what "alignment" means for LLM is a bit vague. While people felt Sydney Bing was cool, if it was not possible to reign it in it would have made it very difficult for Microsoft to gain any market share. An LLM that doesn't do what it's asked or regularly expresses toxic opinions is ultimately bad for business.

In the above paragraph understand "aligned" to mean in the concrete sense of "behaves in a way that is aligned with it's parent companies profit motive", rather than "acting in line with humanities CEV". To rephrase the point I was making above, I feel much of (a majority even) of today's alignment research is focused on the the first definition of alignment, whilst neglecting the second.

Reply

Stephen Fowler's Shortform

Stephen Fowler4mo175

A concerning amount of alignment research is focused on fixing misalignment in contemporary models, with limited justification for why we should expect these techniques to extend to more powerful future systems.

By improving the performance of today's models, this research makes investing in AI capabilities more attractive, increasing existential risk.

Imagine an alternative history in which GPT-3 had been wildly unaligned. It would not have posed an existential risk to humanity but it would have made putting money into AI companies substantially less attractive to investors.

Reply

Agent membranes and causal distance

Stephen Fowler4mo10

Nice post.

"Membranes are one way that embedded agents can try to de-embed themselves from their environment."

I would like to hear more elaboration on "de-embedding". For agents who which are embedded in and interact directly with the physical world, I'm not sure that a process of de-embedding is well defined.

There are fundamental thermodynamic properties of agents that are relevant here. Discussion of agent membranes could also include an analysis of how the environment and agent do work on each other via the mebrane, and how the agent dissipates waste heat and excess entropy to the environment.

Reply

Stephen Fowler's Shortform

Stephen Fowler4mo12

"Day by day, however, the machines are gaining ground upon us; day by day we are becoming more subservient to them; more men are daily bound down as slaves to tend them, more men are daily devoting the energies of their whole lives to the development of mechanical life. The upshot is simply a question of time, but that the time will come when the machines will hold the real supremacy over the world and its inhabitants is what no person of a truly philosophic mind can for a moment question."

— Samuel Butler, DARWIN AMONG THE MACHINES, 1863

Reply

Current AIs Provide Nearly No Data Relevant to AGI Alignment

Stephen Fowler4mo30

An additional distinction between contemporary and future alignment challenges is that the latter concerns the control of physically deployed, self aware system.

Alex Altair has previously highlighted that they will (microscopically) obey time reversal symmetry^[1] unlike the information processing of a classical computer program. This recent paper published in Entropy^[2] touches on the idea that a physical learning machine (the "brain" of a causal agent) is an "open irreversible dynamical system" (pg 12-13).

^{^}
Altair A. "Consider using reversible automata for alignment research" 2022
^{^}
Milburn GJ, Shrapnel S, Evans PW. "Physical Grounds for Causal Perspectivalism" Entropy. 2023; 25(8):1190. https://doi.org/10.3390/e25081190

Reply

Stephen Fowler's Shortform

Stephen Fowler4mo51

Feedback wanted!

What are your thoughts on the following research question:

"What nontrivial physical laws or principles exist governing the behavior of agentic systems."

(Very open to feedback along the lines of "hey that's not really a research question")

Reply

How do you feel about LessWrong these days? [Open feedback thread]

Stephen Fowler5mo10

Yes, perhaps there could be a way having dialogues edited for readability.

Reply