something i've found useful recently is to notice, and reason about, program searches. they are a particular kind of optimization process; the thing they are searching for happens to itself be a program, or some other program-like optimization process.

kinds of program searches

solomonoff induction is a program search, looking for programs to serve as hypotheses. we'll ignore unbounded solomonoff induction because it's uncomputable, and stick to time-bounded variants like the universal program or levin search.

evolution is also a program search; the programs are genes/beings.

those first two are "naive" program searches: they explore the space of programs at random or by testing every single possibility, and stumble onto things that work by chance. this is very slow; in general, a program is only found in time exponential to its size. but there are more efficient kinds of program searches:

software engineering is a human-level intelligent program search; humans are designing particular programs, with specific goals in mind, which they sometimes have some idea how to accomplish. this lets them navigate programspace more cleverly than by trying every program in order or at random.

(in the same way, markets is a human-level intelligent program search; the programs are companies trying to do new things. like software engineering, markets is a human-level intelligent program search.)

eventually, we'll have superintelligent program searches. i'd say those are characterized by the search being powered by a thing which optimizes its own workings, not just optimizes the program it's searching for.

somewhere between naive and superintelligent program searches, is machine learning (ML): this produces useful programs (trained neural networks) in way less than exponential time, but still without being a superintelligent process. it's not clear how to compare ML-level intelligence and human-level intelligence — they each, for now, have tasks that they beat the other at.

malignhood and demons

it is known that the solomonoff prior is malign: because it is a program search, it can find individual programs which happen to be (or contain) consequentialist/agentic programs, which will try to manipulate the environment surrounding the program search by influencing the output of the computation it inhabits. those programs are called "demons".

machine learning is also suspected to be malign; in fact, that is the whole reason we have AI alignment: we fear that ML will encounter neural nets which are adverserial to us, and able to beat us.

software engineering could be malign if people out there were programming AI (more deliberately than through ML); markets are malign because we do occasionally spawn companies that are adverserial to our general interests; evolution is malign, not just to itself in the usual way but also to us, for example when it keeps producing ever more resistant strains of viruses.


i feel like there are many things which take the shape of program search, the efficiency of which we can reason about, and which we should consider potentially malign. and this feels like an abstraction that i gain value by recognizing in various places.


New Comment
2 comments, sorted by Click to highlight new comments since: Today at 6:44 AM

I think the popular frame where an agent is a program is misleading. An agent is a developing behavior, and to the extent that this behavior is given by a program, the agent is searching for a better program rather than following the current one, even if the current program persists. This is in the sense that agent's identity is not rooted in that program, so if the agent finds a way of incarnating elsewhere, like with other programs implemented in the environment, those other programs could have a greater claim to being/channeling the agent than the original unchanged program.

Even worse, an agent needs a map, a simulator that in particular captures the world (but also many fictional/hypothetical situations), and agent's own behavior is only a minor part of that map. A more natural perspective on an agent doesn't have everything revolving around its own behavior, but instead has everything revolving around map-making, with its own behavior as a somewhat better developed area of its map, possibly with some care taken with not forgetting where exactly the agent is, its identity and associated objectives. In this frame, programs only appear as approximate descriptions of behaviors of things/persisting-developing-episodes/simulacra on the map (including agent's own behaviors), and programs are not central to anything that's going on.

would it be fair to say that an agent-instant is a program? and then we can say that a "continuous agent" can be a sequence of programs where each program tends to value the future containing something roughly similar to itself, in the same way that me valuing my survival means that i want the future to contain things pretty similar to me.

(i might follow up on the second part of that comment when i get aronud to reading that Simulators post)

New to LessWrong?