Pointing at Normativity
Consequences of Logical Induction
Partial Agency
Alternate Alignment Ideas
Filtered Evidence, Filtered Arguments
Embedded Agency
Hufflepuff Cynicism


Yes, thanks for citing it here! I should have mentioned it, really.

I see the Skyrms iterative idea as quite different from the "just take a fixed point" theory I sketch here, although clearly they have something in common. FixDT makes it easier to combine both epistemic and instrumental concerns -- every fixed point obeys the epistemic requirement; and then the choice between them obeys the instrumental requirement. If we iteratively zoom in on a fixed point instead of selecting from the set, this seems harder?

If we try the Skyrms iteration thing, maybe the most sensible thing would be to move toward the beliefs of greatest expected utility -- but do so in a setting where epistemic utility emerges naturally from pragmatic concerts (such as A Pragmatists Guide to Epistemic Decision Theory by Ben Levinstein). So the agent is only ever revising its beliefs in pragmatic ways, but we assume enough about the environment that it wants to obey both the epistemic and instrumental constraints? But, possibly, this assumption would just be inconsistent with the sort of decision problem which motivates FixDT (and Greaves).

That would be really cool.

I don't understand. The fact that every single node has a Markov blanket seems unrelated. The claim that the intersection of any two blankets is a blanket doesn't seem true? For example, I can have a network:

a -> b -> c

|     |     |

v    v     v

d -> e -> f

|      |     |

v      v      v

g -> h -> i


It seems like the intersection of the blankets for 'a' and 'c' don't form a blanket.

I find your attempted clarification confusing. 

Our model is going to have some variables in it, and if we don't know in advance where the agent will be at each timestep, then presumably we don't know which of those variables (or which function of those variables, etc) will be our Markov blanket. 

No? A probabilistic model can just be a probability distribution over events, with no "random variables in it". It seemed like your suggestion was to define the random variables later, "on top of" the probabilistic model, not as an intrinsic part of the model, so as to avoid the objection that a physics-ish model won't have agent-ish variables in it.

So the random variables for our markov blanket can just be defined as things like skin surface temperature & surface lighting & so on; random variables which can be derived from a physics-ish event space, but not by any particularly simple means (since the location of these things keeps changing).

On the other hand, if we knew which variables or which function of the variables were the blanket, then presumably we'd already know where the agent is, so presumably we're already conditioning on something when we say "the agent's boundary is a Markov blanket".

Again, no? If I know skin surface temperature and lighting conditions and so on all add up to a Markov blanket, I don't thereby know where the skin is.

I think that is a basically-correct argument. It doesn't actually argue that agent boundaries aren't Markov boundaries; I still think agent boundaries are basically Markov boundaries. But the argument implies that the most naive setup is missing some piece having to do with "where the agent is".

It seems like you agree with Sam way more than would naively be suggested by your initial reply. I don't understand why. 

When I talked with Sam about this recently, he was somewhat satisfied by your reply, but he did think there were a bunch of questions which follow. By giving up on the idea that the markov blanket can be "built up" from an underlying causal model, we potentially give up on a lot of niceness desiderata which we might have wanted. So there's a natural question of how much you want to try and recover, which you could have gotten from "structural" markov blankets, and might be able to get some other way, but don't automatically get from arbitrary markov blankets.

In particular, if I had to guess: causal properties? I don't know about you, but my OP was mainly directed at Critch, and iiuc Critch wants the Markov blanket to have some causal properties so that we can talk about input/output. I also find it appealing for "agent boundaries" to have some property like that. But if the random variables are unrelated to a causal graph (which, again, is how I understood your proposal) then it seems difficult to recover anything like that.

Okay, so you know how AI today isn't great at certain... let's say "long-horizon" tasks? Like novel large-scale engineering projects, or writing a long book series with lots of foreshadowing?

(Modulo the fact that it can play chess pretty well, which is longer-horizon than some things; this distinction is quantitative rather than qualitative and it’s being eroded, etc.)

And you know how the AI doesn't seem to have all that much "want"- or "desire"-like behavior?

(Modulo, e.g., the fact that it can play chess pretty well, which indicates a certain type of want-like behavior in the behaviorist sense. An AI's ability to win no matter how you move is the same as its ability to reliably steer the game-board into states where you're check-mated, as though it had an internal check-mating “goal” it were trying to achieve. This is again a quantitative gap that’s being eroded.)

I don't think the following is all that relevant to the point you are making in this post, but someone cited this post of yours in relation to the question of whether LLMs are "intelligent" (summarizing the post as "Nate says LLMs aren't intelligent") and then argued against the post as goalpost-moving, so I wanted to discuss that.

It may come as a shock to some, that Abram Demski adamantly defends the following position: GPT4 is AGI. I would be goalpost-moving if I said otherwise. I think the AGI community is goalpost-moving to the extent that it says otherwise. 

I think there is some tendency in the AI Risk community to equate "AGI" with "the sort of AI which kills all the humans unless it is aligned". But "AGI" stands for "artificial general intelligence", not "kills all the humans". I think it makes more sense for the definition of AGI to be up to the community of AI researchers who use the term AGI to distance their work from narrow AI, rather than for it to be up to the AI risk community. And GPT4 is definitely not narrow AI.

I'll argue an even stronger claim: if you come up with a task which can be described and completed entirely in text format (and then evaluated somehow for performance quality), for most such tasks the performance of GPT4 is at or above the performance of a random human. (We can even be nice and only randomly sample humans who speak whichever languages are appropriate to the task; I'll still stand by the claim.) Yes, GPT4 has some weaknesses compared to a random human. But most claims of weaknesses I've heard are in fact contrasting GPT4 to expert humans, not random humans. So my stronger claim is: GPT4 is human-level AGI, maybe not by all possible definitions of the term, but by a very reasonable-seeming definition which 2014 Abram Demski might have been perfectly happy with. To deny this would be goalpost-moving for me; and, I expect, for many.

So (and I don't think this is what you were saying) if GPT4 were being ruled out of "human-level AGI" because it cannot write a coherent set of novels on its own, or do a big engineering project, well, I call shenanigans. Most humans can't do that either.

This topic came up when working on a project where I try to make a set of minimal assumptions such that I know how to construct an aligned system under these assumptions. After knowing how to construct an aligned system under this set of assumptions, I then attempt to remove an assumption and adjust the system such that it is still aligned. I am trying to remove the cartesian assumption right now.

I would encourage you to consider looking at Reflective Oracles next, to describe a computationally unbounded agent which is capable of thinking about worlds which are as computationally unbounded as itself; and a next logical step after that would be to look at logical induction or infrabayesianism, to think about agents which are smaller than what they reason about.

You can compute everything that takes finite compute and memory instantly. (This implies some sense of cartesianess, as I am sort of imagining the system running faster than the world, as it can just do an entire tree search in one "clock tick" of the environment.) 

This part makes me quite skeptical that the described result would constitute embedded agency at all. It's possible that you are describing a direction which would yield some kind of intellectual progress if pursued in the right way, but you are not describing a set of constraints such that I'd say a thing in this direction would definitely be progress.

My intuition is that this would still need to solve the problem of giving an agent a correct representation of itself, in the sense that it can "plan over itself" arbitrarily. This can be thought of as enabling the agent to reason over the entire environment which includes itself.  Is that part a solved problem?

This part seems inconsistent with the previous quoted paragraph; if the agent is able to reason about the world only because it can run faster than the world, then it sounds like it'll have trouble reasoning about itself. 

Reflective Oracles solve the problem of describing an agent with infinite computational resources which can do planning involving itself and other similar agents, including uncertainty (via reflective-oracle solomonoff induction), which sounds superior to the sort of direction you propose. However, they do not run "faster than the world", as they can reason about worlds which include things like themselves. 

Amusingly, searching for articles on whether offering unlicensed investment advice is illegal (and whether disclaiming it as "not investment advice" matters) brings me to pages offering "not legal advice" ;p

Also, to be clear, nothing in this post constitutes investment advice or legal advice. 


(Also I know enough to say up front that nothing I say here is Investment Advice, or other advice of any kind!)

None of what I say is financial advice, including anything that sounds like financial advice. 

I usually interpret this sort of statement as an invocation to the gods of law, something along the lines of "please don't smite me", and certainly not intended literally. Indeed, it seems incongruous to interpret it literally here: the whole point of the discussion, as I'm understanding it, is to provide potentially useful ideas about investing strategies. Am I supposed to pretend that it's just, like, an interesting thought experiment? Or is there some other interpretation of your disclaimer I'm not seeing?

Load More