johnswentworth

Sequences

From Atoms To Agents
"Why Not Just..."
Basic Foundations for Agent Models
Framing Practicum
Gears Which Turn The World
Abstraction 2020
Gears of Aging
Model Comparison

Wiki Contributions

Comments

Yup. Also, I'd add that entropy in this formulation increases exactly when more than one macrostate at time  maps to the same actually-realized macrostate at time , i.e. when the macrostate evolution is not time-reversible.

This post was very specifically about a Boltzmann-style approach. I'd also generally consider the Gibbs/Shannon formula to be the "real" definition of entropy, and usually think of Boltzmann as the special case where the microstate distribution is constrained uniform. But a big point of this post was to be like "look, we can get surprisingly a lot (though not all) of thermo/stat mech even without actually bringing in any actual statistics, just restricting ourselves to the Boltzmann notion of entropy".

Meta: this comment is decidedly negative feedback, so needs the standard disclaimers. I don't know Ethan well, but I don't harbor any particular ill-will towards him. This comment is negative feedback about Ethan's skill in choosing projects in particular, I do not think others should mimic him in that department, but that does not mean that I think he's a bad person/researcher in general. I leave the comment mainly for the benefit of people who are not Ethan, so for Ethan: I am sorry for being not-nice to you here.


When I read the title, my first thought was "man, Ethan Perez sure is not someone I'd point to as an examplar of choosing good projects".

On reading the relevant section of the post, it sounds like Ethan's project-selection method is basically "forward-chain from what seems quick and easy, and also pay attention to whatever other people talk about". Which indeed sounds like a recipe for very mediocre projects: it's the sort of thing you'd expect a priori to reliably produce publications and be talked about, but have basically-zero counterfactual impact. These are the sorts of projects where someone else would likely have done something similar regardless, and it's not likely to change how people are thinking about things or building things; it's just generally going to add marginal effort to the prevailing milieu, whatever that might be.

From reading, I imagined a memory+cache structure instead of being closer to "cache all the way down".

Note that the things being cached are not things stored in memory elsewhere. Rather, they're (supposedly) outputs of costly-to-compute functions - e.g. the instrumental value of something would be costly to compute directly from our terminal goals and world model. And most of the values in cache are computed from other cached values, rather than "from scratch" - e.g. the instrumental value of X might be computed (and then cached) from the already-cached instrumental values of some stuff which X costs/provides.

Coherence of Caches and Agents goes into more detail on that part of the picture, if you're interested.

Very far through the graph representing the causal model, where we start from one or a few nodes representing the immediate observations.

You were talking about values and preferences in the previous paragraph, then suddenly switched to “beliefs”. Was that deliberate?

Yes.

... man, now that the post has been downvoted a bunch I feel bad for leaving such a snarky answer. It's a perfectly reasonable question, folks!

Overcompressed actual answer: core pieces of a standard doom-argument involve things like "killing all the humans will be very easy for a moderately-generally-smarter-than-human AI" and "killing all the humans (either as a subgoal or a side-effect of other things) is convergently instrumentally useful for the vast majority of terminal objectives". A standard doom counterargument usually doesn't dispute those two pieces (though there are of course exceptions); a standard doom counterargument usually argues that we'll have ample opportunity to iterate, and therefore it doesn't matter that the vast majority of terminal objectives instrumentally incentivize killing humans, we'll iterate until we find ways to avoid that sort of thing.

The standard core disagreement is then mostly about the extent to which we'll be able to iterate, or will in fact iterate in ways which actually help. In particular, cruxy subquestions tend to include:

  • How visible will "bad behavior" be early on? Will there be "warning shots"? Will we have ways to detect unwanted internal structures?
  • How sharply/suddenly will capabilities increase?
  • Insofar as problems are visible, will labs and/or governments actually respond in useful ways?

Militarization isn't very centrally relevant to any of these; it's mostly relevant to things which are mostly not in doubt anyways, at least in the medium-to-long term.

Yes, I mean "mole" as in the unit from chemistry. I used it because I found it amusing.

Load More