johnswentworth

johnswentworth's Comments

Algorithms vs Compute

The underlying question I want to answer is: ML performance is limited by both available algorithms and available compute. Both of those have (presumably) improved over time. Relatively speaking, how taut are those two constraints? Has progress come primarily from better algorithms, or from more/cheaper compute?

Embedded Agency via Abstraction

Thanks for the pointer, sounds both relevant and useful. I'll definitely look into it.

On hiding the source of knowledge

Lately I've been explicitly trying to trace the origins of the intuitions I use for various theoretical work, and writing up various key sources of background intuition. That was my main reason for writing a review of Design Principles of Biological Circuits, for instance. I do expect this will make it much easier to transfer my models to other people.

It sounds like many of the sources of your intuition are way more spiritual/political than most of mine, though. I have to admit I'd expect intuition-sources like mystic philosophy and conflict-y politics to systematically produce not-very-useful ideas, even in cases where the ideas are true. Specifically, I'd expect such intuition-sources to produce models without correct gears in them.

Coordination as a Scarce Resource

Good points. Personal to Prison Gangs makes a similar point about regulation, along with several other phenomena - litigation, credentialism, tribalism, etc. Based on the model in the OP, all of these are increasing over time because they solve coordination problems more scalably than old systems (e.g. personal reputation).

With regards to coordination vs object-level skills, I think a decent approximation is that object-level skills usually need to be satisficed - one needs to produce a good-enough product/service. After that, it's mainly about coordination: finding the people who already need the good-enough product you have. To put it differently, decreasing marginal returns usually seem to kick in much earlier in object-level investments than in coordination investments.

Material Goods as an Abundant Resource

I think the usual formulation of homo economicus would agree with you on that one, actually.

Constraints & Slackness as a Worldview Generator

Great question! The short answer, in the context of the China example, is that the capital bottleneck is the first gear in the model. Whether banking/lending would relax the constraint depends on the next gear up the chain - i.e. it depends why capital was scarce in the first place.

Here are a few possibilities:

  • malthusian poverty trap: all excess resources go to expanding the population, so there is little-to-no surplus to invest in capital.
  • institutions: weak property rights or poor contract enforcement mechanisms, making it difficult to invest.
  • coordination problem: there's plenty of people with surplus to invest, and plenty of people with profitable ways to invest it, but the coordination problem between them hasn't been solved.

Introducing banking/lending would potentially solve the last one, but not the first two. In the constraint language, banking technology introduces new constraints: it requires contract enforcement, and it requires people with excess resources to invest (among other things). Those new constraints need to be less taut than the old capital constraint in order for the technology to be adopted.

In the case of China, banking/lending technology was almost certainly available - it simply wasn't used to the same extent as in Europe. I have heard both the malthusian trap and the institutions explanations given as possible reasons, but I haven't personally studied it enough to know what was most relevant.

(A -> B) -> A in Causal DAGs

Again, decision theory/game theory are not about "executing a knowable strategy" or "behavior selection according to legible reasoning". They're about what goal-directed behavior means, especially under partial information and in the presence of other goal-directed systems. The theory of decisions/games is the theory of how to achieve goals. Whether a legible strategy achieves a goal is mostly incidental to decision/game theory - there are some games where legibility/illegibility could convey an advantage, but that's not really something that most game theorists study.

(A -> B) -> A in Causal DAGs

Very interesting, thank you for the link!

Main difference between what they're doing and what I'm doing: they're using explicit utility & maximization nodes; I'm not. It may be that this doesn't actually matter. The representation I'm using certainly allows for utility maximization - a node downstream of a cloud can just be a maximizer for some utility on the nodes of the cloud-model. The converse question is less obvious: can any node downstream of a cloud be represented by a utility maximizer (with a very artificial "utility")? I'll probably play around with that a bit; if it works, I'd be able to re-use the equivalence results in that paper. If it doesn't work, then that would demonstrate a clear qualitative difference between "goal-directed" behavior and arbitrary behavior in these sorts of systems, which would in turn be useful for alignment - it would show a broad class of problems where utility functions do constrain.

Theory of Causal Models with Dynamic Structure?

Indeed, that's exactly why I'm looking for it.

(A -> B) -> A in Causal DAGs

On reflection, there's a better answer to this than I originally gave, so I'm trying again.

"What the agent believes the model to be" is whatever's inside the cloud in the high-level model. That's precisely what the clouds mean. But the clouds (and their contents) only exist in the high-level model; the low-level model contains no clouds. The "actual model" is the low-level model.

So, when we talk about the extent to which the high-level and low-level models match - i.e. what queries on the low-level model can be answered by queries on the high-level model - we're implicitly talking about the extent to which the agent's model matches the low-level model.

The high-level model (at least the part of it within the cloud) is "what the agent believes the model to be".

Load More