David James

My top interest is AI safety, followed by reinforcement learning. My professional background is in software engineering, computer science, machine learning. I have degrees in electrical engineering, liberal arts, and public policy. I currently live in the Washington, DC metro area; before that, I lived in Berkeley for about five years.

Wiki Contributions

Comments

First, I encourage you to put credence in the current score of -40 and a moderator saying the post doesn't meet LessWrong's quality bar.

By LD you mean Lincoln-Douglas debate, right? If so, please continue reading.

Second, I'd like to put some additional ideas up for discussion and consideration -- not debate -- I don't want to debate you, certainly not in LD style. If you care about truth-seeking, I suggest taking a hard and critical look at LD. To what degree is Lincoln-Douglas debate organized around truth-seeking? How often does a participant in an LD debate change their position based on new evidence? In my understanding, in practice, LD is quite uninterested in the notion of being "less wrong". It seems to be about a particular kind of "rhetorical art" of fortifying one's position as much as possible while attacking another's. One might hope that somehow the LD debate process surfaces the truth. Maybe, in some cases. But generally speaking, I find it to be a woeful distortion of curious discussion and truth-seeking.

Surprisingly, perhaps, https://dl.acm.org/doi/book/10.5555/534975 has a free link to the full-text PDF.

Reinforcement learning is not required for the analysis above. Only evolutionary game theory is needed.

  • In evolutionary game theory, the population's mix of strategies changes via replicator dynamics.
  • In RL, each individual agent modifies its policy as it interacts with its environment using a learning algorithm.

Personally, I am most confident in 1, then 4, then 3, then 2 (in each case conditional on all the previous claims)

Oops. A previous version of this comment was wrong, so I edited it. The author’s confidence can be written as:

Also, independent of the author’s confidence:

thereby writing directly into your brain’s long-term storage and bypassing the cache that would otherwise get erased

What do we know about "writing directly" into long-term storage versus a short-term cache? What studies? Any theories about the mechanism(s)?

First, thank you for writing this. I would ask that you continue to think & refine and share back what you discover, prove, or disprove.

To me, it seems quite likely that B will have a lot of regularity to it. It will not be good code from the human perspective, but there will be a lot of structure I think, simply because that structure is in T and the environment.

I'm interested to see if we can (i) do more than claim this is likely and (ii) unpack reasons that might require that it be the case.

One argument for (ii) would go like this. Assume the generating process for A has a preference for shorter length programs. So we can think of a A as a tending to find shorter description lengths that match task T.

Claim: shorter (and correct) descriptions reflect some combination of environmental structure and compression.

  • by 'environmental structure' I mean the laws underlying the task.
  • by 'compression' I mean using information theory embodied in algorithms to make the program smaller

I think this claim is true, but let's not answer that too quickly. I'd like to probe this question more deeply.

  1. Are there more than two factors (environmental structure & compression)?
  2. Is it possible that the description gets the structure wrong but makes up for it with great compression? I think so. One can imagine a clever trick by which a small program expands itself into something like a big ball of mud that solves the task well.
  3. Any expansion process takes time and space. This makes me wonder if we should care not only about description length but also run time and space. If we pay attention to both, it might be possible to penalize programs that expand into a big ball of mud.
  4. However, penalizing run time and space might be unwise, depending on what we care about. One could imagine a program that start with first principles and derives higher-level approximations that are good enough to model the domain. It might be worth paying the cost of setting up the approximations because they are used frequently. (In other words, the amortized cost of the expansion is low.)
  5. Broadly, what mathematical tools can we use on this problem?

See also Nomic, a game by Peter Suber where a move in the game is a proposal to change the rules of the game.

I grant that legalese increases the total page count, but I don't think it necessarily changes the depth of the tree very much (by depth I mean how many documents refer back to other documents).

I've seen spaghetti towers written in very concise computer languages (such as Ruby) that nevertheless involve perhaps 50+ levels (in this context, a level is a function call).

In my experience, programming languages with {static or strong} typing are considerably easier to refactor in comparison to languages with {weak or dynamic} typing.*

* The {static vs dynamic} and {strong vs weak} dimensions are sometimes blurred together, but this Stack Overflow Q&A unpacks the differences pretty well.

No source code

I get the intended meaning, but I would like to made the words a little more precise. While we can find the executable source code (DNA) for an organism, that DNA is far from a high-level language.

Load More