I'm Steve Byrnes, a professional physicist in the Boston area. I have a summary of my AGI safety research interests at:

steve2152's Comments

Computational Model: Causal Diagrams with Symmetry

The mention of circuits in your later article reminded me of a couple arguments I had on wikipedia a few years ago (2012, 2015). I was arguing basically that cause-effect (or at least the kind of cause-effect relationship that we care about and use in everyday reasoning) is part of the map, not territory.

Here's an example I came up with:

Consider a 1kΩ resistor, in two circuits. The first circuit is the resistor attached to a 1V power supply. Here an engineer would say: "The supply creates a 1V drop across the resistor; and that voltage drop causes a 1mA current to flow through the resistor." The second circuit is the resistor attached to a 1mA current source. Here an engineer would say: "The current source pushes a 1mA current through the resistor; and that current causes a 1V drop across the resistor." Well, it's the same resistor ... does a voltage across a resistor cause a current, or does a current through a resistor cause a voltage, or both, or neither? Again, my conclusion was that people think about causality in a way that is not rooted in physics, and indeed if you forced someone to exclusively use physics-based causal models, you would be handicapping them. I haven't thought about it much or delved into the literature or anything, but this still seems correct to me. How do you see things?

The Credit Assignment Problem

I wrote a post inspired by / sorta responding to this one—see Predictive coding = RL + SL + Bayes + MPC

Computational Model: Causal Diagrams with Symmetry

I'll highlight that the brain and a hypothetical AI might not use the same primitives - they're on very different hardware, after all

Sure. There are a lot of levels at which algorithms can differ.

  • quicksort.c compiled by clang versus quicksort.c compiled by gcc
  • quicksort optimized to run on a CPU vs quicksort optimized to run on an FPGA
  • quicksort running on a CPU vs mergesort running on an FPGA
  • quicksort vs a different algorithm that doesn't involve sorting the list at all

There are people working on neuromorphic hardware, but I don't put much stock in anything coming of it in terms of AGI (the main thrust of that field is low-power sensors). So I generally think it's very improbable that we would copy brain algorithms at the level of firing patterns and synapses (like the first bullet-point or less). I put much more weight on the possibility of "copying" brain algorithms at like vaguely the second or third bullet-point level. But, of course, it's also entirely possible for an AGI to be radically different from brain algorithms in every way. :-)

Computational Model: Causal Diagrams with Symmetry

On further reflection, the primitive of "temporal sequences" (more specifically high-order Markov chains) isn't that different from cause-effect. High-order Markov chains are like "if A happens and then B and then C, probably D will happen next". So if A and B and C are a person moving to kick a ball, and D is the ball flying up in the air...well I guess that's at least partway to representing cause-effect...

(High-order Markov chains are more general than cause-effect because they can also represent non-causal things like the lyrics of a song. But in the opposite direction, I'm having trouble thinking of a cause-effect relation that can not be represented as a high-order Markov chain, at least at some appropriate level of abstraction, and perhaps with some context-dependence of the transitions.)

I have pretty high confidence that high-order Markov chains are one of the low-level primitives of the brain, based on both plausible neural mechanisms and common sense (e.g. it's hard to say the letters of the alphabet in reverse order). I'm less confident about what exactly are the elements of those Markov chains, and what are the other low-level primitives, and what's everything else that's going on. :-)

Just thinking out loud :-)

Computational Model: Causal Diagrams with Symmetry

Sounds exciting, and I wish you luck and look forward to reading whatever you come up with! :-)

Computational Model: Causal Diagrams with Symmetry

For what it's worth, my current thinking on brain algorithms is that the brain has a couple low-level primitives, like temporal sequences and spatial relations, and these primitives are able to represent (1) cause-effect, (2) hierarchies, (3) composition, (4) analogies, (5) recursion, and on and on, by combining these primitives in different ways and with different contextual "metadata". This is my opinion, it's controversial in the field of cognitive science and I could be wrong. But anyway, that makes me instinctively skeptical of world-modeling theories where everything revolves around cause-effect, and equally skeptical of world-modeling theories where everything revolves around hierarchies, etc. etc. I would be much more excited about world-modeling theories where all those 5 different types of relationships (and others I omitted, and shades of gray in between them) are all equally representable.

(This is just an instinctive response / hot-take. I don't have a particularly strong opinion that the research direction you're describing here is unpromising.)

What is Abstraction?

An abstraction like "object permanence" would be useful for a very wide variety of goals, maybe even for any real-world goal. An abstraction like "golgi apparatus" is useful for some goals but not others. "Lossless" is not an option in practice: our world is too rich, you can just keep digging deeper into any phenomenon until you run out of time and memory ... I'm sure that a 50,000 page book could theoretically be written about earwax, and it would still leave out details which for some goals would be critical. :-)

Seeking Power is Provably Instrumentally Convergent in MDPs

This is great work, nice job!

Maybe a shot in the dark, but there might be some connection with that paper a few years back Causal Entropic Forces (more accessible summary). They define "causal path entropy" as basically the number of different paths you can go down starting from a certain point, which might be related to or the same as what you call "power". And they calculate some examples of what happens if you maximize this (in a few different contexts, all continuous not discrete), and get fun things like (what they generously call) "tool use". I'm not sure that paper really adds anything important conceptually that you don't already know, but just wanted to point that out, and PM me if you want help decoding their physics jargon. :-)

What I talk about when I talk about AI x-risk: 3 core claims I want machine learning researchers to address.

I think capybaralet meant ≥1%.

I don't think your last paragraph is fair; doing outreach / advocacy, and discussing it, is not particularly related to motivated cognition. You don't know how much time capybaralet has spent trying to figure out whether their views are justified; you're not going to get a whole life story in an 800-word blog post.

There is such a thing as talking to an ideological opponent who has spent no time thinking about a topic and has a dumb opinion that could not survive 5 seconds of careful thought. We should still be good listeners, not be condescending, etc., because that's just the right way to talk to people; but realistically we're probably not going to learn anything new (about this specific topic) from such a person, let alone change our own minds (assuming we've already deeply engaged with both sides of the issue).

On the other hand, when talking to an ideological opponent who has spent a lot of time thinking about an issue, we may indeed learn something or change our mind, and I'm all for being genuinely open-minded and seeking out and thinking hard about such opinions. But I think that's not the main topic of this little blog post.

Load More