Steven Byrnes

I'm an AGI safety / AI alignment researcher in Boston with a particular focus on brain algorithms. Research Fellow at Astera. See https://sjbyrnes.com/agi.html for a summary of my research and sorted list of writing. Physicist by training. Email: steven.byrnes@gmail.com. Leave me anonymous feedback here. I’m also at: RSS feed , Twitter , Mastodon , Threads , Bluesky , GitHub , Wikipedia , Physics-StackExchange , LinkedIn

Sequences

Intuitive Self-Models
Valence
Intro to Brain-Like-AGI Safety

Wiki Contributions

Comments

Sorted by

I think we’re in agreement on everything.

By "the real-world thing you're looking at", you mean the image on your monitor, right?

Yup, or as I wrote: “2D pattern of changing pixels on a flat screen”.

I'm very confident a 3D model was involved

For what it’s worth, even if that’s true, it’s still at least possible that we could view both the 3D model and the full source code, and yet still not have an answer to whether it’s spinning clockwise or counterclockwise. E.g. perhaps you could look at the source code and say “this code is rotating the model counterclockwise and rendering it from the +z direction”, or you could say “this code is rotating the model clockwise and rendering it from the -z direction”, with both interpretations matching the source code equally well. Or something like that. That’s not necessarily the case, just possible, I think. I’ve never coded in Flash, so I wouldn’t know for sure. Yeah this is definitely a side track. :)

Nice find with the website, thanks.

Oh, it’s not obnoxious! You’re engaging in good faith. :)

Why do you assume they where forming homunculus concepts?

On further consideration, I have now replaced “But they were forming homunculus concepts just like us.” with “So [the location where thinking happens is] obviously not the kind of thing one can just “feel”.” That actually fits better with the flow of the argument anyway.

I'm from the same culture as you and I claim I don't have homunculus concept, or at least not one that matches what you describe in this post.

For those trying to follow along, this comment was written before the update I described here, which (I hope) helps clarify things.

From both this comment and especially our our thread on Post 2, I have a strong impression that you just completely misunderstand this series and everything in it. I think you have your own area of interest which you call “conceptual analysis” here, involving questions like “what is the self REALLY?”, an area which I dismiss as pointlessly arguing over definitions. Those “what is blah REALLY” questions are out-of-scope for this series.

I really feel like I pulled out all the stops to make that clear, including with boldface font (cf. §1.6.2) and multiple repetitions in multiple posts. :)

And yet you somehow seem to think that this “conceptual analysis” activity is not only part of this series, but indeed the entire point of this series! And you’re latching onto various things that I say that superficially resemble this activity, and you’re misinterpreting them as examples of that activity, when in fact they’re not.

I suggest that you should have a default assumption going forward that anything at all that you think I said in this series, you were probably misunderstanding it. :-P

It’s true that what I’m doing might superficially seem to overlap with “conceptual analysis”. For example, “conceptual analysis” involves talking about intuitions, and this series also involves talking about intuitions. There’s a good reason for that superficial overlap, and I explain that reason in §1.6.

If you can pinpoint ways that I could have written more clearly, I’m open to suggestions. :)

Linda is referring to the following paragraph in §3.4.2 that I just deleted :)

There’s a whole lot more detailed structure that I’m glossing over in that diagram. For example, in my own mind, I think of goals as somehow “inside” the homunculus. In some respects, my body feels like “a thing that the homunculus operates”, like that little alien-in-the-head picture at the top of the post, whereas in other respects my body feels connected to the homunculus in a more intimate way than that. The homunculus is connected to awareness both as an input channel (it “watches the stream-of-consciousness (§2.3) on the projector screen of the Cartesian theater”, in the Consciousness Explained analogy), and as an output (“choosing” thoughts and actions). Moods might be either internalized (“I’m really anxious”) or externalized (“I feel anxiety coming on”), depending on the situation. (More on externalization in §3.5.4 below.) And so on.

I thought about it more and decided that this paragraph was saying things that I hadn’t really thought too hard about, and that don’t really matter for this series, and that are also rather hard to describe (or at any rate, that I lack the language to describe well). I mean, concepts can be kinda vague clouds that have a lot of overlaps and associations, making a kinda complicated mess … and then I try to describe it, and it sounds like I’m describing a neat machine of discrete non-overlapping parts, which isn’t really what I meant.

(That said, I certainly wouldn’t be surprised if there were also person-to-person differences, between you and me, and also more broadly, on top of my shoddy introspection and descriptions :) )

New version is:

As above, the homunculus is definitionally the thing that carries “vitalistic force”, and that does the “wanting”, and that does any acts that we describe as “acts of free will”. Beyond that, I don’t have strong opinions. Is the homunculus the same as the whole “self”, or is the homunculus only one part of a broader “self”? No opinion. Different people probably conceptualize themselves rather differently anyway.

I agree that it is possible to operationalize “top-down versus bottom-up” such that it corresponds to a real and important bright-line distinction in the brain. But it’s also possible to operationalize “top-down versus bottom-up” such that it doesn’t. And that’s what sometimes happens. :)

This is fantastic!

Thanks! :)

Why "veridical" instead of simply "accurate"?

Accurate might have been fine too. I like “veridical” mildly better for a few reasons, more about pedagogy than anything else.

One reason is that “accurate” has a strong positive-valence connotation (i.e., “accuracy is good, inaccuracy is bad”), which is distracting, since I’m trying to describe things independently of whether they’re good or bad. I would rather find a term with a strictly neutral vibe. “Veridical”, being a less familiar term, is closer to that. But alas, I notice from your comment that it still has some positive connotation. (Note how you said “being unfair”, suggesting a frame where I said the intuition was non-veridical = bad, and you’re “defending” that intuition by saying no it’s actually veridical = good.) Oh well. It’s still a step in the right direction, I think.

Another reason is I’m trying hard to push for a two-argument usage (“X is or is not a veridical model of Y“), rather than a one-argument usage (“X is or is not veridical”). I wasn’t perfect about that. But again, I think “accurate” makes that problem somewhat worse. “Accurate” has a familiar connotation that the one-argument usage is fine because of course everybody knows what is the territory corresponding to the map. “Veridical” is more of a clean slate in which I can push people towards the two-argument usage.

Another thing: if someone has an experience that there’s a spirit talking to them, I would say “their conception of the spirit is not a veridical model of anything in the real world”. If I said “their conception of the spirit is not an accurate model of anything in the real world”, that seems kinda misleading, it’s not just a matter of less accurate versus more accurate, it’s stronger than that. 

The GIF isn't rotating, but the 3D model that produced the GIF was rotating, and that's the thing our intuitive models are modeling. So exactly one of [spinning clockwise] and [spinning counterclockwise] is veridical, depending on whether the graphic artist had the dancer rotating clockwise or counterclockwise before turning her into a silhouette.

It was made by a graphic artist. I’m not sure their exact technique, but it seems at least plausible to me that they never actually created a 3D model. Some people are just really good at art. I dunno. This seems like the kind of thing that shouldn’t matter though! :)

Anyway, I wrote “that’s not a veridical model of the real-world thing you’re looking at” to specifically preempt your complaint. Again see what I wrote just above, about two-argument versus one-argument usage :)

Of course nobody is forcing you to do it when you find it pointless, which is okay.

Yup! :) :)

The algorithm analysis method arguably doesn't really fit here, since it requires access to the algorithm, which isn't available in case of the brain.

Oh I have lots and lots of opinions about what algorithms are running in the brain. See my many dozens of blog posts about neuroscience. Post 1 has some of the core pieces: I think there’s a predictive (a.k.a. self-supervised) learning algorithm, that the trained model (a.k.a. generative model space) for that learning algorithm winds up stored in the cortex, and that the generative model space is continually queried in real time by a process that amounts to probabilistic inference. Those are the most basic things, but there’s a ton of other bits and pieces that I introduce throughout the series as needed, things like how “valence” fits into that algorithm, how “valence” is updated by supervised learning and temporal difference learning, how interoception fits into that algorithm, how certain innate brainstem reactions fit into that algorithm, how various types of attention fit into that algorithm … on and on.

Of course, you don’t have to agree! There is never a neuroscience consensus. Some of my opinions about brain algorithms are close to neuroscience consensus, others much less so. But if I make some claim about brain algorithms that seems false, you’re welcome to question it, and I can explain why I believe it. :)

 

…Or separately, if you’re suggesting that the only way to learn about what an algorithm will do when you run it, is to actually run it on an actual computer, then I strongly disagree. It’s perfectly possible to just write down pseudocode, think for a bit, and conclude non-obvious things about what that pseudocode would do if you were to run it. Smart people can reach consensus on those kinds of questions, without ever running the code. It’s basically math—not so different from the fact that mathematicians are perfectly capable of reaching consensus about math claims without relying on the computer-verified formal proofs as ground truth. Right?

As an example, “the locker problem” is basically describing an algorithm, and asking what happens when you run that algorithm. That question is readily solvable without running any code on a computer, and indeed it would be perfectly reasonable to find that problem on a math test where you don’t even have computer access. Does that help? Or sorry if I’m misunderstanding your point.

And you don't even aim for a good definition? For what do you aim then? … I think if I'm doing a priori armchair reasoning on socks, the way you and I do armchair reasoning here, I'm pretty much constrained to conceptual analysis. Which is activity of finding necessary and sufficient conditions for a concept.

The goal of this series is to explain how certain observable facts about the physical universe arise from more basic principles of physics, neuroscience, algorithms, etc. See §1.6.

I’m not sure what you mean by “armchair reasoning”. When Einstein invented the theory of General Relativity, was he doing “armchair reasoning”? Well, yes in the sense that he was reasoning, and for all I know he was literally sitting in an armchair while doing it. :) But what he was doing was not “constrained to conceptual analysis”, right?

As a more specific example, one thing that happens a lot in this series is: I describe some algorithm, and then I talk about what happens when you run that algorithm. Those things that the algorithm winds up doing are often not immediately obvious just from looking at the algorithm pseudocode by itself. But they make sense once you spend some time thinking it through. This is the kind of activity that people frequently do in algorithms classes, and it overlaps with math, and I don’t think of it as being related to philosophy or “conceptual analysis” or “a priori armchair reasoning”.

In this case, the algorithm in question happens to be implemented by neurons and synapses in the human brain (I claim). And thus by understanding the algorithm and what it does when you run it, we wind up with new insights into human behavior and beliefs.

Does that help?

are you even disagreeing with me on this map example here

Yes I am disagreeing. If there’s a perfect map of London made by an astronomically-unlikely coincidence, and someone asks whether it’s a “representation” of London, then your answer is “definitely no” and my answer is “Maybe? I dunno. I don’t understand what you’re asking. Can you please taboo the word ‘representation’ and ask it again?” :-P

Your loop example at the top (Decide X is right, Go all in for X, start worrying that Y is actually better than X, switch to Y, repeat) is very close to how I would describe a very healthy process of iteration / pivoting.

I guess it depends on if you’re pivoting based on things that you’ve learned, versus grass-is-greener.

For example, I’ve mentioned that AGI safety was the 5th long-term (i.e. multi-year) intense ambitious hobby of my life, then turned into my job and I’m in it to the end. All the switches made sense, given what I knew at the time. Glad I didn’t “get unstuck” from the “loop” when I was on my 3rd or 4th hobby. :)

Load More