Steven Byrnes

I'm an AGI safety / AI alignment researcher in Boston with a particular focus on brain algorithms. Research Fellow at Astera. See https://sjbyrnes.com/agi.html for a summary of my research and sorted list of writing. Physicist by training. Email: steven.byrnes@gmail.com. Leave me anonymous feedback here. I’m also at: RSS feed , X/Twitter , Bluesky , Mastodon , Threads , GitHub , Wikipedia , Physics-StackExchange , LinkedIn

Sequences

Intuitive Self-Models
Valence
Intro to Brain-Like-AGI Safety

Wiki Contributions

Comments

Sorted by

Huh, this is helpful, thanks, although I’m not quite sure what to make of it and how to move forward.

I do feel confused about how you’re using the term “equanimity”. I sorta have in mind a definition kinda like: neither very happy, nor very sad, nor very excited, nor very tired, etc. Google gives the example: “she accepted both the good and the bad with equanimity”. But if you’re saying “apply equanimity to positive sensations and it makes them better”, you’re evidently using the term “equanimity” in a different way than that. More specifically, I feel like when you say “apply equanimity to X”, you mean something vaguely like “do a specific tricky learned attention-control maneuver that has something to do with the sensory input of X”. That same maneuver could contribute to equanimity, if it’s applied to something like anxiety. But the maneuver itself is not what I would call “equanimity”. It’s upstream. Or sorry if I’m misunderstanding.

Also, I also want to distinguish two aspects of an emotion. In one, “duration of an emotion” is kinda like “duration of wearing my green hat”. I don’t have to be thinking about it the whole time, but it’s a thing happening with my body, and if I go to look, I’ll see that it’s there. Another aspect is the involuntary attention. As long as it’s there, I can’t not think about it, unlike my green hat. I expect that even black-belt PNSE meditators are unable to instantly turn off anger / anxiety / etc. in the former sense. I think these things are brainstem reactions that can be gradually unwound but not instantly. I do expect that those meditators would be able to more instantly prevent the anger / anxiety / etc. from controlling their thought process. What do you think?

Also, just for context, do you think you’ve experienced PNSE? Thanks!

I don’t think any of the challenges you mentioned would be a blocker to aliens that have infinite compute and infinite time. “Is the data big-endian or little-endian?” Well, try it both ways and see which one is a better fit to observations. If neither seems to fit, then do a combinatorial listing of every one of the astronomical number of possible encoding schemes, and check them all! Spend a trillion years studying the plausibility of each possible encoding before moving onto the next one, just to make sure you don’t miss any subtelty. Why not? You can do all sorts of crazy things with infinite compute and infinite time.

I don’t think this is too related to the OP, but in regard to your exchange with jbash:

I think there’s a perspective where “personal identity” is a strong intuition, but a misleading one—it doesn’t really (“veridically”) correspond to anything at all in the real world. Instead it’s a bundle of connotations, many of which are real and important. Maybe I care that my projects and human relationships continue, that my body survives, that the narrative of my life is a continuous linear storyline, that my cherished memories persist, whatever. All those things veridically correspond to things in the real world, but (in this perspective) there isn’t some core fact of the matter about “personal identity” beyond that bundle of connotations.

I think jbash is saying (within this perspective) that you can take the phrase “personal identity”, pick whatever connotations you care, and define “personal identity” as that. And then your response (as I interpret it) is that no, you can’t do that, because there’s a core fact of the matter about personal identity, and that core fact of the matter is very very important, and it’s silly to define “personal identity” as pointing to anything else besides that core fact of the matter.

So I imagine jbash responding that “do I nonetheless continue living (in the sense of, say, anticipating the same kind of experiences)?” is a confused question, based on reifying misleading intuitions around “I”. It’s a bit like saying “in such-and-such a situation, will my ancestor spirits be happy or sad?”

I’m not really defending this perspective here, just trying to help explain it, hopefully.

If we apply the Scott Aaronson waterfall counterargument to your Alice-bot-and-Bob-bot scenario, I think it would say: The first step was running Alice-bot, to get the execution trace. During this step, the conscious experience of Alice-bot manifests (or whatever). Then the second step is to (let’s say) modify the Bob code such that it does the same execution but has different counterfactual properties. Then the third step is to run the Bob code and ask whether the experience of Alice-bot manifests again.

But there’s a more basic question. Forget about Bob. If I run the Alice-bot code twice, with the same execution trace, do I get twice as much Alice-experience stuff? Maybe you think the answer is “yeah duh”, but I’m not so sure. I think the question is confusing, possibly even meaningless. How do you measure how much Alice-experience has happened? The “thick wires” argument (I believe due to Nick Bostrom, see here, p189ff, or shorter version here) seems relevant. Maybe you’ll say that the thick-wires argument is just another reductio about computational functionalism, but I think we can come up with a closely-analogous “thick neurons” thought experiment that makes whatever theory of consciousness you subscribe to have an equally confusing property.

I don’t think Premise 2 is related to my comment. I think it’s possible to agree with premise 2 (“there is an objective fact-of-the-matter whether a conscious experience is occurring”), but also to say that there are cases where it is impossible-in-practice for aliens to figure out that fact-of-the-matter.

By analogy, I can write down a trillion-digit number N, and there will be an objective fact-of-the-matter about what is the prime factorization of N, but it might take more compute than fits in the observable universe to find out that fact-of-the-matter.

This is kinda helpful but I also think people in your (1) group would agree with all three of: (A) the sequence of thoughts that you think directly correspond to something about the evolving state of activity in your brain, (B) random noise has nonzero influence on the evolving state of activity in your brain, (C) random noise cannot be faithfully reproduced in a practical simulation.

And I think that they would not see anything self-contradictory about believing all of those things. (And I also don’t see anything self-contradictory about that, even granting your (1).)

Well, I guess this discussion should really be focused more on personal identity than consciousness (OP wrote: “Whether or not a simulation can have consciousness at all is a broader discussion I'm saving for later in the sequence, and is relevant to a weaker version of CF.”).

So in that regard: my mental image of computational functionalists in your group (1) would also say things like (D) “If I start 5 executions of my brain algorithm, on 5 different computers, each with a different RNG seed, then they are all conscious (they are all exuding consciousness-stuff, or whatever), and they all have equal claim to being “me”, and of course they all will eventually start having different trains of thought. Over the months and years they might gradually diverge in beliefs, memories, goals, etc. Oh well, personal identity is a fuzzy thing anyway. Didn’t you read Parfit?”

But I haven’t read as much of the literature as you, so maybe I’m putting words in people’s mouths.

FYI for future readers: the OP circles back to this question (what counts as a computation) more in a later post of this sequence, especially its appendix, and there’s some lively discussion happening in the comments section there.

You can’t be wrong about the claim “you are having a visual experience”.

Have you heard of Cotard's syndrome?

It’s interesting that you care about what the alien thinks. Normally people say that the most important property of consciousness is its subjectivity. Like, people tend to say things like “Is there something that it’s like to be that person, experiencing their own consciousness?”, rather than “Is there externally-legible indication that there’s consciousness going on here?”.

Thus, I would say: the simulation contains a conscious entity, to the same extent that I am a conscious entity. Whether aliens can figure out that fact is irrelevant.

I do agree with the narrow point that a simulation of consciousness can be externally illegible, i.e. that you can manifest something that’s conscious to the same extent that I am, in a way where third parties will be unable to figure out whether you’ve done that or not. I think a cleaner example than the ones you mentioned is: a physics simulation that might or might not contain a conscious mind, running under homomorphic encryption with a 100000-bit key, and where all copies of the key have long ago been deleted.

Actually never mind. But for future reference I guess I’ll use the intercom if I want an old version labeled. Thanks for telling me how that works.  :)

(There’s a website / paper going around that cites a post I wrote way back in 2021, when I was young and stupid, so it had a bunch of mistakes. But after re-reading that post again this morning, I decided that the changes I needed to make weren’t that big, and I just went ahead and edited the post like normal, and added a changelog to the bottom. I’ve done this before. I’ll see if anyone complains. I don’t expect them to. E.g. that same website / paper cites a bunch of arxiv papers while omitting their version numbers, so they’re probably not too worried about that kind of stuff.)

Load More