People on the internet love to make resolutions for November.
So, for the entire month of November, 41 people, myself included, set out to publish a post every day, as part of a writing residency called Inkhaven. On the final day, I couldn’t resist the temptation to pull out a prank and make sure no one appeared to have failed.
(Note that I want to maintain everything as plausibly deniable; everything in this post might or might not have happened.)
From Mahmoud, another resident:
Inkhaven is the name of the 30 day blogging retreat that I went on this month. The rules were: come meet your blogging heroes, post 500 words a day, online, by midnight or we kick you out.
I made it to the final day, but, on purpose, this last post will be fewer than 500 words. My reasons for this are kind of silly, but mostly I think it would be more fun if at least one person failed the program. I’m also writing post as a stream of consciousness, unsure where it will go. Maybe I’ll come up with a better reason if I keep writing.
Every time I write the word “failed” something inside of me winces. Decades of consciousness and achievement seeking have trained me to avoid that word at all costs. I think many people come to this realisation.
I used to think that writing was about perfectly describing the ideas in your head, and every so gently placing them on the surface of the world so others can see them, in their pristine, final form. In the first half of Inkhaven I learnt that you can write in another way too - where your ideas are shared half formed and open ended. But in the second half, I learnt that doing this makes you better at doing the first thing too. The only cost is that you need to give up on your previous idea of what failing looks like, and embrace something that looks eerily similar to it.
And if you do fail, so what? What are they gonna do, kick you out?
When I heard that Mahmoud planned to intentionally fail, I knew I had to act.
You see, I didn’t want to die.
I have a friend who was about to go pilot a plane with some friends, but decided to throw a quantum random coin, and if it falls heads four times in a row, he wouldn’t do it.
The coin then fell heads four times. This updated him, in a fundamentally unsharable way, with an odds ratio of 16:1, that he would’ve died if he went to pilot that plane.
There were 41 of us at Inkhaven, publishing a post every day of November.
By the end of November, it was clear that something had been wrong.
Even if you have a 99% chance of publishing a post on a specific day if you really want to, the probability that 41 people would do that successfully for 30 days is 0.000427%.
That outcome was really quite unlikely.
So, by that time I was pretty convinced that Ben Pace must have a nuclear bomb that he would set off if anyone fails Inkhaven (which is how he can be certain no one would fail in the remaining versions of Berkeley).
(When asked for a comment for this piece, Ben Pace denied owning a nuclear bomb.)
But now that I’m out of the Bay and safely out of reach of a nuke, I can finally make a confession.
At the dawn of November 30, I decided to do the funniest thing possible, and made Mahmoud, the author of the “How I failed Inkhaven” post, fail to fail Inkhaven.
It would’ve been quite easy, really: just publish a post and fill out the form with Mahmoud’s name on it, and mark the post as hidden so a link to it isn’t actually displayed from the Inkhaven dashboard.
At that point, the feeling of the community was quite strong. Everyone was cheering Wordpress Dot Com[1], and people really wanted everyone to succeed.
To make sure people are on board and expect it would be funny rather than mean to Mahmoud and to Inkhaven organizers, I consulted a few fellow residents and a member of the Inkhaven team[2], who all found the idea hilarious and fully supported it, and around 9pm, I got to work. I skimmed through a few of Mahmoud’s posts and noticed that he occasionally starts his posts in a characteristic way, as well as has a few ideas overlapping with what I could write about. So by ~9:20pm, not yet having written my own November 30 post, I thought of a comment that I made on an ACX post pointing at an idea, and decided to expand on it. 20 minutes later, I had this:
Hi! I’ve written a few posts on computers and on consciousness. This post is about whether LLMs can, in principle, be conscious.
in order to be conscious, AIs would need to feed back high-level representations into the simple circuits that generate them. LLMs/transformers - the near-hegemonic AI architecture behind leading AIs like GPT, Claude, and Gemini - don’t do this. They are purely feedforward processors, even though they sort of “simulate” feedback when they view their token output stream.
But that’s not really true. You can unwrap recurrent circuits into sequential: say, you have a circuit that computes consciousness and occupies three layers, and information at the output is fed back into the input. You can just copy that three-layer circuit to layers 1-3, 4-6, 7-9, etc. of a feed-forward neural network. The same computation happens as a result, despite no recurrence in the architecture.
An even stronger claim is that to the extent any computer program can contain consciousness, an LLM can do it, too, due to the universal approximation theorem.
Furthermore, actual LLMs are trained to do whatever leads to correctly predicting the outputs of the text on the internet; and much of that text written by humans is a result of them being conscious: as someone conscious, you can talk about your experience in a way that closely matches your actual feeling of experiencing, which is a result of the circuits in your brain responsible for consciousness having not just inputs but also outputs. And since a very good way of predicting the text might be to run, on some level, whatever leads to them, it seems very clear that LLMs can, in principle, learn to contain a lot of tiny conscious people that wonder about their own experiences and write text about those.
Wouldn’t the number of layers be a problem? Well, probably not really: the depth of recursion or reflection required for the minimal consciousness is unlikely to be much higher than the number of layers in LLMs, and is actually likely to be far lower.
If you’re still not convinced, LLMs don’t just do one forward pass; they can pick tokens that would reflect their current state, write them down, and after outputting all of their current state read the new tokens at the beginning and continue from where they left off.
The way indirect object identification circuit works is a very few layers that are able to write certain contents in a certain direction and paying attention to new words, removing them from that direction as they appear, and if there’s something left at the end, they can remove it from that sort of cache.
There’s probably a lot of slow text about reflecting on conscious experience on the internet; and so the same way, an LLM could start some kind of reflection, and store some of the information that it wants to pick up for further reflection in the words that it’s outputting as it reflects.
So: there’s nothing preventing LLMs from being conscious.
I then edited my original comment, created a new Substack account, called m[short for Mahmoud]secondaccount, and published the post as a note.
It remained only to fill out the Airtable form.
I looked up an old daily submissions link that didn’t (unlike newer personalized links) have the Who are you prefilled, and decided to sow more chaos by setting the title of the piece to “AI consciousness in LLMs is possible (PRETEND THIS DOESN’T EXIST EXCEPT I DON’T ACTUALLY FAIL)” and hoping the organizers wouldn’t reach out to Mahmoud to check, or wouldn’t read too much into him denying everything.
Happy, I double-checked the post and the form with others at Inkhaven, submitted it, and, giggling, went on to write my own final post of Inkhaven. (I published it 9 minutes before the midnight deadline.)
A while later, I went to sleep, and then woke up to lots of happy people around me (some of them smiling about a shared secret) and one very confused resident.
I thought it would be good for me to do something a little bit contrarian and silly. Especially when the stakes were so low. I also wrote some reflections (in fewer than 500 words) about how embracing failure was a part of the artistic process. I still stand by the point. I guess my lesson for today is that you don’t get to choose the terms on which you fail.
When I went to bed last night I was feeling a bit conflicted. It would have been nice for the group and for the programme leads if everyone had made it through to the end. It probably messes up some pre-drafted retrospective emails if they have to write “Everyone* made it to the end” (*footnote: except for one person who deliberately failed on the last day).
There were also actual consequences. You get kicked out of the community slack channel, you no longer get invited to alumni events / reunions. I was aware of these and told the organisers I didn’t want them to feel conflicted about enforcing them, I had made my choice, I was happy overall. The choice would not have been as meaningful if there hadn’t been some cost to pay.
I was a bit sad about damaging the vibes for the broader group. On principle I wasn’t going to let this get in the way of a good blog post idea, though the thought still hurt a bit. When I told my partner I was unsure how to feel about having pulled this silly stunt she asked me something along the lines of “are you proud of your choice?” My answer was yes.
A little bit after midnight I looked at the dashboard.
It’s interesting that he could’ve discovered the additional hidden post! If he looked at the published posts, he would’ve seen his own short one, and then my, titled “Hidden Post” on the dashboard.
A greyed-out diamond under my name, after a streak of 29 solid ones. A permanent memorial of my crime. This seemed kind, I was half expecting them to cross out my name, or maybe remove my entire profile from their site.
It’s also interesting that he wouldn’t have noticed anything at all, if the interface displayed the first post published in a day as a diamond, not the last one. But now he did notice that something changed!
But wait... other people had greyed out diamonds too. Does this mean they had failed the program on previous days?
No - this was just the UI for private posts. Strange that they didn’t make it different for people who hadn’t submitted at all.
So close!!!
(They did make it different for people who didn’t submit at all; those were displayed as a small dot. In any case, Mahmoud did submit! The submission was simply under the 500 words.)
A fellow resident contacted me to point this out to me too. Maybe I had messed the system up by submitting my illegally short post into their submission form?
That must be it. Unless... nah. I went to sleep untroubled by any other possibilities.
…or this other resident either knew something or decided that *you* were pranking everyone by secretly submitting another post. (That ended up being the consensus conclusion!
Around noon, Amanda was discussing plans to resolve the market on whether everyone succeeded. That was a bit scary, as I didn’t want to cause market misresolution, so I tried to clarify to Amanda that people should really be a bit uncertain, but then it turned out the market would resolve the same if 0 to 1 people failed, so that was fine. She resolved the market and wrote the following:
The Lightcone staff just confirmed to me that ALL 41 residents finished. Mahmoud tried to fail out on purpose as a joke, but he posted another post that was >500 words later in the evening, before midnight on 11/30.
It’s a bit unfortunate that the Lightcone staff didn’t pretend the post didn’t exist, as they were asked to; this would’ve been so much funnier! oh well, I definitely had that one coming.)
Anyway:
Here is a list of other possibilities
During a nap earlier in the evening, I had sleep-walked over to my laptop and written 223 additional words, posted them to some secret corner of the internet and then sent them to the organisers to make up my posting deficit.
A cosmic ray hurtling through space, had, at just the right moment, struck the Airtable servers which host the backend for inkhaven.blog. The ray flipped just the right bit in just the right register to permanently update my wordcount for the 30th of November to be over five hundred words.
In the dead of night, a blog-goblin, post-er-geist, writing-fairy, or other mysterious creature had logged in to the submission form and sent a post under my name.
In the late 1990s researchers at the Cybernetic Culture Research Unit at the university of Warwick posited a series of esoteric cybernetic principles by which superstitions could become real via social feedback loops very similar to the ones I have been blogging about under the title of Dynamic Nominalism. These eventually came to be known as Hyperstitions. It is possible that by merely thinking about these kinds of ideas my blog has become subject to such forces. If you believe in the explanatory power of these things, then one explanation is that enough people were sure that nobody would fail Inkhaven, that this collective certainty overwrote my agency as an individual. My choice to fail was an illusion.
A closely related theory to number 4. There was a prediction market running for how many people would make it to the end of Inkhaven. By the final day the market was so certain that greater than 40 residents would make it to the end that even the mere possibility that one resident would fail was unthinkable. The market is always right. And capital, even play-money capital, has powers which can overwrite the will of mere bloggers like me.
Due to the indeterminacy of implementation there exists some degenerate mapping of the rules of common sense and the stated rules of Inkhaven by which you can interpret the series of events which unfolded over the last 30 days as me having actually posted 500 words every day after all.
I submitted another 500 word post just before midnight and am lying about having no recollection of it here.
Stranger things have happened I suppose.
When I woke up this morning, the organisers confirmed that indeed, everyone had submitted at least 500 words.
This list of possibilities is incredibly funny, even as you know yourself to be the mysterious creature.
I used to write a lot and not share it with anyone else. The writing was nearly all in the form of journal entries which I kept in paper notebooks. If I did put it online, it was just to make sure it was backed up somewhere.
When you write in this private way there is an especially comforting sense in which the writing remains “yours”. Not only are you the author, you are also the only intended reader. You own the whole pipeline from creation to consumption of the work. When you write for others, even if you write the same things you would have written only for yourself, you necessarily give up some of this control.
This is because once your writing is out there, you no longer own it in the same way. Others are free to interpret it as they wish, and you need to make peace with the possibility that you may be challenged or misunderstood. This is especially true when you become part of a community of other online writers, who are reading and commenting on each others work. I was very grateful to get a taste of this over the last month.
I’m pretty sure that whatever words were written which kept me in this program were not words which I wrote. However, authorship is a strange thing. Two of my poststhis month already were collaborations with other authors, and each of those collaborations took quite different forms.
Yes, authorship is a strange thing, and I have decided to surrender to the idea that my writing might take on a life of its own. So I guess maybe there is some sense in which I did write those words. I wonder what they said.
This was amazing! Anyway, when asked by the organizers whether the post is by him and whether he submitted it, Mahmoud replied that he didn’t submit the form, but referred to the above for his views on authorship (which is incredibly based).
The probability of everyone succeeding as much as they did was the full 0.00043217%, not the mere 0.00042785%.
(Or maybe the timelines where I didn’t submit that post were nuked.)
So: I can recommend signing up for the second iteration of Inkhaven, but before you do, make sure that you are okay with the less productive versions of you dying in radioactive fire.
At first, we attempted to cheer Wordpress, without the Dot Com part, but were quickly stopped by the organizers, who explained to us the difference between Wordpress, an open-source blogging platform, and Wordpress Dot Com, a blogging platform for hosting Wordpress that sponsored the organizers of Inkhaven. So every day, during the launch-time announcements, the crowd would loudly cheer Wordpress Dot Com. Notably, almost no one actually used Wordpress, but I think all of us have warm feelings towards the Wordpress Dot Com-branded blankets that we received.
I really didn’t want to disappoint Ben Pace with the news; of course, not because I would be sad if he was sad, but because who knows what he’s going to do with the nuke. (Ben denies owning the nuke.)
(Content warnings: dubious math, quantum immortality, nuclear war)
Normal people make New Year’s resolutions.
People on the internet love to make resolutions for November.
So, for the entire month of November, 41 people, myself included, set out to publish a post every day, as part of a writing residency called Inkhaven. On the final day, I couldn’t resist the temptation to pull out a prank and make sure no one appeared to have failed.
(Note that I want to maintain everything as plausibly deniable; everything in this post might or might not have happened.)
From Mahmoud, another resident:
When I heard that Mahmoud planned to intentionally fail, I knew I had to act.
You see, I didn’t want to die.
I have a friend who was about to go pilot a plane with some friends, but decided to throw a quantum random coin, and if it falls heads four times in a row, he wouldn’t do it.
The coin then fell heads four times. This updated him, in a fundamentally unsharable way, with an odds ratio of 16:1, that he would’ve died if he went to pilot that plane.
There were 41 of us at Inkhaven, publishing a post every day of November.
By the end of November, it was clear that something had been wrong.
Even if you have a 99% chance of publishing a post on a specific day if you really want to, the probability that 41 people would do that successfully for 30 days is 0.000427%.
That outcome was really quite unlikely.
So, by that time I was pretty convinced that Ben Pace must have a nuclear bomb that he would set off if anyone fails Inkhaven (which is how he can be certain no one would fail in the remaining versions of Berkeley).
(When asked for a comment for this piece, Ben Pace denied owning a nuclear bomb.)
But now that I’m out of the Bay and safely out of reach of a nuke, I can finally make a confession.
At the dawn of November 30, I decided to do the funniest thing possible, and made Mahmoud, the author of the “How I failed Inkhaven” post, fail to fail Inkhaven.
It would’ve been quite easy, really: just publish a post and fill out the form with Mahmoud’s name on it, and mark the post as hidden so a link to it isn’t actually displayed from the Inkhaven dashboard.
At that point, the feeling of the community was quite strong. Everyone was cheering Wordpress Dot Com[1], and people really wanted everyone to succeed.
To make sure people are on board and expect it would be funny rather than mean to Mahmoud and to Inkhaven organizers, I consulted a few fellow residents and a member of the Inkhaven team[2], who all found the idea hilarious and fully supported it, and around 9pm, I got to work. I skimmed through a few of Mahmoud’s posts and noticed that he occasionally starts his posts in a characteristic way, as well as has a few ideas overlapping with what I could write about. So by ~9:20pm, not yet having written my own November 30 post, I thought of a comment that I made on an ACX post pointing at an idea, and decided to expand on it. 20 minutes later, I had this:
I then edited my original comment, created a new Substack account, called m[short for Mahmoud]secondaccount, and published the post as a note.
It remained only to fill out the Airtable form.
I looked up an old daily submissions link that didn’t (unlike newer personalized links) have the Who are you prefilled, and decided to sow more chaos by setting the title of the piece to “AI consciousness in LLMs is possible (PRETEND THIS DOESN’T EXIST EXCEPT I DON’T ACTUALLY FAIL)” and hoping the organizers wouldn’t reach out to Mahmoud to check, or wouldn’t read too much into him denying everything.
Happy, I double-checked the post and the form with others at Inkhaven, submitted it, and, giggling, went on to write my own final post of Inkhaven. (I published it 9 minutes before the midnight deadline.)
A while later, I went to sleep, and then woke up to lots of happy people around me (some of them smiling about a shared secret) and one very confused resident.
On the way to the airport, I saw Mahmoud’s piece on How I failed to fail Inkhaven, and started dying from laughter.
It’s interesting that he could’ve discovered the additional hidden post! If he looked at the published posts, he would’ve seen his own short one, and then my, titled “Hidden Post” on the dashboard.
It’s also interesting that he wouldn’t have noticed anything at all, if the interface displayed the first post published in a day as a diamond, not the last one. But now he did notice that something changed!
No - this was just the UI for private posts. Strange that they didn’t make it different for people who hadn’t submitted at all.
So close!!!
(They did make it different for people who didn’t submit at all; those were displayed as a small dot. In any case, Mahmoud did submit! The submission was simply under the 500 words.)
…or this other resident either knew something or decided that *you* were pranking everyone by secretly submitting another post. (That ended up being the consensus conclusion!
Around noon, Amanda was discussing plans to resolve the market on whether everyone succeeded. That was a bit scary, as I didn’t want to cause market misresolution, so I tried to clarify to Amanda that people should really be a bit uncertain, but then it turned out the market would resolve the same if 0 to 1 people failed, so that was fine. She resolved the market and wrote the following:
It’s a bit unfortunate that the Lightcone staff didn’t pretend the post didn’t exist, as they were asked to; this would’ve been so much funnier! oh well, I definitely had that one coming.)
Anyway:
This list of possibilities is incredibly funny, even as you know yourself to be the mysterious creature.
This was amazing! Anyway, when asked by the organizers whether the post is by him and whether he submitted it, Mahmoud replied that he didn’t submit the form, but referred to the above for his views on authorship (which is incredibly based).
The probability of everyone succeeding as much as they did was the full 0.00043217%, not the mere 0.00042785%.
(Or maybe the timelines where I didn’t submit that post were nuked.)
So: I can recommend signing up for the second iteration of Inkhaven, but before you do, make sure that you are okay with the less productive versions of you dying in radioactive fire.
At first, we attempted to cheer Wordpress, without the Dot Com part, but were quickly stopped by the organizers, who explained to us the difference between Wordpress, an open-source blogging platform, and Wordpress Dot Com, a blogging platform for hosting Wordpress that sponsored the organizers of Inkhaven. So every day, during the launch-time announcements, the crowd would loudly cheer Wordpress Dot Com. Notably, almost no one actually used Wordpress, but I think all of us have warm feelings towards the Wordpress Dot Com-branded blankets that we received.
I really didn’t want to disappoint Ben Pace with the news; of course, not because I would be sad if he was sad, but because who knows what he’s going to do with the nuke. (Ben denies owning the nuke.)