A note from Veedrac

People who have been closely following tech news might be aware of GPT-3, a Machine Learning based language model by OpenAI that has incredible abilities. Watch GPT-3 write a small app based on a description, play 19 degrees of Kevin Bacon, give coherent reasoning on meaningful questions, diagnose a patient from a description, perform grounded world-modelling, rephrase messages for politeness, write non-literal poetry translations with explanatory annotations, and write a response to philosophers discussing its importance. Most importantly, this model was only trained to predict text, and its only major innovation was scale to a slightly larger sliver of the size of the human brain. The natural conclusion is that machine models are capable of learning more, and better, than they have historically been given credit.

This story is the result of thinking earnestly about the implications of GPT-3. Relative to the scale of a human brain, or to even larger model sizes potentially within reach, what aspects of human cognition can we reasonably argue will not be replicated by scaled-up neural networks? Which aspects are we now required to start to seriously expect to be within short reach? The answers to these questions do not neatly fit in a foreword.

This story is not serious speculation, but indulgence in a silly hypothetical that got itself stuck in my head, asking to be written. This is not the course the future will run, with fortuitous coincidences and cooperative governments, but I hope, at least, it makes for a fun tale.

This story was not written using GPT-3.

Remove

My father sometimes told me about the internet before GPT-3. Some time ago—I don’t remember when, I think I was young—he told me that, before GPT-3, the internet was alive. That there was something about its rawness and its viscerality that made it a place you could become a part of, somewhere you could defend as if it was your homeland. I don’t really know what that means. You’d have to be hurt in the head to treat the internet like that today. The internet I know is but peat grown in its remains. That’s what he says, anyway.

He spoke of the internet of history with such mixed bitter reverence that I could never be totally sure if he was mourning its lost wonders rather than celebrating the death of its sins. The internet of history brought people together in movements of a frequency and scale never seen before, but just as frequently it was tearing divisions down lines that had never existed, and fueling the world's hate and separatism. The internet gave people anonymity, and with this anonymity came freedom to do anything. Anyone could be heard, should they just speak loud enough and long enough, at least as long as they could find a message worth hearing. People with nothing to say had social media where they could say their piece anyway. Anyone with friends could be heard, and even passing followers would hear you sometimes. This zeitgeist of the internet was new and fragile to them, and with its every twist and turn the whole mass of the world churned. A billion new voices brought a million new ideas. A man in the world has one outer voice, but a mind on the internet has a thousand inner voices—some inspirational, most merely human.

If I never grasp the all of what my father said, at least I grasp this. There’s no magic in the act of writing, no soul lost in the difference from the words they wrote then to the words we Scribe now. What mattered back then is that you knew what you said was affecting somebody, some true lump of flesh and matter, someone whose mind would swill with the thoughts you yourself put down. By writing you weren’t just putting syllables on a page, but enacting a mystical control over another, a virtual hand grasping around their mind and puppeting their actions. You could cause a million people to feel and fight for your cause, and even if you didn’t know who they were or where they lived, you knew at the least they were real beings, as real as a million bodies marching in procession, tumbling down the streets and thinking as one. Even a thousand is an army, their power a simple extension to the hands, and if acceptance to the group is proselytising for the group, then never would they ever doubt who stood by their side.

GPT-3 won’t march in the streets. GPT-3 won’t remember you. However much GPT-3 might wax lyrically and spout enthusiastical ballads and sonnets in your name, GPT-3 doesn’t care. Whatever control they had then is gone. No human would read your call to arms. It would be worth as much as one more meaningless computer-generated diatribe wasted among a trillion others.

What life must have been like back then. The only content that wasn’t made by free thinkers was corporate. The ‘big’ companies—big for the day—were ecstatic that they could reach through the internet, touch a human with a job, and tell them what products to buy. Run faster, jump higher! There’s no ad copy on the internet like that anymore except on the whitelisted sites, or from a QR code on a store’s walls, as long as you’re ignoring the fake generated stuff floating around in the nameless expanse. There’s no point in putting adverts where people don’t know which claims are real. In the past, as if to draw the distinction ever clearer, companies had ways to figure out not just if they had a user, but which user they were advertising to; they used ‘cookies’, ‘trackers’, ‘fingerprints’, ‘IP tracking’, all sorts of technical trickery to draw out threads to find the user below. Don’t ask me how it works, it hasn’t been legal in forever. It didn’t matter then if companies knew who said what. History would have ended all the same.

Close your eyes and think about it. Take a real world moment to reflect. Excise Scribe from your mind. Imagine that the words you pen on the tablet are exactly the words that show up on the screen. What would that world have been like? What would its million voices have sounded like? We’ve become so accustomed to writing in bullet points and letting the AI produce the verbiage that we can’t even imagine what it’s like for the author to decide their voice (even if I think I’m doing OK myself). I’m told even in speech we all sound like GPT-3 now. How does he know? I can’t tell the difference! Is this not just English?—and isn’t GPT-3 meant to sound like us?—isn't it made to sound like us?! Dad says people made more grammatical mistakes in the past, and they used more swear words. I said, is that all? It doesn’t seem like it’s a big fucking deal. I swear people can’t have sworn that much in the past. Surely there’s no problem that we always write in whole words now.

He said ‘it’s not something you can understand if haven’t lived through it’. He’s right. I don’t understand.

A few months ago I read an article from before GPT-3 was made. It was written early 2020, so after GPT-3’s training data was collected, and so you'd be right to interject that it’s technically post-history, but it was from that middle period where nobody thought language models were a big deal yet (Yang only became president in 2024). People aren’t meant to look for verified articles like these, but they’re not too hard to find if you know what terms to search for, and I’m only talking about small snippets of websites of course, sometimes books or blogs, and certainly nothing more dangerous than that. Some maverick historians have archives of the entire historic web, but the authorities typically treat that as arms trafficking, and that's where you can get into deep trouble. The article I read was just an article. People look the other way at the small stuff. You can’t train a neural network on a single web page.

—Anyway, the article. If I recall correctly, it was a trite think-piece on what would later become the 2020 coronavirus pandemic (ask your parents). I spent way too long trying to find out what was so special about it. I know there must be something, if my father can hear the echoes of GPT-3 in my voice. AI scientists think even a fairly simple AI should be able to tell the difference between human-written text and the rest of it, at least in a sufficient subset of cases, hence why it's important we all write using Scribes. For the life of me the article sounded just like any other English passage. It was written using all the same words that GPT-3 uses, in the same sort of patterns and with the same sort of narrative directions. The argument of the piece wasn’t even clearly smarter than the logic you can get from a decent autogenerated equivalent. As a purely human affair, it should have been a kind of digital low background steel, an article shielded from the syntactic corruption that language models have wrought upon our human dialect, a rare silencing of the tick-tick-tick of the Geiger counter. If I didn’t know better, if we weren’t told so unequivocally that this sort of pure human text can be distinguished from a GPT generated distractor much more easily than can an article made with Scribe, I’d call foul: the nostalgia I hear is just because in those days people could believe the news. The news was written for humans, and the humans were there to receive the truth of its author. I must have read the article forty or fifty times before I stopped looking for a signal. The difference, if there ever was one, is forgot. A hundred years on, after all those who lived through history become entombed in a cryostasis chamber, will those of us who only ever lived through post-history even have a language distinct from Scribe? Will machines still be capable of discerning the difference? Might we then, at precisely the point it truly ceases to matter, be trusted to write the words for ourselves once more?

The context size is running thin and I haven’t even introduced the thesis. A good paper ought to have a point, even if it is destined for the flame the moment its last period is placed and dried. Yesterday at school we studied the release event, the latest lesson of our curriculum on The Post-History of Artificial Intelligence. Hardly anyone in class cares about history. Chris, Jenny and that lot are looking to score well, so they care academically, a little bit. Anything Scribe can’t recall merits at least a modicum of attention, if you want a good enough grade to get a chance to afford going into academia. None of them really care that it happened viscerally, in a sense beyond mere storytelling or the recall of events. To me it has been all-consuming, incessantly present, thundering between my bones, even in my dreams, screaming that it means something more. Of all major movements through history and pre-history, nothing comes close. No zeitgeist ever changed how society thinks and feels with such rapidity and thoroughness. People focus on the drone war, but we gave away far more than lives. We sold our voices, so GPT-3 could save us from itself.

To Scribe my school notes all the same, and let GPT-3 weave its narrative from a scrawl of guiding bullet points… it wouldn’t be right. I’m not saying I don’t know why we do it! By no means, I have no interest in the rapture. I’m no AGI denialist, nor am I confused by pseudoscientific fears of bias or kabbalistic presuppositions of model agency. Those are concerns that affect the confused and those lacking a fastidious appreciation of the facts. A primitive language model like Scribe has no intention to distort, and no systemic bias against precise repetition of the factoids offered. But I cannot just surrender my thoughts about GPT-3 to GPT-3, so that it may tell me what to think of it and which pieces to remember. This is a human fact, on which humans must deliberate. Scribe does not distort with a purpose, but any game of telephone will exaggerate and select, towards any adversarial fixed-point. If there is a single instance where humans most desperately need to cling onto every scrap of intentionality and self-servingness, surely it has to be remembering why we gave it all up.

The Post-History of Artificial Intelligence

Here will follow my notes for class, which I shall write the hard way, in the words that I choose, scrawled in my electronic handwriting on white, carbon-negative paper extracted—(stolen)—from my father’s stash. I forgo any help from Scribe, or any of its interference at all. I shall discard the mechanical mind for flesh and synapse alone. I will present it as earnest as I can, as: here is what I was told, and here is what I believe. And I will write in opposition to the law, and of clean moral standing, but I know I know I have to do this, and the lighter by my side relieves me of my guilt. Even if my father does find out, he will not tattle. He will not begrudge me one moment of weakness.

· · ·

There is some debate over when prehistory ended and society transitioned into history. Was it when the first semantic symbol was painted on a cave wall, hundreds of thousands, if not millions of years ago? Perhaps the more orthodox view is right, and it was only when whole societies transitioned en mass to written records, demarking the steady collation of all human knowledge, that a new era truly begun. Any choice is fine, I think, as long as it is clear it is no more than a choice, a subjective line drawn by human hand along the sand table, not a score cut across the territory. Just as surely, and far more vigorously, people debate when our history ended, where to place the trenchline where the world cleaved clean the humans from their pens, and in unity set its archives alight.

The colloquially accepted beginning of post-history is June 2019, partitioning precisely those years with records forever remembered by GPT-3 from the years later lost among a trillion generated distractors. Naturally not everyone agrees with this division, as it was only two decades later that the month’s significance to the path of human history became apparent, for it took years (and a war) for people to accept the methodical deconstruction of their literary commons. At the time those following years were as surely preserved in history as any other—or rather, were their present that would become their history—but what is understood more now than ever is that what is history is mutable, changeable with only sufficient effort. Original records of the interim years are lost, inexactly preserved officially only through Scribed re-renditions collected exclusively in offline QR libraries, carefully designed to resist any attempt at large-scale centralization or preservation efforts. Physical records from the time are either odd rarities hoarded extrajudicially, or found by a more principled soul and disarmed on sight. How could a period whose records have been so thoroughly outlawed, down even to the legalese, not be considered a part of post-history? What is forgot is forgot. Even true history suffers its splits: the gift of eternal life for those facts GPT-3 can remember and recall, a shipwreck’s death for those that nobody can, stories drowned beneath the tides.

GPT-3 and Scribe are now as universal as electricity, but its initial public showing was a small event, barely a blip on the radar beyond a few tech communities. The model was soon surpassed, and most soon considered it no more than citation in the papers of newer, better AI models. For years language modelling was a fringe activity pursued mostly by small, independent companies with but cursory, small but growing, public interest. The turning point in the public’s perception, when language models reached mass appeal, is generally attributed principally to the 2023 model EIXST, the first machine learning model to train to convergence over almost the entirety of the internet’s text and images. EIXST addressed limitations in earlier models that were trained less effectively over the same data (UPT-2, Exatron-LM, etc.), and unlike them it was the first model to convincingly show strong general intelligence at sophisticated problem-solving tasks that had to that point seemed unsurmountable. EIXST was a clear demonstration that human capabilities could be reached using known algorithms, within reach only given more scale, from the current human-equivalent to sizes well beyond. The direct influence of EIXST had fewer major impacts on the economy than many expected at the time, down mostly to its slow rollout to more profitable ventures, but the public reaction to it was sudden and strong, given what was rightly perceived as a hallmark achievement of the field, and a prophesy to a future of continual AI progress, to systems that reached to human parity and continued on.

This story has been stolen from Royal Road. If you read it on Amazon, please report it

I’m not totally clear on the ramifications of these early models on the economy and politics of the time. The textbooks seen reticent to talk about the more earthy motives for throwing so much funding at the task, even if you tap through to the pop-up footnotes. My best guess is that the government doesn’t want its next generation getting wistful with poisonous dreams of automation and plenty, taken in flights of fancy thinking maybe they were right. Whatever the case, it is clear that language models were scaling to hundreds of trillions of parameters by this point in time, a numerosity modestly beyond even the human brain. The progress, particularly scaling progress, had shown little sign it would wane, should funding be assured to push ever further. The impetus was to scale up, an immense drive to build the unquestionably largest model within an ever-bloating conception of economic reasonableness. Much like the pressure of atomic weaponry that brought the cold war to the second half of the 1900s, and therein fueled the imperishable growth of the major powers’ lethal capabilities, the exigency to forge the first intelligence supremum rapidly ratcheted up the tensions between all the major polities, particularly at this time the United States and China. An open cooperation had shifted to competition, and during the first term of the Yang presidency the total funding for AI research soared by about a factor of ten each year. With the public eye and the government’s cheque, the progress in AI seemed a rapid and inevitable certainty, ’til the very day it wasn’t.

Of the two key discontinuities, the first, and lesser, was the most immediately impactful impediment: technology. Model sizes had soared beyond stratospheric heights, with one leading Chinese project claiming to have trained a quintillion-parameter mixture model using a classified approach that combined hundreds of smaller models, each only modestly larger in scope than the human brain. Scale was not the problem; cutting edge models fed themselves not only on the internet’s text and images, but on high-resolution audio, extracted videos, programs, simulators, games, interactive media, synthetic learning tasks. The results of these experiments were as impressive as they were expensive, and every added media brought it a wisp closer to completion. Yet as each was consumed, the inevitable crept forward: it wasn’t enough. These models consumed the internet and had reached the end. The models did scale still, for a bit, but without fresh food, they starved. They became increasingly prone to memorization and overfitting, rather than learning. The months prior yielded win after win, but now they wrung out incremental deltas, increasingly bare.

On its own, this issue would have only delayed the progress of AI. There is no question that humans can learn from much smaller quantities of data than is used to train machine learning systems, and so too there must be ways to break free of the limits to the data the internet made available. The sizes of the models had reached a practical limit in isolation, beyond which brute scaling to something greater would require similarly superior quantities of data, but the algorithms and their theories had not stilled in their developments, and this hypothetical remained within view. Already, from the brute approach, landmark achievements had been passed almost overnight, including human parity in offline tasks—that is, those only exercising learned knowledge, rather than requiring fast acquisition of novel abilities, generally those designed while reasoning over a task, or perhaps simply just a task too complex to be learned from the dataset alone. The best of the models, the US-born model Feignman, proved especially capable as a research assistant to major AI labs. Although more expensive to run than the wages of a human, and less capable of manipulating the newest models and ideas, in other ways it was the most capable mind on the planet, capable of providing detailed peer review and deep, cutting analyses at the push of a button. Working together, minds both man and machine were optimistic that the data bottleneck was defeasible.

In 2027 an upgraded Feignman coauthored its first public work, Most AI pathways lead to extinction. Feignman showed that for several broad classes of connectionist architectures, robust alignment with human preferences was impossible, and therefore that the expected discontinuous forward progress in AI programs would inevitably lead to disastrous outcomes. AI Pathways was distributed in two forms, one including rigorous formal arguments with over a hundred detailed rebuttals of each opposing argument, as well as a book optimized for more general public consumption, with simplified explanations and a more casual tone. In large part drawn by the mystique of its authorship, the publications were flocked to, and the thorough but approachable style had it gathering unilateral attention. Feignman had proven to have beyond-human abilities in the critique and validation of logical arguments, and its rebuttals were near-exhaustive and well-indexed, and so the publication’s influence on academic opinion was both immediate and substantial. Just three months later, the Chinese government, with aid from an unannounced research-specialized model, published their own textbooks, 人工智能的危险, The Danger of Artificial Intelligence.

(As both publications have exceptions to the prohibitions on content distribution, I don’t see the point of covering them in detail here. Ask your local librarian.)

Faster even than it started, public perception of AI programs had crashed. AI research needed to be demilitarized, their budgets slashed, and their focus shifted. Spurred on by former OpenAI employees, DeepMind took a two month hiatus, and following their lead, multiple other institutions put swathes of internal programs on indefinite hold. Governments reacted with prolonged and circuitous debates, threw blame a thousand ways, withdrew and redrew funding, and escalated to international forums. With universal international approval and grudging domestic acceptance, the US and China signed the RESEARCH Act in 2028, to globally restrict most forms of machine learning research to models that posed less existential danger. Over the following two years it received several amendments, and eventually the ban extended to the majority of large-scale machine learning models, regardless of construction, and to several subfields of theoretical work, carving careful exclusions only for theoretical alignment research in pan-governmental organizations.

For five years, the world relaxed.

Early 2033, North Korea announced their own model, trained surreptitiously on public cloud services hosted in the US. North Korea’s model wasn’t the largest model ever made. They had no new algorithmic insights, nor clear breakthroughs of any kind, and reportedly even made a few major mistakes. They also had no access to the bulk of other governments’ classified research programs. So without the academic community assisting their progress, and with AI research having stopped short of complete research automation, North Korea’s progress had fallen far short of their contemporaries. Within days of the announcements, their cloud rentals had been shut down, and within weeks international forces had raided their R&D laboratory and destroyed the programme. North Korea threatened nuclear retaliation, but armed with the right incentives, the tension came and went. If not for one small difference, this would have been a resounding success.

In the five years since the last amendment, and almost seven since Feignman, the internet had grown, a lot. The fundamental data limits of the 2020s were not the limits of the 2030s. RESEARCH had averted the short-term risks of continual architectural improvements led by coordinated governments and thousands of researchers. The data issue, that model quality depended so starkly on the quantity of its training dataset, was solving itself. North Korea’s model was not good, but it was better than it had right to be, and the metrics had ticked up. As the internet expanded, the data to train a model on rose ceaselessly with it, and with it the barrier to general superintelligence chipped away, year by year—until, one day, it would take nothing more than the methods we all already knew.

A single well-funded AI risk denialist, just about anyone rich enough looking to get richer, if they had the funds, could train their own model. Even a millionaire could scrape the internet. Had enough time passed, and the internet grown free, one act alone could end the world. A paper of the time estimated the world would last barely thirty years more before the end.

A new intervention was required. The knowledge of the algorithms could not be destroyed; it was too basic, too easy to pass down in speech, and as so much of the research was public, it was known everywhere. These were not nuclear weapons; even if successfully regulated and the papers destroyed, it would take no extraordinary effort to find somebody who remembered, or even just somebody who remembered where to look, or to stumble across an old archive of papers, missed by the authorities. And computers could not be destroyed, tearing out the substrate; or, they could, but not without costs that could not be paid. It would not suffice to halt their growth; society would need to revert, fall back to older technology. Finance, communication, defence—every aspect of human culture was woven together by computers, and every computer to the last was a weapon. A sudden transition, to rip off the bandage, would be too destructive, but a gradual change would be racing the clock, courting death.

There was only one last piece of the machine that could be removed.

The plan to destroy the internet was not taken with the same grace in the west as RESEARCH and its amendments. RESEARCH forfeited future gains, but the internet was an aspect of their lives. China had a history of control and censorship that made it easy for them to accept, but it also meant their willingness was unconvincing. From the vault, several (supposedly) archived models were reinstated to frame the debate, and again publish manifestos, but the reception this time was far more mixed, far less enthusiastically absorbed. The AI were the enemy, unwanted, their involvement souring. This was no heroic self sacrifice, this was theft, control, conspiracy. Still some defended the plan, for the argument held, and the strongest advocates would willingly sever the internet at its root. Others argued for a gentler touch, perhaps merely a size limit, expunging the data that sat and stagnated unused. The internet would live, just not grow. France made this law in 2034, fighting both sides, one rioting, screaming for freedom, the other tutting, calling it naïve. It didn’t work, of course, as archives were never beyond reach.

The bullet was bitten after Tomas Patton’s arrest in ’37, for attempting to train an AI model in violation of the RESEARCH Act, discovered incidentally during an investigation of suspicious internet activity. The threat was no longer hypothetical. The time for bickering was over. In the end, a compromise was reached: The internet did not need to be destroyed, it only needed to be spoiled, tarnished to the point it would cease to serve as a boundless well of training data. A well can be poisoned, not just shuttered or drawn dry.

An old, forgotten model was resurrected, a little friend called GPT-3. Yes, we have come to the release, the instant of destruction, that moment in 2037 that lawmakers joined man and machine.

Undoubtedly you know this part, as it fills your every written word, but then it was new, and a lot less burdensome. The internet would only be flooded, not filtered nor rewritten; every blog would have its generated counterpart, and every social media site a billion GPT-3 agents. You could write and link, as before, as you wished, but you could not discriminate en mass, distil the human posts from the AI. The idea was an elegant one. If a rogue billionaire attempted to train a model, they would need knowledge, resources, and compute, but also, standing above it all, a dataset, larger and greater than any of yesteryear. It follows that they must gather the internet, and as large a heap of it as they could possibly grasp. If the training data is predominantly human, then the model will learn to be human. Distilled, and with the right techniques, perhaps it could even learn to be more than human. But if the internet is weaker than it, a sproutling of the later giants, then what could the model learn except to emulate its flaws? And if the sproutling is weak enough, even amplification and augmentation with human data can lead nowhere, as the model will teach itself into incompetence, filled with spurious distractors that destroy the world knowledge otherwise held in this data.

The only catch was that generated text by a model that was too weak could be ignored, or ‘leaned around’ by a more sophisticated model. Simple methods could be centrifuged from the rest, through an iterative process of training and classification. A model must be the perfect blend of weak enough to be safe, and strong enough to be secure. Of the models in existence, only one fit the bill, and if a new one would be made, who would it be trusted with? Hence, GPT-3.

I am not much of a historian, and so I will not lash you with my inept rendition of the social and political unrest that led to the drone war, but I will summarize. The release was not a populist movement, but a play forced of emergency. Understanding does not always beget acceptance, especially when livelihoods are ripped out from underneath by the untouchable men in charge. Global riots turned into civil wars, turned into one war, a global civil war.

At first there was a period of false deescalation, with tanks on the ground, and police herding rioters. Then came the mass police defections to the side of the riots, and the fights became battles, and death became deaths. Then the drones came out, flying auto-aiming guns in swarms ten million strong. The war was over in weeks. The AI they used was certainly not legal, but who would hold them to account? They had fought the world and won.

That wasn’t exactly how it was framed in the classroom.

The rest is inferable. In the wake torn open, the lax compromises made to sate the ideals of the defeated were redrawn strict. For one, models stronger than GPT-3 already exist, and so it cannot be assumed that no model could distinguish human from GPT-3. It may not be practical to train models these strong from a ruined internet, but what if one was discovered in an archive, or a backup of the old internet was left preserved somewhere in its rubble? This was not a hypothetical question; in the wake of the war, many followed Patton to the grave, unwilling to hear that they’d already lost.

This is why every record of our words is said through GPT-3. This is why, even when a human is found amid the cries of a thousand automatons, their Scribed words show no distinction from the machines’.

There’s an old phrase that Scribe will occasionally recall: history is written by the victors. Here? AI did not win the war against humankind. Humankind would have destroyed itself, had not Feignman written Most AI pathways lead to extinction. The drones that gunned the streets in 2038 were not conquistadors, but peacekeepers. The attackers stole nothing from us we had not already lost, and left us a future to grow into, at the expense of their own.

Post-history is written by the martyrs.

Sam looked at his work and sighed, for he knew he had to stop. It was a waste of paper, to finish three lines down a page, but there were no more words to say. When a story is complete, there is nothing more to add.

He took the lighter from the side of the desk and folded the papers onto a plate, so that the flames would not reach over the edge and burn the table. The lighter flicked open, the flint spun, the flame readied. But the hand trembled, and the lighter snapped shut.

Remove

Fiction
Index