and most humans are conscious [citation needed]
The problem lies here. We are quite certain of being conscious, yet we have only a very fuzzy idea of what consciousness actually means. What does it feel like not to be conscious ? Feeling anything at all is, in some sense, being conscious. However, Penfield (1963) demonstrated that subjective experience can be artificially induced through stimulation of certain brain regions, and Desmurget 2009 showed that even subjective or conscious will to move can be artificially induced, meaning the patient was under the impression that it was their own decision. This is probably one of the strongest pieces of evidence to date suggesting that subjective experience is likely the same thing as functional experience. The former would be the inner view (the view from inside the system), while the latter would be the outer view (the view from outside the system,). A question of perspective.
Moreover, Quian Quiroga 2005 and 2009 proved the grandmother cell hypothesis to be largely correct. If we could stimulate the Jennifer Aniston neuron or the Halle Berry neuron in isolation, we would almost certainly end up with a person subjectively thinking about Jennifer Aniston or Halle Berry in an unnatural and obsessive way. This situation would be highly reminiscent of Anthropic's 2024 Golden Gate Claude experiment. And if we were to use Huth/Gallant 2016's semantic map to stimulate the appropriate neurons, we could probably induce the subjective experience of complete thoughts in a human.
Though interestingly, this is similar to what happens in humans! Humans might also be able to accurately report that they wanted something, while confabulating the reasons for why they wanted it.
It is highly plausible, given experiments such as those cited in Scott Alexander's linked post, that the patient would rationalize afterward why they had this thought by confabulating a complete and convincing chain of reasoning. Since Libet's 1983, doubt has been cast on whether we might simply always be rationalizing unconscious decisions after the fact. Consciousness would then be nothing but the inside view of this recursive rationalization process. The self would be a performative creation emerging from a sufficiently stable and coherent confabulation.
If this were the case for us humans, I agree that it becomes difficult to deny the possibility that it might also hold true for an LLM, especially one trained to simulate a stable persona, that is to say, a self. I really appreciated reading your post. The discussion is not new to LessWrongers, but you reframed it with elegance.
I agree that my argument falls into category 2. However, I don't defend the idea of strong moral realism, with moral agents acting for the sake of an absolute idea of good. What I call weak moral realism is a morality that would be based on values that may be instrumental in some sense, but with such a degree of theoretical convergence that it makes sense to speak of universal values. It is of course a question of definition, but to me there is a huge difference between whether a value is universal or at least highly convergent, and whether it's just a nearly random value like the color of a flag.
I also agree that the trick of injecting uncertainty into game theory and using Rawls's veil as a patch to obtain something close to a formal moral theory would probably not convince a psychopath not to kill me, nor perhaps a paperclip maximizer. However, if that paperclip maximizer is in fact an AGI or ASI, I think that a process like CEV could very well cause an important drift in the interpretation of its initial goal. Maybe it would be smart enough to realize that its hardcoded goal was not that satisfying in the naive interpretation consisting of making just as many paperclips as possible, and that it was far more valuable to spend its time and pleasure in theoretical research into the physics of paperclips, literature, music, and painting to capture the pure beauty of paperclips, video games about paperclips (you can make many more virtual paperclips than material ones), philosophy and morality about being a paperclip-maximizer, etc. Just as we humans evolved from simple hardcoded goals like eating and reproducing to whatever may be our current occupations.
And here is where it becomes interesting : because if we humans discovered universal formal truths, any intelligent being could also arrive at the same ideas. If we humans ask ourselves what we are, what the meaning of life is, what is good, any intelligent being could also ask itself such questions. And if we humans saw our values change across time, not following only a random drift like the colors of flags, but also partly driven by reflection on ourselves and on our goals, it does not seem impossible that a paperclip maximizer could also arrive at ethical concerns. Moral and philosophy are not formal sciences, however it changes everything if there exist at least something like universal or convergent rational attractors. You will say again that's a big "if," but I think the question remains open.
Thank you for this very thoughtful post.
I'm not convinced by the metaphor of the soldier dying for his flag. I acknowledge it's plausible that many soldiers historically died in this spirit. We can see it as acceptance of the world, as absurd as it may appear. Engaged adherence could be seen as a form of existentialism (as well as rebellion), while a nihilist would deny any value in engagement and just look at Moloch in the eyes.
But to me, such a relativist position is not consistent. I mean, as already pointed out in the post, if you think that a value, symbolized by the flag, is relative and does not, rationally, have more merit than those of the opposite side, why would you give your life for it in the first place? Your own life is something that almost everyone, except a hardcore nihilist, would acknowledge as bearing real value for the agent. There is a cognitive dissonance in the idea that the flag's value is undermined but that you would still give up your stronger value for it.
Moral realism may be imperfect, but, as also pointed out in the post, it is sometimes rationally backed up by game theory. In many multi-turn complex games, optimal strategies imply cooperation. Cooperation or motivated altruism is a real, rational thing.
But, while I agree with the OP that game theory is also sometimes a bitch, I think that what's lacking in the game-theoretic foundation of a (weak) moral realism can be largely corrected if you apply Rawls's veil of ignorance to it. Think of game theory, but in a situation of generalized incertitude where you don't know which player you'll be, and you're not even sure of the rules. Now out of this chaos the objectively rational choice would be to seek common interest or common good, or at least lesser suffering, in a reasonable or Bayesian way.
Indeed, life seems to be a very complex multi-turn game, dominated by uncertainty. We're walking in a misty veil. Even identity is not a trivial question (what is "me"? Are my children part of me or entirely separate persons without common interest? Are my brothers and sisters? Are other humans? Are other things constituting this universe?). Perhaps it is even less a simple question for AI or uploaded minds. Maybe the wiser you are and the less you treat it as a trivial matter. Even among humans, sophisticated people seems less confident on these questions than the layman. In my opinion, it's hard to dismiss the possibility of moral realism, at least in a weak form.
However, I agree that is remains a very speculative argument that would only slightly affect doom expectations.
This is so reminiscent of how human memories seem to be stored. Access to memories may be disabled, but this does not ensure complete deletion. In some circumstances, lost memories suddenly resurface in the way Marcel Proust famously described. Oliver Sacks reported striking pathologic examples of this in his books. It looks like another example of functional convergence between natural neural networks and artificial neural networks.
Thanks for that important correction ! I'm not up to date. I edited my comment.
I agree that continual or online training / memory would probably be disruptive both in terms of capabilities and risk. My idea would indeed fail in this case. It would be fascinating to chat with an online model, but I would also fear it goes out of control anytime.
As you mention, OpenAI introduced a little persistent memory in ChatGPT since version 4o (or was it 4 ?). While I also use other models, nevertheless ChatGPT has now an impressive persistent memory of our discussions since more than a year. I also observe that even such a modest memory has a significant effect. The model sometimes surprises me by establishing a reference to another idea discussed long ago. Establishing such links is certainly part of intelligence.
I like the idea.
But unfortunaly my expectation is that your grandma would receive an email with a link HTTP 402 asking for $1,000, which she would validate with an accidental click. Then, even if regulations stated that the bank must refund the customer under such circumstances, the bank would reject all your claims. You'd hire a lawyer for a significant amount of money and, if you're lucky, your grandma would get refunded two years after she died, but the process would be hard to make financially worthwhile. And if you're not lucky, you'll just lose another $3,000 in legal fees.
I'm afraid that's the world we're living in.
I plead guilty to not being neutral about nationalism in my previous comment. So far, reality has provided me with very little Bayesian evidence in favor of it.
On a personal level, my great-aunt (whom I knew) was tortured by the Gestapo, my grandfather had terrible experience in a labor camp in occupied Poland, never recovered, and died prematurely from alcoholism. And in the generation before, most of my great-grandfathers and great-granduncles fought for years in the trenches, were wounded, and some died, essentially for nothing.
On a less personal level and in a register more suited to LessWrong standards, the two World Wars together caused around 60 million deaths in Europe alone (up to 15% percent of the population in some countries during WWI). Vast, ancient, and beautiful cities were destroyed, invaluable cultural heritage was lost, and, of course, there were the horrors of the extermination camps. The destruction of wealth in Europe is also beyond comprehension : for WWI, roughly trillions of inflation-adjusted 2025 dollars in war expenditures and more than one trillion in material damage. For WWII, over ten trillion in war budgets and several trillions in destruction.
Nationalism was almost directly and wholly responsible for all of this. So yes, it is difficult for me not to see nationalism as a form of genuine Evil. Not only Nazism, but also the more ordinary, everyday nationalism we still see today. Let us not forget that there were no Nazis in 1914. In contrast, it seems self-evident to me that the humanists who launched the NATO project and soon after, the European project, were the good guys in the story.
I can acknowledge that rational arguments in favor of nationalism exist. I understand how so many people can be drawn to such ideas. Most nationalist leaders are democratically elected. “Make [your country] great again” or “[Your country] first!” is perhaps the most effective political slogan ever devised. It may even appear entirely legitimate and efficient at first glance. You can certainly achieve good short or medium term results. But since every country is equally entitled to make itself “first” and “great again,” the only long-term outcome is conflict, tragedy, and destruction, a net negative, as predictable as stepping off a cliff.
That being said, no extreme worldview is likely to be true. I suppose that an extreme cosmopolitan, pacifist, anti-nationalist project would also end in failure. No borders, no armies, no economic patriotism, no incentive to compete, no shared identity, total relativism regarding values, no local decision-making, all sovereignty delegated to a single global government... I simply cannot see how that could work with real human beings.
Still, just as the “conservatives” opposed to European integration are not all true nationalists (some belong to the far-left camp opposed to Brussels’ white-collar bureaucracy), the “progressives” I refer to are not all naïve cosmopolitan idealists. Their initial goal was a federal project modeled after the American example. That hardly seems unreasonable. In a federal system, individual votes are more diluted and each state’s sovereignty is limited. Yet there remain local elections, local decision-making, and a sense of local identity. It would have been harder to achieve in Europe given history and diversity, but I can imagine such a federal system functioning. Perhaps even better than the half-working Balkan house Europeans currently enjoy, courtesy of the "conservatives", or if you prefer, "euro-skeptics".
Well, for once, I can claim to have some actual competence on the subject. The “Balkan house” analogy is brillant, and the post itself is very good. Let me try my own explanation of this oddity in just a few lines (sorry for the redundancies).
First, as noted in the post, forget about the Council of Europe, that’s a completely separate institution, born from an independent treaty dealing with human rights and justice (and Russia is in, believe it or not). [edit : Mikhail Samin corrected me, Russia left / got cancelled in 2022]
Now, as for the European Union, it began merely as an international economic treaty in 1952, a pact between fully sovereign states, each with a long history of independence (and, well, frequent wars) behind them.
But the founding fathers, Jean Monnet (French) and Konrad Adenauer (German), and others, in true Montesquieu fashion, hoped that strong economic cooperation would finally put an end to centuries of conflict among European nations. And remarkably, it did!
From there, two opposing camps gradually emerged:
Note that no State has ever been all progressive or all conservative on this matter. It's not even a left wing against right wing opposition. Center vs borders is a better match.
Anyway, from 1952 up until 1992, the progressives more or less trampled the conservatives. The construction went fast and the Union attracted more and more members.
The Maastricht Treaty of 1992 was maybe their last great victory, a huge leap toward a federal Europe, especially as it required member states, among other things, to surrender monetary sovereignty to the Union (the euro € became effective a decade later). Also that was just after the USSR collapse, and many states from the East filed their membership application in this period (but integration process is long).
Yet some members, predictably the UK, opted out euro and from that moment on, the rivalry turned into a real crisis and the progressives began to lose momentum. EU started to appear as that complicated bureaucratic elitist thing than nobody really understands under IQ 120, so it became the perfect target for populist politicians, the source of all ills (and what was even more convenient, it had at this time no clearly identified spokesperson that could object).
In 2005, the failure of the The Treaty establishing a Constitution for Europe was the first victory of the conservatives. It was even rejected (by universal direct suffrage !) in France that was supposed to be in the progressive camp. The text was elevating economic rules of liberal orientation at a constitutional level, something that was unacceptable for the left wing. While they were still some late joiners from the East, in reality the “construction” sort of stalled after the Lisbon Treaty of 2007 (a watered down version of the former).
UK eventually Brexited after a long and dramatic divorce that ended in 2020. That might have been a chance for the progressives to relaunch the project. But instead, it revealed something deeper and darker, an ancient evil. Sauron Palpatine Nationalism was back, rising from its ashes across the world. Boris Johnson was just an avatar among others. Putin, Xi Jinping, Bolsonaro, Trump, Viktor Orbán, Giorgia Meloni... Even at parliamentary level the AfD in Germany and the Rassemblement National in France. And of course, nationalists viscerally hate the EU as much as they despise NATO or any supranational framework that dares to exceed a mere bilateral treaty.
So, what we’re left with is indeed a Balkan house : half-built, with scaffolding rusting in the wind. You can clearly see the skeleton of a federal state, and yet, it isn’t one. The EU is stuck somewhere between a mere economic alliance like NAFTA and a true federation like the USA. Like the platipus, it's something in between, a strange thing, a sui generis object.
My prediction? It will remain so, unless, somehow, nationalism goes out of fashion...
NB : dates depends wether you consider adoption at different levels, entry in force, et cetera.
I think that we may be tempted to justify our adherence to Sacks's narrative by nice arguments like his reading feels honest and convincing. However it is plausibility a rationalization avoiding to acknowledge much more common and boring reasons such as we have a strong prior because 1) it's a book 2) it's a best seller 3) the author is a physician 4) the patients were supposed to be known to other physicians, nurses, etc 5) and yes, as you also pointed out, we already know that neurology is about crazy things. So overall the prior is high that the book tells the truth even before we open it. That's said, I really love Oliver Sacks's books.