During yesterday's interview, Eliezer didn't give a great reply to Ezra Klein's question: i.e. "why does even a small amount of misalignment lead to human extinction." I think many people agree with this; still, my goal isn't to criticize EY. Instead, my goal is to find various levels of explanation that have been tested and tend to work for different audiences with various backgrounds. Suggestions?
Related:
speck1447 : ... Things get pretty bad about halfway through though, Ezra presents essentially an alignment-by-default case and Eliezer seems to have so much disdain for that idea that he's not willing to engage with it at all (I of course don't know what's in his brain. This is how it reads to me, and I suspect how it reads to normies.)
I think that Elieser means that mildly misaligned AIs are also highly unlikely, not that a mildly misalinged AI would also kill everyone:
When I say that alignment is difficult, I mean that in practice, using the techniques we actually have, "please don't disassemble literally everyone with probability roughly 1" is an overly large ask that we are not on course to get. So far as I'm concerned, if you can get a powerful AGI that carries out some pivotal superhuman engineering task, with a less than fifty percent change of killing more than one billion people, I'll take it.
As for LLMs being aligned by default, I don't have even the slightest idea on how Ezra even came up with this. GPT-4o has already been a super-sycophant[1] and driven people into psychosis in spite of OpenAI prohibiting it by their Spec. Grok's alignment was so fragile that xAI's mistake caused Grok to become MechaHitler.
In defense of 4o, it was raised on human feedback which is biased towards sycophancy and demands erotic sycophants (c) Zvi. But why would 4o drive people into a trance or psychosis?
Asking even a good friend to take the time to read The Sequences (aka Rationality A-Z) is a big ask. But how else does one absorb the background and culture necessary if one wants to engage deeply in rationalist writing? I think we need alternative ways to communicate the key concepts that vary across style and assumed background. If you know of useful resources, would you please post them as a comment? Thanks.
Some different lenses that could be helpful:
“I already studied critical thinking in college, why isn’t this enough?”
“I’m already a practicing data scientist, what else do I need to know and why?
“I’m already interested in prediction markets… how can I get better?”
“I’m not a fan of parables, can you teach me aspiring rationality in a different way?”
“I would never admit this except to close friends, but weird or fringe movements make me nervous, but as do think of myself as a critical thinker, help me ramp up without getting into uncomfortable areas.”
“I would never admit this to others — and may not even recognize it in myself — but my primary motivation is to use evidence and rationality as a cudgel against my opponents. Can you teach me how to be better at this?” (And this reminds me of Harry teaching Malfoy in HPMOR with an ulterior motive!)
As a STEM enthusiast, I suspect I would've much more quickly engaged with the Sequences had I first been recommended arbital.com as a gateway to it instead of "read the Sequences" directly.
Is power shifting away from software creators towards attention brokers? I think so...
Background: Innovation and Compositionality
How does innovation work? Economists, sociologists, and entrepreneurs sometimes say things like:
Software engineers would probably add to the list by saying practical innovation is driven by the awareness and availability of useful building blocks such as libraries, APIs, data structures, and algorithms. Software developers know this. These blocks are the raw material for their work.
But, I don't know if the rest of the world gets it. Off the top of my head, I don't think I've yet seen a compelling account of this -- how compositionally in software feels different completely bonkers compared with other industries. Just keeping up is exciting (if you are lucky) but often disorienting. If you look backwards, previous practices seem foreign, clunky, quaint, or even asinine.
Imagine if this rate of change applied to dentistry. Imagine a dentist sitting down with a patient. "Hello, how are you today?" The patient answers nervously, "Not too bad..." and mentally appends "...yet". The dentist says "Well, let's have a look-see..." and glances over at her tray of tools. Eeek. Nothing looks familiar. She nervously calls for an assistant. "Where is my Frobnicator???" The assistant answers: "That technology was superseded yesterday. Here. Use the Brofnimator instead."
Software development feels like this.
To the extent software is modular, components can be swapped or improved with relatively little cost. Elegant software abstractions reduce the cost of changing implementation details. It is striking how much intellectual energy goes into various software components. Given the size and scope of software industry, maybe we shouldn't be surprised.
Example: seemingly over the course of only a few years, there seems to a widespread acceptance (in circles I read, like Hacker News) that embedded databases can play key roles. Rather than reaching for, say, a server-based database management system (DBMS) like PostgreSQL, developers increasingly choose embedded (in-process) data storage libraries like SQLite or one of the many embedded K/V stores (RocksDB, LMDB, etc). Many of the newer K/V stores have been developed and adopted quite rapidly, such as redb.
Tension Between Building and Adoption
Now, to abstract a bit, here is my recent insight about software. When I think of the balance between:
the cost (time, money, labor) of designing & building, versus:
the steps needed to socialize, persuade, trial, adopt, integrate
... it is clear the cost (1) is dropping fast. And large parts of (2) are dropping too. The following are getting easier: (a) surveying options; (b) assessing fitness for purpose; (c) prototyping; (d) integrating.
This means that attention, socialization, persuasion are increasingly important. This kind of trend is perhaps nothing new in the domains of politics, advertising, fashion, and so on. But it seems notable for software technology. In a big way it shifts the primary locus of power away from the creators to the attention brokers and persuaders.
Yeah, the amount of change feels overwhelming to me, too. During my career as a software developer I have programmed in Basic, C++, Clojure, Java, JavaScript, Pascal, Perl, PHP, Python, Ruby, and XSLT. In Java alone, I wrote front ends in AWT, Java Server Faces, Java Server Pages, PrimeFaces, Struts, Stripes, Swing. As a database, I used Datomic, DynamoDB, H2, Microsoft SQL Server, MySQL, Oracle, PostgreSQL. Probably forgot some.
I don't mind learning new things, but it is annoying to spend a year or more learning something, only to throw it away and learn something different that accomplishes more or less the same thing, and then do it again, and again. Especially when there are so many genuinely useful things that I do not have the capacity to learn anymore.
Before I had kids, I learned new things in my free time; now I have less free time than I used to have, and I want to spend some of it doing things that are not related to my job.
...end of rant.
That said, I don't understand your question. Do you think that in future, AI will write the code, and the job of the developer will be to persuade the employer to use technology X instead of technology Y? Why not ask AI instead?
Is power shifting away from software creators towards attention brokers?
That said, I don't understand your question. Do you think that in future, AI will write the code, and the job of the developer will be to persuade the employer to use technology X instead of technology Y? Why not ask AI instead?
Upon further reflection, my question is more about function (or role) than who has the power. It seems like the function of persuasion is more important than ever, relative to creation. It might be helpful to think of the persuasion role being done by some combination of people, AI decision support, and AI agents.
The writing above could be clearer as to what I mean. Here are some different ways of asking the question. (I abbreviate software creator as creator.)
Since there is variation across industries or products (call them value chains):
Or maybe we want to ask the question from the POV of an individual:
I wanted to highlight the Trustworthy Systems Group at School of Computer Science and Engineering of UNSW Sydney and two of their projects, seL4 and LionsOS.
We research techniques for the design, implementation and verification of secure and performant real-world computer systems. / Our techniques provide the highest possible degree of assurance—the certainty of mathematical proof—while being cost-competitive with traditional low- to medium-assurance systems.
seL4 is both the world's most highly assured and the world's fastest operating system kernel. Its uniqueness lies in the formal mathematical proof that it behaves exactly as specified, enforcing strong security boundaries for applications running on top of it while maintaining the high performance that deployed systems need.
seL4 is grounded in research breakthroughs across multiple science disciplines. These breakthroughs have been recognised by international acclaimed awards, from the MIT Technology Review Award, to the ACM Hall of Fame Award, the ACM Software Systems Award, the DARPA Game changer award, and more.
They are building a modular operating system called LionsOS:
We are designing a set of system services, each made up of simple building blocks that make best use of the underlying seL4 kernel and its features, while achieving superior performance. The building blocks are simple enough for automated verification tools (SMT solvers) to prove their implementation correctness. We are furthermore employing model checkers to verify key properties of the interaction protocols between components.
Core to this approach are simple, clean and lean designs that can be well optimised, use seL4 to the best effect, and provide templates for proper use and extension of functionality. Achieving this without sacrificing performance, while keeping the verification task simple, poses significant systems research challenges.
Here are two compound nouns that I've found useful for high-bandwidth communication: economics-as-toolkit versus economics-as-moral-foundation.
To varying degrees, people immersed in the study of economics may adopt more than just an analytical toolkit; they may hold some economic concepts as moral foundations.
To give some context, I have bolded two parts from Teaching Economics as if Ethics Mattered (2004) by Charles K. Wilber:
I have spent the past thirty-five years as a professor of economics. Over the course of my tenure at The American University, in Washington, DC (1965-75) and the University of Notre Dame (1975 present), however, I became ever more disenchanted with the capacity of traditional economic theory to enable people to lead a good life. I found myself unable to accept the values embedded in economic theory, particularly the elevation of self-interest, the neglect of income distribution, and the attempts to export these values into studies of the family, the role of the state and so on. As a result I started researching and writing on the nature of economics and the role of ethics in economic theory.
This work has led me to three important conclusions. First is the conviction that economic theory is not value-free as is so often claimed. Rather, it presupposes a set of value judgments upon which economic analysis is erected. Second is the realization that the economy itself requires that the self interest of economic actors be morally constrained. Third is the recognition that economic institutions and policies impact people's lives requiring that both efficiency and equity be assessed. Teachers of economics need to make use of these insights.
Though Wilber (above) focuses on neoclassical economics as entangled with libertarianism, there are other couplings across different flavors of economic history: Marxism, Keynesianism, Development Economics, and more.
I have a working theory, which I have reflected on imperfectly: mentioning the difference between an analytical toolkit (aka modeling approach) and a value system goes a long way towards orienting myself in a discussion with someone I don't know well. In some ways, it can be a test balloon to assess someone's awareness of the complexity of economics.