Modeling Transformative AI Risk (MTAIR)

Wiki Contributions


Unlikely, since he could have walked away with a million dollars instead of doing this. (Per Zvi's other post, "Leopold was fired right before his cliff, with equity of close to a million dollars. He was offered the equity if he signed the exit documents, but he refused.")

The most obvious reason for skepticism of the impact that would cause follows.

David Manheim: I do think that Leopold is underrating how slow much of the economy will be to adopt this. (And so I expect there to be huge waves of bankruptcies of firms that are displaced / adapted slowly, and resulting concentration of power- but also some delay as assets change hands.)

I do not think Leopold is making that mistake. I think Leopold is saying a combination of the remote worker being a seamless integration, and also not much caring about how fast most businesses adapt to it. As long as the AI labs (and those in their supply chains?) are using the drop-in workers, who else does so mostly does not matter. The local grocery store refusing to cut its operational costs won’t much postpone the singularity.


I want to clarify the point I was making - I don't think that this directly changes the trajectory of AI capabilities, I think it changes the speed at which the world wakes up to those possibilities. That is, I think that in worlds with the pace of advances he posits, the impacts on the economy are slower than in AI, and we get faster capabilities takeoff than we do in economic impacts that make the transformation fully obvious to the rest of the world.

The more important point, in my mind, is what this means for geopolitics, which I think aligns with your skepticism. As I said responding to Leopold's original tweet: "I think that as the world wakes up to the reality, the dynamics change. The part of the extensive essay I think is least well supported, and least likely to play out as envisioned, is the geopolitical analysis. (Minimally, there's at least as much uncertainty as AI timelines!)"

I think the essay showed lots of caveats and hedging about the question of capabilities and timelines, but then told a single story about geopolitics - one that I think it both unlikely, and that fails to notice the critical fact - that this is describing a world where government is smart enough to act quickly, but not smart enough to notice that we all die very soon. To quote myself again, "I think [this describes] a weird world where military / government "gets it" that AGI will be a strategic decisive advantage quickly enough to nationalize labs, but never gets the message that this means it's inevitable that there will be loss of control at best."

I don't really have time, but I'm happy to point you to a resource to explain this: https://oyc.yale.edu/economics/econ-159

And I think O disagreed with the concepts inasmuch as you are saying something substantive, but the terms were confused, and I suspect, but may be wrong, that if laid out clearly, there wouldn't be any substantive conclusion you could draw from the types of examples you're thinking of.

That seems a lot like Davidad's alignment research agenda.

Agree that it's possible to have small amounts of code describing very complex things, and I said originally, it's certainly partly spaghetti towers. However, to expand on my example, for something like a down-and-in European call option, I can give you a two line equation for the payout, or a couple lines of easily understood python code with three arguments (strike price, min price, final price) to define the payout, but it takes dozens of pages of legalese instead.

My point was that the legal system contains lots of that type of what I'd call fake complexity, in addition to the real complexity from references and complex requirements.

Very happy to see a concrete outcome from these suggestions!


I'll note that I think this is a mistake that lots of people working in AI safety have made, ignoring the benefits of academic credentials and prestige because of the obvious costs and annoyance.  It's not always better to work in academia, but it's also worth really appreciating the costs of not doing so in foregone opportunities and experience, as Vanessa highlighted. (Founder effects matter; Eliezer had good reasons not to pursue this path, but I think others followed that path instead of evaluating the question clearly for their own work.)

And in my experience, much of the good work coming out of AI Safety has been sidelined because it fails the academic prestige test, and so it fails to engage with academics who could contribute or who have done closely related work. Other work avoids or fails the publication process because the authors don't have the right kind of guidance and experience to get their papers in to the right conferences and journals, and not only is it therefore often worse for not getting feedback from peer review, but it doesn't engage others in the research area.

Answer by Davidmanheim52

There aren't good ways to do this automatically for text, and state of the art is rapidly evolving.

For photographic images which contain detailed images humans or contain non-standard objects with details, there are still some reasonably good heuristics for when AIs will mess up those details, but I'm not sure how long they will be valid for.

This is one of the key reasons that the term alignment was invented and used instead of control; I can be aligned with the interests of my infant, or my pet, without any control on their part.

Most of this seems to be subsumed in the general question of how do you do research, and there's lot of advice, but it's (ironically) not at all a science. From my limited understanding of what goes on in the research groups inside these companies, it's a combination of research intuition, small scale testing,  checking with others and discussing the new approach, validating your ideas, and getting buy-in from people higher up that it's worth your and their time to try the new idea. Which is the same as research generally.

At that point, I'll speculate and assume whatever idea they have is validated in smaller but still relatively large settings. For things like sample efficiency, they might, say, train a GPT-3 size model, which now cost only a fraction of the researcher's salary to do. (Yes, I'm sure they all have very large compute budgets for their research.) If the results are still impressive, I'm sure there is lots more discussion and testing before actually using the method in training the next round of frontier models that cost huge amounts of money - and those decisions are ultimately made by the teams building those models, and management.

Load More