Nisan

# Wiki Contributions

I expressed some disagreement in my comment, but I didn't disagree-vote.

I like your upper bound. The way I'd put it is: If you buy $1 of Microsoft stock, the most impact that can have is if Microsoft sells it to you, in which case Microsoft gets one more dollar to invest in AI today. And Microsoft won't spend the whole dollar on AI. Although they'd plausibly spend most of a marginal dollar on AI, even if they don't spend most of the average dollar on AI. I'm not sure what to make of the fact that Microsoft is buying back stock. I'd guess it doesn't make a difference either way? Perhaps if they were going to buy back$X worth of shares but then you offer to buy $1 of shares from them at market price, they'd buy back$X and sell you $1 for a net buyback of$(X-1) and you still have an impact of $1. I like the idea that buying stock only has a temporary effect on price. If the stock price is determined by institutional investors that take positions on the price, then maybe when you buy$1 of stock, these investors correct the price immediately, and the overall effect is to give those investors $1, which is ethically neutral? James_Miller makes this point here. But I'd like to have a better understanding of where the boundary lies between tiny investors who have zero impact and big investors who have all the impact. Or maybe the effect of buying$1 of stock is giving \$1 to early Microsoft investors and employees? The ethics of that are debatable since the early investors didn't know they were funding an AGI lab.

That could be, but also maybe there won't be a period of increased strategic clarity. Especially if the emergence of new capabilities with scale remains unpredictable, or if progress depends on finding new insights.

I can't think of many games that don't have an endgame. These examples don't seem that fun:

• A single round of musical chairs.
• A tabletop game that follows an unpredictable, structureless storyline.

I don't think this is a good argument. A low probability of impact does not imply the expected impact is negligible. If you have an argument that the expected impact is negligible, I'd be happy to see it.

We had the model for ChatGPT in the API for I don't know 10 months or something before we made ChatGPT. And I sort of thought someone was going to just build it or whatever and that enough people had played around with it.

I assume he's talking about text-davinci-002, a GPT 3.5 model supervised-finetuned on InstructGPT data. And he was expecting someone to finetune it on dialog data with OpenAI's API. I wonder how that would have compared to ChatGPT, which was finetuned with RL and can't be replicated through the API.

I agree that institutional inertia is a problem, and more generally there's the problem of getting principals to do the thing. But it's more dignified to make alignment/cooperation technology available than not to make it.

I'm a bit more optimistic about loopholes because I feel like if agents are determined to build trust, they can find a way.

I agree those nice-to-haves would be nice to have. One could probably think of more.

I have basically no idea how to make these happen, so I'm not opinionated on what we should do to achieve these goals. We need some combination of basic research, building tools people find useful, and stuff in-between.

You poster talks about "catastrophic outcomes" from "more-powerful-than-human" AI. Does that not count as alarmism and x-risk? This isn't meant to be a gotcha, I just want to know what counts as too alarmist for you.