This is a linkpost for

I somehow hadn't read this post until now, so I am posting this here in case I am not the only one (and I wasn't able to find a previous linkpost for it). Relevant to relatively recent discussion on AI-as-a-service, but also just good as a broad reference.

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 8:50 AM

Agenty AI's can be well defined mathematically. We have enough understanding of what an agent is that we can start dreaming up failure modes. Most of what we have for tool ASI is analogies to systems to stupid fail catastrophically anyway, and pleasant imaginings.

Some possible programs will be tool ASI's, much as some programs will be agent ASI's. The question is, what are the relative difficulties in humans building, and benefits of, each kind of AI. Conditional on friendly AI, I would consider it more likely to be an agent than a tool, with a lot of probability on "neither", "both" and "that question isn't mathematically well defined". I wouldn't be surprised if tool AI and corrigible AI turned out to be the same thing or something.

There have been attempts to define tool-like behavior, and they have produced interesting new failure modes. We don't have the tool AI version of AIXI yet, so its hard to say much about tool AI.

I wonder if gwern has changed their view on RL/meta-learning at all given GPT, scaling laws, and current dominance of training on big offline datasets.  This would be somewhat in line with skybrian's comment on Hacker News: 

I see that is has references to papers from this year, so presumably has been updated to reflect any changes in view.