First? Swing low, see how it performs, especially with a long-term project. Something low-stakes. Maybe something like a populated immersive game world. See what comes from there. Is it stable? Is it sane? Does it keep to its original parameters? What are the costs of running the agent/system? Can it solve social alignment problems?
Heck, test out some theories for some of your other answers in there.
Thank you for the comment. I think all of what you said is reasonable. I see now that I probably should’ve been more precise in defining my assumptions, as I would put much of what you said under “…done significant sandbox testing before you let it loose.”
This depends entirely on context and specifics. How did I get such control (and what does "control" even mean, for something agentic)? How do I know it's the first, and how far ahead of second is it? What can this agent do that my human collaborators or employees can't?
In the sci-fi version, where it's super-powerful and able to plan and execute, but only has goals that I somehow verbalize, I think Eliezer's genie description (from https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden-complexity-of-wishes) fits: 'There are three kinds of genies: Genies to whom you can safely say "I wish for you to do what I should wish for"; genies for which no wish is safe; and genies that aren't very powerful or intelligent'. Which one is this?
A more interesting framing of a similar question is: for the people working to bring about agentic powerful AI, what goals are you trying to imbue into it's agency?
Thanks for the comment. I agree that context and specifics are key. This is what I was trying to get at with “If you’d like to change or add to these assumptions for your answer, please spell out how.”
By “controlled,” I basically mean it does what I actually want it to do, filling in the unspecified blanks at least as well as a human would to follow as closely as it can to my true meaning/desire.
Thanks for your “more interesting framing” version. Part of the point of this post was to give AGI developers food for thought about what they might want to prioritize for their first AGI to do.
(If you work for a company that’s trying to develop AGI, I suggest you don’t publicly answer this question lest the media get ahold of it.)
(Let’s assume you’ve “aligned” this AGI and done significant sandbox testing before you let it loose with its first task(s). If you’d like to change or add to these assumptions for your answer, please spell out how.)
Possible answers:
If you think things should be done concurrently, your answer should be in the form of, for example: “(1) 90%, (6) 9%, (10) 1%.”
If you want things done sequentially and concurrently, an example answer would be: “(1) 100%, then (8) 100%, then (9) 50% and (21) 50% (Other: "help me win my favorite video game").”
You can also give answers such as “do (8) first unless it looks like it’ll take more than a year, then do (9) first until I say switch to something else.” I’d suggest, however, to not get too too crazy detailed/complicated with your answers - I’m not going to hold you to them!
There’s a somewhat similar question I found on Reddit to possibly give you some other ideas.