New Answer

New Comment

2 Answers sorted by
top scoring

Jan 08, 2023*

179

Some tasks overlap with what I would want a hypothetical smart human assistant to do: Implement ML experiments and interfaces. Read over my hundreds of pages of drafts, connect ideas to relevant prior work, formalize what makes sense to be formalized and derive implications within the formalism, suggest and perform experiments to test hypotheses, write the ideas and findings up into legible posts. Summarize conversations and meetings. Brainstorm and roleplay useful simulacra with me.

However, I do not think that an Assistant character is the best or only interface AI can give us re augmenting alignment research. I want a neocortex prosthesis that has a more powerful imagination than I, that knows vastly more, is better at math, writing, critical thinking, programming, etc, and which I can weave my thoughts and context into with high bandwidth and minimal overhead, and which is retargetable to any intention I might have. Oh, and which can instantiate Assistants or any other simulacra that might come in handy for the situation.

Sorry if this isn't as specific as you asked for; there are several reasons I didn't describe e.g. the ML experiments I'd like an assistant to do more specifically, mostly laziness.

Also, if you haven't yet, you should check out Results from a survey on tool use and workflows in alignment research.

[-]David Rein3y41

I think the issue with the more general “neocortex prosthesis” is that if AI safety/alignment researchers make this and start using it, every other AI capabilities person will also start using it.

3jacquesthibs3y

While I'm not so sure about this since GPT-3 came out in early 2020 and very few people have used it to its potential (though that number will certainly grow with ease-of-use tools like ChatGPT), your issue is way more likely in the case if there is a publicly available demo vs a few alignment researchers using it in private. That said, it's still very much something to be concerned and careful about.

2janus3y

Yup, that's a problem. The problem also exists with regard to an alignment assistant, although the problem is exacerbated here because "retargetable" is part of the specification. On the other hand, unlike the AI Assistant paradigm, a neocortex prothesis need not be optimized to be user-friendly, and will probably have a respectable learning curve, which makes instant/universal adoption by others less likely. There are also other steps that could be taken to mitigate risks (e.g. siloeing information). Second-order impacts are important to consider, but I also think it's productive to think separately about the problem of what systems would be the most useful to alignment researchers.

3Noosphere893y

More importantly though, there's a point that I think matters here that you said. GPT is not an agent, and a lot of AI risk arguments don't work without agents. One other point to keep in mind is that for the most part, capabilities people will probably create better AIs no matter what we do, so there isn't much control here. I think that we don't have much choice in this matter. Automated research is the only way we can even reasonably solve the alignment problem on short timelines.

4janus3y

I think the concern expressed here is that the neocortex prosthesis could be used by capabilities researchers to do capabilities research more effectively, rather than the system being directly a dangerous agent.

1quetzal_rainbow3y

This is not the post where I intended to discuss this question, just want to express disagreement here: you want a useful LLM, not LLM that produces all possible completions of your idea, but LLM that produces useful completion of your idea. So you want LLM which outputs are at least partially weighted by their usefulness (like ChatGPT), which implies consequentialism.

AtillaYasar

Jan 09, 2023

Disagreement.

I disagree with the assumption that AI is "narrow". In a way GPT is more generally intelligent than humans, because of the breadth of knowledge and type of outputs, and it's actually humans who outperform AI (by a lot) at certain narrow tasks.

And an assistance can include more than asking a question and receiving an answer. It can be exploratory with the right interface to a language model.

(Actually my stories are almost always exploratory, where I try random stuff, change the prompt a little, and recursively play around like that, to see what the AI will come up with)

"Specific"

Related to the above: in my opinion thinking of specific tools is the wrong framing. Like how a gun is not a tool to kill a specific person, it kills whoever you point it at. And a language model completes whichever thought or idea you start, effectively reducing the time you need to think.

So the most specific I can get is I'd make it help me build tooling (and I already have). And the better the tooling the more "power" the AI can give you (as George Hotz might put it).

For example I've built a simple webpage with ChatGPT despite knowing almost no javascript. What does this mean? It means my scope as a programmer just changed from Python to every language on earth (sort of), and it's the same for any piece of knowledge, since ChatGPT can explain things pretty well. So I can access any concept or piece of understanding much more quickly, and there are lots of things Google searches simply don't work for.

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 12:48 AM

[-]Chris_Leong3y20

I'm already finding it useful in terms of writing posts, especially when I have a paragraph, but it doesn't quite flow right. I feel it makes the writing process so much easier.

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

47

[ Question ]

What specific thing would you do with AI Alignment Research Assistant GPT?

47

47

2 Answers sorted by
top scoring

Jan 08, 2023*

Jan 09, 2023

Disagreement.

"Specific"

47

[ Question ]

What specific thing would you do with AI Alignment Research Assistant GPT?

47

47

2 Answers sorted by top scoring

Jan 08, 2023*

Jan 09, 2023

Disagreement.

"Specific"

2 Answers sorted by
top scoring