otto.barten

Comments

otto.barten's Shortform

I think actually 1+1 = ? is not really an easy enough goal, since it's not 100% sure that the answer is 2. Getting to 100% certainty (including what I actually meant with that question) could still be nontrivial. But let's say the goal is 'delete filename.txt'? Could be the trick is in the language..

otto.barten's Shortform

Thanks again for your reply. I see your point that the world is complicated and a utility maximizer would be dangerous, even if the maximization is supposedly trivial. However, I don't see how an achievable goal has the same problem. If my AI finds the answer of 2 before a meteor hits it, I would say it has solidly landed at 100% and stops doing anything. Your argument would be true if it decides to rule out all possible risks first, before actually starting to look for the answer of the question, which would otherwise quickly be found. But since ruling out those risks would be much harder to achieve than finding the answer, I can't see my little agent doing that.

I think my easy goals come closest to what you call other-izers. Any more pointers for me to find that literature?

Thanks for your help, it helps me to calibrate my thoughts for sure!

otto.barten's Shortform

Thanks for your insights. I don't really understand 'setting [easy] goals is an unsolved problem'. If you set a goal: "tell me what 1+1 is", isn't that possible? And once completed ("2!"), the AI would stop to self-improve, right?

I think this may contribute to just a tiny piece of the puzzle, however, because there will always be someone setting a complex or, worse, non-achievable goal ("make the world a happy place!"), and boom there you have your existential risk again. But in a hypothetical situation where you have your AGI in the lab, no-one else has, and you want to play around safely, I guess easy goals might help?

Curious about your thoughts, and also, I can't imagine this is an original idea. Any literature already on the topic?

Help wanted: feedback on research proposals for FHI application

Thanks Charlie! :)

They are asking for only one proposal, so I will have to choose one and am planning to work out that one. So I'm mostly asking about which idea you find most interesting, rather than about which one is the strongest proposal now - that will be worked out. But thanks a lot for your feedback so far - that helps!

otto.barten's Shortform

AGI is unnecessary for an intelligence explosion

Many arguments state that it would require an AGI to have an intelligence explosion. However, it seems to me that the critical point for achieving this explosion is that an AI can self-improve. Which skills are needed for that? If we have hardware overhang, it probably comes down to the type of skills an AI researcher uses: reading papers, combining insights, doing computer experiments until new insights emerge, writing papers about them. Perhaps an AI PhD can weigh in on the actual skills needed. I'm however making the argument that far from all mental skills humans have are needed for AI research. Appreciating art? Not needed. Intelligent conversation about non-AI topics? Not needed. Motor skills? Not needed.

I think the skills needed most for AI research (and therefore self-improvement) are the skills at which a computer may be relatively strong: methodical thinking, language processing, coding. Therefore I would expect that we reach an intelligence explosion significantly earlier than developing actual AGI with all human skills. This should be important for the timeline discussion.

otto.barten's Shortform

Tune AGI intelligence by easy goals

If an AGI is provided an easily solvable utility function ("fetch a coffee"), it will lack the incentive to self-improve indefinitely. The fetch-a-coffee-AGI will only need to become as smart as a hypothetical simple-minded waiter. By creating a certain easiness for a utility function, we can therefore tune the intelligence level we want an AGI to achieve using self-improvement. The only way to achieve an indefinite intelligence explosion (until e.g. material boundaries) would be to program a utility function maximizing something. Therefore this type of utility function will be most dangerous.

Could we create AI safety by prohibiting maximizing-type utility functions? Could we safely experiment with AGIs just a little smarter than us, by using moderately hard goals?

Reaching out to people with the problems of friendly AI

This is the exact topic I'm thinking a lot about, thanks for the link! I've wrote my own essay for a general audience but it seems ineffective. I knew about the Wait but why blog post, but there must be better approaches possible. What I find hard to understand is that there have been multiple best-selling books about the topic, but still no general alarm is raised and the topic is not discussed in e.g. politics. I would be interested in why this paradox exists, and also how to fix it.

Is there any more information about reaching out to a general audience on Lesswrong? I've not been able to find it using the search function etc.

The reason I'm interested is twofold:

1) If we convince a general audience that we face an important and understudied issue, I expect them to fund research into it several orders of magnitude more generously, which should help enormously in reducing the X-risk (I'm not working in the field myself).

2) If we convince a general audience that we face an important and understudied issue, they may convince governing bodies to regulate, which I think would be wise.

I've heard the following counterarguments before, but didn't find them convincing. If someone would want to convince me that convincing the public about AGI risk is not a good idea, these are places to start:

1) General audiences might start pressing for regulation which could delay AI research in general and/or AGI. That's true and indeed a real problem, since all the potential positive aspects of AI/AGI (which may be enormous) cannot be applied yet. However, in my opinion the argument is not sufficient because:

A) AGI existential risk is so high and important that reducing it is more important than AI/AGI delay, and

B) Increased knowledge of AGI will also increase general AI interest, and this effect could outweigh the delay that regulation might cause.

2) AGI worries from the general public could make AI researchers more secretive and less cooperative in working together with AI Safety research. My problem with this argument is the alternative: I think currently, without e.g. politicians discussing this issue, the investments in AI Safety are far too small to have a realistic shot at actually solving the issue timely. Finally, AI Safety may well not be solvable at all, in which case regulation gets more important.

Would be super to read your views and get more information!

Looking for non-AI people to work on AGI risks

Also another thought. (Partially) switching careers comes with a large penalty, since you don't have as much previous knowledge, experience, credibility, and network for the new topic. The only reason I'm thinking about it, is that I think AGI risk is a lot more important to work on than climate risk. If you're moving in the opposite direction:

1) Do you agree that such moving comes with a penalty?

2) Do you think that climate risk is a lot more important to work on than AGI risk?

If so, only one of us can be right. It would be nice to know who that is, so we don't make silly choices.

Looking for non-AI people to work on AGI risks

Hi Brian, thanks for your reply! I think we would not need very special qualifications for this, it's more a matter of reading up on the main status of AI and safe AI, cite the main conclusions from academia and make sure they get presented well to both policy makers and normal people. You say you'd expect countless others to want to work on this too, but I didn't find them yet. I'm still hopeful they may exist somewhere, and if you find people already doing this, I'd love to get in contact with them. Else, we should start ourselves.

Interesting observation! I'm thinking that your second front is especially interesting/worrying where AI improvement tasks are automated. For a positive feedback loop to occur, making AI get smarter very fast, many imagine an AGI is necessary. However, I'm thinking, what's improving AI now? Which skills are required? I think it's partially hardware improvement: academia and industry working together to keep Moore's law going. The other part is software/algorithm improvements, also done by academics and companies such as Deep Mind etc. So if the tasks of those researchers would be automated, that would be the point at which the singularity could take off. Their jobs tend to be analytical and focused on a single task, more than generically human and social, which I guess means that AI would find them easier. That in turn means the singularity (there should be a less scifi name for this) could happen sooner than AGI, if policy doesn't intervene. So also a long winded I agree.

So how should we go about organizing this, if no one is doing it yet? Any thoughts?

Thanks again for your reply, as I said above it's heartening that there are people out there who are on more or less the same page!

Looking for non-AI people to work on AGI risks

Hi WH, thank you for the reply! I find it really heartening and encouraging to learn what others are thinking.

Could you explain what hardware you think would be needed? It's kind of the first time I'm hearing someone talk about that, so I'm curious of course to learn what you think it would take.

I agree with your point that understanding risks of AI projects is a good way of framing things. Given the magnitude of AGI risks (as I understand it now, human extinction), an alarmist tone of a policy report would still be justified in my opinion. I also agree that we should keep an open mind: I see the benefits of AI, and even more the benefits of AGI, which would be biblical if we could control the risks. Climate adaptation could indeed be carried out a lot better, as could many other tasks. However, I think that we will not be able to control AGI, and we may therefore go extinct if we still develop it. But agreed: let's keep an open mind about the developments.

Do you know any reliable overview of AGI risks? It would be great to have a kind of IPCC equivalent that's as uncontroversial as possible to convince people that this problem needs attention. Or papers stating that there is a nonzero chance of human extinction, from a reliable source. Any such information would be great!

If I can help you by the way with ideas on how to fight the climate crisis, let me know!

Load More