I have little to no technical knowledge when it comes to AI, so I have two questions here.
1.) Are the agents or the learning process that was constructed the focus of the paper?
2.) How major of a breakthrough is this toward AGI?
I apologize if this is something you’ve already covered ad nauseum, but what are your timelines?
IMO, I doubt you have to be pessimistic to believe that there’s an unacceptably high probability of AI doom. Some may think that there’s a <10% chance of something really bad happening, but I would argue even that is unacceptable.
The most glaring argument that I could see raised against Christiano’s IDA is that it assumes a functioning AGI would already be developed before measures are taken to make it corrigible. At the same time though, that argument may very well be due to misunderstanding on my part. It’s also possible that MIRI would prefer that the field prioritize over seemingly preparing for non-FOOM scenarios. But I don’t understand how it couldn’t “possibly, possibly, possibly work”.
I’m a little confused though. I’m aware of Yudkowsky’s misgivings regarding the possible failings of prosaic AGI alignment, but I’m not sure where he states it to be border-line impossible or worse. Also, when you refer to MIRI being highly pessimistic of prosaic AGI alignment, are you referring to the organization as a whole, or a few key members?
I also don’t understand why this disparity of projections exists. Is there a more implicit part of the argument that neither party (Paul Christiano and MIRI) haven’t publicly adressed?
EDIT: Is the argument more so that it isn't currently possible due to a lack of understanding regarding what corrigibility even is, without entertaining how possible it might be some years down the line?
Interesting, but what is the probability you assign to this chain of events? Just as well, the probability you would assign to the advent of transformative AI (AGI) being prosaic- as in its achieved by scaling existing architectures with more compute and better hardware?
Thanks for the reply!
Just as well, I’m also less confident that shorter timelines are congruent with a high, irreducible probability of failure.
EDIT: If doom is instead congruent simply with the advent of Prosaic AGI, then I still disagree- even moreso, actually.
I’m somewhat confused as to how slightly more confident, and slightly less confident equate to doom- which is a pretty strong claim imo.
Thank you for the reply!