While I have been contemplating this subject for quite some time, this is my first attempt at public communication of it. I have been thinking about AI implementation, as well as safety. I've been looking at ideas of various experts in the field and considering how things might be combined and integrated to improve either performance or safety (preferably both). One important aspect of that is legibility of how neural networks work. While I've been thinking about it for some time, a YouTube video I watched last night helped me crystallize the how of implementing it. It's an interview with the authors of this paper on Arxiv. In the paper, they show... (read 1234 more words →)
Excellent posts, you and several others have stated much of what I’ve been thinking about this subject.
Sorcerer’s Apprentice and Paperclip Scenarios seem to be non-issues given what we have learned over the last couple years from SotA LLMs.
I feel like much of the argumentation in favor of those doom scenarios relies on formerly reasonable, but now outdated issues that we have faced in simpler systems, precisely because they were simpler.
I think that’s the real core of the general misapprehension that I believe is occurring in this realm. It is extraordinarily difficult to think about extremely complex systems, and so, we break them down into simpler ones so that we can examine them... (read 1386 more words →)