I didn't upvote or downvote this post. Although I do find the spirit of this message interesting, I have a disturbing feeling that arguing to future AI to "preserve humanity for pascals-mugging-type-reasons" trades off X-risk for S-risk. I'm not sure that any of these aforementioned cases encourage AI to maintain lives worth living. I'm not confident that this meaningfully changes S-risk or X-risk positively or negatively, but I'm also not confident that it doesn't.
With the advent of Sydney and now this, I'm becoming more inclined to believe that AI Safety and policies related to it are very close to being in the overton window of most intellectuals (I wouldn't say the general public, yet). Like, maybe within a year, more than 60% of academic researchers will have heard of AI Safety. I don't feel confident whatsoever about the claim, but it now seems more than ~20% likely. Does this seem to be a reach?
There is a fuzzy line between "let's slow down AI capabilities" and "lets explicitly, adversarially, sabotage AI research". While I am all for the former, I don't support the latter; it creates worlds in which AI safety and capabilities groups are pitted head to head, and capabilities orgs explicitly become more incentivized to ignore safety proposals. These aren't worlds I personally wish to be in.
While I understand the motivation behind this message, I think the actions described in this post cross that fuzzy boundary, and pushes way too far towards that style of adversarial messaging
We know, from like a bunch of internal documents, that the New York Times has been operating for the last two or three years on a, like, grand [narrative structure], where there's a number of head editors who are like, "Over this quarter, over this current period, we want to write lots of articles, that, like, make this point..."
Can someone point me to an article discussing this, or the documents itself? While this wouldn't be entirely surprising to me, I'm trying to find more data to back this claim, and I can't seem to find anything significant.
It feels strange hearing Sam say that their products are released whenever the feel as though 'society is ready.' Perhaps they can afford to do that now, but I cannot help but think that market dynamics will inevitably create strong incentives for race conditions very quickly (perhaps it is already happening) which will make following this approach pretty hard. I know he later says that he hopes for competition in the AI-space until the point of AGI, but I don't see how he balances the knowledge of extreme competition with the hope that society is prepared for the technologies they release; it seems that even current models, which appear to be far from the capabilities of AGI, are already transformative.
Let's say Charlotte was a much more advanced LLM (almost AGI-like, even). Do you believe that if you had known that Charlotte was extraordinarily capable, you might have been more guarded about recognizing it for its ability to understand and manipulate human psychology, and thus been less susceptible to it potentially doing so?
I find that small part of me still think that "oh this sort of thing could never happen to me, since I can learn from others that AGI and LLMs can make you emotionally vulnerable, and thus not fall into a trap!" But perhaps this is just wishful thinking that would crumble once I interact with more and more advanced LLMs.
I'm trying to engage with your criticism faithfully, but I can't help but get the feeling that a lot of your critiques here seem to be a form of "you guys are weird": your guys's privacy norms are weird, your vocabulary is weird, you present yourself off as weird, etc. And while I may agree that sometimes it feels as if LessWrongers are out-of-touch with reality at points, this criticism, coupled with some of the other object-level disagreements you were making, seems to overlook the many benefits that LessWrong provides; I can personally attest to the fact that I've improved in my thinking as a whole due to this site. If that makes me a little weird, then I'll accept that as a way to help me shape the world as I see fit. And hopefully I can become a little less weird through the same rationality skills this site helps develop
Humans can often teach themselves to be better at a skill through practice, even without a teacher or ground truth
Definitely, but I currently feel that the vast majority of human learning comes with a ground truth to reinforce good habits. I think this is why I'm surprised this works as much as it does: it kinda feels like letting an elementary school kid teach themself math by practicing certain skills they feel confident in without any regard to if that skill even is "mathematically correct".
Sure, these skills are probably on the right track toward solving math problems - otherwise, the kid wouldn't have felt as confident about them. But would this approach not ignore skills the student needs to work on, or even amplify "bad" skills? (Or maybe this is just a faulty analogy and I need to re-read the paper)
I don't quite understand the perspective behind someone 'owning' a specific space. Do airlines specify that when you purchase a ticket, you are entitled to the chair + the surrounding space (in whatever ambiguous way that may mean)? If not, it seems to me that purchasing a ticket pays for a seat and your right to sit down on it, and everything else is complementary.
Yeah, something along the lines of this. Preserving humanity =/= humans living lives worth living.