Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com.
(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)
Don't religions sort of centrally try to get you to believe known-to-be-false claims?
I agree that institutionally they are set up to do a lot of that, but the force they bring to bear on any individual is actually quite small in my experience, compared to what I've seen in AI safety spaces. Definitely lots of heterogeneity here, but most optimization that religions do to actually keep you believing in their claims are pretty milquetoast.
Are you saying that EAs are better at deceiving people than typical members of those groups?
Definitely in-expectation! I think SBF, Sam Altman, Dario, Geoff Anders plus a bunch of others are pretty big outliers on these dimensions. I think in-practice there is a lot variance between individuals, with a very high-level gloss being something like "the geeks are generally worse, unless they make it an explicit optimization target, but there are a bunch of very competent sociopaths around, in the Venkatesh Rao sense of the word, which seem a lot more competent and empowered than even the sociopaths in other communities".
Are you claiming that members of those groups may regularly spout false claims, but they're actually not that invested in getting others to believe them?
Yeah, that's a good chunk of it. Like, members of those groups do not regularly sit down and make extensive plans about how to optimize other people's beliefs in the same way as seems routine around here. Some of it is a competence side-effect. Paranoia becomes worse the more competent your adversary is. The AI Safety community is a particularly scary adversary in that respect (and one that due to relatively broad buy-in for something like naive-consequentialism can bring more of its competence to bear on the task of deceiving you).
My current best guess is that you have a higher likelihood of being actively deceived/have someone actively plot to mislead you/have someone put in very substantial optimization pressure to get you to believe something false or self-serving, if you interface with the AI safety community than almost any of the above.
A lot of that is the result of agency, which is often good, but in this case a double-edged sword. Naive consequentialism and lots of intense group-beliefs make the appropriate level of paranoia when interfacing with the AI Safety community higher than with most of these places.
"Appropriate levels of paranoia when interfacing with you" is of course not the only measure of honor and integrity, though as I am hoping to write about sometime this week, it's kind of close to the top.
On that dimension, I think the AI Safety community is below AGI companies and the US military, and above all the other ones on this list. For the AGI companies, it's unclear to me how much of it is the same generator. Approximately 50% of the AI Safety community are employed by AI labs, and they have historically made up a non-trivial fraction of the leadership of those companies, so those datapoints are highly correlated.
Huh, those are very confident AGI timelines. Have you written anything on your reasons for that? (No worries if not, am just curious).
I hope we continue to hold ourselves to high standards for integrity and honor, and as long as we do, I will be proud to be part of this community no matter what the super PACs say.
I do wish this was the case, but as I have written many times in the past, I just don't think this is an accurate characterization. See e.g.: https://www.lesswrong.com/posts/wn5jTrtKkhspshA4c/michaeldickens-s-shortform?commentId=zoBMvdMAwpjTEY4st
I don't think the AI safety community has particularly much integrity or honor. I would like to make there be something in the space that has those attributes, but please don't claim valor we/you don't have!
I mean "not solving alignment" pretty much guarantees misuse by everyone's lights? (In both cases conditional on building ASI)
As I understood the title, the point was to indicate that the author is tired of this, and while maybe they were able to help some people, knowing this information has come at great emotional and social cost to them, wishing (in at least some sense hinted at in the title) that they had never encountered this information.
So at least as played at in the title, it would fall into the “negatively-valued information” bucket.
Your message:
Yeah, I honestly think the above is pretty clear?
I do not think it at all describes a policy of "if someone was trying to harm the third party, and having this information would cause them to do it sooner, then I would give them the information". Indeed, it seems really very far away from that! In the above story nobody is trying to actively harm anyone else as far as I can tell? I certainly would not describe "CEA Comm Health team is working on a project to do a bunch of investigations, and I tell them information that is relevant to how highly they should prioritize those investigations" as being anything close to "trying to harm someone directly"!
You didn’t say that when we were talking about it!
No, I literally said "Like, to be clear, I definitely rather you not have told me". And then later "Even if I would have preferred knowing the information packaged with the request". And my first response to your request said "You can ask in-advance if I want to accept confidentiality on something, and I'll usually say no".
If you were like, “sorry, I obviously can’t actually not propagate this information in my world model and promise it won’t reflect on my plans, but I won’t actively try to use outside of coordinating with the third party and will keep it confidential going forward”, that would’ve been great and expected and okay.
Sure, but I also wouldn't have done that! The closest deal we might have had would have been a "man, please actually ask in advance next time, this is costly and makes me regret having that whole conversation in the first place. If you recognize that as a cost and owe me a really small favor or something, I can keep it private, but please don't take this as a given", but I did not (and continue to not) have the sense that this would actually work.
but I won’t actively try to use outside of coordinating with the third party
Maybe I am being dense here, and on first read this sounded like maybe a thing I could do, but after thinking more about it I do not know what I am promising if I promise I "won't actively try to use [this information] outside of coordinating with the third party". Like, am I allowed to write it in my private notes? Am I allowed to write it in our weekly memos as a consideration for Lightcone's future plans? Am I not allowed to think the explicit thought "oh, this piece of information is really important for this plan that puts me in competition with this third party, better make sure to not forget it, and add it to my Anki deck?
Like, I am not saying there isn't any distinction between "information passively propagating" and "actively using information", but man, it feels like a very tricky distinction, and I do not generally want to be in the business of adding constraints to my private planning and thought-processes that would limit how I can operate here, and relies on this distinction being clear to other people. Maybe other people have factored their mind and processes in ways they find this easy, but I do not.
I don't feel great about my donations to a nonprofit funding their "hotel/event venue business" (as I would call it)
The nice thing about Lighthaven is that it mostly funds itself! Our current expected net-spending on Lighthaven is about 10% of our budget, largely as a result of subsidizing events and projects here that couldn't otherwise exist. I think looking at that marginal expenditure Lighthaven is wildly cost-effective if you consider any of the organizations that run events here that we subsidize to be cost-effective.
because if, e.g., someone was considering whether it's important to harm the third party now rather than later and telling them the information that I shared would've moved them towards harming the third party earlier, Oliver would want to share information with that someone so that they could harm the third party.
No, I didn't say anything remotely like this! I have no such policy! I don't think I ever said anything that might imply such a policy. I only again clarified that I am not making promises about not doing these things to you. I would definitely not randomly hand out information to anyone who wants to harm the third party.
At this point I am just going to stop commenting every time you summarize me inaccurately, since I don't want to spend all day doing this, but please, future readers, do not assume these summaries are accurate.
Then, after hearing Oliver wouldn't agree to confidentiality given that I haven't asked him for it in advance
I have clarified like 5 times that this isn't because you didn't ask in advance. If you had asked in advance I would have rejected your request as well, it's just that you would have never told me in the first place.
don't try to tell people specifically for the purpose of harming the third party
This is also not what you asked for! You said "I just ask you to not use this information in a way designed to hurt [third party]", which is much broader. "Not telling people" and "not using information" are drastically different. I have approximately no idea how to commit to "not use information for purpose X". Information propagates throughout my world model. If I end up in conflict with a third party I might want to compete with them and consider the information as part of my plans. I couldn't blind myself to that information when making strategic decisions.
Yeah, that does sound roughly like what I mean, and then I think most people just drop the second part:
I do not think that SBF was doing this part. He was doing the former though!
My best guess you are doing a mixture of: