Glad to read this.
I am currently writing about it. So, if you have questions, remarks or sections that you've found particularly interesting and/or worth elaborating upon, I would benefit from you sharing them (whether it is here or in DM).
So if I don't take myself in general too seriously by holding most of my models lightly and I then have OODA loops where I recursively reflect on whether I'm becoming the person who I want to be and have set out to be in the past, is that not better than having high guards?
I believe it is hard to accept, but that you do get changed as a result of what your spend your time on regardless of your psychological stance.
You may be very detached. Regardless, if you see A then B a thousand times, you'll expect B when you see A. If you witness a human-like entity feel bad at the mention of a concept a thousand times, it's going to do something to your social emotions. If you interact with a cognitive entity (an other person, a group, an LLM chatbot, or a dog) for a long time, you'll naturally develop your own shared language.
--
To be clear, I think it's good to try to ask questions in different ways and discover just enough of a different frame to be able to 80-20 it and use it with effort, without internalising it.
But Davidad is talking about "people who have >1000h LLM interaction experience."
--
From my point of view, all the time, people get cognitively pwnd.
People get converted and deconverted, public intellectuals get captured by their audience, newbies try drugs and change their lives after finding its meaning there, academics waste their research on what's trendy instead of what's critical, nerds waste their whole careers on what's elegant instead of what's useful, adults get syphoned into games (not video games) to which they realise much later they lost thousands of hours, thousands of EAs get tricked into supporting AI companies in the name of safety, citizens get memed both into avoiding political actions and into feeling bad about politics.
--
I think getting pwnd is the default outcome.
From my point of view, it's not that you must commit a mistake to get pwnd. It's that if you don't take any precaution, it naturally happens.
It has in fact been a while since the last time I have had written conversations with strangers, I'm sorry that my tone came up as too abrasive for productive conversation.
> Next time I'm working on it, I'll see if I can consolidate this claim from it and ping you?
I have shared my email in DM.
By the way, tone doesn't come across well in writing. To be fair, even orally, I am often a bit abrasive.
So just to be clear: I'm thankful that you're engaging with the conversation. Furthermore, I am assuming that you are doing so genuinely, so thanks for that too.
yes. I don't think any of them suggest that LessWrong is supporting or enthusiastic about OpenAI
I think you may have misread what I wrote.
My statements were that the LessWrong community has supported DeepMind, OpenAI and Anthropic, and that it had friends in all three companies.
I did not state that it was enthusiastic about it, and much less so that it currently is. When I say "has supported", I literally mean that it has supported them. Eliezer introducing Demis and Thiel, Paul Christiano doing RLHF at OpenAI and helping with ChatGPT, the whole cluster founding Anthropic, all the people safety-washing the companies, etc. I didn't make a grand statement about its feelings, just a pragmatic one about some of its actions.
Nevertheless a reaction to my statements, you picked up a thread the top answer recommends people work at OpenAI, and where the second topmost answer expresses happiness at capabilities (Paul's RLHF) work.
How could he have known that Paul's work would lead to capabilities 2 years before ChatGPT? By using enmity and keeping in mind that an organisation that races to AGI will leverage all of its internal research (including the one labelled "safety") for capabilities.
I don't know how you did footnotes in comments, but...
For instance, the context of Ben Pace's response was one when many people in the community at the time (plausibly himself too!) recommended people work at OpenAI's safety teams.
He mentions in his comment that he is happy that Paul and Chris get more money at OpenAI than they would have had otherwise, the same reasoning would have applied to other researchers working with them.
From my point of view, this is pretty damning. You picked one post, and the topmost answers featured two examples of support. The type that you would naturally and should clearly avoid with enemies.
To be clear, the LessWrong community has supported many times DeepMind, OpenAI and Anthropic, and at the same time, felt bad feelings about them too. This is quite a normal awkward situation in the absence of clear enmity.
This is not surprising. Enmity would have helped with clarifying this relationship and not committing this mistake.
Also, remember that I do not view enmity as a single-dimensional axis, and this is a major point of my thesis! My recommendation sums to: be more proactive in deeming others enemies, and at the same time, remain cordial, polite and professional with them.
"if you write something that will predictably make people feel worse about [real person or org], you should stick to journalistic standards of citing sources and such"
This is a selective demand for rigour, which induces an extremely strong positivity bias when discussing other people. I would not willingly introduce such a strong bias.
I think other norms make sense, and do not lead to entire communities distorting their vision of the social world. Cordiality, politeness, courtesy and the like.
I think it's very unlikely that having laxer standards for accusing others is a good thing.
I know you think so. And I disagree, especially on "~0% suffer from having too high standards" (my immediate reaction is that you are obviously rejecting the relevant evidence when you say this).
This is why I am thinking of having an article specifically about this, specifically tailored to Lesswrong.
To varying degrees. People are probably less negative on Anthropic than OpenAI. We're certainly not enthusiastic about OpenAI.. In any case I don't think it summarizes to "the Lesswrong community has supported" these orgs.
Have you read the most upvoted responses to your link?
Its conclusion is "I think people who take safety seriously should consider working at OpenAI" (with the link to its job page!)
The conclusion of the second most voted one, from Ben Pace, is "Overall I don't feel my opinion is very robust, and could easily change.", and "And of course I'm very happy indeed about a bunch of the safety work they do and support. The org give lots of support and engineers to people like Paul Christiano, Chris Olah, etc". For reference, Paul Christiano's "safety work" included RLHF, which was instrumental to ChatGPT.
From my point of view, you are painfully wrong about this, and indeed, Lesswrong should have had much more enmity toward OpenAI, instead of recommending people work there because of safety.
Entities are can both be enemies and allies at the same time, entities can be more or less enemies, etc.
From my point of view: accepting to see someone as an enemy only in extreme cases or in the most complete opposition is part of the mindset that I am denouncing.
As I said earlier, and as I'm willing to explain if needed, this is the canonical losing move in politics.
This seems very fake, idiosyncratic, a much smaller problem (if any) than failing to organise against one's enemies.
Nevertheless, if you have a write-up about it of -10 pages, I'm interested in checking it out (I often routinely get proven wrong, and I have found it good to extend this amount of interest to ~anyone who engages with me on a topic that I started).
I am genuinely interested in your point of view.
I see it as causally connected to why the Lesswrong community has supported three orgs racing to AGI.
Out of the following, which of them would count as "talking badly about an org" and would a norm of being more thorough before?
If the above passes your threshold for "need to be more thorough before saying it", then that informs what a potential follow-up to my article geared toward Lesswrong would have to be about.
Specifically, it should be about Lesswrong having a bad culture. One that favours norms that make punishing enemies harder, up to the point of not being able to straightforwardly say "if you are pro-nuke, an org that has been anti-nuke for decades is your enemy". Let alone dealing with AI corporations racing to AGI that have friends in the community.
If the above doesn't pass your threshold and you think it's fine, then I don't think it makes sense for me to write a follow-up article to Lesswrong. It was basically as far as my article goes IIRC, and so the problem lies deeper.
A bunch of wealthy libertarian-leaning Silicon Valley nerds who routinely dismiss the concern that wealthy countries could exploit poor countries
You are projecting.
I have written in the past specifically against tech-libertarianism, in the context of wealth concentration leading to abuses of power.
they're offended when they're asked to even address that concern
I'm not offended that I'm asked to address a concern. I merely find it irrelevant.
What offends me is the lack of thought behind assuming that I didn't know that Greenpeace had arguments. I have seen better on Lesswrong.
agreed
agreed, for similar reasons
I strongly disagree with this, and believe this advice is quite harmful.
"Uncompromisingly insisting on the importance of principles like honesty, and learning to detect increasingly subtle deceptions so that you can push back on them" is one of the stereotypical ways to get cognitively pwnd.
"I have stopped finding out increasingly subtle deceptions" is much more evidence of "I can't notice it anymore and have reached my limits" than "There is no deception anymore."
An intuition pump may be noticing the same phenomenon coming from a person, a company, or an ideological group. Of course, the moment where you have stopped noticing their increasingly subtle lies after pushing against them is the moment they have pwnd you!
The opposite would be "You push back on a couple of lies, and don't get any more subtle ones as a result." That one would be evidence that your interlocutor grokked a Natural Abstraction of Lying and has stopped resorting to it.
But pushing back on "Increasingly subtle deceptions" up until the point where you don't see any, is almost a canonical instance of The Most Forbidden Technique.