How attached are you to the wording "take those questions in stride"? Because in order to fully agree with your comment, I'd want to replace it with something more like "make it through these questions without lastingly diminishing the strength of the relationship." The re-wording would allow for outcomes like "one person or more feel terrible for having had an insecurity triggered, even though it doesn't imply anything bad about relationship compatibility."Basically, I feel like there are two types of issues-that-cause-bad-feelings that can be unearthed via these questions:
I agree that there's something wrong/sad about shrugging away from things around the first bullet point.On the second bullet point, I'd say the nature of triggers/insecurities is precisely that they can give you a bad day even when there's no rational reason to worry. Some of the phrasings you use ("in stride", "navigate smoothly" – edit: though not all of them, because you also say "if these questions would cause you problems" at one point, and that seems like an appropriately strong wording to me!) suggests that you think finding the exercise emotionally very difficult means that there's automatically something suboptimal with the relationship. I don't agree that this follows. I concede that there's a point above which too many or too strong insecurities will predictably impair the nature of a relationship. However, I think that point only comes significantly above "have zero triggers/insecurities." The important part is to keep triggers/insecurities below the threshold where it incentivizes even the most considerate and trusted people in your life to white-lie to you or hide things from you to avoid causing you too much harm. I think the list of questions goes well past that level because it's adversarially optimized to find people's triggers. You don't need that level of bullet-proofness to "live in truth" in a relationship.This gets me to a bit of a tangential rant, but I think it's a virtue-signalling-related failure mode among rationalists that there sometimes develop these pseudo-virtues in the vicinity of things that are truly important (like "living in truth") where people push the truly important thing to outlandish extremes and thereby pass implicit judgment on others who don't go to these extremes, implying that they're less good rationalists. "Living in truth" is an important virtue, but it has very little to do with "you're doing something wrong as a rationalist if you have significant triggers/insecurities." (I'm not saying that you were claiming that in your post (see also my edit above!), or that your comment is evidence that you think that way, but based on my overall impression from reading your posts/comments, it wouldn't surprise me if part of you thought something like that, perhaps in an unreflected fashion.) All else equal, it's better not to have triggers/insecurities, yes. But people differ tremendously around dimensions like neuroticsm, and there are tons of other rationality-related skills one can practice, and then there's a whole part of actually doing work that reduces suffering (or work that advances someone's self-oriented goals in-real-life, if we're talking about non-effective-altruist rationalists) instead of this perpetually-inward-focused work on "improving one's rationality."
I see at least two ways in which it isn't a lemon market for everyone/every circumstance:1. If you value compatibility a lot and seek a long-term relationship, you're wasting your own time if you try to cover up things that some people might consider to be flaws or dealbreakers.2: Some people are temperamentally quite sensitive to rejection, and rejection hurts more the more someone gets close to the real you. To protect against the pain from rejection at a later point, some people are deliberately very open about their flaws right out of the gate.Doing a lot of 2. can be sign that someone isn't ready for a relationship (as it almost exclusively turns off potential partners), but I think it's possible for people who are temperamentally tempted to self-sabotage that way to transform it into a strength. Combined with developing self-confidence about one's good qualities, an awareness of (and openness about) one's weaknesses can seem quite appealing.You might say "but then you're indirectly signalling positive qualities again ("awareness of flaws; the confidence to admit flaws"), so this is still about presenting oneself in the best light possible." Hm, sort of, but if you're actually admitting to things that some people will consider to be dealbreakers, you're opening yourself up to the luck of draw ("will she/he consider it a dealbreaker or not?"), so in the instances where you get lucky and she/he is okay with it or finds it endearing, you actually sent a credible signal! (I.e., you partly overcame the dynamics/incentives that make it difficult in early-stage dating to update much on quality and compatibility.)
I had a super-long Okcupid profile up for about seven years minus the <1y period where I was in a relationship with someone I met irl via EA/work. I also had tinder for a period of time, where I circumvented the word limit by linking to text from my Okcupid profile. I always thought the long profile was the right approach for me because I knew that people who are soulmate-compatible with me would appreciate both the length and the honesty.I was primarily looking for a life-long relationship, but there were times where I noted that I'd be interested in trying out more casual relationships while the search was continuing. (This was certainly a tradeoff: the long-profile approach is probably suboptimal for impressing women who are looking for a more casual relationship.)After seven years of very little success, someone replied to me with "I've just read your profile and I have rarely come across someone so like-minded, ever. It's a little bit uncanny." It turned out that this impression of uncanny compatibility was mutual! We've been together for nearly 1.5 years now and things couldn't be better<3So, it was definitely worth it for me, even though it seemed like the profile wasn't doing much for me for the vast majority of the time that I had it. (For instance, even on the rare occasions where I got replies to first messages, it would almost never lead to a conversation where the women would eventually comment on the parts of my profile that I was particularly fond of.)
That's a great point! It'll also help with communicating the difficulty of the problem if they'll conclude that the field is in trouble and time running out (in case that's true – experts disagree here). I think AI strategy people should consider trying to get more ambassadors on board. (I think I see the ambassador effect as more important now than those people's direct contributions, but you definitely only want ambassadors whose understanding of AI risk is crystal clear.)
Edit: That said, bringing in reputable people from outside ML may not be a good strategy to convince opinion leaders within ML, so this could backfire.
The idea that you can reach 90+% confidence that a non-human animal is sentient, via evidence like 'I heard its vocalizations and looked into its eyes and I just knew', is objectively way, way, way, way, way, way crazier than Lemoine thinking he can reach 90+% confidence that LaMDA is sentient via his conversation.
I don't agree with that. The animal shares an evolutionary history with us whereas a language model works in an alien way, and in particular, it wasn't trained to have a self-model. Edit: Nevermind, my reply mentions arguments other than "I looked into its eyes," so probably your point is that if we forget everything else we know about animals, the "looking into the eyes" part is crazy. I agree with that.
>I wouldn't assign "blame" at all. Given their past and their cognitive structure, they acted as made the most sense in the moment. I concede that "blame" is a bad framing. What I'm trying to say is that, if we want to move towards a world where people are happier on average, it's important to treat liars and self-deceivers differently than we'd treat people who don't do these things. For instance, people who got caught doing these things repeatedly would no longer get the benefit of the doubt. (I think I'm saying something very trivial here, so we probably agree!)>Two people fighting is, in any case, an indication to not trust both. Kind of, when we think of loose correlations. But sometimes the connection is unwarranted for one of the participants of the fight, in which case it's important to hold space for that possibility (that one side was unfairly accused or otherwise caught up in something, or [edit:] that the side was correctly accused but somehow managed to deny, attack and reverse victim and offender in the social environment's perception of the incident).If you get dragged into a fight for unfair reasons, it's arguably kind of suboptimal to just give in to the attacker and avoid making a scene. "Be the bigger person" makes sense in a lot of circumstances, and even when it seems suboptimal, it can be very understandable. Still, there are situations where you definitely can't fault people for fighting back. Sometimes people who fight are justly defending themselves against an attacker. Sometimes that's brave, and other times that may even be their only option because the attack is an existential threat to their social standing, and they no longer have anything to lose. I think it's the case somewhat frequently that attackers make things up or bizarrely misrepresent stuff. Dark triad personality traits (and vulnerable dark triad) tend to be involved, and while it's possible for people with such traits to adhere to "Social Contract" principles, many don't. (It's often part of the symptom criteria that they don't, but I definitely don't want to demonize groups of people based on a cluster of criteria that doesn't always present in the same way.)Overall, I very much second Duncan's norm of "withholding judgment" rather than a stance like "two people fighting is an indication to not trust both." And I also think that it's good to cultivate a desire to get to the bottom of things, rather than have an attitude of "oh, another fight, looks like the truth is somewhere in between." That said, the original post by Duncan exemplifies that it's often not practically possible to get to the bottom of it, and that's arguably one of the most unfortunate things about human civilization.
Not contradicting what you say: This is sometimes (or even quite often) true, but it's still worth emphasizing that, if one side is in fact lying, while the other side is trying hard to be truthful, then your assumption would assign roughly equal blame, which incentivizes/rewards the lying. (It can be easy for bad actors to obfuscate the facts in a situation by lying about everything they think they can get away with.) So if you don't want to be a force that makes the world worse, you have an obligation to (at least) strongly consider and investigate the possibility that one side is almost completely responsible for all discrepancies.
On point 35, "Any system of sufficiently intelligent agents can probably behave as a single agent, even if you imagine you're playing them against each other": This claim is somewhat surprising to me given that you're expecting powerful ML systems to remain very hard to interpret to humans.
I guess the assumption is that superintelligent ML models/systems may not remain uninterpretable to each other, especially not with the strong incentivize to advance interpretability in specific domains/contexts (benefits from cooperation or from making early commitments in commitment races). Still, if a problem is hard enough, then the fact that strong incentives exist to solve it doesn't mean it will likely be solved. Having thought a bit about possible avenues to make credible commitments, it feels non-obvious to me whether superintelligent systems will be able to divide up the lightcone, etc. If anyone has more thoughts on the topic, I'd be very interested.
You may get massive s-risk at comparatively little potential benefit with this. On many people's values, the future you describe may not be particularly good anyway, and there's an increased risk of something going wrong because you'd be trying a desperate effort with something you'd not fully understand.
To me, the most obvious hypothesis for the “seeing spiders” case would be that immediate sensory processing is more immune to this than more abstract planning procedures for some reason.
Yeah, it would make sense for evolution to make the brain system that does predicting sensory data have independent reward from the brain system that evaluates how well your day or life is going at the moment. Your comment made me (vaguely) remember that Scott Alexander (maybe?) wondered something similar in his review of the book Surfing Uncertainty and/or his post Toward a Predictive Theory of Depression. (I haven't re-read the posts lately, so I'm not confident that they contain relevant information or speculation.)