Wiki Contributions


Speaking of Stag Hunts

There's a lot here, and I've put in a lot of work writing and rewriting. After failing for long enough to put things in a way that is both succinct and clear, I'm going to abandon hopes of the latter and go all in on the former. I'm going to use the minimal handles for the concepts I refer to, in a way similar to using LW jargon like "steelman" without the accompanying essays, in hopes that the terms are descriptive enough on their own. If this ends up being too opaque, I can explicate as needed later.

Here's an oversimplified model to play with:

  • Changing minds requires attention, and bigger changes require more attentions.
  • Bidding for bigger attention requires bigger respect, or else no reason to follow.
  • Bidding for bigger respect requires bigger security, or else not safe enough to risk following. 
  •  Bidding for that sense of security requires proof of actual security, or else people react defensively, cooperation isn't attended to, and good things don't happen

GWS took an approach of offering proof of security and making fairly modest bids for both security and respect. As a result, the message was accepted, but it was fairly restrained in what it attempted to communicate. For example, GWS explicitly says "I do not expect that I would give you the type of feedback that Jennifer has given you here (i.e. the question the validity of your thesis variety)."

Jennifer, on the other hand, went full bore, commanding attention to places which demand lots of respect if they are to be followed, while offering little in return*. As a result, accepting this bid also requires a large degree of security, and she offered no proof that her attacks on Duncan's ideas (it feels weird addressing you in the third person given that I am addressing this primarily to you, but it seems like it's better looked at from an outside perspective?) would be limited to that which wouldn't harm Duncan's social standing here. This makes the whole bid very hard to accept, and so it was not accepted, and Duncan gave high heat responses instead.

Bolder bids like that make for much quicker work when accepted, so there is good reason to be as bold as your credit allows. One complicating factor here is that the audience is mixed, and overbidding for Duncan himself doesn't necessarily mean the message doesn't get through to others, so there is a trade off here between "Stay sufficiently non-threatening to maintain an open channel of cooperation with Duncan" and "Credibly convey the serious problems with Duncan's thesis, as I see them, to all those willing to follow".

Later, she talks about wanting to help Duncan specifically, and doesn't seem to have done so. There are a few possible explanations for this.

1) When she said it, there might have been an implied "[I'm only going to put in a certain level of work to make things easy to hear, and beyond that I'm willing to fail]". In this branch, the conversation between Duncan and Jennifer is going nowhere unless Duncan decides to accept at least the first bid of security. If Duncan responds without heat (and feeling heated but attempting to screen it off doesn't count), the negotiation can pick up on the topic of whether Jennifer is worthy of that level of respect, or further up if that is granted too.

2) It's possible that she lacks a good and salient picture of what it looks like to recover from over-bidding, and just doesn't have a map to follow. In this branch, demonstrating what that might look like would likely result in her doing it and recovering things. In particular, this means pacing Duncan's objections without (necessarily) agreeing with them until Duncan feels that she has passed his ITT and trusts her intent to cooperate and collaborate rather than to tear him down.

3) It could also be that she's got her own little hang up on the issue of "respect", which caused a blind spot here. I put an asterisk there earlier, because she was only showing "little respect" in one sense, while showing a lot in another. If you say to someone "Lol, your ideas are dumb", it's not showing a lot of respect for those ideas of theirs. To the extent that they afford those same ideas a lot of respect, it sounds a lot like not respecting them, since you're also shitting on their idea of how valuable those ideas are and therefore their judgement itself. However, if you say to someone "Lol, your ideas are dumb" because you expect them to be able to handle such overt criticism and either agree or prove you wrong, then it is only tentatively disrespectful of those ideas and exceptionally and unusually respectful of the person themselves.

She explicitly points at this when she says "Duncan is a special case. I'm not treating him like a student, I'm treating him like an equal", and then hints at a blind spot when she says (emphasis her own) "who should be able to manage himself and his own emotions" -- translating to my model, "manage himself and his emotions" means finding security and engaging with the rest of the bids on their own merits unobstructed by defensive heat. "Should" often points at a willful refusal to update ones map to what "is", and instead responding to it by flinching at what isn't as it "should" be. This isn't necessarily a mistake (in the same way that flinching away from a hot stove isn't a mistake), and while she does make other related comments elsewhere in the thread, there's no clear indication of whether this is a mistake or a deliberate decision to limit her level of effort there. If it is a mistake, then it's likely "I don't like having to admit that people don't demonstrate as much security as I think they should, and I don't wanna admit that it's a thing that is going to stay real and problematic even when I flinch at it". Another prediction is that to the extent that it is this, and she reads this comment, this error will go away.

I don't want to confuse my personal impression with the conditional predictions of the model itself, but I do think it's worth noting that I personally would grant the bid for respect. Last time I laughed off something that she didn't agree should be laughed off, it took me about five years to realize that I was wrong. Oops.

Speaking of Stag Hunts

I'm torn about getting into this one, since on one hand it doesn't seem like you're really enjoying this conversation or would be excited to continue it, and I don't like the idea of starting conversations that feel like a drain before they even get started. In addition, other than liking my other comment on this post, you don't really know me and therefore I don't really have the respect/trust resources I'd normally lean on for difficult conversations like this (both in the "likely emotionally significant" and also "just large inferential distances with few words" senses).

On the other hand I think there's something very important here, both on the object level and on a meta level about how this conversation is going so far. And if it does turn out to be a conversation you're interested in having (either now, or in a month, or whenever), I do expect it to be actually quite productive.

If you're interested, here's where I'm starting:

Jennifer has explicitly stated that at this point her goal is to help you. This doesn't seem to have happened. While it's important to track possibilities like "Actually, it's been more helpful than it looks", it looks more like her attempt(s) so far have failed, and this implies that she's missing something.

Do you have a model that gives any specific predictions about what it might be? Regardless of whether it's worth the effort or whether doing so would lead to bad consequences in other ways, do you have a model that gives specific predictions of what it would take to convey to her the thing(s) she's missing such that the conversation with her would go much more like you think it should, should you decide it to be worthwhile?

Would you be interested in hearing the predictions my models give?

Speaking of Stag Hunts

Yeah, I anticipated that the "Drama is actually kinda important" bit would be somewhat controversial. I did qualify that it was selected "(if imperfectly)" :p

Most things are like "Do we buy our scratch paper from walmart or kinkos?", and there are few messes of people so bad that it'd make me want to say "Hey, I know you think what you're fighting about is important, but it's literally less important than where we buy our scratch paper, whether we name our log files .log or .txt, and literally any other random thing you can think of".

(Actually, now that I say this, I realize that it can fairly often look that way and that's why "bikeshedding" is a term. I think those are complicated by factors like "What they appear to be fighting about isn't really what they're fighting about", "Their goals aren't aligned with the goal you're measuring them relative to", and "The relevant metric isn't how well they can select on an absolute scale or relative to your ability, but relative to their own relatively meager abilities".)

In one extreme, you say "Look, you're fighting about this for a reason, it's clearly the most important thing, or at least top five, ignore anyone arguing otherwise".

In another, you say "Drama can be treated as random noise, and the actual things motivating conflict aren't in any way significantly more important than any other randomly selected thing one could attend to, so the correct advice is just to ignore those impulses and plow forward"

I don't think either are very good ways of doing it, to understate it a bit. "Is this really what's important here?" is an important question to keep in mind (which people sometimes forget, hence point 3), but that it cannot be treated as a rhetorical question and must be asked in earnest because the answer can very well be "Yes, to the best of my ability to tell" -- especially within groups of higher functioning individuals.

I think we do have a real substantive disagreement in that I think the ability to handle drama skillfully is more important and also more directly tied into more generalized rationality skills than you do, but that's a big topic to get into.

I am, however, in full agreement on the main idea of "in a high dimensional space, choosing the right places to explore is way more important than speed of exploration", and that it generalizes well and is a very important concept. It's actually pretty amusing that I find myself arguing "the other side" here, given that so much of what I do for work (and otherwise) involves face palming about people working really hard to optimize the wrong part of the pie chart, instead of realizing to make a pie chart and work only on the biggest piece or few.

Speaking of Stag Hunts

I think your main point here is wrong.

Your analysis rests on a lot of assumptions:

1) It's possible to choose a basis which does a good job separating the slope from the level

2) Our perturbations are all small relative to the curvature of the terrain, such that we can model things as an n-dimensional plane

3) "Known" errors can be easily avoided, even in many dimensional space, such that the main remaining question is what the right answers are

4) Maintenance of higher standards doesn't help distinguish between better and worse directions.

5) Drama pushes in random directions, rather than directions selected for being important and easy to fuck up.

1) In a high dimensional space, almost all bases have the slope distributed among many basis vectors. If you can find a basis that has a basis vector pointing right down the gradient and the rest normal to it, that's great. If your bridge has one weak strut, fix it. However, there's no reason to suspect we can always or even usually do this. If you had to describe the direction of improvement from a rotting log to a nice cable stayed bridge, there's no way you could do it simply. You could name the direction "more better", but in order to actually point at it or build a bridge, many many design choices will have to be made. In most real world problems, you need to look in many individual directions and decide whether it's an improvement or not and how far to go. Real world value is built on many "marginal" improvements.

2) The fact that we're even breathing at all means that we've stacked up a lot of them. Almost every configuration is completely non-functional, and being in any way coherent requires getting a lot of things right. We are balanced near optima on many dimensions, even thought there is plenty left to go. While almost all "small" deviations have even smaller impact, almost all "large" deviations cause a regression to the mean or at least have more potential loss than gain. The question is whether all perturbations can be assumed small, and the answer is clear from looking at the estimated curvature. On a bad day you can easily exhibit half the tolerance that you do on a good day. Different social settings can change the tolerance by *much* more than that. I could be pretty easily convinced that I'm averaging 10% too tolerant or 10% too intolerant, but a factor of two either way is pretty clearly bad in expectation. In other words, the terrain is can *not* be taken as planar.

3) Going uphill, even when you know which way is up, is *hard*, and there is a tendency to downslide. Try losing weight, if you have any to lose. Try exercising as much as you think you should. Or just hiking up a real mountain. Gusts of wind don't blow you up the mountain as often as they push you down; gusts of wind cause you to lose your footing, and when you lose your footing you inevitably degenerate into a high entropy mess that is further from the top. Getting too little sleep, or being yelled at too much, doesn't cause people to do better as often as it causes them to do worse. It causes people to lose track of longer term consequences, and short term gradient following leads to bad long term results. This is because so many problems are non-minimum phase. Bike riding requires counter-steering. Strength training requires weight lifting, and accepting temporary weakening. Getting rewarded for clear thinking requires first confronting the mistakes you've been making. "Knowing which way to go" is an important part of the problem too, and it does become limiting once you get your other stuff in order, but "consistently performs as well as they could, given what they know" is a damn high bar, and we're not there yet. "Do the damn things you know you're supposed to to, and don't rationalize excuses" is a really important part of it, and not as easy as it sounds.

4) Our progress on one dimension is not independent of our ability to progress on the others. Eat unhealthy foods despite knowing better, and you might lose a day of good mental performance that you could have use to figure out "which direction?". Let yourself believe a comforting belief, and that little deviation from the truth can lead to much larger problems in the future. One of the coolest things about LW, in my view, is that people here are epistemically careful enough that they don't shoot themselves in the foot *immediately*. Most people reason themselves into traps so quickly that you either have to be extremely careful with the order and manner in which you present things, or else you have to cultivate an unusual amount of respect so they'll listen for long enough to notice their confusion. LW is *better* at this. LW is not *perfect* at this. More is better. We don't have clear thinking to burn. So much of clear thinking has to do with having room to countersteer that doing anything but maximizing it to the best of our ability is a huge loss in future improvement.

5) Drama is not unimportant, and it is not separable. We are social creatures, and the health and direction of our social structures is a big deal. If you want to get anything done as a community, whether it be personal rationality improvement or collective efforts, the community has to function or that ain't gonna happen. That involves a lot of discussing which norms and beliefs should be adopted, as well as meta-norms and beliefs about how disagreement should be handled, and applying to relevant cases. Problems with bad thinking become exposed and that makes such discussions both more difficult and more risky, but also more valuable to get right. Hubris that gets you in trouble when talking to others doesn't just go away when making private plans and decisions, but in those cases you do lack someone to call you on it and therefore can't so easily find which direction(s) you are erring in. Drama isn't a "random distraction", it's an error signal showing that something is wrong with your/your communities sense making organs, and you need those things in order to find the right directions and then take them. It's not the *only* thing, and there are plenty of ways to screw it up while thinking you're doing the right thing (non-minimumphase again), but it is selected (if imperfectly) for being centered around the most important disagreements, or else it wouldn't command the attention that it does.

Emotional microscope

I'm on-board mostly - except maybe I'd leave some room for doubt of some claims you're making. 

I might agree with the doubt, or I might be able to justify the confidence better.

to me that would quite a huge win! 

I agree! Just not easy :P

And I sort of thought that "people framing it as objective" is a good thing - why do you think it's a problem? 
I could even go as far as saying that even if it was totally inaccurate, but unbiased - like a coin-flip - and if people trusted it as objectively true, that would already help a lot! Unbiased = no advantage to either side. Trusted = no debate about who's right. Random = no way to game it.

Because it wouldn't be objective or trustworthy. Or at least, it wouldn't automatically be objective and trustworthy, and falsely trusting a thing as objective can be worse than not trusting it at all.

A real world example is what happened when these people put Obama's face into a "depixelating" AI.

If you have a human witness describe a face to a human sketch artist, both the witness and the artist may have their own motivated beliefs and dishonest intentions which can come in and screw things up. The good thing though, is that they're limited to the realism of a sketch. The result is necessarily going to come out with a degree of uncertainty, because it's not a full resolution depiction of an actual human face -- just a sketch of what the person might kinda look like. Even if you take it at face value, the result is "Yeah, that could be him".

AI can give extremely clear depiction of exactly what he looks like, and be way the fuck off -- far outside the implicit confidence interval that comes with expressing the outcome as "one exact face" rather than "a blurry sketch" or "a portfolio of most likely faces". If you take AI at face value here, you lose. Obama and that imaginary white guy are clearly different people. 

In addition to just being overconfident, the errors are not "random" in the sense that they are both highly correlated with each other and predictable. It's not just Obama that the AI imagines as a white guy, and anyone who can guess that they fed it a predominately white data set can anticipate this error before even noticing that the errors tend to be "biased" in that direction. If 90% of your dataset is white and you're bad at inferring race, then the most accurate thing to do (if you can't say "I have no idea, man") is to guess "White!" every time -- so "eliminating bias" in this case isn't going to make the answers any more accurate, but you still can't just say "Hey, it has no hate in it heart so it can't be racially biased!". And even if the AI itself doesn't have the capacity to bring in dishonesty, the designers still do. If they wanted that result, they can choose what data to feed it such that it forms the inferences they want it to form. 

This particular AI at this particular job is under-performing humans while giving far more confident answers, and with bias that humans can readily identify, which is sorta "proof of concept" for distrust of AI to be reasonable. As the systems get more sophisticated it will get more difficult to spot the biases and causes, but that doesn't mean they just go away. Neither does it mean that one can't have a pretty good idea of what the biases are -- just that it starts to become a "he said she said" thing, where one of the parties is AI. 

At the end of the day, you still have to solve the hard problem of either a) communicating the insights such that you don't have to trust and can verify yourself, or b) demonstrating sufficient credibility that people will actually trust you. This is the same problem that humans face, with the exception again that AI is more scale-able. If you solve the problem once, you're now faced with an easier problem of credibly demonstrating that the code hasn't changed when things scale up.

How to respond to a series of defiantly persistent evidence-free claims?
Answer by jimmyOct 15, 202114

In your example, both parties are asserting at the other party who doesn't respect their assertions, and neither is getting curious. Saying "Would you please provide evidence" might superficially sound like curiosity, but you'll notice that you still haven't asked what evidence is or why they believe it. The question is "Will you justify yourself to me", and this hypothetical person seems pretty clearly uninterested in that -- and that's okay, you may not be worth justifying to. Maybe you are, but it's not guaranteed. As Christian says, burden of proof doesn't always work that way. As a simple rule, "the burden of proof is on the person who wants to change minds". On the other side of the coin, the burden of curiosity is on the person who wants to find truth.

When you find yourself in this kind of situation, you can shortcut the whole thing by dropping the presuppositions that they need to justify things in ways that would satisfy you, and engaging in curiosity yourself.

A: X is true
You: Hm, how do you know?

Emotional microscope

The last time I said something similar [to "don't be sad"] it was to a person who had a strong meditation background and it got her out of a depressive phase in which she was the prior two months


That's awesome

Emotional microscope

I think there's a temptation to see AI as an unquestionable oracle of truth. If you can get objective and unquestionable answers to these kinds of questions then I do think it would be a revolutionary technology in an immediate and qualitatively distinct way. If you can just pull out an app that looks at both people and says "This guy on my right is letting his emotions get to him, and the guy on my left is being perfectly calm and rational", and both people accept it because not accepting it is unthinkable, that would change a lot.

In practice, I expect a lot of "Fuck your app". I expect a good amount of it to be justified too, both by people who are legitimately smarter and wiser than the app and also by people who are just unwilling to accept when they're probably in the wrong. And that's true even if no one is thumbing the scales. Given their admission to thumbing the scale on search results, would you trust Google's AI to arbitrate these things? Or even something as politically important as "Who is attractive"?

In my experience, a lot of the difficulty in introspecting on our own emotions isn't in the "seeing" part alone, but rather in the "knowing what to do with them" part, and that a lot of the difficult "seeing" isn't so much about being unable to detect things so much as not really looking for things that we're not ready to handle. Mostly not "I'm looking, and I'm ready to accept any answer, but it's all just so dark and fuzzy I just can't make anything out". Mostly "Well, I can't be angry because that'd make a bad person, so what could it possibly be! I have no idea!" -- only often much more subtle than that. As a result, simply telling people what they're feeling doesn't do a whole lot. If someone tells you that you're angry when you can't see it, you might say "Nuh uh" or you might say "Okay, I guess I'm angry", but neither one gives you a path to explaining your grievances or becoming less angry and less likely to punch a wall. This makes me really skeptical of the idea that simple AI tools of the sort like "micro-expression analyzers" are going to make large discontinuous changes.

If you can figure out what someone is responding to, why they're responding to it in that way, what other options they have considered and rejected, what other options they have but haven't seen, etc, then you can do a whole lot. However, then you start to require a lot more context and a full fledged theory of mind, and that's a much deeper problem. To do it well you not only have to have a good model of human minds (and therefore AI-complete, but no more objective than humans) but also have a model of the goal humans should be aiming for, which is a good deal harder still.

It's possible to throw AI at problems like chess and have them outperform humans without understanding chess strategy yourself, but that has a lot to do with the fact that the problem is arbitrarily narrowed to something with a clear win condition. The moment you try to do something like "improving social interaction" you have to know what you're even aiming for. And the moment you make emotions goals, you Goodheart on them. People Goodheart these already, like with people who have learned how to be really nice and avoid near-term conflict and get people to like them but fail to learn the spine needed to avoid longer term conflict let alone make things good. Competent AI is just going to Goodheart that much more effectively, and that isn't a good thing.

As I see it, the way for AI to help isn't to blindly throw it at ill-defined problems in hopes that it'll figure out wisdom before we do, but rather to help with scaling up wisdom we already have or have been able to develop. For example, printing presses allow people to share insights further and with more fidelity than teaching people who teach people who teach people. Things like computer administered CBT allow even more scalability with even more fidelity, but at the cost of flexibility and ability to interact with context. I don't see any *simple* answers for how better technology can be used to help, but I do think there are answers to be found that could be quite helpful. The obvious example that follows from above would be that computer administered therapy would probably become more effective if it could recognize emotions rather than relying on user reports.

I'm also not saying that an argument mediator app which points out unacknowledged emotions couldn't be extremely helpful. Any tool that makes it easier for people to convey their emotional state credibly can make it easier to cooperate, and easier cooperation is good. It's just that it's going to come with most of the same problems as human mediators (plus the problem of people framing it as 'objective'), just cheaper and more scalable. Finding out exactly how to scale things as AI gets better at performing more and more of the tasks that humans do is part of it, but figuring out how to solve the bigger problems with actual humans is a harder problem in my estimation, and I don't see any way for AI to help that part in any direct sort of way.

Basically, I think if you want to use AI emotion detectors to make a large cultural change, you need to work very hard to keep it accurate, to keep it's limitations in mind, and to figure out with other humans how to put it to use in a way that gets results good enough that other people will want to copy. And then figure out how to scale the understanding of how to use these tools.

Why didn't we find katas for rationality?

The gap isn't nearly so clear cut. In wrestling, we'd do solo drills of shots/sprawls/sit outs/etc every day, and sometimes combine them into chains like sit out->granby->stand up-> shot. The karate guys next to us did their partner katas all the time. 

The difference is that when karate dude hits air, it looks like a joke. For all you can tell watching it, not only has this dude never been in a real fight, neither have any of his instructors or his instructors instructors. When Mike Tyson hits air, it's terrifying, and you can tell just from watching it that this guy is very in touch with what is required for knocking out skilled opponents. It's more about whether the connection with the ultimate test is still there and informing how the katas are done than whether you're doing it alone or for multiple moves in a row or anything.

Why didn't we find katas for rationality?

While it's true that live resistance is an essential component, it's important to note that effective martial arts do katas as well under different names. In wrestling, it's "drilling". In boxing, it's hitting the bag or pads.

Load More