Nora_Ammann

Comments

Maps of Maps, and Empty Expectations

Glad to hear it seemed helpful!

FWIW I'd be interested in reading you spell out in more detail what you think you learnt from it about simulacra levels 3+4.

Re "writing the bottom line first": I'm not sure. I think it might be, but at least this connection didn't feel salient, or like it would buy me anything in terms of understanding, when thinking about this so far. Again interested in reading more about where you think the connections are. 

To maybe say more about why (so far) it didn't seem clearly relevant to me: "Writing the bottom line first", to me, comes with a sense of actively not wanting, and taking steps to avoid, figuring out where the arguments/evidence leads you. Maps of maps feels slightly different in so far as the person really wants to find the correct solution but they are utterly confused about how to do that, or where to look. Similarly, "writing the bottom line first" suggests that you do have a concrete "bottom line" that you want to be true, wherelse empty expectations don't have anything concrete to say about what you would want to be true  - there isn't (hardly) any object-level substance there.
Most succinctly, "writing the bottom line first" seems closer to motivated reasoning, and maps of maps/empty expectation seem closer to (some fundamental sense of) confusion (about where to even look to figure out the truth/solution). (Which, having spelt this out just now, makes the connection to simulacra levels 3+4 more salient.)

 

How do we prepare for final crunch time?

Regarding "Staying grounded and stable in spite of the stakes": 
I think it might be helpful to unpack the vritue/skill(s) involved according to the different timescales at which emergencies unfold. 

For example: 

1. At the time scale of minutes or hours, there is a virtue/skill of "staying level headed in a situation of accute crisis". This is the sort of skill you want your emergency doctor or firefighter to have. (When you pointed to the military, I think you in part pointed to this scale but I assume not only.)

From talking to people who do or did jobs like this, a typical pattern seems to be that some types of people when in siutations like this basically "freeze" and others basically move into a mode of "just functioning". There might be some margin for practice here (maybe you freeze the first time around and are able to snap out of the freeze the second time around, and after that, you can "remember" what it feels like to shift into funcitoning mode ever after) but, according to the "common wisdom" in these  prfoessions (as I undestand it), mostly people seem to fall in one or the other category. 

The sort of practice that I see being helpful here is a) overtraining on whatever skill you will need in the moment (e.g. imagine the emergency doctor) such that you can hand over most cognitive work to your autopilot once the emergency occurs; and b) train the skill of switching from freeze into high-functioning mode. I would expect "drill-type practices" are the most abt to get at that, but as noted above I don't know how large the margin for improvement is. (A subtlety here: there seems to be a massive difference between "being the first person to switch in to funcitoning mode", vs "switching into functioning mode after (literally or metaphorically speaking) someone screamed at your face to get moving". (Thinking of the military here.))

All that said, I don't feel particularly excited for people to start doing a bunch of drill practice or the like. I think there are possible extreme scenarios of "narrow hingy moments" that will involve this skill but overall this doesn't seem to me not to be the thing that is most needed/with highest EV.

(Probably also worth putting some sort of warning flag here: genuinly high-intensity situations can be harmful to people's psychy so one should be very cautious about experimenting with things in this space.)


2. Next, there might be a related virtue/skill at the timescale of weeks and months. I think the pandemic, especially from ~March to May/June is an excellent example of this, and was also an excellent learning opportunities for people involved in some time-sensitive covid-19 problem. I definitely think I've gained some gears on what a genuin (i.e. highly stakey) 1-3 month sprint involves, and what challenges and risks are invovled for you as an "agent" who is trying to also protect their agency/ability to think and act (though I think others have learnt and been stress-tested much more than I have). 

Personally, my sense is that this is "harder" than the thing in 1., because you can't rely on your autopilot much, and this makes things feel more like an adaptive rather than technical problem (where the latter is aproblem where the solution is basically clear, you just have to do it; and the latter is a problem most of the work needed is in figuring out the solution, not so much (necessarily) in executing it.)

One difficulty is that this skill/virtue involves managing your energy not only spending it well. Knowing yourself and hoy your energy and motivation structures work - and in particular how they work in extreme scenarios - seems very important. I can see how people who have meditated a lot have gained valuable skills here. I don't think it's th eonly way to get these skills, and I expect the thing that is paying off here is more "being able to look back on years of meditaton practice and the ways this has rewired one's brain in some deep sense" rather than "benefits from having a routine to meditate" or something like this. 

During the first couple of COVID-19 months, I was also surprised how "doing well at this" was more a question of collective rationality than I would have thought (by collective rationality I mean things like: ability to communciate effectively, ability to mobilise people/people with the right skills, abilty to delegate work effectively). There is still a large individual component of "staying on top of it all/keeping the horizon in sight" such that you are able to make hard decisoins (which you will be faced with en masse). 

I think it could be really good to collect lessons learnt from the folks invovled in some EA/rationlaist-adjacent COVID-19 projects.

3. The scale of ~(a few) years seems quite similar in type to 2. The main thing that I'd want to add here is that the challenge of dealing with strong uncertainty while the stakes are massive can be very psychologically challenge. I do think meditation and related practices can be helpful in dealing with that in a way that is both grounded and not flinching from the truth. 

I find myself wondering whether the miliatry does anything to help soldiers prepare for the act of "going to war" where the posisbility of death is extremely real. I imaigne they must do things to support people in this process. It's not exactly the same but there certainly are parallels with what we want. 

On the nature of purpose

Re language as an example: parties involved in communication using language have comparable intelligence (and even there I would say someone just a bit smarter can cheat their way around you using language). 

Mhh yeah so I agree these examples of ways in which language "fails". But I think they don't bother me too much? 
I put them in the same category as "two agents with good faith sometimes miscommunicate - and still, language overall is pragmatically", or "works good enough". In other words, even though there is potential for exploitation, that potential is in fact meaningfully constraint. More importantly, I would argue that the constraint comes (in large parts) from the way the language has been (co-)constructed. 

On the nature of purpose

a cascade of practically sufficient alignment mechanisms is one of my favorite ways to interpret Paul's IDA (Iterated Distillation-Amplification)

Yeah, great point!

On the nature of purpose

However, I think its usefulness hinges on ability to robustly quantify the required alignment reliability / precision for various levels of optimization power involved. 

I agree and think this is a good point! I think on top of quantifying the required alignment reliability "at various levels of optimization" it would also be relevant to take the underlying territory/domain into account. We can say that a territory/domain has a specific epistemic and normative structure (which e.g. defines the error margin that is acceptable, or tracks the co-evolutionary dynamics). 


 

Nora_Ammann's Shortform

Pragmatically reliable alignment
[taken from On purpose (footnotes); sharing this here because I want to be able to link to this extract specifically]

AI safety-relevant side note: The idea that translations of meaning need only be sufficiently reliable in order to be reliably useful might provide an interesting avenue for AI safety research. 

Language works, evidenced by the striking success of human civilisations made possible through advanced coordination which in return requires advanced communication. (Sure, humans miscommunicate what feels like a whole lot, but in the bigger scheme of things, we still appear to be pretty damn good at this communication thing.)  

Notably, language works without there being theoretically air-tight proofs that map meanings on words. 

Right there, we have an empirical case study of a symbolic system that functions on a (merely) pragmatically reliable regime. We can use it to inform our priors on how well this regime might work in other systems, such as AI, and how and why it tends to fail.

One might argue that a pragmatically reliable alignment isn’t enough - not given the sheer optimization power of the systems we are talking about. Maybe that is true; maybe we do need more certainty than pragmatism can provide. Nevertheless, I believe that there are sufficient reasons for why this is an avenue worth exploring further. 

The Inner Workings of Resourcefulness

The question you're pointing at is definitely interstinterestinging. A Freudian, slightly pointed way of phrasing it is something like: are human's deepest desires, in essence, good and altruistic, or violent and selifsh? 

My guess is that this question is wrong-headed. For example, I think this is making a mistake of drawing a dichotomy and rivalry between my "oldest and deepest drives" and "reflective reasoning", and depending on your conception of which of these two wins, your answer to the above questions ends up being positive or negative. I don't really think those people would endorse that, but I do have a sense that something like this influences their background model of the world, and informs their intuitions about the "essence of human nature" or whatever. 

This dichotomy/rivalry seems wrong to me. In my experience, my intuition/drives and my explicit reasoning can very much "talk to each other". For example, I can actually integrate the knowledge that our minds have evolved such that we are scope insensitive into the whole of my overall reasoning/worldview. Or I can integrate the knowledge that, given I know that I can suffer or be happy, it's very likely other people can also suffer or feel pleasure, and that does translate into my S1-type drivers. Self-alignment, as I understand it, is very much about having my reflective beliefs and my intuitions inform one another, and I don' need o through either one overboard to become more self-aligned. 

That said, my belief that self-alignment is worth pursuing is definitely based on the belief that this leads people to be more in touch with the Good and more effective in pursuing it. In that belief in turn is mostly informed by my own experience and reports from other people. I acknowledge that that likely doesn't sound very convincing to someone whose experience points in the exact opposite direction. 

The feeling of breaking an Overton window

[I felt inclined to look for observations of this thing outside of the context of the pandemic.]

Some observations: 

I experience this process (either in full or the initial stages of it) for example when asked about my work (as it relates to EA, x-risks, AI safety, rationality and the like), or when sharing ~unconventional plan (e.g. "I'll just spend the next few months thinking about this") when talking to e.g. old friends from when I was growing up, people in the public sphere like a dentist, physiotherapist etc. This used to be also somewhat the case with my family but I've made some conscious (and successful) effort to reduce it.

My default reaction is [exagerating a bit for the purpose of pulling out the main contures] to sort of duck; my brain kicks into a process that feels like "oh we need to fabricate a lie now, focus" (though, lying is a bit misleading - it's more like "what are the fewest, least reveiling words I can say about this that will still be taken as a 'sufficient' answer"); my thinking feels restrained, quite the opposite of being able to think freely, clearly and calmly; often there is an experience (reminiscent) of something like shame ; also some feeling of helplessness, "I can't explain myself" or "they won't understand me"; sometimes the question feels a bit intrusive, like if they wanted to come in(to my mind?) and break things. (?)

Some reflections:

  • "Inferential distance and the cost of explaining":
    • This is very viscerally salient to me in the moment when the "alien process" kicks in. I basically have the thought pattern of "They won't understand what I'm talking about. Mhh I guess I could explain it to them? But that will be lengthy and effortful, and I don't want to spend that effort."
      • I think this pragmatic consideration is often legitimate. At the same time I also suspect that my mind often uses this as an excuse/cover-up for something else.
      • For example, I am on average much less reluctant to give answers to such questions in English compared to German or French. I think about my work in English, thus, explaining my beliefs in another language is extra costly because it requires lots of non-trivial translation. That said, speaking German or French is also correlated with being in specific environments, notably environments I grew up in and that trigger memories of older self-conceptions of mine, and where I generally feel more expectations from the society, or soemthing.
  • "Updating based on someone's conclusions (including their observed behaviour) is often misleading (as opposed to updating on based one someone's reasoning/map)":
    • Based on the above, if I inner sim telling someone about my belief X that is, say, slightly outside of their overton window , I feel kinda doomy, like things will go wrong or at the very least it won't be useful. So, it feels like I either want to get the chance to sit down with them for 2h+ or say as little as possible about my belief X.
    • I think it's interesting to double click on what "things go wrong" means here. The two main things that come up are:
      • An epsitemic worry: they will objectively-speaking make a wrong update and walk away with more wrong rather than less wrong beliefs
      • A ~social worry: all they will update about is me being ~weird. A decent part of the worry here is something like: they will distance themselves from me because they will feel like they can't talk to me/like we're not talking the same language, a sense of isolation. Another part seems more extrem: They will think I'm crazy(?) (I sort of crinche at this one. I don't really think they will think I'm crazy(?). Idk - there is something here, but I'm confused about what it is.)
On the nature of purpose

As far as I can tell, I agree with what you say - this seems like a good account of how the cryptophraher's constraint cashes out in language. 

To your confusion: I think Dennett would agree that it is Darwianian all the way down, and that their disagreement lies elsewhere. Dennet's account for how "reasons turn into causes" is made on Darwinian grounds, and it compels Dennett (but not Rosenberg) to conclude that purposes deserve to be treated as real, because (compressing the argument a  lot) they have the capacity to affect the causal world.

Not sure this is useful?

On the nature of purpose

I'm inclined to map your idea of "reference input of a control system" onto the concept of homeostasis, homeostatic set points and homeostatic loops. Does that capture what you're trying to point at?

(Assuming it does) I agree that that homeostasis is an interesting puzzle piece here. My guess for why this didn't come up in the letter exchange is that D/R are trying to resolve a related but slightly different question: the nature and role of an organism's conscious, internal experience of "purpose". 

Purpose and its pursuit have a special role in how human make sense of the world and themselves, in a way non-human animals don't (though it's not a binary). 

The suggested answer to this puzzle is that, basically, the conscious experience of purpose and intent (and the allocation of this conscious experience to other creatures) is useful and thus selected for. 

Why? They are meaningful patterns in the world. An observer with limited resource who wants to make senes of the world (i.e. an agent that wants to do sample complexity reduction) can abstract along the dimension of "purpose"/"intentionality" to reliably get good predictions about the world.  (Except, "abstracting along the dimension of intentionality" isn't an active choice of the observer, rather than a results of the fact that intentions are a meaningful pattern.) The "intentionality-based" prediction does well at ignoring variables that aren't very predictive and capturing the ones that are, in the context of a bounded agent. 




 

Load More