All of Nora_Ammann's Comments + Replies

Here is another interpretation of what can cause a lack of robustness to scaling down: 

(Maybe this is what you have in mind when you talk about single-single alignment not (necessaeraily) scaling to multi-multi alignment - but I am not sure that is the case, and even if it ism I feel pulled to stating it again more as I don't think it comes out as clearly as I would want it to in the original post.)

Taking the example of an "alignment strategy [that makes] the AI find the preferences of values and humans, and then pursu[e] that", robustness to scaling ... (read more)

Curious what different aspects the "duration of seclusion" is meant to be a proxy for? 

You defindefinitelyitly point at things like "when are they expected to produce intelligible output" and "what sorts of questions appear most relevant to them". Another dimension that came to mind - but I am not sure you mean or not to include that in the concept - is something like "how often are they allowed/able to peak directly at the world, relative to the length of periods during which they reason about things in ways that are removed from empirical data"? 

5Duncan_Sabien1y
As Henry points out in his comment, certainly at least some 1,000 and 10,000-day monks must need to encounter the territory daily.  I think that for some monks there is probably a restriction to actually not look for the full duration, but for others there are probably more regular contacts. I think that one thing the duration of seclusion is likely to be a firm proxy for is "length of time between impinging distractions."  Like, there is in fact a way in which most people can have longer, deeper thoughts while hiking on a mountainside with no phone or internet, which is for most people severely curtailed even by having phone or internet for just 20min per day at a set time. So I think that even if a monk is in regular contact with society, the world, etc., there's something like a very strong protection against other people claiming that the monk owes them time/attention/words/anything.

PIBBSS Summer Research Fellowship -- Q&A event

  • What? Q&A session with the fellowship organizers about the program and application process. You can submit your questions here.
  • For whom? For everyone curious about the fellowship and for those uncertain whether they should apply.
  • When? Wednesday 12th January, 7 pm GMT
  • Where? On Google Meet, add to your calendar

PIBBSS Summer Research Fellowship -- Q&A event

  • What? Q&A session with the fellowship organizers about the program and application process. You can submit your questions here.
  • For whom? For everyone curious about the fellowship and for those uncertain whether they should apply.
  • When? Wednesday 12th January, 7 pm GMT
  • Where? On Google Meet, add to your calendar

I think it's a shame that these days for many people the primary connotation of the word "tribe" is connected to culture wars. In fact, our decision to use this term was in part motivated by wanting to re-appropriate the term to something less politically loaded.

As you can read in our post (see "What is a tribe?"), we mean something particular. As any collective of human beings, it can in principle be subject to excessive in-group/out-group dynamics but that's by far not the only, nor the most interesting part of it. 

2jbash1y
I did actually read that. I admit that I didn't read all the detailed advice about how to make one work, since I have no intention of doing so... but I did read the definition and the introductory part. It wouldn't have mattered what word you'd used. Your groups are actually smaller than most things called tribes anyway. I am reacting to the substance. I doubt that humans are , in a practical way, capable of tightening up their in-groups like that without at the same time increasing hostility to out-groups (or at least people who are out-of-the-group). Not in principle, but in practice. If nothing else, you have to start by giving some kind of preference to members of the tribe. And, since it's about mutual aid with certain costs, you have to enforce its boundaries. And set up norms about what you can and can't do and still be "in" (which will not all be formally considered, will not all be under organized control, and yet will involve enough people that they can't easily be changed, challenged, or made too complicated). I suspect that the specific scale of "up to the limit of the number of people who can all personally know each other" is a particularly dangerous scale. For one thing, that means that at the edges of the group, you will often know, and have some special duty toward, the person or people on one side of some brewing conflict... but you will NOT know or feel any special duty toward the person or people on the other side. For another, it's probably the scale at which people most often had occasion to attack each other in the "evolutionary environment". For a third, it means you're always at the risk of growing to the point of having to split the group, with no obvious way to handle that without generating acrimony. You may address that last one in your detailed material; I don't know. It's true, though, that the word "tribe" is kind of attached to that kind of concern. And there must be a reason why the word got a bad name, as well as a reason you

Context:  (1) Motivations for fostering EA-relevant interdisciplinary research; (2) "domain scanning" and "epistemic translation" as a way of thinking about interdisciplinary research

[cross-posted to the EA forum in shortform]
 

List of fields/questions for interdisciplinary AI alignment research

The following list of fields and leading questions could be interesting for interdisciplinry AI alignment reserach. I started to compile this list to provide some anchorage for evaluating the value of interdiscplinary research for EA causes, specifical... (read more)

Glad to hear it seemed helpful!

FWIW I'd be interested in reading you spell out in more detail what you think you learnt from it about simulacra levels 3+4.

Re "writing the bottom line first": I'm not sure. I think it might be, but at least this connection didn't feel salient, or like it would buy me anything in terms of understanding, when thinking about this so far. Again interested in reading more about where you think the connections are. 

To maybe say more about why (so far) it didn't seem clearly relevant to me: "Writing the bottom line first", to ... (read more)

3romeostevensit2y
I meant that empty expectations are another anchor for antidoting writing the bottom line first. As for simulacra levels: This highlights how we switch abstraction levels when we don't know how to solve a problem on the level we're on. This is a reasonable strategy in general that sometimes backfires.

Regarding "Staying grounded and stable in spite of the stakes": 
I think it might be helpful to unpack the vritue/skill(s) involved according to the different timescales at which emergencies unfold. 

For example: 

1. At the time scale of minutes or hours, there is a virtue/skill of "staying level headed in a situation of accute crisis". This is the sort of skill you want your emergency doctor or firefighter to have. (When you pointed to the military, I think you in part pointed to this scale but I assume not only.)

From talking to people who do ... (read more)

Re language as an example: parties involved in communication using language have comparable intelligence (and even there I would say someone just a bit smarter can cheat their way around you using language). 

Mhh yeah so I agree these examples of ways in which language "fails". But I think they don't bother me too much? 
I put them in the same category as "two agents with good faith sometimes miscommunicate - and still, language overall is pragmatically", or "works good enough". In other words, even though there is potential for exploitation, that ... (read more)

a cascade of practically sufficient alignment mechanisms is one of my favorite ways to interpret Paul's IDA (Iterated Distillation-Amplification)

Yeah, great point!

However, I think its usefulness hinges on ability to robustly quantify the required alignment reliability / precision for various levels of optimization power involved. 

I agree and think this is a good point! I think on top of quantifying the required alignment reliability "at various levels of optimization" it would also be relevant to take the underlying territory/domain into account. We can say that a territory/domain has a specific epistemic and normative structure (which e.g. defines the error margin that is acceptable, or tracks the co-evolutionary dynamics). 


 

Pragmatically reliable alignment
[taken from On purpose (footnotes); sharing this here because I want to be able to link to this extract specifically]

AI safety-relevant side note: The idea that translations of meaning need only be sufficiently reliable in order to be reliably useful might provide an interesting avenue for AI safety research. 

Language works, evidenced by the striking success of human civilisations made possible through advanced coordination which in return requires advanced communication. (Sure, humans miscommunicate what feels like a w... (read more)

The question you're pointing at is definitely interstinterestinging. A Freudian, slightly pointed way of phrasing it is something like: are human's deepest desires, in essence, good and altruistic, or violent and selifsh? 

My guess is that this question is wrong-headed. For example, I think this is making a mistake of drawing a dichotomy and rivalry between my "oldest and deepest drives" and "reflective reasoning", and depending on your conception of which of these two wins, your answer to the above questions ends up being positive or negative. I don't... (read more)

[I felt inclined to look for observations of this thing outside of the context of the pandemic.]

Some observations: 

I experience this process (either in full or the initial stages of it) for example when asked about my work (as it relates to EA, x-risks, AI safety, rationality and the like), or when sharing ~unconventional plan (e.g. "I'll just spend the next few months thinking about this") when talking to e.g. old friends from when I was growing up, people in the public sphere like a dentist, physiotherapist etc. This used to be also somewhat the cas... (read more)

As far as I can tell, I agree with what you say - this seems like a good account of how the cryptophraher's constraint cashes out in language. 

To your confusion: I think Dennett would agree that it is Darwianian all the way down, and that their disagreement lies elsewhere. Dennet's account for how "reasons turn into causes" is made on Darwinian grounds, and it compels Dennett (but not Rosenberg) to conclude that purposes deserve to be treated as real, because (compressing the argument a  lot) they have the capacity to affect the causal world.

Not sure this is useful?

I'm inclined to map your idea of "reference input of a control system" onto the concept of homeostasis, homeostatic set points and homeostatic loops. Does that capture what you're trying to point at?

(Assuming it does) I agree that that homeostasis is an interesting puzzle piece here. My guess for why this didn't come up in the letter exchange is that D/R are trying to resolve a related but slightly different question: the nature and role of an organism's conscious, internal experience of "purpose". 

Purpose and its pursuit have a special role in how hu... (read more)

2Richard_Kennaway2y
I would map "homeostasis" onto "control system", but maybe that's just a terminological preference. The internal experience of purpose is a special case of internal experience, explaining which is the Hard Problem of Consciousness, which no-one has a solution for. I don't see a reason to deny this sort of purpose to animals, except to the extent that one would deny all conscious experience to them. I am quite willing to believe that (for example) cats, dogs, and primates have a level of consciousness that includes purpose. The evolutionary explanation does not make any predictions. It looks at what is, says "it was selected for", and confabulates a story about its usefulness. Why do we have five fingers? Because every other number was selected against. Why were they selected against? Because they were less useful. How were they less useful? They must have been, because they were selected against. Even if some content were put into that, it still would not explain the thing that was to be explained: what is purpose? It is like answering the question "how does a car work?" by expatiating upon how useful cars are.

In regards to "the meaning of life is what we give it", that's like saying "the price of an apple is what we give it". While true, it doesn't tell the whole story. There's actual market forces that dictate apple prices, just like there are actual darwinian forces that dictate meaning and purpose.

Agree; the causes that we create ourselves aren't all that governs us - in fact, it's a small fraction of that, considering physical, chemical, biological, game-theoretic, etc. constraints. And yet, there appears to be an interesting difference between the causes t... (read more)

I'm confused about the "purposes don't affect the world" part. If I think my purpose is to eat an apple, then there will not be an apple in the world that would have otherwise still been there if my purpose wasn't to eat the apple. My purpose has actual effects on the world, so my purpose actually exists.

So, yes, basically this is what Dennett reasons in favour of, and what Rosenberg is skeptical of. 

I think the thing here that needs reconciliation - and what Dennett is trying to do - is to explain why,  in your apple story, it's justified to use... (read more)

Thanks :)

> I will note that I found the "Rosenberg's crux" section pretty hard to read, because it was quite dense. 

Yeah, you're right - thanks for the concrete feedback ! 

I wasn't originally planning to make this a public post and later failed to take a step back and properly model what it would be like as a reader without the context of having read the letter exchange. 

I consider adding a short intro paragraph to partially remedy this.  

3jacobjacob2y
Makes sense! An intro paragraph could be good :) 

While I'm not an expert, I did study political science and am Swiss. I think this post paints an accurate picture of important parts of the Swiss political system. Also, I think (and admire) how it explains very nicely the basic workings of a naturally fairly complicated system.

If people are interested in reading more about Swiss Democracy and its underlying political/institutional culture (which, as pointed out in the post, is pretty informal and shaped by its historic context), I can recommend this book: https://www.amazon.com/Swiss-Democracy-Solut... (read more)

Are there any existing variolation projects that I can join?

FWIW, there is this I know of: https://1daysooner.org/

That said, last time I've got an update from them (~1 month ago), any execution of these trials was still at least a few months away. (You could reach out to them via the website for more up to date information.) Also, there is a limited number of places where the trials can actually take place, so you'd have to check whether there is anything close to where you are.

(Meta: This isn't necessarily an endoresement of your main qusetion.)

That's cool to hear!

We are hoping to write up our current thinking on ICF at some point (although I don't expect it to happened within the next 3 months) and will make sure to share it.

Happy to talk!