All of Jalen Lyle-Holmes's Comments + Replies

You also could just use this to disguise your 'style' if you want to say something anonymously going forward (doesn't work for stuff you've already got out there). Just ask an LLM to reword it in a different style before you post, could be a plugin or something, and then it can't be identified as being by you, right?

Yep that's it! Glad my explanation helped.
(Though if we want to be a bit pedantic about it, we'd say that actually a world where 21 heads in a row ever happens is not unlikely (If heaps and heaps of coin tosses happen across the world over time, like in our world), but a world where any particular given sequence of 21 coin flips is all heads is yes very unlikely (before any of them have been flipped)).)

Ah yes this was confusing to me for a while too, glad to be able to help someone else out with it!

The key thing to realise for me, is that the probability of 21 heads in a row changes as you toss each of those 21 coins.

The sequence of 21 heads in a row does indeed have much less than 0.5 chance, to be precise ,   which is 0.000000476837158.  But it only has such a tiny probability before any of those 21 coins have been tossed. However as soon as the first coin is tossed, the probability of those 21 coins all being heads changes. If firs...

1mikbp4mo
I think you explain it very well!   So the thing is something like the following, right?: "Looking at it from the outside, a world where 21 heads showed in a row is incredibly unlikely: (if the coin is fair) I would happily bet against this world happening. However, I am already in an incredibly weird world where 20 heads have shown in a row, and another heads only makes it a bit more weird, so I don't know what to bet, heads or tails."
1[comment deleted]4mo

Love this way at pointing at this distinction!

Thanks! What do you mean about the cross referencing?

Good for any particular topic or just in general?

2Chris_Leong10mo

Oooo thanks for this, just used it to write a post for my blog and it was more fun and easier than usual: My Anxiety Keeps Persisting Even Though It Has Been Diagnosed, Rude

Thanks for sharing your ideas. I'm a bit confused about your core claim and would love if you could could clarify (Or refer to the specific part of your writing that addresses these questions): I get the general gist of your claim, that AI alignment depends on whether humans can all have the same values, but I don't know how much 'the same' you mean. You say 'substantially' align, could you give some examples of how aligned you mean? For example, do you mean all humans sharing the same political ideology (libertarian/communist/ etc)? Do you mean that for a...

Yes great question. Looking at programming in general, there seem to be many obvious counterexamples, where computers have certain capabilities ('features') that humans don't (e.g. doing millions of arithmetic operations extremely fast with zero clumsy errors) and likewise where they have certain problem ('bugs') that we don't (e.g. adversarial examples for image classifiers, which don't trip humans up at all but entire ruin the neural nets classification.)

Yes, if all humans agreed on everything, there would still be significant technical problems to get an AI to align with all the humans. Most of the existing arguments for the difficulty of AI alignment would still hold even if all humans agreed. If you (Henry) think these existing arguments are wrong, could you say something about why you think that, i.e. offer counterarguments?

Chris Aruffo has done some work on this: http://www.aruffo.com/eartraining/

I love this idea and would watch this stream!

These are a couple posts I came up with in a quick search, so not necessarily the best examples:

Covid 9/23: There Is a War

"The FDA, having observed and accepted conclusive evidence that booster shots are highly effective, has rejected allowing people to get those booster shots unless they are over the age of 65, are immunocompromised or high risk, or are willing to lie on a form. The CDC will probably concur.  I think we all know what this means. It means war! ..."

Covid 11/18: Paxlovid Remains Illegal

"It seems to continue to be the official position t

...

There's a 16 week Zoom book club coming up for Burns' book about TEAM-CBT, facilitated by a TEAM-CBT trainer, in case anyone is interested (starts Sep 8th 2021): https://www.feelinggreattherapycenter.com/book-club
(I just signed up)

There's a 16 week Zoom book club coming up for Burns' book about TEAM-CBT, facilitated by a TEAM-CBT trainer, in case anyone is interested: https://www.feelinggreattherapycenter.com/book-club
(I just signed up)

To me it seems useful to distinguish two different senses of 'containing knowledge', and that some of your examples implicitly assume different senses. Sense 1: How much knowledge a region contains, regardless of whether an agent in fact has access to it (This is the sense in which the sunken map does contain knowledge) and 2. How much knowledge a region contains and how easily a given agent can physically get information about the relevant state of the region in order to 'extract' the knowledge it contains (This is the sense in which the go-kart with a da...

Hm on reflection I actually don't think this does what I thought it did. Specifically I don't think it captures the amount of 'complexity barrier' reducing the usability of the information. I think I was indeed equivocating between computational (space and time) complexity, vs. Kolmogorov complexity. My suggestion captures the later, not the former.

Also, some further Googling has told me that the expected absolute mutual information, my other suggestion at the end, is "close" to Shannon mutual information (https://arxiv.org/abs/cs/0410002) so doesn't seem like that's actually significantly different to the mutual information option which you already discussed.

Building off Chris' suggestion about Kolmogorov complexity, what if we consider the Kolmogorov complexity of thing we want knowledge about (e.g. the location of an object) given the 'knowledge containing' thing (e.g. a piece of paper with the location coordinates written on it) as input.

Wikipedia tells me this is called the 'conditional Kolmogorov complexity' of  (the thing we want knowledge about) given  (the state of the region potentially containing knowledge),

(Chris I'm not sure if I understood all of your commen...

1Jalen Lyle-Holmes2y
Hm on reflection I actually don't think this does what I thought it did. Specifically I don't think it captures the amount of 'complexity barrier' reducing the usability of the information. I think I was indeed equivocating between computational (space and time) complexity, vs. Kolmogorov complexity. My suggestion captures the later, not the former.  Also, some further Googling has told me that the expected absolute mutual information, my other suggestion at the end, is "close" to Shannon mutual information (https://arxiv.org/abs/cs/0410002) so doesn't seem like that's actually significantly different to the mutual information option which you already discussed.

Interesting sequence so far!

Could we try like an 'agent relative' definition of knowledge accumulation?

e.g. Knowledge about X (e.g. the shape of the coastline) is accumulating in region R (e.g. the parchment) accessibly for an agent A (e.g. a human navigator) to the extent that agent A is able to condition its behaviour on X by observing R and not X directly. (This is borrowing from the Cartesian Frames definition of an 'observable' being something the agent can condition on).

If we want to break this down to lower level concepts than 'agents' a...

Thank you Alex! Just sent you a PM :)

Oh cool, I'm happy you think it makes sense!
I mean, could the question even be as simple as "What is an optimiser?", or "what is an optimising agent?"?
With maybe the answer being maybe something roughly to do with
1. being able to give a particular cartesian frame over possible world histories, such that there exists an agent 'strategy' ('way that the agent can be')   such that for some 'large' subset of possible environments , and some target set of possible worlds  we have  for all

and 2. that...

4Alex Flint2y
Jalen I would love to support you to to turn this into a standalone post, if that is of interest to you. Perhaps we could jump on a call and discuss the ideas you are pointing at here.

Side question: After I read some of the Cartesian Frames sequence, I wondered if something cool could come out of combining its ideas with your ideas from your Ground of Optimisation post. Because: Ground of Optimisation 1. formalises optimisation but 2. doesn't split the system into agent and environment, whereas Cartesian Frames 1. gives a way of 'imposing' an agent-environment frame on a system (without pretending that frame is a 'top level' 'fundamental' property of the system) , but  2. doesn't really deal with optimisation. So I've wondered  if there might be something fruitful in trying to combine them in some way, but am not sure if this would actually make sense/be useful (I'm not a researcher!), what do you think?

2Alex Flint2y
Thanks for this note Jalen. Yeah I think this makes sense. I'd be really interested in what question might be answered by the combination of ideas from Cartesian Frames and The Ground of Optimization. I have the strong sense that there is a very real question to be asked but I haven't managed to enunciate it yet. Do you have any sense of what that question might be?

I assume you've read the Cartesian Frame sequence? What do you think about that as an alternative to the traditional agent model?

5Jalen Lyle-Holmes2y
Side question: After I read some of the Cartesian Frames sequence, I wondered if something cool could come out of combining its ideas with your ideas from your Ground of Optimisation post. Because: Ground of Optimisation 1. formalises optimisation but 2. doesn't split the system into agent and environment, whereas Cartesian Frames 1. gives a way of 'imposing' an agent-environment frame on a system (without pretending that frame is a 'top level' 'fundamental' property of the system) , but  2. doesn't really deal with optimisation. So I've wondered  if there might be something fruitful in trying to combine them in some way, but am not sure if this would actually make sense/be useful (I'm not a researcher!), what do you think?