All of Chipmonk's Comments + Replies

Synthesis says worked with DARPA to make a tool to teach kids math: 

(Also, I asked a friend who worked there about this last summer and, if I recall correctly, they said it does not use LLMs, it's something else.)

  • show up to office hours for classes you aren’t a part of, just to chat with the professor

this is how I became friends with Po-Shen Loh. I would be the only person to show up to office hours, and we wouldn't even talk about math.

oh this would make a lot more sense if true.

How were you getting SLIT for $25/week

I don't know, but the online shops that sell it are still $100/mo today.

how much did SLIT cost in total for you

$25/wk * 2 years

were the doses tailored to your particular allergies based on tests?

yes, there were several allergens in there, just the ones i was allergic to

Interesting, I don't know anything about the quality of different SLIT manufacturers, but $2600 sounds a lot more affordable than $7000. I'll try to remember to ask my allergist about this if I ever see them again.

The advice I wish I had earlier is:

LW is full of good writers

I disagree with this definitively. I can't read most if not almost all LW posts. You can do much better if you know what you want to communicate and only say the essential words. 

That’s interesting. I find it relaxing to read most LW posts/comments, which tempts me to call them good writers. Perhaps it’s not that they write “well” but that they think similar to me?

If you don't like fluoride polish you can instead bring your own nano-hydroxyapatite tooth polish. (It's essentially tooth polish made from [synthetic] teeth.) I ship this one from Japan (it's also sometimes available on US amazon).

2Nathan Helm-Burger25d
Oh yeah, I love nano-hydroxyaptite japanese toothpaste. I've been buying it from Amazon for years.

Yes, see Agent membranes/boundaries and formalizing “safety” and davidad's comment. 

(Also, I'm not necessarily agreeing that your examples are not violations of boundaries. First one isn't a violation of end-person (although probably the farmer). Second one could be.)

in full generality, what's a "threat"?

in full generality, what's a "dangerous" collision?

Hm I'm not immediately sure how to define these

is that the defense is actually used to strike inside another's boundary, as has been the case for ~all weapons

Yeah, I am worried about this. 

This is notably not the case for infosec and encryption, where defensive capability doesn't imply offensive capability. However, I'm unsure if this is also true for any physical interventions. (e.g.:  Vaccines? No, bioweapons… Nanotech? No…)

That said, physical interventions do seem to be defense-dominant when there is coordination among a sufficiently large portion of society/power.

2the gears to ascension1mo
I don't think I'm convinced physical interactions are defense dominant. The easiest-to-formally-certify defense is to enclose something in a hunk of impenetrable matter, and that only can be certified up to a given impact energy level. Above that energy level, the defense will simply be stripped away. Only MAD seems able to be game theoretically durable, and certifying that a MAD situation will endure requires proving through a simulation of the opposition.

Today, I revised it to be much more clear and complete.

Thread of revisions to this post. (This post was originally published on 2024 Jan 3.)

Today, I revised it to be much more clear and complete.
If all you want is something like a shutdown button, then a timer is a good and probably-simpler way to achieve it. It does still run into many of the same general issues (various ontological issues, effects of other agents in the environment, how to design the "shutdown utility function", etc) but it largely sidesteps the things which are confusing about corrigibility specifically. The flip side is that, because it sidesteps the things which are confusing about corrigibility specifically, it doesn't offer much insight on how to tackle more general problems of corrigibility, beyond just the shutdown problem.

this exists for some of those --

Thanks you! Also, alternatives

I think the more general form of the emotions thing is: reductionism and "i can't understand it consciously therefore it's not rational". 

The counter is deep respect for Chesterton's Fence.

This is also how many people get into woo

"selective, corrective, and structural" are kinda like:

  • Selective: Interventions that occur before the critical event. t < T_c
  • Corrective: Interventions that occur during the critical event. t ~= T_c
  • Structural: Interventions that occur after the critical event. t > T_c

The non-linear drama might be another good case study


Here's a tricky example I've been thinking about:

Is a cell getting infected by a virus a boundary violation?

What I think makes this tricky is that viruses generally don't physically penetrate cell membranes. Instead, cells just "let in" some viruses (albeit against their better judgement). 

Then once you answer the above, please also consider:

Is a cell taking in nutrients from its environment a boundary violation?

I don't know what makes this different from the virus example (at least as long as we're not allowed to refer to preferences).

Where is the 2024 fellowship going to be?

edit: oh i see website

  • Fellows can join the program from anywhere in the world. We provide tailored solutions to support fellows in their research, such as access to one of several AI safety-specific office spaces in Europe or the US, or support to visit their mentors in-person.

If you're an agent observing other agents, other agents stop seeming sovereign from your perspective if you can mind-read those agents in high resolution.

And "Vingean Agency" is the opposite of that, got it.

any proposed actions which can be meaningfully interpreted by sandboxed human-level supervisory AIs as messages with nontrivial semantics could be rejected.

I want to give a big +1 on preventing membrane piercing not just by having AIs respect membranes, but also by using technology to empower membranes to be stronger and better at self-defense.

Thanks for writing this! I largely agree (and the rest I need to think more about)

thx to friend:

The key scholars are Amartya Sen and Martha Nussbaum. SEP  is a good place to get oriented: 

thanks just listened to it. This reminds me a lot of what Scott Garrabrant has been thinking about. Perhaps intentionally setting up membranes within a society so that failure/infection/etc. in one region doesn't infect every other region. They talk about the same thing but for insight in solving problems

2Alexander Gietelink Oldenziel2mo
Iirc it's in the original boundaries sequence. Sorry I'm too lazy to look it up. It's the part about natural bargaining points and people being able to 'go home'

Psychology may [… be …] ultimately very important, perhaps critically important

oh good, my other project is psychology, so I really hope it circles back.

I originally got into the AI safety stuff because I was trying to understand psychological boundaries properly and realized that no one did. Then I realized that many people were maybe doing the same thing when thinking about interactions between AIs and humans.

This is also maybe one of my qualms about anchoring this idea on the words "membranes" and "boundaries"... i think the actual structure is a bit more continuous. "Membranes" are a nice anchor abstraction for other reasons though so I'm sticking with it for now

Hmm I'm confused about why people ask "where is the boundary?" in this situation. 

I visualize this situation as more of a topological map, kinda like

Generated by DALL·E

where the horizontal axes are kinda like physical space (more precisely: such that pieces (e.g.: your arm, your brain, web search, personal note taking app) thar are more closely connected are closer horizontally)

and the vertical axis is like "communication bandwidth". For example, the communication bandwidth (ability to sense and control) between your brain/nervous system and your arm is pretty darn high... (read more)

This is also maybe one of my qualms about anchoring this idea on the words "membranes" and "boundaries"... i think the actual structure is a bit more continuous. "Membranes" are a nice anchor abstraction for other reasons though so I'm sticking with it for now
2the gears to ascension2mo
well maybe it can be as a backstop but what about, idk, dogs? or just humans that aren't in the protected group, eg people outside a state?

essentially also "membranes for safety" idea but Critch takes it a broader and more civilizational

could you restate your argument again plainly i missed it

2the gears to ascension2mo
groups working together to be an interaction have a aggregate meta-membrane: for any given group who are participating in an interaction of some kind, the fact of their connection is a property of {the mutual information of some variables about their locations and actions or something} that makes them act as a semicoherent shared being, and we call that shared being "a friendship", "a romance", "a family", "a party", "a talk", "an event", "a company", "a coop", "a neighborhood", "a city", etc etc etc. each of these will have a different level of membrane defense depending on how much the participants in the thing act to preserve it. in each case, we can recognize some unreliable pattern of membrane defense. typically the membrane gets stretched through communication tubes, I think? consider how loss of internet feels like something cutting a stretched membrane that was connecting you. this seems like an obvious consequence of not getting to specify "organism" in any straightforward way; we have to somehow information theoretically specify something that will simultaneously identify bacteria and organisms, and then we need some sort of weighting that naturally recognizes individual humans. those should end up getting the majority of the weight in the integral, of course, but it shouldn't need to be hardcoded.

Non-technical topics thread

  • geopolitics, sovereignty

The concept of “separation of tasks” from Adlerian Psychology. This is where many of my intuitions originally came from. This isn't technical though.

3Roman Leventov2mo
Psychology may not be "technical enough" because an adequate mathematical science or process theory is not developed for it, yet, but it's ultimately very important, perhaps critically important: see the last paragraph of Davidad apparently thinks that it can be captured with an Infra-Bayesian model of a person/human. Also on psychology: what is the boundary of personality, where just a "role" (spouse, worker, etc) turns into multiple-personality disorder?

(This all seems very Sante Fe. Anything from the Sante Fe Institute to link?)

4Roman Leventov2mo
In the most recent episode of his podcast show, Jim Rutt (former president of SFI) and his guest talk about membranes a lot, the word appears 30 times on a transcript page:
4Roman Leventov2mo
Related, consciousness frame: where is the boundary of it? Is our brain conscious, or the whole nervous system, or the whole human, or the whole human + the entire microbiome populating them, or human + robotic prosthetic limbs, or human + web search + chat AI + personal note taking app, or the whole human group (collective consciousness), etc. Some computational theories of consciousness attempt to give a specific, mathematically formalised answer to this question.

Cyborgism (for empowering membranes / making them better at defending themselves)

Also Capabilitarianism

Does anyone have good links for these^?

thx to friend:
7Roman Leventov2mo
Related, quantum information theory: * The physical meaning of the Holographic Principle.,, and many other related papers on Quantum Free Energy Principle at
2the gears to ascension2mo
(I don't follow the relationship. clarify or don't at your whim)

from the post:

a minimal “membrane” for each agent/moral patients humanity values value

Many membranes (ie: many possible Markov blankets if you could observe them all) are not valued, empirically.

2the gears to ascension2mo
right, which translates into, it's not a uniform integral, there's some sort of weighting. but I don't retract my argument that the moral value of my relationship with my friend means that me and my friend acting together as a friendship means that the friendship has a membrane. How familiar are you with social network analysis? if not very, I'd suggest speedwatching at least the first half hour of which should take 15m at 2x speed. I suggest this because of the way the explanation and the visual intuitions give a framework for reasoning about networks of people. we also need to somehow take into account when membranes dissipate but this isn't a failure according to the individual beings.

kill its family

Huh interesting. 

To be clear I think this probably emotionally harms most humans, but ultimately it's that's up to whatever interpretations and emotional beliefs that person has (which are all flexible, in principle).

The counter thought would be that there are different dimensions of the membrane that extend over parts of the world. For example part of my membranes extend over the things I care about, and things that affect my survival.


The question then becomes how to quantify these different membranes and in terms of interacting wit

... (read more)
2the gears to ascension2mo
if I and my friend work together well, aren't we an aggregate being that has a new membrane that needs protecting?

Edit: just see Davidad's comment

Hmmm. It's becoming apparent to me that I don't want to regard membrane piercing as a necessarily objective phenomenon. Membrane piercing certainly isn't always visible from every perspective.

That said, I think it's still possible to prevent "membrane piercing", even if whether it occurred can be somewhat subjective. 

Responding to some of your examples:

Is it piercing a membrane if I speak and it distracts you, but I don't touch you otherwise

Again: I don't actually care so much about whether this is or isn't a membrane p... (read more)

2the gears to ascension2mo
I sort of agree, but my food sources are not my property, they're a farmer's property. I edited numbers into my questions, could you edit to make your response numbered and get each one?

See my recent post where I talk about membranes in cybersecurity in one of the sections

Hm right now I only see asserting properties about the state of the membrane, and not about anything inside

2the gears to ascension2mo
I conjecture that someone will be able to prove that in expectation over properties of the membrane (call the random variable P), properties P you wish to assert about the state of the membrane without reference to the inside of the membrane are strongly probably either insufficient, and therefore allows adversaries to "damage" the insides of the membrane, or the given P is overly constraining and itself "damages" the preferences of the being inside the membrane; where "damage" means "move towards a state dispreferred by a random variable U of what partial utility function the inside of the membrane implies". this is a counting argument, and those have been taking some flak lately, but my point is that we need to do better than simplicity-weighted random properties of the membrane.
Load More