Jonathan Claybrough

Software engineer transitioned into AI safety, teaching and strategy. Particularly interested in psychology, game theory, system design, economics.

Posts

Sorted by New

3Jonathan Claybrough's Shortform

9mo

127The case for training frontier AIs on Sumerian-only corpus

4mo

3Jonathan Claybrough's Shortform

9mo

65News : Biden-⁠Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI

10mo

20An Overview of AI risks - the Flyer

10mo

Wiki Contributions

Comments

MATS Summer 2023 Retrospective

Jonathan Claybrough2mo10

Hi, I'm currently evaluating the cost effectiveness of various projects and would be interested in knowing, if you're willing to disclose, approximately how much this program costs MATS in total? By this I mean the summer cohort, includings ops before and after necessary for it to happen, but not counting the extension.

The impossible problem of due process

Jonathan Claybrough4mo11-1

"It's true that we don't want women to be driven off by a bunch of awkward men asking them out, but if we make everyone read a document that says 'Don't ask a woman out the first time you meet her', then we'll immediately give the impression that we have a problem with men awkwardly asking women out too much — which will put women off anyway."

This seems a weak response to me, at best only defensible considering yourself to be on the margin and without thought for longterm growth and your ability to clarify intentions (you have more than 3 words when interacting with people irl).

To be clear explicitly writing "don't ask women out the first time you meet her" would be terrible writing and if that's the best writing members of that group can do for guidelines, then maybe nothing is better than that. Still, it reeks of "we've tried for 30 seconds and are all out of ideas" energy.

A guidelines document can give high level guidance to the vibe you want (eg. truth seeking, not too much aggressiveness, giving space when people feel uncomfortable, communicating around norms explicitly), all in the positive way (eg. you say what you want, not what you don't want), and can refer to sub-documents to give examples and be quite concrete if you have socially impaired people around that need to learn this explicitly.

Stop talking about p(doom)

Jonathan Claybrough4mo10

Note Existential is a term of art different from Extinction.

The Precipice cites Bostrome and defines it such:
"An existential catastrophe is the destruction of humanity’s longterm potential.
An existential risk is a risk that threatens the destruction of humanity’s longterm potential."

Disempowerment is generally considered an existential risk in the literature.

Funding case: AI Safety Camp

Jonathan Claybrough5mo81

I participated in the previous edition of AISC and found it very valuable to my involvement in AI Safety. I acquired knowledge (on standards and the standards process), got experience, contacts. I appreciate how much coordination AISC enables, with groups forming, which enable many to have their first hands on experience and step up their involvement.

AISC 2024 - Project Summaries

Jonathan Claybrough5mo20

Thanks, and thank you for this post in the first place!

AISC 2024 - Project Summaries

Jonathan Claybrough5mo22

Jonathan Claybrough

Actually no, I think the project lead here is jonachro@gmail.com which I guess sounds a bit like me, but isn't me ^^

AI Safety is Dropping the Ball on Clown Attacks

Jonathan Claybrough7mo316

Would be up for this project. As is, I downvoted Trevor's post for how rambly and repetitive it is. There's a nugget of idea, that AI can be used for psychological/information warfare that I was interested in learning about, but the post doesn't seem to have much substantive argument to it, so I'd be interested in someone both doing an incredibly shorter version which argued for its case with some sources.

Padding the Corner

Jonathan Claybrough8mo32

It's a nice pic and moment, I very much like this comic and the original scene. It might be exaggerating a trait (here by having the girl be particularly young) for comedic effect but the Hogfather seems right.
I think I was around 9 when I got my first sword, around 10 for a sharp knife. I have a scar in my left palm from stabbing myself with that sharp knife as a child while whittling wood for a bow. It hurt for a bit, and I learned to whittle away from me or do so more carefully. I'm pretty sure my life is better for it and (from having this nice story attached to it) I like the scar.

Padding the Corner

Jonathan Claybrough8mo-1-4

This story still presents the endless conundrum between avoiding hurt and letting people learn and gain skills.
Assuming the world was mostly the same as nowadays, by the time your children are parenting, would they have the skills to notice sharp corners if they never experienced them ?

I think my intuitive approach here would be to put some not too soft padding (which is effectively close to what you did, it's still an unpleasant experience hitting against that even with the cloth).

What's missing is how to teach against existential risks. There's an extent to which actually bleeding profusely from a sharp corner can help learn walking carefully, anticipating dangers, and that these skills do generalize to many situations and allows one to live a long fruitful life. (This last sentence does not pertain to the actual age of your children and doesn't address ideal ages at which you can actually learn the correct and generalizable thing). If you have control on the future, remove all the sharp edges forever.
If you don't, you remove the hard edges when they're young, instore them again when they can/should learn to recognize what typically are hard edges and must be accounted for.

Jonathan Claybrough's Shortform

Jonathan Claybrough9mo194

Are people losing ability to use and communicate in previous ontologies after getting Insight from meditation ? (Or maybe they never had the understanding I'm expecting of them ?) Should I be worried myself, in my practice of meditation ?

Today I reread Kensho by @Valentine, which presents Looking, and the ensuing conversation in the comments between @Said Achmiz and @dsatan, where Said asks for concrete benefits we can observe and mostly fails to get them. I also noticed interesting comments by @Ruby who in contrast was still be able to communicate in the more typical LW ontology, but hadn't meditated to the point of Enlightenment. Is Enlightenment bad? Different ?

My impression is that people don't become drastically better (at epistemology, rationality, social interaction, actually achieving your goals and values robustly) very fast through meditating or getting Enlightened, though they may acquire useful skills that could help to get better. If that's the case, it's safe for me to continue practicing meditation, getting into Jhanas, Insight etc (I'm following The Mind Illuminated), as the failings of Valentine/dsatan to communicate their points could just be attributed to them not being able to before either.
But I remain wary that people spend so much time engaging and believing in the models and practices taught in meditation material that they actually change their minds for the worse in certain respects. It looks like meditation ontologies/practices are Out to Get You and I don't want to get Got.