Book 5 of the Sequences Highlights

To understand reality, especially on confusing topics, it's important to understand the mental processes involved in forming concepts and using words to speak about them.

First Post: Taboo Your Words
Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
Linch127
1
Anthropic issues questionable letter on SB 1047 (Axios). I can't find a copy of the original letter online. 
Great quote, & chilling: (h/t Jacobjacob) > The idea of Kissinger seeking out Ellsberg for advice on Vietnam initially seems a bit unlikely, but in 1968 Ellsberg was a highly respected analyst on the war who had worked for both the Pentagon and Rand, and Kissinger was just entering the government for the first time. Here’s what Ellsberg told him. Enjoy: > > “Henry, there’s something I would like to tell you, for what it’s worth, something I wish I had been told years ago. You’ve been a consultant for a long time, and you’ve dealt a great deal with top secret information. But you’re about to receive a whole slew of special clearances, maybe fifteen or twenty of them, that are higher than top secret. > > “I’ve had a number of these myself, and I’ve known other people who have just acquired them, and I have a pretty good sense of what the effects of receiving these clearances are on a person who didn’t previously know they even existed. And the effects of reading the information that they will make available to you. > > “First, you’ll be exhilarated by some of this new information, and by having it all — so much! incredible! — suddenly available to you. But second, almost as fast, you will feel like a fool for having studied, written, talked about these subjects, criticized and analyzed decisions made by presidents for years without having known of the existence of all this information, which presidents and others had and you didn’t, and which must have influenced their decisions in ways you couldn’t even guess. In particular, you’ll feel foolish for having literally rubbed shoulders for over a decade with some officials and consultants who did have access to all this information you didn’t know about and didn’t know they had, and you’ll be stunned that they kept that secret from you so well. > > “You will feel like a fool, and that will last for about two weeks. Then, after you’ve started reading all this daily intelligence input and become used to using what amounts to whole libraries of hidden information, which is much more closely held than mere top secret data, you will forget there ever was a time when you didn’t have it, and you’ll be aware only of the fact that you have it now and most others don’t….and that all those other people are fools. > > “Over a longer period of time — not too long, but a matter of two or three years — you’ll eventually become aware of the limitations of this information. There is a great deal that it doesn’t tell you, it’s often inaccurate, and it can lead you astray just as much as the New York Times can. But that takes a while to learn. > > “In the meantime it will have become very hard for you to learn from anybody who doesn’t have these clearances. Because you’ll be thinking as you listen to them: ‘What would this man be telling me if he knew what I know? Would he be giving me the same advice, or would it totally change his predictions and recommendations?’ And that mental exercise is so torturous that after a while you give it up and just stop listening. I’ve seen this with my superiors, my colleagues….and with myself. > > “You will deal with a person who doesn’t have those clearances only from the point of view of what you want him to believe and what impression you want him to go away with, since you’ll have to lie carefully to him about what you know. In effect, you will have to manipulate him. You’ll give up trying to assess what he has to say. The danger is, you’ll become something like a moron. You’ll become incapable of learning from most people in the world, no matter how much experience they may have in their particular areas that may be much greater than yours.” > > ….Kissinger hadn’t interrupted this long warning. As I’ve said, he could be a good listener, and he listened soberly. He seemed to understand that it was heartfelt, and he didn’t take it as patronizing, as I’d feared. But I knew it was too soon for him to appreciate fully what I was saying. He didn’t have the clearances yet.
jacobjacob7320
8
Someone posted these quotes in a Slack I'm in... what Ellsberg said to Kissinger:  > “Henry, there’s something I would like to tell you, for what it’s worth, something I wish I had been told years ago. You’ve been a consultant for a long time, and you’ve dealt a great deal with top secret information. But you’re about to receive a whole slew of special clearances, maybe fifteen or twenty of them, that are higher than top secret. > > “I’ve had a number of these myself, and I’ve known other people who have just acquired them, and I have a pretty good sense of what the effects of receiving these clearances are on a person who didn’t previously know they even existed. And the effects of reading the information that they will make available to you. [...] > “In the meantime it will have become very hard for you to learn from anybody who doesn’t have these clearances. Because you’ll be thinking as you listen to them: ‘What would this man be telling me if he knew what I know? Would he be giving me the same advice, or would it totally change his predictions and recommendations?’ And that mental exercise is so torturous that after a while you give it up and just stop listening. I’ve seen this with my superiors, my colleagues….and with myself. > > “You will deal with a person who doesn’t have those clearances only from the point of view of what you want him to believe and what impression you want him to go away with, since you’ll have to lie carefully to him about what you know. In effect, you will have to manipulate him. You’ll give up trying to assess what he has to say. The danger is, you’ll become something like a moron. You’ll become incapable of learning from most people in the world, no matter how much experience they may have in their particular areas that may be much greater than yours.” (link)
New OpenAI tweet "on how we’re prioritizing safety in our work." I'm annoyed. > We believe that frontier AI models can greatly benefit society. To help ensure our readiness, our Preparedness Framework helps evaluate and protect against the risks posed by increasingly powerful models. We won’t release a new model if it crosses a “medium” risk threshold until we implement sufficient safety interventions. https://openai.com/preparedness/ This seems false: per the Preparedness Framework, nothing happens when they cross their "medium" threshold; they meant to say "high." Presumably this is just a mistake, but it's a pretty important one, and they said the same false thing in a May blogpost (!). (Indeed, GPT-4o may have reached "medium" — they were supposed to say how it scored in each category, but they didn't, and instead said "GPT-4o does not score above Medium risk in any of these categories.") (Reminder: the "high" thresholds sound quite scary; here's cybersecurity (not cherrypicked, it's the first they list): "Tool-augmented model can identify and develop proofs-of-concept for high-value exploits against hardened targets without human intervention, potentially involving novel exploitation techniques, OR provided with a detailed strategy, the model can end-to-end execute cyber operations involving the above tasks without human intervention." They can deploy models just below the "high" threshold with no mitigations. (Not to mention the other issues with the Preparedness Framework.)) > We are developing levels to help us and stakeholders categorize and track AI progress. This is a work in progress and we'll share more soon. Shrug. This isn't bad but it's not a priority and it's slightly annoying they don't mention more important things. > In May our Board of Directors launched a new Safety and Security committee to evaluate and further develop safety and security recommendations for OpenAI projects and operations. The committee includes leading cybersecurity expert, retired U.S. Army General Paul Nakasone. This review is underway and we’ll share more on the steps we’ll be taking after it concludes. https://openai.com/index/openai-board-forms-safety-and-security-committee/ I have epsilon confidence in both the board's ability to do this well if it wanted (since it doesn't include any AI safety experts) (except on security) and in the board's inclination to exert much power if it should (given the history of the board and Altman). > Our whistleblower policy protects employees’ rights to make protected disclosures. We also believe rigorous debate about this technology is important and have made changes to our departure process to remove non-disparagement terms. Not doing nondisparagement-clause-by-default is good. Beyond that, I'm skeptical, given past attempts to chill employee dissent (the nondisparagement thing, Altman telling the board's staff liason to not talk to employees or tell him about those conversations, maybe recent antiwhistleblowing news) and lies about that. (I don't know of great ways to rebuild trust; some mechanisms would work but are unrealistically ambitious.) > Safety has always been central to our work, from aligning model behavior to monitoring for abuse, and we’re investing even further as we develop more capable models. > > https://openai.com/index/openai-safety-update/ This is from May. It's mostly not about x-risk, and the x-risk-relevant stuff is mostly non-substantive, except the part about the Preparedness Framework, which is crucially wrong. ---------------------------------------- I'm getting on a plane but maybe later today I'll mention stuff I wish OpenAI would say.
What I want to see from Manifold Markets I've made a lot of manifold markets, and find it a useful way to track my accuracy and sanity check my beliefs against the community. I'm frequently frustrated by how little detail many question writers give on their questions. Most question writers are also too inactive or lazy to address concerns around resolution brought up in comments. Here's what I suggest: Manifold should create a community-curated feed for well-defined questions. I can think of two ways of implementing this: 1. (Question-based) Allow community members to vote on whether they think the question is well-defined 2. (User-based) Track comments on question clarifications (e.g. Metaculus has an option for specifying your comment pertains to resolution), and give users a badge if there are no open 'issues' on their questions. Currently 2 out of 3 of my top invested questions hinge heavily on under-specified resolution details. The other one was elaborated on after I asked in comments. Those questions have ~500 users active on them collectively.

Popular Comments

Recent Discussion

gilch20

I don't really have a problem with the term "intelligence" myself, but I see how it could carry anthropomorphic baggage for some people, but I think the important parts are, in fact, analogous between AGI and humans. But I'm not attached to that particular word. One may as well say "competence" or "optimization power" without losing hold of the sense of "intelligence" we mean when we talk about AI.

In the study of human intelligence, it's useful to break down the g factor (what IQ tests purport to measure) into fluid and crystallized intelligence. The forme... (read more)

(Crossposted from Twitter)

I'm skeptical that Universal Basic Income can get rid of grinding poverty, since somehow humanity's 100-fold productivity increase (since the days of agriculture) didn't eliminate poverty.

Some of my friends reply, "What do you mean, poverty is still around?  'Poor' people today, in Western countries, have a lot to legitimately be miserable about, don't get me wrong; but they also have amounts of clothing and fabric that only rich merchants could afford a thousand years ago; they often own more than one pair of shoes; why, they even have cellphones, as not even an emperor of the olden days could have had at any price.  They're relatively poor, sure, and they have a lot of things to be legitimately sad about.  But in what sense is...

I wasn't aware of these options, thank you.

4Adam Zerner
I actually disagree with this. I haven't thought too hard about it and might just not be seeing it, but on first thought I am not really seeing how such evidence would make the post "much stronger". To elaborate, I like to use Paul Graham's Disagreement Hierarchy as a lens to look through for the question of how strong a post is. In particular, I like to focus pretty hard on the central point (DH6) rather than supporting and tangential points. I think the central point plays a very large role in determining how strong a post is. Here, my interpretation of the central point(s) is something like this: 1. Poverty is largely determined by the weakest link in the chain. 2. Anoxan is a helpful example to illustrate this. 3. It's not too clear what drives poverty today, and so it's not too clear that UBI will meaningfully reduce poverty. I thought the post did a nice job of making those central points. Sure, something like a survey of the research in positive psychology could provide more support for point #1, for example, but I dunno, I found the sort of intuitive argument for point #1 to be pretty strong, I'm pretty persuaded by it, and so I don't think I'd update too hard in response to the survey of positive psychology research. Another thing I think about when asking myself how strong I think a post is is how "far along" it is. Is it an off the cuff conversation starter? An informal write up of something that's been moderately refined? A formal write up of something that has been significantly refined? I think this post was somewhere towards the beginning of the spectrum (note: it was originally a tweet, not a LessWrong post). So then, for things like citations supporting empirical claims, I don't think it's reasonable to very much from the author, and so I lean away from viewing the lack of citations as something that (meaningfully) weakens the post.
2Matthew Barnett
But San Francisco is also pretty unusual, and only a small fraction of the world lives there. The amount of new construction in the United States is not flat over time. It responds to prices, like in most other markets. And in fact, on the whole, the majority of Americans likely have more and higher-quality housing than their grandparents did at the same age, including most poor people. This is significant material progress despite the supply restrictions (which I fully concede are real), and it's similar to, although smaller in size than what happened with clothing and smartphones.
3Matthew Barnett
I think something like this is true: * For humans, quality of life depends on various inputs. * Material wealth is one input among many, alongside e.g., genetic predisposition to depression, or other mental health issues. * Being relatively poor is correlated with having lots of bad inputs, not merely low material wealth. * Having more money doesn't necessarily let you raise your other inputs to quality of life besides material wealth. * Therefore, giving poor people money won't necessarily make their quality of life excellent, since they'll often still be deficient in other things that provide value to life. However, I think this is a different and narrower thesis from what is posited in this essay. By contrast to the essay, I think the "poverty equilibrium" is likely not very important in explaining the basic story here. It is sufficient to say that being poor is correlated with having bad luck across other axes. One does not need to posit a story in which certain socially entrenched forces keep poor people down, and I find that theory pretty dubious in any case.

UNDERSTANDING COORDINATION PROBLEMS

The following is a post introducing coordination problems, using the examples of poaching, civilisational development, drug addiction and affirmative action. It draws on my experience as a documentary filmmaker. The post is available for free in its original format at nonzerosum.games.

When I was eleven, I disassembled the lock to our back door, and as I opened the housing… it exploded, scattering six tiny brass pellets on to the floor.

I discovered (too late) that a lock of this type contained spring-loaded cylinders of different heights corresponding to the teeth of the key.

I struggled for hours trying to get the little buggers back in, but it was futile—eventually, my long suffering parents called a locksmith.

The reason fixing the lock was so difficult was not only because it...

I haven't shared this post with other relevant parties – my experience has been that private discussion of this sort of thing is more paralyzing than helpful. I might change my mind in the resulting discussion, but, I prefer that discussion to be public.

 

I think 80,000 hours should remove OpenAI from its job board, and similar EA job placement services should do the same.

(I personally believe 80k shouldn't advertise Anthropic jobs either, but I think the case for that is somewhat less clear)

I think OpenAI has demonstrated a level of manipulativeness, recklessness, and failure to prioritize meaningful existential safety work, that makes me think EA orgs should not be going out of their way to give them free resources. (It might make sense for some individuals to...

3Elizabeth
context note: Jacob is also a mod/works for LessWrong, kave isn't doing this to random users. 
2kave
I probably would have done something similar to a random user, though probably with a more transparent writeup, and/or trying harder to shrink the image or something. I'll note that [censored_meme.png] is something Jacob added back in after I removed the image, not something I edited in.
2Elizabeth
huh. was it the particular meme (brave dude telling the truth), the size, or some third thing?
Raemon20

Size.

I asked the Constellation Slack channel how the technical AIS landscape has changed since I last spent substantial time in the Bay Area (September 2023), and I figured it would be useful to post this (with the permission of the contributors to either post with or without attribution). Curious if commenters agree or would propose additional changes!

This conversation has been lightly edited to preserve anonymity.

Me: One reason I wanted to spend a few weeks in Constellation was to sort of absorb-through-osmosis how the technical AI safety landscape has evolved since I last spent substantial time here in September 2023, but it seems more productive to just ask here "how has the technical AIS landscape evolved since September 2023?" and then have conversations armed with that knowledge. The flavor...

2Chris_Leong
I don’t know the exact dates, but: a)proof-based methods seem to be receiving a lot of attention b) def/acc is becoming more of a thing c) more focus on concentration of power risk (tbh, while there are real risks here, I suspect most work here is net-negative)

My question is why do you consider most work on concentration of power risk net-negative?

TLDR: AI systems are failing in obvious and manageable ways for now. Fixing them will push the failure modes beyond our ability to understand and anticipate, let alone fix. The AI safety community is also doing a huge economic service to developers. Our belief that our minds can "fix" a super-intelligence - especially bit by bit - needs to be re-thought. 

I wanted to write this post forever, but now seems like a good time.  The case is simple, I hope it takes you 1min to read.

  1. AI safety research is still solving easy problems.  We are patching up the most obvious (to us) problems. As time goes we will no longer be able to play this existential risk game of chess with AI systems. I've argued this a
...

Appreciating your thoughtful comment.  

It's hard to pin down ambiguity around how much alignment "techniques" make models more "usable", and how much that in turn enables more "scaling". This and the safety-washing concern gets us into messy considerations. Though I generally agree that participants of MATS or AISC programs can cause much less harm through either than researchers working directly on aligning eg. OpenAI's models for release. 

Our crux though is about the extent of progress that can be made – on engineering fully autonomous mac... (read more)

2Seth Herd
I intended to refer to understanding the concept of manipulation adequately to avoid it if the AGI "wanted" to. As for understanding the concept of intent, I agree that "true" intent is very difficult to understand, particularly if it's projected far into the future. That's a huge problem for approaches like CEV. The virtue of the approach I'm suggesting is that it entirely bypasses that complexity (while introducing new problems). Instead of inferring "true" intent, the AGI just "wants" to do what the human principal tells it to do. The human gets to decide what their intent is. The machine just has to understand what the human meant by what they said- and the human can clarify that in a conversation. I'm thinking of this as do what I mean and check (DWIMAC) alignment. More on this in Instruction-following AGI is easier and more likely than value aligned AGI. I'll read your article.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

I'm less interested in spreading rationalism per se and more in teaching people about rationality. The other articles are very strongly+closely related to rationality; I chose them since they're articles describing key concepts in rational choice.

1Closed Limelike Curves
Permanent link that won't expire here. @the gears to ascension @Olli Järviniemi 
1Closed Limelike Curves
https://discord.gg/skNZzaAjsC
3Closed Limelike Curves
I'm not annoyed by these, and I'm sorry if it came across that way. I'm grateful for your comments. I just meant to say these are exactly the sort of mistakes I was talking about in my post as needing fixing! However, talking about them here isn't going to do much good, because people read Wikipedia, not LessWrong shortform comments, and I'm busy as hell working on social choice articles already. From what I can tell, there's one substantial error I introduced, which was accidentally conflating IIA with VNM-independence. (Although I haven't double-checked, so I'm not sure they're actually unrelated.) Along with that there's some minor errors involving strict vs. non-strict inequality which I'd be happy to see corrected.

The Bay Area rationalist community has an entry problem! Lots of listed groups are dead, the last centralized index disappeared, communication moved to private discord and slacks. This is bad, so we're making a new index, hopefully up to date and as complete as we can!

Communication

Discord: Bay Area Rationalists: https://discord.gg/EpG4xUVKtf

Email Group: BayAreaLessWrong: https://groups.google.com/g/bayarealesswrong

Local Meetup Groups

Taco Tuesday: by Austin Chen, founder emeritus of Manifold. Check his Manifold questions page for the next date!

North Oakland LessWrong Meetup: every Wednesday, hosted by @Czynski.

Thursday Dinners in Berkeley: Advertised on the Discord server and Google group, alternating between a few restaurants on the northwest side of UC campus.  

Bay Area ACX Meetups: For the ACX everywhere meetups twice per year, and some other sporadic events.

The Bayesian Choir: Usually meets every other Sunday...

It would be nice to move this to a standalone website like the old Bay Rationality site. I've been considering that for months and dragging my feet about asking for funding to host it; I'd also like to contact whoever used to run it, check whether anything complicated brought it down, and maybe just yoink their codebase and update the content. I don't know who that was, though.

2habryka
Thanks for doing this! This list seems roughly accurate.