A book review examining Elinor Ostrom's "Governance of the Commons", in light of Eliezer Yudkowsky's "Inadequate Equilibria." Are successful local institutions for governing common pool resources possible without government intervention? Under what circumstances can such institutions emerge spontaneously to solve coordination problems?

Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
robo9330
13
Our current big stupid: not preparing for 40% agreement Epistemic status: lukewarm take from the gut (not brain) that feels rightish The "Big Stupid" of the AI doomers 2013-2023 was AI nerds' solution to the problem "How do we stop people from building dangerous AIs?" was "research how to build AIs".  Methods normal people would consider to stop people from building dangerous AIs, like asking governments to make it illegal to build dangerous AIs, were considered gauche.  When the public turned out to be somewhat receptive to the idea of regulating AIs, doomers were unprepared. Take: The "Big Stupid" of right now is still the same thing.  (We've not corrected enough).  Between now and transformative AGI we are likely to encounter a moment where 40% of people realize AIs really could take over (say if every month another 1% of the population loses their job).  If 40% of the world were as scared of AI loss-of-control as you, what could the world do? I think a lot!  Do we have a plan for then? Almost every LessWrong post on AIs are about analyzing AIs.  Almost none are about how, given widespread public support, people/governments could stop bad AIs from being built. [Example: if 40% of people were as worried about AI as I was, the US would treat GPU manufacture like uranium enrichment.  And fortunately GPU manufacture is hundreds of time harder than uranium enrichment!  We should be nerding out researching integrated circuit supply chains, choke points, foundry logistics in jurisdictions the US can't unilaterally sanction, that sort of thing.] TLDR, stopping deadly AIs from being built needs less research on AIs and more research on how to stop AIs from being built. *My research included 😬
Very Spicy Take Epistemic Note:  Many highly respected community members with substantially greater decision making experience (and Lesswrong karma) presumably disagree strongly with my conclusion. Premise 1:  It is becoming increasingly clear that OpenAI is not appropriately prioritizing safety over advancing capabilities research. Premise 2: This was the default outcome.  Instances in history in which private companies (or any individual humans) have intentionally turned down huge profits and power are the exception, not the rule.  Premise 3: Without repercussions for terrible decisions, decision makers have no skin in the game.  Conclusion: Anyone and everyone involved with Open Phil recommending a grant of $30 million dollars be given to OpenAI in 2017 shouldn't be allowed anywhere near AI Safety decision making in the future. To go one step further, potentially any and every major decision they have played a part in needs to be reevaluated by objective third parties.  This must include Holden Karnofsky and Paul Christiano, both of whom were closely involved.  To quote OpenPhil: "OpenAI researchers Dario Amodei and Paul Christiano are both technical advisors to Open Philanthropy and live in the same house as Holden. In addition, Holden is engaged to Dario’s sister Daniela."
Akash4234
2
My current perspective is that criticism of AGI labs is an under-incentivized public good. I suspect there's a disproportionate amount of value that people could have by evaluating lab plans, publicly criticizing labs when they break commitments or make poor arguments, talking to journalists/policymakers about their concerns, etc. Some quick thoughts: * Soft power– I think people underestimate the how strong the "soft power" of labs is, particularly in the Bay Area.  * Jobs– A large fraction of people getting involved in AI safety are interested in the potential of working for a lab one day. There are some obvious reasons for this– lots of potential impact from being at the organizations literally building AGI, big salaries, lots of prestige, etc. * People (IMO correctly) perceive that if they acquire a reputation for being critical of labs, their plans, or their leadership, they will essentially sacrifice the ability to work at the labs.  * So you get an equilibrium where the only people making (strong) criticisms of labs are those who have essentially chosen to forgo their potential of working there. * Money– The labs and Open Phil (which has been perceived, IMO correctly, as investing primarily into metastrategies that are aligned with lab interests) have an incredibly large share of the $$$ in the space. When funding became more limited, this became even more true, and I noticed a very tangible shift in the culture & discourse around labs + Open Phil * Status games//reputation– Groups who were more inclined to criticize labs and advocate for public or policymaker outreach were branded as “unilateralist”, “not serious”, and “untrustworthy” in core EA circles. In many cases, there were genuine doubts about these groups, but my impression is that these doubts got amplified/weaponized in cases where the groups were more openly critical of the labs. * Subjectivity of "good judgment"– There is a strong culture of people getting jobs/status for having “good judgment”. This is sensible insofar as we want people with good judgment (who wouldn’t?) but this often ends up being so subjective that it ends up leading to people being quite afraid to voice opinions that go against mainstream views and metastrategies (particularly those endorsed by labs + Open Phil). * Anecdote– Personally, I found my ability to evaluate and critique labs + mainstream metastrategies substantially improved when I spent more time around folks in London and DC (who were less closely tied to the labs). In fairness, I suspect that if I had lived in London or DC *first* and then moved to the Bay Area, it’s plausible I would’ve had a similar feeling but in the “reverse direction”. With all this in mind, I find myself more deeply appreciating folks who have publicly and openly critiqued labs, even in situations where the cultural and economic incentives to do so were quite weak (relative to staying silent or saying generic positive things about labs). Examples: Habryka, Rob Bensinger, CAIS, MIRI, Conjecture, and FLI. More recently, @Zach Stein-Perlman, and of course Jan Leike and Daniel K. 
If your endgame strategy involved relying on OpenAI, DeepMind, or Anthropic to implement your alignment solution that solves science / super-cooperation / nanotechnology, consider figuring out another endgame plan.
quila50
0
A quote from an old Nate Soares post that I really liked: > It is there, while staring the dark world in the face, that I find a deep well of intrinsic drive. It is there that my resolve and determination come to me, rather than me having to go hunting for them. > > I find it amusing that "we need lies because we can't bear the truth" is such a common refrain, given how much of my drive stems from my response to attempting to bear the truth. > > I find that it's common for people to tell themselves that they need the lies in order to bear reality. In fact, I bet that many of you can think of one thing off the top of your heads that you're intentionally tolerifying, because the truth is too scary to even consider. (I've seen at least a dozen failed relationships dragged out for months and months due to this effect.) > > I say, if you want the intrinsic drive, drop the illusion. Refuse to tolerify. Face the facts that you feared you would not be able to handle. You are likely correct that they will be hard to bear, and you are likely correct that attempting to bear them will change you. But that change doesn't need to break you. It can also make you stronger, and fuel your resolve. > > So see the dark world. See everything intolerable. Let the urge to tolerify it build, but don't relent. Just live there in the intolerable world, refusing to tolerate it. See whether you feel that growing, burning desire to make the world be different. Let parts of yourself harden. Let your resolve grow. It is here, in the face of the intolerable, that you will be able to tap into intrinsic motivation.

Popular Comments

Recent Discussion

 [memetic status: stating directly despite it being a clear consequence of core AI risk knowledge because many people have "but nature will survive us" antibodies to other classes of doom and misapply them here.]

Unfortunately, no.[1]

Technically, “Nature”, meaning the fundamental physical laws, will continue. However, people usually mean forests, oceans, fungi, bacteria, and generally biological life when they say “nature”, and those would not have much chance competing against a misaligned superintelligence for resources like sunlight and atoms, which are useful to both biological and artificial systems.

There’s a thought that comforts many people when they imagine humanity going extinct due to a nuclear catastrophe or runaway global warming: Once the mushroom clouds or CO2 levels have settled, nature will reclaim the cities. Maybe mankind in our hubris will have wounded Mother Earth and paid the price ourselves, but...

2plex
This is correct. I'm not arguing about p(total human extinction|superintelligence), but p(nature survives|total human extinction), as this conditional probability I see people getting very wrong sometimes. It's not implausible to me that we survive due to decision theoretic reasons, this seems possible though not my default expectation (I mostly expect Decision theory does not imply we get nice things, unless we manually win a decent chunk more timelines than I expect). My confidence is in the claim "if AI wipes out humans, it will wipe out nature". I don't engage with counterarguments to a separate claim, as that is beyond the scope of this post and I don't have much to add over existing literature like the other posts you linked. Edit: Partly retracted, I see how the second to last paragraph makes a more overreaching claim, edited to clarify my position.
1jaan
i might be confused about this but “witnessing a super-early universe” seems to support “a typical universe moment is not generating observer moments for your reference class”. but, yeah, anthropics is very confusing, so i’m not confident in this.
plex20

By my models of anthropics, I think this goes through.

17quiet_NaN
I think an AI is slightly more likely to wipe out or capture humanity than it is to wipe out all life on the planet. While any true Scottsman ASI is so far above us humans as we are above ants and does not need to worry about any meatbags plotting its downfall, as we don't generally worry about ants, it is entirely possible that the first AI which has a serious shot at taking over the world is not quite at that level yet. Perhaps it is only as smart as von Neumann and a thousand times faster.  To such an AI, the continued thriving of humans poses all sorts of x-risks. They might find out you are misaligned and coordinate to shut you down. More worrisome, they might summon another unaligned AI which you would have to battle or concede utility to  later on, depending on your decision theory. Even if you still need some humans to dust your fans and manufacture your chips, suffering billions of humans to live in high tech societies you do not fully control seems like the kind of rookie mistake I would not expect a reasonably smart unaligned AI to make.  By contrast, most of life on Earth might get snuffed out when the ASI gets around to building a Dyson sphere around the sun. A few simple life forms might even be spread throughout the light cone by an ASI who does not give a damn about biological contamination.  The other reason I think the fate in store for humans might be worse than that for rodents is that alignment efforts might not only fail, but fail catastrophically. So instead of an AI which cares about paperclips, we get an AI which cares about humans, but in ways we really do not appreciate. But yeah, most forms of ASI which turn out for out bad for homo sapiens also turn out bad for most other species. 

As I note in the third section, I will be attending LessOnline at month’s end at Lighthaven in Berkeley. If that is your kind of event, then consider going, and buy your ticket today before prices go up.

This month’s edition was an opportunity to finish off some things that got left out of previous editions or where events have left many of the issues behind, including the question of TikTok.

Oh No

All of this has happened before. And all of this shall happen again.

Alex Tabarrok: I regret to inform you that the CDC is at it again.

Marc Johnson: We developed an assay for testing for H5N1 from wastewater over a year ago. (I wasn’t expecting it in milk, but I figured it was going to poke up somewhere.)

However,

...

Europeans... vastly less rich than they could be.

POSIWID. Metric being optimized is not "having the most money". It is debatable if it should be, as one of the "poor Europeans" my personal opinion is that we're doing just fine.

I stayed up too late collecting way-past-deadline papers and writing report cards. When I woke up at 6, this anxious email from one of my g11 Computer Science students was already in my Inbox.

Student: Hello Mr. Carle, I hope you've slept well; I haven't.

I've been seeing a lot of new media regarding how developed AI has become in software programming, most relevantly videos about NVIDIA's new artificial intelligence software developer, Devin.

Things like these are almost disheartening for me to see as I try (and struggle) to get better at coding and developing software. It feels like I'll never use the information that I learn in your class outside of high school because I can just ask an AI to write complex programs, and it will do it...

Jiao Bu10

"[I]s a traditional education sequence the best way to prepare myself for [...?]"

This is hard to answer because in some ways the foundation of a broad education in all subjects is absolutely necessary.  And some of them (math, for example), are a lot harder to patch in later if you are bad at them at say, 28.

However, the other side of this is once some foundation is laid and someone has some breadth and depth, the answer to the above question, with regards to nearly anything, is often (perhaps usually) "Absolutely Not."

So, for a 17 year old, Yes. &nbs... (read more)

Thanks to Taylor Smith for doing some copy-editing this.

In this article, I tell some anecdotes and present some evidence in the form of research artifacts about how easy it is for me to work hard when I have collaborators. If you are in a hurry I recommend skipping to the research artifact section.

Bleeding Feet and Dedication

During AI Safety Camp (AISC) 2024, I was working with somebody on how to use binary search to approximate a hull that would contain a set of points, only to knock a glass off of my table. It splintered into a thousand pieces all over my floor.

A normal person might stop and remove all the glass splinters. I just spent 10 seconds picking up some of the largest pieces and then decided...

These thoughts remind me of something Scott Alexander once wrote - that sometimes he hears someone say true but low status things - and his automatic thoughts are about how the person must be stupid to say something like that, and he has to consciously remind himself that what was said is actually true.

For anyone who's curious, this is what Scott said, in reference to him getting older – I remember it because I noticed the same in myself as I aged too:

I look back on myself now vs. ten years ago and notice I’ve become more cynical, more mellow, and more pro

... (read more)
4Mo Putera
Holden advised against this: Also relevant are the takeaways from Thomas Kwa's effectiveness as a conjunction of multipliers, in particular:
1Johannes C. Mayer
At the top of this document.
2Algon
Thanks!

This is a followup to the D&D.Sci post I made last Friday; if you haven’t already read it, you should do so now before spoiling yourself.

Below is an explanation of the rules used to generate the dataset (my full generation code is available here, in case you’re curious about details I omitted), and their strategic implications.


Ruleset

Impossibility

Impossibility is entirely decided by who a given architect apprenticed under. Fictional impossiblists Stamatin and Johnson invariably produce impossibility-producing architects of the time; real-world impossiblists Penrose, Escher and Geisel always produce architects whose works just kind of look weird; the self-taught break Nature's laws 43% of the time.

Cost

Cost is entirely decided by materials. In particular, every structure created using Nightmares is more expensive than every structure without them.

Strategy

The five architects who would...

The halting problem is the problem of taking as input a Turing machine M, returning true if it halts, false if it doesn't halt. This is known to be uncomputable. The consistent guessing problem (named by Scott Aaronson) is the problem of taking as input a Turing machine M (which either returns a Boolean or never halts), and returning true or false; if M ever returns true, the oracle's answer must be true, and likewise for false. This is also known to be uncomputable.

Scott Aaronson inquires as to whether the consistent guessing problem is strictly easier than the halting problem. This would mean there is no Turing machine that, when given access to a consistent guessing oracle, solves the halting problem, no matter which consistent guessing oracle...

Note that Andy Drucker is not claiming to have discovered this; the paper you link is expository.

Since Drucker doesn't say this in the link, I'll mention that the objects you're discussing are conventionally know as PA degrees. The PA here stands for Peano arithmetic; a Turing degree solves the consistent guessing problem iff it computes some model of PA. This name may be a little misleading, in that PA isn't really special here. A Turing degree computes some model of PA iff it computes some model of ZFC, or more generally any  theory capable o... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

It's also notable that the topic of OpenAI nondisparagement agreements was brought to Holden Karnofsky's attention in 2022, and he replied with "I don’t know whether OpenAI uses nondisparagement agreements; I haven’t signed one." (He could have asked his contacts inside OAI about it, or asked the EA board member to investigate. Or even set himself up earlier as someone OpenAI employees could whistleblow to on such issues.)

If the point was to buy a ticket to play the inside game, then it was played terribly and negative credit should be assigned on that bas... (read more)

1Stephen Fowler
This does not feel super cruxy as the the power incentive still remains. 
4Joe_Collman
I think there's a decent case that such updating will indeed disincentivize making positive EV bets (in some cases, at least). In principle we'd want to update on the quality of all past decision-making. That would include both [made an explicit bet by taking some action] and [made an implicit bet through inaction]. With such an approach, decision-makers could be punished/rewarded with the symmetry required to avoid undesirable incentives (mostly). Even here it's hard, since there'd always need to be a [gain more influence] mechanism to balance the possibility of losing your influence. In practice, most of the implicit bets made through inaction go unnoticed - even where they're high-stakes (arguably especially when they're high-stakes: most counterfactual value lies in the actions that won't get done by someone else; you won't be punished for being late to the party when the party never happens). That leaves the explicit bets. To look like a good decision-maker the incentive is then to make low-variance explicit positive EV bets, and rely on the fact that most of the high-variance, high-EV opportunities you're not taking will go unnoticed. From my by-no-means-fully-informed perspective, the failure mode at OpenPhil in recent years seems not to be [too many explicit bets that don't turn out well], but rather [too many failures to make unclear bets, so that most EV is left on the table]. I don't see support for hits-based research. I don't see serious attempts to shape the incentive landscape to encourage sufficient exploration. It's not clear that things are structurally set up so anyone at OP has time to do such things well (my impression is that they don't have time, and that thinking about such things is no-one's job (?? am I wrong ??)). It's not obvious to me whether the OpenAI grant was a bad idea ex-ante. (though probably not something I'd have done) However, I think that another incentive towards middle-of-the-road, risk-averse grant-making is the last t
1starship006
Hmmm, can you point to where you think the grant shows this? I think the following paragraph from the grant seems to indicate otherwise:

The forum has been very much focused on AI safety for some time now, thought I'd post something different for a change. Privilege.

Here I define Privilege as an advantage over others that is invisible to the beholder. [EDIT: thanks to JenniferRM for pointing out that "beholder" is a wrong word.] This may not be the only definition, or the central definition, or not how you see it, but that's the definition I use for the purposes of this post. I also do not mean it in the culture-war sense as a way to undercut others as in "check your privilege". My point is that we all have some privileges [we are not aware of], and also that nearly each one has a flip side. 

In some way this...

14Viliam
What are the advantages of noticing all of this? * better model of the world; * not being an asshole, i.e. not assuming that other people could do just as well as you, if they only were not so fucking lazy; * realizing that your chances to achieve something may be better than you expected, because you have all these advantages over most potential competitors, so if you hesitated to do something because "there are so many people, many of them could do it much better than I could", the actual number of people who could do it may be much smaller than you have assumed, and most of them will be busy doing something else instead.
2shminux
I mostly meant your second point, just generally being kinder to others, but the other two are also well taken.
6localdeity
Also: * Knowing the importance of the advantages people have makes you better able to judge how well people are likely to do, which lets you make better decisions when e.g. investing in someone's company or deciding who to hire for an important role (or marry). * Also orients you towards figuring out the difficult-to-see advantages people must have (or must lack), given the level of success that they've achieved and their visible advantages. * If you're in a position to influence what advantages people end up with—for example, by affecting the genes your children get, or what education and training you arrange for them—then you can estimate how much each of those is worth and prioritize accordingly. One of the things you probably notice is that having some advantages tends to make other advantages more valuable.  Certainly career-wise, several of those things are like, "If you're doing badly on this dimension, then you may be unable to work at all, or be limited to far less valuable roles".  For example, if one person's crippling anxiety takes them from 'law firm partner making $1 million' to 'law analyst making $200k', and another person's crippling anxiety takes them from 'bank teller making $50k' to 'unemployed', then, well, from a utilitarian perspective, fixing one person's problems is worth a lot more than the other's.  That is probably already acted upon today—the former law partner is more able to pay for therapy/whatever—but it could inform people who are deciding how to allocate scarce resources to young people, such as the student versions of the potential law partner and bank teller. (Of course, the people who originally wrote about "privilege" would probably disagree in the strongest possible terms with the conclusions of the above lines of reasoning.)

Excellent point about the compounding, which is often multiplicative, not additive. Incidentally, multiplicative advantages result in a power law distribution of income/net worth, whereas additive advantages/disadvantages result in a normal distribution. But that is a separate topic, well explored in the literature.

TLDR: Things seem bad. But chart-wielding optimists keep telling us that things are better than they’ve ever been. What gives? Hypothesis: the point of conversation is to solve problems, so public discourse will focus on the problems—making us all think that things are worse than they are. A computational model predicts both this dynamic, and that social media makes it worse.


Are things bad? Most people think so.

Over the last 25 years, satisfaction with how things are going in the US has tanked, while economic sentiment is as bad as it’s been since the Great Recession:

Meanwhile, majorities or pluralities in the US are pessimistic about the state of social norms, education, racial disparities, etc. And when asked about the wider world—even in the heady days of 2015—a only...

Agreed that people have lots of goals that don't fit in this model. It's definitely a simplified model.  But I'd argue that ONE of (most) people's goals to solve problems; and I do think, broadly speaking, it is an important function (evolutionarily and currently) for conversation.  So I still think this model gets at an interesting dynamic.

LessOnline Festival

May 31st to June 2nd, Berkely CA