Ebenezer Dukakis

My best idea for achieving an international AI pause: https://forum.effectivealtruism.org/posts/RNatHBdxdCdidhqWf/chinscratch-s-quick-takes?commentId=XCxZjqhMRPGK3Faq8

Wiki Contributions

Comments

On LessWrong, there's a comment section where hard questions can be asked and are asked frequently.

In my experience, asking hard questions here is quite socially unrewarding. I could probably think of a dozen or so cases where I think the LW consensus "emperor" has no clothes, that I haven't posted about, just because I expect it to be an exercise in frustration. I think I will probably quit posting here soon.

I don't think AI policy is a good example for discourse on LessWrong. There are strategic reasons to be less transparent about how to affect public policy then for most other topics.

In terms of advocacy methods, sure. In terms of desired policies, I generally disagree.

Everything that's written publically can be easily picked up by journalists wanting to write stories about AI.

If that's what we are worried about, there is plenty of low-hanging fruit in terms of e.g. not tweeting wildly provocative stuff for no reason. (You can ask for examples, but be warned, sharing them might increase the probability that a journalist writes about them!)

"The far left is censorious" and "Republicans are censorious" are in no way incompatible claims :-)

Great post. Self-selection seems huge for online communities, and I think it's no different on these fora.

Confidence level: General vague impressions and assorted thoughts follow; could very well be wrong on some details.

A disagreement I have with both the rationalist and EA communities is what the process of coming to robust conclusions looks like. In those communities, it seems like the strategy is often to identify a few super-geniuses who go do a super-deep analysis, and come to a conclusion that's assumed to be robust and trustworthy. See the "Groupthink" section on this page for specifics.

From my perspective, I would rather see an ordinary-genius do an ordinary-depth analysis, and then have a bunch of other people ask a bunch of hard questions. If the analysis holds up against all those hard questions, then the conclusion can be taken as robust.

Everyone brings their own incentives, intuitions, and knowledge to a problem. If a single person focuses a lot on a problem, they run into diminishing returns regarding the number of angles of attack. It seems more effective to generate a lot of angles of attack by taking the union of everyone's thoughts.

From my perspective, placing a lot of trust in top EA/LW thought leaders ironically makes them less trustworthy, because people stop asking why the emperor has no clothes.

The problem with saying the emporer has no clothes is: Either you show yourself a fool, or else you're attacking a high-status person. Not a good prospect either way, in social terms.

EA/LW communities are an unusual niche with opaque membership norms, and people may want to retain their "insider" status. So they do extra homework before accusing the emperor of nudity, and might just procrastinate indefinitely.

There can also be a subtle aspect of circular reasoning to thought leadership: "we know this person is great because of their insights", but also "we know this insight is great because of the person who said it". (Certain celebrity users on these fora get 50+ positive karma on basically every top-level post. Hard to believe that the authorship isn't coloring the perception of the content.)

A recent illustration of these principles might be the pivot to AI Pause. IIRC, it took a "super-genius" (Katja Grace) writing a super long post before Pause became popular. If an outsider simply said: "So AI is bad, why not make it illegal?" -- I bet they would've been downvoted. And once that's downvoted, no one feels obligated to reply. (Note, also -- I don't believe there was much reasoning transparency regarding why the pause strategy was considered unpromising at the time. You kinda had to be an insider like Katja to know the reasoning in order to critique it.)

In conclusion, I suspect there are a fair number of mistaken community beliefs which survive because (1) no "super-genius" has yet written a super-long post about them, and (2) poking around by asking hard questions is disincentivized.

Yeah, I think there are a lot of underexplored ideas along these lines.

It's weird how so much of the internet seems locked into either the reddit model (upvotes/downvotes) or the Twitter model (likes/shares/followers), when the design space is so much larger than that. Someone like Aaron, who played such a big role in shaping the internet, seems more likely to have a gut-level belief that it can be shaped. I expect there are a lot more things like Community Notes that we could discover if we went looking for them.

I've always wondered what Aaron Swartz would think of the internet now, if he was still alive. He had far-left politics, but also seemed to be a big believer in openness, free speech, crowdsourcing, etc. When he was alive those were very compatible positions, and Aaron was practically the poster child for holding both of them. Nowadays the far left favors speech restrictions and is cynical about the internet.

Would Aaron have abandoned the far left, now that they are censorious? Would he have become censorious himself? Or would he have invented some clever new technology, like RSS or reddit, to try and fix the internet's problems?

Just goes to show what a tragedy death is, I guess.

I expect escape will happen a bunch

Are you willing to name a specific year/OOM such that if there are no publicly known cases of escape by that year/OOM, you would be surprised? What, if anything, would you acknowledge as evidence that alignment is easier than you thought, here?

To ensure the definition of "escape" is not gerrymandered -- do you know of any cases of escape right now? Do you think escape has already occurred and you just don't know about it? "Escape" means something qualitatively different from any known event up to this point, yes? Does it basically refer to self-exfiltration of weights which was not requested by any human? Can we get a somewhat precise definition by any chance?

Sure there will be errors, but how important will those errors be?

Humans currently control the trajectory of humanity, and humans are error-prone. If you replace humans with something that's error-prone in similar ways, that doesn't seem like it's obviously either a gain or a loss. How would such a system compare to an em of a human, for example?

If you want to show that we're truly doomed, I think you need additional steps beyond just "there will be errors".

Some recent-ish bird flu coverage:

Global health leader critiques ‘ineptitude’ of U.S. response to bird flu outbreak among cows

A Bird-Flu Pandemic in People? Here’s What It Might Look Like. TLDR: not good. (Reload the page and ctrl-a then ctrl-c to copy the article text before the paywall comes up.) Interesting quote: "The real danger, Dr. Lowen of Emory said, is if a farmworker becomes infected with both H5N1 and a seasonal flu virus. Flu viruses are adept at swapping genes, so a co-infection would give H5N1 opportunity to gain genes that enable it to spread among people as efficiently as seasonal flu does."

Infectious bird flu survived milk pasteurization in lab tests, study finds. Here's what to know.

1 in 5 milk samples from grocery stores test positive for bird flu. Why the FDA says it’s still safe to drink -- see also updates from the FDA here: "Last week we announced preliminary results of a study of 297 retail dairy samples, which were all found to be negative for viable virus." (May 10)

The FDA is making reassuring noises about pasteurized milk, but given that CDC and friends also made reassuring noises early in the COVID-19 pandemic, I'm not fully reassured.

I wonder if drinking a little bit of pasteurized milk every day would be helpful inoculation? You could hedge your bets by buying some milk from every available brand, and consuming a teaspoon from a different brand every day, gradually working up to a tablespoon etc.

About a month ago, I wrote a quick take suggesting that an early messaging mistake made by MIRI was: claim there should be a single leading FAI org, but not give specific criteria for selecting that org. That could've lead to a situation where Deepmind, OpenAI, and Anthropic can all think of themselves as "the best leading FAI org".

An analogous possible mistake that's currently being made: Claim that we should "shut it all down", and also claim that it would be a tragedy if humanity never created AI, but not give specific criteria for when it would be appropriate to actually create AI.

What sort of specific criteria? One idea: A committee of random alignment researchers is formed to study the design; if at least X% of the committee rates the odds of success at Y% or higher, it gets the thumbs up. Not ideal criteria, just provided for the sake of illustration.

Why would this be valuable?

  • If we actually get a pause, it's important to know when to unpause as well. Specific criteria could improve the odds that an unpause happens in a reasonable way.

  • If you want to build consensus for a pause, advertising some reasonable criteria for when we'll unpause could get more people on board.

Don’t have time to respond in detail but a few quick clarifications/responses:

Sure, don't feel obligated to respond, and I invite the people disagree-voting my comments to hop in as well.

— There are lots of groups focused on comms/governance. MIRI is unique only insofar as it started off as a “technical research org” and has recently pivoted more toward comms/governance.

That's fair, when you said "pretty much any other organization in the space" I was thinking of technical orgs.

MIRI's uniqueness does seem to suggest it has a comparative advantage for technical comms. Are there any organizations focused on that?

by MIRI’s lights, getting policymakers to understand alignment issues would be more likely to result in alignment progress than having more conversations with people in the technical alignment space

By 'alignment progress' do you mean an increased rate of insights per year? Due to increased alignment funding?

Anyway, I don't think you're going to get "shut it all down" without either a warning shot or a congressional hearing.

If you just extrapolate trends, it wouldn't particularly surprise me to see Alex Turner at a congressional hearing arguing against "shut it all down". Big AI has an incentive to find the best witnesses it can, and Alex Turner seems to be getting steadily more annoyed. (As am I, fwiw.)

Again, extrapolating trends, I expect MIRI's critics like Nora Belrose will increasingly shift from the "inside game" of trying to engage w/ MIRI directly to a more "outside game" strategy of explaining to outsiders why they don't think MIRI is credible. After the US "shuts it down", countries like the UAE (accused of sponsoring genocide in Sudan) will likely try to quietly scoop up US AI talent. If MIRI is considered discredited in the technical community, I expect many AI researchers to accept that offer instead of retooling their career. Remember, a key mistake the board made in the OpenAI drama was underestimating the amount of leverage that individual AI researchers have, and not trying to gain mindshare with them.

Pause maximalism (by which I mean focusing 100% on getting a pause and not trying to speed alignment progress) only makes sense to me if we're getting a ~complete ~indefinite pause. I'm not seeing a clear story for how that actually happens, absent a much broader doomer consensus. And if you're not able to persuade your friends, you shouldn't expect to persuade your enemies.

Right now I think MIRI only gets their stated objective in a world where we get a warning shot which creates a broader doom consensus. In that world it's not clear advocacy makes a difference on the margin.

Load More