Wei Dai


Just saw a poll result that's consistent with this.

I didn't downvote/disagree-vote Ben's comment, but it doesn't unite the people who think that accelerating development of certain technologies isn't enough to (sufficiently) prevent doom, that we also need to slow down or pause development of certain other technologies.

Crossposted from X (I'm experimenting with participating more there.)

This is speaking my language, but I worry that AI may inherently disfavor defense (in at least one area), decentralization, and democracy, and may differentially accelerate wrong intellectual fields, and humans pushing against that may not be enough. Some explanations below.

"There is an apparent asymmetry between attack and defense in this arena, because manipulating a human is a straightforward optimization problem [...] but teaching or programming an AI to help defend against such manipulation seems much harder [...]" https://www.lesswrong.com/posts/HTgakSs6JpnogD6c2/two-neglected-problems-in-human-ai-safety

"another way for AGIs to greatly reduce coordination costs in an economy is by having each AGI or copies of each AGI profitably take over much larger chunks of the economy" https://www.lesswrong.com/posts/Sn5NiiD5WBi4dLzaB/agi-will-drastically-increase-economies-of-scale

Lastly, I worry that AI will slow down progress in philosophy/wisdom relative to science and technology, because we have easy access to ground truths in the latter fields, which we can use to train AI, but not the former, making it harder to deal with new social/ethical problems

Came across this account via a random lawyer I'm following on Twitter (for investment purposes), who commented, "Huge L for the e/acc nerds tonight". Crazy times...

https://twitter.com/karaswisher/status/1725678898388553901 Kara Swisher @karaswisher

Sources tell me that the profit direction of the company under Altman and the speed of development, which could be seen as too risky, and the nonprofit side dedicated to more safety and caution were at odds. One person on the Sam side called it a “coup,” while another said it was the the right move.

Similarly, suppose you have two deontological values which trade off against each other. Before systematization, the question of “what’s the right way to handle cases where they conflict” is not really well-defined; you have no procedure for doing so.

Why is this a problem, that calls out to be fixed (hence leading to systematization)? Why not just stick with the default of "go with whichever value/preference/intuition that feels stronger in the moment"? People do that unthinkingly all the time, right? (I have my own thoughts on this, but curious if you agree with me or what your own thinking is.)

And that’s why the “mind itself wants to do this” does make sense, because it’s reasonable to assume that highly capable cognitive architectures will have ways of identifying aspects of their thinking that “don’t make sense” and correcting them.

How would you cash out "don't make sense" here?

which makes sense, since in some sense systematizing from our own welfare to others’ welfare is the whole foundation of morality

This seems wrong to me. I think concern for others' welfare comes from being directly taught/trained as a child to have concern for others, and then later reinforced by social rewards/punishments as one succeeds or fails at various social games. This situation could have come about without anyone "systematizing from our own welfare", just by cultural (and/or genetic) variation and selection. I think value systematizing more plausibly comes into play with things like widening one's circle of concern beyond one's family/tribe/community.

What you're trying to explain with this statement, i.e., "Morality seems like the domain where humans have the strongest instinct to systematize our preferences" seems better explained by what I wrote in this comment.

This reminds me that I have an old post asking Why Do We Engage in Moral Simplification? (What I called "moral simplification" seems very similar to what you call "value systematization".) I guess my post didn't really fully answer this question, and you don't seem to talk much about the "why" either.

Here are some ideas after thinking about it for a while. (Morality is Scary is useful background here, if you haven't read it already.)

  1. Wanting to use explicit reasoning with our values (e.g., to make decisions), which requires making our values explicit, i.e., defining them symbolically, which necessitates simplification given limitations of human symbolic reasoning.
  2. Moral philosophy as a status game, where moral philosophers are implicitly scored on the moral theories they come up with by simplicity and by how many human moral intuitions they are consistent with.
  3. Everyday signaling games, where people (in part) compete to show that they have community-approved or locally popular values. Making values legible and not too complex facilitates playing these games.
  4. Instinctively transferring our intuitions/preferences for simplicity from "belief systematization" where they work really well, into a different domain (values) where they may or may not still make sense.

(Not sure how any of this applies to AI. Will have to think more about that.)

if the biggest impact is the very slow thinking from a very small group of people who care about them, then I think that’s a very small impact

I guess from my perspective, the biggest impact is the possibility that the idea of better preparing for these risks becomes a lot more popular. An analogy with Bitcoin comes to mind, where the idea of cryptography-based distributed money languished for many years, known only to a tiny community, and then was suddenly everywhere. An AI pause would provide more time for something like that to happen. And if the idea of better preparing for these risks was actually a good one (as you seem to think), there's no reason why it couldn't (or was very unlikely to) spread beyond a very small group, do you agree?

In My views on “doom” you wrote:

Probability of messing it up in some other way during a period of accelerated technological change (e.g. driving ourselves crazy, creating a permanent dystopia, making unwise commitments…): 15%

Do you think these risks can also be reduced by 10x by a "very good RSP"? If yes, how or by what kinds of policies? If not, isn't "cut risk dramatically [...] perhaps a 10x reduction" kind of misleading?

It concerns me that none of the RSP documents or discussions I've seen talked about these particular risks, or "unknown unknowns" (other risks that we haven't thought of yet).

I'm also bummed that "AI pause" people don't talk about these risks either, but at least an AI pause would implicitly address these risks by default, whereas RSPs would not.

Load More