Decision Theory: Newcomb's Problem


As a personal datapoint: I think the OPs descriptions have a lot in common with how I used to be operating, and that I think this would have been tremendously good advice for me personally, both in terms of its impact on my personal wellness and in terms of its impact on whether I did good-for-the-world things or harmful things.

(If it matters, I still think AI risk is a decent pointer at a thingy in the world that may kill everyone, and that this matters.  The "get sober" thing is a good idea both in relation to that and broadly AFAICT.)

Nope, haven't changed it since publication.

I like this observation.  As a random note, I've sometimes heard people justifying "leave poor working conditions in place for others, rather than spending managerial time improving them" based on how AI risk is an emergency, though whether this checks out on a local consequentialist level is not actually analyzed by the model above, since it partly involves tradeoffs between people and I didn't try to get into that.

I sorta also think that "people acting on a promise of community and support that they later [find] [isn't] there" is sometimes done semi-deliberately by the individuals in question, who are trying to get as much work out of their system one's as possible, by hoping a thing works out without really desiring accurate answers.  Or by others who value getting particular work done (via those individuals working hard) and think things are urgent and so are reasoning short-term and locally consequentialist-ly.  Again partly because people are reasoning near an "emergency."  But this claim seems harder to check/verify.  I hope people put more time into "really generating community" rather than "causing newcomers to have an expectation of community," though.

I can think of five easily who spontaneously said something like this to me and who I recall specific names and details about.  And like 20 more who I'm inclined to project it onto but there was some guesswork involved on my part (e.g., they told me about trouble having hobbies and about feeling kinda haunted by whether it's okay to be "wasting" their time, and it seemed to me these factors were connected, but they didn't connect them aloud for me; or I said I thought there was a pattern like this and they nodded and discussed experiences of theirs but in a way that left some construal to me and might've been primed.  Also I did not name the 20, might be wrong in my notion of how many).

In terms of the five: two "yes better shot IMO," three not.  For the 20, maybe 1/4th "better short IMO".

If you get covid (which many of my friends seem to be doing lately), and your sole goal is to minimize risk of long-term symptoms, is it best to take paxlovid right away, or with a delay?

My current low-confidence guess is that it is best with a delay of ~2 days post symptoms.  Would love critique/comments, since many here will face this sometime this year.

Basic reasoning: anecdotally, "covid rebound" seems extremely common among those who get paxlovid right away, probably also worse among those who get paxlovid right away.  Paxlovid prevents viral replication but does not destroy the virus already in your body.  With a delay, your own immune system learns to do this, else not as much.

Data and discussion:

Maybe.  But a person following up on threads in their leisure time, and letting the threads slowly congeal until they turn out to turn into a hobby, is usually letting their interests lead them initially without worrying too much about "whether it's going anywhere," whereas when people try to "found" something they're often trying to make it big, trying to make it something that will be scalable and defensible.  I like that this post is giving credit to the first process, which IMO has been historically pretty useful pretty often.  I'd also point to the old tradition of "gentlemen scientists" back before the era of publicly funded science, who performed very well per capita; I would guess that high performance was at least partly because there was more low-hanging fruit back then, but my personal guess is that that wasn't the only cause.

I appreciate this comment a lot.  Thank you.  I appreciate that it’s sharing an inside view, and your actual best guess, despite these things being the sort of thing that might get social push-back!

My own take is that people depleting their long-term resources and capacities is rarely optimal in the present context around AI safety.

My attempt to share my reasoning is pretty long, sorry; I tried to use bolding to make it skimmable.

In terms of my inside-view disagreement, if I try to reason about people as mere means to an end (e.g. “labor”):

0.  A world where I'd agree with you.  If all that would/could impact AI safety was a particular engineering project (e.g., Redwood’s ML experiments, for concreteness), and if the time-frame of a person’s relevance to that effort was relatively short (e.g., a year or two, either because AI was in two years, or because there would be an army of new people in two years), I agree that people focusing obsessively for 60 hours/week would probably produce more than the same people capping their work at 35 hrs/week.

But (0) is not the world we’re in, at least right now.  Specific differences between a world where I'd agree with you, and the world we seem to me to be in:

1.  Having a steep discount rate on labor seems like a poor predictive bet to me.  I don’t think we’re within two years of the singularity; I do think labor is increasing but not at a crazy rate; and a person who keeps their wits and wisdom about them, who pays attention and cares and thinks and learns, and especially someone who is relatively new to the field and/or relatively young (which is the case for most such engineers I think), can reasonably hope to be more productive in 2 years than they are now, which can roughly counterbalance the increase (or more than counterbalance the increase) on my best guess.

E.g., if they get hired and Redwood and then stay there, you’ll want veterans a couple years later who already know your processes and skills.

(In 2009, I told myself I needed only to work hard for ~5 years, maybe 10, because after that I’d be a negligible portion of the AI safety effort, so it was okay to cut corners.  I still think I’m a non-negligible portion of the effort.)

1.1.  Trying a thing to see if it works (e.g. 60 hrs/week of obsession, to see how that is) might still be sensible, but more like “try it and see if it works, especially if that risk and difficulty is appealing, since “appealingness” is often an indicator that a thing will turn out to make sense / to yield useful info / to be the kind of thing one can deeply/sincerely try rather than forcing oneself to mimic, etc.” not like “you are nothing and don’t matter much after two years, run yourself into the ground while trying to make a project go.”  I suppose your question is about accepting a known probability of running yourself into the ground, but I’m having trouble booting that sim; to me the two mindsets are pretty different.  I do think many people are too averse to risk and discomfort; but also that valuing oneself in the long-term is correct and important.  Sorry if I’m dodging the question here.

2.  There is no single project that is most of what matters in AI safety today, AFAICT.  Also, such projects as exist are partly managerially bottlenecked.  And so it isn’t “have zero impact” vs “be above Redwood’s/project such-and-such’s hiring line,” it is “be slightly above a given hiring line” (and contribute the difference between that spot and the person who would fill it next, or between that project having one just-above-margin person and having one fewer but more managerial slack) vs “be alive and alert and curious as you take an interest in the world from some other location”, which is more continuous-ish.

3.  We are confused still, and the work is often subtle, such that we need people to notice subtle mismatches between what they’re doing and what makes sense to do, and subtle adjustments to specific projects, to which projects make sense at all, and subtle updates from how the work is going that can be propagated to some larger set of things, etc.  We need people who care and don’t just want to signal that they kinda look like they care.  We need people who become smarter and wiser and more oriented over time and who have deep scientific aesthetics, and other aesthetics.  We need people who can go for what matters even when it means backtracking or losing face.  We don’t mainly need people as something like fully needing-to-be-directed subjugated labor, who try for the appearances while lacking an internal compass.  I expect more of this from folks who average 35 hrs/week than 60 hrs/week in most cases (not counting brief sprints, trying things for awhile to test and stretch one’s capacities, etc. — all of which seems healthy and part of fully inhabiting this world to me).  Basically because of the things pointed out in Raemon’s post about slack, or Ben’s post about the Sabbath.  Also because often 60 hrs/week for long periods of time means unconsciously writing off important personal goals (cf Critch’s post about addiction to work), and IMO writing off deep goals for the long-term makes it hard to sincerely care about things.

(4.  I do agree there’s something useful about being able to work on other peoples’ projects, or on mundane non-glamorous projects, that many don’t have, and that naive readings of my #3 might tend to pull away from.  I think the deeper readings of #3 don’t, but it could be discussed.)

If I instead try to share my actual views, despite these being kinda wooey and inarticulate and hard to justify, instead of trying to reason about people as means to an end:

A.  I still agree that in a world where all that would/could impact AI safety was a particular engineering project (e.g., Redwood’s ML experiments, for concreteness), and if the time-frame of a person’s relevance to that effort was relatively short (e.g., a year or two, or even probably even five years), people focusing obsessively for 60 hours/week would be in many ways saner-feeling, more grounding, and more likely to produce the right kind of work in the right timeframe than the same people capping their work at 35 hrs/week.  (Although even here, vacations, sabbaths, or otherwise carefully maintaining enough of the right kinds of slack and leisure that deep new things can bubble up seems really valuable to me; otherwise I expect a lot of people working hard at dumb subtasks).

A2.  I’m talking about “saner-feeling” and “more grounding” here, because I’m imagining that if people are somehow capping their work at 35 hrs/week, this might be via dissociating from how things matter, and dissociation sucks and has bad side-effects on the quality of work and of team conversation and such IMO.  This is really the main thing I’m optimizing for ~in general; I think sane grounded contexts where people can see what causes will have what effects and can acknowledge what matters will mostly cause a lot of the right actions, and that the main question is how to cause such contexts, whether that means 60 hrs/week or 35 hrs/week or what.

A3.  In this alternate world, I expect people will kinda naturally reason about themselves and one another as means to an end (to the end of us all surviving), in a way that won’t be disoriented and won’t be made out of fear and belief-in-belief and weird dissociation.

B.  In the world we seem to actually be in, I think all of this is pretty different:

B1.  It’s hard to know what safety strategies will or won’t help how much.  

B2.  Lots of people have “belief in belief” about safety strategies working.  Often this is partly politically motivated/manipulated, e.g. people wanting to work at an organization and to rise there via buying into that organization’s narrative; an organization wanting its staff and potential hires to buy its narrative so they’ll work hard and organize their work in particular ways and be loyal.

B3.  There are large “unknown unknowns,” large gaps in the total set of strategies being done, maybe none of this makes sense, etc.

B4.  AI timelines are probably more than two years, probably also more than five years, although it’s hard to know.

C.  In a context like the hypothetical one in A, people talking about how some people are worth much more than another, about what tradeoffs will have what effects, etc. will for many cash out in mechanistic reasoning and so be basically sane-making and grounding.  (Likewise, I suspect battlefield triage or mechanistic reasoning from a group of firefighters considering rescuing people from a burning building is pretty sane-making.)

In a context like the one in B (which is the one I think we’re in), people talking about themselves and other people as mere means to an end, and about how much more some people are worth than another such that those other people are a waste for the first people to talk to, and so on, will tend to increase social fear, decrease sharing of actual views, and increase weird status stuff and the feeling that one ought not question current social narratives, I think.  It will tend to erode trust, erode freedom to be oneself or to share data about how one is actually thinking and feeling, and increase the extent to which people cut off their own and others’ perceptual faculties. The opposite of sane-ifying/grounding.

To gesture a bit at what I mean: a friend of mine, after attending a gathering of EA elites for the first time, complained that it was like: “So, which of the 30 organizations that we all agree has no more than a 0.1% chance of saving the world do you work for?”, followed by talking shop about the specifics within that plan, with almost no attention to the rest of the probability mass.

So I think we ought mostly not to reason about ourselves and other “labor” as though we’re in simple microecon world, given the world we’re in, and given that it encourages writing off a bunch of peoples’ perceptual abilities etc.  Though I also think that you, Buck (or others) speaking your mind, including when you’re reasoning this way, is extremely helpful!  We of course can’t stop wrong views by taking my best guess at which views are right and doing belief-in-belief about it; we have to converse freely and see what comes out.

(Thanks to Justis for saying some of this to me in the comments prior to me posting.)

I think this is a solid point, and that pointing out the asymmetry in evolutionary gradients is important; I would also expect different statistical distributions for men and women here.  At the same time, my naive ev psych guess about how all this is likely to work out would also take into account that men and women share genes, and that creating gender-specific adaptations is actually tricky.  As evidence: men have nipples, and those nipples sometimes produce drops of milk.

Once, awhile ago and outside this community, a female friend swore me to secrecy and then shared a story and hypothesis similar to the OPs (ETA: it was also a story of sexual touching, not of rape; I suspect rape is usually more traumatic).  I've also heard stories of being pretty messed up by sexual abuse from both men and women, including at least two different men messed up by having, as teenagers, had sex with older women (in one case, one of his teachers) without violence/force, but with manipulation.  My current guess is that adaptations designed for one sex typically appear with great variability in the other sex (e.g. male nipples' milk production), and so we should expect some variability in male reactions here.  Also everyone varies.

ETA: I'd like to quarrel with the use of the word "infohazardous" in the OP's title.  My best guess is that people would be better off having all the stories, including stories such as the OPs that is is currently somewhat taboo to share; my best guess is that there is a real risk of harm the OP is gesturing at, but this is significantly via the selective non-sharing of info, rather than being primarily via the sharing of info.

A bunch of people have told me they got worse at having serious/effortful intellectual hobbies, and at "hanging out", after getting worried about AI.  I did, for many long years.  Doesn't mean it's not an "excuse"; I agree it would be good to try to get detailed pictures of the causal structure if we can.

In fairness, a lot of these things (clothes, hairstyles, how "hard core" we can think we are based on working hours and such) have effects on our future self-image, and on any future actions that're mediated by our future self-image.  Maybe they're protecting their psyches from getting eaten by corporate memes, by refusing to cut their hair and go work there.

I suspect we need to somehow have things less based in self-image if we are to do things that're rooted in fresh perceptions etc. in the way e.g. science needs, but it's a terrifying transition.

Load More