All of Linch's Comments + Replies

In 2015 or so when my friend and I independently came across a lot of rationalist concepts, we learned that each other were interested in this sort of LW-shaped thing. He offered for us to try the AI box game. I played the game as Gatekeeper and won with ease. So at least my anecdotes don't make me particularly worried.

That said, these days I wouldn't publicly offer to play the game against an unlimited pool of strangers. When my friend and I played against each other, there was an implicit set of norms in play, that explicitly don't apply to the game as s... (read more)

Yeah this came up in a number of times during covid forecasting in 2020. Eg, you might expect the correlational effect of having a lockdown during times of expected high mortality load to outweigh any causal advantages on mortality of lockdowns. 

Do I understand this correctly as: "Countries with more deaths are more likely to do lockdowns, which may result in: 'the countries that did lockdowns actually had more covid deaths than the countries that did not'."?

Yeah this came up in a number of times during covid forecasting in 2020. Eg, you might expect the correalational effect of having a lockdown during times of expected high mortality load to outweigh any causal advantages on mortality of lockdowns.

0Shankar Sivarajan1mo
For a community that makes a big deal about "Bayesian reasoning," it's amusing and utterly unsurprising that so many here fail utterly to recognize applications of it in practice.

Going forwards, LTFF is likely to be a bit more stringent (~15-20%?[1] Not committing to the exact number) about approving mechanistic interpretability grants than in grants in other subareas of empirical AI Safety, particularly from junior applicants. Some assorted reasons (note that not all fund managers necessarily agree with each of them):

  • Relatively speaking, a high fraction of resources and support for mechanistic interpretability comes from other sources in the community other than LTFF; we view support for mech interp as less neglected within t
... (read more)

My guess is that it's because "Francesca" sounds more sympathetic as a name.

Yes. 'Gino' (rather than 'Gina') is a guy name, while 'Francesca' is a woman's name. This incorrect framing (the correct legal use of 'X v Y' would of course be to use her inconveniently-male-sounding surname) is a cheap but useful PR trick for her, so she's not going to miss it, and this framing is part of her overall defense: that she's being persecuted out of misogyny and she is the real victim here.

Interesting! That does align better with the survey data than what I see on e.g. Twitter.

Out of curiosity, is "around you" a rationalist-y crowd, or a different one?

No, just regular people.

Apologies if I'm being naive, but it doesn't seem like an oracle AI[1] is logically or practically impossible, and a good oracle should be able to be able to perform well at long-horizon tasks[2] without "wanting things" in the behaviorist sense, or bending the world in consequentialist ways.

The most obvious exception is if the oracle's own answers are causing people to bend the world in the service of hidden behaviorist goals that the oracle has (e.g. making the world more predictable to reduce future loss), but I don't have strong reasons to believe that... (read more)

An oracle doesn't have to have hidden goals. But when you ask it what actions would be needed to do the long term task, it chooses the actions that lead to that would lead to that task being completed. If you phrase that carefully enough maybe you can get away with it. But maybe it calculates the best output to achieve result X is an output that tricks you into rewriting itself into an agent. etc. In general, asking an oracle AI any question whose answers depend on the future effects in the real world of those answers would be very dangerous. On the other hand, I don't think answering important questions on solving AI alignment is a task whose output necessarily needs to depend on its future effects on the real world. So, in my view an oracle could be used to solve AI alignment, without killing everyone as long as there are appropriate precautions against asking it careless questions. 

I'd be pretty scared of an oracle AI that could do novel science, and it might still want things internally. If the oracle can truly do well at designing a fusion power plant, it can anticipate obstacles and make revisions to plans just as well as an agent-- if not better because it's not allowed to observe and adapt. I'd be worried that it does similar cognition to the agent, but with all interactions with the environment done in some kind of efficient simulation. Or something more loosely equivalent.

It's not clear to me that this is as dangerous as havin... (read more)

Someone privately messaged me this whistleblowing channel for people to give their firsthand accounts of board members. I can't verify the veracity/security of the channel but I'm hoping that having an anonymous place to post concerns might lower the friction or costs involved in sharing true information about powerful people:

In the last 4 days, they were probably running on no sleep (and less used to that/had less access to the relevant drugs than Altman and Bockman), and had approximately zero external advisors, while Altman seemed to be tapping into half of Silicon Valley and beyond for help/advice.

Apologies, I've changed the link.


I think it was clear from context that Lukas' "EAs" was intentionally meant to include Ben, and is also meant as a gentle rebuke re: naivete, not a serious claim re: honesty.

This is not a link to shared chat, this is a link which is private to you

Besides his most well-known controversial comments re: women, at least according to my read of his Wikipedia page, Summers has a poor track record re: being able to identify and ouster sketchy people specifically. 

I think it was most likely unanimous among the remaining 4, otherwise one of the dissenters would've spoken out by now.

My favorite low-probability theory is that he had blackmail material on one of the board members[1], who initially decided after much deliberation to go forwards despite the blackmail, and then when they realized they got outplayed by Sam not using the blackmail material, backpeddled and refused to dox themselves.  And the other 2-3 didn't know what to do afterwards, because their entire strategy was predicated on optics management around said blackmail + blackmail material.

  1. ^

    Like something actually really bad.

AFAICT the only formal power the board has is in firing the CEO, so if we get a situation where whenever the board wants to fire Sam, Sam comes back and fires the board instead, well, it's not exactly an inspiring story for OpenAI's governance structure.

This is a very good point. It is strange, though, that the Board was able to fire Sam without the Chair agreeing to it. It seems like something as big as firing the CEO should have required at least a conversation with the Chair, if not the affirmative vote of the Chair. The way this was handled was a big mistake. There needs to be new rules in place to prevent big mistakes like this.

Here are some notes on why I think Imperial Japan was unusually bad, even by the very low bar set by the Second World War.

CW: fairly frank discussions of violence, including sexual violence, in some of the worst publicized atrocities with human victims in modern human history. Pretty dark stuff in general.

tl;dr: Imperial Japan did worse things than Nazis. There was probably greater scale of harm, more unambiguous and greater cruelty, and more commonplace breaking of near-universal human taboos.

I think the Imperial Japanese Army is noticeably worse during World War II than the Nazis. Obviously words like "noticeably worse" and "bad" and "crimes against humanity" are to some ex... (read more)

Huh, I didn't expect something this compelling after I voted disagree on that comment of your from a while ago.  I do think I probably still overall disagree because the holocaust so uniquely attacked what struck me as one of the most important gears in humanity's engine of progress, which was the jewish community in Europe, and the (almost complete) loss of that seems to me like it has left deeper scars than anything the Japanese did (though man, you sure have made a case that the Japanese WW2 was really quite terrifying).
1Alexander Gietelink Oldenziel4mo

Yeah, terrorists are often not very bright, conscientious, or creative.[1] I think rationalist-y types might systematically overestimate how much proliferation of non-novel information can still be bad, via giving scary ideas to scary people.

  1. ^

    No offense intended to any members of the terror community reading this comment

My guess is still that this is below the LTFF bar (which imo is quite high) but I've forwarded some thoughts to some metascience funders I know. I might spend some more free time trying to push this through later. Thanks for the suggestion! 

This is the type of thing that speaks to me aesthetically but my guess is that it wouldn't pencil, though I haven't done the math myself (nor do I have a good sense of how to model it well). Improving business psychology is just not a very leveraged way to improve the long-term future compared to the $X00M/year devoted to x-risk, so the flow-through effects have to be massive in order to be better than marginal longtermist grants. (If I was making grants in metascience this is definitely the type of thing I'd consider).

I'm very open to being wrong though; ... (read more)

(sorry for pontificating when you asked for an actual envelope or napkin) upside is an externality, Ziani incidentally benefits but the signal to other young grad students that maybe career suicide is a slightly more viable risk seems like the source of impact. Agree that this subfield isn't super important, but we should look for related opportunities in subfields we care more about. I don't know if designing a whistleblower prize is a good nerdsnipe / econ puzzle, in that it may be a really bad goosechase (since generating false positives through incentives imposes name-clearing costs on innocent people, and either you can design your way out of this problem or you can't).

I started writing a comment reply to elaborate after getting some disagreevotes on the parent comment, but decided that it'd be a distraction from the main conversation; I might expand on my position in an LW shortform at some point in the near future.

Answer by LinchOct 26, 202380

I'm happy to play on any  of the 4 roles, I haven't played non-blitz chess in quite a while (and never played it seriously) but I would guess I'm ~1300 on standard time controls on (interpolating between different time controls and assuming a similar decay as other games like Go).

I'm free after 9pm PDT most weekdays, and free between noon and 6pm or so on weekends.

I think Germany is an extreme outlier here fwiw, (eg) Japan did far worse things and after WW2 cared more about covering up wrongdoing than with admitting fault; further, Germany's government and cultural "reformation" was very much strongarmed by the US and other allies, whereas the US actively assisted Japan in covering up war crimes.

EDIT: See shortform elaboration: 

Here are some notes on why I think Imperial Japan was unusually bad, even by the very low bar set by the Second World War.
3Daniel Kokotajlo4mo
Curious why you say "far worse" rather than "similarly bad" though this isn't important to the main conversation.

Thanks! Though tbh I don't think I fully got the core point via reading the post so I should only get partial credit; for me it took Alexander's comment to make everything click together.

Answer by LinchOct 11, 2023134

I think anything that roughly maps to "live vicariously through higher-status people" is usually seen as low-status, because higher-status people are presumed to be able to just do those things directly:

  • watching porn/masturbation (as opposed to having sex)
  • watching sports
  • watching esports
  • romance novels
  • reality TV
  • Guitar Hero etc (a bit of a stretch but you're playing the role of somebody who could actually play guitar)

One exception is when the people you're living vicariously through are very high status/do very difficult things (eg it's medium to high status ... (read more)

I think I read this a few times but I still don't think I fully understand your point. I'm going to try to rephrase what I believe you are saying in my own words:

  • Our correct epistemic state in 2000 or 2010 should be to have a lot of uncertainty about the complexity and fragility of human values. Perhaps it is very complex, but perhaps people are just not approaching it correctly.
  • At the limit, the level of complexity can approach "simulate a number of human beings in constant conversation and moral deliberation with each other, embedded in the existing broa
... (read more)
2Matthew Barnett5mo
Yes, I think so, with one caveat: I'm not saying anything about the fragility of value argument, since that seems like a separate argument than the argument that value is complex. I think the fragility of value argument is plausibly a statement about how easy it is to mess up if you get human values wrong, which still seems true depending on one's point of view (e.g. if the AI exhibits all human values except it thinks murder is OK, then that could be catastrophic). Overall, while I definitely could have been clearer when writing this post, the fact that you seemed to understand virtually all my points makes me feel better about this post than I originally felt.

My current (maybe boring) view is that any academic field where the primary mode of inquiry is applied statistics (much of the social sciences and medicine) is suss. The fields where the primary tool is mathematics (pure mathematics, theoretical CS, game theory, theoretical physics) still seems safe, and the fields where the primary tool is computers (distributed systems, computational modeling in various fields) are reasonably safe. ML is somewhere in between computers and statistics. 

Fields where the primary tool is just looking around and counting (demography, taxonomy, astronomy(?)) are probably safe too? I'm confused about how to orient towards the humanities. 

I don't think this is a sufficiently complete way of looking at things. It could make sense when the problem was thought to be "replication crisis via p-hacking" but it turns out things are worse than this. * The research methodology in biology doesn't necessarily have room for statistical funny business but there are all these cases of influential Science/Nature papers that had fraud via photoshop. * Gino and Ariely's papers might have been statistically impeccable, the problem is they were just making up data points. * there is fraud in experimental physics and applied sciences too from time to time. I don't know much about what opportunities there are for bad research practices in the humanities. The only thing I can think of is citing a source that doesn't say what is claimed. This seems like a particular risk when history or historical claims are involved, or when a humanist wants to refer to the scientific literature. The spectacular claim that Victorian doctors treated "hysteria" using vibrators turns out to have resulted from something like this. Outside cases like that, I think the humanities are mostly "safe" like math in that they just need some kind of internal consistency, whether that is presenting a sound argument, or a set of concepts and descriptions that people find to be harmonious or fruitful.
1Gesild Muka5mo
In this incident something was true because the “experts” decided it must be true. That’s humanities in (almost?) every incident.

What you originally said was "say it's not at all obvious that a vegan diet has health tradeoffs ex-ante". I think what you meant here was "it's not clear a vegan diet is net negative." A vegan diet leading to lower energy levels but longer lifespan is the definition of a trade-off. 

This might be semantics, but when you said "Change my mind: Veganism entails trade-offs, and health is one of the axes" I (until now) interpreted the claim as vegans needing to trade off health (writ large) against other desirable properties (taste, cost, convenience, etc)... (read more)

Life is trade-offs all the way down. 

crossposted from answering a question on the EA Forum.

(My own professional opinions, other LTFF fund managers etc might have other views) 

Hmm I want to split the funding landscape into the following groups:

  1. LTFF
  2. OP
  3. SFF
  4. Other EA/longtermist funders
  5. Earning-to-givers
  6. Non-EA institutional funders.
  7. Everybody else


At LTFF our two biggest constraints are funding and strategic vision. Historically it was some combination of grantmaking capacity and good applications but I think that's much less true these days. Right now we have enough new donations to fund what... (read more)

In others, being a worker vs. queen is something of a choice, and if circumstances change a worker may start reproducing. There isn't a sharp transition between cooperative breeding and eusociality. 

Yep. Once the old naked mole-rat queen dies, the remaining female naked mole-rats have a dominance contest until one girl emerges victorious and becomes the new queen.

Thanks, that's very helpful context. In principle, I wouldn't put too much stock in the specific numbers of a single poll, since those results depend too much on specific wording etc. But the trend in this poll is consistent enough over all questions that I'd be surprised if the questions could be massaged to get the opposite results, let alone ones 10x in favor of the accelerationist side.

I believe this has been replicated consistently across many polls. For the results to change, reality (in the sense of popular opinion) likely has to change, rather than... (read more)

1David Hornbein5mo
One very common pattern is, most people oppose a technology when it's new and unfamiliar, then later once it's been established for a little while and doesn't seem so strange most people think it's great.

UPDATE 2023/09/13: 

Including only money that has already landed in our bank account and extremely credible donor promises of funding, LTFF has raised ~1.1M and EAIF has raised ~500K.  After Open Phil matching, this means LTFF now has ~3.3M additional funding and EAIF has ~1.5m in additional funding.

We are also aware that other large donors, including both individuals and non-OP institutional donors, are considering donating to us. In addition, while some recurring donors have likely moved up their donations to us because of our recent unusually u... (read more)

possibly enough to risk a suit if Lincoln wanted to

Would be pretty tough to do given the legal dubiousness re: enforceability of non-disparagement agreements in the US (note: the judgement applies retroactively)

Thanks, I haven't played ONUW much,  Avalon is the main game I play, also more classic mafia, werewolf, secret hitler and Quest.

Eg I think in advanced Among Us lobbies it's an important skill to subtly push an unproductive thread of conversation without making it obvious that you were the one who distracted everybody.

I'm not much of an avid Among Us player, but I suspect this only works in Among Us because of the (much) heavier-than usual time pressures. In the other social deception games I'm aware of, the structural incentives continue to point in the other direction, so the main reason for bad guys to make spurious accusations is for anti-inductive reasons (if everybody knows th... (read more)

I don't understand this - it reads to me like you're saying a similar thing is true for the game and real life? But that goes against your position.

Sorry that was awkwardly worded. Here's a simplified rephrase:

In games, bad guys want to act and look not the same. In real life, if you often agree with known bad folks, most think you're not good.

Put in a different way, because of the structure of games like Avalon (it's ~impossible for all the bad guys to not be found out, minions know who each other are, all minions just want their "team" to win so having s... (read more)

Errol is a Logical Decision Theorist. Whenever he's playing a game of Werewolf, he's trying to not just win that game, but to maximize his probability of winning across all versions of the game, assuming he's predictable to other players. Errol firmly commits to reporting whether he's a werewolf whenever he gets handed that role, reasoning that behind the veil of ignorance, he's much more likely to land as villager than as werewolf, and that villager team always having a known villager greatly increases his overall odds of winning. Errol follows through with his commitments. Errol is not very fun to play with and has since been banned from his gaming group.

As I mentioned to you before, I suspect werewolf/mafia/avalon is a pretty bad analogy for how to suss out the trustworthiness of people irl:

  • in games, the number of werewolves etc is often fixed and known to all players ahead of time; irl a lot of the difficulty is figuring out whether (and how many) terminally bad actors exist, vs honest misunderstandings, vs generically suss people. 
  • random spurious accusations with zero factual backing are usually considered town/vanilla/arthurian moves in werewolf games; irl this breeds chaos and is a classic DARVO
... (read more)
This is very tangential, but: if that's your experience with e.g. one night ultimate werewolf, then I strongly recommend changing the mix of roles so that the numbers on each side are random and the werewolf side ends up in the majority a nontrivial fraction of the time. Makes the game a lot more fun/interesting IMO, and negates some of the points you list about divergence between such games and real life.
8Elias Schmied6mo
In my experience this is only true for beginner play (where werewolves are often too shy to say anything), and in advanced play it is a bad guy tactic for the same reasons as IRL. Eg I think in advanced Among Us lobbies it's an important skill to subtly push an unproductive thread of conversation without making it obvious that you were the one who distracted everybody. It's not clear/concrete to me in what ways points 3 and 5 are supposed to invalidate the analogy. I don't understand this - it reads to me like you're saying a similar thing is true for the game and real life? But that goes against your position.

Interpreted literally,

FWIW I’ve never known a character of high integrity who I could imagine writing the phrase “your career in EA would be over with a few DMs”.

contains the phrase "your career in EA would be over with a few DMs". I don't think it was meant to be interpreted literally though.

Are you familiar with the use-mention distinction? It seems pretty relevant here.

For example, maybe I know the person well enough to justify the following charitable interpretation:

That phrase could be interpreted as a subtle threat, especially in the context of us cu

... (read more)
7Adam Zerner6mo
FWIW, I didn't mean it as a cheap shot. I just wanted to establish that context is, in fact, relevant (use-mention is an example of relevant context). And from there, go on to talk about why I think there are realistic contexts where a high-character person would make the statement.

I'm pretty sure that most EAs I know have ~100% confidence that what they're doing is net positive for the long-term future).

Really? Without giving away names, can you tell me roughly what cluster they are in? Geographical area, age range, roughly what vocation (technical AI safety/AI policy/biosecurity/community building/earning-to-give)? 

I'm super interested in how you might have arrived at this belief: would you be able to elaborate a little? For instance, is there a theoretical argument going on here, like a weak form of cluelessness? Or is it mor

... (read more)

My current guess is a lack of LTFF funding is probably producing more researchers at Anthropic than otherwise, because there just that aren't many opportunities for people to work on safety or safety-adjacent roles. E.g. I know of people who are interviewing for Anthropic capability teams because idk man, they just want a safety-adjacent job with a minimal amount of security, and it's what's available. Having spoken to a bunch of people, I strongly suspect that of the people that I'd want to fund but won't be funded, at least a good fraction are significan

... (read more)

(personal opinions)

Yeah most of the things I'm thinking of didn't look like technical safety stuff, more like Demis and Shane being concerned about safety -> decided to found Deepmind, Eliezer introducing Demis and Shane to Peter Thiel ( their first funder), etc.

In terms of technical safety stuff, sign confusion around RLHF is probably the strongest candidate. I'm also a bit worried about capabilities externalities of Constitutional AI, for similar reasons. There's also the general vibes issue of safety work (including quite technical work) and communic... (read more)

Yeah, I definitely think this is true to some extent. "First get impact, then worry about the sign later" and all. 
Load More