mesaoptimizer

https://mesaoptimizer.com

 

learn math or hardware

Wiki Contributions

Comments

I'd love to read an elaboration of your perspective on this, with concrete examples, which avoids focusing on the usual things you disagree about (pivotal acts vs. pivotal processes, social facets of the game is important for us to track, etc.) and mainly focus on your thoughts on epistemology and rationality and how it deviates from what you consider the LW norm.

I started reading your meta-rationality sequence, but it ended after just two posts without going into details.

David Chapman's website seems like the standard reference for what the post-rationalists call "metarationality". (I haven't read much of it, but the little I read made me somewhat unenthusiastic about continuing).

Note that the current power differential between evals labs and frontier labs is such that I don't expect evals labs have the slack to simply state that a frontier model failed their evals.

You'd need regulation with serious teeth and competent 'bloodhound' regulators watching the space like a hawk, for such a possibility to occur.

I just encountered polyvagal theory and I share your enthusiasm for how useful it is for modeling other people and oneself.

Note that I'm waiting for the entire sequence to be published before I read it (past the first post), so here's a heads up that I'm looking forward to seeing more of this sequence!

I think Twitter systematically underpromotes tweets with links external to the Twitter platform, so reposting isn't a viable strategy.

Thanks for the link. I believe I read it a while ago, but it is useful to reread it from my current perspective.

trying to ensure that AIs will be philosophically competent

I think such scenarios are plausible: I know some people argue that certain decision theory problems cannot be safely delegated to AI systems, but if we as humans can work on these problems safely, I expect that we could probably build systems that are about as safe (by crippling their ability to establish subjunctive dependence) but are also significantly more competent at philosophical progress than we are.

Leopold's interview with Dwarkesh is a very useful source of what's going on in his mind.

What happened to his concerns over safety, I wonder?

He doesn't believe in a 'sharp left turn', which means he doesn't consider general intelligence to be a discontinuous (latent) capability spike such that alignment becomes significantly more difficult after it occurs. To him, alignment is simply a somewhat harder empirical techniques problem like capabilities work is. I assume he imagines in behavior similar to current RLHF-ed models even as frontier labs have doubled or quadrupled the OOMs of optimization power applied to the creation of SOTA models.

He models (incrementalist) alignment research as "dual use", and therefore effectively models capabilities and alignment as effectively the same measure.

He also expects humans to continue to exist once certain communities of humans achieve ASI, and imagines that the future will be 'wild'. This is a very rare and strange model to have.

He is quite hawkish -- he is incredibly focused on China not stealing AGI capabilities, and believes that private labs are going to be too incompetent to defend against Chinese infiltration. He prefers that the USGOV would take over the AGI development such that they can race effectively against AGI.

His model for take-off relies quite heavily on "trust the trendline" and estimating linear intelligence increases with more OOMs of optimization power (linear with respect to human intelligence growth from childhood to adulthood). Its not the best way to extrapolate what will happen, but it is a sensible concrete model he can use to talk to normal people and sound confident and not vague -- a key skill if you are an investor, and an especially key skill for someone trying to make it in the SF scene. (Note he clearly states in the interview that he's describing his modal model for how things will go and he does have uncertainty over how things will occur, but desires to be concrete about what is his modal expectation.)

He has claimed that running a VC firm means he can essentially run it as a "think tank" too, focused on better modeling (and perhaps influencing) the AGI ecosystem. Given his desire for a hyper-militarization of AGI research, it makes sense that he'd try to steer things in this direction using the money and influence he will have and build, as a founder of n investment firm.

So in summary, he isn't concerned about safety because he prices it in as something about as difficult (or slightly more difficult than) capabilities work. This puts him in an ideal epistemic position to run a VC firm for AGI labs, since his optimism is what persuades investors to provide him money since they expect him to attempt to return them a profit.

Oh, by that I meant something like "yeah I really think it is not a good idea to focus on an AI arms race". See also Slack matters more than any other outcome.

If Company A is 12 months from building Cthulhu, we fucked up upstream. Also, I don't understand why you'd want to play the AI arms race -- you have better options. They expect an AI arms race. Use other tactics. Get into their OODA loop.

Unsee the frontier lab.

Load More