(By "most promising" I mostly mean "not obviously making noob mistakes", with the central examples being "any Proper Noun research agenda associated with a specific person or org".)

(By "formal" I mean "involving at least some math proofs, and not solely coding things".)

Asking because the field is both relatively-small and also I'm not sure if any single person "gets" all of it anymore.

Example that made me ask this (not necessarily a central example): Nate Soares wrote this about John Wentworth's work, but then Wentworth replied saying it was inaccurate about his current/overall priorities.

New to LessWrong?

New Answer
New Comment
2 comments, sorted by Click to highlight new comments since: Today at 11:30 PM
[-]Seth Herd10mo-2-1

I'm just curious, why the specification of math proofs? I know of some modestly promising ideas for aligning the sorts of AGI we're likely to get, and none of them were originally specified in mathematical terms. Tacking on maths to those wouldn't really be useful. My impression is that the search for formal proofs of safety have failed and are probably hopeless. It also seems like adding mathematical gloss to ML and psychological concepts is more often confusing than enlightening.

It's to differentiate from more-obviously-tractable less-formalism/conceptual/deconfusion-based research agendas like i.e. HCH. As asked, I'm looking for info specifically related to this other kind.