I have basically no idea how to make these happen, so I'm not opinionated on what we should do to achieve these goals. We need some combination of basic research, building tools people find useful, and stuff in-between.

Reply

[-]Charlie Steiner3y20

I'll admit I'm pessimistic, because I expect institutional inertia to be large and implementation details to unavoidably leave loopholes. But it definitely sounds interesting.

Reply

[-]Nisan3y40

I'm a bit more optimistic about loopholes because I feel like if agents are determined to build trust, they can find a way.

Reply

[-]Nisan3y20

I agree that institutional inertia is a problem, and more generally there's the problem of getting principals to do the thing. But it's more dignified to make alignment/cooperation technology available than not to make it.

Reply

[-]the gears to ascension3y20

On whose shoulders are we standing?

Some metaphor searches to find (some of) the prior work for each section:

"On whose shoulders are we standing?" entire section as a search
zero-knowledge literature paragraph as a search
open source game theory paragraph as a search
"...modal logic, biosimulation, simplicial complexes..." paragraph as a search, with added "reading list" phrasing
"executable textbook for game theory", finds a big list of game theory intro textbooks

My follow-through, I’m a bad employee in terms of consistency and dependability, and much of that would apply to independent research: I kick ass for stretches then crash (3 to 5 months asskicking per one month burned out).

holy crap same type of pattern. am currently in a burned out period, but feel like I could become productive again with active management, if you figure out where to buy that definitely let me know! I'm personally considering applying for universities.

p.s. this metaphor search found some amusing old ai alignment plans that I don't think are terribly useful but may be of historical interest to someone

I’m sniped by the areas of math I’m most aesthetically attracted to, and creating a 300 IQ plan with a bajillion 4D chess moves to rationalize working on them.

While you might be risking wasting your time for all I know, this research plan as a whole seems extremely high quality to me and on the right track in a way few are. That said, I think you're underestimating how soon we'll see TAI.

(or maybe I don't know what people mean by TAI? I don't think all technology will be solved for several decades after TAI and hitting max level on AI does not result in science instantly being completed. many causal experiments and/or enormous high-precision barely-approximate simulations are still needed, part of the task is factorizing that, but it will still be needed.)

Reply

Moderation Log

Sheet of unit cube	Takeoff scenario type
(-1, y, z)	unipolar
(1, y, z)	multipolar
(x, -1, z)	homogenous
(x, 1, z)	heterogenous
(x, y, -1)	fast
(x, y, 1)	slow

Scenario	A nice-to-have product for that scenario	What needs to be in place for that product to be viable
The boards of corporations A and B are negotiating a merger, and have delegated most of the work to an executive assistant service.	Synthetic bargaining subroutines with free assurances that the services will act in a predictable manner	Proof assistant -like tooling honed for properties of agents.
The leaders of country X are threatening to invade country Z, and country Y is privately committed to defending Z. Civilian and military leadership of X and Y each are augmented by assistants which are plugged into autonomous weapons systems.	A specification language to describe the stakes of the scenario (very loosely) as a “game” and a calculator to recommend policies for scenarios that abide by the specification language.	A semantic account of games and strategies that plays nicely with algorithmic game theory.
An agent is the custodian of a small civilization of ems or digital minds, which values purple hats. A DAO populated by human principals would like to utilize that civilization’s computronium, which they’re willing to play along with if members of the DAO agree to wear lots of purple hats. The agent is simulating the DAO at some fixed resolution, and the agent’s source code is known to the DAO.	A tooling stack for taking a bargaining subroutine and generating audits, interpretations, and assurances about how that subroutine will behave (this would build up confidence between the two parties).	Meaningful asks for what an audit of a bargaining subroutine should reveal, theoretical foundations for useful interpretation, etc.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

17

Master plan spec: needs audit (logic and cooperative AI)

17

17

On whose shoulders are we standing?

Takeoff geometry

-polarity

-geneity

speed

Forecasting by giving a solid

Multi-multi delegation

TLDR: what I broadly think crunch time looks like

What products would we like to be on the market at crunch time?

Object-level technical content

On whose shoulders are we standing?

Program equilibrium and modal combat

Compositional game theory / open games

Domain theory

Plausibly helpful directions not explicitly a part of my master plan right now

Projects I’m in early stages of

Redteam