It seems that certain thermodynamic phenomena are highly relevant to alignment.  From Paul Christiano's rebuttal to a recent seminal paper:

In fact, it appears that thousands of words may have been written on this topic, from which many lesser researchers have bounced off.

Seeing that Paul Christiano and John Swentworth have spent countless hours investigating this important topic, it's importance is quite evident.  In order to free up their time for other strategies like ELK, I am launching a new EA-funded initiative:

Our initial staff consists of myself and 10 Mechanical Turk workers, who shall be tasked with data gathering.  For seed funding, we are requesting a modest sum of $50M.  Given that Paul's time is worth $20 trillion per hour, this seems like a reasonable tradeoff.

Our initial research agenda consists of investigating the space of designs with a complex number of hoses.  Further, we have reason to believe that regularizing by the complexity of the stupidest argument for a given design leads to good inductive bias properties.  Our work may even have direct implications for other alignment agendas, as it shares structure with many important problems.

Furthermore, building infrastructure for practical engineering projects is of great value, even if air conditioners aren't quite mechanistically identical to AGI.

You can support our work by loudly advertising your loyalty to the 1-hose or 2-hose camp.

11 comments, sorted by Click to highlight new comments since: Today at 1:10 AM
New Comment

I have noticed that all parties in The Great Air Conditioner Debate Of 2022 agree that the two-hose design is at least somewhat more efficient than the one-hose design, and that the zero-hose design utterly fails to cool any room at all. (2-hose) > (1-hose) > (0-hose).

Personally, I put far more faith in the tendency of trends to continue than in all this abstract theorizing. I therefore propose that one of [AC]RC's first projects should be to build a gigahose air conditioner and test its performance.

Other than being a really cool project, this will provide us with important data on the reliability of scaling laws, which will hopefully generalize to other domains.

I urge against abandoning the 0-hose case too quickly! Consider direct radiative cooling of people, which appears to be correctly aligned to the human value of thermal comfort, and highly efficient, while still running afoul of the heuristic that if fails to cool any room at all.

This suggests the multi-hose paradigm is already Goodharted, and designs isomorphic to the 0-hose case are a more fruitful path of investigation.

It is shocking to me that such an esteemed institution as [AC]RC would get John Wentworth's surname wrong in an announcement post. Clearly this is proof that their staff has no time to spare for proofreading. This must mean they are currently starved for funding and they need significantly more than the modest sum of $50M they are asking for.

To the mean discussion, I'll contribute by proposing the harmonic mean as an alternative point of compromise. I don't know if the harmonic mean has any desirable properties but the name sounds more prestigious compared to the arithmetic or geometric means, and as such is probably the good choice for fundraising advertisement purposes.

It is shocking to me that such an esteemed institution as [AC]RC would get John Wentworth's surname wrong in an announcement post.

 

Clearly John's preference is to have exactly two hoses attached to the "W"(indow) ... making both your spelling (with 0 hoses) and the [AC]RC spelling (with just one hose) weirdly confrontational.

As the difference between 0 hoses and 1 hose is big and the difference between one hose and two hoses is much smaller, clearly hose scaling will have to be exponential. So if we take a mean, we should do it in logspace.

I propose a compromise: 1.5 hoses. This will lead to greatness.

According to information theory, the geometric mean is more suitable -- 1.41 should do fine as an approximation.

Please PM me an ETH address for the $50MM funds, which you can consider secure enough to get started with your organization's work.

Actually, I’d be happy to route it through my personal account! I may also know a few Nigerian princes who can help as well…

Entirely totally coincidentally, the acronym ARC was taken; let us know if you have better ideas.

John Swentworth

I'm pretty sure it's Johns Wentworth, he's plural.

New to LessWrong?