Critiques of the Agent Foundations agenda?

Stretching the definition of 'substantial' further:

Beth Zero was an ML researcher and Sneerclubber with some things to say. Her blog is down unfortunately but here's her collection of critical people. Here's a flavour of her thoughtful Bulverism. Her post on the uselessness of Solomonoff induction and the dishonesty of pushing it as an answer outside of philosophy was pretty good.

Sadly most of it is against foom, against short timelines, against longtermism, rather than anything specific about the Garrabrant or Demski or Kosoy programmes.

Critiques of the Agent Foundations agenda?

Nostalgebraist (2019) sees it as equivalent to solving large parts of philosophy: a noble but quixotic quest. (He also argues against short timelines but that's tangential here.)

Here is what this ends up looking like: a quest to solve, once and for all, some of the most basic problems of existing and acting among others who are doing the same. Problems like “can anyone ever fully trust anyone else, or their future self, for that matter?” In the case where the “agents” are humans or human groups, problems of this sort have been wrestled with for a long time using terms like “coordination problems” and “Goodhart’s Law”; they constitute much of the subject matter of political philosophy, economics, and game theory, among other fields.

The quest for “AI Alignment” covers all this material and much more. It cannot invoke specifics of human nature (or non-human nature, for that matter); it aims to solve not just the tragedies of human coexistence, but the universal tragedies of coexistence which, as a sad fact of pure reason, would befall anything that thinks or acts in anything that looks like a world.

It sounds misleadingly provincial to call such a quest “AI Alignment.” The quest exists because (roughly) a superhuman being is the hardest thing we can imagine “aligning,” and thus we can only imagine doing so by solving “Alignment” as a whole, once and forever, for all creatures in all logically possible worlds. (I am exaggerating a little in places here, but there is something true in this picture that I have not seen adequately talked about, and I want to paint a clear picture of it.)

There is no doubt something beautiful – and much raw intellectual appeal – in the quest for Alignment. It includes, of necessity, some of the most mind-bending facets of both mathematics and philosophy, and what is more, it has an emotional poignancy and human resonance rarely so close to the surface in those rarefied subjects. I certainly have no quarrel with the choice to devote some resources, the life’s work of some people, to this grand Problem of Problems. One imagines an Alignment monastery, carrying on the work for centuries. I am not sure I would expect them to ever succeed, much less to succeed in some specified timeframe, but in some way it would make me glad, even proud, to know they were there.

I do not feel any pressure to solve Alignment, the great Problem of Problems – that highest peak whose very lowest reaches Hobbes and Nash and Kolomogorov and Gödel and all the rest barely began to climb in all their labors...

#scott wants an aligned AI to save us from moloch; i think i'm saying that alignment would already be a solution to moloch

Rationalists from the UK -- what are your thoughts on Dominic Cummings?

Huh, works for me. Anyway I'd rather not repeat his nasty slander but "They're [just] a sex cult" is the gist.

Rationalists from the UK -- what are your thoughts on Dominic Cummings?

The received view of him is as just another heartless Conservative with an extra helping of tech fetishism and deceit. In reality he is an odd accelerationist just using the Tories (Ctrl+F "metastasising"). Despite him quoting Yudkowsky in that blog post, and it getting coverage in all the big papers, people don't really link him to LW or rationality, because those aren't legible, even in the country's chattering classes. We are fortunate that he is such a bad writer, so that no one reads his blog.

Here's a speculative rundown of things he probably got implemented (but we won't really know until 2050 declassification):

  • Doubling of the already large state R&D budget (by 2025). This will make the government half of all UK R&D spending. £800m ARPA like. £300m STEM funding already out.

  • Pushed the COVID science committee into an earlier lockdown. Lockdown sceptics / herd immunity types likely to gain influence now.

  • An uncapped immigration path for scientists

  • Tutoring in state schools

  • Data-driven reform of the civil service is incomplete and probably abortive. His remaining crew are "misfits", little influence. Associated data science, superforecasting and evidence-based policy with racists and edgelords. (One of those is on record as having a ridiculously negative view of LW.) Weirdo hiring scheme may mean Whitehall hiring even more staid in the short run.

  • Something something bullying, norms, deception, centralisation of power. Whipping the Treasury probably not a good precedent.

  • His hypocrisy probably weakened lockdown norms. This also wasted a huge amount of Boris Johnson's political capital during a public health crisis; I don't know how to evaluate that.

Model Depth as Panacea and Obfuscator

Great post. Do you have a sense of

  1. how much of tree success can be explained / replicated by interpretable models;
  2. whether a similar analysis would work for neural nets?

You suggest that trees work so well because they let you charge ahead when you've misspecified your model. But in the biomedical/social domains ML is most often deployed, we are always misspecifying the model. Do you think your new GLM would offer similar idiotproofing?

Yeah, the definition of evidence you use (that results must single out only one hypothesis) is quite strong, what people call "crucial" evidence.

Are there good ways to find expert reviews of popular science books?

I suspect there is no general way. ): Even the academic reviews tend to cherry-pick one or two flaws and gesture at the rest.

Partial solutions:

  1. Invest the time to follow the minority of Goodreads users who know their stuff. (Link is people I follow.)
  2. See if Stuart Ritchie has reviewed it for money.
Most reliable news sources?

The Economist ($) for non-Western events and live macroeconomics. They generally foreground the most important thing that happens every week, wherever it happens to occur. They pack the gist into a two page summary, "The World this Week". Their slant is pro-market pro-democracy pro-welfare pro-rights, rarely gets in the way. The obituaries are often extremely moving.

Conceptual engineering: the revolution in philosophy you've never heard of

Raised in the old guard, Chalmers doesn't understand...

This amused me, given that in the 90s he was considered an outsider and an upstart, coming round here with his cognitive science, shaking things up. (" 'The Conscious Mind' is a stimulating, provocative and agenda-setting demolition-job on the ideology of scientific materialism. It is also an erudite, urbane and surprisingly readable plea for a non-reductive functionalist account of mind. It poses some formidable challenges to the tenets of mainstream materialism and its cognitivist offshoots" )

Not saying you're wrong about him in that lecture. Maybe he has socialised and hardened as he gained standing. A funny cycle, in that case.

Load More