Garrett Baker

Independent alignment researcher


Isolating Vector Additions

Wiki Contributions


Do you think the final big advance happens within or with-out labs?

I don't know the exact article that convinced me, but I bet this summary of the history of economic thought on the subject is a good place to start, which I have skimmed, and seems to cover the main points with citations.

Interesting lens! Though I'm not sure if this is fair -- the largest things that are done tend to get done through governments, whether those things are good or bad. If you blame catastrophes like Mao's famine or Hitler's genocide on governments, you should also credit things like slavery abolition and vaccination and general decline of violence in civilized society to governments too.

I do mostly[1] credit such things to governments, but the argument is about whether companies or governments are more liable to take on very large tail risks. Not about whether governments are generally good or bad. It may be that governments just like starting larger projects than corporations. But in that case, I think the claim that a greater percentage of those end in catastrophe than similarly large projects started by corporations still looks good.

  1. I definitely don't credit slavery abolition to governments, at least in America, since that industry was largely made possible in the first place by governments subsidizing the cost of chasing down runaway slaves. I'd guess general decline of violence is more attributable to generally increasing affluence, which has a range of factors associated with it, than government intervention so directly. But I'm largely ignorant on that particular subject. The "mostly" here means "I acknowledge governments do some good things". ↩︎

I will push back on democratic in the sense I think Linch is pushing the term being actually all that good a property for cosmically important orgs. See Bryan Caplan's The Myth of the Rational Voter, and the literature around [Social-desirability bias](Social-desirability bias) for reasons why, which I'm sure Linch is familiar with, but I notice is not mentioned.

I also claim that most catastrophes through both recent and long-ago history have been caused by governments, not just in the trivial sense, but also if we normalize by amount of stuff done. A good example everyone right now should be familiar with is qualified immunity, and the effects it has on irresponsible policing. The fact is we usually hold our companies to much higher standards than our governments (or do we just have more control over the incentives of our companies than our governments). It is also strange that the example Linch gives for a bad company is Blackwater, which while bad, is... about par-for-the-course when it comes to CIA projects.

I note too the America-centric bias with all of these examples & comparisons. Maybe the American government is just too incompetent compared to others, and we should instead embed the project within France or Norway.

There's a general narrative that basic research is best done in government/academia, but is this true? The academia end seems possibly true in the 20th century, most discoveries were made by academics. But there was also a significant contribution by folks at research labs started by monopolies of the period (most notably Bell Laboratories). Though this seems like the kind of thing which could turn out to be false going forward, as our universities become more bloated, and we kill off our monopolies. But in either case, I don't know why Linch thinks quality basic research will be done by the government? People like bringing up the Apollo program & Manhattan project, but both of those were quality projects due to their applied research, not their basic research which was all laid down ahead of time. I'm not saying it doesn't happen, but does anyone have good case studies? CERN comes to mind, but of course for projects that just require governments to throw massive amounts of money at a problem, government does well. AGI is plausibly like this, but alignment is not (though more money would be nice).

Government also tends to go slow, which I think is the strongest argument in favor of doing AGI inside a government. But also, man I don't trust government to implement an alignment solution if such a solution is invented during the intervening time. I'm imagining trying to convince a stick-in-the-ass bureaucrat fancying himself a scientist philosopher, whose only contribution to the project was politicking at a few important senators to thereby get himself enough authority to stand in the way of anyone changing anything about the project, who thinks he knows the solution to alignment that he is in fact wrong, and he should use so-and-so proven strategy, or such-and-such ensemble approach instead. Maybe a cynical picture, but one I'd bet resonates with those working to improve government processes.

I'd be interested to hear how Austin has updated regarding Sam's trustworthiness over the past few days.

The second half deals with more timeless considerations, like whether OpenAI should be embedded in a larger organization which doesn't have its main reason for existence being creating AGI, like a large company or a government.

I don't know of any clear progress on your interests yet. My argument was about the trajectory MI is on, which I think is largely pointed in the right direction. We can argue about the speed at which it gets to the hard problems, whether its fast enough, and how to make it faster though. So you seem to have understood me well.

A core motivating intuition behind the MI program is (I think) "the stuff is all there, perfectly accessible programmatically, we just have to learn to read it". This intuition is deeply flawed: Koan: divining alien datastructures from RAM activations

I think I'm more agnostic than you are about this, and also about how "deeply" flawed MI's intuitions are. If you're right, once the field progresses to nontrivial dynamics, we should expect those operating at a higher level of analysis--conceptual MI--to discover more than those operating at a lower level, right?

I have, and I also remember seeing Adam’s original retrospective, but I always found it unsatisfying. Thanks anyway!

My recommendation would be to get an LTFF, manifund, or survival and flourishing fund grant to work on the research, then if it seems to be going well, try getting into MATS, or move to Berkeley & work in an office with other independent researchers like FAR for a while, and use either of those situations to find co-founders for an org that you can scale to a greater number of people.

Alternatively, you can call up your smart & trustworthy college friends to help start your org.

I do think there's just not that much experience or skill around these parts with setting up highly effective & scalable organizations, so what help can be provided won't be that helpful. In terms of resources for how to do that, I'd recommend Y Combinator's How to Start a Startup lecture recordings, and I've been recommended the book Traction: Get a Grip on Your Business.

It should also be noted that if you do want to build a large org in this space, once you get to the large org phase, OpenPhil has historically been less happy to fund you (unless you're also making AGI[1]).

  1. This is not me being salty, the obvious response to "OpenPhil has historically not been happy to fund orgs trying to grow to larger numbers of employees" is "but what about OpenAI or Anthropic?" Which I think are qualitatively different than, say, Apollo. ↩︎

I'd say mechanistic interpretability is trending toward a field which cares & researches the problems you mention. For example, the doppelganger problem is a fairly standard criticism of the sparse autoencoder work, diasystemic novelty seems the kind of thing you'd encounter when doing developmental interpretability, interp-through-time, or inductive biases research, especially with a focus on phase changes (a growing focus area), and though I'm having a hard time parsing your creativity post (an indictment of me, not of you, as I didn't spend too long with it), it seems the kind of thing which would come from the study of in-context-learning, a goal that mainstream MI I believe has, even if it doesn't focus on now (likely because it believes its unable to at this moment), and which I think it will care more about as the power of such in-context learning becomes more and more apparent.

ETA: An argument could be that though these problems will come up, ultimately the field will prioritize hacky fixes in order to deal with them, which only sweep the problems under the rug. I think many in MI will prioritize such limited fixes, but also that some won't, and due to the benefits of such problems becoming empirical, such people will be able to prove the value of their theoretical work & methodology by convincing MI people with their practical applications, and money will get diverted to such theoretical work & methodology by DL-theory-traumatized grantmakers.

Advertisements are often very overt so that users don't get suspicious of your product, so I imagine you get GPT-Cola, which believes its a nice, refreshing, cold, bubbling bottle of Coca-Cola. And loves, between & within paragraphs actually answering your question, to talk about how tasty & sweet coca-cola is, and how for a limited time only, you can buy specialty GPT-4 coke bottles with GPT-cola q&as written on the front.

Load More