I'm particularly interested in sustainable collaboration and the long-term future of value. I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.
I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! In no particular order, here are some I've enjoyed recently
Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites
People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.
I'm interested to know how (if at all) you'd say the perspective you've just given deviates from something like this:
My current guess is you agree with some reasonable interpretation of all these points. And maybe also have some more nuance you think is important?
Given the picture I've suggested, the relevant questions are
A complementary angle: we shouldn't be arguing over whether or not we're in for a rough ride, we should be figuring out how to not have that.
I suspect more people would be willing to (both empirically and theoretically) get behind 'ruthless consequentialist maximisers are one extreme of a spectrum which gets increasingly scary and dangerous; it would be bad if those got unleashed'.
Sure, skeptics can still argue that this just won't happen even if we sit back and relax. But I think then it's clearer that they're probably making a mistake (since origin stories for ruthless consequentialist maximisers are many and disjunctive). So the debate becomes 'which sources of supercompetent ruthless consequentialist maximisers are most likely and what options exist to curtail that?'.
"This short story perfectly depicts the motivations and psychological makeup of my milieu," I think wryly as I strong upvote. I'm going to need to discuss this at length with my therapist. Probably the author is one of those salty mid-performing engineers who didn't get the offer they wanted from Anthropic or whatever. That thought cheers me up a little.
Esther catches sight of the content on my screen over my shoulder. "I saw that too," she remarks, looking faintly worried in a way which reminds me of why I am hopelessly in love with what she represents. "Are we, like, the bad guys, or maybe deluding ourselves that we're the good guys in a bad situation? It seems like that author thinks so. It does seem like biding my time hasn't really got me any real influence yet."
I rack my brain for something virtuous to say. "Yeah, um, safety-washing is a real drag, right?" Her worry intensifies, so I know I'm pronouncing the right shibboleths. God, I am really spiritually emaciated right now. I need to cheer her up. "But think about it, we really are in the room, right? Who else in the world can say that? It's not like Vox or Krishna are going to wake up any time soon. That's a lot of counterfactual expected impact."
She relaxes. "You're right. Just need to keep vigilant for important opportunities to speak up. Thanks." We both get back to tuning RL environments and meta-ML pipelines.
Well, why do we envy? Evopsych just-so story says that, of course, others having much impinges on me whether I want it to or not (my security/precarity, liberty, relative success in interactive situations, ...).
I think you can express that in a 'goods' framing by glossing it as the consumption of 'status and interactive goods' or something like that (but noting that these are positional and contingent on the wider playing field, which the bog standard welfare theorem utility functions aren't).
This might be some sort of crux between you and him, where he might point to growth in researcher focus/headcount as well as perhaps some uplift from AI by now. (I might endorse the first of those a bit, and probably not the second, thought it's conceptually reasonable and not out of the question.)
Thanks for this really helpful nuance, where distinguishing types of algorithmic improvement seems really important for forecasting out when and how the trends will deflect, how R&D inputs and Wright's Law ish effects apply to each factor, and how recursive 'self' improvement might play out.
Dario’s been working primarily on Transformer-based LLMs for at least 7 years. I flat out do not believe that those kinds of “optimizations” can make the same calculations run
It's a nit, but he was saying that he estimates the rate now to be 4x, not that it's been 4x the whole time. So he's not claiming it should be 16k times more economical now.
The solution may be analogous: some form of paternalism, where human minds are massively protected by law from some types of interference. This may or may not work, but once it is the case, you basically can not start from classical liberal and libertarian assumptions.
I half believe this. I notice though that many modern societies have some paternalistic bits and pieces, often around addiction and other preference-hijacking activities... but are also often on the whole liberal+libertarian. It may be that more enclaves and 'protection' is needed at various layers, and then liberalism can be maintained within those boundaries.
Right, I think positional goods and the like are among several distortions of the basic premises of the welfare theorems (and indeed empirically many people are sad, lonely, etc. in our modern world of abundance) - I sometimes think those theorems imply a sort of normative 'well, just don't worry about other people's stuff!' (i.e. non-envy, which is, after all, a deadly sin). cf Paretotopia, which makes exactly this normative case in the AI futurism frame.
(I forgot that more conversation might happen on a LW crosspost, and I again lament that the internet has yet to develop a unified routing system for same-content-different-edition discourse. Copied comment from a few days ago on substack:)
I really appreciate this (and other recent) transparency. This is much improved since AI 2027.
One area I get confused by (same with Davidson, with whom I've discussed this a bit) is 'research taste'. When you say things like 'better at research taste', and when I look at your model diagram, it seems you're thinking of taste as a generic competence. But what is taste? It's nothing but a partially-generalising learned heuristic model of experiment value-of-information. (Said another way, it's a heuristic value function for the 'achieve insight' objective of research).
How do you get such learned models? No other way than by experimental throughput and observation thereof (direct or indirect: can include textbooks or notes and discussions with existing experts)!
See my discussion of research and taste
As such, taste accumulates like a stock, on the basis of experimental throughput and sample efficiency (of the individual or the team) at extracting the relevant updates to VOI model. It 'depreciates' as you go, because the frontier of the known moves, which moves gradually outside the generalising region of the taste heuristic (eventually getting back to naive trial and error), most saliently here with data and model scale, but also in other ways.
This makes sample efficiency (of taste accumulation) and experimental throughput extremely important, central in my view. You might think that expert interviews and reading all the textbooks ever etc provide meaningful jumpstart to the taste stock. But they certainly don't help with the flow. So then you need to know how fast it depreciates over the relevant regime.
(Besides pure heuristic improvements, if you think faster, you can also reason your way to somewhat better experiment design, both by naively pumping your taste heuristics for best-of-k, or by combining and iterating on designs. I think this reasoning boost falls off quite sharply, but I'm unsure. See my question on this)