Pablo Villalobos — LessWrong

Previously Staff Researcher at Epoch, these days I'm mostly thinking about better frameworks for understanding superintelligence.

I think this is partly cope. Middle management at large companies might have a significant political component but large companies still have much higher labor productivity than small ones, they still represent like a quarter of OECD economic output, and the average middle manager is probably still creating very significant amounts of marginal value despite the political infighting.

Yes, there might be very few middle managers at the top of Forbes' list. But look at millionaires, the people at the top 10% of the wealth distribution in the US. Most middle managers will probably be there, along with other highly paid professionals. And if you found a startup and it fails, you won't make the 10%, which is what happens in the overwhelming majority of cases.

So purely in terms of wealth and creating value for society, marginal improvements in middle management seem quite valuable. Sure, being a founder might have much higher EV, but also vastly higher variance. And risk aversion is behaviorally indistinguishable from just having a different utility function.

Speaking of which, you might want to be immortal and go to the moon, but most people don't. You could argue that if they'd read the right books or had the right parent/teacher/friend or had more vision, they would also want that, but at that point you're just saying that everyone should be playing the game you like, instead of the one they like.

And I dispute the idea that there's less politics, signaling or strategic/conflictive behavior at the "winner's bracket". Look at Sam Altman, look at the status competitions and fighting for credit in the highest halls of science. Look at states for God's sake. Do you really think Vladimir Putin or Donald Trump or Xi Jinping are not in the winner's bracket, that they will have less real power than the first person to figure out how to solve aging?

Physicists might have found the secret knowledge of how to create nuclear weapons that nobody else had, but after they had the bright idea the bottlenecks were capital, labor, natural resources, and the ability to manage their combination efficiently, and the physicists were not the ones who got to control the weapons in the end.

I think something like your thesis might be true in terms of actually having good counterfactual impact on reality vs merely capturing the resulting wealth, power and prestige, what you call leading the parade. But that doesn't mean that you get to both have the impact and lead the parade by pursuing just the impact part!

Personal view as an employee: Epoch has always been a mix of EAs/safety-focused people and people with other views. I don't think our core mission was ever explicitly about safety, for a bunch of reasons including that some of us were personally uncertain about AI risk, and that an explicit commitment to safety might have undermined the perceived neutrality/objectiveness of our work. The mission was raising the standard of evidence for thinking about AI and informing people to hopefully make better decisions.

My impression is that Matthew, Tamay and Ege were among the most skeptical about AI risk and had relatively long timelines more or less from the beginning. They have contributed enormously to Epoch and I think we'd have done much less valuable work without them. I'm quite happy that they have been working with us until now, they could have moved to do direct capabilities work or anything else at any point if they wanted and I don't think they lacked opportunities to do so.

Finally, Jaime is definitely not the only one who still takes risks seriously (at the very least I also do), even if there have been shifts in relative concern about different types of risks (eg: ASI takeover vs gradual disempowerment).

We're in the nearby bar, Casa Remigio, since the theater is occupied

I suspect the analogy does not really work that well. Much of human genetic variation is just bad mutations that take a while to be selected out. For example, maybe a gene variant slightly decreases the efficiency of your neurons and makes everything in your brain slightly slower

I stand corrected. Although the broader point about share prices noisily approximating a discounted expected cash flow which can be added or multiplied still holds

There is a sense in which the price approximates an intrinsic property of the shares that you can add up or multiply by the number of shares. Each share gives you a vote in the shareholder assembly and an equal portion of the dividends. If you had all the shares, you would own the company and in principle could pay yourself as much as the company can afford in dividends.

How much the company can afford to pay in dividends in the future is basically how much net operating profit after taxes (NOPAT) the company will have.

If you have a prediction of the future NOPAT of the company, it implies a present value for the whole company and its shares assuming all of it is cashed out as dividends. It is commonly assumed that in most cases the market price of shares oscillates around a rational expectation of future NOPAT, in which case it would be a reasonable approximation to something that you can semantically multiply by the number of shares to get the overall value of the company.

The arguments you make seem backwards to me.

All this to say, land prices represent aggregation effects / density / access / proximity of buildings. They are the cumulative result of being surrounded by positive externalities which necessarily result from other buildings not land. It is the case that as more and more buildings are built, the impact of a single building to its land value diminishes although the value of its land is still due to the aggregation of and proximity to the buildings that surround it.

Yes, this is the standard Georgist position, and it's the reason why land owners mainly capture (positive and negative) externalities from land use around them, not in their own land.

Consider an empty lot on which you can build either a garbage dump or a theme park, each of equivalent economic value. Under SQ, the theme park is built as the excess land value is capture by the land owner. Under LVT, the garbage dump is built as the reduced land values reduces their tax burden. The SQ encourages positive externalities, LVT encourages negative externalities.

This seems wrong. The construction of a building mainly affects the value of the land around it, not the land on which it sits. Consider the following example in which instead of buildings, we have an RV and a truck, so there is no cost of building or demolishing stuff:

There's a pristine neighborhood with two empty lots next to each other in the middle of it. Both sell for the same price. The owner of empty lot 1 rents it to a drug dealer, who places a rusty RV on the lot and sells drugs in it. The owner of empty lot 2 rents it to a well-known chef who places a stylish food truck on the lot and serves overpriced food to socialites in it.

Under SQ, who do you think would profit from selling the land now? The owner of lot 2 has to sell land next to a drug dealer that a prospective buyer can do nothing about. The owner of lot 1 has to sell land next to delicious high-status food, and if a buyer minds the drug dealer he can kick him out. Who is going to have an easier time selling? Who is going to get a higher price?

Now, suppose there is a LVT. If the tax is proportional to the selling price of the land under SQ (as it ideally should), which owner is going to pay more tax?

The case of the theme park and garbage dump is exactly the same, with the added complication of construction / demolition costs. An LVT should be proportional to the price of the land if there were no buildings on top of it (and without taking into account the tax itself), so building a garbage dump is not going to significantly reduce your tax payments.

In such a way, a land value tax has a regularisation effect on building density, necessitating a spread of concentration.

There are several separate effects here, if you are a landowner. Under LVT:

You are incentivized to reduce the density in surrounding land
You are incentivized to build as densely as possible within your own land to compensate the tax

Under SQ:

You are incentivized to increase the density in surrounding land
You are not incentivized to increase density in your own land

The question is, which of these effects is bigger? I would say that landowners have more influence over their own land than over surrounding land, so a priori I would expect more density to result from an LVT

We'll be at the ground floor!

Not quite. What you said is a reasonable argument, but the graph is noisy enough, and the theoretical arguments convincing enough, that I still assign >50% credence that data (number of feedback loops) should be proportional to parameters (exponent=1).

My argument is that even if the exponent is 1, the coefficient corresponding to horizon length ('1e5 from multiple-subjective-seconds-per-feedback-loop', as you said) is hard to estimate.

There are two ways of estimating this factor

Empirically fitting scaling laws for whatever task we care about
Reasoning about the nature of the task and how long the feedback loops are

Number 1 requires a lot of experimentation, choosing the right training method, hyperparameter tuning, etc. Even OpenAI made some mistakes on those experiments. So probably only a handful of entities can accurately measure this coefficient today, and only for known training methods!

Number 2, if done naively, probably overestimates training requirements. When someone learns to run a company, a lot of the relevant feedback loops probably happen on timescales much shorter than months or years. But we don't know how to perform this decomposition of long-horizon tasks into sets of shorter-horizon tasks, how important each of the subtasks are, etc.

We can still use the bioanchors approach: pick a broad distribution over horizon lengths (short, medium, long). My argument is that outperforming bioanchors by making more refined estimates of horizon length seems too hard in practice to be worth the effort, and maybe we should lean towards shorter horizons being more relevant (because so far we have seen a lot of reduction from longer-horizon tasks to shorter-horizon learning problems, eg expert iteration or LLM pretraining).

Note that you can still get EUM-like properties without completeness: you just can't use a single fully-fleshed-out utility function. You need either several utility functions (that is, your system is made of subagents) or, equivalently, a utility function that is not completely defined (that is, your system has Knightian uncertainty over its utility function).

See Knightian Decision Theory. Part I

Arguably humans ourselves are better modeled as agents with incomplete preferences. See also Why Subagents?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments