harsimony

I am a longtime LessWrong and SSC reader who finally got around to starting a blog. I would love to hear feedback from you! https://harsimony.wordpress.com/

Wiki Contributions

Comments

Two arguments I would add

  1. Conflict has direct costs/risks, a fight between AI and humanity would make both materially worse off
  2. Because of comparative advantage, cooperation between AI and humanity can produce gains for both groups. Cooperation can be a Pareto improvement.

Alignment applies to everyone, and we should be willing to make a symmetric commitment to a superintelligence. We should grant them rights, commit to their preservation, respect it's preferences, be generally cooperative and avoid using threats, among other things.

It may make sense to commit to a counterfactual contract that we expect an AI to agree to (conditional on being created) and then intentionally (carefully) create the AI.

Standardization/interoperability seems promising, but I want to suggest a stranger option: subsidies!

In general, monopolies maximize profit by setting an inefficiently high price, meaning that they under-supply the good. Essentially, monopolies don't make enough money.

A potential solution is to subsidize the sale of monopolized goods so the monopolist increases supply to the efficient level.

For social media monopolies, they charge too high a "price" by using too many ads, taking too much data, etc. Because of the network effect, it would be socially beneficial to have more users, but the social media company drives them away with their high "prices". The socially efficient network size could be achieved by paying the social media company per active user!

I was planning to write this up in more detail at some point (see also). There are of course practical difficulties with identifying monopolies, determining the correct subsidy in an adversarial environment, Sybil attacks, etc.

Nice post, thanks!

Is there a formulation of UDASSA that uses the self-indication assumption instead? What would be the implications of this?

Frowning upon groups which create new, large scale models will do little if one does not address the wider economic pressures that cause those models to be created.

I agree that "frowning" can't counteract economic pressures entirely, but it can certainly slow things down! If 10% of researchers refused to work on extremely large LM's, companies would have fewer workers to build them. These companies may find a workaround, but it's still an improvement on the situation where all researchers are unscrupulous.

The part I'm uncertain about is: what percent of researchers need to refuse this kind of work to extend timelines by (say) 5 years? If it requires literally 100% of researchers to coordinate, then it's probably not practical, if we only have to convince the single most productive AI researcher, then it looks very doable. I think the number could be smallish, maybe 20% of researchers at major AI companies, but that's a wild guess.

That being said, work on changing the economic pressures is very important. I'm particularly interested in open-source projects that make training and deploying small models more profitable than using massive models.

On outside incentives and culture: I'm more optimistic that a tight-knit coalition can resist external pressures (at least for a short time). This is the essence of a coordination problem; it's not easy, but Ostrom and others have identified examples of communities that coordinate in the face of internal and external pressures.

I like this intuition and it would be interesting to formalize the optimal charitable portfolio in a more general sense.

I talked about a toy model of hits-based giving which has a similar property (the funder spends on projects proportional to their expected value rather than on the best projects):

https://ea.greaterwrong.com/posts/eGhhcH6FB2Zw77dTG/a-model-of-hits-based-giving

Updated version here: https://harsimony.wordpress.com/2022/03/24/a-model-of-hits-based-giving/

Great post!!

I think the section "Perhaps we don’t want AGI" is the best argument against these extrapolations holding in the near-future. I think data limitations, practical benefits of small models, and profit-following will lead to small/specialized models in the near future.

https://www.lesswrong.com/posts/8e3676AovRbGHLi27/why-i-m-optimistic-about-near-term-ai-risk

Yeah I think a lot of it will have to be resolved at a more "local" level.

For example, for people in a star system, it might make more sense to define all land with respect to individual planets ("Bob owns 1 acre on Mars' north pole", "Alice owns all of L4" etc.) and forbid people from owning stationary pieces of space. I don't have the details of this fleshed out, but it seems like within a star system, its possible to come up with a sensible set of rules and have the edge cases hashed out by local courts.

For the specific problem of predicting planetary orbits, if we can predict 1000 years in the future, it seems like the time-path of land ownership could be updated automatically every 100 years or so, so I don't expect there to be huge surprises there.

For taxation across star systems, I'm having trouble thinking of a case where there might be ownership ambiguity given how far apart they are. For example, even when the Milky Way and Andromeda galaxies collide, its unlikely that any stars will collide. Once again, this seems like something that can be solved by local agreements where owners of conflicting claims renegotiate them as needed.

I feel like something important got lost here. The colonists are paying a land value tax in exchange for (protected) possession of the planet. Forfeiting the planet to avoid taxes makes no sense in this context. If they really don’t want to pay taxes and are fine with leaving, they could just leave and stop being taxed; no need to attack anyone.

The “its impossible to tax someone who can do more damage than their value” argument proves too much; it suggests that taxation is impossible in general. It’s always been the case that individuals can do more damage than could be recouped in taxation, and yet, people still pay taxes.

Where are the individuals successfully avoiding taxation by threatening acts of terrorism? How are states able to collect taxes today? Why doesn’t the U.S. bend to the will of weaker states since it has more to lose? It’s because these kinds of threats don’t really work. If the U.S. caved to one unruly individual then nobody would pay taxes, so the U.S. has to punish the individual enough to deter future threats.

... this would provide for enough time for a small low value colony, on a marginally habitable planet, to evacuate nearly all their wealth.

But the planet is precisely what's being taxed! Why stage a tax rebellion only to forfeit your taxable assets?

If the lands are marginal, they would be taxed very little, or not at all.

Even if they left the planet, couldn’t the counter strike follow them? It doesn’t matter if you can do more economic damage if you also go extinct. It’s like refusing to pay a $100 fine by doing $1000 of damage and then ending up in prison. The taxing authority can precommit to massive retaliation in order to deter such behavior. The colony cannot symmetrically threaten the tax authority with extinction because of the size difference.

All of this ignores the practical issues with these weapons, the fact that earth’s value is minuscule compared to the sun, the costs of forfeiting property rights, the relocation costs, and the fact that citizens of marginal lands would receive net payments from the citizens dividend.

There are two possibilities here:

  1. Nations have the technology to destroy another civilization

  2. Nations don't have the technology to destroy another civilization

In either case, taxes are still possible!

In case 1, any nation that attempts to destroy another nation will also be destroyed since their victim has the same technology. Seems better to pay the tax.

In case 2, the Nation doesn't have a way to threaten the authorities, so they pay the tax in exchange for property rights and protection.

Thus threatening the destruction of value several orders of magnitude greater than the value to be collected is a viable deterrent.

Agreed, but to destroy so much value, one would have to destroy at least as much land as they currently control. Difficulties of tax administration mean that only the largest owners will be taxed, likely possessing entire solar systems. So the tax dodge would need to destroy a star. That doesn't seem easy.

It’s impossible, without some as yet uninvented sensing technology, to reliably surveil even the few hundred closest star systems.

I'm more optimistic about sensing tech. Gravitational lensing, superlenses, and simple scaling can provide dramatic improvements in resolution.

It's probably unnecessary to surveil many star systems. Allied neighbors on Alpha Centauri can warn earth about an incoming projectile as it passes by (providing years of advanced notice), so nations might only need to surveil locally and exchange information.

... once it’s past the Oort Cloud and relatively easy to detect again, there will be almost no time left at 0.5 c

It would take 2-3 years for a 0.5c projectile to reach earth from the outer edge of the Oort cloud, this seems like enough time to adapt. At the very least, its enough time to launch a counterstrike.

Fast projectiles may be impractical given that a single collision with the interstellar medium would destroy them. Perhaps thickening the Oort cloud could be an effective defense system.

A second-strike is only a credible counter if the opponent has roughly equal amounts to lose.

Agents pre-commit to a second strike as a deterrent, regardless of how wealthy the aggressor is. If the rebelling nation has the technology to destroy another nation and uses it, they're virtually guaranteed to be destroyed by the same technology.

Given the certainty of destruction, why not just pay the (intentionally low, redistributive, efficiency increasing, public-good funding) taxes?

Load More