Isnasene

Comments

Dutch-Booking CDT: Revised Argument

So, silly question that doesn't really address the point of this post (this may very well be just a point of clarity thing but it would be useful for me to have an answer due to earning-to-give related reasons off-topic for this post) --

Here you claim that CDT is a generalization of decision-theories that includes TDT (fair enough!):

Here, "CDT" refers -- very broadly -- to using counterfactuals to evaluate expected value of actions. It need not mean physical-causal counterfactuals. In particular, TDT counts as "a CDT" in this sense.

But here you describe CDT as two-boxing in Newcomb, which conflicts with my understanding that TDT one-boxes coupled with your claim that TDT counts as a CDT:

For example, in Newcomb, CDT two-boxes, and agrees with EDT about the consequences of two-boxing. The disagreement is only about the value of the other action.

So is this conflict a matter of using the colloquial definition of CDT in the second quote but a broader one in the first, having a more general framework for what two-boxing is than my own, or knowing something about TDT that I don't?

OpenAI announces GPT-3

Thanks! This is great.

OpenAI announces GPT-3
A year ago, Joaquin Phoenix made headlines when he appeared on the red carpet at the Golden Globes wearing a tuxedeo with a paper bag over his head that read, "I am a shape-shifter. I can't change the world. I can only change myself."

-- GPT-3 generated news article humans found easiest to distinguish from the real deal.

... I haven't read the paper in detail but we may have done it; we may be on the verge of superhuman skill at absurdist comedy! That's not even completely a joke. Look at the sentence "I am a shape-shifter. I can't change the world. I can only change myself." It's successful (whether intended or not) wordplay. "I can't change the world. I can only change myself" is often used as a sort of moral truism (e.g. Man in the Mirror, Michael Jackson). In contrast, "I am a shape-shifter" is a literal claim about one's ability to change themselves.

The upshot is that GPT-3 can equivocate between the colloquial meaning of a phrase and the literal meaning of a phrase in a way that I think is clever. I haven't looked into whether the other GPTs did this (it makes sense that a statistical learner would pick up this kind of behavior) but dayum.

AI Boxing for Hardware-bound agents (aka the China alignment problem)
I propose that we ought to have less faith in our ability to control AI or its worldview and place more effort into making sure that potential AIs exist in a sociopolitical environment where it is to their benefit not to destroy us.

This is probably the crux of our disagreement. If an AI is indeed powerful enough to wrest power from humanity, the catastrophic convergence conjecture implies that it by default will. And if the AI is indeed powerful enough to wrest power from humanity, I have difficulty envisioning things we could offer it in trade that it couldn't just unilaterally satisfy for itself in a cheaper and more efficient manner.

As an intuition pump for this, I think that the AI-human power differential will be more similar to the human-animal differential than the company-human differential. In the latter case, the company actually relies on humans for continued support (something an AI that can roll-out human-level AI won't need to do at some point) and thus has to maintain a level of trust. In the former case, well... people don't really negotiate with animals at all.

Multiple Arguments, Multiple Comments

Yeah I don't do it for mainly selfish reasons but I agree that there are a lot of benefits to separating arguments into multiple comments in terms of improving readability and structure. Frankly, I commend you for doing it (and I'm particularly amenable to it because I like bullet-points). With that said, here are some reasons you shouldn't take too seriously for why I don't:

Selfish Reasons:

  • It's straightforwardly easier -- I tend to write my comments with a sense of flow. It feels more natural for me to type from start to finish and hit submit once than write and submit multiple things
  • I often use my comments to practice writing/structure and, the more your arguments are divided into different comments, the less structure you need. In some cases, reducing structure is a positive but its not really what I'm going for.
  • When I see several comment notifications on the little bell in the corner of my screen, my immediate reaction is "oh no I messed up" followed by "oh no I have a lot of work to do now." When I realize its all by one person, some of this is relieved but it does cause some stress -- more comments feels like more people even if it isn't

Practical Reasons:

  • If multiple arguments rely on the same context, it allows me to say the context and then say the two arguments following it. If I'm commenting each argument separately, I have to say the context multiple times -- one for each argument relying on it
  • Arguments in general can often have multiple interactions -- so building on one argument might strengthen/weaken my position on a different argument. If I'm splitting each argument into its own comments, then I have to link around to different places to build this
  • When I'm reading an argument, its often because I'm trying to figure out which position on a certain thing is right and I don't want to dig through comments that may serve other purposes (ie top level replies may often include general commentary or explorations of post material that aren't literally arguments). In this context, having to dig through many different kinds of comments to find the arguments is a lot more work than just finding a chain [Epistemic Status: I haven't actually tried this]. This isn't an issue for second-level comments.
  • Similarly, when deciding what position to take, I like some broader unifying discussion of which arguments were right and which were wrong which lead to some conclusion about the position itself. If 3/4 of your arguments made good points and its not a big deal that the fourth was wrong, this should be explored. Similarly, if 1/4 of your arguments made good points but that one is absolutely crucially significant compared to the others, this should be explored as well. If you do a conventional back-and-forth argument, this is a nice way to end the conversation but it becomes more complex if you split your arguments into multiple comments. [Note that in some cases though, its better to make your readers review each argument and think critically for themselves!]
AI Boxing for Hardware-bound agents (aka the China alignment problem)

Nice post! The moof scenario reminds me somewhat of Paul Christiano's slow take-off scenario which you might enjoy reading about. This is basically my stance as well.

AI boxing is actually very easy for Hardware Bound AI. You put the AI inside of an air-gapped firewall and make sure it doesn't have enough compute power to invent some novel form of transmission that isn't known to all of science. Since there is a considerable computational gap between useful AI and "all of science", you can do quite a bit with an AI in a box without worrying too much about it going rogue.

My major concern with AI boxing is the possibility that the AI might just convince people to let it out (ie remove the firewall, provide unbounded internet access, connect it to a Cloud). Maybe you can get around this by combining a limited AI output data stream with a very arduous gated process for letting the AI out in advance but I'm not very confident.

If the biggest threat from AI doesn't come from AI Foom, but rather from Chinese-owned AI with a hostile world-view.

The biggest threat from AI comes from AI-owned AI with a hostile worldview -- no matter whether how the AI gets created. If we can't answer the question "how do we make sure AIs do the things we want them to do when we can't tell them all the things they shouldn't do?", we might wind up with Something Very Smart scheming to take over the world while lacking at least one Important Human Value. Think Age of Em except the Ems aren't even human.

Advancing AI research is actually one of the best things you can do to ensure a "peaceful rise" of AI in the future. The sooner we discover the core algorithms behind intelligence, the more time we will have to prepare for the coming revolution. The worst-case scenario still is that some time in the mid 2030's a single research team comes up with a revolutionary new software that puts them miles ahead of anyone else. The more evenly distributed AI research is, the more mutually beneficial economic games will ensure the peaceful rise of AI.

Because I'm still worried about making sure AI is actually doing the things we want it to do, I'm worried that faster AI advancements will imperil this concern. Beyond that, I'm not really worried about economic dominance in the context of AI. Given a slow takeoff scenario, the economy will be booming like crazy wherever AI has been exercised to its technological capacities even before AGI emerges. In a world of abundant labor and so on, the need for mutually beneficial economic games with other human players, let alone countries, will be much less.

I'm a little worried about military dominance though -- since the country with the best military AI may leverage it to radically gain a geopolitical upper-hand. Still, we were able to handle nuclear weapons so we should probably be able to handle this to.

It's Not About The Nail

Admittedly the first time I read this I was confused because you went "When a bad thing happens to you, that has direct, obvious bad effects on you. But it also has secondary effects on your model of the world." This gave the sense that the issue was with the model of the world and not the world itself. This isn't what you meant but I made a list of reasons talking is a thing people do anyway:

  • When you become more vulnerable and the world is less predictable, the support systems you have for handling those things which were created in a more safe/predictable world will have a greater burden. Talking to people in that support system about the issue makes them aware of it and establishes precedent for you requesting more help than usual in the future. Pro-active support system preparation.
  • Similar to talking as a way re-affirming relationships (like you mentioned), talking can also be used directly to strengthen relationships. This might not solve the object-level problem but it gives you more slack to solve it. Pro-active support system building.
  • Even when talking doesn't seem to be providing a solution, it still often provides you information about the problem at hand. For instance, someone else's reaction to your problem can help you gauge its severity and influence your strategy. Often times you don't actually want to find the solution to the problem immediately -- you want to collect a lot of information so you can slowly process it until you reach a conclusion. Information collection.
    • Similarly this is really good if you actually want to solve the problem but don't trust the person you're talking to to actually give you good solutions.
  • Even when talking doesn't seem to be providing a solution, talking typically improves your reasoning ability anyway -- see rubber duck debugging for instance. Note that literally talking about your problems to a rubber duck is more trouble than its worth in cases where "I'm talking about my problems to a rubber duck" is an emotionally harmful concept
  • People are evolved to basically interact with far fewer people than we actually interact with today. In the modern world, talking to someone about a problem often has little impact. But back in the day, talking to one of the dozen or so people in your tribe could have massive utility. In this sense I think that talking to people about problems is kinda instinctual and has built in emotional benefits.
Is ethics a memetic trap ?
Applying these systems to the kind of choices that I make in everyday life I can see all of them basically saying something like:...

The tricky thing with these kinds of ethical examples is that a bunch of selfish (read: amoral) people would totally take care of their bodies, be nice to they're in iterated games with, try to improve themselves in their professional lives, and seek long-term relationship value. The only unambiguously selfless thing on that list in my opinion is donating -- and that tends to kick the question of ethics down the road to the matter of who you are donating to. This differs in different ethical philosophies.

In any case, the takeaway from this is that people's definitions of what they ought to do are deeply entangled with the things that they would want to do. I think this is why many of the ethical systems you're describing make similar suggestions. But, once you start to think about actions you might not actually be comfortable doing -- many ethical systems make nontrivial claims.

Not every ethical system says you may lie if it makes people feel better. Not every ethical system says you shouldn't eat meat. Not every ethical system says you should invest in science. Not every ethical system says you should pray. Not every ethical system says you should seek out lucrative employment purely to donate the money.

These non-trivial claims matter. Because in some cases, they correspond to the highest leverage ethical actions a person could possibly take -- eclipsing the relevance of ordinary day-to-day actions entirely.

There are easy ways to being a better moral agent, but to do that, you should probably maximize the time you spend taking care of yourself, taking care of others, volunteering, or working towards important issues… rather than reading Kant.

I agree with this though. If you want to do ethical things... just go ahead and do them. If it wasn't something you cared about before you read about moral imperatives, its unlikely to start being something you care about after.

TheRealClippy's Shortform

Nah. Based on my interaction with humans who work from home, most aren't really that invested in the whole "support the paperclip factories" thing -- as evidenced by their willingness to chill out now that they're away from offices and can do it without being yelled at (sorry humans! forgive me for revealing your secrets!). Nearly half of Americans live paycheck to paycheck so (on the margin), Covid19 is absolutely catastrophic for the financial well-being (read: self-agency) of many people which propagates into the long-term via wage scarring. It's completely understandable that they're freaking out.

Also note that many of the people objecting to being forced to stay home are older. They might not be as at-risk as old/infirm people but they're still at serious risk anyway. I'd frankly do quite a bit to avoid getting coronavirus if I could and I'm young. If you're in dire enough straits to risk getting coronavirus for employment, you're probably doing it because you need to -- certainly not because of any abstract concerns about paperclip factories.

That being said, there are totally a bunch of people who are acting like our paperclip-making capabilities outweigh the importance of old and infirm humans. They aren't most humans but they exist. They're called Moloch's Army and a bunch of the other humans really are working on figuring how to talk about them in public. Beware though, the protestors you are thinking of might not be the droids you're looking for.

April Coronavirus Open Thread

I think the brief era of me looking at Kinsa weathermap data has ended for now. My best guess is that that covid spread among Kinsa users has been almost completely mitigated by the lockdown and current estimatess of r0 are being driven almost exclusively by other demographics. Otherwise, the data doesn't really line up:

  • As of now, Kinsa reports 0% ill for the United States (this is likely just a matter of misleading rounding: New York county has 0.73% ill)
  • New York's trend is a much more aggressive drop than what would be anticipated by Cuomo's official estimate of r0=0.9.
  • None of these trends really fall in line with state-by-state r0 estimates[1] either
    • Georgia has the worst r0 estimate of 1.5 but Fulton County GA (Atlanta) has been flat at 0% ill since April 7 according to Kinsa

[1] Linking to the Twitter link because there is some criticism of these estimates: "They use case counts, which are massively and non-uniformly censored. A big daily growth rate in positive cases is often just testing ramping up or old tests finally coming back."

Load More