JeremyHussell — LessWrong

Mechanisms too simple for humans to design

Yeah, bad example. Nonetheless, an adult human brain cannot be recreated solely from its genetic code, just as documents written using Microsoft Word cannot be recreated solely from the source code of Microsoft Word and an LLM cannot be recreated without training data. Most of the article falls apart because comparing source code size (uncompressed, note) to genome size tells us very little about the relative complexity of software and living organisms.

Your brain probably is the most complex thing in the room, with ~86 billion neurons, each of which has a lot of state that matters.

Covid 10/7: Steady as She Goes

JeremyHussell4y10

What happened here wasn’t that Harvard and the CDC stopped being interested in truth. That ship sailed a while ago. What happened here was that Harvard and the CDC’s lack of interest in truth was revealed more explicitly and clearly, and became closer to common knowledge.

Further, thinking and talking about organizations as if they were interested or disinterested in anything keeps leading to errors. CDC and Harvard likely have no institutional rules or incentives in place to promote truth over falsehood, or even to promote trust in the institutions themselves. (They may have "vision statements" or "principles", but these are, in practice, neither rules nor incentives for most people in an organization.) The people who were the first members of the organizations were very interested in and careful about truth and trust, and that's how the reputation arose. But the CDC was founded in 1946 and it's all different people now. This applies even more to Harvard, which was founded in 1636, but their current reputation was mostly formed around the same time. In the long run, one cannot trust future people flowing through an organization to do anything except obey the rules written in stone and the incentives in place. Even then, if rules are felt to be too restrictive it's likely future people will ignore, undermine, reinterpret, and outright rewrite them. And one can expect future people might do anything which is permitted by the rules and incentives, even if it would be unthinkable today.

The corollary is that one cannot trust organizational "character" not to change, and one must regularly update the reputation one attributes to an organization (really to the people in an organization), because the current people are regularly changing, both individuals simply changing over time and different people outright taking the place of previous ones.

The corollary to the corollary is that it can be helpful to have an estimate of the rate of turnover and mentoring in each organization, because that affects how long one can safely go without updating the rest of the reputation of the organization (which, again, is really the combined reputation of the particular people in that organization). Harvard professors likely have very low rates of turnover and excellent mentoring, so I wouldn't expect their teaching and research to have changed much in the last decade, but university administrators probably change jobs as often CDC administrators. Looking into it more deeply can give you an estimate of how often you'll have to re-evaluate the reputation of any particular organization. Every decade? Every five years? Yearly?

Another line of thought: what incentives and rules for an institution could one set up which would encourage desirable behaviour and discourage undesirable behaviour in all the future people who will take positions in the organization? Making progress on problems like this, where agents are human but have a very long future time in which to find and exploit loopholes, seems like an obvious prerequisite to making progress on AI alignment. One cannot simply trust that no future person will abuse the rules for their own benefit, just as one cannot trust that no AI will not immediately do the same.

Maybe if some progress were made on this, we could have some sustainable trust in some institutions. The checks-and-balances concept is a good start: set up independent institutions all able to monitor and correct each other.

Possible worst outcomes of the coronavirus epidemic

JeremyHussell6y100

After-the-fact analysis of the causes of major disasters often reveals multiple independent causes, none of which would have caused a disaster by itself, but each of which degraded or disabled the usual safeguards in place for the other problems. This seems to come up in everything from relatively small-scale transportation disasters to the fall of civilizations, and possibly in major extinction events. E.g. there have been many large asteroid impacts, but the one which finished off the dinosaurs happened to also coincide with (and possibly triggered or exacerbated) major volcanic activity. (The Deccan Traps.)

So the worst possible outcome of the epidemic might be that it happens to coincide with some other, totally unrelated disaster. For example, natural disasters such as earthquake+tsunamis, widespread rainfall and flooding, major fires piling air-quality issues on top of COVID-19 breathing problems, and so on. (In a way, I'm thankful the recent fires in Australia happened then, and are therefore not happening now.) Unrelated war(s) would make everything worse. So would a second pandemic at the same time. So would just about anything on the list of possible existential risks. I think this would count as a worst-case outcome of the epidemic, even though it would be an indirect outcome.

The global scale of this epidemic, and its months-long projected duration, seem to make it more probable that something else will go badly wrong just when everything else is under stress.

Possible worst outcomes of the coronavirus epidemic

JeremyHussell6y50

Once enough people have been infected and recovered, gaining immunity, the evolutionary pressures on a virus switch from "spread as fast as possible into new hosts" to "keep the current host alive and infectious long enough to encounter a host without immunity". Even though influenza periodically bypasses existing immunity, the evolutionary pressure towards lower mortality is still present most of the time. In particular, our actions to quarantine and isolate, if sufficiently widespread, will also put a lot of evolutionary pressure towards less-severe effects on SARS-CoV-2. All those mild and asymptomatic cases? Pretty soon those are going to be the most successful replication strategy, and the SARS-CoV-2 population as a whole will be pushed towards causing lower mortality.

There is still a chance we'll have repeated high-mortality waves, but one should note that coronaviruses and influenza viruses are not particularly closely related. Influenza seems to have about one mutation in its protein coat per replication, while as of Feb. 11th the 81 sequenced samples of SARS-CoV-2 had "at most seven mutations relative to a common ancestor". So I'm inferring that SARS-CoV-2 is less likely to be able to bypass existing immunity on a yearly basis. Influenza has a high mutation rate due to lacking RNA proofreading enzymes, so if SARS-CoV-2 has an RNA proofreading enzyme or hijacks host-cell proofreading enzymes I would also update towards a lower probability of repeated waves. Influenza is also unusual because it's composed of eight pieces of RNA, which makes it easy for different strains of influenza to swap genes when they infect the same cell at the same time. This is another major reason that influenza bypasses immunity so often. SARS-CoV-2 seems to have one 30,000 base-pair segment of RNA, so it can't do that trick either.

There are still a lot of unknowns, but so far there's no evidence I've heard of which has made me update towards SARS-CoV-2 being more likely to be able to bypass existing immunity than other coronaviruses, much less influenza.

Simplified Poker

JeremyHussell7y10

8 months late. I'm coming into this cold but having previously read about a very similar competition to create strategies to play Rock-Paper-Scissors (RPS). First, work out all the decision points in the game, and the possible information available at each decision point. We end up with 2 binary decisions for each player, and 3 states of information at each decision point.

So my first strategy is to predict my opponent's decisions, and calculate which of my possible decisions will give me the best result. For RPS this is pretty simple:

P(R), P(P), P(S): probabilities my opponent will play Rock, Paper, and Scissors.

V(R), V(P), V(S): expected score (value) for me playing Rock, Paper, Scissors.

V(R) = (P(R) * 0 + P(P) * -1 + P(S) * 1) / (P(R) + P(P) + P(S))

The calculation on the line above is for the general case. For the specific case of RPS, it simplifies to:

V(R) = P(S) - P(P)

V(P) = P(R) - P(S)

V(S) = P(P) - P(R)

A surprising number of competitors fail to play optimally against their opponent's predicted actions. For example, with P(R) = 0.45, P(P) = 0.16, P(S) = 0.39, many competitors play Paper, even though the best expected value is from playing Rock. (Optimal play exploits unusually low probabilities as well as unusually high probabilities.)

In RPS there are three possible decisions, but in simplified poker all the decision points are binary, so we can use A and !A to represent both probabilities, instead of A, B, and C. I choose to represent betting and calling as direct probabilities, and checking and folding as the complementary probabilities.

A, B, C: player #1 bets with a 1, 2, or 3 respectively

D, E, F: after a check and a bet, player #1 calls with a 1, 2, 3

G, H, I: after a bet, player #2 calls with a 1, 2, 3

J, K, L: after a check, player #2 bets with a 1, 2, 3

The expected value calculations are more complicated than in RPS (among other things, you can be uncertain about the current state of the game because you don't know which card your opponent has, and the outcome of player #1's game sometimes depends on its own future decisions), but thanks to the binary decisions the results can be simplified almost as much as in RPS.

D(A), D(B), etc.: condition necessary to decide to do A, B, etc. Calculate V(A) and V(!A), then D(A) = V(A) > V(!A) and D(!A) = V(A) < V(!A). If they're equal, then you play your predetermined Nash equilibrium strategy.

Player #1:

D(A) = 4/3 > P(H) + P(I)

D(B) = 2 + P(G) + z > 3 * P(I), where z = P(L) - P(J) when 3 * P(J) > P(L) and z = 2 * P(J) when 3 * P(J) < P(L)

D(C) = P(G) + P(H) > P(J) + P(K)

D(D) = false

D(E) = 3 * P(J) > P(L)

D(F) = true

Player #2:

D(G) = false

D(H) = 3 * P(A) > P(C)

D(I) = true

D(J) = P(!B) * (2 * P(!E) - P(E)) > P(!C) * (P(F) - 2 * P(!F))

D(K) = P(!A) * P(D) > P(!C) * (3 * P(F) + 2)

D(L) = P(!A) * P(D) + P(!B) * P(E) > 0

Translated back to English:

#1 with a 1: If you predict #2 will fold often enough, then bet (bluff), otherwise check, and always fold if #2 bets.

#1 with a 2: Bet only if you predict #2 will call with a 1 and fold with a 3 enough more than bluffing with a 1 and checking with a 3. Call after #2 bets if there's a high enough chance it's a bluff.

#1 with a 3: Bet or call depending on whether #2 is more likely to call your bet or bet after you check. Always call if #2 bets.

#2 with a 1: If #1 bets, fold. If #1 checks and will fold often enough, then bluff, otherwise check.

#2 with a 2: If #1 bets, call if the chances of a bluff are high enough, otherwise fold. If #2 checks, check unless you predict #1 will call with a 1 and fold with a 3 often enough combined to be worth it.

#2 with a 3: If #1 bets, call. If #1 checks, bet.

Alert readers will complain that I've skipped over the most interesting step: predicting what my opponent will play. This is true, but the above steps needed to be done first, because many of the interesting strategies for predicting your opponent's play assume they've done the same analysis. If both players play following this strategy, and both know that the other will play following this strategy, then play settles into one of the Nash equilibriums. But, many players won't play optimally, and if you can identify deviations from the Nash equilibrium quickly then you can get a better score. If your opponent is doing the same thing, then you can fake a deviation from Nash that lowers your score a little, but causes your opponent to deviate from the Nash equilibrium in a way that you can exploit for more gain than your loss (until your opponent catches on). So I can predict you will predict I will predict you will... and it seems to go into an infinite loop of ever-higher levels of double-think.

My most important takeaway from the Rock-Paper-Scissors competition was that if there are a finite number of deterministic strategies, then the number of levels of double-think are finite too. This is much easier to see in RPS. Given a method of prediction P:

P0: assume your opponent is vulnerable to prediction by method P, play to beat it.

P1: assume your opponent thinks you will use method P0, and plays to beat it. Play to beat that.

P2: assume your opponent thinks you will use P1, and plays to beat it. Play to beat that.

But because in RPS there are only 3 possible deterministic strategies, P3 recommends you play the same way as P0!

There's also a second stack where you assume your opponent is using P to predict you, then assuming you know that, and so on, which also ends with 3 deterministic strategies.

In simplified poker, if you predict your opponent is not playing a Nash equilibrium strategy, and respond optimally yourself, then you will respond in one of 16 ways. If you assume your opponent has guessed your play and will respond optimally, then there are 8 ways for player #1 to respond, and only 4 ways for player #2 to respond. So, assuming I haven't made a mistake, there are at most 5 levels of second guessing, 1 for responding to naive play, and at most 4 more for responding to optimal play before either you or your opponent start repeating yourselves.

So, for any method of prediction which does not involve double-thinking, you can generate all double-think strategies and reverse double-think strategies. Then you need a meta-strategy to decide which one to use on the next hand. If you do this successfully then you'll defeat anyone who is vulnerable to one of your methods of prediction, uses one of your methods of prediction, or uses a strategy to directly defeat one of your methods of prediction.

An Undergraduate Reading Of: Macroscopic Prediction by E.T. Jaynes

JeremyHussell8y40

Note that this paper was first published in 1985, not 1996. The full source is in a footnote at the bottom of the first page.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments