gwern

Comments

March 2021 gwern.net newsletter

"'Nash equilibrium strategy' is not necessarily synonymous to 'optimal play'. A Nash equilibrium can define an optimum, but only as a defensive strategy against stiff competition. More specifically: Nash equilibria are hardly ever maximally exploitive. A Nash equilibrium strategy guards against any possible competition including the fiercest, and thereby tends to fail taking advantage of sub-optimum strategies followed by competitors. Achieving maximally exploitive play generally requires deviating from the Nash strategy, and allowing for defensive leaks in one's own strategy."

2020 AI Alignment Literature Review and Charity Comparison

That's interesting. I did see YC listed as a major funding source, but given Sam Altman's listed loans/donations, I assumed, because YC has little or nothing to do with Musk, that YC's interest was Altman, Paul Graham, or just YC collectively. I hadn't seen anything at all about YC being used as a cutout for Musk. So assuming the Guardian didn't screw up its understanding of the finances there completely (the media is constantly making mistakes in reporting on finances and charities in particular, but this seems pretty detailed and specific and hard to get wrong), I agree that that confirms Musk did donate money to get OA started and it was a meaningful sum.

But it still does not seem that Musk donated the majority or even plurality of OA donations, much less the $1b constantly quoted (or any large fraction of the $1b collective pledge, per ESRogs).

The best frequently don't rise to the top

One of the most interesting media experiments I know of is the Yahoo Media experiments:

  1. "Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market", Salganik et al 2006:

    We investigated this paradox experimentally, by creating an artificial ‘‘music market’’ in which 14,341 participants downloaded previously unknown songs either with or without knowledge of previous participants’ choices. Increasing the strength of social influence increased both inequality and unpredictability of success. Success was also only partly determined by quality: The best songs rarely did poorly, and the worst rarely did well, but any other result was possible.

  2. "Web-Based Experiments for the Study of Collective Social Dynamics in Cultural Markets", Salganik & Watts 2009:

    Using a ‘‘multiple-worlds’’ experimental design, we are able to isolate the causal effect of an individual-level mechanism on collective social outcomes. We employ this design in a Web-based experiment in which 2,930 participants listened to, rated, and downloaded 48 songs by up-and-coming bands. Surprisingly, despite relatively large differences in the demographics, behavior, and preferences of participants, the experimental results at both the individual and collective levels were similar to those found in Salganik, Dodds, and Watts (2006)...A comparison between Experiments 1 and 2 reveals a different pattern. In these experiments, there was little change at the song level; the correlation between average market rank in the social influence worlds of Experiments 1 and 2 was 0.93.

This is analogous to test-retest error: if you run a media market with the same authors, and same creative works, how often do you get the same results? Forget completely any question about how much popularity correlates with 'quality' - does popularity even correlate with itself consistently? If you ran the world several times, how much would the same songs float to the top?

The most relevant rank correlation they seem to report is rho=0.93*. That may seem high, but the more datapoints there are, the higher the necessary correlation soars to give the results you want.

A rho=0.93 implies that if you had a million songs competing in a popularity contest, the #1 popular song in our world would probably be closer to only the #35,000th most popular song in a parallel world's contest as it regresses to the mean (1000000 - (500000 + (500000 * 0.93))). (As I noted the other day, even in very small samples you need extremely high correlations to guarantee double-maxes or similar properties, once you move beyond means; our intuitions don't realize just what an extreme demand we make when we assume that, say, J.K. Rowling must be a very popular successful writer in most worlds simply because she's a billionaire in this world, despite how many millions of people are writing fiction and competing with her. Realistically, she would be a minor but respected author who might or might not've finished out her HP series as sales flagged for multi-volume series; sort of like her crime novels published pseudonymously.)

Then toss in the undoubtedly <<1 correlation between popularity and any 'quality'... It is indeed no surprise that, out of the millions and millions of chefs over time, the best chefs in the world are not the most popular YouTube chefs. Another example of 'the tails comes apart' at the extremes and why order statistics is counterintuitive.

* They also report a rho=0.52 from some other experiments, which are arguably now more relevant than the 0.93 estimate. Obviously, if you use 0.52 instead, my point gets much much stronger: then, out of a million, you regress from #1 to #240,000!

The EMH is False - Specific Strong Evidence

I knew someone was going to ask that. Yes, it's impure indexing, it's true. The reason is the returns to date on the whole-world indexes have been lower, the expense is a bit higher, and after thinking about it, I decided that I do have a small opinion about the US overperforming (mostly due to tech/AI and a general sense that people persistently underestimate the US economically) and feel pessimistic about the rest of the world. Check back in 20 years to see how that decision worked out...

Against evolution as an analogy for how humans will create AGI

As described above, I expect AGI to be a learning algorithm—for example, it should be able to read a book and then have a better understanding of the subject matter. Every learning algorithm you’ve ever heard of—ConvNets, PPO, TD learning, etc. etc.—was directly invented, understood, and programmed by humans. None of them were discovered by an automated search over a space of algorithms. Thus we get a presumption that AGI will also be directly invented, understood, and programmed by humans.

For a post criticizing the use of evolution for end to end ML, this post seems to be pretty strawmanish and generally devoid of any grappling with the Bitter Lesson, end-to-end principle, Clune's arguments for generativity and AI-GAs program to soup up self-play for goal generation/curriculum learning, or any actual research on evolving better optimizers, DRL, or SGD itself... Where's Schmidhuber, Metz, or AutoML-Zero? Are we really going to dismiss PBT evolving populations of agents in the AlphaLeague just 'tweaking a few human-legible hyperparameters'? Why isn't Co-Reyes et al 2021 an example of evolutionary search inventing TD-learning which you claim is absurd and the sort of thing that has never happened?

Thirty-three randomly selected bioethics papers
Promoted by Raemon

This was exactly what I expected. The problem with the field of bioethics has never been the papers being 100% awful, but how it operates in the real world, the asymmetry of interventions, and what its most consequential effects have been. I would have thought 2020 made this painfully clear. (That is, my grandmother did not die of coronavirus while multiple highly-safe & highly-effective vaccines sat on the shelf unused, simply because some bioethicist screwed up a p-value in a paper somewhere. If only!)

The actual day-to-day churn of publishing bioethics papers/research... Well, HHGttG said it best in describing humans in general:

Mostly Harmless.

The EMH is False - Specific Strong Evidence

I haven't heard that claim before. My understanding was that such a claim would be improbable or cherrypicking of some sort, as a priori risk-adjusted etc returns should be similar or identical but by deliberately narrowing your index, you do predictably lose the benefits of diversification. So all else equal (such as fees and accessibility of making the investment), you want the broadest possible index.

The EMH is False - Specific Strong Evidence

Since we're discussing EMH and VTSAX, seems as good a place to add a recent anecdote:

Chatting with someone, investments came up and they asked me where I put mine. I said 100% VTSAX. Why? Because I think the EMH is as true as it needs to be, I don't understand why markets rise and fall when they do even when I think I'm predicting future events accurately (such as, say, coronavirus), and I don't think I can beat the stock markets, at least not without investing far more effort than I care to. They said they thought it wasn't that hard, and had (unlike me) sold all their stocks back in Feb 2020 or so when most everyone was still severely underestimating coronavirus, and beat the market drops. Very impressive, I said, but when had they bought back in? Oh, they hadn't yet. But... didn't that mean they missed out on the +20% net returns or so of 2020, and had to pay taxes? (VTSAX returned 21% for 2020, and 9.5% thus far for 2021.) Yes, they had missed out. Oops.

Trading is hard.

What's a good way to test basic machine learning code?

ALE is doubtless the Atari Learning Environment. I've never seen an 'ALE' in DRL discussions which refers to something else.

Load More