gwern's Comments

Are veterans more self-disciplined than non-veterans?

Draft lotteries provide randomized experiments. The most famous one is the Vietnam draft analysis by Angrist 1990 where he finds a 15% penalty to lifetime income caused by being drafted. There's also little evidence of any benefit in Low-Aptitude Men In The Military: Who Profits, Who Pays?, Laurence & Ramberger 1991, which covers Project 100,000 among others, which drafted individuals who would not otherwise have been drafted either as a matter of policy or, in the ASVAB Misnorming, unrealized accident. (As Fredrik deBoer likes to say, if there's any way it can possibly be a selection effect, then yeah, it's a selection effect.)

SARS-CoV-2 pool-testing algorithm puzzle

It describes a lot of things, and I'm not sure which ones would be the right answer here. If anyone wants to read through and describe which one seems to best fit the corona problem, that'd be a better answer than just providing an enormous difficult-to-read monograph.

Evaluability (And Cheap Holiday Shopping)

No, one shouldn't. Playing a game of chance once or a thousand times does not influence the outcome of the next round (aka the gambler's fallacy). If a bet is a good idea one time, it's a good idea a thousand times. And if a bet is a good idea a thousand times, it is a good idea the first time. How could you consider betting a thousand times to be a good idea, if you think each individual bet is a bad idea?

Gambler's ruin. The bets are the same, but you are not. If bets are rare enough or large enough, they do not justify they simplifying assumption of ergodicity ('adding up to normality' in this case) and greedy expected-value maximization is not optimal.

What would be the consequences of commoditizing AI?

I'm not clear here what you mean by "AI". Full AGI, or just AI like the present (various specialized systems, usually but not often subhuman capabilities)? If the former, it's hard to speculate.

The latter, however, is already being commoditized and I include it as an example. FANG practically compete to see who can give away more source code & datasets & research and open-source everything, to the extent that when someone announces a release of a new reinforcement learning framework library I don't even bother announcing it on /r/reinforcementlearning unless there's something special. FANG is not threatened by open-sourcing everything because all the benefits come from the integration into their ecosystem as part of your smartphone, running on the ASICs built into your smartphone, with access to all the data in your account and the private data in all the FANG datacenters, which power your upfront payments and then advertising eyeballs later. Someone who invents a better image classifier can not in any meaningful way threaten Google's Android browser advertising revenues, but does help make Android image apps somewhat better and increases Google revenue indirectly, and so on. Thus, Google can release not just new research but the actual models themselves like EfficientNet, and it's no problem. You can provide a free competitor. But Google provides something which is "better than free".

As far as 'tool AI' or 'AI services' go, this seems to be the overall theme. Most tasks themselves are not too valuable. The question is what can you build on top or around it, and what does it unlock? ( is interesting.)

What information, apart from the connectome, is necessary to simulate a brain?

One question for WBE is whether we really need to upload a specific brain or if it is enough to use a generic brain as a template and a kind of informative prior to greatly speed up alternative AI techniques (an emerging paradigm I'm calling "brain imitation learning" for now).

The correct response to uncertainty is *not* half-speed

Happens all the time in decision theory & reinforcement learning: the average of many good plans is often a bad plan, and a bad plan followed to the end is often both more rewarding & informative than switching at every timestep between many good plans. Any kind of multi-modality or need for extended plans (eg due to upfront costs/investments) will do it, and exploration is quite difficult - just taking the argmax or adding some randomness to action choices is not nearly enough, you need "deep exploration" (as Osband likes to call it) to follow a specific hypothesis to its limit. This is why you have things like 'posterior sampling' (generalization of Thompson sampling), where you randomly pick from your posterior of world-states and then follow the optimal strategy assuming that particular world state. (I cover this a bit in two of my recent essays, on startups & socks.)

[AN #91]: Concepts, implementations, problems, and a benchmark for impact measurement

On the other hand, OpenAI Five (AN #13) also has many, many subtasks, that in theory should interfere with each other, and it still seems to train well.

True, but OA5 is inherently a different setup than ALE. Catastrophic forgetting is at least partially offset by the play against historical checkpoints, which doesn't have an equivalent in your standard ALE; the replay buffer typically turns over so old experiences disappear, and there's no adversarial dynamics or AlphaStar-style population of agents which can exploit forgotten area of state-space. Since Rainbow is an off-policy DQN, I think you could try saving old checkpoints and periodically spending a few episodes running old checkpoints and adding the experience samples to the replay buffer, but that might not be enough.

There's also the batch size. The OA5 batch size was ridiculously large. Given all of the stochasticity in a DoTA2 game & additional exploration, that covers an awful lot of possible trajectories.

In Gmail, everything after

They also present some fine-grained experiments which show that for a typical agent, training on specific contexts adversely affects performance on other contexts that are qualitatively different.

Is cut off by default due to length.

Welcome to LessWrong!

If you're interested in LW2's typography, you should take a look at GreaterWrong, which offers a different and much more old-school non-JS take on LW2, with a number of features like customizable CSS themes. There is a second project, Read The, which focuses on a pure non-interactive typography-heavy presentation of a set of highly-influential LW1 posts. Finally, there's been cross-pollination between LW2/GW/RTS and my own website (description of design).

Load More