LESSWRONG
LW

Optimization Process's Shortform

by Optimization Process
14th Mar 2020
1 min read
9

3

This is a special post for quick takes by Optimization Process. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Optimization Process's Shortform
8Optimization Process
2Dagon
1Optimization Process
2Dagon
1Donald Hobson
2Optimization Process
1papetoast
2Optimization Process
3Optimization Process
1[comment deleted]
1[comment deleted]
9 comments, sorted by
top scoring
Click to highlight new comments since: Today at 8:12 PM
[-]Optimization Process5y80

Some wagers have the problem that their outcome correlates with the value of what's promised. For example, "I bet $90 against your $10 that the dollar will not undergo >1000% inflation in the next ten years": the apparent odds of 9:1 don't equal the probability of hyperinflation at which you'd be indifferent to this bet.

For some (all?) of these problematic bets, you can mitigate the problem by making the money change hands in only one arm of the bet, reframing it as e.g. "For $90, I will sell you an IOU that pays out $100 in ten years if the dollar hasn't seen >1000% inflation." (Okay, you'll still need to tweak the numbers for time-discounting purposes, but it seems simpler now that we're conditioning on lack-of-hyperinflation.)

Does this seem correct in the weak case? ("some")

Does this seem correct in the strong case? ("all")

Reply
[-]Dagon5y20

Clearly not all - the extreme version of this is betting on human extinction. It's hard to imagine the payout that has any value after that comes to pass. In some, you can find the conditional wagers that work, in some you can find a better resource or measurement to wager (one gram of gold, or one day's average wage as reported by X government). In many, though, there just is no wager possible, as the utility of the parties diverges too much from the resources available to account for in the wager.

Reply
[-]Optimization Process5y10

Clearly not all - the extreme version of this is betting on human extinction. It's hard to imagine the payout that has any value after that comes to pass.

Agreed that post-extinction payouts are essentially worthless -- but doesn't the contract "For $90, I will sell you an IOU that pays out $100 in one year if humans aren't extinct" avoid that problem?

Reply
[-]Dagon5y20

Small amounts and near-even-money ($90 for $100) are bad intuition pumps - this is in the range where other considerations dominate the outcome estimates. In fact, you probably can't find many people to accept only 11% for a one-year unsecured loan.

Reply
[-]Donald Hobson5y10

This is exactly conditional to a bond that pays out in one year "unconditionally". Ie this is a loan with interest. (There are a few contrived scenarios where humans are extinct and money isn't worthless, depending on the definitions of those words. Would this bond pay out in a society of uploaded minds?)

Reply
[-]Optimization Process4d20

Heuristic: distrust any claim that's much memetically fitter than its retraction would be. (Examples: "don't take your vitamins with {food}, because it messes with {nutrient} uptake"; "Minnesota is much more humid than prior years because of global-warming-induced corn sweat"; "sharks are older than trees"; "the Great Wall of China is visible from LEO with the naked eye")

Reply
[-]papetoast4d10

What do you mean by "retraction"? Do you just mean an opposite statement "sharks are older than trees" --> "sharks are not older than trees", or do you mean something more specific?

Assuming just a general contrasting statement, my gut feeling is that 1. this heuristic is true for certain categories of statements, but generates wrong intuition for other categories 2. this heuristic works, but rarely because of memetic reasons, instead it is just signal to noise ratio of the subjects.

Currently I am thinking about counterexamples from statements that roughly equates to a recommendation "Twitch is the best streaming platform" (I know it isn't very fitting as a memetic statement), which heuristically sounds plausibly true to me because 1. I know there is a very small number of streaming platforms 2. people who talk about this is likely to know what they are talking about

Reply
[-]Optimization Process4y20

Consider AI-generated art (e.g. TWDNE, GPT-3 does Seinfeld, reverse captioning, Jukebox, AI Dungeon). Currently, it's at the "heh, that's kinda neat" stage; a median person might spend 5-30 minutes enjoying it before the novelty wears off.

(I'm about to speculate a lot, so I'll tag it with my domain knowledge level: I've dabbled in ML, I can build toy models and follow papers pretty well, but I've never done anything serious.)

Now, suppose that, in some limited domain, AI art gets good enough that normal people will happily consume large amounts of its output. It seems like this might cause a phase change where human-labeled training data becomes cheap and plentiful (including human labels for the model's output, a more valuable reward signal than e.g. a GAN's discriminator); this makes better training feasible, which makes the output better, which makes more people consume and rate the output, in a virtuous cycle that probably ends with a significant chunk of that domain getting automated.

I expect that this, like all my most interesting ideas, is fundamentally flawed and will never work! I'd love to hear a Real ML Person's take on why, if there's an obvious reason.

Reply
[-]Optimization Process4y*30

Trying to spin this into a plausible story: OpenAI trains Jukebox-2, and finds that, though it struggles with lyrics, it can produce instrumental pieces in certain genres that people enjoy about as much as human-produced music, for about $100 a track. Pandora notices that it would only need to play each track ($100 / ($0.00133 per play) = 75k) times to break even with the royalties it wouldn't have to pay. Pandora leases the model from OpenAI, throws $100k at this experiment to produce 1k tracks in popular genres, plays each track 100k times, gets ~1M thumbs-[up/down]s (plus ~100M "no rating" reactions, for whatever those are worth), and fine-tunes the model using that reward signal to produce a new crop of tracks people will like slightly more.

Hmm. I'm not sure if this would work: sure, from one point of view, Pandora gets ~1M data points for free (on net), but from another reasonable point of view, each data point (a track) costs $100 -- definitely not cheaper than getting 100 ratings off Mechanical Turk, which is probably about as good a signal. This cycle might only work for less-expensive-to-synthesize art forms.

Reply
[+][comment deleted]9mo10
[+][comment deleted]4y10
Moderation Log
More from Optimization Process
View more
Curated and popular this week
9Comments
Deleted by Optimization Process, 06/25/2021
Reason: just testing markup
Deleted by Optimization Process, 11/25/2024