Comments

You have been a bad Bing:

Just in the last few weeks, after the model change, copilot has been adding a link to Masterpiece on LessWrong to the end of random messages and denying it did it when asked about it. Just kinda creepy. I worry about GPT 4 sometimes, even though I know other people have access to it.

gwern1dΩ276025

Warning for anyone who has ever interacted with "robosucka" or been solicited for a new podcast series in the past few years: https://www.tumblr.com/rationalists-out-of-context/744970106867744768/heads-up-to-anyone-whos-spoken-to-this-person-i

gwern4d-1-11

I think all of them...that suggests its bad

They don't. As I already explained, these examples are bad because the outcomes are not all bad, in addition to not reflecting the same causal patterns or being driven by adverse selection. The only consistent thing here is a Marxian paranoia that everyone else is naive and being ripped off in trades. Which is a common cognitive bias in denying gains to trade. The subway car is simply an equilibrium. You cannot tell if 'you' are better off or worse off in any car, so it is not the case that 'the deal is bad' The room and food examples actually imply the best outcome happened, as the room and food went to those who valued it more and so ate it sooner (it's not about correlation of preferences, it's about intensity); the deal was good there. And the Laffy Taffy example explicitly doesn't involve anything like that but is pure chance (so it can't involve "other people's maps" or 'adverse selection').

But the framing here is completely wrong...

But OK, let's leave aside the title and attempt to imply anything about 99% of trades out there, or the basically Marxist take on all exchanges being exploitation and obsession with showing how you are being tricked or ripped off. The examples are still very bad and confused! Like, these examples are not even all about adverse selection, and several of them are just wrong in portraying the hypothetical as a bad thing.

The first one about subways, isn't even about adverse selection to begin with. A reminder of what "Adverse selection" is:

In economics, insurance, and risk management, adverse selection is a market situation where buyers and sellers have different information. The result is the unequal distribution of benefits to both parties, with the party having the key information benefiting more.

In the subway example, there is no different information: it's about how governments do rationing and make markets clear by letting the goods degrade until the utility is destroyed because of lack of appetite for setting clearing prices like surge prices or fare enforcement; that's not 'adverse selection' at all, any more than freeways reaching an equilibrium of misery where they are so slow that people avoid them is 'adverse selection'. (If you think it's 'adverse selection', explain what "buyers and sellers have different information" means in the context of lack of congestion pricing in transport...?)

#3 and #4 are not adverse selection either (still no difference in information), and are fundamentally wrong in portraying it as a bad outcome: the outcomes are not bad, but neutral or good - OP gives no reason to think that the outcomes would have been better if 'you' had gotten the good room or to eat whichever dish. (In fact, presumptively, those are the desirable outcomes: if 'you' cared so much, why did you leave it up to Bob; and why did you not eat the dish yourself, but someone hungrier did?)

#6 doesn't demonstrate anything because no trade happened, so it can't show anything about your surplus from trades that do happen.

And the Wall Street efficient market examples are true (finally, an actual adverse selection example!), but relevant to vanishingly few people who are also extremely aware of it and spend a lot of effort dealing with it, generally successfully; and people who do auctions more than occasionally generally do not have any problem with winner's curses, and auctions are widely & intensively used in many fields by experts. And so on.

For buying milk you have multiple samples as to good price. Even if any is contrived, the bulk still capture something real

No, the bulk don't, because I buy milk a lot more often than I go on Wall Street and try to get cute with limit orders or manufacturing options or straddles on speculative merger/takeover targets or sign up to MoviePass or park while ignorant in NYC. The bulk of my life is buying milk, not speculating on Widgets Inc. And if I did those enough times to come anywhere near the number of times I've bought milk, so that 'the bulk' could potentially be any of those things, I would also not be doing it nearly as badly as OP postulates I would. (Because I would be, say, a market-maker like Jane Street, which makes a lot of money off doing that sort of thing.)

Counterpoint: actually, you're wrong, because most trades I make IRL leave me with a lot of consumer surplus, and in reality, conditional on me making a trade, it was pretty good.

The fact that you have to reach for exotic scenarios either involving government failures like subways or doing limit orders in highly efficient markets for financial speculation on liquid but volatile assets (not exactly an everyday 'trade' I hope you'll concede) or contests or auctions by naive non-auction goers who don't even know to account for winner's curse or getting stuff for free should make you rethink what you are claiming about "most trades you make aren't all that great".

If your point was true, it should be as simple as "you go into the grocery store to buy a gallon of milk. You are filled with deep remorse and shame when you get home and look at the receipt and think about how much you spent in gas to boot. You look in your freezer for comfort. You are filled with deep remorse and shame when you are reminded how much you paid for the ice cream. With little choice, you pull out a spoon you bought years ago - and are filled with deep remorse and shame &etc &etc". You wouldn't need to invoke all these weird hypotheticals like "you ask your friend Drew to sell you under the table a cheap limited share of his cow's monthly milk production in ice cream tickets through your company redeemable in NYC but only in an office which can be reached by an express subway (which runs on alternate Tuesdays)"...

The effect of structural variants like that would be bounded by the difference between SNP heritability and full heritability. That's an easy measurement. (And if it was really responsible for much variance, then it ought to show up as a variance component with whole-genomes from long-read sequencing, I would think.) What evidence is there that transposon counts really matter much in terms of total variance phenome-wide?

You're at token i in a non-final layer. Which token's output are you optimizing for? i+1?

I already addressed this point. If I'm in a non-final layer then I can be optimizing for arbitrary tokens within the context window, sure, and 'effectively' predicting intermediate tokens because that is the 'dominant' effect at that location... insofar as it is instrumentally useful for predicting the final token using the final layer. Because that is where all the gradients flow from, and why the dog wags the tail.

I don't think I am. ("conditioned future informativity" - informativity for what? ...the next/last token, which is the only thing taken into account by a causal loss which masks out the rest - that's the definition of it! everything else like packing or doing all the sub-sequences is an optimization and doesn't change the objective.) But feel free to expand on it and explain how the tail wags the dog in causal/decoder Transformers.

Load More