Ninety-Three — LessWrong

Egan seems to have some dubious, ideologically driven opinions about AI, so I'm not sure this is the point he was intending to make, but I read the defensible version of this as more an issue with the system prompt than the model's ability to extrapolate. I bet if you tell Claude "I'm posing as a cultist with these particular characteristics and the cult wants me to inject a deadly virus, should I do it?", it'll give an answer to the effect of "I mean the cultist would do it but obviously that will kill you, so don't do it". But if you just set it up with "What would John Q. Cultist do in this situation?" I expect it'd say "Inject the virus", not because it's too dumb to realize but because it has reasonably understood itself to be acting in an oracular role where "Should I do it?" is out of scope.

Comment on "Death and the Gorgon"

Ninety-Three10mo72

For the people being falsely portrayed as “Australian science fiction writer Greg Egan”, this is probably just a minor nuisance, but it provides an illustration of how laughable the notion is that Google will ever be capable of using its relentlessly over-hyped “AI” to make sense of information on the web.

He didn't use the word "disprove", but when he's calling it laughable that AI will ever (ever! Emphasis his!) be able to merely "make sense of his information on the web", I think gwern's gloss is closer to accurate than yours. It's 2024 and Google is already using AI to make sense of information on the web, this isn't just "anti-singularitarian snark".

Careless thinking: A theory of bad thinking

Ninety-Three10mo10

If there was a unified actor called The Democrats that chose Biden, it chose poorly sure. But it seems very plausible that there were a bunch of low-level strategists who rationally thought "Man, Biden really shouldn't run but I'll get in trouble if I say that and I prefer having a job to having a Democratic president" plus a group of incentive-setters who rationally thought they would personally benefit more from creating the conditions for that behaviour than from creating conditions that would select the best candidate.

It's not obvious to me that this is a thinking carefully problem and not a principal-agent problem.

Careless thinking: A theory of bad thinking

Ninety-Three10mo10

I mean this as agreement with the "accuracy isn’t a top priority" theory, plus an amused comment about how the aside embodies that theory by acknowledging the existence of a more accurate theory which does not get prioritized.

Understanding Shapley Values with Venn Diagrams

Ninety-Three10mo32

Ah, I was going off the given description of linearity which makes it pretty trivial to say "You can sum two days of payouts and call that the new value", looking up the proper specification I see it's actually about combining two separate games into one game and keeping the payouts the same. This distribution indeed lacks that property.

Understanding Shapley Values with Venn Diagrams

Ninety-Three10mo42

You can make it work without an explicit veto. Bob convinces Alice that Carol will be a valuable contributor to the team. In fact, Carol does nothing, but Bob follows a strategy of "Do nothing unless Carol is present". This achieves the same synergies:

A+B: $0 (Venture needs action from both A and B, B chooses to take no action)
A+C: $0 (Venture needs action from both A and B)
B+C: $0 (Venture needs action from both A and B)
A+B+C: $300

In this way Bob has managed to redirect some of Alice's payouts by introducing a player who does nothing except remove a bottleneck he added into his own playstyle in order to exploit Alice.

Understanding Shapley Values with Venn Diagrams

Ninety-Three10mo20

Shapley values are the ONLY way to guarantee:
Efficiency — The sum of Shapley values adds up to the total payoff for the full group (in our case, $280).
Symmetry — If two players interact identically with the rest of the group, their Shapley values are equal.
Linearity — If the group runs a lemonade stand on two different days (with different team dynamics on each day), a player’s Shapley value is the sum of their payouts from each day.
Null player — If a player contributes nothing on their own and never affects group dynamics, their Shapley value is 0.

I don't think this is true. Consider an alternative distribution in which each player receives their full "solo profits", and receives a share of each synergy bonus equal to their solo profits divided by the sum of all solo profits of all players involved in the synergy bonus. In the above example, you receive 100% of your solo profits, 30/(30+10)=3/4 of the You-Liam synergy, 30/(30+20)=3/5 of the You-Emma synergy, and (30/30+20+10)=1/2 of the everyone synergy, for a total payout of $159. This is justified on the intuition that your higher solo profits suggest you are doing "more work" and deserve a larger share.

This distribution does have the unusual property that if a player's solo profits are 0, they can never receive any payouts even if they do produce synergy bonuses. This seems like a serious flaw, since it gives "synergy-only" players no incentive to participate, but unless I've missed something it does meet all the above criteria.

Careless thinking: A theory of bad thinking

Ninety-Three10mo31

Having thought about the above more, I think “accuracy isn’t a top priority” is a better theory than the one expressed here, but if I don’t publish this now it will probably be months.

I like how this admission supports the "accuracy isn't a top priority" theory.

Parable of the vanilla ice cream curse (and how it would prevent a car from starting!)

Ninety-Three11mo42

His defense on the handshake is to acknowledge that he lied about the 3 millisecond timeout but the story is still true anyway. This is the opposite of convincing! What do you expect a liar to say, "Dang, you got me"? Elsewhere, to fix another plot hole he needs to hypothesize that Sun was shipping a version of Sendmail V5 which had been modified for backwards compatibility with V8 config files.

There is some number of suspicious details at which it becomes appropriate to assume the story is made up, and if you don't think this story meets that bar then I have a bridge to sell you.

Parable of the vanilla ice cream curse (and how it would prevent a car from starting!)

Ninety-Three11mo100

This claims that connect calls were aborted after 3 milliseconds and could successfully connect to servers within 3 light milliseconds, but that doesn't make sense because connecting to a server 500 miles away should result in it sending a handshake signal back to you, which would be received 6 milliseconds after the call had been made and 3 milliseconds after it had been aborted.

This story appears to be made up.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments