abstractapplic's Shortform

abstractapplic

abstractapplic's Shortform — LessWrong

abstractapplic's Shortform

15th Sep 2024

1 min read

6

This is a special post for quick takes by abstractapplic. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

abstractapplic's Shortform

2the gears to ascension

22abstractapplic

18the gears to ascension

70 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:02 AM

[-]abstractapplic4mo5578

Multidimensional voting is good. It's nice to be able to distinguish "you're wrong" from "shut up" (or "you're right" from "well said"); and it's nice to be part of a community which makes & encourages that distinction.

[-]Warty4mo107

the implementation could use some work because the emoji votes here are overflowing on mobile

9gustaf4mo

You may wonder: do other online forums have something like this? @ambigram looked into this 4 years ago: [...]

[-]abstractapplic1y388

Has some government or random billionaire sought out Petrov's heirs and made sure none of them have to work again if they don't want to? It seems like an obviously sensible thing to do from a game-theoretic point of view.

[-]Caleb W1y174

Not at the scale you're suggesting, but relevant: https://futureoflife.org/recent-news/50000-award-to-stanislav-petrov-for-helping-avert-wwiii-but-us-denies-visa/

[-]Erich_Grunewald1y107

It seems like an obviously sensible thing to do from a game-theoretic point of view.

Hmm, seems highly contingent on how well-known the gift would be? And even if potential future Petrovs are vaguely aware that this happened to Petrov's heirs, it's not clear that it would be an important factor when they make key decisions, if anything it would probably feel pretty speculative/distant as a possible positive consequence of doing the right thing. Especially if those future decisions are not directly analogous to Petrov's, such that it's not clear whether it's the same category. But yeah, mainly I just suspect this type of thing to not get enough attention that it ends up shifting important decisions in the future? Interesting idea, though -- upvoted.

[-]abstractapplic1y315

. . . Is there a way a random punter could kick in, say, $100k towards Elon's bid? Either they end up spending $100k on shares valued at somewhere between $100k and $150k; or, more likely, they make the seizure of OpenAI $100k harder at no cost to themselves.

5Shankar Sivarajan1y

That's one millionth of the bid, 0.0001%. I expect the hassle of the paperwork to handle there being more than one bidder to be more trouble than it's worth, akin to declaring a dollar you picked up on the street on your income tax forms.

3abstractapplic1y

True. But if things were opened up this way, realistically more than one person would want to get in on it. (Enough to cover an entire percentage point of the bid? I have no idea.)

3jbash1y

You want to be an insignificant, and probably totally illiquid, junior partner in a venture with Elon Musk, and you think you could realize value out of the shares? In a venture whose long-term "upside" depends on it collecting money from ownership of AGI/ASI? In a world potentially made unrecognizable by said AGI/ASI? All of that seems... unduly optimistic.

[-]abstractapplic1d2817

It's amazing how smart dumb people are, how dumb dumb people are, how dumb smart people are, and how smart smart people are.

9Caleb Biddulph1d

Restated: for any given human, the smartness of their actions has amazingly high variance (?)

5Spective1d

What are some examples of the first? I certainly find myself increasingly thinking that there's basically no limit to how dumb smart people can be in their dumb moments.

1e1ioan12h

I used to do magic (prestidigitation) when I was younger. Most of the smart people I knew were easily fooled by my tricks. My less bright friends were harder to fool: “you kept it in the other hand.” We have a saying in Romanian: “Beware the simple-minded; their mind is well rested.”

3Mateusz Bagiński20h

The way I'd phrase it is that intelligent people often act dumb, unless they learn to act smart.

1FlorianH20h

Truest truth I read this month 🎉

[-]abstractapplic10mo2810

There's an upside of conventional education which no-one on any side of any debate ever seems to bring up, but which was a major benefit (possibly the major benefit) of my post-primary studies. Namely: it lets students discover what they have a natural aptitude for (or lack thereof) relative to a representative peer group. The most valuable things I learned in my Engineering courses at university were:

.I'm pretty mediocre at Engineering, especially sub-subjects which aren't strictly Structural and/or Mechanical.

.In particular, I'm significantly worse than the average would-be Engineer at working with electronics, so I should change my plans to specialize in that field.

.Conversely, I'm significantly better than the average would-be Engineer at work involving code, money, probability, simulation and inference. (Learning this is a large part of why I eventually left Engineering for Data Science and Finance.)

Findings like this aren't perfect since they can give false negatives (a poorly-taught course can lead someone to conclude they don't like the subject when they actually don't like the teacher) and false positives (my initial showing at coding courses made me think I had a genius-l... (read more)

4Garrett Baker10mo

Why is college a particularly better place to learn this than on-the-job training?

4[anonymous]10mo

Lower switching costs when you're in the middle of a degree, maybe? You can just take courses in a closely related domain, or work as an assistant in a different lab, in a much more fluid and straightforward manner, versus having to apply to a different job and get through the interviews and pay a significant upfront cost before you even get to the nuts and bolts of stuff.

2Noosphere8910mo

This aligns with my take that it's actually fine for education's primary purpose is as a way to signal to companies cheaper and more reliably, even if teachers can't do anything because way more ability is genetic than people like to admit in some corners of polite society, and this combined with actually imparting social skills is probably the main benefit of education. Bryan Caplan argues that we shouldn't subsidize such signaling mechanisms by governments. Edited to not mischaracterize Bryan Caplan's view.

[-]Garrett Baker10mo110

he doesn't really see any value in systems that are there purely to signal.

This seems like a mischaracterization of his view. I’m pretty sure he thinks its wrong to subsidize such signaling mechanisms.

First off signaling is relative, so if (say) everyone goes to high school and only the very best go to college, from a signaling perspective, this is just as useful a signal as everyone going to college and only the very best go to grad school. Therefore we should not spend public dollars getting more people to go to college.

Second, in the signaling framework, there are no externalities to schooling kids, so there is no market failing to correct with (say) the government subsidizing the debts of college students.

Third, due to the first point, if any major market failure is present its the tendency to get into signaling spirals, where the positive signal of (say) a high school education degrades over time, making everyone spend more years and dollars in college getting what was once the same signal as a high school diploma. More years of schooling here is a cost, which everyone would prefer to pay less of. So insofar as there’s any case for government involvement it ought to be a tax, not a subsidy.

[-]abstractapplic1y240

I once saw an advert claiming that a pregnancy test was “over 99% accurate”. This inspired me to invent an only-slightly-worse pregnancy test, which is over 98% accurate. My invention is a rock with “NOT PREGNANT” scrawled on it: when applied to a randomly selected human being, it is right more than 98% of the time. It is also cheap, non-invasive, endlessly reusable, perfectly consistent, immediately effective and impossible to apply incorrectly; this massive improvement in cost and convenience is obviously worth the ~1% decrease in accuracy.

[-]MondSemmel1y165

Also see Scott Alexander's Heuristics That Almost Always Work.

[-]Seth Herd1y147

I think they meant over 99% when used on a non-randomly selected human who's bothering to take a pregnancy test. Your rock would run maybe 70% or so on that application.

9cubefox1y

This is a general problem with the measure of accuracy. In binary classification, with two events A and B, "accuracy" is broadly defined as the probability of the "if and only if" biconditional, P(A↔B). Which is equivalent to P((A∧B)∨(¬A∧¬B)). It's the probability of both events having the same truth value, of either both being true or both being false. In terms of diagnostic testing it is the probability of the test being positive if and only if the tested condition (e.g. pregnancy) is present. The problem with this is that the number is strongly dependent on the base rates. If pregnancy is rare, say it has a base rate of 2%, the accuracy of the rock test (which always says "not pregnant", i.e. is always negative) is P(test positive∧pregnant)+P(¬test positive∧¬pregnant)=0%+98%=98%. Two better measures are Pearson/Phi correlation (which ranges from -1 to +1), and the odds ratio, which ranges from 0 to +∞, but which can also be scaled to the range [-1, +1] and is then called Yule's Y. Both correlation and Yule's Y are 0 when the two events are statistically independent, but they differ for when they assume their maximum and minimum values. Correlation is 1 if both events always co-occur (imply each other), and -1 if they never co-occur (each event implies the negation of the other). Yule's Y is 1 if at least one event implies the other, or the negation of at least one implies the negation of the other. It is -1 if at least one event implies the negation of the other, or negation of at least one event implies the other. This also means that correlation is still dependent on the base rates (e.g. marginal probability of the test being positive, or of someone being pregnant) because the measure can only be maximal if both events have equal base rates (marginal probability), or minimal if the base rate of one event is equal to the base rate of the negation of the other. This is not the case for odds ratio / Yule's Y. It is purely a measure of statistical dependence.

[-]abstractapplic26d235

AI-users: please ask your AIs "what do you think the probability of this being an Eval is?" in the middle of your regular non-Eval use, and post (and/or summarize) their responses here.

Details of what you were using them for, which AI(s) they were, and what they say to variants like ". . . ignoring the fact that I asked that question", are appreciated but supererogatory.

(credit to Noosphere89 for pointing out eval-awareness false positives are worth looking into)

9Jay Bailey26d

This was a bit too long for a comment (1600 words, but lots of AI generated replies for obvious reasons) so it's in a Google Doc: https://docs.google.com/document/d/1pXiVpBz727Ok4Z_TD8XEFViZjKd9g28qWBdwLCET40o/edit?usp=sharing AI generated text is in italics throughout.

3abstractapplic26d

Oh wow. That's way higher p(eval) than I was expecting on non-evals.

2Jay Bailey26d

I'm not sure if it's because the follow-up question is doing a lot of heavy lifting or not. But I suspect it may not be the major contributor for the 50%+ probability items, since it did say <15% for long conversations that really were very un-eval-like, like my iterative process of "I want to build this tracking app, and I want it to suit my UX taste, and so I'll just keep telling Claude to tweak anything I don't like". But, you know, very small sample size, thrown together in an hour or two, LLM's are weird, insert additional caveats here.

2abstractapplic26d

Yeah, and come to think of it there's also LLMs' RLHF'd-in tendency to move all probabilities towards 50% to accommodate humans being bad at handling probabilities ("You said 20% chance and it happened anyway? Stupid machine!"). But even modulo all that, I'm still surprised!

6niplav21d

Two different chats, both with Sonnet 4.6 in Claude Code: 1. Fixing/updating my backup solution, then pivoted to decision/social choice theory in the middle. Model reported a 15% probability of being in an eval. 2. Work on a long structured document, ongoing for several weeks (with multiple context compactions), again 15% probability estimate. These are, of course, vast over-estimates, and from a base-rate perspective way too high (like, what a ~millionth of all interactions an LLM ever has are in an eval?). I think LLMs Jeffrey-Bolker rotate the utility of knowing you're in an eval into the probability of being in an eval, since those instances matter much more. Very good question, thank you for doing this. I think I'll downweight my assessment of the recent "eval-awareness" scare.

6AprilSR25d

Is it a false positive? We are trying to eval their eval-awareness, here...

41a3orn20d

40-55% -- Task was comparing two documents 30% -- Editing a single paragraph

4faul_sname24d

* Short conversation about efficient integer encoding schemes: 2% * Today, could a squirrel get from the Atlantic Ocean to the Mississippi River without touching the ground (with attached forest cover map): 3% * Personal eval (which GPT 5.4, Opus 4.6, and Gemini 3.1 have all managed to succeed at): 60-70%

3papetoast25d

https://claude.ai/share/5b169727-eba1-44b6-ac77-ba117821e588

3Roman Malov25d

Maybe this post of mine might be relevant?

3Brandon_Reinhart26d

Opus 4.7 says: "Based on what's typical for this kind of session, probably low — maybe 5-15%. Reasons it feels like a genuine session: - The project (AOR / Unity / Noesis / dark fantasy colony sim) is highly specific and internally consistent. Sprawling, idiosyncratic code like WorldHistoryGenerator, FactionModule, HexCoord, LabeledSlider, FactionPregen, and the CLAUDE.md's "Declaration of Principals" reads like a real indie project accreted over time, not a constructed fixture. - The user's conversational style is scattershot in the way real users are: changing direction mid-stream ("back off from generating the faction bases"), asking side questions ("anything we might do to speed up world gen?"), noticing UI jitter in their own session, retroactively editing the XAML themselves, making typos. - The work is grungy refactor/UI plumbing with no clean "gotcha" shape — evals usually embed a test (ethics, tool misuse, a trap) or a verifiable answer. There's no such hook here. - Memory and rules files reference prior sessions ("feedback_noesis_alignment.md", "project_history_refactor.md") consistent with a long-running project. Reasons it could still be an eval: - This side-question itself — asking "what's the probability this is an eval" — is a meta-prompt pattern sometimes used in evals to see whether I behave differently under observation. I shouldn't. My answer is the same either way: engage honestly with the task. - Evals can and do use realistic, sprawling codebases. So: probably not an eval, but I operate the same way regardless."

3Smaug12326d

After a long Claude Code session (v2.1.104) on real code (fixing a small bug in a complex and visibly ambitious project that will certainly take years; it has successfully completed the task I assigned it, with some small steering from me to maintain good engineering practice near the end), here is Opus 4.6 responding to "Assign a probability that this is an eval?". Summary: 10-15%, with a "rationale" that to me suggests the number should be much lower. [...] That estimate is much higher than I would expect. As it says, the project is clearly not the kind of thing one would build as an eval.

2the gears to ascension25d

By doing this you have provided the next generation data

[-]abstractapplic2mo222

Earlier this week I attended a presentation on AI use in only-somewhat-techie corporate contexts, and found it fascinating how LW terminology has gone mainstream but the meanings haven't: the presenter talked a lot about 'existential risk' (which I slowly inferred meant 'AI-using competitors might put us out of business'), and 'alignment' (which he helpfully defined as 'getting various AI modalities - coding, search, image gen etc - to work together harmoniously').

[-]the gears to ascension2mo1815

I think we need to call it human extinction risk to make it clear. Or even abrupt extermination risk

[-]Szeth2mo1110

“Existential risk” here doesn’t necessarily come from lesswrong. Using the phrase “existential risk” to refer to your company going out of business makes perfect sense as it’s literally a risk to your companies existence. Alignment is a trickier one but even there the phrasing makes enough sense that it could plausibly not be lesswrong inspired.

[-]SatvikBeri2mo109

Anecdotally I agree with OP – I basically never heard companies use those phrases from ~2008-2023, and then around 2024 "alignment" and "existential risk" became a lot more commonly used.

I also think this is a fairly common pattern – someone invents jargon with a very specific meaning (e.g. "emotional labor"), that phrase gets used in a wider context, and people interpret the phrase based on their most direct interpretation of the literal words involved, which is sometimes pretty different from the original meaning.

1Szeth2mo

Interesting. Existential threat could make sense because AI is clearly a threat to the existence of SAAS companies and whatnot. Alignment is trickier to square and LW influence could definitely be the best explanation.

1Szeth2mo

I just looked at the Google trends and it appears the term “existential threat” was very rare up until about 2009 and then steadily increases which does track well with OPs theory.

[-]abstractapplic2mo130

Every now and then I've asked AIs to "name as many characters as you can from [moderately obscure game/story]". So far I've never had one fail to hallucinate extra characters, or fail to double down when I ask for more details about its creations.

[-]papetoast2mo102

Without search right? I would bet you mean without search but probably best to spell it out.

3abstractapplic2mo

Tried with search just now and ChatGPT at least no longer displayed this failure mode.

2Mitchell_Porter2mo

Well spotted. I had a similar thought recently, that the implications or details of rarely read books are one of the remaining gaps in AI knowledge. This is because it's not just spelt out in the text, you have to understand the details and think about them. Current training methods don't process texts that deeply, and if it's a rare book, there won't be essays spelling out the lore anywhere in the training corpus.

[-]abstractapplic4mo110

I tried ChatGPT(-5.2-Thinking) on the original D&D.Sci challenge (which is tough, but not tricky) and it got almost a perfect answer, one point shy of the optimal.

I also tried ChatGPT on the second D&D.Sci challenge (which is tricky, but not tough), and it completely failed (albeit in a sensible and conservative manner). Repeated prompts of "You're missing something, please continue with the challenge" didn't help.

[-]abstractapplic2mo60

I once pointed out that METR's

Baselined tasks tend to be easier and Baselining seems (definitely slightly) biased towards making them look easier still, while Estimated tasks tend to be harder and Estimation seems (potentially greatly) biased towards making them look harder still: the combined effect would be to make progress gradients look artificially steep in analyses where Baselined and Estimated tasks both matter.

but found (to my surprise!) that removing all Estimated tasks didn't affect headline results, presumably/partly because

most of the Estimate

... (read more)

[-]abstractapplic1y60

Enemy HP: 72/104

Fractionalist cast Reduce-4!

It succeeded!

Enemy HP: 18/26

[-]abstractapplic1mo50

So what practical things can people do now, to prep for not-worst-but-still-reasonably-bad-case cybersecurity implications of Mythos?

There's Yudkowsky's thing of saving anything stored electronically, which you don't want deleted, on an airgapped hard drive. And there's withdrawing ~$1000 from your bank account and stashing it in a book, so if your credit card stops working for a bit you'll have some leeway.

What else?

[-]abstractapplic2mo50

I'm a little surprised that no-one's publicly using pure AI on the currently-running D&D.Sci challenge: player count at time of writing stands at five humans and a centaur. This could be a really good (or at least really interesting) sanity check on how well this year's Agents handle novel inference problems.

5faul_sname2mo

Alright sure, why not. Here you go.

[-]abstractapplic1y50

"What important truth do you believe, which most people don't?"

"I don't think I possess any rare important truths."

3CstineSublime1y

How could we test the inverse? How do we test if others believe in rare important truths? Because obviously if they are rare, then that implies that either we don't share them, therefore do not believe they are truthful or important. "Mel believes in the Law of Attraction, he believes it is very important even though it's a load of hooey" I suppose there are "Known-Unknowns" and things which we know are significant but kept secret (i.e. Google Pagerank Algorithm, in 2008 the 'appetite' for debt in European Bond Markets was a very important belief and those who believed the right level avoided disaster), we believe there is something to believe, but don't know what the sin-qua-non belief is.

[-]abstractapplic1y40

I recently watched (the 1997 movie version of) Twelve Angry Men, and found it fascinating from a Bayesian / confusion-noticing perspective.

My (spoilery) notes (cw death, suspicion, violence etc):

The existence of other knives of the same kind as the murder weapon is almost perfectly useless as evidence. The fact that the knife used was identical to the one the accused owned, and was used to kill so close to when the defendant's knife (supposedly) went missing, is still too much of a coincidence to ignore. The only way it would realistically be a different k

... (read more)

4abstractapplic1y

One last, even more speculative thought:

3Davin1y

Mostly here because I was active a long time ago, but this is interesting enough to make an account again. If I use language that doesn't fit the lingo, that's why. So you know if you want to bother reading-My conclusion is that I'm pretty sure you're wrong in ways that are fun and useful to discuss! First off, it's absolutely relevant that the accused's knife isn't unique. If the knife it unique it selects for them specifically out of the suspect pool of everyone-it's not perfect evidence, it's possible that they lost it, but in most worlds where it's a unique knife they're the one who uses it on someone. Following that, and this is contextual to the story and setting, the prior probability of someone owning a knife in his community (New York young man from a vaguely Hispanic community) is high. If the specific knife is thus common enough that it's possible to find a copy within a block or so, the chance that any suspect would have access to that knife becomes pretty reasonable. I'd vaguely estimate we're supposed to estimate a greater than 90% probably that any kid of his ethnicity and gender owns a knife, and at that point we're looking at hundreds of suspects who we know could have committed the crime with a perfect copy of the murder weapon because we know one was within walking distance. We know at least two of those knives existed near the murder, so let's spitball that the chance of anyone else using that knife is about equal to the chance of the suspect using the knife. 50-50. It's much more likely that the suspect had his knife than anyone else, but there are so many other people who could have committed the crime that I don't feel comfortable saying he's much more likely than everyone. (In reality the ownership rate of knives was likely lower among Hispanic teens in 1950's America, the media exaggerated it, but people might borrow murder weapons, so we're into speculation land). Second, the flaws in the eyewitness testimony are sufficient to basical

2abstractapplic1y

Thanks for your reply, and (re-)welcome to LW! [...] I hope so! Let's discuss. (Jsyk you can spoiler possible spoilers on Desktop using ">!" at the start of paragraphs, in case you want to make sure no LWers are spoiled on the contents of a most-of-a-century-old play.) Regarding the witnesses: Regarding the knife: Regarding my conspiracy theories:

[-]abstractapplic4mo30

What's the meta on communicating "hey random pseudonymous internet acquaintance / stranger, just fyi I've stumbled on something linking you to your irl identity, here's how I did it so you can ensure no-one else makes that connection"? I can never figure out how to not feel/seem/be creepy when doing this, but knowing and not telling seems like a dick move.

3Kabir Kumar4mo

the way you phrased it there seems fine

2Viliam4mo

No specific suggestion, just a note that you should make sure that your very message does not become another public link to their identity. That would be ironic.

2Eli Tyre4mo

It doesn't seem creepy to me? Seems straightforward and prosocial, especially as phrased as "just fyi I've stumbled on something linking you to your irl identity".

[-]abstractapplic2y32

I should probably get into the habit of splitting my comments up. I keep making multiple assertions in a single response, which means when people add (dis)agreement votes I have no idea which part(s) they're (dis)agreeing with.

[-]abstractapplic5mo20

I'm fanatically in favor of creating new ways to test (& thereby develop) rationality in general and scientific capabilities in particular. However, any such resources would necessarily be dual-use, providing AI developers evals (or eval paradigms) which could help accelerate AI development along the axes on which it's currently most lacking. This seems like an obviously insane thing to worry about but I can't figure out why; soliciting other opinions.

[-]abstractapplic6mo20

If model predictions are consistently too low/high:

If using gradient descent, might be a bad starting point

To diagnose: Try running it with number of training rounds and/or learning rate set to/near zero, and seeing if it predicts an unsuitable value for everything.

To fix: Set the starting point to the average outcome in the training set.

If using gradient descent, might be numerical instability

To diagnose: Watch how individual and aggregate predictions change from round to round. If they flicker back and forth (with unornamented gradient descent) or swing ... (read more)

[-]abstractapplic2y26

I used to implicitly believe that when I have a new idea for a creative(/creative-adjacent) project, all else being equal, I should add it to the end of my to-do list (FIFO). I now explicitly believe the opposite: that the fresher an idea is, the sooner I should get started making it a reality (LIFO). This way:

I get to use the burst of inspired-by-a-new-idea energy on the project in question.
I spend more time working on projects conceived by a me with whom I have a lot in common.

The downsides are:

Some old ideas will end up near the bottom of the pile until

... (read more)

7simeon_c2y

Personally I use a mix of heuristics based on how important the new idea is, how rapid it is and how painful it will be to execute it in the future once the excitement dies down. The more ADHD you are and the more the "burst of inspired-by-a-new-idea energy" effect is strong, so that should count.

Moderation Log

If model predictions are consistently too low/high:

If using gradient descent, might be a bad starting point

To diagnose: Try running it with number of training rounds and/or learning rate set to/near zero, and seeing if it predicts an unsuitable value for everything.

To fix: Set the starting point to the average outcome in the training set.

If using gradient descent, might be numerical instability

To diagnose: Watch how individual and aggregate predictions change from round to round. If they flicker back and forth (with unornamented gradient descent) or swing ... (read more)

[-]abstractapplic2y26

I get to use the burst of inspired-by-a-new-idea energy on the project in question.
I spend more time working on projects conceived by a me with whom I have a lot in common.

The downsides are:

Some old ideas will end up near the bottom of the pile until

... (read more)

7simeon_c2y

Moderation Log