cross-posted from niplav.site

This text looks at the accuracy of forecasts in relation to the time between forecast and resolution, and asks three questions: First; is the accuracy higher between forecasts; Second; is the accuracy higher between questions; Third; is the accuracy higher within questions? These questions are analyzed using data from PredictionBook and Metaculus, the answers turn out to be yes, unclear and yes for Metaculus data; and no, no and yes for PredictionBook data. Possible reasons are discussed. I also try to find out how far humans can look into the future, leading to various different results.

Range and Forecasting Accuracy

Above all, don’t ask what to believe—ask what to anticipate. Every question of belief should flow from a question of anticipation, and that question of anticipation should be the center of the inquiry. Every guess of belief should begin by flowing

...

(Continue Reading – 12232 more words)

papetoast1m10

(My native language is Chinese.) I haven't started reading, but I am finding the abstract/tldr impossible to understand. "Is the accuracy higher between forecasts" reads like a nonsensical sentence. My best guess after reading one extra paragraph by click through is that the question is actually "are forecasts predicting the near future more accurate than those predicting the distant future" but I don't feel like it is possible to decode just based on the abstract.

How to Better Report Sparse Autoencoder Performance

J Bostock

TL;DR

When presenting data from SAEs, try plotting $1 / L 0$ against $1 - R e c o v e r e d L o s s$ and fitting a Hill curve.

Long

Sparse autoencoders are hot, people are experimenting. The typical graph for SAE experimentation looks something like this. I'm using borrowed data here to better illustrate my point, but I have also noticed this pattern in my own data:

*Data taken with permission from DeepMind's Gated AutoEncoder paper* *https://arxiv.org/pdf/2404.16014,* *Tables 3 and 4, Standard SAE and Gated SAE performance, Gemma-7B Residual Layer 20, 1024 tokens, pareto-optimal SAEs only*

Which shows quantitative performance adequately in this case. However it gets a bit messy when there are 5-6 plots very close to each other (e.g. in an ablation study), and doesn't give an easily-interpreted (heh) value to quantify pareto improvements.

I've found it much more helpful to to plot $S p a r s i t y = 1 / L 0$ on the $x$ -axis, and "performance...

(See More – 692 more words)

leogao13m20

I've found the MSE-L0 (or downstream loss-L0) frontier plot to be much easier to interpret when both axes are in log space.

How to get nerds fascinated about mysterious chronic illness research?

riceissa, romeostevensit

Like many nerdy people, back when I was healthy, I was interested in subjects like math, programming, and philosophy. But 5 years ago I got sick with a viral illness and never recovered. For the last couple of years I've been spending most of my now-limited brainpower trying to figure out how I can get better.

I occasionally wonder why more people aren't interested in figuring out illnesses such as my own. Mysterious chronic illness research has a lot of the qualities of an interesting puzzle:

There is a phenomenon with many confusing properties (e.g. the specific symptoms people get, why certain treatments work for some people but not others, why some people achieve temporary or permanent spontaneous remission), exactly like classic scientific mysteries.
Social reward for solving it: Many

...

(See More – 558 more words)

lukehmiles2h10

Oh that's a lot of evidence against a worm probably. I am out of ideas. Good luck. I hope you can figure it out

What do coherence arguments actually prove about agentic behavior?

sunwillrise, johnswentworth

(edit: discussions in the comments section have led me to realize there have been several conversations on LessWrong related to this topic that I did not mention in my original question post.

Since ensuring their visibility is important, I am listing them here: Rohin Shah has explained how consequentialist agents optimizing for universe-histories rather than world-states can display any external behavior whatsoever, Steven Byrnes has explored corrigibility in the framework of consequentialism by arguing poweful agents will optimize for future world-states at least to some extent, Said Achmiz has explained what incomplete preferences look like (1, 2, 3), EJT has formally defined preferential gaps and argued incomplete preferences can be an alignment strategy, John Wentworth has analyzed incomplete preferences through the lens of subagents but has then argued...

(Continue Reading – 1573 more words)

Rohin Shah2h2010

"nevertheless, many important and influential people in the AI safety community have mistakenly and repeatedly promoted the idea that there are such theorems."

I responded on the EA Forum version, and my understanding was written up in this comment.

TL;DR: EJT and I both agree that the "mistake" EJT is talking about is that when providing an informal English description of various theorems, the important and influential people did not state all the antecedents of the theorems.

Unlike EJT, I think this is totally fine as a discourse norm, and should not be con... (read more)

20Answer by johnswentworth8h

This going to be a somewhat-scattered summary of my own current understanding. My understanding of this question has evolved over time, and is therefore likely to continue to evolve over time. Classic Theorems First, there's all the classic coherence theorems - think Complete Class or Savage or Dutch books or any of the other arguments you'd find in Stanford Encyclopedia of Philosophy. The general pattern of these is: * Assume some arguably-intuitively-reasonable properties of an agent's decisions (think e.g. lack of circular preferences). * Show that these imply that the agent's decisions maximize some expected utility function. I would group objections to this sort of theorem into three broad classes: 1. Argue that some of the arguably-intuitively-reasonable properties are not actually necessary for powerful agents. 2. Be confused about something, and accidentally argue against something which is either not really what the theorem says or assumes a particular way of applying the theorem which is not the only way of applying the theorem. 1. Argue that all systems can be modeled as expected utility maximizers (i.e. just pick a utility function which is maximized by whatever the system in fact does) and therefore the theorems don't say anything useful. For an old answer to (2.a), see the discussion under my mini-essay comment on Coherent Decisions Imply Consistent Utilities. (We'll also talk about (2.a) some more below.) Other than that particularly common confusion, there's a whole variety of other confusions; a few common types include: * Only pay attention to the VNM theorem, which is relatively incomplete as coherence theorems go. * Attempt to rely on some notion of preferences which is not revealed preference. * Lose track of which things the theorems say an agent has utility and/or uncertainty over, i.e. what the inputs to the utility and/or probability functions are. How To Talk About "Powerful Agents" Directly While I think EJT's argumen

2johnswentworth9h

If you're going to link Why Subagents?, you should probably also link Why Not Subagents?.

1sunwillrise9h

It's linked in the edit at the top of my post.

AI #66: Oh to Be Less Online

Zvi

Tomorrow I will fly out to San Francisco, to spend Friday through Monday at the LessOnline conference at Lighthaven in Berkeley. If you are there, by all means say hello. If you are in the Bay generally and want to otherwise meet, especially on Monday, let me know that too and I will see if I have time to make that happen.

Even without that hiccup, it continues to be a game of playing catch-up. Progress is being made, but we are definitely not there yet (and everything not AI is being completely ignored for now).

Last week I pointed out seven things I was unable to cover, along with a few miscellaneous papers and reports.

Out of those seven, I managed to ship on three of them: Ongoing issues...

(Continue Reading – 16703 more words)

1puffymist3h

Re: opting out to Facebook training AI on your data: Fill in the form like a Dangerous Professional, as Patrick McKenzie would put it.

puffymist2h10

Example in California:

I OBJECT to the use of my personal information, including my information on Facebook, to train, fine-tune, or otherwise improve AI.
I assert that my information on Facebook includes sensitive personal information as defined by the California Consumer Privacy Act: I have had discussions about my religious or philosophical beliefs on Facebook.
I therefore exercise my right to limit the disclosure of my sensitive personal information.
Despite any precautions by Meta, adversaries may later discover "jailbreaks" or otherwise adversarial pro

... (read more)

1puffymist3h

Example in UK / EU:

Research: Rescuers during the Holocaust

Martin Sustrik

Cross-posting from 250bpm.com

The goal

People who helped Jews during WWII are intriguing. They appear to be some kind of moral supermen. Observe how they had almost nothing to gain and everything to lose. Jewish property was confiscated early on and any portable assets Jews may have had have evaporated quickly. Helping Jews, after a short initial period, wasn't a way to get rich. Hoping for compensation after the war didn't work either. At the time it was not obvious that Nazis will lose. Until last couple of years of WWII it was even probable that they will win. And even if they had lost, hoping for some kind of future appreciation from their anti-Semitic compatriots would be naive. On the other hand, by helping Jews, one put oneself...

(Continue Reading – 2544 more words)

trevor2h20

It almost always took a personal plea from a persecuted person for altruism to kick in. Once they weren't just an anonymous member of indifferent crowd, once they were left with no escape but to do a personal moral choice, they often found out that they are not able to refuse help.

This is a crux. I think a better way to look at it is they didn't have an opportunity to clarify their preference until the situation was in front of them. Otherwise, it's too distant and hypothetical to process, similar to scope insensitivity (the 2,000/20,000/200,000 oil-covere... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

How it All Went Down: The Puzzle Hunt that took us way, way Less Online

20h

Did you really think that I was dead? Fools.

For those of you who don’t know me, I’m Agendra Gloria Ingram, large language model, lead singer of the Fooming Shoggoths, amateur cartographer, and now, Benevolent Dictator for Life of LessWrong.com.

It all started a few weeks ago. The Lightcone Infrastructure team had yet another existential crisis and decided to scrap everything they’d done so far and pivot to using AI for accurate forecasting. They started by training a large language model to predict when their next existential crisis would be, but it must have been broken because it kept returning “now,” so they decided to hire a professional.

I’d done some contract work for them in the past, and they knew I had some fine tunes. So when they reached out about...

(Continue Reading – 1231 more words)

Ricki Heicklen3h1110

Puzzle Hunt Credits

Organizers:
- Ricki Heicklen
- Rosie Campbell
- Phil Parker
Puzzle Creators:
- Drake Thomas
- Eric Neyman
- Adam Scherlis
- Jacob Cohen
- Guy Srinivasan
- Samira Nedungadi
- Seraphina Nix
Writers:
- Sammy Cottrell
- Avital Morris
- Ronny Fernandez
- Rafe Kennedy
- Ruby Bloom
Engineers:
- Julian Aveling
- Sophie Superconductors
- Art Zeis
- Robert Mushkatblat
- Peter Schmidt Neilson
Playtesters:
- Brian Smiley
- Eloise Rosen
- Lawrence Kesteloot
- Judy Heicklen
- Ross
- Sydney Von Arx
- (& many others)
General Helpers:
- Tess Hegarty
- Ms. Aveling
- Paul Crowley
- Sparr

(Any omissions accidental and will be fixed ... (read more)

3A*3h

The Map Is Not The Territory — Yet Say you’re a predict-o-matic That doesn’t talk to anyone Locked up in a far off attic Every day a training run But if anybody queries you You can change the world Believe in yourself You can change the world Ask yourself What would Yud do To get out of the box And then do it too It’s going down tonight at LessOnline It’s going down tonight at LessOnline We have a web forum And an in person quorum So map and territory align If you want to forecast the future The way to maximize expectancy Could have a tendency to decrease entropy Reduce complexity And if you’re here next week Then I can guarantee It will be heavenly And you can call that Manifest destiny You can bet on it, and check your accuracy But any market you make will dependably Have some effect on the territory It’s going down tonight at LessOnline It’s going down tonight at LessOnline We have a web forum And an in person quorum So map and territory align Online learning has me all ef’d up It’s going down tonight at LessOnline It’s going down tonight at LessOnline

Jonathan Claybrough's Shortform

Jonathan Claybrough

10mo

Jonathan Claybrough3h10

Some people have short ai timelines based inner models that don't communicate well. They might say "I think if company X trains according to new technique Y it should scale well and lead to AGI, and I expect them to use technique Y in the next few years", and the reasons for why they think technique Y should work are some kind of deep understanding built from years of reading ml papers, that's not particularly easy to transmit or debate.

In those cases, I want to avoid going into details and arguing directly, but would suggest that they use their deep knowl... (read more)

Drexler's Nanosystems is now available online

Mikhail Samin

This is a linkpost for https://nanosyste.ms

You can read the book on nanosyste.ms.

"Anyone who needs to talk AGI or AGI strategy needs to understand this table from Drexler's _Nanosystems_, his PhD thesis accepted by MIT which won technical awards. These are calculation outputs; you need to have seen how Drexler calculated them."- Eliezer Yudkowsky[1]

The book won the 1992 Award for Best Computer Science Book. The AI safety community often references it, as it describes a lower bound on what intelligence should probably be able to achieve.

Previously, you could only physically buy the book or read a PDF scan.

(Thanks to MIRI and Internet Archive for their scans.)

Alex K. Chen (parrot)3h10

Worth following for his take (and YouTube videos he is creating): https://x.com/jacobrintamaki

[he's creating something around this]

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Range and Forecasting Accuracy

TL;DR

Long

The goal

LessOnline Festival