Now THIS is forecasting: understanding Epoch’s Direct Approach

Happy May the 4th from Convergence Analysis! Cross-posted on the EA Forum.

As part of Convergence Analysis’s scenario research, we’ve been looking into how AI organisations, experts, and forecasters make predictions about the future of AI. In February 2023, the AI research institute Epoch published a report in which its authors use neural scaling laws to make quantitative predictions about when AI will reach human-level performance and become transformative. The report has a corresponding blog post, an interactive model, and a Python notebook.

We found this approach really interesting, but also hard to understand intuitively. While trying to follow how the authors derive a forecast from their assumptions, we wrote a breakdown that may be useful to others thinking about AI timelines and forecasting.

In what follows, we set out our interpretation of...

(Continue Reading – 5526 more words)

Chris_Leong15m20

Under the current version of the interactive model, its median prediction is just two decades earlier than that from Cotra’s forecast

Just?

3NunoSempere4h

You might also enjoy this review: https://nunosempere.com/blog/2023/04/28/expert-review-epoch-direct-approach/

Transcoders enable fine-grained interpretable circuit analysis for language models

Jacob Dunefsky, Philippe Chlenski, Neel Nanda

Ω 304d

Summary

We present a method for performing circuit analysis on language models using "transcoders," an occasionally-discussed variant of SAEs that provide an interpretable approximation to MLP sublayers' computations. Transcoders are exciting because they allow us not only to interpret the output of MLP sublayers but also to decompose the MLPs themselves into interpretable computations. In contrast, SAEs only allow us to interpret the output of MLP sublayers and not how they were computed.
We demonstrate that transcoders achieve similar performance to SAEs (when measured via fidelity/sparsity metrics) and that the features learned by transcoders are interpretable.
One of the strong points of transcoders is that they decompose the function of an MLP layer into sparse, independently-varying, and meaningful units (like neurons were originally intended to be before superposition was discovered).

...

(Continue Reading – 4944 more words)

RGRGRG21m10

Question about the "rules of the game" you present. Are you allowed to simply look at layer 0 transcoder features for the final 10 tokens - you could probably roughly estimate the input string from these features' top activators. From you case study, it seems that you effectively look at layer 0 transcoder features for a few of the final tokens through a backwards search, but wonder if you can skip the search and simply look at transcoder features. Thank you.

William_S's Shortform

William_S

Ω 31y

James Payor31mΩ110

By "gag order" do you mean just as a matter of private agreement, or something heavier-handed, with e.g. potential criminal consequences?

I have trouble understanding the absolute silence we seem to be having. There seem to be very few leaks, and all of them are very mild-mannered and are failing to build any consensus narrative that challenges OA's press in the public sphere.

Are people not able to share info over Signal or otherwise tolerate some risk here? It doesn't add up to me if the risk is just some chance of OA trying to then sue you to bankruptcy, ... (read more)

4Martín Soto9h

What's PPU?

6MondSemmel4h

From here:

3mishka11h

No, OpenAI (assuming that it is a well-defined entity) also uses a probability distribution over timelines. (In reality, every member of its leadership has their own probability distribution, and this translates to OpenAI having a policy and behavior formulated approximately as if there is some resulting single probability distribution). The important thing is, they are uncertain about timelines themselves (in part, because no one knows how perplexity translates to capabilities, in part, because there might be difference with respect to capabilities even with the same perplexity, if the underlying architectures are different (e.g. in-context learning might depend on architecture even with fixed perplexity, and we do see a stream of potentially very interesting architectural innovations recently), in part, because it's not clear how big is the potential of "harness"/"scaffolding", and so on). This does not mean there is no political infighting. But it's on the background of them being correctly uncertain about true timelines... ---------------------------------------- Compute-wise, inference demands are huge and growing with popularity of the models (look how much Facebook did to make LLama 3 more inference-efficient). So if they expect models to become useful enough for almost everyone to want to use them, they should worry about compute, assuming they do want to serve people like they say they do (I am not sure how this looks for very strong AI systems; they will probably be gradually expanding access, and the speed of expansion might depend).

Northampton Astral Codex Ten Meetup

Northampton, MA ACX Meetup: May 11, 2024

May 11th100 Black Birch Trail, Northampton

Alex Liebowitz

[If you haven't come since we started meeting at Rocky Hill Cohousing, make sure to read this for more details about where to go and park.]

We're the regular Northampton area meetup for Astral Codex Ten readers, and (as far as I know) the only rationalist or EA meetup in Western Massachusetts.

We started as part of the blog's 2018 "Meetups Everywhere" event, and have been holding meetups with varying degrees of regularity ever since. At most meetups we get about 4-7 people out of a rotation of 15-20, with a nice mix of regular faces, people who only drop in once in a while, and occasionally total newcomers.

After meeting more sporadically in the past, we recently started meeting biweekly (currently every other Saturday, although there's a chance I'll...

(See More – 255 more words)

Introducing AI-Powered Audiobooks of Rational Fiction Classics

Askwho

(ElevenLabs reading of this post, if anyone can tell me how to embed the audio into Lesswrong I'd appreciate it)

I'm excited to share a project I've been working on that I think many in the Lesswrong community will appreciate - converting some rational fiction into high-quality audiobooks using cutting-edge AI voice technology from ElevenLabs, under the name "Askwho Casts AI".

The keystone of this project is an audiobook version of Planecrash (AKA Project Lawful), the epic glowfic authored by Eliezer Yudkowsky and Lintamande. Given the scope and scale of this work, with its large cast of characters, I'm using ElevenLabs to give each character their own distinct voice. It's a labor of love to convert this audiobook version of this story, and I hope if anyone has bounced...

(See More – 147 more words)

Why I'm doing PauseAI

Joseph Miller

GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict fairly well what the cross-entropy loss will be, but pretty much nothing else.

Maybe we will suddenly discover that the difference between GPT-4 and superhuman level is actually quite small. Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.

Hopefully model evaluations can catch catastrophic risks before wide deployment, but again, it’s hard to be sure. GPT-5 could plausibly be devious enough to circumvent all of...

(See More – 955 more words)

quiet_NaN1h10

Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.

I am by no means an expert on machine learning, but this sentence reads weird to me.

I mean, it seems possible that a part of a NN develops some self-reinforcing feature which uses the gradient descent (or whatever is used in training) to go into a particular direction and take over the NN, like a human adrift on a raft in the ocean might decide to build a sail to make the raft go into a particular direction.

Or is that s... (read more)

2Nathan Helm-Burger14h

I absolutely sympathize, and I agree that with the world view / information you have that advocating for a pause makes sense. I would get behind 'regulate AI' or 'regulate AGI', certainly. I think though that pausing is an incorrect strategy which would do more harm than good, so despite being aligned with you in being concerned about AGI dangers, I don't endorse that strategy. Some part of me thinks this oughtn't matter, since there's approximately ~0% chance of the movement achieving that literal goal. The point is to build an anti-AGI movement, and to get people thinking about what it would be like to be able to have the government able to issue an order to pause AGI R&D, or turn off datacenters, or whatever. I think that's a good aim, and your protests probably (slightly) help that aim. I'm still hung up on the literal 'Pause AI' concept being a problem though. Here's where I'm coming from: 1. I've been analyzing the risks of current day AI. I believe (but will not offer evidence for here) current day AI is already capable of providing small-but-meaningful uplift to bad actors intending to use it for harm (e.g. weapon development). I think that having stronger AI in the hands of government agencies designed to protect humanity from these harms is one of our best chances at preventing such harms. 2. I see the 'Pause AI' movement as being targeted mostly at large companies, since I don't see any plausible way for a government or a protest movement to enforce what private individuals do with their home computers. Perhaps you think this is fine because you think that most of the future dangers posed by AI derive from actions taken by large companies or organizations with large amounts of compute. This is emphatically not my view. I think that actually more danger comes from the many independent researchers and hobbyists who are exploring the problem space. I believe there are huge algorithmic power gains which can, and eventually will, be found. I furthermore

1yanni kyriacos16h

Hi Tomás! is there a prediction market for this that you know of?

1yanni kyriacos16h

I think it is unrealistic to ask people to internalise that level of ambiguity. This is how EA's turn themselves into mental pretzels.

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

If you are assuming Software works well you are dead

Johannes C. Mayer

I say this because I can hardly use a computer without constantly getting distracted. Even when I actively try to ignore how bad software is, the suggestions keep coming.

Seriously Obsidian? You could not come up with a system where links to headings can't break? This makes you wonder what is wrong with humanity. But then I remember that humanity is building a god without knowing what they will want.

So for those of you who need to hear this: I feel you. It could be so much better. But right now, can we really afford to make the ultimate <programming language/text editor/window manager/file system/virtual collaborative environment/interface to GPT/...>?

Can we really afford to do this while our god software looks like...

May this find you well.

Dagon1h60

Agreed, but it's not just software. It's every complex system, anything which requires detailed coordination of more than a few dozen humans and has efficiency pressure put upon it. Software is the clearest example, because there's so much of it and it feels like it should be easy.

3Ustice5h

I don’t know about making god software, but human software is a lot of trial and error. I have been writing code for close to 40 years. The best I can do is write automated tests to anticipate the kinds of errors I might get. My imagination just isn’t as strong as reality. There is provably no way to fully predict how a software system of sufficient complexity. With careful organization it becomes easier to reason about and predict, but unless you are writing provable software (it’s a very slow and complex process, I hear), that’s the best you get. I feel you on being distracted by software bugs. I’m one of those guys that reports them, or even code change suggestions (GitHub Pull Requests).

1Johannes C. Mayer4h

I think it is incorrect to say that testing things fully formally is the only alternative to whatever the heck we are currently doing. I mean there is property-based testing as a first step (which maybe you also refer to with automated tests but I would guess you are probably mainly talking about unit tests). Maybe try Haskell or even better Idris? The Haskell compiler is very annoying until you realize that it loves you. Each time it annoys you with compile errors it actually says "Look I found this error here that I am very very sure you'd agree is an error, so let me not produce this machine code that would do things you don't want it to do". It's very bad at communicating this though, so it's words of love usually are blurted out like this: Don't bother understanding the details, they are not important. So maybe Haskell's greatest strength, being a very "noisy" compiler, is also its downfall. Nobody likes being told that they are wrong, well at least not until you understand that your goals and the compiler's goals are actually aligned. And the compiler is just better at thinking about certain kinds of things that are harder for you to think about. In Haskell, you don't really ever try to prove anything about your program in your program. All of this you get by just using the language normally. You can then go one step further with Agda, Idris2, or Lean, and start to prove things about your programs, which easily can get tedious. But even then when you have dependent types you can just add a lot more information to your types, which makes the compiler able to help you better. Really we could see it as an improvement to how you can tell the compiler what you want. But again, you what you can do in dependent type theory? NOT use dependent type theory! You can use Haskell-style code in Idris whenever that is more convenient. And by the way, I totally agree that all of these languages I named are probably only ghostly images of what they could truly be. But

2Johannes C. Mayer6h

"If you are assuming Software works well you are dead" because: * If you assume this you will be shocked by how terrible software is every moment you use a computer, and your brain will constantly try to fix it wasting your time. * You should not assume that humanity has it in it to make the god software without your intervention. * When making god software: Assume the worst.

Twin Cities ACX/EA Meetups

MSP ACX Hangout: Davanni's Pizza

May 5th41 Cleveland Avenue South, Saint Paul

25Hour

This is just an ACX hangout! I'm ordering pizza, gonna get a large with half vegetarian toppings. No reading this week.

25Hour1h10

Primarily people come to this on the discord, so I just have this on lw for visibility

Shannon Vallor’s “technomoral virtues”

David Gross

Shannon Vallor is the first Baillie Gifford Chair in the Ethics of Data and Artificial Intelligence at the Edinburgh Futures Institute. She is the author of Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. She believes that we need to build and cultivate a new virtue ethics appropriate to our technological era. The following summarizes some of her arguments and observations:

Why Do We Need New Virtues?

Vallor believes that we need to discover and cultivate a new set of virtues that is appropriate to the onslaught of technological change that marks our era. This for several reasons, including:

Technology is extending our reach, so that our decisions have effects with broader ethical implications than traditional moral wisdom is prepared to cope with.
Technology is also starting

...

(Continue Reading – 1450 more words)

Nathan Helm-Burger1h20

Something I think humanity is going to have to grapple with soon is the ethics of self-modification / self-improvement, and the perils of value-shift due to rapid internal and external changes. How do we stay true to ourselves while changing fundamental aspects of what it means to be human?

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Summary

Why Do We Need New Virtues?

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA