We are headed into an extreme compute overhang

If we achieve AGI-level performance using an LLM-like approach, the training hardware will be capable of running ~1,000,000s concurrent instances of the model.

Definitions

Although there is some debate about the definition of compute overhang, I believe that the AI Impacts definition matches the original use, and I prefer it: "enough computing hardware to run many powerful AI systems already exists by the time the software to run such systems is developed". A large compute overhang leads to additional risk due to faster takeoff.

I use the types of superintelligence defined in Bostrom's Superintelligence book (summary here).

I use the definition of AGI in this Metaculus question. The adversarial Turing test portion of the definition is not very relevant to this post.

Thesis

Due to practical reasons, the compute requirements for training LLMs...

(See More – 408 more words)

snewman15m10

Assuming we require a performance of 40 tokens/s, the training cluster can run $\frac{2000}{30} \times 24000 = 1, 600, 000$ concurrent instances of the resulting 70B model

Nit: you mixed up 30 and 40 here (should both be 30 or both be 40).

I will assume that the above ratios hold for an AGI level model.

If you train a model with 10x as many parameters, but use the same training data, then it will cost 10x as much to train and 10x as much to operate, so the ratios will hold.

In practice, I believe it is universal to use more training data when training larger models? Imply... (read more)

Uncontrollable Super-Powerful Explosives

Sammy Martin

In the late 19th century, two researchers meet to discuss their differing views on the existential risk posed by future Uncontrollable Super-Powerful Explosives.

Catastrophist: I predict that one day, not too far in the future, we will find a way to unlock a qualitatively new kind of explosive power. This explosive will represent a fundamental break with what has come before. It will be so much more powerful than any other explosive that whoever gets to this technology first might be in a position to gain a DSA over any opposition. Also, the governance and military strategies that we were using to prevent wars or win them will be fundamentally unable to control this new technology, so we'll have to reinvent everything on the fly or die in

...

(Continue Reading – 1183 more words)

Sammy Martin26m20

In the late 1940s and early 1950s nuclear weapons did not provide an overwhelming advantage against conventional forces. Being able to drop dozens of ~kiloton range fission bombs in eastern European battlefields would have been devastating but not enough by itself to win a war. Only when you got to hundreds of silo launched ICBMs with hydrogen bombs could you have gotten a true decisive strategic advantage

On Not Pulling The Ladder Up Behind You

Screwtape

32m

Epistemic Status: Musing and speculation, but I think there's a real thing here.

I.

When I was a kid, a friend of mine had a tree fort. If you've never seen such a fort, imagine a series of wooden boards secured to a tree, creating a platform about fifteen feet off the ground where you can sit or stand and walk around the tree. This one had a rope ladder we used to get up and down, a length of knotted rope that was tied to the tree at the top and dangled over the edge so that it reached the ground.

Once you were up in the fort, you could pull the ladder up behind you. It was much, much harder to get into the fort without the ladder....

(Continue Reading – 2402 more words)

ACX Atlanta - The Atlanta Moloch Slayers

ACX Atlanta Meetups Everywhere Spring 2024

Apr 27thAtlanta

Steve French

The April 2024 Meetup will be April 27th at Bold Monk at 2:00 PM

We return to Bold Monk brewing for a vigorous discussion of rationalism and whatever else we deem fit for discussion – hopefully including actual discussions of the sequences and Hamming Circles/Group Debugging.

Location:
Bold Monk Brewing
1737 Ellsworth Industrial Blvd NW
Suite D-1
Atlanta, GA 30318, USA

No Book club this month!

This is also the meetups everywhere meetup that will be advertised on the blog - so we should have a large turnout!

We will be outside out front (in the breezeway) – this is subject to change, but we will be somewhere in Bold Monk. If you do not see us in the front of the restaurant, please check upstairs and out back – look for the yellow table sign. We will have to play the weather by ear.

Remember – bouncing around in conversations is a rationalist norm!

Steve French1h10

Great!

Reducing sycophancy and improving honesty via activation steering

117

Nina Rimsky

Ω 499mo

Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger.

I generate an activation steering vector using Anthropic's sycophancy dataset and then find that this can be used to increase or reduce performance on TruthfulQA, indicating a common direction between sycophancy on questions of opinion and untruthfulness on questions relating to common misconceptions. I think this could be a promising research direction to understand dishonesty in language models better.

What is sycophancy?

Sycophancy in LLMs refers to the behavior when a model tells you what it thinks you want to hear / would approve of instead of what it internally represents as the truth. Sycophancy is a common problem in LLMs trained on human-labeled data because human-provided training signals...

(Continue Reading – 2574 more words)

1alexandraabbas12h

"[...] This is because there would be no general direction towards a truth-based belief domain or away from using human modeling in output generation." What do you mean by "human modeling in output generation"?

Nina Rimsky1h20

I am contrasting generating an output by:

Modeling how a human would respond (“human modeling in output generation”)
Modeling what the ground-truth answer is

Eg. for common misconceptions, maybe most humans would hold a certain misconception (like that South America is west of Florida), but we want the LLM to realize that we want it to actually say how things are (given it likely does represent this fact somewhere)

[Concept Dependency] Concept Dependency Posts

Johannes C. Mayer

This is a Concept Dependency Post. It may not be worth reading on its own, out of context. See the backlinks at the bottom to see which posts use this concept.

See the backlinks at the bottom of the post. Every post starting with [Concept Dependency] is a concept dependency post, that describes a concept this post is using.

Problem: Often when writing I come up with general concepts that make sense in isolation. Often I want to reuse these concepts without having to reexplain them.

A Concept Dependency Post is explaining a single concept, usually with no or minimal context. It is expected that the relevant context is provided by another post that links to the concept dependency post.

Concept Dependency Posts can be very short. Much shorter than a regular post. They might not be worth reading on their own....

(See More – 414 more words)

1quila1h

i like the idea. it looks useful and it fits my reading style well. i wish something like this were more common - i have seen it on personal blogs before like carado's. i would use [Concept Dependency] or [Concept Reference] instead so the reader understands just from seeing the title on the front page. also avoids acronym collision

Johannes C. Mayer1h20

Adopted.

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Nathan Young's Shortform

Nathan Young

2Nathan Young1h

If that were true then there are many ways you could partially do that - eg give people a set of tokens to represent their mana at the time of the devluation and if at future point you raise. you could give them 10x those tokens back.

2James Grugett4h

We are trying our best to honor mana donations! If you are inactive you have until the rest of the year to donate at the old rate. If you want to donate all your investments without having to sell each individually, we are offering you a loan to do that. We removed the charity cap of $10k donations per month, which is going beyond what we previous communicated.

Nathan Young1h20

Nevertheless lots of people were hassled. That has real costs, both to them and to you.

2Nathan Young2h

I’m discussing with Carson. I might change my mind but i don’t know that i’ll argue with both of you at once.

Take the wheel, Shoggoth! (Lesswrong is trying out changes to the frontpage algorithm)

Ruby, RobertM

For the last month, @RobertM and I have been exploring the possible use of recommender systems on LessWrong. Today we launched our first site-wide experiment in that direction.

(In the course of our efforts, we also hit upon a frontpage refactor that we reckon is pretty good: tabs instead of a clutter of different sections. For now, only for logged-in users. Logged-out users see the "Latest" tab, which is the same-as-usual list of posts.)

Why algorithmic recommendations?

A core value of LessWrong is to be timeless and not news-driven. However, the central algorithm by which attention allocation happens on the site is the Hacker News algorithm^[1], which basically only shows you things that were posted recently, and creates a strong incentive for discussion to always be...

(See More – 965 more words)

2habryka1h

GDPR is a giant mess, so it's pretty unclear what it requires us to implement. My current understanding is that it just requires us to tell you that we are collecting analytics data if you are from the EU. And the kind of stuff we are sending over to Recombee would be covered by it being data necessary to provide site functionality, not just analytics, so wouldn't be covered by that (if you want to avoid data being sent to Google Analytics in-particular, you can do that by just blocking the GA script in uBlock origin or whatever other adblocker you use, which it should do by default).

the gears to ascension1h20

drat, I was hoping that one would work. oh well. yes, I use ublock, as should everyone. Have you considered simply not having analytics at all :P I feel like it would be nice to do the thing that everyone ought to do anyway since you're in charge. If I was running a website I'd simply not use analytics.

back to the topic at hand, I think you should just make a vector embedding of all posts and show a HuMAP layout of it on the homepage. that would be fun and not require sending data anywhere. you could show the topic islands and stuff.

2kave1h

I am sad to see you getting so downvoted. I am glad you are bringing this perspective up in the comments.

2habryka1h

I am pretty excited about doing something more in-house, but it's much easier to get data about how promising this direction is by using some third-party services that already have all the infrastructure. If it turns out to be a core part of LW, it makes more sense to in-house it. It's also really valuable to have an relatively validated baseline to compare things to. There are a bunch of third-party services we couldn't really replace that we send user data to. Hex.tech as our analytics dashboard service. Google Analytics for basic user behavior and patterns. A bunch of AWS services. Implementing the functionality of all of that ourselves, or putting a bunch of effort into anonymizing the data is not impossible, but seems pretty hard, and Recombee seems about par for the degree to which I trust them to not do anything with that data themselves.

Examples of Highly Counterfactual Discoveries?

149

johnswentworth, kromem

The history of science has tons of examples of the same thing being discovered multiple time independently; wikipedia has a whole list of examples here. If your goal in studying the history of science is to extract the predictable/overdetermined component of humanity's trajectory, then it makes sense to focus on such examples.

But if your goal is to achieve high counterfactual impact in your own research, then you should probably draw inspiration from the opposite: "singular" discoveries, i.e. discoveries which nobody else was anywhere close to figuring out. After all, if someone else would have figured it out shortly after anyways, then the discovery probably wasn't very counterfactually impactful.

Alas, nobody seems to have made a list of highly counterfactual scientific discoveries, to complement wikipedia's list of multiple discoveries.

To...

(See More – 189 more words)

1Johannes C. Mayer2h

A few adjacent thoughts: * Why is a programming language like Haskell that is extremely powerful in the sense that if your program compiles, it is the program that you want with a very high probability because most stupid mistakes are now compile errors? * Why is there basically no widely used homoiconic language, i.e. a language in which you can use the language itself to <reason about the language/manipulate the language>. Here we have some technology that is basically ready to use (Haskell or Clojure), but people decide to mostly not use them. And with people, I mean professional programmers and companions who make software. * Why did nobody invent Rust earlier, by which I mean a system-level programming language that prevents you from making really dumb mistakes that can be machine-checked if you make them? * Why did it take like 40 years to get a latex replacement, even though latex is terrible in very obvious ways? These things have in common that there is a big engineering challenge. It feels like maybe this explains it, together with that people who would benefit from these technologies where in the position that the cost of creating them would have exceeded the benefit that they would expect from them. For Haskell and Clojure we can also consider this point. Certainly, these two technologies have their flaws and could be improved. But then again we would have a massive engineering challenge.

4Alexander Gietelink Oldenziel3h

I would not say that the central insight of SLT is about priors. Under weak conditions the prior is almost irrelevant. Indeed, the RLCT is independent of the prior under very weak nonvanishing conditions. The story that symmetries mean that the parameter-to-function map is not injective is true but already well-understood outside of SLT. It is a common misconception that this is what SLT amounts to. To be sure - generic symmetries are seen by the RLCT. But these are, in some sense, the uninteresting ones. The interesting thing is the local singular structure and its unfolding in phase transitions during training. The issue of the true distribution not being contained in the model is called 'unrealizability' in Bayesian statistics. It is dealt with in Watanabe's second 'green' book. Nonrealizability is key to the most important insight of SLT contained in the last sections of the second to last chapter of the green book: algorithmic development during training through phase transitions in the free energy. I don't have the time to recap this story here.

mattmacdermott1h30

Lucius-Alexander SLT dialogue?

4Alexander Gietelink Oldenziel3h

All proofs are contained in the Watanabe's standard text, see here https://www.cambridge.org/core/books/algebraic-geometry-and-statistical-learning-theory/9C8FD1BDC817E2FC79117C7F41544A3A

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Definitions

Thesis

I.

What is sycophancy?

Why algorithmic recommendations?

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA