anaguma — LessWrong

Alexander Gietelink Oldenziel's Shortform

anaguma4d10

I'm reminded from time to time of a tweet that Ilya Sutskever posted a few years ago.

if you value intelligence above all other human qualities, you’re gonna have a bad time

Daniel Tan's Shortform

anaguma5d20

Why not both? I imagine you could average the gradients so that you learn both at the same time.

anaguma's Shortform

anaguma7d30

80,000 hours has done a great podcast with Helen Toner on her work in AI security and policy.

anaguma's Shortform

anaguma10d202

A notable section from Ilya Sutskever's recent deposition:

WITNESS SUTSKEVER: Right now, my view is that, with very few exceptions, most likely a person who is going to be in charge is going to be very good with the way of power. And it will be a lot like choosing between different politicians.
ATTORNEY EDDY: The person in charge of what?
WITNESS SUTSKEVER: AGI.
ATTORNEY EDDY: And why do you say that?
ATTORNEY AGNOLUCCI: Object to form.
WITNESS SUTSKEVER: That's how the world seems to work. I think it's very -- I think it's not impossible, but I think it's very hard for someone who would be described as a saint to make it. I think it's worth trying. I just think it's -- it's like choosing between different politicians. Who is going to be the head of the state?

anaguma's Shortform

anaguma10d10

That makes sense, thanks for the corrections!

anaguma's Shortform

anaguma11d20

Energy Won't Constrain AI Inference.

The energy for LLM inference follows the formula: Energy = 2 × P × N × (tokens/user) × ε, where P is active parameters, N is concurrent users, and ε is hardware efficiency in Joules/FLOP. The factor of 2 accounts for multiply-accumulate operations in matrix multiplication.

Using NVIDIA's GB300, we can calculate ε as follows: the GPU has a TDP of 1400W and delivers 14 PFLOPS of dense FP4 performance. Thus ε = 1400 J/s ÷ (14 × 10^15 FLOPS) = 100 femtojoules per FP4 operation. With this efficiency, a 1 trillion active parameter model needs just 0.2 mJ per token (2 × 10^12 × 10^-13 J). This means 10 GW could give every American 167^[1] tokens/second continuously.

^{^}
300 million users: tokens/second = 10^10 W ÷ (2 × 10^12 × 3 × 10^8 × 10^-13) = 167 tokens/second per person

anaguma's Shortform

anaguma11d80

OpenAI plans to have automated AI researchers by March 2028.

Needless to say, I hope that they don't succeed.

From Sam Altman's X:

Yesterday we did a livestream. TL;DR:

We have set internal goals of having an automated AI research intern by September of 2026 running on hundreds of thousands of GPUs, and a true automated AI researcher by March of 2028. We may totally fail at this goal, but given the extraordinary potential impacts we think it is in the public interest to be transparent about this.

We have a safety strategy that relies on 5 layers: Value alignment, Goal alignment, Reliability, Adversarial robustness, and System safety. Chain-of-thought faithfulness is a tool we are particularly excited about, but it somewhat fragile and requires drawing a boundary and a clear abstraction.

On the product side, we are trying to move towards a true platform, where people and companies building on top of our offerings will capture most of the value. Today people can build on our API and apps in ChatGPT; eventually, we want to offer an AI cloud that enables huge businesses.

We have currently committed to about 30 gigawatts of compute, with a total cost of ownership over the years of about $1.4 trillion. We are comfortable with this given what we see on the horizon for model capability growth and revenue growth. We would like to do more—we would like to build an AI factory that can make 1 gigawatt per week of new capacity, at a greatly reduced cost relative to today—but that will require more confidence in future models, revenue, and technological/financial innovation.

Our new structure is much simpler than our old one. We have a non-profit called OpenAI Foundation that governs a Public Benefit Corporation called OpenAI Group. The foundation initially owns 26% of the PBC, but it can increase with warrants over time if the PBC does super well. The PBC can attract the resources needed to achieve the mission.

Our mission, for both our non-profit and PBC, remains the same: to ensure that artificial general intelligence benefits all of humanity.

The nonprofit is initially committing $25 billion to health and curing disease, and AI resilience (all of the things that could help society have a successful transition to a post-AGI world, including technical safety but also things like economic impact, cyber security, and much more). The nonprofit now has the ability to actually deploy capital relatively quickly, unlike before.

In 2026 we expect that our AI systems may be able to make small new discoveries; in 2028 we could be looking at big ones. This is a really big deal; we think that science, and the institutions that let us widely distribute the fruits of science, are the most important ways that quality of life improves over time.

Patrick Spencer's Shortform

anaguma12d20

One argument is that energy costs 0.1 cents per kWh versus 5 cents on Earth. For now launch costs dominate this but in the future this balance might change.

Mikhail Samin's Shortform

anaguma12d10

Do you think the world would be a better place if Anthropic didn’t exist?

Temporarily Losing My Ego

anaguma14d60

Just like your mind can only see the rightside-up-stairs (w/ a blue wall closer to you) or the upside-down-stairs (w/ a green wall closer to you) (but never both of them at the same time), you can also suddenly shift from a viewpoint of being little brother, to being the full system.

If you aren’t able to see the second thing, try flipping your screen and upside down, and looking at it normally again.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments