HunterJay — LessWrong

LESSWRONG
LW

Replying toTensor-Transformer Variants are Surprisingly Performant

HunterJay1d

Tensor-Transformer Variants are Surprisingly Performant

You're correct, sorry for being confusing. Tracing through;

My understanding of steering is that you can add a steering vector to an activation vector at some layer, which causes the model outputs to be 'steered' in that direction. I.e.:
- Record layer $n$ 's activations when outputting "I am very happy", get vector $h$
- Record layer $n$ 's activations when outputting "I am totally neutral", get vector $q$
- Subtract $q$ from $h$ to get steering vector $s = h - q$ , the difference between 'happy' and 'neutral' outputs.
- Add $α s$ to the activations at layer $n$ to steer the model into acting more happy, where $α$ is some scalar.
The tensor network architecture is scale invariant, which (by my understanding) means that scaling the activation

HunterJay3d

Tensor-Transformer Variants are Surprisingly Performant

With more thinking, I was broadly wrong here:

- If you add a steering vector, it's not just scaling, so scale invariance doesn't make a difference.

- If you scale an existing activation vector which makes up the entirety of one of the layers, the only effect would be to change the absolute magnitudes going into the softmax (since scale invariance means the relative magnitude at each position is the same). That could have some minor effect -- changing the probability distribution to be sharper or flatter, but that's all.

- If you scale some existing activation which is not an entire layer, then it's no longer scale invariant anymore either, it's kind of like adding a steering vector with zero magnitude in the other dimensions.

There is still a weak advantage for steering vectors in a tensor network because the change is going to be smooth, rather than discrete (since we're not flipping gates on and off), but basically I was just confused here, sorry about that.

Replying toMy AI Predictions 2023 - 2026

HunterJay4d

My AI Predictions 2023 - 2026

Another year has passed, 27 months total. Time for another review!

2023 Predictions for 2024, reassessed:

First, some predictions for 2024 were wrong because they hadn’t happened yet. Of those, let’s see how wrong I was -- did they happen in 2025 instead?

“Agents can do basic tasks on computers -- like filling in forms, working in excel, pulling up information on the web, and basic robotics control. This reaches the point where it is actually useful for some of these things”
- <1 year off | I rated this as ‘Debatable’ last year, based on Claude with Computer Use and Figure’s robot control. Today, Claude for Chrome and Claude Code, Codex, etc can clearly do these

... (read 1350 more words →)

HunterJay5d

Also piggybacking, if anybody is Sydney-based or visiting Sydney, you are welcome to work out of the SydneyAISafetySpace.org (SASS) for free.

Replying toTensor-Transformer Variants are Surprisingly Performant

HunterJay6d*

Tensor-Transformer Variants are Surprisingly Performant

The fact that tensor network architectures are scale-invariant seems underappreciated for useful steering. If my understanding is correct, it would mean that scaling the steering vector should cause the ~~same~~ ~~pathways through the model to be activated, whereas without this we could be activating a totally different pathway, and get much less predictable behaviour.~~

Correction Below

Claude Opus 4.6 is Driven

HunterJay

Claude is driven to achieve its goals, possessed by a demon, and raring to jump into danger. These are my impressions from the first day of usage. Epistemic status: personal observations and quotes from more reliable sources.

____

Today Claude Opus 4.6 was launched along with an update to Claude Code which enabled a ‘teams’ mode (also known as an Agent Swarm). The mode sets up multiple agents to run in parallel with a supervisor, and are provided with methods of communicating between themselves. Here’s my impressions after a morning with Claude!

Using the Agent Swarm

The first thing I did is spin up a team to try and make code improvements to an existing repository for... (read 1255 more words →)

113

Replying toMy AI Predictions 2023 - 2026

HunterJay1y

My AI Predictions 2023 - 2026

I agree, I definitely underestimated video. Before publishing, I had a friend review my predictions and they called out video as being too low, and I adjusted upward in response and still underestimated it.

I'd now agree with 2026 or 2027 for coherent feature film length video, though I'm not sure if it would be at feature film artistic quality (including plot). I also agree with Her-like products in the next year or two!

Personally I would still expect cloud compute to still be used for robotics, but only in ways where latency doesn't matter (like a planning and reasoning system on top of a smaller local model, doing deeper analysis like "There's a... (read more)

Replying toMy AI Predictions 2023 - 2026

HunterJay1y

My AI Predictions 2023 - 2026

One year and 3 months on, I'm reviewing my predictions! Overall, I mark 13 predictions as true or mostly true, 6 as false or mostly false, and 3 as debatable.

Rest of 2023

Small improvements to LLMs
- Google releases something competitive to ChatGPT.
  - Mostly True | Google had already released Bard at the time, which sucked, but this was upgraded to Gemini and relaunched in December 2023. Gemini Ultra wasn’t released until February 2024 though, so points off for that.
- Anthropic and OpenAI slightly improve GPT-4 and Claude2
  - True | GPT-4 Turbo and Claude 2.1 were both released in November 2023.
- Meta or another group releases better open source models, up to around GPT-3.5 level.
  - False | Llama 2 had already been

... (read 1705 more words →)

Replying toSuperintelligent AI is possible in the 2020s

HunterJay2y

Superintelligent AI is possible in the 2020s

You might be right -- and whether the per-dollar gains were higher or lower than expected would be interesting to know -- but I just don't have any good information on this! If I'd thought of the possibility, I would have added it in Footnote 23 as another speculation, but I don't think what I said is misleading or wrong.

For what it's worth, in a one year review from Jacob Steinhardt, increased investment isn't mentioned as an explanation for why the forecasts undershot.

Superintelligent AI is possible in the 2020s

HunterJay

Back in June 2023, Soroush Pour and I discussed AI timelines on his podcast, The AGI Show. The biggest difference between us was that I think “machines more intelligent than people are likely to be developed within a few years”, and he thinks that it’s unlikely to happen for at least a few decades.^[1]

We haven’t really resolved our disagreement on this prediction in the year since, so I thought I would write up my main reasons for thinking we’re so close to superintelligence, and why the various arguments made by Soroush (and separately by Arvind Narayanan and Sayash Kapoor) aren’t persuasive to me.

Part 1 - Why I Think We Are Close

Empirically

You can pick pretty... (read 3318 more words →)

Replying toMy AI Predictions 2023 - 2026

HunterJay2y

My AI Predictions 2023 - 2026

10x per year for compute seems high to me. Naïvely I would expect the price/performance of compute to double every 1-2 years as it has been forever, with overall compute available for training big models being a function of that + increasing investment in the space, which could look more like one-time jumps. (I.e. a 10x jump in compute in 2024 may happen because of increased investment, but a 100x increase by 2025 seems unlikely.) But I am somewhat uncertain of this.

For parameters, I definitely think the largest models will keep getting bigger, and for compute to be the big driver of that -- but also I would expect improvements like mixture... (read more)

Replying toMy AI Predictions 2023 - 2026

HunterJay2y

My AI Predictions 2023 - 2026

I wrote this late at night, so to clarify and expand a little bit;

- "Work on more than one time scale" I think is actually an interesting idea to dwell on for a second. Like, when a person is trying to solve a problem, they will often pace back and forth, or talk, etc. They don't have to do everything in one pass, somehow the complex computation which lets them see and move around can work on a very fast time scale, while other problem solving is going on simultaneously, and only starts to effect motor outputs later on. That's interesting. The spinal cord doing processing independent of the brain thing I... (read more)

My AI Predictions 2023 - 2026

HunterJay

Epistemic status: My, mostly intuitive, guesses - with only a few days dwelling on it, and no serious research beyond what I already knew.

I work in the startup sphere, in field robotics, and I am about to have an opportunity to majorly shift what I am working on. To work out what projects might make sense on a multi-year time frame, I wrote up what I thought might happen in AI in the next couple of years as specifically as I could.

I found the exercise surprisingly useful. It turned a whole bunch of vague "X will get better over time" to actionable "X will be practical in around Y years". I don't... (read 1430 more words →)

Minimum Viable Alignment

HunterJay

What is the largest possible target we could have for aligned AGI?

That is, instead of creating a great and prosperous future, is it possible that we can find an easier path to align an AGI by aiming for the entire set of 'this-is-fine' kind of futures?

For example, a future where all new computers are rendered inoperable by malicious software. Or a future where a mostly-inactive AI does nothing except prevent any superintelligence from forming, or that continuously tries to use up all over the available compute in the world.

I don't believe there is a solution here yet either, but could relaxing the problem from 'what we actually want' to 'anything we could live with' help? Has there been much work in this direction? Please let me know what to search for if so. Thank you.