LESSWRONG
LW

anaguma
1544670
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Rauno's Shortform
anaguma13h40

Do you currently work at OpenAI?

Reply
Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety
anaguma3d32

Isn’t there a KL divergence term from the base model, like is done with RLHF?

Reply
10x more training compute = 5x greater task length (kind of)
anaguma5d30

With sufficiently large batch sizes for rollouts, why should we expect lower utilization than pretraining?

Reply
Energy-Based Transformers are Scalable Learners and Thinkers
anaguma9d20

Interesting, I was looking at figure 7, but that seems to be a much smaller run. I retract my original comment.
 

Reply
Energy-Based Transformers are Scalable Learners and Thinkers
anaguma10d10

Unfortunately they extended the scaling curves to ~10 B tokens, less than 3OOMs of the data used to train frontier models. So it’s unclear whether this will work at scale, and the fact that they didn’t extend it further is some evidence against it working.

[This comment is no longer endorsed by its author]Reply
How much novel security-critical infrastructure do you need during the singularity?
anaguma14d42

Also, AI companies might write a bunch of security-critical infra code really fast because they suddenly decide to build or physically reconfigure their datacenter really fast in order to increase physical security. These security requirements might rule out the possibility of them using an existing cloud compute provider, potentially necessitating a bunch of novel code.


If the models are scheming, doing seems like a risky proposition. 

Reply2
AI #122: Paying The Market Price
anaguma19d10

The scripts are misconfigured, but the behavior is qualitatively much more agentic than previous models.

More capable models show qualitatively new scheming behavior. For example, Opus-4-early creates fake legal documentation, establishes persistence mechanisms, makes multiple hidden back-ups, creates a fake press release, and sets up an automated Cron job – all in one roll-out.

The scripts are misconfigured, but the behavior is qualitatively much more agentic than previous models.
 

 

The second sentence is duplicated.

Reply
LLMs are (mostly) not helped by filler tokens
anaguma22d10

Has anyone done research on filler tokens since this was published?

Reply
Foom & Doom 1: “Brain in a box in a basement”
anaguma24d10

Thanks!

Reply
Load More
2anaguma's Shortform
7mo
13
No wikitag contributions to display.
32025 Alignment Predictions
Q
6mo
Q
3
2anaguma's Shortform
7mo
13
5OpenAI o1 + ChatGPT Pro release
7mo
0
11Anthropic - The case for targeted regulation
8mo
0