LESSWRONG
LW

Nullity
7040
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Daniel Kokotajlo's Shortform
Nullity2mo20

I wouldn’t worry too much about these. It’s not at all clear that all the alignment researchers moving to Anthropic is net-negative, and for AI 2027, the people who are actually inspired by it won’t care too much if you’re being dunked on.

Plus, I expect basically every prediction about the near future to be wrong in some major way, so it’s very hard to determine what actions are net negative vs. positive. It seems like your best bet is to do whatever has the most direct positive impact.

Thought this would help, since these worries aren’t productive, and anything you do in the future is likely to lower p(doom). I’m looking forward to whatever you’ll do next.

Reply
quetzal_rainbow's Shortform
Nullity4mo30

I don’t understand how this is an example of misalignment—are you suggesting that the model tried to be sycophantic only in deployment?

Reply
leogao's Shortform
Nullity4mo30

I usually think of execution as compute and direction as discernment. Compute = ability to work through specific directions effectively, discernment = ability to decide which of two directions is more promising. Probably success is upper-bounded by the product of the two, in a sufficiently informal way.

Reply
AI 2027 is a Bet Against Amdahl's Law
Nullity4mo30

New commenter here, I think this is a great post. I think the distribution given by AI 2027 is actually close to correct, and is maybe even too slow (I would expect SAR+ to give a bit more of a multiplier to R&D). It seems like most researchers are assuming that ASI will look like scaled LLMs + scaffolding, but I think that transformer-based approaches will be beaten out by other architectures at around SAR level, since transformers were designed to be language predictors rather than reasoners.

This makes my most likely paths to ASI either “human researchers develop new architecture which scales to ASI” or “human researchers develop LLMs at SC-SAR level, which then develop new architecture capable of ASI”. I also think a FOOM-like scenario with many OOMs of R&D multiplier is more likely, so once SIAR comes along there would probably be at most a few days to full ASI.

AI R&D is far less susceptible to Amdahl’s law than pretty much anything else, as it’s only bottlenecked on compute and sufficiently general intelligence. You’re right that if future AIs are about as general as current LLMs, then automation of AI R&D will be greatly slowed, but I see no reason why generality won’t increase in the future.

Lastly, I think that many of the difficulties relating to training data (especially for specialist tasks) will become irrelevant in the future as AIs become more general. In other words, the AIs will be able to generalize from “human specialist thought in one area” to “human specialist thought in X” without needing training data in the latter.

I agree that without these assumptions, the scenario in AI 2027 would be unrealistically fast.

Reply
No posts to display.