Roman Leventov — LessWrong

LESSWRONG
LW

I don't understand why people rave so much about Claude Code etc., nor how they really use these agents. The problem is not capability--sure, today agents can go far without stumbling or losing the plot. The problem is that they will go not in the direction I want.

It's because my product vision, architectural vision, and code quality "functions" are complex: very tedious to express in CLAUDE/AGENTS .md, and often hardly expressible in language at all. "I know it when I see it." Hence keeping agent "on a short leash" (Karpathy)--in Cursor.

This makes me think that at least in coding (also, probably some other types of engineering, design, soon perhaps content creation, perhaps deep research, etc.) agents are hobbled by alignment, not capability. I predict that in the next few months, "agentic alignment" concept will be taken over by AI engineers from AI safety, and will become a trendy topic/area of focus in AI engineering, like "context engineering" or "memory" are today.

When agentic alignment is largely solved (likely with a mix of specific model post-training and harness), agents will be unhobbled in a big way in engineering, research, business, etc., akin to how RLHF ("alignment technique") unhobbled LLM chat bots.

An “Optimistic” 2027 Timeline

Roman Leventov7mo20

But then the possibilities for 2027 branch on whether there are reliable agents, which doesn't seem knowable either way right now.

Very reliable, long-horizon agency is already in the capability overhang of Gemini 2.5 pro, perhaps even the previous-tier models (gemini 2.0 exp, sonnet 3.5/3.7, gpt-4o, grok 3, deepseek r1, llama 4). It's just the matter of harness/agent-wrapping logic and inference-time compute budget.

Agency engineering is currently in the brute-force stage. Agent engineers over rely on a "single LLM rollout" to be robust, but also often use LLM APIs that sometimes lack certain nitty-gritty affordances for implementing reliable agency, such as "N completions" with timely self-consistency pruning and perhaps scaling N up again when model's own uncertainty is up.

This somewhat reminds me of the early LLM scale-up era where LLM engineers over relied on "stack more layers" without digging more into the architectural details. The best example is perhaps Megatron, a trillion-parameter model from 2021 whose performance is probably abysmal relative to the 2025 models of ~10B parameters (perhaps even 1B).

So, the current agents (such as Cursor, Claude Code, Replit, Manus) are in the "Megatron era" of efficiency. In four years, even with the same raw LLM capability, agents will be very reliable.

To give a more specific example when robustness is a matter of spending more on inference, let's consider Gemini 2.5 pro: contrary to the hype, it often misses crucial considerations or acts strangely stupidly on modestly sized contexts (less than 50k tokens). However, seeing these omissions, it's obvious to me that if someone applied ~1k token-sized chunks of that context to 2.5-pro's output and asked a smaller LLM (flash or flash lite) "did this part of the context properly informed that output", flash would answer No when 2.5-pro indeed missed something important from that part of the context. This should trigger a fallback on N-completions, 2.5 self-review with smaller pieces of the context, breaking down the context hierarchically, etc.

Roman Leventov's Shortform

Roman Leventov7mo50

It seems that a lot of white collar jobs will become (already becoming) positional goods, such as aristocratic titles, at least for a few years, possibly longer.

AI will do 100% of the "meat" of the job better than almost all humans, and ~equally for every user (prompting won't matter much).

But business will still demand accountability for results, and that the workers can claim that they understand and attest AI outputs (these claims themselves won't be tested, though, nor would it really matter in the grand scheme of things). At the same time, the productivity of these jobs will increase more than businesses can absorb, at least for a few years (and then perhaps fully automated companies will ensue). Thus, fewer total white collar workers are needed.

When the skill doesn't really matter, and the demand decreases, the jobs will become highly contested and the credentials, prestige (pedigree), connections, and "soft skills" (primarily: of passing the interviews) will decide these contents rather than "hard skills" (of which only the skill of understanding sophisticated AI outputs and potentially fix remaining issues with AI outputs will really matter, but the marginal difference between workers who are good and bad at this skill will be relatively small for the company's bottom line, and testing candidates for this skill will be too hard).

The above straightforwardly applies to all "digital"/online/IT/analyst/manager jobs.

I don't buy the takes like Steve Yegge's https://sourcegraph.com/blog/revenge-of-the-junior-developer and similar, with projections of white collar workers becoming 10x, 100x more productive than today. Backlogs are not that deep, and the marginal value of churning through 99% of these backlog issues for companies is ~0.

I also don't believe in Jevon's paradox wonders of increased demand for "digital" work, again at least for a few years (or realistically, 10+ years) until the economy goes through a deeper transformation (including geographically). In the meantime, the economy looks to be already ~saturated (or even oversaturated) with IT/digitalization, marketing, compliance, legal proceedings, analysis, educational materials, and other similar outputs of white collar work.

Gradual Disempowerment, Shell Games and Flinches

Roman Leventov9mo4-2

Even for those not directly employed by AI labs, there are similar dynamics in the broader AI safety community. Careers, research funding, and professional networks are increasingly built around certain ways of thinking about AI risk. Gradual disempowerment doesn't fit neatly into these frameworks. It suggests we need different kinds of expertise and different approaches than what many have invested years developing. Academic incentives also currently do not point here - there are likely less than ten economists taking this seriously, trans-disciplinary nature of the problem makes it hard sell as a grant proposal.

I agree this is unfortunate, but this also seems irrelevant? Academic economics (as well as sociology, political science, anthropology, etc.) are approximately completely irrelevant to shaping major governments' AI policies. "Societal preparedness" and "governance" teams at major AI labs and BigTech giants seem to have approximately no influence on the concrete decisions and strategies of their employers.

The last economist who influenced the economic and policy trajectory significantly was Milton Friedman perhaps?

If not research, what can affect the economic and policy trajectory at all in a deliberate way (disqualifying the unsteerable memetic and cultural drift forces), apart from powerful leaders themselves (Xi, Trump, Putin, Musk, etc.)? Perhaps the way we explore the "technology tree" (see https://michaelnotebook.com/optimism/index.html)? Such as the internet, social media, blockchain, form factors of AI models, etc. I don't hold too much hope here, but this looks to me like the only plausible lever.

Gradual Disempowerment, Shell Games and Flinches

Roman Leventov9mo110

My quick impression is that this is a brutal and highly significant limitation of this kind of research. It's just incredibly expensive for others to read and evaluate, so it's very common for it to get ignored.

I'd predict that if you improved the arguments by 50%, it would lead to little extra uptake.

I think this is wrong. The introduction of the GD paper takes no more than 10 minutes to read and no significant cognitive effort to grasp, really. I don't think there is more than 10% potential of making it any clearer or approachable.

The Failed Strategy of Artificial Intelligence Doomers

Roman Leventov9mo20

https://gradual-disempowerment.ai/ is mostly about institutional progress, not narrow technical progress.

AI research assistants competition 2024Q3: Tie between Elicit and You.com

Roman Leventov1y70

Undermind.ai I think is much more useful for searching concepts and ideas in papers rather than extracting tabular info a la Elicit. Nominally Elicit can do the former, too, but is quite bad in my experience.

The Great Data Integration Schlep

Roman Leventov1y40

https://openmined.org/ develops Syft, a framework for "private computation" in secure enclaves. It potentially reduces the barriers for data integration both within particularly bureaucratic orgs and across orgs.

My motivation and theory of change for working in AI healthtech

Roman Leventov1y130

Thanks for the post, I agree with it!

I just wrote a post with differential knowledge interconnection thesis, where I argue that it is on net beneficial to develop AI capabilities such as

Federated learning, privacy-preserving multi-party computation, and privacy-preserving machine learning.
Federated inference and belief sharing.
Protocols and file formats for data, belief, or claim exchange and validation.
Semantic knowledge mining and hybrid reasoning on (federated) knowledge graphs and multimodal data.
Structured or semantic search.
Datastore federation for retrieval-based LMs.
Cross-language (such as, English/French) retrieval, search, and semantic knowledge integration. This is especially important for low-online-presence languages.

I discuss whether knowledge interconnection exacerbates or abates the risk if industrial dehumanization on net in a section. It's a challenging question, but I reach the tentative conclusion that AI capabilities that favor obtaining and leveraging "interconnected" rather than "isolated" knowledge are on net risk-reducing. This is because the "human economy" is more complex than the hypothetical "pure machine-industrial economy", and "knowledge interconnection" capabilities support that greater complexity.

Would you agree or disagree with this?

There Should Be More Alignment-Driven Startups

Roman Leventov1y60

I think the model of commercial R&D lab would often suit alignment work better than a "classical" startup company. Conjecture and AE Studio come to mind. Answer.AI, founded by Jeremy Howard (of Fast.ai and Kaggle) and Eric Ries (Lean Startup) elaborates on this business and organisational model here: https://www.answer.ai/posts/2023-12-12-launch.html.

LESSWRONG
LW

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments