Feedback welcome: www.admonymous.co/mo-putera
Long-time lurker (c. 2013), recent poster. I also write on the EA Forum.
For my own reference: some "benchmarks" (very broadly construed) I pay attention to.
Interesting quote on the downstream consequences of local speedup of output production by LLMs in business processes by Rafa Fernández, host of the Protocols for Business special interest group (SIG), from his essay Finding Fault Lines within the Firm:
AI is usually discussed in terms of automation or productivity. Those framings are not wrong, but they miss what makes AI adoption particularly revealing from a protocol perspective. While much of the public discussion frames AI in terms of cost-savings or new markets, our SIG has been focusing on the pressure it places on current coordination systems by changing the speed and scale at which work is produced ...
Across the SIG’s discussions, interviews, and readings, a consistent pattern has emerged. Under AI adoption, the first thing that stops working smoothly seemed unintuitive: time.
This became clear when our group reviewed Blake Scholl’s writing on Boom Supersonic. Here, Scholl distinguishes between at least two clocks operating inside the same organization. The first is the calendar: project timelines, milestones, and delivery dates. The second is what he calls the Slacker Index: the amount of time engineers spend waiting – on inputs, approvals, dependencies, or external constraints – rather than building. Even in well-run, safety-critical organizations, these clocks coexist.
Under stable conditions and in mature industries, this alignment is usually implicit. Engineering velocity, supplier lead times, regulatory review cycles, and internal decision-making rhythms evolve together. At Boom, hardware design, simulation, testing, and supplier manufacturing are paced to one another. Slower clocks constrain faster ones in predictable ways. Waiting is visible, expected, and priced into the system.
As Scholl points out, AI-enabled production changes the speed and scale of production. Certain forms of work – design iteration, analysis, documentation, internal review – can suddenly accelerate by orders of magnitude. From the perspective of the Slacker Index, local waiting collapses. Yet the calendar will not automatically follow. Supplier lead times remain fixed. Certification processes still unfold at human and institutional speeds. External partners continue to operate on contractual and regulatory time.
The consequence of AI-enabled opportunity is temporal divergence (a topic explored in depth by SIG member Sachin). Some clocks speed up sharply while others remain unchanged. At Boom, this would mean design teams outrunning suppliers, simulations outrunning manufacturing feedback, or internal decision cycles outrunning the capacity of external partners to respond. The Slacker Index may improve locally – less waiting to produce – but worsen systemically as downstream dependencies fall behind.
AI systems further amplify this effect in two ways. One, because they generate outputs without passing through the durations that normally situate work, creating a dizzying orientation. ... Knowledge accumulates faster than it can be evaluated, integrated, or acted upon.
Second, AI software using LLMs can be contextually misaligned. They draw on data that’s often years apart (a model trained up to 2024, used in 2026) and produced outside the local business context. From this lens, the recent focus on improving AI product memory seems intuitive. Efforts such as RAG, MCP, skills, and even “undo” prompt features become attempts to realign probabilistic software into business context, tempo, and authority.
Safety-critical organizations like Boom make these dynamics visible precisely because they cannot simply collapse time. Hardware, suppliers, and regulators enforce non-negotiable rhythms. When AI accelerates internal work without moving those external clocks, coordination strain surfaces quickly. Slack accumulates in unfamiliar places, with no protocols available to redistribute it.
When time regimes fall out of alignment, coordination problems and opportunities change form. Delays no longer appear as isolated errors that can be corrected locally. Instead, organizations experience escalating tensions: pressure to act without corresponding capacity to review, decide, or remember.
So how have orgs adapted? Three categories of examples:
When shared assumptions about time lose coherence, organizations first adapt within current structures. Work continues by absorbing friction rather than resolving its source.
One visible form of this absorption is Boom’s solution: integrate vertically. The critical move was purchasing their own large-scale manufacturing equipment rather than continuing to rely on external suppliers whose lead times dominated the schedule. Supplier queues and fabrication delays had become the governing clock for the entire program, producing a high Slacker Index: engineers were ready to iterate, but progress stalled while waiting on parts. By acquiring the machine, Boom internalized that bottleneck and converted supplier wait time into an internal, controllable process. This collapsed a multi-month external dependency into a shorter, iterable internal cycle, allowing design, testing, and manufacturing to co-evolve rather than queue sequentially.
Another response was novel translation work. The SIG discussed the fast growing Forward Deployed Engineer role, emerging to help mediate between fast-moving demands and slower-moving infrastructure. Their task is not to eliminate mismatch, but to work across it and leverage it – adjusting scope, translating intent, and negotiating constraints as they appear. This work allows organizations to keep operating even as tempos diverge, and gain a competitive advantage in the process. At its best, the work defines the operating model. This is the case for Palantir and large AI labs like OpenAI and Anthropic.
Other adaptations the SIG encountered took the form of operational formalization: AI usage guidelines, governance documents, digitized ontologies. These measures make previously tacit constraints visible without altering the structures that produced the misalignment. They stabilize behavior at the margin while leaving underlying coordination regimes intact.
Was the editor's note written by an LLM?
Gemini 3 Pro analogized Scott Alexander to a beaver when I asked it to make sense of him, because "Scott is a keystone individual" and "in ecology, a keystone species (like the beaver) exerts influence disproportionate to its abundance because it creates the ecosystem in which others live":
- He built the dam (The Community/Lighthaven) that pooled the water.
- He signaled where the food was (Grants/Open Threads).
- He warned of the predators (Moloch/AI Risk).
This was mildly funny. It was also striking how many factual details it erred in (the rest of the response that is, not the beaver analogy), which to an outsider might sound plausible if dramatic.
The essay's examples show how those types of payments decorrelate at the extremes, which makes it useful to think about improving one's "payment allocation".
Bit tangential: re: your sequence name "civilization is FUBAR", I get the FU, but why BAR? Maybe I'm just in too much of a progress-vibed bubble?
I asked a bunch of LLMs with websearch to try and name the classic mistake you're alluding to:
To be honest these just aren't very good, they usually do better at naming half-legible vibes.
Just learned about the Templeton World Charity Foundation (TWCF), which is unusual in that one of their 7 core funding areas is, explicitly, 'genius':
Genius
TWCF supports work to identify and cultivate rare cognitive geniuses whose work can bring benefits to human civilization.
In this context, geniuses are not simply those who are classified as such by psychometric tests. Rather, they are those who: (1) generate significant mathematical, scientific, technological, and spiritual discoveries and inventions that benefit humanity or have the potential to transform human civilization, and (2) show exceptional cognitive ability, especially at an early age.
Eligible projects may include research on the benefits of various attributes of geniuses to humanity, biographical studies of individual geniuses, comparisons of groups of geniuses with various levels of cognitive abilities, and projects that facilitate the spread of creative insights, discoveries, and original ideas of geniuses. Projects may also investigate genetic factors contributing to genius, and the cultural and nurturing factors that engender geniuses who contribute to such cognitive virtues as diligence, constructive thinking, and noble purposes. Ineligible projects include physical, musical, or artistic geniuses; spelling bees; geniuses with spectacular memory; and scholarships for geniuses.
Among the 613 projects they've funded so far, 7 grants come up if you search for 'genius', all between 2013-18 so I'm not sure why they stopped since. Some of the largest grants:
Yeah the "pi like you" was a reference to that passage.
Yeah, this was the source of much personal consternation when I left my operations-heavy career path in industry to explore research roles, as much as I found the latter more intrinsically exciting.
It's also what's always back-of-mind w.r.t. the alignment-related work I'm most excited by, even though part of why I'm excited about them is how relatively empirically grounded they are.
Not exactly comparable to the AI Village's open-ended long-horizon tasks above, but it's interesting that Cursor found out that
on their project to build a web browser from scratch (GitHub), totaling >1M LoC across 1k files, running "hundreds of concurrent agents" for a week. This is the opposite of what I'd have predicted just from how much more useful Claude is vs comparable-benchmark models. Also: "GPT-5.2 is a better planner than GPT-5.1-codex, even though the latter is trained specifically for coding", what's up with that?