254

LESSWRONG
LW

253
Synthesizing Standalone World-Models
AbstractionNatural AbstractionAIWorld Modeling
Frontpage

13

Synthesizing Standalone World-Models, Part 3: Dataset-Assembly

by Thane Ruthenis
25th Sep 2025
AI Alignment Forum
3 min read
2

13

Ω 7

AbstractionNatural AbstractionAIWorld Modeling
Frontpage

13

Ω 7

Previous:
Synthesizing Standalone World-Models, Part 2: Shifting Structures
5 comments16 karma
Next:
Synthesizing Standalone World-Models, Part 4: Metaphysical Justifications
9 comments23 karma
Log in to save where you left off
Synthesizing Standalone World-Models, Part 3: Dataset-Assembly
4Morpheus
2Thane Ruthenis
New Comment
2 comments, sorted by
top scoring
Click to highlight new comments since: Today at 9:59 AM
[-]Morpheus1mo42
What are high-level ways to formalize the dataset-assembly subproblem?
What are some heuristics for solving this subproblem?
How should we think about/model the problem of solving all three subproblems jointly?

I have read the first summary post and this one. I have only skimmed your abstraction posts etc., so maybe I am missing something here. But if you are ultimately aiming for some particular pivotal act routed through human understanding, I think you should spend at least 1-6 months trying to just solve that pivotal act (or trying multiple if it turns out that particular one seems less tractable). Look at what type of knowledge and training data are useful for your brain and go for understanding directly rather than trying to route it indirectly through an autoencoder that you don't know how to train yet. I am reasonably confident your plan is harder than just trying to go for say adult human intelligence enhancement directly. But even if that is false, I am very confident you will ultimately save a bunch of time by investing enough time in this step. You will save time when training that autoencoder, when debugging and validating any algorithms you run on top of your autoencoder and when trying to learn concepts from your autoencoder. For example, if you go for adult genetic intelligence enhancement, you are going to run into peculiarities of genetics and I think it's just easier to learn about them directly from a textbook optimized for pedagogy and not just short description length. This should be your first step, not your last step. Listen to Andrej Karpathy and become one with the data!

Reply
[-]Thane Ruthenis13d20

My response here would be similar to this one. I think there's a kind of "bitter lesson" here: for particularly complex fields, it's often easier to solve the general problem of which that field is an instance of, rather than attempting to solve the field directly. For example:

  • If you're trying to solve mechanistic interpretability, studying a specific LLM in detail isn't the way; you'd be better off trying to find methods that generalize across many LLMs.
  • If you're trying to solve natural-language processing, turns out tailor-made methods are dramatically out-performed by general-purpose generative models (LLMs) trained by a general-purpose search method (SGD).
  • If you're trying to advance bioscience, you can try building models of biology directly, or you can take the aforementioned off-the-shelf general-purpose generative model, dump biology data into it, and get a tool significantly ahead of your manual efforts.
  • Broadly, LLMs/DL have "solved" or outperformed a whole bunch of fields at once, without even deliberately trying, simply as the result of looking for something general and scalable. 

Like, yeah, after you've sketched out your general-purpose method and you're looking for where to apply it, you'd need to study the specific details of the application domain and tinker with your method's implementation. But the load-bearing, difficult step is deriving the general-purpose method itself; the last-step fine-tuning is comparatively easy.

In addition, I'm not optimistic about solving e. g. interpretability directly, simply because there's already a whole field of people trying to do that, to fairly leisurely progress. On intelligence-enhancement front, there would be mountains of regulatory red tape to go through, and the experimental loops would be rate-limited by the slow human biology. Etc., etc.

Reply
Moderation Log
More from Thane Ruthenis
View more
Curated and popular this week
2Comments

This is part of a series covering my current research agenda. Refer to the linked post for additional context.

This is going to be a very short part. As I'd mentioned in the initial post, I've not yet done much work on this subproblem.
 


From Part 1, we more or less know how to learn the abstractions given the set of variables over which they're defined. We know their type signature and the hierarchical structure they assemble into, so we can just cast it as a machine-learning problem (assuming a number of practical issues is solved). For clarity, let's dub this problem "abstraction-learning".

From Part 2, we more or less know how to deal with shifting/resampled structures. While the presence of specific abstractions doesn't uniquely lock down what other abstractions are present at higher/lower/sideways levels, we can infer a probability distribution over what abstractions are likely to be there, and then resample from it until finding one that works. Let's call this "truesight".

Except, uh. Part 1 only works given the solution to Part 2's problem, and Part 2 only works given the solution to Part 1's problem. We can't learn abstractions before we've stabilized the structure/attained truesight, but we can't attain truesight until we learn what abstractions we're looking for. We need to, somehow, figure out how to learn them jointly.

This represents the third class of problems we need to solve: figuring out how to transform whatever data we happen to have into datasets for learning new abstractions. Such datasets would need to be isomorphic to samples from the same fixed (at least at a given high level) structure. Assembling them might require:

  • Figuring out what variables to consider.
  • Figuring out what samples of those variables to consider. (E. g., when learning Conway's glider, we'd only consider the samples of the first [0:3]×[0:3] set of cells for the first few time-steps. Similarly, in fluid dynamics, we use material derivatives to lock our viewpoint to a specific particle in a stream, instead of to a specific point in space – switching the representation from "spatial points are the low-level random variables" to "particles are the low-level random variables".)
  • Figuring out what functions of those variables to consider. (Which might involve effectively discarding specific "subvariables" xji of a given variable xi, or treating a bunch of variables as subvariables to get at their synergistic information.)

Call this "dataset-assembly".

Dataset-assembly has some overlap with the truesight problem, so the heuristical machinery for implementing them would be partly shared. In both cases, we're looking for functions over samples of some known variables that effectively sample from the same stable structure. The difference is whether we already known that structure or not.

Another overlap is with the heuristics I'd mentioned in 1.6, the ones for figuring out which subsets of variables to try learning synergistic/redundant-information variables for (instead of doing it for all subsets). Indeed, given the shifting-structures problem, those are actually folded into the heuristics for assembling abstraction-learning datasets!

Introspectively, in humans, "dataset-assembly" is represented by qualitative research as well, and by philosophical reasoning (or at least my model of what "philosophical reasoning" is). "Dataset-assembly heuristics" correspond to research taste, to figuring out what features of some new environment/domain to pay attention to, and which parts of reality could be meaningfully grouped together and decomposed into a new abstract hierarchy/separate field of study.

My thinking on the topic of dataset-assembly is relatively new, and isn't yet refined into a proper model/distilled into something ready for public consumption. Hence, this post is little more than a stub.

That said, I hope the overall picture of the challenge is now clarified. We need to figure out how to set up a process that jointly learns the heuristics for solving these three classes of problems.


What I'm particularly interested in here for the purposes of the bounties is... well, pretty much anything related, since the map is pretty blank. Three core questions:

  • What are high-level ways to formalize the dataset-assembly subproblem?
  • What are some heuristics for solving this subproblem?
  • How should we think about/model the problem of solving all three subproblems jointly?

(My go-to approach in such cases is to figure several practical heuristics, go through a few concrete cases, then attempt to distill general principles/algorithms based on those analyses.)

Mentioned in
76Research Agenda: Synthesizing Standalone World-Models