I'm not sure exactly what you're trying to learn here, or what debate you're trying to resolve. (Do you have a reference?)

If almost all the complexity is in architecture, you can have fast takeoff because it doesn't work well until the pieces are all in place; or you can have slow takeoff in the opposite case. If almost all the complexity is in learned content, you can have fast takeoff because there's 50 million books and 100,000 years of YouTube videos and the AI can deeply understand all of them in 24 hours; or you can have slow takeoff because, for example, maybe the fastest supercomputers can just barely run the algorithm at all, and the algorithm gets slower and slower as it learns more, and eventually grinds to a halt, or something like that.

If an algorithm uses data structures that are specifically suited to doing Task X, and a different set of data structures that are suited to Task Y, would you call that two units of content or two units of architecture?

(I personally do not believe that intelligence requires a Swiss-army-knife of many different algorithms, see here, but this is certainly a topic on which reasonable people disagree.)

I'm not sure exactly what you're trying to learn here, or what debate you're trying to resolve. (Do you have a reference?)

I'm not entirely sure what I'm trying to learn here (which is part of what I was trying to express with the final paragraph of my question); this just seemed like a natural question to ask as I started thinking more about AI takeoff.

In "I Heart CYC", Robin Hanson writes: "So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking questions, reading, and learning for themselves. Prior A

... (read more)

[ Question ]

Source code size vs learned model size in ML and in humans?

by riceissa 1 min read20th May 20205 comments

11


To build intuition about content vs architecture in AI (which comes up a lot in discussions about AI takeoff that involve Robin Hanson), I've been wondering about content vs architecture size (where size is measured in number of bits).

Here's how I'm operationalizing content and architecture size for ML systems:

  • content size: The number of bits required to store the learned model of the ML system (e.g. all the floating point numbers in a neural network).
  • architecture size: The number of bits of source code. I'm not sure if it makes sense to include the source code of supporting software (e.g. standard machine learning libraries).

I tried looking at the AlphaGo paper to see if I could find this kind of information, but after trying for about 30 minutes was unable to find what I wanted. I can't tell if this is because I'm not acquainted enough with the ML field to locate this information or if that information just isn't in the paper.

Is this information easily available for various ML systems? What is the fastest way to gather this information?

I'm also wondering about this same content vs architecture size split in humans. For humans one way I'm thinking of it is as "amount of information encoded in inheritance mechanisms" vs "amount of information encoded in a typical adult human brain". I know that Eliezer Yudkowsky has cited 750 megabytes as the amount of information in the human DNA, and also emphasizes that most of this information is junk. This was in 2011 and I don't know if there's a new consensus or how to factor in epigenetic information. There is also content stored in genes, and I'm not sure how to separate out the content and architecture in genes.

I'm pretty uncertain about whether this is even a good way to think about this topic, so I would also appreciate any feedback on this question itself. For example, if this isn't an interesting question to ask, I would like to know why.

11

New Answer
Ask Related Question
New Comment

1 Answers

2 points -

First, this will be hard to compile information, because of the way the systems work, but seems like a very useful exercise. I would add that the program complexity should include some measure of the "size" of the hardware architecture as well as the libraries, etc. used.

Second, I think that for humans, the relevant size is not just the brain, but the information embedded in the cultural process used for education. This seems vaguely comparable to training data and/or architecture search for ML models, though the analogy should probably be clarified.