The Wrights invented the airplane using an empirical, trial-and-error approach. They had to learn from experience. They couldn’t have solved the control problem without actually building and testing a plane. There was no theory sufficient to guide them, and what theory did exist was often wrong. (In fact, the Wrights had to throw out the published tables of aerodynamic data, and make their own measurements, for which they designed and built their own wind tunnel.)
This part in particular is where I think there's a whole bunch of useful lessons for alignment...
Major problem with that particular name: in philosophy, "intention" means something completely different from the standard use. From SEP:
In philosophy, intentionality is the power of minds and mental states to be about, to represent, or to stand for, things, properties and states of affairs. To say of an individual’s mental states that they have intentionality is to say that they are mental representations or that they have contents.
So e.g. Dennett's "intentional stance" does not mean what you probably thought it did, if you've heard of it! (I personally learned of this just recently, thankyou Steve Peterson.)
Y'know, I didn't realize until reading this that I hadn't seen a short post spelling it out before. The argument was just sort of assumed background in a lot of conversations. Good job noticing and spelling it out.
Scaling up the data wasn't algorithmic progress. Knowing that they needed to scale up the data was algorithmic progress.
That would, and in general restrictions aimed at increasing price/reducing supply could work, though that doesn't describe most GPU restriction proposals I've heard.
Note that this probably doesn't change the story much for GPU restrictions, though. For purposes of software improvements, one needs compute for lots of relatively small runs rather than one relatively big run, and lots of relatively small runs is exactly what GPU restrictions (as typically envisioned) would not block.
I expect words are usually pointers to natural abstractions, so that part isn't the main issue - e.g. when we look at how natural language fails all the time in real-world coordination problems, the issue usually isn't that two people have different ideas of what "tree" means. (That kind of failure does sometimes happen, but it's unusual enough to be funny/notable.) The much more common failure mode is that a person is unable to clearly express what they want - e.g. a client failing to communicate what they want to a seller. That sort of thing is one reason why I'm highly uncertain about the extent to which human values (or other variations of "what humans want") are a natural abstraction.
So I saw the Taxonomy Of What Magic Is Doing In Fantasy Books and Eliezer’s commentary on ASC's latest linkpost, and I have cached thoughts on the matter.
My cached thoughts start with a somewhat different question - not "what role does magic play in fantasy fiction?" (e.g. what fantasies does it fulfill), but rather... insofar as magic is a natural category, what does it denote? So I'm less interested in the relatively-expansive notion of "magic" sometimes seen in fiction (which includes e.g. alternate physics), and more interested in the pattern cal...
There's an asymmetry between local differences from the true model which can't match the true distribution (typically too few edges) and differences which can (typically too many edges). The former get about O(n) bits against them per local difference from the true model, the latter about O(log(n)), as the number of data points n grows.
Conceptually, the story for the log(n) scaling is: with n data points, we can typically estimate each parameter to ~log(n) bit precision. So, an extra parameter costs ~log(n) bits.
Just that does usually work pretty well for (at least a rough estimate of) the undirected graph structure, but then you don't know the directions of any arrows.
I've tried this before experimentally - i.e. code up a gaussian distribution with a graph structure, then check how well different graph structures compress the distribution. Modulo equivalent graph structures (e.g. A -> B -> C vs A <- B <- C vs A <- B -> C), the true structure is pretty consistently favored.
I'm not sure what motivation for worst-case reasoning you're thinking about here. Maybe just that there are many disjunctive ways things can go wrong other than bad capability evals and the AI will optimize against us?
This getting very meta, but I think my Real Answer is that there's an analogue of You Are Not Measuring What You Think You Are Measuring for plans. Like, the system just does not work any of the ways we're picturing it at all, so plans will just generally not at all do what we imagine they're going to do.
(Of course the plan could still in-pri...
This answer clears the bar for at least some prize money to be paid out, though the amount will depend on how far other answers go by the deadline.
One thing which would make it stronger would be to provide a human-interpretable function for each equivalence class (so Alice can achieve the channel capacity by choosing among those functions).
The suggestions for variants of the problem are good suggestions, and good solutions to those variants would probably also qualify for prize money.
Yes, there is a story for a canonical factorization of , it's just separate from the story in this post.
Sounds like we need to unpack what "viewing as a latent which generates " is supposed to mean.
I start with a distribution . Let's say is a bunch of rolls of a biased die, of unknown bias. But I don't know that's what is; I just have the joint distribution of all these die-rolls. What I want to do is look at that distribution and somehow "recover" the underlying latent variable (bias of the die) and factorization, i.e. notice that I can write the distribution as , where i...
Phase transitions are definitely on the todo list of things to reinvent. Haven't thought about lattice waves or phonons; I generally haven't been assuming any symmetry (including time symmetry) in the Bayes net, which makes such concepts trickier to port over.
If you have sets of variables that start with no mutual information (conditioning on ), and they are so far away that nothing other than could have affected both of them (distance of at least ), then they continue to have no mutual information (independent).
Yup, that's basically it. And I agree that it's pretty obvious once you see it - the key is to notice that distance implies that nothing other than could have affected both of them. But man, when I didn't know that was what I should look for? Much les...
Yup, that's basically it. And I agree that it's pretty obvious once you see it - the key is to notice that distance implies that nothing other than could have affected both of them. But man, when I didn't know that was what I should look for? Much less obvious.
... I feel compelled to note that I'd pointed out a very similar thing a while ago.
Granted, that's not exactly the same formulation, and the devil's in the details.
Let be the initial state of a Gibbs sampler on an undirected probabilistic graphical model, and be the final state. Assume the sampler is initialized in equilibrium, so the distribution of both and is the distribution given by the graphical model.
Take any subsets of , such that the variables in each subset are at least a distance away from the variables in the other subsets (with distance given by shortest path length in the graph). Then ...
Ah, no, I suppose that part is supposed to be handled by whatever approximation process we define for ? That is, the "correct" definition of the "most minimal approximate summary" would implicitly constrain the possible choices of boundaries for which is equivalent to ?
Almost. The hope/expectation is that different choices yield approximately the same , though still probably modulo some conditions (like e.g. sufficiently large ).
What's the here? Is it meant to be ?
System size, i.e. number of variab...
First crucial point which this post is missing: the first (intuitively wrong) net reconstructed represents the probabilities using 9 parameters (i.e. the nine rows of the various truth tables), whereas the second (intuitively right) represents the probabilities using 8. That means the second model uses fewer bits; the distribution is more compressed by the model. So the "true" network is favored even before we get into interventions.
Implication of this for causal epistemics: we have two models which make the same predictions on-distribution, and only make ...
Good question. I recommend looking at this post. The very short version is:
I don't know of a good existing write-up on this, and I think it would be valuable for someone to write.
This seems to be arguing against a starry-eyed idealist case for an "AI disarmament treaty", but not really against a cynical/realistic case. (At first I was going to say "arguing against a strawman", but no, there are in fact lots of starry-eyed idealists in alignment.)
Here's my cynical/realistic case for an "AI disarmament treaty" (or something vaguely in that cluster) with China. As the post notes, the regulations mostly provide evidence that Beijing sees near-term AI as a potential threat to stability that needs to be addressed with regulation. For pur...
There are NNs that train for a lifetime then die, and there are NNs that train for a lifetime but then network together to share all their knowledge before dying.
But crucially, humans do not share all their knowledge. Every time a great scientist or engineer or manager or artist dies, a ton of intuition and skills and illegible knowledge dies with them. What is passed on is only what can be easily compressed into the extremely lossy channels of language.
As the saying goes, "humans are as stupid as they can be while still undergoing intelligence-driven take...
Yeah, the main changes I'd expect in category 1 are just pushing things further in the directions they're already moving, and then adjusting whatever else needs to be adjusted to match the new hyperparameter values.
One example is brain size: we know brains have generally grown larger in recent evolutionary history, but they're locally-limited by things like e.g. birth canal size. Circumvent the birth canal, and we can keep pushing in the "bigger brain" direction.
Or, another example is the genetic changes accounting for high IQ among the Ashkenazi. In order...
Ah, interesting. If I were going down that path, I'd probably aim to use a Landauer-style argument. Something like, "here's a bound on mutual information between the policy and the whole world, including the agent itself". And then a lock/password could give us a lot more optimization power over one particular part of the world, but not over the world as a whole.
... I'm not sure how to make something like that nontrivial, though. Problem is, the policy itself would then presumably be embedded in the world, so is just .
This is all assuming that the power consumption for a wire is at-or-near the Landauer-based limit Jacob argued in his post.
Thanks!
Also, I recognize that I'm kinda grouchy about the whole thing and that's probably coming through in my writing, and I appreciate a lot that you're responding politely and helpfully on the other side of that. So thankyou for that too.
Here are two intuitive arguments:
I mean, sure, but I doubt that e.g. Eliezer thinks evolution is inefficient in that sense.
Basically, there are only a handful of specific ways we should expect to be able to beat evolution in terms of general capabilities, a priori:
Interesting - I think I disagree most with 1. The neuroscience seems pretty clear that the human brain is just a scaled up standard primate brain, the secret sauce is just language (I discuss this now and again in some posts and in my recent part 2). In other words - nothing new about the human brain has had much time to evolve, all evolution did was tweak a few hyperparams mostly around size and neotany (training time): very very much like GPT-N scaling (which my model predicted).
Basically human technology beats evolution because we are not constrained ...
In an absolute sense, yes, but I expect it can be bounded as a function of bits of optimization without observation. For instance, if we could only at-most double the number of bits of opt by observing one bit, then that would bound bit-gain as a function of bits of optimization without observation, even though it's unbounded in an absolute sense.
Unless you're seeing some stronger argument which I have not yet seen?
The new question is: what is the upper bound on bits of optimization gained from a bit of observation? What's the best-case asymptotic scaling? The counterexample suggests it's roughly exponential, i.e. one bit of observation can double the number of bits of optimization. On the other hand, it's not just multiplicative, because our xor example at the top of this post showed a jump from 0 bits of optimization to 1 bit from observing 1 bit.
The four claims you listed as "central" at the top of this thread don't even mention the word "brain", let alone anything about it being pareto-efficient.
It would make this whole discussion a lot less frustrating for me (and probably many others following it) if you would spell out what claims you actually intend to make about brains, nanotech, and FOOM gains, with the qualifiers included. And then I could either say "ok, let's see how well the arguments back up those claims" or "even if true, those claims don't actually say much about FOOM because...", rather than this constant probably-well-intended-but-still-very-annoying jumping between stronger and weaker claims.
Alright, I think we have an answer! The conjecture is false.
Counterexample: suppose I have a very-high-capacity information channel (N bit capacity), but it's guarded by a uniform random n-bit password. O is the password, A is an N-bit message and a guess at the n-bit password. Y is the N-bit message part of A if the password guess matches O; otherwise, Y is 0.
Let's say the password is 50 bits and the message is 1M bits. If A is independent of the password, then there's a chance of guessing the password, so the bitrate will be about ...
Trying to patch the thing which I think this example was aiming for:
Let A be an n-bit number, O be 0 or 1 (50/50 distribution). Then let Y = A if , else Y = 0. If the sender knows O, then they can convey n-1 bits with every message (i.e. n bits minus the lowest-order bit). If the sender does not know O, then half the messages are guaranteed to be 0 (and which messages are 0 communicates at most 1 bit per, although I'm pretty sure it's in fact zero bits per in this case, so no loophole there). So at most ~n/2 bits per message can be conveyed if ...
Damn, that one sounded really promising at first, but I don't think it works. Problem is, if A is fixed-length, then knowing the number of 1's also tells us the number of 0's. And since we get to pick P[A] in the optimization problem, we can make A fixed-length.
EDIT: oh, Alex beat me to the punch.
My gloss of the section is 'you could potential make the brain smaller, but it's the size it is because cooling is expensive in a biological context, not necessarily because blind-idiot-god evolution left gains on the table'
I tentatively buy that, but then the argument says little-to-nothing about barriers to AI takeoff. Like, sure, the brain is efficient subject to some constraint which doesn't apply to engineered compute hardware. More generally, the brain is probably efficient relative to lots of constraints which don't apply to engineered compute hardw...
FWIW, I basically buy all of these, but they are not-at-all sufficient to back up your claims about how superintelligence won't foom (or whatever your actual intended claims are about takeoff). Insofar as all this is supposed to inform AI threat models, it's the weakest subclaims necessary to support the foom-claims which are of interest, not the strongest subclaims.
I basically buy all of these, but they are not-at-all sufficient to back up your claims about how superintelligence won't foom
Foom isn't something that EY can prove beyond doubt or I can disprove beyond doubt, so this is a matter of subjective priors and posteriors.
If you were convinced of foom inevitability before, these claims are unlikely to convince of the opposite, but they do undermine EY's argument:
I think you may be misunderstanding why I used the blackbody temp - I (and the refs I linked) use that as a starting point to to indicate the temp the computing element would achieve without convective cooling (ie in vacuum or outer space).
There's a pattern here which seems-to-me to be coming up repeatedly (though this is the most legible example I've seen so far). There's a key qualifier which you did not actually include in your post, which would make the claims true. But once that qualifier is added, it's much more obvious that the arguments are utterly...
The 'big-sounding' claim you quoted makes more sense only with the preceding context you omitted:
...Conclusion: The brain is a million times slower than digital computers, but its slow speed is probably efficient for its given energy budget, as it allows for a full utilization of an enormous memory capacity and memory bandwidth. As a consequence of being very slow, brains are enormously circuit cycle efficient. Thus even some hypothetical superintelligence, running on non-exotic hardware, will not be able to think much faster than an artificial brain runnin
After chewing it on it a bit, I find it very plausible that this is indeed a counterexample. However, it is not obvious to me how to prove that there does not exist some clever encoding scheme which would achieve bit-throughput competitive with the O-dependent encoding without observing O. (Note that we don't actually need to ensure the same Y pops out either way, we just need the receiver to be able to distinguish between enough possible inputs A by looking at Y.)
Ok simpler example:
You know the channel either removes all 0s or all 1s, but you don't know which.
The most efficient way to send a message is to send n 1s, followed by n 0s, where n is the number the binary message you want to send represents.
If you know whether 1s or 0s are stripped out, then you only need to send n bits of information, for a total saving of n bits.
EDIT: this doesn't work, see comment by AlexMennen.
(Note that this, in turn, also completely undermines the claims about optimality of speed in the next section. Those claims ultimately ground out in high temperatures making high clock speeds prohibitive, e.g. this line:
Scaling a brain to GHz speeds would increase energy and thermal output into the 10MW range, and surface power density to / , with temperatures well above the surface of the sun
)
(Copied with some minor edits from here.)
Jacob's argument in the Density and Temperature section of his Brain Efficiency post basically just fails.
Jacob is using a temperature formula for blackbody radiators, which is basically irrelevant to temperature of realistic compute substrate - brains, chips, and probably future compute substrates are all cooled by conduction through direct contact with something cooler (blood for the brain, heatsink/air for a chip). The obvious law to use instead would just be the standard thermal conduction law: heat flow per uni...
I'm going to make this slightly more legible, but not contribute new information.
Note that downthread, Jacob says:
the temp/size scaling part is not one of the more core claims so any correction there probably doesn't change the conclusion much.
So if your interest is in Jacob's arguments as they pertain to AI safety, this chunk of Jacob's writings is probably not key for your understanding and you may want to focus your attention on other aspects.
Both Jacob and John agree on the obvious fact that active cooling is necessary for both the brain and for GPUs a...
Consider two claims:
These two claims should probably not both be true! If any system can be modeled as maximizing a utility function, and it is possible to build a corrigible system, then naively the corrigible system can be modeled as maximizing a utility function.
I exp... (read more)