I'm a PhD student at the University of Amsterdam. I have research experience in multivariate information theory and equivariant deep learning and recently got very interested into AI alignment. https://langleon.github.io/
One question: Do you think Chinchilla scaling laws are still correct today, or are they not? I would assume these scaling laws depend on the data set used in training, so that if OpenAI found/created a better data set, this might change scaling laws.
Do you agree with this, or do you think it's false?
https://x.com/sama/status/1813984927622549881
According to Sam Altman, GPT-4o mini is much better than text-davinci-003 was in 2022, but 100 times cheaper. In general, we see increasing competition to produce smaller-sized models with great performance (e.g., Claude Haiku and Sonnet, Gemini 1.5 Flash and Pro, maybe even the full-sized GPT-4o itself). I think this trend is worth discussing. Some comments (mostly just quick takes) and questions I'd like to have answers to:
I went to this event in 2022 and it was lovely. Will come again this year. I recommend coming!
Thanks for the answer!
But basically, by "simple goals" I mean "goals which are simple to represent", i.e. goals which have highly compressed representations
It seems to me you are using "compressed" in two very different meanings in part 1 and 2. Or, to be fairer, I interpret the meanings very differently.
I try to make my view of things more concrete to explain:
Compressed representations: A representation is a function from observations of the world state (or sequences of such observations) into a representation space of "features". That this is "compressed" means (a) that in , only a small number of features are active at any given time and (b) that this small number of features is still sufficient to predict/act in the world.
Goals building on compressed representations: A goal is a (maybe linear) function from the representation space into the real numbers. The goal "likes" some features and "dislikes" others. (Or if it is not entirely linear, then it may like/dislike some simple combinations/compositions of features)
It seems to me that in part 2 of your post, you view goals as compositions . Part 1 says that is highly compressed. But it's totally unclear to me why the composition should then have the simplicity properties you claim in part 2, which in my mind don't connect with the compression properties of as I just defined them.
A few more thoughts:
This is largely my fault since I haven't really defined "representation" very clearly, but I would say that the representation of the concept of a dog should be considered to include e.g. the neurons representing "fur", "mouth", "nose", "barks", etc. Otherwise if we just count "dog" as being encoded in a single neuron, then every concept encoded in any neuron is equally simple, which doesn't seem like a useful definition.
(To put it another way: the representation is the information you need to actually do stuff with the concept.)
I'm confused. Most of the time, when seeing a dog, most of what I need is actually just to know that it is a "dog", so this is totally sufficient to do something with the concept. E.g., if I walk on the street and wonder "will this thing bark?", then knowing "my dog neuron activates" is almost enough.
I'm confused for a second reason: It seems like here you want to claim that the "dog" representation is NOT simple (since it contains "fur", "mouth", etc.). However, the "dog" representation needs lots of intelligence and should thus come along with compression, and if you equate compression and simplicity, then it seems to me like you're not consistent. (I feel a bit awkward saying "you're not consistent", but I think it's probably good if I state my honest state of mind at this moment).
To clarify my own position, in line with my definition of compression further above: I think that whether representation is simple/compressed is NOT a property of a single input-output relation (like "pixels of dog gets mapped to dog-neuron being activated"), but instead a property of the whole FUNCTION that maps inputs to representations. This function is compressed if for any given input, only a small number of neurons in the last layer activate, and if these can be used (ideally in a linear way) for further predictions and for evaluating goal-achievement.
I agree that most people who say they are hedonic utilitarians are not 100% committed to hedonic utilitarianism. But I still think it's very striking that they at least somewhat care about making hedonium. I claim this provides an intuition pump for how AIs might care about squiggles too.
Okay, I agree with this, fwiw. :) (Though I may not necessarily agree with claims about how this connects to the rest of the post)
Thanks for the post!
a. How exactly do 1 and 2 interact to produce 3?
I think the claim is along the lines of "highly compressed representations imply simple goals", but the connection between compressed representations and simple goals has not been argued, unless I missed it. There's also a chance that I simply misunderstand your post entirely.
b. I don't agree with the following argument:
Decomposability over space. A goal is decomposable over space if it can be evaluated separately in each given volume of space. All else equal, a goal is more decomposable if it's defined over smaller-scale subcomponents, so the most decomposable goals will be defined over very small slices of space—hence why we're talking about molecular squiggles. (By contrast, you can't evaluate the amount of higher-level goals like "freedom" or "justice" in a nanoscale volume, even in principle.)
The classical ML-algorithm that evaluates features separately in space is a CNN. That doesn't mean that features in CNNs look for tiny structures, though: The deeper into the CNN you are, the more complex the features get. Actually, deep CNNs are an example of what you describe in argument 1: The features in later layers of CNNs are highly compressed, and may tell you binary information such as "is there a dog", but they apply to large parts of the input image.
Therefore, I'd also expect that what an AGI would care about are ultimately larger-scale structures since the AGI's features will nontrivially depend on the interaction of larger parts in space (and possibly time, e.g. if the AGI evaluates music, movies, etc.).
c. I think this leaves the confusion why philosophers end up favoring the analog of squiggles when they become hedonic utilitarians. I'd argue that the premise may be false since it's unclear to me how what philosophers say they care about ("henonium") connects with what they actually care about (e.g., maybe they still listen to complex music, build a family, build status through philosophical argumentation, etc.)
You should all be using the "Google Scholar PDF reader extension" for Chrome.
Features I like:
Screenshot:
I guess (but don't know) that most people who downvote Garrett's comment overupdated on intuitive explanations of singular learning theory, not realizing that entire books with novel and nontrivial mathematical theory have been written on it.
I do all of these except 3, and implementing a system like 3 is among my deprioritized things in my ToDo-list. Maybe I should prioritize it.
Yes the first! Thanks for the link!
The news is not very old yet. Lots of potential for people to start freaking out.