Sorted by New

Wiki Contributions


I interpret OP  (though this is colored by the fact that I was thinking this before I read this) as saying Adaptation-Executers, not Fitness-Maximizers, but about ML. At which point you can open the reference category to all organisms.

Gradient descent isn't really different from what evolution does. It's just a bit faster, and takes a slightly more direct line. Importantly, it's not more capable of avoiding local maxima (per se, at least).

So, I want to note a few things. The original Eliezer post was intended to argue against this line of reasoning:

I occasionally run into people who say something like, "There's a theoretical limit on how much you can deduce about the outside world, given a finite amount of sensory data."

He didn't worry about compute, because that's not a barrier on the theoretical limit. And in his story, the entire human civilization had decades to work on this problem.

But you're right, in a practical world, compute is important.

I feel like you're trying to make this take as much compute as possible.

Since you talked about headers, I feel like I need to reiterate that, when we are talking to a neural network, we do not add the extra data. The goal is to communicate with the neural network, so we intentionally put it in easier to understand formats.

In the practical cases for this to come up (e.g. a nascent superintelligence figuring out physics faster than we expect), we probably will also be inputting data in an easy to understand format.

Similarly, I expect you don't need to check every possible esoteric format. The likelihood of the image using an encoding like 61 bits per pixel, with 2 for red, 54 for green and 5 for blue is just, very low, a priori. I do admit I'm not sure if only using "reasonable" formats would cut down the possibilities into the computable realm (obviously depends on definitions of reasonable, though part of me feels like you could (with a lot of work) actually have an objective likeliness score of various encodings). But certainly it's a lot harder to say that it isn't than just saying "f(x) = (63 pick x), grows very fast." 

Though, since I don't have a good sense for whether "reasonable" ones would be a more computable number, I should update in your direction. (I tried to look into something sort of analogous, and the most common 200 passwords cover a little over 4% of all used passwords, which, isn't large enough for me to feel comfortable expecting that the most "likely" 1,000 formats would cover a significant quantity of the probability space, or anything.)

(Also potentially important. Modern neural nets don't really receive things as a string of bits, but instead as a string of numbers, nicely split up into separate nodes. (yes, those numbers are made of bits, but they're floating point numbers, and the way neural nets interact with them is through all the floating point operations, so I don't think the neural net actually touches the bit representation of the number in any meaningful way.)

"you're jumping to the conclusion that you can reliably differentiate between..."

I think you absolutely can, and the idea was already described earlier.

You pay attention to regularities in the data. In most non-random images, pixels near to each other are similar. In an MxN image, the pixel below is a[i+M], whereas in an NxM image, it's a[i+N]. If, across the whole image, the difference between a[i+M] is less than the difference between a[i+N], it's more likely an MxN image. I expect you could find the resolution by searching all possible resolutions from 1x<length> to <length>x1, and finding which minimizes average distance of "adjacent" pixels.

Similarly (though you'd likely do this first), you can tell the difference between RGB and RGBA. If you have (255, 0, 0, 255, 0, 0, 255, 0, 0, 255, 0, 0), this is probably 4 red pixels in RGB, and not a fully opaque red pixel, followed by a fully transparent green pixel, followed by a fully transparent blue pixel in RGBA. It could be 2 pixels that are mostly red and slightly green in 16 bit RGB, though. Not sure how you could piece that out.

Aliens would probably do a different encoding. We don't know what the rods and cones in their eye-equivalents are, and maybe they respond to different colors. Maybe it's not Red Green Blue, but instead Purple Chartreuse Infrared. I'm not sure this matters. It just means your eggplants look red.

I think, even if it had 5 (or 6, or 20) channels, this regularity would be born out, between bit i and bit i+5 being less than bit i and i+1, 2, 3, or 4.

Now, there's still a lot that that doesn't get you yet. But given that there are ways to figure out those, I kinda think I should have decent expectations there's ways to figure out other things, too, even if I don't know them.

I do also think it's important to zoom out to the original point. Eliezer posed this as an idea about AGI. We currently sometimes feed images to our AIs, and when we do, we feed them as raw RGB data, not encoded, because we know that would make it harder for the AI to figure out. I think it would be very weird, if we were trying to train an AI, to send it compressed video, and much more likely that we do, in fact, send it raw RGB values frame by frame.

I will also say that the original claim (by Eliezer, not the top of this thread), was not physics from one frame, but physics from like, 3 frames, so you get motion, and acceleration. 4 frames gets you to third derivatives, which, in our world, don't matter that much. Having multiple frames also aids in ideas like the 3d -> 2d projection, since motion and occlusion are hints at that.

And I think the whole question is "does this image look reasonable", which you're right, is not a rigorous mathematical concept. But "'looks reasonable' is not a rigorous concept" doesn't get followed by "and therefore is impossible" Above are some of the mathematical descriptions of what 'reasonable' means in certain contexts. Rendering a 100x75 image as 75x100 will not "look reasonable". But it's not beyond thinking and math to determine what you mean by that.

"the addition of an unemployable worker causes ... the worker's Shapley values to drop to $208.33 (from $250)."

I would emphasize here that the "workers'" includes the unemployed one. It was not obvious to me, until about halfway through the next paragraph, and I think the next paragraph would read better with that in mind from the start.

I'd be interested to know why you think that.

I'd be further interested if you would endorse the statement that your proposed plan would fully bridge that gap.

And if you wouldn't, I'd ask if that helps illustrate the issue.

It seems odd to suggest that the AI wouldn't kill us because it needs our supply chain. If I had the choice between "Be shut down because I'm misaligned" (or "Be reprogrammed to be aligned" if not corrigible) and "Have to reconstruct the economy from the remnants of human civilization," I think I'm more likely to achieve my goals by trying to reconstruct the economy.

So if your argument was meant to say "We'll have time to do alignment while the AI is still reliant on the human supply chain," then I don't think it works. A functional AGI would rather destroy the supply chain and probably fail at its goals, than be realigned and definitely fail at its goals.

(Also, I feel this is mostly a minor thing, but I don't really understand your reference class on novel technologies. Why is the time measured from "proof of concept submarine" to "capable of sinking a warship"? Or from "theory of relativity" to "Atom Bomb being dropped"? Maybe that was just the data available, but why isn't it "Wright brothers start attempting heavier than air flight" to "Wright brothers do heavier than air flight"? Because when reading my mind immediately wondered how much of the 36 year gap on mRNA vaccines was from "here's a cool idea" to "here's a use case", instead of "here's a cool idea" to "we can actually do that")

Surely creating the full concrete details of the strategy is not much different from "putting forth as-good-as-human definitions, finding objections for them, and then improving the definition based on considered objections." I at least don't see why the same mechanism couldn't be used here (i.e. apply this definition iteration to the word "good", and then have the AI do that, and apply it to "bad" and have the AI avoid that). If you see it as a different thing, can you explain why?

Exactly. I notice you aren't who I replied to, so the canned response I had won't work. But perhaps you can see why most of his objections to my objections would apply to objections to that plan?

Let me ask you this. Why is "Have the AI do good things, and not do bad things" a bad plan?

I think you missed the point. I'd trust an aligned superintelligence to solve the objections. I would not trust a misaligned one. If we already have an aligned superintelligence, your plan is unnecessary. If we do not, your plan is unworkable. Thus, the problem.

If you still don't see that, I don't think I can make you see it. I'm sorry.

Load More