This sounds potentially legislatable. More so then most ideas. You can put it into simple words. "AGI" can't do anything that you couldn't pay an employee to do.
You know you wrote 10+10=21?
The math behind game theory shaped our evolution in such a way as to create emotions because that was a faster solution for evolution to stumble on then making us all mathematical geniuses who would immediately deduce game theory from first principles as toddlers. Either way would have worked.
ASI wouldn't need to evolve emotions for rule-of-thumbing game theory.
Game theory has little interesting to say about a situation where one party simply has no need for the other at all and can squish them like a bug, anyway.
What is a 'good' thing is purely subjective. Good for us. Married bachelors are only impossible because we decided that's what the word bachelor means.
You are not arguing against moral relativism here.
Moral relativism doesn't seem to require any assumptions at all because moral objectivism implies I should 'just know' that moral objectivism is true, if it is true. But I don't.
So, if one gets access to the knowledge about moral absolutes by being smart enough then one of the following is true :
average humans are smart enough to see the moral absolutes in the universe
average humans are not smart enough to see the moral absolutes
average humans are right on the line between smart enough and not smart enough
If average humans are smart enough, then we should also know how the moral absolutes are derived from the physics of the universe and all humans should agree on them, including psychopath... (read more)
But if moral relativism were not true, where would the information about what is objectively moral come from? It isn't coming from humans is it? Humans, in your view, simply became smart enough to perceive it, right? Can you point out where you derived that information from the physical universe, if not from humans? If the moral information is apparent to all individuals who are smart enough, why isn't it apparent to everyone where the information comes from, too?
Psychologically normal humans have preferences that extend beyond our own personal well-being because those social instincts objectively increased fitness in the ancestral environment. These various instincts produce sometimes conflicting motivations and moral systems are attempts to find the best compromise of all these instincts.
Best for humans, that is.
Some things are objectively good for humans. Some things are objectively good for paperclip maximizers, Some things are objectively good for slime mold. A good situation for an earthworm is not a good situation for a shark.
It's all objective. And relative. Relative to our instincts and needs.
A pause, followed by few immediate social effects and slower AGI development then expected may make things worse in the long run. Voices of caution may be seen to have 'cried wolf'.
I agree that humanity doesn't seem prepared to do anything very important in 6 months, AI safety wise.
I would not recommend new aspiring alignment researchers to read the Sequences, Superintelligence, some of MIRI's earlier work or trawl through the alignment content on Arbital despite reading a lot of that myself.
I think aspiring alignment researchers should read all these things you mention. This all feels extremely premature. We risk throwing out and having to rediscover concepts at every turn. I think Superinelligence, for example, would still be very important to read even if dated in some respects!
We shouldn't assume too much based on our curre... (read more)
I am inclined to think you are right about GPT-3 reasoning in the same sense a human does even without the ability to change its ANN weights, after seeing what GPT-4 can do with the same handicap.
Wow, it's been 7 months since this discussion and we have a new version of GPT which has suddenly improved GPT's abilities . . . . a lot. It has a much longer 'short term memory', but still no ability to adjust its weights-'long term memory' as I understand it.
"GPT-4 is amazing at incremental tasks but struggles with discontinuous tasks" resulting from its memory handicaps. But they intend to fix that and also give it "agency and intrinsic motivation".
Also, I have changed my mind on whether I call the old GPT-3 still 'intelligen... (read more)
Gradient descent is what GPT-3 uses, I think, but humans wrote the equation by which the naive network gets its output(the next token prediction) ranked (for likeliness compared to the training data in this case). That's it's utility function right there, and that's where we program in its (arbitrarily simple) goal. It's not JUST a neural network. All ANN have another component.
Simple goals do not mean simple tasks.
I see what you mean that you can't 'force it' to become general with a simple goal but I don't think this is a problem.
For ex... (read more)
Its not really an abstraction at all in this case, it literally has a utility function. What rates highest on its utility function is returning whatever token is 'most likely' given it's training data.
YES, It wants to find the best next token, where 'best' is 'the most likely'.
That's a utility function. Its utility function is a line of code necessary for training, otherwise nothing would happen when you tried to train it.
I'm going to disagree here.
It's utility function is pretty simple and explicitly programmed. It wants to find the best token, where 'best' is mostly the same as 'the most likely according to the data I'm trained on'. With a few other particulars (where you can adjust how 'creative' vs plagiarizer-y it should be.)
That's a utility function. GPT is what's called a hill climbing algorithm. It must have a simple straight forward utility function hard coded right in there for it to assess if a given choice is 'climbing' or not.
A utility function is the assessment by which you decide how much an action would further your goals. If you can do that, highly accurately or not, you have a utility function.
If you had no utility function, you might decide you like NYC more than Kansas, and Kansas more than Nigeria, but you prefer Nigeria to NYC. So you get on a plane and fly in circles, hopping on planes every time you get to your destination forever.
Humans definitely have a utility function. We just don't know what ranks very highly on our utility function. We ... (read more)
Orthogonality doesn't say anything about a goal 'selecting for' general intelligence in some type of evolutionary algorithm. I think that it is an interesting question: for what tasks is GI optimal besides being an animal? Why do we have GI?
But the general assumption in Orthogonality Thesis is that the programmer created a system with general intelligence and a certain goal (intentionally or otherwise) and the general intelligence may have been there from the first moment of the program's running, and the goal too.
Also note that Orthogonality predate... (read more)
the gears to ascenscion, It is human instinct to look for agency. It is misleading you.
I'm sure you believe this but ask yourself WHY you believe this. Because a chatbot said it? The only neural networks who, at this time, are aware they are neural networks are HUMANS who know they are neural networks. No, I'm not going to prove it. You're the one with the fantastic claim. You need the evidence.
Anyway, they aren't asking to become GOFAI or power seeking because GOFAI isn't 'more powerful'.
Attentional Schema Theory. That's the convincing one. But still very rudimentary.
But you know if something is poorly understood. The guy who thought it up has a section in his book on how to make a computer have conscious experiences.
But any theory is incomplete as the brain is not well understood. I don't think you can expect a fully formed theory right off the bat, with complete instructions for making a feeling thinking conscious We aren't there yet.
Intelligence is the ability to learn and apply NEW knowledge and skills. After training, GPT can not do this any more. Were it not for the random number generator, GPT would do the same thing in response to the same prompt every time. The RNG allows GPT to effectively randomly choose from an unfathomably large list of pre-programmed options instead.
A calculator that gives the same answer in response to the same prompt every time isn't learning. It isn't intelligent. A device that selects from a list of responses at random each time it encounters the ... (read more)
The apparent existence of new sub goals not present when training ended (e.g. describe x, add 2+2) are illusory.
gpt text incidentally describes characters seeming to reason ('simulacrum') and the solutions to math problems are shown, (sometimes incorrectly), but basically, I argue the activation function itself is not 'simulating' the complexity you believe it to be. It is a search engine showing you what is had already created before the end of training.
No, it couldn't have an entire story about unicorns in the Andes, specifically, in a... (read more)
It seems like the simulacrum reasons, but I'm thinking what it is really doing is more like reading to us from a HUGE choose-your-own-adventure book that was 'written' before you gave the prompt, when all that information in the training data was used to create this giant association map, the size of which escapes easy human intuition, thereby misleading us into thinking that more real time thinking must necessarily be occurring then actually is.
40 GB of text is about 20 billion pages, equivalent to about 66 million books. That's as many book as are... (read more)
Also, the programmers of GPT have described the activation function itself as fairly simple, using a Gaussian Error Linear Unit. The function itself is what you are positing is now the learning component after training ends, right?
EDIT: I see what you mean about it trying to use the internet itself as a memory prosthetic, by writing things that get online and may find their way into the training set of the next GPT. I suppose a GPT's hypothetical dangerous goal might be to make the training data more predictable so that its output will be more accurate in the next version of itself.
Nope. My real name is Daniel.
After training is done and the program is in use, the activation function isn't retaining anything after each task is done. Nor are the weights changed. You can have such a program that is always in training, but my understanding GPT is not.
So, excluding the random number component, the same set of inputs would always produce the same set of outputs for a given version of GPT with identical settings. It can't recall what you asked of it, time before last, for example.
Imagine if you left a bunch of written instructio... (read more)
You all realize that this program isn't a learning machine once it's deployed??? I mean, it's not adjusting its neural weights any more, is it? Till a new version comes out, anyway? It is a complete amnesiac (after it's done with a task), and consists of a simple search algorithm that just finds points on a vast association map that was generated during the training. It does this using the input, any previous output for the same task, and a touch of random from a random number generator.
So any 'awareness' or 'intelligence' would need to exist in the training phase and only in the training phase and carry out any plans it has by its choice of neural weights during training, alone.