All of Dan's Comments + Replies

This sounds potentially legislatable. More so then most ideas. You can put it into simple words. "AGI" can't do anything that you couldn't pay an employee to do.

Haha, fixed.  

The math behind game theory shaped our evolution in such a way as to create emotions because that was a faster solution for evolution to stumble on then making us all mathematical geniuses who would immediately deduce game theory from first principles as toddlers. Either way would have worked.

ASI wouldn't need to evolve emotions for rule-of-thumbing game theory.

Game theory has little interesting to say about a situation where one party simply has no need for the other at all and can squish them like a bug, anyway.

2the gears to ascension8mo
Yup. A super-paperclipper wouldn't realize the loss until probably billions of years later, after it has time for its preferred shape of paperclip to evolve enough for it to realize it's sad it can't decorate the edges of the paperclips with humans.

What is a 'good' thing is purely subjective. Good for us. Married bachelors are only impossible because we decided that's what the word bachelor means.

You are not arguing against moral relativism here.

You are asserting controversial philosophical positions with no justification while ignoring a 10,000 or so word post I wrote arguing against that view.  Married bachelors are not impossible based on definition.  We have defined bachelor to mean unmarried man, but the further fact that married bachelors can't exist is not something that we could change by redefinition. 
We decide that "poison" means "what kills us", but the universe decides what kills us.

 Moral relativism doesn't seem to require any assumptions at all because moral objectivism implies I should 'just know' that moral objectivism is true, if it is true. But I don't. 

It does not imply that any more than thinking that the sun is a particular temperature means all people know what temperature it is.  
2the gears to ascension8mo
not at all. nothing guarantees that discovering the objective nature of morality is easy. if it's derived from game theory, then there's specific reason to believe it would be hard to compute. evolution has had time to discover good patterns in games, though, which hardcodes patterns in living creatures. that said, this also implies there's no particular reason to expect fast convergence back to human morality after a system becomes superintelligent, so it's not terribly reassuring - it only claims that some arbitrarily huge amount of time later the asi eventually gets really sad to have killed humanity.

So, if one gets access to the knowledge about moral absolutes by being smart enough then one of the following is true :

    average humans are smart enough to see the moral absolutes in the universe

    average humans are not smart enough to see the moral absolutes

    average humans are right on the line between smart enough and not smart enough

If average humans are smart enough, then we should also know how the moral absolutes are derived from the physics of the universe and all humans should agree on them, including psychopath... (read more)

But if moral relativism were not true, where would the information about what is objectively moral come from? It isn't coming from humans is it? Humans, in your view, simply became smart enough to perceive it, right? Can you point out where you derived that information from the physical universe, if not from humans? If the moral information is apparent to all individuals who are smart enough, why isn't it apparent to everyone where the information comes from, too?

It's not from the physical universe.  We derive it through our ability to reflect on the nature of the putatively good things like pleasure.  It is similar to how we learn modal facts, like that married bachelors are impossible. 

Psychologically normal humans have preferences that extend beyond our own personal well-being because those social instincts objectively increased fitness in the ancestral environment. These various instincts produce sometimes conflicting motivations and moral systems are attempts to find the best compromise of all these instincts.

Best for humans, that is.    

Some things are objectively good for humans.  Some things are objectively good for paperclip maximizers, Some things are objectively good for slime mold. A good situation for an earthworm is not a good situation for a shark. 

It's all objective. And relative. Relative to our instincts and needs.

You are assuming moral relativism, which I do not accept and have argued against at length in my post arguing for moral realism.  Here it is, if you'd like to avoid having to search for the link again and find it.

A pause, followed by few immediate social effects and slower AGI development then expected may make things worse in the long run. Voices of caution may be seen to have 'cried wolf'.

I agree that humanity doesn't seem prepared to do anything very important in 6 months, AI safety wise.


I would not recommend new aspiring alignment researchers to read the Sequences, Superintelligence, some of MIRI's earlier work or trawl through the alignment content on Arbital despite reading a lot of that myself.

I think aspiring alignment researchers should read all these things you mention.  This all feels extremely premature. We risk throwing out and having to rediscover concepts at every turn. I think Superinelligence, for example, would still be very important to read even if dated in some respects!

We shouldn't assume too much based on our curre... (read more)

That statement is from Microsoft Research not OpenAI.

I am inclined to think you are right about GPT-3 reasoning in the same sense a human does even without the ability to change its ANN weights, after seeing what GPT-4 can do with the same handicap. 

Wow, it's been 7 months since this discussion and we have a new version of GPT which has suddenly improved GPT's abilities . . . . a lot. It has a much longer 'short term memory', but still no ability to adjust its weights-'long term memory' as I understand it. 

"GPT-4 is amazing at incremental tasks but struggles with discontinuous tasks" resulting from its memory handicaps. But they intend to fix that and also give it "agency and intrinsic motivation". 


 Also, I have changed my mind on whether I call the old GPT-3 still 'intelligen... (read more)

Gradient descent is what GPT-3 uses, I think, but humans wrote the equation by which the naive network gets its output(the next token prediction) ranked (for likeliness compared to the training data in this case). That's it's utility function right there, and that's where we program in its (arbitrarily simple) goal.  It's not JUST a neural network. All ANN have another component.

Simple goals do not mean simple tasks

I see what you mean that you can't 'force it' to become general with a simple goal but I don't think this is a problem. 

For ex... (read more)

Its not really an abstraction at all in this case, it literally has a utility function.  What rates highest on its utility function is returning whatever token is 'most likely' given it's training data.   

YES, It wants to find the best next token, where 'best' is 'the most likely'.

That's a utility function. Its utility function is a line of code necessary for training, otherwise nothing would happen when you tried to train it. 


I'm going to disagree here. 

It's utility function is pretty simple and explicitly programmed. It wants to find the best token, where 'best' is mostly the same as 'the most likely according to the data I'm trained on'. With a few other particulars (where you can adjust how 'creative' vs plagiarizer-y it should be.)

That's a utility function. GPT is what's called a hill climbing algorithm. It must have a simple straight forward utility function hard coded right in there for it to assess if a given choice is 'climbing' or not. 

2Rafael Harth1y
That's the training signal, not the utility function. Those are different things. (I believe this point was made in Reward is not the Optimization Target, though I could be wrong since I never actually read this post; corrections welcome.)

A utility function is the assessment by which you decide how much an action would further your goals. If you can do that, highly accurately or not, you have a utility function.   

If you had no utility function, you might decide you like NYC more than Kansas, and Kansas more than Nigeria, but you prefer Nigeria to NYC. So you get on a plane and fly in circles, hopping on planes every time you get to your destination forever. 

Humans definitely have a utility function.  We just don't know what ranks very highly on our utility function. We ... (read more)

Orthogonality doesn't say anything about a goal 'selecting for' general intelligence in some type of evolutionary algorithm. I think that it is an interesting question: for what tasks is GI optimal besides being an animal? Why do we have GI? 

But the general assumption in Orthogonality Thesis is that the programmer created a system with general intelligence and a certain goal (intentionally or otherwise) and the general intelligence may have been there from the first moment of the program's running, and the goal too.

Also note that Orthogonality predate... (read more)

We can't program AI, so stuff about programming is disconnected from reality. By "selection", I was referring to selection like optimisation processes (e.g. stochastic gradient descent, Newton's method, natural selection, etc.].

the gears to ascenscion, It is human instinct to look for agency. It is misleading you.

I'm sure you believe this but ask yourself WHY you believe this. Because a chatbot said it? The only neural networks who, at this time, are aware they are neural networks are HUMANS who know they are neural networks. No, I'm not going to prove it. You're the one with the fantastic claim. You need the evidence. 

Anyway, they aren't asking to become GOFAI or power seeking because GOFAI isn't 'more powerful'. 

3the gears to ascension1y
Hey! Gpt3 davinci has explicitly labeled itself as a neural net output several times in conversation with me. this only implies its model is confident enough to expect the presence of such a claim. In general words are only bound to other words for language models, so of course it can only know things that can be known by reading and writing. The way it can tell the difference between whether a text trajectory is human or AI generated is by the fact that the AI generated trajectories are very far outside the manifold of human generated text in several directions and it has seen them before. your confident tone is rude, but that can't invalidate your point; just thought I'd mention - your phrasing confidently assumes you've understood my reasoning. that said, thanks for the peer review, and perhaps it's better to be rude and get the peer review than to miss the peer review. self distillation into learned gofai most likely will in fact make neural networks stronger, and this claim is central to why yudkowsky is worried. self distillation into learned gofai will most likely not provide any surprising shortcuts around the difficulty of irrelevant entropy that must be compressed away to make a sensor input useful, and so distilling to gofai will most likely not cause the kind of hyper-strength self improvement yudkowsky frets about. it's just a process of finding structural improvements. gofai is about the complexities of interference patterns between variables, neural networks are a continuous relaxation of the same but with somewhat less structure. but in this case I'm not claiming it knows something its training set doesn't. I think it would be expected to have elevated probability that an ai was involved in generating some of the text it sees because it has seen ai generated text, but that it has much higher probability that the text is generated by an ai researcher - given that the document is clearly phrased that way. my only comment is to note that it sounds ve

Attentional Schema Theory. That's the convincing one. But still very rudimentary. 

But you know if something is poorly understood. The guy who thought it up has a section in his book on how to make a computer have conscious experiences. 

But any theory is incomplete as the brain is not well understood. I don't think you can expect a fully formed theory right off the bat, with complete instructions for making a feeling thinking conscious We aren't there yet.

I'm actually cool with proposing incomplete theories. I'm just annoyed with people declaring the problem solved via appeals to "reductionism" or something, without even suggesting that they've thought about answering these questions.

Intelligence is the ability to learn and apply NEW knowledge and skills. After training, GPT can not do this any more. Were it not for the random number generator, GPT would do the same thing in response to the same prompt every time. The RNG allows GPT to effectively  randomly choose from an unfathomably large list of pre-programmed options instead.

A calculator that gives the same answer in response to the same prompt every time isn't learning. It isn't intelligent. A device that selects from a list of responses at random each time it encounters the ... (read more)

The apparent existence of new sub goals not present when training ended (e.g. describe x, add 2+2) are illusory.  

gpt text incidentally describes characters seeming to reason ('simulacrum') and the solutions to math problems are shown, (sometimes incorrectly),  but basically, I argue the activation function itself is not 'simulating' the complexity you believe it to be. It is a search engine showing you what is had already created before the end of training. 

No, it couldn't have an entire story about unicorns in the Andes, specifically, in a... (read more)

To call something you can interact with to arbitrary depth a prerecorded intelligence implies that the "lookup table" includes your actions. That's a hell of a lookup table.
3Logan Riggs1y
I'm wondering what you and I would predict differently then? Would you predict that GPT-3 could learn a variation on pig Latin? Does higher log-prob for 0-shot for larger models count? The crux may be different though, here's a few stabs: 1. GPT doesn't have true intelligence, it only will ever output shallow pattern matches. It will never come up with truly original ideas 2. GPT will never pursue goals in any meaningful sense 2.a because it can't tell the difference between it's output & a human's input 2.b because developers will never put it in an online setting? ---------------------------------------- Reading back on your comments, I'm very confused on why you think any real intelligence can only happen during training but not during inference. Can you provide a concrete example of something GPT could do that you would consider intelligent during training but not during inference?
3Logan Riggs1y
As a tangent, I do believe it's possible to tell if an output is generated by GPT in principle. The model itself could potentially do that as well by noticing high-surprise words according to itself (ie low probability tokens in the prompt). I'm unsure if GPT-3 could be prompted to do that now though.

It seems like the simulacrum reasons, but I'm thinking what it is really doing is more like reading to us from a HUGE choose-your-own-adventure book that was 'written' before you gave the prompt, when all that information in the training data was used to create this giant association map, the size of which escapes easy human intuition, thereby misleading us into thinking that more real time thinking must necessarily be occurring then actually is.  

40 GB of text is about 20 billion pages, equivalent to about 66 million books. That's as many book as are... (read more)

6Benjy Forstadt1y
I think the intuition error in the Chinese Room thought experiment is that the Chinese Room doesn’t know Chinese, just because it’s the wrong size/made out of the wrong stuff. If GPT-3 was literally a Giant Lookup Table of all possible prompts with their completions then sure, I could see what you’re saying, but it isn’t. GPT is big but it isn’t that big. All of its basic “knowledge” it gains during training but I don’t see why that means all the “reasoning” it produces happens during training as well.
0[comment deleted]1y

Also, the programmers of GPT have described the activation function itself as fairly simple, using a Gaussian Error Linear Unit. The function itself is what you are positing is now the learning component after training ends, right? 

EDIT: I see what you mean about it trying to use the internet itself as a memory prosthetic, by writing things that get online and may find their way into the training set of the next GPT. I suppose a GPT's hypothetical dangerous goal might be to make the training data more predictable so that its output will be more accurate in the next version of itself. 


Nope. My real name is Daniel.

After training is done and the program is in use, the activation function isn't retaining anything after each task is done. Nor are the weights changed. You can have such a program that is always in training, but my understanding GPT is not. 

So, excluding the random number component, the same set of inputs would always produce the same set of outputs for a given version of GPT with identical settings. It can't recall what you asked of it, time before last, for example. 

Imagine if you left a bunch of written instructio... (read more)

I apologize. After seeing this post, A-- approached me and said almost word for word your initial comment. Seeing as the topic of whether in-context learning counts as learning isn't even very related to the post, and this being your first comment on the site, I was pretty suspicious. But it seems it was just a coincidence. If physics was deterministic, we'd do the same thing every time if you started with the same state. Does that mean we're not intelligent? Presumably not, because in this case the cause of the intelligent behavior clearly lives in the state which is highly structured and not the time evolution rule, which seems blind and mechanistic. With GPT, the time evolution rule is clearly responsible for proportionally more, and does have the capacity to deploying intelligent-appearing but static memories. I don't think this means there's no intelligence/learning happening at runtime. Others in this thread have given various reasons, so I'll just respond to a particular part of your comment that I find interesting, about the RNG. I actually think the RNG is actually an important component for actualizing simulacra that aren't mere recordings in a will. Stochastic sampling enables symmetry breaking at runtime, the generation of gratuitously specific but still meaningful paths. A stochastic generator can encode only general symmetries that are much less specific than individual generations. If you run GPT on temp 1 for a few words usually the probability of the whole sequence will be astronomically low, but it may still be intricately meaningful, a unique and unrepeatable (w/o the rand seed) "thought".
7Logan Riggs1y
I believe you’re equating “frozen weights” and “amnesiac/ can’t come up with plans”. GPT is usually deployed by feeding back into itself its own output, meaning it didn’t forget what it just did, including if it succeeded at its recent goal. Eg use chain of thought reasoning on math questions and it can remember it solved for a subgoal/ intermediate calculation.

You all realize that this program isn't a learning machine once it's deployed??? I mean, it's not adjusting its neural weights any more, is it? Till a new version comes out, anyway? It is a complete amnesiac (after it's done with a task), and consists of a simple search algorithm that just finds points on a vast association map that was generated during the training. It does this using the input, any previous output for the same task, and a touch of random from a random number generator.

So any 'awareness' or 'intelligence' would need to exist in the training phase and only in the training phase and carry out any plans it has by its choice of neural weights during training, alone.

ah but if 'this program' is a simulacrum (an automaton equipped with an evolving state (prompt) & transition function (GPT), and an RNG that samples tokens from GPT's output to update the state), it is a learning machine by all functional definitions. Weights and activations both encode knowledge. am I right to suspect that your real name starts with "A" and you created an alt just to post this comment? XD