What's Your Cognitive Algorithm?

[-]johnswentworth6y190

I definitely agree that most people most of the time are using an algorithm like babble-with-associations, with feedback refining the associations over time. That said, I do think it is possible to do better, and at least some people (especially STEM people working on hard problems) learn to use other methods at least sometimes. In the rationality-movement-I'd-like-to-see, systematizing and training such more-efficient-thinking-algorithms is a central piece of the curriculum, and a major goal of the movement.

I've been thinking about explicit models of more-efficient-thinking-algorithms for a while now - Mazes and Crayon was an early piece along those lines, and of course a lot of the gears pieces are tied in. I have at least one thinking-algorithm which is more efficient on novel problems than vanilla babble-with-associations and which I think is trainable. I've been working towards a write-up (Everyday Lessons from High-Dimensional Optimization was partly intended as background for that write-up).

That write-up isn't ready yet, but here are some sparknotes:

We use gears-level models because black-box problem solving in high dimensions is inefficient. The idea of "gears" is to decompose the system into coupled low-dimensional subsystems.
My current best guess is that equations/constraints are a universal representation for gears. E.g. in a causal model (specifically structural equations), the equations/constraints are the structural equations themselves. Likewise in physics/chemistry/engineering/etc: the constraints are the system's governing equations. When we want to problem-solve on the system, those governing equations act as constraints on our problem.
- I expect that humans' native representation of gears-level models amounts to something equivalent to constraints as well.
This immediately suggests a whole class of problem-solving strategies, in particular:
- constraint relaxation to generate heuristics (e.g. "could I solve this problem ignoring this constraint?")
- dual methods (e.g. thinking about tautness/slackness of constraints - which constraints are "easy", and which are limiting our ability to solve the problem)
There's also the question of how to figure out the equations/constraints in the first place. Presumably we could look for constraints-on-the-constraints, but then we go up a whole ladder of meta-levels... except we actually don't. Here again, we can leverage duality: constraints-on-constraints are partial solutions to the original problem. So as we go "up" the meta-ladder, we just switch back-and-forth between two problems. One specific example of this is simultaneously looking for a proof and a counterexample in math. A more visual example is in this comment.
- Concretely, this process looks like trying out some solution-approaches while looking for any common barrier they run into, then thinking about methods to specifically circumvent that barrier (ignoring any other constraints), then looking for any common barrier those methods run into, etc...

The key feature of this class of thinking-algorithms is that they can get around the inefficiency issues in high-dimensional problems, as long as the problem-space decomposes - which real-world problem spaces usually do, even in cases where you might not expect it. This is basically just taking some standard tricks from AI (constraint relaxation, dual methods), and applying them directly to human thinking.

Note that this can still involve babbly thinking for the lower-level steps; the whole thing can still be implemented on top of babble-with-associations. The key is that we only want to babble at low-dimensional subproblems, while using a more systematic approach for the full high-dimensional problem.

[-]Raemon6y50

This all sounds true (and I meant it to be sort of implied by the post, although I didn't delve into every given possible "improved algorithm", and perhaps could have picked a better example.)

What seemed to me was the gears/model-based-thinking still seems implemented on babble, not just for the lower level steps, but for the higher level systematic strategy. (I do think this involves first building some middle-order thought processes on top of the babble, and then building the high level strategy out of those pieces)

i.e. when I use gears-based-systematic-planning, the way the pieces of the plan come together still feel like they're connected via the same underlying associative babbling. It's just that I'd have a lot of tight associations between collections of strategies, like:

Notice I'm dealing with a complex problem
Complex problem associates into "use the appropriate high level strategy for this problem" (which might involve first checking possible strategies, or might involve leaping directly to the correct strategy)
Once I have a gears-oriented strategy, it'll usually have a step one, then a step two, etc (maybe looping around recursively, or with branching paths) and each step is closely associated with the previous step.

Does it feel differently to you?

When you do the various techniques you describe above, what is the qualia and low-level execution of it feel like?

[-]johnswentworth6y50

I do think it's all ultimately implemented on top of something babbly, yeah. The babbly part seems like the machine code of the brain - ultimately everything has to be implemented in that.

I think what I mean by "gearsy reasoning" is something different than how you're using the phrase. It sounds like you're using it as a synonym for systematic or system-2 reasoning, whereas I see gears as more specifically about decomposing systems into their parts. Gearsy reasoning doesn't need to look very systematic, and systematic reasoning doesn't need to be gearsy - e.g. simply breaking things into steps is not gearsy reasoning in itself. So the specific "tight associations" you list do not sound like the things I associate with gearsy thinking specifically.

As an example, let's say I'm playing a complicated board game and figuring out how to get maximum value out of my resources. The thought process would be something like:

Ok, main things I want are X, Y, Z -> what resources do I need for all that?
(add it up)
I have excess A and B but not enough C -> can I get more C?
I have like half a dozen ways of getting more C, it's basically interchangeable with B at a rate of 2-to-1 -> do I have enough B for that?
...

So that does look like associative babbling; the "associations" it's following are mainly the relationships between objects given by the game actions, plus the general habit of checking what's needed (i.e. the constraints) and what's available.

I guess one insight from this: when engaged in gears-thinking, it feels like the associations are more a feature of the territory than of my brain. It's not about taste, it's about following the structure of reality (or at least that's how it feels).

[-]Raemon6y40

I think what I mean by "gearsy reasoning" is something different than how you're using the phrase. It sounds like you're using it as a synonym for systematic or system-2 reasoning, whereas I see gears as more specifically about decomposing systems into their parts.

Yeah. My reply was somewhat general and would work for non-gearsy strategies as well. I do get that gearsiness and systematicness are different axes and strategies can employ them independently. I was referring offhandedly to "systematic gearsiness" because it's what you had just mentioned and I just meant to convey that the babble-process worked for it.

i.e, I think your list that begins "Okay, the main things I want are X, Y and Z..." follows naturally from my list that ends "Once I have a gears-oriented strategy, it'll usually have a step one..."

I guess one insight from this: when engaged in gears-thinking, it feels like the associations are more a feature of the territory than of my brain. It's not about taste, it's about following the structure of reality.

The way I'd parse it is that I have some internalized taste that "when figuring out a novel, complex problem, it's important to look for associations that are entangled with reality". And then as I start exploring possible strategies to use, or facts that might be relevant, "does this taste gearsy?" and "does this taste 'entangled with reality'" are useful things to be able to check. (Having an aesthetic taste oriented around gearsy-entangledness lets you quickly search or rule out directions of thought at the sub-second level, which might then turn into deliberate, conscious thought)

Alternately: I'm developing a distaste for "babbling that isn't trying to be methodical" when working on certain types of problems, which helps remind me to move in a more methodical direction (which is often but not always gearsy)

[edit: I think you can employ gearsy strategies without taste, I just think taste is a useful thing to acquire

[-]Kaj_Sotala5y*170

For my paper "How Feasible is the Rapid Development of Artificial Superintelligence?", I looked at some of the existing literature on human expertise to develop a model of exactly what it is that human intelligence consists of.

As a very rough distinction, somewhat analogous to Type 1 and Type 2 reasoning, we can divide human expertise into two components: pattern recognition and mental simulation. An excerpt:

There exists a preliminary understanding, if not of the details of human decision-making, then at least the general outline. A picture that emerges from this research is that expertise is about developing the correct mental representations (Klein 1999, Ericsson and Pool 2016).

A mental representation is a very general concept, roughly corresponding to any mental structure forming the content of something that the brain is thinking about (Ericsson and Pool 2016).

Domain-specific mental representations are important because they allow experts to know what something means; know what to expect; know what good performance should feel like; know how to achieve the good performance; know the right goals for a given situation; know the steps necessary for achieving those goals; mentally simulate how something might happen; learn more detailed mental representations for improving their skills (Klein 1999, Ericsson and Pool 2016).

Although good decision-making is often thought of as a careful deliberation of all the possible options, such a type of thinking tends to be typical of novices (Klein 1999). A novice will have to try to carefully reason their way through to an answer, and will often do poorly regardless, because they do not know what things are relevant to take into account and which ones are not. An expert does not need to—they are experienced enough to instantly know what to do.

A specific model of expertise is the recognition-primed decision-making model (Klein 1999). First, a decision-maker sees some situation, such as a fire for a firefighter or a design problem for an architect. The situation may then be recognized as familiar, such as a typical garage fire. Recognizing a familiar situation means understanding what goals make sense and what should be focused on, which cues to pay attention to, what to expect next and when a violation of expectations shows that something is amiss, and knowing what the typical ways of responding are. Ideally, the expert will instantly know what to do.

The expectations arising from mental representations also give rise to intuition. As one example, Klein (1999) describes the case of a firefighter lieutenant responding to a kitchen fire in an ordinary one-story residential house. The lieutenant’s crew sprayed water on the fire, but contrary to expectations, the water seemed to have little impact. Something about the situation seemed wrong to the lieutenant, who ordered his crew out of the house. As soon as they had left the house, the floor where they had been standing collapsed. If the firefighters had not pulled out, they would have fallen down to the fire raging in the basement. The lieutenant, not knowing what had caused him to give the order to withdraw, initially attributed the decision to some form of extra-sensory perception.

In a later interview, the lieutenant explained that he did not suspect that the building had a basement, nor that the seat of the fire was under the floor that he and his crew were standing on. However, several of his expectations of a typical kitchen fire were violated by the situation. The lieutenant was wondering why the fire did not react to water as expected, the room was much hotter than he would have expected out of a small kitchen fire, and while a heat that hot should have made a great deal of noise, it was very quiet. The mismatch between the expected pattern and the actual situation led to an intuitive feeling of not knowing what was going on, leading to the decision to regroup. This is intuition: an automatic comparison of the situation against existing mental representations of similar situations, guiding decision-making in ways whose reasons are not always consciously available.

In an unfamiliar situation, the expert may need to construct a mental simulation of what is going on, how things might have developed to this point, and what effect different actions would have. Had the floor mentioned in the previous example not collapsed, given time the firefighter lieutenant might have been able to put the pieces together and construct a narrative of a fire starting from the basement to explain the discrepancies. For a future-oriented example, a firefighter thinking about how to rescue someone from a difficult spot might mentally simulate where different rescue harnesses might be attached on the person, and whether that would exert dangerous amounts of force on them.

Mental representations are necessary for a good simulation, as they let the expert know what things to take into account, what things could plausibly be tried, and what effects they would have. In the example, the firefighter’s knowledge allows him to predict that specific ways of attaching the rescue harness would have dangerous consequences, while others are safe.

Similar to the firefighter's intuition, GPT-2 has the ability to make predictions about what's most likely to "happen next". But there are also several differences.

Most notably, GPT-2's only goal is just that: predict what's going to happen next. This is a much more limited task than the one faced by (for example) a human firefighter, who needs to not just predict how a fire might proceed, but also how to best respond to it and which actions to take.

Let's take a concrete example and look at it in more detail; from Klein 1999:

The initial report is of flames in the basement of a four-story apartment building: a one-alarm fire. The [firefighter] commander arrives quickly and does not see anything. There are no signs of smoke anywhere. He finds the door to the basement, around the side of the building, enters, and sees flames spreading up the laundry chute. That's simple: a vertical fire that will spread straight up. Since there are no external signs of smoke, it must just be starting.

The way to fight a vertical fire is to get above it and spray water down, so he sends one crew up to the first floor and another to the second floor. Both report that the fire has gotten past them. The commander goes outside and walks around to the front of the building. Now he can see smoke coming out from under the eaves of the roof.

It is obvious what has happened: the fire has gone straight up to the fourth floor, has hit the ceiling there, and is pushing smoke down the hall. Since there was no smoke when he arrived just a minute earlier, this must have just happened. It is obvious to him how to proceed now that the chance to put out the fire quickly is gone. He needs to switch to search and rescue, to get everyone out of the building, and he calls in a second alarm. The side staircase near the laundry chute had been the focus of activity before. Now the attention shifts to the front stairway as the evacuation route.

And this picture shows the general algorithm for recognition-primed decision-making. Breaking down the story, we might say that something like the following happened:

1. The commander sees no smoke outside, then flames spreading up the laundry chute. These are the cues that allow him to recognize this as a vertical fire that is just starting.

2. The commander's mental representation of vertical fires includes plausible goals, expectancies of what is going to happen, and actions that could further the goals. A plausible goal for this situation: put the fire out quickly, before it has a chance to spread. An action that would further it: send people to spray water on the fire from above. A rapid mental simulation suggests that this should work, so he gives the order.

3. The crews report that the fire has gotten past them. This violates the expectancy that the fire should be in the basement only; to diagnose this anomaly, the commander goes outside to gather more data. When he sees the smoke coming up from the roof, this allows him to construct a story of what has happened.

4. The situation is now different, calling up a new mental representation: that of a fire that has spread from the basement to the top floor. Plausible goals in this situation: get everyone out of the building. Actions to take here: call in reinforcements, get people to the front stairway to carry out an evacuation.

As at least one major difference, GPT-2 never does the thing where it expects that something will happen, and then takes actions to re-evaluate the situation if the prediction goes wrong. If it predicts "the word after 'maximize' is going to be 'paperclip'" with 90% confidence, finding out that it's actually followed by 'human values' doesn't cause it to...

Actually, I don't need to complete that sentence, because "seeing that it was mistaken" isn't actually a thing that happens to GPT-2 in the first place. It does get feedback to its predictions during its training phase, but once it has been trained, it will never again compare its prediction with the actual result. You just give it a prompt and then it tries to predict the rest, that's it. If you give it one prompt, have it predict the rest of it, and then give it a revised prompt with the correct completion, it has no idea that you are doing this. It just sees one prompt and then another. This makes it incapable of noticing that its expectations are violated, gathering more information in return, and then constructing a story of what happened and what kind of a situation it's actually in.

You could probably apply it to something like "predict what a human firefighter would do in this situation" (imitation learning), but as anyone can verify by playing AI Dungeon (which now uses GPT-3 not GPT-2), its predictions get very nonsensical very quickly. It doesn't really do the kind of causal reasoning that would involve mental simulations to produce novel responses, e.g. the following example from Klein:

A [firefighter] lieutenant is called out to rescue a woman who either fell or jumped off a highway way overpass. She is drunk or on drugs and is probably trying to kill herself. Instead of falling to her death, she lands on the metal supports of a highway sign and is dangling there when the rescue team arrives.

The lieutenant recognizes the danger of the situation. The woman is semiconscious and lying bent over one of the metal struts. At any moment, she could fall to her death on the pavement below. If he orders any of his team out to help her, they will be endangered because there is no way to get a good brace against the struts, so he issues an order not to climb out to secure her.

Two of his crew ignore his order and climb out anyway. One holds onto her shoulders and the other to her legs.

A hook-and-ladder truck arrives. The lieutenant doesn't need their help in making the rescue, so tells them to drive down to the highway below and block traffic in case the woman does fall. He does not want to chance that the young woman will fall on a moving car.

Now the question is how to pull the woman to safety.

First, the lieutenant considers using a rescue harness, the standard way of raising victims. It snaps onto a person's shoulders and thighs. In imagining its use, he realizes that it requires the person to be in a sitting position or face up. He thinks about how they would shift her to sit up and realizes that she might slide off the support.

Second, he considers attaching the rescue harness from the back. However, he imagines that by lifting the woman, they would create a large pressure on her back, almost bending her double. He does not want to risk hurting her.

Third, the lieutenant considers using a rescue strap-another way to secure victims, but making use of a strap rather than a snap-on harness. However, it creates the same problems as the rescue harness, requiring that she be sitting up or that it be attached from behind. He rejects this too.

Now he comes up with a novel idea: using a ladder belt-a strong belt that firefighters buckle on over their coats when they climb up ladders to rescue people. ple. When they get to the top, they can snap an attachment on the belt to the top rung of the ladder. If they lose their footing during the rescue, they are still attached to the ladder so they won't plunge to their death.

The lieutenant's idea is to get a ladder belt, slide it under the woman, buckle it from behind (it needs only one buckle), tie a rope to the snap, and lift her up to the overpass. He thinks it through again and likes the idea, so he orders one of his crew to fetch the ladder belt and rope, and they tie it onto her.

In the meantime, the hook-and-ladder truck has moved to the highway below the overpass, and the truck's crew members raise the ladder. The firefighter on the platform at the top of the ladder is directly under the woman shouting, "I've got her. I've got her." The lieutenant ignores him and orders his men to lift her up.

At this time, he makes an unwanted discovery: ladder belts are built for sturdy firefighters, to be worn over their coats. This is a slender woman wearing a thin sweater. In addition, she is essentially unconscious. When they lift her up, they realize the problem. As the lieutenant put it, "She slithered through the belt like a slippery strand of spaghetti."

Fortunately, the hook-and-ladder man is right below her. He catches her and makes the rescue. There is a happy ending.

Now the lieutenant and his crew go back to their station to figure out what had gone wrong. They try the rescue harness and find that the lieutenant's instincts were right: neither is usable.

Eventually they discover how they should have made the rescue. They should have used the rope they had tied to the ladder belt. They could have tied it to the woman and lifted her up. With all the technology available to them, they had forgotten that you can use a rope to pull someone up.

Consider the lieutenant's first idea. Possibly GPT-2 might have been able to notice that statistically, firefighters typically use rescue harnesses in situations like this. But it doesn't do any mental simulation to see what the predicted outcome of using that harness would be. If there had been enough previous situations where a harness was unusable, and enough cues to indicate to GPT-2 that this was one of those situations, then it could accurately predict that the rescuers would do something different. But if this is a novel situation (as most situations are), then it needs to actually do causal reasoning and notice that the woman would slide off the support. (This is similar to your "check for badness" thing, except it happens via mental simulation rather than just association.)

Orthonormal gives us a real-life example of what happens when an AI uses pattern-matching, but does not do causal reasoning, and then tries to play Starcraft:

The overhyped part is that AlphaStar doesn't really do the "strategy" part of real-time strategy. Each race has a few solid builds that it executes at GM level, and the unit control is fantastic, but the replays don't look creative or even especially reactive to opponent strategies.

That's because there's no representation of causal thinking - "if I did X then they could do Y, so I'd better do X' instead". Instead there are many agents evolving together, and if there's an agent evolving to try Y then the agents doing X will be replaced with agents that do X'. But to explore as much as humans do of the game tree of viable strategies, this approach could take an amount of computing resources that not even today's DeepMind could afford.

(This lack of causal reasoning especially shows up in building placement, where the consequences of locating any one building here or there are minor, but the consequences of your overall SimCity are major for how your units and your opponents' units would fare if they attacked you. In one comical case, AlphaStar had surrounded the units it was building with its own factories so that they couldn't get out to reach the rest of the map. Rather than lifting the buildings to let the units out, which is possible for Terran, it destroyed one building and then immediately began rebuilding it before it could move the units out!)

AlphaStar notices that the units are trapped, which it associates with "must destroy the thing that is trapping them". Then it notices that it is missing a factory, and its associations tell it that in this situation it should have one more factory, and it should be located right where the destroyed factory should be, so...

In contrast, a human might have considered destroying the factory, but then noticed that this leads to a situation where there is one factory too little; and then realized that the building can just be lifted out of the way.

Here is Klein's illustration of how a generic mental simulation seems to work; he also has illustrations of the more specific variant of explaining the past (e.g. the firefighter commander constructing a story of how the basement fire had spread) and projecting to the future (e.g. the firefighter lieutenant trying to figure out how to rescue the woman). Here's him explaining a part of the figures:

Consider this example. Some need arises for building a mental simulation; let us say a coworker has suddenly started acting rudely toward you. The simulation has to let you infer what the original situation was that led to the events you are observing. You assemble the action sequence: the set of transitions that make up the simulation. Perhaps you recall an incident that same morning when you were chatting with some other people in your office and said something that made them laugh. Perhaps you also recall that earlier that morning, your coworker had confided some embarrassing secret to you. So you construct a sequence in which your coworker trusts you with a confidence, then regrets it immediately afterward and feels a little awkward around you, then sees you possibly entertaining some other people with the secret, and then feels that it is going to be unbearable to live with you in the same setting. Now you can even remember that after you made the other people laugh, you looked up and saw the coworker giving you a look that made you feel uneasy. This set of states and transitions is the action sequence, the mental simulation that explains the rude behavior.

The next step is to evaluate the action sequence at a surface level. Is it coherent (Do the steps follow from each other)? Yes, it is. Does it apply (Does the sequence account for the rudeness)? Yes, it does. How complete is it (Does it leave out any important factors, such as the excellent performance evaluation you have just received)? Yes, there are some more pieces that might belong to the puzzle. But in general, the mental simulation passes the internal evaluation. It is an acceptable explanation. That does not mean it is correct.

Sometimes the mental simulation will not pass the internal evaluation, and that also helps you make sense of things. [The following example] illustrates this with a story reported in a newspaper. [...]

The IRA Terrorist. A well-respected lawyer has agreed to defend a man accused of committing an act of terrorism: planting a bomb for the IRA. The lawyer, asked why he would take the case, answers that he interviewed the accused man, who was shaking and literally overcome with panic. He was surprised to see the man fall apart like that. He tried to imagine the IRA's recruiting such a person for a dangerous mission and found that he could not. He cannot conjure up a scenario in which the IRA would give a terrorism assignment to a man like this, so his conclusion is that the man is innocent.

This lawyer could not generate an action sequence that passed his internal evaluation-specifically, the requirement that the transition between steps be plausible. His failure to assemble a plausible sequence of steps led him to a different explanation than the prosecutors had formed. That's why you see a long, curving arc in figure 5.4: the failure to assemble the mental simulation was the basis of the conclusion.

There are also times when you use mental simulation to try to increase your understanding of situations like these. You are trying to build up better models. When you run the action sequence in your mind, you may notice parts that still seem vague. Maybe you can figure out how to set up a better action sequence, or maybe there are some more details about the present state that you should gather. Going back to the example of your coworker, your explanation has not included the fact that you received such a good performance evaluation. What was your coworker's performance evaluation? Possibly the coworker felt you had gotten recognition for work that someone else had done. Perhaps you can get a general sense of the situation by talking to your boss. That might give you some more data points for building your explanations.

(If you look at explanations of how GPT's Transformer architecture works [1, 2], you can see that it doesn't do anything like this.)

[-]Raemon5y40

So, doublechecking my comprehension:

In my OP, my claim was basically "you probably can get human-level output out of something GPT-like by giving it longer term rewards/punishments, and having it continuously learn" (i.e. give it an actual incentive to figure out how to fight fires in novel situations, which current GPT doesn't have).

I realize that leaves a lot of fuzziness in "well, is it really GPT if has a different architecture that continuously learns and has longterm rewards?". My guess was that it'd be fairly different from GPT architecturally, but that it wouldn't depend on architectural insights we haven't already made, it'd just be work to integrate existing insights.

Is your claim "this is insufficient – you still need working memory and the ability to model scenarios, and currently we don't know how to do that, and there are good reasons to think that throwing lots of data and better reward structures at our existing algorithms won't be enough to cause this to develop automatically via Neural Net Magic?"

[-]Kaj_Sotala5y40

Is your claim "this is insufficient – you still need working memory and the ability to model scenarios, and currently we don't know how to do that, and there are good reasons to think that throwing lots of data and better reward structures at our existing algorithms won't be enough to cause this to develop automatically via Neural Net Magic?"

So at this point I'm pretty uncertain of what neural nets can or can not learn to do. But at least I am confident in saying that GPT isn't going to learn the kinds of abilities that would be required for actually fighting fires, as it is trained and tested on a fundamentally static task, as opposed to one that requires adapting your behavior to a situation as it develops. For evaluating at progress on those, projects like AlphaStar look like more relevant candidates.

I don't feel confident in saying whether some combination of existing algorithms and training methods could produce a system that approached the human level on dynamic tasks. Most people seem to agree that we haven't gotten neural nets to learn to do good causal reasoning yet, so my understanding of the expert consensus is that current techniques seem inadequate... but then the previous expert consensus would probably also have judged neural nets to be inadequate for doing many of the tasks that they've now mastered.

[-]Raemon5y40

Thanks, this is great. May have more thoughts after thinking it over a bit.

[-]Steven Byrnes5y100

Meanwhile, Jeff Hawkins says "Every part of the neocortex is running the same algorithm", and it's looking like maybe brains aren't doing that complicated a set of things.

This is nitpicking, but your post goes back and forth between the "underlying algorithm" level and the "learned model" level. Jeff Hawkins is talking about the underlying algorithm level when he says that it is (more or less) the same in every part of the neocortex. But almost all the things you mention in "My algorithm as I understand it" are habits of thought that you've learned over the years. (By the same token, we should distinguish between "Transformer + SGD" and "whatever calculations are being done by the particular weight settings in the trained Transformer model".)

I don't expect there to be much simplicity or universality at the "learned model" level ... I expect that people use lots of different habits of thought.

Has anyone done anything like "Train a neural net on Reddit, where it's somehow separately rewarded for predicting the next word, and also for predicting how much karma a cluster of words will get, and somehow propagating that back into the language generation?")

I imagine the easiest thing would be to pre-pend the karma to each post, fine-tune the model, then you can generate high-karma posts by just prompting with "Karma 1000: ...". I'm not aware of anyone having done this specific thing but didn't check. I vaguely recall something like that for AlphaStar, where they started by imitation learning with the player's skill flagged, and then could adjust the flag to make their system play better or worse.

What's happening in System 2 thought?

If you haven't already, see Kaj's Against System 1 and System 2. I agree with everything he wrote; the way I would describe it is: Our brains house a zoo of compositional generative models, and system 2 is a cool thing where generative models can self-assemble into an ad-hoc crappy serial computer. For example, you can learn a Generative Model X that first summons a different Generative Model Y, and then summons either Generative Model Z₁ or Z₂ conditional on some feature of Generative Model Y. (Something like that ... I guess I should write this up better someday.) Anyway, this is a pretty neat trick. Can a trained Transformer NN do anything like that? I think there's some vague sense in which a 6-layer Transformer can do similar things as a series of 6 serial human thoughts maybe?? I don't know. There's definitely a ton of differences too.

...chunking...

My vague sense about foresight (rolling out multiple steps before deciding what to do) is that it's helpful for sample-efficiency but not required in the limit of infinite training data. Some examples: in RL, both TD learning and tree search eventually converge to the same optimal answer; AlphaGo without a tree search is good but not as good as AlphaGo with a tree search.

Perhaps not coincidentally, language models are pretty sample inefficient compared to people...

In my everyday life, I feel like my thoughts very often involve a sequence of two or three chunks, like "I will reach into my bag and then pull out my wallet", and somewhat less often is it a longer sequence than that, but i dunno.

Maybe "AlphaStar can't properly block or not block narrow passages using buildings" is an example where it's held back by lack of foresight.

[-]Raemon5y20

Thanks! Will reply to some different bits separately. First, on reddit-karma training:

I imagine the easiest thing would be to pre-pend the karma to each post, fine-tune the model, then you can generate high-karma posts by just prompting with "Karma 1000: ...".

This doesn't accomplish what I'm going for (probably). The key thing I want is to directly reward GPT disproportionately in different circumstances. As I currently understand it, every situation for GPT is identical – bunch of previous words, one more word to predict, graded on that one word.

GPT never accidentally touches a burning hot stove, or gets a delicious meal, or builds up a complicated web of social rewards that they aspire to succeed at. I bet toddlers learn not to touch hot stoves very quickly even without parental supervision, faster than GPT could.

I don't want "1 karma", "10 karma" and "100 karma" to be a few different words with different associations. I want 10 karma to be 10x the reward of 1 karma, and 100 karma 10x that. (Well, maybe not literally 10x, I'd fine tune the reward structure with some fancy math)

When GPT-3 sort of struggles to figure out "I'm supposed to be doing addition or multiplication here", I want to be able to directly punish or reward it more than it usually is.

[-]Steven Byrnes5y20

Well, sure, you could take bigger gradient-descent steps for some errors than others. I'm not aware of people doing that, but again, I haven't checked. I don't know how well that would work (if at all).

The thing you're talking about here sounds to me like "a means to an end" rather than "an end in itself", right? If writing "Karma 100000: ..." creates the high-karma-ish answer we wanted, does it matter that we didn't use rewards to get there? I mean, if you want algorithmic differences between Transformers and brains, there are loads of them, I could go on and on! To me, the interesting question raised by this post is: to what extent can they do similar things, even if they're doing it in very different ways? :-)

[-]lionhearted (Sebastian Marshall)6y100

I think you'd probably like the work of John Boyd:

https://en.wikipedia.org/wiki/John_Boyd_(military_strategist)

He's really interesting in that he worked on a mix of problems and areas with many different levels of complexity and rigor.

Notably, while he's usually talked about in terms of military strategy, he did some excellent work in physics that's fundamentally sound and still used in civilian and military aviation today:

https://en.wikipedia.org/wiki/Energy%E2%80%93maneuverability_theory

He was a skilled fighter pilot, so he was able to both learn theory and convert into tactile performance.

Then, later, he explored challenges in organizational structures, bureaucracy, decision making, corruption, consensus, creativity, inventing, things like that.

There's a good biography on him called "Boyd: The Fighter Pilot Who Changed the Art of War" - and then there's a variety of briefings, papers, and presentations he made floating around online. I went through a phase of studying them all; there's some gems there.

Notably, his "OODA" loop is often incorrectly summarized as a linear process but he defined it like this —

https://taskandpurpose.com/.image/c_fit%2Ccs_srgb%2Cfl_progressive%2Cq_auto:good%2Cw_620/MTcwNjAwNDYzNjEyMTI2ODcx/18989583.jpg

I think the most interesting part of it is under-discussed — the "Implicit Guidance and Control" aspect, where people can get into cycles of Observe/Act/Observe/Act rapidly without needing to intentionally orient themselves or formally make a decision.

Since he comes at it from a different mix of backgrounds with a different mix of ability to do formal mathematics or not, he provides a lot of insights. Some of his takeaways seem spot-on, but more interesting are the ways he can prime thinking on topics like these. I think you and he were probably interested in some similar veins of thought, so it might produce useful insights to dive in a bit.

[-]johnswentworth5y30

I've read some of his stuff on strategy. It seemed like there were a lot of interesting insights in there, but it was all presented in the sort of way that sounds sciency to non-science-people but didn't really communicate a proper model. If someone knows of or could write a good explanation of the models underlying his ideas, I'd be very interested to read that.

[-]Trinley Goldenberg5y20

Most of Boyd's work was communicated through briefings and presentations, so we don't have a lot of the underlying models, except second hand.

[-]quanticle5y70

Super naive question: given all we know about the myriad ways in which the brain fools itself, and more specifically, the ways that subconscious mental activities fool our conscious selves, why should we trust introspection? More specifically, why should I believe that the way I perceive myself to think is the way I actually think (as opposed to an abstraction put up by my subconscious)?

My model is that any psychological model that relies on introspection is going to be inherently flawed. If we want to learn how people think, we should observe their actions, and carefully watch how people behave in response to different stimuli and situations. I think asking people how they think tells us more about how they rationalize their thinking than it does about how they actually think.

[-]Gordon Seidoh Worley5y80

There is both some actual fact of what it is like to experience your own mind, and then there is the way you make sense of it to explain it to yourself and others that has been reified into concepts. Just because the reification of the experience of our own thinking is flawed in a lot of ways doesn't make it not evidence of our thoughts, it only makes it noisy, unreliable, and "known" in ways that have to be "unknown" (we have to find and notice confusion).

You worry that asking people who they think will tell us more about their understanding of how they think rather than how they actually think, and that's probably true, but also useful, because they got that understanding somehow and it's unlikely to be totally divorced from reality. Lacking better technology for seeing into our minds, we're left to perform hermeneutics on our self reports.

[-]Kaj_Sotala5y40

I wouldn't be that surprised if GPT-2 was "only" a System 1. But I also wouldn't be that surprised if it naturally developed a System 2 when scaled up, and given more training. I also wouldn't be that surprised if it turned out not to need a System 2.

As steve2152 also noted, System 2 (or more accurately, Type 2) reasoning involves passing the outputs from one Type 1 system to another using working memory resources. Working memory seems to involve several specialized components, including memory storages and executive functions that control and regulate it. If GPT-2 doesn't have those kinds of architectural properties already, it's not going to develop them by just having more training data thrown at it.

[-]Raemon5y20

Something I notice here (about myself) is that I don't currently understand enough about what's going on under-the-hood to make predictions about what sort of subsystems GPT could develop internally, and what it couldn't. (i.e. if my strength as a rationalist is the ability to be more confused by fiction than reality, well, alas)

It seems like it has to develop internal models in order to make predictions. It makes plausible sense to me that working memory is a different beast that you can't develop by having more training data thrown at you, but I don't really know what facts about GPT's architecture should constrain my beliefs about that.

(It does seem fairly understandable to me that, even if it were hypothetically possible for GPT to invent working memory, it would be an inefficient way of inventing working memory)

[-]Gordon Seidoh Worley5y40

A quick summary of the phenomenology of my thoughts:

thoughts primarily have shapes, textures, and other touch-related features
- prior/below the level of words, my thoughts are like touch-felt objects that exist in a field that permeates my body
thinking feels like exploring, as in i'm traveling around finding/bumping into thought objects
sometimes i'm not very in touch with this aspect though and i can't feel it, but i'm pretty sure it's always there for hard to explain reasons
when i can't see it words seem to just come up from nowhere for unknown reasons

Some differences from your model:

i don't have what feels like a badness check, rather it feels like i have a thought and then maybe a linked thought is about what the consequences of it might be, and sometimes those are bad.
- but sometimes i might be distracted and not follow up and notice the bad association.

[-]Raemon5y20

i don't have what feels like a badness check, rather it feels like i have a thought and then maybe a linked thought is about what the consequences of it might be, and sometimes those are bad.

I think this is actually probably what's going on with me, upon further reflection.

[-]avturchin6y40

I had an idea similar to yours "badness" algorithm: It will be interesting to add to the GPT a truth discriminator: another neural net which predicts the truth values of GPT's statement relative to the real world and is trained on a database of true statements (there are several). The whole thing then is trained in GAN-style, and the GPT thus trained to produce statements with highest true score.

[-]Raemon5y40

Actually, I think your comment about this awhile ago was what got me started on all this. I tried looking for it when I wrote this post but couldn't find it easily. If you give me the link I'd be happy to credit you in the OP.

[-]avturchin5y20

:) Don't remember where I wrote about it.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

75

What's Your Cognitive Algorithm?

75

75

My algorithm, as I understand it

How far removed from that is GPT-2?

From Toddlers to Software Architects

But, really, what about deep planning and models and creativity?

What's happening in System 2 thought?

What Other Kinds of Thought Are There?

How Do You Think?