Models of Memory and Understanding

Hazard

johnswentworth's post on Declarative Mathematics got me thinking about what different types of understanding/learning/memory feel like to me. In order to explain an initial confusion I had, I'm explicating some of my models on how my mind can work.

Jumps vs Procedure

I want to highlight two different styles of solving problems. One is where you jump directly to the solution, and the other is where you follow a procedure to produce the solution.

The prototypical example of these two styles would be a neural net vs an automated theorem prover. Once a neural net is trained, there's a sense in which it takes an input and "jumps directly" to an output. An automated theorem prover takes an input and systematically applies a bunch of deduction rules till it gets its output.

I don't care to claim that these two styles of "fundamentally" different (because I'd have to figure out what that means), but I do want to point out that they certainly feel different. This is what matters to me, because this post is about what it feels like for me to be solving problems in different domains.

Memory as a Graph

My working model of memory is something like "connected graph of ideas with hebbian learning". This model explains why use and connection are such a big deal for making something stick in memory. The more you use something, the "sticker" it becomes (pretty non-controversial empirical claim, seems to be theoretically backed up by observations on how neurons work). The more connections you make between ideas, the "sticker" they become (also non-controversial empirical claim, my theory is something like "doing graph search to get to a node on the idea graph becomes easier the more connected an idea is" (hmmm, seems fishy)).

My thinking of memory as an idea-graph is shaped by alkjash's Babble and Prune sequence and Kevin Simler's Nihilist Guide to Meaning.

Analogies, mapping between modules

I can think of two different ways my mind uses analogies. One is as a mnemonic crutch, and the other is as a licence to substitute.

The licence to substitute is what happens when I learn that two different math structures are isomorphic. I'm allowed to map things in the source domain to the target domain, do my work in the target domain, map it back and be guaranteed a correct answer (if I did the work correctly).

Analogies as a mnemonic crutch is when I only use a one particular idea from a target domain. When I say "juggling is like riding a bicycle" I'm really only talking about one specific part; at first it seems impossible, you do it a bunch, and eventually you get it, often without a detailed understanding of what you did differently to make it work. Analogies as mnemonics feels like copying a pointer to the same chunk of memory in C (me using analogies to describe how analogies work is sort of like using gcc to compile the new version of gcc). My brain already has the experience "impossible->practice->it works->don't know how" stored near ideas/experiences related to riding bikes. At some point in time I was told juggling is like riding a bicycle, and in my juggling knowledge I stored a pointer to that previous experience I had with bikes.

As is often the case, mnemonic vs substitution is probably better thought of as a spectrum. The more you allow an analogy to be a substitution, the less meta-data you have to remember about "what parts of this analogy are valid to draw conclusions from?"

The book Where Mathematics Comes From claims that analogy based reasoning is one of our fundamental thinking tools, and then goes on to make the more interesting claim, "And here are the grounding analogies that dictate people's intuitions about math". Interesting read, haven't finished, but would recommend.

Key Idea: one might be tempted to throw out mnemonic analogies because if you mistake them for substitution analogies, you get the wrong answer. I claim that mnemonic analogies can be incredibly useful for giving you a memory handle that allows you to even remember all of the new ideas in the first place.

I'll use the term module to refer to a chunk of your mind that is experienced at solving problems of some domain using some combo a mix of jumps and procedural knowledge. The key think I want to communicate when speaking of a module is that it's solidly cemented in your memory, and whether fast or slow to produce an answer, it's reliable.

Rigor: Pre and post

Terry Tao has a post that people here have seen before where he divides one's mathematical journey into three stages; pre-rigor, rigor, and post-rigor. His post is short and you should read it. Here's each phase translated into my own models described above.

Pre-rigor: You have scant procedural knowledge on the topic, you have no well fit modules to map over, and most of you understanding is comes from analogies as mnemonics which, due to their lack of fit, take a lot of "rote practice glue" to stick together. You are not very good.

Rigor: Your procedural knowledge of the topic is gotten a lot more connected and well used. You are getting stronger at getting answers. You see how your previous mnemonic analogies are very misleading and discard them. You distrust people who make jumps and stick to your now familiar formal procedures. You are okay.

Post-rigor: Your procedural knowledge has been practiced so much it begins to form a module that allows you to jump to some solutions. You have a sense of what the "shape" and "feel" of different structures are. Having built new modules with rigorous foundations, you are more comfortable using substitution analogies to guide your jumps. You are becoming quite powerful.

The actual thing I wanted to talk about

See this comment.

Let's say I'm using a high level library (like asyncio) to make calls over a network. The library provides me an interface that very easily maps onto concepts I'm already familiar with. I'm doing things like "opening a connection" "writing to the other computer" and "waiting to read". These are all things that easily map onto mental modules I'm able to use and make jumps with.

If the abstraction is good, maps to a module I can already wield, and isn't that leaky, I don't really care about what's under the hood. Sometimes I get curious and investigate, "Hmmmm, what how do you actually get something like 'opening a connection' with just wires and 1's and 0's?", but that's not necessary for me to use the abstraction well. It is necessary for me to know what's under the hood if the things start breaking, but not until then.

When I was imagining not knowing how things worked "under the hood" for mathematics, I was imagining my experience with real analysis where I never got solid enough mnemonic analogies (nor built new modules through rigorous practice) for the ideas/concepts/results to stay in my mind. The only tool I had claim to was "start from the basic definitions and slowly work your way towards the proof".

Now that I've thought things through and can imagine not knowing what's "under the hood" yet still having a mapping to an existing module that leads me remember things, I'm more inclined to be interested in declarative mathematics.

[-]Pattern5y10

Pushing and pulling. When you have a jigsaw puzzle, you're done when you push the last piece into place. Trying to solve an equation for a variable by pushing might look like: x^2-1+3=5. x=0? 2 != 5. x=1? 3!=5. x=2? 6!=5. This might seem unproductive, but (using knowledge about x^2) we can see there's a solution where 1<x<2. Pulling might look like: x^2-1+3=5. x^2+2=5. x^2 = 3. Or we could go meta*, solve ax^2+bx+c=d, then substitute the values in for the solutions.

The licence to substitute

The reason you can't go from -x < y to x < -y: Going from 1 < 2 to 1 > 2 isn't valid. (Replacing the abstract with the concrete.*)

*These are opposites.

LESSWRONG
LW