Note: this post leans heavily on metaphors and examples from computer programming, but I've tried to write it so it's accessible to a determined person with no programming background.
To summarize some info from computer processor design at very high density: There are a variety of ways to manufacture the memory that's used in modern computer processors. There's a trend where the faster a kind of memory is to read from and write to, the more expensive it will be. So modern computers have a hierarchical memory structure: a very small amount of memory that's very fast to do computation with ("the registers"), a larger amount of memory that's a bit slower to do computation with, a even larger amount of memory that's even slower to do computation with, and so on. The two layers immediately below the the registers (the L1 cache and the L2 cache) are typically abstracted away from even the assembly language programmer. They store data that's been accessed recently from the level below them ("main memory"). The processor will do a lookup in the caches when accessing data; if the data is not already in the cache, that's called a "cache miss" and the data will get loaded in to the cache before it's accessed.
(Please correct me in the comments if I got any of that wrong; it's based on years-old memories of an undergrad computer science course.)
Lately I've found it useful to think of my memory in the same way. I've got working memory (7±2 items?), consisting of things that I'm thinking about in this very moment. I've got short term memory and long term memory. And if I can't find something after trying to think of it for a while, I'll look it up (frequently on Google). Cache miss for the lose.
What are some implications of thinking about memory about this way?
Register limitations and chunking
When programming, I've noticed that sometimes I'll encounter a problem that's too big to fit in my working memory (WM) all at once. In the spirit of getting stronger, I'm typically tempted to attack the problem head on, but I find that my brain just tends to flit around the details of the problem instead of actually making progress on it. So lately I've been toying with the idea of trying to break off a piece of the problem that can be easily modularized and fits fully in my working memory and then solving it on its own. (Feynman: "What's the smallest nontrivial example?") You could turn this definition around and define a good software architecture as one that consists of modular components that can individually be made to fit completely in to one's working memory when reading code.
As you write or read code modules, you'll come to understand them better and you'll be able to compress or "chunk" them so they take up less space in your working memory. This is why top-down programming doesn't always work that well. You're trying to fit the entire design in your working memory, but because you don't have a good understanding of the components yet (since you haven't written them), you aren't dealing with chunks but pseudochunks. This is true for concepts in general: it takes all of a beginner's WM to comprehend a for loop, but in a master's WM a for loop can be but one piece in a larger puzzle.
One thing to observe: you don't get alerted when memory at the top of your mental hierarchy gets overwritten. We've all had the experience of having some idea in the shower and having forgotten it by the time we get out. Similarly, if you're working on a delicate mental task (programming, math, etc.) and you get interrupted, you'll lose mental state related to the problem you're working on.
If you're having difficulty focusing, this can easily make doing a delicate mental task, like a complicated math problem, much less fun and productive. Instead of actually making progress on the task, your mind drifts away from it, and when you redirect your attention, you find that information related to the problem has swapped out of your working memory or short-term memory and must be re-loaded. If you're getting distracted frequently enough or you're otherwise lacking mental stamina, you may find that you spend the majority of your time context switching instead of making progress on your problem.
Adding an additional external cache level
Anecdotally, adding an additional brain cache level between long-term memory and Google seems like a pretty big win for personal productivity. My digital notebook (since writing that post, I've started using nvALT) has turned out to be one of my biggest wins where productivity is concerned; it's ballooned to over 700K words, and a decent portion of it consists of copy-pasted snippets that represent the best information from Google searches I've done. A co-worker wrote a tool that allows him to quickly look up how to use software libraries and reports that he's continued to find it very useful years after making it.
Text is the most obvious example of an exobrain memory device, but here's a more interesting example: if you're cleaning a messy room, you probably don't develop a detailed plan in your head of where all of your stuff will be placed when you finish cleaning. Instead, you incrementally organize things in to related piles, then decide what to do with the piles, using the organization of the items in your room as a kind of external memory aid that allows you to do a mental task that you wouldn't be able to do entirely in your head.
Would it be accurate to say that you're "not intelligent enough" to organize your room in your head without the use of any external memory aides? It doesn't really fit with the colloquial use of "intelligence", does it? But in the same way computers are frequently RAM-limited, I suspect that humans are also frequently RAM-limited, even on mental tasks we frequently associate with "intelligence". For example, if you're reading a physics textbook and you notice that you're getting confused, you could write down a question that would resolve your confusion, then rewrite the question to be as precise as possible, then list hypotheses that would answer your question along with reasons to believe/disbelieve each hypothesis. By writing things down, you'd be able to devote all of your working memory to the details of a particular aspect of your confusion without losing track of the rest of it.