I recently made a mistake where I tried doing something like this in Ruby on Rails:

johns_campaign_logs = SmsLog.all.filter { |s| s.campaign_id == 1234 }

I did have a sense that it wasn't The Rails Way, but I also had a sense that it was quite readable and sufficient. The code was working well locally, so I issued a pull request.

When I issued the PR, someone pointed out a major issue. SmsLog.all is going to fetch all of the sms logs in our database. Suppose there's 100 million of them. That'd be a very expensive query, taking all 100M rows from the DB and loading them into memory on the server. Instead, we could say to the database, "Hey, could you give us just the sms logs from John's campaign?". Suppose there's 30k of those. Now we only have to load 30k records into memory on our server instead of 100M. Much better.

I felt bad about this mistake. I've been programming for eight years. Of course you don't want to load 100M records into memory on your server. Duh. How could I get such a basic thing wrong?

A cognitive behavioral therapist would call this a dysfunctional thought. And they'd ask me to come up with a rational response to the thought. Well, here's my attempt at a rational response.


Imagine that you had a pyramid of things a programmer knows. At the bottom are basic things like what the map function does. At the top are advanced things like what a monad is. In the middle are stuff in between.

I think that this base layer of the pyramid is very, very wide. There are a lot of these "basic" things to know. It's to the point that even senior engineers are going to have a very non-trivial amount of gaps in this level of their pyramid. For example, here is a rough sketch of what I'd imagine a senior engineer's pyramid would look like versus a junior engineer's pyramid.

The senior engineer did a pretty nice job of filling out that bottom level, but there are still some notable gaps. There are still definitely going to be times when the senior engineer stumbles upon basic things that they don't know.

And this is why I say that my thought was dysfunctional. Just because you stumble across something basic that you don't know doesn't mean you're a bad developer. Even good developers have lots of gaps at the base of their pyramids. You could very well be a good developer who just so happened to stumble upon one of those gaps.

Tangent: I really like this pyramid analogy. I think it shows how you can't really be learning level two stuff if you don't already have the level one stuff. Notice how in my diagrams that when there's a gap at level one there's never anything above it. Well, technically I think reality is more jenga, where you don't need a 100% perfect foundation to build on top of, but that there are consequences of having a shaky foundation.

I also want to comment on how both the senior and junior engineers started moving up to level two without first filling out the entirety of level one. I think this is appropriate.

It's been my experience that all senior developers have their fair share of gaps at the bottom level. I can't think of anyone I've met who is an exception to this. Can you?

A great example is Dan Abramov, the creator of Redux, and a member of the React Core team. In one of my favorite blog posts ever, Dan bravely shares with the world a bunch of basic things that he doesn't know. Some of the things that stood out to me as especially eye opening are:

  • "Modern CSS. I don’t know Flexbox or Grid. Floats are my jam."
  • "CORS. I dread these errors! I know I need to set up some headers to fix them but I’ve wasted hours here in the past."
  • "Algorithms. The most you’ll get out of me is bubble sort and maybe quicksort on a good day. I can probably do simple graph traversing tasks if they’re tied to a particular practical problem. I understand the O(n) notation but my understanding isn’t much deeper than 'don’t put loops inside loops'."

In that spirit, I'd like to list out some of the basic things that, after programming for eight years, I personally still don't know:

  • How joins work in SQL. I have a vague idea, but I have trouble thinking about them and visualizing them.
  • Database normalization is just about avoiding duplication, right? Why is that such a big deal? Is it a such big deal?
  • I often lose track of what this is.
  • Without googling, I couldn't tell you what is cool about Node.js. And after a quick google, I still don't really get it. I'd have to think harder about it.
  • Relative paths always screw with me.
  • I'm not great with git. I don't really understand the tradeoffs of merging vs rebasing, and sometimes I get myself into a pickle.
  • With CSS, I'm the opposite of Dan. Flexbox and Grid make sense to me, but floats are never something I've been able to develop a solid grasp of. In using them I am either referencing an example from Stack Overflow, or I have to play around until it works.
  • Speaking of CSS, I always have to look up how to center things. CSS Tricks' article is my goto. I remember one time I was getting paid to tutor someone. He asked me how to center stuff, and I had to spend time fumbling around referencing the CSS Tricks article as I tried to explain it.
  • DNS stuff. Whenever I set up a new website, dealing with DNS stuff is always a struggle. I've actually been paying $7/month for a while to serve static content via Heroku instead of moving to Netlify because I can't figure out how to migrate over.
  • I have an intuitive sense that watchers are something to avoid in Vue if you can, but I can't really explain why.

To be clear, I'm not just talking about a front end developer not knowing the basics of how compilers work. I'm talking about a front end developer having level one gaps in normal, everyday things like floats and flexbox. About having basic gaps in your day-to-day domain, not just outside of it. The gaps will of course be less frequent when you're inside your normal domain, but they are still very non-trivial.

I think I'll stop here. I can go on of course, but this should get the point across.


Why write this post? These points all feel sorta obvious. Of course you can't expect someone to know 100% of the basic things. Duh. No one is perfect.

It's not that I expect people to disagree with the core point, but I do think that it's something that is easy to lose sight of. And underestimate. So if you were previously in either of those boats, hopefully I've been able to bring you back to shore. And if you started off on shore, well, hopefully you've had a good time singing kumbaya with me.

New to LessWrong?

New Comment
33 comments, sorted by Click to highlight new comments since: Today at 3:16 AM

Thanks for writing this. I struggle with imposter syndrome sometimes and it's reassuring to be told I don't need to know everything, even if that should be obvious.

It's also worth mentioning that making a mistake doesn't necessarily indicate lack of knowledge. You can know something well and still make a mistake out of absentmindedness, lack of sleep, etc.

Glad it helped. Good point about mistakes not necessarily indicating lack of knowledge, that is very true.

Not knowing some trivia is fine. But bouncing off something when I try to figure it out, if my peers have no trouble with it, doesn't feel fine to me. It feels terrible. It makes me double my effort, then double it again. And this reaction feels right to me, I wouldn't want to get rid of it.

Thank you for bringing up a good counterpoint to a core point of the post.

I agree with the sentiment that tsuyoku naritai is desirable, and that some amount of disappointment is required for it. But feeling "terrible" seems like it is taking things too far.

I'm not sure where the right point on the spectrum is. I lean towards thinking that intrinsic motivation + carrots (rather than sticks) go a long way and that it is appropriate to apply "the stick" in a similar manner as you would apply hot sauce to a dish. You'd shake a little on top, not soak the entire plate in it. And similar to hot sauce, it depends on your tolerance level.

It's also worth distinguishing between two things I've conflated a little bit: 1) the appropriate emotional reaction, and 2) how much one should update on the evidence. I think the more central point in the post was about (2), although (1) is definitely still worth discussing. Also, if the answer to (2) is in fact "very little", then I think that feeling terrible for (1) would probably necessitate a little bit of lying to yourself, which is a dark art.

There's an important point which I think this misses.

Rather than imagining the bottom level of a 2D pyramid, imagine the bottom level of a 3D pyramid. As you fill in the bottom level of that 3D pyramid, at some point you go from "it's mostly space with a few islands filled in" to "it's mostly filled in with a few islands of space". There's this phase-transition-like-phenomenon where all the concepts/knowledge go from disconnected pieces to connected whole.

For instance, in studying mechanics, this transition came for me around the time I took a differential equations class (I'd already taken some physics and programming). I went from feeling like "I can only model the dynamics of certain systems with special, tractable forms" to "I can model most systems, at least numerically, except for certain systems with special, intractable weird stuff". This was still only level 1 of the pyramid - the higher levels still provided important tools for solving mechanics problems more efficiently - but it gave me a unified framework in which everything fit together, and in which I could generally see where the holes were.

I really like that point about connected vs disconnected pieces, but isn't that true in a 2D pyramid as well?

In a 2D pyramid, the bottom layer is 1D, so any "hole" anywhere breaks it into two disconnected pieces.

Ah, gotcha.

I feel deeply inadequate because I don't know how to scale a web system properly, despite doing web programming for eight years. I'm not even entirely sure what "systems design" actually is or isn't. 

I also recently fell badly on my face when given a basic algorithms problem. This was equal parts anxiety, being out of practice interviewing, and being out of practice with algo questions. 

If it helps, I know how to scale web systems and basically never do it because you can run almost anything on one server unless you're Google.

If it makes you feel any better, up until a few weeks ago I was in the exact same boat for systems design. For algorithms it was similar. And I wasn't able to pass a Google interview after studying almost full time for four months.

I don't know the Chinese word for spatula, the chemical formula for sucrose or what % does in Vim. Last month a friend showed me a second set of adjuster screws on my bicycle breakpads I had never used before. I sleep just fine at night.

Looking at it from that perspective of "how many times do people bat an eye vs how many opportunities to bat an eye are there", I agree that the ratio is small, but I disagree that the ratio is (particularly) relevant. To me, the important point seems to be that left hand side of the ratio is problematic regardless of what number you use for the right hand side. (This isn't the best or most precise way of saying what I'm trying to say, but hopefully it's sufficient.)

Database normalization is just about avoiding duplication, right?

I think the thing here is that people who get database design can't really understand how it is possible to not get it, but there are a lot of people for whom it is extremely difficult to understand this topic. I sat through years of lectures wondering why we were taught things that were completely self-evident. Then I looked at a lot of other people's code, and it became clear that it wasn't self-evident at all.

It's always puzzling me why this is so hard to accept for me. Maybe one aspect could be that work relationships force you to present yourself as best as possible to your employer. And this leads to situations in which you try to signal competence instead of uncertainty, even to yourself.

I've noticed that the more high-level and complex the work you're doing, the sillier your bugs get. Perhaps because you focus so much on the complex parts since they're difficult to get right, so you gloss over the more basic parts.

I don't think your pyramid is a good conceptual framework to understand programming expertise. Expertise comes mostly from seeing common/overarching patterns (which would be all over the place on your pyramid) and from understanding the entire stack - having at least some sense of how each level functions, from the high-level abstraction of RoR's ORM to object lifetime and memory concerns to how database queries happen to how the db executes it (e.g. db indexes are also relevant to your example), down to at least having played around a little with assembly language.

I don't even know Ruby or RoR, but if I had to use it for your example, my first thought would be "ok, how do I do a WHERE query in their ORM", because every db abstraction in every language and framework has to solve this problem and I've seen a lot of those solutions. And I'll know to consider eager vs lazy evaluation (what if a campaign has 1M records after filtering, maybe I want to iterate over results instead of getting a plain list), and whether campaign_id has an index, because all of those are very common concerns that crop up.

So the expertise isn't knowing a factoid "don't use x.all.filter() in RoR", it's knowing that anything that queries a database has to deal with the concerns above somehow.

I don't think your pyramid is a good conceptual framework to understand programming expertise.

Perhaps. But does it meet the bar of "all models are wrong, but some are useful"?

So the expertise isn't knowing a factoid "don't use x.all.filter() in RoR", it's knowing that anything that queries a database has to deal with the concerns above somehow.

Agreed. The issue was that I am lacking a little in backend experience and didn't think about those concerns.

In the middle are stuff in between.

What an Applicative is? :)

I actually wouldn't know!

You can map across any monad, but not everything you can map across is a monad. Applicatives are in between.

Re database normalization, it’s obviously good to do if you can afford the hit for speed and scalability. Unfortunately I believe the software industry currently has a big problem with a lack of capable databases to support elegant data denormalization patterns: https://lironshapira.medium.com/data-denormalization-is-broken-7b697352f405

Interesting, thanks for sharing. I really like the way you explained what a paradigm shift is. And the point you make in the It's Not Your Fault section:

On the day you’re implementing the User.numUnreadRooms field, you’ll rack your brain for all the places in your code that will need to update it, and I don’t blame you for overlooking the onDeleteRoom event handler.

On the day you’re implementing the onDeleteRoom event handler, you’ll rack your brain for all the denormalized fields that it might need to update, and I don’t blame you for overlooking the User.numUnreadRooms field.

I don’t blame you, the application developer, for any bug in your denormalization logic. I blame the paradigm of tightly coupling denormalization logic with application code.

I think that it is related to my Don't Feel Bad About Not Knowing Basic Things post. If the "basic thing" you don't understand is also something that plenty of other people don't understand, then maybe it's because it is unnecessarily complicated. (There are other possibilities too of course.)

NodeJS is mostly cool because you can use the same language and the same development tools across your whole stack. When it launched I think another selling point was that it’s reasonably good at handling multiple requests in parallel.

NodeJS is mostly cool because you can use the same language and the same development tools across your whole stack.

Oh ok that's good to hear. That's always been what felt most appealing about it to me.

When it launched I think another selling point was that it’s reasonably good at handling multiple requests in parallel.

What is it about Node that causes this to be true? I have a feeling it has to do with Node being single-threaded, but that never made sense to me. Being single-threaded seems like it'd be strictly worse than being multi-threaded. Anything you can do on a single-threaded system, you can also do on a multi-threaded system, right? Just don't create a second thread.

Ya I don’t know the details even though I use NodeJS almost every day :) Maybe it does run parallel requests in separate threads.

Glad to hear I'm not alone :)

I'm three years into my PhD. Still don't really understand what a semi-group is (something something exponential). I can't remember the formula for the exposants in Sobolev inequality but here it is just my fault for never learning it correctly.

So the idea is that if you've got a linear differential equation, like dx/dt=2x, then all the solutions look like x=x0*exp(2t).

And you could imagine that there's an operator which says "Where will this point end up in 1 second?", call it O(1), which looks like x0->x*exp(2). I.e. it just gets multiplied.  

If you know that operator, then you know the operator that represents "Where will the point end up in 2 seconds", because you can say "where will it end up one second later" then "where will that point end up in another second. 

 So this gives you a law that O(2) is O(1) done twice. And O(3) is O(2) followed by O(1). So these operators are a group (the identity is 'where will it be after I run the equation for no seconds').

In the one-dimensional ODE case, this is all so trivial that it's not worth mentioning. Almost a pun. 

But it generalises nicely to dx/dt=Ax, where x is a vector and A is a matrix, and that gives us a way of defining exp(At), i.e., how to exponentiate a matrix, by saying, "well what if we run this matrix equation for a second and see where it puts all the vectors?"

And in fact we can generalize it further, to differential equations on linear operators on infinite-dimensional spaces. Which is another way of looking at partial differential equations. (A function is an infinite dimensional vector, a vector is a function on a finite set). 

So you can talk by analogy about how to exponentiate the diffusion equation, and a time delay operator that takes wiggly functions to less wiggly functions. 

But you can't always run pdes backwards, e.g. the diffusion equation won't run in reverse, so you don't always get a full group. 

I think that's the intuition for semigroups, I hope that's what you were talking about! (And sorry if it's not clear or just wrong, I haven't been a mathematician for thirty years.)

If you think about this for long enough, you should suddenly understand why e to the i pi is minus one. In fact it's just a really obvious thing that has to be true. At that point you've probably got it. 

If you think about this for long enough, you should suddenly understand why e to the i pi is minus one. In fact it's just a really obvious thing that has to be true. At that point you've probably got it.

Be careful about the illusion of transparency here. This all strikes me as the sort of stuff that is unlikely to be obvious.

Sorry, obvious to a mathematician who thinks about dz/dt=iz and realises that "exponentiation is time-evolution". At that point it's just "if you rotate for just long enough to turn half-way round, you'll be pointing backwards".

That's a good layman description - a semi group is basically the exponential of some linear operator. The problem is that I'm supposed to be a bit more than a layman.

Still don't really understand what a semi-group is (something something exponential).

Then perhaps this was a little hyperbolic?

It was, a little. Truth is I know the basic definition but I've yet to build up enough knowledge and intuition around them to really use them in my research. Thinks analytic vs bounded semigroup, L^\infty calculus, angular sectors and so on.