(Cross posted on my personal blog.)
Is it worth refactoring
currentDate? I think that there are two ways to look at it.
You can zoom in and ask yourself questions about whether such a refactor will actually have a business impact. Will it improve velocity? Reduce bugs? Sure,
currentDate might be slightly more descriptive, but does it really move the needle? How long does it take to figure out that
yyyymmdd refers to a date? A few seconds, maybe? Won't it be pretty obvious given the context? Shouldn't your highly paid, highly intelligent engineers be smart enough to put two and two together? Did we all just waste 30 seconds of our lives talking about this?
The other way of looking at it is to zoom out. How do you feel when you work in codebases where the variable names are slightly confusing? It slows you down, right? Often times you legitimately can't put two and two together. And there are times when it leads to bugs. Right?
It's interesting how two different viewpoints − zoomed in vs zoomed out − can produce wildly different answers to essentially the same question: do the costs of investing in code quality outweigh the benefits? When you zoom in, eg. to a single variable name, unless the code is truly awful, it usually doesn't seem worth it. The answer is usually, "it's not that bad, developers will be able to figure it out". But when you zoom out and look at the entirety of a codebase, I think the answer is usually that working in messy codebases will have legitimate, significant impacts on things like velocity and bugs, and it's worth taking the time to do things the right way.
What's going on here? Is this a paradox? Which is the right answer? To answer those questions, let's talk about something called the planning fallacy.
The Denver International Airport opened sixteen months later than scheduled, with a total cost of $4.8 billion, over $2 billion more than expected.
When estimating things, people usually zoom in. "Build an airport in Denver? Well, we just have to do A, B, C, D, E and F. Each should take about six months and $500M, so overall it should be three years and $3B." The problem with this is… well… the problem is that it just never works. You always forget something. And the individual components always end up being more complicated than they seem. Just like when you think dinner will be ready in 30 minutes.
So what can you do instead? Well, how long have similarly sized airports taken to build in the past? Ten years and $10B? Hm, if so, maybe your estimate is off. Sure, your situation is different from those other situations, but you can adjust upwards or downwards using the reference class of the other airports as a starting point. Maybe that brings you from 10 to 8 or 10 to 7, but probably not 10 to 3.
How does this relate to code quality? Well, I think that something similar is going on. When you zoom in and take the inside view, it looks like everything will be good. But when you zoom out and take the outside view, you realize that messy codebases usually cause significant problems. Is there a good reason to believe that your codebase is a special snowflake where messiness won't cause significant problems? Probably not.
I feel like I'm being a little bit dishonest here. I don't want to hype up the outside view too much. In practice, inside view thinking also has it's virtues. And it makes sense to combine inside view thinking with outside view thinking. Doing so is more of an art than a science, and something that I am definitely still developing a feel for.
I think that certain things lend themselves more naturally to inside view thinking, and others lend themselves more naturally to outside view thinking. For example, coming up with startup ideas or scientific theories are both good fits for inside view thinking, IMHO. On the other hand, code quality feels to me like something that is a great fit for the outside view. And so, that's the viewpoint that I favor when I think about whether or not it is worthwhile to invest in.