Problem-solving is an essential part of software development. Bugs are inevitable and the ongoing opportunity to solve puzzles is one of the reasons I love this work. However, sometimes we get stuck on a particularly baffling problem, and this can feel frustrating and discouraging. The following are some of the methods I turn to when this happens to me.
The most puzzling problems almost definitionally have one thing in common: something we think we know is incorrect. At the outset there is no way to know what we have overlooked, so be a good detective and consider everything a suspect. This includes sections of code that seem trivial or obvious, as well as all of your tools (including the browser!). Sometimes solutions to the most mysterious problems are hiding in plain sight.
I once spent hours trying to understand why I wasn’t seeing updates to a user’s profile page, only to eventually realize that I was signed in as a different user. On another occasion, I struggled late into the night trying to figure out why new records weren’t getting saved in the database. It turned out I was looking in the wrong database. I have more examples like this than I care to admit!
It is a time-honored developer tradition to start by printing your variables. In many cases, this reveals the problem right away. But when you don’t find what you’re looking for, or what you find doesn’t make sense, take a scorched earth approach to logging. Log everything and then log some more. Start from the unexpected result and work your way backwards, logging everything you can, including values you feel certain you know.
If I trace the problem as far back as possible and things still do not make sense, then it’s time for some high-level sanity checks. Is my unit test actually calling the code I think I’m testing? Is my browser actually pointed at the correct instance of the site? In my experience, sometimes the most frustrating bugs have the simplest (overlooked) solutions.
Sometimes multiple factors combine like layers of an onion to create what feels like a single intractable problem. In situations like these, it is sometimes helpful to temporarily shift your focus away from the conspicuous problem, and instead focus on the intermediate factor(s) getting in the way. For example, if an error only occurs in a remote environment perhaps locating the log files is the first layer of the problem that needs to be solved.
Imagine an e-commerce site with a checkout form that throws an error. My first instinct is usually to work directly towards the desired end state. In this example, that would focus my attention on preventing the error. But this might be extremely inefficient if I don’t have an easy way to submit the form without making an actual purchase. Sometimes the quickest way to the solution feels indirect, and might even involve temporarily “breaking” things. The fastest way to solve the checkout error might be to first disable the payment gateway so I can resubmit the form (without going broke). It may feel regressive because now two things are “broken” but this is significant progress if it gets me closer to solving the underlying problem.
When I catch myself grumbling “if I just had…” I try to stop and think hard about what it would take to address that need. Is there an affordable tool that would unblock me? Is there a gap in my knowledge that is forcing me to guess about something knowable? Simply put: don’t allow the problem to distract from the solution.
Error messages and broken layouts tend to draw our attention. But sometimes we can learn important lessons even when the problem does not occur. The term negative facts is a bit contentious in philosophy circles, but for our purposes it simply refers to instances where the issue doesn’t happen.
Recently I investigated a site that was loading extremely slowly. The problem was inconsistent, affecting a small but steady portion of page loads and we were struggling to find an explanation. We scrutinized everything we could think of to explain the slow responses, but studying the “normal” responses turned out to be more fruitful. It was eventually discovered that these requests all included a header indicating that they hit the CDN cache. This clue helped us determine that the performance bottleneck was consistently present on the origin server, but often concealed by the caching layer. This created the illusion of mysterious delays when in reality we were actually noticing the absence of caching.
If the same component throws an error on one page, but not another, maybe there is something special about that page that causes the error. Or, maybe there is something special on the other page that is preventing the error. If only certain users experience a bug, it’s natural to first wonder what they all have in common? But if you cannot find a pattern, don’t forget to look for a pattern in all the users that don’t report the problem.
A simple exercise that helps me to think in this way starts by asking myself: “If what I think is true, then what else should I expect to see?” Challenging my working hypothesis in this way helps me avoid guessing by comparing my theory to the evidence. It also helps me notice missing evidence, and this absence might be just as significant.
Community support forums and Q&A sites like StackOverflow are incredibly powerful resources in multiple ways. Obviously, your question might be answered by a helpful stranger, and this is lovely. But I do this even for extremely obscure problems that are unlikely to be answered by the community. Very often, the exercise of writing my question causes me to consider the problem in a different way so I can figure it out myself.
I used to think of this as a last resort that I would turn to only when everything else didn’t work. But now I do this much earlier in my sequence. Stopping to write the question may feel like it takes time away from debugging the problem. But the act of writing the question is part of debugging the problem because a good question requires that you restate the facts, summarize what you’ve already tried, and define the exact information that you are searching for. Even if the act of writing the question turns out not to be fruitful, the sooner the question is published, the sooner the community has a chance to see it and potentially answer it. It’s an investment. Make it early.
You can also do a spoken version of this technique. I find this particularly helpful when I am working in parallel with other members of a team. Everyone is working towards the same eventual goal of solving the problem, but the pieces they already understand, the ways they are thinking about the problem, and the solutions they’re exploring are often different. Remind yourselves to speak aloud what you think might be happening, why you think that, and what you need to figure out in order to verify it.
Most of us want to end a work session on a high note. It’s unsatisfying to step away from something unresolved or broken. But ironically this is sometimes the quickest way to fix it. If I focus on the same problem for a long time it becomes difficult to maintain perspective and objectivity. It also makes me tired, which tends to promote guessing and sloppiness. Often, a short break is all I need before returning to the problem with clear eyes and sharper thinking. This simple tactic is the single most effective debugging technique in my toolbox.
I have learned to resist the fear that my current headspace is essential and I won’t ever be able to recover it when I return. I will. I also resist the delusion that if I stay up all night grinding away I’ll eventually figure it out. I probably won’t. Countless times, I have made this mistake: stubbornly working late on something frustrating before finally giving up, only to return the next morning and immediately notice a missing semicolon or a typo in a variable name. It takes practice, but I have learned to recognize the early signs of getting stuck or frustrated. When I feel this happening I step away, or I set a strict time limit for how long I will allow myself to keep struggling before I take a break. Experience has taught me that the sooner I take that break, the sooner I will actually solve my problem.
Don’t forget to savor the moment whenever you finally solve your difficult problem. Do a victory dance. Drink a tasty beverage. Consider calling it a day. Problems tend to clamor for our attention, but successes can slip by quietly if we allow them. You’ve worked hard and accomplished something challenging. There will always be more problems to solve tomorrow.