InformIT

The Bad Code Spotter's Guide

Date: Apr 14, 2006

Return to the article

Old maps were marked with the phrase "Here be Dragons" to help seafarers steer away from dangerous places; in programming the best way to avoid dealing with bad code is to avoid writing it. Diomidis Spinellis points out 10 giveaways to spot bad code that you (or others) may have written.

The ability to spot substandard code is crucial both for programmers and software development managers. As programmers, when we encounter bad code we know we have to take a defensive stance: plan for more development time, expect bugs to crop out of nowhere, and anticipate that small changes may require significant rework. Ideally, we should also arrange for an opportunity to give the code a facelift, refactoring the worse-designed elements and correcting style problems. As managers (for others or of our own work), when we find bad code we need to take a step back and evaluate the situation. Why am I or my group writing this trash? Are there schedule pressures or insufficient incentives for writing brilliant code; is there a need for more training, better design, different development practices?

Here’s a list of 10 giveaways I typically use to spot bad code that I or others have written.

Poor Commenting

Writing good comments is more difficult than writing good code, and therefore identifying code whose commenting is poor is a child’s play. If you see nontrivial functions or methods lacking a comment at their beginning explaining what they do, you know you’re in trouble. The same goes for global variables, class and structure fields, and code blocks implementing a complex algorithm: all should be accompanied by a comment. Note that I don’t expect to see everything adorned with a comment: getter and setter methods, straightforward code, and many local variables are better left to explain themselves.

There’s also the problem of useless comments that simply repeat what is obvious from the name of the entity they explain. A particular thorn here are the elaborate javadoc comments that some sophisticated editors create automatically and some unsophisticated programmers never fill in. Such comments take up valuable screen real estate and distract the code’s reader without contributing something to the program’s understanding.

Finally, there’s also the (relatively less common) problem of excessive commenting. This tends to cause difficulties, because as we programmers maintain the code we often forget to update the comments (especially when there are many detailed comments and we’re unsure about what we’re doing) and this causes the comments to diverge from reality. So, unless you’re programming in assembly language, don’t feel you’ve got to comment every line of code you write.

Inappropriate Naming

Problems with the naming of the program’s entities are also easy to spot. One area is inconsistency. When you see in the same program CamelCaseIdentifiers mixed with underscore_separated_names, you know that you’re treading on the remains of a battle between different personalities. Similarly, if you find a wanderGet and a setWander method, you know that the programmer who wrote the code probably had his mind wandering somewhere else at the time.

When choosing names, moderation should be the norm. A long and overly descriptive name used for a local index variable that should simply be named "i" is just as bad as a short and cryptic global variable name.

Inconsistent Spacing and Indentation

The third giveaway for bad code is the inappropriate use of spaces for indenting the code and separating the program’s elements. There are many different ways for indenting the code to indicate its structure, but there are no excuses for using none of them, for inventing a new one, or for combining two of them in the same file. Any one of those three sins should alert you that the code is sick, and the problems could go deeper than its surface.

The same goes for the use of spaces for forming statements and separating operands from operators in expressions. Novice programmers often fail to notice that there are specific spacing rules for programs. I can still remember an afternoon in 1983: it was only three months after I had started learning C from a third generation photocopy of K&R, and I was anxious to show off my code to a friend who was at the time studying computer engineering. He flatly told me that my code had too few spaces, and didn’t offer a word of praise. You certainly wouldn’t want to maintain the code I was writing then, so watch out for those spaces.

Missing Error Checks

The final easy ways to spot bad code are calls to library functions that fail to perform an appropriate action when the function returns with an error. In C code, errors are often passed from the API to the program in-band, that is together with the function’s result: a special value, like NULL or a negative integer, is used to indicate that the result is not an entity the caller would expect to be returned, but an indication of an error. If the program fails to check the return value of, say fopen or malloc, for NULL, then you can be almost sure there will be situations where the program will crash on the face of its hapless end user. In the Java and .NET platforms, errors are typically communicated out of band through exceptions. In these cases dodging an error situation requires a bit of explicit effort: an empty exception handler. Despite this, I’ve often encountered them in sloppily written code and they are typically bad news.

Handling errors is difficult because the actions one can take after the error occurs are rarely able to rectify the error. Thus, we programmers are put in a position where we have to tediously check one possible error after another, only to report it and exit gracefully. Nevertheless, programs that do not check for API errors reek of unprofessionalism and are often untrustworthy.

Repetition

Copying one piece of code from one place to another is easy to do, but finding that code after many years and programmers have left their signs on it is rarely that straightforward. We typically encounter the situation of repeated code sequences when we fix an error or introduce a change, and then look in the program for further places we should adjust (this is always a good practice). Sometimes we locate another (typically subtly different) instance of the code we modified, and then another, and another. These code segments violate the "Don’t Repeat Yourself" (DRY) and "single point control" principles.

Although the problem of repetition is rarely glaringly apparent like the other signs we’ve examined so far, the trouble it causes goes deeper than the surface. First, repetition indicates a failure to use appropriate abstraction mechanisms in the program, such as methods, functions, subclasses, and generic types. As a result, the code is longer than it should be, and, as you can surely appreciate, it will contain more bugs. In addition, repeated instances of code hinder maintenance because they force you to manually track and modify each separate repeating fragment.

Lack of Clarity

Despite what competitions like the Obfuscated C Code Contest might let one believe, the writing of complex undecipherable code is not a sign of programmer prowess. Consider as an example the following code fragment, found in a widely used program.

     if (SWITCH_TAKES_ARG (c) > (p[1] != 0))
      nskip += SWITCH_TAKES_ARG (c) - (p[1] != 0);

Amazingly, this mind-twisting sequence could be simply rewritten as

     if (SWITCH_TAKES_ARG (c) && (p[1] == 0))
      nskip++;

which, incidentally, is also slightly more efficient than the initial code.

We can’t express all algorithms in a clear manner. However, some would regard deliberately obtuse code as an insult toward the developers who will maintain it in the future.

When you see code that appears overly complex, try to simplify it, starting with expression or statement-level refactoring operations. If you succeed, you’ll know the code’s quality isn’t all that great, and you’ll also leave the code in a state slightly better than the state you found it in.

Gratuitously Complicated Design

The mother of unclear and complex code is code that expresses a complicated design. Here I’m thinking about abstraction through mechanisms like inheritance, interfaces, and generic types with no apparent gains. Design patterns are also sometimes misused in a similar manner.

There’s a fine difference between a complex design that serves a purpose and one that simply shows-off designer expertise or guards against vague and unspecified requirements that might appear in the future. The agile programming community has taught us that paying upfront the cost of future-proof goldplating is a disastrous waste of programmer productivity. The design becomes difficult to navigate today, while the rewards may get reaped in the future.

So, when you encounter complex designs that fit our description don’t feel intimidated by what might lie beneath; simply lament the fact that somebody’s Pharaonic design ideas will cause you some everyday pain.

Hidden Assumptions

At the opposite end of complex designs, you may find programs that use too few abstractions hiding a number of assumptions deep inside the code. Two chief and easily recognizable culprits in this area are fixed buffer sizes and magic numbers. Although nowadays C++, Java, and .NET come with a feature-rich container framework, one can still find code that instead of using the platform’s vector container—which implements a growable array of objects—it declares a fixed-size array. The number of elements that the array will hold is hidden within the code, and the reasons for choosing that particular number are often similarly hidden within the programmer’s mind. This code will crash when it faces more elements than those originally planned for, and in some cases, it may even expose the program to a buffer overflow attack.

Numbers appearing out of the blue don’t only occur in array declarations. They come in many shapes and sizes, and they are aptly termed magic numbers. We can’t avoid them, because at some point we do need, for example, to specify that TCP packets are identified by the number 6, and UDP packets by the number 17. However, directly embedding these numbers in the code makes the code difficult to understand and maintain.

Failure to Use Standard Libraries

Using an array instead of a vector may be bad, but implementing a resizable collection from scratch is worse. In general, when you find code implementing functionality that duplicates facilities available in the standard libraries, be wary. Although standard libraries are typically mature, stress-tested, and well-thought-out, home brew code duplicating their functionality is anything but. When you find a complete XML parser or a specially crafted version of a sort function in a program, you’ll have to spend effort to understand its interface, verify its functionality, and ensure that its performance is adequate. All that because somebody wanted to reinvent the wheel or didn’t bother to read the platform’s documentation.

Deficient Testing Support

I’ll end my tirade on bad code signs with a word about testing. Unit tests are becoming standard, but I wouldn’t go as far as saying that any code lacking unit tests is deficient. Other methods of testing are also acceptable, especially for legacy code. For example, the liberal use of assertions in the code, or the provision of a framework for regression testing, can often provide a solid testing infrastructure. Also, programs implemented as filters or programs providing a scriptable interface can also be easy to test. On the other hand, when it’s obvious that no one ever spent a thought on how the code will be tested, and the code defies reasonable attempts to test it, then you’ve got a problem on your hands.

Many of the bad code traits I’ve identified may seem trivial to you, and many indeed are. Some can even be superficially corrected with a code formatting tool like indent. However, remember we’re talking about ways to spot bad code here, not ways to identify good code. These traits are simply danger signs. If those who wrote the code weren’t motivated enough or didn’t pay enough attention to keep the simple things in order, you can be sure that many worse issues will be lurking in it.

Old maps were marked with the phrase "Here be Dragons" to help seafarers steer away from dangerous places; in programming the best way to avoid dealing with bad code is to avoid writing it.

800 East 96th Street, Indianapolis, Indiana 46240