Tools and Techniques for Debugging Your Program Code
If you've gotten past all the standard problems people run into with CGI scripts and your script still isn't working properly, you have to start searching for bugs within your code. This can be a much tougher job than dealing with all the common problems mentioned previously.
Basically, errors within programs come in two flavorssyntax errors and runtime errors. Syntax errors crop up when you use your programming language of choice improperly. If you leave out a required semicolon or use elseif instead of elsif, your program will not compile, or if it's an interpreted language, it won't execute all the way through. Runtime errors occur when all the syntax in your program is correct, but your program still doesn't behave as expected.
Runtime errors can cause your programs to exit with an error during execution, or they might execute but produce unexpected results. For example, if you have a mathematical construct in your program that divides a number by 0, most languages will exit and return an error during execution. On the other hand, if your program multiplies a number by 1000 when it should multiply by 100, the program will appear to work correctly but will produce invalid results.
Compiled Versus Interpreted Languages
When we talk about debugging, it's important to contrast two types of languages compiled languages and interpreted languages. The difference between the two is that there are at least two steps to get from source code to execution with compiled languages; with interpreted languages, there's only oneexecution. A scripting language, which is a simple language designed to perform special or limited tasks, is usually interpreted.
Let me talk about an interpreted language first to give you an example. Scripts written in the Bourne shell are interpreted. When a Bourne shell script is executed, the shell reads the program command by command, and executes each command before moving on to the next one. With interpreted languages, even syntax errors are "runtime errors." In other words, if there's a syntax error on the fifth line of a shell script, the first four lines will be executed, and the program will exit with an error when it reaches the fifth line.
On the other hand, when you write a program in a compiled language, the source code must be transformed into machine readable instructions prior to execution. How this is handled differs from language to language. When you program in C, you use a compiler to transform source code into machine readable code. The code is then stored in executable format so that it can be used. When you program in Java, the source code is compiled into an intermediate format called bytecode, and is stored in that format. When you execute a Java program, the Java virtual machine translates the bytecode into machine readable code for the platform it's running on and executes the code.
The other example of a compiled language is Perl, which is a scripting language. Despite the fact that it's a scripting language, it's not an interpreted language. The difference between Perl and other compiled languages is that Perl code isn't stored in an executable format or bytecode after it's compiled. Instead, every time you run a Perl program, the source code is read by the Perl interpreter and compiled, and then executed immediately. So, it's a scripting language in the sense that to the user, compilation and execution are part of the same step, but it's a compiled language because the code is compiled before it is executed.
The difference in debugging the two types of languages is that, with an interpreted language, all debugging occurs at runtime. There is no compilation step during which you can iron out all the syntax errors in your code; instead you have to run the program to find any errors in it. This really becomes a problem when your interpreted program modifies files, or makes any other changes to permanent resources. For example, let's say you've written a shell script that, among other things, submits a credit card transaction for processing. If there's a syntax error in the program after the credit card submission, you'll have to submit a credit card transaction to get to the bug, and then submit another to determine whether the bug is fixed after a change. This can make debugging interpreted programs a real hassle (fortunately, in this case, most credit card verification services provide bogus credit card numbers you can use to test your programs). Generally in these situations you comment out instructions that make permanent changes for debugging.
Running CGI Scripts from the Command Line
When you're debugging your CGI scripts, it's important to remember that they're standard command-line programs. That means that you don't have to always run them through the Web server. In many cases, it's a lot easier to run them from the command line and look at the output there to find bugs.
This is certainly true when your script is returning a 500 Server Error result when requested. As you know, when you run into one of these errors, oftentimes you can find out what caused the error in the server's error log. You should always take a 500 Server Error as an indication that you should test the CGI program at the command line if you can't spot the problem immediately in the error log.
Running your program at the command line will let you know immediately whether there's a syntax error in your program if it's written in a scripting language, or whether there's a runtime error in the code that prevents the program from executing if it's a compiled language. In fact, if you're writing a program in Perl, you can execute it at the command line using the -cw flags to compile the program without executing it, nd to turn on warnings to catch coding mistakes that aren't necessarily syntax errors. For example, to compile (but not execute) the program example.pl with warnings, the following command line is used:
perl -cw example.pl
It's also easy to verify from the command line whether your program produces the proper HTTP header. The first two lines of the program's output should be the content type header and a blank line. This output is processed by the Web browser, so you never see it when you're testing through a server, but it's right there on the screen if you test your program at the command line.
CGI.pm and the Command Line
CGI.pm, the standard Perl tool for creating CGI scripts, is designed to make testing CGI programs from the command line very easy. When you run a program that uses CGI.pm from the command line, you can pass in name and value pairs as arguments, like this:
$ perl archive.pl year=2002 month=5 day=1
$ perl archive.pl year=2002&month=5&day=1
CGI.pm treats data received through the POST and GET methods the same, so there's no need to distinguish between them when you pass data to a script from the command line.
Using Print Statements for Debugging
One of the most important and time-tested techniques for tracking down logic errors in programs is using print statements to isolate bugs and find out where incorrect values are introduced. After you have your script working well enough to produce output, you can start inserting print statements to track down logic errors.
Many programming languages include debuggers that allow you to step through the execution of your program's statement by statement. You can stop execution at any time, and examine the values of variables that have been set within the program. Unfortunately, many languages used for CGI programming don't have debuggers, and even if they do, it's sometimes easier to just use print statements for debugging.
Generally speaking, print statements are most often used to display the values of variables that are normally used internally by the program. Let me show you how print statements are used for debugging with some examples. One of the most common uses of print statements to debug is to print values as a loop executes. If you're unsure of how many times a loop iterates in a program, you can insert a print statement in the loop so that it prints either a count, or just a marker value every time through.
Another common usage for print statements is to determine where, exactly, an error is occurring. Generally, you place print statements before and after the potentially offending code, and check to see whether the text in both of the print statements gets printed out. You can place these types of "markers" throughout your code to figure out where, exactly, something is failing along the way.