Home > Articles

This chapter is from the book

Debugging CGI Scripts

Some people would say that the toughest step in the process of creating an application is debugging it after all the functionality that you wanted is, in theory, complete. Certainly working out all of the nagging bugs in an application can take as long as getting the main functionality written.

One problem with CGI scripts is that they exist as part of a larger system. The actual CGI program is just one component of the system; there's also the operating system, the Web server, and the network connection between the Web browser and the Web server. A problem can occur with any of these components, and one of the toughest tasks facing a CGI programmer is determining where a problem occurred.

In many cases, problems crop up in the network connection between the Web server and Web browsers. In this book, I'm not going to discuss problems caused by bad Internet connections. Instead, I'm going to focus on problems that crop up on the Web server. However, you should be mindful of the fact that in some cases, the problem lies in the network connection, or even in the configuration of the browser or computer running the browser. Sometimes, when errors in your CGI scripts are reported, it turns out that the user just didn't understand how to use the script.

Finding the Source of an Error

Before you can fix an error, you have to figure out what caused it. Because this book focuses on CGI programming and not Web server administration, I'm not concerned with problems like your Web server crashing or your router preventing incoming connections. The important point to make is that if you can't connect to the Web server at all, either because the domain name in the URL you entered can't be looked up, or the server is refusing network connections, don't blame your CGI program. You've got some problem cropping up at a lower level that has to be fixed first.

Let's assume that the Web server is up and running, and there's some problem with your CGI script. Before you start digging through your program's source code, you should verify that a few common mistakes haven't been made. These mistakes plague everyone who writes CGI scripts, even people who are old hats. You can learn a lot about what went wrong by looking at the response code that's returned with the response.

Examining the HTTP Status Code

Every HTTP response is accompanied by a status code that indicates what the result of the request was. The status code is part of the response headers that are sent from the server to the browser. The most common status code by far is 200 Success. You never see or hear about this one because it means that the request was valid and returned a successful response. Whenever you request a page that is displayed properly, the response had a status code of 200.

When the status is something other than 200 Success, the Web server generally sends an error document back with the response. Often, the response code will be displayed as part of the error document. If it is not, you have to check the server's access log to find the response code for the failed request. The most common error code encountered is 404 Not Found. This code is returned when the requested resource could not be located by the Web server. Usually this error message crops up when a user clicks on a link to a file that no longer exists.

A 403 Access Denied error is returned when a user attempts to request a file that the Web server is not allowed to read. The Web server user must have read permission for a file in order to send it to a user.

When a user tries to visit a site that is password protected using basic authentication and she enters an invalid account and password combination, she receives a 401 Unauthorized error.

None of these errors are specific to CGI programming, I'm just including them here so that you'll know what you're looking at when you see them. The most important error for CGI programmers is 500 Server Error. It indicates that something went wrong when the server tried to execute the CGI script. It doesn't necessarily mean that the program itself is broken, just that the server had trouble requesting it and getting back the proper results.

Reading the Error Log

Web servers that support CGI programs maintain an error log of some kind. Any time a request fails, the error that occurred is stored in the log so that responsible system administrators and programmers can see what went wrong when their applications failed to work properly. The error log isn't restricted to CGI-related errors. It stores any error that the Web server sends as a response to a request, so all the 404 Not Found errors and other errors associated with regular requests go there as well.

The most important feature of the error log, from a CGI programmer's standpoint, is that it goes beyond storing the error code generated by the Web server to storing the error message produced by the program itself. This works in an interesting way. When a CGI program fails, it generally displays an error message. If the error message is displayed before (or instead of) the required Content-type header, the Web server reports an error to the user (because of the missing header). When the request for the CGI program fails, the server copies some of the output of the attempt to execute the program (in other words, the CGI program's error message) into the error log. This message is generally the most important clue for determining what went wrong with the CGI program.

The error log for the Apache Web server is generally found in the logs directory under the server root and is usually named error_log. However, the name and location of the error log can be changed to something else, so you can't count on that being the case. You can generally get the location of this file from the server administrator, or if you're able, by checking the Web server's configuration files. Some service providers don't turn on error logging by default, and you may have to ask them to enable error logs for your site so that you can debug your programs more easily.

Listing 3.1 contains an excerpt from an Apache error log. I'll discuss what some of the common errors found in the error log are a bit later.

Listing 3.1 An Excerpt from an Apache Error Log

 1: [Wed Dec 1 21:33:20 1999] [error] (2)No such file or directory: exec of 
/web/cgi-bin/bad.cgi failed
 2: [Wed Dec 1 21:33:20 1999] [error] [client 199.72.11.45] Premature end of 
script headers: /web/cgi-bin/bad.cgi

Fixing Setup Errors

Now that you've learned how to find errors, I'm going to go over some of the common errors you might encounter and explain how to fix them. Because these common errors are generally related to features required of all CGI programs, they're easy to track down and fix. Errors in your application logic are far more insidious and can take an awful lot longer to fix than the simpler errors listed here. I will talk about some debugging techniques later that you can use to isolate these types of bugs in your code.

Setting the Proper File Permissions

One of the most common mistakes most people make when they write CGI programs is setting the file permissions improperly. CGI scripts must be executable by the user that the Web server runs as. The easiest way around this is to make sure that all your CGI programs are executable by everyone. If the Web server isn't allowed to execute the program, the response to any request for it will be 500 Server Error.

File Permissions Under UNIX-Based Operating Systems

If your CGI programs are installed on a server running some UNIX-based operating system, you can make your program executable by everyone using the following command:

chmod a+x

You can tell if a program is executable by looking at the long version of the directory listing, which looks like this:

drwxr-xr-x  3 rafeco users  512 Dec 1 00:11 ./
drwxr-xr-x 26 rafeco users 1024 Nov 30 22:56 ../
-rwxr-xr-x  1 rafeco users 5018 Oct 27 11:21 archive.pl*
lrwxrwxrwx  1 rafeco users   9 Dec 1 00:11 example.sh@ ->  
  simple.sh
drwxr-xr-x  2 rafeco users  512 Nov 30 22:56 guestbook/
-rwxr-xr-x  1 rafeco users  280 Oct 24 12:44 pinggeneric.sh*
-rwxr-xr-x  1 rafeco users  666 Aug 23 12:31 sample.cgi*
-rwxr-xr-x  1 rafeco users 3867 Oct 27 11:07 search.pl*
-rwxr-xr-x  1 rafeco users  156 Aug 22 23:51 simple.sh*

The file permissions are cryptically expressed by the string -rwxr-xr-x. Let me explain how this string is decoded. The first character, a dash in this case, indicates what type of file the current file is. The dash indicates that this is a normal file. Directories have a d in this space, and symbolic links have an l. The next nine characters are used to display the access permissions for the file.

The characters are divided into three groups of three permissions. There are three sets of people that can be granted permission for a file, and there are three types of permission for each file. From left to right, the three sets are user, group, and others.

The user permissions pertain to the owner of the file. The owner is listed in the third column of the long directory listing. The group permissions apply to the members of the group to which the file is assigned. The group associated with a file is listed in the fourth column of a long directory listing. The last group of permissions is for others. The others permissions apply to all the system's users.

When you attempt to access a file, the permissions for the most restrictive set of users of which you are a member apply to you. In other words, the others permissions are only used if you're neither the owner of the file nor a member of the group associated with it. Similarly, the group permissions do not apply to the file's owner, even for a member of the group associated with the file.

Now let me explain what the individual permissions mean. As I said before, there are three permissions for each set of users. The permissions are read, write, and execute. If that set of users has the permission, the appropriate letter will appear in that space. If they do not, a dash will appear there instead. For example, if the owner has the permissions rw-, he can read and write the file, but not execute it. Similarly, if others have r-x permissions, they can read and execute the file, but not modify it.

Permissions are slightly different for directories. The names of the permissions read, write, and execute are the same, but how they work differs. If you have read permission for a directory, you're allowed to list files in that directory. If you have execute permission for a directory, you can make it your current directory. If you have write access to a directory, you can create files in that directory. It's possible to access files in a directory if you have execute permission for the directory but not read permission. You just have to be allowed access to the file, and you have to know the name of it because you aren't allowed to get a listing.

So now let's go back and look at the full file permissions for a file. If a file has the permissions -rwxr-xr-x, it's a normal file, and the owner has read, write, and execute permission for the file. Both the group and others have read and execute permission, but not write permission.

Checking Your Headers

A common pitfall for CGI programmers is forgetting to include the code that produces the content-type header; instead the program goes straight to generating HTML or other output, causing an error.

Anytime your program won't execute because of syntax errors, or because the Web server is unable to call it, you'll generally get an error complaining that the script didn't print the proper header. As described earlier, the script probably sent an error message instead, so you should look in the error log to find out what the error was.

Checking the Path to Your Script Interpreter

One very common problem you'll often find when you move scripts from one server to another, or you install a script downloaded from the Internet, is that the path to the Perl interpreter (or whatever script interpreter you're using) is wrong in the shebang line.

If the path to your script interpreter is incorrect, instead of seeing the output of your script, you'll see something like the following in the error log:

[Wed Dec 1 21:33:20 1999] [error] (2)No such file or directory: exec of 
/web/cgi-bin/bad.cgi failed
[Wed Dec 1 21:33:20 1999] [error] [client 199.72.11.45] Premature end of 
script headers: /web/cgi-bin/bad.cgi

If you encounter one of these errors, you just need to enter the right path in the shebang line. To do so, find the directory where your script interpreter is installed, and use that path in your program.

TIP

If you have a UNIX shell account on the server where the CGI program resides, you can often find the location of executable files using the which command. The which command searches all the directories in your path for the program you specify, and tells you where it is. For example, the following sequence illustrates how which is used to find the Perl interpreter

$ which perl
/usr/local/bin/perl

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020