Home > Store

Add To My Wish List

Essential Apache for Web Professionals

By Scott Hawkins
Published Dec 10, 2001 by Pearson. Part of the Essential Series for Web Professionals series.

Book

Sorry, this book is no longer in print.

Not for Sale

About

Description

Sample Content

Updates

More Information

About

Features

Companion Web site contains downloadable code, configuration directives, sample images for the book's projects, and additional resources—A complete integrated Book/Web Learning System: Concise, hands-on book complemented by rich companion Web site.
- Gives students powerful additional resources for learning Apache, and updates for staying current. Ex.__-

Extensive task-based cross-referencing—Simplifies access to a wide range of configuration directives and step-by-step techniques.
- Helps students rapidly find the solution they are searching for. Ex.___

Complete example configuration directives for key tasks—Includes detailed configuration directives for several common Apache scenarios and environments, including virtual hosting.
- Gives students a jumpstart by providing directives they can borrow and adapt. Ex.___

Database connectivity, step-by-step—Walks through integrating back-end databases for the delivery of dynamic content: a crucial requirement for most significant Web sites.
- Teaches students how to perform one of the most crucial tasks in real-world Web site management—a task neglected by many Apache introductory guides. Ex.___



Description

Copyright 2002
Dimensions: F
Pages: 256
Edition: 1st

Book
ISBN-10: 0-13-064930-9
ISBN-13: 978-0-13-064930-0

Deploy and manage Apache-based Web sites—now!
Focuses on the Apache skills Web professionals need most
Learn hands on, with real configuration, deployment, and management projects!
Virtual hosting, database connectivity, load balancing, and more
Quick step-by-step solutions based on real-world scenarios

Essential Apache for Web Professionals is the fastest way for Web professionals to master key skills for configuring, deploying, and managing virtually any site! Through hands-on projects and real configuration directives, you'll learn how to use the world's #1 Web server to handle virtual hosting, database connectivity-even complex session management and load balancing tasks. You'll start with simple examples, then work your way up to sophisticated projects, learning practical techniques you'd otherwise need months-or years-to learn. The companion Web site contains downloadable configuration directives, sample images for the book's projects, and even more in-depth explanations!

You'll master all this, and much more!

Downloading, unpacking, and compiling Apache source code
Installing precompiled versions of Apache
Basic runtime configuration techniques
Security and access control
Using built-in, third-party, and custom modules
Virtual hosting: running multiple sites from the same server
Serving dynamic content: CGI, FastCGI, PHP and mod_perl
Integrating Apache with Web databases
Tuning Apache for improved performance
Load balancing, session management, and other advanced techniques

Rely on Essential Guides for ALL the Web Skills You Need!All these books share the same great format, and the same dynamic Web site... so once you've used one, they're all a piece of cake!

Essential XML for Web Professionals
Essential JSP for Web Professionals
Essential Flash 5 for Web Professionals
Essential CSS & DHTML for Web Professionals
Essential PHP for Web Professionals
Essential ASP for Web Professionals
Essential Design for Web Professionals
Essential PERL 5 for Web Programmers
Essential Photoshop 6 for Web Designers
Essential JavaScript for Web Programmers
with more to come!



Sample Content

1. Installation.

2. Basic Apache.

3. Hosting Multiple Sites.

4. Dynamic Content.

5. Advanced Topics.

Preface

Introduction

This book is a discussion of the installation, configuration, andmaintenance of the Apache Web server. At this writing, Apache isthe most popular Web server in the world. Apache is open-sourcesoftware; among other things, that means it is available fordownload at no cost.

The source code also is included with most distributions. Ifyou choose, you can modify Apache to suit your needs. This featurehas led to a rich variety of third-party add-ons. Many talentedprogrammers have chosen to make their work available tothe general public.

Apache is high-quality software. It is rare to encounter anerror in the source code itself. If you do encounter a problem,technical support is available from a variety of outlets on theInternet and in bookstores. Some companies also provide phonesupport for a fee.

This book will teach you how to use the Apache Web server.The discussion assumes a basic familiarity with computer concepts,but if you use computers at all, you should find the bookaccessible. In this chapter, I first provide a brief discussion of somegeneral networking concepts. If you've been working with net-worksfor a while, feel free to skip this section. If not, you shouldreview it, as the discussion in later chapters presumes a familiaritywith the terms introduced here. The last portion of this chapterlays out the typographical conventions of the rest of the book.

Basic Concepts

In this section I will introduce several fundamental concepts ofnetworking software in general. Next I'll cover several conceptsthat are specific to Apache.

Web Servers

Apache is a Web server. A Web server is a piece of software thatresponds to the requests of Web browsers. When you type a URLin the address window of your Web browser, an intricate ques-tion-and-answer sequence is initiated between your browser andvarious Internet services. In order to understand the material inthis book, you need to have some understanding of these pro-cesses,so I will explain them first.

IP ADDRESSES

If you're even peripherally involved in the computer industry,you're probably familiar with the concept of an IP address. An IPaddress is a sequence of four numbers, each ranging in valuefrom 0 to 255, which are separated by periods. The following isan example of an IP address:

192.168.100.1

You will probably notice that most of the examples in thisbook use addresses in the range of 192.168.100.1 to 192.168.100.255. These aren't real addresses, at least not ones you can get tofrom the Internet. They are part of a range of addresses that wasset aside for private networks not connected to the Internet. Assuch, they are perfect for examples--because they are not real,they cannot be hacked.

NAME RESOLUTION

As members of the browsing public, we are accustomed to thinkingof Web addresses in terms of their domain names. A domainname is an address of the form:

www.stitch.com

You might be surprised to learn that those names are not ofmuch use to your computer. Computers almost never care aboutEnglish names for things. In order to connect to your Apacheserver and start downloading information, the Web browser thatwants to be your client must know two things about you:

the IP address of your machine
the port that your server is monitoring

However, in all likelihood, when users try to connect to your Website, all they have is your domain name. How do we get from the

www.stitch.com

printed on your business card to the IP address and port numberthe networking software uses?

The first step in the process is name resolution. Name resolutionis the process of looking up the IP address associated witha domain name. Name resolution usually occurs without anyhelp from the end user. When you install networking software onyour PC—such as the kind provided by your Internet service provider(AOL, Earthlink, and so on)—part of the installation processis to tell your machine where to go when it needs some name resolutiondone.

Usually, the machines that perform name resolution arelarge, powerful server machines that are dedicated to that onetask. Most of them run software called the Domain Name Service(DNS). Not every machine that runs DNS contains every singleaddress of the Internet. DNS servers store only the addresses thatare most popular among their client bases. When they are askedto resolve a domain name with which they are not familiar, theypass the question on to another DNS server. The details of thename resolution process aren't really important to you as anadministrator. The key point to remember is this:

When you decide to add a new Web site to your server, youmust make sure the Internet at large knows that thedomain name you are supporting is associated with the IPaddress of your server. The actual mechanics of this processare probably outside of your control. In practice, DNS registrationusually is accomplished by picking up the phoneand calling your Internet service provider (ISP). Tell themthat you want the domain name you are hosting to be reg-isteredin DNS as belonging to your IP address. Generallythis process takes a couple of hours on hold and $50 or so.You also should allow a couple of days for news of thechange to travel from the DNS software of your ISP out tothe world at large.

Once you have found an unclaimed domain name you can livewith and have registered it with DNS, the worst is over.

PORTS

Let us assume that the example browser has contacted a DNSserver and that name resolution has been completed successfully.Now the browser knows the IP address of the machine with whichit wants to communicate.

However, you may recall that earlier I said that, in order tomake a network connection, the client browser also needs toknow what port the Web server will be listening on. The machineassociated with the IP address you found may be running multiplenetwork services (ftp, telnet, etc.). Each of these services mustrespond to different requests in different ways. How does theserver keep them separated? The answer is ports.

A port is a secondary number associated with an IP address.Ports come in the range of 1 to 65535. Rather than asking eachindividual machine which service it associates with which port, ithas become customary for all machines connected to the Internetto use the same port for the same services. The term for this customis well-known port. The well-known port for Web service isnumber 80. When connecting across the secure socket layer (SSL),port 43 also is used.

SOCKETS

A socket is a network programming construct that enables twomachines to communicate across a network. A socket is definedby the IP address of the originating machine, the IP address of theterminating machine, and the port they are using to communicate.Socket connections are requested by the client browser. Ifthere is a server process (such as Apache) on the machine at theIP address requested by the client, monitoring the well-knownport associated with Web connections, that server will accept theconnection. At that point, a socket is created.The actual transmission of Web pages occurs across the socketconnection.

PROTOCOL

The term protocol, as it is used in computer science, is derivedfrom the term as it is used in human interaction. Just as diplomats and debutantes have all sorts of rituals they perform tofacilitate a smooth interaction between parties, so do computers.The idea is that computers aren't versatile enough to improvise,so the order and nature of each request—and each response toeach request—must be rigidly defined.

To give you just a rough idea of what I'm talking about, thefirst thing a server does after it has accepted a connection from aclient is to transfer information about which version of the protocolit is using across the socket. The client browser uses this informationto fine-tune the nature of the requests it sends and itsresponse to the information it receives. Next, the client has anopportunity to request data. The server responds to that requestwith either a Web page or an error message. The client displaysthe data it received and the cycle repeats itself.

All network services use some sort of protocol. Sometimes, asin the case of File Transfer Protocol (ftp) and HyperText TransferProtocol (http), the names reflect this. The protocol associatedwith the World Wide Web is http.

It's worth noting that the http protocol is not absolutely ideal.At the time it was created in 1990, something called "the Internet"did exist; it was largely the province of academics and lonelysingle men. The relentless hype that came to characterize it in themid-1990s was still years in the future. The most popular Internetapplications at the time were newsgroups and bulletin boards,both of which were, for the most part, text only. This was partly afunction of bandwidth—modems at the time were glacially slowcompared to what's available today. At 300 baud, even text-onlymessages took an achingly long time to download, and imagefiles were out of the question.

Modem speed improved, of course. At about the same time, aguy at CERN (a European research center) named Tim Berners-Leedeveloped a piece of software that would exploit both theincreasing speed of modems and the graphical user interface(GUI) capabilities of modern operating systems. His http enabledthe user to access data—including pictures—across a networkusing an intuitive, point-and-click interface.

This was truly a brilliant idea, and it took off immediately.However, in retrospect, it may have taken off too quickly. Let mepreface these next few sentences with a disclaimer: I am about toindulge in some shameless Monday-morning quarterbacking. Iwas a computer science student during this period, and I hadaccess to the same sorts of resources Tim Berners-Lee did. The main difference between us is that I was the one who failed toinvent the World Wide Web.

Having said that, I will go ahead and point out that http containedno provision for the secure transfer of data, no provisionfor the execution of scripts on either the client or the server side,and only rudimentary graphic-formatting capabilities. For thelast 11 years or so, the computer science community hasexpended enormous energy trying to find a way to retrofit thesecapabilities into http. The solutions that have been developed arecertainly functional, but no one ever describes them as elegant.

To be fair, I don't think that anybody at the time had anyidea just how huge the Web was going to be. If they had, theymight have spent a bit more time refining the protocols beforereleasing them on an unsuspecting public.

How Apache Works

Usually, Web servers handle requests from many browsers simultaneously.If a single server process were to handle all of theincoming requests, a great deal of overhead would be incurred inkeeping track of who wants what, what stage of the protocol theyare in, and so forth. In the UNIX environment, there is a simplerway: On UNIX systems, each client is assigned its own individualserver process.

How does this work? When Apache is started, the first thing itdoes is check whether it is the first such process on the machine.The first process, called the parent, has rights and responsibilitiesthat the other processes do not have. Specifically, it is responsiblefor creating copies of itself, called child processes, tohandle user requests. It also is responsible for killing the childprocesses off as necessary. As an Apache server administrator,you have the ability to control the number of these processes.

Apache on Windows is slightly different. On Windows,Apache relies on multiple threads within a single process to handleall user requests. The Apache program has a lot of assumptionsabout parent and child processes that were difficult toremove when the windows port was performed, so there is a parentprocess as well. Note, however, that the parent/child model isnot optimal.

Directives

Apache is a versatile piece of software. It alters its behavior at runtime based on the values of hundreds or even thousands of differentvariables stored in its configuration file. These variables arecalled directives. Most of this book is concerned with definingwhat these directives do, what their possible values are, and howyou can best exploit them to suit your needs.

Even the simplest Apache server will need to have dozens ofdirectives set. Rather than type the directives in when the serverprocess is invoked, as is common with Unix command line utilities,Apache stores the directives in a configuration file. This configuration file is a plain old text file. You can edit it with yourfavorite text editor, copy it at will, and generally treat it as youwould any other text file.

In order for any changes you make in the configuration file totake effect, you must restart the server process. The details of howto do this are discussed in Chapter 2.

Modules

Apache distributions all come with the same chunk of basic functionality,called the core, enabled by default. This functionalityincludes the ability to do such basic tasks as read its configurationfile, perform rudimentary access control, and find the Webpages it is supposed to be serving.

Each of these (and many other) tasks is handled by its ownclearly defined section of code. These sections of code are calledmodules. Apache is designed so that you can use only the modulesyou really need and discard the rest.

In order to fully exploit the modular capabilities of Apache,you will need to create an executable program from the sourcecode provided with the distribution. The process of creating anexecutable program from source code is called compilation.The program you end up compiling is called httpd. The compilationprocess is discussed in detail in Chapter 1.

It's worth emphasizing here that Apache is httpd. The termswill be used interchangeably throughout this book and all otherApache documentation. Why don't we just call httpd "Apache"?That's a fair question. The code that eventually became theApache server is descended from a program called the HyperTextTransfer Protocol Daemon. The name Apache is one of the weakjokes common among programmers—it refers to the fact thatearly versions of the server required a lot of software patches inorder to run correctly. By the time the name Apache was coined,using the label httpd for the running server process was unassailablyentrenched in both the source code and documentation.

Perhaps the best way to build a module is as a DynamicShared Object (or DSO). A DSO is a module that can be added toor removed from the httpd executable as the server is being startedsimply by changing a few directives in the configuration file. Thisis an amazingly handy ability. Compiling a module as a DSO isslightly more complicated than compiling it into a static serverprocess, but it is a smart investment of time. The details of compilingDSOs also are discussed in Chapter 1.

Handlers

Modules sometimes provide specific handlers, which are methodsof processing files or requests in an unusual way. Sometimeshandlers are named so that they can be referred to in configurationdirectives. Named handlers and their associated modulesare listed in Table 0.1.

**TABLE 0-1** Named Handlers
Handler	Module	Effect
send-as-is	mod_asis	Serve file and headers as-is
cgi-script	mod_cgi	Attempt to execute and serveoutput
imap-file	mod_imap	Imagemap rule file
server-info	mod_info	Display server configurationinformation
server-parsed	mod_include	Locate and replace server-sideincludes
server-status	mod_status	Display server status information
type-map	mod_negotiation	Parse as type map file

As I implied in the discussion of modules, you must include amodule in the current httpd executable before you can access itshandler.

MIME Types

MIME is an acronym for Multimedia Internet Mail Extensions.The idea behind MIME types is to enable a program to determinewhat kind of data a file contains by looking at the file's extension.Apache comes with a default mechanism that enables you to define how MIME types will be presented to the client. Like everythingelse in Apache, this mechanism is fully configurable.

Conventions of This Book

Throughout this book you'll find example commands and configurationdirectives, always accompanied by at least some explanationand sometimes by example output. In general, I don'tprovide detailed syntax information for directives and systemcommands in the regular text. That sort of thing is found in theAppendices, particularly Appendix A (Core Directives) andAppendix B (Other Directives). I hope you'll be able to glean thegeneral nature of any command with which you are unfamiliarfrom the context.

The success or failure of any given Apache transactiondepends on the internal server configuration, the content beingtransferred, the configuration of the underlying operating system,and the vagaries of the network support services. Given that, it isimpossible to say with absolute certainty that the examples presentedherein will run on your particular machine. You have mysolemn vow that I typed each and every one of them in and theyworked for me.

If you have any questions, comments, corrections, or suggestionsfor improvement, please feel free to contact me at:

s_hawkins@mindspring.com

Additional information about this and other books in PrenticeHall PTR's Essential Web Series can be found at:

www.phptr.com/essential/

Recap

A Web server is a piece of software that monitors an IP addressand port and uses the http protocol to respond to requests fromclient browsers. The Web pages are served across a network connectioncalled a socket.

The behavior of the Apache server is controlled by variablescalled directives stored in a configuration file. Apache is not a singleprocess but, rather, a collection of nearly identical child processesthat are created and destroyed by a parent.

Apache is composed of modules that may be included in theserver process at the discretion of the administrator. Some modulesprovide handlers, which are methods of processing files orrequests in a nonstandard way.



Updates

Submit Errata



More Information



InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Privacy Notice

Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children

This site is not directed to children under the age of 13.

Marketing

Pearson may send or direct marketing communications to users, provided that

Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
Such marketing is consistent with applicable law and Pearson's legal obligations.
Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

As required by law.
With the consent of the individual (or their parent, if the individual is a minor)
In response to a subpoena, court order or legal process, to the extent permitted or required by law
To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
To investigate or address actual or suspected fraud or other illegal activities
To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020

Email Address