Any developer who's comfortable with Perl can build remarkably powerful TCP/IP network applications -- no C required! In Network Programming with Perl, Lincoln Stein shows how, step-by-step, with extensive code examples. Modeled on W. Richard Stevens' legendary Unix network programming book, this book opens up network programming to a new generation of programmer: Web developers ready to build serious network applications and solve complex network problems. Stein begins with an overview of Perl's increasingly powerful networking facilities; then introduces Berkeley Sockets, and the UDP and TCP protocols at the heart of network programming. He presents Perl's IO:: socket API, which simplifies the creation and use of sockets; demonstrates how to create forking servers; and introduces practical techniques for creating multithreaded and multiplexed applications. Modeled upon the style of Stevens, and using extensive sample code, Stein demonstrates all of the key features. Network Programming with Perl also includes chapter-length explanations of creating Internet modules for FTP and Telnet; Mail and News; and Web services.
Click below for Sample Chapter related to this title:
I. BASICS.1. Input/Output Basics.
Perl and Networking.
Networking Made Easy.
Using Object-Oriented Syntax with the IO::Handle and IO::File Modules.
Summary.2. Processes, Pipes, and Signals.
Summary.3. Introduction to Berkeley Sockets.
Clients, Servers, and Protocols.
A Simple Network Client.
Network Names and Services.
Network Analysis Tools.
Summary.4. The TCP Protocol.
A TCP Echo Client.
Socket Functions Related to Outgoing Connections.
A TCP Echo Server.
Adjusting Socket Options.
Other Socket-Related Functions.
Exceptional Conditions during TCP Communications.
Summary.5. The IO::Socket API.
More Practical Examples.
Performance and Style.
II. DEVELOPING CLIENTS FOR COMMON SERVICES.6. FTP and Telnet.
Summary.7. SMTP: Sending Mail.
Introduction to the Mail Modules.
Summary.8. POP, IMAP, and NNTP: Processing Mail and Netnews.
The Post Office Protocol.
The IMAP Protocol.
Internet News Clients.
A News-to-Mail Gateway.
Summary.9. Web Clients.
Parsing HTML and XML.
III. DEVELOPING TCP CLIENT/SERVER SYSTEMS.10. Forking Servers and the inetd Daemon.
Standard Techniques for Concurrency.
Running Example: A Psychotherapist Server.
The Psychotherapist as a Forking Server.
A Client Script for the Psychotherapist Server.
Daemonization on UNIX Systems.
Starting Network Servers Automatically.
Using the inetd Super Daemon.
Summary.11. Multithreaded Applications.
A Multithreaded Psychiatrist Server.
A Multithreaded Client.
Summary.12. Multiplexed Applications.
A Multiplexed Client.
The IO::Select Module.
A Multiplexed Psychiatrist Server.
Summary.13. Nonblocking I/O.
Creating Nonblocking I/O Handles.
Using Nonblocking Handles.
Using Nonblocking Handles with Line-Oriented I/O.
A Generic Nonblocking I/O Module.
Nonblocking Connects and Accepts.
Summary.14. Bulletproofing Servers.
Using the System Log.
Setting User Privileges.
Handling HUP and Other Signals.
Summary.15. Preforking and Prethreading.
A Nonblocking TCP Client Using IO::Poll.
IV. ADVANCED TOPICS.17. TCP Urgent Data.
"Out-of-Band" Data and the Urgent Pointer.
Using TCP Urgent Data.
The sockatmark() Function.
A Travesty Server.
Summary.18. The UDP Protocol.
A Time of Day Client.
Creating and Using UDP Sockets.
Using UDP Sockets with IO::Socket.
Sending to Multiple Hosts.
Increasing the Robustness of UDP Applications.
Summary.19. UDP Servers.
An Internet Chat System.
The Chat Client.
The Chat Server.
Detecting Dead Clients.
Unicasting versus Broadcasting.
Sending and Receiving Broadcasts.
Broadcasting Without the Broadcast Address.
Enhancing the Chat Client to Support Resource Discovery.
Sample Multicast Applications.
Summary.22. UNIX-Domain Sockets.
Using UNIX-Domain Sockets.
A "Wrap" Server.
Using UNIX-Domain Sockets for Datagrams.
Summary.Appendix A. Additional Source Code.
Net::NetmaskLite (Chapter 3).
PromptUtil.pm (Chapters 8 and 9).
IO::LineBufferedSet (Chapter 13).
IO::LineBufferedSessionData (Chapter 13).
DaemonDebug (Chapter 14).
Text::Travesty (Chapter 17).
mchat_client.pl (Chapter 21).Appendix B. Perl Error Codes and Special Variables.
System Error Constants.
Magic Variables Affecting I/O.
Other Perl Globals.Appendix C. Internet Reference Tables.
Assigned Port Numbers.
Registered Port Numbers.
Internet Multicast Addresses.Appendix D. Bibliography.
The network is everywhere. At the office, machines are wired together into local area networks, and the local networks are interconnected via the Internet. At home, personal computers are either intermittently connected to the Internet, or, increasingly, "always-on" cable and DSL modems. New wireless technologies, such as Bluetooth, promise to vastly expand the network realm, embracing everything from cell phones to kitchen appliances.
Such an environment creates tremendous opportunities for innovation. Whole new classes of applications are now predicated on the availability of high-bandwidth, always-on connectivity. Interactive games allow players from around the globe to compete on virtual playing fields and the instant messaging protocols let them broadcast news of their triumphs to their friends. New peer-to-peer systems, such as Napster and Gnutella, allow people to directly exchange MP3 audio files and other types of digital content. The SETI@Home project takes advantage of idle time on the millions of personal computers around the world to search for signs of extraterrestrial life in a vast collection of cosmic noise.
The ubiquity of the network allows for more earthbound applications as well. With the right knowledge, you can write a robot that will fetch and summarize prices from competitors' Web sites; a script to page you when a certain stock drops below a specified level; a program to generate daily management reports and send them off via e-mail; a server that centralizes some number-crunching task on a single high-powered machine, or alternatively distributes that task among the multiple nodes of a computer cluster.
Whether you are searching for the best price on a futon or for life in a distant galaxy, you'll need to understand how network applications work in order to take full advantage of these opportunities. You'll need a working understanding of the TCP/IP protocol--the common denominator for all Internet-based communications and the most common protocol in use in local area networks as well. You'll need to know how to connect to a remote program, to exchange data with that program, and what to do when something goes wrong. To work with existing applications, such as Web servers, you'll have to understand how the application-level protocols are built on top of TCP/IP, and how to deal with common data exchange formats such as XML and MIME.
This book uses the Perl programming language to illustrate how to design and implement practical network applications. Perl is an ideal language for network programming for a number of reasons. First, like the rest of the language, Perl's networking facilities were designed to make the easy things easy. It takes just two lines of code to open a network connection to a server somewhere on the Internet and send it a message. A fully capable Web server can be written in a few dozen lines of code.
Second, Perl's open architecture has encouraged many talented programmers to contribute to an ever-expanding library of useful third-party modules. Many of these modules provide powerful interfaces to common network applications. For example, after loading the LWP::Simple module, a single function call allows you to fetch the contents of a remote Web page and store it in a variable. Other third-party modules provide intuitive interfaces to e-mail, FTP, net news, and a variety of network databases.
Perl also provides impressive portability. Most of the applications developed in this book will run without modification on UNIX machines, Windows boxes, Macintoshes, VMS systems, and OS/2.
However, the most compelling reason to choose Perl for network application development is that it allows you to fully exploit the power of TCP/IP. Perl provides you with full access to the same low-level networking calls that are available to C programs and other natively compiled languages. You can create multicast applications, implement multiplexed servers, and design peer-to-peer systems. Using Perl, you can rapidly prototype new networking applications and develop interfaces to existing ones. Should you ever need to write a networking application in C or Java, you'll be delighted to discover how much of the Perl API carries over into these languages.
You should have access to a Perl interpreter and some experience writing, running, and debugging scripts. Just as important, you should have access to a computer that is connected both to a local area network and to the Internet! Although the recipes in Chapter 10 on setting Perl-based network servers to start automatically when a machine is booted do require superuser (administrative) access, none of the other examples require privileged access to a machine.
This book does take advantage of the object-oriented features in Perl version 5 and higher, but most chapters do not assume a deep knowledge of this system. Chapter 1 addresses all the details you will need as a casual user of Perl objects.
This book is a thorough review of the TCP/IP protocol at the lowest level, or a guide to installing and configuring network hubs, routers, and name servers. Many good books on the mechanics of the TCP/IP protocol and network administration are listed in the references in Appendix D.
This book is organized into four main parts, Basics, Developing Cients for Common Services, Developing TCP Client/Server Systems, and Advanced Topics.
Part I, Basics, introduces the fundamentals of TCP/IP network communications.
Part II, Developing Clients for Common Services, looks at a collection of the best third-party modules that developers have contributed to the Comprehensive Perl Archive Network (CPAN).
Part III, Developing TCP Client/Server Systems--the longest part of the book--discusses the alternatives for designing TCP-based client/server systems. The major example used in these chapters is an interactive psychotherapist server, based on Joseph Weizenbaum's classic Eliza program.
select()call, which enables an application to process multiple I/O streams concurrently without using multiprocessing or multithreading.
Part IV, Advanced Topics, addresses techniques that are useful for specialized applications.
All good things evolve to meet changing conditions, and Perl has gone through several major changes in the course of its short life. This book was written for versions of Perl in the 5.X series (5.003 and higher recommended). At the time I wrote this preface (August 2000), the most recent version of Perl was 5.6, with the release of 5.7 expected imminently. I expect that Perl versions 5.8 and 5.9 (assuming there will be such versions) will be compatible with the code examples given here as well.
Over the horizon, however, is Perl version 6. Version 6, which is expected to be in early alpha form by the summer of 2001, will fix many of the idiosyncrasies and misfeatures of earlier versions of Perl. In so doing, however, it is expected to break most existing scripts. Fortunately, the Perl language developers are committed to developing tools to automatically port existing scripts to version 6. With an eye to this, I have tried to make the examples in this book generic, avoiding the more obscure Perl constructions.
More serious are the differences between implementations of Perl on various operating systems. Perl started out on UNIX (and Linux) systems, but has been ported to many different operating systems, including Microsoft Windows, the Macintosh, VMS, OS/2, Plan9, and others. A script written for the Windows platform will run on UNIX or Macintosh without modifications.
The problem is that the I/O subsystem (the part of the system that manages input and output operations) is the part that differs most dramatically from operating system to operating system. This restricts the ability of Perl to make its I/O system completely portable. While Perl's basic I/O functionality is identical from port to port, some of the more sophisticated operations are either missing or behave significantly differently on non-UNIX platforms. This affects network programming, of course, because networking is fundamentally about input and output.
In this book, Chapters 1 through 9, use generic networking calls that will run on all platforms. The exception to this rule is the last example in Chapter 5, which calls a function that isn't implemented on the Macintosh, fork(), and some of the introductory discussion in Chapter 2 of process management on UNIX systems. The techniques discussed in these chapters are all you need for the vast majority of client programs, and are sufficient to get a simple server up and running. Chapters 10 through 22 deal with more advanced topics in server design.
The nice thing is that the non-UNIX ports of Perl are improving rapidly, and there is a good chance that new features will be available at the time you read this.
All the sample scripts and modules discussed in this book are available on the Web in ZIP and TAR/GZIP formats. The URL for downloading the source is http://www.modperl.com/perl_networking. This page also includes instructions for unpacking and installing the source code.
Many of Perl's networking modules are preinstalled in the standard distribution. Others are third-party modules that you must download and install from the Web. Most third-party modules are written in pure Perl, but some, including several that are mentioned in this book, are written partly in C and must be compiled before they can be used.
CPAN is a large Web-based collection of contributed Perl modules. You can get access to it via a Web or FTP browser, or by using a command-line application built into Perl itself.
To find a CPAN site near you, point your Web browser at http://www.cpan.org/. This will present a page that allows you to search for specific modules, or to browse the entire list of contributed modules sorted in various ways. When you find the module you want, download it to disk.
Perl modules are distributed as gzipped tar archives. You can unpack them like this:% gunzip -c Digest-MD5-2.00.tar.gz | tar xvf -
Once the archives are unpacked, you'll enter the newly created directory and give the perl Makefile.PL, make, make test, and make install commands. These will build, test, and install the module.% cd Digest-MD5-2.00
On UNIX systems, you may need superuser privileges to perform the final step. If you don't have such privileges, you can install the modules in your home directory. At the perl Makefile.PL step, provide a PREFIX= argument with the path of your home directory. For example, assuming your home directory can be found at /home/jdoe, you would type:% perl Makefile.PL PREFIX=/home/jdoe
The rest of the install procedure is identical to what was shown earlier.
If you are using a custom install directory, you must tell Perl to look in this directory for installed modules. One way to do this is to add the name of the directory to the environment variable
PERL5LIB. For example:
Another way is to place the following line at the top of each script that uses an installed module.use lib '/home/jdoe';
A simpler way to do the same thing is to use Andreas Koenig's wonderful CPAN shell. With it, you can search, download, build, and install Perl modules from a simple command-line shell. The install command does it all:% perl -MCPAN -e shell cpan shell -- CPAN exploration and modules installation (v1.40)
These examples all assume that you have UNIX-compatible versions of the gzip, tar, and make commands. Virgin Windows systems do not have these utilities. The Cygwin package, available from http://www.cygnus.com/cygwin/, provides these utilities as part of a complete set of UNIX-compatible tools.
It is easier, however, to use the ActiveState Perl Package Manager (PPM). This Perl script is installed by default in the ActiveState distribution of Perl, available at http://www.activestate.com. Its interface is similar to the command-line CPAN interface shown in the previous section, except that it can install precompiled binaries as well as pure-Perl scripts. For example:C:\WINDOWS>ppm
The MacPerl Module Porters site, http://pudge.net/cgi-bin/mmp.plx, contains a series of modules that have been ported for use in MacPerl. A variety of helper programs have been developed to make module installation easier on the Macintosh. The packages are described at http://pudge.net/macperl/macperlmodinstall.html, which also gives instructions on downloading and installing them.
In addition to books and Web sites, Network Programming with Perl refers to two major sources of online information, Internet RFCs and Perl POD documentation.
The specifications of all the fundamental protocols of the Internet are described in a series of Requests for Comment (RFC) submitted to the Internet Engineering Task Force (IETF). These documents are numbered sequentially. For example RFC 1927--"Suggested Additional MIME Types for Associating Documents"--was the 1927th RFC submitted. Some of these RFCs eventually become Internet Standards, in which case they are given sequentially numbered STD names. However, most of them remain RFCs. Even though the RFCs are unofficial, they are the references that people use to learn the details of networking protocols and to validate that a particular implementation is correct.
The RFC archives are mirrored at many locations on the Internet, and maintained in searchable form by several organizations. One of the best archives is maintained at http://www.faqs.org/rfcs/. To retrieve an RFC from this site, go to the indicated page and type the number of the desired RFC in the text field labeled "Display the document by number." The document will be delivered in a minimally HTMLized form. This page also allows you to search for standards documents, and to search the archive by keywords and phrases. If you prefer a text-only form, the www.faqs.org site contains a link to their FTP site, where you can find and download the RFCs in their original form.
Much of Perl's internal documentation comes in Plain Old Documentation (POD) format. These are mostly plain text, with a few markup elements inserted to indicate headings, subheadings, and itemized lists.
When you installed Perl, the POD documentation was installed as well. The POD files are located in the pod subdirectory of the Perl library directory. You can either read them directly, or use the perldoc script to format and display them in a text pager such as more.
To use perldoc type the command and the name of the POD file you wish to view. The best place to start is the Perl table of contents, perltoc:% perldoc perltoc
This will give you a list of other POD pages that you can display.
For a quick summary of a particular Perl function, perldoc accepts the -f flag. For example, to see a summary of the socket() function, type:% perldoc -f socket
For Macintosh user's the MacPerl distribution comes with a "helper" application called shuck. This adds POD viewing facilities to the MacPerl Help menu.