Internet Software Backbone: Sockets
From a user's point of view, the Internet that we know today is a browser where sites can be viewed and searched, information can be entered, songs can be played, movies can be viewed, information can be sent and received through email, and files can be sent and received using FTP tools.
Most of the transfer of information on the Internet is done using the HTTP protocol. Email transfer is usually done using the SMTP and POP protocols. Files are usually downloaded or uploaded using the FTP protocol. These protocols ride on top of the TCP/IP protocol.
But what is the software mechanism used to transfer information across the net using these protocols? The answer is sockets.
Sockets can be defined as the software backbone of the Internet. The Internet is what it is today because of the socket paradigm.
What Are Sockets?
The socket paradigm was first defined for the C language under UNIX for network programming. It was later incorporated into the MS Windows environment and is called WINSOCK. If you understand the use of sockets in an algorithmic sense, you should have no problem applying and using them in different platforms and environments.
Many functions in these environments deal with sockets. The basic function names and the parameters that they take are generally the same in the different environments, such as C under UNIX, Perl, and WINSOCK in Windows on the highest level.
Before we look at the functions in detail, a few basic concepts must be understood.
What Is a Port in Socket Terminology?
In very basic terms, a port is a unique number used to identify an application on a machine. For example, let's say that there are two programs, PA and PB (applications), running on one machine on a network. Now imagine that a second machine on the network needs to use program PA on Machine 1, so Machine 2 sends a request to Machine 1 to use PA using the unique identifier of PA, which can be called a port. Applications that listen to and service requests from other machines are assigned port numbers.
What Is Network Bandwidth?
In very basic terms, network bandwidth is the amount of information that can be transmitted over a network in a given amount of time.
What Is a protocol?
A protocol is a set of rules or standards to be followed to do a particular task. Therefore, to transmit data between different machines over a network, a set of rules or standards must be followed. These rules define the procedure of establishing a connection, packaging information, transmitting information, addressing the packets of information, performing error checking (if any), performing error correction (if any) over the network, and so on.
There are two types of network rules or protocols:
Connection-oriented—An example is the Transmission Control Protocol (TCP)
Connectionless—An example is the User Datagram Protocol (UDP)
Connection-oriented protocols provide reliable packet delivery to the destination using mechanisms such as parity and checksum calculations per individual data packet. Although this ensures data integrity and reliability, it requires higher bandwidth. This is not the case with connectionless protocols because they do not ensure integrity or reliable delivery of data packets to the destination.
Information to be sent over a network is broken into units of bytes called packets. In the case of TCP, each packet consists of the actual data and also checksum and parity data, which is used to ensure data integrity and reliability. In the case of UDP, no such information is stored in the packet; therefore, less bandwidth is required compared to TCP to send the same amount of information over a network.
In the case of TCP, therefore, data is guaranteed to reach its destination, which is not true in the case of UDP. This also leads to a slow transmission of data over the network using TCP compared to UDP.
Each machine on a network is identified by an Internet Protocol (IP) address. Each packet of information contains the IP of the destination machine.
Depending on the type of protocols used by a socket, the socket is classified into different types:
Stream sockets that use TCP
Datagram sockets that use UDP