Network Layers: A Brief Primer on Internet Protocols (and Relevant Acronyms)
Networking is often described in terms of layers. This concept of layered protocols is exciting and important to computer scientists, but might not be as interesting to the reader. Nonetheless, it is useful to know about the many layers of software that allow the Internet to function.
Many other fine books do a more thorough job explaining the different protocols used in networking and how they stack on top of one another. However, this section aims to give just enough coverage of this topic to help you understand the protocols relevant to Internet video.
You have probably encountered the terms http and TCP/IP. These are protocols (networking languages spoken between computers). This section introduces you to a few other relevant protocols and explains how they fit together.
Conceptually, you know that there is a physical layer (copper wires, telephone lines, cable, fiber optic, and so on), the hardware that actually carries the pro-verbial 1s and 0s from one computer to another.
Figure 5-13 Internet hardware layer.
Data Link Layer
The next layer up is a data link layer, the language appropriate for that hardwareAsynchronous Transfer Mode (AT) for fiber and copper wires, Data Over Cable Service Interface Specification (DOCSIS) for cable modems, and so on. You can obtain Internet access so many different ways including wireless (WiFi, 802.11). The language of all these transports are at this data link layer. Here, almost every piece of hardwarefrom wireless to optical to TV cable, telephone, or satellitehas its own special "language" with which it intercommunicates.
Figure 5-14 Data link layer.
The next layer up is the network layer. This is where you find IP, the Internet Protocol, the basic language used between two computers on the Internet. You've heard of IP addresses; this layer is all about sending packets from one address to another address. And, this is the layer where IP has become the lingua franca of hundreds of millions of computer devices.
In IP, a packet is a chunk of information that has its own source and destination IP addresses, a size, and some data inside it. They range in size from around 34 bytes (characters) to a few kilobytes.
Figure 5-15 Packets are sent over IP.
Machines on the Internet are sometimes called hosts. Whenever a host needs to send a packet, it sends it to the nearest router. Your ISP provides the router that routes the packets sent by your host (machine) to other hosts on the Internet.
Sometimes routers go down or the links they control go down, and traffic has to be re-routed through other paths, as shown in Figure 5-16. This is somewhat analogous to freeway traffic; when a normally high-volume freeway becomes clogged, people take alternative routes or have to slowly make it through the congested freeway.
Figure 5-16 Traffic gets re-routed when one path is too busy or unavailable.
If a router is too busy and there is too much traffic going through it, it has every right to simply throw packets away, as shown in Figure 5-17. This is what's referred to as packet loss. You will hear a lot about this occurrence in this chapter, and it is the cause of most difficulty when delivering video over the Internet. Unfortunately, it is not a phenomenon that will go away with time. It is the nature of the Internet to lose packets. Tech-niques exist to mitigate it, however, which is one of the major purposes of streaming protocols.
Figure 5-17 When a router is too busy, packets can be lost or discarded.
The next layer up is called the transport layer. At the network layer, IP controls the movement of packets around the network, and an IP address identifies a particular machine on the Internet. Here at the transport layer, TCP and UDP provide different kinds of information delivery.
This is the layer where port numbers are used to identify what the hosts are talking about. You can think of port numbers as a sort of "channel" on a computer, as shown in Figure 5-18. For instance, the port number 80 is usually used for web pages. So the transport layer allows you instead of just, "Send a packet to the machine at 18.104.22.168," to say, "Send a packet to the email channel (port 25) on the machine at 22.214.171.124."
Figure 5-18 Port numbers indicate which Internet protocol is used.
You might see port numbers in URLs after a colon, such as http://www.masteringinternetvideo.com:8080. This means that web services are located on a different port on this machine.
There are 65,536 ports, and many of them are well established with specific uses, such as streaming media (554, 7070), web service (80, 8080), file service (21 and 23), network administration (161), remote access (3389), online games (666), and time service (123), to name a few.
Transmission Control Protocol (TCP)
TCP is the dominant transmission protocol. It is considered a self-healing protocol in that it detects errors and resends packets that were damaged or dropped because of network transmission. It also numbers packets, so that an application can identify that it received packets 1, 2, 4, and 5, but not packet 3. TCP also performs a function called flow control, which makes sure the sender slows down if the receiver (or the intervening connection) can't handle the speed.
All this makes TCP a high-quality protocol; you are virtually guaranteed to get all the dataeventually. But there's a price for the benefits of error detection, automatic resending, and flow controlspeed. When traffic is high and connections are swamped with data, causing packet loss, TCP sends data more slowly and has to resend a lot of packets. The "World Wide Wait" is the natural outcome of TCP's high-reliability, potentially high-delay architecture. Of course, there are times when there's so much packet loss that even TCP can't overcome it. That's when you'll see the message, "A connection failure has occurred."
User Datagram Protocol (UDP)
TCP is considered a reliable protocol, in that it's architected to reliably deliver the data. But reliability isn't always the name of the game. In some cases, a lightweight approach is preferable. User Datagram Protocol (UDP) implements an efficient but not entirely reliable delivery mechanism. It is a lean protocol that doesn't add many features on top of IP. Hosts send datagrams, which are basically IP packets with a destination host IP address and port number. UDP also includes checksumminga way to tell if the packet was received intact or was damaged in transmission. You can see a conceptual comparison of TCP and UDP in Figure 5-19.
Figure 5-19 UDP uses datagrams, whereas TCP uses two-way data channels.
UDP leaves it up to the application to negotiate a resend of a dropped or damaged packet. Unlike UDP, TCP doesn't assume that dropped packets should be automatically resent. This has obvious advantages for video: In many cases, it's preferable to just skip a missing packet and move onto the next chunk of data. It's better than coming to a standstill, waiting to see if the Internet will manage to deliver that packet on the next attempt. On the other hand, UDP has no built-in packet ordering, so the application also has to put sequence numbers inside the datagrams it sends. Applications have to do their own accounting to figure out if a packet has been dropped.
From a delivery perspective, TCP operates like certified mail delivered by slow, thorough postal workers; you'll always get your mail eventually, and you'll be informed if there was a problem or delay. UDP operates like postcards delivered by fast, sloppy postal workers; they deliver the mail quickly but they might lose your mail as well, and unless you've numbered the postcards, you'll never know.
From a programming perspective, think of UDP and TCP like manual and automatic transmissions. It's a lot harder to drive a stick shift, but you have potentially more control over speed, acceleration, and fuel economy. It's harder to write UDP applications because the programmer has to address issues that TCP takes care of for youthings like handling dropped packets and flow controlbut it gives you more control over exactly how these problems are handled. Automatics are easier to drive, but they don't always change gears when you want them to and don't give the best performance or economy possible from the engine. Like driving an automatic, TCP is much simpler to program for, but when the delays get out of hand, there's not much you can do to improve the situation.
To get a feel for how multimedia transmission sounds with unreliable packets, think about mobile phones. While making a call, the audio is constantly connected, but sometimes the signal gets worse and you can't hear anything for a whileand sometimes you lose the connection. But, when you do hear the call, it works pretty well you hear the other party in real time and they hear you. Though mobile phones don't operate over TCP/IP (at least not yet), the protocols they use are similar to UDP.
All the other Internet protocols used by applications are built upon the transport layer protocols. These protocols make up the application layer. The best known of these is of course HTTP (Hypertext Transfer Protocol). Other familiar protocols are FTP (File Transfer Protocol, ports 21 and 23) and SMTP (Simple Mail Transport Protocol.) POP3 (Post Office Protocol, version 3) is here too. All these protocols use TCP as its transport, and each performs the same basic function:connect, send data, receive data, and disconnect.
Perhaps as a more comprehensible analogy, IP can be likened to letters; TCP and UDP are made up of many IP packets and can be likened to words; and higher level protocols assemble these words into more complex sentences, paragraphs, pages, and so on. Figure 5-20 illustrates this concept.
Figure 5-20 Application protocols run over TCP or UDP, which run over IP.
Although the applications that most users interact with are built on TCP, several important applications and protocols are built on UDP. Perhaps the most important is the Domain Name System, the network of servers that translate domain names into IP addresses. When you type in a new URL into a web browser, the first thing the computer does is send a UDP packet to a DNS server asking to resolve the domain name you entered to an IP address. This is an ideal application for UDP in that the request can be encapsulated into a single packet.
For networking programmers and streaming system designers, UDP is the preferred protocol for delivering streaming video. UDP has many advantages over TCP for delivering video including the following:
Delay: If a packet is dropped in UDP, the server can just keep sending UDP packets to the receiver. A single dropped packet is probably only a frame or two of video, and the video player can just keep going. When a packet is lost in TCP, TCP then stops all sending and tries to fetch the packet again. This causes a phenomenon called "jitter," which results in uneven timing and delays in the data.
Flow control: Unlike TCP, UDP has no built-in flow control. TCP's flow control makes it automatically slow down when the receiver can't accept the data at the speed it is sent. However, it is often not the receiver that can't accept the data, but some temporary condition on the network. Only UDP can keep sending data at a constant pace, independent of what the network does.
Low overhead: UDP is a simple protocol; it essentially puts the data in an envelope, stamps it, and sends it. All the hard work is left for the application. TCP has an elaborate protocol for making sure the packet is delivered. similar to registered mail with signatures and a return mailer. As a result, there is more paperwork, so to speak; this overhead increases delays and affects the amount of data that is delivered.