- Simplicity versus Flexibility versus Optimality
- Knowing the Problem You're Trying to Solve
- Overhead and Scaling
- Operation Above Capacity
- Compact IDs versus Object Identifiers
- Optimizing for the Most Common or Important Case
- Forward Compatibility
- Migration: Routing Algorithms and Addressing
- Making Multiprotocol Operation Possible
- Running over Layer 3 versus Layer 2
- Determinism versus Stability
- Performance for Correctness
- In Closing
Protocols generally evolve, and it is good to design with provision for making minor or major changes. Some changes are "incompatible," meaning that it is preferable for the later-version node to be aware that it is talking to an earlier-version node and switch to speaking the earlier version of the protocol. Other changes are "compatible," and later-version protocol messages can be processed without harm by earlier-version nodes. There are various techniques, covered in the next few sections, to ensure forward compatibility.
18.7.1 Large Enough Fields
A common mistake is to make fields too small. It is better to overestimate than to underestimate. It greatly expands the lifetime of a protocol. Following are examples of fields that one could argue should have been larger.
Packet identifier in IP and CLNP headers (because it could wrap around within a packet lifetime).
Fragment identifier in IS-IS (because an LSP could be larger than 256 fragments).
Packet size in IPv6. (Some people might argue that the desire to "optimize for most common case" is the reason for splitting the high-order part into an option in the unusual case where packets larger than 64K would be desired.)
18.7.2 Independence of Layers
It is desirable to design a protocol with as little as possible dependence on other layers, so that in the future one layer can be replaced without affecting other layers. An example is to have protocols above layer 3 make the assumption that addresses are 4 bytes long.
Wrong medium: At work, during a meeting, the loudspeaker announced, "Urgent. There is a water emergency. Everyone must log out and turn off all computers." I didn't even know there was a loudspeaker. That was the first announcement I'd ever heard in the year I'd worked there. And gee, whatever a "water emergency" was, it sounded really important. So we all rushed off to turn off our machines and finish the meeting. Later we were chatting in the halls, and someone walked by, joined in the conversation, and then said, "Well, gotta get back to work." We said, "But all the machines are down." He said, "Oh no, they're not. They never even went down." We said, "Oh really? We didn't hear another announcement. How did you find out we didn't need to turn everything off?" He said, "They sent email."
The downside of this principle is that if you do not exploit the special capabilities of a particular technology at layer n, you wind up with the least common denominator. For example, not all data links provide multicast capability, but it is useful for routing algorithms to use link level multicast for neighbor discovery, efficient propagation of information to all LAN neighbors, and so on. If we adhered too strictly to the principle of not making special assumptions about the data link layer, we might not have allowed layer 3 to exploit the multicast capability of some layer 2 technologies.
Another danger of exploiting special capabilities of layer n–1 is that a new technology at layer n–1 might need to be altered in unnatural ways to make it support the API designed for a different technology. An example is attempting to make a technology such as frame relay or SMDS provide multicast so that it "looks like" Ethernet. For example, the way that multicast is simulated in SMDS is to have packets with a multicast destination address transmitted to a special node that is manually configured with the individual members; that node individually addresses copies of the "multicast" packet to each of the recipients.
18.7.3 Reserved Fields
Often there are spare bits. If they are carefully specified to be transmitted as zero and ignored upon receipt, they can later be used for purposes such as signaling that the transmitting node has implemented later version features, or they can be used to encode information, such as priority, that can safely be ignored by nodes that do not understand it. This is an excellent example of the maxim "Be conservative in what you send, and liberal in what you accept" because you should always set reserved bits to zero and ignore them upon receipt.
18.7.4 Single Version-number Field
One method of expressing version is with a single number. What should an implementation do if the version number is different? Sometimes a node might implement multiple previous versions. Sometimes newer versions are indeed compatible with older versions.
It is generally good to specify that a node that receives a packet with a larger version number simply drop it, or respond with an older version packet, rather than logging an error or crashing. If two nodes attempt to communicate, and the one with the newer version notices it is talking to a node with an older version, the newer-version node simply switches to talking the older version of the protocol, setting the version number to the one recognized by the other side.
The "spectacularly bad idea" protocol: I was staying at a hotel where they put the following announcement under my door. "Tomorrow morning we will be testing the emergency procedures. Guests should ignore all alarms and all behavior of the staff during the day tomorrow, as it is only a drill."
One problem that can result is that two new-version nodes might get tricked into talking the old version of the protocol to each other. Any memory from one side that the other side is older causes it to talk the older version and therefore causes the other side to talk the older version. A method of solving this problem is to use a reserved bit indicating "I could be speaking a later version, but I think this is the latest version you support." Another possibility is to periodically probe with a later-version packet.
18.7.5 Split Version-number Field
This strategy uses two or more subfields, sometimes referred to as major and minor version numbers. The major subfield is incremented if the protocol has been modified in an incompatible way and it is dangerous for an old-version node to attempt to process the packet. The minor subfield is incremented if there are compatible changes to the protocol. For example, a transport layer protocol might have added the feature of delayed acks to avoid silly window syndrome.1
The same result could be applied with reserved bits (signaling that you implement enhanced features that are compatible with this version). However, having a minor version field in addition to the major version allows 2n possible enhancements to be signaled with an n-bit minor version field (assuming that the enhancements were added to the protocol in sequential order so that announcing enhancement 23 means that you support all previous enhancements).
If you want to allow more flexibility than "all versions up to n," there are various possibilities.
I support all capabilities between k and n (requires a field twice the size of the minor version field and the assumption that it makes sense to implement a range rather than everything smaller than n).
I support capabilities 2, 3, and 6 (you're probably better off with a bitmask of n bits).
With a version number field, care must be taken if it is allowed to wrap around. It is far simpler not to face this issue by either making the version number field very large or being conservative about incrementing it. One solution to the problem of running out of version numbers is to define the use of the highest numerical value in the version number field to indicate that the actual version number follows, most likely with a larger field. For example, if the original protocol called for a 1-byte version number, the value 255 might mean that the version number extension follows.
Another way of providing for future protocol evolution is to allow the appending of options. It is desirable to encode options so that an unknown option can be skipped. Sometimes, however, it is desirable for an unknown option to generate an error rather than be ignored. The most flexible solution is to specify, for each option, what a node that does not recognize the option should do, whether it be "skip and ignore," "skip and log," or "stop parsing and generate an error report."
To skip unknown options, the strategies are as follows.
Have a special marker at the end of the option (requires linear scan of the option to find the end)
Have options be TLV-encoded, which means having a type field, a length field, and a value field
Note that to skip over an unknown option, L (length) must always mean the same thing. Sometimes protocols have L depend on T—for example, not having any L field if the particular type is always fixed length, or having the L be expressed in bits versus bytes depending on T.
Another way to make it impossible to skip over an unknown option is if L is the usable length and the actual length is always padded to, say, a multiple of 8 bytes. If the specification is clear that all options interpret L in that way, then options can be parsed, but if some option types use L as "how much data to skip" and others as "relevant information" to which padding is inferred somehow, then it is not possible to parse unknown options.
To determine what to do with unknown options, there are various strategies.
Specify the handling of all unknown types (for example, skip and log, skip and ignore, generate error and ignore entire packet) and live without the flexibility in the future of defining an option that should be handled differently.
Have a field present in all options that specifies the handling of the option.
Have the handling implicit in the type number—for example, a range of T values that the specification says should be ignored and another range to be skipped and logged, and so on. This is similar to considering a bit in the type field as a flag indicating the handling of the packet.
Priority is an example of an option that would make sense to ignore if unknown. An example of an option in which the packet should be dropped is strict source routing.
Ironically, the 1984 version of ASN.1 (still widely deployed today), although it uses TLV encoding, does not allow you to add new options that are accepted by old implementations of your protocol if you define your structures in the straightforward ASN.1 way. You can define fields in your data structure as optional, but that means only that the implementation is willing for any of those defined fields to appear. Any options that you don't define will cause a parsing error.