Home > Articles

  • Print
  • + Share This
This chapter is from the book

The Encoding (Protocol Binding and Serialization)

If YANG plus a YANG model can be compared to a language such as English, defined by a dictionary of known words and a grammar describing how the words can be combined, you still need to work out an encoding for your language before you can communicate. English has two very commonly used encodings: text and voice. The text encoding can be further divided into computer encodings like ASCII, Windows-1252, and UTF-8. Text can also be encoded as pixels of different colors in a photo file, as ink on paper, microfiche, or grooves on a stone or a copper plate.

Similarly, there are several different encodings of messages (also known as protocol bindings or serialization) relating to YANG-based models, each more or less suitable depending on the context. The most commonly mentioned (and used) YANG-related encodings are XML, JavaScript Object Notation (JSON; in two variants), and protobuf. As time has passed, new encodings have emerged. For example, this section also covers Concise Binary Object Representation (CBOR).

Figure 2-4, shown earlier in this chapter, displays the different possible encodings for a specific protocol. Keep in mind that more encodings are available all the time: It makes sense as, in the end, an encoding is just... an encoding.

XML

Ask most IT professionals if they know XML, and the reply is invariably “yes.” As most IT folks know, XML stands for Extensible Markup Language. If you ask the same IT professionals to outline the extensibility mechanism in XML, however, only a tiny fraction will give you a reasonable reply. This is why there is a brief recap of the most important XML features, including the extensibility mechanism, in Chapter 3.

Most people think of XML as those HTML-like text documents full of angle brackets (<some>xml</some>), which may seem as a rather small and simple concept. It is not, however. XML and its family of related standards is a complicated and far-reaching dragon. It has a well-designed extensibility mechanism that allows XML-based content to evolve nicely over time, it has a query language called XPath, it has a schema language called XML Schema, and it has a transformation language called XSLT—just to mention a few features. Keep in mind that a schema is the definition of the structure and content of data (names, types, ranges, and defaults).

Because of these qualities and the abundant availability of tools, the NETCONF working group decided to base its protocol on XML for the message encoding. In the beginning, before YANG was invented, a lot of people assumed that NETCONF would be modeled using XML Schema Description (XSD). It was even used, together with a set of additional mapping conventions, as the official modeling language for about half a year. This was when the group was attempting to craft the first standard models for NETCONF. Then a serious flaw was found in one of the early models made by the NETCONF gurus. This led to a rather heated debate around how such a major flaw could be introduced and not be found in review by even the inner circle of NETCONF pundits. The root cause for the situation was eventually declared to be the hard-to-read nature of XML Schema. This situation triggered tiny teams from separate organizations across the industry to get together to define a new schema language, with the fundamental requirement that models must be easy to read and write. Those team members’ names are found in RFC 6020. Today, this language is known as YANG.

This explains why the ties between NETCONF and XML were (and remain) very strong, and even why YANG 1.0 depends on XML mechanisms quite a bit. With different encodings these days, YANG 1.1 is designed as much more neutral to the protocol encodings.

JSON

As the demand for a REST-based approach similar to the functionality standardized in NETCONF was rising, the NETCONF WG started developing RESTCONF.21 REST-based transports use a fairly wide variety of message encodings, but there is no question that the most popular one is JavaScript Object Notation (JSON). It has its roots in the way objects are represented in JavaScript: It was a very simple collection of encoding rules that fit on a single page. This simplicity was a major driver for JSON’s popularity. Today there is a clear, precise, and language-independent definition of JSON in RFC 7159, and further updated in RFC 8259.

While simplicity is always welcome, the downside is that JSON in its simple form handled many use cases quite poorly. For example, there were no mechanisms for evolution and extension, no counterpart of the YANG namespace mechanism. Another example is that JSON has a single number type with an integer precision of about 53 bits in typical implementations. The 64-bit integers in YANG therefore must be encoded as something other than as numbers (strings). To handle these and many other similar-but-not-so-obvious mapping cases, a set of encoding conventions were needed on top of JSON itself (just like with XSD for XML, as mentioned earlier). For the YANG-to-RESTCONF mapping, these conventions are found in “JSON Encoding of Data Modeled with YANG” (RFC 7951; do not confuse this with RFC 7159, mentioned earlier, despite their similar numbers). The general JSON community is also working with standardization around additional use cases for JSON, and today JSON is approaching XML in versatility and complexity.

Besides JSON, the RESTCONF specification (RFC 8040) also defines how to encode the data as XML. Some RESTCONF servers support JSON, some XML, and many support both. When the encoding is JSON, it really means JSON as specified in RFC 7159, plus all the conventions in RFC 7951.

Google Protobufs

Protocol buffers, or protobufs, are another supported encoding in gNMI. Protobufs were originally invented by Google and are widely used in much of Google’s products and services, where communication over the wire or data storage is required.

You specify how you want the information you are serializing to be structured by defining protocol buffer message types in .proto files. Once you define your messages, you run the protocol buffer compiler for your application’s language on your .proto file to generate data access classes. Protobufs have built-in support for versioning and extensibility, which many see as an edge over JSON. The messaging mechanism is openly available with bindings for a long list of languages, which has led to very broad usage.

Protobufs come in two formats: self-describing and compact. The self-describing mode is three times larger in terms of bits on the wire. The compact mode is a tight binary form, which has the advantage of saving space on the wire and in memory. As a consequence, this mode is two times faster. This encoding is therefore well suited for telemetry, where a lot a data is pushed at high frequency toward a collector. On the other hand, the compact format is hard to debug or trace—without the .proto files, you cannot tell the names, meaning, or full data types of fields.

CBOR

Concise Binary Object Representation (CBOR) is another encoding being discussed, and is particularly useful for small, embedded systems, typically from the Internet of Things (IoT). CBOR is super-efficient, as it compresses even the identifiers. CBOR is used in connection with the CoAP Management Interface (CoMI) protocol on the client side.

As of this writing, CBOR (RFC 7049) is not in wide use with YANG-based servers, but discussions are ongoing in the IETF CORE (Constrained RESTful Environments) working group, where a document called “CBOR Encoding of Data Modeled with YANG” is being crafted.

  • + Share This
  • 🔖 Save To Your Account