## Data Integrity

At a much lesser cost than encrypting a plaintext in its entirety, both services of data integrity and origin authentication can be afforded by a secret cryptographic scheme using a Message Authentication Code (MAC). Its main component is the digest function.

### Hash Functions

By definition a hash function is a deterministic function that maps a message
of arbitrary length to a fixed length string (e.g., 128 or 160 bits) commonly
known as a message digest [MERK79, PREN93]. Hash functions are considered to be
one of the fundamental primitives in modern cryptography. For a function
*H* to be a hash function, it should satisfy the following properties:

For an arbitrary input

*p*, it is easy to compute*h = H(p)*It is computationally infeasible to compute the inverse

*p = H*(^{–1}*h*), a property that attributes the hash function the one-way hash function nameIt is also computationally infeasible to determine

*p*' such that*H*(*p*) =*H*(*p*'), a property know as collision resistance

The collision resistance characteristic of a hash function is further
enhanced by having the function satisfy a property whereby the probability that
a randomly picked stream of bytes is mapped to a known *n*-bit hash-value
is 2^{(–n)}.

The fundamental premise here is that the hash value becomes, depending upon the strength of the hashing algorithm, a compact representation of the original data; hence hash value is sometimes referred to as fingerprint. In modern cryptography hash functions are commonly used in providing digital signatures of documents, data integrity, and data origin authentication.

MD5 [RIVE92] and SHA-1 [SHS95] are the most widely used cryptographic hash functions. MD5 yields a 128-bit hash value; SHA-1 results in a 160-bit digest. SHA-1 appears to be a cryptographically stronger function as it presents an enhancement over MD5 and is more resistant to brute force attacks. On the other hand, MD5 edges SHA-1 in computational performance.

The combination of a hash function and a secret key cryptographic scheme yields a MAC, which enables both data integrity and data origin authenticity. In contrast to imply using a hash function to digest a message as is the case in Modification Detection Codes (MDC), the keyed MAC hashing function yields a value that can only be verified by an entity having knowledge of the secret key.

### MAC Examples

One simple method of turning a one-way hash function into a MAC is to encrypt the resulting hash value with a secret-key block algorithm. Similarly, a MAC can be computed by solely using a symmetric block algorithm in a mode such as the cipher block chaining (CBC) [FIPS81]. In this mode we start with a randomly chosen block of data as an initial vector. We perform the XOR of the initial vector with the first block to be encrypted. Then we encrypt the block. The procedure is repeated for the next block using the block that was encrypted last as an initial vector. The cascading nature of this chained feedback is shown in Figure 1.8. The last ciphertext block is encrypted once more in CBC mode, yielding the final MAC value. Known instances of this procedure employ DES and triple DES resulting in DES-MAC and Triple-DES-MAC, respectively.

**FIGURE 1.8 Feedback chaining in a CBC-based MAC algorithm**

### HMAC

Other methods of constructing MACs rely solely on keyed one-way hash functions. One simple example would be to prefix or suffix the data to be digested with a secret key. The result is then subjected to the hashing transformation. Another variation consists of prefixing and suffixing the data to be digested with the key. Additionally, a more reliable version of such a method includes the length of the data to digest while computing the hash value.

A common method in this category is the keyed hashing for message
authentication known as HMAC [KRAW97]. The HMAC algorithm is described using a
generic iterative hash function *H*. In practice, however, it has been
employed mostly with MD5 and SHA-1.

Let's denote by *b* the block length of the underlying hash
function's input block (64 for both MD5 and SHA-1). The following inner and
outer padding values are defined:

*innerPad*= the byte 0 x 36 repeated*b*times*outerPad*= the byte 0 x 5C repeated*b*times.

To compute HMAC over an input *p* we perform:

*H((K XOR outerPad) || H( (K XOR innerPad) || p))*,

where || denotes string concatenation. This computation breaks into the following steps:.

Append zeros to the end of K to create a b byte string

XOR the b byte string computed in step (1)with innerPad

Append the stream p to the b-byte string resulting from step (2)

Apply H to the stream generated in step append (3)

XOR the b byte string computed in step (1)with outerPad

Append the H result from step (4)to the b byte string resulting from step (5)

Apply H to the stream generated in step (6)and output the result

The effective contribution to the final hash value is accumulated in
accordance to function *H*. The procedure is then iteratively applied to
each of the remaining blocks.