Definitions

# One-way compression function

In cryptography, a one-way compression function is a function that transforms two fixed length inputs to an output of the same size as one of the inputs. The transformation is "one-way", meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to data compression, which by definition is invertible.

One-way compression functions are for instance used in the Merkle-Damgård construction inside cryptographic hash functions.

One-way compression functions are often built from block ciphers. Some methods to turn any normal block cipher into a one-way compression function are Davies-Meyer, Matyas-Meyer-Oseas, Miyaguchi-Preneel, MDC-2/Meyer-Schilling and MDC-4. These methods are described in detail further down. (MDC-2 is also the name of a hash function patented by IBM.)

## Compression

A compression function mixes two fixed length inputs and produces a single fixed length output of the same size as one of the inputs. This can also be seen as that the compression function transforms one large fixed-length input into a shorter, fixed-length output.

For instance, input A might be 128 bits, input B 128 bits and they are compressed together to a single output of 128 bits. This is the same thing as if one single 256-bit input is compressed together to a single output of 128 bits.

Some compression functions have different size of the two inputs but the output usually is the same size as one of the inputs. For instance, input A might be 256 bits, input B 128 bits and they are compressed together to a single output of 128 bits. That is, a total of 384 input bits are compressed together to 128 output bits.

The mixing is done in such a way that full avalanche effect is achieved. That is, every output bit depends on every input bit.

## One-way

A one-way function is a function that is easy to compute but hard to invert. One-way compression functions are one-way in the following ways:

• If you know both inputs it is easy to calculate the output.
• If an attacker only knows the output it should be unfeasible to calculate any of the inputs.
• If an attacker knows the output and one of the inputs it should be unfeasible to figure out the other input.

The compression function should also be collision resistant. That is, it should be hard to find two different sets of inputs that compress to the same outputs.

## The Merkle-Damgård construction

Main article: Merkle-Damgård construction

One-way compression functions are used in the Merkle-Damgård construction inside cryptographic hash functions.

A hash function must be able to process an arbitrary-length message into a fixed-length output. This can be achieved by breaking the input up into a series of equal-sized blocks, and operating on them in sequence using a one-way compression function. The compression function can either be specially designed for hashing or be built from a block cipher.

The last block processed should also be length padded, this is crucial to the security of this construction. This construction is called the Merkle-Damgård construction. Most widely used hash functions, including SHA-1 and MD5, take this form.

## Often built from block ciphers

One-way compression functions are often built from block ciphers.

Block ciphers take (like one-way compression functions) two fixed size inputs (the key and the plaintext) and return one single output (the ciphertext) which is the same size as the input plaintext.

However, modern block ciphers are only partially one-way. That is, given a plaintext and a ciphertext it is infeasible to find a key that encrypts the plaintext to the ciphertext. But, given a ciphertext and a key a matching plaintext can be found simply by using the block cipher's decryption function. Thus, to turn a block cipher into a one-way compression function some extra operations have to be added.

Some methods to turn any normal block cipher into a one-way compression function are Davies-Meyer, Matyas-Meyer-Oseas, Miyaguchi-Preneel, MDC-2 and MDC-4. They are then used inside the Merkle-Damgård construction to build the actual hash function. These methods are described in detail further down. (MDC-2 is also the name of a hash function patented by IBM.)

Using a block cipher to build the one-way compression function for a hash function is usually somewhat slower than using a specially designed one-way compression function in the hash function. This is because all known secure constructions do the key scheduling for each block of the message. Black, Cochran and Shrimpton have shown that it is impossible to construct a one-way compression function that makes only one call to a block cipher with a fixed key. In practice reasonable speeds are achieved provided the key scheduling of the selected block cipher is not a too heavy operation.

But, in some cases it is easier because a single implementation of a block cipher can be used for both block cipher and a hash function. It can also save code space in very tiny embedded systems like for instance smart cards or nodes in cars or other machines.

If a block cipher has a block size of say 128 bits most of the methods create a hash function that has the block size of 128 bits and produces a hash of 128 bits. But there are also methods to make hashes with double the hash size compared to the block size of the block cipher used. So a 128-bit block cipher can be turned into a 256-bit hash function.

The hash function can only be considered secure if at least the following conditions are met:

• The block cipher has no special properties that distinguish it from ideal ciphers, such as for example weak keys or keys that lead to identical or related encryptions.
• The resulting hash size is big enough. 64-bit is too small, 128-bit might be enough.
• The last block is properly length padded prior to the hashing. (See the Merkle-Damgård construction.) Length padding is normally implemented and handled internally in specialised hash functions like SHA-1 etc.

The constructions presented below: Davies-Meyer, Matyas-Meyer-Oseas and Miyaguchi-Preneel have been shown to be secure under the black-box analysis. The black-box model assumes that a random block cipher is used. In this model an attacker may freely encrypt and decrypt any blocks, but does not have access to an implementation of the block cipher.

## Davies-Meyer

The Davies-Meyer one-way compression function feeds each block of the message (mi) as the key to a block cipher. It feeds the previous hash value (Hi-1) as the plaintext to be encrypted. The output ciphertext is then also XORed ($oplus$) with the previous hash value (Hi-1) to produce the next hash value (Hi). In the first round when there is no previous hash value it uses a constant pre-specified initial value (H0).

In mathematical notation Davies-Meyer can be described as:

$H_i = E_\left\{m_i\right\}\left\{\left(H_\left\{i-1\right\}\right)\right\} oplus \left\{H_\left\{i-1\right\}\right\}$

If the block cipher uses for instance 256-bit keys then each message block (mi) is a 256-bit chunk of the message. If the same block cipher uses a block size of 128 bits then the input and output hash values in each round is 128 bits.

Variations of this method replace XOR with any other group operation, such as addition on 32-bit unsigned integers.

If the used block cipher is not secure i.e. has been broken then a so-called fixed point attack can be applied to this construction . According to Bruce Schneier this "is not really worth worrying about.

The security of the Davies-Meyer construction under the black-box assumption was first proved by R. Winternitz.

## Matyas-Meyer-Oseas

The Matyas-Meyer-Oseas one-way compression function can be considered the dual (the opposite) of Davies-Meyer.

It feeds each block of the message (mi) as the plaintext to be encrypted. The output ciphertext is then also XORed ($oplus$) with the same message block (mi) to produce the next hash value (Hi). The previous hash value (Hi-1) is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value (H0).

If the block cipher has different block and key sizes the hash value (Hi-1) will have the wrong size for use as the key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function g() to be converted/padded to fit as key for the cipher.

In mathematical notation Matyas-Meyer-Oseas can be described as:

$H_i = E_\left\{g\left(H_\left\{i-1\right\}\right)\right\}\left(m_i\right)oplus m_i$

## Miyaguchi-Preneel

The Miyaguchi-Preneel one-way compression function is an extended variant of Matyas-Meyer-Oseas. It was independently proposed by Shoji Miyaguchi and Bart Preneel.

It feeds each block of the message (mi) as the plaintext to be encrypted. The output ciphertext is then XORed ($oplus$) with the same message block (mi) and then also XORed with the previous hash value (Hi-1) to produce the next hash value (Hi). The previous hash value (Hi-1) is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value (H0).

If the block cipher has different block and key sizes the hash value (Hi-1) will have the wrong size for use as the key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function g() to be converted/padded to fit as key for the cipher.

In mathematical notation Miyaguchi-Preneel can be described as:

$H_i = E_\left\{g\left(H_\left\{i-1\right\}\right)\right\}\left(m_i\right)oplus H_\left\{i-1\right\}oplus m_i$

The roles of mi and Hi-1 may be switched, so that Hi-1 is encrypted under the key mi. Thus making this method an extension of Davies-Meyer instead.