Block cipher: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Sandy Harris
imported>Sandy Harris
Line 1: Line 1:
{{subpages}}
{{main|Cryptography}}
{{TOC|right}}
'''Block ciphers''' are one of the two main types of [[symmetric cipher]]; they operate on fixed-size blocks of [[plaintext]], giving a block of [[ciphertext]] for each. The other main type are [[stream cipher]]s, which generate a continuous stream of keying material to be mixed with messages.
The main function of block ciphers is to keep messages or stored data [[Information_security#Content_confidentiality | secret]]; the intent is that an unauthorised person be completely unable to read the enciphered material. Block ciphers therefore use a [[Key (cryptography)|key]] and are designed to be hard to read without that key. Of course an attacker's intent is exactly the opposite; he wants to read the material without authorisation, and often without the key. See [[cryptanalysis]] for his methods.
Block ciphers are often used as components in [[hybrid cryptosystem]]s; these combine [[public key]] (asymmetric) cryptography with [[symmetric key cryptography | secret key]] (symmetric) techniques such as block ciphers or [[stream cipher]]s. Typically, the symmetric cipher is the workhorse that encrypts large amounts of data; the public key mechanism manages keys for the symmetric cipher and provides [[Information_security#Source_authentication|authentication]]. Generally other components such as [[cryptographic hash]]es and a cryptographically strong [[random number]] generator are used as well.
Various [[block cipher modes of operation]] are used when multiple blocks are to be encrypted. The block cipher defines how a single block is encrypted; the mode defines how multiple block encryptions are combined to achieve some larger goal. Using a mode that is inappropriate for the application at hand may lead to insecurity, even if the cipher itself is secure.
Among the best-known and most widely used block ciphers are two US government standards. The [[#DES | Data Encryption Standard]] (DES) from the 1970s is now considered obsolete; the [[#AES | Advanced Encryption Standard]] (AES) replaced it in 2002.
To choose the new standard, the [[National Institute of Standards and Technology]] ran an [[AES competition]]. Fifteen ciphers were entered, five finalists selected, and eventually AES chosen. Text below describes the five finalists — Rijndael which became [[#AES | AES]], [[#Twofish | Twofish]],  [[#Mars | Mars]], [[#RC6 | RC6]] and  [[# Serpent | Serpent ]] — and some others. The [[AES competition]] article covers all fifteen candidates.
== Context ==
Block ciphers are essential components in many security systems. However, just having a good block cipher does not give you security, much as just having good tires does not give you transportation. It may not even help; good tires are of little use if you need a boat. Even in systems where block ciphers are needed, they are never the whole story.
Most of this article deals with block ciphers themselves. The major sections are:
* [[#Size parameters|Size parameters]] describes how block size and key size are chosen
* [[#Principles and techniques | Principles and techniques]] defines terms and introduces major ideas and methods
* [[#DES and alternatives | DES and alternatives]] describes 20th century block ciphers, from the 70s into the 90s
* [[#The AES generation | The AES generation]] describes the next generation, the first 21st century ciphers
* [[#Large-block ciphers | Large-block ciphers]] covers a few special cases that do not fit in the other sections
This section aims to provide a context for those by mentioning some issues that, while outside the study of the ciphers themselves, are crucially important in understanding and using these ciphers.
* It is hard to design any system that must withstand adversaries; see [[Cryptography#Cryptography_is_difficult|cryptography is difficult]] and [[information security]].
* Block ciphers must withstand [[cryptanalysis]]. It is impossible to design a good block cipher, or evaluate the security of one, without a thorough understanding of the available attack methods.
* [[Kerckhoffs' Principle]] applies to block ciphers. No cipher can be considered secure unless it can resist an attacker who knows all its details except the key in use. Analysis of security claims cannot even begin until all internal details of a cipher are published. Anyone making security claims without publishing those details will be either ignored or mocked by most experts.
* Any cipher is worthless without a good key. In any application which encrypts large volumes of data, the key must be changed from time to time. See the [[cryptography#Keying | cryptography]] article for discussion and [[random number]] for the methods used to generate keys which an opponent cannot guess.
* A block cipher defines how a single block is encrypted; a [[block cipher modes of operation| mode of operation]] defines how multiple block encryptions are combined to achieve some larger goal. Using a mode that is inappropriate for the application at hand may lead to insecurity, even if the cipher itself is secure.
* When block ciphers are used as components in [[hybrid cryptosystem]]s the system can only be as strong as its weakest link, and it may not even be that strong. Using secure components including good block ciphers is certainly necessary, but just having good components does not guarantee that the ''system'' will be secure. See [[hybrid cryptosystem]] for how the components fit together, and [[information security]] for broader issues.
That said, we turn to the block ciphers themselves.
== Size parameters ==
One could say  there are only three things to worry about in designing a block cipher:
* make the '''block size large enough''' that an enemy cannot create a [[code book attack | code book]], collecting so many known plaintext/ciphertext pairs that the cipher is broken.
* make the '''key size large enough''' that he cannot use a [[brute force]] attack, trying all possible keys
* then '''design the cipher well enough''' that no other attack is effective
Getting adequate block size and key size is the easy part; just choose large enough numbers. This section describes how those choices are made. Making ciphers that resist attacks that are cleverer than brute force (see [[cryptanalysis]]) is far more difficult. The following section, [[#Principles and techniques | Principles and techniques]] covers ideas and methods for that.
Later on, we describe two generations of actual ciphers. The [[#DES and alternatives | 20th century ciphers]] use 64-bit blocks and key sizes from 56 bits up. The [[#The AES generation | 21st century ciphers]] use 128-bit blocks and 128-bit or larger keys.
If two or more ciphers use the same block and key sizes, they are effectively interchangeable. One can replace another in almost any application without requiring any other change to the application. This might be done to comply with a particular government's standards, to replace a cipher against which some new attack had been discovered, to provide efficiency in a particular environment, or simply to suit a preference.
Nearly all cryptographic libraries give a developer a choice of components, and some protocols such as [[IPsec]] allow a network administrator to select ciphers. This may be a good idea if all the available ciphers are strong, but if some are weak it just gives the developer or administrator, neither of whom is likely to be an expert on ciphers, an opportunity to get it wrong. There is an argument that supporting multiple ciphers is an unnecessary complication. On the other hand, being able to change ciphers easily if one is broken provides a valuable safety mechanism.
=== Block size ===
The block size of a cipher is chosen partly for implementation convenience; using a multiple of 32 bits makes software implementations simpler. However, it must also be large enough to guard against [[code book attack]]s.
DES and the [[#DES and alternatives | generation of ciphers]] that followed it all used a 64-bit block size. To weaken such a cipher significantly the attacker must build up a [[code book attack | code book]] with 2<sup>32</sup> blocks, 32 gigabytes of data, all encrypted with the same key, As long as the cipher user changes keys reasonably often, a code book attack is not a threat. Procedures and protocols for block cipher usage therefore always include a re-keying policy.
However, with [[Moore's Law]] making larger code books more practical, [[NIST]] chose to play it safe in their [[#The AES generation | AES specifications]]; they used a 128-bit block size. This was a somewhat controversial innovation at the time (1998), since it meant changes to a number of applications and it was not absolutely clear that the larger size was necessary. However, it has since become common practice; later ciphers such as [[Camellia (cipher)|Camellia]] and [[SEED (cipher)|SEED]] also use 128 bits.
There are also a few ciphers which either support variable block size or have a large fixed block size. See the section on [[#Large-block ciphers|large-block ciphers]] for details.
=== Key size ===
In theory, any cipher can be broken by a [[brute force]] attack; the enemy just has to try keys until he finds the right one. However, the attack is practical only if the cipher's key size is inadequate. '''Current block ciphers all use at least 128-bit keys''' to protect against brute force attacks; many support larger keys as well. If the key uses n-bits, there are 2<sup>n</sup> possible keys and on average the attacker must test half of them, so the average cost of the attack is 2<sup>n-1</sup> encryptions.
 
Key size is critical in [[stream cipher]]s as well as block ciphers, for the same reasons. In many applications of [[cryptographic hash]] algorithms, no key is used. However in applications where a key is used, such as [[hashed message authentication code]]s, key size is naturally an issue.
== Principles and techniques ==
This section introduces the main principles of block cipher design, defines standard terms, and describes common techniques.
All of the principles and many of the terms and techniques discussed here for block ciphers also apply to other cryptographic primitives such as [[stream cipher]]s and [[cryptographic hash]] algorithms.
=== Iterated block ciphers ===
Nearly all block ciphers are '''iterated block ciphers'''; they have multiple '''rounds''', each applying the same transformation to the output of the previous round. At setup time the '''primary key''' undergoes '''key scheduling''' giving a number of '''round keys'''. In the actual encryption or decryption, each round uses its own round key. This allows the designer to define some relatively simple transformation, called a '''round function''', and apply it repeatedly to create a cipher with enough overall complexity to thwart attacks.
Two common ways to design iterated block ciphers &mdash; [[#SP networks | SP networks]] and [[#Feistel structures | Feistel structures]] &mdash; and two important ways to look at the complexity requirements  &mdash; [[#Avalanche | avalanche]] and  [[#Nonlinearity | nonlinearity]] &mdash; are covered in following sections.
Any iterated cipher can be made more secure by increasing the number of rounds or made faster by reducing the number. In choosing the number of rounds, the cipher designer tries to strike a balance that achieves both security and efficiency simultaneously. Often a safety margin is applied; if the cipher appears to be secure after a certain number of rounds, the designer specifies a somewhat larger number for actual use.
There is a trade-off that can be made in the design. With a simple fast round function, many rounds may be required to achieve adequate security; for example, [[GOST cipher| GOST]] and [[Tiny Encryption Algorithm| TEA]] both use 32 rounds. A more complex round function might allow fewer rounds; for example, [[International Data Encryption Algorithm| IDEA]] uses only 8 rounds. Since the ciphers with fast round functions generally need more rounds and the ones with few rounds generally need slower round functions, neither strategy is clearly better. Secure and reasonably efficient ciphers can be designed either way, and compromises are common.
In [[cryptanalysis]] it is common to attack '''reduced round''' versions of a cipher. For example, in attacking a 16-round cipher, the analyst might start by trying to break a two-round or four-round version. Such attacks are much easier. Success against the reduced round version may lead to insights that are useful in work against the full cipher, or even to an attack that can be extended to break the full cipher.
=== Whitening and tweaking ===
Nearly all block ciphers use the same basic design, an iterated block cipher with multiple rounds. However, some have additional things outside that basic structure.
'''Whitening''' involves mixing additional material derived from the key into the plaintext before the first round, or into the ciphertext after the last round. or both. The technique was introduced by [[Ron Rivest]] in [[Data Encryption Standard#Variations on DES | DES-X]] and has since been used in other ciphers such as [[Rivest ciphers#RC6|RC6]], [[Blowfish (cipher)| Blowfish]] and [[Twofish]]. If the whitening material uses additional key bits, as in DES-X, then this greatly increases resistance to brute force attacks because of the larger key. If the whitening material is derived from the primary key during key scheduling, then resistance to brute force is not increased since the primary key remains the same size. However, using whitening is generally much cheaper than adding a round, and it does increase resistance to other attacks; see papers cited for [[Data Encryption Standard#Variations on DES | DES-X]].
A recent development is the '''tweakable''' block cipher
<ref>{{citation
| title=Tweakable Block Ciphers
| author=M. Liskov, R. Rivest, and D. Wagner
| journal=LNCS, Crypto 2002
| publisher=Springer Verlag
| date=2002
| url=http://www.eecs.berkeley.edu/~daw/papers/tweak-crypto02.pdf
}}</ref>.
Where a normal block cipher has only two inputs, plaintext and key, a tweakable block cipher has a third input called the '''tweak'''. The tweak, along with the key, controls the operation of the cipher. Whitening can be seen as one form of tweaking, but many others are possible.
If changing tweaks is sufficiently lightweight, compared to the key scheduling operation which is often fairly expensive, then some new [[Block_cipher_modes_of_operation#Tweakable_modes | modes of operation]] become possible. Unlike the key, the tweak need not always be secret, though it should be somewhat [[random number |random]] and in some applications it should change from block to block. Tweakable ciphers and the associated modes are an active area of current research.
The [[#Hasty Pudding|Hasty Pudding Cipher]] was one of the first tweakable ciphers, pre-dating the ''Tweakable Block Ciphers'' paper and referring to what would now be called the tweak as "spice".
=== Avalanche ===
The designer wants changes to quickly propagate through the cipher. This was named the '''avalanche effect''' in a paper <ref>{{ citation | author = Horst Feistel | title = Cryptography and Computer Privacy | journal = Scientific American | date = 1973 | url = http://www3.edgenet.net/dcowley/docs.html }}</ref> by [[Horst Feistel]]. The idea is that changes should build up like an avalanche, so that a tiny initial change quickly creates large effects. The term and its exact application were new, but the basic concept was not; avalanche is a variant of [[Claude Shannon]]'s diffusion and that in turn is a formalisation of ideas that were already in use.
If a single bit of input is changed at round <math>n</math>, that should affect all bits of the ciphertext by round <math>n+x</math> for some reasonably small <math>x</math>. Ideally, <math>x</math> would be 1, but this is not generally achieved in practice. Certainly <math>x</math> must be much less than the total number of rounds; if <math>x</math> is large, then the cipher will need more rounds to be secure.
The '''strict avalanche criterion''' <ref name=SAC> {{citation | author = A. F. Webster and [[Stafford E. Tavares]] | title = On the design of S-boxes | journal = Advances in Cryptology - Crypto '85 (Lecture Notes in Computer Science) | date = 1985 }} </ref> is a strong version of the requirement for good avalanche properties. Complementing any single bit of the input or the key should give exactly a 50% chance of a change in any given bit of output.
=== Cipher structures ===
In [[Claude Shannon]]'s
<ref>{{citation
| author = C. E. Shannon
| title = Communication Theory of Secrecy Systems
| journal = Bell Systems Technical Journal
| volume = 28
| date = 1949
| pages = pp.656-715
| url = http://www.prism.net/user/dcowley/docs.html }}</ref>.
terms, a cipher needs both '''confusion''' and '''diffusion''', and a general design principle is that of the '''product cipher''' which combines several operations to achieve both goals. This goes back to the combination of substitution and transposition in various [[Cipher#Classical_cipher_components | classical ciphers]] from before the advent of computers. All modern block ciphers are product ciphers.
Two structures are very commonly used in building block ciphers &mdash; SP networks and the Feistel structure. In Shannon's terms, both are product ciphers. Either structure is a known quantity for a cipher designer, part of the toolkit. He or she gets big chunks of a design &mdash; an overall cipher structure with a well-defined hole for the  round function to fit into &mdash; from the structure, This leaves him or her free to concentrate on the hard part, designing the actual round function. Neither structure gives ideal avalanche in a single round but, with any reasonable round function, both give excellent avalanche after a few rounds.
Not all block ciphers use one of these structures, but most do. This section describes these two common structures.
==== SP networks ====
A '''substitution-permutation network''' or '''SP network''' or '''SPN''' is Shannon's own design for a product cipher. It uses two layers in each round: a '''substitution layer''' provides confusion, then a '''permutation layer''' provides diffusion.
The '''S-layer''' typically uses look-up tables called substitution boxes or '''S-boxes''', though other mechanisms are also possible. The input is XOR-ed with a round key, split into parts and each part used as an index into an S-box. The S-box output then replaces that part so the combined S-box outputs become the S-layer output. S-boxes are discussed in more detail in their own [[#S-boxes | section]] below.
The '''P-layer''' permutes the resulting bits, providing diffusion or in Feistel's terms helping to ensure avalanche.
A single round of an SP network does not provide ideal avalanche; output bits are affected only by inputs to their S-box, not by all input bits. However, the P-layer ensures that the output of one S-box in one round will affect several S-boxes in the next round so, after a few rounds, overall avalanche properties can be very good.
==== Feistel structure ====
Another way to build an iterated block cipher is to use the '''Feistel structure'''. This technique was devised by [[Horst Feistel]] of IBM and used in [[#DES|DES]]. Such ciphers are known as '''Feistel ciphers''' or '''Feistel networks'''. In Shannon's terms, they are another class of product cipher.
Feistel ciphers are sometimes referred to as '''Luby-Rackoff''' ciphers after the authors of a theoretical paper
<ref>{{citation
| author = M. Luby and C. Rackoff
| title = How to Construct Pseudorandom Permutations and Pseudorandom Functions
| journal = SIAM J. Comput
| date - 1988
}}</ref>
analyzing some of their properties. Later work
<ref>{{citation
| author = Jacques Patarin
| title = Luby-Rackoff: 7 Rounds Are Enough for Security
| journal = Lecture Notes in Computer Science
| volume = 2729
| date = Oct 2003
| pages = 513 - 529
}}</ref>
based on that shows that a Feistel cipher with seven rounds can be secure.
In a Feistel cipher, each round uses an operation  called the '''F function''' whose input is half a block and a round key; the output is a half-block of scrambled data which is XORed into the other half block of text. The rounds alternate direction &mdash; in one data from the left half-block is input and the right half-block is changed, and in the next round that is reversed.
Showing the half-blocks as left and right, XOR as ^ and round key for round n as k<sub>n</sub>, even numbered rounds are then:
: left<sub>n</sub> = left<sub>n-1</sub> ^ F(right<sub>n-1</sub>, k<sub>n</sub>)
: right<sub>n</sub> = right<sub>n-1</sub>
and odd-numbered rounds are
: right<sub>n</sub> = right<sub>n-1</sub> ^ F(left<sub>n-1</sub>, k<sub>n</sub>)
: left<sub>n</sub> = left<sub>n-1</sub>
Since XOR is its own inverse (a^b^b=a for any a,b) and the half-block that is used as input to the F function is unchanged in each round, reversing a Feistel round is straightforward. Just calculate the F function again with the same inputs and XOR the result into the ciphertext to cancel out the previous XOR. For example, the decryption step matching the first example above is:
: left<sub>n-1</sub> = left<sub>n</sub> ^ F(right<sub>n</sub>, k<sub>n</sub>)
: right<sub>n-1</sub> = right<sub>n</sub>
In some ciphers, including those based on SP networks, all operations must be reversible so that decryption can work. The main advantage of a Feistel cipher over an SP network is that the F function itself need not be reversible, only repeatable. This gives the designer extra flexibility; almost any operation he can think up can be used in the F function. On the other hand, in the Feistel construction, only half the output changes in each round while an SP network can change all of it in a single round.
There is a variant called an unbalanced Feistel cipher in which the block is split into two unequal-sized pieces rather than two equal halves. [[Skipjack]] was a well-known example. There are also variations which treat the text as four blocks rather than just two; [[#MARS|MARS]] and [[CAST (cipher)#CAST-256|CAST-256]] are examples.
A single round in a Feistel cipher has less than ideal avalanche properties; only half the output is changed. However, the other half is changed in the next round so, with a good F function, a Feistel cipher can have excellent overall avalanche properties within a few rounds.
The hard part of Feistel cipher design is of course the F function. Design goals include efficiency, easy implementation, and good avalanche properties. Also, it is critically important that the F-function be highly nonlinear. All other operations in a Feistel cipher are linear and a cipher without enough nonlinearity is weak; see the next section.
=== Nonlinearity ===
To be secure, '''every cipher must contain nonlinear operations'''. If all operations in a cipher were linear &mdash; in any algebraic system, with the attacker making the choice of system and perhaps trying more than one &mdash; then the cipher could be reduced to a system of linear equations and broken by an [[algebraic attack]]. The attacker can also try [[linear cryptanalysis]]. If he can find a good enough linear ''approximation'' for the round function and has enough known plaintext/ciphertext pairs, then this will break the cipher. Defining "enough" in the two places where it occurs in the previous sentence is tricky; see [[linear cryptanalysis]].
What makes these attacks impractical is a combination of the sheer size of the system of equations used (large block size, whitening, and more rounds all increase this) and nonlinearity in the relations involved. In any algebra, solving a system of ''linear'' equations is more-or-less straightforward provided there are more equations than variables. However, solving ''nonlinear'' systems of equations is far harder, so the cipher designer strives to introduce [[nonlinearity]] to the system, preferably to have at least some components that are ''not even close to linear''. Combined with good avalanche properties and enough rounds, this makes both direct algebraic analysis and [[linear cryptanalysis]] prohibitively difficult.
There are several ways to add nonlinearity; some ciphers rely on only one while others use several.
One method is '''mixing operations from different algebras'''. If the cipher relies only on Boolean operations, the cryptanalyst can try to attack using Boolean algebra; if it uses only arithmetic operations, he can try normal algebra. If it uses both, he has a problem. Of course arithmetic operations can be expressed in Boolean algebra or vice versa, but the expressions are inconveniently (for the cryptanalyst!) complex and nonlinear whichever way he tries it.
For example, in the [[Blowfish (cipher)| Blowfish]] F function, it is necessary to combine four 32-bit words into one. This is not done with a straightforward x = a+b+c+d or x=a^b^c^d but instead with x = ((a+b)^c)+d. On most computers this costs no more, but it makes the analyst's job harder.
Other operations can also be used, albeit at higher costs. [[International Data Encryption Algorithm | IDEA]] uses multiplication modulo 2<sup>16</sup>+1 and [[#AES | AES]] does matrix multiplications with polynomials in a [[Galois field]].
'''Rotations''', also called '''circular shifts''', on words or registers are nonlinear in normal algebra, though they are easily described in Boolean algebra. [[GOST cipher| GOST]] uses rotations by a constant amount, [[CAST (cipher)#CAST-128 | CAST-128]] and [[CAST (cipher)#CAST-256 | CAST-256]] use a key-dependent rotation in the F function, and [[Rivest ciphers#RC5 | RC5]], [[Rivest ciphers#RC6 | RC6]] and [[#MARS | MARS]] all use data-dependent rotations.
A general operation for introducing nonlinearity is the substitution box or '''S-box'''; see following section.
Nonlinearity is also an important consideration in the design of [[stream cipher]]s and [[cryptographic hash]] algorithms. For hashes, much of the mathematics and many of the techniques used are similar to those for block ciphers. For stream ciphers, rather different mathematics and methods apply (see [[Berlekamp-Massey algorithm]] for example), but the basic principle is the same.
=== S-boxes ===
S-boxes or '''substitution boxes''' are look-up tables. The basic operation involved is a = sbox[b] which, at least for reasonable sizes of a and b, is easily done on any computer.
There is an extensive literature on the design of good S-boxes, much of it emphasizing achieving high nonlinearity though other criteria are also used. See [[Block_cipher/External_Links#Other_links| external links]] and references below.
S-boxes are described as <math>m*n</math> or <math>m</math> by <math>n</math>, with <math>m</math> representing the number of input bits and <math>n</math> the number of output bits. For example, [[#DES|DES]] uses 6 by 4 S-boxes. The storage requirement for an <math>m*n</math> S-box is 2<sup>m</sup>*n bits, so large values of <math>m</math> (many input bits) are problematic. Values up to eight are common; going much beyond that would be expensive. Large values of <math>n</math> (many output bits) are not a problem; 32 is common and at least one system, the Tiger hash algorithm
<ref>{{citation
| title=Tiger: a fast new hash function
| author=Ross Anderson & Eli Biham
| journal=Fast Software Encryption, Third International Workshop Proceedings
| date= 1996
| url=http://www.cs.technion.ac.il/~biham/Reports/Tiger/}}</ref>,
uses 64.
S-boxes are often used in the S-layer of an [[#Substitution-permutation networks | SP Network]]. In this application, the S-box must have an inverse to be used in decryption. It must therefore have the same number of bits for input and output; only <math>n*n</math> S-boxes can be used. For example, [[#AES|AES]] is an SP network with a single <math>8*8</math> S-box and [[#Serpent|Serpent]] is one with eight <math>4*4</math> S-boxes. Another common application is in the F function of a [[#Feistel structure | Feistel cipher]]. Since the F function need not be reversible, there is no need to construct an inverse S-box for decryption and S-boxes of any size may be used.
With either an SP network or a Feistel construction, '''nonlinear S-boxes and enough rounds give a highly nonlinear cipher'''.
==== Large S-boxes ====
The first generation of Feistel ciphers used relatively small S-boxes, <math>6*4</math> for [[Data Encryption Standard|DES]] and <math>4*4</math> for [[GOST (cipher)|GOST]]. Later Feistel ciphers use larger ones, <math>8*32</math> for both [[CAST (cipher)| CAST]], and [[Blowfish (cipher) | Blowfish]]. They are primarily designed for software implementation, rather than the 1970s hardware DES was designed for, so looking up a full computer word at a time makes sense. An 8*32 S-box takes one K byte of storage; several can be used on a modern machine without difficulty.
In earlier Feistel ciphers, [[Data Encryption Standard | DES]] and [[GOST cipher| GOST]], the F function is essentially one round of an [[#Substitution-permutation networks | SP Network]]. These ciphers  use eight 6*4 or 4*4 S-boxes to get 32 bits of S-box output. Those bits, reordered by a simple transformation, become the 32-bit output of the F function. Avalanche properties are less than ideal since each output bit depends only on the inputs to one S-box. The output transformation compensates for this, ensuring that the output from one S-box in one round affects several S-boxes in the next round so that good avalanche is achieved after a few rounds.
Later Feistel ciphers, [[CAST (cipher)| CAST]], and [[Blowfish (cipher) | Blowfish]], use bigger S-boxes and do not use S-box bits directly as F function output. Instead, they take 32-bit words from several S-boxes and combine them to form a 32-bit output, so that the '''F function has ideal avalanche properties''' &mdash; every output bit depends on all S-box output words, and therefore on all input bits and all key bits. With the Feistel structure and such an F function, complete avalanche &mdash; all 64 output bits depend on all 64 input bits &mdash; is achieved in three rounds. This also requires fewer S-box lookups than the eight in DES or GOST so the F function, and therefore the whole cipher, can be reasonably efficient.
No output transformation is required in such an F function, but one may be used anyway; [[CAST (cipher)#CAST-128|CAST-128]] has a key-dependent rotation.
==== S-box design ====
The CAST S-boxes use [[bent function]]s (the most highly nonlinear Boolean functions) as their columns. That is, the mapping from all the input bits to any single output bit is a bent function. Such S-boxes meet the '''strict avalanche criterion''' <ref name="SAC" />; not only does every every bit of round input and every bit of round key affect every bit of round output, but complementing any input bit has exactly a 50% chance of changing any given output bit. A paper on generating the S-boxes is Mister & Adams "Practical S-box Design"
<ref name="sbox">{{citation
| author = S. Mister, C. Adams
| title = Practical S-Box Design
| journal = Selected Areas in Cryptography (SAC '96)
| date = August, 1996
| url=http://adonis.ee.queensu.ca:8000/sac/sac96/papers/paper7.ps
| pages = 61-76 }}</ref>.
Bent functions are combined to get additional desirable traits &mdash; a balanced S-box (equal probability of 0 and 1 output), miniumum correlation among output bits, and high overall S-box nonlinearity.
[[Blowfish (cipher) | Blowfish]] uses a different approach, generating random S-boxes as part of the key scheduling operation at cipher setup time. Such S-boxes are not as non-linear as the carefully constructed CAST ones, but they are non-linear enough and, unlike the CAST S-boxes, they are unknown to an attacker.
In '''perfectly nonlinear S-boxes'''
<ref>{{citation
| author = Kaisa Nyberg
| title = Perfect nonlinear S-boxes
| journal = Eurocrypt'91, LNCS 547
| publisher = Springer-Verlag
| date = 1991 }}</ref>,
not only are all columns [[bent function]]s (the most nonlinear possible Boolean functions), but all linear combinations of columns are bent functions as well. This is possible only if <math>m >= 2n</math>, there are at least twice as many input bits as output bits. Such S-boxes are therefore not much used.
==== S-boxes in analysis ====
S-boxes are sometimes used as an analytic tool even for operations that are not actually implemented as S-boxes. Any operation whose output is fully determined by its inputs can be described by an S-box; concatenate all inputs into an index, look that index up, get the output. For example, the [[International Data Encryption Algorithm | IDEA]] cipher uses a multiplication operation with two 16-bit inputs and one 16-bit output; it can be modeled as a 32*16 S-box. In an academic paper, one might use such a model in order to apply standard tools for measuring S-box nonlinearity. A well-funded cryptanalyst might actually build the S-box (8 gigabytes of memory) either to use in his analysis or to speed up an attack.
=== Resisting linear & differential attacks ===
The very powerful [[cryptanalysis | cryptanalytic]] methods of attacking block ciphers are [[linear cryptanalysis]] and [[differential cryptanalysis]]. The former works by finding linear approximations for the non-linear components of a cipher, then combining them using the [[piling-up lemma]] to attack the whole cipher. The latter looks at how small changes in the input affect the output, and how such changes propagate through multiple rounds. These are the only known attacks that break [[#DES| DES]] with less effort than brute force, and they are completely general attacks that apply to any block cipher..
These attacks, however, require large numbers of known or chosen plaintexts, so a simple defense against them is to re-key often enough that the enemy cannot collect sufficient texts.
Techniques introduced for [[CAST (cipher)|CAST]] go further, building a cipher that is '''provably immune to linear or differential analysis''' with any number of texts. The method, taking linear cryptanalysis as our example and abbreviating it LC, is as follows:
: start from properties of the round function, (for CAST. from bent functions in the S-boxes)
: derive a limit m, the maximum possible quality of any linear approximation to a single round
: consider the number of rounds, r, as a variable
: derive an expression for e, the effort required to break the cipher by LC, in terms of r and m
: find the minimum r such that e exceeds the effort required for brute force, making LC ''impractical''
: derive an expression for c, the number of chosen plaintexts required for LC, also in terms of r and m
: (LC with only known plaintext requires more texts, so it can be ignored)
: find the minimum r such that c exceeds the number of ''possible'' plaintexts, 2<sup>blocksize</sup>, making LC ''impossible''
A similar approach applied to differentials gives values for r that make differential cryptanalysis impractical or impossible. Choose the actual number of rounds so that, at a minimum, both attacks are impractical. Ideally, make both impossible, then add a safety factor.
This type of analysis is now a standard part of the cryptographer's toolkit. Many of the [[#The_AES_generation | AES candidates]], for example, included proofs along these lines in their design documentation, and [[#AES|AES]] itself uses such a calculation to determine the number of rounds required for various key sizes.
== DES and alternatives ==
== DES and alternatives ==


Line 309: Line 23:


The era effectively ended when the US government began working on a new cipher standard to replace their Data Encryption Standard, the [[#AES | Advanced Encryption Standard]] or AES. A whole new generation of ciphers arose, the first 21st century block ciphers. Of course these designs still drew on the experience gained in the post-DES generation, but overall these ciphers are quite different. In particular, they all use 128-bit blocks and most support key sizes up to 256 bits.
The era effectively ended when the US government began working on a new cipher standard to replace their Data Encryption Standard, the [[#AES | Advanced Encryption Standard]] or AES. A whole new generation of ciphers arose, the first 21st century block ciphers. Of course these designs still drew on the experience gained in the post-DES generation, but overall these ciphers are quite different. In particular, they all use 128-bit blocks and most support key sizes up to 256 bits.
== The AES generation ==
By the 90s, the [[Data Encryption Standard]] was clearly obsolete; its small key size made it more and more vulnerable to [[brute force]] attacks as computers became faster. The US [[National Institute of Standards and Technology]] (NIST) therefore began work on an [[Advanced Encryption Standard]], '''AES''', a block cipher to replace DES in government applications and in regulated industries.
To do this, they ran a very open international [[AES competition]], starting in 1998. Their requirements specified a block cipher with 128-bit [[#Block_size | block size]] and support for 128, 192 or 256-bit [[#Key size | key sizes]]. Evaluation criteria included security, performance on a range of platforms from 8-bit CPUs (e.g. in smart cards) up, and ease of implementation in both software and hardware.
Fifteen submissions meeting basic criteria were received. All were [[#Iterated block ciphers | iterated block ciphers]]; in Shannon's terms all were [[#Cipher_structures|product ciphers]]. Most used an [[#SP network | SP network]] or [[#Feistel structure | Feistel structure]], or variations of those. Several had [[#Resisting_linear_.26_differential_attacks |  proofs of resistance]] to various attacks. The [[AES competition]] article covers all candidates and many have their own articles as well. Here we give only a summary.
After much analysis and testing, and two conferences, the field was narrowed to five finalists:
* [[Twofish]], a cipher with key-dependent S-boxes, from a team at [[Bruce Schneier]]'s company Counterpane
* [[MARS (cipher)| MARS]], a variant of Feistel cipher using data-dependent rotations, from [[IBM]]
* [[Serpent (cipher)| Serpent]], an SP network, from an international group of well-known players
* [[Rivest ciphers#RC6 | RC6]], a  cipher using data-dependent rotations, from a team led by [[Ron Rivest]]
* [[Advanced Encryption Standard | Rijndael]]. an SP network, from two Belgian designers
After another year of analysis and testing, they chose a winner. In October 2002, Rijndael was chosen to become the [[Advanced Encryption Standard]] or AES. See [[Block_cipher/External_Links#AES_links | external links]] for the official standard.
An entire [[#DES_and_alternatives|generation]] of block ciphers used the 64-bit block size of [[#DES|DES]], but since [[#AES|AES]] many new designs use a 128-bit block size.
As discussed under [[#Size parameters|size parameters]], if two or more ciphers have the same block and key sizes, then they are effectively interchangeable; replacing one cipher with another requires no other changes in an application. When asked to implement [[#AES|AES]], the implementer might include the other finalists &mdash; [[#Twofish | Twofish]], [[#Serpent | Serpent]]. [[Rivest ciphers#RC5|RC6]] and [[#MARS | MARS]] &mdash; as well. This provides useful insurance against the (presumably unlikely) risk of someone finding a good attack on AES. Little extra effort is required since open source implementations of all these ciphers are readily available, see [[Block_cipher/External_Links#AES_links|external links]]. All except RC6 have completely open licenses.
There are also many other ciphers that might be used. There were ten AES candidates that did not make it into the finals:
* [[CAST (cipher)#CAST-256|CAST-256]], based on CAST-128 and with the same theoretical advantages
* [[DFC (cipher)| DFC]], based on another theoretical analysis proving resistance to various attacks.
* [[Hasty Pudding (cipher)|Hasty Pudding]], a variable block size [[#Whitening_and_tweaking|tweakable]] cipher
* [[DEAL (cipher)|DEAL]]
* [[FROG (cipher)| FROG]], an innovative cipher; interesting but weak
* [[E2 (cipher)| E2]], from Japan
* [[CRYPTON (cipher)| CRYPTON]], a Korean cipher with some design similarities to AES
* [[Magenta (cipher)|Magenta]], Deutsche Telekom's candidate, quickly broken
* [[LOKI97]], one of the [[LOKI (cipher)|LOKI]] family of ciphers, from Australia
* [[SAFER+]], one of the [[SAFER (cipher)|SAFER]] family of ciphers, from Cylink Corporation
Some should not be considered. Magenta and FROG have been broken, DEAL is slow, and E2 has been replaced by its descendant Camellia.
There are also some newer 128-bit ciphers that are widely used in certain countries:
* [[Camellia (cipher)|Camellia]], an 18-round Feistel cipher widely used in Japan and one of the standard ciphers for the [[NESSIE]] (New European Schemes for Signatures, Integrity and Encryption) project.
* [[SEED (cipher)|SEED]], developed by the [[Korean Information Security Agency]] (KISA) and widely used in Korea.
== Large-block ciphers ==
For most applications a 64-bit or 128-bit block size is a fine choice; nearly all common block ciphers use one or the other. [[3-Way]] uses 96 bits. Such ciphers can be used to encrypt objects larger than their block size; just choose an appropriate [[Block cipher modes of operation|mode of operation]].
However, a few ciphers supporting larger block sizes do exist; this section discusses them.
A block cipher with larger blocks may be more efficient; it takes fewer block operations to encrypt a given amount of data. It may also be more secure in some ways; diffusion takes place across a larger block size, so data is more thoroughly mixed, and large blocks make a [[code book attack]] more difficult. On the other hand, great care must be taken to ensure adequate diffusion within a block so a large-block cipher may need more rounds, larger blocks require more padding, and there is not a great deal of literature on designing and attacking such ciphers so it is hard to know if one is secure. Large-block ciphers are inconvenient for some applications and simply do not fit in some protocols.
Some block ciphers, such as [[Tiny Encryption Algorithm| Block TEA]] and [[Hasty Pudding (cipher)|Hasty Pudding]], support variable block sizes. They may therefore be both efficient and convenient in applications that need to encrypt many items of a fixed size, for example disk blocks or database records. However, just using the cipher in [[Block_cipher_modes_of_operation#Electronic_Code_Book.2C_ECB | ECB mode]] to encrypt each block under the same key is unwise, especially if encrypting many objects. With ECB mode, identical blocks will encrypt to the same ciphertext and give the enemy some information. One solution is to use a [[#Whitening_and_tweaking| tweakable]] cipher such as Hasty Pudding with the block number or other object identifier as the tweak. Another is to use [[Block_cipher_modes_of_operation#Cipher_Block_Chaining.2C_CBC |CBC mode]] with an initialisation vector derived from an object identifier.
[[Cryptographic hash]] algorithms can be built using a block cipher as a component. There are general-purpose methods for this that can use existing block ciphers; ''Applied Cryptography''
<ref name="schneier">{{citation
| first = Bruce | last = Schneier
| title = Applied Cryptography
| date = 2nd edition, 1996,
| publisher = John Wiley & Sons
|ISBN =0-471-11709-9}}</ref>
gives a long list and describes weaknesses in many of them. However, some hashes include a specific-purpose block cipher as part of the hash design. One example is [[Hash_(cryptography)#Whirlpool|Whirlpool]], a 512-bit hash using a block cipher similar in design to [[#AES|AES]] but with 512-bit blocks and a 512-bit key. Another is the [[Hash_(cryptography)#The_Advanced_Hash_Standard | Advanced Hash Standard]] candidate [[Hash_(cryptography)#Skein|Skein]] which uses a [[#Whitening_and_tweaking| tweakable]] block cipher called Threefish. Threefish has 256-bit, 512-bit and 1024-bit versions; in each version block size and key size are both that number of bits.
It is possible to go the other way and use any [[cryptographic hash]] to build a block cipher; again  ''Applied Cryptography'' <ref name="schneier" /> has a list of techniques and describes weaknesses. The simplest method is to make a Feistel cipher with double the hash's block size; the F function is then just to hash text and round key together. This technique is rarely used, partly because a hash makes a rather expensive round function and partly because the block cipher block size would have to be inconveniently large; for example using a 160-bit bit hash such as [[SHA-1]] would give a 320-bit block cipher.
The hash-to-cipher technique was, however, important in one legal proceeding, the [[Cryptography_controversy#Export_Controls | Bernstein case]]. At the time, US law strictly controlled export of cryptography because of its possible military uses, but hash functions were allowed because they are designed to provide authentication rather than secrecy. Bernstein's code built a block cipher from a hash, effectively circumventing those regulations. Moreover, he sued the government over his right to publish his work, claiming the export regulations were an unconstitutional restriction on freedom of speech. The courts agreed, effectively striking down the export controls.
It is also possible to use a [[public key]] operation as a block cipher. For example, one might use [[RSA]] with 1024-bit keys as a block cipher with 1024-bit blocks. Since the round function is itself cryptographically secure, only one round is needed. However, this is rarely done; public key techniques are expensive so this would give a very slow block cipher. A much more common practice is to use [[public key]] methods, block ciphers, and [[cryptographic hash]]es together in a [[hybrid cryptosystem]].
==References==
{{reflist|2}}

Revision as of 10:59, 25 July 2009

DES and alternatives

The Data Encryption Standard, DES, is among the the best known and most thoroughly analysed block ciphers. It was invented by IBM, and was made a US government standard for non-classified government data and for regulated industries such as banking, in the late 70s. From then until about the turn of the century, it was very widely used. It is now considered obsolete because its 56-bit key is too short to resist brute force attacks if the opponents have recent technology.

The DES standard marked the beginning of an era in work related to block ciphers. For an entire generation, every student of cryptanalysis tried to find a way to break DES and every student of cryptography tried to devise a cipher that was demonstrably better than DES. Very few succeeded.

Every new cryptanalytic technique invented since DES became a standard has been tested against DES. None of them have broken it completely, but two — differential cryptanalysis and linear cryptanalysis — give attacks theoretically significantly better than brute force. This does not appear to have much practical importance since both require enormous numbers of known or chosen plaintexts and DES can be broken by brute force with one known plaintext. All the older publicly known cryptanalytic techniques have also been tried, or at least considered, for use against DES; none of them work.

DES served as a sort of baseline for cipher design through the 80s and 90s; the design goal for almost any 20th century block cipher was to replace DES in some of its many applications with something faster, more secure, or both. All these ciphers used 64-bit blocks, like DES, but all used 128-bit or longer keys for better resistance to brute force. Many of the techniques used came from DES and many of the design principles came from analysis of DES.

Ciphers of this generation include:

  • The Data Encryption Standard itself, the first well-known Feistel cipher, using 16 rounds and eight 6 by 4 S-boxes.
  • The GOST cipher, a Soviet standard similar in design to DES, a 32-round Feistel cipher using eight 4 by 4 S-boxes.
  • IDEA, the International Data Encryption Algorithm, a European standard, not a Feistel cipher, with only 8 rounds and no S-boxes.
  • RC2, a Feistel cipher from RSA Security which was approved for easy export from the US (provided it was used with only a 40-bit key), so widely deployed.
  • RC5, a Feistel cipher from RSA security. This was fairly widely deployed, often replacing RC2 in applications.
  • CAST-128, a widely used 16-round Feistel cipher, with 8 by 32 S-boxes.
  • Blowfish, another widely used 16-round Feistel cipher with 8 by 32 S-boxes.
  • The Tiny Encryption Algorithm, or TEA, designed to be very small and fast but still secure, a 32-round Feistel cipher without S-boxes.
  • Skipjack, an algorithm designed by the NSA for use in the Clipper chip, a 32-round unbalanced Feistel cipher.

Several of these ciphers introduced interesting new design ideas. The CAST ciphers were the first to use large S-boxes which allow the F function of a Feistel cipher to have ideal avalanche properties, and to use bent functions in the S-box columns. Blowfish introduced key-dependent S-boxes. RC5 was the first well-known cipher to use data-dependent rotations to achieve nonlinearity. IDEA uses a clever variant on multiplication to achieve nonlinearity.

The era effectively ended when the US government began working on a new cipher standard to replace their Data Encryption Standard, the Advanced Encryption Standard or AES. A whole new generation of ciphers arose, the first 21st century block ciphers. Of course these designs still drew on the experience gained in the post-DES generation, but overall these ciphers are quite different. In particular, they all use 128-bit blocks and most support key sizes up to 256 bits.