User talk:Sandy Harris/Sandbox
Principles and terms
In encryption and decryption, a key is one or more unique values used by an encryption or decryption algorithm. Encryption algorithms take as input a key and plaintext, producing ciphertext output. For decryption, the process is reversed to turn ciphertext into plaintext.
The system should be secure against an attacker who knows all its details except the key; this is known as Kerckhoffs' Principle.
Methods of defeating cryptosystems have a long history and an extensive literature; see cryptanalysis. Anyone designing or deploying a cyptosystem must take cryptanalytic results into account.
The ciphertext produced by an encryption algorithm should bear no resemblance to the original message. Ideally, it should be indistinguishable from a random string of symbols. Any non-random properties may provide an opening for a skilled cryptanalyst.
Keying
Even an excellent safe cannot protect against a thief who knows the combination. Even an excellent cipher cannot protect against an enemy who knows the key.
Many cryptographic techniques — block ciphers, stream ciphers, public key encryption, digital signatures, and hashed message authentication codes — depend on cryptographic keys. None of these can be secure if the key is not. Enemies can sometimes read encrypted messages without breaking the cipher; they use practical cryptanalysis techniques such as breaking into an office to steal keys.
The quality of the keys is almost as important as their secrecy. Keys need to be highly random, effectively impossible to guess. See random number for details. A key that an enemy can easily guess, or that he can find with a low-cost search, does not provide much protection. Using strong cryptography with a poor key is like buying good locks then leaving the key under the doormat.
In applications which encrypt a large volume of data, any cipher must be re-keyed from time to time to prevent an enemy from accumulating large amounts of data encrypted with a single key. Such a collection facilitates some attacks — see code book attack, linear cryptanalysis and differential cryptanalysis in particular, and cryptanalysis in general. It also makes the payoff for breaking that key very large. Re-keying also limits the damage if a key is compromised in some other way. Neither block ciphers nor stream ciphers typically include a re-keying mechanism; some higher-level protocol manages that and re-keys the cipher using the normal keying mechanism.
In some applications, there are natural breaks where a new key should be used. For example it is natural to use a different key for each new message in a message-oriented protocol such as email, or for each new connection in a connection-oriented protocol such as SSH. This may be all the re-keying required. Or it may not; what if some users send multi-gigabyte emails or stay logged in for months?
In other applications, a mechanism for periodic re-keying is required. For a VPN connection between two offices, this would normally be the Internet Key Exchange protocol. For an embassy, it might be a clerk who changes the key daily and an officer who delivers more keys once a month, flying in with a briefcase handcuffed to his wrist.
There are many ways to manage keys, ranging from physical devices and smartcards to cryptographic techniques such as Diffie-Hellman. In some cases, an entire public key infrastructure may be involved. See key management for details.
External attacks
Any of the techniques of espionage — bribery, coercion, blackmail, deception ... — may be used to obtain keys; such methods are called practical cryptanalysis. In general, these methods work against the people and organisations involved, looking for human weaknesses or poor security procedures. They are beyond our scope here; see information security.
For computer-based security systems, host security is a critical prerequisite. No system can be secure if the underlying computer is not. Even systems generally thought to be secure, such as IPsec or PGP are trivially easy to subvert for an enemy who has already subverted the machine they run on. See computer security.
For some systems, host security may be an impossible goal. Consider a Digital Rights Management system whose design goal is to protect content against the owner of the computer or DVD player it runs on. If that owner has full control over his device then the goal is not achievable.
Encrypting messages does not prevent traffic analysis; an enemy may be able to gain useful information from the timing, size, source and destination of traffic, even if he cannot read the contents.
Side channel attacks
There are also side channel attacks.
For example, any electrical device handling fast-changing signals will produce electromagnetic radiation. An enemy might listen to the radiation from a computer or from crypto hardware. For the defenders, there are standards for limiting such radiation; see TEMPEST and protected distribution system.
Timing attacks make inferences from the length of time cryptographic operations take. These may be used against devices such as smartcards or against systems implemented on computers. Any cryptographic primitive — block cipher, stream cipher, public key or cryptographic hash — can be attacked this way. Power analysis has also been used, in much the same way as timing. The two may be combined.
Differential fault analysis attacks a cipher embedded in a smartcard or other device. Apply stress (heat, mechanical stress, radiation, ...) to the device until it begins to make errors; with the right stress level, most will be single-bit errors. Comparing correct and erroneous output gives the cryptanalyst a window into cipher internals. This attack is extremely powerful; "we can extract the full DES key from a sealed tamper-resistant DES encryptor by analyzing between 50 and 200 ciphertexts generated from unknown but related plaintexts" [1].
See cryptanalysis for details and information security for defenses.
Secret key systems
Until the 1970s, all (publicly known) cryptosystems used secret key or symmetric key cryptography methods. In such a system, there is only one secret key for a message; that key can be used either to encrypt or decrypt the message. Both the sender and receiver must have the key, and third parties (potential intruders) must be prevented from obtaining the key. Symmetric key encryption may also be called traditional, shared-secret, secret-key, or conventional encryption.
Historically, ciphers worked at the level of letters; see history of cryptography for details. Attacks on them used techniques based largely on linguistic analysis, such as frequency counting; see cryptanalysis.
On computers, there are two main types of symmetric encryption algorithm:
A block cipher breaks the data up into fixed-size blocks and encrypt each block under control of the key. Since the message length will rarely be an integer number of blocks, there will usually need to be some form of "padding" to make the final block long enough. The block cipher itself defines how a single block is encrypted; modes of operation specify how these operations are combined to achieve some larger goal.
A stream cipher encrypts a stream of input data by combining it with a pseudo-random stream of data; the pseudo-random stream is generated under control of the encryption key.
Another method, usable manually or on a computer, is a one-time pad. This works much like a stream cipher, but it does not need to generate a pseudo-random stream because its key is a truly random stream as long as the message. This is the only known cipher which is provably secure (provided the key is truly random and no part of it is ever re-used), but it is impractical for most applications because managing such keys is too difficult.
Key management
More generally, key management is a problem for any secret key system.
- It is critically important to protect keys from unauthorised access; if an enemy obtains the key, then he or she can read all messages ever sent with that key.
- It is necessary to change keys periodically, both to limit the damage if an attacker does get a key and to prevent various attacks which become possible if the enemy can collect a large sample of data encrypted with a single key.
- It is necessary to communicate keys; without a copy of the identical key, the intended receiver cannot decrypt the message.
Managing all of these simultaneously is an inherently difficult problem.
One problem is where, and how, to safely store the key. In a manual system, you need a key that is long and hard to guess because keys that are short or guessable provide little security. However, such keys are hard to remember and if the user writes them down, then you have to worry about someone looking over his shoulder, or breaking in and copying the key, or the writing making an impression on the next page of a pad, and so on.
On a computer, keys must be protected so that enemies cannot obtain them. Simply storing the key unencrypted in a file or database is a poor strategy. A better method is to encrypt the key and store it in a file that is protected by the file system; this way, only authorized users of the system should be able to read the file. But then, where should one store the key used to encrypt the secret key? It becomes a recursive problem. Also, what about an attacker that can defeat the file system protection? If the key is stored encrypted but you have a program that decrypts and uses it, can an attacker obtain the key via a memory dump or a debugging tool? If a network is involved, can an attacker get keys by intercepting network packets? Can an attacker put a keystroke logger on the machine; if so, he can get everything you type, possibly including keys or passwords.
Communicating keys is an even harder problem. With secret key encryption alone, it would not be possible to open up a new secure connection on the internet, because there would be no safe way initially to transmit the shared key to the other end of the connection without intruders being able to intercept it. A government or major corporation might send someone with a briefcase handcuffed to his wrist, but for many applications this is impractical.
Moreover, the problem grows quadratically if there are many users. If users must all be able to communicate with each other securely, then there are possible connections, each of which needs its own key. For large this becomes quite unmanageable.
Various techniques can be used to address the difficulty. A centralised server, such as the Kerberos system developed at MIT [2] and used (not without controversy [3]) by all versions of Microsoft Windows since Windows 2000 [4] is one method. Other techniques use two factor authentication, combining "something you have" (e.g. your ATM card) with "something you know" (e.g. the PIN).
The development of public key techniques, describe in the next section, allows simpler solutions.
Public key systems
Public key or asymmetric key cryptography was first proposed, in the open literature, in 1976 by Whitfield Diffie and Martin Hellman.[1]. The historian David Kahn described it as "the most revolutionary new concept in the field since polyalphabetic substitution emerged in the Renaissance" [2]. There are two reasons public key cryptography is so important. One is that it solves the key management problem described in the preceding section; the other is that public key techniques are the basis for digital signatures.
In a public key system, keys are created in matched pairs, such that when one of a pair is used to encrypt, the other must be used to decrypt. The system is designed so that calculation of one key from knowledge of the other is computationally infeasible, even though they are necessarily related. Keys are generated secretly, in interrelated pairs. One key from a pair becomes the public key and can be published. The other is the private key and is kept secret, never leaving the user's computer.
In many applications, public keys are widely published — on the net, in the phonebook, on business cards, on key server computers which provide an index of public keys. However, it is also possible to use public key technology while restricting access to public keys; some military systems do this, for example. The point of public keys is not that they must be made public, but that they could be; the security of the system does not depend on keeping them secret.
One big payoff is that two users (traditionally, A and B or Alice and Bob) need not share a secret key in order to communicate securely. When used for content confidentiality, the public key is typically used for encryption, while the private key is used for decryption. If Alice has (a trustworthy, verified copy of) Bob's public key, then she can encrypt with that and know that only Bob can read the message since only he has the matching private key. He can reply securely using her public key. This solves the key management problem. The difficult question of how to communicate secret keys securely does not need to even be asked; the private keys are never communicated and there is no requirement that communication of public keys be done securely.
Moreover, key management on a single system becomes much easier. In a system based on secret keys, if Alice communicates with people, her system must manage secret keys all of which change periodically, all of which must sometimes be communicated, and each of which must be kept secret from everyone except the one person it is used with. For a public key system, the main concern is managing her own private key; that generally need not change and it is never communicated to anyone.
Of course, she must also manage the public keys for her correspondents. In some ways, this is easier; they are already public and need not be kept secret. However, it is absolutely necessary to authenticate each public key. Consider a philandering husband sending passionate messages to his mistress. If the wife creates a public key in the mistress' name and he does not check the key's origins before using it to encrypt messages, he may get himself in deep trouble.
Public-key encryption is slower than conventional symmetric encryption so it is common to use public key algorithm for key management but a faster symmetric algorithm for the main data encryption. Such systems are described in more detail below; see hybrid cryptosystems.
The other big payoff is that, given a public key cryptosystem, digital signatures are a straightforward application. The basic principle is that if Alice uses her private key to encrypt some known data then anyone can decrypt with her public key and, if they get the right data, they know (assuming the system is secure and her private key unknown to others) that it was her who did the encryption. In effect, she can use her private key to sign a document. The details are somewhat more complex and are dealt with in a later section.
Many different asymmetric techniques have been proposed and some have been shown to be vulnerable to some forms of cryptanalysis; see the public key article for details. The most widely used public techniques today are the Diffie-Hellman key agreement protocol[3] and the RSA (Rivest-Shamir-Adleman) public-key system[4]. Techniques based on elliptic curves are also used.
In 1997, it finally became publicly known that asymmetric cryptography had been invented by James H. Ellis at GCHQ, a British intelligence organization, in the early 1970s, and that both the Diffie-Hellman and RSA algorithms had been previously developed (by Malcolm J. Williamson and Clifford Cocks, respectively)[5].
Cryptographic hash algorithms
Hashing or message digest algorithms take an input of arbitrary size and produce a fixed-size digest, a sort of fingerprint of the input document. Some of the techniques are the same as those used in other cryptography but the goal is quite different. Where ciphers (whether symmetric or asymmetric) provide secrecy, hashes provide authentication.
Using a hash for data integrity protection is straightforward. If Alice hashes the text of a message and appends the hash when she sends it to Bob, then Bob can then verify that he got the correct message. He computes a hash from the received message text and compares that to the hash Alice sent. If they compare equal, then he knows (with overwhelming probability, though not with absolute certainty) that the message was received exactly as Alice sent it. Exactly the same method works to ensure that a document extracted from an archive or a file downloaded from a software distribution site is as it should be.
That technique is useful, but an unkeyed hash is useless against an adversary who intentionally changes the data. The enemy simply calculates a new hash for his changed version and stores or transmits that instead of the original hash. To resist an adversary takes a keyed hash, a hashed message authentication code or HMAC. Sender and receiver share a secret key; the sender hashes using both the key and the document data, and the receiver verifies using both. Lacking the key, the enemy cannot alter the document undetected.
If Alice uses an HMAC and that verfies correctly, then Bob knows both that the received data is correct and that whoever sent it knew the secret key. If the rest of the system is secure, then only Alice knows that key, so he knows Alice was the sender. An HMAC provides source authentication as well as data authentication.
Digital signatures
Two cryptographic techniques are used together to produce a digital signature, a hash and a public key system.
Alice calculates a hash from the message, encrypt that hash with her private key and appends the encrypted hash to the message as a signature. To verify the signature, Bob needs a trustworthy copy of Alice's public key. He uses that to decrypt the signature; this should give him the hash Alice calculated. He then hashes the received message body himself to get another hash value and compares the two hashes.
If the two hash values are identical, then Bob knows with overwhelming probability that the document Alice signed and the document he received are identical. He also knows that whoever generated the signature had Alice's private key. If both the hash and the public key system used are secure, and no-one except the sender knows his private key, then the signatures are trustworthy.
A digital signature has some of the desirable properties of an ordinary signature. It is easy for a user to produce, but difficult for anyone else to forge. The signature is permanently tied to the content of the message being signed; it cannot be copied from one document to another, or used with an altered document, since the different document would give a different hash.
Any public key technique can provide digital signatures. RSA is widely used, as is the US government standard Digital Signature Algorithm (DSA).
Certificates and PKIs
Practical use of asymmetric cryptography, on any sizable basis, requires a public key infrastructure (PKI). A public key will normally be embedded in a digital certificate that is issued by a certification authority. In the event of compromise of the private key, the certification authority can revoke the key by adding it to a certificate revocation list. Digital certificates, like passports or other identification documents, usually have expiration dates, and a means of verifying both the validity of the certificate and of the certificate issuer.
Hybrid cryptosystems
Public key exchanges are used to open up secure secret key channels between strangers across the internet.
Generating session keys
The primary usage of public-key encryption is in hybrid systems where a symmetric algorithm does the bulk data encryption while the public key algorithm provides other services. Public-key encryption is slower than conventional symmetric encryption.For example, in Pretty Good Privacy (PGP) email encryption the sender generates a random key for the symmetric bulk encryption and uses public key techniques to securely deliver it to the receiver. In the Diffie-Hellman key agreement protocol, used in IPsec and other systems, public key techniques provide authentication.
An advantage of asymmetric over symmetric cryptosystems is that all symmetric system keys must be kept secret, and the logistics of key management become complex. Each distinct pair of communicating parties must share a different key. The number of keys required increases as the square of the number of network members, which requires very complex key management schemes in large networks. The difficulty of establishing a secret key between two communicating parties when a secure channel doesn't already exist between them also presents a chicken-and-egg problem which is a considerably practical obstacle for cryptography users in the real world.
One-way encryption
There are a substantial number of applications where it is not necessary to be able to reconstruct the plaintext from the ciphertext, but merely to be able to prove that some piece of information could be generated only from the original plaintext. See one-way encryption for the techniques; some applications are presented here.
Protection against record modification
Protection against file modification
Protecting stored passwords
When passwords are stored on a computer, it is essential that they be kept secret. Thus it is recommended practice to encrypt the passwords before writing them to disk, and furthermore to prevent anyone who might find them from decrypting them. One-way encryption involves storing an encrypted string which cannot be decrypted. When a user later enters their password, the newly enter password is first encrypted, and then is compared to the encrypted stored string.
The password is usually encrypted as a message digest or hash, a large number generated by scrambling and condensing plain text letters. An example of a hash digest is SHA-1, which dates from 1994. The SHA-1 algorithm takes a string as input. The algorithm is a digest because the result is a fixed-size number. The SHA-1 algorithm always outputs a 160-bit number (20 bytes of storage). 48 decimal digits would be required to express this number, and it is usually displayed to humans as a 28-character, base-64 encoded string. Here are some examples:
Hello World z7R8yBtZz0+eqead7UEYzPvVFjw= VB L1SHP0uzuGbMUpT4z0zlAdEzfPE= vb eOcnhoZRmuoC/Ed5iRrW7IxlCDw= Vb e3PaiF6tMmhPGUfGg1nrfdV3I+I= vB gzt6my3YIrzJiTiucvqBTgM6LtM=
- ↑ Diffie, Whitfield (June 8, 1976), "Multi-user cryptographic techniques", AFIPS Proceedings 4 5: 109-112
- ↑ David Kahn, "Cryptology Goes Public", 58 Foreign Affairs] 141, 151 (fall 1979), p. 153
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs nameddh2
- ↑ Rivest, Ronald L.; Adi Shamir & Len Adleman, A Method for Obtaining Digital Signatures and Public-Key Cryptosystems
- ↑ Clifford Cocks. A Note on 'Non-Secret Encryption', CESG Research Report, 20 November 1973.