Talk:Cryptanalysis/Draft: Difference between revisions
imported>Sandy Harris No edit summary |
imported>Howard C. Berkowitz (→Intermediate systems: new section) |
||
Line 31: | Line 31: | ||
:: Side channel certainly covers Tempest and RAFTER (new to me). I'm not sure if differential fault analysis and timing attacks go there or under "attacks on the ciphers"; I lean toward the latter since they aim at finding the keys rather than just reading material. [[User:Sandy Harris|Sandy Harris]] 07:18, 17 October 2008 (UTC) | :: Side channel certainly covers Tempest and RAFTER (new to me). I'm not sure if differential fault analysis and timing attacks go there or under "attacks on the ciphers"; I lean toward the latter since they aim at finding the keys rather than just reading material. [[User:Sandy Harris|Sandy Harris]] 07:18, 17 October 2008 (UTC) | ||
== Intermediate systems == | |||
Given that many readers won't have much background, I think we need a section that considers both early manual methods before they were pure guesswork (and maybe a little there), and then materials, some declassified, from the 1920s and 1930s. | |||
Let me offer what is mostly outline, which would go before specific attacks: | |||
===Preparatory=== | |||
While it is not necessary to be fluent in a language to cryptanalyze it, languages have different statistical properties and it helps to know the language. Of course, if there is a change of guessing probable words, knowledge of the language is more important, and knowledge of the area of application (e.g., smuggling liquor or describing radar) can reveal even more. In a real-world rather than puzzle cryptanalysis, knowledge of the circumstances in which the ciphertext was collected can be informative. [[William Friedman]] proposed four basic steps:<ref name=Friedman-MC-I>{{citation | |||
| title = Military Cryptanalysis | |||
| volume = I: Monoalphabetic Substitution Systems | |||
| year = 1938 | |||
| first = William F. | last = Friedman | |||
| publisher = Signal Intelligence Section, Plans and Training Division, U.S. War Department. | |||
| url = http://www.nsa.gov/public/mil_crypt_I.pdf}}, p. 7-11</ref> | |||
#The determination of the language employed in the plain-text version. | |||
#The determination of the general system of cryptography employed. | |||
#The reconstruction of the specific key in the case of a cipher system, or the reconstruction, partial or complete, of the code book, in the case of a code system; or both, in the case of an enciphered code system | |||
#The reconstruction or establishment of the plain text | |||
These are usually done in the order in which they are given above and in which | |||
they usually are performed in the solution of cryptograms, although occasionally the second step may precede the first. | |||
===Basic methods=== | |||
The very earliest cryptanalysis seems to have been inspired guessing about the plaintext, or perhaps attacks against a known and fairly weak manual systems. In 20th century, mathematical and statistical techniques. | |||
The most basic is frequency analysis.<ref>Friedman-I, p. 11-17</ref>. When the frequency of letters, graphed by frequency against count, do not follow the curve characteristic to the language, <ref>Friedman-I, pp.18-26</ref> which becomes increasingly effective when keys become more complex, even in simply polyalphabetic substitution on a monographic cipher, such as the Vigenere method. Frequency analysis is almost useless against pure transposition systems, other than than confirming the system is simple transposition if the frequency of letter curve matches, and the counts of letters in cleartest match the letters in the natural language. | |||
====Monoalphabetic substitution==== | |||
The most basic form, such as the [[Caesar cipher]], uses a single key alphabet; the same cleartext letter was always represented by the same ciphertext letter. Against this, basic frequency analysis is possible. The monoalphabetic keys of increasing complexity involve: | |||
*Constant shift monoalphabetic solution, in which each plaintext letter is shifted a fixed number of letter positions to the right | |||
*Keyword-based mixed, where the unique letters of a keyword (e.g., CRYPTO) form the first letters of a key alphabet, with the rest following in alphabetical order: CRYPTOABDEFGHIJKLMQSUVW | |||
*Completely random: QPWOEIROTIYUSAGFHGKJLMZNXBCV | |||
=====Increasing the complexity of monoalphabetics===== | |||
Encryption, even of fairly simple ciphers, became more difficult when two things happened: | |||
*encrypted text did not follow the spacing of plaintext, so that the insight of knowing that a particular cipher symbol was at the beginning, in the middle, or of the word did not help.<ref name=Gaines>{{citation|first = Helen|last=Gaines|title=Cryptanalysis|Publisher = Dover|year=1939}},pp. 69-72</ref> | |||
*padding began being used, so that X's or perhaps meaningless digraphs were inserted, so the message was always an integral multiple of the number of cipher alphabets | |||
====Basic and enhanced manual polyalphabetic substitutions==== | |||
The simplest polyalphabetic substitution used multiple constant shift keys. For a '''period''' of 4, the number of different encryptions before the sequence of keys repeats, an example would be: | |||
Key 1: BCDEFGHIJKLMNOPQRSTUVWXYZA | |||
Key 2: CDEFGHIJKLMNOPQRSTUVWXYZAB | |||
Key 3: DEFGHIJKLMNOPQRSTUVWXYZABC | |||
Key 4: EFGHIJKLMNOPQRSTUVWXYZABCD | |||
Even with a method that was a short-period polyalphabetic solution, cryptographers might use different techniques for putting letters into the encryption system, and removing them. Assume there are four alphabets, 1-4, all Caesar variants (+1, +2, +3, +4), not shown for simplicity in the drawing, and the rows of ciphertext fall under the appropriate alphabet. This starts to introduce transposition. | |||
Contrast the two examples below, of ATTACK AT TWO TODAY, with spaces suppresed. The initial encryption works by | |||
Writing out the cleartext, | |||
Result of polyalphabetic substitution (four Caesar based used) | |||
1 2 3 4 | |||
Row A A T T A | |||
Row B C K A T | |||
Row C T W O T | |||
Row D O D A Y | |||
Result of polyalphabetic substitution (four Caesar based used) | |||
1 2 3 4 | |||
Row A B V W D | |||
Row B D M D X | |||
Row C U X R D | |||
Row D P F C B | |||
=====Mixing in basic transposition===== | |||
Even with a method that was fundamentally polyalphabetic solution, cryptographers might use different techniques for putting letters into the encryption system, and removing them. Assume there are four alphabets, 1-4, not shown for simplicity in the drawing, and the rows of ciphertext fall under the appropriate alphabet. | |||
In the example below, taking off the letters in groups of four, from left to right, would give | |||
BVWD DMDX UXRD PFCB | |||
Only a slight modification of sending all the odd rows first and then the even would give: | |||
BVWD UXRD DMDX PFCB | |||
While this is a trivial example, simple frequency analysis is no longer enough to decrypt; even given the key alphabets, simple decryption yields nonsense. Still, a system simple enough to respond to hand analysis often was still a challenge, especially when the encryptor used mixed alphabets and changed them frequently | |||
=====Nonperiodic key alphabets==== | |||
Ciphers were not always "geometric" or strictly periodic. An aperiodic cipher might only have 4 cipheralphabets in the key, but, if the order of their use is 1-2-4-3 on the first cycle but 2-3-4-1 on the next, reconstructing the key becomes more difficult. | |||
Contrast the two examples below, of ATTACK AT TWO TODAY, with spaces suppresed. The initial encryption works by | |||
Result of nonperiodic (4-3-1-2) polyalphabetic substitution | |||
4 3 1 2 | |||
Row A A T T A | |||
Row B C K A T | |||
Row C T W O T | |||
Row D O D A Y | |||
Result of polyalphabetic substitution | |||
4 3 1 2 | |||
Row A E W U F | |||
Row B G O B V | |||
Row C W Z P V | |||
Row D S H B A | |||
Mathematical cryptanalysis emerged roughly in the 1920s, principally from [[William Friedman]] and his colleagues. These still used fairly basic mathematical techniques. A significant increase in cryptanalytic power came when techniques such as [[[group theory]] were applied against the [[Enigma machine]] and other early machine ciphers. | |||
====Index of coincidence==== | |||
====Kappa test==== | |||
==References== | |||
{{reflist|2}} |
Revision as of 12:04, 17 October 2008
I am thinking of a re-organisation here, along the lines:
- Attacks on the system
- Practical cryptanalysis
- Traffic analysis
- Side channel attacks
- Bypassing authentication
- Guessing secrets
- Dictionary attacks on passwords
- Random number weaknesses
- Small keys
- Attacks on the ciphers
Then the topics we currently have under "Mathematical cryptanalysis".
Things like man-in-the-middle would then turn up in two places, first under "Bypassing authentication" because if you can do that then you don't have to break the actual encryption, and second under "Attacks on the ciphers" for details of attacks on different authentication mechanisms since those details are much the same as other attacks on RSA, block ciphers or whatever. Sandy Harris 01:51, 17 October 2008 (UTC)
- Should social engineering be under guessing, or its own category? For that matter, where does one put the people who write their keys on their desk calendar?
- "Social engineering" and "shoulder surfing" would be categories, perhaps subheads under practical cryptanalysis. Sandy Harris 07:18, 17 October 2008 (UTC)
- Side channel, I assume. covers TEMPEST/HIJACK/TEAPOT/NONSTOP, timing analysis on plaintext, acoustic cryptanalysis, Operation RAFTER (specific case of getting the received text off the intermediate frequency)
- Could you define "attacks on the ciphers"?
- I think this is going somewhere interesting, but not sure where it is yet.
- If you would, see if we can agree on some of the more specific (e.g.,) authentication attacks in communications security. I am also open to a better name for that article. Howard C. Berkowitz 05:51, 17 October 2008 (UTC)
- By "attacks on the ciphers" (chosen mainly to contrast with "attacks on the system") I meant what is now called "mathematical cryptanalysis" and might be called "cryptanalysis proper". The article introduction refers to it as "classic cryptanalyis". Not sure what the best title would be.
- Somewhere up in the opening/overview part I'd want to say that, while this is a real threat, it may not be the main threat in many cases. Quote Anderson [1] about banking sytems "the threat model commonly used by cryptosystem designers was wrong: most frauds were not caused by cryptanalysis or other technical attacks, but by implementation errors and management failures." or Schneier's intro to Secrets and Lies where he says in some ways writing "Applied Cryptography" was a mistake; too much technology, not enough attention to other issues.
- Side channel certainly covers Tempest and RAFTER (new to me). I'm not sure if differential fault analysis and timing attacks go there or under "attacks on the ciphers"; I lean toward the latter since they aim at finding the keys rather than just reading material. Sandy Harris 07:18, 17 October 2008 (UTC)
Intermediate systems
Given that many readers won't have much background, I think we need a section that considers both early manual methods before they were pure guesswork (and maybe a little there), and then materials, some declassified, from the 1920s and 1930s.
Let me offer what is mostly outline, which would go before specific attacks:
Preparatory
While it is not necessary to be fluent in a language to cryptanalyze it, languages have different statistical properties and it helps to know the language. Of course, if there is a change of guessing probable words, knowledge of the language is more important, and knowledge of the area of application (e.g., smuggling liquor or describing radar) can reveal even more. In a real-world rather than puzzle cryptanalysis, knowledge of the circumstances in which the ciphertext was collected can be informative. William Friedman proposed four basic steps:[1]
- The determination of the language employed in the plain-text version.
- The determination of the general system of cryptography employed.
- The reconstruction of the specific key in the case of a cipher system, or the reconstruction, partial or complete, of the code book, in the case of a code system; or both, in the case of an enciphered code system
- The reconstruction or establishment of the plain text
These are usually done in the order in which they are given above and in which they usually are performed in the solution of cryptograms, although occasionally the second step may precede the first.
Basic methods
The very earliest cryptanalysis seems to have been inspired guessing about the plaintext, or perhaps attacks against a known and fairly weak manual systems. In 20th century, mathematical and statistical techniques.
The most basic is frequency analysis.[2]. When the frequency of letters, graphed by frequency against count, do not follow the curve characteristic to the language, [3] which becomes increasingly effective when keys become more complex, even in simply polyalphabetic substitution on a monographic cipher, such as the Vigenere method. Frequency analysis is almost useless against pure transposition systems, other than than confirming the system is simple transposition if the frequency of letter curve matches, and the counts of letters in cleartest match the letters in the natural language.
Monoalphabetic substitution
The most basic form, such as the Caesar cipher, uses a single key alphabet; the same cleartext letter was always represented by the same ciphertext letter. Against this, basic frequency analysis is possible. The monoalphabetic keys of increasing complexity involve:
- Constant shift monoalphabetic solution, in which each plaintext letter is shifted a fixed number of letter positions to the right
- Keyword-based mixed, where the unique letters of a keyword (e.g., CRYPTO) form the first letters of a key alphabet, with the rest following in alphabetical order: CRYPTOABDEFGHIJKLMQSUVW
- Completely random: QPWOEIROTIYUSAGFHGKJLMZNXBCV
Increasing the complexity of monoalphabetics
Encryption, even of fairly simple ciphers, became more difficult when two things happened:
- encrypted text did not follow the spacing of plaintext, so that the insight of knowing that a particular cipher symbol was at the beginning, in the middle, or of the word did not help.[4]
- padding began being used, so that X's or perhaps meaningless digraphs were inserted, so the message was always an integral multiple of the number of cipher alphabets
Basic and enhanced manual polyalphabetic substitutions
The simplest polyalphabetic substitution used multiple constant shift keys. For a period of 4, the number of different encryptions before the sequence of keys repeats, an example would be:
Key 1: BCDEFGHIJKLMNOPQRSTUVWXYZA Key 2: CDEFGHIJKLMNOPQRSTUVWXYZAB Key 3: DEFGHIJKLMNOPQRSTUVWXYZABC Key 4: EFGHIJKLMNOPQRSTUVWXYZABCD
Even with a method that was a short-period polyalphabetic solution, cryptographers might use different techniques for putting letters into the encryption system, and removing them. Assume there are four alphabets, 1-4, all Caesar variants (+1, +2, +3, +4), not shown for simplicity in the drawing, and the rows of ciphertext fall under the appropriate alphabet. This starts to introduce transposition.
Contrast the two examples below, of ATTACK AT TWO TODAY, with spaces suppresed. The initial encryption works by
Writing out the cleartext,
Result of polyalphabetic substitution (four Caesar based used) 1 2 3 4 Row A A T T A Row B C K A T Row C T W O T Row D O D A Y
Result of polyalphabetic substitution (four Caesar based used) 1 2 3 4 Row A B V W D Row B D M D X Row C U X R D Row D P F C B
Mixing in basic transposition
Even with a method that was fundamentally polyalphabetic solution, cryptographers might use different techniques for putting letters into the encryption system, and removing them. Assume there are four alphabets, 1-4, not shown for simplicity in the drawing, and the rows of ciphertext fall under the appropriate alphabet.
In the example below, taking off the letters in groups of four, from left to right, would give
BVWD DMDX UXRD PFCB
Only a slight modification of sending all the odd rows first and then the even would give:
BVWD UXRD DMDX PFCB
While this is a trivial example, simple frequency analysis is no longer enough to decrypt; even given the key alphabets, simple decryption yields nonsense. Still, a system simple enough to respond to hand analysis often was still a challenge, especially when the encryptor used mixed alphabets and changed them frequently
=Nonperiodic key alphabets
Ciphers were not always "geometric" or strictly periodic. An aperiodic cipher might only have 4 cipheralphabets in the key, but, if the order of their use is 1-2-4-3 on the first cycle but 2-3-4-1 on the next, reconstructing the key becomes more difficult.
Contrast the two examples below, of ATTACK AT TWO TODAY, with spaces suppresed. The initial encryption works by
Result of nonperiodic (4-3-1-2) polyalphabetic substitution 4 3 1 2 Row A A T T A Row B C K A T Row C T W O T Row D O D A Y
Result of polyalphabetic substitution 4 3 1 2 Row A E W U F Row B G O B V Row C W Z P V Row D S H B A
Mathematical cryptanalysis emerged roughly in the 1920s, principally from William Friedman and his colleagues. These still used fairly basic mathematical techniques. A significant increase in cryptanalytic power came when techniques such as [[[group theory]] were applied against the Enigma machine and other early machine ciphers.
Index of coincidence
Kappa test
References
- ↑ Friedman, William F. (1938), Military Cryptanalysis, vol. I: Monoalphabetic Substitution Systems, Signal Intelligence Section, Plans and Training Division, U.S. War Department., p. 7-11
- ↑ Friedman-I, p. 11-17
- ↑ Friedman-I, pp.18-26
- ↑ Gaines, Helen (1939), Cryptanalysis,pp. 69-72