Ok, away from politics, and back to the good stuff. When I left off talking about encryption, we were looking at RSA, as an example of an asymmetric encryption system.
Since it's been a while, I'll start off with a quick review of RSA.
Asymmetric encryption systems like RSA are based on computations that are easy to perform if you know the keys, and incredibly hard to perform if you don't. In the specific case of RSA, everything is based on a pair of very large prime numbers. If you know those two primes, and you know the public key, it's really easy to compute the private key. But if you don't know the two prime numbers, then even given the public key, it's incredibly difficult to compute the private key.
To be a bit more specific, in RSA, you get a pair of large prime numbers, P and Q. You compute from them a totient of their product, which is the number N=(P-1)×(Q-1). Then you arbitrarily pick a public exponent, E, which is smaller than N, and which is prime relative to N. You can then compute the private key exponent, D. If you know what P and Q are, it's pretty easy to compute D.
Once you've got D and E, your public key is the pair is (N,E), and the private key is the pair is (N,D).
If you've got a plaintext message M, then encrypting it with the public key is done by computing ME mod N. If you've got a ciphertext C encrypted with the public key, then decrypting it is done by computing CD mod N.
Encrypting a plaintext with the public key is exactly the same process as decrypting a ciphertext produced with the private key. And vice versa.
RSA is amazingly elegant. Every time I look at it for any reason, I'm struck by its beauty and it's simplicity. It's really an amazing example of how something seemingly abstract like number theory can produce practical, down-to-earth, and useful.
But RSA isn't perfect. In fact, it's got a some rather serious problems that are a direct result of its mathematical structure. I'm just going to mention one, but there are several of them.
Suppose you've got your public/private keypair. You want to encrypt two messages, I and J with your private key. Let's call the function for encrypting a message with the private key E - so E(M) = MD mod N. Then CI = E(I), and CJ=E(J). Good so far.
The problem is that given those two messages - for which an attacker has access to both the plaintext and the ciphertext - it happens that there's a vulnerability. Because of the way that encryption works, E(I×J) = E(I) × E(J)!
This opens you up to something called chosen ciphertext attacks. A chosen ciphertext attack is one where you attack an crytographic system by running chosen ciphertexts through the decryption system to see if you can compute the encryption key. There's an effective attack against the naive use of RSA based on a selected ciphertext method.
In addition to this, there are a few other interesting attacks that I won't describe in detail. They all rely on various problems of the numerical structure of RSA encryption. In order to defeat those attacks, RSA is typically used with something called a padding scheme. The idea of the padding scheme is twofold. First, it increases the size of the message - which guarantees that the encrypted message is large enough that it will not be easy to use for an attack. Second, it intersperses pseudo-random information in a way that means that a given plaintext message could be encrypted to a wide range of different ciphertexts, depending on the choices made during padding.
A common scheme for padding is OAEP - Optimal Asymmetric Encryption Padding. OAEP is a basic Feistel network system, like the ones I described when I was talking about DES. OAEP ends up doing a mixture of permuting the plaintext, and adding pseudo-random noise to it. It's a reversible transformation, and the receiver of the encrypted message (and thus any attacker) knows how to do the reversal of the padding given the decrypted plaintext. But the process of permutation and random injection performed by the padding has the effect of breaking the properties of RSA that make it easy to attack the system.
For example, suppose that the padding function is called P. E(P(I))×E(P(J)) is not equal to E(P(I×J)). Similarly, the message-size-related problems with RSA are not usable on messages whose size is increased past the vulnerability threshold using OAEP padding.
So what does the padding look like? Let's start with the pieces.
- You have a number, N, which is the length of the cryptographic modulus.
- You have two integers, k0 and k1, which are parameters of the protocol in use.
- You have the original plaintext message, M. By definition, the length of M is (n - k0 - k1). (The protocol is responsible for breaking up large messages into sub-messages via an appropriate mode of operation.)
- You have a 0-padded version of M, M', which is M followed by k1 zeros - so M' has length N-k0.
- You have a random string of bits, r, which has length k0.
- You have a cryptographic hash function G, which expands a string of k0 bits to a string of N-k0 bits.
- You have another cryptographic hash function H, which contracts a string of n-k0 bits to a srting of k0 bits.
The OAEP padding algorithm is illustrated off to the side. The way that it works is: you compute G(r), giving you a string of N-k0 bits. You XOR G(r) with M', producing a string of N-k0 bits, which we'll call X. Then compute H(X), and XOR that with r, producing a result that we'll call Y. The end result of the padding is the concatenation of X and Y.
The way to understand this is that you've got some random bits in R, which you're shuffling up (using G and H) and mixing with the bits of the original message. The resulting message is longer, and has a random element mixed into it, which defeats the numerical properties that make some attacks against RSA work. It's easy to compute X and Y given M and r. And given X and Y, if you know G and H it's easy to compute M and r. But given E(X concat Y), as an attacker, you're screwed. You can still decript the message, and get X and Y, decode them, and get the plaintext. But the process of doing the padding obscures the numerical properties of RSA so that even knowing the padding function, your attacks won't work.