Next, we are going to study a result called Shannon's Perfect Secrecy Theorem, which is a very important result in cryptography. In this system, X denotes the plain text, which is a message that we want to transmit in secrecy. Y denotes the ciphertext, that is the encryption of the message. The message is encrypted by means of using a key denoted by Z, where this key is available to both the transmitter and the receiver. Now, it is possible that the adversary can intercept Y. To achieve perfect secrecy, we need to ensure that from Y, nothing can be known about X. That is, X and Y are statistically independent, or the mutual information between X and Y is equal to 0. We also require decipherability, that is the receiver upon receiving Y and also Z would be able to recover X. In other words, X is a function of Y and Z, or the conditional entropy of X given Y and Z is equal to 0. We are going to show that these two requirements imply that entropy of Z is greater than or equal to entropy of X. That is the length of the key is at least the same as the length of the plain text. In the work by Shannon in 1949, he gave a combinatorial proof of this result. We are going to prove this result by using an information diagram. First, perfect secrecy requires I(X;Y) equals 0. So in the information diagram, We let I(X;Y|Z) be a, and I(X;Y;Z) be minus a. Decipherability requires H(X|Y,Z) equal zero. Since I(Y;Z) is greater than or equal to 0, we have I(Y;Z|X) is at least equal to a. We also have the conditional entropy of Z given X, Y is greater than or equal to 0. We need to show that mu star tilda{Z} is greater than or equal to mu star tilda{X}. Compare in information diagram at the bottom, the atoms of tilda{X} and the atoms of tilda{Z}. Now we see that the atoms in tilda{X} intersect tilda{Z} are common to both tilda{X} and tilda{Z}. So their measures cancel with each other, and so these atoms do not need to be considered. Only the atoms in tilda{X} minus tilda{Z} and tilda{Z} minus tilda{X} need to be compared. From the information diagram at the top, it is evident that mu star tilda{Z} minus tilda{X} is greater than or equal to mu star tilda{X} minus tilda{Z}, because this atom has measure at least the measure of this atom, and this atom has measure at least the measure of this atom. Therefore, we conclude that entropy of Z is at least equal to the entropy of X, as is to be shown. Example 3.15 is the Imperfect Secrecy Theorem, which is a generalization of Shannon's Perfect Secrecy Theorem. Let X be the plain text, Y be the cipher text, and Z be the key in a secret key cryptosystem. Since X can be recovered from Y and Z, we have entropy of X given Y and Z equal to 0. We can show that this constraint implies the mutual information between X and Y is lower bounded by entropy of X minus entropy of Z. For the details, please study example 3.15. The interpretation of the imperfect secrecy theorem is the following. The mutual information between X and Y measures the leakage of information, that is, the amount of information about X that a adversary can learn by knowing Y. The Imperfect Secrecy Theorem says that the leakage of information is at least equal to the deficiency of the key length, that is entropy of X minus entropy of Z. When I(X;Y) is equal to 0, that is, no leakage of information is allowed, the Imperfect Secrecy Theorem reduces to Shannon's perfect secrecy theorem. The next example is the Data Processing Theorem that we have discussed in chapter two. It says that if X Y Z T forms a Markov chain, then the mutual information between X T is less than or equal to the mutual information between Y and Z. This is readily seen in the information diagram for the Markov chain X, Y, Z, T, because the mutual information between X and T corresponds to the atom marked in red in the information diagram, and the mutual information between Y and Z corresponds to the four atoms marked in blue in the information diagram. In fact, from the information diagram, it can readily be seen that I(Y;Z) is equal to I(X;T) plus I(X;Z|T) plus I(Y;T|X) plus I(Y;Z|X,T). Example 3.18 is a more elaborate example. It says that, if we have a Markov chain X Y Z T U consisting of five random variables, then entropy of Y plus entropy of T is equal to the mutual information between Z and X, Y, T, U plus the mutual information between (X,Y) and (T,U), plus the entropy of Y given Z, plus the entropy of T given Z. This identity is very difficult to discover without an information diagram. And it is instrumental in proving an outer bound, for problem in information theory called the multiple description problem. As an exercise, verify the following information diagram for the above equality. Information inequalities that are implied by the basic inequalities are called Shannon-type inequalities. They can be proved by means of a linear program called ITIP, which stands for Information Theoretic Inequality Prover, developed on Matlab at CUHK back in 1996. A version running on C called Xitip, which is more portable, was developed at EPFL in 2007. See Chapter 13 and 14 for discussion on this topic. Here are some examples of ITIP. In the first example, we want to prove that entropy of X Y Z is less than or equal to entropy of X plus entropy of Y, plus entropy of Z, namely, the independence bound for entropy. Upon hitting the Return key, the program returns true, which means that this is a Shannon-type inequality. In the second example, we want to verify whether I(X;Z) is equal to 0 conditioning on I(X;Z|Y) equals 0 and I(X;Y) equals 0. Upon hitting the Return key, the program returns true. In the third example, we want to verify whether X Y Z T forms a Markov chain, conditioning on two Markov sub-chains, namely X Y Z and Y Z T, and this cannot be proved by ITIP. In this case, it can be shown that given the two Markov subchains, X Y Z and Y Z T, it is indeed the case that the long Markov chain X Y Z T does not hold. In the last example, we want to verify the inequality for four random variables X, Y, Z and U. The inequality says that I(Z;U) minus I(Z;U|X) minus I(Z;U|Y) is less than or equal to one half of I(X;Y) plus one quarter of I(X;Z,U) plus one quarter of I(Y;Z,U). This is not provable by ITIP but in fact this is true. The last inequality is a so called non-Shannon-type inequality, which is valid but not implied by the basic inequalities. See Chapter 15 for a discussion.