2.3 Standard Block Cipher Systems
2.3.1 DES
Data Encryption Standard (DES) originates from the Lucifer which was developed by IBM and later modified by NIST (National Institute of Standards and Technology) to become a block cipher standard in 1977. DES is one type of Feistel structure we described earlier. The block length is 64-bit and also 64-bit key length (including 8 parity check bits). Figure 3 is an overview of the DES structure.
plaintext
IP
L0 R0
f
f
R1=L0+f(R0,K1) L1=R0
R16=L15+f(R15,K16) L16=R15
K1
f
R2=L1+f(R1,K2) L2=R1
K2
R15=L14+f(R14,K15) L15=R14
K16
IP-1
ciphertext
Figure 3: DES structure
The IP and IP-1 are initial permutation and inverse permutation, respectively. Each Li
and Ri is 32 bits in length. The f function takes Ri-1 and Ki as inputs and we show the f function in Figure 4.
Ri-1
E(Ri-1)
Ki
f(A,J) E
+
B1 B2 B3 B4 B5 B6 B7 B8
S1 S2 S3 S4 S5 S6 S7 S8
C1 C2 C3 C4 C5 C6 C7 C8
P
Figure 4: The DES f function 32 bits
48 bits
48 bits
32 bits
Ri-1 is first extended to 48 bits and then XOR with round key (48-bit). The result is then divided into 8 blocks each with 6 bits. These 8 blocks are then input to the 8 S-boxes which output 4 bits each. The 8 blocks of Ci are permutated according to P and the output of f function is then the output of permutation P.
The key scheduling will generate 16 subkeys each with 48 bits from the initial 56 (64) bits key. We show the scheduling in Figure 5. The PC-1 and PC-2 are also permutations.
Figure 5: Key scheduling of DES
On January 2, 1997, NIST began the process of choosing a replacement of DES, which is called the Advanced Encryption Standard (AES). AES requires a block length with 128-bit and supporting key length with 128, 192, 256 bits (Nk=4, 6, 8).
On October 2, 2000, Rijndael [3][4] was selected as the new standard.
AES has block length with 128, 192, 256 (Nb=4, 6, 8) bits whose number of rounds Nr, are 10, 12, and 14, respectively. All operations in AES are byte oriented.
State is the input cut into byte array (Figure 6). AES first generates the subkeys we need using KeyExpansion algorithm from the initial key. Then for the first Nr-1 rounds, it performs the Round function, which contains the ByteSub、ShiftRow、
MixColumn and AddRoundKey. Finally we apply the FinalRound, which is the same
as Round except for no MixCloumn. The algorithm is given in Algorithm 2.1 in pseudo C language.
KeyExpansion generates the Nr+1 round subkeys from the initial key. The expanded key is a linear array of 4-byte word. The first Nk words contain the cipher key. All other words are defined recursively in terms of words with smaller indices. The algorithm is given in Algorithm 2.2.
Rijndael(State, Key)
Figure 6: AES State representation for Nb=4, 6, 8
Algorithm2.1: AES algorithm
The round constants are independent of Nk and defined by:
Rcon[i] = (RC[i],‘00’,‘00’,‘00’) with RC[i] representing an element in GF(28) with a value of x( i - 1) so that:
RC[1] = 1 (i.e. ‘01’)
RC[i] = x (i.e. ‘02’) ·(RC[i-1]) = x(i-1)
RotByte is a rotate of the bytes, i.e., RotByte(B0,B1,B2,B3)=(B1,B2,B3,B0). Then the RoundKey i is given by the Round Key buffer word W[Nb*i] to W[Nb*(i+1)].
For the first Nr-1 rounds, we perform Round function, which contains four sub-function: ByteSub, ShiftRow, MixColumn, and AddRoundKey. The FinalRound is the same as Round except for no MixColumn. Next, we briefly introduce the sub-functions.
ByteSub is the function to replace one byte by another byte, i.e., it acts as a S-box. The detailed algorithm will be given in Chapter 4.
Algorithm 2.2:
KeyExpansion(byte Key[4*Nk] word W[Nb*(Nr+1)]) {
for(i = 0; i < Nk; i++)
W[i] = (Key[4*i],Key[4*i+1],Key[4*i+2],Key[4*i+3]);
for(i = Nk; i < Nb * (Nr + 1); i++) {
temp = W[i - 1];
if (i % Nk == 0)
temp = SubByte(RotByte(temp)) ⊕ Rcon[i / Nk];
W[i] = W[i - Nk] ⊕ temp;
} }
Figure 7: AES ByteSub function
The ShiftRow is a cyclic left shift of the State according to the offsets (Table 1).
Table 1: Shift offsets with different Nb
Nb C1 C2 C3
Figure 8: AES ShiftRow operation
MixColumn replaces a column by a new one formed by multiplying the column with a matrix.
Figure 9: AES MixColumn operation x0,0 x0,1 x0,2 x0,3
And the AddRoundKey is simply add the State with the RoundKey.
2.4 Other Block Cipher Systems
Although the previous two ciphers are the most commonly used today, there are still other systems not belonging to these two kinds. However, there exists one common feature in all of them: they use repeated rounds to achieve security requirement.
2.4.1 RC6
RC6 [28] is a block cipher designed to meet the requirements of AES. The design is based on RC5 and modified to increase security and performance. It has block length with 128-bit and can be seen as extending RC5 from 64-bit to 128 bit.
However, instead of using two 64-bit registers, they change to use four 32-bit registers since the AES architecture does not support 64-bit operations. Like RC5, RC6 makes an extensive use of data-dependant rotations. The philosophy of RC5 is to exploit operations (such as rotations) that are efficiently implemented on modern processors. RC6 follows the trend and it includes the 32-bit integer multiplication since this operation is now implemented on almost all processors. The advantage of the integer multiplication is to “diffuse” effectively. RC6 uses it to compute the rotation amounts, so that the rotation amounts are dependent on all of the bits of another register. Thus RC6 has much faster diffusion than RC5 and increases security with fewer rounds.
A version of RC6 is more accurately specified as RC6-w/r/b where the word size is w bits, encryption consists of a nonnegative number of rounds r, and b denotes the
length of the encryption key in bytes. RC6 consists of the following six basic operations:
a + b: integer addition modulo 2w a-b: integer subtraction modulo 2w a⊕b: bitwise exclusive-or of w-bit words a × b: integer multiplication modulo 2w
a<<<b: rotate the w-bit word a to the left by the amount given by the least significant lgw bits of b
a>>>b: rotate the w-bit word a to the right by the amount given by the least significant lgw bits of b
The key scheduling is as follows. The user supplies a key of b bytes, where 0≦b
≦255. From this key, 2r + 4 words (w bits each) are derived and stored in the array S[0,1,…, 2r + 3]. This array is used in both encryption and decryption. The encryption and decryption algorithms are shown in the following figures.
Input: Plaintext stored in four w-bit input registers A, B, C, D Number r of rounds
w-bit round keys S[0,1,…,2r + 3]
Figure 10. Encryption algorithm with RC6-w/r/b
2.4.3 IDEA
IDEA (International Data Encryption Algorithm) [16][17] was developed by Lai in 1991. IDEA is used in PGP (Pretty Good Privacy), the cryptographic system for Internet and E-mail security. IDEA is also 64-bit block length as DES and the round number is 8 and the key size is 128-bit.
The algorithm is illustrated in Figure 12. The 64-bit plaintext is divided into four 16-bit blocks, X1,X2,X3,X4. In each round, six 16-bit subkeys are used, denoted by Ki,1,Ki,2,…,Ki,6 for round i. Since there are 8 rounds, 48 subkeys are used, plus 4 extra subkeys used after the last round to transform the output. And the four output
Input: Ciphertext stored in four w-bit input registers A, B, C, D Number r of rounds
w-bit round keys S[0,1,…,2r + 3]
Figure 11. Decryption algorithm with RC6-w/r/b
ciphertext blocks are denoted by Y1,Y2,Y3,Y4.
In each round, the 16-bit blocks are XORed, added and multiplied as the figure shows. The multiplication modulo 216+1 can be regarded as the S-box of IDEA. After the last step, each of the resulting 16-bit blocks is multiplied modulo 216+1 by its corresponding subkey.
The key scheduling is very simple as follows. The initial key of 128 bits is divided into 8 blocks of 16 bits and they become K1,1,…,K1,6, and K2,1,K2,2. Then the initial key is shifted 25 bits left and divided into 8 blocks of new subkeys. The procedure continues until 52 subkeys are generated.
The decryption algorithm is the same as encryption. The keys are used in reverse order with some modifications; they are the inverse of the encryption keys for
Figure 12. The IDEA structure
⊕: bit by bit XOR : addition modulo 216
: multiplication modulo 216+1 with zero corresponds to 216
.
multiplications as well as addition.
In this chapter, we introduced several block cipher systems from basic schemes, Feistel Networks and SPNs, to standard systems, DES and AES. In the next chapter, we will start to use linear cryptanalysis to attack the SPNs and use our strategies to attack them more efficiently.
Chapter 3 Linear Cryptanalysis
In this chapter, we introduce linear cryptanalysis, which is the most important attack on block cipher systems. Section 1 briefly introduces the Matsui’s attack concept on DES. Section 2 gives an entire procedure of the attack on SPNs. Section 3 introduces some other improved techniques proposed by other researchers. Section 4 illustrates our new strategies, which can find trails with good bias to attack and we also show the performance of our new strategies in the end.
3.1 Matsui’s Attack on DES
Originally, Matsui and Yamagishi [21] developed the linear cryptanalysis against the FEAL [31] (Fast Data Encipherment Algorithm) cipher in 1992. In 1994, Matsui modified it and used it on DES [18] in a theoretical attack on the full 16-round DES, which requires 247 known plaintext-ciphertext pairs and successfully obtaines 14 key bits. Now it has become the most important attack against block ciphers. In Matsui’s paper, he introduced two versions of attack algorithms. The first one, called Algorithm 1, can only attack one key bit information. The second one, called Algorithm 2, can extract more key bits in one attack.
Algorithm 1:
Step 1: Let T be the number of plaintexts such that the left side of equation, ]
,..., , [ ] ,..., , [ ] ,..., ,
[i1 i2 ia C j1 j2 jb K k1 k2 kc
P ⊕ = ,
is equal to zero.
Step 2: If T>N/2 (N denotes the number of plaintexts),
then guess K[k1,k2,...,kc]=0 (when p>1/2) or 1 (when p<1/2), else guess K[k1,k2,...,kc]=1 (when p>1/2) or 0 (when p<1/2).
Algorithm 2:
Step 1: For each candidate Kn(i)(i=1,2,...) of Kn, let Ti be the number of plaintexts such that the left side of equation
] ,..., , [ ] ,..., , )[
, ( ] ,..., , [ ] ,..., ,
[i1 i2 ia C j1 j2 jb Fn CL Kn l1 l2 ld K k1 k2 kc
P ⊕ ⊕ =
is equal to zero.
Step 2: Let Tmax be the maximal value and Tmin be the minimal value of all Ti’s.
l If |Tmax−N/2|>|Tmin−N/2|, then adopt the key candidate corresponding to Tmax and guess K[k1,k2,...,kc]=0 (when p>1/2) or 1 (when p<1/2).
l If |Tmax−N/2|<|Tmin−N/2|, then adopt the key candidate corresponding to Tmin and guess K[k1,k2,...,kc]=1 (when p>1/2) or 0 (when p<1/2).
In the remaining parts of this thesis, we focus on Algorithm 2 since it is much more powerful.
3.2 Linear Cryptanalysis on SPNs
Here we briefly explain how linear cryptanalysis works on SPNs. The detailed introduction is described in [11][33]. Keliher also discussed linear attacks on SPN in [14]. To apply linear attacks, we need to find a subset of bits that their XOR behaves in a non-random way. First, we introduce a useful lemma in linear attacks.
3.2.1 The Piling-up lemma
Suppose X1, X2,…∈{0,1} are independent random variables. p1, p2,…are real numbers such that 0≦pi≦1, and suppose that Pr[Xi=0]=pi and Pr[Xi=1]=1-pi. Then we define the bias of Xi to be εi = pi−21. Let
ik
i i1,2,...,
ε
denote the bias of the random variableik
i X
X ⊕...⊕
1 . It is easy to see that
2 1 2
1,i
2
i ii
ε ε
ε =
. And we cangeneralize it in the following lemma.
Lemma 3.1 (Piling-up Lemma) [18]: Let εi1,i2,...,ik denote the bias of the random variable
ik
i X
X ⊕...⊕
1 . Then
j
k i
k
j k i i
i ε
ε1,2,..., =2 −1
Π
=1 .3.2.2 Linear approximations of S-boxes
Next, we need to compute the linear approximation table of an S-box so that we can determine the XOR of which bits is not random.
Example 3.1: Consider the following S-box: πS:{0,1}4 →{0,1}4. X1 X2 X3 X4 Y1 Y2 Y3 Y4
0 0 0 0 1 1 1 0
0 0 0 1 0 1 0 0
0 0 1 0 1 1 0 1
0 0 1 1 0 0 1 0
0 1 0 0 0 0 0 1
0 1 0 1 1 1 1 1
0 1 1 0 1 0 1 1
0 1 1 1 1 0 0 0
1 0 0 0 0 0 1 1
1 0 0 1 1 0 1 0
1 0 1 0 0 1 1 0
1 0 1 1 1 1 0 0
1 1 0 0 0 1 0 1
1 1 0 1 0 0 0 0
1 1 1 0 1 0 0 1
1 1 1 1 0 1 1 1
If we want to know the probability of X2⊕Y2⊕Y3=0, then we count the number of rows in the above table where X2⊕Y2⊕Y3=0and denote this number as NL value.
Then we divide NL by 24 (4 is the number of S-box input) to get the probability of
3 0
2
2⊕Y ⊕Y =
X . Here NL=4, thus the probability is 4/16 and the bias is –1/4.
In a similar way, we can record all possible input-output XOR in a linear approximation table (Table 2). We read the table by using the following notation:
} 1 , 0 { , ,
4
1 4
1
∈
⊕
⊕
i= aiXi⊕
i= biYi ai bi .Take (a1,…,a4) as index of rows and (b1,…,b4) as index of columns. The values in the table indicate NL’s-8. Thus, X2⊕Y2⊕Y3 of Example 3.1 is expressed as a=0100, b=
0110 and the corresponding NL-8 is in the shaded place of the table which is -4 as Example 3.1 counts. This table consists of 2n×2m entries where n and m denote the number of X variables and Y variables respectively (in Example 3.1, n=m=4). In the linear cryptanalysis, we are searching for the pattern with a large bias size to attack.
Table 2: Linear approximation table of Example 3.1
X Y 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 +8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 -4 0 -4 0 -4 0 +4 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 +2 -2 +6 +2 +2 -2 -2 +2
3 0 0 0 0 0 0 0 0 +2 -6 -2 -2 +2 +2 -2 -2
4 0 +4 -2 -2 -2 -2 -4 0 0 0 -2 +2 +2 -2 0 0
5 0 0 -2 +2 -2 +2 +4 +4 0 0 -2 +2 +2 -2 0 0
6 0 0 -2 +2 +2 -2 0 0 -2 -2 0 +4 -4 0 -2 -2
7 0 0 -2 +2 +2 -2 0 0 -2 +2 0 0 +4 +4 -2 +2
8 0 0 0 0 0 0 0 0 -2 +2 +2 -2 +2 -2 -2 -6
9 0 0 0 0 0 0 0 0 -2 -2 +2 +2 +2 +2 +6 -2
10 0 0 0 0 -4 -4 +4 -4 0 0 0 0 0 0 0 0
11 0 +4 0 -4 +4 0 +4 0 0 0 0 0 0 0 0 0
12 0 0 +2 -2 -2 +2 0 0 +2 +2 0 +4 0 +4 -2 -2
13 0 0 +2 -2 -2 +2 0 0 -6 -2 0 0 0 0 -2 +2
14 0 +4 +2 +2 -2 -2 0 +4 0 0 +2 -2 -2 +2 0 0
15 0 0 -6 -2 -2 +2 0 0 0 0 +2 -2 -2 +2 0 0
3.2.3 Linear expression of a trail
We then use such weakness (large bias) to find a trail through entire SPN to get a linear expression involving only parts of plaintext bits and data bits into the last round (bits of UNr) and all subkeys encountered in the path. All other intermediate data bits of Ur、Vr, where r<Nr, will be cancelled. Thus we produce a linear expression in the following:
=0
⊕
⊕ J K
I C K
P , (3.1)
where PI, CJ, and KK denote the XOR of some plaintext bits, data bits of UNr and encountered key bits respectively. But what we care is only
=0
⊕ J
I C
P . (3.2)
plaintext
S11 S12 S13 S14
Subkey K1 mixing
S21 S22 S23 S24
Subkey K2 mixing
S41 S42 S43 S44
Subkey K4 mixing Subkey K3 mixing
Subkey K5 mixing
ciphertext U4
V4
U2
V2
W2
U1
V1
W1
S31 S32 S33 S34
U3
V3
W3
P1 P2
…
P16C1 C2
…
C16Figure 13: A possible attack trail.
Figure 13 shows a possible attack trail. Here, PI is P5⊕P7⊕P8 and CJ is
15 4 14 4 7 4 6
4 U U U
U ⊕ ⊕ ⊕ . The trail is formed as follows: In S12, we choose
4 4 3
1 X X Y
X ⊕ ⊕ ⊕ since it has large bias. Then we follow the output permutation and XOR with K2. Now in round 2, they become the input X2 of S24. So we can look up in the linear approximation table to check what bits X2 XORing with has large bias (row 4, since X2 represents 01002). As procedure continues we have a trail formed.
After the trail is determined, the overall bias of the entire SPN can be calculated by Piling-up lemma (each S-box encountered viewed as
ij
ε ) and we denote the bias as
ε .
3.2.4 Subkeys attack
Once we have the trail and the bias, we then begin to extract the subkeys of the last round. It proceeds as follows:
1. The subkeys we are going to extract are those involved in the last part of the trail.
For example, in Figure 13, CJ of (3.2) are the bits into the second and fourth S-box. Then the subkeys being extracted are the corresponding position of the output bits of those S-boxes, i.e., the circled part in Figure 13.
2. Since the attack is a known plaintext attack, we have many plaintext- ciphertext pairs and we say we have T pairs. We maintain a counter array for each possible candidate subkeys. Then we partially decrypt the ciphertext for each candidate subkeys. If the linear expression (3.2) holds, then we increment the corresponding counter of that subkey.
3. In the end, we expect the counter, which is closest to (12±ε)T , is the most likely subkey.
3.3 More on Linear Cryptanalysis
In this section we introduce some further researches done as the linear cryptanalysis develops. With the help of these techniques, we can increase the success rate and reduce the data pairs we need.
3.3.1 Linear hull
Nyberg [24] proposed the linear hull effect in 1994. The main result shows that the success rate of Algorithm 2 is underestimated in Matsui’s paper. They show this by declaring that the data complexity we need can be reduced. Since we may have many linear expressions with the same input and output mask but different internal subkeys, i.e., PI and CJ are the same but KK is different. For input mask a and output mask b, he uses ALH(a,b) to denote the approximation linear hull. We describe the definition and theorem in a more understandable version by [15].
Definition 3.1: Given nonzero N-bit masks a, b, the approximation linear hull, ALH(a,b), is the set of all T-round characteristics, for the T rounds under consideration, having a as the input mask for round 1 and b as the output mask for round T, i.e., all characteristics of the form Ω= a,a2,a3,...,aT,b .
The characteristic Ω here is like the trail we said before. And we have the following theorem.
Theorem 3.1: Let a and b be fixed nonzero N-bit input and output masks, respectively, for T rounds of an SPN. Then
∈
∑
Ω
Ω
=
* ) , (
) ( ]
b a, [
b a ALH
T LCP
E . (3.3)
The ET[a,b] denotes the expected value of linear probability of mask (a,b) over all independent keys. And LCP(Ω) denotes the linear characteristic probability of a characteristic Ω. This theorem shows that under certain masks (a,b), we may have many different characteristics and the expected value of masks (a,b) is the sum of
) (Ω
LCP over a large set of characteristics. In other words, under certain PI and CJ, the expected value of bias is the sum of a large set of different trails with the same PI
and CJ. Therefore, the linear characteristic probability of best characteristic is strictly less than ET[a,b]. This implies that an attacker will overestimate the number of pairs required for a given success rate since the best trail we find is always smaller than ET[a,b].
3.3.2 Key ranking
After the linear cryptanalysis was proposed, Matsui experimented on the attack in 1994 again with some modifications [20]. In his paper, he uses two new linear approximation equations, each of which provides candidates for 13 key bits. Further, he adopts the reliability of key candidates into consideration. The key candidates means that he stores not only the most likely key bits but also the ith likely candidates.
That is, he stores the key
ˆ , ˆ ,...
2 1
k
k
in order wherekˆ
i is the ith likely key bits.Then if the most likely key tests to be wrong, he can go back to use the second likely key bits and so on. The test can be done by given a plaintext-ciphertext pair (P, C), and the rest key bits by exhaustive key search to test if the candidate key bits can generate C from P. To increase accuracy, a few more pairs {(P1, C1), (P2, C2),…} can be given since wrong key bits can generate the correct Ci with negligible probability.
Thus, if
ˆk
1 fails the test, thenˆk
2 is used and so on until the correct one is found.With this simple improvement, he increased the success rate. In his test, he successfully attacked the 26 key bits of the full 16-round DES with 243 plaintext- ciphertext pairs. The remaining 30 key bits can be found by exhaustive key search. In comparison with his original attack, more key bits are attacked with fewer pairs needed.
3.3.3 Multiple linear approximations
Kaliski and Robshaw [29] proposed a new idea on linear cryptanalysis by using multiple linear approximations in CRYPTO’94. Suppose they have n linear approximations, which involve the same key bits but differ in the plaintext and ciphertext bits that they use. For each linear approximation they assign a different weight ai (this may be decided by their biases) and
∑
= n =
i
ai 1
1 . Then for each candidate key bits K(j), j=1,2,… and each linear approximation i, let Tji
be the number
of the linear equation holds. Then we calculate
∑
=
= n
i i j i
j aT
U
1
for each j. And the rest
parts are just like the original Algorithm 2 in Matsui’s attack, i.e., we see which Uj is furthest from N/2 (N is number of pairs) and we assume it to be the most likely key bits.
This technique is supposed to increase the success rate and reduce the data complexity. However, in their experiments, the increase of effectiveness on DES is somewhat limited. But, this is still an important skill since it may be generally applicable to other block ciphers and be extremely effective in reducing data complexity.
3.4 Our Attack Design
As we mentioned in the introduction, we want to use linear cryptanalysis many times to get most of the key bits. We use one trail to extract a subset of key bits and another trail to get another subset of key bits. Until the last round keys KNr+1 are all extracted then we go one level up to extract the key bits of KNr with new trails and so on.
3.4.1 Observations
Before we explain our strategies, there are some observations to be made.
1. The subkeys we are going to attack should not be too many in a single attack, i.e., the S-boxes involved in the last round should not be too many. This is because the more subkeys we want to extract in one attack the more time we need. For example, if we want to get 8 key bits in one time, then we have to test 28 candidate key bits for all pairs. But if we get 4 bits and then another 4 bits in two attacks, we only need to test 2×24 candidate keys for all pairs.
2. The fewer S-boxes are involved the larger the bias. So, maybe there exists one input-output XOR having the largest bias, but its output spreads to many S-boxes in the permutation. Then we should consider if it is worthwhile to choose such path.
3. It is easy to see that with first Nr-1 round trail we can get bits of KNr+1. So with Nr-2 round trail we can get bits of KNr. Continuing the process we can get all key
3. It is easy to see that with first Nr-1 round trail we can get bits of KNr+1. So with Nr-2 round trail we can get bits of KNr. Continuing the process we can get all key