Unified Inversion and Division Algorithms

2.4 Galois Field Arithmetic

2.4.2 Unified Inversion and Division Algorithms

Unified Inversion Algorithms based on Fermat’s Little Theorem

Based on Fermat’s Little Theorem (FLT), X^p−1 = 1 (mod p), the inversion opera-tion is easily achieved by X⁻¹ = X^p−2 (mod p). FLT is commonly used in projective operation, because of low cost and high integration with radix-r MMM. Algorithm 2.5 shows the unified MMI algorithm based on FLT (FLT-UMMI), and the execution cycle of FLT-UMMI is about m² ∼ 2m², where m is the execution cycle of MMM. Besides, the FLT can also be used to accomplish MI operation shown in algorithm 2.6.

Algorithm 2.5. (Unified MMI algorithm based on FLT.)

Input: X · 2^m and p, where X is in GF (p) or GF (2^m) and p is the prime or irreducible polynomial.

Output: R ≡ X⁻¹· 2^m (mod p).

1. if the operating field is prime, then: T = p − 2 2. else: T = 2^m− 2

3. R = X · 2^m

4. for i from m − 2 to 0 by −1 do 5. R = M M M (R, R)

6. if Ti = 1, then: R = M M M (R, X · 2^m) 7. endfor

Algorithm 2.6. (Unified MI algorithm based on FLT (FLT-UMI).)

Input: X and p, where X is in GF (p) or GF (2^m) and p is the prime or irreducible polynomial.

Output: R ≡ X⁻¹ (mod p).

1. if the operating field is prime, then: T = p − 2 2. else: T = 2^m− 2

3. R = X

4. for i from m − 2 to 0 by −1 do 5. R = R · R (mod p)

6. if Ti = 1, then: R = R · X (mod p) 7. endfor

Kaliski’s Unified Inversion Algorithm

Algorithm 2.7 shows the unified inversion algorithm proposed by Kalisiki (K-UI) [27].

This algorithm supports the MI and MMI operation over dual fields. This algorithm calculates R = X⁻¹ · 2^m (mod p), where the operand R is defined as the Montgomery representation of modular inverse, m is the bit-length of p, and X (6= 0) be the elements of the field. Similarly, the R = X⁻¹ (mod p) can also be obtained from this algorithm, where the operand R is defined as the integer representation of modular inverse. The inversion is computed by intertwining the procedure for finding the modular quotient with that for calculating gcd(X, p). The algorithm requires four operands, U , V , R, and S. U and V are used for calculating gcd(X, p) and the operands R and S are used for calculating modular inverse. The operands U and V are initialized to Y and p, respectively, and the properties shown in Table 2.5 are applied iteratively to calculate gcd(X, p). For example, U can be replaced by U/2 according to the property gcd(U, V ) = gcd(U/2, V ), when U is even. In addition, R and S are initialized to the values of X and 0, respectively. Besides, the corresponding R, S operations are determined by the following invariants:

( X · R ≡ −U · 2ⁱ (mod p) X · S ≡ V · 2ⁱ (mod p)

(2.5)

During the phase 1 operation which means the operating steps are 2∼8, the domain value i is increased by 1 every cycle. Table 2.5 shows the detail operations of U , V , R, and S

based on the properties and invariants. For instance, if U is even, the algorithm changes value U to U/2 and the value i is increased to i + 1 for obeying the equivalence 2.5. To increase the value i to i + 1 in the second equivalence, the operand S must be multiplied by 2.

At the end of the while loop, the value U and V would be 1 and 0 which means R = −X⁻¹· 2ⁱ (mod p) with m ≤ i ≤ 2m and S = 0 (mod p). Then in phase 2 which contains step 10 to 14, the value of i is reduced to m. This can be done by either iteratively halving modulo p or multiplication modulo p [28]. After phase 2, the value R would be

−X⁻¹ · 2^m (mod p) or −X⁻¹ (mod p), and in the prime field R should be reduced to within the range [0, p − 1] by p − R operation. Finally, it has been proved that the cycle number needed to complete MMD and MD operations are m ∼ 3m and 2m ∼ 4m if X and p are co-prime [27].

Algorithm 2.7. (Kaliski’s unified inversion algorithm.)

Takagi’s Unified Modular Division Algorithm

In 1998, Takagi proposed a unified modular division algorithm (T-UMD) [38] based on the extended binary GCD algorithm [57]. The algorithm calculates S = X · Y⁻¹ (mod p) by finding the value gcd(Y, p) and the corresponding modular quotient, where X and Y are the elements of the field with odd prime (or irreducible polynomial) p.

This algorithm requires four operands, U , V , R, and S. U and V are used for cal-culating gcd(Y, p) and the operands R and S are used for calcal-culating modular quotient.

The operands U and V are initialized to Y and p, respectively, and the properties shown in Table 2.6 are applied repeatedly to calculate gcd(Y, p). The operands R and S are initialized to the values of X and 0, respectively. Then, the same operations that are

Table 2.5: The properties of Kaliski’s unified inversion algorithm.

Initial X · (0) ≡ −(p) · 2⁰ (mod p) X · (1) ≡ (X) · 2⁰ (mod p) End of MMI operation X · (−X⁻¹· 2^m) ≡ −(1) · 2^m (mod p)

X · (0) ≡ (0) · 2^ρ (mod p) End of MI operation X · (−X⁻¹) ≡ −(1) (mod p)

X · (0) ≡ (0) · 2^ρ (mod p)

Properties Invariants

U is even gcd(U, V ) = gcd(^U₂, V ) X · R ≡ −U/2 · 2ⁱ⁺¹ (mod p) X · 2 · S ≡ V · 2ⁱ⁺¹ (mod p) V is even gcd(U, V ) = gcd(U,^V₂) X · 2 · R ≡ −U · 2ⁱ⁺¹ (mod p)

X · S ≡ V/2 · 2ⁱ⁺¹ (mod p) U > V gcd(U, V ) = gcd(^{U −V}₂ , V ) X · ^R+S2 ≡ −^{U −V}2 · 2ⁱ⁺¹ (mod p)

X · 2 · S ≡ V · 2ⁱ⁺¹ (mod p) U ≤ V gcd(U, V ) = gcd(U,^{V −U}₂ ) X · 2 · R ≡ −U · 2ⁱ⁺¹ (mod p)

X ·^R+S2 ≡^{V −U}2 · 2ⁱ⁺¹ (mod p)

phase 2 – X ·^R2 ≡ −(1) · 2ⁱ⁻¹ (mod p)

X · (0) ≡ (0) · 2^ρ (mod p) ρ is equal to the value i in the last iteration of phase 1.

performed to the operands U and V are applied to the operands R and S for calculating the modular quotient by reducing U and V value. Furthermore, the operands U and V are integers and are allowed to be negative. δ represents α − β, where α and β are values such that 2^α and 2^β indicate the upper bounds of |U| and |V |, respectively. The value δ = 0 is introduced to represent min(α, β). For correctness, we do some modification on the condition of while loop in the original algorithm.

This algorithm is based on the following invariants:

( X⁻¹· Y · R ≡ U (mod p) X⁻¹· Y · S ≡ V (mod p)

(2.6)

It can easily be shown that the equivalences always hold in Table 2.6. Since gcd(Y, p) = 1, the operands U = 0 and V is 1 or −1 in the last iteration. Hence, in the final step of algorithm, the equivalence X⁻¹ · Y · S = 1 (mod p) holds and S is equal to X · Y⁻¹ (mod p). Moreover, the number of iterations needed to complete the algorithm is at least m and at most 2m cycles if Y and p are co-prime.

Algorithm 2.8. (Takagi’s unified modular division algorithm.)

Input: X, Y , and p, where X, Y are in GF (p) or GF (2^m) and p is the prime integer or irreducible polynomial.

Output: S ≡ X · Y⁻¹ (mod p).

1. U = Y , V = p, R = X, S = 0, δ = 0 2. while U 6= 0 do

3. if U is even, then: U = U/2, R = R/2 (mod p), δ = δ − 1 4. else

5. if δ < 0, then: swap(U, V ), swap(R, S), δ = −δ 6. if (U + V ) (mod 4) = 0, then: q = −1

7. else: q = 1

8. U = ^{U +q·V}₄ , δ = δ − 1 9. R = ^R+q·S₄ (mod p) 10. endif

11. endwhile

12. if V = −1, then: S = P − S

Table 2.6: The properties of Takagi’s unified modular division al-gorithm.

Initial X⁻¹· Y · (X) ≡ (Y ) (mod p) X⁻¹· Y · (0) ≡ (p) (mod p) End X⁻¹· Y · (0) ≡ (0) (mod p)

X⁻¹· Y · (±X · Y⁻¹) ≡ (±1) (mod p)

Properties Invariants

U is even

gcd(U, V ) = gcd(^U₂, V ) X⁻¹· Y · R ≡ U/2 (mod p)

and V is odd X⁻¹· Y · S ≡ V (mod p)

U + V is

gcd(U, V ) = gcd(^{U +V}₄ , V ) X⁻¹· Y ·^R+S4 ≡ ^{U +V}4 (mod p)

divisible by 4 X⁻¹· Y · S ≡ V (mod p)

U − V is

gcd(U, V ) = gcd(^{U +V}₄ , V ) X⁻¹· Y ·^R−S4 ≡ ^{U −V}4 (mod p)

divisible by 4 X⁻¹· Y · S ≡ V (mod p)

Liu’s Unified Division Algorithm

In algorithm 2.9, the Liu’s unified division algorithm (L-UD) is proposed in [31, 33].

The initial value of U , V , R, and S are set to p, Y , 0, and X, respectively, and the equivalences are shown as follows:

( X⁻¹· Y · R ≡ −U · 2ⁱ (mod p) X⁻¹· Y · S ≡ V · 2ⁱ (mod p)

(2.7)

The execution cycle of L-UD algorithm is the same as K-UI algorithm, but it can support MMD and MD operations.

Algorithm 2.9. (Liu’s unified division algorithm.)

Input: X, Y , and p, where X, Y are in GF (p) or GF (2^m) and p is the prime integer or

在文檔中抵抗能量攻擊法的雙域橢圓曲線密碼運算單元之設計與實現 (頁 24-31)