The error model of messages - 多媒體資料完整性的快速認證

Chapter 3 Definitions

3.3 The error model of messages

We assume the error probability of a message is randomly and independently, the loca-tions of error occur are not correlated. This assumption can simplify the problem model without loss generality. The error models of message are different under different circum-stances like forgery, compression, watermarking, channel noise… etc. In different applica-tions and environments, we can just adjust the threshold parameters of AMAC to find a suitable authentication scheme, and the random sampling model can make the error model fit the random error distribution condition. We also assume the verifier has information of

the error model without loss generality.

3.4 Definitions of AMACs

In this section, we will first define the model of MAC in math, and then introduce the definitions of AMAC in [18] and our AMAC definitions with the authentication model. Then we compare the two different models and describe the reason we use our definitions. At the least of this chapter, we will discuss the probabilistic properties of AMAC.

3.4.1 Definitions of MACs

Let the integer s be a security parameter. A message authentication code (MAC) is a triple (K, T, V), where algorithms K, T, V run in time polynomial and satisfy the following syntax. The key generation algorithm K takes as input a random string and returns a secret key k of length s. The authenticating algorithm T takes a message m and a secret key k as inputs and produces a string tag t. The verifying algorithm V takes a message m’, a secret key k and a string tag as inputs and returns a value {1,0}. A MAC has to satisfy a correctness requirement:

After k is generated by K and tag is generated by T, V on the input (m’, t), outputs 1 if m’=m.

3.4.2 AMAC with parameters (s, d_m, e, α), algorithms (K, T, V)

An (dm, e, α)-noise-tolerance AMAC satisfied the following: After k is generated using K,

if tag is generated using algorithm Tk on input message m, then algorithm Vk, on in-put(m’,tag) , outputs 1 with probability at leastα, if dm(m, m’)≦e. Here e stands for the ac-ceptable number of errors.

From the above definition, we can observe that an AMAC is a probabilistic guarantee such that if the number of errors blows to e, than the AMAC make the correct decision only at least αprobability, which is quite different to the probabilistic guarantee of cryptographic MAC. Also from the above definition, it does not consider the situation that when too many errors occur and should not be accepted by the verifier, there is no probabilistic guarantee that the AMAC can detect the errors, because of the insufficiencies of the AMAC definition above, we enhance the AMAC later.

3.4.3 AMAC with parameters(s,d_m,c₁,c₂,p₁,p₂), algorithms (K,T,V)

Different from the AMAC definition in 3.4.2, the original error threshold e becomes two error thresholds c1, c2, and the probability parameter e also becomes two probability param-eters p1, p2. The main idea is to consider the probability AMAC outputs 1 of both situations:

acceptable number of errors and unacceptable number of errors.

Let the integer s be a security parameter, algorithms K, T, V run in time polynomial, the key k is generated by algorithms K with length s. d is the distance function of messages witch dm(m, m’) return the distance of m with m’, d can be Hamming distance or other distance function. c1, c2 are two parameters of errors, they can be real numbers between 0 and 1 or a natural number of errors, depends on the distance function dm. p1, p2 are two probabilities which satisfy 0≦ p1<p2≦1. The following two requirements are satisfied:

)

T is the tag algorithm which t=T(k, m), V is the verification algorithm which on input (t, m’) outputs 1 with probability p≧p1 if d(m, m’) ≦ c1 and outputs 1 with probability p ≦ p2 if d(m, m’) ≧ c2, which means when the number of errors below some threshold c1, the veri-fication should pass greater then a high probability p1. On the other side, when the number of errors above some threshold c2, the verification should pass under then a relatively lower probability p2.

Compare to AMAC definition in 3.4.2, the definitions here are more suitable to real world application since both false alarms and false positives will have cost for the verifier.

3.4.4 Distance-preserving AMACs

Distance-preserving is a key idea of reducing the distance estimation of message to the distance of tags. We will introduce how to use the information of tags to estimate the dis-tance between the messages later.

A (dm, dt, δm, δt)-distance-preserving AMAC satisfy the following:

For any m1,m2, such that dm(m1, m2) ≦ δm, the expected value of dt(t1, t2) ≦δt, we will use the properties of distance-preserving AMACs to construct AMACs with definition of 3.4.3.

A straightforward idea to construct V(t, m’) is:

return 1 if dt(t, t’) ≦ 2δt, return 0 otherwise.

Et(δm) then we can compute the expected p1 and p2 given c1, c2.

3.5 Mutually independent AMACs

For the reason that we want to know the distribution of Et(δm), we use AMACs with in-dependent symbols .An l_m -dimension message space with alphabets N constructed from all possible messages of length l_m. One of the values {0,…, N-1} is assigned to each dimension, such that the message space contains N^ l_m possible messages. We compute each AMAC symbol with non-overlapping sets of message. Given a key k and an initialization vector I, the N-ary AMAC algorithm maps each message to an AMAC tag of length L. The following theo-rem holds:

Assume the existence of a pseudo-random generator, AMAC symbols are mutually inde-pendent.

When we calculate an AMAC tag of length L, a message is partitioned into L non-overlapping sets after the operations with the outputs of PRG, the pseudo-random number generator. Each set contains l_m /L elements. Each AMAC symbol is calculated from the corresponding set. Since the random sampling and modulo operations eliminate the correlations between each set, the AMAC symbols are mutually independent.

3.6 Probabilistic properties of mutually independent AMACs

If an AMAC is mutually independent, then the probabilistic properties of an AMAC tag changes can decide the properties of AMAC. We denote the probabilistic properties of an AMAC tag changes as PA, thenδt can be written as a binomial distribution of PA.

Where PA can be written as a function ofδm, PA will increase strictly when δm increase as the distance-preserving property. Two different values ofδm will produce two different bi-nomial distributions.

Chapter 4 Our AMAC Algorithm

Our AMAC is a probabilistic checksum calculated by using pseudo-random permutation, masking via a modulo sum operation, and MODE function, such that a small difference be-tween the two messages tends to result in a small difference bebe-tween their AMACs. N is the symbol size of messages, for any messages we can change N easily depending on the verifier.

For N=2, the N-ary AMACs reduces to the binary AMACs where modulo sum operator reduc-es to XOR operation and MODE function reducreduc-es to MAJORITY function.

Let m be the input N-ary message of length l_m. The ith element in the message is denoted as m(i). Given a secret key k generated by K and a pseudo-random number generator PRG. As with conventional MACs, the length of AMACs L is typically chosen in the range 128≦ L ≦ 1024 bits. We compare different AMACs with the same length L, or we say an AMAC is better than the other if they have the same properties while one has shorter L.

First, the row data m with length l is input into authentication tag generation, random sampling is taken to reduce the size of m from l to L*H. After re-formatting, the matrix M is masked by random matrix P generated with key k. The matrix Z then input to the tag function column by column, the output after quantization is the final AMAC tag with L bits.

N-ary message of length

Figure 3. The flow chart of our AMAC scheme

4.1 Initialization

Verifier and owner share the secret key k, k is input to the Pseudo-Random Generator (PRG) as a seed. The output of PRG must be available to both verifier and owner. PRG is used repeatedly as a source of N-ary pseudo-random numbers.

4.2 Feature extraction

A feature vector that represents the media content extracted from the original message and hashed into a small digest. The digest is then signed by a standard digital signature algo-rithm. Since only the semantic information is extracted for authentication, the incidental noise can be tolerated. Different features could be used to represent the content of the im-age such as edge information, DCT coefficients, and color or intensity histogram, histogram feature was used in our AMAC.

In our AMAC, only the error number can detect, not the perceptual data error. If the at-tacker changed the data with the amount of errors that blows the threshold, he will not be detected by our AMAC since the amount of errors is acceptable. In our AMAC, the error posi-tion and error distance are not measured in the tag funcposi-tion. The histogram feature of data only considers the number of errors. To enhance our AMAC for detecting attacker, prepro-cessing the multimedia data to extract the perceptual feature is helpful. There are many works that extract different types of multimedia data features, we make the assumption that the extract features are suitable for further histogram feature extraction, which means the errors in features of multimedia are not location and distance correlated. Thus, we can apply feature extraction of the type of multimedia data and then apply our AMAC, the final AMAC can detect the attacker.

4.3 Random sampling

For the reason that the decrease of accuracy of AMAC is not as much as the decrease of proportion of message which take part in the computation of AMAC, so we use random sampling to reduce the computation.

We use m_old to denote the original message with length l, L is the length of the tag, and namely, the tag has L symbols. We sample L*H symbols from the original message by using PGR. The other message symbols are not taking part in the computation of AMAC.

The PRG is used to form a sample table such that each element in the message matrix and in the sample table forms a new matrix accordingly. The verifier and the message sender use the shared key k for PRG. The purpose of the pseudo-random sampling is to not only destroy any existing spatial correlation within the neighboring elements but also enhance the securi-ty against attack.

4.4 Masking

The N-Nary message of length L*H, denoted asm(m(0),m(1),...,m(LH1), Then the message re-formatted into a matrix, denoted as

Let P be the pseudo-random L*H matrix generated from PRG. The matrix M is then masked by a modulo N operator with the pseudo-random matrix P, element by element. Denote the masked matrix M as M=(M+P)N, where mij=(mij+pij) module N.

The modulo operation leads to the variables , which are independent of each other and unbiased whenever the samples {pij} are mutually independent and unbiased which means they obey a discrete uniform distribution on {0,1,…,N-1}.

m(0)m(1)…..…m(L-1) m(L)m(L+1)…m(2L-1) . . . . . . . .

m(LH-L)………m(LH-1)

4.5 Feature extraction: Tag function

After random sampling and masking, then the tag t of length L symbols is computed by matrix M, t=Tag(M). Because of random sampling, we can simply divide M into rows to com-pute each tag symbol without permutation. Each tag symbol is comcom-puted by ti=Tag(Mi), Mi={mi, mL+i,…, mL(H-1)+i}, thus each tag symbol is mutually independent.

MODE

The MODE is defined as the most common value in a set. If a “tie” occurs, the MODE op-eration breaks the tie by comparing the adjacent values.

Example:

MODE(0, 1, 1, 1)=1, MODE(1, 3, 0, 3, 2)=3

MODE2

The MODE2 is defined as appearance frequency of the most common value in a set. If a

“tie” occurs, the MODE operation breaks the tie by comparing the adjacent values.

MODE2(0, 1, 1, 1)=3, MODE2(1, 3, 0, 3, 2)=2

4.6 Feature reduction: Quantization

Quantize the tag symbol into binary or other n-nary, n<N, can reduce the bits of tag while not affect the accuracy of AMACs significantly. The reason is that if we quantize the tag sym-bols, the same length of tag can contain more symbols.

Example:

tag t=(0, 1, 2, 3), Quantization(t, 2)=(0, 1, 0, 1)

More symbols can affect the accuracy significantly in nature. But if we fix the length of the tag, the only way to increase the number of symbols is to compress each symbol. The disad-vantage is that the probability of each tag symbol change is decreased, for example:

mi=(0, 1, 2, 3, 3) Tag(mi) = ti = 3

After transmission, mi becomes to mi’=(0, 1, 2, 3, 1) Tag(mi’) = ti’= 1, where ti ≠ ti’

After quantization, ti = Quantization(ti,2) = 1,and the value of ti’ becomes to

ti’ = Quantization(ti’,2) = 1 is equal to ti, in such cases, the tag is not changed if we quantize the tag.

The effect of quantization will have the benefit of accuracy since the number of symbols is increased, but the probability of each tag symbol change is decreasing, the final AMAC accu-racy is the tradeoff of the two factors.

Figure 4. False alarm versus true positive for different quantization

From above figure 6, we can observe that with the same condition but different quantization parameter, more quantization has better accuracy in these three cases.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 0.2 0.4 0.6 0.8 1

MODE2 s=0.06, q=2 MODE2 s=0.06 q=4 MODE2 s=0.06 q=16

sample 3%

N,L Pa[0.03] Pa[0.04] P[0.03] P[0.04] P[0.03]-P[0.04] T

2,128*8 0.234 0.284 0.986 0.037 0.949 265

4,128*4 0.376 0.453 0.996 0.053 0.943 212

16,128*2 0.518 0.591 0.91 0.133 0.776 143

Figure 5. Different symbol size with fixed tag length

From figure 7, we use the same bit length of the tag but different symbol size without quan-tization, although small N seems to have better performance but in the case N=2 compares to N=4, the AMAC with N=2 are not significantly dominate the AMAC with N=4. This means that the AMAC is not always benefit if we quantize the AMAC symbols more. A practical way is to find the best quantization factor for the AMAC with experiments.

4.7 Verification

The resultant N-ary AMAC tag t together with the initialization data are sent along with the message m. The receiver compares the received AMAC t and the AMAC t’ constructed from the received message m’. The distance between two AMACs is measured by distance func-tion dt, we use Hamming distance here. Over an N-ary alphabet, the definition of Hamming distance between two vectors is the number of positions in which they differ. Although other distance functions like Euclidean distance are also taken into account, the Hamming distance between two AMACs is effective in showing the differences between two messages.

The larger the distance between t and t’, the larger the difference between and is judged to be. We then compare the distance between t and t’, δt, with thresholds c1, c2,

if dt(t, t’) < c1, return 1

if c1≦ dt(t, t’)≦c2, don’t care if dt(t, t’)>c2, return 0

MAC1

MAC2 MAC1

MAC generation

Data Data

Similarity comparison Key Key

Decision

Figure 6. The scheme of AMAC verification

Tag Algorithm

input: the secret key k, data m, sample rate s, L, H

1 generate index set a, where ai = PRNG(k,i), i from 0 to LH - 1 2 m = (ma0, ma1,…, maLH-1)

3 generate r, where ri = PRNG(k, i + LH), i from LH to 2LH - 1 4 m = m + r

5 mi = (mi, m2H+i,…, m(L-1)H+i)

6 t = (MODE2(m0), MODE2(m1),…, MODE2(mL-1)) 7 return t

Verification Algorithm

input: the secret key k, modified data m’, tag t, thresholds c1,c2

1 t’ = Tag(m’, k) 2 δ_t= dt(t, t’)

3 if dt(t, t’) < c1, return 1

if c1≦ dt(t, t’) ≦ c2, don’t care if dt(t, t’) > c2, return 0

Chapter 5 Experiment

5.1

Experiment environment

We use an 8-bit 800*600 image as our example, the row data can be seen as a gray-scale image, and the desired AMAC length is 128 bits to 1024 bits. We use the computer with Intel Core i7 Q720 1.6GHz CPU and 4GB RAM and coding with Bloodshed Dev-C++ 4.9.9.2. The key is generated as the seed for the pseudo random number generator. In this chapter, we will first discuss the comparison for two different AMACs, then discuss several factors that affect to our AMAC and compare to the AMAC of [18]. The distance function we use for data and tags are hamming distance, which is the number of differences of each symbol, the distance of each symbol is not considered in our AMAC.

5.2

Comparison of two different AMACs

To compare two different AMACs, we compare the probability of each AMAC outputs the correct authentication given the same input data, the keys of AMACs are generated randomly.

We simulated the authentication many times with different keys and compute the probabil-ity that the AMAC make the correct authentication decision. The error added to the image are randomly for each byte, for example, a pixel with original value 100 are randomly changed to 0~255 except 100 if the error occurred to this pixel. The amount of errors added to the image is just at the edge of the acceptable number of errors or the unacceptable number of errors, we will discuss this later.

5.2.1 The length of AMAC tag

It is not difficult to see that an AMAC with longer tag take advantage over the other with shorter tag on distinguishable abilities. Suppose we compare two AMACs with the same AMAC family, one with 128-bit length tag and the other with 256-bit length tag, and the 256-bit AMAC divided into two partitions. The first 128 bits are the same as the 128-bit AMAC tag, and the later 128 bits are additional information that does not contains in 128-bit AMAC tag. Consider the worst case of 256-bit AMAC, we just drop the later 128 bits, the output result of authentication is the same as the 128-bit AMAC, thus the distinguishable abilities of the 256-bit AMAC is equal or better than the 128-bit AMAC. In addition, the long-er the tag is, the probability of tag long-error increase, or we need more efforts and redundancy to protect the tag. Thus to compare AMACs fairly, we compare them under the same length of AMAC tag.

5.2.2 AMAC distinguishable ability measurement

We measured an AMAC with the distinguishable ability which is defined as follows:

P1 = P [m’ pass AMAC verification | m’ is acceptable]

P2 = P [m’ pass AMAC verification | m’ is unacceptable]

Comparing two AMACs at the same level of P2, the AMAC with higher P1 has more advantage than the other.

Since we consider AMACs with mutually independent symbols, the properties of AMAC are decided by the probability that one AMAC symbol changes, denote as PA, is a function of δ_m, denote as fPA(δm), and fPA(δm) should be a strict increase function ofδm because more errors of message will increase the probability that an AMAC symbol changes. From figure 9, we can observe that PA is strictly increasing when the error ratio increase, where the error ratio isδm/|m|

Figure 7. The probability that one AMAC symbol changes under different error ratio 0

Since the AMAC symbols of ours are independent, the expected total number of tag sym-bol changes E(δt) can be simply calculated by LPA. And we define the accuracy of AMAC:

accuracy = P[true positive] – P[false alarm]

where P[true positive] = P[true positive |δm = c1], P[false alarm] = P[false alarm |δm = c2]

The accuracy we defined is simply. Consider the penalty describe below:

penalty = a1* P[Reject|δm < c1] + a2* P[Pass|δm > c2]

where a1, a2 are different penalty coefficients for false alarms and false positives

Since we does not know the true environment data error probability and distribution, we can not decide the coefficients of a1, a2. We assume a1 = a2, then the penalty becomes:

penalty = a1* (P[Reject|δm < c1] + P[Pass|δm > c2]) And we remove the factor of a1

penalty = P[Reject|δm < c1] + P[Pass|δm > c2] which equals to P[false alarm] + P[false positive],

1 - penalty = 1 - P[false positive] - P[false alarm] = P[true positive] – P[false alarm]

Thus, 1 - penalty = accuracy

The lower value of penalty is better, on the opposite; the higher value of accuracy is better.

5.3 Error estimation with tags

Figure 8. δm with different number of errors and threshold.

Figure 10 shows an example where the acceptable data errors is 1% and the unacceptable data error is 2%, and both 1% errors data and 2% errors data are generated 5000 times, and compute the number of result δt. Data with 2% errors are generally having higher value of δt

which is consist to our expectation.

From figure 10, there exist overlap region of 1% errors and 2% errors. This means no mat-ter what threshold we use in this case, there exists probability that the authentication deci-sion made by threshold not correct is not equal to 0.

在文檔中多媒體資料完整性的快速認證 (頁 23-0)