Expectation and maximization algorithm for estimating parameters of a simple partial erasure model

(1)

Expectation and Maximization Algorithm

for Estimating Parameters of a Simple

Partial Erasure Model

Tsai-Sheng Kao and Mu-Huo Cheng

Abstract—The identification of the model parameters of a high-density recording channel generally requires solution of nonlinear equations. In this paper, we apply the expectation and maximization (EM) algorithm to realize the maximum likelihood estimation of the parameters of a simple partial erasure model, including the reduction parameters and the isolated transition response. The algorithm that results from this approach iteratively solves two least-squares problems and, thus, realization is simple. Computer simulations verify the feasibility of the EM algorithm, and show that the proposed algorithm has fast convergence and the resulting estimator is asymptotically efficient.

Index Terms—Expectation and maximization algorithm, least-squares methods, maximum-likelihood estimation, Monte Carlo methods, simple partial erasure model.

I. INTRODUCTION

A

S RECORDING densities grow in magnetic storage, nonlinear distortions become the primary factors to limit the detector performance. Nonlinear distortions in magnetic storage are mainly the nonlinear transition shift and the partial erasure. A magnetic recording channel model that represents nonlinear distortions accurately enables one to design an im-proved detector [1]–[3]. Several models have been presented to characterize the nonlinear distortions [4]–[6]. It is known that if the channel model is more complicated, the complexity of detector for the model may be prohibitively higher. A simple partial erasure model [7], which is simplified from the transition-width-reduction model in [4], is shown to preserve sufficient accuracy and is much simpler than the other models, and, thus, is often adopted to design the detector.

Accurate estimation of model parameters is crucial in de-signing a detector of high performance such that high recording density can be achieved. Techniques widely used for identifying nonlinear distortions are the echo extraction method using pseu-dorandom binary sequence input [8], the autocorrelation method [9], the frequency domain nonlinear measurement method [10], and an adaptive technique using the stochastic gradient [11]. These methods, however, either require the input with a special pattern, or have to compute the gradient vector and determine the step size; thus, the complexity is often high and the conver-gence is not assured.

Manuscript received February 6, 2002; revised July 26, 2002.

The authors are with the Department of Electrical and Control Engineering, National Chiao Tung University, Hsinchu 300, Taiwan, R.O.C. (e-mail: mhcheng@cc.nctu.edu.tw).

Digital Object Identifier 10.1109/TMAG.2002.806343

In this paper, we focus on the simple partial erasure model and apply the expectation and maximization (EM) algorithm [12]–[14] for estimating the model parameters, including the reduction parameters and samples of the isolated transition re-sponse. The EM algorithm ensures the convergence and obtains the maximum-likelihood (ML) estimates of the channel parame-ters [15]. At each iteration, the EM algorithm is required to solve two least-squares problems. Thus, the EM algorithm for this problem has simple realization. Simulation results also demon-strate that fast convergence speed of the proposed algorithm is obtained.

The rest of paper is arranged as follows. In Section II, a simple partial erasure model is reviewed and the sampled output in terms of the model parameters is formulated. In Section III, the problem is first formulated as the joint ML estimation of the model parameters. Then, the EM algorithm is applied to solve the joint ML estimation iteratively. In Section IV, the feasibility of the EM algorithm is verified by computer simulations. Fi-nally, a conclusion is given in Section V.

II. SIMPLEPARTIALERASURECHANNELMODEL Assume that the nonlinear transition shift has been elimi-nated by the precompensation technique [16]. In terms of the simple partial erasure model, the playback signal of a high-den-sity recording channel can be expressed as

(1)

where the effective transition-width ratio is determined by the neighboring transitions and is given by

(2)

The nonreturn-to-zero-inverted (NRZI) modulated input with the period may be of values . The term rep-resents the isolated transition response, which is often modeled as the Lorentzian function

(3)

where the normalized density quantifies the recorded bit density. Note that here we assume is unknown and its sam-ples are to be estimated. The last term in (1) represents the 0018-9464/03$17.00 © 2003 IEEE

(2)

measurement noise. The simple partial erasure model character-izes the channel nonlinearity by the reduction parameters and ; when , the model is reduced to a linear model. The simple partial erasure model commonly sets . Here, we use two independent variables to represent the model; thus, the modeling flexibility is enhanced.

The sampled output can be obtained from (1) (4)

(5)

where , , and is chosen such that

for are small enough to be neglected.

Denote , , and

, where the subscript rep-resents the transpose operation; then, the sampled output (5) can be rewritten as follows:

(6) This formulation illustrates the relation between the output and the isolated transition response . We can obtain another expression emphasizing the effect of reduction parameters to represent the sampled output as

(7) where , , and (8) where otherwise (9) (10) where otherwise (11) (12) where otherwise. (13) Note that the formulation (7) expresses the relation between the output and the reduction parameters . The two different expressions (6) and (7) of the output sample will be used in the

application of the EM algorithm. In (6), the reduction parame-ters are buried in the data vector , while in (7), the isolated transition response are buried in the coefficient vector .

The problem considered here is to estimate the isolated tran-sition response and the reduction parameters from the given samples under the assumption that all concerned s are known. In Section III, we shall derive how the EM algo-rithm is applied to obtain the estimates of and .

III. JOINTML ESTIMATION OFMODELPARAMETERS VIAEM ALGORITHM

In this section, the ML estimation of the model parameters is first formulated. Then, the direct solution using this formulation is shown to require for solving highly nonlinear and intractable equations. Finally, the EM algorithm for realizing the estimator is derived and discussed.

A. ML Estimator

Using the representation (6) to express the sample data in a matrix form, we obtain

(14)

where the sample data vector , the noise

vector , and the matrix .

Assume that the noise sample data are identically independent Gaussian distribution with zero-mean and of variance . The likelihood function is thus derived

(15)

where . The ML estimates of and can be

obtained directly by solving the equations derived by setting the derivatives of the logarithm of the likelihood function (15) with respect to and to zeros. Note that the reduction parameters are buried in the data matrix , so this approach is required to solve nonlinear and intractable equations. Therefore, a better algorithm should be developed.

B. EM Algorithm for Estimating Model Parameters

The EM algorithm, developed in 1976 [12], is well known for its power to solve nonlinear intractable ML problems via itera-tions of two simple steps: the expectation step (E-step), and the maximization step (M-step). Besides the ensured convergence, the EM algorithm often demands low realization complexity. Here, we apply the EM algorithm for finding the ML estimates of and . The E-step is formulated for estimating the reduc-tion parameters , then the M-step is used to obtain the estimate of .

Using the terminology in [12], we set the reduction parame-ters as the hidden data; the measurement data is incomplete because we cannot obtain the ML estimate of via without knowing the hidden data . The main theme of the EM algo-rithm is on the following auxiliary function:

(16) (17)

(3)

where denotes the expectation operator. Given an initial estimate of the parameter , the E-step in the EM algorithm evaluates the auxiliary function , yielding the byproduct

, the estimate of . The M-step then finds the estimate of for maximizing the auxiliary function . This estimate of

in the M-step is used again in the E-step as for evaluating the auxiliary function; then the M-step follows. Such iteration continues until the estimates converge. The convergence value of the estimate in the M-step and the byproduct in the E-step are the desired results. It is shown in [15] that at each iteration in the EM algorithm, these estimated parameters are obtained so that the likelihood function in (15) is nondecreasing, and, thus, the convergence is guaranteed.

As seen from (17), different realization algorithms of the EM algorithm emerged when different distributions of the hidden data are assumed. Here, we assume that is an unknown, deter-ministic vector, then the detailed algorithm and its derivation of using the EM algorithm for our problem are briefly discussed as follows.

Step 1. Initialization: Given an initial estimate ^hhh. Step 2. E-step (Expectation):

Using the representation (7) and the estimate ^hhh, we can express the N measurements in the matrix form

yyy = ^xxx + ^CCC + nnn (18) where CC = [^ccc ; . . . ; ^ccc ]C^ , ^xxx = [^x ; . . . ; ^x ] , and ^ccc , ^x are obtained using (9)–(13) with h replaced by ^h . Then the estimate of is derived by the least-squares method, yielding

^ = (^CCC ^CCC) CC (yyy 0 ^xxx):C^ (19) Since is assumed deterministic and the noise sample data are independent normal distributed, using the representation (6) and replacing by ^ we obtain the Q function (17)

Q(hhhj^hhh) = K(yyy yyy 0 2yyy ^DDD hhh + hhh ^DD ^D DDhhh)D (20) where K is a constant and DDD^ is obtained using (5),

(6) and (14) with replaced by ^. Step 3. M-step (Maximization):

Once the Q is obtained, the M-step is to find the estimate hhh for maximizing the function, resulting in

^hhh = ( ^DD ^D DD)D DDD yyy:^ (21)

The EM-algorithm iteratively executes the E-step and M-step until convergence. The convergence is often checked by eval-uating the measure of the difference between the estimated parameters of successive iterations. When the measure is less than a predetermined value, the iteration terminates. The com-putational flow chart of the EM algorithm for estimating the parameters of a simple partial erasure model is depicted in Fig. 1.

Fig. 1. Computational flowchart of the EM algorithm.

Fig. 2. Averaged output square error versus iterations of a simulation run. IV. COMPUTERSIMULATIONS

In this section, the EM algorithm is demonstrated to ac-curately estimate the parameters of a simple partial erasure model by computer simulations. Assume that the model has a Lorentzian isolated transition response with and

the reduction parameters and . The number

of samples for modeling the isolated transition response is set to be 21, i.e., . The data s are obtained by NRZI encoding of a random signal of equal probability on two values . The number of the sample data is . The signal power, given the model settings, can be analytically obtained to be 0.5921. The signal-to-noise ratio (SNR), defined

as , is set to be 20 dB; hence,

the noise variance is equal to 0.005 921.

The EM algorithm discussed above with zeros as the initial estimate is simulated. The simulation results of one typical run are shown in Figs. 2–4. Fig. 2 depicts the average square output

(4)

Fig. 3. Samples of the actual isolated transition response and its estimates obtained at the eighth iteration of a simulation run.

Fig. 4. (a)^ versus iterations of a simulation run (b) ^ versus iterations of a simulation run.

of the identified model versus the iteration. Note that the axis is in log-scale and the noise power is also shown in a dashed line. This result indicates that the algorithm converges rapidly in a few iterations, and the convergent average square error is close to the noise variance. The algorithm terminates at the eighth iteration when the predetermined error measure, defined as the 2-norm of the difference between the present and the previous estimates of , is set to 10 . The samples of the actual isolated transition response and its estimates , obtained at the eighth iteration, are shown in and in , respectively, in Fig. 3. The estimated reduction parameters and at each iteration are also depicted in in Fig. 4(a) and (b). From these simulations, the estimated samples of the isolated transition response and the reduction parameters are very close to the true ones, which verifies the proposed method.

TABLE I

ESTIMATEDERRORVARIANCE OFEACHREDUCTIONPARAMETER AND ITS

CRAMÉR-RAOBOUNDUNDERVARIOUSSNRS

Fig. 5. (a) Estimated error variance of versus the number of data samples. (b) Estimated error variance of versus the number of data samples.

We also demonstrate by simulation that the joint ML esti-mator of the model parameters is asymptotically efficient. For simplicity, we shall only show the error variance of each re-duction parameter, obtained by Monte Carlo simulations using the average of 100 independent runs. The simulation results under various SNRs are listed in Table I. As shown in the table, each error variance is in the same order as the Cramér-Rao bound. Also shown in Fig. 5, under SNR of 20 dB increasing the number of data samples decreases the discrepancy between the obtained error variance and the Cramér-Rao bound. When the data sample number equals 5000, the error variances of and are nearly equal to their corresponding Cramér-Rao bounds, respectively. These results show that the proposed algorithm for estimating the parameters of a simple partial erasure model is asymptotically efficient.

V. CONCLUSION

The EM algorithm has been successfully applied for es-timating the parameters of a simple partial erasure model, including the reduction parameters and the isolated transition response. This approach not only avoids solving the nonlinear equations but provides an effective way to identify accurately the model parameters. The resulting algorithm is an iteration of solving two least-squares problems; hence, its realization is simple. Simulation results also show that the convergence is fast and the resulting estimator is asymptotically efficient. This algorithm can estimate the model parameters rapidly and accurately and, thus, is expected to improve the performance of high-density magnetic recording.

(5)

REFERENCES

[1] S. Choi, S. Ong, C. You, D. Hong, and J. Cho, “Performance of neural equalizers on partial erasure model,” IEEE Trans. Magn., vol. 33, pp. 2788–2790, Sept. 1997.

[2] M. P. C. Fossorier, T. Mori, and H. Imai, “Simplified trellis diagram for a simple partial erasure model,” IEEE Trans. Magn., vol. 34, pp. 320–322, Jan. 1998.

[3] S. Choi and D. Hong, “Performance of RBF equalizer in data storage channels,” in Int. Joint Conf. Neural Networks, vol. 2, 1999, pp. 1077–1080.

[4] T. Yamauchi and J. M. Cioffi, “A nonlinear model in the thin film disk recording systems,” in Int. Conf. Magn., vol. EC-10, Apr. 1993. [5] W. E. Ryan and N. H. Yeh, “Viterbi detector for Pr4-equalized magnetic

recording channels with transition-shift and partial erasure nonlineari-ties,” IEEE Trans. Magn., vol. 32, pp. 3950–3952, Sept. 1996. [6] R. Hermann, “Volterra modeling of digital magnetic saturation recording

channels,” IEEE Trans. Magn., vol. 26, pp. 2125–2127, Sept. 1990. [7] I. Lee, T. Yamauchi, and J. M. Cioffi, “Performance comparison of

re-ceivers in a simple partial erasure model,” IEEE Trans. Magn., vol. 30, pp. 1465–1469, July 1994.

[8] D. Palmer, P. A. Ziperovich, R. Wood, and T. Howell, “Identification of nonlinear write effect using pseudo-random sequences,” IEEE Trans. Magn., vol. 23, pp. 2377–2379, Sept. 1987.

[9] X. Che and P. A. Ziperovich, “A time-correction method of calculating nonlinearities utilizing pseudo-random sequences,” IEEE Trans. Magn., vol. 30, pp. 4239–4241, Nov. 1994.

[10] Y. S. Tang and C. Tsang, “A technique for measuring nonlinear bit shift,” IEEE Trans. Magn., vol. 27, pp. 5316–5318, Nov. 1991.

[11] Y. S. Cho and N. J. Lee, “A technique for nonlinear distortion in high-density magnetic recording channels,” IEEE Trans. Magn., vol. 34, pp. 40–44, Jan. 1998.

[12] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Roy. Stat. Soc., ser. B, vol. 39, no. 1, pp. 1–38, 1997.

[13] T. K. Moon, “The expectation-maximization algorithm,” IEEE Signal Processing Mag., pp. 47–60, Nov. 1996.

[14] R. Perry, W. A. Berger, and K. Buckley, “EM algorithms for sequence estimation over random ISI channels,” in Asilomar Conf. Signals, Sys-tems and Computers, vol. 1, 1999, pp. 295–299.

[15] C. Wu, “On the convergence properties of the EM algorithm,” Ann. Statist., vol. 11, no. 1, pp. 95–103, 1983.

[16] S. X. Wang and A. M. Taratorin, Introduction to Magnetic Information Storage Technology. New York: Academic, 1999, pp. 291–293.

Tsai-Sheng Kao was born in Taipei, Tawain, R.O.C., in 1975. He received

the B.S. and M.S. degrees in electrical and control engineering from National Chiao Tung University, Hsinchu, Taiwan, in 1997 and 1999, respectively. He is currently a doctoral student in the Department of Electrical and Control Engi-neering at National Chiao Tung University.

His research interests include magnetic storage systems, parameter estima-tion, and digital signal processing.

Mu-Huo Cheng received the B.S. degree in electrical engineering from

National Cheng-Kung University, Tainan, Taiwan, R.O.C., in 1980, the M.S. degree in electronic engineering from National Chiao Tung University, Hsinchu, Taiwan, in 1987, and the Ph.D. degree from the Department of Electrical and Computer Engineering at Carnegie Mellon University, Pittsburgh, PA, in 1995.

Since 1995, he has been an Associate Professor in the Department of Elec-trical and Control Engineering at National Chiao Tung University. His current research interests include adaptive signal processing, detection, and estimation.