DSP Program Flow for Viterbi Decoder - Execution Flow of RS Decoder and Viterbi Decoder

6 Implementation and Acceleration of 802.1a Reed-Solomon Decoder on TI

6.3 DSP Implementation of Reed-Solomon Decoder and Viterbi Decoder

6.3.2 Execution Flow of RS Decoder and Viterbi Decoder

6.3.2.2 DSP Program Flow for Viterbi Decoder

The interface of the Viterbi decoder implementation is shown in Fig. 6.7 and is similar to that of the RS decoder except for the text edit box, which is the coding mode.

The program execution flow is also similar to that of the RS decoder, shown as Fig. 6.6, but no code mode is needed to be judged in the Viterbi decoder.

Figure 6.7: the Interface of the Viterbi Decoder Implementation

6.3.3 Performance Analysis

In this section, we present the execution time of our implementation for the RS decoder and Viterbi decoder of the IEEE 802.16a wireless communication standard.

The execution time and the code size of our proposed implementation system is shown in Table 6.9.

Implemented

Decoder Name Code Size Processing Rate (Kbytes/sec)

Improvement Percentage (%)

the Original RS

Decoder 17,137,575 58.80 N/A

Improved RS

Decoder 17,139,055 176.40 96.44

Viterbi 17,120,975 17.42 N/A

Table 6.9: Profile of our Implementation for RS Decoder and Viterbi Decoder

It is observed that the code sizes of the both decoder implementations are almost the same because the largest part included in the final code is the overhead of the transfer mechanism, the functions, and the constants that have been ready by the library.

The improved RS decoder is up to 176.4 Kbytes/sec of the processing rate, and its improvement gain is up to 96.44% compared to the Lee RS decoder without the file-level optimization. The processing rate of the Viterbi decoder is about 17.42 Kbytes/sec. To accelerate the Viterbi decoder, it seems better to design the logic for parallelize its operation than to execute it sequentially on the DSP platform. Moreover, the algorithm of the Viterbi decoder is almost fixed, and we are only able to measure its efficiency on the DSP platform.

Chapter 7 Conclusions and Future Works

7.1 Conclusions

The speech coding approach taken by AMR is a way to adjust the speech and channel coding rate to the channel condition without losing too much quality. The Reed-Solomon codec in IEEE 802.16a provides several coding rates and error capabilities for the wireless communication. However the multiple speech coding modes and the additional channel coding for reducing channel errors increase the complexity of the implementation on the hardware. However, the technique of VLSI and architecture design advances rapidly at the present time. It gives us the opportunity to implement complicated algorithms on hardware. In this thesis, the AMR speech codec is implemented on the DSP platform, which is used mainly for multimedia coding purposes. And so is the Reed-Solomon decoder, which is used wildly because of its high capability of correcting both random and burst errors.

In the previous chapters, we first focus on the AMR speech codec. We profile the C program provided by 3GPP and find that most functions mainly consist of the function call of arithmetic operations. Hence it is an effective way to reduce much execution time by accelerating the arithmetic operations. We also use the TI DSP intrinsics, which are efficient instructions supported by the C64x DSP to take the advantage of the DSP architecture, to accelerate the AMR codec. It has been improved

up to 68.88% for the encoder and 66.12% for the decoder when the compiler-level optimization is also enabled. Finally, we implement the accelerated program on the DSP platform, and its speed is up to 14.05 ms/frame for the encoder and 2.43 ms/frame for the decoder. The measured time includes the data transfer and still meets the real time.

The other topic in this thesis is the Reed-Solomon decoder in IEEE 802.16a. The conventional decoding algorithm is described and treated as the original one for further improvement. The original decoder is first profiled. And then it is accelerated in the syndrome computation and chien search modules, which are two most time consuming procedures. We reduce their complexity and simplify their structure for the software pipeline. It is improved up to 97.79% in the syndrome computation and 73.72% in the chien search. The improved Reed-Solomon decoder is also implemented on the DSP platform. Its processing speed is up to 176.4 Kbytes/sec and is 96.44% faster than the original one. The Viterbi decoder is also implemented to complete the FED scheme in our IEEE 802.16a project. Its processing rate of DSP implementation is 17.42 Kbytes/sec. The final version of both the Reed-Solomon decoder and the Viterbi decoder in IEEE 802.16a reaches our goal of real time for the AMR speech coding.

7.2 Future Works

As discussed in the above, the processing speed of Viterbi decoder is the bottleneck in our IEEE 802.16a FED procedure. However, we have adopted the most efficient algorithm we know of and it is hard to further accelerate it by algorithm fine tuning. One way to implement and accelerate the Viterbi decoder is to design VLSI logic and parallelize its operations. So, the FED scheme may be accelerated by implementing the Viterbi decoder using the FPGA with the help of DSP. The DSP platform we use in this project contains an Xilinx FPGA. It may worth to try.

There are also other issues in the AMR codec implementation. It is not yet implemented for the analog input and output although they are included on the DSP

baseboard we use. Reading and writing files are the primary I/O for our present implementation. It would be more useful in practice to process real-time input speech or audio using the microphone and the speaker. However, we are limited by the time and not yet to test and use the I/O port. This can be another subject to explore.

Bibliography

[1] O. Corbun, M. Almgren, and K. Svanbro, “Capacity and Speech Quality aspects using Adaptive Multi-Rate (AMR),” The Ninth IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, vol. 3, pp. 1535-1539, 1998.

[2] 3G TS 26.071: “AMR Speech Codec; General Description,” 3GPP, Aug.

1999.

[3] 3G TS 26.090: “AMR Speech Codec; Speech Transcoding Functions,” 3GPP, Dec. 1999.

[4] D. A. F. Florencio, “Investigating the use of Asymmetric Windows in CELP Vocoders,” ICASSP, vol.2, pp. 427-430, 1993.

[5] R. Salami, C. Laflamme, J. P. Adoul, and D. Massaloux, “A Toll Quality 8 Kb/s Speech Codec for the Personal Communications System (PCS),” IEEE Transactions on Vehicular Technology, vol. 43, no. 3, pp. 808-816, Aug.

1994.

[6] R. Salami, C. Laflamme, J. P. Adoul, A. Kataoka, S. Hayashi, T. Moriya, C.

Lamblin, D. Massaloux, S. Proust, P. Kroon, and Y. Shoham, “Design and Description of CS-ACELP: A Toll Quality 8 kb/s Speech Coder,” IEEE Transactions on Speech and Audio Processing, vol. 6, no. 2, pp. 116-130, Mar.

[7] P. Kabal and R. P. Ramachandran, “The computation of line spectral frequencies using Chebyshev polynomials,” IEEE Transactions on ASSP, vol.

34, no. 6, pp. 1419-1426, Dec. 1986.

[8] C. R. Galand, J. E. Menez, and M. M. Rosso, “Adaptive Code Excited Predictive Coding,” IEEE Transactions on Signal Processing, vol. 40, no. 6, pp. 1317-1326, Jun. 1992.

[9] P. Kroon and B. S. Atal, “On the Use of Pitch Predictors with High Temporal Resolution,” IEEE Transactions on Signal Processing, vol. 39, no. 3, pp.

733-735, Mar. 1991.

[10] E. Ekudden, R. Hagent, I. Johansson, and J. Svedberg, “The Adaptive Multi-Rate Speech Coder,” IEEE Proceeding of Speech Coding, pp.117-179, 1999.

[11] A. Uvliden, S. Bruhn, and R. Hagen, “Adaptive Multi-Rate – A Speech Service Adapted to Cellular Radio Network Quality,” IEEE Tirty-Second Asilomar Conference, vol. 1, pp. 343-347, 1998.

[12] K. Jarvinen, J. Vainio, P. Kapanen, T. Honkanen, and P. Haavisto, “GSM Enhanced Full Rate Speech Codec,” ICASSP, vol. 2, pp. 771-774, 1997.

[13] A. M. Kondoz, Digital Speech: Coding for Low Bit Rate Communication Systems. Wiley, 2004.

[14] IEEE Standard for local and metropolitan area networks, Part 16, Amendment 2, 2003.

[15] I. S. Reed and X.-M. Chen, Error-Control Coding for Data Networks.

Kluwer Academic Publishers, Dordrecht, 1999.

[16] J.-S. Lin, DSP Implementation and Error Performance Study on Speech Source/Channel Coding. M.S. thesis, National Chiao Tung University, Dep.

of Elect. Eng., Hsinchu, Taiwan R.O.C., Jun. 2002.

[17] Y.-P. Ho, Study on OFDM Signal Description and Channel Coding in the IEEE 802.16a TDD OFDMA Wireless Communication Standard. M.S. thesis, National Chiao Tung University, Dep. of Elect. Eng., Hsinchu, Taiwan R.O.C., Jun. 2003.

[18] F. Tosato and P. Bisaglia, “Simplified Soft-Output Demapper for Binary Interleaved COFDM with Application to HIPERLAN/2,” IEEE International Conference Communications, vol. 2, pp. 664-668, 2002.

[19] Y.-P. E. Wang and R. Ramesh, “To bite or not to bite – a study of tail bits versus tail-biting,” Proc. IEEE International Symposium on Personal Indoor Mobile Radio Communication, vol. 2, pp. 317-321, Oct. 1996.

[20] Y.-T. Lee, DSP Implementation and Optimization of the Forward Error Correction Scheme in IEEE 802.16a Standard. M.S. thesis, National Chiao Tung University, Dep. of Elect. Eng., Hsinchu, Taiwan R.O.C., Jun. 2004.

[21] Texas Instruments, TMS320C6000 CPU and Instruction Set Reference Guide.

Literature Number: SPRU189F, Oct. 2000.

[22] Texas Instruments, TMS320C64x Technical Overview. Literature Number:

SPRU396B, Jan. 2001.

[23] Texas Instruments, TMS320C6000 Programmer’s Guide. Literature Number:

SPRU198G, Aug. 2002.

[24] Innovative Integration, Quixote User’s Manual. 2003.

[25] Innovative Integration, Quixote Architecture. 2003.

[26] Q. Zhuge, B. Xiao, and E. H.-M. Sha, “Code Size Reduction Technique and Implementation for Software-Pipelined DSP Applications,” ACM Transactions on Embedded Computing Systems, vol. 2, pp. 590-613, Nov.

2003.

[27] 3G TS 26.074: “AMR Speech Codec Test Sequence,” 3GPP, Dec. 2004.

[28] Texas Instruments, Reed Solomon Decoder: TMS320C64x Implementation.

Literature Number: SPRA686, Dec. 2000.

[29] T.-K. Truong, J.-H. Jeng, and I. S. Reed, “Fast Algorithm for Computing the Roots of Error Locator Polynomials up to Degree 11 in Reed-Solomon Decoders,” IEEE Transactions on Communications, vol. 49, no. 5, May 2001.

[30] M. Morii and M. Kasahara, “Generalized Key-Equation of Remainder Decoding Algorithm for Reed-Solomon Codes,” IEEE Transactions on Information Theory, vol. 38, no. 6, Nov. 1992.

[31] X. Ma and X.-M. Wang, “On the Minimal Interpolation Problem and Decoding RS Codes,” IEEE Transactions on Information Theory, vol. 46, no.

4 Jul. 2000.

[32] W. G. Chambers, R. E. Peile, K. Y. Tsie, and N. Zein, “Algorithm for Solving the Welch-Berlekamp Key-Equation, with a Simplified Proof”, Electronics Letters, vol. 29, no. 18, Sep. 1993.

[33] S. R. Blackburn, “Fast Rational Interpolation Reed-Solomon Decoding, and the Linear Complexity Profiles of Sequence,” IEEE Transactions on Information Theory, vol. 43, no. 2, Mar. 1997.

[34] A. Mahmudi, Dr. M. Benaissa, and Dr. P. Sweeney, “The Implementation of Generalized Minimum Distance Decoding for Reed Solomon Codes,” IEEE International Symposlum on Circuits and Systems, May 2000.

[35] D. Dabiri and I. F. Blake, “Fast Parallel Algorithm for Decoding Reed-Solomon Codes,” IEEE Transaction on Information Theory, vol. 41, no.4, Jul. 1994.

[36] T.-K. Truong, J.-H. Jeng, and T. C. Cheng, “A New Decoding Algorithm for Correcting Both Erasures and Errors of Reed-Solomon Codes,” IEEE Transactions on Communications, vol. 51, no. 3, Mar. 2003.

[37] E. R. Berlekamp and L. Welch, “Error Correction for algebraic block codes,”

U.S. patent 4633470, 1986.

[38] E. R. Berlekamp, Bounded Distance+1 Soft Decision Reed-Solomon Decoding. preprint.

在文檔中 AMR編碼及IEEE 802.16a標準之Reed-Solomon解碼器於數位訊號處理器之實現 (頁 129-0)