A Division-Free(255,k) RS code Syndrome
Computation Scheme for PC-based DVB-T
Software Radio Implementation
Shu-Ming Tseng Yao-Teng Hsu Jheng-Zong Shih
Department of Electronic Engineering Department of Electronic Engineering Liteon Corp., Taipei, Taiwan
National Taipei University of Tech-nology, Taipei 106,Taiwan
Email: [email protected] National Taipei University of Tech-nology, Taipei 106,Taiwan
Email: [email protected] Email: [email protected]
Abstract―In this paper, we propose a novel
divi-sion-free algorithm of syndrome evaluation for (255,k) Reed Solomon code on PC-based software radio platform. The proposed algorithm has reduced execution time of syndrome evaluation. The syndrome evaluation with the proposed division-free algorithm is almost three times faster then typical one.
Index Terms―modulo, division-free algorithm, PC
plat-form
I. INTRODUCTION
PC-based software radio implementation has been become popular because the ease of imple-menting and modifying the baseband processing algorithms. In [1], a PC-based software GPS re-ceiver is implemented. In [2], a real-time notebook PC-based Digital Audio Broadcasting (DAB) re-ceiver is implemented.
Based on our previous work in DAB, we now want to implement PC-based Digital Video Broad-casting-Terrestrial (DVB-T) software receiver. However, the bottleneck for real-time processing is the Reed-Solomon (RS) code syndrome compu-tation, which accounts for more than half of the total computation time in our PC-based DVB-T software receiver using C programming, as shown in Table. 1.
In this paper, we propose a simplified algorithm of syndrome evaluation for GF( ) for x86 PC platform to reduce the computation time. We re-place the modulo (division) operation by addition operation to reduce the massive execution time.
The new algorithm also can be executed parallel with SSE parallel instruction to get more gain.
8 2
This paper is organized as follows: In Section II, we briefly describe the syndrome evaluation. In Section III, we describe the proposed algorithm. The simulation result is given in Section IV. Sec-tion V is the conclusion.
II. THE SYNDROME EVALUATION
We focus on (255,k) RS codes because it is popular in industry standards. Let 255-k=2t, where t is the number of correctable symbol errors.
The received code word is
) ( ) ( ) (x v x e x r (1)
where v is the transmitted codeword, and e is error polynomial as v v j j j j j j x e x e x e e 2 2 1 1 (2) The
v j j j e e e , , , 21 are error values and
j2, ,xjv
jx
x ,1 are error positions.
an
Syndromes are evaluate
(4) ej d x (3) d as ) 2 ( m GF GF(2m) ) 1 2 ( ) 2 ( 1 2 1 ) 2 ( 1 0 ) 2 ( 0 2 2 ) 1 2 ( 2 1 2 1 2 1 0 2 0 2 2 ) 1 2 ( 1 1 2 1 0 1 ) ( ) ( m m m m m m t t t t t r r r r S r r r r S r 1 1 0 1 1 ) ( r r r S 1
III. THE PROPOSED SIMPLIFIED ALGORITHM The mod operation is achieved with DIV ( sion) instruction in x86 platform as there is no mod instruction in that platform. According to the Intel report [3], we can know the latency typical DIV
instru er then
add so
it
divi-ction is 23 CPU cycle. It is far larg an instruction which latency is 0.5 CPU cycle, is a great hotspot of performance. If we can re-place the DIV instruction with ADD, SUB or other low latency instructions, we are closer to real-time software-radio implementation.
The conventional x mod 255 scheme is shown in Fig. 1, where anti_log is the inverse function of the logarithm function. For further analysis, we can divide the x mod 255 operation into 3 cases
● Case I: if x255,ah0
● Case II: if al ah al al al x 255 mod 255 mod 256 0 255 mod (5) 0 , 255 , 255 al ah x (6) al ah x al ah 0 255 mod 255 ● Case III: if (7)The result is list in Table 2. From this table w at most case of (x mod
ah+al. With x=255, this
not be keep because of x mod 255 = 0 ≠ ah + al = 255 . In order to avoid this exception, we consider another approach as “(x+1) mod 255”. The result of th
CPU and brief specific is show in Table 4. The develop
2005 Team Edition for software developers. The pe
thus
much faster. It i are radio
related research in
204-4211, November
[2] S. M. Tseng, Y . Chang and H. L.
ber, “The Software Optimization Cook-1 , 510 256 x ah 255 0 al
1
al ah al al al al ah x 1 255 mod 1 255 mod 256 25 mod 256 255 mod 5 e Chacan notice th 255) can be
replaced with relation can
book
is approach is showed as Table 3. With this table, we note that the column “(al+ah-1) of (x+1)” is the
same as “x mod 255” for x<510.
With this relationship, we design a new flow as Fig. 2. The proposed contains no division opera-tions.
IV. SIMULATION RESULTS
The throughput of syndrome evaluation of typi-cal algorithm and proposed algorithm is measured with a x86 computer which has Intel Pentium M 1.4Ghz
ment tool is Microsoft Visual Studio rformance measuring tool is Performance ex-plorer within Microsoft Visual Studio 2005. The result of evaluation is listed in Table 5. The throughput of syndrome evolution with our pro-posed algorithm in the testing platform is 85470 packets per second, and it is 3.06 times faster then typical algorithm (27917 packets per second).
V. CONCLUSION
The Reed Solomon code is very common in many communication systems, and software de-coding of that in x86 PC is a problem in the past. Our proposed algorithm is division-free and
s very helpful for softw the future.
REFERENCE
[1] N. Kubo, S. Kondo and A. Yasuda,
“Evalua-tion of code multipath mitiga“Evalua-tion using a soft-ware GPS receiver,” IEICE Trans. Commun., Vol. E88-B, No.11, pp. 4
2005.
. T. Hsu, M. C
n, “A Notebook PC Based Real-Time Software Radio DAB Receiver,” IEICE Transactions on Communications, vol. E89B, no. 12, pp. 3208-3214, Dec. 2006.
[3] R. Ger
,” Intel, pp.63, 2002
Fig. 1. The conventional scheme to compute in GF(256)
nk n
r
Fig. 2. The proposed scheme to compute in GF(256)
Table 1. Computation time of RS decoder using C programming
Table 2. x mod 255 result
T x+1) m 55 result T latfo pecifica n Table 5. Performance c n Get Syndrome 59.7% x x mod 255 ah+al CASE I 0 ≤ x < 255 x X CASE II 255 0 255 CASE III 256 ≤ x m 5 able 3. ( od 2 nk n r able 4. P rm S tio ompariso x < 510 x mod 255 od 25 x x m
255 od ah+al for (x+1) for (x+1)ah+al-1 CASE I 0 ≤ x < 254 x x+1 x CASE II 2 54 254 255 254 CASE III 255 0 1 0 CASE IV 256 1 2 1 CASE V 256 < x < 510 x-255 x -254 x-255 Component Spec OS Windows XP
CPU Intel Pentium M 1.4G
Ram 256MB
Type ASUS M2400N
Name Throughput
Typical algorithm 27917
packets/s Our proposed algorithm 85470
packets/s Berlekamp-Massey algorithm 1.8%
Chien search 37.7%
Forney algorithm 0.8%