行政院國家科學委員會專題研究計畫 成果報告
子計畫二:以正交分頻多工為基礎之多模式基頻收發器研製
(3/3)
計畫類別: 整合型計畫 計畫編號: NSC93-2220-E-009-033- 執行期間: 93 年 08 月 01 日至 94 年 07 月 31 日 執行單位: 國立交通大學電子工程學系及電子研究所 計畫主持人: 李鎮宜 報告類型: 完整報告 報告附件: 出席國際會議研究心得報告及發表論文 處理方式: 本計畫可公開查詢中 華 民 國 95 年 6 月 1 日
行政院國家科學委員會補助專題研究計畫成果報告
用於軟體無線電基頻處理之系統晶片設計技術
子計劃二:以正交分頻多工為基礎之多模式基頻收發器研製(3/3)
計畫類別:□個別型計畫 ■整合型計畫 計畫編號:NSC 91-2218-E-009-010 執行期間:91 年 8 月 1 日 至 93 年 7 月 31 日 計畫編號:NSC92-2220-E-009-019- 執行期間:92 年 8 月 1 日 至 93 年 7 月 31 日 計畫編號:NSC93-2220-E-009-033- 執行期間:93 年 8 月 1 日 至 94 年 7 月 31 日 計畫主持人:李鎮宜 計畫參與人員: 執行單位:國立交通大學電子工程系所中 華 民 國 94 年 10 月 28 日
中文摘要
在此結案報告中將敘述第三年有關於 DVB-T 基頻接收系統中關鍵模組的改進與設 計。相關的研究項目包含頻率同步系統的改善與實現,以及整個 DVB-T 基頻系統的架 構設計、晶片實現。在系統整合過程中,我們將三年內陸續發展的相關關鍵模組,例如: FFT processor, Viterbi decoder, RS decoder, De-interleaver 完全整合於單一晶片中,並且
通過完整的功能、工作速度、功率消耗量測。
關鍵字:數位視訊系統、正交分頻多工、頻率同步演算法、系統整合與實現
Abstract:
This final report describes the project progress in third year about developing,
implementing core technologies for OFDM-based digital video broadcasting (DVB) system
but also DVB-H system. The research tasks include frequency synchronization system
improvement, DVB-T/H baseband receiver design, and implementation. We integrate
DVB-T/H baseband receiver by several developed functional block designs, such as 2k/4k/8k
point processor, Viterbi decoder, single memory de-interleaver, and RS decoder. Finally, the
measurement result of single chip DVB-T/H baseband receiver will be reported in the end.
Keywords: DVB-T baseband receiver, OFDM, Frequency Synchronization Algorithm,
Part I: Low Complexity Carrier Frequency Synchronization for DVB-T/H
System
A low complexity carrier frequency offset (CFO) synchronization scheme is proposed
for Digital Video Broadcasting-Terrestrial/Handheld (DVB-T/H) system, which comprises
two acquisition strategies and a tracking loop. In time-domain, Pre-FFF algorithm, the
proposed fractional CFO acquisition algorithm can overcome the distortion caused by
multipath delay spread and achieves 0.25~7.8dB gain in RMSE compared with the
conventional approach. In frequency-domain, Post-FFT algorithm, a 2-stage scheme is
proposed for the integral CFO acquisition to reduce the search range. In the other hand, we
propose two low complexity algorithms to detect the accurate integral CFO value and save
more than 80% of number of multiplication without any performance loss.
1. Carrier Frequency Offset Synchronization Scheme
The objective of CFO synchronization is to establish subcarrier orthogonality as fast and
accurately as possible (acquisition) and then maintain orthogonality as well as possible at all
times during online reception (tracking). However, a CFO acquisition algorithm alone can not
be both fast and sufficiently accurate, because
1. Pre-FFT algorithms allow only fast acquisition of the fractional CFO but no
acquisition of the integral CFO.
2. Post-FFT algorithms allow fast acquisition of the integral CFO but, due to lack of
orthogonality, acquisition of fractional CFO is very complicate.
Both fast and accurate acquisition can be attained by adopting a multi-stage
synchronization strategy with two one-shot acquisition stages (one pre-FFT and the other
post-FFT) followed by tracking. In DVB-T/H system, the data format provides for training is
only for frequency domain (continual and scattered pilots) but not for time domain. Hence,
pre-FFT non-data-aided acquisition and post-FFT data-aided acquisition and tracking
scheme as shown in Fig. 1. Pre-FFT CFO Acquisition ( ) Post-FFT CFO Acquisition ( ) FFT CFO Compensator Post-FFTCFO Tracking ( ) + εI ^ I
ε
dataCFO ^ Fε
^ F ε R εR ε Rε
from ADC to EQFig. 1: Overall CFO synchronization and compensation scheme
The control loops of the three-stage synchronization subsystem operate in a
per-OFDM-symbol basis. When the CFO acquisition or tracking stage has generated an
estimation of CFO value, the CFO compensator will calculate the effective compensation
value before the beginning of the next pre-FFT OFDM symbol, and then start to compensate
the updated CFO value when the next pre-FFT OFDM symbol comes.
1.1 Fractional Carrier Frequency Offset Synchronization
The conventional fractional CFO estimation utilizes maximum likelihood estimation
(MLE) of differential phase between two repeated training symbols in frequency domain to
estimate the fractional CFO value. The estimation range is limited within±0.5 subcarrier space, and can be expressed as
/ 2 1 * 1, 2, ^ 1 / 2 / 2 1 * 1, 2, / 2 Im 1 tan 2 Re K k k k K K F k k k K R R R R π
ε
− − =− − =− ⎡ ⋅ ⎤ ⎢ ⎥ ⎢ ⎥ = ⎢ ⋅ ⎥ ⎢ ⎥ ⎣ ⎦∑
∑
(1)where R1,k* and R2,k are the pre-defined training symbols in frequency domain.
In WLAN IEEE 802.11a system, similar idea is exploited but different training patterns
are utilized. The estimation of CFO is accomplished by the aid of pre-defined short preamble
and long preamble in time domain and achieves wider estimation range than Moose’s
scattered pilots in DVB-T/H system. The former two data-aided algorithms are both not
suitable solutions for our application.
From section 2.1.2, we can know that the phase of the received signal in time
domain is rotated by CFO linearly according to the sample time instant tn as (2-4) shows.
When the difference of sample time instant between two received signals is equal to FFT
length N, the phase error difference caused by CFO between them can be expressed as
( ) ( ) 2 2 l n N l n ftn N ftn θ + −θ = Δπ + − Δ π 2πε(lNs Ng n N) /N 2πε(lNs Ng n) /N = + + + − + + 2πε 2 (π εI εF) = = + . (2) Since the phase rotation of multiples of 2π can be ignored, the phase error between ( )
l
r n and (r nl +N) is just equal to 2πεF and in proportion to the fractional CFO value. This phase error feature will be utilized in our proposed fractional CFO synchronization. In
the proposed DVB-T/H system platform, however, no any useful training symbol can be used
in time domain. So if we want to exploit the phase error feature between ( )r n and l
( )
l
r n+N , the guard interval based algorithm is the most suitable solution.
In order to prevent the influence of multipath channel spread and inter-symbol
interference (ISI), a cyclical prefix is inserted in front of each symbol. The cyclical prefix
must be composed of partial signal in the back of the symbol, and its length has to be longer
or equal to the multipath delay spread as shown in Fig. 2.
Symbol (N) GI(Ng)
copy channel impulse
response
Fig. 2: Guard interval insertion and multipath channel spread
received sample ( )r n in guard interval and (l r nl +N) in the symbol’s tail are exactly identical when there is no any distortion exists such as multipath delay spread or CFO. As
previous sections mentioned, the difference of rotated phase error between ( )r n and l
( )
l
r n+N is in proportion to the fractional CFO value εF. We can conclude that the tail received sample and its cyclical prefix show the same property except for a phase rotation
error which is exactly 2πεF. The estimation of fractional CFO value can be accomplished with the MLE of differential phase between guard interval and the tail of symbol, and can be
expressed as 1 1 2 * * , , ( , , ) s s F s g s g N N j l n N l n l n N l n N n N N n N N x r r r r e πε − − − − − = − = − =
∑
⋅ =∑
⋅ 1 2 2 , s F s g N j l n N n N N e πε r − − = − =∑
1 * , , ^ 1 1 * , , Im 1 1 arg( ) tan 2 2 Re s s g s s g N l n N l n n N N N F l n N l n n N N r r x r r π πε
− − = − − − − = − ⎡ ⎤ ⋅ ⎢ ⎥ ⎢ ⎥ = = ⎢ ⎥ ⋅ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦∑
∑
(3) (3) shows that the distinguishable phase error of arg( )x is within ± , so the πestimation range of the fractional CFO synchronization is also limited within ±0.5 subcarrier space. In the proposed CFO synchronization scheme, the rough estimation of
fractional CFO is calculated with the first symbol after symbol boundary is decided. And then
the estimated fractional CFO value
^
F
ε
will be sent to the CFO compensator before data being sent to FFT receiver as Fig. 1 shows.If AWGN is the only external distortion, the accuracy of the fractional CFO
synchronization will be very excellent because the correlation of guard interval and tail of
symbol can average the noise induced by AWGN. However, the DVB-T/H system is an
multipath channel is necessary. As Fig. 2 shows, the delay spread of multipath channel will
affect the data of the front portion of the guard interval directly especially when the length of
guard interval is relatively short (2k mode, Ng /N =1/ 32). In order to reduce the effect of multipath delay spread, several beginning samples of the guard interval must be discarded,
and (3) can be rewritten as
1 * , , ^ 1 1 * , , Im 1 1 arg( ) tan 2 2 Re s s g s s g N l n N l n n N N y N F l n N l n n N N y r r x r r π π
ε
− − = − + − − − = − + ⎡ ⎤ ⋅ ⎢ ⎥ ⎢ ⎥ = = ⎢ ⎥ ⋅ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦∑
∑
(4) where y is the number of discarded samples. However, discarding too many samples willalso degrade the averaging performance.
1.2 Integral Carrier Frequency Offset Synchronization
From previous section, we can know that the time domain guard interval correlation
algorithm can only deal with the rotated phase error caused by the fractional CFO value. The
imperfect effect caused by the integral CFO should be monitored and synchronized in
frequency domain. Thanks to the compensation of
^
F
ε
, the residual fractional CFO εR is relatively smaller (εR ≤0.02) and the ICI noise is also neglected. In essence, the k-th transmitted subcarrier shows up at FFT output bin with subcarrier index k+ as Fig. 2-3 (b) εIshows. The subcarrier index shift, which is just equal to the integral CFO εI, must now be detected by using the pre-defined training sequence (continual and scattered pilots) or the null
subcarriers. In later sections, some different algorithms of integral CFO synchronization will
be illustrated and discussed.
1.2.1 Conventional Pilots Based Approach
The DVB-T/H standard defines continual and scattered pilots for synchronization and
equalization in frequency domain. The signal power of the two kinds of pilots is at boosted
continual and scatter piloted is their subcarrier index. The continual pilots locate at fixed
subcarrier index and do not shift as OFDM symbol number increases. However, scattered
pilots are inserted every 12 subcarriers and have an interval of 3 subcarriers in the next
adjacent symbol. In general, the continual pilots based integral CFO synchronization
algorithms are the most widely used because of its good performance in low SNR and mobile
environment. The main idea of this approach is based on the MLE theory. In the first step, the
correlation between two continual pilots at the same subcarrier index for two successive
symbols in the frequency domain based on shifting the pilot positions is calculated, and can
be expressed as * 1, , , i i l k l k k P C R− R i m = =
∑
⋅ ≤ (5) where Ci is the correlation value at the i-th shift location,i
P =[p1+m p, 2+m,...,pP+m,] are the positions of the subcarriers to be correlated in two successive symbols, and m is the estimation range. The integer CFO value εI is then estimated by detecting the offset position i where the value C is maximized as i
^
m a x ( i)
I i C
ε
= (6) Fig. 3 shows the received signal according to the subcarrier in frequency domain whenthe integral CFO is equal to 1 subcarrier space. In DVB-T/H 2k mode, the positions of
continual pilots should be 0, 48, 54, 87…. Accordingly, if the maximum value of C is i
obtained from subcarriers 1, 49, 55, 88…, the estimated integral CFO is 1 because the
position of maximum correlation is achieved one subcarrier position away from the original
continual pilots. Because the continual pilots are transmitted at boosted power level, the
power difference of correlation values is still apparent and not affected by strong noise even
in low SNR and deep delay spread channel condition. The total number of multiplication
(2 1) ( 4 2)
M = m+ ⋅ P⋅ + (7)
where M is the total number of multiplication, and P is the number of correlated pilots,
respectively. In DVB-T/H system, P is 45, 89, and 177 for 2k, 4k, and 8k mode. Apply (7) we
can see that as the search range increases, if all of the continual pilots are used for estimation,
the total number of multiplication will increase enormously. For example, if the desired
search range m is 60 for 2k mode when using all continual pilots, the number of
multiplication will raise up to 22022. For low power consideration, such large number of
multiplication should be avoided. The tradeoff between estimator performance and power
consumption has become an important task for the integral CFO acquisition.
47 48 49 … ... … … …… … ... …… ……… … 53 54 55 87 88 89 47 48 49 53 54 55 87 88 89 + 1 C− + 0 C + 1 C … … … … … … correlation subcarrier index subcarrier index (j-1)-th symbol j-th symbol
MAX: estimated CFO=1
continual pilot data
Fig. 3: Received signal in frequency domain when CFO=1 subcarrier space
Besides the continual pilots based approach, another algorithm based on both continual
and scattered pilots (CP+SP) was also proposed. This algorithm calculates the correlation
between possible 4 types of CP+SP patterns with the shifted received symbol in frequency
domain. By detecting the peak value of the correlation result among the 4 CP+SP patterns,
the integral CFO and the scattered pilot mode can be estimated at the same time, and can be
' ^ * , , , 1 max , [0,1, 2,3] P l z k i z k I i k R Y z
ε
+ = =∑
⋅ ∈ (8) where P’ is the total number of CP+SP, Yz k, is the z type CP+SP sequence, and z is thesubcarrier index pattern of 4 possible types of CP+SP, respectively. Although this approach
can acquire the scattered pilot mode and the integral CFO at the same time, the computational
complexity also rises to about 4 times of the continual pilots based one and leads to more
power consumption.
1.2.2 Conventional Guard Band Based Approach
In DVB-T/H system, the number of subcarriers K within an OFDM symbol is chosen
smaller than the symbol length N to provide that so-called “guard bands” at the edges of the
transmission spectrum are left free. Hence all the subcarriers within guard-bands are
composed of null subcarriers and the transmitted signal power is zero. According to the
DVB-T standard, the signal power of the useful data subcarrier is normalized to 1, and the
power of the reference pilots is 16/9. By exploiting the feature of power difference, a guard
band power detection based algorithm for integral CFO acquisition was proposed by Kim in
1997. This algorithm utilizes the guard bands in both sides of spectrum as a moving window
to search the subcarrier index shift value caused by the integral CFO. The main idea is that
when the useful signal component (data or pilot subcarriers) is not within the moving window,
the total component power within the moving window includes only noise component. So
when the power of the moving window reaches minimum, the shift value of the window is
equal to the shift value of signal spectrum due to the integral CFO, and can be expressed as
max min min max 1 1 ^ 2 2 , , min{ }, K K l k i l k i I i k K w k K w R R i m
ε
− + + + = − = + =∑
+∑
≤ (9) where w is the width of the moving window at both sides of the guard band and is set asmin K Kmax N−1 -3 -2 -1 0 1 2 3 i perfect symbol shifted symbol
min power in this window subcarrier index
. . . . . .. . . .
0
Fig. 4: Received symbol in frequency domain when CFO is -2
Fig. 4 shows the received symbol spectrum in frequency domain according to subcarrier
index when the integral CFO is -2 subcarrier space. As we can see the minimum power
appears in the moving window where i is -2 because it does not include any data or pilot
component. The total number of multiplication M required for the acquisition of integral CFO
can be expressed as
( 2 ) 4
M = w+ m ⋅ (10)
From (4), we can find that the number of multiplication M could be reduced effectively
by using small moving window width w. However, small w may lead this algorithm to worse
performance in low SNR and deep frequency selective fading environment. So the trade-off
between w and M should be treated very carefully.
In order to improve the performance of the conventional guard band power detection
based algorithm, another modified guard band power detection method was proposed. This
algorithm modifies the structure of the symbol spectrum and inserts additional null
frequency selective fading. However, the modification conflicts with the DVB-T/H standard
and can’t be applied for our system platform.
1.3. Proposed 2-stage Approach
From previous sections, we can conclude that neither the continual pilots based
algorithm nor the guard band power detection based algorithm can satisfy good performance
and low computational complexity at the same time. Besides, the number of multiplication of
all these algorithms is in proportion to the search range. If we want to let the integral CFO
estimator work in low SNR and deep frequency selective fading environment and search
large range CFO with low computational complexity, none of these algorithms is the best
choice. In order to solve this problem, a 2-stage integral CFO acquisition algorithm is
proposed as Fug 5 shows. The objective of the first stage is to recognize whether the integral
CFO value εI is positive or negative (i.e. to find whether the direction of subcarrier shift due to integral CFO is right or left) with a low complexity guard band based algorithm. Once
the first stage finishes and finds the direction of the subcarrier shift, the search range and the
number of multiplication can be reduced half at the same time. In the second stage, the
accurate integral CFO value εI will be acquired along the direction estimated by the first stage with the proposed continual pilots based algorithm or guard band based algorithm. The
detailed content of the proposed 2-stage approach will be illustrated in later sections.
Recognize orεI >0 εI <0
Stage 1
Stage 2 Find accurate toward leftεI
Stage 2
Find accurate toward rightεI
0 I ε > 0 I ε < Acquire εI Acquire εI
1.3.1. The first Stage of the Proposed Approach
The main task of this stage is to find whether the integral CFO value εI is positive or negative fast and efficiently, so a left window and a right window that composed of w 1
guard band null subcarriers and w data subcarriers at the boundary between guard band and 1
data are exploited. In the first step, the summation of signal power of two successive OFDM
symbols based on the position of left and right window is calculated separately. Once the
integral CFO value εI is not equal to zero, the subcarrier distribution of guard band and data within the left and right window will be imbalanced. So in the second step, we compare the
calculated correlation power to decide whether the integral CFO value εI is positive or negative, and can be expressed as
2 2 1, , l k l k k left L R− R = ⎡ ⎤ = ⎢ + ⎥ ⎣ ⎦
∑
2 2 1, , l k l k k right R R− R = ⎡ ⎤ = ⎢ + ⎥ ⎣ ⎦∑
, I 0 L>R ε ≤ or L<R,εI ≥ (11) 0 where left =[Kmin −w K1, min−w1+1,...,Kmin−1,Kmin,Kmin +1,...,Kmin+w1−1] , rightmax 1 max 1 max max max max 1
[K w 1,K w,...,K 1,K ,K 1,...,K w]
= − + − − + + , and w is the window 1
width, respectively. data data
L
R
Compare L < R, positive L > R, negative Symbol j-1 Symbol j w1w1 w1 w1 Kmin Kmax GB GB GB GBFig. 6: The first stage of the proposed integral CFO estimator
received subcarrier will shift toward right and the number of guard band signal will be more
than that of the data signal in the left window. Also in the right window, the number of the
guard band signal will be less than that of the data signal. The power difference between the
left and right window will appear and help us to decide whether the integral CFO value εI is positive or negative. The total number of multiplication of the first stage can be expressed
as
1
8
M = ⋅ (12) w
From (12) we can know that the number of multiplication of the first stage is not
affected by the estimation range m and low complexity calculation can be achieved by
choosing smaller window width. However, too small window width will affect the
performance of the first stage. The optimal window width will be shown by simulation result
in chapter 3.
1.3.2. The Second Stage of the Proposed Approach
By the aid of the first stage, the search range of the second stage can be reduced from
m
± to m. However, the result of the first stage may be incorrect while the integral CFO value εI is smaller than the window width w in deep frequency selective fading channel 1
environment. In order to prevent estimation error, the search range should be extended from
m to m+ , implying that we should add more ww1 1 points to the search range toward the
reverse direction to assure correct acquisition result when the integral CFO value εI is near zero in deep frequency selective fading channel.
Once the search range of the second stage is decided, there are still various algorithms
can be applied for acquisition the accurate integral CFO value εI. The trade-off between estimator performance and computational complexity, however, still exists among the
previous mentioned algorithms. Considering acceptable acquisition performance and efficient
computation load, a reduced continual pilot based algorithm and a guard band power
illustrated in later sections.
(a) Proposed Reduced Continual Pilot Based Approach
From (7), we can find that the number of multiplication of the conventional continual
pilot based approach is in proportion to not only the search range m but also the number of
utilized continual pilot P. In order to achieve efficient computational load, the number of
utilized continual pilot should be reduced with the search range at the same time. Hence a
reduced continual pilot based approach is proposed. The main feature of the proposed
reduced continual pilot based algorithm for the second stage integral CFO acquisition is
similar to the conventional continual pilot based one. But the proposed one exploits only a
part of the continual pilot instead of all of them to reduced the number of multiplication, and
can be expressed as , ^ * 1, , max r i l k l k I i k P R R
ε
− = =∑
⋅ (13) where P is the shifted subcarrier index of the selected continual pilots, r i, − ≤ ≤m i w1while negative value estimated by the first stage, and − ≤ ≤ while positive value, w1 i m
respectively. The number of multiplication for the proposed reduced continual pilot approach
can be expressed as
1
( 1) ( r 4 2)
M = m+w + ⋅ P ⋅ + (14)
where P is the total number of the correlated continual pilots. Because the power r
difference between pilot and data subcarrier is very significant, it is not necessary to use all of
the continual pilots and the acquisition performance is still acceptable to meet lower
computational load. As (14) shows, the number of multiplication can be reduced effectively.
(b) Proposed Guard Band Power Detection Based Approach
As previous sections mentioned, the conventional guard band power detection based
algorithm requires fewer number of multiplication and performs worse performance in low
symbol. In order to utilize the advantage of lower computational complexity and to improve
the performance in critical channel condition, we propose a new guard band power detection
based algorithm. By the aid of the proposed first stage, the search range of the second stage
can be reduced effectively and more OFDM symbols can be utilized to improve the
acquisition performance. Thus the proposed guard band power detection based algorithm still
keeps the moving window scheme and calculates the summation of signal power within three
successive OFDM symbols, and can be expressed as
max min min 2 max 2 1 1 ^ 2 2 2 2 2 2 , 1, 2, , 1, 2, min{ } K K l k i l k i l k i l k i l k i l k i I i k K w k K w R R R R R R
ε
− + − + − + + + − + − + = − = + ⎡ ⎤ ⎡ ⎤ =∑
⎢⎣ + + ⎥⎦+∑
⎢⎣ + + ⎥⎦ (15) where w is the width of the moving window at both sides of the guard band, 21
m i w
− ≤ ≤ while negative value estimated by the first stage, and − ≤ ≤w1 i m while positive value, respectively. As Fig. 2.10 shows, by the use of summation within three
successive OFDM symbols, the distortion induced by noise in severe environment can be
decreased effectively. The number of multiplication can be expressed as
2 1
( ) 12
M = w + +m w ⋅ (16)
Compared (16) with (10), we can see that the total number of multiplication of the
proposed guard band power detection based approach consumes about 1.5 times of that of the
conventional approach. However, the acquisition performance is improved significantly.
GB data GB data GB GB w2 GB data GB w2 GB Moving Wondow Symbol j-1 Symbol j Symbol j-2 max K min K GB
1.4. Residual Carrier Frequency Offset Synchronization
After the acquisition stage estimates the integral and most of the fractional CFO value,
the residual CFO value is usually less than 1 to 2 percent of the subcarrier space. However,
the phase error induced by such small value of CFO in time domain still affects the system
performance for long time receiving operation. As Fig. 2.2 shows, the accumulative phase
error when residual CFO value is 0.01 still exceeds π while the received number of data is more than 10,000. Besides, the Doppler effect in mobile environment also introduces small
drift to CFO. Therefore the tracking of residual CFO is necessary and has to operate
continuously until the reception is turned off.
Generally speaking, the residual CFO value εR is usually very small. Thus only fractional CFO estimation is sufficient. In particular, the estimation of the residual CFO at
tracking stage requires precise and low variation result. Therefore in our DVB-T/H system
platform, the tracking stage of CFO is divided into two parts. The first part estimates the
residual CFO value symbol by symbol followed by a PI (proportional-integral) loop filter to
reduce the variation. The tracking loop of the CFO synchronization is shown in Fig. 8.
FFT
CFO
Compensator
PI loop
filter
data
CFO
from
ADC
to
EQ
One-shot
residual CFO
estimator
1 e ^ 1 e ~ 1 e ^ ^ I Fε ε
+ 2 e εFig. 8: The tracking loop of the CFO synchronization
As shown in Fig. 2.11, e is the residual CFO value of the first iteration of the tracking 1
loop. After the estimation of e , the output of the residual CFO estimator 1
^
1
post-processed by the PI loop filter. When the second iteration starts, the CFO compensator
will compensate the incoming data with the updated CFO value
^ ^ ~
1
I F e
ε ε
+ + and then get the next residual CFO error e of the second iteration. As the CFO tracking loop works 2iteratively, the residual CFO error will be minimized.
1.4.1. Residual CFO Estimation
The objective of the residual CFO estimator is to estimate the residual CFO error value
precisely and fast. As previous section mentioned, only fractional CFO synchronization is
sufficient for this estimator. Considering hardware integration and resource reuse, the
fractional CFO estimator may can be utilized for the residual CFO estimator. However, the
non-data-aided algorithm that exploits the guard interval is very sensitive to the inter-symbol
interference introduced by the multipath delay spread and the estimated result may be not
precise enough for the residual CFO estimation in deep delay fading environment. Only
roughly fractional CFO value can be obtained with this approach. Therefore an efficient
data-aided algorithm that employs the pre-defined continual pilots is applied for the residual
CFO estimator.
After most of the CFO value is estimated and compensated, the residual CFO value is
usually less than 1 to 2 percent of the subcarrier space and the ICI noise is small enough to be
neglected. As (2-3) shows, regardless of the ICI term, the phase error caused by the residual
CFO error and SCO at the k-th subcarrier of the l-th OFDM symbol in frequency domain can
be expressed as 2 ( ) 2 ( )(1 ) / ( ) ( ) l R s g s g l k k lN N N lN N k N π ϕ = πε + +ζ + + ζ φ+ (17) where ( )φl k is the phase of the channel frequency response Hl k, . If the channel is a slowly fading channel (φl( )k ≈φl−1( )k ), the difference of phase rotation between two successive OFDM symbols is represented as
1
' ( )l k l( )k l ( )k
2 RNs 2 RNs 2 kNs N N N πε πε ζ π ζ = + + 2 RNs 2 kNs N N πε π ζ ≈ + (18)
The second term
2 RNs N
πε ζ
can be ignored since the product of ε ζR⋅ is usually less
than 2.0x10-6. From (2-24) we can know that the residual CFO εR causes mean phase error
and the SCO ζ causes linear phase offset between two consecutive OFDM symbols. If we take two adjacent continual pilots of arbitrary two consecutively received OFDM symbols,
the phase rotation is shown in Fig. 2.12 [17]. The total phase rotation includes the effects of
symbol timing offset, residual CFO and SCO. As we can see from Fig. 2.12, the magnitude of
phase rotation induced by symbol timing offset is identical and in proportion to the subcarrier
index among the two symbols. However, in the current symbol, the effect of residual CFO
and SCO are accumulated in the phase of the previous symbol, where the residual CFO
induces mean phase and SCO generates linear phase. Thus, we must estimate the residual
CFO as well as the SCO by computing the phase rotation between two successive symbols.
Adjacent Continual Pilot
Symbol timing offset CFO
Sampling clock offset
Previous symbol Current symbol
subcarrier index
Fig. 9: Phase rotation between two successive OFDM symbols
continual pilots which have fixed subcarrier index are exploited to estimate the residual CFO.
In general, the residual CFO and the SCO are estimated jointly because their effects of phase
rotation are uncorrelated. Thus a joint residual CFO and SCO estimation algorithm is applied
as ^ 2, 1, 1 1 ( ) 2 (1 / ) 2 l l R g N N ϕ ϕ π
ε
= ⋅ ⋅ + + ^ 2, 1, 1 1 ( ) 2 (1 Ng /N) K/ 2 l l ζ ϕ ϕ π = ⋅ ⋅ + + 1|2 * 1|2,l arg[ l k, l 1,k] k C R R ϕ − ∈ =∑
⋅ (19) where C denotes the subcarrier index set of continual pilots which locates in the left 1half (k∈[0, (K−1) / 2)), and C denotes the subcarrier index set of continual pilots which 2
locates in the right half (k∈((K−1) / 2,Kmax]) of the OFDM symbol spectrum, respectively. Applying correlation of continual pilots within two successive OFDM symbols and
accumulating the correlation results in two parts lead to the so-called CFD/SFD (carrier
frequency detector / sampling frequency detector) algorithm [18]. The summation of ϕ2,l and ϕ1,l can compute mean phase error while subtraction of ϕ2,l and ϕ1,l produces the linear phase error. As a result, the residual CFO and SCO can be estimated jointly by
multiplying different coefficients.
Besides the continual pilots based approach, some other scattered pilots based
approaches are also presented in [14] and [19]. [14] proposes a residual CFO estimator that
exploits the continual and scattered pilots between the l-th and the (l-4)-th OFDM symbol.
The equation of this approach is very similar to (2-25) except the correlated symbols and
pilots. The main feature of this algorithm is to use more pilots to reduce the distortion caused
by AWGN and ICI noise. However, the convergence speed is extended about 2.5 times
longer than that of the CFD/SFD algorithm because it utilizes the l-th and the (l-4)-th OFDM
successive OFDM symbols and has similar equation with (2-25). However, the subcarrier
index of scattered pilots of two successive OFDM symbols is not identical and has a
difference of 3. The estimated phase error between two scattered pilots is also distorted by the
symbol timing offset. However, the symbol timing offset is an unknown factor and can not be
estimated precisely by symbol synchronizer. So the estimation result of this approach is not
reliable only if precise symbol offset value is estimated.
1.4.2. Residual CFO Tracking Loop Filter
In order to reduce the variation of the estimated residual CFO, a PI loop filter is utilized
in our CFO synchronization design [20]. The PI loop filter is composed of two paths. The
proportional path multiplies the estimated residual CFO by a proportional factor K . The P
integral path multiplies the estimated residual CFO by an integral factor K and then I
integrates the scaled value by using an adder and a delay element. The block diagram of the
PI loop filter is shown as Fig. 10.
KP
KI
+
Z-1+
Fig. 2.10: Block diagram of PI loop filter
The transform function of the PI loop filter can be represented as
1 1 ( ) 1 P I Z H z K K Z − − = + − (20) For small loop delay and KI −KP KP , the standard deviation of the steady-state 1 tracking error is expressed as
( ')e KP/ 2 ( )e
σ = ⋅σ (21) where e is the estimation error of the residual CFO estimator and e’ is the steady-state
tracking error. The close-loop tracking time constant is approximately given by
1/
loop P
T ≈ K (22) So from (21) and (22) we can find that there is a tradeoff between steady-state tracking
error and tracking convergence speed. In our proposed DVB-T/H platform, the loop
parameter KP is chosen as a larger value to increase the convergence speed in the beginning
of tracking, and then switched to a smaller value to reduce the steady-state tracking error
Part II: A Single Chip DVB-T/H baseband receiver design
A DVB-T/H baseband receiver with 2k/4k/8k-point FFT, complete synchronization,
channel equalizer, and channel decoder is implemented with developed designs, such as
2K/4K/8K FFT processor, Path Merging Viterbi decoder, Single Memory De-interleaver, and
RS Decoder. This baseband receiver achieves 70Hz Doppler effect tolerance with multiple
steps CFO compensation, 2D linear channel equalizer in 2k mode. The chip with single port
154Kbytes embedded SRAM only consumes 250mW for highest 31.67Mb/s data rate.
I. Introduction
In conventional approaches for DVB-T receiver, they are partial functional design
[8][10][12][13] or non-fully baseband supporting design [11][14], or multiple chip design
approach [3][4][5][7][9]. There is no single design for DVB-T and DVB-H baseband receiver,
except Fechtel’s design [6], but his approach is design for simpler channel environment.
These proposed designs have optimized for functional blocks or partial system. In this paper,
we present one DVB-T/H fully baseband receiver, which included two synchronization
systems, FFT core, equalization, QAM demodulation, and FEC decoders. In following paper
organization, we will introduce proposed system architecture in section II. The detail
architecture of functional blocks will be described in section III. The simulation result,
estimation result and chip photo will be shown in section IV. In the end, we will discuss
conclusion and future work in the section V.
The existing DVB-T receivers are partial functional design [8][10][12][13], or multiple
chip design approach [3][4][5][7][9], otherwise the processor/DSP based approaches [LSI
Logic, L64782] were proposed in past few years. The DVB-H system is based on DVB-T
system and modified for handheld applications.
II. System Architecture
In this report, we introduce one single chip DVB-T/H [1][2] baseband receiver, which
demodulation, inner de-interleaver, Viterbi decoder, outer deinterelaver and RS decoder. The
system block diagram is shown in Fig. 1.
Fig. 1: Block diagram of DVB-T/H baseband receiver
In system organization, we separate this DVB-T/H baseband receiver into two main
sections: inner receiver (synchronization/demodulation) and outer receiver (FEC). In the
synchronization systems of inner receiver, there are two sub-systems to perform
synchronization operations, one is timing synchronization system, and another one is
frequency synchronization system. These two subsystems are allocated in pre-FFT
synchronizations and post-FFT synchronizations. In timing synchronization system, there are
two target to optimized, first one is OFDM symbol bound detection, second one is sampling
clock offset (SCO) value. In frequency synchronization system, we have to reduce the carry
frequency offset (CFO) value by three estimation processes: fractional part of CFO
estimation, integral part of CFO estimation, and residue CFO tracking.
In channel equalizer, we are using 2 1D linear interpolators [6] in channel estimation for
mobile environment issue, and zero-forcing method is used in channel equalization. The
structure of channel estimation and channel equalization is shown in Fig. 3.
Inner Deinterleaver Inner Deinterleaver Outer Deinterleaver Outer Deinterleaver Outer Decoder Outer Decoder Inner Decoder Inner Decoder De-scrambler De-scrambler FFT Window FFT Window Pre-FFT AFC Pre-FFT AFC Carrier Frequency Compensator Carrier Frequency Compensator Post-FFT AFC Post-FFT AFC Sampling Frequency Tracking Sampling Frequency Tracking Carrier Frequency Tracking Carrier Frequency Tracking Fine Symbol Synchronization Fine Symbol Synchronization Equalizer Equalizer GI/Mode decision GI/Mode decision Coarse Symbol Synchronization Coarse Symbol Synchronization Interpolator/ Decimation FFT Core FFT Core SP Mode Detection SP Mode Detection QAM Demodulation QAM Demodulation TPS decode & Pilot Remove TPS decode & Pilot Remove Rx RF ADC DVB-T/H baseband receiver Inner Deinterleaver Inner Deinterleaver Outer Deinterleaver Outer Deinterleaver Outer Decoder Outer Decoder Inner Decoder Inner Decoder De-scrambler De-scrambler FFT Window FFT Window Pre-FFT AFC Pre-FFT AFC Carrier Frequency Compensator Carrier Frequency Compensator Post-FFT AFC Post-FFT AFC Sampling Frequency Tracking Sampling Frequency Tracking Carrier Frequency Tracking Carrier Frequency Tracking Fine Symbol Synchronization Fine Symbol Synchronization Equalizer Equalizer GI/Mode decision GI/Mode decision Coarse Symbol Synchronization Coarse Symbol Synchronization Interpolator/ Decimation FFT Core FFT Core SP Mode Detection SP Mode Detection QAM Demodulation QAM Demodulation TPS decode & Pilot Remove TPS decode & Pilot Remove Rx RF ADC DVB-T/H baseband receiver
In the end of inner receiver, there are two operations: QAM demodulation, and inner
De-interleaving. The soft-decision QAM demodulation is used to improve performance gain
in Inner decoder (Viterbi decoder). In inner de-interleaving, we exchange the symbol
de-interleaving operation order before QAM demodulation that can reduce the data overhead
when the soft-decision QAM demodulation result word-length is longer than word-length of
channel equalization output.
A 64 64 states ACSs structure and path merge method is impelemted in Viterbi decoder,
which can reduce feedback timing and memory access time. For Outer DeInterleaver, we
propose one address generator with universal memory structure, which improved the memory
efficiency and minimize the required memory size. The RS (204, 188) decoder is assembling
by several modules: Syndrome Calculator, Key Equation Solver, Chien Search, Error Value
Evaluator, and Error Corrector. In the output of RS decoder, last stage of receiver, the
Descrambler decodes the scrambling data but not including the synchronization word in
original data stream.
III. Functional block architecture
Two main regions organized the proposed DVB-T/H receiver, inner receiver and outer
receiver. The inner receiver performs synchronization process, FFT operation, channel
equalization, QAM demodulation, and inner deinterleaver. The block diagram of inner
receiver is shown in Fig. 2. The outer receiver contents inner decoder (Viterbi decoder), outer
deinterleaver, outer decoder (RS decoder), and Descrambler. The block diagram of outer
receiver is shown in Fig. 3. We will introduce detail of inner receiver and outer receiver in
the two following sub-sections separately.
1. Functional blocks of inner receiver
The proposed inner receiver is working firstly after system reset signal triggered. When
the receiving data arriving, timing synchronization system will detect the key frame
Guard Interval Ratio (1/4, 1/8, 1/16, 1/32). These two parameters are main operation
parameters of inner receiver, without operation mode and guard interval length (ratio), The
following functional blocks can’t work correctly. After detected operation mode, guard
interval ratio, the coarse symbol bound will be decided by normalize maximum correlation
method. In the same time, the fraction part of CFO will be estimated based on the phase
rotation between guard interval data and partial of symbol data.
Fig. 2: Inner receiver architecture for DVB-T/H system
The FFT core will do Fast Fourrier Transform when receiverd complete OFDM symbol.
Because the require operation clock cycle count of FFT core is almost 3 times of sample
count of one symbol, the FFT core will operate at 4 times of sampling clock rate. The FFT
core is based on radix-8 butterfly unit with 64 points pre-fetch buffer, and using dynamic
scaling method to reduce output word length overhead.
The scatter pilot (SP) order detection and post-FFT CFO estimation will start after FFT
output data. These two functions will spend 3 OFDM symbols to detect SP order and integral
part of CFO. The CFO compensator will compensate the FFT input data to reduce the ICI
effect when the integral part of CFO ready. After detecting SP order, the channel estimator
will extract the SP information in the OFDM symbol. To getting correct, acceptable Channel
Frequency Respond (CFR), the channel estimator will queue four OFDM symbols to get 2- Channel Estimator Channel Equalizer Symbol De-Interleaver QAM demaper Bitwise memory P/S 4xÆ6x CFO Compensation Time Sync CFO fractional Estimation FFT memory FFT Core FFT memory Symbol memory Symbol memory CFO integral Estimation CFO Tracking Data input TPS decoder CE memory CE memory CE memory CE memory CE memory CE memory Bitwise memory Bitwise memory Bitwise memory Bitwise memory Bitwise memory Control unit 8 6 Viterbi input 36 24
dimension CFR information. After channel estimator collected sufficient CFR information,
the channel equalizer will equalize data and output to symbol deinterleaver memory. The
detail structure of channel estimation and channel equalizer is shown in Fig. 3.
Fig. 3: Architecture of 2D linear channel equalizer
Before symbol deinterleaving operation, we have to decode Transmission Parameter
Signal (TPS), which including operation mode, QAM modulation method, guard interval
ratio, symbol interleaving order, and coding rate of inner decoder. Without decoded TPS
information, the inner deinterleaver, QAM receiver, and Outer receiver can’t work directly.
The complete TPS information is embedded in one OFDM frame (68 OFDM symbols) of
DVB-T/H frame structure. So we have to waiting at least one OFDM frame to collect
complete TPS code and decode TPS by DeBPSK modulation method. When the TPS
information is ready, the Symbol deinterleaving, QAM demodulation, and bitwise
deinterleaving will start working. For each bitwise interleaving section, symbol deinterleaver
will output 126 deinterleved data to 64-level Soft decision QAM receiver. The QAM receiver
output the demodulated data and write into bitwise deinterleaving memory. After each 126
demodulated data fill into bitwise interleaving memory. The Symbol deinterleaving, QAM
receiver will hold until bitwise deinterleaver transfer one complete section data.
Size 6817 RAM Size 6817 RAM Size 6817 RAM Size 3424 RAM Size 3424 RAM Serial In Pilots STORAGE Size 2288 RAM 1x 2x 3x 1x 2x 3x + 1x 2x 1x 2x +
DIV Serial Out
Size 6817 RAM Size 6817 RAM Size 6817 RAM Size 3424 RAM Size 3424 RAM Serial In Pilots STORAGE Size 2288 RAM 1x 2x 3x 1x 2x 3x + 1x 2x 1x 2x +
DIV Serial Out
2. Functional blocks of outer receiver
The detail block diagram of outer receiver is shown in Fig. 4.After bitwise
deinterleaving, the Virterbi decoder will receive bitwise deinterleaver output to decode
puntched convolution code. The outer de-interleaver is universal memory structure with
specified address generator. The required memory space in the universal memory structure
can be reduced to minimum size, which is depended on RS(204,188) decoding length. There
are 5 steps in the RS(204,188) decoder, Syndrom Calculator, Key Equation Solver, Chien
Search, Error Value Evaluator, and Error Corrector. At the output of RS decoder, the
Descrambler decode the scrambeled data stream expect the synchronization words.
Fig. 4: FEC architecture for DVB-T/H system
IV. Simulation and implementation
The chip implementation is using 0.18um CMOS process with die size is 6.9x5.8 mm2 including IO pad and using 208-pin CQFP package. The chip simulated post-layout power
consumption is 250mw@31.67Mbps of maximum data rate of DVB-T/H system. The real
chip measurement and testing procedure are complete. The Fig. 5 shows the power profile of
simulation, and measurement result. The comparisons with existing design are shown in
branch metric 64 ACS 64 Path metric Survivor memory Universal Outer DeInterleaver Memory Outer DeInterleaver address generator Syndrome Calculator Key Equation Solver Chien Search Error Value Evaluator Error Corrector Descrambler MPEG-2 Stream Bitwse deinterleaving output
Table 1. The detail information, advanced of proposed design will highlight in the
comparison table. The chip photo and supporting system specification are shown in Fig. 6.
0 50 100 150 200 250
Sync. Stage, simulated Sync. Stage, measurement receiving out stage, simulated receiving out stage,
measurement Synchronization FFT core Memory access FECs others overall
Fig. 5: Power consumption profile
Table 1: The Comparison with existing designs
Proposed Jheng‘s[14] Hosemann’s[11] LSI L64782[15]
Technology 0.18um CMOS ASIC 0.18um CMOS ASIC 0.13um CMOS
AS-DSP -
Input clock 109.71M Hz 54.86M Hz *1 250M Hz -
Power 250mW 307mW 300mW 800mW *2
Memory size 1263.4K bits - 1180K bits -
Die size 40.02 mm2 15.6 mm2 9.7 mm2 -
Feature
All functional block included, except ADC Low power
consumption
1D CE method, include FEC blocks Without sync. func, and ADC
Without sync. func. DSP based approach Doesn’t support DVB-H yet Fully baseband integrated with embedded 10bit ADC Doesn’t support DVB-H yet *1: nominal frequency
Fig. 6: chip result
V. Conclusion and future work
In related research publish; there is no single chip, DVB-T/H fully baseband receiver
design which including synchronization, demodulation, and channel decodeing. Otherwise,
even in published single chip design, the system environment constrain is simpler such as the
design may not meet the DVB-T/H required system performance. In this paper, we present
one single chip DVB-T/H baseband receiver with 1.8V simulated 250mW average power
consumption for 31.67 Mbps output data rate. Since there are several COFDM applications
announce in the world, for example: ISDB, DMB and DVB systems, and they have similar
system architecture or frame structure; so that the design strategy, algorithm approach may be
reused for these systems. Base on current research result, developing the low power,
universal COFDM processor for multiple COFDM system is our next target.
Reference
[1] ETSI EN 300 744 V1.5.1, “Digital Video Broadcasting (DVB): Framing structure, channel coding and modulation for digital terrestrial television”, ETSI, Nov. 2004.
[2] ETSI EN 302 304 V1.1.1, “Digital Video Broadcasting (DVB): Transmission System for Handheld Terminals (DVB-H)”, ETSI, Nov. 2004.
[3] Anikhindi, S., et al., “A commercial DVB-T receiver chipset”, Broadcasting Convention, 1997. International, 12-16 Sept. 1997 Page(s):528 – 533
1/2, 2/3, 3/4, 5/ 6, 7/8 Guar d Interval rat io
QPSK, 16QA M, 64QA M Modulatio n 2k, 4k, 8k Operatio n mo de DV B- T/ DV B- H Supporting Standard 250mw @31.67Mbps *
Po wer Co nsumpt ion
1.8V Core, 3.3V I/ O Supply Voltage
109.71 MHz Inp ut Cloc k Speed
6.9 X 5.8 mm2
die S ize
208-p in CQFP Pac kage
154.2 Kbytes Embe dded Me mory S ize
371,353 Logic Gate Co unt
(Exc luding SRA M)
UMC 0.18um CMOS, 1P6M Tec hnique
1/2, 2/3, 3/4, 5/ 6, 7/8 Guar d Interval rat io
QPSK, 16QA M, 64QA M Modulatio n 2k, 4k, 8k Operatio n mo de DV B- T/ DV B- H Supporting Standard 250mw @31.67Mbps *
Po wer Co nsumpt ion
1.8V Core, 3.3V I/ O Supply Voltage
109.71 MHz Inp ut Cloc k Speed
6.9 X 5.8 mm2
die S ize
208-p in CQFP Pac kage
154.2 Kbytes Embe dded Me mory S ize
371,353 Logic Gate Co unt
(Exc luding SRA M)
UMC 0.18um CMOS, 1P6M Tec hnique FFT RAM FFT CE RAM Viterbi Decoder RS
Decoder ViterbiRAM
CE EQ (I) Inner Deinterleaver EQ (II) DeQAM CFO
(I) CFO(II)
CFO (III) Time (I) Tim e (II) Ot hers
[4] Makowitz, R., et al., “DVB-T Decoder ICs”, Consumer Electronics, IEEE Transactions on Volume 43, Issue 3, Aug. 1997 Page(s):438 – 442
[5] Tognin, M., et al., “A VLSI solution for a digital terrestrial TV (DVB-T) receiver”, Broadcasting Convention, 1997. International 12-16 Sept. 1997 Page(s):343 – 348
[6] Fechtel, S.A., et al., “Advanced receiver chip for terrestrial digital video broadcasting: architecture and performance”, Consumer Electronics, IEEE Transactions on Volume 44, Issue 3, Aug. 1998 Page(s):1012 – 1018
[7] Makowitz, R., et al., “A single-chip DVB-T receiver”, Consumer Electronics, IEEE Transactions on Volume 44, Issue 3, Aug. 1998 Page(s):990 – 993
[8] Frescura, F., et al., “DSP based OFDM receiver and equalizer for professional DVB-T receivers”, Broadcasting, IEEE Transactions on Volume 45, Issue 3, Sept. 1999 Page(s):323 – 332
[9] Jen Seung Choi, et al., “Design and implementation of DVB-T receiver system for digital TV”, Consumer Electronics, IEEE Transactions on Volume 50, Issue 4, Nov. 2004 Page(s):991 – 998
[10] Andrijevic, G., et al., “A fully integrated low-IF DVB-T receiver architecture”, System-on-Chip, 2004. Proceedings. 2004 International Symposium on
16-18 Nov. 2004 Page(s):189 – 192
[11] Hosemann, M., et al., “Implementing a receiver for terrestrial digital video broadcasting in software on an application-specific DSP”, Signal Processing Systems, 2004. SIPS 2004. IEEE Workshop on 2004 Page(s):53 – 58
[12] Yun-Nan Chang, “Design of An Efficient Memory-based DVB-T Channel Decoder”, Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on 23-26 May 2005 Page(s):5019 – 5022
[13] Chua-Chin Wang, et al., “A 2K/8K mode small-area FFT processor for OFDM demodulation of DVB-T receivers”, Consumer Electronics, IEEE Transactions on Volume 51, Issue 1, Feb. 2005 Page(s):28 – 32
[14] Kai-Yuan Jheng, et al., “A DVB-T baseband receiver design based on multimode silicon IPs”, VLSI Design, Automation and Test, 2005. (VLSI-TSA-DAT). 2005 IEEE VLSI-TSA International Symposium on 27-29 April 2005 Page(s):49 – 52