CHAPTER 3 BIT LOADING AND OPTIMAL THROUGHPUT OF
3.4 S UMMARY
In this chapter, the maximum throughputs of a DMT-based VDSL system under various constraints, such as the effects of crosstalks and bridged taps, are analyzed. It is seen that raising the sampling rate can improve the system throughput, especially when the loop length is short. However, this method has its limitations as shown in the previous simulations, and the optimal solutions are calculated according to the defined constraints.
From our simulation results, if the loop length is below 800 m, increasing the sampling rate will improve the system throughput significantly. Otherwise, for the loops with length above 800 m to 1500 m, 4.4 MHz sampling rate is good enough to reach the optimal throughput of the DMT-based VDSL system, which is the value selected in ADSL 2 standard.
Chapter 4
Frame Synchronization by Cyclic Prefix
In the previous chapter, the maximum transmission data rate is calculated based on the optimal bit-loading capability of the twisted-pair line. However, in the receiver side, the overall performance depends on the equalization algorithm, frame synchronization, etc.
Since the channel responses of these test loops have been calculated, the well-trained coefficients of TEQ [36] and FEQ can be applied to those simulations directly.
In this chapter2[37], the theory of maximum likelihood (ML) algorithm for frame synchronization will be discussed, and a modified algorithm to improve the performance is then proposed to obtain the optimal estimation.
4.1 ML Algorithm for Frame Synchronization
In the transmitted data sequence of DMT system, the cyclic prefix causes non-zero correlation between some pairs of samples. A maximum-likelihood estimator for frame synchronization derived in [38][39][40] employs the special characteristic of cyclic prefix.
Fig. 4.1 shows that the last L samples, I’, are copied to the guard interval, I, at the beginning of each DMT symbol of length N+L (from θ -L+1 to N+θ ). We assume that the
2 Part of the content in this chapter has been published in:
S. T. Lin and C. H. Wei, “Precursor ISI-Free Frame Synchronization for DMT VDSL System,” IEICE Trans.
length of cyclic prefix is L and the size of IFFT is N, 2(N+L) samples of two consecutive received symbols are observed. The variable θ is defined as the start position of a frame in the observed data, i.e., the θ th sample is placed as the first element in the FFT operation.
The maximum likelihood estimation of θ , given the received signal r(k), can be formulated as [38]
If we define the “correlation” in term of the Euclidean distance between two
constellation points, a low complexity ML algorithm [40] is obtained. When signal-to- noise ratio is high, i.e. ‘ρ’ is close to unity, the estimated result is similar to the ML
algorithm.
All the above algorithms are based on the assumption that the channel is ideal. In other words, the received signals, r(k) , are assumed identical to r(k+N) in the cyclic prefix period.
However, the twisted-pair channel is dispersive, therefore, an algorithm to minimize the ISI should be considered. In Fig. 1.3, a DMT symbol can be sent after parallel-to-serial conversion as Y= {yN-L, yN-L+1,…, yN-1,yo, y1, y2, … ,yN-1}. Denote that yN-L+k = ak for simplicity. If the length of the channel response P is shortened to M, M<L, by a time-domain equalizer (TEQ), then the shortened channel response H is represented by {ho,
h1, h2, … hM-1}. The received signal vector R is the convolution of transmitted signal Y and
The representation of r(k) can be expressed as:
⎪⎪ the shortened channel impulse response is shorter than the cyclic prefix length L, then only the Mth to Lth samples are identical to the (M+N)th to (L+N)th samples, respectively. If the number of summation terms of the estimation function is reduced to less than (L-M), the estimated value is much more likely located at the initial transition edge of the channel
impulse response. Thus we formulate a new modified ML algorithm by defining the
From the matrix analysis listed above, it can be proved theoretically that equation (4.5) gives better estimation value over a dispersion channel. The detailed derivation is described in the following section.
4.2 Modified ML Algorithm
In order to prove that equation (4.5) performs better than equation (4.2), a new variable, u(k), is defined by the difference between r(k) and r(k+N), i.e., u(k)=r(k)-r(k+N). By applying the previously defined r(k) and r(k+N), u(k) can be expressed as:
……….(4.6)
Then r(k) is used in equations (4.2) and (4.5) to find the maximum estimation value. If some elements in the summation window are nonzero; i.e., r(k) is not equal to r(k+N), their squared values are greater than zero. Therefore, the result is negative if some elements are
nonzero. Otherwise, if r(k) equals to r(k+N) for all elements in the summation window, the result will equal to zero, which gives the maximum value.
When the summation window length is shorter than (L-M) as in equation (4.5), the summation is zero, if k=L. Therefore, the estimated value, θ , is equal to L, which locates the frame boundary. Since the Lth to (N+L)th elements of r(k) contains only the jth symbol, then no ISI exists. However, if the summation window length equals to L as in equation (4.2), the estimated value will be located at the peak of the channel response. To reduce the complexity of derivation, we assume a simple channel model with h0=0.1, hp=0.8, and hM-1=0.1. Then u(k) can be rewritten as:
If the summation window length is L as that in equation (4.2), the maximum value should be obtained when k = (L+p), with a delay “p” from the previous value, where “p” is the peak position of the channel response. Since the (N+L) th to (N+L+p)th elements of r(k) contains the jth as well as (j+1)th symbol, thus ISI is introduced. Fig. 4.1 is used to demonstrate the previous derivations. From these derivations, it is seen that equation (4.2)
estimates the frame boundary at the channel peak instead of its initial rising edge; therefore, ISI is introduced and it degrades the system performance [41].
θ 2(N+L)
I I’
N+θ
(a) Original DMT data sequence
1 k
(b) Channel response
θ N+θ 2(N+L) k+N
(c) Received data convoluted with channel response and offset with N samples
θ N+θ
(d) Received data convoluted with channel response
k
θ-(L-M) θ+L-M
k
(e) Equation (4.5) output by applying modified ML algorithm
θ-L+p θ+L+p k
(f) Equation (4.2) output by applying ML algorithm
θ+p
Figure 4.1 Illustration for equation derivation.
4.3 Computer Simulations
To evaluate the performance of this modified algorithm, some computer simulations are performed. First, the simulation environment is introduced, and then a new term, called Et/N0, is defined as a parameter to indicate the performance of these algorithms. From the simulation results, it is seen that the performance is strongly dependent on the bit-loading of each tone. Therefore, the relationship between these two parameters is studied in the second subsection. Then the test procedures are applied to a simple loop with various lengths to compare the performance of two algorithms as well as the loop characteristics.
Finally, all VDSL test loops, VDSL0 to VDSL7 in reference [4], are simulated to study the performance of these algorithms over various types of loop topology.
4.3.1 Simulation Environment
The complexity of the modified ML algorithm is the same as the conventional one.
However, it provides better estimation for frame synchronization of the DMT system. A test environment is built to perform the evaluation of these algorithms. This simulation program includes several modules: the DMT symbol generator, channel models, and receiver. The DMT symbol generator assigns random data into N/2 sub-channels continuously. The bit number of each sub-channel is assigned by the bit-loading table, which is calculated according to the transfer functions of the corresponding loops. A look-up table is also set up to contain the QAM constellations of various sizes, ranging from 2 bits to 15 bits with normalized signal power. Each DMT symbol is converted to the corresponding time-domain samples by using N-point IFFT, parallel-to-serial converter, and adding cyclic
prefix. These data are then convoluted with the channel modeling in time domain to simulate the data at the receiver side. In the analysis, the AWGN noise is taken into consideration and the transmission power is –60 dBm/Hz in the amateur radio bands.
The received data are processed by a TEQ such that the tail of the impulse response is shortened to less than that of cyclic prefix. Subsequently, the frame synchronization algorithm is applied to locate an FFT symbol so that a precursor ISI-free demodulation can be obtained. A one-tap per sub-channel FEQ is used to equalize the channel distortion so that the transmitted QAM signal can be recovered. The function of performance monitor program is to calculate the difference between the received data and its original source.
Currently, most of the DMT xDSL system uses a TEQ[36] plus FEQ structure for channel equalization. Since the channel response of a twisted-pair line is longer than hundreds of taps, a TEQ is used to reduce the length of impulse response. In our simulation, the coefficient of TEQ is calculated by using the Impulse Response Shortening (IRS) algorithm[36] with 32 zeroes and 16 poles in downstream direction. It can generate a shortened impulse response (SIR) with length shorter than that of cyclic prefix for each test loop in our system. The coefficient of FEQ is calculated by an inverse function of the product of TEQ and original channel response. Since the topologies of test loops are known, their impulse responses and corresponding TEQ and FEQ coefficients can be calculated directly without any training process.
The simulation environment has been built by creating the functional blocks as shown in Fig. 1.3. All topologies of test loops, VDSL0 to VDSL7 as well as a simple loop with length varying from 100 m to 1500 m, have been applied in the performance test simulations. The estimated value θ by the ML and modified ML algorithms are calculated first. If the block of each frame is selected according to the calculated frame boundary, and passed through the FFT and FEQ blocks frame by frame. Finally, the results are compared with the original transmitted data. Due to the channel effect, the received data are scattered
around the original constellation point. The deviation of the noisy data determines the performance of the system. If the deviation is larger than half the distance between two consecutive constellation points, the received data may be decoded incorrectly. We define a new parameter, called Et/N0, to indicate the normalized energy of two constellation points over the noise power. The distance between two constellation points is reduced if the QAM size grows, and thus is more sensitive to the noises.
4.3.2 Performance Comparison of ML and Modified ML Algorithms
The computer simulations of two synchronization algorithms are applied on all the test loops to obtain the Et/N0 of each tone in the receiver side. From the simulation results, it is observed that the performance of Et/N0 is strongly dependent on the number of bits per sub-channel. Loop VDSL 1200 m is selected to demonstrate this relationship because its bit-loading curve varies like going-downstairs step by step from 15 bits to 3 bits, as shown in Fig. 4.2. The performance monitor module calculates the average of Et/N0 (dB) for each tone over 100 symbols under various transmission power and noise, as shown in Fig. 4.3.
0 50 100 150 200 250 300 0
2 4 6 8 10 12 14 16 18
Bsit
Nth tone
Total: 18.30 Mbps SNR curve
Figure 4.2 Channel capacity of gauge #26 twisted-pair line with 8 kHz symbol rate.
0 50 100 150 200 250 300
-20 -10 0 10 20 30 40 50 60
Nth tone Et/
No B(d)
Maximum likelihood Modified ML
(a)
0 50 100 150 200 250 300
Figure 4.3 Performance of frame synchronization algorithms with various transmission power (a) -60 dBm/Hz (b) -40 dBm/Hz (c) without AWGN -140 dBm/Hz
For conventional maximum likelihood estimator, the obtained values of Et/N0 are 0, 12, and 20 dB for 12-, 10-, and 7-bit constellation, respectively. However, the value of Et/N0 maintains around 20 dB respectively for the modified ML method while the transmission powere is around -60 dBm/Hz. If the transimission power is raised to -40 dBm/Hz, the value of Et/N0 is raised around to 40 dB for the modified ML method. If the AWGN noise of -140 dBm/Hz is not added into the receiving signal, the value of Et/N0 can be raises to even around 50 dB. It is clear to see the performance degradation caused by precursor ISI, especially for high-level QAM. The peak in each case occurs at the 64th sub-channel (pilot tone), the Et/N0 value grows high because only two bits are carried in these high SNR band. The simulation results show that the modified ML algorithm performs better than the conventional one, especially at the tone carrying more bits. The Et/N0 of the modified ML algorithm is greater than the original one, which indicates that the decoded values after FFT and FEQ blocks are closer to the original constellation points.
Therefore, the precursor ISI is reduced and BER is improved. The SNR differences between the two algorithms are shown in Fig. 4.3. The simulation is applied to all the VDSL test loops, including a simple loop with length varying from 100 m to 1500 m. Since the SNR curve of a pure loop without any bridged- tap decreases smoothly and the corresponding bit-loading curve is similar to that shown in Fig. 4.3. Therefore, it is easier to see the relationship between Et/N0 and bit-loading. For the loops shorter than 500 meters, the bit-loading maintains above 12 bits for most tones, and therefore, the SNR differences are high for all tones. For the longer loops, the bit-loading are lowered in high frequency, and the SNR difference is reduced, as shown in Fig. 4.4.
0 50 100 150 200 250 300 -10
0 10 20 30 40 50 60
300m
500m
1500m
Nth tone SN
R Dnceiffere: Et/
No B(d)
300m 500m 800m 1300m 1500m
Figure 4.4 SNR difference of various loops.
Although the SNR difference is affected by the bit-loading, it also depends on the original SNR curve of the tested loop. In other words, the absolute value of SNR differences cannot indicate the exact bit-loading value, however, it increases at high bit-loading tones. From the simulations, we can see that the performance of the modified ML algorithm is better than the conventional one, especially for the DMT VDSL system with high-level QAM.
4.3.3 Loop Length and Channel Characteristics
The VDSL system limits its loop length up to 1500 m. Therefore, a set of test loops with simple topology and length from 100 m to 1500 m are simulated to show the relationship between channel characteristics and loop length.
It can be seen that the longer the loop, the broader the response will be. In addition, the delay becomes greater while the loop length gets longer. The impulse channel responses of these loops have long tails and their lengths are much longer than that of cyclic prefix.
Therefore, the TEQ should be applied in the receiver system to get the SIR of each channel.
The original channel response and its corresponding SIR are shown in Fig. 4.5.
0 20 40 60 80 100 120 140
-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8
Magnudeit
sample
VDSL1200m SIR
Figure 4.5 Channel response and its SIR of loop VDSL 1200 m.
Fig. 4.5 shows that the length of SIR, an output from TEQ functional block, is reduced to smaller than 20, i.e., M < 20. However, the length of the original impulse response is
over 100.
To search the optimal solution of summation window length for the frame synchronization, the magnitude of M varies from 1 to L, the corresponding averaged Et/N0
of some tested loops are shown in Fig. 4.6. If the length of channel response is shorter than that of cyclic prefix, the choice of window length has more than one optimal choices. From Fig. 4.6, the Et/N0 maintains at the same level during the window length (L-M) varying from 2 to 22. If the variable M is large enough to cover the main energy band of the SIR, the estimator can deliver a proper solution without precursor ISI. The conventional ML estimator can be viewed as a special case of the modified ML algorithm with M equal to zero. Since the channel responses of VDSL test loops are sharp after TEQ, the summation window length can be set under 20.
In real application system, the exact length of a SIR is difficult to estimate; therefore, the magnitude of M is assigned a reasonable value by the program, such as 20. However, if the assigned value M can cover the main energy band of the SIR, the ISI caused by the residual tail is negligible. On the other hand, if M is larger than the length of the SIR, the output of equation (4.5) has more than one solution. In other words, the maximum value θ is a continuous band instead of a sharp peak value. To obtain the correct solution in our system, the largest value in the continuous band should be selected.
0 5 10 15 20 25 30 35
Figure 4.6 Estimated delays and their Et/N0 vs. window length.
0 500 1000 1500
30
Frame Boundary Estimated Value
Delay
0 500 1000 1500
-20
Length of the Tested Loop (m)
Et/No
Maximum Likelihood Modified ML
Figure 4.7 θ and Et/N0 of VDSL test loops.
The estimated frame boundary θ and the averaged Et/N0 of all used tones for 8 kHz DMT VDSL downstream over a loop with various lengths are listed in Fig. 4.7.
The length of test loops is arranged from 100 m to 1500 m, and their index numbers are from 1 to 15, respectively. The estimated delays by ML and Modified ML algorithms are about 2 to 8. The delay indicates the distance between the original rising edge and peak position of the shortened channel response. The bit-loading of each tone for short loop is high, and therefore the performance of the modified ML algorithm is much better than the conventional one.
4.3.4 VDSL Test Loops with Complex Topologies
The above procedure can be applied to other test loops with complex topologies as listed in the VLSI draft standard [35]. The estimated frame boundary θ and the averaged Et/N0 of all used tones for 8 kHz DMT VDSL downstream are listed in Table 1 and Fig. 4.8.
The first row in Table 1 contains the results from the conventional ML algorithm while the lower ones are from the modified ML algorithm.
Table 4.1 (θ , Et/N0) of VDSL test loops.
Loop No. VDSL1 VDSL2 VDSL3 VDSL4 VDSL5, 6, & 7 (43,-12) (42, -5) (43,-4.5) (43, -5) (42, -4) Short
(300 m) (37,35) (39,36) (37, 36) (40,30) (38,32) (59, 2) (58, 8) (58,-0.5) (58,26) (58, 2) Medium
(1000 m) (50,41) (48,32) (49,35) (55,38) (49,38) (70,17) (69, 3) (69, -4) (70,19) (70,10) Long
(1500 m) (61,24) (60,23) (60,20) (64,21) (61,33)
0 5 10 15
30 40 50 60 70
Fram e B oundary E s tim ated V alue
Delay
M ax im um Lik elihood M odified M L
0 5 10 15
-20 0 20 40 60
A veraged S NR
Tes t Loops
Et/No
Figure 4.8 θ and Et/N0 of VDSL test loops.
The order of test loops is arranged from short to long, with the index numbers from 1 to 15, respectively. In other words, VDSL1 to VDSL4 of 300 m long are indexed from 1 to
4, VDSL5 is 5, while VDSL1 to VDSL4 of 1000 m long are 6 to 9, VDSL6 is 10, etc. Since the topologies of these test loops contain various types of wires and some of them have bridged-taps, the performance differences of these two algorithms do not have smooth transition as the previous test cases. However, the bit-loading of each tone for short loop is high, and therefore the performance of the modified ML algorithm is much better than the conventional one, especially for short test loop.
4.4 Summary
In this chapter, a low-complexity modified ML algorithm for frame boundary estimation is developed. Its performance is better than the conventional one over the dispersive channel. A test environment is built to analyze the performance of DMT-based VDSL system. The analysis in this dissertation takes the AWGN noise into consideration and the transmission power is –60 dBm/Hz in the amateur radio bands. Simulation results show that the modified ML algorithm has better performance so that the Et/N0 of each tone is raised, especially for those carrying data of high-level QAM. The modified algorithm improves the VDSL system performance significantly because most of its sub-channels are loaded with a large number of bits.
Chapter 5 ISI Cancellation Algorithm for DMT-based VDSL System
For the traditional DMT-based VDSL system, there are still two drawbacks remained:
un-cancelled residual ISI outside the cyclic prefix and deterioration of channel capacity by TEQ. In this chapter3[43], we propose a algorithm to enhance the performance of the DMT-
un-cancelled residual ISI outside the cyclic prefix and deterioration of channel capacity by TEQ. In this chapter3[43], we propose a algorithm to enhance the performance of the DMT-