Chapter 4 Gaussian mixture identification with EM algorithm
4.3 Gaussian mixture identification via EM algorithm
The mixture-density parameter estimation problem is one of the most widely used applications of the EM algorithm. For this case, we have the density to be identified as density function parameterized by θi. It can be also considered as we have M component densities mixed together with mixing coefficientsαi.
Even with the Gaussian assumption of pj(.), (4.8) is difficult to maximize since it is highly nonlinear. If we consider X as incomplete, and posit the existence of
unobserved data Y={y } whose values indicating which the component density i i=1N generated each data item, the likelihood expression can be significantly simplified and the solution is easier to obtain. Assuming that yi∈{1, 2, , }K M for each i , and
yi = if the k i sample is generated by the th k mixture component. If we know the th data Y, the likelihood can be expressed as:
1 1
which is a particular form of the component densities, and it can be easily optimized using a variety of techniques. However, we do not know the values of Y. If we assume Y is a random vector, we can proceed.
First, we must derive an expression for the distribution of the unobserved
data. In the beginning, we guess thatΘ =g {α α1g, 2g, ,K α θ θMg, 1g, 2g, ,K θMg}. GivenΘ , g we can easily compute p xj( |i θ for each i and j . Besides, the mixing jg)
parameters αj can be thought of as the priori probability of each mixture component, that isαj = p component j( ). Using Bayes’ rule, we can know:
Then, we can simplify (4.12) as
(4.12) as (4.14):
In order to maximize (4.14), we can maximize the term containing αl and the term containing θl independently since they are not uncorrelated.
To findαl, we introduce the Lagrange multiplier λ with the constraint that
l 1
lα =
∑
. The Lagrange multipliers provide a strategy for finding themaximum/minimum of a function subject to constraints. For example, if we want to maximize ( , )f x y , and the constraint is ( , )g x y = , then the cost function can be c re-defined with the Lagrange multiplier λ as follows:
Λ( , , )x y λ = f x y( , )+λ( ( , )g x y −c) (4.15) Now, we can solve the following equation forαl:
1 1
Note that in our scenario, we do not need to consider the parameters αl since we assume that all component densities mixed together with the same mixing
coefficientαl.
We then want to findθl. For some distributions, it is possible to get an analytical expression forθl. In our scenario, the distribution is a mixture of two one-dimensional
Gaussian distributions with mean µ1 = − and varianceµ2 σ12 =σ22, which is shown
Taking the logarithm of (4.19), ignoring constant terms, and substituting the result into the right side last term of (4.14), we can obtain
Taking the derivative of (4.20) with respect to µl and setting it equal to zero, 2
Finally, we can obtain the estimation forσl2:
Now, a complete EM algorithm for the Gaussian mixture identification is derived. To sum up, the E-step finds the expected value of the complete-data log-likelihood. The M-step obtains a new estimation by maximizing the expectation computed in the E-step. The estimation of the new parameters in terms of the old parameters is summarized as below:
; 1
With the EM algorithm, it is guaranteed to increase the log-likelihood and converge to a local maximum of the likelihood function. Since the EM algorithm is not guaranteed to find the global maximum, we need to choose the initial values carefully.
1
5 Compress and Forward in User Cooperation
In Chapter 3, we have described two cooperative protocols, i.e., AF and DF. In this chapter, we investigate another cooperative scheme, called compress-and-forward (CF). In CF, the relay forwards the quantized/observed/estimated version of its
observations. In [11], the CF for turbo decoder was introduced. In this chapter, we will extend its use to the LDPC decoder.
Since the SR channel may have deep fades with a high probability, the DF scheme cannot operate in the cooperative mode all the time and this will cause performance degradation. The AF does not have the problem. However, the SR channel may be noisy and retransmission will further amplify the noise. The CF can alleviate the problems mentioned above. Under the CF, the relay, whether the decoding is successful or not, retransmits the information from the source to the destination. Then the destination can combine both the LLR from the source and the relay for data detection. We will have more details in the later section.
5.1 Compress-and-forward (CF) Cooperation Strategy
Figure 5-1: Block diagram of hybrid compress-and-forward (CF)
Figure 5-1 shows the block diagram of the hybrid CF cooperative strategy, but for simplicity, we just call it CF. In DF, when the decoding in the relay fails, the relay switches to the non-cooperation mode. As a result, the destination only have the information from the source; However, in the CF scheme, when the decoding fails, the relay will switches to a mode which will retransmit the quantized LLR
information to the destination. Figure 5-2 shows the CF cooperative protocol.
Figure 5-2: CF for BPSK modulation
In Figure 5-2, xs denotes the transmit signal at the source, vs its received signal at the destination, r the LLR of a decoded bit at the relay, and w ˆ( )t (t) an modulated index for the quantized LLR (here only one bit quantization), and z (t) the received index at the destination. Then the destination collects v and z s (t) to recover the transmit data.
Here, we assume that the LDPC code is used at the source. The overall approach can be summarized as follows:
1) In the first time slot, the source broadcasts signal x to the relay and the s destination simultaneously.
2) The relay performs LDPC decoding to estimate x . If s x is decoded s
successfully, the relay use traditional DF scheme. If the decoding fails, the relay quantizes the LDPC-decoder LLR, encodes the index, conduct symbol mapping with BPSK, QPSK, or M-QAM, and transmits the resultant signal to the
destination.
3) The destination combines the information received form the source and relay to recovery the information bits.
Note that in the second time slot, an indicating bit may be piggybacked on the relay packet, so the destination knows if information bits or LLR indices are
re-transmitted. Besides, if the error rate at the relay is too high, we may also switch to the non- cooperative mode. This is because if the SR channel is very poor, not much information can be explored in the relay.
5.2 System Model
Here we consider the same scenario as that in Chapter 3, a typical three-node relay system. Let the channels be block Rayleigh fading, the variances of all channels be one, and all noises are AWGNs, The received signal at the relay and the destination can be expressed as follow:
n , SD n are the received signals and noise at the destination (first and second time RD slot). All noises have zero mean and the variance of, nSD, nSR , nRD are σ , SD2 σ and SR2
2
σ , respectively. RD
For decoding, we have to calculate the channel LLR upon receiving the signal at the relay and destination. For retransmission at the relay, we have to calculate the decoder LLR. For a received signalr , the LLR can be written as:
2
LLR ˆr , we can model it as a binary signal corrupted by a Gaussian noise, i.e,:
n N σ . The probability density function (PDF) of the decoder LLR is then
In the next section, we will discuss the quantizer and index encoder at the relay.
5.3 Quantizer Optimization [11]
In this section, we address how to quantize r at the relay, where the ˆ( )t
superscript t is the index of a signal in a packet. In CF, an index encoder (IE) typically succeeds the quantizer to compress the indices of the quantization bins for further rate reduction. Design of a CF quantizer needs to consider the index encoder type. Here for simplicity, we consider a fixed-rate index encoding. We discuss the design procedure by considering a four-level scalar quantizer. However, one can extend the method to a higher-level quantizter.
Let { ,u ii =0,1, 2,3, 4} be the bin-boundaries where u and 0 u are set to be 4
and the high-end bin-boundary forr . Figure 5-3 shows the bin boundaries and an ˆ( )t example of the index-encoder. For a specific bin-indexw , we have ( )t
( ) ( )
Figure 5-3: Bin boundaries and index-encoder
The general design goal for a CF scheme, as well as other cooperative schemes, is for the relay to maximize the amount of “new” information about signalxs. Where the new, we mean non-overlap information that complements the information
conveyed directly to the destination by the source. Mathematically, this criterion to maximize can be expressed as,
arg min ( ( )| ( )) (5.9)
Substituting (5.11) and (5.12) into (5.10), we can have computed by the EM algorithm described in Chapter 4.
5.4 LLR Computation at Destination
Another important issue in CF is how the destination exploits the information received from the relay. As mentioned, the LLR is what we need for LDPC decoding.
In this section, we will derive the formula to compute the LLR at the destination using the observations. In Figure 5-2 , z(t) and v( )st Define z( )t =(z( ,0)t ,z( ,1)t ),
5.4.1 BPSK Modulation at the Relay
First, we consider a simple case where BPSK is used at the relay for the modulation of the quantized LLR. Let
(5.17), and (5.18) to obtain the LLR which combined the information from the source and the relay at the destination.
( ) ( ) ( ) ( ) ( )
Due to v(t)and z(t)are independent, we can arrange the equation as:
( ) ( ) ( ) ( )
( )
Here we can find the first part in (5.20) is the same as the LLR we consider in Chapter 3, so we can rewrite as:
Where we assume
We also assume that the destination will receive the value of
( )
relay by a side information channel.
5.4.2 QPSK Modulation at the Relay
Note that in Figure 5-2,w( )t ={w( ,0)t ,w( ,1)t } has two bits. Thus, for BPSK, it needs two symbols to transmit. For QPSK, we only need one symbol. We can let w(t) be a complex number, i.e.,w( )t ={w( ,0)t +w( ,1)t j}. Now, we can re-plot the Figure 5-2 to 5-4.
Figure 5-4: CF for QPSK modulation The LLR can then be expressed as:
( )
Where z'( )It and z'( )Qt is the real part and the image part of z'( )t , and z'( )t is the same definition as in Section 5.4.1. For each bit of the QPSK signal, we can find that the region for w( ,0)t and w( ,1)t to demap. With the Gray coding, the regions are shown in Figure 5-5. Bit 1 is fromw( ,0)t , and bit 2 is fromw( ,1)t .
Figure 5-5 : The region forw( ,0)t andw( ,1)t
5.4.3 16QAM Modulation at the Relay
The idea is similar to QPSK, and the system model is also similar to Figure 5-4.
However, there are 4 bits carried in a 16QAM signal, so we need to modulate two bin-indices {wn( ,0)t ,wn( ,1)t } and {wn( ,0)t+1 ,wn( ,1)t+1} to a 16QAM symbol. Therefore, the symbol we send from the relay isw16QAM =(wn( ,0)t ,wn( ,1)t ,wn( ,0)t+1 ,wn( ,1)t+1) (= wn( )t +wn( )t+1j), wherewn( )t ,wn( )t+1= ± ± . Figure 5-6 is the corresponding system model. { 1, 3}
Figure 5-6 : CF for 16QAM modulation
The two LLR can be expressed as:
Figure 5-7 : The region forw andn( )t wn( )t+1
Similarly, we assume that the destination will receive the value of
( )
( ) ( ) ˆ
t h t l
u
u ⋅ dr
∫
from the relay by a side information channel. After computing the LLR, we can use it as the input to the LDPC decoder to find the soft decoder LLR. Finally, we make data decisions as that in (2.30).
6 Simulations
In this chapter, we will report simulate results evaluating the performance of different cooperative schemes in different scenarios. In the simulations, we assume that the instantaneous CSI h , SD h , and SR hRD are known.to the receivers, and BPSK, QPSK, and 16QAM are used as the modulation schemes. The bit error rate (BER) and packet error rate (PER) are used as the performance measures.
We also assume that h , SD h , and SR hRD are spatially independent and experience slow Rayleigh fading. The variance of each channel is one. We also consider the line-of-side (LOS) channel in which each channel has an unit gain. As to the noise, we consider the AWGN. The means of nSD, n , and SR nRDare zeros and the variances are σSD2 , σSR2 , and σRD2 , respectively. Given the SNR and the average power of a signal Pk , we can compute the noise variance easily. For reference simplicity, we let the SNR of the SR channel be denoted as SNRSR, that of the SD as SNRSD, and that of RD as SNRRD.
At the source, we encode the original information bits with the LDPC encoder defined in IEEE 802.15.3c with code rate=1/2. We let the packet size be equal to the coding-block size. In other words, there is one LDPC codeword (672 bits) in one packet. Five scenarios are considered. We use the DC scheme for Scenario 1 to scenario 4, and the MD scheme for Scenario 5.
6.1 Scenario 1
In this scenario, we consider the LOS channel, i.e, the gain of each channel is always one. We evaluate the performance of the non-cooperative (NC), the
cooperative, and the cooperative LDPC-coded schemes. Here let
SNRSD=SNRSR=SNRRD. Figure 6-1 shows the simulation results. As we can see, at BER=10-3 the cooperative scheme outperforms the NC about 2 dB. Also, AF with LDPC coding outperforms AF without coding about 4 dB.
We then conduct more simulations for AF without LDPC coding. Let SNRSD=5dB, SNRRD=1, 5, 9 dB, and SNRSR be varied. Figure 6-2 shows the performance comparison. From the figure, we see that the higher the SNRSR, the better the performance we can have. Then, we let SNRSD =5dB, SNRSR=5, 9dB, and SNRRD be varied. Figure 6-3 shows the performance comparison. From the figure, we see that the higher channel SNRRD, the better the performance we can have.
Then we conduct simulations for AF with LDPC coding. Let SNRSD=1dB, SNRRD=1, 5 dB, and SNRSR be varied. Figure 6-4 shows the performance comparison.
From the figure, we see that the higher the SNRSR, the better the performance we can have. Because of with LDPC coding, the performance is much better than the
situation without LDPC coding. Then, we let SNRSD =1dB, SNRSR=1, 5dB, and SNRRD be varied. Figure 6-5 shows the performance comparison. From the figure, we also see that the higher channel SNRRD, the better the performance we can have. Sum up, cooperative systems with LDPC codes can work better than without LDPC codes.
Figure 6-1 : BER comparison for AF cooperative/non cooperative systems with LDPC codes and without LDPC codes
Figure 6-2 : BER comparison for various SNRSR in AF without LDPC codes
Figure 6-3 : BER comparison for various SNRRD in AF without LDPC codes
Figure 6-4 : BER comparison for various SNRSR in AF with LDPC codes
Figure 6-5 : BER comparison for various SNRRD in AF with LDPC codes
6.2 Scenario 2
In this scenario, we consider the system with Rayleigh fading channels. We compare the PER performance between the NC, AF, and DF systems with BPSK modulation. The channel SNRs are set as SNRSD=SNRSR=SNRRD. In AF, the relay just amplifies the signals and transmits to the destination, so the relay will propagate the noise. However, it does not have the decision errors. In DF, it is degenerated to the NC mode when decision error occurs at relay. Despite of that, DF has 1~2.5dB gain over AF.
Figure 6-6 : BER comparison for NC, AF and DF, (SNRSR= SNRSD= SNRRD)
6.3 Scenario 3
In this scenario, we assume the LOS channel in the system. That means in every packet, the SNR is always fixed. We include the CF scheme in our simulations.
In CF, if the relay decodes the information bits correctly, it will choose the DF mode to re-encode and re-transmits the information bits to the destination. If it decodes the bits incorrectly, the relay will have two modes to choose, the CF or NC modes. Here, we set a threshold for the mode selection. If the BER is higher than the threshold at the relay, the relay will choose the CF mode; otherwise, the relay will switch to NC mode. The threshold we set is 0.5.
For cooperative systems, the source uses the 16QAM modulation scheme. At the first time slot, it transmits the modulated signal to the relay and the destination. At the second slot, the relay uses DF or CF to transmit the processed signal to the
destination. In DF, the 16QAM scheme is used, while for CF, BPSK, QPSK, and 16QAM modulation schemes are used. We use CF (BPSK), CF (QPSK) , and CF(16QAM) to denote the various CF schemes we consider. Note that the data rates in the RD channel are different for different modulation/cooperative schemes. In general, the CF scheme requires a higher data rate. However, as the typical case, the PER is small, the overhead introduced by the CF scheme will be slightly higher than the DF scheme.
6.3.1 Case 1
We set the channel SNRs as SNRSR=SNRRD - 8 and SNRSD=SNRRD - 10.
Figure 6-7 shows the simulation results and we can find that the performance of CF is much better than DF. Also, the performance of CF(BPSK), CF(QPSK), and
CF(16QAM) is very close. Below SNRRD=15 dB, the performance of DF is almost
the same as NC. This is because SNRSR is low and the DF always switches to the NC mode most of the time.
Figure 6-7 : BER comparison for NC, DF, and CF in LOS channel, (SNRSR=SNRRD - 8 and SNRSD=SNRRD – 10)
6.3.2 Case 2
We set the channel SNRs as SNRSR=7dB and SNRSD=5dB, and vary SNRRD. Figure 6-8 shows the results and we can find that in this case the DF is still close to NC. The performance of the CF scheme improves very quickly as SNRRD is getting higher. Finally, the BER will saturate around 10-2.
Figure 6-8 : BER comparison for NC, DF, and CF in LOS channel (SNRSR=7dB and SNRSD=5dB)
6.3.3 Case 3
We set the channel SNRs as SNRSR=8dB and SNRSD = SNRRD - 10dB. Figure 6-9 shows the simulation results and we can find that the performance of CF is much better than that of DF while SNRRD is higher than 5dB. The higher the SNRRD, the larger gain we can obtain with CF.
Figure 6-9 : BER comparison for NC, DF, and CF in LOS channel (SNRSR=8dB and SNRSD = SNRRD - 10dB)
6.3.4 Case 4
We set the channel SNRs as SNRSR=SNRRD, and SNRSD=SNRRD-10dB. Figure 6-10 shows the result. From the figure, we can see that the performance of NC and DF curve is almost the same when SNRRD is less than 7dB. Also, and the DF is worse than AF in this SNR region. The reason, as mentioned, DF switches to the NC mode most of the time. When SNRRD is higher than 7dB, the performance of DF starts to improve and becomes better than that of CF. As we can see, CF always gives the best performance among all the cooperative schemes.
Figure 6-10 : BER comparison for NC, DF, and CF in LOS channel (SNRSR=SNRRD, and SNRSD=SNRRD-10dB)
6.4 Scenario 4
In this scenario, we assume Rayleigh fading channels in our system. At the source, the transmitter uses QPSK as the modulation scheme. At the relay, DF uses QPSK as the modulation scheme, while CF uses BPSK, QPSK, or 16QAM. Since a two-bit quantizer is used in CF, the transmit bits at the relay is doubled. We use DC at the destination. As a result, if BPSK or QPSK is used the data rate of CF is higher than that of DF. However, if 16QAM is used, the data rate for CF is then the same as
In this scenario, we assume Rayleigh fading channels in our system. At the source, the transmitter uses QPSK as the modulation scheme. At the relay, DF uses QPSK as the modulation scheme, while CF uses BPSK, QPSK, or 16QAM. Since a two-bit quantizer is used in CF, the transmit bits at the relay is doubled. We use DC at the destination. As a result, if BPSK or QPSK is used the data rate of CF is higher than that of DF. However, if 16QAM is used, the data rate for CF is then the same as