2 Double Talk Detector For Acoustic Echo Canceller
3.1 Cross-correlation DTD in echo path change and double talk
One of problems in double talk detector is that there is difficult to distinguish the echo path change [3] from DT. This distinction is important because the adaptive filter coefficients should be continuously updated during the echo path change but not during the double talk periods. On the contrary, when there is an abrupt change of the echo path change (EPC) in the near-end room, the adaptive filter with fast rate of convergence is required to track the echo path change. It is therefore necessary for a DTD to be able to distinguish between the DT situation and the echo path change in order to obtain appropriate tracking performance of the adaptive filter. For both cases of DT and echo path change, the misadjustment of the adaptive filter, and thus the error signal, is drastically increased. Thus, the error signal cannot be used as a DTD alone since it cannot distinguish between these two events.
For an example, in Fig 3.1, the cross correlation ρdyˆ( )k is decreasing in double
talk period. However, ρdyˆ( )k is also decreasing when echo path change is present. From Fig 3.1, double talk is 1k from 1.5k, and echo path change occurs in 2.2k. So, the cross correlation double talk detector can not distinguish between DT and EPC. The variation of
ˆ( )
dy k
ρ has four cases. Therefore, the conventional cross correlation DTD can not distinguish the four cases. The conventional cross correlation DTD is not robust.
0 0.5 1 1.5 2 2.5 3 3.5 x 104 0.2
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
iteration Cross Correlation
DT
EPC
Fig 3.1 Variation of ρ between DT and EPC d y,ˆ
In next section, we propose the modified cross-correlation double talk detector. The modified cross-correlation detector can distinguish the four cases. Whatever the level of double talk or echo path change is present, the modified DTD can detect correct.
3.2 The modify cross-correlation double talk detector
In Section 3.1, we discussed that the cross-correlation DTD is hard to differentiate between double talk and echo path change. In this section, we propose the modified cross -correlation DTD. The conventional cross correlation DTD although can detector one situation. But in general case, the conventional cross correlation DTD detect error between DT and EPC.
In Table 3.1, we will consider four cases of the cross-correlation, depending on the near-end signal energy and the degree of the echo path change. The small EPC means that the new change channel is close to the origin channel. The large DT means that the near end speech energy is large.
From Table 3.1, we can find four typical cases in double talk and echo path change.
But the conventional cross correlation DTD works in only one case. To make the detector robust, we extend the cross-correlation DTD by incorporating microphone energy. Fig 3.2 is the structure of the modified cross-correlation DTD.
DT
h ˆ
ˆ( ) y k
Fig 3.2 Structure of the modified cross-correlation DTD by including the
microphone energy
The modified cross-correlation DTD includes the microphone energy detector. The energy detector can detect the double talk, and help cross-correlation DTD make correct decision. The energy detector is like Geigel double talk detector. With the energy detector, the modified cross-correlation DTD can detect correctly the four typical cases. Using the microphone energy, it has the drawback when the near end speech energy is small. The modified cross-correlation DTD is hard to detect double talk in near end speech small energy. The modified DTD drawback is the same with the Geigel DTD. However, the Geigel DTD is difficult to detect echo path change.
The flow chart of the modified cross-correlation DTD algorithm is given in Fig. 3.3.
( )
( )
d kx k
ˆ( ) y k
ˆ( )
dy k ρ
>T
<
>T
<
Fig 3.3 Flow chart of the modified cross correlation DTD algorithm by microphone energy
First, we use the correlation ρdyˆ( )k in (2.1.4) to decide single talk or double talk (or echo path change). This means that double talk or echo path change is present when correlation is smaller than some threshold. Second, we use the microphone energy to detect double talk. The microphone energy detector algorithm in (2.3.2) is written for simplicity.
( ) ( )
( )
d x
P k P k
α = P k
The microphone energy detector actually is a Geigel DTD with smoothed microphone and far end signal energy. If P kα( )is larger than the threshold, we can decide double talk. Once doubletalk is declared, the detection is held for a minimum period of time. If P kα( ) is smaller than the threshold. We can decide echo path change.
Now, we use two detectors that we can detect all situations. With two detectors, we can are more confident to decide double talk or echo path change. The cross-correlation DTD can detect double talk in near end speech small energy. But, the Geigel DTD can not.
We make the cross-correlation DTD to be more robust. In Chapter 4, simulations of all cases will be performed to verify the effectiveness of the modified cross-correlation DTD.
3.3 Evaluating DTD in echo path change and double talk
In Section 2.2, Morgan [8] proposed an objective technique to evaluate double talk detectors. But the technique only calculates the miss probability in double talk. The DTD should decide double talk or echo path change. Therefore, we also can calculate the miss probability when echo path change is present. This section will introduce the method to calculate the miss probability in echo path change period. We can combine [8] with our technique to calculate the miss probability in double talk and echo path change.
3.3.1 Introduction the technique evaluate DTD in echo path change
Before we introduced the technique evaluate DTD in echo path change, we discuss echo path change. From [8], we calculate the miss probability in different levels of the near end signal energy. But for the echo path change, we must calculate the miss probability in different echo paths. We define the parameter
ρ
h to quantize degree how the echo path changes. The channel correlationρ
h is defined as follows.2 2
ˆ ˆ
T h c
c
h h h h
ρ ⋅
(3.3.1)
where h is the new change channel , assuming the filter has converged c h hˆ≈ , h is the origin channel.
The small channel correlation means the echo path change hc is very different to the origin channel h . The detector can easily detect that echo path change for a small
ρ
h. The channel correlationρ
h lies between -1 and 1.In section 2.2, the miss probability has been used. In this section, we also calculate the miss probability in echo path change. Avoiding disarraying the miss probability, we define that the miss probability in echo path change calls the change miss probabilityPcm. Similarly, the false alarm probability in echo path change defines the change false alarm probabilityPcf .
Fig 3.4 The method of the calculating false alarm probability
Now, we introduce the technique to evaluate DTD in echo path change. First, we calculate the change false alarm probability Pcf using the count in Fig 3.4.
Pcf P(EPC is detected | EPC Not happens)
The change false probability Pcf N
= φ (3.3.2)
where N is trial length , φ is the frequency when detector makes error decision.
We find the threshold under fixed false alarm probability from (3.3.2). This step is equal to
0.5 1 1.5 2 2.5 3
Second step, Let EPC happen in range C. procedure, we can calculate the change miss probability in EPC. We can use the technique to calculate the miss probability in different level of echo path change.
0.5 1 1.5 2 2.5 3
The complete DTD evaluation technique is summarized as follows.
1) a) Select threshold T.
b) Compute false alarm probabilityPcf from (3.3.2).
c) Repeat steps a, b over a range of threshold values.
d) Select threshold value that corresponds to P =cf 0.1 2) Calculate echo path change
Let EPC happen in range C
3) Select different channel correlation value.
a) DTD algorithm decides echo path change rangeφ. b) ComputePm from (3.2.2)
c) Repeat steps a, b, c over all conditions.
4) Repeat step 2 and step 3 over a range of channel correlation values.
5) Plot averagePmas a function of channel correlation
Table 3.2 DTD evaluation procedure in case of echo path change
3.3.2 Evaluating DTD miss probability in echo path change and double talk
In section 3.3.1, we introduced the technique which can calculate the miss probability when echo path change is present. Now, we combine Section 2.2 technique with section 3.3.1 method. The combined technique can calculate the miss probability in double talk and echo path change period.
Fig 3.6 A flow chart of the calculated miss probabilities for DTD and EPC
From Fig 3.6, the step 1 and step 2 of the combined technique is the same method in [8]. The fist two steps can calculate the miss probability in double talk period. The step 3 and step 4 of the combined technique is introduced in section 3.3.1. The last two steps can calculate the miss probability when echo path change is present.
3.3.3 Weight miss probability
Before we calculated the miss probability, we have the same weight to calculate the miss probability. But, the same weight is not fair. Therefore, we want to modify the weight. The weight miss probability is more fair and robust. We use the residual error power or near end signal energy to be different weight.
In DT period, we calculate the miss probability from (2.2.2). The miss probability
We can find that the variableΦ is binary. It means that the DTD detect correct is one. On the contrary, if DTD detected wrong that the variableΦ is zero. The variable Φ is one if DTD detected correct, and we can find that the DTD had the same weight to calculate the miss probability.
Therefore, we modify the weight.
0.6, near end energy >5db
can decide DT easier. Therefore, we change that the weight is smaller. In other words, the DTD can decide DT harder in near end signal small energy. The weight is larger than 1.Hence, we modify the weight that the miss probability is more fair and robust.
Equally, we can modify the weight when echo path change is present. In section 3.2, we proposed a technique to calculate the miss probability in echo path change. We calculate
1
d nm
n n
P O O
O O
= − ⋅
⋅
The variable Od equal one when the DTD detect correct. Similarly, the weight is the same in every point. Therefore, we modify the weight by residual error power. In section 3.3.1, we also propose the
ρ
h to judge the level of echo path change. When theρ
h is small, it means that the residual error power is also small, and the DTD detect harder correct. On the contrary, the DTD detect easier correct in largeρ
h situation. The modified weight is as follows.2 2
2
0.7, ( ) > 0db 1 , -10db ( ) 0db
1.6 , -10db < ( )
d
e k
O e k
e k
= ≤ ≤
(3.3.5)
3.4 Variant threshold
In section 2.1, we introduced the cross correlation DTD. If ρd y,ˆ( )k is smaller than the threshold, the detector decide double talk. We only use the fixed threshold to detect double talk. But, using the fixed threshold has a drawback. From Fig 3.7, double talk is present in 1K to 1.5K. We set that the fixed threshold is equal 0.95. If ρd y,ˆ( )k is below the threshold, the detector decide that double talk is present and the correlation is decreasing to 0.5. When double talk is over, the correlation is increasing to 0.95. Therefore,
,ˆ( )
d y k
ρ is larger than the threshold and the filter continue adapting. But, the better detector can decide that the filter continue adapting when double talk is over.
In order to perform the detector, we use the variant method to prove the performance.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 104 0.2
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
iteratiom Cross correlation
DT Threshold
Fig 3.7 The cross correlation ρ under fixed threshold d y,ˆ
Now, we use the variant threshold like Fig 3.8. If the detector decides that double talk is present, we can set that the threshold is larger than the correlation in DT period.
With the new threshold, the detector can faster decide that double talk is over.
Then, the detector decides the single talk, and the threshold renews to set. This means that the double talk detector can faster decide double talk when double talk is present again.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 104 0.2
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
iteratiom Cross correlation
DT Threshold(1)
Threshold(2)
Threshold(3)
Fig 3.8 The cross correlation ρ under variant threshold d y,ˆ
In section 2.4, we analyzed the theoretical cross correlation whatever double talk or echo path change. From (2.4.4) and (2.4.5), the theoretical correlation is exact value in any double talk cases. If the detector decides that double talk is present, we use the (2.4.5) to calculate the threshold again.
But, (2.4.5) is not practice, and we rewrites (2.4.5) for simplicity.
,ˆ 2 2 threshold in double talk period. The new threshold is calculated as follows.
Threshold = filter is safer and not diverges. The variant threshold in cross correlation DTD algorithm is from Fig 3.9.
If the filter has converged, the double talk is abrupt present. The correlation is decreasing and small the threshold. The DTD detect the double talk and freeze updating.
We can calculate the threshold by (3.4.2). With the new threshold, we can faster find that the double talk is over. Therefore, the performance is better than the fixed threshold. Until the correlation is large the origin threshold, the threshold change the origin threshold.
Fig. 3.9 Algorithm of the variant threshold selection in cross correlation DTD
Chapter 4
Computer Simulations
In this chapter, we will perform the computer simulations to verify the previous derived results. The difference between the simulation and theoretical results will be compared in this chapter. Both white and speech signals are also considered.
In Section 4.1, we explain the some simulation parameters, such as echo path impulse response, speech model and others parameters. In Section 4.2, we compare DTDs in section 2.1. In section 4.3, we illustrate the effectiveness of the robust DTD.
Theoretical and simulated cross-correlations DTD are compared in Section 4.4. And shown to be very close to each other. In Section 4.5, we compare the theoretical and simulated miss probabilities.
In Section 4.6, we verify that the modified cross-correlation can detect precise whatever double talk or echo path change is present. The technique for evaluating DTD in echo path change and double talk is simulated in Section 4.7. In Section 4.8, we propose the variant threshold to adapt to the real speech environment.
4.1 Simulations parameters and room impulse response
The echo impulse responseh k( ) is shown in Fig 4.1. Fig 4.2 shows the far end signal and near end speech, where DT occurs from 10k to 15k. In following speech simulations, will use the signal in Fig 4.2
0 20 40 60 80 100 120
4.2 Comparison of four typical double talk detectors in double talk
In Section 2.2, we have explained how the technique in [8] can calculate the miss probability in double talk period. Here we let the step size u =0.2, forgetting factor
λ =0.01, and NFR= v22
x
σ
σ ,denotes the energy ratio of the near end signal to the echo signal.
From Fig 4.3, we found the mic/AEC correlation double DTD is better than others DTDs. Similarly, two echo path model can freeze adaptive in double talk period. But from Fig 4.3 the two echo path model performance is bad. But, two echo path model is a good choice to implement in real environment for its excellent stability. The Gradient correlation double talk detector is most effective when near end signal energy is very small. The miss probability is an inverse proportion the NFR.
-30 -25 -20 -15 -10 -5 0 5 10
Fig 4.3 Comparison of DTDs miss probability in different near end signal energies
4.3 Robust DTD
In section 2.3.1, we introduced the robust Geigel DTD. The buffer range is set by the PDF of Pα. The buffer range is in Fig 2.8. From Fig 4.4, we can find that the robust Geigel DTD is better than the conventional Geigel DTD. In DT period, the robust ERLE is better about 3db than the convention. By the simulation, the buffer range makes the Geigel DTD more robust.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 104 -20
-15 -10 -5 0 5 10 15 20 25
iteration ERLE
Geigel DTD
Robust Geigel DTD NO DTD
DT
Fig 4.4 Comparison of the robust and conventional Geigel DTD
In section 2.3.2, we also introduced the robust cross-correlation DTD. We also use the different buffer range in Fig 2.9. In this section, we will simulate the different range.
From Fig 4.5, we can find the wide buffer range is better than the narrow buffer range and the convention. However, the performance of the robust cross-correlation DTD is a little better than the convention.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 104 -10
-5 0 5 10 15 20 25 30
iteration ERLE
DT
wide range DTD
Conventional DTD
narrow range DTD
Fig 4.5 Comparison of the robust and conventional cross-correlation DTD
4.4 Analysis of the cross-correlation DTD
4.4.1 Analysis of the microphone signal and the estimated signal correlation DTD
In section 2.4, we derive the theoretical cross-correlation ρ of the mic/AEC d y,ˆ
correlation DTD. Now, we assume Near-end signal energy is equal to far-end signal energy.
By (2.4.5), the theoretical correlation ρ is related with near end speech energy. The d y,ˆ
simulation is also very close to the theoretical correlation in DT period. Another case, the theoretical correlation ρ is equal to different channel correlation in (2.4.9) when echo d y,ˆ
path change occurs in 22k. When EPC is present, the correlation ρ is relation with the d y,ˆ origin and new channel correlation. By simulation, we can find the simulation is also very close theoretical value in EPC.
0 0.5 1 1.5 2 2.5 3
x 104 -0.2
0 0.2 0.4 0.6 0.8 1 1.2
iteration Cross Correlation
Simulation theoretics
DT
EPC
Fig 4.6 Simulated and theoretical the mic/AEC correlation ρ in DT and EPC d y,ˆ
4.4.2 Analysis the microphone signal and the error signal correlation DTD
0 0.5 1 1.5 2 2.5 3
x 104 -0.2
0 0.2 0.4 0.6 0.8 1 1.2
iteration Mic/error correlation
Simulated Theoretical
DT
EPC
Fig 4.7 Simulated and theoretical the mic/error correlation ρ in DT and EPC d e,
From Fig 4.7, we can find that the simulation is also very closed the theoretic value.
We add near end signal from 10k to 15k.And the near end signal energy is 1. From (2.4.15), the theoretic value is related with near end speech energy in double talk period. The theoretic correlation is relation with the change channel hc and the difference h∆ in the new channel and origin channel h in EPC from (2.4.19).
4.5 Analysis the cross-correlation DTD miss probability
In Section 2.5, we studied the miss probability of the cross-correlation DTD. First, we derived that the threshold is a function with false alarm probability in (2.5.5). In Figure 4.8, we find that the DTD threshold under fixed false probability is closed to theoretical threshold when false alarm probability Pf large 0.2. Therefore, the theoretical derivation (2.5.5) appears to be correct.
Next, we derive that the miss probability under fixed the false alarm probability.
From Figure 4.8, when the false alarm probability is progressive smaller, the threshold also is smaller. The result is very intuitive. Because (2.5.3) does not include the near end signal, the correlation is very close 1. Hence, the threshold is also closed 1. But, when the threshold is getting further away from 1, the detector makes frequent errors in double talk period. Therefore, the false alarm probability is larger.
From Fig 4.9 and 4.10, we can find the miss probability Pm is decreasing when the false alarm probability Pf is increasing. In Figure 4.11, we can see that the near end signal energy is larger, and the miss probability is smaller. But the simulated line is not close to theoretical line especially when NFR is below -5db. This may be due to our assumption that the random variable r n= +2 v2 is Chi-square distribution. When the near end signal energy is small, it is added with noise. The mixed signal is not Chi-square distributed. But when the near end signal energy is large, it may be well assumed to have Chi-square distribution. When the miss probability is large 0.2, the cost of the detector is very high. Therefore, we focus on the small miss probability. We find that the theoretic is close the simulation.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Fig 4.9 Simulated and theoretical cross correlation DTD miss probability under fixed false
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Fig 4.11 Simulated and theoretical cross correlation DTD miss probability
4.6 Nonlinear effect for cross-correlation DTD
In Section 2.6, we have derived the theoretical correlation ρd y,ˆ in nonlinear
loudspeaker. From Fig 4.12, we can find that ρ in single talk and DT is decreased for d y,ˆ
nonlinear. The theoretical correlation ρ is close the simulated in single talk. However, d y,ˆ
in DT period, the theoretical correlation ρ is not close the simulated. Because we d y,ˆ
approximate the correlation ρ in (2.6.4). d y,ˆ
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 0.4
0.5 0.6 0.7 0.8 0.9 1
iteration Cross-Correlation
iteration Cross-Correlation