CHAPTER 3 ACOUSTIC FEEDBACK MODEL CONSTRUCTION &
3.2 S IMULATED R ESULTS
3.2.2 Algorithms compared
In this section, we will prove our algorithm’s performance will be as same as other algorithm’s performance in echo cancellation of the model we
introduced in section 3.2.1.
In Fig. 27 learning curves of four algorithms, we will show the LMS, DLMS, NLMS and our P2SPT algorithm’s learning curve.
200 400 600 800 1000 1200 1400 1600 1800 2000
Fig 27. learning curves of four algorithms
Our input signal is a random white noise signal between 1 and -1.
Moreover, we also put in 10% noise to the input signal. All algorithms will have 32 taps and LMS algorithm’s step-size is 0.1, DLMS algorithm’s
step-size is 0.1 too. Considering the model we built, the channel of we learning is the echo channel that we introduced in section 3.1. in Fig 27, if we
compare speed of these algorithms, we have the order from fast to slow will be:
NLMS > P2
SPT > LMS > DLMS.
If we compare precision of these algorithms, we can find out the order from batter to bad precision will be:
NLMS > LMS > DLMS > P2
SPT.
In fact, we want to compare the performance of “echo canceller’s
algorithms”, not the filter speed or precision. Therefore, we will use the model (introduced in section 3.2.1) to verify all of algorithm’s performance of echo cancellation.
Consequently, to compare the speed or the precision of those filters can not really find out the result we want. Only to compare those algorithms by real voice’s signal and have real echo noise occurred, the result will mean real performance of echo cancellation.
First of all, we will show the LMS algorithm. Next, we will have the DLMS algorithm and the NLMS algorithm. Finally, we will show our P2SPT algorithm’s performance and the performance of not using echo canceller to you.
LMS algorithm:
In next page, we will have the information of four types in Fig 29 LMS algorithm’s performance. We have the original input signal in lower left-hand corner, and system output in lower right-hand corner. In upper left, we have the echo canceller output. In upper right, we show the MSE of original input and system output.
Accordingly, Fig. 29 Fig 30, Fig. 31 and Fig. 32 are following this order.
Besides, we will show the MSE result of those algorithms in the end.
Fig. 28 as shows as follows:
0 0.5 1 1.5 2 2.5 3 3.5 4
Mean squared error for system output & original input
0 0.5 1 1.5 2 2.5 3 3.5 4
Fig 28. LMS algorithm’s performance
DLMS algorithm:
Fig. 29 as shows as follows, our delay parameter is four iterations.
0 0.5 1 1.5 2 2.5 3 3.5 4
Mean squared error for system output & original input
0 0.5 1 1.5 2 2.5 3 3.5 4
Fig 29. DLMS algorithm’s performance
NLMS algorithm:
Fig. 30 NLMS algorithm’s performance as shows as follows:
0 0.5 1 1.5 2 2.5 3 3.5 4
Mean squared error for system output & original input
0 0.5 1 1.5 2 2.5 3 3.5 4
Fig 30. NLMS algorithm’s performance
P
P2SPT algorithm:
Fig. 31 P2SPT algorithm’s performance as shows as follows:
0 0.5 1 1.5 2 2.5 3 3.5 4
Mean squared error for system output & original input
0 0.5 1 1.5 2 2.5 3 3.5 4
Using no echo canceller:
Fig. 32 no echo canceller’s performance as shows as follows:
0 0.5 1 1.5 2 2.5 3 3.5 4
Mean squared error for system output & original input
0 0.5 1 1.5 2 2.5 3 3.5 4
Fig 32. no echo canceller’s performance
We use table 2 MSE of algorithms to show the performance of those algorithms.
table 2. the performance of algorithms Algorithm
Type
NLMS LMS DLMS P
P2SPT Using NO canceller
MSE
(unit: 10^-4)
0.2297 0.421 0.4174 1.93 13.36
SNR (db)
32.79 30.16 30.19 23.55 15.14
Our algorithm’s performance look likes not better than others. But in the users hear, all of algorithms can cancel the echo except using no echo canceller.
In fact, human’s hearing feel nothing when the echo noise under the allowed range [32]. We will use the spectrogram graphs to prove this point in next page.
Spectrograms comparison:
We have the original input wave in upper left-hand corner, and the
spectrograms of original input in upper right -hand corner. In lower left, we have the spectrograms of system output without echo canceller. In lower right, we show the spectrograms of P2SPT system output. Fig. 33 as shows as
follows:
Fig 33. (a) original input wave. The spectrograms of (b) original input (c) system output without echo canceller (d) P2SPT system output
To compare with original input, we can find out the different between system output without echo canceller and P2SPT system output. The high frequency energy occurs in system output without echo canceller. We will compare our P2SPT system output with others in next page. Therefore, we can analyze those spectrograms plots to prove there no different in user’s
experience.
We have the spectrograms of NLMS system output in upper left-hand corner, and the spectrograms of LMS system output in upper right -hand corner. In lower left, we have the spectrograms of DLMS system output. In lower right, we show the spectrograms of P2SPT system output. Fig. 34 as shows as follows:
Fig 34. The spectrograms of (a) NLMS system output (b) LMS system output (c) DLMS system output (d) P2SPT system output
In Fig. 34, there are not evidence that our algorithm’s performance is bad than others. In fact, our algorithm only increases a little bit high frequency energy and there are not occur any effect in user’s hearing experience.
3.2.3 Performance on human’s voice
In this section, we will show you that our algorithm has good echo
cancellation performance in human’s voice input. In consequence, we will use the same model to verify algorithm’s performance on human’s voice testing.
In fact, all we really want is reduced the power consumption of echo canceller. Therefore, to compare the speed or the precision is not the point we considered. For that reason, we only to show you that the echo noise will be eliminated in spectrogram and waveform graphs.
First, we will show the test of 4 types human’s voice to you. Moreover, considering the testing of long voice’s stability and forward path changed, we will verify those two parts in the end.
In human’s voice testing, we present four voice’s types as follows:
¾ Man’s voice.
¾ Woman’s voice.
¾ Boy’s voice.
¾ Girl’s voice.
For those different types, we will show you that the echo cancellation performance of our P2SPT algorithm in spectrogram and waveform graphs.
Accordingly, those four types will have different echo noise response.
Consequently, Boy’s voice will be lowest frequency voice and woman’s voice is highest frequency voice in this test. Therefore, we can prove to you that our algorithm is real useful of echo cancellation in hearing aid.
Man’s voice:
First of all, we assume that man’s original signal is like Fig. 35. The Y-axis is from -1 to 1 and X-axis from 0 to 12000 (unit: iterations).
0 2000 4000 6000 8000 10000 12000
Fig 35. Waveform of man’s original signal
In addition, we show echo canceller output in Fig. 36 Waveform of man’s echo canceller output signal. (12bit: value 2048 means 1)
0 2000 4000 6000 8000 10000 12000
-2000
Fig 36. Waveform of man’s echo canceller output signal
Our system output will as shows as follows; under the echo canceller work.
0 2000 4000 6000 8000 10000 12000
Fig 37. Waveform of man’s system output signal
Now, if we turn off the echo canceller. The system output will be look like Fig. 38. In this case, the sound of this signal will bring a bleep that we call echo noise. (Y-axis also from -1 to 1)
0 2000 4000 6000 8000 10000 12000
-1
Fig 38. Waveform of man’s system output signal without echo canceller
We will show MSE of system output to original input. Also, we will show MSE without echo canceller too. Fig. 39 shows MSE (with echo canceller) of
system output to original input. (Y-axis is from 0 to 2)
0 2000 4000 6000 8000 10000 12000
0
Fig 39. MSE (with echo canceller) of man’s testing
Fig. 40 shows MSE (without echo canceller) of system output to original input. (Y-axis is from 0 to 2)
0 2000 4000 6000 8000 10000 12000
0
Fig 40. MSE (without echo canceller) of man’s testing
Man’s spectrograms results as shows as follows. In Fig. 43, we can see the energy distribution is different with Fig. 41. On other hand, the
spectrograms result of Fig. 42 is similar to Fig. 41.
Fig 41. Spectrograms of man’s original signal
Fig 42. Spectrograms of man’s system output signal
Fig 43. Spectrograms of man’s system output signal without echo canceller
Woman’s voice:
As same as man’s voice testing, woman’s original signal is like Fig. 44.
The Y-axis is from -1 to 1 and X-axis from 0 to 18000 (unit: iterations).
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
-1
Fig 44. Waveform of woman’s original signal
Besides, we will show the echo canceller output in Fig. 45 Waveform of woman’s echo canceller output signal. (12bit: value 2048 means 1)
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
-2000
Fig 45. Waveform of woman’s echo canceller output signal
Our system output will as shows as follows; with echo canceller work.
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
-1
Fig 46. Waveform of woman’s system output signal
As well, we also turn off the echo canceller. The system output will be look like Fig. 47. (Y-axis also from -1 to 1)
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
-1
Fig 47. Waveform of woman’s system output signal without echo canceller
Fig. 48 shows MSE (with echo canceller) of system output to original input. (Y-axis is from 0 to 0.4)
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
0
Fig 48. MSE (with echo canceller) of woman’s testing
Fig. 49 shows MSE (without echo canceller) of system output to original input. (Y-axis is from 0 to 0.4)
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
0
Fig 49. MSE (without echo canceller) of woman’s testing
Woman’s spectrograms results as shows as follows. In Fig. 52, we can see the energy distribution is also different with Fig. 50 and the spectrograms result of Fig. 51 is similar to Fig. 50.
Fig 50. Spectrograms of woman’s original signal
Fig 51. Spectrograms of woman’s system output signal
Fig 52. Spectrograms of woman’s system output signal without echo canceller
Boy’s voice:
The boy’s original signal is like Fig. 53. The Y-axis is from -1 to 1 and X-axis from 0 to 20000 (unit: iterations).
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Fig 53. Waveform of boy’s original signal
We will show the echo canceller output in Fig. 54 Waveform of boy’s echo canceller output signal. (12bit: value 2048 means 1)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Fig 54. Waveform of boy’s echo canceller output signal
Our system output will as shows as follows; under the echo canceller
Fig 55. Waveform of boy’s system output signal
In next graph, if we also turn off the echo canceller. The system output will be look like Fig. 56. (Y-axis also from -1 to 1)
Fig 56. Waveform of boy’s system output signal without echo canceller
Fig. 57 shows MSE (with echo canceller) of system output to original input. (Y-axis is from 0 to 0.35)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 104 0
0.05 0.1 0.15 0.2 0.25 0.3
Fig 57. MSE (with echo canceller) of boy’s testing
Fig 58shows MSE (without echo canceller) of system output to original input. (Y-axis is from 0 to 0.35)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 104 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35
Fig 58. MSE (without echo canceller) of boy’s testing
Boy’s spectrograms results as shows as follows. In Fig. 61, we can see the energy distribution is different with Fig. 59 and the spectrograms result of Fig.
60 is similar to Fig. 61.
Fig 59. Spectrograms of boy’s original signal
Fig 60. Spectrograms of boy’s system output signal
Fig 61. Spectrograms of boy’s system output signal without echo canceller
Girl’s voice:
The girl’s original signal is like Fig. 62. The Y-axis is from -1 to 1 and X-axis from 0 to 15000 (unit: iterations).
0 5000 10000 15000
Fig 62. Waveform of girl’s original signal
The echo canceller output in Fig 63 Waveform of girl’s echo canceller output signal, as shows as follows. (12bit: value 2048 means 1)
0 5000 10000 15000
Fig 63. Waveform of girl’s echo canceller output signal
Our system output will as shows as follows; under the echo canceller
Fig 64. Waveform of girl’s system output signal
If we also turn off the echo canceller, the system output will be look like Fig. 65. (Y-axis also from -1 to 1)
Fig 65. Waveform of girl’s system output signal without echo canceller
Fig. 66 shows MSE (with echo canceller) of system output to original input. (Y-axis is from 0 to 0.25)
0 5000 10000 15000
0 0.05 0.1 0.15 0.2 0.25
Fig 66. MSE (with echo canceller) of girl’s testing
Fig. 67 shows MSE (without echo canceller) of system output to original input. (Y-axis is from 0 to 0.25)
0 5000 10000 15000
0 0.05 0.1 0.15 0.2 0.25
Fig 67. MSE (without echo canceller) of girl’s testing
Girl’s spectrograms results as shows as follows. In Fig. 70, we can see the energy distribution is different with Fig. 68 and the spectrograms result of Fig.
69 is similar to Fig. 68.
Fig 68. Spectrograms of girl’s original signal
Fig 69. Spectrograms of girl’s system output signal
Fig 70. Spectrograms of girl’s system output signal without echo canceller
We will form all MSE and SNR of voice test in table 4.3.1 MSE of voice testing.
table 3. performance of voice testing
man woman boy girl
MSE with Echo canceller
0.0026 0.00058188 0.0013 0.00042373 MSE w/o
In table 3, we can discover our echo canceller can reduced the most echo noise in MSE and SNR data results. Moreover, we also use the spectrograms to prove our algorithm has good performance in echo cancellation result.
Therefore, the echo noise will not influence our user’s hearing experience when our algorithm worked.
In next topic, we will discuss that the long voice testing and forward path delays changed effect for our algorithm.
Forward path delays changed:
We consider the forward path delays effect in this model. We will show you that the performance of echo cancellation is not depended on the forward path delays.
We use MSE to show you that the performance of echo cancellation is not depended on the forward path delays in table 4 MSE of forward path delays changed.
We use man’s voice signal as shows as follows. The result of table 4 is not depended on forward path iteration delays. Therefore, other components of hearing aid will have more design adaptability.
table 4. MSE of forward path delays changed
Iterationdelays
50 100 150 300
MSE 0.0026 0.00255 0.00252 0.00264
Long voice’s testing:
In order to test the stability of our algorithm, we use the voice’s signal of 700000 iterations (almost 44 seconds) to prove our algorithm will adaptive the window coefficient for control the echo noise.
In Fig. 71 original long voice’s signal, we have the system input for our
Fig 71. original long voice’s signal
In Fig. 72 long voice’s system output signal, we have the system output for our model.
0 1 2 3 4 5 6 7 x 105 -1
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8
Fig 72. long voice’s system output signal
In Fig. 72, we can find out the system output will sounds like the original input signal in the human’s hearing experience.
Actually, the key point of this thesis is low power design. So we just achieved the goal of cancelled echo noise and will focus on real design’s power consumption. Accordingly, we just prove our algorithm is achieved the goal of cancelled echo noise in human’s hearing experience.
Therefore, we will have the simplest architecture design when we using our P2SPT algorithm in next chapter, Chapter 4 Architecture Designs & Power Reports.
Chapter 4
Architecture Designs & Power Reports
Considering the power consumption of echo canceller, we have to do more works in architecture design. Based on the P2SPT algorithm, we fold the new architecture for our low power issue. We will discuss the reason of folding in section 4.1. Furthermore, we will show the power report in section 4.2, when we finish introduced our design’s architecture.
4.1 Architecture Designs
In this section, we start to introduce our design’s architecture. For the reason of us folding our design, we will to explain that in section 4.1.1. After section 4.1.1, we start introduced our echo canceller architecture design in section 4.1.2. We will have a clear description of our architecture design in this section.
4.1.1 Architecture design’s description
The “folding” [34] is key-point architecture technique in this thesis.
Based on our specification and work’s conditions, folding our design will minimize design’s power consumption. We will illustrate the reason of folding, and give estimated results in the end of this section.
In next page, we show the reason of folding in Fig. 73. We use the folding architecture technique for 1. Reduce area for efficient power consumption, and 2. Using SRAM-types register file. In fact, all of the architecture designs will only for the one reason; minimizes the power consumption of our echo
canceller design.
Considering our work’s conditions as shows as follows, we folding our design for minimize power consumption.
¾ Using standard library.
¾ Fixed voltage.
¾ Work frequency too low.
¾ Leakage problem in 90 processes.
¾ Simplest architecture.
¾ Replace shifter register.
The key point for using folding as shows as follows:
Fig 73. key point for using folding skill
In Fig. 73, we want to use the “registers file” to replace “shifter registers”, and using the SRAM type register file to replace D flip-flop. Therefore, the clock loading and power consumption will be minimized. In our specification, the sampling rate will be 16K Hz. So we should to “folding” for smaller area, if our design has simplest architecture (to reduce the leakage power, especially in the advanced processes [35].).
1. Reason of registers file to replace shifter registers:
First of all, we start explained the reason of use the “registers file” to replace “shifter registers”.
In Fig. 74 Diagram of shifter registers, we can find out that all input data will shift in the every iteration. Shifter registers will send the data to the next register in the every iteration, there means a new input will force others moved forward. For instance, if our filter has 32 taps, not necessary move will occur for 31 times.
Fig 74. Diagram of shifter registers
In our application, the speech’s input signal always swing between 1 and -1. Therefore, lot not necessary power consumption will occur in the shifter registers; especially we use 2’s complement to represent input data in our hardware design.
In the past, shifter register was popular form in the filter design. But considering the power issue, we have to find out some way do not real “shift”
all input data, but can also catch all data’s shifter information that just we need.
We choose register file to replace shifter register, because every input data will “replace” the oldest input data we do not used.
In Fig. 75 Diagram of register file, we can figure out the new input data will replace the oldest one, if we give the suitable address.
Fig 75. Diagram of register file
Considering the register file in filter design, we have to start used the technique we call “folding”. Accordingly, we can use the register file of SRAM type to reduce the power consummation and layout area; the shifter register only can compose by D flip-flop.
But in fact, we will have to design the unit of control those addresses and data follows. Therefore, the address scheduling [36] will be considered.
In our research, if shifter register will replace by register file, the power consumptions will reduce to 10% in the register part. But the control unit design is necessary. In consequence, some power consumption will transform to this control unit.
2. Estimation of Power to Folding:
We estimate the power consumption for different folding way of our design, because we want to know the minimize power consumption of folding.
First of all, we start introduced our work’s condition and show the estimated results in the next. In the main, our clock rate is 16K Hz, if our sampling rate is defined to 16K Hz. So if we folding the architecture, than the clock rate will higher than 16K Hz.
In the low power design we know [37]; the speedAccelerated will increase the power consumption. But in this case, our works clock rate is too slow to use. Especially our echo canceller is design for the simplest structure.
Accordingly, we folding will increase the power consumption of unit area, but also reduce the total area of our design. In a ward, In order to realize
relationship of folding and power consumption, we assume 3 conditions as follows:
1. The power consumption of Registers part as defines as follows:
table 5. defines of registers part’s power consumption Fold u W Estimation function
32 68 (AC*freq + DC)*1.3v freq:512K Hz
The SRAM-types register file only supply to folding 8 times. We assume the power will double increase for using D flip-flop and return to the shifter register. In next page, we will show the data ship of TSMC_013 SRAM-types register file to you. TSMC_013 SRAM-types register file as follows:
table 6. defines of registers part’s power information
TSMC_013 register file AC current DC current
SYHD130_8X12X1CM2 0.004 mA/MHz 1.521 uA
SYHD130_16X12X1CM2 0.004 mA/MHz 1.665 uA SYHD130_32X12X1CM2 0.004 mA/MHz 1.950 uA
2. Tap part will increase 10% power consumption for folding.
For instance, if we folding 32 times for using one tap unit and try to no folding for 32 tap units, the folding structure will increase power consumption with higher clock frequency.
3. Other part will increase 10% power consumption for folding.
There will only one other part, but design complexity will increase by folding.
Based on these 3 conditions, we will have a clear calculation of our estimation.
In this analysis, we try to describe the relationship of the folding and the power consumption. Therefore, we estimated design’s power to you in Table 7. In next page, we will show the estimation of fold to power consumption likes Fig 76.
Table 7 as shows as follows:
table 7. clear calculation of folding to power estimation Fold u W Tap + other + register
In Fig. 76 estimation of folding to power consumption, we can see the relationship of folding and power consumption.
0 5 10 15 20 25 30 100
200 300 400 500 600 700 800 900 1000 1100
folding
u Watt
Fig 76. estimation of folding to power consumption
As we can see in Fig. 76 estimation of folding to power consumption, we will have minimized power consumption when we folding 32 times of our design.
Based on this result, we will design our architecture for folding 32 times.
We will show our real architecture design in next section, section 4.1.2 Design’s Architecture.
4.1.2 Design’s Architecture
In this section, we will introduce our architecture designs. First of all, we show Fig. 77 Diagram of design’s architecture to you.
Fig 77. Diagram of design’s architecture
Accordingly, there are four main blocks in Fig. 77.Simple functions illustration of these blocks as shows as follows, and the clear explanation will in next page.
¾ Control unit: to produce addresses & flow control signals.
¾ Partial unit: to handle the partial update function.
¾ Tap unit: to execute the tap function & to produce the output data.
¾ Register file: to save the input data & update coefficients.
1. Control unit
In order to explain our control unit design, we show the architecture of control unit in Fig. 78.
Fig 78. architecture of control unit
As we can understand, this unit’s key-point is figured out the relationship between all the control and address flows. Besides, we try to share the counter and adder as we can. In consequence, we can have the architecture as likes as Fig. 78.
Our design has 32 taps, and folding 32 times (to 1 tap.). Furthermore, our control unit should to take care the “shift feeling” in the address signal that we call “in_read_ctl”. In that part, we use “in_write_ctl” to move our start position in the register file. Altogether, the control unit will control the data flows of our designs.
2. Partial unit
We show the architecture of partial unit in Fig. 79.
Fig 79. architecture of partial unit
In the main, this unit will try to reduce the update operation. Therefore, to reduce the update operation will reduce the design’s power consumption.Based on our P2SPT algorithm, we can design the simplest architecture of our partial update function.
For example, if our partial update condition will not be true, the update operation is n/2. On the other hand, if the condition will be true, the update operation is n/4 (n: iterations; the complete explanation in the chapter 2).
3. Tap unit
The architecture of tap unit is shown in Fig. 80.