A. Direction of Arrival (DOA) Estimation and Beamforming
2. The Superdirective Method
The superdirective method is presented.23-28 From the delay-sum method, we can know that signals which got from microphones are
(ejω)= (ejω) s+ (ejω)
x s d v (23)
where d is the look direction vector which depends on the actual geometry of the s array and the direction of the sound source signal. Then, the output is
y = w x H (24)
where w denotes the frequency-domain coefficients of the beamformer and the operator H denotes a complex conjugated transposition (Hermitian operator). The array gain is the measure which shows the improvement of the SNR between one sensor and the output of the whole array. Therefore,
Array
Sensor
G SNR
=SNR (25)
The SNR of one sensor is given by the ratio of the power spectral densities (PSD) of the signal φss and the average noise
V Va a
φ . The SNR at the output can be computed
by deriving the PSD of the output signal.
H
is a power spectral density matrix of
the array input signals. When the desired signal is present only, the output is
H 2
where φVV is a normalized cross power spectral density matrix of the noise.
Therefore, array gain in Eq. (25) can be rewritten as
H 2
Assuming a homogeneous noise field can be expressed in terms of the coherence matrix
gain can be rewritten again as
This representation allows for different noise fields and they can be expressed by their coherence function.
A common quantity to evaluate beamformers is the directivity index (DI) which describes the ability of the array to suppress a diffuse noise field. Therefore, we can compute the DI by using the coherence function of a diffuse noise field
[ ]
Then, directivity index (DI) is
2
The ability of the array to suppress spatially uncorrelated noise, which can be caused by self-noises of the sensors. Inserting the coherence matrix for this noise field
VV uncorr =
Γ I (34)
into Eg. (30) results in the white noise gain (WNG)
H 2 s
WNG= w dH
w w (35)
On a logarithmic scale positive values represent an attenuation of uncorrelated noise, whereas negative values show an amplification.
In the real implementation of superdirective beamforming, large microphone weights result in array instability and sensitivity to uncorrelated noise. The noise sensitivity ( )Ψ 26
*
can be implied to illustrate the uncorrelated relationship between signals and noises.
When w is determined by Eq. (38), w dT s =1, and Ψ is simply w wH = Σi|wi|2. Insure that the array has unity gain in the sound source, w must be normalized to the sound source. Figure 12 shows the relationship between noise sensitivity and DI in broadside and endfire arrays. From the results, high noise sensitivity implies the larger microphone weights. In low frequencies, the optimal weights used for implementations are larger than high ones. In order to achieve the same DI, microphone weights utilized in low frequency are larger than high ones. Therefore, the DI in low frequency is hard to improve, unless the large weights are utilized.
In order to design optimal beamformers, we have to minimize the power of the output signal of the array. Avoiding the trivial solution w= 0, the minimization is constrained to give an undistorted signal response in the desired look direction.
Therefore, the following constrained minimization problem has to be solved:
min H XX subject to H s =1
W w φ w w d (37)
Since we are only interested in the optimal suppression of the noise, and we assume a perfect correspondence between the direction of the desired signal and the look direction of the array, only the noise PSD-matrix φ is used. The well-know VV solution for Eq. (37) is called the Minimum Variance Distortionless Response (MVDR) beamformer.23 It is given by
and can be derived by using the Lagrange-multiplier.29 Assuming a homogeneous noise field the solution is a function of the coherence matrix:
1
Eq. (36) or Eq. (37) can be interpreted as a spatial decorrelation process followed by a matched filter for the desired signal. The normalization in the denominator leads to unity signal response for the look direction.
The well-known delay-sum beamformer is included for comparison purposes.
It is an “optimal” beamformer for optimizing the WNG. We can derive the coefficients from Eq. (37) by inserting the coherence matrix for spatial uncorrelated noise
The method uses a same added scalar ε to the main diagonal of the normalized PSD or coherence matrix:
1
The factor ε can vary from zero to infinity, which results in the unconstrained superdirective or the delay-sum respectively. The results between delay-sum and superdirective in broadside and endfire array are illustrated in the following.
Figure 13 shows the effects of changing factor ε in 0, 0.1, 0.01 and 0.001 for DI in the broadside and endfire arrays. From the results, we know that the DI in the endfire array is larger than the broadside array. In the broadside and endfire arrays, the DI is independent of ε above the frequencies of 2 and 3 kHz. Figure 14 shows the effects of changing factor ε in 0, 0.1, 0.01 and 0.001 for WNG in the broadside array and endfire array. From the results, we know that the WNG in the broadside array is larger than the endfire array. In the broadside and endfire arrays, the WNG is independent of ε above the frequencies of 2 and 3 kHz. Figure 15 shows the effects of changing factor ε in 0, 0.1, 0.01 and 0.001 for optimum weight in the broadside and endfire arrays. From the results, we know that the optimum weights
in the broadside array are larger than the endfire array. It means that the larger values of optimum weight, the harder implementations of FIR filters. In the broadside and endfire arrays, the optimum is independent of ε above the frequencies of 2 and 3 kHz. The optimal weights can not be too large, or it is hard to be implemented. The optimal weights should be small enough to implement in FIR filters. Around the effects of ε , the best values of ε is chosen as 0.01.
After the proper ε is chosen, then the comparisons of delay-sum and superdirective in DI, WNG, and optimal weights in broadside array are illustrated.
Figure 16 shows the comparisons of DI of delay-sum and superdirective with ε chosen as 0.01 in the broadside and endfire arrays. We know that the DI with the superdirective method is improved greatly both in the broadside and endfire arrays.
Figure 17 shows the comparisons of WNG of delay-sum and superdirective with ε chosen as 0.01 in the broadside and endfire arrays. We know that the WNG with the delay-sum method is better than the superdirective method both in the broadside and endfire arrays. Figure 18 shows the comparisons of optimum weights of delay-sum and superdirective with ε chosen as 0.01 in the broadside and endfire arrays. From the results, we know that the optimum weights with the superdirective method are larger than the delay-sum method at low frequency both in the broadside and endfire arrays. Because the WNG is small at low frequency, the optimum weights have to compensate the weights at low frequency. The weights of superdirective method can be accepted to implement as FIR filters.
In addition, the contour plots of delay-sum and superdirective methods compared in broadside array is shown in Figure 19. We know that the directivity of superdirective method in low frequency is greatly improved. It means that the SNRs of superdirective method are larger than delay-sum method. The performances of microphone can be improved by superdirective beamforming. Figure 20 shows the
contour plots of delay-sum and superdirective methods compared in endfire array.
The results are still the same with the broadside array. To compare the difference of the broadside and endfire array with the performance indices indicated above, the directivity of the endfire array is better than the broadside array, but beamwidth shown in contour plots is wider than the broadside array. The reason is that the directivity is calculated from 0 ~ 360 and the directivity of the broadside array is o o symmetrical at 0o and 180o and shown up from 0 ~ 360 . But, the directivity of o o endfire array is not symmetrical and shown up just from -90 ~ 90 . That is why o o the differences of directivity and beamwidth in the broadside and endfire array.
Once the frequency response of the superdirective weights is obtained, the inverse Fourier transform is utilized to acquire the impulse response of the superdirective FIR filters. If the P frequencies are acquired, the discrete w frequency response can be obtained as
( ) *( ), 1, ,
m m w
H l =w l l= L P (42)
where ( )Hm l is the discrete frequency response at mth channel and wm*( )l is the discrete frequency response of superdirective algorithm at mth channel. In order to obtain the impulse response with real coefficients, mirror Hm(l) with conjugate operation to obtain the symmetric frequency response. Then the frequency response becomes
l L . The discrete frequency response becomes
⎪⎪
Finally, the impulse response can be obtained by utilizing inverse Fourier transform at each channel. non-causal impulse response of superdirective FIR filters of order 64. The causal impulse response of superdirective FIR filters of order 64 is shown in Figure 22.
Therefore, the digital superdirective FIR filters are implemented. From the results, the FIR filters are symmetrical. The superdirective FIR filters of Microphone 1 are the same with Microphone 4. The superdirective FIR filters of Microphone 2 are the same with Microphone 3. Besides, the relationship between Microphone 1 and Microphone 2 is like ’differential microphones’. Figure 23-26 show the simulations and measurements of the DOA estimation of delay-sum and superdirective methods with a single frequency at 500, 1000, 2000 and 4000 Hz, respectively. The sound source is indicated at 10 . From the results, the DOA estimation angle of o superdirective method is more precise than delay-sum at 500, 1000 and 2000 Hz.
The DOA estimation is almost the same at 4000 Hz or other high frequencies.
However, the sidelobes of superdirective method are larger than delay-sum. Figure 27 shows the DOA estimation simulated by delay-sum and superdirective methods with the sampling rate 16000 Hz at a 12 cm aperture in 8 kHz broadband white noise.
Obviously, the beamwidth of superdirective method is smaller than the delay-sum method and the directivity is also better than delay-sum method. However, the
sidelobes of superdirective method are larger than delay-sum. We can know that the relationship between beamwidth and sidelobes is trade-off. The DOA estimation of superdirective method is more precise than delay-sum method.