Organization of this Dissertation - 基於多秩訊號模型之語音純化波束形成與多通道後濾波

Chapter 1 Introduction

1.5 Organization of this Dissertation

The remainder of this dissertation is organized as follows. The robust adaptive beamforming with multi-rank models based on the Kalman filter is introduced in Chapter 2. Chapter 3 presents the multi-channel post-filtering based on spatial coherence measure. Chapter 4 shows the experimental results of the proposed beamformer and post-filter. Finally, conclusions and future works are drawn in Chapter 5.

Chapter 2

Robust Adaptive Beamforming with

Multi-Rank Signal Models Based on the Kalman Filter

2.1 Introduction

MVDR beamforming aims to minimize variances of the interferences and noise while maintaining the desired array response. It is known to degrade dramatically due to even small mismatches of the desired signal model, especially when the desired signal is present in the training data. The robust MVDR beamforming aims to keep the output SINR performance against several array or propagation uncertainties. In real world environments, the spatial correlation is typically multi-rank due to local scattering, wavefront fluctuation, or reverberation. Therefore, the multi-rank signal model is able to provide a more accurate model of the sound propagation of the desired source to the microphones. In this case, if the array response provided with multi-rank signal models is known exactly, such performance degradation can be reduced. To further reduce the sensitivity of arbitrary kinds of mismatches, diagonal loading (DL) [2][23][24]

technique has been a popular used approach to improve the robustness of the MVDR

beamformer. The major drawback of DL is that it is not clear how to choose the diagonal loading level given the input covariance matrix. The norm constrained Capon beamforming (NCCB) is known to be equivalent to the DL [2][26]. However, the knowledge of how to choose the norm constraint value has not been completely studied.

In this dissertation, we use the norm constraint rather than the original DL formulation.

Besides, the superiority of using the norm constraint value rather than using the diagonal loading level is demonstrated in the simulations (see Section 4.1.1).

In this dissertation, we conduct the constrained Kalman filtering for more flexible on-line implementations. Constrained Kalman filtering [55–59] has been widely investigated in the last decade. The approaches mainly fall into one of three categories:

pseudo-observation methods (or penalty methods), projection methods, and dimension reduction methods. Among these methods, the pseudo-observation method is the most intuitive way to conduct the constraints into the state-space of the Kalman filtering by considering the constraints as additional measurement equations. In this way, several developed nonlinear Kalman filtering algorithms can be directly applied. Chen et al.

was the first one who introduced the soft-constrained pseudo-observation (SCPO) [58][59] into the traditional MVDR problem [67]. El-Keyi et al. conducted the SCPO for the robust adaptive beamforming based on worst-case performance optimization [68]. In this dissertation, we also apply the SCPO for the robust adaptive beamforming with multi-rank signal models. The potential drawback is that the unconstrained problem by using SCPO can be ill-conditioned if the parameter matrices are not appropriately chosen. In this dissertation, the settings of the initial conditions and parameter matrices are studied to achieve the good performance, which also prevent the SCPO method from the ill-conditioning problem. Compared to the prior work [20], the computation of principal eigenvector can be avoided in the proposed method.

Since the robust adaptive beamforming problems with multi-rank signal models and the norm constraint belong to the quadratically constrained quadratic programming (QCQP) [69], nonlinear Kalman filtering is introduced. The most widely used nonlinear Kalman filter is the extended Kalman filter (EKF) [60–62]. Another popular method is the unscented Kalman filter (UKF) [62–66]. The EKF approximates the Jacobian and Hessian matrices (in the first- and second-order approximations) of the nonlinear functions, while the UKF approximates the probability distribution of the nonlinear transformation using sigma points. Theoretically, the second-order extended Kalman filter (SOEKF) gives the best approximation in the MSE sense. However, due to the approximation of the second-order errors (see Appendix I), the SOEKF is sensitive to improper initial conditions and parameter matrices. The comparison of the above nonlinear Kalman filters will be discussed in Section 4.1.2.

The remainder of this chapter is organized as follows. In Section 2.2, we briefly review the problem of the MVDR beamforming with multi-rank signal models. Section 2.3 gives a modified problem with the normalized signal model and the norm constraint, and formulates its state space model based on the SCPO method. The relationship between the diagonal loading level and the norm constraint value for the multi-rank case is also analyzed. Section 2.4 presents the solutions using the EKFs and the UKF. Finally, a summary is drawn in Section 2.5.

2.2 Problem Formulation

2.2.1 Robust MVDR Beamforming with Multi-Rank Signal Models

The well-known MVDR beamformer minimizes the output power of interference-signals-plus-noise while maintaining a distortionless constraint at the look

direction [1]. Consider the noise vector n(,k) in the STFT domain given in (1-2), the problem can be formulated as

 

         

min ^H _n subject to ^H _s 1

      

w w Φ w w a (2-1)

where n() = E[n(,k)n^H(,k)] is the noise-only PSD matrix. The problem is equivalent to the maximum signal-to-interference-plus-noise ratio (SINR) beamformer [2]. The solution of the MVDR beamformer can be easily obtained by the Lagrange multiplier method as response on the steering vector as(), where the steering vector is usually considered as a point source model, or a rank-1 signal model. Shahbazpanahi et al. [20] modified the distortionless constraint to a quadratic one and incorporated multi-rank signal models given in Section 1.2.2. The modified MVDR problem is given by

 

           

^ˆ

min ^H _n subject to ^H _s 1

       

w w Φ w w Φ w (2-3)

where Φ^ˆs

 

 is the designed or estimated multi-rank signal model. The solution of the modified problem can be solved by the Lagrange multiplier, which results in the following generalized eigenvalue problem [70]:

         

^ˆ

n     s  

Φ w Φ w (2-4)

where the Lagrange multiplier () can be considered as a corresponding generalized eigenvalue. Since the PSD matrices Φ^ˆs

 

 and n() are positive semi-definite, ()

is always real-valued and non-negative.

The solution to the minimization problem in (2-3) is the generalized eigenvector corresponding to the smallest generalized eigenvalue of the matrix pencil

   



^Φⁿ ^ ^,^Φ^ˆ^s ^



. Assuming that n() is full-rank and invertible, the equation (2-4) can be rewritten as

         

smallest eigenvalue in (2-4) corresponds to the maximum eigenvalue in (2-5). Thus, the optimal weight vector of the problem in (2-3) can be expressed by

  

    

opt ˆ

n s

  ^  

w  Φ Φ (2-6)

where {•} denotes the operator that yields the principal eigenvector of a matrix. It is known that when the noise field is incoherent (i.e., n() = I, where I is the identity matrix), the optimal MVDR turns into the matched filter (or the delay-and-sum (DS) filter for the rank-1 signal model). Hence, the matched filter for the multi-rank case can be obtained by

     

matched ˆ

  s 

w  Φ (2-7)

For speech enhancement, the desired signal can appear with the interferences and noise. In practice, the noise PSD matrix is replaced by the input PSD matrix of the training data as

     

is referred as the sample matrix [1] and N is the training size. The solution of (2-8) is the multi-rank (MR) version of the well-known sample matrix inverse (SMI) beamformer [20]. However, when the desired signal exists in the training data, the MVDR beamforming is known to degrade dramatically due to the mismatches between the presumed and actual array responses to the desired signal [20]. This is the so-called self-cancellation phenomenon. To improve the robustness of the MVDR beamforming against mismatches, one of the most popular approaches is the diagonal loading (DL) method. It is equivalent to impose an additive noise on the covariance matrix [1][2], and the MVDR problem in (2-3) can be modified as

 

           

where () is the diagonal loading level to be determined. The solution of the modified problem (2-9) is referred as the multi-rank loaded SMI (MRLSMI) beamformer as

        

^{ } 

MRLSMI ˆ ˆ

ˆ   _x    ^ _s 

w  Φ I Φ (2-10)

The major drawback of MRLSMI is that it is not clear how to choose the best diagonal loading level () since the optimal choice depends on the unknown signal and interference parameters [20].

2.2.2 Motivations of the Proposed Robust Beamforming

In this dissertation, the motivations of proposed robust beamforming are listed below:

1) The performance of the MVDR beamforming is known to degrade severely in the presence of even small mismatches between the actual and presumed array responses to the desired signal [20], especially when the desired signal

“contaminates” the training data. Therefore, with multi-rank signal models, it is possible to model a more accurate array response which reduces the performance degradation due to the model mismatches.

2) In the original multi-rank MVDR beamforming algorithm [20], the normalization problem is not taken into account in the narrow-band applications. However, it is important in the wide-band applications since different normalization factors among frequencies will introduce frequency dependent distortion or different white noise gains [1]. This can seriously deteriorate the speech quality for multi-channel speech enhancement.

Therefore, this dissertation proposes a modification of the problem based on a normalized multi-rank signal model.

3) The selection of the diagonal loading level () depends on the unknown signal and interference parameters. Cox et al. [2] have shown that the DL problem in (2-9) is equivalent to the norm-constrained Capon beamforming problem. In this dissertation, the relationship between the diagonal loading level and the norm constraint value for the multi-rank case is analyzed. The simulations in Section 4.1.1 show that the optimal choice of the norm constraint value is less sensitive to unknown signal powers and small angle mismatches at high SNRs.

4) In [20], the computation of the principal eigenvector is needed. For the on-line implementation, we introduce the Kalman filter algorithms for more flexible designs.

5) The selection of initial conditions and parameter matrices is critical, especially when the system is nonlinear. Wrong settings can break down the performance of the system. In this dissertation, the initial state is suggested to be in the feasible set of the constraints, or at least close to the feasible set corresponding to chosen variance parameters. The error covariance is initialized as the null space of the initial state. Further, the selection of the parameter matrices is investigated to achieve a good performance and prevent from the ill-conditioning problem.

2.3 Proposed Robust Beamforming Based on the Soft-Constrained Pseudo-Observation Method

The soft-constrained pseudo-observation (SCPO) is one of the methods in constrained Kalman filtering [55–59]. By the SCPO method, constraints can be easily formulated into the state space as augmented measurements. In the follows, the distortionless constraint using normalized signal models for wide-band applications is proposed in Section 2.3.1. Then, the norm-constrained Capon beamforming (NCCB) for the multi-rank case is introduced in Section 2.3.2. In the sequel, the state space of the NCCB problem is formulated using the SCPO method in Section 2.3.3.

2.3.1 Normalized Multi-Rank Signal Model for Wide-Band Applications

In narrow-band applications, the normalization is immaterial since it does not affect the SINR defined as

     

SINR

s H

  

 w Φ w

w Φ w (2-11)

However, the normalization is important in wide-band applications to keep the array gain consistent at the desired signal array response. In (2-3) and (2-9), it is worth to note that different powers of the designed signal models Φ^ˆs

 

 lead to different normalization factors. Since the distortionless constraint is to constrain the desired array response without the consideration of signal power, normalization on the PSD matrix

 

ˆ s 

Φ is reasonable. Thus, we modified the distortionless constraint by using the normalized signal model Φs

 

 comparable to the conventional case. For example, consider the rank-1 signal model given in (1-5), the left hand side of the distortionless constraint in (2-12a) is

               

This gives the same norm as the distortionless constraint given in (2-1).

2.3.2 Norm-Constrained Capon Beamforming

Cox et al. [2] have shown that the DL problem is equivalent to the norm-constrained Capon beamforming (NCCB) problem [23][26]. For the rank-1 signal model, the NCCB can be expressed by

 

         

where () is the designed constraint value of the squared weight vector norm. The solution of the NCCB has the diagonal loading (DL) form as

 



   



 

By substituting the weight vector in (2-15) into the norm constraint, the relationship between the diagonal loading level () and the norm-constrained value of the weight vector norm () can be obtained by [26]

Now, the NCCB formulation can be extended with multi-rank signal models as

 

           

is the normalized signal PSD matrix given in (2-12b). The Lagrangian function of (2-17) can be defined by

     

obtain the solution similar to (2-10) as

   

is the normalization factor to meet the distortionless constraint. Likewise, the relationship between () and T () with the multi-rank signal models can be derived as

The value of () is greater than 1/M. According to the distortionless constraint, we have

 

T  M (2-22)

Note that the trace inequality tr(AB) ≤ tr(A) tr(B) used in (2-21) is described in Section 3.3.1 in detail. If we assume a semi-positive diagonal loading level (), there is also an upper bound for (). The equation (2-20) monotonically decreases to 1/M as ()

If the norm constraint value () is greater than the upper bound, there is no feasible solution for a semi-positive ().

For the rank-1 signal model, the weight vector can be decomposed into subspaces of the presumed steering vector and its null space based on the concept of GSC [71] as

 

w a . In this case, the norm of the weight vector can be expressed as

 

² ¹ s

 

  M  ^ 

w a (2-25)

It can be seen that constraining the norm of w() is equivalent to constrain the norm of

 



a . Thus, to express the effect of the latter term in (2-25), we decompose the

threshold T() as

 

^{, where 0}

 

T   M      (2-26)

In the rest of this dissertation, we discuss the selection of () instead of T() for better description of the scale of the norm deviations.

Compared to the selection of the diagonal loading level (), the selection of () is less sensitive to the signal powers due to the division in (2-16) and (2-19). It is worth to note that the speech signals are nonstationary and time-varying. The insensitivity property of () benefits the application of speech enhancement. It can be shown in Section 4.1.1 that the selection of norm constraint is also less sensitive to the small angle mismatches at high SNRs.

2.3.3 State Space Formulation Using the SCPO Method

The pseudo-observation method treats the set of constraint equations as additional observations, but with no measurement noise [55–59]. In this case, the constraint equations are called perfect measurements, and the constraints are considered as “hard constraint”. However, it is known that perfect measurements give a singular error covariance matrix, which will lead to the ill-conditioning problem in the Kalman filter.

Thus, small variances of the constraint equations are used instead and it gives the

“soft-constrained” solutions.

Considering the NCCB problem in (2-17), the state space model is given as follows:

State Space Formulation of the Proposed Robust Beamforming Problem

A vector form of (2-27b) is expressed as

 

mutually uncorrelated with the covariance matrices

 

 E_k _s



,k

 

^H_s ,k



 0²

 

The only real measurement in (2-27b) is the input vector x(,k) in the first equation given by the objective of minimizing the filtered output power in the MSE sense, i.e.,

   

0 ^H , ,

Ek^ x  k w  k ^.

Considering the measurement update in the Kalman filter,

      



 



constraint parameters approach to zeros, the constraint costs are increasingly weighted, and the solutions that do not satisfy the constraint are increasingly penalized. The solution of the SCPO should approach to the solution of the NCCB problem in (2-17) if the constraint parameters 22() and 32() are much smaller than 12() and the approximation of the nonlinear functions are adequate. To avoid numerical problems, typically the constraint parameters will not be set as zeros. Therefore, the SCPO method does not strictly satisfy the constraints, but provides a flexible approach to incorporate different equality constraints.

2.4 Solutions Using Nonlinear Kalman Filters

In the multi-rank MVDR problems, the multi-rank distortionless constraint and the norm constraint are quadratic. Hence, nonlinear approximation on the measurement equations is needed. Consider the nonlinear measurement equation describe in (2-27), it can be approximated by the Taylor expansion in the second-order around an estimate

 

 and P is the number of measurement equations.

F denotes the Jacobian matrix of the nonlinear function w ^{f w}



^^{, k}



^{, and}Fww^{ }ⁱ

denotes the Hessian matrix of the i-th measurement equation in ^{f w}



^^{, k}



. The major two categories of the nonlinear Kalman filtering are the extended Kalman filter (EKF) and the unscented Kalman filter (UKF). An overview of the above algorithms has been given in [66]:

1) The first-order EKF (FOEKF) [60][62] approximates the Jacobian matrix F . _w This works fine as long as the Hessian matrix (i.e., second-order term) is small, which can depend on the state estimation error or the degree of nonlinearity of ^{f w}



^^{, k}



^.

2) The second-order EKF (SOEKF) [61][62] approximates both the Jacobian matrix and the Hessian matrices.

3) The UKF [62–65] implicitly estimates the first- and second-order terms in the nonlinear transformation in (2-32) instead of estimating the Jacobian and

Hessian matrices. In other words, the UKF approximates the probability distribution using sigma points rather than approximating an arbitrary nonlinear function or transformation.

In the following, the solutions using the EKFs and the UKF for our problems will be listed. Discussions among the algorithms are investigated in the sequel.

2.4.1 Solutions Using Extended Kalman Filters

The EKF has been widely used in nonlinear filtering [60–62]. It approximates the nonlinear function around the a priori estimate of the Kalman filter. In our problems, only parts of the measurement equations are nonlinear. Therefore, only Jacobian and Hessian matrices of the nonlinear measurement equations are needed to be estimated.

Given the state space model in Section 2.3.3, the Jacobian matrix Fw(k) and the Hessian matrices Fww^{ }¹



^^{, k}



and Fww^{ }²



^^{, k}



of the nonlinear functions can be

For the SOEKF, the Hessian matrices in (2-32) leads to the additional terms in the innovation (k) and it covariance matrix (k) under the MSE sense (see Appendix I). The bias terms (k) and (k) in our problem can be expressed as

       

Finally, the EKFs using the first- and second-order Taylor expansion can be summarized as follows [62]:

Multi-Rank MVDR Beamformer Using the FOEKF and SOEKF

The SOEKF, using the second-order Taylor expansion, for the state space model in (2-27) is given by the following recursions initialized with ^w^ˆ



^^,0



^and^P^



^^,0



the innovation vector and its covariance matrix. For detailed derivation of the SOE Kalman filter, please refer to the Appendix I.

2.4.2 Solution Using the Unscented Kalman Filter

The UKF uses sigma points to approximate the first- and second-order moments of the nonlinear transformation. There are different ways to set the sigma points and the weightings [62–66]. In this dissertation, we choose the method given by [62] since it gives positive weightings. The (2M + 1) sigma points for the approximation of the nonlinear measurement equations are generated by

 

And the transformed sigma points are given by

     

where 1 denotes the M-by-1 all-one vector.

The UKF is summarized as follows [62]:

Multi-Rank MVDR Beamformer Using the UKF

By using the sigma points given in (2-36), the UKF is given by the following recursions initialized with ^w^ˆ



^^,0



^and^P^



^^,0



 

       ^ ^ ^ ^ 

The setting of initial conditions is important for constrained Kalman filtering problems. The initial conditions should satisfy the constraints or at least close to the feasible sets of the constraint in the order of the chosen variance parameter in matrix R(). An improper setting can dramatically degrade the performance with nonlinear constraints.

First, consider the rank-1 MVDR beamforming problem in (2-1). Based on the projection method in the constrained Kalman filtering [55–59], the state error covariance matrix P⁺(k) converges to the null space of the presumed steering vector as(). Therefore, the initial values of ^w^ˆ



^^,0



^and^P^



^^,0



are chosen as

         

The normalization term in (2-38a) is to satisfy the distortionless constraint w^H()as()

= 1.

Likewise, for multi-rank signal models, the weight vector w() is initialized to satisfy the distortionless constraint, and the state error covariance matrix P⁺(k) can be set as the null space of w(). Thus, the initial conditions of ^w^ˆ



^^,0



^{and P}⁺⁽^^{0) for}

multi-rank case are chosen as

   

2.4.4 Estimation of the Parameter Matrices

In the update equations of the proposed Kalman filters, there are two parameter matrices to be determined: Q() and R(). In general, Q() stands for the random walk during the state update, which is typically assumed as stochastically white. For

在文檔中基於多秩訊號模型之語音純化波束形成與多通道後濾波 (頁 25-0)

Organization of this Dissertation

Chapter 1 Introduction

1.5 Organization of this Dissertation

Chapter 2

Robust Adaptive Beamforming with

Multi-Rank Signal Models Based on the Kalman Filter

         

           

 

         

 

   





         

  

    

     

     

           

        

  

     

     

 

 

 

               

         

 

   

 

           

     

   

 

 

 

 

 

 

 

 

 

 

 

 



 



 

   

      

 



 





















       









 

     









 

           



^{ } 

       ^ ^ ^ ^ 