• 沒有找到結果。

Post-Nonlinear ICA Model

在文檔中 基於影像之3D物體重建 (頁 64-70)

4. A Post-Nonlinear ICA Reflectance Model for 3D Surface Reconstruction

4.3. Post-Nonlinear ICA Model

In this section, we introduce the particular nonlinear mixtures, which can be considered to be a hybrid structure consisting of a linear stage followed by a nonlinear stage. It is shown in Fig. 4-2. This structure, which was introduced by Taleb and Jutten [59], provides the observation x(t)=(x1(t),x2(t),K,xn(t))T, which is the unknown nonlinear mixture of the unknown statistically independent source

:

where fi(⋅) are unknown invertible derivable nonlinear functions, and ) following, the mixture vector x(t), and by extension the pair (A, f), will be called a post-nonlinear (PNL) model.

f1

f2

f3

s

1

(t) s

2

(t) s

3

(t)

A x

1

(t)

x

2

(t)

x

3

(t)

Figure 4-2. Post-nonlinear mixing ICA model (n = 3).

Contrary to general nonlinear mixtures, the PNL model has a favorable separability property. That is, using the separation structure (g, B) shown in Fig. 4-3, it can be demonstrated, under weak conditions on the mixing matrix A and on the source distribution, that the output independence can be obtained if and only if

i

i g

f • are linear for all index i from 1 to n. This means that the sources

T n t y t y t y

t) ( ( ), ( ), , ( ))

( = 1 2 K

y , which was estimated using an independence criterion on the outputs, are equal to the unknown sources with the same indeterminacies noted in linear mixture model.

y

1

(t)

y

2

(t)

y

3

(t) x

1

(t) B

x

2

(t)

x

3

(t)

Figure 4-3. Separation architecture of the post nonlinear ICA model (n = 3).

A very popular approach to estimating the ICA model is the maximum likelihood (ML) estimation. Maximum likelihood estimation is a fundamental method of

( ) (

x t

)

g1 θ1, 1

( ) (

x t

)

g2 θ2, 2

( ) (

x t

)

g3 θ3, 3

parameter values as estimates that give the highest probability for the observations. In following section, we show how to apply ML estimation technique to post-nonlinear ICA estimation. The similar derivations of equations (4-3)-(4-13) based on the mutual information as a cost function is shown in the chapter by Taleb [59], [60].

4.3.1. Independence Criterion and Deriving the Likelihood

The statistical independence of the sources is the main assumption. Then, any separation architecture is tuned so that the components of its output y become statistically independent. This is achieved if the joint density factorizes as the product of the marginal densities

( ) ( )

. can be formulated as

(

,

)

( ) det

(

,

) ( )

, densities of the independent components. Equation (4-4) can be expressed as a function of B=(b1 ,b2 ,...,bn)T and x, giving

Then the likelihood can be obtained as the product of this density evaluated at the T points. This is denoted by L(B ,θ) and we have

( ) ( ( ( ) ) )

In general, it is more practical to use the logarithm of the likelihood, since it is algebraically simpler. This does not make any difference here since the maximum of the logarithm is obtained at the same point as the maximum of the likelihood. The log-likelihood is given by

( ) ( ( ( ) ) )

To simplify notation and to make it consistent to what can denote the sum over the sample index t by an expectation operator, thus we have

(

θ

) [ (

b g

(

θ x

( ) ) ) ]

B

where the expectation operator here is an average computed from the observed samples.

4.3.2. The Derivation of Adaptation Rules with Maximum Likelihood (ML) Estimation

To perform maximum log-likelihood estimation in practice, we need an algorithm to perform the numerical maximization of log-likelihood. In this section, we perform the numerical maximization of log-likelihood by gradient methods. First, the maximization of the log-likelihood requires the computation of its gradient with respect to the separation architecture parameters B and θi ,i=1 ,2 ,...,n.

The first layer: To estimate the linear stage parameters, we must compute the gradient of log-likelihood of Eq. (4-8) with respect to the separation architecture

(

,

) ( (

,

( ) ) ) (

,

( ) ) ( )

,

Therefore, this immediately gives the following adaptation rule for ML estimation: where η is the learning rate for adapting B. This result has the same expression as B in the linear source separation. This algorithm is often called the Bell-Sejnowski algorithm [61]. It is the simplest algorithm for maximizing likelihood by gradient methods. However, due to the inversion of the matrix B in Eq. (4-10) is needed in every step, it converges very slowly. The convergence can be improved by whitening the data, and especially by using the natural gradient [63] that is based on the geometrical structure of the parameter space. Therefore, Eq. (4-12) is used to estimate the linear stage instead of the Eq. (4-10).

( )

y y I B

The second layer: The derivation of the log-likelihood with respect to parameters θi of the nonlinear function g θi( i,xi) is

( )

From the derivation of the log-likelihood with respect to parameters θi, we update the parameters θi of the gi

(

θi, tx( )

)

function by the following adaptation

4.3.3. Estimation of the Source Densities

Denoted by py

( )

y the assumed densities of the independent components, and uncorrelated and to have unit variance. Then the ML estimator is locally consistent, if the assumed densities pyj

( )

yj fulfill

( ) ( )

{

y h y h y

}

0, j.

E j j j − ′j j > ∀ (4-17) The proof can be found in the [64]. Therefore, the limitation shows how to construct families consisting of only two densities, so that the condition in Eq. (4-17) is true for one of these densities. For example, consider the following log-densities:

( )

s

( )

s

where α12 are positive parameters that are fixed so as to make these two functions logarithms of probability densities. Actually, these constants can be ignored in the following. Then, for super-Gaussian independent components, the pdf defined by Eq. (4-18) is usually used. This means that the nonlinear function h

( )

⋅ is the tanh function:

( )

y

( )

y

h+ =−2tanh . (4-20) For sub-Gaussian independent components, the other pdf defined by Eq. (4-19) is used. Then the nonlinear function h

( )

⋅ can be written as:

( )

y

( )

y y

h =tanh − . (4-21) Finally, the choice between the two nonlinearities in Eq. (4-20) and Eq. (4-21) can be made by computing the nonpolynomial moment:

( ) ( ( ) )

[ ]

{

E tanh y y 1 tanh y 2

}

,j 1 ,2 ,...,n,

sign

kj = − j j + − j = (4-22)

using some estimates of the independent components. Then, the source distribution is super-Gaussian when kj =1 and sub-Gaussian when kj =−1 , where the expectation value in the formulas is for all t, t = 1, 2, …, T.

4.4. Solving the Proposed Nonlinear Reflectance Model by

在文檔中 基於影像之3D物體重建 (頁 64-70)