• 沒有找到結果。

Related Works of Magnetic Source Imaging

在文檔中 穩健性腦磁波訊號源造影 (頁 22-38)

For Human beings, brain is the most important component to coordinate all parts of our body. For thousands of years, people have devoted themselves to discover the accurate brain functionalities. With the progress of the technologies, invasive anatomy becomes non-invasive functional brain mapping, that ”non-invasive” means without physical harm.

In order to acquire brain signals non-invasively, we can measure electrophysiological, hemodynamic, metabolic, and etc. Recently, Magnetoencephalography (MEG), Electroen-cephalography (EEG), functional Magnetic Resonance Imaging (fMRI) and Near-Infrared Reflectance Spectroscopy (NIRS) are four major modalities.

When neurons become active, they induce changes in blood flow and oxygenation lev-els, which can be imaged related to neural activity. fMRI can detect hemodynamic changes with spatial resolution as high as 1-3mm. However, because of the relatively slow hemo-dynamic changes, fMRI has relatively poor temporal resolution (in the order of seconds) compared with techniques such as EEG and MEG.

MEG and EEG are two complementary techniques that measure the magnetic induction and the scalp electric potentials produced by the ensemble of neuronal activities inside the brain. The major advantage of MEG and EEG is the higher temporal resolution (in the order of milliseconds) compared to fMRI, because they measure electrical brain activity directly. Furthermore, MEG is an imaging technique which measures the magnetic fields produced by electrical activities inside the brain via extremely sensitive devices such as superconducting quantum interference devices (SQUIDs) which has higher signal-to-noise ratio (SNR) than EEG. Therefore, MEG is often used as the modality for source imaging.

However, the spatial resolutions of EEG and MEG are limited for two reasons. One is the limited number of spatial measurements. EEG and MEG measure brain signal with limited electrodes or sensors, usually tens or a few hundreds. Compared to fMRI, tens of thousands, that’s relatively a small amount. The other reason is the inherent ambiguity of the inverse problem, finding the distribution of sources when the recordings are given. Even with infinite MEG and EEG sensors, a non-ambiguous solution to localization of the neuronal activity inside the brain would not be possible. We can reach similar resolutions to those of fMRI only by adding restrictive models on the sources of MEG and EEG signals [1]. We

can see the MEG device in Figure 1.1.

Figure 1.1: This figure shows the MEG Device in Taipei Veterans General Hospital.

In order to discover brain functionalities, we need to conduct source imaging which is also involved in the inverse problem. We can acquire the concept of inverse problem from the Bayesian statistic framework.

P (x|y) = P (x)P (y|x)

P (y) (1.1)

P (x|y) is the conditional probability of x, given y. It is also called the posterior proba-bility because it is derived from or depends upon the specified value of y. P (y|x) is the conditional probability of y given x. P (y) is the prior or marginal probability of y, and acts as a normalizing constant. P (x) is the prior probability or marginal probability of x. It is

”prior” in the sense that it does not take into account any information about y. Applying the Bayesian statistic framework to MEG inverse problem and let x represent the distribution of the sources in the head and y represent the recordings from the MEG sensors. P (x|y) can describe the inverse problem that to get the distribution of the sources while given MEG recordings. From the Bayesian equation, we can simplify the inverse problem as the form

of the right side of the equal sign if we want to know parameters of the source distribution from the recordings. Therefore, P (y|x) is the key to the solution. P (y|x) describes the probability of the recordings when given parameters of the source distribution and that is the forward problem.

However, there are many difficulties of robust source imaging, like the amount of MEG recordings, the appropriate approach for signal analysis and source localization and head motion correction. First, we discuss about the amount of recordings. For robust source imaging, we need large enough SNR of recordings and SNR can be enhanced with suffi-cient number of acquired trials. Because there are no explicit criteria or rules for experi-mentalist to follow, the number of recordings is always determined with the experimental-ist’s rule of thumb. This is very subjective.

Even the amount of recordings is enough for data analysis, what is the appropriate ap-proaches for source analysis and localization is another important issues. To signal analysis, Independent Component Analysis (ICA) is an powerful method that can retrieve indepen-dent components from the linear mixture of sources, under the assumption that components are statistical independent. There are many algorithms proposed to solve the problem of source localization. Source localization is an ill-posed problem because the number of sources is much larger than the number of the sensors, additional constraints are necessary to solve this problem. Among the approaches for source localization, beamformer is more focal and accurate. Therefore, beamformer is the most attracting method in recent years.

There are still some situations that will affect the accuracy of source localization, like the head motion during MEG experiments. If there is head motion during experiments, the relative position between head and sensors will be changed and induce errors of source localization. However, to MEG studies, head motion during MEG experiments is unavoid-able in the long signal recording time. We need to solve this crucial problem for higher accuracy of source localization.

We discuss the three issues in detail in the following sections.

1.1.1 Amount of Recordings

To get event-related potential for signal analysis and source localization, we usually need tens or hundreds of trials of brain signals for large enough SNR. To our knowledge, there is still no on-line evaluation method in the literature to realize whether the amount of MEG recordings is enough.

The easiest way is the visual inspection of on-line averaged waveform. While the ex-pected (or well-known) features of averaged waveform exist, we can determine that the amount of recordings is enough. But this is a subjective method and what we need is an objective formal criterion.

For some methods of statistical thresholding of source activities, they extract target source activities by comparing the discrimination between the target and non-target source activities [2, 3]. Following this concept, in this thesis, we display the discrimination be-tween the selected active and control states from MEG recordings to give the experimen-talist a concept about the amount of recordings and saves the unnecessary time and effort.

We use statistical hypothesis test to get this discrimination.

1.1.2 Signal Analysis and Source Localization

Signal Analysis

For signal analysis, ICA is an powerful method. Originally, ICA is proposed to solve the problem of blind source separation. In the following statements, we express the concept of ICA.

We want to recover N independent source signals s = {s1(t), . . . , sN(t)} after they are linearly mixed by a mixing matrix A.

x = {x1(t), . . . , xN(t)} = As (1.2) Given x, we will estimate an unmixing matrix W which can recover x to original sources u = Wx under the assumption that the estimated sources u are as statistically independent as possible. Here u contains independent components.

In advance, we should know that uncorrelated is not independent. Independence is a stronger constraint than uncorrelation. Two random variables are said to be uncorrelated if

their covariance is zero.

E{y1y2} − E{y1}E{y2} = 0 (1.3) If the random variables are independent, they are uncorrelated. However, that the random variables are uncorrelated cannot imply they are independent. Take sin(x) and cos(x) for example, they are both dependent on x, but they are uncorrelated since cov(sin(x), cos(x)) = 0. Moreover, uncorrelatedness is a relation between only two random variables, whereas independence can be a relationship between more than two random variables.

Decorrelation methods, like PCA can give uncorrelated output components. While PCA decorrelates the outputs, ICA attempt to make the outputs statistically independent.

First, we should know that there are some assumptions in ICA [4].

I. The independent components are assumed to be statistically independent.

This is the main assumption that ICA requires. We say that random variables y1, y2, . . . , yn are independent if the information of yi does not give any informa-tion of yj for all i 6= j. To represent this concept as a mathematical form, we can formulate it as the form of probability density function.

p(y1, y2, . . . , yn) = p1(y1)p2(y2) . . . pn(yn) (1.4) where p(y1, y2, . . . , yn) is the joint probability density function of yiand pi(yi) is the marginal probability density function of yi.

II. Gaussian variables are forbidden.

The order cumulants for gaussian distribution are zero, but such higher-order information for estimation of the ICA model is necessary. One can prove that the distribution of any orthogonal transformation of gaussian random variables gives exactly the same information. Thus, in the case of gaussian variables, we can only estimate the ICA model up to an orthogonal transformation. In other words, the matrix is not identifiable for gaussian independent components. (Actually, if just one of the independent components is gaussian, the ICA model can still be estimated.) Second, there are some ambiguities in the ICA algorithm. We explain it in the below statements:

I. The variances of independent components are unknown.

We cannot determine the variance of the independent components. Because both A and s are unknown, any scalar multiplier in one of the sources siof s could always be canceled out by dividing the corresponding column ai of A by the same scalar.

II. The order of the independent components is unknown.

We cannot determine the order of the independent components. We can always apply a permutation matrix P to Eq (1.2), x = AP−1Ps. Then Ps is still like the original signals and AP−1 is just a new unknown mixing matrix and can also be solved by ICA algorithm.

Since we have got the concept from ICA and understood its assumptions, we should go on for its optimization algorithm. The major concept is that to make the output component as statistically independent as possible. This leads to another problem about how to mea-sure the independence. We should know in advance that non-gaussian is independent. By the central limit theorem (CLT), the distribution of a sum of independent random variables tends toward a gaussian distribution. In other words, the mixing of two independent signals usually has a distribution that is closer to gaussian distribution than any of the two original signals. From this concept, u = Wx = WAs that u must be more gaussian than s because of CLT. But we want u is a copy of s so that u must be the least gaussian. That equals to maximize the non-gaussianity of Wx.

Usually kurtosis is used as a measure of the non-gaussianity. The kurtosis of a gaussian random variable is zero. But existing a problem about kurtosis that it is very sensitive to outliers. As a result we use the negentropy to replace the kurtosis. From the information theory, a gaussian variable is that it has the largest entropy among all random variables. We can take the entropy as a measure of non-gaussianity. For computation simplicity, we use an approximation to calculate the negentrophy.

J (u) ≈ c[E{G(u)} − E{G(ν)}] (1.5)

where c is some constant, E is the expectation value, ν is a gaussian variable of zero mean and unit variance and G is a non-quadratic function, called a contrast function. There are some choices about G, like tanh, exp and the power function [5].

In ICA algorithm, the contrast function G is a parameter that we should choose for every different condition [5]. Furthermore, we need to do preprocessing to make the problem of ICA estimation simpler and better conditioned. Whitening is a preprocessing step that can make the input x uncorrelated and unit variance. We can see that the effect of the whitening processing in Figure 1.2. Figure 1.2 (a) is the joint density function of original independent signal with uniform distribution. After uncorrelated mixtures of the independent signals, the distribution is as Figure 1.2 (b), is no more independent. Figure 1.2 (c), we can see the effect of whitening that the square defining the distribution is now clearly a rotated version of the original square in Figure 1.2 (a). Whitening can be achieved with principal component analysis (PCA) or singular value decomposition (SVD).

(a) (b) (c)

Figure 1.2: This graph shows the effect of whitening. (a) is the joint probability density function of original independent signal with uniform distribution. (b) is the distribution after uncorrelated mixtures of the independent signals from (a). (c) is the joint distribution of the whitened mixtures.

We can have an overview of ICA algorithm in Figure 1.3. The first step is whitening to make the problem of ICA estimation simpler and better conditioned. The second is to optimize the un-mixing matrix W iteratively by maximizing the non-gaussianity, approxi-mated by calculating the negentrophy with the contrast function G. This optimization step can reach the purpose of ICA, making the output components as statistically independent as possible.

There exists many kinds of ICA algorithms, including FastICA, Infomax, JADE, and

whitening

estimate W input measurementx

un-mixingmatrix W

optimization maximizethe non-gaussianity

Figure 1.3: This graph shows the overview of ICA algorithm. The first step is whitening to make the problem of ICA estimation simpler and better conditioned. The second is to optimize the un-mixing matrix W iteratively by maximizing the non-gaussianity.

etc. All of them have some advantages and disadvantages. Since we want to solve the problem efficiently, we choose FastICA [5, 6] algorithm as our method.

Source Localization

As aforementioned, because the electromagnetic field recorded by MEG sensors is the ensemble of neuronal activities within the whole brain, we can determine the recorded electromagnetic field from forward solutions. Therefore, the inverse problem can be trans-formed to estimate the neuronal activities inside the brain by Eq 1.1. But these inverse problems are ill-posed problems because the number of sources is much larger than the number of the sensors, additional constraints are necessary. A lot of algorithms had been proposed [1] to solve the inverse problem.

The most widely-used method is least square dipole fitting [7, 8]. It assumes the neu-ronal activities are composed of a fixed number of Equivalent Current Dipoles (ECD) and

finding out the best solution by minimizing the error between the electromagnetic field generated by ECDs and by MEG/EEG recordings. However, two difficulties of this al-gorithm are a priori number of sources and trapping in local minimum. Multiple Signal Classification (MUSIC) [9–11] is another kind of method. It can avoid trapping in local minimum by searching through the region of interest and determining the locations with peak projections of forward models in the signal subspace. Because there is no perfect head model and MEG/EEG recordings, how to determine the best signal subspace is an impor-tant issue. Minimum Norm Estimation (MNE) [12] will set dipole orientation either to be on the tangential plane or normal to the local cortical surface. But the major problem is that because of the minimum norm constraint, the result will tend to emphasize the cortical regions closer to the MEG sensors.

The most attracting method for inverse problem in recent years is Beamforming [13,14].

We can think that the beamformer creates a virtual sensor at a specified position in the source space by the information recorded from MEG sensors to measure the signals with a specified orientation. It can obtain the activation magnitude of the targeted source by imposing the unit-gain constraint while suppressing the contribution from other sources by applying the minimum variance criterion. This property can improve the spatial resolution and the accuracy of the spatial filter.

There are two kinds of Beamformer, vector-type beamformer [15] and scalar-type beam-former [16]. The vector-type beambeam-former decomposes the dipole into three orthogonal components. Every component has its own spatial filter. There is only one spatial filter in scalar-type beamformer and it determines the direction by maximizing the Z-value. Com-pared to scalar-type beamformer, it is more efficient to calculate the spatial filter by using vector-type beamformer because all the involved procedures are deterministic. But the ma-jor advantage of scalar-type beamformer is the higher Signal-to-Noise Ratio (SNR) and higher spatial resolution [17, 18].

We can think that the beamformer creates a virtual sensor at a specified position in the source space by the information recorded from MEG sensors to measure the signals with a specified orientation. There are two types of beamformers, vector-type [15] and scalar-type beamformer [16]. The scalar-scalar-type and vector-scalar-type beamformer give exactly the same output power and output SNR when the orientation is optimum. However, the theoretical

output SNR of a vector-typed beamformer is significantly degenerated without optimum orientation [17].

We describe the theorem of scalar beamformer in the below statements.

I. Scalar Beamformer

First, involving Sarvas forward model [19–21], the MEG recordings can be writ-ten as considering a target unit dipole with parameter θ = {rq, q} where rq is the dipole location and q is the dipole orientation. We can further separate m(t) into three components

m(t) = mθ(t) + mδ(t) + n(t) (1.6) where mθ(t) denotes the contribution from target source, mδ(t) denotes the mag-netic recordings generated from all other sources and n(t) denotes the noises in real environment. While merging mδ(t) + n(t) into mn(t) and replacing the mθ(t) with the form of lead filed, the above equation becomes

m(t) = lθsθ(t) + mn(t) (1.7)

sθis the dipole moment and lθ is the lead field which means the predicted recordings of the unit dipole with q orientation.

Here lθis the combination of lead field matrix Lrand dipole orientation q,

lθ = Lrq (1.8)

The output signal y(t) can be formulated by filtering the MEG recordings with the spatial filer wθ, denoting by

y(t) = wθtm(t) (1.9)

We further apply Eq (1.7) into Eq (1.9). We can get the concept about unit-gain and minimum variance constraints from Figure 1.4. With unit-gain constraint, the spatial filter at the location of the target source (r, q) will preserve the most strength and impress the contribution form other sources by minimum variance constraint.

wTl=1

wTl=0 r,j

Figure 1.4: With unit-gain constraint, the spatial filter at the location of the target source(r, q) will preserve the most strength and impress the contribution form other sources by minimum variance constraint.

Therefore, we can write the formula as:

y(t) = wθtm(t)

= wθtmθ(t) + wθtmn(t)

= sθ(t)wθtlθ+ wθtmn(t)

= sθ(t) + wθtmn(t) (1.10)

From the above, in order to approximate the output signal y(t) as source strength sθ, we must suppress the leakage of sources from other locations, wθtmn(t), corre-sponding to minimize variance of y(t). Therefore, we can obtain the optimal spatial filter from

ˆ

wθ = arg min

wθ

E{ky(t) − E{y(t)}k2} + αkwθk2 subject to wθtlθ = 1 (1.11) where E represents the expectation value and α represents the parameter of Tikhonov regularization [22]. Here α is a parameter to adjust the tradeoff between specificity and noise sensitivity, corresponding to the shape of beamformer spatial filter.

Substituting the equation Eq (1.10) into Eq (1.11), we can solve this equation by Lagrange multipliers then get the solution to wθ:

ˆ

where C is the covariance matrix of the MEG recordings m(t) and I is the NxN identity matrix.

II. Statistical Mapping

Because the SNR of the recordings is often small, the effect from the non-task-related sources is necessary to be considered. At every target position rq, we can get the spatial filter from Eq (1.12). The norm of the spatial filter is location-dependent.

The lead field norm of a deeper source is bigger than that of a superficial source and because of the unit-gain constraint, the norm of spatial filter is inversely-squared-proportional to the norm of the lead field. Therefore, if we take only the filtered

The lead field norm of a deeper source is bigger than that of a superficial source and because of the unit-gain constraint, the norm of spatial filter is inversely-squared-proportional to the norm of the lead field. Therefore, if we take only the filtered

在文檔中 穩健性腦磁波訊號源造影 (頁 22-38)

相關文件