• 沒有找到結果。

CHAPTER 2. INTEGRATION OF INDEPENDENT COMPONENT ANALYSIS

2.3 RESULTS AND DISCUSSION

2.3.1 SUCROSE SOLUTION

The 78 sucrose solution samples were divided into 52 calibration samples and 26

validation samples with a ratio of 2:1. The distribution of their sugar content (°Brix) is

shown in Table 2.1. For all the samples within the calibration and validation sets, the

difference between maximum values of two sets was 0.2 °Brix; the differences for other

items including minimum, average, standard deviation, and coefficient of variation

(CV), were all smaller than 0.5 °Brix. The above sets of samples were conforming to the

consistent requirement of sugar content distributions.

Table 2.1 Summary of sucrose solutions and sample sugar contents. Total samples (n =

78), calibration set (n = 52) and validation set (n = 26) were arranged to have

consistent distributions of sugar content.

Sucrose Solutions

Group n

Sugar Content (°Brix)

Max. Min. Mean SD CV

Total Samples 78 19.00 0.40 9.83 5.48 0.56

Calibration Set 52 19.00 0.40 9.72 5.52 0.57

Validation Set 26 18.80 0.90 10.06 5.52 0.55

2.3.1.1 SELECTION OF THE MOST APPROPRIATE NUMBER OF ICS

According to the definition of ICA, the observed receptor signals can be decomposed

at most into a number of ICs (independent components) equal to the number of samples

(Hyvärinen and Oja, 2000). This study used the data of full range of wavelength (400 to

2498 nm) as the inputs of ICA, conducted ICA for the original spectra of 52 calibration

samples of sucrose solution by selecting 1 to 52 ICs, and observed the prediction error

by using the calibration model. Both situations with and without normalization were

examined. When only one IC applied, the prediction error was high, so the results were

only shown by applying 2 to 50 ICs. As shown in Fig. 2.2, when the number of ICs

increased to 4, SEC of the case without normalization sharply decreased to 0.14 °Brix,

and SEV fell to 0.21 °Brix, indicating that different numbers of ICs can influence the

predictability of the spectral calibration model. However, application of more ICs did

not necessarily help improve the ability of the calibration model because the sucrose

solutions were mixtures of sucrose and water, hence only the initial 4 ICs were applied

in the calibration model.

The results of ICA with normalized spectra can be observed in Fig. 2.2. The

prediction error greatly reduced as the number of ICs increased to 7; the SEC and SEV

with 7 ICs were 0.12 and 0.22 °Brix, respectively. Normalization apparently gave less

variations of SEV compared with that of original spectra.

Fig. 2.2 Relationship between the numbers of ICs and errors of the predicted sugar

content for sucrose solutions. The most appropriate number of ICs for

normalized spectra was determined by the tendency of SEC (green-short dash

line) and SEV (blue-dash dot dot line) values.

2.3.1.2 SPECTRA DECOMPOSITION AND CORRELATION ANALYSIS OF

SUGAR CONTENT

Based on ICA analysis it is critical to examine whether these 7 ICs were statistically

independent. To illustrate the operation, IC 1 and 4 were selected and their correlation

was shown in Fig. 2.3, with the coefficient of determination (r2) being only 4.0 x 10-8.

This indicated that IC 1 and 4 were independent of each other. Diagrams of every two

ICs among the 7 ICs also showed a similar distribution to that in Fig. 2.3, with all of the

r2 smaller than 0.243, conforming to the mutually independent characteristics of ICs

(Hyvärinen and Oja, 2000).

Fig. 2.3 Distribution of calibration and validation samples of sucrose solutions in IC

1-IC 4 space. IC 1 and IC 4 were randomly selected from the 7 ICs.

Eq. 2.5 shows that the constituent information ‘sugar content’ should mainly

correspond to a specific IC, and there should be a high correlation between the values of

the IC in the mixing matrix and the sugar content. So a diagram was made with the

reference sugar content and the values of each column (each IC) in the mixing matrix.

As shown in Fig. 2.4, the correlation coefficient (r) between IC 1 and the reference

sugar content could reach 0.977, which meant that with 7 ICs extracted, the IC 1 among

all 7 ICs could reveal the most information resulted from the sugar content in the

spectra. The results were in agreement with Westad (2005). Therefore, selection of the

numbers of ICs is important since it influences how the information is used after spectra

decomposition.

Fig. 2.4 Correlation between the values of IC 1 in the mixing matrix and the reference

sugar contents of sucrose solutions.

The regression coefficient matrix by the NIR spectra and the reference sugar content

of calibration sets was shown in Table 2.2, and the values from the top to the bottom

referred to IC 1 to 7. All values were compared in terms of absolute values. It was found

that the value of the first row (IC 1) was the largest, closely followed by the value of IC

4. The results agreed with the order of correlation between each IC and the reference

sugar content, and indicated that the importance of each IC was independent of the IC

sequence. Each major constituent had its corresponding IC decomposed by ICA, in

which IC contribution was clearly defined, so that all constituents of the mixtures could

be distinguished by ICA (Chen and Wang, 2001; Hahn and Yoon, 2006; Pasadakis and

Kardamakis, 2006; Kardamakis et al., 2007; Kaneko et al., 2008).

Table 2.2 Regression coefficient matrix of sucrose solutions with 7 ICs were extracted

from the NIR spectra of calibration sets. Correlation between the absolute

value of each IC in regression coefficient matrix and sugar content was

examined.

IC # Regression Coefficient

1 -2.1811

2 -0.2843

3 -0.1843

4 1.2976

5 0.1876

6 -0.1334

7 -0.1416

The ICs, decomposed from the spectra by ICA, reflected the spectral characteristics of the unknown mixture and constituted the pure materials’ spectra of this mixture under

an ideal state (Chen and Wang, 2001; Hahn and Yoon, 2006; Pasadakis and Kardamakis,

2006; Kardamakis et al., 2007). Since the sucrose solutions were mixtures of sucrose

and water, and the spectra was comprised of both constituents, the ICs decomposed by

ICA should reflect the characteristics of these two pure substances. For the original

spectra of the normalized calibration set, among the 7 ICs applied for ICA, the order of

the 7 ICs, according to the correlation with reference sugar content, was IC 1, 4, 2, 5, 3,

7, and 6. The NIR original spectra of the calibration set and IC 1 were shown in Fig.

2.5(A) and (B), and the reflectance spectrum of sucrose powder post-Detrend was

shown in Fig. 2.5(C). The peak positions of IC 1 (964, 1090, 1436, 2100, and 2276 nm)

matched the specific wavelength ranges of sugar content (C-H band) (Chang et al., 1998;

Park, 2003; Hahn and Yoon, 2006), which was also consistent with the absorption bands

seen in Fig. 2.5(C). So IC 1 can be considered to respond mainly to the sugar content,

conforming to the above results. The other ICs had poor correlation with reference sugar

content, and the absolute values in the regression coefficient matrix were much smaller

than that of IC 1, so they exerted an assisting function.

Fig. 2.5 (A) Original NIR spectra of sucrose solutions, (B) IC 1 decomposed from

calibration sets, and (C) the reflectance spectrum of sucrose powder

post-Detrend.

2.3.1.3 SUGAR CONTENT QUANTIFICATION BASED ON ICA AND PLSR

Quantitative analyses of sugar content in sucrose solutions were conducted by ICA

and PLSR using the full range of wavelength from 400 to 2498 nm. The results of

best spectral calibration model was the original spectra normalized, with 7 ICs applied.

The results were Rc = 0.9998, SEC = 0.124 °Brix, rv = 0.9993, SEV = 0.216 °Brix, bias

= 0.014 °Brix, and RPD = 25.54. A comparison was made in light of the result of the

original spectra with and without normalization, and it was found that the calibration

model yielded similar outcomes in the validation sets, whereas the SEC value was

improved when normalization was applied. Although derivatives can improve baseline

shift of the original spectra and amplify the signal characteristics, noise interference

may also be enhanced at the same time, making it unsuitable for spectral bands with

much noises. The spectrum in the range of 2200 to 2498 nm contained more noises;

therefore, the predictability of the spectral calibration models would decrease as

derivatives were attempted.

Table 2.3 Regression results by ICA and PLSR analyses for sucrose solutions.

2nd Derivative + Normalization 2 0.9990 0.243 20.96 0.9869 0.899 34.99 0.013 6.14

The results of spectral calibration models built by PLSR indicated that the best

spectral calibration model was acquired when the original spectra and 2 factors were

employed, and the results were as follows: Rc = 0.9995, SEC = 0.181 °Brix, rv = 0.9985,

SEV = 0.300 °Brix, bias = 0.069 °Brix, and RPD = 18.38 (Table 2.3). Moreover, with

the SEC = 0.192 °Brix and the SEV = 0.546 °Brix for the 1st derivative with

normalization, and the SEC = 0.243 °Brix and the SEV = 0.899 °Brix for the 2nd

derivative with normalization, it is apparent that the SEV values of both 1st and 2nd

derivatives were many times higher than SEC. The results showed that the PLSR

spectral calibration models had poor predictability when applied to validation sets.

Comparing the quantitative analysis results of ICA and PLSR, all ICA spectral

calibration models had better ability than PLSR in predicting calibration and validation

sets. This means that ICA extracts the characteristic information from the spectra more

effectively, not only improving the expository ability of calibration models for the

calibration sets, but also increasing the tolerance for the validation sets. Results also

showed that ICA was preferable to PLSR due to much lower bias (Table 2.3). This

finding became more obvious with normalization, indicating that ICA had a better

tolerance to the influences caused by factors other than chemical characteristics of the

constituents in the samples, which helped to build more robust spectral calibration

models. In summary for the sucrose solutions, ICA achieved better quantitative analysis

of sugar content than PLSR did, while selecting a suitable number of ICs and spectral

pretreatments could help improve the predictability of spectral calibration models. The

results of sucrose solutions also helped establish proper procedures with useful

information applicable when conducting ICA analysis of wax jambu.

2.3.2 WAX JAMBU

Wax jambu samples totaling 114 were used; their sugar contents ranged from 6.4 to

14.5 °Brix. The average sugar content was 9.92 °Brix with the standard deviation of

1.61 °Brix. All the samples were divided in a 2:1 ratio into 76 and 38 calibration and

validation samples (Table 2.4).

Table 2.4 Summary of wax jambu (Syzygium samarangense Merrill & Perry) and

sample sugar contents. Total samples (n = 114), calibration set (n = 76) and

validation set (n = 38) were arranged to have consistent distributions of

sugar content.

Wax Jambu

Group n

Sugar Content (°Brix)

Max. Min. Mean SD CV

Total Samples 114 14.50 6.40 9.92 1.61 0.16

Calibration Set 76 14.50 6.40 9.89 1.61 0.16

Validation Set 38 14.00 7.10 9.99 1.62 0.16

2.3.2.1 CORRELATION ANALYSIS OF NIR SPECTRA AND SUGAR

CONTENT

Fig. 2.6 showed the distribution of the correlation coefficients for the original, the 1st

derivative and the 2nd derivative spectra of the wax jambu samples and their sugar

contents. The main absorption wavelengths of the original spectra were 676, 968, and

1144 nm, of which 676 nm was located within the visible region of red light, whereas

968 and 1144 nm in the NIR region, belonging to the 2nd overtone of the C-H bond. The

main absorption wavelengths of the 1st derivative spectra were 626, 974, 1070, and

1406 nm, of which 626 nm was located in the visible region of orange light, with the

correlation up to 0.808, while the remaining wavelengths in the NIR region. The main

absorption wavelengths of the 2nd derivative spectra were located in the visible region

between orange light and red light, namely 594, 642, and 692 nm. Fig. 2.6 showed that

the wavelength range of 600 to 1098 nm was the major absorption band, and the 1st

derivative spectra were most significantly correlated to the sugar content (Chung et al.,

2004). As for the spectral band 650 to 700 nm, which belonged to the absorption band

of red light, it was consistent with the color of wax jambu skin, indicating that color

information was also reflected in the spectrum.

Fig. 2.6 Correlation coefficient distributions of the spectra and the sugar content of wax

jambu through three different spectral pretreatments (original spectra, 1st

derivative spectra, and 2nd derivative spectra).

The NIR spectra of wax jambu samples were analyzed by taking every 100 nm as a

band region, and full spectrum range from 400 to 2498 nm was divided into 21 band

regions, in which they were separately analyzed. Analysis of the 76 wax jambu

calibration samples could have been decomposed into 76 ICs; however, applying too

many ICs could easily lead to overfitting of the model. Hence, in this study ICA was

conducted with the limit of 30 ICs. The SEV showed no obvious trend when applying 1

water in the wax jambu samples, so it was necessary to avoid using the spectral bands of

1450 and 1900 nm that represent primarily water absorption. When applying 7 to 30 ICs

(Fig. 2.7), the SEV values in the ranges of 600 to 700 nm and 800 to 1098 nm were less

than 1 °Brix, so were the results of the 1st and the 2nd derivative spectra. All three

spectra fitted the spectral bands of higher correlation in Fig. 2.6, so the specific

wavelength regions for spectrum analyses of wax jambu were selected from the

wavelength range of 600 to 700 nm and 800 to 1098 nm (Chung et al., 2004).

Fig. 2.7 Relationship between spectral bands and errors of the predicted sugar content

for wax jambu when applying 7 to 30 ICs. Full spectrum range from 400 to

2498 nm was divided into 21 band regions by taking every 100 nm as a band

region.

2.3.2.2 SUGAR CONTENT QUANTIFICATION BASED ON ICA AND PLSR

2.3.2.2.1 ANALYSIS WITHOUT SPECTRAL PRETREATMENT

The ICA results of the spectral calibration model for wax jambu are shown in Table

2.5. The best spectral calibration model was found with the normalized 1st derivative

spectra and 10 ICs, resulting in Rc = 0.956, SEC = 0.471 °Brix, rv = 0.954, SEV = 0.489

°Brix, bias = -0.013 °Brix, RPD = 3.32. Among the 10 ICs applied for ICA, the order of

the initial 4 ICs, according to the correlation with reference sugar content, is IC 3, 7, 8,

and 6, with respective correlation coefficient (r) of -0.805, 0.647, -0.612, and 0.279. IC

3, 7, and 8 can be considered to respond mainly to the information of sugar content

(including fructose, glucose and sucrose) (Moneruzzaman et al., 2011; Tehrani et al.,

2011) as the composition of wax jambu is rather complicated than that of sucrose

solution alone. Since the specific wavelengths used were within the wavelength range of

600 to 700 nm and 800 to 1098 nm, the spectra covered the 3rd overtone of C-H bond,

conforming to the results of Fig. 2.6 and 2.7. Additionally, the spectral calibration

models built after normalization used the characteristic information of 10 ICs, which is

in line with the SEV trend observed in Fig. 2.7. Moreover, the small values of bias

indicated that ICA had good tolerance to the influence caused by factors other than the

internal chemical composition of the samples.

The PLSR results of the spectral calibration model are shown in Table 2.5, with the

best spectral calibration model found in the normalized original spectra with 5 factors,

yielding Rc = 0.884, SEC = 0.753 °Brix, rv = 0.867, SEV = 0.816 °Brix, and bias =

0.238 °Brix. The specific wavelength regions used were within the wavelength range of

600 to 700 nm and 800 to 1098 nm, consistent with the aforementioned results.

Table 2.5 Regression results by ICA and PLSR analyses for wax jambu (without spectral pretreatment).

2nd Derivative +

After comparing the results of ICA and PLSR quantitative analysis, it was found that

the ICA calibration model performed better than PLSR, since not only did it enhance the

predictability of the model but it also reduced the bias. The specific wavelengths used in

ICA and PLSR showed a high degree of coincidence. When applied to wax jambu

samples, the correlation analysis between NIR spectra and sugar content provided a

basis to select the appropriate specific wavelength regions.

2.3.2.2.2 ANALYSIS WITH SPECTRAL PRETREATMENT

To evaluate the best predictability of ICA models for wax jambu, ICA analysis was

further performed with pretreatment and outlier procedures. After selecting the best

pretreatment parameters (points of smoothing and gap of derivative were both 3) and

eliminating 1/10 outliers (11 samples) from the total of 114 samples, the best spectral

calibration model was found, as shown in Table 2.6, with the normalized 1st derivative

spectra and 9 ICs, resulting in Rc = 0.988, SEC = 0.243 °Brix, rv = 0.971, SEV = 0.381

°Brix, bias = 0.001 °Brix, RPD = 4.15. The PLSR analysis results under the same

conditions were Rc = 0.983, SEC = 0.287 °Brix, rv = 0.963, SEV = 0.426 °Brix, bias =

-0.039 °Brix, RPD = 3.71. The ICA spectral calibration model had better results than

PLSR results with pretreatment and outlier procedures in predicting calibration and

validation sets.

Table 2.6 Regression results by ICA and PLSR analyses for wax jambu (with spectral pretreatment).

Compared to the previous literatures (You, 2002; Lin, 2002; Chung et al., 2004), the

spectral calibration models built by ICA had higher predictability for wax jambu since

the SEC values reported by You (2002), Chung et al. (2004) and Lin (2002) were 0.413

°Brix, 0.388 °Brix and 0.252 °Brix, respectively. Among them, the SEP values reported

by Chung et al. (2004), 0.262 °Brix, 0.207 °Brix and 0.322 °Brix, were all lower than

its SEC value (0.388 °Brix); these MLR analysis results seemed unreasonable because

that the prediction sets were unknown to the calibration model, thus the SEP values

should be higher than SEC value. Even though, our ICA results listed in Table 2.6 were

better than those reported by Chung et al. (2004) and Lin (2002) in terms of Rc, SEC, rp

and RPD.

The results of ICA sugar content quantification based on NIR spectroscopy showed

that ICA can effectively extract the characteristic information in the spectra, and build

the spectral calibration models with desirable abilities to evaluate the concentration of

the constituents. It thus can be expected that integration of ICA with NIR spectroscopy

could become a powerful tool for quantitative analysis of specific targets.