CONSTITUENTS’ CONTENTS
After eliminating 1/10 outliers (23 samples) from 230 Gentiana scabra Bunge
samples, the remaining 207 effective samples were divided respectively into 138 and 69
calibration and validation samples in the ratio of 2:1. Statistical assessments on the
gentiopicroside and swertiamarin contents in each data set are shown in Table 4.2. The
differences of average, standard deviation, and coefficient of variation (CV) of the
effective samples in the calibration and validation set were all less than 0.05 %.
Table 4.2 The target constituents’ contents of effective samples, calibration set, and
validation set in Gentiana scabra Bunge.
Sample #
Gentiopicroside Content (%) Swertiamarin Content (%)
Mean (Min. - Max.) SD CV Mean (Min. - Max.) SD CV
Effective Samples 207 4.72 (1.59 - 8.77) 1.52 0.32 0.69 (0.12 - 2.15) 0.49 0.72
Calibration Set 138 4.73 (1.59 - 8.77) 1.53 0.32 0.69 (0.12 - 2.15) 0.49 0.72
Validation Set 69 4.72 (1.92 - 8.19) 1.51 0.32 0.68 (0.12 - 1.72) 0.49 0.72
The NIR spectra of the 207 Gentiana scabra Bunge samples were acquired by using
the MSC treatment. As shown in Fig. 4.1(A), absorption peaks were found in both the
visible region of blue light (452 nm) and red light (666 nm), since the chlorophyll in
Gentiana scabra Bunge absorbs the majority of blue and red light when involved in
photosynthesis. The spectra of tissue culture and the shoot were similar, which could be
attributed to the fact that during the domestication period the tissue is mainly composed
of shoots, since the root development of Gentiana scabra Bunge is not obvious at that
time. Contrarily, the root spectra in the visible region showed a significant difference,
with high absorption occurring from green to yellow light (492 to 586 nm) and low
absorption (flat waveform) from orange to red light (606 to 700 nm). This could be due
to lack of chlorophyll in the roots of Gentiana scabra Bunge plant, hence reducing the
absorption of blue and red light, while reflecting green light.
After MSC treatment, the spectra of Gentiana scabra Bunge were analyzed using the
following pretreatments: (1) smoothing; (2) smoothing with 1st derivative; and (3)
smoothing with 2nd derivative. The best pretreatment parameters (smoothing points /
gap) of the gentiopicroside analysis were (3/0), (2/2), and (6/6), whereas the best of the
swertiamarin analysis were (1/0), (2/2), and (6/6); both the smoothing points and the
gap were less than 10, indicating that NIRS 6500 spectrophotometer was stable, and the
spectra of Gentiana scabra Bunge powder exhibited minimal noise.
The correlation between the spectra of Gentiana scabra Bunge powder and the
bioactive components were assessed at first when selecting specific wavelength regions
of spectra. As for original spectra, the 1st derivative spectra, and the 2nd derivative
spectra, the correlation coefficients of gentiopicroside of effective samples were
distributed as shown in Fig. 4.1(B), and the threshold value (|r| > 0.50) was set to
determine the degree of correlation. Because the influence of water absorption on the
spectrum of Gentiana scabra Bunge powder had been eliminated, it’s unnecessary to
avoid the O-H bond absorption band around 1450 and 1900 nm. In both the visible and
the NIR region, there were highly correlated bands, with the original spectra located
between the orange and red light region as well as the O-H bond region. The 1st
derivative spectra were located throughout the regions of red light, the 4th overtone of
C-H bond, the combination of 1st overtone of C-H bond, and the combination between
C-H bonds. On the other hand, the 2nd derivative spectra were located in the regions of
red light, the 4th overtone of C-H bond, the 1st overtone of C-H bond, and the
combination between N-H bond and O-H bond.
The correlation coefficients between the spectra of Gentiana scabra Bunge powder
and swertiamarin are shown in Fig. 4.1(C) with the threshold value (|r| > 0.75) set to
determine the degree of correlation. The original spectra were located in different
regions, including red light, the 1st overtone of C-H bond, the combination between N-H
bond and O-H bond, and the combination between C-H bond and C-C bond. The 1st
derivative spectra were located in the regions of the 4th overtone of C-H bond, the 2nd
overtone of N-H bond, the 2nd overtone of C-H bond, the combination of 1st overtone of
C-H bond, the 1st overtone of C-H bond, and the combination between C-H bond and
C-C bond; whereas the 2nd derivative spectra were located in the red light and the 4th
overtone of C-H bond regions. As indicated by Fig. 4.1(B) and 4.1(C), the 4th overtone
of C-H bond was the main absorption band for both gentiopicroside and swertiamarin. It
is noteworthy that the dominance of red light in the visible region of the original spectra
could be attributed to the differences in the color of tissue culture, shoot, and root.
Fig. 4.1 (A) The spectra of Gentiana scabra Bunge powder post-MSC; (B) correlation
coefficient distributions between the spectra and gentiopicroside; and (C)
correlation coefficient distributions between the spectra and swertiamarin.
4.3.3 NIR SPECTRA DECOMPOSITION AND ICA ANALYSIS OF THE
TARGET CONSTITUENTS
According to the definition of ICA, the observed signal of receiver can be
decomposed into ICs of which the number is the same as that of training samples at
most (Hyvärinen and Oja, 2000). In order to avoid over-fitting of calibration model
caused by use of excessive ICs, appropriate ICs were selected under the condition that
calibration models were built only by using 1 to 17 ICs when ICA analysis was
conducted for original spectra (400 to 2498 nm) of the calibration set. The SEV of the
calibration models continued to drop and then rise when 7 ICs were applied, indicating
that incorporation of more IC will not necessarily be helpful to the analysis as it is
sufficient to decompose the spectra into 7 ICs.
After the original spectra (400 to 2498 nm) of the calibration set was decomposed
into 7 ICs, correlations between each IC and the two bioactive components were
checked. ICs 4 and 5 presented the higher correlation coefficients, followed by IC 6,
suggesting that the spectral information about gentiopicroside and swertiamarin was
typically stored in these three ICs. There were peaks for IC 4 in the wavelength of 704
nm, IC 5 in the wavelengths of 692 and 740 nm, and IC 6 in the wavelengths of 494,
1838, 1944, 2058, and 2132 nm (Fig. 4.2), which was consistent with the absorption
bands seen in Fig. 4.1(B) and 4.1(C). This suggests that the spectral characteristics of
gentiopicroside and swertiamarin were mainly reflected in ICs 4, 5, and 6 (Chen and
Wang, 2001; Hahn and Yoon, 2006; Pasadakis and Kardamakis, 2006; Kardamakis et al.,
2007). These wavelengths will be taken as the reference for selection on specific
wavelength region of spectra when building calibration models.
Fig. 4.2 The three ICs decomposed from the original spectra of Gentiana scabra Bunge
powder post-MSC that has higher correlation with gentiopicroside and
swertiamarin.
As shown in Eq. 4.2, the mixing matrix contained concentration information of the
two bioactive components in each sample. Since the spectral information of
these two ICs in the mixing matrix were used to configure 2-D distributions. As can be
seen in Fig. 4.3(A) and 4.3(B), tissue culture, shoot, and root were distributed in three
distinct locations of the IC 4-IC 5 space. The values of tissue culture and shoot were
close to each other and the root presented a higher value in IC 5, showing differences
among different parts of Gentiana scabra Bunge presented in the spectra, which are
consistent with the result in Fig. 4.1(A). If the average contents of gentiopicroside and
swertiamarin were taken as the threshold values, the samples could be classified into
four groups, namely A: gentiopicroside and swertiamarin at high contents; B:
gentiopicroside at high content and swertiamarin at low content; C: gentiopicroside at
low content and swertiamarin at high content; and D: gentiopicroside and swertiamarin
at low contents. The distributions of calibration and validation sets in the IC 4-IC 5
space are shown in Fig. 4.3(C) and 4.3(D), of which the gentiopicroside contents of
most tissue cultures were higher than the mean value, suggesting that the production of
gentiopicroside of Gentiana scabra Bunge was sufficient during the domestication
period. As the grown plants of Gentiana scabra Bunge were collected at different
growth stages, their gentiopicroside content in root varied. The gentiopicroside content
in shoot was low, indicating that gentiopicroside was mainly stored in the root for
Gentiana scabra Bunge plant during greenhouse cultivation. On the other hand, the
swertiamarin content in tissue culture was higher than the mean value, but lower than
the mean value in shoot and root, indicating that swertiamarin in Gentiana scabra
Bunge plant was reduced during greenhouse cultivation; therefore it is preferable to
extract swertiamarin from tissue culture.
Fig. 4.3 Scores of tissue culture, shoot, and root in IC 4-IC 5 space established with
calibration samples. (A) = calibration set, (B) = validation set. Scores of
gentiopicroside and swertiamarin in IC 4-IC 5 space established with
calibration samples. (C) = calibration set, (D) = validation set.
According to the discussion foregoing, IC 6 also contains spectral information about
gentiopicroside and swertiamarin; so the values of ICs 4, 5, and 6 in the mixing matrix
and root were clearly distributed in three locations of the IC 4-IC 5-IC 6 space,
indicating that even if the correlation between IC 6 and the two bioactive components
was lower than that of ICs 4 and 5, the information could still be helpful to the analysis.
If the average contents of gentiopicroside and swertiamarin were used for sample
grouping, the distributions of calibration and validation sets in the IC 4-IC 5-IC 6 space
could be constructed, as shown in Fig. 4.4(C) and 4.4(D). The lower the value of IC 4 is,
the higher the value of IC 6, hence the higher the gentiopicroside content. Similarly, the
lower the values of ICs 4 and 5 are, the higher the value of IC 6, thus the higher the
swertiamarin content. Fig. 4.3 and 4.4 indicate that the differences among various parts
of Gentiana scabra Bunge could be clearly identified by the change in the trend of two
bioactive components from the space of ICs, making the information useful in
qualitative and quantitative analysis of NIR spectroscopy.
Fig. 4.4 Scores of tissue culture, shoot, and root in IC 4-IC 5-IC 6 space established
with calibration samples. (A) = calibration set, (B) = validation set. Scores of
gentiopicroside and swertiamarin in IC 4-IC 5-IC 6 space established with
calibration samples. (C) = calibration set, (D) = validation set.
The ICA analysis results of the two bioactive components are shown in Table 4.3.
The best spectral calibration model of gentiopicroside was attained when applying the
2nd derivative spectra, of which the smoothing points and the gap were both 6 and the
wavelength ranged 600 to 700 nm, 1600 to 1700 nm, and 2000 to 2300 nm (Rc = 0.847,
SEC = 0.865 %, rv = 0.756, SEV = 0.909 %, bias = -0.395 %, and RPD = 1.67). With
regard to swertiamarin, the best spectral calibration model was acquired with the 1st
derivative spectra, of which the smoothing points and the gap were both at 2 and the
wavelength ranged 600 to 800 nm and 2200 to 2300 nm (Rc = 0.948, SEC = 0.168 %, rv
= 0.898, SEV = 0.216 %, bias = 0.003 %, and RPD = 2.28). Satisfied outcomes were
acquired for both gentiopicroside and swertiamarin. The relationship between the
predicted and reference concentrations of both bioactive components are shown in Fig.
4.5. Since the content of gentiopicroside predicted by the calibration model was mainly
affected by bias, the predictability can be improved by eliminating the bias calculated
from a set of representative samples. As for the prediction accuracy of swertiamarin
content in the calibration model, it is clear that the error mainly came from minor outlier
samples because swertiamarin content in Gentiana scabra Bunge is relatively low,
which is also why the quantity and equitability of Gentiana scabra Bunge powder are
both important.
Table 4.3 Prediction of the target constituents’ contents in Gentiana scabra Bunge by ICA models.
Fig. 4.5 Relationship between the estimated contents and the reference contents of (A)
gentiopicroside; and (B) swertiamarin in Gentiana scabra Bunge.
4.4 CONCLUSIONS
This study applied ICA in NIR spectroscopy analysis on gentiopicroside and
swertiamarin - bioactive components of Gentiana scabra Bunge and discussed relevant
tissue culture and grown plant (including shoot and root). By selecting ICs that were
highly correlated to the bioactive components, the space of ICs could clearly show the
distribution of gentiopicroside and swertiamarin in different parts of Gentiana scabra
Bunge. Additionally, the predictability of the spectral calibration models on the two
bioactive components was adequate for establishing qualitative and quantitative
correlations. Therefore, by combining ICA with NIR spectroscopy, fast and accurate
growth stages could be achieved. This technology could contribute substantially to the
quality management of Gentiana scabra Bunge during and post cultivation.
ACKNOWLEDGMENT
I would like to thank Mr. Cheng-Wei Huang, Mr. Yu-Song Chen, and Mr. Chun-Chi
Chen for their assistance.
CHAPTER 5. INTEGRATION OF INDEPENDENT COMPONENT ANALYSIS
WITH NEAR INFRARED SPECTROSCOPY FOR
EVALUATION OF RICE FRESHNESS 5.1 INTRODUCTION
Near infrared (NIR) spectroscopy, a rapid nondestructive inspection method based on
specific absorptions within a given range of wavelength corresponding to the
constituents in the sample, has been widely applied for evaluation of internal quality of
agricultural products (Delwiche, 1998; Delwiche and Graybosch, 2002; Bao et al., 2007;
Chen and Huang, 2010; Salgó and Gergely, 2012). Because an NIR spectrum of a disassembling the mixture’s signals from a Gaussian distribution into non-Gaussian
independent constituents with only a small loss of information and does not require any
additional information from the source (Comon, 1994).
Application of ICA for spectrum analysis has been demonstrated by Chen and Wang
(2001) in separating the pure spectra of various constituents from the NIR spectra of the
mixtures, whereupon relationships were established between the estimated independent
components and the constituents. Such a capability also enabled complete explanation
of the constituents’ properties for NIR qualitative analyses (Westad and Kermit, 2003).
In addition, ICA was used to obtain statistically independent and chemically
interpretable latent variables (LVs) in multivariate regression (Gustafsson, 2005). It was
also noted that the number of independent components extracted from the spectra of
mixtures is related to the performance of ICA (Westad, 2005). Moreover, ICA was
employed to identify the infrared spectra of mixtures containing two pure materials
(Hahn and Yoon, 2006) as well as the constituents in commercial gasoline (Pasadakis
and Kardamakis, 2006; Kardamakis et al., 2007). Equally noteworthy is the observation
that the calibration model built through multiple linear regression (MLR), after using
ICA to extract independent components of aqueous solutions, gave good predictability
(Kaneko et al., 2008). In other work, the accuracy of the NIR estimation of sucrose
concentration (Chuang et al., 2010) and glucose concentration (Al-Mbaideen and
Benaissa, 2011) were enhanced by using ICA.
While application of ICA for spectral analysis appears promising, available literature
still focuses mainly on chemical samples or non-natural products. To date, ICA has not
been applied to NIR quantitative analysis of the internal quality of rice. The storage
time of rice has an enormous effect on its appearance, flavor, and quality of the nutrients
(Zhou et al., 2002). A previous study demonstrated that most lipids in rice hydrolyze
into free fatty acids and cause the acidity of rice to increase with prolonged storage
(Takano, 1989; Hu, 2011; Chen et al., 2011). Therefore, the determination of rice
freshness is one of the main goals in site examination. There is a strong need to develop
a non-invasive, rapid detection method for the analysis of freshness. Therefore, the
objective of the current study was to examine rice freshness in terms of qualitative and
quantitative approaches using NIR spectroscopy. Rice freshness was expressed by both
pH value and fat acidity (Hu, 2011; Chen et al., 2011). The pH values were determined
by bromothymol blue - methyl red (BTB-MR) method (Hsu and Song, 1988) and fat
acidity by AACC International method 02-02.02 (AACC International, 2000). By means
of a calibration curve, a relationship between pH and fat acidity was established (Hu,
2011; Chen et al., 2011). ICA was subsequently integrated with NIR spectral analysis to
quantify the pH in rice. Linear regression was then used to build spectral calibration
models of pH value.
5.2 MATERIALS AND METHODS
5.2.1 SAMPLE PREPARATION
A total of 180 (= 6 cargo lots × 30 draws per lot) Tainan 11 (TN-11) paddy rice
samples stored at 10-15°C were provided by the Erlin Farmers’ Association, Changhua
County (a central-west coastal county in Taiwan) and Agricultural Research and
Extension Station, Taichung in Taiwan, including 6 crop seasons (1 lot per season): 2nd
crop of 2010, 1st crop of 2010, 1st crop of 2009, 1st crop of 2008, 1st crop of 2007 and 1st
crop of 2006. All samples were collected at one time and then dehulled and milled soon
thereafter (Hu, 2011; Chen et al., 2011).