Dry powder of G. scabra Bunge was gently poured into a small ring cup (i.d. 5 cm)
and subjected to NIR measurement (NIRS 6500, FOSS NIRSystems, Inc., Laurel, MD,
U.S.A.). The reflectance spectra of the samples were collected in the range of 400 to
2498 nm with 2 nm intervals, and the NIR spectrum of each sample was the average of
32 scans (Yang et al., 2008; Cheng, 2009).
To attain the reference value of the bioactive component, gentiopicroside was
measured by HPLC (DX 500 ion chromatograph, Dionex Corporation, Sunnyvale, CA,
U.S.A.) equipped with a DIONEX C18 column (250 mm × 4.6 mm i.d.). The peak of
gentiopicroside appeared at 250 nm when methanol : water (20:80) was used as the
mobile phase at a flow rate of 1 mL/min. A high-precision scale was used to measure
the gentiopicroside standard powder, and diluted into 1000, 500, and 250 ppm with 70%
methanol as the standard solutions for the three-point calibration of HPLC. A
quantitative linear relationship was established between the standard concentration and
the peak area (Yang et al., 2008; Cheng, 2009).
3.2.3 DATA ANALYSIS
In order to apply the specific wavelengths identified to multi-spectral imaging
inspection of G. scabra Bunge, the spectra of the full wavelength range (400 to 2498 nm)
and the silicon CCD sensing band (400 to 1098 nm) were analyzed. Modified partial
least squares regression (MPLSR) and stepwise multiple linear regression (SMLR)
methods were employed to build the calibration models of gentiopicroside.
3.2.3.1 MPLSR
An extension of partial least squares regression (PLSR), MPLSR abides by the
principle of normalization of the spectra and constituent values prior to PLSR, which is
a standard tool in chemometrics and has been widely used in the pharmaceutical,
chemical, and agricultural fields (Wold et al., 2001). When PLSR is applied to spectral
analysis, the spectra can be regarded as the composition of several principal components (PCs), and be expressed as a ‘factor’ in the PLSR algorithm. The factors’ sequence is
determined by their influences, i.e., the more important factor is ranked earlier in the
order. Since PLSR analysis uses information from spectral bands, the analysis results
can be improved by selecting appropriate number of factors and specific wavelength
ranges.
3.2.3.2 SMLR
SMLR selects the specific wavelengths according to the F-test (F ≥ 3) of null
hypothesis testing (Chang et al., 1998). To build the calibration model with numerous
wavelengths, the SMLR algorithm chooses the most important specific wavelength
from the major molecular bonding region of the objects, and the second most important
specific wavelength is usually chosen between the combination of related molecular
bonding, or the overtone of complementary bonding, and by analogy. When adding a
new wavelength for training, the algorithm will base on the previously selected specific
wavelengths to continue finding the wavelength, which can allow the highest multiple
coefficient of determination (r2) and the minimum prediction error, and determine
whether such wavelength can replace the current specific wavelength or not. In case of
poor competency of the newly-added wavelength for training, the algorithm will stop
training.
3.2.3.3 SPECTRAL PRETREATMENTS
The purpose of spectral pretreatments was to eliminate the spectral variation not
caused by chemical information contained in the samples (de Noord, 1994; Fearn, 2001).
Since inevitable light scattering could be added into the spectra when using NIR to
measure powder samples, especially when the particle size is not uniform,
multiplicative scatter correction (MSC) was used to allow additive and multiplicative
transformation of the spectra (Eq. 3.1). It was conducted using the average spectrum of
all samples as the reference value, and calculating the parameters a and b with the least
square. After MSC treatment, the spectra of G. scabra Bunge powder not only reduced
the physical impact of non-uniform particles (Helland et al., 1995; Maleki et al., 2007),
but also confirmed the linearity of the spectral information (Isaksson and Næ s, 1988),
which would contribute to subsequent linear regression analysis (Thennadil et al.,
2006).
independent treatments, namely (1) smoothing; (2) smoothing with 1st derivative; and (3)
smoothing with 2nd derivative, in order to choose the best pretreatment parameters,
including the smoothing points and the gap ranging from 2 to 50, with the gap being
greater than or equal to the smoothing points.
3.2.3.4 MODEL ESTABLISHMENT
The spectral calibration models of MPLSR and SMLR were built by WinISI II
chemometric software (Infrasoft International, LLC., Port the Matilda, PA, U.S.A.). The
MPLSR analysis procedure included: (1) spectral pretreatments; (2) selecting the
specific wavelength regions; (3) selecting calibration set and validation set; and (4)
determining best calibration model. In steps 1 and 2, 3-fold cross validation (CV) was
used to enable objective selection of the parameters. A 2:1 ratio of calibration to
validation samples was adopted according to the gentiopicroside concentration in the
sample. All samples were ranked ascendantly according to their gentiopicroside
concentration, with the gentiopicroside concentration in the calibration set higher than
the validation set, yet both sets contained similar gentiopicroside concentration
distribution of all samples. When selecting the best calibration model, in order to avoid
over-fitting caused by use of excessive factors, the following principles were adhered to:
(1) the maximum number of factors is one tenth of the number of calibration sets + 2 to
3; (2) stop if the adding of a new factor makes the SEV rise; and (3) when the SEV is
lower than the SEC, stop adding new factor. The SMLR analysis procedure was: (1)
selecting calibration set and validation set; (2) spectral pretreatments; and (3)
determining best calibration model and the specific wavelengths. The same calibration
and validation sets were used for both MPLSR and SMLR analyses.
After the respective spectral calibration models of MPLSR and SMLR were built,
these models were then used to predict the gentiopicroside concentration of the
calibration and the validation set. The predictability of the models was evaluated based
on the following statistical parameters, including coefficient of correlation of calibration
set (Rc), standard error of calibration (SEC), standard error of validation (SEV), bias
and the ratio of the standard error of performance to the standard deviation of the
reference values (RPD), as defined below:
1 2where Yc and Yv represent the estimated gentiopicroside concentration of the
calibration set and the validation set, respectively. Yr is the reference gentiopicroside
concentration; nc and nv are the number of samples in the calibration set and validation
set, respectively; SD is the standard deviation of gentiopicroside concentration within
the validation set.
3.3 RESULTS AND DISCUSSION