• 沒有找到結果。

應用獨立成分分析法於生物材料之近紅外光分析

N/A
N/A
Protected

Academic year: 2022

Share "應用獨立成分分析法於生物材料之近紅外光分析"

Copied!
153
0
0

加載中.... (立即查看全文)

全文

(1)

國立臺灣大學生物資源暨農學院生物產業機電工程學系 博士論文

Department of Bio-Industrial Mechatronics Engineering College of Bioresources and Agriculture

National Taiwan University Doctoral Dissertation

應用獨立成分分析法於生物材料之近紅外光分析 Near Infrared Analysis of Biomaterials

Using Independent Component Analysis

莊永坤

Yung-Kun Chuang

指導教授:陳世銘 博士 Advisor: Suming Chen, Ph.D.

中華民國 102 年 7 月

July, 2013

(2)

誌 謝

博士的訓練過程是充滿回憶的,很高興能夠順利完成博士論文的撰寫。一路 走來承蒙指導教授陳世銘博士的提攜與教誨,老師無論於身教或言教皆樹立良好 的典範,使我自懵懂青澀的大學專題生,逐步邁向具有獨立思考與解決問題能力 的博士,在此衷心地感謝。

承蒙博士論文指導委員臺灣大學生物機電系盧福明榮譽教授、中興大學生物 機電系盛中德教授、中興大學生物機電系謝廣文副教授、宜蘭大學生物機電系邱 奕志院長對於本論文的細心審閱與斧正,謹於此致上最誠摯的感謝。美國馬里蘭 大學營養與食品科學系 Dr. Y. Martin Lo 與美國農業部 Beltsville 農業研究中心 Dr.

Stephen R. Delwiche 於本論文的指導與建議亦功不可沒,我對二位的感激之情難以 言喻。很感謝國家科學委員會提供為期一年的千里馬計畫(100-2917-I-002-004)

機會,使我有幸前往美國馬里蘭大學與美國農業部,接受兩位知名學者完整而紮 實的訓練與指導,並拓展我的國際觀,著實感激與珍惜。

在實驗室從事研究工作的過程中,感謝加增學長與宜璋學長帶領我入門,而 一路上亦師亦友的蔡兆胤博士,更是我學習的榜樣,經由共同處理實驗室內的許 多工作,使我受惠成長,特此感謝。也感謝世傑學長、翕雯學姐、毓良學長、錦 銘學長及詠惠,無論是維持實驗室內各項工作與計畫的順利運作,或對於我的協 助與鼓勵,皆功不可沒。很感謝宇帆、慶茵學姐、易平、育菘及俊吉學長提供我 論文研究中實驗與數據的幫助,以及美國農業部 Beltsville 農業研究中心工程師 Diane E. Chan 小姐協助潤飾論文的文字。至於在我求學過程中曾經一同共事的研 究助理、學長姐、同學及學弟妹們,你們都是我最好的同伴,也唯有長期以來良 好的團隊合作與氣氛,才能創造實驗室豐碩的研究成果,謝謝。

最後將此博士論文獻給我最親愛的父親、母親、兄長、女友及親友們,您們 總是無怨無悔地提供我生活與精神上的支持與鼓勵,使我能一路堅持到取得博士 學位,謹以此論文作為最好的禮物呈獻給您們,謝謝。

(3)

中文摘要

本論文使用獨立成分分析法為唯一之核心演算法,應用於三種生物材料之近 紅外光定量分析,包含蓮霧(Syzygium samarangense Merrill & Perry)、藥用植物 龍膽(Gentiana scabra Bunge)及白米之研究,亦對不同型態樣本(蔗糖水溶液、

蓮霧完整果、龍膽乾燥粉末及白米粒)進行分析探討。第一部分研究結合獨立成 分分析法與近紅外光光譜於蓮霧糖度之快速定量分析,結合 JADE 演算法、線性 迴歸及光譜前處理方法,分別對蓮霧與蔗糖溶液之近紅外光光譜進行分析。相較 於其他多變量分析方法,獨立成分分析法可提供更完整之蓮霧糖度資訊,其最佳 光譜檢量模式使用一次微分光譜搭配正規化處理,光譜範圍為 600~700 nm 與 900

~1098 nm,Rc = 0.988,SEC = 0.243°Brix,SEV = 0.381°Brix,顯示獨立成分分析 法可快速準確地擷取蓮霧光譜中之糖度資訊,並建立具高預測能力之光譜檢量模 式,更有效地定量蓮霧糖度。第二部分研究應用獨立成分分析法於龍膽指標成分 龍膽苦苷與當藥苦苷之近紅外光分析,對 94 個組織培養瓶苗與 68 個植株樣本(包 含 68 個地上部與 68 個地下部)進行探討。選擇與兩種指標成分高度相關之獨立 成分後,組織培養瓶苗、植株地上部及植株地下部清楚分佈於獨立成分空間之三 個位置,可觀察龍膽苦苷與當藥苦苷含量之變化趨勢。龍膽苦苷之最佳光譜檢量 模式使用二次微分光譜,光譜範圍為 600~700 nm、1600~1700 nm 及 2000~2300

nm,其 Rc = 0.847,SEC = 0.865%,SEV = 0.909%;當藥苦苷之最佳光譜檢量模式

使用一次微分光譜,光譜範圍為 600~800 nm 與 2200~2300 nm,其 Rc = 0.948,

SEC = 0.168%,SEV = 0.216%,皆具有良好之預測能力。本研究成功建立龍膽苦 苷與當藥苦苷之定性與定量關係,可針對不同生長時期之龍膽進行兩種指標成分 含量之檢測,作為快速且準確之龍膽品質評估工具。第三部分研究應用獨立成分 分析法於稻米新鮮度之快速定性與定量分析,新鮮度為決定稻米品質之重要指 標,稻米貯藏時間會影響其外觀、食味及營養價值。本研究對六個收穫時期(95

(4)

年第一期、96 年第一期、97 年第一期、98 年第一期、99 年第一期及 99 年第二期)

之白米進行探討,結果顯示不同新鮮度之白米清楚分佈於三維之獨立成分空間

中,對酸鹼度 pH 值所建立之光譜檢量模式亦具有高預測能力,其 Rc = 0.939,SEC

= 0.202,SEP = 0.233,表示結合獨立成分分析法與近紅外光光譜可有效評估稻米 之新鮮度,且 pH 值與脂肪酸度較年份期別為更合適之評量指標。結合獨立成分分 析法與近紅外光光譜可快速且正確地評估生物材料之內部成分,獨立成分分析法 提供近紅外光光譜於生物材料內部成分定量分析一項快速可靠之工具,應用於評 估生物材料內部品質具有重大貢獻。

關鍵詞:近紅外光光譜、獨立成分分析法、蓮霧、糖度、龍膽、龍膽苦苷、當藥 苦苷、稻米新鮮度

(5)

ABSTRACT

In this study, independent component analysis (ICA) was first adopted as the sole tool

in conducting NIR quantitative analyses of biomaterials, including wax jambu fruit

(Syzygium samarangense Merrill & Perry), medicinal plant Gentiana scabra Bunge,

and milled white rice, to evaluate the applicability of this method. The influence due to

various types of samples (sucrose solution, intact fruit, dry powder of Gentiana scabra

Bunge, and rice kernel) was also studied. In the first part, ICA was integrated with near

infrared (NIR) spectroscopy for rapid quantification of sugar content in wax jambu. The

JADE algorithm (Joint Approximate Diagonalization of Eigenmatrices) and linear

regression with spectral pretreatments were incorporated to analyze the NIR spectra of

wax jambu as well as sucrose solutions. Unlike other multivariate approaches, ICA

enabled comprehensive quantification of sugar content in wax jambu. In the present

study, ICA was used as the sole tool to build the NIR calibration model of internal

quality of intact wax jambu without any other assisted multivariate analysis methods.

The best spectral calibration model of wax jambu (600 to 700 nm and 900 to 1098 nm)

yielded Rc = 0.988, SEC = 0.243 °Brix, and SEV = 0.381 °Brix using the normalized

first derivative spectra. Thus, ICA can quickly identify and effectively quantify the

sugar contents in wax jambu with calibration models achieving high predictability. In

the second part, ICA was applied to NIR spectroscopy on the analysis of gentiopicroside

(6)

and swertiamarin, the two bioactive components of Gentiana scabra Bunge.

Independent components (ICs) that are highly correlated to the two bioactive

components were selected for the analysis of tissue cultures, shoots and roots, which

were found to distribute in three different positions within the domain (2D and 3D)

constructed by the ICs. This setup could be used for quantitative determination of

respective contents of gentiopicroside and swertiamarin within the plants. For

gentiopicroside, the spectral calibration model based on the 2nd derivative spectra

produced the best effect in the wavelength ranges of 600 to 700 nm, 1600 to 1700 nm,

and 2000 to 2300 nm (Rc = 0.847, SEC = 0.865 %, and SEV = 0.909 %). For

swertiamarin, spectral calibration model based on the 1st derivative spectra gave the best

effect in the wavelength ranges of 600 to 800 nm and 2200 to 2300 nm (Rc = 0.948,

SEC = 0.168 %, and SEV = 0.216 %). Both models showed a satisfactory predictability.

This study successfully established qualitative and quantitative correlations for

gentiopicroside and swertiamarin with NIR spectra, enabling rapid and accurate

inspection on the bioactive components of Gentiana scabra Bunge at different growth

stages. Furthermore, determination of freshness is an important issue for rice quality.

The storage time of rice has an enormous effect on its appearance, flavor, and quality of

the nutrients. A total of 180 white rice samples were collected from 6 crop seasons (2nd

crop of 2010, 1st crop of 2010, 1st crop of 2009, 1st crop of 2008, 1st crop of 2007 and 1st

(7)

crop of 2006) for the purpose of developing an ICA NIR based procedure for rice

freshness as quantified by pH. Freshness of white rice could be distinguished either

visually by a 3-dimensional diagram composed from ICs 2, 3 and 4, or statistically by a

calibration model (Rc = 0.939, SEC = 0.202, and SEP = 0.233). The results showed that

ICA with NIR has the potential to be a useful tool for evaluating rice freshness.

Compared to harvest time, pH value and fat acidity were more appropriate to serve as

indicators of rice freshness. By combining ICA with NIR spectroscopy, fast and

accurate evaluation of constituents in biomaterials could be achieved. ICA offers a rapid

and reliable tool for quantitative analyses of constituents in biomaterials by NIR

spectroscopy. The obtained results contribute substantially to identify multiple

constituents of biomaterials and evaluate their concentrations.

Keywords: Near infrared spectroscopy, Independent component analysis, Wax jambu,

Sugar content, Gentiana scabra Bunge, Gentiopicroside, Swertiamarin,

Rice freshness

(8)

CONTENTS

誌 謝 ... i

中文摘要 ... ii

ABSTRACT ... iv

CONTENTS ... vii

LIST OF FIGURES ... xii

LIST OF TABLES ... xv

CHAPTER 1. GENERAL INTRODUCTION ... 1

1.1 INTRODUCTION ... 1

1.1.1 WAX JAMBU ... 4

1.1.2 GENTIANA SCABRA BUNGE ... 5

1.1.3 RICE ... 7

1.2 GENERAL OBJECTIVE ... 8

1.3 DISSERTATION ORGANIZATION... 9

CHAPTER 2. INTEGRATION OF INDEPENDENT COMPONENT ANALYSIS WITH NEAR INFRARED SPECTROSCOPY FOR RAPID QUANTIFICATION OF SUGAR CONTENT IN WAX JAMBU ... 10

2.1 INTRODUCTION ... 10

2.2 MATERIALS AND METHODS ... 14

(9)

2.2.1 SAMPLE PREPARATION ... 14

2.2.2 NIR SPECTRA AND SUGAR CONTENT MEASUREMENT ... 14

2.2.3 DATA ANALYSIS ... 16

2.3 RESULTS AND DISCUSSION ... 22

2.3.1 SUCROSE SOLUTION ... 22

2.3.2 WAX JAMBU ... 35

2.4 CONCLUSIONS ... 46

CHAPTER 3. QUANTIFICATION OF BIOACTIVE GENTIOPICROSIDE IN A MEDICINAL PLANT GENTIANA SCABRA BUNGE BY NEAR INFRARED SPECTROSCOPY ... 48

3.1 INTRODUCTION ... 48

3.2 MATERIALS AND METHODS ... 50

3.2.1 G. SCABRA BUNGE SAMPLE PREPARATION ... 50

3.2.2 NIR SPECTRA AND HPLC MEASUREMENT ... 50

3.2.3 DATA ANALYSIS ... 51

3.3 RESULTS AND DISCUSSION ... 56

3.3.1 GENTIOPICROSIDE CONCENTRATION AND DISTRIBUTION IN G. SCABRA BUNGE ... 56

3.3.2 CORRELATION BETWEEN NIR SPECTRA AND

(10)

GENTIOPICROSIDE CONTENT ... 57

3.3.3 GENTIOPICROSIDE QUANTIFICATION USING SPECIFIC WAVELENGTH RANGES ... 61

3.3.4 GENTIOPICROSIDE QUANTIFICATION USING CCD CAMERA WAVELENGTH SPECTRA ... 67

3.4 CONCLUSIONS ... 75

CHAPTER 4. INTEGRATION OF INDEPENDENT COMPONENT ANALYSIS WITH NEAR INFRARED SPECTROSCOPY FOR ANALYSIS OF BIOACTIVE COMPONENTS IN A MEDICINAL PLANT GENTIANA SCABRA BUNGE ... 76

4.1 INTRODUCTION ... 76

4.2 MATERIALS AND METHODS ... 78

4.2.1 GENTIANA SCABRA BUNGE SAMPLE PREPARATION ... 78

4.2.2 NIR SPECTRA AND HPLC MEASUREMENT ... 79

4.2.3 DATA ANALYSIS ... 80

4.3 RESULTS AND DISCUSSION ... 85

4.3.1 DISTRIBUTIONS OF THE TARGET CONSTITUENTS IN GENTIANA SCABRA BUNGE ... 85

4.3.2 CORRELATION BETWEEN NIR SPECTRA AND TARGET CONSTITUENTS’ CONTENTS ... 86

(11)

4.3.3 NIR SPECTRA DECOMPOSITION AND ICA ANALYSIS OF THE

TARGET CONSTITUENTS ... 91

4.4 CONCLUSIONS ... 99

CHAPTER 5. INTEGRATION OF INDEPENDENT COMPONENT ANALYSIS WITH NEAR INFRARED SPECTROSCOPY FOR EVALUATION OF RICE FRESHNESS ... 101

5.1 INTRODUCTION ... 101

5.2 MATERIALS AND METHODS ... 103

5.2.1 SAMPLE PREPARATION ... 103

5.2.2 NIR SPECTRA AND PH VALUE MEASUREMENT ... 104

5.2.3 DATA ANALYSIS ... 106

5.3 RESULTS AND DISCUSSION ... 108

5.3.1 RELATIONSHIP BETWEEN FAT ACIDITY AND PH VALUE ... 108

5.3.2 DISTRIBUTIONS OF THE PH VALUE IN RICE ... 110

5.3.3 NIR SPECTRA DECOMPOSITION AND ICA ANALYSIS OF THE PH VALUE ... 112

5.4 CONCLUSIONS ... 117

CHAPTER 6. GENERAL CONCLUSIONS ... 119

6.1 GENERAL DISCUSSION ... 119

(12)

6.2 RECOMMENDATIONS FOR FUTURE RESEARCH ... 122

REFERENCES ... 123

(13)

LIST OF FIGURES

Fig. 2.1 A wax jambu (Syzygium samarangense Merrill & Perry) sample (A) side view

and the NIR measurement location, and (B) sample placement with suggested

distance 7.62 cm between the light source and the top of sample in the on-line

NIRS 6500 spectrophotometer. ... 16

Fig. 2.2 Relationship between the numbers of ICs and errors of the predicted sugar

content for sucrose solutions. The most appropriate number of ICs for

normalized spectra was determined by the tendency of SEC (green-short dash

line) and SEV (blue-dash dot dot line) values. ... 24

Fig. 2.3 Distribution of calibration and validation samples of sucrose solutions in IC

1-IC 4 space. IC 1 and IC 4 were randomly selected from the 7 ICs. ... 26

Fig. 2.4 Correlation between the values of IC 1 in the mixing matrix and the reference

sugar contents of sucrose solutions. ... 27

Fig. 2.5 (A) Original NIR spectra of sucrose solutions, (B) IC 1 decomposed from

calibration sets, and (C) the reflectance spectrum of sucrose powder

post-Detrend. ... 30

Fig. 2.6 Correlation coefficient distributions of the spectra and the sugar content of wax

jambu through three different spectral pretreatments (original spectra, 1st

derivative spectra, and 2nd derivative spectra). ... 38

(14)

Fig. 2.7 Relationship between spectral bands and errors of the predicted sugar content

for wax jambu when applying 7 to 30 ICs. Full spectrum range from 400 to

2498 nm was divided into 21 band regions by taking every 100 nm as a band

region. ... 40

Fig. 3.1 The spectra of G. scabra Bunge powder post-MSC (A) tissue culture and (B)

grown plants. ... 58

Fig. 3.2 Correlation coefficient distributions between absorbance values of the spectra

and gentiopicroside contents of the G. scabra Bunge powder (A) tissue culture

and (B) grown plants. ... 61

Fig. 4.1 (A) The spectra of Gentiana scabra Bunge powder post-MSC; (B) correlation

coefficient distributions between the spectra and gentiopicroside; and (C)

correlation coefficient distributions between the spectra and swertiamarin. .... 90

Fig. 4.2 The three ICs decomposed from the original spectra of Gentiana scabra Bunge

powder post-MSC that has higher correlation with gentiopicroside and

swertiamarin. ... 92

Fig. 4.3 Scores of tissue culture, shoot, and root in IC 4-IC 5 space established with

calibration samples. (A) = calibration set, (B) = validation set. Scores of

gentiopicroside and swertiamarin in IC 4-IC 5 space established with

calibration samples. (C) = calibration set, (D) = validation set. ... 94

(15)

Fig. 4.4 Scores of tissue culture, shoot, and root in IC 4-IC 5-IC 6 space established

with calibration samples. (A) = calibration set, (B) = validation set. Scores of

gentiopicroside and swertiamarin in IC 4-IC 5-IC 6 space established with

calibration samples. (C) = calibration set, (D) = validation set. ... 96

Fig. 4.5 Relationship between the estimated contents and the reference contents of (A)

gentiopicroside; and (B) swertiamarin in Gentiana scabra Bunge. ... 99

Fig. 5.1 Relationship between fat acidity and pH value established by the 18 selected

rice samples (Hu, 2011; Chen et al., 2011). ... 109

Fig. 5.2 Distributions of pH values for six crop seasons of white rice samples. ... 111

Fig. 5.3 Relationship between the numbers of independent components and the model

standard errors for pH value. ... 113

Fig. 5.4 Scores of white rice with 6 crop seasons in the vector space of independent

components 2, 3, and 4 established with calibration samples. a = calibration set,

b = validation set. ... 115

Fig. 5.5 Scatter plot of the reference pH values and the predicted pH values by

independent component analysis of the NIR spectra... 117

(16)

LIST OF TABLES

Table 2.1 Summary of sucrose solutions and sample sugar contents. Total samples (n =

78), calibration set (n = 52) and validation set (n = 26) were arranged to have

consistent distributions of sugar content. ... 22

Table 2.2 Regression coefficient matrix of sucrose solutions with 7 ICs were extracted

from the NIR spectra of calibration sets. Correlation between the absolute

value of each IC in regression coefficient matrix and sugar content was

examined. ... 28

Table 2.3 Regression results by ICA and PLSR analyses for sucrose solutions. ... 32

Table 2.4 Summary of wax jambu (Syzygium samarangense Merrill & Perry) and

sample sugar contents. Total samples (n = 114), calibration set (n = 76) and

validation set (n = 38) were arranged to have consistent distributions of sugar

content. ... 36

Table 2.5 Regression results by ICA and PLSR analyses for wax jambu (without

spectral pretreatment). ... 42

Table 2.6 Regression results by ICA and PLSR analyses for wax jambu (with spectral

pretreatment). ... 45

Table 3.1 The gentiopicroside content in tissue culture and grown plants of G. scabra

Bunge. ... 57

(17)

Table 3.2 The gentiopicroside content of effective samples, calibration set, and

validation set in tissue culture and grown plants. ... 62

Table 3.3 Prediction of the gentiopicroside content in tissue culture and grown plants of

G. scabra Bunge by MPLSR models in the wavelength of 400 to 2498 nm. . 65

Table 3.4 Prediction of the gentiopicroside content in tissue culture and grown plants of

G. scabra Bunge by MPLSR models in the wavelength of 400 to 1098 nm. . 69

Table 3.5 Prediction of the gentiopicroside content in tissue culture and grown plants of

G. scabra Bunge by SMLR models in the wavelength of 400 to 1098 nm. ... 73

Table 4.1 Contents and distributions of the target constituents in Gentiana scabra

Bunge. ... 86

Table 4.2 The target constituents’ contents of effective samples, calibration set, and

validation set in Gentiana scabra Bunge. ... 87

Table 4.3 Prediction of the target constituents’ contents in Gentiana scabra Bunge by

ICA models. ... 98

Table 5.1 Regression results by ICA analyses for white rice. ... 115

(18)

CHAPTER 1. GENERAL INTRODUCTION

1.1 INTRODUCTION

Near infrared (NIR) spectroscopy, a nondestructive sensing method based on specific

absorptions within a given range of wavelength corresponding to the constituents in the

sample (McClure, 2003), has been widely applied for the evaluation of internal quality

of agricultural products (Davey et al., 2009; Lebot et al., 2011). Since NIR spectra of a

mixture is the linear summation of individual spectra of the constituents in the mixture,

such a mixture spectra thus can be regarded as ‘blind sources’ (Hyvärinen et al., 2001)

as the proportion of constituents in the samples remains unknown. Many attempts have

been made in recent years to extract critical features from the spectra using multivariate

analysis (Blanco and Villarroya, 2002; Burns and Ciurczak, 2008), including multiple

linear regression (MLR) (Chang et al., 1998), principal component regression (PCR)

(Wold, 1987), and partial least squares regression (PLSR) (Wold et al., 2001). However,

these methods were not designed for resolving the ‘blind source’ problem and may not

correlate well with the properties of constituents in the mixture, consequently hindering

the applicability of the spectra for chemometric analysis of the constituents

(Al-Mbaideen and Benaissa, 2011; Chen and Wang, 2001; Kaneko et al., 2008).

A multiuse statistical approach originally used to implement ‘blind source separation’

(19)

in signal processing (Herault and Jutten, 1986; Vittoz and Arreguit, 1989), independent

component analysis (ICA) is capable of disassembling the mixture signals of Gaussian

distribution into non-Gaussian independent constituents with only a little loss of

information and does not require any information to be added to the source (Comon,

1994). In practice, multiple ICA algorithms have been developed, including JADE

algorithm (joint approximate diagonalization of eigenmatrices) (Cardoso and

Souloumiac, 1993; Cardoso, 1999) and FastICA algorithm (Hyvärinen and Oja, 1997;

Hyvärinen, 1999), making ICA a high-speed and reliable tool (Hyvärinen and Oja, 2000)

for analytical chemistry (Lathauwer et al., 2000; Al-Mbaideen and Benaissa, 2011),

biomedical signal processing, telecommunications, econometrics, audio processing, and

image processing (Hyvärinen et al., 2001).

Application of ICA for spectrum analysis has been demonstrated by Chen and Wang

(2001) to separate the pure spectra of various constituents from the NIR spectra of the

mixture and to build qualitative relationship between the estimated independent

components and the constituents. Such a capability also enabled complete explanation of the constituents’ properties for NIR qualitative analyses (Westad and Kermit, 2003).

In addition, ICA was used to obtain statistically independent and chemically

interpretable latent variables (LVs) in multivariate regression (Gustafsson, 2005). It was

(20)

also noted that the number of independent components extracted from the spectra of

mixtures is related to the performance of ICA (Westad, 2005). Moreover, ICA was

employed to identify the infrared spectrum of mixtures containing two pure materials

(Hahn and Yoon, 2006) as well as the constituents in commercial gasoline (Pasadakis

and Kardamakis, 2006; Kardamakis et al., 2007). Equally noteworthy is that the

calibration model built through MLR, after using ICA to extract independent

components of aqueous solutions, gave good predictability (Kaneko et al., 2008),

whereas NIR estimation of sucrose concentration (Chuang et al., 2010) and glucose

concentration (Al-Mbaideen and Benaissa, 2011) were enhanced by using ICA.

While application of ICA for spectral analysis appears promising, available literature

still focuses mainly on chemical samples or non-natural products, such as (1)

identification of constituents in the mixture, especially for chemical compounds (Chen

and Wang, 2001; Hahn and Yoon, 2006; Pasadakis and Kardamakis, 2006; Kardamakis

et al., 2007; Liu et al., 2008; Kaneko et al., 2008; Al-Mbaideen and Benaissa, 2011); (2)

a preprocessing method for improving predictability of calibration model (Zou and

Zhao, 2006); and (3) combination of ICA and other multivariate analysis methods, such

as PCA-ICA (Pasadakis and Kardamakis, 2006), ICA-MLR (Kaneko et al., 2008; Liu et

al., 2009), ICA-PLS (Liu et al., 2009), ICA-LS-SVM (Wu et al., 2008) and ICA-NNR

(21)

(Fang and Lin, 2008) to deal with linear or nonlinear problems. However, no literature

exists by using ICA with NIR spectroscopy as the sole tool to quantify internal quality

or constituents of biomaterials without any other assisted multivariate analysis methods.

The applicability of ICA for quantitative inspection of biomaterials thus should be

evaluated and studied. In this dissertation, ICA was first applied as the sole tool in

conducting NIR quantitative analyses of biomaterials, including wax jambu fruit (see

CHAPTER 2), medicinal plant Gentiana scabra Bunge (see CHAPTER 3 and 4), and

milled white rice (see CHAPTER 5), to evaluate the applicability of this method.

Influence due to various types of sample (sucrose solution, intact fruit, dry powder of

Gentiana scabra Bunge, and cargo rice) was also studied.

1.1.1 WAX JAMBU

Wax jambu (Syzygium samarangense Merrill & Perry), an endemic fruit in Taiwan

and parts of southeast Asia has very unique surface and texture that are easily bruised or

damaged, hence requiring wax jambu to be handled delicately from harvest to shipping

and distribution. To date, several researches aimed to develop a non-invasive and rapid

detection method for the analysis of internal quality of wax jambu (You, 2002; Lin,

2002; Chung et al., 2004). For further applications of ICA with NIR spectroscopy in the

inspection of fruits, wax jambu is suitable to serve as an example for discussion. In the

(22)

present study, ICA was integrated with NIR spectral analysis to quantify the sugar

content in intact wax jambu. The results of wax jambu were also compared with those

of sucrose solutions –– mixtures of sucrose and de-ionized water. Spectral pretreatments

and linear regression were then used to build spectral calibration models of sugar

content. The analysis results of ICA were also compared with those of PLSR to assess

the abilities in predicting sugar content in wax jambu.

1.1.2 GENTIANA SCABRA BUNGE

Medicinal plants have always been considered an important and reliable source of

pharmacy, since they are rich in many bioactive components. The international trade

market for medicinal plant products continues to expand and covers food, beverages,

drugs, cosmetics, and skin care products. Gentiana scabra Bunge, a perennial

herbaceous plant, is mainly grown in temperate regions such as Taiwan, China, Japan,

South Korea, and Russia. Dried root and rootstock of Gentiana scabra Bunge are

commonly used as pharmaceutical raw materials, since they are rich in many

secoiridoid glycosides such as gentiopicroside, swertiamarin and sweroside (Kakuda et

al., 2001). In particular, gentiopicroside has been shown to protect liver, inhibit liver

dysfunction, and promote gastric acid secretion in addition to its antimicrobial and

anti-inflammatory effects, making it a popular ingredient in Chinese herbal medicine

(23)

and health products (Kim et al., 2009).

In early days, Gentiana scabra Bunge was mainly collected from the wild. As the

demand for Gentiana scabra Bunge increases, the wild resources diminish gradually,

thus restoration of Gentiana scabra Bunge became an important issue (Zhang et al.,

2010). Studies in recent years used tissue culture technology to cultivate of Gentiana

scabra Bunge (Cai et al., 2009), by domesticating the tissue culture of Gentiana scabra

Bunge, then transplanting it to the greenhouse for cultivation. In order to monitor the

change of Gentiana scabra Bunge during the growth process, it is necessary to measure

the bioactive components of Gentiana scabra Bunge. However, the commonly used

methods such as micellar electrokinetic capillary chromatography (MECC) (Glatz et al.,

2000), high performance liquid chromatography (HPLC) (Szücs et al., 2002; Kikuchi et

al., 2005; Carnat et al., 2005; Kušar et al., 2010; Hayta et al., 2011a; Hayta et al.,

2011b), liquid chromatography-mass spectrometry (LC-MS) (Aberham et al., 2007;

Aberham et al., 2011), and ultra-performance liquid chromatography (UPLC)

(Nastasijević et al., 2012) are all time-consuming and energy-intense, hence cannot be

applicable for daily quality inspection of Gentiana scabra Bunge during cultivation.

NIR spectroscopy has been widely used in dispensation, such as herbal component

(24)

analysis of Chinese herbal plants Angelicae gigantis Radix (Woo et al., 2005), Rhubarb

(Zhang and Tang, 2005), licorice (Glycyrrhizia uralensis Fisch,) (Wang et al., 2007),

Panax Species (Chen et al., 2011), and Lonicera japonica (Wu et al., 2012), as well as

the content detection of active pharmaceutical ingredients (APIs) in tablets (Paris et al.,

2006; Jamrógiewicz, 2012; Porfire et al., 2012). However, it has not been employed to

qualitatively monitor the growth of Gentiana scabra Bunge. In recent years, ICA has

been used in medicinal tests (Fang and Lin, 2008; Wang et al., 2009; Shao et al., 2009).

Considering there hasn’t been any study applying NIR spectroscopy in inspection on

internal components of Gentiana scabra Bunge currently, it is the intent of this study to

apply ICA, which could analyze various components simultaneously, in NIR

spectroscopy analysis on gentiopicroside and swertiamarin to discuss qualitative and

quantitative relationships of the two bioactive components. Efforts were also made to

build spectral calibration models with high predictability in order to evaluate the

potentiality of NIR for quality inspection on Gentiana scabra Bunge.

1.1.3 RICE

Rice is one of the most important and popular food crops in the world, and freshness

of rice depends on the storage conditions such as storage time, storage temperature, and

relative humidity. Among them, the storage time of rice has an enormous effect on its

(25)

appearance, flavor, and quality of the nutrients (Zhou et al., 2002). Previous studies

demonstrated that most lipids in rice hydrolyze into free fatty acids and cause the acidity

of rice to increase with prolonged storage (Takano, 1989; Hu, 2011; Chen et al., 2011).

Therefore, the determination of rice freshness is one of the main goals in site

examination. There is a strong need to develop a non-invasive, rapid detection method

for the analysis of freshness. Therefore, the objective of the current study was to inspect

rice freshness in terms of qualitative and quantitative approaches using NIR

spectroscopy. Rice freshness was expressed by both pH value and fat acidity. The pH

values were determined by bromothymol blue - methyl red (BTB-MR) method (Hsu

and Song, 1988) and fat acidity by AACC International method 02-02.02 (AACC

International, 2000). By means of a calibration curve, a relationship between pH and fat

acidity was established. ICA was subsequently integrated with NIR spectral analysis to

quantify the pH in rice. Linear regression was then used to build spectral calibration

models of pH value.

1.2 GENERAL OBJECTIVE

The objective of the dissertation was to apply ICA as the sole tool in conducting NIR

quantitative analyses of biomaterials, including wax jambu fruit, medicinal plant

Gentiana scabra Bunge, and milled white rice, to evaluate its applicability. Influence

(26)

due to various types of sample (sucrose solution, intact fruit, dry powder of Gentiana

scabra Bunge, and cargo rice) was also studied in the dissertation.

1.3 DISSERTATION ORGANIZATION

The dissertation is written in the alternative format. The “GENERAL INTRODUCTION” section is followed by chapters containing manuscripts of four

research papers: (1) Integration of independent component analysis with near infrared

spectroscopy for rapid quantification of sugar content in wax jambu (Syzygium

samarangense Merrill & Perry), (2) Quantification of bioactive gentiopicroside in a

medicinal plant Gentiana scabra Bunge using near infrared spectroscopy, (3)

Integration of independent component analysis with near infrared spectroscopy for

analysis of bioactive components in a medicinal plant Gentiana scabra Bunge, and (4)

Integration of independent component analysis with near infrared spectroscopy for

evaluation of rice freshness. These are followed by “GENERAL CONCLUSIONS”

section.

(27)

CHAPTER 2. INTEGRATION OF INDEPENDENT COMPONENT ANALYSIS WITH NEAR INFRARED SPECTROSCOPY FOR RAPID QUANTIFICATION OF SUGAR CONTENT IN WAX JAMBU 2.1 INTRODUCTION

Near infrared (NIR) spectroscopy, a nondestructive inspection method based on

specific absorptions within a given range of wavelength corresponding to the

constituents in the sample (McClure, 2003), has been widely applied for the evaluation

of internal quality of agricultural products (Davey et al., 2009; Lebot et al., 2011). Since

NIR spectra of a mixture is the linear summation of individual spectra of the constituents in the mixture, such a mixture spectra thus can be regarded as ‘blind

sources’ (Hyvärinen et al., 2001) as the proportion of constituents in the samples

remains unknown. Many attempts have been made in recent years to extract critical

features from the spectra using multivariate analysis (Blanco and Villarroya, 2002;

Burns and Ciurczak, 2008), including multiple linear regression (MLR) (Chang et al.,

1998), principal component regression (PCR) (Wold, 1987), and partial least squares

regression (PLSR) (Wold et al., 2001). However, these methods were not designed for resolving the ‘blind source’ problem and may not correlate well with the properties of

constituents in the mixture, consequently hindering the applicability of the spectra for

chemometric analysis of the constituents (Al-Mbaideen and Benaissa, 2011; Chen and

Wang, 2001; Kaneko et al., 2008).

(28)

A multiuse statistical approach originally used to implement ‘blind source separation’

in signal processing (Herault and Jutten, 1986; Vittoz and Arreguit, 1989), independent

component analysis (ICA) is capable of disassembling the mixture signals of Gaussian

distribution into non-Gaussian independent constituents with only a little loss of

information and does not require any information to be added to the source (Comon,

1994). In practice, multiple ICA algorithms have been developed, including JADE

algorithm (joint approximate diagonalization of eigenmatrices) (Cardoso and

Souloumiac, 1993; Cardoso, 1999) and FastICA algorithm (Hyvärinen and Oja, 1997;

Hyvärinen, 1999), making ICA a high-speed and reliable tool (Hyvärinen and Oja, 2000)

for analytical chemistry (Lathauwer et al., 2000; Al-Mbaideen and Benaissa, 2011),

biomedical signal processing, telecommunications, econometrics, audio processing, and

image processing (Hyvärinen et al., 2001).

Application of ICA for spectrum analysis has been demonstrated by Chen and Wang

(2001) to separate the pure spectra of various constituents from the NIR spectra of the

mixture and to build relationship between the estimated independent components and

the constituents. Such a capability also enabled complete explanation of the constituents’ properties for NIR qualitative analyses (Westad and Kermit, 2003). In

addition, ICA was used to obtain statistically independent and chemically interpretable

(29)

latent variables (LVs) in multivariate regression (Gustafsson, 2005). It was also noted

that the number of independent components extracted from the spectra of mixtures is

related to the performance of ICA (Westad, 2005). Moreover, ICA was employed to

identify the infrared spectrum of mixtures containing two pure materials (Hahn and

Yoon, 2006) as well as the constituents in commercial gasoline (Pasadakis and

Kardamakis, 2006; Kardamakis et al., 2007). Equally noteworthy is that the calibration

model built through MLR, after using ICA to extract independent components of

aqueous solutions, gave good predictability (Kaneko et al., 2008), whereas NIR

estimation of sucrose concentration (Chuang et al., 2010) and glucose concentration

(Al-Mbaideen and Benaissa, 2011) were enhanced by using ICA.

While application of ICA for spectral analysis appeared promising, available

literatures still focused mainly on (1) identification of constituents in the mixture,

especially for chemical compounds (Chen and Wang, 2001; Hahn and Yoon, 2006;

Pasadakis and Kardamakis, 2006; Kardamakis et al., 2007; Liu et al., 2008; Kaneko et

al., 2008; Al-Mbaideen and Benaissa, 2011); (2) a preprocessing method for improving

predictability of calibration model (Zou and Zhao, 2006); and (3) combination of ICA

and other multivariate analysis methods, such as PCA-ICA (Pasadakis and Kardamakis,

2006), ICA-MLR (Kaneko et al., 2008; Liu et al., 2009), ICA-PLS (Liu et al., 2009),

(30)

ICA-LS-SVM (Wu et al., 2008) and ICA-NNR (Fang and Lin, 2008) to deal with linear

or nonlinear problems. However, no literature exists for ICA with NIR spectroscopy to

be applied as the sole tool to quantify internal quality of intact fruit without any other

multivariate analysis methods. Wax jambu (Syzygium samarangense Merrill & Perry),

an endemic fruit in Taiwan and parts of southeast Asia (Fig. 2.1) has very unique surface

and texture that are easily bruised or damaged, hence requiring wax jambu to be

handled delicately from harvest to shipping and distribution. To date, several researches

aimed to develop a non-invasive and rapid detection method for the analysis of internal

quality of wax jambu (You, 2002; Lin, 2002; Chung et al., 2004). For further

applications of ICA as the sole tool with NIR spectroscopy in the inspection of fruits,

wax jambu is suitable to serve as sample for discussion. In the present study, ICA was

integrated for NIR spectral analysis to quantify the sugar content in intact wax jambu.

The results of wax jambu were also compared with those of sucrose solutions. Spectral

pretreatments and linear regression were then used to build spectral calibration models

of sugar content. The analysis results of ICA were also compared with those of PLSR to

assess the abilities in predicting sugar content in wax jambu.

(31)

2.2 MATERIALS AND METHODS

2.2.1 SAMPLE PREPARATION

Sucrose (C12H22O11, FW: 342.30) powder was solubilized in de-ionized water to

prepare 78 sucrose solutions with sugar content ranging from 0.4 to 19.0 °Brix. The

average sugar content was 9.83 °Brix, and the standard deviation was 5.48 °Brix. A total

of 114 wax jambu (S. samarangense Merrill & Perry) samples purchased from Fangliao,

Pingtung County in Taiwan were employed for the study. Before measuring the sugar

content, wax jambu was first pressed to extract 15 mL juice which was centrifuged for

15 min at 2500 rpm and 22°C to clarify the sample in centrifuge KUBOTA 2700

(KUBOTA Corporation Co., Ltd., Osaka City, Osaka, Japan).

2.2.2 NIR SPECTRA AND SUGAR CONTENT MEASUREMENT

A NIRS 6500 spectrophotometer and sample transport (FOSS NIRSystems, Laurel,

MD, U.S.A.) with quartz cuvette were used to measure the transmittance spectra of

sucrose solutions. The wavelength ranged from 400 to 2498 nm with 2 nm intervals.

The quartz cuvette (light path: 1 mm; external dimensions: length = 3.0 cm, width = 0.2

cm, and height = 3.5 cm) was filled with sucrose solution for transmittance

measurements. An on-line NIRS 6500 spectrophotometer (FOSS NIRSystems, Laurel,

MD, U.S.A.) was used to measure the reflectance spectra of the wax jambu samples.

(32)

The wavelength range was from 400 to 2498 nm and the interval was 2 nm. As shown

in Fig. 2.1(A), the wax jambu was examined to find an area with no defects, which was

then selected as the location for reflectance measurements. The wax jambu was placed

horizontally in line with the spectrophotometer in a dark compartment, and the distance

between the light source and the top of sample was adjusted to the suggested value of

7.62 cm, as shown in Fig. 2.1(B). The spectrophotometer was controlled by a personal

computer to perform NIR acquisition and spectrum editing. All spectral data were

recorded as the logarithm of reciprocal of reflectance (log 1/R), and NIR spectrum of

each sample was the average of 32 scans. A digital refractometer (PR-101, ATAGO Co.,

Ltd., Itabashi-ku, Tokyo, Japan) was used to measure the sugar content as the reference values. The index ‘°Brix’ used for PR-101 refractometer is a parameter that denotes the

total amount of soluble solids in the sample. For fruits such as wax jambu, most of

soluble solids in the juice are sugars, mainly sucrose, fructose and glucose. Therefore,

the value of °Brix measured from wax jambu can be regarded as the total sugar content.

(33)

Fig. 2.1 A wax jambu (Syzygium samarangense Merrill & Perry) sample (A) side view

and the NIR measurement location, and (B) sample placement with suggested

distance 7.62 cm between the light source and the top of sample in the on-line

NIRS 6500 spectrophotometer.

2.2.3 DATA ANALYSIS

2.2.3.1 INDEPENDENT COMPONENT ANALYSIS (ICA)

Independent component analysis (ICA) is a method used to transform the observed

multivariate data to statistically independent components (ICs) and to present them as a

linear combination of observation variables. The number of receptors defined by ICA

algorithm must be more than or equal to the number of sources, and the signals emitted

by the sources are in non-Gaussian distribution (Hyvärinen and Oja, 2000). The ICs are

latent variables; therefore, they cannot be directly observed, indicating that the mixing

(34)

matrix is also unknown. The purpose of the ICA algorithm is to determine the mixing

matrix (M) or the separating matrix (W). In order to predict the unknown source, it is

assumed that W = M-1,

ŝ = Wx = M-1Ms (2.1)

where ŝ is the estimation of the sources (s) and x represents the observed spectra of

the objects.

In the present study JADE (joint approximate diagonalization of eigenmatrices)

algorithm (Cardoso and Souloumiac, 1993; Cardoso, 1999) was employed to conduct

ICA analysis. In general, JADE offers rapid performance for dealing with spectra data

due to it works off-the-shelf, an improvement over other multivariate approaches like

PCR and PLSR. Assuming that the spectra obtained through measurement of the unknown mixtures were the linear combination of various components’ spectra, it can

be expressed as:

A = MI (2.2)

The spectra of samples were all linearly composed of m ICs. Matrix Al×n stands for l

samples containing n values; Im×n stands for the matrix of ICs, including m independent

components. Ml×m stands for the mixing matrix, which is related to the component

concentration in the mixture. The linear relationship between the mixing matrix (M) and

(35)

the component concentration (C) can be expressed as:

C = MB (2.3)

Among them, B referred to the matrix of regression coefficient. In doing so, the

concentration of each component in the mixture could be determined by the

combination of ICA and linear regression.

2.2.3.2 PARTIAL LEAST SQUARES REGRESSION (PLSR)

Partial least squares regression (PLSR), a typical method in chemometrics (Wold et

al., 2001), has been widely applied to chemical and engineering fields. When PLSR is

applied to spectral analysis, the spectra can be regarded as the composition of several principal components (PCs), and be expressed as a ‘factor’ in the PLSR algorithm. The

factors’ sequence is determined by their influences; the more important factor is ranked

earlier in the order, such as factor 1 and factor 2. Since information from spectral bands

was used in PLSR analysis, the analysis results can be improved by selecting

appropriate number of factors and specific wavelength ranges. To avoid overfitting of the PLSR model’s results with too many factors, the factors were selected based on the

following principles in this study: (1) A maximum factor limit was set at 1/10 of

calibration set data + 2 to 3 factors; (2) new factors were not added if they caused a rise

in the prediction error; and (3) new factors were not added if they resulted in a standard

(36)

error of validation (SEV) smaller than the standard error of calibration (SEC).

2.2.3.3 SPECTRAL PRETREATMENTS

The purpose of spectral pretreatments was to eliminate the spectral variation, which

was not caused by chemical information contained in the samples (de Noord, 1994). For

the raw NIR spectra of sucrose solutions and wax jambu, three different spectral

pretreatments were employed in this study: (1) normalization; (2) 1st derivative with

normalization; and (3) 2nd derivative with normalization. Normalization scaled the

spectrum absorbance of all samples to fall within an interval of -1 to 1. For further

applications of ICA in fast on-line inspection of fruits, the procedure of selecting best

pretreatment parameters, including points of smoothing and gap of derivative, were not

employed to save computational time. The gap of derivative was set at a minimal value

of 2, so as to maintain the most wavelength values as inputs for the model.

2.2.3.4 MODEL ESTABLISHMENT

This study used the mathematic software MATLAB (The MathWorks, Inc., Natick,

MA, U.S.A.) to write ICA programs based on JADE algorithm for establishing ICA

spectral calibration models. The results of ICA were compared with the spectral

calibration models of PLSR built by WinISI II (Infrasoft International, LLC., Port

(37)

Matilda, PA, U.S.A.) chemometric software package. The analysis procedure of both

ICA and PLSR for wax jambu and sucrose solution samples included: (1) selecting

calibration set and validation set, (2) spectral pretreatments, and (3) determining best

calibration model. Since the sucrose solutions were mixtures of sucrose powder and

water, their composition were rather simple. Therefore, the data of full wavelength

range (400 to 2498 nm) were used for comparing the tolerance abilities of ICA and

PLSR since spectral bands with more noises (e.g. 2200 to 2498 nm) often affect the

analysis results. Identification of specific wavelength ranges was needed for wax jambu

because their composition was more complicated than that in sucrose solutions, which

required additional correlation analysis between wavelengths and sugar content. All of

the sucrose solutions and wax jambu samples were respectively used for analysis to

assess the tolerance abilities of ICA and PLSR. A ratio of calibration to validation

samples of 2:1 was adopted according to the sugar content in the sample. All samples

were ranked ascendantly according to their sugar content. Number 1 and 2 were

assigned for calibration and 3 for validation, with subsequent numbers following the

same alternating sequence. The same sets of calibration and validation were used for

both ICA and PLSR analyses.

After the respective spectral calibration models of sucrose solution and wax jambu

(38)

were built, these models were then used to predict the sugar contents of the calibration

and the validation set. The evaluation of predictability was based on the following

statistical parameters, including coefficient of correlation of calibration set (Rc),

standard error of calibration (SEC), coefficient of correlation of validation set (rv),

standard error of validation (SEV), bias, and ratio of [standard error of] performance to

[standard] deviation (RPD), as defined by:

 

1 2

1

1 2

SEC 

 

 

nc

i

i c

Yc

n Yr (2.4)

 

 

1 2

1

Bias 2

SEV 1





  

nv

i

i v

Yv

n Yr (2.5)

 

nv

i

i v

Yv n 1 Yr

Bias 1 (2.6)

RPD = SD/SEV (2.7)

where Yc and Yv represent the estimated sugar contents of the calibration set and the

validation set, respectively. Yr is the reference sugar content, nc and nv are the number of

samples in the calibration set and validation set, and SD is the standard deviation of

sugar content within the validation set. RPD is one of the indices used to evaluate the

performance of a model. The greater the value of RPD is considered adequate for

analytical purposes in most of NIR spectroscopy applications for agricultural products

(Williams and Sobering, 1993).

(39)

2.3 RESULTS AND DISCUSSION

2.3.1 SUCROSE SOLUTION

The 78 sucrose solution samples were divided into 52 calibration samples and 26

validation samples with a ratio of 2:1. The distribution of their sugar content (°Brix) is

shown in Table 2.1. For all the samples within the calibration and validation sets, the

difference between maximum values of two sets was 0.2 °Brix; the differences for other

items including minimum, average, standard deviation, and coefficient of variation

(CV), were all smaller than 0.5 °Brix. The above sets of samples were conforming to the

consistent requirement of sugar content distributions.

Table 2.1 Summary of sucrose solutions and sample sugar contents. Total samples (n =

78), calibration set (n = 52) and validation set (n = 26) were arranged to have

consistent distributions of sugar content.

Sucrose Solutions

Group n

Sugar Content (°Brix)

Max. Min. Mean SD CV

Total Samples 78 19.00 0.40 9.83 5.48 0.56

Calibration Set 52 19.00 0.40 9.72 5.52 0.57

Validation Set 26 18.80 0.90 10.06 5.52 0.55

(40)

2.3.1.1 SELECTION OF THE MOST APPROPRIATE NUMBER OF ICS

According to the definition of ICA, the observed receptor signals can be decomposed

at most into a number of ICs (independent components) equal to the number of samples

(Hyvärinen and Oja, 2000). This study used the data of full range of wavelength (400 to

2498 nm) as the inputs of ICA, conducted ICA for the original spectra of 52 calibration

samples of sucrose solution by selecting 1 to 52 ICs, and observed the prediction error

by using the calibration model. Both situations with and without normalization were

examined. When only one IC applied, the prediction error was high, so the results were

only shown by applying 2 to 50 ICs. As shown in Fig. 2.2, when the number of ICs

increased to 4, SEC of the case without normalization sharply decreased to 0.14 °Brix,

and SEV fell to 0.21 °Brix, indicating that different numbers of ICs can influence the

predictability of the spectral calibration model. However, application of more ICs did

not necessarily help improve the ability of the calibration model because the sucrose

solutions were mixtures of sucrose and water, hence only the initial 4 ICs were applied

in the calibration model.

The results of ICA with normalized spectra can be observed in Fig. 2.2. The

prediction error greatly reduced as the number of ICs increased to 7; the SEC and SEV

(41)

with 7 ICs were 0.12 and 0.22 °Brix, respectively. Normalization apparently gave less

variations of SEV compared with that of original spectra.

Fig. 2.2 Relationship between the numbers of ICs and errors of the predicted sugar

content for sucrose solutions. The most appropriate number of ICs for

normalized spectra was determined by the tendency of SEC (green-short dash

line) and SEV (blue-dash dot dot line) values.

(42)

2.3.1.2 SPECTRA DECOMPOSITION AND CORRELATION ANALYSIS OF

SUGAR CONTENT

Based on ICA analysis it is critical to examine whether these 7 ICs were statistically

independent. To illustrate the operation, IC 1 and 4 were selected and their correlation

was shown in Fig. 2.3, with the coefficient of determination (r2) being only 4.0 x 10-8.

This indicated that IC 1 and 4 were independent of each other. Diagrams of every two

ICs among the 7 ICs also showed a similar distribution to that in Fig. 2.3, with all of the

r2 smaller than 0.243, conforming to the mutually independent characteristics of ICs

(Hyvärinen and Oja, 2000).

(43)

Fig. 2.3 Distribution of calibration and validation samples of sucrose solutions in IC

1-IC 4 space. IC 1 and IC 4 were randomly selected from the 7 ICs.

Eq. 2.5 shows that the constituent information ‘sugar content’ should mainly

correspond to a specific IC, and there should be a high correlation between the values of

the IC in the mixing matrix and the sugar content. So a diagram was made with the

reference sugar content and the values of each column (each IC) in the mixing matrix.

As shown in Fig. 2.4, the correlation coefficient (r) between IC 1 and the reference

sugar content could reach 0.977, which meant that with 7 ICs extracted, the IC 1 among

all 7 ICs could reveal the most information resulted from the sugar content in the

(44)

spectra. The results were in agreement with Westad (2005). Therefore, selection of the

numbers of ICs is important since it influences how the information is used after spectra

decomposition.

Fig. 2.4 Correlation between the values of IC 1 in the mixing matrix and the reference

sugar contents of sucrose solutions.

The regression coefficient matrix by the NIR spectra and the reference sugar content

of calibration sets was shown in Table 2.2, and the values from the top to the bottom

referred to IC 1 to 7. All values were compared in terms of absolute values. It was found

that the value of the first row (IC 1) was the largest, closely followed by the value of IC

(45)

4. The results agreed with the order of correlation between each IC and the reference

sugar content, and indicated that the importance of each IC was independent of the IC

sequence. Each major constituent had its corresponding IC decomposed by ICA, in

which IC contribution was clearly defined, so that all constituents of the mixtures could

be distinguished by ICA (Chen and Wang, 2001; Hahn and Yoon, 2006; Pasadakis and

Kardamakis, 2006; Kardamakis et al., 2007; Kaneko et al., 2008).

Table 2.2 Regression coefficient matrix of sucrose solutions with 7 ICs were extracted

from the NIR spectra of calibration sets. Correlation between the absolute

value of each IC in regression coefficient matrix and sugar content was

examined.

IC # Regression Coefficient

1 -2.1811

2 -0.2843

3 -0.1843

4 1.2976

5 0.1876

6 -0.1334

7 -0.1416

(46)

The ICs, decomposed from the spectra by ICA, reflected the spectral characteristics of the unknown mixture and constituted the pure materials’ spectra of this mixture under

an ideal state (Chen and Wang, 2001; Hahn and Yoon, 2006; Pasadakis and Kardamakis,

2006; Kardamakis et al., 2007). Since the sucrose solutions were mixtures of sucrose

and water, and the spectra was comprised of both constituents, the ICs decomposed by

ICA should reflect the characteristics of these two pure substances. For the original

spectra of the normalized calibration set, among the 7 ICs applied for ICA, the order of

the 7 ICs, according to the correlation with reference sugar content, was IC 1, 4, 2, 5, 3,

7, and 6. The NIR original spectra of the calibration set and IC 1 were shown in Fig.

2.5(A) and (B), and the reflectance spectrum of sucrose powder post-Detrend was

shown in Fig. 2.5(C). The peak positions of IC 1 (964, 1090, 1436, 2100, and 2276 nm)

matched the specific wavelength ranges of sugar content (C-H band) (Chang et al., 1998;

Park, 2003; Hahn and Yoon, 2006), which was also consistent with the absorption bands

seen in Fig. 2.5(C). So IC 1 can be considered to respond mainly to the sugar content,

conforming to the above results. The other ICs had poor correlation with reference sugar

content, and the absolute values in the regression coefficient matrix were much smaller

than that of IC 1, so they exerted an assisting function.

(47)

Fig. 2.5 (A) Original NIR spectra of sucrose solutions, (B) IC 1 decomposed from

calibration sets, and (C) the reflectance spectrum of sucrose powder

post-Detrend.

2.3.1.3 SUGAR CONTENT QUANTIFICATION BASED ON ICA AND PLSR

Quantitative analyses of sugar content in sucrose solutions were conducted by ICA

and PLSR using the full range of wavelength from 400 to 2498 nm. The results of

(48)

best spectral calibration model was the original spectra normalized, with 7 ICs applied.

The results were Rc = 0.9998, SEC = 0.124 °Brix, rv = 0.9993, SEV = 0.216 °Brix, bias

= 0.014 °Brix, and RPD = 25.54. A comparison was made in light of the result of the

original spectra with and without normalization, and it was found that the calibration

model yielded similar outcomes in the validation sets, whereas the SEC value was

improved when normalization was applied. Although derivatives can improve baseline

shift of the original spectra and amplify the signal characteristics, noise interference

may also be enhanced at the same time, making it unsuitable for spectral bands with

much noises. The spectrum in the range of 2200 to 2498 nm contained more noises;

therefore, the predictability of the spectral calibration models would decrease as

derivatives were attempted.

(49)

Table 2.3 Regression results by ICA and PLSR analyses for sucrose solutions.

Method Spectrum

ICs /

Factors

Calibration Set (52) Validation Set (26)

Mean: 9.715, SD: 5.515 Mean: 10.058, SD: 5.515

Rc

SEC

(°Brix)

RSEC

(%)

rv

SEV

(°Brix)

RSEV

(%)

bias

(°Brix)

RPD

ICA

Original 4 0.9997 0.144 6.97 0.9995 0.215 3.57 0.045 25.69

Original + Normalization 7 0.9998 0.124 4.01 0.9993 0.216 3.68 0.014 25.54

1st Derivative + Normalization 4 0.9994 0.193 13.71 0.9984 0.331 10.34 0.028 16.66

2nd Derivative + Normalization 5 0.9983 0.321 19.66 0.9973 0.409 16.20 -0.014 13.48

PLSR

Original 2 0.9995 0.181 11.41 0.9985 0.300 8.78 0.069 18.38

Original + Normalization 4 0.9990 0.218 11.59 0.9975 0.399 8.68 0.022 13.82

1st Derivative + Normalization 3 0.9995 0.192 11.50 0.9950 0.546 12.92 0.031 10.10

(50)

2nd Derivative + Normalization 2 0.9990 0.243 20.96 0.9869 0.899 34.99 0.013 6.14

(51)

The results of spectral calibration models built by PLSR indicated that the best

spectral calibration model was acquired when the original spectra and 2 factors were

employed, and the results were as follows: Rc = 0.9995, SEC = 0.181 °Brix, rv = 0.9985,

SEV = 0.300 °Brix, bias = 0.069 °Brix, and RPD = 18.38 (Table 2.3). Moreover, with

the SEC = 0.192 °Brix and the SEV = 0.546 °Brix for the 1st derivative with

normalization, and the SEC = 0.243 °Brix and the SEV = 0.899 °Brix for the 2nd

derivative with normalization, it is apparent that the SEV values of both 1st and 2nd

derivatives were many times higher than SEC. The results showed that the PLSR

spectral calibration models had poor predictability when applied to validation sets.

Comparing the quantitative analysis results of ICA and PLSR, all ICA spectral

calibration models had better ability than PLSR in predicting calibration and validation

sets. This means that ICA extracts the characteristic information from the spectra more

effectively, not only improving the expository ability of calibration models for the

calibration sets, but also increasing the tolerance for the validation sets. Results also

showed that ICA was preferable to PLSR due to much lower bias (Table 2.3). This

finding became more obvious with normalization, indicating that ICA had a better

tolerance to the influences caused by factors other than chemical characteristics of the

constituents in the samples, which helped to build more robust spectral calibration

(52)

models. In summary for the sucrose solutions, ICA achieved better quantitative analysis

of sugar content than PLSR did, while selecting a suitable number of ICs and spectral

pretreatments could help improve the predictability of spectral calibration models. The

results of sucrose solutions also helped establish proper procedures with useful

information applicable when conducting ICA analysis of wax jambu.

2.3.2 WAX JAMBU

Wax jambu samples totaling 114 were used; their sugar contents ranged from 6.4 to

14.5 °Brix. The average sugar content was 9.92 °Brix with the standard deviation of

1.61 °Brix. All the samples were divided in a 2:1 ratio into 76 and 38 calibration and

validation samples (Table 2.4).

(53)

Table 2.4 Summary of wax jambu (Syzygium samarangense Merrill & Perry) and

sample sugar contents. Total samples (n = 114), calibration set (n = 76) and

validation set (n = 38) were arranged to have consistent distributions of

sugar content.

Wax Jambu

Group n

Sugar Content (°Brix)

Max. Min. Mean SD CV

Total Samples 114 14.50 6.40 9.92 1.61 0.16

Calibration Set 76 14.50 6.40 9.89 1.61 0.16

Validation Set 38 14.00 7.10 9.99 1.62 0.16

2.3.2.1 CORRELATION ANALYSIS OF NIR SPECTRA AND SUGAR

CONTENT

Fig. 2.6 showed the distribution of the correlation coefficients for the original, the 1st

derivative and the 2nd derivative spectra of the wax jambu samples and their sugar

contents. The main absorption wavelengths of the original spectra were 676, 968, and

1144 nm, of which 676 nm was located within the visible region of red light, whereas

968 and 1144 nm in the NIR region, belonging to the 2nd overtone of the C-H bond. The

main absorption wavelengths of the 1st derivative spectra were 626, 974, 1070, and

數據

Fig. 2.1 A wax jambu (Syzygium samarangense Merrill & Perry) sample (A) side view
Fig.  2.2  Relationship  between  the  numbers  of  ICs  and  errors  of  the  predicted  sugar
Fig.  2.3  Distribution  of  calibration  and  validation  samples  of  sucrose  solutions  in  IC
Fig. 2.4 Correlation between the values of IC 1 in the mixing matrix and the reference
+7

參考文獻

相關文件

相關分析 (correlation analysis) 是分析變異數間關係的

files Controller Controller Parser Parser.

• If the same monthly prepayment speed s is maintained since the issuance of the pool, the remaining principal balance at month i will be RB i × (1 − s/100) i.. • It goes

在做容忍度的分析時,又有四種的分析方法可以選擇:極值分析 (Extreme Value Analysis,簡稱 EVA),和方根(Root-Sum-Square,簡稱 RSS)分析,快速蒙地卡羅分析(Fast Monte

本研究探討空氣流量及轉爐石量對於 Chlorella sp.生長之表面物 化特及釋出有機物性質影響,可透過傅立葉轉換紅外線光譜(Fourier transform

在集群分析方法中,Stuart Lloyd 於 1957 年提出了 K-Means 分析法。它是利用劃分方 式的ㄧ種聚類算法。此種方式以隨機選取

調整動力分析 動力分析 動力分析之反應譜 動力分析 之反應譜 之反應譜 之反應譜:依據規範規定之設計地震力,. 調整 Scale

And, it is irrelevant in long direction; (2) the greater bearing area improves the LCM strength in both directions; (3) the greater Frame retaining wall height, the greater