Chapter III Data Analysis
III- 4. Multivariate data analysis
For the multivariate analysis, we performed multivariate curve resolution (MCR) to analyze a four-dimensional data matrix (XY on the image plane (spatial), spectral and temporal dimensions; see Figure III-4) by the software (nmf-11, Pylone) which was developed at Tokyo specifically for spectral imaging applications. To be able to deal with four-dimensional data, we unfolded the four-dimensional data matrix into a two-dimensional
17
array. One dimension should correspond to the spectral dimension (pixel or wavenumber);
thus the non-negative matrix factorization constrains each resolution as an individual component [37]. The other dimensions were merged to be a single dimension. In MCR, given an 𝑚 × 𝑛 non-negative data matrix A, a low-rank approximation of the matrix A is sought for by solving the problem [37, 38]:
𝑨 ≈ 𝑾𝑯 (III − 1) with non-negativity constrains 𝑾 ≥ 𝟎 and 𝑯 ≥ 𝟎. In the present work, 𝑾 is an 𝑚 × 𝑘 matrix whose columns correspond to spectra and 𝑯 is a 𝑘 × 𝑛 matrix whose rows represent spatiotemporal concentration profiles. By reorganizing the 𝑯 matrix, multivariate Raman images can be obtained. The parameter 𝑘, which specifies the number of components that consist of the data, should be given by the user in advance. The most suitable value of 𝑘 was found to be k = 6 in the present case. The optimal solutions of 𝑾 and 𝑯 are obtained by solving alternating least-squares (ALS) problems of equation III-1 so that the Frobenius norm
‖𝑨 − 𝑾𝑯‖F2 is minimized.
Here, we briefly describe the MCR procedure adopted in this work, which consisted of the following steps:
(1). The MCR software requires a two-dimensional matrix as input. Thus, we need to rearrange a four-dimensional matrix which was constructed by the data cube at each measurement time (Figure III-4) into a two-dimensional matrix. Because spectral information is one of our primary concerns, the wavenumber dimension was treated as a single variable.
By doing so, we generated the data matrix 𝑨.
(2). SVD-based initialization [39] was utilized to generate initial guesses for 𝑾 and 𝑯 in this case. Although random initialization is also available in our software, it often results in falling into a local minimum [39]. The number of components was set to be 𝑘 = 6, which yielded the best resolution of polysaccharide, lipid, and protein components.
(3). Additional constraint L1-norm (lasso regression [40]) was further added to ALS
18
optimization of 𝑾 and 𝑯. An L1 penalty term of α2 = 0.0095 was added to obtain sparser solutions as
(𝑾T𝑾 + α2𝑬)𝑯 = 𝑾T𝑨 (III − 2) and
(𝑯𝑯T+ α2𝑬)𝑯 = 𝑯𝑨T (III − 3) where 𝑬 is a 𝑘 × 𝑘 matrix all of whose elements are unity.
(4). Repeat step (3) until ‖𝑨 − 𝑾𝑯‖F2 converges. The number of iteration was to be 4000, ensuring that ‖𝑨 − 𝑾𝑯‖F2 converged to a sufficiently small number.
Last, we discuss how well the matrices 𝑾 and 𝑯 reproduce the matrix 𝑨. For the unfolded data matrix 𝑨 (a 790 × 6885 matrix), we calculated the normalized residual matrix 𝑹 (Figure III-5a) using the following equation:
𝑅𝑖𝑗 = �𝐴𝑖𝑗−(𝑾𝑯)𝐴 𝑖𝑗�
𝑖𝑗 (III − 4) where 𝑅𝑖𝑗 represent the normalized residue at row i and column j. Besides horizontal stripes indicating the locations of intense Raman bands, the 2D plot shows no particular distribution pattern and residues seem to be randomly distributed. Figure III-5b shows a histogram of all fitting residues, which are less than 10%. This result indicates that the original data is well reproduced by the present MCR analysis. We also compare a reproduced spectrum with the corresponding original spectrum (SVD-treated) at a randomly selected position (Figure III-5c). The two spectra are almost identical with less than 5% residues.
19
Table III-1. Band assignments for Raman spectra of single living S. pombe cells.
Wavenumber (cm-1) Assignment
1655 cis-C=C stretching of the unsaturated lipid chains Amide I mode of proteins
1602 Not yet assigned. "Raman spectroscopic signature of life"
1440 CH2 scissoring and CH3 degenerate deformation 1340 CH bending of the aliphatic chain of proteins 1301 In-phase CH2 twisting mode
1260 C=C-H in-plane bend of the cis- –CH=CH– linkage Amide III mode of proteins
Ring breathing of the phenylalanine residues 852
783
“Tyrosine doublet” (Fermi resonance of a ring-breathing vibration and the overtone of an out-of-plane ring-bending vibration of the tyrosine residues)
Cytosine vibration and/or −O−P−O− symmetric stretching
20
Figure III-1. Typical lipid-rich Raman spectrum of fission yeast cells with 633 nm excitation. Some Raman bands, e.g. at 1655, 1440, 1260, and 1154 cm-1, show a complicated feature composed of lipids/proteins and other molecular species.
21
(a)
Figure III-2. (a) Typical space-resolved Raman spectra of a single living S. pombe cell acquired with an exposure time of 1.5 s and laser power of 1 mW. A–E denote the positions in the cell at which the Raman spectra were recorded. The SVD-treated spectra (right) exhibit much higher S/N than the raw data (left). (b) Raman images for the 1655 cm−1 band constructed from the raw (left) and SVD-analyzed (right) data, respectively. It is clear that SVD analysis makes it possible to construct high-contrast Raman images even with low excitation power and short exposure time.
raw data SVD-analyzed data
raw data
(b) SVD-analyzed data
22
(a) (b)
Figure III-3. (a) Baseline connecting two ends for the Raman band at 1655 cm-1. The blue-shaded area is used as the Raman intensity of this band at a given pixel in the univariate Raman image. (b) Univariate Raman image of the 1655 cm-1 band. A rainbow color scale displays the highest Raman intensity with red and the lowest with purple.
Figure III-4. Diagrammatic representation of the unfolding of overall four-dimensional spectral data into a two-way array that facilitates multivariate data analysis. X and Y denote positions in the image plane, and denotes the spectral dimension. The three dimensions (two spatial and one temporal dimensions) are combined to be a single dimension.
23
(a)
(b) (c)
Figure III-5. (a) 2D plot of the normalized residual matrix 𝑹 (see equation III-4).
(b) Histogram of all fitting residues calculated at all pixels. (c) Comparison of a typical SVD-treated Raman spectrum (blue curve) and the corresponding MCR fitted spectrum (red curve). Also shown is their difference spectrum.
24
Chapter IV
Results and Discussion
25
IV-1. Cell cycle of fission yeast cells under the microscope
Figure IV-1 schematically shows the cell cycle of S. pombe. The optical images showing the morphology of the S. pombe cell in each phase during the cycle are those captured in our Raman imaging experiment (see below). The S. pombe cell cycle consists of four different phases called the M (mitosis) phase, G1 (gap-1) phase, S (synthesis) phase, and G2 (gap-2) phase. It is known that S. pombe has a very short G1 phase under normal vegetative conditions.
This fact makes the G1 phase of S. pombe unclear. Thus, we use the notation G1/S for the two indistinguishable phases. In the present work, we randomly chose a single yeast cell that stayed in G2 phase and started an in vivo long-term measurement with the Raman microspectrometer equipped with the laboratory-built chamber (see Chapter II). The time zero is defined as the instance when we put a single colony of S. pombe into PMLU medium. At 1 h, we start to measure a randomly chosen single yeast cell in the sample dish. Once an imaging measurement is done, laser illumination is blocked until a next measurement. We are sure that the cell initially stayed in the G2 phase, because we observe elongation of the cell along the cell polarity axis by a factor of ∼1.2 on going from 1 to 4h, which is a common biological behavior in G2 phase. The cell cycle progresses from G2 to M phases and the cell prepares for a coming cell division in the first four hours (1-4 h). At 6 h, a septum is already formed to segregate the cell into two compartments, indicating that the cell is in the G1/S phase. By 6.5 h, the cell divides completely and splits into two daughter cells. Subsequently, two daughter cells should enter a new cell cycle (G2 phase again). To clearly present the stages in the cell cycle, we also performed experiments using nucleus-labeled fission yeast cells with GFP, but they were not successful. A possible reason for the failure might be that the nucleus-labeled fission yeast cells seem to be more photolabile than unlabeled cells and be strongly affected by laser illumination during the cell cycle.
It is worthy to discuss the behaviors of our selected yeast cell after 6.5 h. We tentatively
26
think that the selected fission yeast cell enter a new cell cycle after the mother cell divided into two daughter cells. They each are supposed to start new cytoplasmic division after 6.5 h.
However, as shown in the optical images of Figure IV-2, the daughter cells exhibit little morphological change, suggesting that the conditions are not favorable for the yeast cells to divide. Singh et al. [41] showed that budding yeast cells cannot sustain even with 400 µW of 632.8-nm laser radiation if it lasts for a long period of time. Thus, we presume that the fission yeast cells divide only once due to 1 mW laser irradiation in our experiments. The fission yeast cells may fall into G0 (gap-0) phase without morphological changes. Even in such a case, changes of molecular compositions are still progressing to overcome the external stress. As discussed in the next section, our results do reveal that the molecular compositions and distributions continuously vary even while there is little morphological change in optical images.
IV-2.Univariate Raman images
Using the univariate analysis described in Chapter III, 10 time-lapse univariate Raman images of a single S. pombe cell during the cell cycle are constructed at the Raman shift of 1655, 1602, 1440, 1340, 1301, 1260, 1154, 1003, 852, and 348 cm-1 within as narrow as possible selected bands (Figure IV-2). We classify these 10 Raman images into three groups according to their assignments. The three groups are lipids, proteins, and admixtures of lipids, proteins, and other molecular species.
IV-2-1.Group of lipids
Univariate Raman images of lipids include those for the 1440 and 1301 cm-1 Raman bands. As we discussed in Chapter III, the Raman band at 1440 cm-1 comes from the CH2
scissoring (1439 cm-1) and CH3 deformation (1456 cm-1) of both lipids and proteins. How then can we obtain a Raman image that is solely attributed to lipids? Figure IV-3 shows a pair
27
of lipid-rich and protein-rich Raman spectra of fission yeast cells. The peak position of the CH bend in the protein-rich spectrum (Figure IV-3a) is at 1451 cm-1, which is different from that in the lipid-rich spectrum, i.e., 1440 cm-1 (Figure IV-3b). This result is consistent with the fact that proteins have a larger CH3/CH2 ratio than lipids [42]. As long as we carefully choose a narrow region around 1440 cm-1 for intensity integration (recall Chapter III), we can extract a Raman image of lipids from contaminated bands. Compared to the 1440 cm-1 Raman band, the 1301 cm-1 Raman band originates predominantly from CH2 in-phase twist of lipids, so a pure Raman image can be constructed easily.
IV-2-2.Group of proteins
The group of proteins contains the 1003 and 852 cm-1 Raman bands. Their origins are exclusively proteins. The sharp band at 1003 cm-1 is assigned to the ring breathing mode of the phenylalanine residue in proteins. The 852 cm-1 band is one of the tyrosine doublet, which arises from a Fermi resonance between the ring breathing fundamental and the overtone of an out-of-plane ring bending vibration of tyrosine residues [43].
IV-2-3. Group of admixtures of lipids, proteins, and other molecular species
Here we discuss the last group consisting of the 1655, 1340, 1260, and 1154 cm-1 Raman bands. It is well known that the peak positions of the amide I band (1657 cm-1) and the C=C stretching of the unsaturated lipid chains (1655 cm-1) are almost identical [44]. Thus, those severely overlapped Raman bands cannot be easily resolved even by using the same method as for the 1440 cm-1 Raman bands. The 1340 cm-1 Raman band is located at a shoulder of a broad band around 1300 cm-1 which include a lot of complicated species, suggesting that the contributions to the 1340 cm-1 image are very complicated. The Raman images of the weak band at 1154 cm-1 may also suffer from large uncertainties. Furthermore, the Raman bands of crystalline sodium polyphosphate at 1162 cm-1 [17] and the skeletal C-C stretch modes in the
28
1000-1150 cm-1 region [45] may interfere with the 1154 cm-1 Raman images.
Despite the spectral contaminations for those complicated bands, we can still separate those admixtures, to some extent, into lipid-dominated and protein-dominated contributions.
The cis-C=C band at 1655 cm-1 of lipids is usually stronger than the amide I band at 1657 cm-1 in yeast cells. Thus, the lipid-dominated Raman images at 1655 cm-1 coincide with other lipids Raman images. However, more green patterns that fill up the entire yeast cell are observed in the Raman images at 1655 cm-1. It implies that proteins also contribute slightly to the Raman images at 1655 cm-1. Crystalline sodium polyphosphate is known to appear in yeast cells under conditions of starvation [17]. However, in the present work, the treatment with fresh medium provides sufficient nutrition and prevents the yeast cells from starvation.
The interference from this band should thus be small. Skeletal C-C stretch modes are very broad so that we can remove this contribution by using a well-selected baseline. Thus, the 1154 cm-1 Raman images here can be considered as originating from proteins. The Raman bands at 1340 and 1260 cm-1 are too weak to be used for constructing decent Raman images.
It is not clear whether their behaviors represent that of lipids or proteins.
IV-2-4. Others
Especially, we put the 1602 and 348 cm-1 Raman bands individually. The 348 cm-1 Raman band is assigned to polysaccharides which are abundance in the cell walls and the septa [46, 47]. To our knowledge, Raman images of this particular band in the fission yeast cells have been obtained for the first time owing to the improved S/N and sample conditions.
The Raman band at 1602 cm-1 is relatively isolated from the others, but it has not yet been given a conclusive assignment [48-50]. Recently, Chiu et al. showed that ergosterol contributes, at least partially, to the 1602 cm-1 Raman band [51]. In another project of our group, we also studied this mysterious Raman band using ubiquinone Q10 deficient yeast cells, and we will discuss the results in Appendix I.
29
IV-3. Time dependence of the Raman intensities during the cell cycle
Next, we discuss the time dependence of the total relative concentrations of selected bands (Figure IV-4). Note that the Raman intensity of each band is proportional to the concentration of the molecule that gives rise to the band. Moreover, the effective laser volume in the axial direction (~2 µm) here is similar to the thickness of a single fission yeast cell (~2 µm). Taken together, the Raman intensity of the band can represent dynamics changes in concentration of the molecular species within the entire yeast cell. Here we focus only on four Raman images that have been assigned unambiguously (1440 and 1301 cm-1 for lipids and 1003 and 852 cm-1 for proteins) for discussion of their specific dynamic behaviors. The normalized total intensity of these four bands reaches a maximum at 6 h, indicating that the yeast cell produces a large amount of lipids, proteins, and other biological molecules right before dividing. Subsequently, the normalized Raman intensities of the four bands drop by
~50% when the cell divides at 6.5 h. After the cell division, they gradually increase over 15.5 h. Interestingly, the morphology of the yeast cell looks identical from 6.5 h to 22 h, but the total Raman intensities at 1440, 1301, 1003, and 852 cm-1 still gradually increase over 15 h.
Although the yeast cell might fall into G0 phase at some point after 6.5 h due to unexpected stress, we do see slight changes in concentrations of lipids and proteins, indicating that the yeast cell might return to the regular cell cycle.
IV-4. Distribution changes of lipids and proteins during the cell cycle
In this section, we discuss the distribution of lipids and proteins in more detail. Dynamic distribution changes of lipids during the cell cycle can be explored using the 1301 cm-1 Raman images, which are constructed from a pure lipid band (Figure IV-5a). At 1 h (G2
phase), the 1301 cm-1 Raman images show that red patterns are localized at the two ends of the yeast cell. In the next 5 h (from M to G1/S phase), red areas decreases gradually in number
30
and appear to spread over the whole cell. To more quantitatively see whether the Raman images show distribution changes from a localized to delocalized pattern in the yeast cell during the cell cycle, we consider a cross-section of the Raman images that runs from upper left to lower right of the yeast cell (see lines on the Raman images of Figure IV-5a). Raman intensities at two neighboring pixels along the line as well as on the line are summed up to reduce uncertainty in the choice of a cross-section. The line profile of the Raman intensity clearly shows that lipids are localized at two ends of the yeast cell at 1 h (early G2) and continuously becomes uniform on going from late G2 to G1/S phase. In contrast to the distribution of lipids, that of proteins (Figure IV-5b) exhibits delocalized pattern in the each phase, indicating that the concentration of proteins is always homogeneous over the whole yeast cell. These results are consistent with our previous work [52].
We interpret this phenomenon as follows. Lipids are associated with energy storage, which is a basic metabolic process in the yeast cells [53, 54]. The yeast cell stores high concentrations of lipids for energy in some particular organelles whose locations are indicated as the aggregated areas (red areas) and prepares for the cell cycle in the early G2 phase. The yeast cell then transfers those lipids to sites where energy is needed to start duplicating organelles and to proceed to cell division in the M and G1/S phase. As a result, the uniform distribution is observed in this period. The present finding demonstrates the potential use of lipid Raman images as an indicator of energy consumption in the cell.
IV-5. Multivariate Raman images
To overcome the difficulty in the univariate analysis arising from possible contaminations from multiple intracellular molecules, we perform multivariate curve resolution (MCR). MCR globally analyzes Raman image data and can in principle decompose all major components contained in the spectrum of fission yeast cells. The MCR analysis of the same data set as used for the univariate analysis derived time-lapse Raman images and
31
component spectra of six chemical species (Figure IV-6). Rainbow color scale is used to represent the intensity at each pixel with red showing the highest intensity and purple the lowest. The six components are denoted 1-6. We will describe in the following how to assign each component based on its spectral feature and concentration distribution.
IV-5-1. Component 1
Component 1 shows a featureless spectrum (Figure IV-6b-1) and is distributed exclusively outside the yeast cell (Figure IV-6a-1), indicating component 1 is associated with background from the medium.
IV-5-2.Component 2
Component 2 can also be undoubtedly attributed to polysaccharides such as β-1,3-glucan, the major constituent of the yeast cell wall [55]. The Raman images of component 2 (Figure IV-6a-2) clearly visualize the cell wall at all measurement times and a septum at 6 h.
Furthermore, the spectrum of component 2 (Figure IV-6b-2) is in good agreement with the reported Raman spectrum of β-1,3-glucan [16]. According to the normal-mode analysis of model disaccharides [46, 47], the low-frequency bands at 348 and 423 cm-1 are predominantly attributed to the skeltal deformations (C–C–C, C–C–O, and O–C–O), the 890 cm-1 band to the C–O stretch mode of the glycosidic linkage, and the 1463 cm-1 band to the CH bending mode of the CH2OH group.
IV-5-3. Component 3
The Raman images of component 3 are almost identical to the 1301 cm-1 Raman images (Figure IV-2) obtained with the univariate analysis, indicating that component 3 is associated to lipids. Its spectral component shows a typical lipid-rich Raman spectrum of yeast cells, including bands at 716, 1301, 1440, 1602, and 1658 cm-1. As already discussed above, the 1658 cm-1 band is assigned to cis-C=C stretching of the unsaturated lipid chains, the 1440
32
cm-1 band to the CH bending of lipid chain, and the 1301 cm-1 band to in-plane CH2 twisting.
In particular, the 716 cm-1 band is a characteristic spectral signature of phospholipid headgroup [56, 57], suggesting that phospholipids also contribute to component 3. Moreover, close inspection of the distribution pattern of the image show a blue (low-intensity) region around the center of the cell that corresponds to the nucleus. This result is consistent with a low concentration of lipids in nuclei.
In particular, the 716 cm-1 band is a characteristic spectral signature of phospholipid headgroup [56, 57], suggesting that phospholipids also contribute to component 3. Moreover, close inspection of the distribution pattern of the image show a blue (low-intensity) region around the center of the cell that corresponds to the nucleus. This result is consistent with a low concentration of lipids in nuclei.