Part I - Evaluation of Peptide Fractionation Strategies used in Proteome Analysis.
Two-dimensional LC separation possesses sufficient resolving power to substantially reduce the spatial and temporal complexity of peptide mixtures.
An increase in the number of measurable peptides, and widening the overall dynamic range consequently results in an increase in proteome coverage. Four different peptide fractionation strategies for 2D LC MS/MS were compared in this study. The goal of the study was to evaluate which fractionation strategy permits the highest number of peptides/proteins to be identified. The method of choice in the second LC dimension is acidic-RP, because of its high resolution and compatibility with ESI-MS/MS.
A flowchart of the experiment is presented in Figure 13, one milligram of the PLC/5 cell lysate was reduced, alkylated and digested, as described in the experimental section. Equal peptide-containing aliquots (155 µg) were first desalted using an RP column and the resulting solution injected first to the LC dimension including acidic-RP, SCX, HILIC, alkaline-RP chromatography or separated on an OFFGEL fractionator. Figure 6 shows the chromatographic profile of the peptide mixture desalted using a C18 column; fractions starting from 36 to 40 min were collected. The first dimension analysis resulted in 24 fractions, typical of a global proteomic analysis,54 which were further analyzed by LC-MS/MS. The acidic-RP was used as a control, since it has the same separation properties as the second LC dimension. The five methods examined were based on previously described methods55-57 as well as our own
28
experiences with these separation techniques. The total proteins/peptides identified from each 2D separation method were used to indicate the overall fractionation efficiency. The distribution of the identified peptides and proteins are illustrated in Figure 14.
The Influence of Salt on Separation Efficiency
A salt gradient was used to elute the peptides from the SCX column, similar to a previous study.58 The peptide sample contained NH4HCO3 from the digestion step. In order to examine the influence of salt content (~10 mM) on the SCX separation, both non-desalted and desalted peptide samples were examined. As shown in Table 3 and Figure 15, a total of 1990 proteins were identified from the desalted sample but only 1375 proteins were identified from the non-desalted sample. This difference was magnified at the peptide level; 10 055 distinct peptides were found in the desalted sample, versus only 4644 peptides in the case of the non-desalted sample. The comparison between the non-desalted and desalted samples for protein/peptide abundance per fraction is shown in Figure 16. It was possible to identify more proteins/peptides in nearly all the fractions of desalted sample than non-desalted samples. This was especially true for fractions after No. 10 where the peptide contained three or more charges. As shown in Figure 17, the charge distribution for the non-desalted sample is quite different with the desalted sample, in that peptides with a charge of +2 were distributed in the first 8 fractions in the desalted sample but only appear in the first 2 fractions in the case of the non-desalted sample. This indicates that peptides with charges of +2 may not be retained in the SCX column when the salt concentration is greater than the KH2PO4 concentration in the mobile phase A
29 identifiable peptides, compared to the desalted sample. The total number of distinct peptides identified in non-desalted sample was slightly greater than those in the desalted sample in the case of solution-IEF. This indicates that, provided the salt concentration is maintained below 10 mM, as suggested by the manufacturer, the salt content of the sample will not be a critical factor in separation efficiency in the case of solution-IEF.
Orthogonality of Two-Dimensional Separations
Orthogonality in chromatography refers to alternative selectivity between separations. Orthogonal, or 2D separations are needed to address one of the major concerns in method development, i.e., insufficient resolution, which can mask signals from analytes with similar physical and chemical properties. Such separations can be achieved by modifying some of the parameters and/or by appropriate choice of stationary phases, conditions of the organic modifier, and mobile phase pH on both separation dimensions with different selectivities. Therefore, orthogonal separations make 2D separations successful.9
The orthogonality between the two dimensions of separation was evaluated by plotting each peptide hit per fraction in the first dimension separation as a function of its retention time in a second dimension separation (Figure 18).7 The fact that the peptides are evenly distributed over the whole
30
map provides a clear indication of the high degree of orthogonality for SCX x RPLC and HILIC x RPLC.
A few reports suggest that the orthogonality of the SCX x RPLC approach is less than ideal.8 It is known that the separation in SCX is directed by the charge on the peptide. Since the charge states of the tryptic peptides identified from SCX were mainly +2 and +3, the peptides cluster in a narrow retention window. The majority of peptides elute from the column early in the analysis, leaving a portion of separation space that is relatively devoid of peaks.
Therefore the resolution of the SCX x RPLC approach is worse than expected.
The fact that SCX fractionation can be easily coupled with a UV/Vis detector is a distinct advantage, since it permits the elution of peptides to be on-line monitored. It is feasible to pool the collected fractions with approximately equal amounts of peptide for each LC-MS/MS analysis. The chromatographic peak areas, as determined from UV absorbance values, indicate the amount of peptide at each time point. One to seven minutes were pooled for the first fraction, 8~9 minutes for the second fraction, after 10~14 minutes collected one fraction per minute, during 15~22 minutes and 29~38 minutes one fraction every two minutes; after 39 minutes, the eluate was pooled into one fraction for every 15 minutes. The result showed that after this manipulation, fractionation based on peak area determined by UV absorbance permits more peptides to be identified. (Table 3)
It has been reported that solution-IEF (pI based) gave a better focusing of peptides compared with SCX (based on charge).51 However, the improved peptide separation did not result in the detection of an increased number of peptides/proteins in this study. This could be attribute to a deterioration in
31
recovery in the case of solution-IEF, although the manufacturer claims that the sample can be recovered from the gel in 95% yield. In addition, the peptide is migrating in the gel without any monitoring device, which makes it difficult to predict the actual amount of peptide in each compartment of the IEF solution.
Another problem is that the amounts of peptides vary greatly in different fractions from the solution-IEF. As seen in Figure 16B, there are two gaps in the peptide distribution map. These are due to a lack of peptides with a specific pI value.17
Alkaline-RP provides good orthogonality (Figure 18E), because of the ionic nature of peptides, which made it possible to achieve a substantial separation orthogonality in the alkaline-RP x RP LC system using a high pH buffer in the first dimension and a low pH buffer in the second separation dimension, even when the column was packed with an identical sorbent.20
Charge, GRAVY and pI Value Distribution of Peptides
Figure 17 shows the charge distribution of peptides identified in SCX. It is known that singly and doubly charged peptides elute at the very beginning of the separation, whereas peptides carrying three or more charges are retained on the column more strongly and elute at higher salt concentrations. As shown in Figure 17A, the majority of the peptides identified from SCX are doubly charged peptides, which are present in fractions 2 through 8. Triply charged peptides comprise a certain amount of the total peptides identified, and are mainly present in fractions 9 through 18. More charged peptides, in relatively low amounts, are present in the last five fractions, when the majority of triply charged peptides had already eluted from the column.
GRAVY (grand average of hydropathy) for a peptide or protein is
32
calculated as the sum of hydropathy values of all of the amino acids, divided by the number of residues in the sequence.59 For either acidic-RP or alkaline-RP separation peptides based on hydrophobicity, peptides with a weaker hydrophobicity tend to elute early in the analysis, while peptides with a stronger hydrophobicity elute at higher concentrations of organic modifier.
In contrast with reverse phase, peptides with a stronger hydrophobicity elute from the HILIC column early in the process. The GRAVY distributions for acidic-RP and alkaline-RP have the same trend, the peptides identified in early fractions have smaller GRAVY values and peptides identified in later fractions have higher GRAVY values (Figure 19A).
The GRAVY distribution for HILIC (Figure 19B) is contrary to RP, peptides in early fractions have larger GRAVY values and those in later fractions have smaller GRAVY values. HILIC also exhibited a fairly pronounced advantage in identifying peptides with lower GRAVY values, in comparison with alkaline-RP (Figure 20). The amount of peptides with a GRAVY value below -0.5 identified by HILIC was 2966, with 2150 being identified by alkaline-RP respectively. This clearly shows that the HILIC fractionation approach permitted more hydrophilic peptides to be identified.
Solution-IEF utilizes differences in pI values for peptides as the basis for separation. A 24 cm long IPG gel strip with a linear pH gradient ranging from 3 to 10 was used in this experiment; the gel strip was divided into 24 sections and each section had a pI range of 0.29. The pI value for the distribution of peptides identified by solution-IEF is demonstrated in Figure 14F and Figure 14G. Peptides were unevenly distributed along the IPG strip scale; there are two gaps in the map; one in the pI range 4.45 to 5.03 and the other in the pI
33
range 7.64 to 8.51. The gap of peptides in certain fractions could be attributed to a lack of possible amino acid combinations that could be present in peptides with adequate pI values.56 There are several reports indicating that the distribution of a proteome digest over the OFFGEL IPG strip is not dependent on the sample nature or the specific organism (eukaryote or prokaryote).17, 56, 60
Complementarity of SCX x RPLC, HILIC x RPLC, Alkaline-RP x RPLC and sIEF x RPLC Methods
In order to systematically compare the relative efficiencies of 2D separations for complex protein mixtures, a consistent peptide load in the second dimension, a consistent number of collected fractions (24), consistent LC-MS/MS conditions, and consistent database search parameters were employed. Table 3 summarizes the total proteins/peptides identified by Mascot for each workflow. As shown in Table 3, SCX x RPLC using desalted samples gave the greatest number of proteins identified, followed closely by HILIC x RPLC. The alkaline-RP x RPLC and solution-IEF x RPLC gave average results while the RP x RPLC gave the worst results (as a control).
To evaluate the overlap between the four methods, Venn diagrams, based on the comparison of identified proteins and peptides across the different methods, were constructed using Scaffold (Figure 21). A total of 1762 proteins were identified within all sets of methods, SCX covered 96.54% of the total proteins identified (Figure 21A). When SCX and HILIC were used, 98.92% of the total proteins were identified. At the peptide level (Figure 21B), SCX and HILIC covered 70.04% and 58.75% of the total peptides identified, respectively, but the sets of peptides were somewhat independent of one another. The amount
34
of peptides identified with all setups is rather high in comparison with the amount of proteins identified with all setups, which explains the fact that different peptides may contribute to the identification of the same protein.
The combination of methods not only increases the marginal amount of identified protein but also increases the amount of identified unique peptides identified to a significant extent. Table 4 listed 10 proteins identified from our experiment. The peptide per protein ratio listed in the table was increased after combining the data from the four different fractionation approaches. For example, in the case of the early endosome antigen 1 (accession number:
IPI00329536.2) , the sequence coverage was only about 19.99% in the SCX approach but after combining the data from the four fractionation approaches, a 33.1% sequence coverage was realized. This indicates that, although SCX x RPLC is capable of generating the largest numbers of peptide and protein identifications, the four fractionation methods complement each other to some degree.
Table 5 lists the number of phosphopeptides identified from each fractionation method, SCX gave the best results (48) and solution-IEF gave the worst results (17). When the data from the four fractionation methods are combined, the number of unique phosphopeptides identified were substantially increased to 83 (73% higher than SCX x RPLC only).
The increased amount of distinct peptides being identified is especially beneficial, not only for confidently identifying proteins but also for providing an enhanced feasibility in proteomic studies with post-translational modifications and peptide-based quantification approaches using stable isotope labeling, such as iTRAQ, SILAC, or TMT. For quantification approaches,
35
improved sequence coverage of proteins will increase the number of peptide matches per protein, thus reducing false positive identifications and enabling statistical quantitation for a greater number of identified proteins.
36
Summary
To date, there is no single peptide separation approach that allows all peptides to be identified from a complex biological sample. A multi-dimensional separation strategy to reduce the complexity of the protein sample is an essential need. Here, four of the most widely applied fractionation methods, including SCX, HILIC, alkaline-RP and solution-IEF for 2D shotgun proteomics were compared and evaluated. In the case of the total protein identified, SCX x RPLC using desalted samples provided superior results.
It should be noted that the salt content of the sample could massively lower the separation efficiency of SCX due to the fact that singly and doubly charged peptides are not retained, resulting in fewer peptide identifications. If a single 2D separation method is applied for general proteomics analysis, it is recommended that the sample be first fractionated with SCX x RPLC to permit a sufficient number of proteins to be identified, and also for its ease of use.
Combining two highly resolving separation approaches such as SCX x RPLC and HILIC x RPLC or others could significantly increase the sequence coverage of identified proteins. It will further assist in a proteomic study, especially in cases of post-translational modifications and peptide-based quantification approaches.
37