Chapter 1 – Introduction
1-1 Phosphoproteomics
1-1.1 From Genomics to Proteomics
One of the most significant biological achievements to emerge during the past 40 years has been the completion of draft DNA sequences of the human genome.
1,2The International Human Genome Sequencing Consortium, led in the United States by the National Human Genome Research Institute and the Department of Energy, published its scientific description of the finished human genome sequence, reducing the estimated number of human protein-coding genes from 35,000 to only 20,000-25,000, a surprisingly low number for our species.
3,4Although the number is surprising low, a gene can be transcribed to pre-mRNA that may be processed to one mRNA or by alternative splicing to several forms of mRNAs. The transcripts are translated into functional proteins - the ultimate operating molecules producing the physiological state.
Several processes in RNA levels and post-translational modifications of proteins would
add large amounts of additional diversity to the expression profiles of gene products.
5The term “proteome” was first introduced to describe the set of proteins coined in 1994
as a linguistic equivalent to the concept of genome.
5In contrast to the “static” genome
sequence, the proteome is a highly dynamic entity due to their abundance, state of
modification, subcellular location, etc. depends on their physiological state of the cell
or tissue.
6The study of the proteome, called proteomics, initially was used to
describe the study of the expressed proteins of a genome using 2D gel electrophoresis.
This approach is now referred to as “expression” proteomics. The scope of proteomics now is used to describe the complete set of proteins that is expressed, and modified following expression, by the entire genome in the lifetime of a cell. It is also used in a less universal sense to describe the complement of proteins expressed by a cell at any one time. Today, proteomics is scientific discipline that promises to bridge the gap between our understanding of genome sequence and cellular behavior, and it can be viewed as more of a biological assay for determining gene function.
1-1.2 Post Translational Modification
In addition to the protein expression, the posttranslational modification of the protein determinates protein turnover, localization, activity or binding interactions.
The huge majority of eukaryotic proteins are posttranslational modified and more than 200 posttranslational modifications of amino acids have been reported thus far. These modifications act on individual residues either by cleavage at specific points, deletions, additions or having the side chains converted or modified. Post translational modification may involve the formation of disulfide bridges or attachment of any of a number of biochemical functional groups, such as acetate, phosphate, various lipids and carbohydrates. Some posttranslational modifications extend the range of possible functions a protein by introducing other chemical groups into the makeup of a protein.
Such chemical changes may alter the hydrophobicity of a protein and thus determine if the modified protein is cytosolic or membrane-bound.
1-1.3 Protein Phosphorylation
For the wide variety of posttranslational modifications, the protein phosphorylation is most common, about one third of the human proteins contain covalently bound phosphate, and five-hundred protein kinases and a third that number of protein phosphatases encode by human genome. The phosphorylation adds two negative charges to a protein and allows the formation of hydrogen bonds, which leads to changes in electrostatic interactions, substrate binding, conformation, and catalytic activity, as is important for biological function. The phosphorylation regulates protein function and localization and involves in ubiquitous regulatory mechanism in both eukaryotes and prokaryotes.
7Intracellular phosphorylation is regulated by protein kinases, which are activated in response to extracellular ways, kinase cascade activation, membrane transport, gene transcription, and motor mechanisms. The study of cell biology is now littered with examples of regulation by phosphorylation:
increasing or decreasing the biological activity of an enzyme, helping move proteins
between subcellular compartments, allowing interactions between proteins to occur, as
well as labeling proteins for degradation. Several diseases have been recognized to be
associated with the abnormal phosphorylation of cellular proteins. For example, a
major virulence factor of Yersinia is protein tyrosine phosphpatase: this class of
bacteria causes several serious diseases, including the bubonic plague, which has been
responsible for many pandemics over past millennium.
8These include the Black Death,
which killed 25 % of the population of Europe in the 12th an 13th centuries. This
phosphatase can enter human cells causing uncontrolled dephosphorylation of many
tyrosine residues that rapidly proves fatal. Therefore, phosphorylated proteins are
attractive drug targets. Large-scale identification of phosphorylated kinase substrates
will certainly enhance our understanding of diverse biological phenomena, potentially
leading to targeted intervention in any number of disease paradigms.
91-2 Analysis of Protein Phosphorylation
1-2.1 Analytical Challenges of Protein Phosphorylation
To gain further insight into regulation of cellular function by reversible phosphorylation, it is often necessary to characterize the phosphorylation state of specific proteins under certain conditions. However, global analysis of protein phosphorylation remains a major analytical challenge. Mann et al. have reviewed the detailed reasons for this.
10First, the stoichiometry of phosphorylation is generally relatively low - only a small fraction of the available intracellular pool of a protein is phosphorylated at any given time as a result of a stimulus. Second, the phosphorylated sites on proteins might vary; implying that any given phosphoprotein is heterogeneous (i.e. it exists in several different phosphorylated forms). Third, many of the signaling molecules are present at low abundance within cells and, in these cases; enrichment is a prerequisite before analysis. Fourth, most analytical techniques used for studying protein phosphorylation have a limited dynamic range, which means that although major phosphorylation sites might be located easily, minor sites might be difficult to identify. Finally, phosphatases could dephosphorylate residues unless precautions are taken to inhibit their activity during preparation and purification steps of cell lysates.
Therefore, it needs that the high dynamic range of analytical techniques employed for the study of protein phosphorylation.
1-2.2 Analytical Techniques for the Protein Phosphorylation
There are several analytical techniques developed for the analysis of protein phosphorylation. The conventional approach for characterizing protein phosphorylation relies on
32P labeling with 2D-PAGE quantification by scintillation counting or audioradiography.
11The identification of protein and phosphorylation site can be achieved by using Edman sequencing.
12However, these techniques require relatively long time for sample handling, and are insufficient when dealing with complex protein mixtures. In addition, for the method of
32P labeling, it suffers from the difficulties associated with the use of radioactive isotopes. For the Edman sequencing, it requires large quantities of sample and also limited by the long analysis time. In recent years, mass spectrometry has become the choice for protein phosphorylation analysis. Table shown below provides a comparison of mass spectrometry-based approaches (mainly refer to LC-MS/MS) to methods using Edman sequencing and
32P labeling.
10Mass spectrometric method for identification and characterizing phosphoproteins is inherently better than the others.
32
P-labeling Edman sequencing
Mass spectrometry
Sensitivity
most sensitive less sensitive highly sensitiveRadioactivity
yes in some methods noLocalization of phosphorylation sites
no
yes ( However, tyrosine phosphorylation sites can be difficult)
yes
Sample
throughput
very slow slowHomogeneous
protein required
yes yes nohigh 32
P-labeling Edman
sequencing
Mass spectrometry
Sensitivity
most sensitive less sensitive highly sensitiveRadioactivity
yes in some methods noLocalization of phosphorylation sites
no
yes ( However, tyrosine phosphorylation sites can be difficult)
yes
Sample
throughput
very slow slowHomogeneous
protein required
yes yes nohigh