1
Introduction to Functional Proteomics
2 Introduction
Protein overview
Protein separation, purification, and analysis Sample preparation
One-dimensional electrophoresis Two-dimensional electrophoresis Image analysis
The type of mass spectrometers MS for protein analysis and identification Protein modifications
Drug vs. proteomic Microarray and SNP
Modification of protein (MS analysis) Application of proteomics
Student presentation
課程大綱
3
地球上所有的故事是這樣開始 … 一五○億年前
大 爆 炸
牛頓雜誌 (1994) 第 132 期, p.20
4
宇宙放晴 大渾沌 大爆炸
現在的宇宙 10-34sec
10-44sec
10 萬年
150 億年
肥皂泡模型
牛頓雜誌 (1994) 第 129 期, p.116
5
牛頓雜誌(1991) 第93 期, p.103
基本粒子 基本粒子 原子核 原子核
原子 原子
+ + + + - -
- -
由基本粒子到原子
6
小行星碰撞 岩漿海 地殼形成 第一場大雨 天空放晴
46 億年前 38 億年前
牛頓雜誌 (1994) 第 132 期, p.37
地球上的水分是由殞石帶來的 地球只有薄薄一層地殼是冷的
地球早期演進的重要關鍵
7
基本小分子 單位小分子單位小分子
Campbell (1999) Biochemistry (3e) p.16
H H C
N
O H
H H H
H H H
6
7
8
由基本小分子到單位小分子
8
9 10
11
後基因體時代將進入蛋白質體科學為主的功能性研究,其將導引 我們發現各種蛋白質間之交互作用,提供我們致病途徑及新藥開 發之機會,開發各種治療方法,以提供根本預防策略為最目終目標
Beyond the Genome
Proteins are ultimately (最終) responsible for all biological processes that take place within cells.
Protein dynamics reflect the state of biological system at a given time Detection and identification of post-translational modifications (PTM)
Proteomic and the New biology
12
Proteome (蛋白體)
Proteomics (蛋白質體學)
Genome (基因體)
Genomics (基因體學)
Functional Proteomics (功能性蛋白質體學)
Definition (必考)
It is the study of the proteome, the protein complement of the genome/organism.
Which describe the entire collection of genes in an organism.
It is the study of the genome, the gene complement of an organism.
The proteins complement of the genome/organism. (specific time and target)
13
"The analysis of the entire protein complement expressed by a genome, or by a cell or tissue type.“
Wasinger VC et al Progress with gene-product mapping of the mollicutes: Mycoplasma genitalium.
Electrophoresis 16 (1995) 1090-1094
14
From Genotype to Phenotype
• Genome: DNAs
• Transcriptome: RNAs
• Proteome: Proteins
• Physiome: Metabolites
• Biome: Environment
15
The proteome
1. The proteome consist of all of the proteins expressed by a cell under specific conditions or stimulation
2. The proteome of a organism depends ontype, its developmental stage, environment/stimuli, nutritional and metabolic status etc..
3. The genome of a organism is fixed, however, the
proteome is dynamic
4. The proteome is much larger than the genome. Each gene can translate into mutiple isoforms of proteins 5. The proteome is very hard than genome, for low
expressed protein.
16
“omics” 體學, 利用大規模且巨觀的研究方法(包含蛋白質身 份鑑定、細胞及分子生物技術、生物活性測試、電腦程式 計算)來研究觀察整個生物體分子層次之生理角色。
“Omics” in life science Genomics
arrangement, identify, discovery of gene
Transcriptomics
RNA expressionProteomics
Metabolomics
Metabolite in specific organism
“Omics” terms symbolized a redefinition of how we think about biology and the working of living systems
17
Current -omics
18
狹義定義:在特定一個時間點,在一個特定標的物(組織,細胞,胞 器…),研究所有蛋白質之種類、特性及其含量。
廣義定義:除了狹義定義之,加上嚴謹之實驗(控制組及實驗組),並 結合生理、藥理、分子生物學及生物資訊之技術及知識,探討蛋 白質之交互作用在生物體上可能扮演之角色。
Proteomics (蛋白質體學)
Don’ Don ’t forget that the proteome is t forget that the proteome is dynamic dynamic, changing to , changing to reflect the environment that the cell is in
reflect the environment that the cell is in
19
Metabolic Pathway Analysis
Proteomics Segmentation
Protein Structure
Analysis
3-D Protein Structure Computer Modeling
Protein Function
Protein-Protein Interaction Protein Function
Assays
Protein Discovery
Identification / Characterization
Expression Quantitation Differential Display
PTM PTM’s
Protein Expression Protein Complexes
Cross Linking Studies Higher Order Protein Structure
Epitope Mapping Active Site Investigation
Drug Targets
BioMarkers & Diagnostics
Protein Therapeutics
20
DNA
mRNA
Proteins
Cell function
Genome
Proteome
Genomics
Proteomics transcription
translation
Modification (phosphorylation, methylation, glycosylation…)
In human, 40,000 gene m-RNA(1,000,000) protein
Functional protein
21
Proteome: 1994 Wilkins and Williams Proteomics:
Yates: He defined proteomics as the scientific discipline (教養) and characterizing and analysis the proteins, interaction and modification of an organism.
Gygi and Aebersold: They defined proteomics as the ability to systematicallyevery protein expressed in a
specific target
(cell, tissue…) and determine the salient (顯著) properties of each protein (such as abundance, modification)Wagner: He defined the proteomics is the entire profiled of all the proteins expressed by a specific target under strictly definedconditions at a given
time. 22
Genomics vs Proteomics
Same within an organism Different within an organism conditionally and functionally
DNA code revealed and shared
Protein content revealed and shared
Linear information Multi dimensional information
Rich Information Content of Proteomics
23
Definition of Proteomic (Depending on your asking)
Expression Proteomic
Identifying all the proteins in an organism
Functional Proteomic
Determining how those proteins join force to form network
Structural Proteomic
Outlining the precise three-dimensional structure of the proteins
24
25
Types of Proteomics
• Protein Expression
– Quantitative study of protein expression between samples that differ by some variable
• Structural Proteomics
– Goal is to map out the 3-D structure of proteins and protein complexes
• Functional Proteomics
– To study protein-protein interaction, 3-D structures, cellular localization and PTMS in order to understand the physiological function of the whole set of
proteome.
26
If we can measure gene expression, why bother with proteomics?
We can measure DNA → m-RNA → predict protein expression
= actually protein
m-RNA level:
1. Stability
2. efficiencies in translation
Protein level:
1. stability (degradation)
2. turnover (transcription factor, signal transduction, cell cycle…..) 1 gene = one protein
In fact, the definition of a gene is debatable (爭議)..(promoter, pseudogene, gene product, etc)
1 gene=how many proteins?
Why? (function)
27
Co- and Post-translational modification
Proteomics and posttranslational modifications
Patterson and Aebersold, Nature Genetics (supp.), 33, 311 (2003)
protein-ligand interactions protein protein--ligandligand
interactions interactions
protein complexes (machines) protein protein complexes complexes (machines) (machines)
protein families (activity or structural)
protein families protein families (activity or structural) (activity or structural) post-translational
modified proteins postpost--translationaltranslational modified proteins modified proteins Eukaryotic cell.
Examples of protein properties are shown,
including the interaction of proteins and protein
modifications.
28
Proteomic Analysis of Post-translational Modifications
Post-translational modifications (PTMs)
– Covalent
processing events that change the properties of a protein• proteolytic cleavage
• addition of a modifying group to one or more amino acids
– Determine its activity state, localization, turnover, interactions with other proteins
– Mass spectrometry and other biophysical methods can be used to determine and localize potential PTMs
• However, PTMs are still challenging aspects of
proteomics with current methodologies
29 Mann and Jensen, Nature Biotech. 21, 255 (2003)
The post-translational modifications
30
Nam e Site Mo d ification m ass, D m
N-terminal acetylation Terminal NH2- Replaced by CH3CONH- 42 N-terminal formylation Terminal NH2- Replaced by HCONH- 28 N-terminal myristylation Terminal NH2- Replaced by CH3(CH2)12CONH- 210 N-terminal palmitoylation Terminal NH2- Replaced by CH3(CH2)14CONH- 238 C-terminal amidation Terminal -COOH Replaced by -CONH2 -1
Disulfide bonds 2 Cys -SH Replaced by -S-S- -2
Glycosylation (N-linked) N-X-S/T Glycosylation (O-linked) S/T
Sulfation -OH of Y Replaced by -OSO3H 80
Phosphorylation -OH of Y/S/T Replaced by -OPO3H2 80
N-methylation -NH2 of K/R/H/Q Replaced by -NHCH3 14
O-methylesterification -COOH of E/D Replaced by -COOCH3 14
Carboxylation -NH2 of E/D Replaced by -NHOCH3 30
Hydroxylation -NH2 of P/K/D Replaced by -NHOH 16
31
One functional protein = one physiological function
Other protein interaction (complex, modification) form different condition
Function A (complex; transcription factor) Function B (modification; phosphorylation) Function C
. .
32
33 Difference between protein chemistry and proteomics
Protein Chemistry Proteomic Individual proteins
Complete sequence analysis Emphasis on structure and function
Complex protein (interaction) Partial sequence
Emphasis on identification by database matching System biology
34
Why do proteomics?
1. It focuses on the gene product . In organism, only protein produced directly biological function, not mRNA or DNA.
2. mRNA expression analysis (array chip, south blot…) does not always reflect the expression level of protein.
3. Analysis the modification for proteins that are not actuality happen from mRNA or DNA.
4. Biological sample (such as CSF, serum, urine…) are not suitable for mRNA analysis.
5. Analysis the location of proteins.
6. Protein-protein interaction
35
The specific aims of PROTEOMICS
The expression of proteins profile in specific target
The modifications of proteins
The response of different insult (diseases, drug response)
Did not predict from gene or m-RNADid not predict from gene sequence, and some modification of specific protein under different stimulation/condition
Find out the protein, which may plays an important role of some physiological function
36
Applications of Proteomics
Mining (礦業)
Protein-expression profiling
Protein-network mapping
Mapping of protein modifications Identifying all (possible) of the proteins in a sample DNA/gene microarrays only predict
Combine MS data base and software
Identifying of proteins in a particular samples as a function of a particular state of the organism (e.g. differentiation, development, disease)
The information has provide future study.
How the protein interact with each other in living system. The interaction include signal transduction cascades, biosynthetic or degradation
The task of identifying how and where proteins are modified Specific modifications on specific site
37
New biological tool-proteomics
NCBI (美國國家醫學圖書館)MEDLINE(生物醫學資料庫), 1995, key in “proteomics” about 10
2005, key in “proteomics” = 3570 2006.10, key in “proteomics”=8960
7.8 10.8
23.8 32.8
44.8 56.8
15.8
0 10 20 30 40 50 60
1999 2000 2001 2002 2003 2004 2005
億美元
產值 蛋白質體學(Proteomics)全球產值趨勢
資料來自工研院
Proteomics related journals
3839
The challenge of proteomics
Enormous protein, some protein express too lowdifficult to determine.
Variation
of the same type of protein in different organismDifferent condition
might induced the changes of protein, but not geneData bank
30,000-40,000 gene might produce 3,000,000-40,000,000 proteins Protein expression levels are not predictable from mRNA expression levels
human and orangutan (猩猩) gene has > 99% similarity, but different protein expression result in different appearance
Proteins are uniquely modified and processed in ways which are not apparent from gene sequence
Food, living environment, drug, disease…. Induced protein change Proteome are dynamic and reflect the state of biological systems
40
The landmark of proteomics
1950 Smith gel electrophoresis
1975 O’Farrel 2-D electrophorsis
1985 Karas and Hillenkamp develop MALDI; Fenn develop ESI
1990 Human Genome Project
1994 Wilkins and Williams first “proteomics”
1997 First book about “proteomics”
2003 Human genome ok
1990 美國能源部(department of energy)與國家衛生研究院 (National Institutes of Health) human genome project
3,000,000,000 (30億) nucleotide of DNA for completely sequence
41
Application of proteomics
1.新藥開發
2.疾病診斷
3.儀器及設備提供
4.技術平台發展
42
Proteomics (summary)
Annotation of genomes, i.e. functional annotation – Genome + proteome = annotation (注解) Protein Function
Protein Post-Translational Modification
Protein Localization and Compartmentalization Protein-Protein Interactions
Protein Expression Studies
– Differential gene expression is not the answer
43
Protein Production Pathway
mRNA level ≠ expressed protein level nor does it indicate the nature of the functional protein product
Genomic
Sequence mRNA Protein
Product
Functional Protein Product
Transcriptional Regulation
Translational Regulation
Post-Translational Regulation
44
Traditional RNA analysis technique : Northern blotting
1. Estimated time to get results: 2-3days 2. Expressed Gene (mRNA) checked: 1-8 species 3. Accuracy: Low to moderate
Electrophoresis Blotting
Probing
Developing Labelling
Labelling R
45
New RNA analysis technique : Micro-array (
有movie)
Labelling on sample mRNA as probe cDNA or oligonucleotide spotted on chips
1. Estimated time to get results: 5-7 days 2. Expressed Gene (mRNA) checked: thousands 3. Accuracy: moderate to high
data analysis
Clustered genes
Clustered experiments (High
(High--throughput)throughput)
46
Alizadeh et al.
Nature 403 (2000) 503-511
microarray
47
Microarray revolutionized biology and medicine research
• One gene at a time before, now tens of thousands simultaneously - PROTEOMICS
• Gene expression
• Gene disease relation
• Gene-gene interaction
• Finding Co-Regulated Genes
• Understanding Gene Regulatory Networks
• Many, many more
48
cDNA and Oligonucleotide Microarrays
Type of Microarrays
1. cDNA arrays (Microspotting)
2. Oligonucleotide arrays (Photolithographic synthesis, Ink- jet technology, etc.)
Applications
Gene Expression SNPs
Mutations
Deletions and insertions
Genotyping
49 50
Basic idea of Microarray
• 製造原理
– 將可特徵基因之對偶鹼基序列 – 稱為探針
(probe)– 排列放置在微晶片(microchip) 上
• 應用原理
– 將含基因序列之樣品 (sample) 液體到在微 晶片上
– 利用互補鹼基雜交作用(hybridization) 的
原理,由 樣品 與微晶片上基因序列相互
作用的情形摘取所需的資訊
51
Basic idea of Microarray
• Construction
– Place array of probes on microchip
• Probe (for example) is oligonucleotide ~25 bases long that characterizes gene or genome
• Each probe has many, many clones
• Chip is about 2cm by 2cm
• Application principle
– Put (liquid)
samplecontaining genes on microarray and allow probe and gene
sequences to hybridize and wash away the rest
– Analyze hybridization pattern
52 Fabrication via Printing
DNA sequence stuck to glass substrate
DNA solution pre- synthesized in the lab
Fabrication In Situ
Sequence “built”
Photolithographic techniques use light to release capping chemicals
365 nm light allows 20-μm resolution
Fabrication
Fabrication
53 O O O O O
Light (deprotection)
HO HO O O O T T O O O
T T C C O Light
(deprotection)
T T O O O
C A T A T A G C T G T T C C G Mask
Mask
Substrate Substrate
MaskMask
Substrate Substrate
T –T –
C –C –
REPEAT REPEAT
On chip photolithographic synthesis of oligonucleotides
54
cDNA microarray schema
cDNA晶片製造原理
55
GeneChip for gene expression profile
Specimens Bioinformatics
Labeled Targets
0 0 0
0
56
cDNA
Labeled Targets (cDNA or cRNA) mRNA
0 0 0 0
AAAA
Labeled Fragments
0 0
0 0
Specimens
Target Preparation
57
Rat Genome Chip
8799 Probe Sets
58
Rat-Tox Chip
1031 Probe Sets
59
Control Test
RT &
Labeled with Fluor Dyes
cDNA Microarrays
scanning Cy3 Cy5
Bioinformatics
2-Dye Technology of cDNA Microarrays
60
Microarray analysis
Operation Principle:
Samples are tagged with flourescent material to show pattern of sample-probe interaction (hybridization) Microarray may have 60K probe
Compare with two sample gene expression profile
61
Microarray Processing sequence
From: Shin-Mu Tseng tsengsm@mail.ncku.edu.tw
62
63 http://www.bio.davidson.edu/courses/genomics/chip/chip.html
Demonstration
64 One gene 1 protein
65
Traditional Protein technique:
peptide sequencing
1. Protein purification: necessary 2. Protein idetified: 1 per purified sample
Cut desired band
Peptide N terminal sequencing Edman degradation
Database searching for homolog
66
Protein analysis technique : Western blotting
67
Protein analysis technique : Western blotting
2.14 1.12 0.91 1 1 1.98 2.2 2.0 3.14 3.98 0.95
68
New protein analysis technique : Protein-array
69
New protein analysis technique : Protein-array
70
71 72
Applications of proteomics to Medicine
• Detection of disease markers in body fluids
• Pharmaceutical studies
• Toxicological studies
• Glycosylation and phosphorylation of proteins and hormons
• Tissue analysis for cancer research
73
DNA RNA Protein Cell animal
Sequencing microarray proteomic Cellular assay transgenic
Lab
Disease model Clinical diagnosis
SNP Gene therapy
Clinical diagnosis Recombine DNA Protein expression
Functional assay
Diagnosis (antibody)
Disease therapy Structure modeling for compound screening
industrial circles
screening
74
Virtual Screening Virtual
Screening High- Throughput
Drug Screening
Proteomic Proteomic
Clinical Trial
Protein Drug
Vaccine Disease Pathway
Disease Pathway
提供藥靶提供藥靶 (Target) (Target)
Protein Array Chemical Array
Proteomic在新藥開發過程中的角色 Proteomic
Proteomic在新藥開發過程中的角色 在新藥開發過程中的角色
75
From proteomics to medicine
Identification of disease marker
Diagnosis, Patient profiling
Identification of drug target
Discovery, validation, clinical trials, patient monitoring
Identification of the interaction between protein- protein and protein-drug
Side effect predict, potency predict..
76
Drug discovery procession
77 78
The essential tools of proteomics
Protein separation or collection Data base
Mass spectrometry (MS) Software (for MS data base)
79
• Sample generation
– Origin of sample• Sample processing
– Gels (1D/2D), columns, other methods
• Mass Spectrometry
– Spectra, machine and componentry types, parameter, processing methods
• In Silico analysis
– Database name + version, partial sequence, search parameters, search hits, accession numbers
80
81
蛋白質體學樣品分析流程
架構:工研院
82
Proteomic study step
Step 1: Sample Preparation Step 2: Isoelectric Focusing
Step 3: SDS Polyacrylamide Gel Electrophoresis Step 4: Staining of the Gels
Step 5: Scanning of Gels and Image Analysis Step 6: 2D DIGE or 2D electrophoresis Step 7: Spot Excision
Step 8: Sample Destaining Step 9: In-gel Digestion Step 10: Microscale Purification
Step 11: Chemical Derivatisation of the Peptide Digest Step 12: MS Analysis
Step 13: Calibration of the MALDI-ToF MS Step 14: Preparing for a Database Search Step 15: PMF Database Search Unsuccessful
83
The front end of an exciting system
PDQUEST
SAMPLE PREP SAMPLE PREP
MASS SPEC MALDI-TOF MASS SPEC MALDI-TOF SWISS-PROT,
TrEMBL, etc.
HTML Links
MS/MS ESI-MS IMAGING
SPOT CUTTER
ProteinLynx PepSeq
PROTEIN DIGEST STATION PROTEIN DIGEST STATION 2-D GELS
OR BLOTS
WorksBase WorksBase
84 Characterization
Automated Characterization Protein Isolation
Identification of 2-D Separated Proteins
Biological Material
tissue cells body fluids
2-D gel protein spot 2-D gel spot immobilized
on a membrane
de novo sequence sequence
fractionated
peptide peptide mass
fingerprint peptide mix
N-terminal sequence
amino acid composition Amino acid
analysis Edman
degradation blotting
excision Biological
pre-fractionation
Biochemical pre-fractionation
HPLC
Edman
degradationESI MS/MS carboxy- database search
peptidase ESI-MS/MS
MALDI-MS
sequence tag
protein ID digestion
detection imaging
excision analysis software
ProteomeWorks Spot Cutter