Protein structure and function
The Function of Proteins
Enzymes
Enzymes biological catalysts.
Immuno
Immuno-- antibodies of immune system.
globulins globulins Transport
Transport move materials around hemoglobin for O2. Regulatory
Regulatory hormones, control metabolism.
Structural
Structural coverings and support
skin, tendons, hair, nails, bone.
Movement
Movement muscles, cilia, flagella.
DNA Protein
The life cycle of a protein
Functional protein Families
Science (2001),
15 % intermediary and nucleic acids metabolism
15-20 % structure, protein metabolism (cytoskeleton, chaperones, mediator of degradation) 20-25 % signal transduction or DAN binding protein
40 % gene encodes protein product are unknown function
General flow scheme for proteomic analysis
Sample Proteome
Proteins mixture Experiment
Clinical
protein
peptide
Mass Spectrometer
Data analysis Peptide mixture
identification
separation
digestion digestion
1o 2 o 3o 4o 一級primary 二級secondary 三級tertiary 四級quaternary
Nelson & Cox (2000) Lehninger Principles of Biochemistry
Amino acid 2 amino acids peptide polypeptide
H
2N C COOH H
R
The general structural formula of amino acids pK1
pK2 α-carboxylic acid
α-amino group
H
3N C COO H
R
pK1 pK2
+ -
pK1 values of the α-carboxylic acid groups lie in a small range around 2.2, above 3.5 COOH COO- pK2 values of the α-amino group lie in a small range around 9.4, below 8.0 NH2 NH2+
• Under normal cellular conditionsamino acids are zwitterions(dipolar ions):
Amino group = -NH3+ Carboxyl group = -COO-
COOH
NH
2H
+COO
-R - C - H NH
2H
+R - C - H
COO
-NH
2R - C - H
Acidic environment Neutral environment Alkaline environment
+1 0 -1
pK
1~ 2
pK
2~ 9
Isoelectric point
5.5
12 9 6 3 0
[OH] →
★
★
pK
1pK
2pH
H - C - R pI COO
-NH
2H
+Isoelectric point = pK
1+ pK
22
Amino Acids Have Buffering Effect
Juang RH (2004) BCbasics
Four aliphatic (脂肪族) amino acid structures (non-polar)
CH CH3
CH2 CH3
Isoleucine (I)
(Ile)
Aromatic (芳香族) amino acid structures
Methionine and cysteine
Non-poar aliphatic
First a.a. disulfide bond
Formation of cystine
Side Chains with Alcohol Groups
• Serine (Ser, S) and Threonine (Thr, T) have uncharged polar side chains
Glycosylation phosphorylation
Phosphorylation is reversible and is used in many pathways to control activity. Enzymes that add a phosphate to a hydroxyl side chainare commonly called kinases. Enzymes that remove a phosphate from a phosphorylated side chain are called phosphatases
Phosphorylation
Glycosylation There are two basic types
of glycosylation which occur on:
(a)N-linked: asparagine (b)O-linked: serine and
threonine
1. Covalently attached to the polypeptide as oligosaccharide chains containing 4 to 15 sugars
2. Sugars frequently comprise 50% or more of the total molecular weight of a glycoprotein
3. Most glycosylated proteins are either secreted or remain membrane- bound
4. Glycosylation is the most abundant form of post-translational modification 5. Glycosylation confers resistance to protease digestion by steric protection 6. Important in cell-cell recognition
The biological function of protein glycosylation
N-linked glycosylation on asparagine (Asn) side chains:
• an alkali-stable bond between the amide nitrogen of asparagine and the C-1 of an amino sugar residue
• occurs co-translationally in the endoplasmic reticulum (ER) during synthesis
• lipid-linked oligosaccharide complex is transferred to polypeptide by oligosaccharyl transferase
• target sequence or consensus site on protein is Asn-X-Ser/Thr
• further processing in Golgi apparatus Examples:
Heavy chain of immunoglobulin G (IgG) Hen ovalbumin
Ribonuclease B
N-LINKED OLIGOSACHHARIDES
There are two types which both share a common pentasachharide core (shaded yellow in the next two slides):
High-mannose type
Complex type
A COMPLEX TYPE N-LINKED OLIGOSACCHARIDE
O-linked glycosylation on serine (Ser) or threonine (Thr) side chains
• an alkali-labile bond between the hydroxyl group of serine or threonine and an amino sugar
• carried out by a class of membrane-bound enzymes called glycosyl transferases which reside in the endoplasmic reticulum (ER) or the Golgi apparatus
• nucleotide-linked monosaccharides added to protein side chain one at a time
• Linked one at a time to OH of Ser, Thr or OH in modified amino acids (e.g. Hydroxylysine of collagen)
• Example: Blood group antigens on erythrocyte surface:
•The A antigen and B antigen are pentasaccharides which differ in composition of the 5th sugar residue
•The O substance is a tetrasaccharide which is missing the 5th residue and does not elecit an antibody response (non-antigenic).
Protein glycosylation takes place in the ER and Golgi
The endoplasmic reticulum- ER
– A continuous cytoplasmic network studded with ribosomes and functions as a transport system for newly synthesized proteins.
The Golgi complex
– An organelle consisting of stacks of flat membranous vesicles that modify, store, and route products of the ER.
N-linked glycosylation begins in the ER and continues in the Golgi apparatus (via dolichol phosphate).
O-linked glycosylation takes place only in the Golgi apparatus.
In the Golgi:
1. O-linked sugar units are linked to proteins.
2. N-linked glycoproteins continue to be modified.
3. Proteins are sorted and are sent to- lysosomes
secretory granules plasma membrane
according to signals encoded by amino acid sequences.
Glycosylation starts in the endoplasmic reticulum and continues in the Golgi.
GLYCOSYLATION
Glycosylated proteins, probably half of all the proteins in animals, include:
Most proteins in the extracellular matrix
Most proteins on the plasma membrane and in the lysosome Most proteins in blood (albumin is a notable exception) Coat proteins of viruses made in animal cells
Function of glycosylation:
Makes proteins more hydrophilic
Increases stability by decreasing protease accessiblity of the backbone Provides specific recognition handles for targeting
Glycosylation is not limited to proteins.
Lipids are glycosylated (glycolipids) altering their presentation in the membrane and some glycolipids (e.g. gangliosides) can have messenger character.
The liver makes extensive use of glycosylation during detoxification of foreign substances.
Precursor synthesis
The oligosaccharide is assembled sugar by sugar onto the carrier lipid dolichol
High energy pyrophosphate bond
(We shall discuss only the synthesis of a complex N-linked oligosaccharide.)
1. Synthesis of Nucleotide Sugars
Every sugar is first linked to a nucleotide. This is
accomplished by a series of reactions whose details need not concern us here.
2. Synthesis of Lipid-linked Oligosaccharides In this stage, which takes place in the endoplasmic reticulum, the oligosaccharide is assembled onto a very hydrophobic lipid: dolichol phosphate.
3. Transfer of the Oligosaccharide to the Protein
The enzyme protein-oligosaccharyltransferase transfers the oligosaccharide en bloc onto an asparagine residue of the protein. This reaction also takes place in the ER.
4. Processing of the Protein-Bound Oligosaccharide
Biosynthesis of dolichol pyrophosphoryl oligosaccharide precursor
Strongly hydrophobic lipid (79-95 carbon)
Oligosaccharide side chain may promote folding and stability of glycoproteins
Consensus:
Asn-X-Ser/Thr
Addition & processing of N-linked oligosaccharides in r-ER of
vertebrate cells
N-glycosylation: Oligosaccharide precursor is attached to the protein co-translationally
Red: GlcNAc Blue: mannose Green: Glucose
Consensus:
Asn-X-Ser/Thr
Mannose 6-Phosphate directs a protein to the lysosome
• M6P is generated in the cis-Golgi in a 2-step process
• This sorting signal recruits adaptor proteins and clathrin
Newly synthesized proteins destined for the lysosome are first transported to the late endosome.
Adaptins bridge the M6P receptor to clathrin.
Hydrolases are
transported to the late endosome which later matures into a lysosome.
Acidic pH causes hydrolase to dissociate from the receptor. M6P receptor is recycled back to the TGN.
The acid hydrolases in the lysosome are sorted in the TGN based on the chemical marker mannose 6-phosphate.
Formation of Mannose 6-phosphate tag at golgi complex.
This was first attached in the ER.
The phosphate is added in the Golgi
The creation of the M6P marker in the Golgi relies on recognition of a signal patch in the tertiary structure of the hydrolase.
Patients with a disease called inclusion-cell disease have cells lacking hydrolases in their lysosomes. Instead, the hydrolases are found in the blood. These patients lack GlcNAc phosphotransferase. Without the M6P-tag, the acid hydrolases are transported to the plasma membrane instead of the late endosome.
Structures of histidine, lysine and arginine
Structures of aspartate, glutamate, asparagine and glutamine
Glycosylation
Families of Amino Acids
• The common amino acids are grouped according to whether their side chains are:
– Acidic: aspartate, glutamate (tyrosine) – Basic: lysine, arginine, (histidine)
– (natural) uncharged polar: serine, threonine, glutamine, asparagine, (glycine)
– Nonpolar:(glycine), alanine, valine, leucine, Methioine, cysteine, proline, phenylalanine, tryptophan
• Hydrophilic amino acids (uncharged polar) are usually on the outside of a protein whereas nonpolar residues cluster on the inside of protein
• Basic or acidic amino acids are very polar and are generally found on the outside of protein molecules
polar
Ampholyte
Ampholyte contains both positive and negative groups on its molecule
Uncommon a.a.
Plant cell wall Collagen
Collagen
myosin
Prothrombin Ca2+ binding protein
elastin
21th a.a.
Added during protein synthesis UGA codon
glutathione peroxidases
Broad spectrum amino acids
Essential amino acids (can not synthesis itself) Arginine, histidine, isoleucine, leucine, lysine,
methionine, phenylalanine, threonine, tryptophan, valine Neuroendocrine related
GABA, glycine, serine, tauring, glutamate, aspartate, tyrosine
Energy metabolism
Asparagine, aspartate, glutamate, citrulline, ornithine
Cysteine → disulfides bond
Glycine → small, peptide more elasticity
Proline → 2 structure more fixed
There Are Four Levels of Protein Structure
Primary structure - amino acid linear sequence Secondary structure - regions of regularly
repeating conformations of the peptide chain, such as α-helices and β-sheets
Tertiary structure - describes the shape of the fully folded polypeptide chain
Quaternary structure - arrangement of two or more polypeptide chains into multisubunit molecule
• Peptide bond- linkage between amino acids is a secondaryamide bond
• Formed by condensation of the α-carboxyl of one amino acid with the α- amino of another amino acid (loss of H2O molecule)
• Primary structure- linear sequence of amino acids in a polypeptide or protein
Peptide Bonds Link Amino Acids in Proteins
The backbone of protein (polypeptide)
N-C-C-N-C-C-N-C-C-N-C-C N C
N-terminal C-terminal
Unit (單位)
Peptide bond (胜鍵)
H
3N C COO H
R
pK1 pK2
+ -
The Hydrophobicity of Amino Acid Side Chains
Hydropathy: the relative hydrophobicity of each amino acid
The larger the hydropathy, the greater the tendency of an amino acid to prefer a hydrophobic environment. Hydropathy ↑ = hydrophobic ↑ = dissolve in water↓
Hydropathy affects protein folding:
hydrophobic side chains tend to be in the interior hydrophilic residues tend to be on the surface.
Membrane protein vs. hydropathy
One of the most commonly used properties is the suitability of an amino acid for an aqueous environment
Hydropathy & Hydrophobicity
– degree to which something is “water hating” or “water fearing”
Hydrophilicity
– degree to which something is “water loving”
Hydro-pathy/phobicity/philicity
Analysis:
Goal: Obtain quantitative descriptions of the degree to which regions of a protein are likely to be exposed to aqueous solvents Starting point: Tables of propensities of each amino acid
Describe the likelihood that each amino acid will be found in an aqueous environment - one value for each amino acid
Commonly used tables
– Kyte-Doolittlehydropathy – Hopp-Woodshydrophilicity – Eisenberg et al. normalized
consensus hydrophobicity
Hydro-pathy/phobicity/philicity table
The topology of a membrane protein often can be deduced from its sequence: hydropathy profile (親水性行為)
Hydropathic index for each aa.
Total hydrophobicity of 20 contiguous aa
hydrophobicity
Usually for cytosol
Example Hydrophilicity Plot Example Hydrophilicity Plot
This plot is for a tubulin, a soluble cytoplasmic protein.
Regions with high hydrophilicity are likely to be exposed to the solvent (cytoplasm), while those with low hydrophilicity are likely to be internal or interacting with other proteins.
Amino Acid Composition of Proteins
• Amino acid analysis - determination of the amino acid composition of a protein
• Peptide bonds are cleaved by acid hydrolysis (6M HCl, 110
o, 16-72 hours) (所以不怕溫度,怕酵素)
• Amino acids are separated chromatographically and quantitated
• Phenylisothiocyanate (PITC; Edman degradation) used to
derivatize the amino acids prior to HPLC analysis
Secondary structure α-Helix
H
| N C
||
O
H
| N C
||
O H
| N
C
||
O C
||
H O
| N C
||
O
H
| N C
||
O
H
| N C
||
O
C
||
O
C
||
O H
| N
H
| N H
|
N
Every amide hydrogen
and carbonyl oxygen is involved in a hydrogen bond.
Multiple strands may entwine to make a protofibril
protofibril.
The α-helix
每3.6胺基酸繞一圈,每圈5.4 Å高
The α-helix can produce polar or non polar protein folding
Outside → + Inside → -
Polar folding H
| N C
||
O
H
| N C
||
O
H
| N
C
||
O C
||
H O
| N C
||
O
H
| N C
||
O
H
| N C
||
O
C
||
O
C
||
O H
| N
H
| N H
| N
Stereo view of right-handed α helix
• All side chains project outward from helix axis
Albertset al(2002) Molecular Biology of the Cell (4e) p.679
Garrett & Grisham (1999) Biochemistry (2e) p.1054
α-helix produced new related position
α -Helix Example
myosin/actin structure myosin/actin structure Proteins used in muscle actin
troponin
myosin head myosin tail ATP and actin binding sites
thick filament
thin filament
αααα
形成更複雜的構造
Stryer (1995) Biochemistry (4e) p.436
Many α helices complex structure
Horse liver alcohol dehydrogenase
• Amphipathic α helix (blue ribbon)
• Hydrophobic
residues (blue)
directed inward,
hydrophilic (red)
outward
β-Sheets (a) parallel, (b) antiparallel
parallel antiparallel
Turn
β turn γ turn
Pro
劇烈轉折 R 在同一側
R 在相對側
三個胺基酸夾一氫鍵 兩個胺基酸夾一氫鍵
Mathews et al (2000) Biochemistry (3e) p.181
Reverse Turns
Reverse Turns: β turn, γ turn
It also related with H-bond
thews et al (2000) Biochemistry (3e) p.164
α helix β sheet
兩者都由
H-bond組成
allα helices allβ sheets helices + sheets
Kleinsmith & Kish (1995) Principles of Cell and Molecular Biology (2e) p.26
Secondary structure produced Tertiary structure
turn
Glu Met Ala Leu Lys Phe Gln Trp Ile Val Asp His Arg Thr Ser Cys Tyr Asn Pro Gly
α helix β sheet β turn
各種胺基酸對二級構造的貢獻程度
Tertiary structure
1. H-bond, Disulfide bond, peptide bond (covalent bond),ionic bond and hydrophobic contributed to the structure. (secondary structure: H-bond) 2. Contain many secondary (supersecondary) structure
3. Has ion interaction, and contributed to stable the structure
4. Hydrophobic group in cone (soluble protein by folding and 2 & 3 structure)
Trypsin inhibitor 2- structure H-bond
3-structure
Tertiary Structure of Proteins
- S - S -
Salt bridge Sulfide
Crosslink
Hydrogen bonding Hydrophobic
interaction
-COO
-H
3N
+-
-O \ H
\ -O H
Tertiary Structure of Proteins
Hydrophobic attractions Hydrophobic attractions - -
Attractions between R groups of non-polar amino acids.
Hydrogen bonding Hydrogen bonding - -
Interaction between polar amino acidR groups.
Ionic bonding Ionic bonding - -
Bonding between oppositely charged amino acid R groups.
Tertiary structure- ionic bond
Cys, His, Glu, Asp can interact with metal, and formed ionic bond
Function:
1. Fixed (stable) structure
2. Protein function (co-factor
Mg
2+)
Hydrophobic amino acid group
以水溶性蛋白質而言 (脂溶性蛋白質恰相反)
Albertset al(2002) Molecular Biology of the Cell (4e) p.135
Hydropholic amino acid group
The hydrophobic strength of tertiary structure
C CH2 SH
SH CH2 C C
CH2 SH
SH CH2 C
C CH2 S S
CH2 C C CH2 S S
CH2 C 分子內
Intrachain Disulfide bond
分子間 Interchain Disulfide bond
Oxidation Reduction
The disulfide bond of tertiary structure
The bonds contribute to protein structure
1. Hydrogen bond
2. Hydrophobic interaction
3. Ionic bond
4. Disulfide bond
RNAase
124amino acids, but has 4 Disulfide bond
Strong and stable Can reverse itself
+ Urea, Mercaptoethanol - Urea, Mercaptoethanol
Uncorrected folding
Common motifs
1.Supersecondary structure 2. 2-5 secondary structure
Tertiary structure - motif
Quaternary Structure of Proteins
Many proteins are not single peptide strands.
They are combinations of several proteins - aggregate of smaller globular proteins.
Conjugated protein
Conjugated protein - incorporate another type of group that
performs a specific function.
Common domain folds
1. Several hundred amino acids 2. > 2 motifs, and between secondary
and tertiary structure 3. Had specific conformation
4. Functional structure (myoglobin, 1;
antibody > 2)
Tertiary structure-domain
Tertiary structure- modification
1. Glycoprotein 2. Lipoprotein 3. Metalloprotein
4. Add phosphorylation 5. Need coenzyme
6. Protein cleavage (insulin → chymotrypsinogen)
Alberts et al (2002) Molecular Biology of the Cell (4e) p.156
Protein cleavage can regulated/produced new protein function
Protein cleavage can regulated/produced new protein function
Quaternary Structure
• Refers to the organization of subunits in a protein with multiple subunits (an “oligomer”)
• Subunits (may be identical or different) have a defined stoichiometry and arrangement
• Subunits are held together by many weak, noncovalent interactions (hydrophobic, electrostatic 靜電作用)
Quaternary structure of multidomain proteins
Hemoglobin tetramer
(a) Human oxyhemoglobin (b) Tetramer schematic
Sequence
↓
Conformation
↓ Activity
↓
Regulative function
(1) Directly sequence Edman degradation F. Sanger (Cambridge U.)
→ Insulin 胰島素 (A, B chains)
(2) 由 cDNA 序列反推胺基酸序列:
DNA 定序法: F. Sanger
one DNA → two template → three possible Determination of primary structure–
amino acids sequence
ATCGATCC……..
F. Sanger
(1951, Cambridge U)
Insulin 胰島素 (A, B chains)
S S
S S S S
+NH3
NH3+
-OOC
Q G
I V E CQ C T
S I C S
L L Y NE CY N
COO-
F S
F V N Q H L G C L H EV
A L Y
L V
C G
E R
G Y F T P K T
B-chain
A-chain
Protein function- serine protease
Acid-catalyzed hydrolysis of a peptide
Traditional N-terminal sequence All peptides → degradation → can not recover
Edman degradation
Can recovery the peptide
Edman degradation STEP:
1. Denature 2. Reduce and
alkylate disufuides 3. Cleavage
4. Edman reaction 5. Recover, …
Step 1 Denature
Step 2 Reduce and alkylate disulfides
Step 3 Cleavage (more efficiency )
Now: sequenator about 50 amino acids
Step 4 Edman degradation
Reconstruct sequence of polypeptides
How to identify the primary sequence of protein ? (1)From cDNA (genome) predict peptide sequence
One DNA → three types amino acids double strain DNA → six types amino acids
(2) Directly to sequence of the peptide
N-terminal secquence (F. Sanger Cambridge U; ) Edman degradation (F. Sanger ie lnsulin)
核酸序列閱讀方向
Signal Peptide
Enter ER? Peroxisomes? Nucleus?
Ribbon Structure of Ribonuclease A
N-Terminus
N-Terminus
Lys-Glu-Ser-Arg-Ala
Three Steps of Edman Chemistry
Lys-Glu-Ser-Arg-Ala 1. COUPLE
Edman Reagent
Lys-Glu-Ser-Arg-Ala 2. CLEAVE
Glu-Ser-Arg-Ala 3. CONVERT
PTH-Lysine
PTH Chromatograms
PTH-Lysine
PTH-Glutamatic acid
PTH-Serine
1st, amino acid Residue
2nd amino acid Residue
3rd amino acid Residue
Procise
Standard Instrument: HT (high throughput)
conversion flask
Reagents: R1, R2, R3, R4, R5 Solvents: S1, S2, S3, S4
1 2 3 4 one
protein sample per cartridge
Couple: R1, R2 Cleave: R3 Convert: R4
Standard: R5
491 - one cartridge 492 - two cartridges 494 – four cartridges
Standard PTH Chromatogram on Procise 10 picomoles each PTH-AA
D = Aspartic acid N = Asparagine S = Serine Q = Glutamine T = Threonine G = Glycine E = Glutamic acid
H = Histidine A = Alanine R = Arginine Y = Tyrosine P = Proline M = Methione V = Valine
W = Tryptophan F = Phenylalanine I = Isoleucine K = Lysine L = Leucine
dmptu, dptu = Edman byproducts
5 10 15 minutes
Classical Proteomics Example Surface Proteins in Human Lung
211 silver stained spots 182 i.d. by Edman
Sequencing ..61 different proteins
2 unknown proteins
Human bronchoalveolar lavage fluid: 2D Gel, amino acid microseqeuncing and identification of major proteins,
R. Wattiez et al.
Electropohoresis 20, 1634 (1999)
Identification and Quantification of Novel Yeast Proteins in Glucose Pathway
Hxk2p: Hexokinase II 1 pmol EF1b: Elongation Factor 6 pmol E1:pyruvate dehydrogen. 0.8 pmol hsp71: heat shock protein 3 pmol IF5A: initiation factor 7 pmol EMBO Journal
vol. 18, 99 4157-4168
Procise
Standard Instrument: HT
Detector
Low noise:
(+/- 1 x 10-5au ) Flow cell: 8 mm path,
12 uL vol.
785
140C
Binary Pump
Low pulsation:
dual-syringe pump flow rate: 325 uL/min
PTH-AA in injection loop PTH Column
First Residue of ß-Lactoglobulin
Sample: 10 picomoles, Initial Yield 7 picomoles
5 10 15 minutes
PTH-Leucine
High Sensitivity Sequencing
Sequential Cleavage and Excision of a Segment of a the Thyrotropin Receptor Ectodomain
S. de Bernard et al.
J. Biological Chemistry, vol 274, pp 101-107, 1999
TSH receptor protein, immunoprecipitate, run on SDS PAGE, electroblot on PVDF (0.2-0.5 pmoles), sequence on Procise 494
Molecular Cloning and Expression of Lipid Transfer Inhibitory Protein Reveals Its Identity with Apolipoprotein F
X. Wang et al.
J. Biological Chemistry, vol 274, pp 1814-1820 (1999)
affinity purify apoE complex, SDS PAGE, electroblo on PVDF at 0.5 to 1 picomole, sequence on Procise 492