1 Protein structure and function

(1)

Protein structure and function

The Function of Proteins

Enzymes

Enzymes biological catalysts.

Immuno

Immuno-- antibodies of immune system.

globulins globulins Transport

Transport move materials around hemoglobin for O₂. Regulatory

Regulatory hormones, control metabolism.

Structural

Structural coverings and support

skin, tendons, hair, nails, bone.

Movement

Movement muscles, cilia, flagella.

(2)

DNA Protein

The life cycle of a protein

(3)

Functional protein Families

Science (2001),

15 % intermediary and nucleic acids metabolism

15-20 % structure, protein metabolism (cytoskeleton, chaperones, mediator of degradation) 20-25 % signal transduction or DAN binding protein

40 % gene encodes protein product are unknown function

General flow scheme for proteomic analysis

Sample Proteome

Proteins mixture Experiment

Clinical

protein

peptide

Mass Spectrometer

Data analysis Peptide mixture

identification

separation

digestion digestion

(4)

1ô 2 ô 3ô 4ô 一級primary 二級secondary 三級tertiary 四級quaternary

Nelson & Cox (2000) Lehninger Principles of Biochemistry

Amino acid 2 amino acids peptide polypeptide

H

₂

N C COOH H

R

The general structural formula of amino acids pK₁

pK₂ α-carboxylic acid

α-amino group

H

₃

N C COO H

R

pK₁ pK₂

+ -

pK₁values of the α-carboxylic acid groups lie in a small range around 2.2, above 3.5 COOH COO^- pK2 values of the α-amino group lie in a small range around 9.4, below 8.0 NH₂ NH²⁺

• Under normal cellular conditionsamino acids are zwitterions(dipolar ions):

Amino group = -NH3⁺ Carboxyl group = -COO^-

(5)

COOH

NH

₂

H

⁺

COO

^-

R - C - H NH

₂

H

⁺

R - C - H

COO

^-

NH

₂

R - C - H

Acidic environment Neutral environment Alkaline environment

+1 0 -1

pK

₁

~ 2

pK

₂

~ 9

Isoelectric point

5.5

(6)

12 9 6 3 0

[OH] →

★

pK

₁

pK

₂

pH

H - C - R pI COO

^-

NH

₂

H

⁺

Isoelectric point = pK

₁

+ pK

₂

2 Amino Acids Have Buffering Effect

Juang RH (2004) BCbasics

Four aliphatic (脂肪族) amino acid structures (non-polar)

CH CH3

CH2 CH3

Isoleucine (I)

(Ile)

(7)

Aromatic (芳香族) amino acid structures

(8)

Methionine and cysteine

Non-poar aliphatic

First a.a. disulfide bond

Formation of cystine

(9)

Side Chains with Alcohol Groups

• Serine (Ser, S) and Threonine (Thr, T) have uncharged polar side chains

Glycosylation phosphorylation

Phosphorylation is reversible and is used in many pathways to control activity. Enzymes that add a phosphate to a hydroxyl side chainare commonly called kinases. Enzymes that remove a phosphate from a phosphorylated side chain are called phosphatases

Phosphorylation

(10)

Glycosylation There are two basic types

of glycosylation which occur on:

(a)N-linked: asparagine (b)O-linked: serine and

threonine

1. Covalently attached to the polypeptide as oligosaccharide chains containing 4 to 15 sugars

2. Sugars frequently comprise 50% or more of the total molecular weight of a glycoprotein

3. Most glycosylated proteins are either secreted or remain membrane- bound

4. Glycosylation is the most abundant form of post-translational modification 5. Glycosylation confers resistance to protease digestion by steric protection 6. Important in cell-cell recognition

The biological function of protein glycosylation

(11)

N-linked glycosylation on asparagine (Asn) side chains:

• an alkali-stable bond between the amide nitrogen of asparagine and the C-1 of an amino sugar residue

• occurs co-translationally in the endoplasmic reticulum (ER) during synthesis

• lipid-linked oligosaccharide complex is transferred to polypeptide by oligosaccharyl transferase

• target sequence or consensus site on protein is Asn-X-Ser/Thr

• further processing in Golgi apparatus Examples:

Heavy chain of immunoglobulin G (IgG) Hen ovalbumin

Ribonuclease B

N-LINKED OLIGOSACHHARIDES

There are two types which both share a common pentasachharide core (shaded yellow in the next two slides):

High-mannose type

Complex type

(12)

A COMPLEX TYPE N-LINKED OLIGOSACCHARIDE

O-linked glycosylation on serine (Ser) or threonine (Thr) side chains

• an alkali-labile bond between the hydroxyl group of serine or threonine and an amino sugar

• carried out by a class of membrane-bound enzymes called glycosyl transferases which reside in the endoplasmic reticulum (ER) or the Golgi apparatus

• nucleotide-linked monosaccharides added to protein side chain one at a time

• Linked one at a time to OH of Ser, Thr or OH in modified amino acids (e.g. Hydroxylysine of collagen)

• Example: Blood group antigens on erythrocyte surface:

•The A antigen and B antigen are pentasaccharides which differ in composition of the 5th sugar residue

•The O substance is a tetrasaccharide which is missing the 5th residue and does not elecit an antibody response (non-antigenic).

(13)

Protein glycosylation takes place in the ER and Golgi

The endoplasmic reticulum- ER

– A continuous cytoplasmic network studded with ribosomes and functions as a transport system for newly synthesized proteins.

The Golgi complex

– An organelle consisting of stacks of flat membranous vesicles that modify, store, and route products of the ER.

N-linked glycosylation begins in the ER and continues in the Golgi apparatus (via dolichol phosphate).

O-linked glycosylation takes place only in the Golgi apparatus.

In the Golgi:

1. O-linked sugar units are linked to proteins.

2. N-linked glycoproteins continue to be modified.

3. Proteins are sorted and are sent to- lysosomes

secretory granules plasma membrane

according to signals encoded by amino acid sequences.

Glycosylation starts in the endoplasmic reticulum and continues in the Golgi.

GLYCOSYLATION

Glycosylated proteins, probably half of all the proteins in animals, include:

Most proteins in the extracellular matrix

Most proteins on the plasma membrane and in the lysosome Most proteins in blood (albumin is a notable exception) Coat proteins of viruses made in animal cells

Function of glycosylation:

Makes proteins more hydrophilic

Increases stability by decreasing protease accessiblity of the backbone Provides specific recognition handles for targeting

Glycosylation is not limited to proteins.

Lipids are glycosylated (glycolipids) altering their presentation in the membrane and some glycolipids (e.g. gangliosides) can have messenger character.

The liver makes extensive use of glycosylation during detoxification of foreign substances.

(14)

Precursor synthesis

The oligosaccharide is assembled sugar by sugar onto the carrier lipid dolichol

High energy pyrophosphate bond

(We shall discuss only the synthesis of a complex N-linked oligosaccharide.)

1. Synthesis of Nucleotide Sugars

Every sugar is first linked to a nucleotide. This is

accomplished by a series of reactions whose details need not concern us here.

2. Synthesis of Lipid-linked Oligosaccharides In this stage, which takes place in the endoplasmic reticulum, the oligosaccharide is assembled onto a very hydrophobic lipid: dolichol phosphate.

3. Transfer of the Oligosaccharide to the Protein

The enzyme protein-oligosaccharyltransferase transfers the oligosaccharide en bloc onto an asparagine residue of the protein. This reaction also takes place in the ER.

4. Processing of the Protein-Bound Oligosaccharide

(15)

Biosynthesis of dolichol pyrophosphoryl oligosaccharide precursor

Strongly hydrophobic lipid (79-95 carbon)

Oligosaccharide side chain may promote folding and stability of glycoproteins

Consensus:

Asn-X-Ser/Thr

Addition & processing of N-linked oligosaccharides in r-ER of

vertebrate cells

(16)

N-glycosylation: Oligosaccharide precursor is attached to the protein co-translationally

Red: GlcNAc Blue: mannose Green: Glucose

Consensus:

Asn-X-Ser/Thr

Mannose 6-Phosphate directs a protein to the lysosome

• M6P is generated in the cis-Golgi in a 2-step process

• This sorting signal recruits adaptor proteins and clathrin

(17)

Newly synthesized proteins destined for the lysosome are first transported to the late endosome.

Adaptins bridge the M6P receptor to clathrin.

Hydrolases are

transported to the late endosome which later matures into a lysosome.

Acidic pH causes hydrolase to dissociate from the receptor. M6P receptor is recycled back to the TGN.

The acid hydrolases in the lysosome are sorted in the TGN based on the chemical marker mannose 6-phosphate.

(18)

Formation of Mannose 6-phosphate tag at golgi complex.

This was first attached in the ER.

The phosphate is added in the Golgi

The creation of the M6P marker in the Golgi relies on recognition of a signal patch in the tertiary structure of the hydrolase.

Patients with a disease called inclusion-cell disease have cells lacking hydrolases in their lysosomes. Instead, the hydrolases are found in the blood. These patients lack GlcNAc phosphotransferase. Without the M6P-tag, the acid hydrolases are transported to the plasma membrane instead of the late endosome.

(19)

Structures of histidine, lysine and arginine

Structures of aspartate, glutamate, asparagine and glutamine

Glycosylation

(20)

Families of Amino Acids

• The common amino acids are grouped according to whether their side chains are:

– Acidic: aspartate, glutamate (tyrosine) – Basic: lysine, arginine, (histidine)

– (natural) uncharged polar: serine, threonine, glutamine, asparagine, (glycine)

– Nonpolar:(glycine), alanine, valine, leucine, Methioine, cysteine, proline, phenylalanine, tryptophan

• Hydrophilic amino acids (uncharged polar) are usually on the outside of a protein whereas nonpolar residues cluster on the inside of protein

• Basic or acidic amino acids are very polar and are generally found on the outside of protein molecules

polar

Ampholyte

Ampholyte contains both positive and negative groups on its molecule

Uncommon a.a.

Plant cell wall Collagen

Collagen

myosin

Prothrombin Ca2+ binding protein

elastin

21th a.a.

Added during protein synthesis UGA codon

glutathione peroxidases

(21)

Broad spectrum amino acids

Essential amino acids (can not synthesis itself) Arginine, histidine, isoleucine, leucine, lysine,

methionine, phenylalanine, threonine, tryptophan, valine Neuroendocrine related

GABA, glycine, serine, tauring, glutamate, aspartate, tyrosine

Energy metabolism

Asparagine, aspartate, glutamate, citrulline, ornithine

(22)

Cysteine → disulfides bond

Glycine → small, peptide more elasticity

Proline → 2 structure more fixed

(23)

There Are Four Levels of Protein Structure

Primary structure - amino acid linear sequence Secondary structure - regions of regularly

repeating conformations of the peptide chain, such as α-helices and β-sheets

Tertiary structure - describes the shape of the fully folded polypeptide chain

Quaternary structure - arrangement of two or more polypeptide chains into multisubunit molecule

• Peptide bond- linkage between amino acids is a secondaryamide bond

• Formed by condensation of the α-carboxyl of one amino acid with the α- amino of another amino acid (loss of H₂O molecule)

• Primary structure- linear sequence of amino acids in a polypeptide or protein

Peptide Bonds Link Amino Acids in Proteins

(24)

The backbone of protein (polypeptide)

N-C-C-N-C-C-N-C-C-N-C-C N C

N-terminal C-terminal

Unit (單位)

Peptide bond (胜鍵)

H

₃

N C COO H

R

pK₁ pK₂

+ -

The Hydrophobicity of Amino Acid Side Chains

Hydropathy: the relative hydrophobicity of each amino acid

The larger the hydropathy, the greater the tendency of an amino acid to prefer a hydrophobic environment. Hydropathy ↑ = hydrophobic ↑ = dissolve in water↓

Hydropathy affects protein folding:

hydrophobic side chains tend to be in the interior hydrophilic residues tend to be on the surface.

Membrane protein vs. hydropathy

(25)

One of the most commonly used properties is the suitability of an amino acid for an aqueous environment

Hydropathy & Hydrophobicity

– degree to which something is “water hating” or “water fearing”

Hydrophilicity

– degree to which something is “water loving”

Hydro-pathy/phobicity/philicity

Analysis:

Goal: Obtain quantitative descriptions of the degree to which regions of a protein are likely to be exposed to aqueous solvents Starting point: Tables of propensities of each amino acid

Describe the likelihood that each amino acid will be found in an aqueous environment - one value for each amino acid

Commonly used tables

– Kyte-Doolittlehydropathy – Hopp-Woodshydrophilicity – Eisenberg et al. normalized

consensus hydrophobicity

Hydro-pathy/phobicity/philicity table

(26)

The topology of a membrane protein often can be deduced from its sequence: hydropathy profile (親水性行為)

Hydropathic index for each aa.

Total hydrophobicity of 20 contiguous aa

hydrophobicity

Usually for cytosol

(27)

Example Hydrophilicity Plot Example Hydrophilicity Plot

This plot is for a tubulin, a soluble cytoplasmic protein.

Regions with high hydrophilicity are likely to be exposed to the solvent (cytoplasm), while those with low hydrophilicity are likely to be internal or interacting with other proteins.

Amino Acid Composition of Proteins

• Amino acid analysis - determination of the amino acid composition of a protein

• Peptide bonds are cleaved by acid hydrolysis (6M HCl, 110

^o

, 16-72 hours) (所以不怕溫度,怕酵素)

• Amino acids are separated chromatographically and quantitated

• Phenylisothiocyanate (PITC; Edman degradation) used to

derivatize the amino acids prior to HPLC analysis

(28)

Secondary structure α-Helix

H

| N C

||

O

H

| N C

||

O H

| N

C

||

O C

||

H O

| N C

||

O

H

| N C

||

O

H

| N C

||

O

C

||

O

C

||

O H

| N

H

| N H

|

N

Every amide hydrogen

and carbonyl oxygen is involved in a hydrogen bond.

Multiple strands may entwine to make a protofibril

protofibril.

The α-helix

每3.6胺基酸繞一圈，每圈5.4 Å高

(29)

The α-helix can produce polar or non polar protein folding

Outside → + Inside → -

Polar folding H

| N C

||

O

H

| N C

||

O

H

| N

C

||

O C

||

H O

| N C

||

O

H

| N C

||

O

H

| N C

||

O

C

||

O

C

||

O H

| N

H

| N H

| N

Stereo view of right-handed α helix

• All side chains project outward from helix axis

(30)

Albertset al(2002) Molecular Biology of the Cell (4e) p.679

Garrett & Grisham (1999) Biochemistry (2e) p.1054

α-helix produced new related position

α -Helix Example

myosin/actin structure myosin/actin structure Proteins used in muscle actin

troponin

myosin head myosin tail ATP and actin binding sites

thick filament

thin filament

(31)

αααα

形成更複雜的構造

Stryer (1995) Biochemistry (4e) p.436

Many α helices complex structure

Horse liver alcohol dehydrogenase

• Amphipathic α helix (blue ribbon)

• Hydrophobic

residues (blue)

directed inward,

hydrophilic (red)

outward

(32)

β-Sheets (a) parallel, (b) antiparallel

parallel antiparallel

Turn

(33)

β turn γ turn

Pro

劇烈轉折 R 在同一側

R 在相對側

三個胺基酸夾一氫鍵兩個胺基酸夾一氫鍵

Mathews et al (2000) Biochemistry (3e) p.181

Reverse Turns

Reverse Turns: β turn, γ turn

It also related with H-bond

thews et al (2000) Biochemistry (3e) p.164

α helix β sheet

兩者都由

H-bond組成

(34)

allα helices allβ sheets helices + sheets

Kleinsmith & Kish (1995) Principles of Cell and Molecular Biology (2e) p.26

Secondary structure produced Tertiary structure

turn

Glu Met Ala Leu Lys Phe Gln Trp Ile Val Asp His Arg Thr Ser Cys Tyr Asn Pro Gly

α helix β sheet β turn

各種胺基酸對二級構造的貢獻程度

(35)

Tertiary structure

1. H-bond, Disulfide bond, peptide bond (covalent bond),ionic bond and hydrophobic contributed to the structure. (secondary structure: H-bond) 2. Contain many secondary (supersecondary) structure

3. Has ion interaction, and contributed to stable the structure

4. Hydrophobic group in cone (soluble protein by folding and 2 & 3 structure)

Trypsin inhibitor 2- structure H-bond

3-structure

Tertiary Structure of Proteins

- S - S -

Salt bridge Sulfide

Crosslink

Hydrogen bonding Hydrophobic

interaction

-COO

^-

H

₃

N

⁺

-

-O \ H

\ -O H

(36)

Tertiary Structure of Proteins

Hydrophobic attractions Hydrophobic attractions - -

Attractions between R groups of non-polar amino acids.

Hydrogen bonding Hydrogen bonding - -

Interaction between polar amino acidR groups.

Ionic bonding Ionic bonding - -

Bonding between oppositely charged amino acid R groups.

Tertiary structure- ionic bond

Cys, His, Glu, Asp can interact with metal, and formed ionic bond

Function:

1. Fixed (stable) structure

2. Protein function (co-factor

Mg

²⁺

)

(37)

Hydrophobic amino acid group

以水溶性蛋白質而言 (脂溶性蛋白質恰相反)

Albertset al(2002) Molecular Biology of the Cell (4e) p.135

Hydropholic amino acid group

The hydrophobic strength of tertiary structure

C CH₂ SH

SH CH₂ C C

CH₂ SH

SH CH₂ C

C CH₂ S S

CH₂ C C CH₂ S S

CH₂ C 分子內

Intrachain Disulfide bond

分子間 Interchain Disulfide bond

Oxidation Reduction

The disulfide bond of tertiary structure

(38)

The bonds contribute to protein structure

1. Hydrogen bond

2. Hydrophobic interaction

3. Ionic bond

4. Disulfide bond

RNAase

124amino acids, but has 4 Disulfide bond

Strong and stable Can reverse itself

+ Urea, Mercaptoethanol - Urea, Mercaptoethanol

Uncorrected folding

(39)

Common motifs

1.Supersecondary structure 2. 2-5 secondary structure

Tertiary structure - motif

Quaternary Structure of Proteins

Many proteins are not single peptide strands.

They are combinations of several proteins - aggregate of smaller globular proteins.

Conjugated protein

Conjugated protein - incorporate another type of group that

performs a specific function.

(40)

Common domain folds

1. Several hundred amino acids 2. > 2 motifs, and between secondary

and tertiary structure 3. Had specific conformation

4. Functional structure (myoglobin, 1;

antibody > 2)

Tertiary structure-domain

Tertiary structure- modification

1. Glycoprotein 2. Lipoprotein 3. Metalloprotein

4. Add phosphorylation 5. Need coenzyme

6. Protein cleavage (insulin → chymotrypsinogen)

(41)

Alberts et al (2002) Molecular Biology of the Cell (4e) p.156

Protein cleavage can regulated/produced new protein function

(42)

Quaternary Structure

• Refers to the organization of subunits in a protein with multiple subunits (an “oligomer”)

• Subunits (may be identical or different) have a defined stoichiometry and arrangement

• Subunits are held together by many weak, noncovalent interactions (hydrophobic, electrostatic 靜電作用)

Quaternary structure of multidomain proteins

(43)

Hemoglobin tetramer

(a) Human oxyhemoglobin (b) Tetramer schematic

Sequence

↓

Conformation

↓ Activity

↓

Regulative function

(44)

(1) Directly sequence Edman degradation F. Sanger (Cambridge U.)

→ Insulin 胰島素 (A, B chains)

(2) 由 cDNA 序列反推胺基酸序列：

DNA 定序法： F. Sanger

one DNA → two template → three possible Determination of primary structure–

amino acids sequence

ATCGATCC……..

F. Sanger

(1951, Cambridge U)

Insulin 胰島素 (A, B chains)

S S

S S S S

+NH3

NH3+

-OOC

Q G

I V E CQ C T

S I C S

L L Y NE CY N

COO^-

F S

F V N Q H L G C L H EV

A L Y

L V

C G

E R

G Y F T P K T

B-chain

A-chain

(45)

Protein function- serine protease

Acid-catalyzed hydrolysis of a peptide

(46)

Traditional N-terminal sequence All peptides → degradation → can not recover

Edman degradation

Can recovery the peptide

(47)

Edman degradation STEP:

1. Denature 2. Reduce and

alkylate disufuides 3. Cleavage

4. Edman reaction 5. Recover, …

Step 1 Denature

Step 2 Reduce and alkylate disulfides

(48)

Step 3 Cleavage (more efficiency )

Now: sequenator about 50 amino acids

Step 4 Edman degradation

(49)

Reconstruct sequence of polypeptides

How to identify the primary sequence of protein ? (1)From cDNA (genome) predict peptide sequence

One DNA → three types amino acids double strain DNA → six types amino acids

(2) Directly to sequence of the peptide

N-terminal secquence (F. Sanger Cambridge U; ) Edman degradation (F. Sanger ie lnsulin)

核酸序列閱讀方向

(50)

Signal Peptide

Enter ER? Peroxisomes? Nucleus?

Ribbon Structure of Ribonuclease A

N-Terminus

N-Terminus

Lys-Glu-Ser-Arg-Ala

(51)

Three Steps of Edman Chemistry

Lys-Glu-Ser-Arg-Ala 1. COUPLE

Edman Reagent

Lys-Glu-Ser-Arg-Ala 2. CLEAVE

Glu-Ser-Arg-Ala 3. CONVERT

PTH-Lysine

PTH Chromatograms

PTH-Lysine

PTH-Glutamatic acid

PTH-Serine

1st, amino acid Residue

2nd amino acid Residue

3rd amino acid Residue

(52)

Procise

Standard Instrument: HT (high throughput)

conversion flask

Reagents: R1, R2, R3, R4, R5 Solvents: S1, S2, S3, S4

1 2 3 4 one

protein sample per cartridge

Couple: R1, R2 Cleave: R3 Convert: R4

Standard: R5

491 - one cartridge 492 - two cartridges 494 – four cartridges

Standard PTH Chromatogram on Procise 10 picomoles each PTH-AA

D = Aspartic acid N = Asparagine S = Serine Q = Glutamine T = Threonine G = Glycine E = Glutamic acid

H = Histidine A = Alanine R = Arginine Y = Tyrosine P = Proline M = Methione V = Valine

W = Tryptophan F = Phenylalanine I = Isoleucine K = Lysine L = Leucine

dmptu, dptu = Edman byproducts

5 10 15 minutes

(53)

Classical Proteomics Example Surface Proteins in Human Lung

211 silver stained spots 182 i.d. by Edman

Sequencing ..61 different proteins

2 unknown proteins

Human bronchoalveolar lavage fluid: 2D Gel, amino acid microseqeuncing and identification of major proteins,

R. Wattiez et al.

Electropohoresis 20, 1634 (1999)

Identification and Quantification of Novel Yeast Proteins in Glucose Pathway

Hxk2p: Hexokinase II 1 pmol EF1b: Elongation Factor 6 pmol E1:pyruvate dehydrogen. 0.8 pmol hsp71: heat shock protein 3 pmol IF5A: initiation factor 7 pmol EMBO Journal

vol. 18, 99 4157-4168

(54)

Procise

Standard Instrument: HT

Detector

Low noise:

(+/- 1 x 10^-5au ) Flow cell: 8 mm path,

12 uL vol.

785

140C

Binary Pump

Low pulsation:

dual-syringe pump flow rate: 325 uL/min

PTH-AA in injection loop PTH Column

First Residue of ß-Lactoglobulin

Sample: 10 picomoles, Initial Yield 7 picomoles

5 10 15 minutes

PTH-Leucine

(55)

High Sensitivity Sequencing

Sequential Cleavage and Excision of a Segment of a the Thyrotropin Receptor Ectodomain

S. de Bernard et al.

J. Biological Chemistry, vol 274, pp 101-107, 1999

TSH receptor protein, immunoprecipitate, run on SDS PAGE, electroblot on PVDF (0.2-0.5 pmoles), sequence on Procise 494

Molecular Cloning and Expression of Lipid Transfer Inhibitory Protein Reveals Its Identity with Apolipoprotein F

X. Wang et al.

J. Biological Chemistry, vol 274, pp 1814-1820 (1999)

affinity purify apoE complex, SDS PAGE, electroblo on PVDF at 0.5 to 1 picomole, sequence on Procise 492