୯ҥᆵεᏢᏢଣϯᏢس ᅺγፕЎ
Department of Chemistry College of Science
National Taiwan University Master Thesis
Ҙ୷ϯᒘữለჹ͉ᖥᆶ͊ש่݈ᄬᛙۓࡋ Ϸჹਡᑗਡለᒣᆶಒझऀϐቹៜ!
Effect of Lysine Methylation on α-Helix Propensity, β-Sheet Propensity, RNA Recognition, and Cell Penetration
ቅށဂ!
Mu-Chun Liu
ࡰᏤ௲;!ഋѳ!റγ!
Advisor: Richard P. Cheng, Ph.D.
ύ҇୯!103!ԃ!7!Д!
July, 2014
i
ᇞ ᇞᖴ!
२Ӄाߚதགᖴ! ഋѳԴৣԾεΟԿᅺΒаٰߏԃޑࡰᏤǴගٮΑࡐӭჴᡍ
ᔅշǵፕЎঅׯǵჹԾρ܌অಞޑሦୱԖ׳ӭޑΑှаϷჹ҂ٰޑགྷݤᆶᔅշǶ
ᅺγٿԃᗨฅӧೈқ፦ሦୱᗋόૈᆉࢂߚதᆒ೯ǴՠҗԴৣޑ៶࣬ࡕ
ΨᏢΑࡐӭ࣬ᜢޑޕаϷჴᡍБݤǴ׳ख़ाޑࢂᎦԾρפሡाၗૻޑૈΚǴ
ᕇؼӭǼ!
ќѦΨߚதଯᑫૈᇡჴᡍ࠻ଆӕҒӅधޑუՔॺǺςޑਁᏌε
ᏢߏऐЈӦࡰᏤჴᡍБݤаϷှเӭፐޑୢᚒǴߚதёǼذ൛ᏢۊӧӀ
ޑεΚᔅշаϷёαޑѠࠄӜౢࢂኗቪ೭ጇፕЎόё܈લޑाનǹيࣁӕ
ᏆՠૈΚᆶεᏢߏ٠ӈণᇬӧჴᡍکፐޑൂБय़ᔅԆᡣךڙόϿǹᆶႠഩ
کమ܃ϐ໔ޑᅥ٣ᄣፋᡣᅺγғఱύޑӭᓸΚΑόϿǹᅺޑᏢۂॺϓ
ൈǵܿǵդەǵܱᡣ೭ঁჴᡍ࠻ᕴࢂόલޑݗǶߚதགᖴჴᡍ࠻ޑ܌
ԖუՔӧ೭ٿԃϐ໔ޑӭБᔅԆᆶྣ៝Ǵߚதᄪ۩ૈᇡεৎǼനࡕाགᖴךޑ
Р҆ගٮдॺޑϪᡣךૈడคޑֹԋךޑᏢǶ!
2014.07.01
ii
ύ
ύЎᄔा
ᙯࡕঅႬӧೈқ፦Չࣁޑቹៜߚதख़ाǴԶᒘữለޑҘ୷ϯࢂځύᅿ
ߚததـޑጄٯǴҘ୷ϯёૈԋೈқ፦่ᄬޑᡂϯаϷфૈޑࢲϯ܈
ڋǶᒘữለҘ୷ϯӧғނᡏύԖΟᅿԄǺൂҘ୷ǵᚈҘ୷کΟҘ୷Ǵόӕำࡋ
ޑҘ୷ϯёૈჹೈқ፦ΨԖόӕቫભکਏ݀ޑቹៜǶӧ่ᄬБय़ǴΟᅿҘ୷ϯᒘ
ữለΕܭٿᅿ୷ҁΒભ่ᄬޑᴏ两ኳ݈ύٰࣴزځჹΒભ่ᄬᛙۓࡋޑቹ
ៜǺ͉.ᖥک͊.ש݈Ƕ༝ΒՅӀሺҔٰෳໆᴏ两ޑᖥำࡋǴԶש݈ޑ่ᄬ
ၗૻ߾٬ҔΒᆢਡᅶӅਁӀٰϩǶ!
! ܭфૈБय़ǴҁፕЎ߾ࢂܭΓᜪխࣝલЮੰࢥޑགࢉၸำύߚதख़ाޑ
ፓೈқǺUbuǶԜೈқύӸӧࢤ֖҅ႝữ୷ለޑୱȐSLLSSRSSȑ٠Ъ
ځςࣴزрёаଯᒧ܄Ӧᆶۓ SOB ่ᄬ่ӝᙖаε൯ࡋගଯੰࢥೈқޑ߄
ໆǶ೭ࢤߚதख़ाޑׇӈӕਔΨ፟ϒ Ubu ೈқऀಒझጢޑૈΚٰΕߟځдಒ
झǶԶᒘữለޑҘ୷ϯჴᆶԜೈқޑࢲϯکڋ৲৲࣬ᜢǶӢԜǴךॺӝԋ
೭ࢤᴏ两٠ӧ 61 ک 62 ဦϩձΕΟᅿҘ୷ϯᒘữለٰࣴزځჹᒣᇡۓ SOB а
ϷऀಒझૈΚޑቹៜǶᆶ SOB ϐ໔ޑှᚆதኧҗጤᡏႝݚीᆉǴԶऀಒझޑ
ૈΚ߾٬ҔࢬԄಒझሺٰՉෳ၂Ƕ!
!
iii
Abstract
Post-translational modification dominates many protein behaviors. Methylation of
lysine impacts both protein function and structure. There are three variations of
methylated lysines that were identified in proteins. It is logical to assume that different
numbers of methyl groups attached onto the side chain amino group should have
different degrees of effects on proteins. In this study, various types of methylated lysines
are placed into two basic secondary structures: the α-helix and the simplestE-sheet
model “β-hairpin”, to investigate the effect of lysine methylation on structural stability.
The fraction helix of the helical peptides was determined by circular dichorism
spectroscopy. The structural information of the hairpin peptides was analyzed by 2D
NMR.
Lysine methylation also plays an important part in many biological processes. The
regulatory Tat protein contains a basic region (RKKRRQRRR, residue 49 to 57) which
specifically binds to the trans-activating responsive (TAR) element to modulate HIV-1
RNA transcription. The binding between HIV-1 Tat protein and TAR RNA is essential
for HIV-1 virus to efficiency produce full-length viral RNA. To study the effect of
lysine methylation on RNA recognition and cellular uptake, two lysine residues Lys50
and Lys51 were replaced with monothylated, dimethylated, and trimethylated lysines.
iv
The dissociation constant for the Tat derived peptide-TAR RNA complexes was
determined by gel shift assay. The cellular uptake efficiency of Tat derived peptide into
Jurkat cell was assessed by flow cytometry.
v
Table of Contents
ᇞᖴ ... i
ύЎᄔा ...ii
Abstract ...iii
Table of Contents... v
List of Charts ...viii
List of Figures ... ix
List of Tables ...xiii
List of Schemes ... xv
Abbreviation ... xvi
Chapter 1. Introduction ... 1
1-1 Central Dogma of Molecular Biology... 1
1-2 Proteins ... 2
1-3 Protein Folding and Function... 2
1-4 Hierarchy of Protein Structure ... 3
1-5 Driving Force of Protein Folding ... 9
1-6 RNA Recognition ... 11
1-7 Post-Translational Modifications (PTMs) ... 13
1-8 Thesis Overview ... 15
1-9 References ... 16
Chapter 2. ... 23
2-1 Introduction ... 23
α-Helix ... 23
Lifson-Roig Theory ... 24
β-Sheet ... 26
Lysine Methylation ... 28
2-2 Results and Discussion... 29
Peptide Design and Synthesis ... 29
Circular Dichorism Spectroscopy ... 33
vi
Helix Formation Parameters ... 36
Hairpin Structure Characterization ... 37
2-3 Conclusions ... 48
2-4 Future Aspects ... 50
2-5 Acknowledgement... 53
2-6 Experimental Section ... 53
General Materials and Methods ... 53
Peptide Synthesis ... 54
Ultraviolet-Visible (UV-vis) Spectroscopy ... 69
Circular Dichorism Spectroscopy ... 70
Helix Propensity and Capping Parameter Derivation ... 70
Hairpin Peptide Structure Analysis by 2D-NMR ... 71
2-7 References ... 80
Chapter 3. ... 85
3-1 Introduction ... 85
Ribonucleic acid (RNA) ... 85
Human Immunodeficiency Virus (HIV) ... 87
Trans-Activation Response Element (TAR) RNA ... 89
Trans-Activator of Transcription (Tat) Protein ... 90
Tat-Mediated Transcription ... 91
Lysine Methylation in Tat Protein ... 93
Cell Penetration ... 93
3-2 Results and Discussion... 95
Peptide Design and Synthesis ... 95
Reductive Methylation on Lysine ... 97
Electrophoretic Mobility Shift Assay in the Presence of Bulk E. coli tRNA ... 100
Circular Dichorism Spectroscopy ... 106
Cellular Uptake Assay ... 107
3-3 Conclusion...111
3-4 Acknowledgement... 112
3-5 Experimental Section ... 112
General Materials and Methods ... 112
vii
Peptide Synthesis ... 114
Ultraviolet-Visible (UV-vis) Spectroscopy ... 122
Electrophoretic Mobility Shift Assay ... 124
Circular Dichroism Spectroscopy ... 126
Cells and Cell Cultures ... 126
Cellular uptake Assay... 127
3-6 References ... 128
Appendix. ... 136
viii
List of Charts
Chart 2-1. Chemical Structure, Full Name and 3-Letter Code of Methylated Lysines--- 29 Chart 3-1. Chemical Structure, Full Name and 3-Letter Code of Methylated
Lysines--- 96 Chart 3-2. The Chemical Structures of Commercially Available Methylated
Lysines--- 98
ix
List of Figures
Figure 1-1. The central dogma of molecular biology is the genetic information
flowing from DNA through RNA protein--- 1
Figure 1-2. The peptide bond and the dihedral angles I ψ in the backbone.--- 5
Figure 1-3. Ramachandran plot.--- 5
Figure 1-4. The structure of an α-helix.--- 6
Figure 1-5. The structure of a β-hairpin.--- 7
Figure 1-6. Four hierarchical levels of protein structure.--- 9
Figure 2-1. Chemical structure of the experimental (A), the fully unfolded (B), and the fully folded (C) hairpin peptides. Xaa was replaced by Mmk, Dmk, or Tmk.--- 32
Figure 2-2. Circular dichorism spectra of the peptide at pH7 (273 K) in 1 mM phosphate, borate, and citrate buffer with 1 M NaCl. (A) KXaa9 peptides, (B) KXaa14 peptides, (C) NCapXaa peptides, (D) CCapXaa peptides,--- 35
Figure 2-3. The Hα chemical shift deviation for peptides HPTMmkAla (A), HPTDmkAla (B), and HPTTmkAla (C). Reference fully unfolded peptides are HPTUMmkAla, HPTUDmkAa and HPTUTmkAla, respectively.--- 38
Figure 2-4. The Hα chemical shift deviation for peptides HPTFMmkAla (A), HPTFDmkAla (B), and HPTFTmkAla (C). Reference fully unfolded peptides are HPTUMmkAla, HPTUDmkAa and HPTUTmkAla, respectively.--- 39
Figure 2-5. The NOEs (A) and Wüthrich diagram (B) of peptide HPTMmkAla. The thickness of the bands reflects the NOE intensity.--- 42
Figure 2-6. The NOEs (A) and Wüthrich diagram (B) of peptide HPTUMmkAla. The thickness of the bands reflects the NOE intensity.--- 42
Figure 2-7. The NOEs (A) and Wüthrich diagram (B) of peptide HPTFMmkAla. The thickness of the bands reflects the NOE intensity.--- 43 Figure 2-8. The NOEs (A) and Wüthrich diagram (B) of peptide HPTDmkAla.
The thickness of the bands reflects the NOE
x
intensity.--- 43
Figure 2-9. The NOEs (A) and Wüthrich diagram (B) of peptide HPTUDmkAla. The thickness of the bands reflects the NOE intensity.--- 44
Figure 2-10. The NOEs (A) and Wüthrich diagram (B) of peptide HPTFDmkAla. The thickness of the bands reflects the NOE intensity.--- 44
Figure 2-11. The NOEs (A) and Wüthrich diagram (B) of peptide HPTTmkAla. The thickness of the bands reflects the NOE intensity.--- 45
Figure 2-12. The NOEs (A) and Wüthrich diagram (B) of peptide HPTUTmkAla. The thickness of the bands reflects the NOE intensity.--- 45
Figure 2-13. The NOEs (A) and Wüthrich diagram (B) of peptide HPTFTmkAla. The thickness of the bands reflects the NOE intensity.--- 46
Figure 2-14. The folding percentage of each residue for peptide HPTMmkAla (A), HPTDmkAla (B), and HPTTmkAla (C).--- 47
Figure 2-15. Fraction folded for the HPTXaaAla peptides (Xaa = Mmk, Dmk, Tmk).--- 48
Figure 3-1. The chemical structure of four nucleobases for RNA.--- 85
Figure 3-2. The basic constitution of a single-stranded RNA.--- 86
Figure 3-3. Various RNA secondary structures: helix (A), stem-loop (B), and bulge loop (C).--- 87
Figure 3-4. The landmark of HIV-1 genome consists of nine essential genes. The trans-activation response element (TAR, the fixed box on bottom left) located at the viral 5’ LTR promoter and the trans-activator of transcription (Tat) protein displayed in the right hand side.--- 89
Figure 3-5. The sequence and secondary structure of HIV-1 from +17 to +45. This region, TAR RNA, contains a bulge and a loop structures, +23 to +25 and +30 to +35, respectively.--- 90
Figure 3-6. A schematic illustration of the Tat protein.--- 91
Figure 3-7. Trans-activated transcription of HIV-1 via Tat-TAR binding.--- 93
Figure 3-8. The classification of cell-penetrating peptides.--- 94
Figure 3-9. The chemical structure of 6-carboxy-fluorescein.--- 97 Figure 3-10. The analytical RP-HPLC chromatogram of peptide Fl-Dmk51-Tat
synthesized using commercially available Dmk (A), and synthesized
xi
by reductive methylation (B).--- 98 Figure 3-11. Images of typical gels of electrophoretic mobility shift assay (EMSA)
for Tat-derived peptides. All lanes contain 100 nM fluorescein-labeled HIV-1 TAR RNA in the presence of 10 μg/mL bulk E. coli tRNA.--- 103 Figure 3-12. The global fitting results of Tat-derived peptides binding to TAR
RNA in the presence of 10 μg/mL bulk E. coli tRNA.--- 104 Figure 3-13. Apparent dissociation constants for Tat-derived peptide-TAR RNA
complexes as determined by EMSA in the presence of 10 μg/mL bulk E. coli tRNA. TAR RNA concentration was 100 nM.--- 105 Figure 3-14. CD spectra between 200 to 300 nm of the Tat-derived peptides. The
spectra were acquired in 10 mM Tris buffer at pH 7 and 25 oC.
Peptide concentration was 50 PM.--- 107 Figure 3-15. The mean fluorescence intensity of Jurkat cells treated with 7 uM
(A) and 120 uM (B) Tat-derived peptides in cellular uptake assays.---- 108 Figure A-1. Flow cytometry results showing the side scattered light plotted
against the forward scattered light for live control cells, dead control cells, and cells incubated with 7 μM Tat-derived peptides. The gate used to restrict the population of cells analyzed is shown and labeled as P1.--- 138 Figure A-2. Flow cytometry results showing the propidium iodide fluorescence
against the fluorescein fluorescence for live control cells, dead control cells, and cells incubated with 7 μM Tat-derived peptides. The gate used to restrict the fluorescence of cells analyzed is shown and labeled as P2.--- 139 Figure A-3. Flow cytometry results showing the fluorescein fluorescence for live
control cells, and cells incubated with 7 μM Tat-derived peptides for 15 minutes at 37 oC.--- 140 Figure A-4. Flow cytometry results showing the side scattered light plotted
against the forward scattered light for live control cells, dead control cells, and cells incubated with 120 μM Tat-derived peptides. The gate used to restrict the population of cells analyzed is shown and labeled as P1.--- 141 Figure A-5. Flow cytometry results showing the propidium iodide fluorescence
against the fluorescein fluorescence for live control cells, dead control cells, and cells incubated with 120 μM Tat-derived peptides. The gate used to restrict the fluorescence of cells analyzed is shown and
xii
labeled as P2.--- 142 Figure A-6. Flow cytometry results showing the fluorescein fluorescence for live
control cells, and cells incubated with 120 μM Tat-derived peptides for 15 minutes at 37 oC.--- 143 Figure A-7. The overlaid bright-foeld and fluorescence microscopy images of
Jurkat cells incubated with 7 μM Fl-Mmk50-Tat (A), Fl-Dmk50-Tat (B), Fl-Tmk50-Tat (C), Fl-Mmk51-Tat (D), Fl-Dmk51-Tat (E), and Fl-Tmk51-Tat (F) for 15 minutes at 37 oC in the presence of fetal bovine serum, washed and treated with trypsin at 37 oC for 5 minutes.--- 144 Figure A-8. The license agreement for Figure 3-4.--- 145 Figure A-9. The license agreement for Figure 3-7.--- 146
xiii
List of Tables
Table 2-1. Sequence of Ala-based Peptides for Determining the N-Cap Parameter, C-Cap Parameter, and Helix Propensity of Modified Lys
Analogs --- 30
Table 2-2. Sequences for the Hairpin Peptides HPTXaaAla, the Unfolded Reference Peptides HPTUXaaAla, and the Folded Reference Peptide HPTFXaaAla Containing Midified Lys Analogs--- 32
Table 2-3. The Purity and Weight of the Helical Peptides--- 33
Table 2-4. The Purity and Weight of the Hairpin Peptides--- 33
Table 2-5. Mean Residue Ellipticity at 222 nm and Fraction Helix (fhelix) of Helical Peptides Containing Methylated Lys--- 36
Table 2-6. Statistical Mechanical Helix Formation Parameters for Modified Lys Analogs Derived from Experimentally Measured Fraction Helix Based on Modified Lifson–Roig Theory--- 37
Table 2-7. The3JNHαCoupling Constant Values (Hz) of Peptides HPTFXaaAla-- 40
Table 2-8. The3JNHαCoupling Constant Values (Hz) of Peptides HPTXaaAla --- 40
Table 2-9. The3JNHαCoupling Constant Values (Hz) of Peptide HPTUXaaAla--- 40
Table 2-10. Fraction Folded (%) and ΔGfold(kcal/mol) of the Peptide HPTXaaAla 48 Table 2-11. Sequences for the Future Model Peptides for Investigating Helix and Sheet Stability--- 52
Table 2-12. The1H Chemical Shift Assignments for Peptide HPTMmkAla--- 72
Table 2-13. The1H Chemical Shift Assignments for Peptide HPTDmkAla--- 73
Table 2-14. The1H Chemical Shift Assignments for Peptide HPTTmkAla--- 74
Table 2-15. The1H Chemical Shift Assignments for Peptide HPTUMmkAla--- 75
Table 2-16. The1H Chemical Shift Assignments for Peptide HPTUDmkAla--- 76
Table 2-17. The1H Chemical Shift Assignments for Peptide HPTUTmkAla--- 77
Table 2-18. The1H Chemical Shift Assignments for Peptide HPTFMmkAla--- 78
Table 2-19. The1H Chemical Shift Assignments for Peptide HPTFDmkAla--- 79
Table 2-20. The1H Chemical Shift Assignments for Peptide HPTFTmkAla--- 80
Table 3-1. The Sequences of Tat-Derived Peptides Capped with an Acetyl Group--- 96
Table 3-2. The Sequences of Tat-Derived Peptides Capped with 6-Carboxy-Fluorescein--- 97
Table 3-3. The Purity and Weight of the Tat-Derived Peptides--- 97 Table 3-4. The Apparent Dissociation Constants for Tat-Derived Peptides-TAR
xiv
RNA Complexes in the Presence of 10 μg/mL bulk E. coli tRNA.
TAR RNA concentration was 100 nM--- 105 Table 3-5. The Z and P values for Comparing the Apparent Dissociation
Constants of Wild Type Peptide and Tat-Derived Peptides in the Presence of 10 μg/mL bulk E. coli tRNA--- 106 Table 3-6. Cellular Uptake of Tat-Derived Peptides Treated into Jurkat Cells in
PBS. Mean fluorescence intensity for each peptide with 7 μM and 120 μM--- 108 Table 3-7. The Z and P Value of the Mean Fluorescence Intensity at 7 μM for All
Peptides of Cellular Uptake Assays--- 110 Table 3-8. The Z and P Value of the Mean Fluorescence Intensity at 120 μM for
All Peptides of Cellular Uptake Assays--- 111 Table 3-9. Amount of Reagents for Preparation of the Separating Gel--- 126 Table 3-10. Amount of Reagents for Preparation of Samples with Different
Concentrations--- 126 Table A-1. The Secondary Structural Occurrence of Methylated Lysine in
Natural Proteins --- 136 Table A-2. The Z and P Values for w9Value of KXaa9--- 137
xv
List of Schemes
Scheme 3-1. Synthesis of Fl-Dmk51-Tat via Reductive Methylation --- 100 Scheme 3-2. Mechanism of Reductive Methylation --- 100
xvi
Abbreviation
α-SYN α-Synuclein
Aβ Amyloid β peptide
Ac Acetyl
AD Alzheimer’s disease
AIDS Acquired immune deficiency syndrome
Ala Alanine
APS Ammonium persulfate
Arg Arginine
Bis-acrylamide N,N’-methylene-bis-acrylamide
CCR5 C-C chemokin receptor type 5
CD Circular dichroism
CD4 Cluster of differentiation 4
CDK9 Cyclin-dependent kinase 9
CPPs Cell-penetrating peptides
Cys Cysteine
CXCR4 C-X-C chemokine receptor type 4
DIEA Diisopropylethylamine
DMF Dimethylformamide
Dmk Dimethyllysine
DNA Deoxyribonucleic acid
DQF-COSY Double-quantum filtered-correlated spectroscopy E. coli tRNA Escherichia coli transfer ribonucleic acid
EMSA Electrophoretic mobility shift assay
FBS Fetal bovine serum
Fl 6-Carboxy-fluorescein
Fmoc N-9-Fluorenylmethoxycarbonyl
Gln Glutamine
Gly Glycine
gp120 Envelope glycoprotein GP120
HBTU O-1H-benzotriazol-1-yl-1,1,3,3-tetramethyluronium hexafluorophosphate
HIV Human immunodeficiency virus
HOBT 1-Hydroxybenzotriazole
Ile Isoleucine
KD Dissociation constant
Leu Leucine
xvii
LTR Long terminal repeat
Lys Lysine
MALDI-TOF Matrix-assisted laser desorption ionization time-of-flight
MeOH Methanol
Mmk Monomethyllysine
NELF Negative elongation factor
NMR Nuclear magnetic resonance
NOE Nuclear Overhauser effect
NOESY Nuclear Overhauser effect spectroscopy
Orn Ornithine
PD Parkinson’s disease
PMT PRC2
Photomultiplier tube
Polycomb repressive complex 2
Pro Proline
PrP Prion protein
P-TEFb Positive transcriptional elongation factor-b PTMs Post-transcriptional modifications
RNA Ribonucleic acid
RNAPII RNA polymerase II
ROESY Rotating-frame nuclear Overhauser effect correlation spectroscopy
RT Reverse transcriptase
RP-HPLC RPMI
Reversed phase high-performance liquid chromatography Roswell Park Memorial Institute medium
SPPS Solid phase peptide synthesis TAR Trans-activation response element Tat Trans-activator of transcription
TEMED N,N,N’,N’-Tetramethylethylenediamine
TFA Trifluoroacetic acid
Thr Threonine
Tmk Trimethyllysine
TOCSY Total correlation spectroscopy
Tris Tris (hydroxylmethyl)-aminomethane
Tyr Tyrosine
UV-vis Ultraviolet-visible
Val Valine
Chapter 1
Introduction
1
Chapter 1. Introduction
1-1 Central Dogma of Molecular Biology
DNA, RNA, and proteins are the three crucial marcomolecules in living organisms.
The central dogma was introduced by Crick to describe the process of producing
proteins from DNA through RNA in 1958 (Figure 1-1).1 DNA is a biopolymer that
carries genetic information. DNA is duplicated before a cell undergoes self-replication.
DNA is also used to produce pre-RNA through transcription. Both duplication and
transcription of DNA occur in the cell nucleus. RNA is processed through RNA splicing
to remove the non-coding regions before translation. The mature RNA is transported to
the cytoplasm. Proteins are built based on the corresponding genetic code on the mature
RNA through translation.
Figure 1-1. The central dogma of molecular biology is the genetic information flowing from DNA through RNA to proteins.1The solid arrows indicate the information flow that occurs in all eukaryotic cells. The dashed arrow indicates the information flow that occasionally occurs in viruses through reverse transcriptases.
2
1-2 Proteins
Proteins are the end products of the central dogma. Based on the unique genetic
code carried by the RNA, each protein is composed of different types and number of
amino acid. Most amino acids are L-α-amino acids. Proteins are linear biopolymers with
peptide bonds linking an α-carboxyl group of one amino acid and an α-amino group of
another. The peptide bond is planar with six atoms in the same plane. The length of a
peptide bond is 1.32 Å, which is between a C-N single bond (1.49 Å) and a double bond
(1.27 Å), suggesting partial double bond character.2 Each amino acid contains a
different side chain functional group, allowing proteins to perform various bioactivities.
Proteins are essential elements that control nearly all cellular functions. There are
several types of proteins differing in utility including structural components,3 signal
transduction,4 catalysis,5 and immune response.6 Proteins are responsible for almost all
bioactivities in the cell, and thus studies to enhance the fundamental knowledge on
proteins should improve our understanding of nature, along with potential technological
advancement.
1-3 Protein Folding and Function
In order to perform various biological functions, proteins must fold into
three-dimensional structures with high accuracy. Different protein structures give rise to
3
various protein functions.3, 7For example, at least 15 distinct enzyme families require a
specific protein fold named αβ barrel to construct the appropriate active site geometry.8
If proteins are denatured or mutated and cannot fold correctly into the corresponding
three dimensional shape, proteins lose their functions or even lead to protein misfolding
diseases such as Alzheimer’s,9 Parkinson’s,10 Huntington’s,11 and Crutzfeldt-Jacob
(prion) diseases.12 Alzheimer’s disease (AD) is a clinical syndrome caused by
neurodegeneration and was estimated that 24.3 million people suffered from it in
2001.13AD is related to the abnormal formation and accumulation of amyloid E peptide
(Aβ) and tau protein.14 Parkinson’s disease (PD) is a common nerval syndrome caused
by the abnormal aggregation of a stable tetrameric protein, α-synuclein (α-SYN), to
form insoluble fibrils.15 Prion disease is also caused by the aggregation of a
helical-containing protein called prion protein (PrP).16 These three diseases are all
involved in peculiar protein stacking of once structurally diverse proteins into β-sheet
structured amyloid fibrils. Importantly, the exact conformation of a protein plays an
important role in its function. Thus, a thorough study of protein function at the
molecular level requires detailed structural analysis.
1-4 Hierarchy of Protein Structure
In 1952, Linderstrøm-Lang proposed the hierarchy of protein structure with four
4
levels: primary, secondary, tertiary, and quaternary.17 In Linderstrøm-Lang’s model,
each level was constructed by the elements of the previous level and was characterized
by specific patterns of interactions.17 The primary structure reveals the direct
composition of a protein in the unit of various types of amino acids, starting from the
amino-terminal end (N) to the carboxyl-terminal end (C’). The main-chain atoms are an
NH group of one residue bound to Cα, a central carbon atom (Cα) to which the side
chain (R) is attached, and a carbonyl group C’=O linked to the NHof another residue.
The backbone atoms are basically composed of a repeating unit (NH- Cα+C’=O)n,
which serve as the common framework of an amino acid (Figure 1-2). In order to
describe the structural properties of a protein, another method is introduced to
characterize the main chain. The original repeating unit can be viewed as one central
carbon (Cαn+1) extending to its prior (Cαn) and subsequent central carbons (Cαn+2). As
discussed earlier, the peptide C-N bond has partial double bond character.18 This
character allows the peptide bond to arrange six main chain atoms
(Cαn-C’O-NH-Cαnand Cαn+1-C’O-NH-Cαn) in a rigid planar structure.2 Two
neighboring rigid planar structures are linked by the covalent bonds with the Cαatom,
rotating through N-Cα and Cα-C’ bonds. The two conventional dihedral angles for these
two bonds are named phi (I) and psi (ψ), respectively (Figure 1-2).
5
Figure 1-2. The peptide bond and the dihedral anglesIandψ in the backbond.
The combinations of the dihedral anglesare used to describe the structural
properties of the main chain. Most of the combinations of φ and ψ angles are not
allowed due to steric clashes between the peptide backbone and the side chains.19G. N.
Ramachandran calculated and plotted the sterically allowed regions as Ramachandran
plots with the dihedral angles ranging from -180° to 180° (Figure 1-3).19 The allowed
regions depend on the permitted van der Waals contact distance and the combination of
dihedral angles.19
Figure 1-3. Ramachandran plot.19The X axis is φ and the Y axis is ψangles, and the angle regions are from -180° to 180°.
Secondary structure is defined by patterns of hydrogen bonds between the
backbone amide and carboxyl groups. The basic secondary structures are α-helix
6
andβ-sheet.20 The α-helix was first described by Pauling in 1951.21 The α-helix is a
right-handed coil with dihedral angles I = -57° and ψ = -47°.22, 23The coil-like structure
has 3.6 residues per turn and is characterized by consecutive, main-chain, i←i+4
hydrogen bonds between each carbonyl oxygen (i) and an amide hydrogen (i+4) on the
adjacent helical turn (Figure 1-4).24 One third of all protein residues adopt an α-helix
conformation, showing that helical proteins play important roles in living organism.25
Figure 1-4. The structure of an α-helix (an α-helix from a four-α-helix bundle, PDB
2I7U).
β-Sheet is another common secondary structure. It is a flat plate configuration
containing multiple β-strands with inter-strand hydrogen bonds between backbone
C’=O and N-H on neighboring strands. β-Sheets can be further categorized into two
types: parallel and anti-parallel, distinguished by the arrangement of the hydrogen bond
orientation.26 A parallel β-sheet is characterized by a series of twelve-membered
hydrogen-bonded rings, while an anti-parallel β-sheet is characterized by an alternating
series of ten-and fourteen-membered hydrogen-bonded rings. The dihedral angles of
parallel and anti-parallel β-sheets are (I = -119°, ψ = +113°) and (I = -139°, ψ = +135°),
respectively. β-Hairpins are one of the simplest super-secondary structures, consisting of
7
two anti-parallel β-strands connected through a short loop region (Figure 1-5).27-29
Figure 1-5. The structure of a β-hairpin (the C-termini β-hairpin from GB1 protein,
PDB 2PLP).
Tertiary structure refers to the stable three-dimensional structure formed by a
polypeptide chain.30 Various recurring secondary structures assemble to form the
tertiary structure, which is required to perform different and precise protein functions.
X-ray analysis has revealed significant relationship between function and structure.
Domains are the fundamental units of tertiary structure, which are also closely related to
protein function. The concept of a domain was first introduced by Wetlaufer after X-ray
studies of hen lysozyme and papain,31, 32and proteolysis studies of immunoglobulins.33,
34 Protein tertiary structures can be divided into four major classes based on their
secondary structure content of the domain: all-D domains, all-E domains, α+β domains,
and α/β domains.35 According to an algorithm named “Structural Classification of
Proteins (SCOP) Database”, which investigates sequences and structures, these common
folds account for 16.2%, 22.6%, 25.4%, and 23.4% of the total 87681 structural hits,
respectively.36 Pyruvate kinase is a phosphate group-transferring enzyme that plays an
crucial role in glycolysis. It contains three major domains: an all-β regulatory domain,
8
an α/β substrate binding domain, and an α/β nucleotide binding domain. Each
structurally different domain serves a different purpose in phosphate group transfer. A
typical tertiary structure has its nonpolar residues buried in the interior, forming a
hydrophobic core.37 Polar and charged residues are more frequently found on the
surface, where proteins can interact with the aqueous environment through the
hydrophilic side chains.37
Quaternary structure is the spatial assemble of multiple polypeptide chains.38
Examples of proteins with quaternary structure include hemoglobin, DNA polymerase,
and ion channels. Conformational change or re-orientation of individual polypeptides
can induce changes in quaternary structure or connection between polypeptides.
Through such structural changes, protein function can be regulated and exert their
physiological function.
Each level of protein structure is held together by characteristic interactions and
forces. Higher levels of proteins structure are assembled through the structural units of
the lower level (Figure 1-6). Among the protein structure hierarchy, the secondary
structural level plays a key role in protein folding. Therefore, research on the factors
that affect the formation of secondary structure is important for understanding protein
structure formation and prediction.
9 Primary
Structure
Secondary Structure
Tertiary Structure
Quaternary Structure
Figure 1-6. Four hierarchical levels of protein structure (triosephosphate isomerase, PDB 8TIM).
1-5 Driving Force of Protein Folding
Proteins must fold into the native structure to carry out its function. There are four
dominant forces for protein folding and all these four forces are non-covalent in
nature.39 These four forces are hydrophobics, electrostatics interaction, hydrogen
bonding, and van der Waals.37, 39-46
Protein residues can be divided into two groups, polar and non-polar, depending on
their side chains. When a protein folds, most of the non-polar residues are buried inside
and form a hydrophobic core, while polar residues are mostly exposed to solvent. This
phenomenon is entropically favored and therefore leads to the increased stability of
10
proteins.37, 47, 48The hydrophobic effect was first described by Kauzmman in 1959.
Polar residues are mostly charged and free to interact with their environment,
including solvent molecules and other polar functional groups. Electrostatic interactions
can be divided into three types: ion-ion, ion-dipole, and dipole-dipole.41, 49 A charged
side chain can interact with an oppositely charged functional group located on another
residue or the protein terminus. Dipoles are formed by the asymmetric distribution of
electrons due to the differences in electronegativity of the two atoms in a covalent bond.
Electrostatic interactions through ionic charges or dipoles contribute to protein stability
and the formation of protein structures.50, 51
A hydrogen bond is an interaction between a hydrogen atom in an X-H group and a
highly electronegative atom Y such as nitrogen, oxygen, or fluorine.40, 52 The partial
positive charge on the H atom interacts with the partial negative charge on the Y atom.40,
52Such an interaction is important for stabilizing secondary and tertiary structures.44, 53,
54 The backbone hydrogen bond C=O···H-N is the most prevalent (68.1%), with
C=O···side chain (10.9%), N-H···side chain (10.4%), and side chain···side chain
hydrogen bond (10.6%) account for the remainder of the hydrogen bonds in protein
structures.44
Another intermolecular interaction is van der Waals. Van der Waals force is a
dispersion force caused by the fluctuating polarization of the nearby entities.55 In a
11
symmetrical molecule, there is no charge distribution on average. In reality, electrons
are mobile and might more towards one end of the molecule, forming a slight negatively
charged end (δ-) and a slightly positively charged end (δ+).55 Individual van der Waals
interactions are very weak, yet a massive number of such weak forces can still
significantly influence protein structure and stability.56
1-6 RNA Recognition
RNA-protein interactions are important in various fundamental biological processes,
including transcription, translation,57 RNA processing and modification.58 Both double
helical RNA and DNA are constructed by multiple complementary base pair such as
A-U, C-G and A-T, C-G.59 There are three factors that control the binding affinity
between RNA and protein: electrostatic interaction between the protein positively
charged region and the negatively charged phosphate groups on the RNA backbone,
hydrogen bonding, and the interactions between the RNA groove and the protein side
chains. Specific proteins bind to specific sites on specific RNAs. The appropriate
binding of such proteins acts as a switch for RNA activation or repression. Therefore,
studies on RNA-protein recognition are important for understanding many diseases
related to RNA.
12
Human immunodeficiency virus (HIV) is a type of RNA retrovirus that causes the
acquired immune deficiency syndrome (AIDS).60 A retrovirus is a single-stranded RNA
virus that targets a host cell as an obligate parasite.61 In most viruses, DNA is
transcribed into RNA, and RNA is translated into viral protein. In retroviruses, however,
RNA is reverse-transcribed into DNA by a virally encoded reverse transcriptase, and
then integrated into the genome of the host cell by a virally encoded integrase.62 Most
retroviruses contain three common genes in RNA genomes: gag, pol, and env. These
genes contain the information necessary for building the structural proteins and
important enzymes for new virus particles. The gag and env genes code for the core
nucleocapsid polypeptides and surface-coat proteins of the virus, respectively.63The pol
gene code for the viral reverse transcriptase and other enzymes.64 In the HIV-1 viral
RNA genome, there are six additional regulatory genes (tat, rev, nef, vif, vpr, and vpu)
that code for proteins that control the infection by HIV and the production of new viral
particles.64 The tat gene encodes for the Tat protein, which serves as a transcriptional
trans-activator by binding TAR RNA. The Tat protein is important for HIV-1
replication.
Trans-activator of transcription (Tat) protein contains a basic region that can
recognize RNA: RKKRRQRRR (residue 49 to 57). The Tat protein targets the
trans-activating responsive element (TAR) RNA located at the 5’end of nascent HIV-1
13
transcripts.65The TAR RNA contains a stem-loop structure composed of 59 nucleotides.
Two essential regions are the pentanucleotide loop (+29CUGGG+33) and the three-base
bulge (+22UCU+24) at the sites from +17 to +45. By interacting with this loop and bulge
region, Tat proteins alters the properties of the transcriptional complex and recruits
crucial enzymes, including the positive transcription elongation complex and RNA
polymerase II, for efficient production of full-length viral RNA.66The Tat-TAR binding
provides a positive feedback cycle and allows HIV to have an explosive response once
the threshold amount of Tat protein is reached.67Blocking this protein-RNA interaction
may repress the transcription of HIV-1 and serve as a potential treatment towards
AIDS.68
1-7 Post-Translational Modifications (PTMs)
Proteins are synthesized through the following biological steps: translation,
polymerization, termination, and processing.69 There are only 20 amino acids encoded
by the triple nucleotide codons in mRNA. However, there are about 140 amino acids
derivatives that have been identified in different proteins.70 These 20 encoded amino
acids must undergo various modifications to increase or even alter their functionalities.
Any modification that occurs after the completion of translation is considered a
14 post-translational modification (PTM).70
PTMs are a series of covalent processing events including peptide bond cleavage
and functional group attachment onto individual amino acids. Some common PTMs are
phosphorylation,71 acetylation,72 glycosylation,73 acylation,74 and methylation.75 PTMs
are responsible for protein function regulation and structural change.76
Protein methylation is a common post-translational modification that affects
thermal stability,77 cellular stress response,78 protein aging,79 ,gene regulation,80-82 and
transcriptional regulation.83 Protein methylation typically takes place on arginine (Arg)
or lysine (Lys) residues in the protein sequence.75Lysine can be methylated once, twice,
or three times by lysine methyltransferases into monomethyllysine (Mmk),
dimethyllysine (Dmk), and trimethyllysine (Tmk), respectively.84 Lysine methylation
leads to the increase of the positive charge effective radius and hydrophobicity. Such
methylated lysines play an important role in protein-protein and protein-nucleic acid
regulation.85, 86 For the Tat protein, several post-translational modifications have been
identified that modulate the interactions of Tat with TAR and other essential enzyme
complexes.87 These modifcations include lysine methylation at the residue adjacent to
the basic region.87 Accordingly, in this thesis, we investigate the effect of various types
of lysine methylation on TAR RNA recognition by Tat47-57derivatives.
15
1-8 Thesis Overview
Post-translational modifications are responsible for many protein behaviors. Lysine
methylation alters the physiological properties of the residue and may impact both
protein function and structure. There are three variations of methylated lysines that are
identified in proteins. It is logical to assume that the different numbers of methyl groups
attached on the side chain amino group should have different effects on proteins. In this
study, various types of methylated lysines are placed into two basic secondary structures:
α-helix and the simplest β-sheet model, “β-hairpin”, to investigate the effect of lysine
methylation on structural stability (Chapter 2).
Lysine methylation also plays an important part in biological processes.75 The
regulatory Tat protein contains a basic region (RKKRRQRRR, residue 49 to 57), which
can specifically bind to the trans-activating responsive (TAR) element to modulate
transcription.88 The binding between HIV-1 Tat protein and TAR RNA is essential for
the HIV-1 virus to efficiently produce viral RNA.88 To study the effect of lysine
methylation on RNA recognition and cellular uptake, two lysine residues Lys50 and
Lys51 were replaced with monothylated, dimethylated, and trimethylated lysines
individualing in Chapter 3. The dissociation constant for the Tat derived peptide-TAR
RNA complexes was determined by gel shift assay. The cellular uptake efficiency of Tat
16
derived peptides into Jurkat cell was assessed by flow cytometry.
1-9 References
1. Crick, F. Central dogma of molecular biology. Nature 1970, 227, 561-563.
2. Pauling, L.; Corey, R. B. The Planarity of the Amide Group in Polypeptides. J. Am.
Chem. Soc. 1952, 74, 3964-3964.
3. Hall, A. Rho GTPases and the actin cytoskeleton. Science 1998, 279, 509-514.
4. Nishizuka, Y. The Role of Protein Kinase-C in Cell-Surface Signal Transduction and Tumor Promotion. Nature 1984, 308, 693-698.
5. Radzicka, A.; Wolfenden, R. A Proficient Enzyme. Science 1995, 267, 90-93.
6. Aderem, A.; Ulevitch, R. J. Toll-like receptors in the induction of the innate immune response. Nature 2000, 406, 782-787.
7. Gavin, A. C.; Aloy, P.; Grandi, P.; Krause, R.; Boesche, M.; Marzioch, M.; Rau, C.;
Jensen, L. J.; Bastuck, S.; Dumpelfeld, B.; Edelmann, A.; Heurtier, M. A.; Hoffman, V.; Hoefert, C.; Klein, K.; Hudak, M.; Michon, A. M.; Schelder, M.; Schirle, M.;
Remor, M.; Rudi, T.; Hooper, S.; Bauer, A.; Bouwmeester, T.; Casari, G.; Drewes, G.; Neubauer, G.; Rick, J. M.; Kuster, B.; Bork, P.; Russell, R. B.; Superti-Furga, G.
Proteome survey reveals modularity of the yeast cell machinery. Nature 2006, 440, 631-636.
8. Wierenga, R. K. The TIM-barrel fold: a versatile framework for efficient enzymes.
FEBS Lett. 2001, 492, 193-198.
9. Georges, J. Alzheimer's disease in real life. Eur. J. Neurol. 2005, 12, 328-328.
10. Lee, J. C.; Gray, H. B.; Winkler, J. R. Copper(II) binding to α-synuclein, the Parkinson's protein. J. Am. Chem. Soc. 2008, 130, 6898-6899.
11. Bates, G. P. Huntington's disease - Exploiting expression. Nature 2001, 413, 691-694.
12. Prusiner, S. B.; Groth, D.; Serban, A.; Stahl, N.; Gabizon, R. Attempts to Restore Scrapie Prion Infectivity after Exposure to Protein Denaturants. Proc. Natl. Acad.
Sci. U. S. A. 1993, 90, 2793-2797.
13. Ferri, C. P.; Prince, M.; Brayne, C.; Brodaty, H.; Fratiglioni, L.; Ganguli, M.; Hall, K.; Hasegawa, K.; Hendrie, H.; Huang, Y. Q.; Jorm, A.; Mathers, C.; Menezes, P. R.;
Rimmer, E.; Scazufca, M.; Intl, A. D. Global prevalence of dementia: a Delphi consensus study. Lancet 2005, 366, 2112-2117.
14. Ballard, C.; Gauthier, S.; Corbett, A.; Brayne, C.; Aarsland, D.; Jones, E.
Alzheimer's disease. Lancet 2011, 377, 1019-1031.
17
15. Kahle, P. J. α-synucleinopathy models and human neuropathology: similarities and differences. Acta Neuropathol. 2008, 115, 87-95.
16. Caughey, B.; Chesebro, B. Prion protein and the transmissible spongiform encephalopathies. Trends. Cell Biol. 1997, 7, 56-62.
17. Linderstrøm-Lang, K. U. Proteins and Enzymes. Lane. Medical. Lectures 1952, 6.
18. Invernizzi, G.; Papaleo, E.; Sabate, R.; Ventura, S. Protein aggregation:
Mechanisms and functional consequences. Int. J. Biochem. Cell B 2012, 44, 1541-1554.
19. Ramachandran, G. N.; Ramakrishnan, C.; Sasisekharan, V. Stereochemistry of Polypeptide Chain Configurations. J. Mol. Biol. 1963, 7, 95-99.
20. Richardson, J. S. The anatomy and taxonomy of protein structure. Adv. Protein Chem. 1981, 34, 167-339.
21. Pauling, L.; Corey, R. B.; Branson, H. R. The Structure of Proteins - 2 Hydrogen-Bonded Helical Configurations of the Polypeptide Chain. Proc. Natl.
Acad. Sci. U. S. A. 1951, 37, 205-211.
22. Arnott, S.; Wonacott, A. J. Atomic co-ordinates for an α-helix: refinement of the crystal structure of α-poly-l-alanine. J. Mol. Biol. 1966, 21, 371-383.
23. Barlow, D. J.; Thornton, J. M. Helix geometry in proteins. J. Mol. Biol. 1988, 201, 601-619.
24. Pauling, L.; Corey, R. B. The structure of synthetic polypeptides. Proc. Natl. Acad.
Sci. U. S. A. 1951, 37, 241-250.
25. Cheng, R. P.; Girinath, P.; Suzuki, Y.; Kuo, H. T.; Hsu, H. C.; Wang, W. R.; Yang, P.
A.; Gullickson, D.; Wu, C. H.; Koyack, M. J.; Chiu, H. P.; Weng, Y. J.; Hart, P.;
Kokona, B.; Fairman, R.; Lin, T. E.; Barrett, O. Positional Effects on Helical Ala-Based Peptides. Biochemistry 2010, 49, 9372-9384.
26. Pauling, L.; Corey, R. B. The pleated sheet, a new layer configuration of polypeptide chains. Proc. Natl. Acad. Sci. U. S. A. 1951, 37, 251-256.
27. Sibanda, B. L.; Thornton, J. M. β-hairpin families in globular proteins. Nature 1985, 316, 170-174.
28. Sibanda, B. L.; Blundell, T. L.; Thornton, J. M. Conformation of β-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J. Mol. Biol. 1989, 206, 759-777.
29. Sibanda, B. L.; Thornton, J. M. Conformation of β hairpins in protein structures:
classification and diversity in homologous structures. Methods Enzymol. 1991, 202, 59-82.
30. Janin, J.; Chothia, C. Domains in proteins: definitions, location, and structural principles. Methods Enzymol. 1985, 115, 420-430.
18
31. Phillips, D. C. 3-Dimensional Structure of an Enzyme Molecule. Sci. Am. 1966, 215, 78-90.
32. Drenth, J.; Jansoniu.Jn; Koekoek, R.; Swen, H. M.; Wolthers, B. G. Structure of Papain. Nature 1968, 218, 929-932.
33. Porter, R. R. Structural Studies of Immunoglobulins. Science 1973, 180, 713-716.
34. Edelman, G. M. Antibody Structure and Molecular Immunology. Science 1973, 180, 830-840.
35. Levitt, M.; Chothia, C. Structural Patterns in Globular Proteins. Nature 1976, 261, 552-558.
36. Murzin, A. G.; Brenner, S. E.; Hubbard, T.; Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures.
J. Mol. Biol. 1995, 247, 536-540.
37. Pace, C. N.; Shirley, B. A.; McNutt, M.; Gajiwala, K. Forces contributing to the conformational stability of proteins. FASEB J. 1996, 10, 75-83.
38. Klotz, I. M.; Langerman, N. R.; Darnall, D. W. Quaternary structure of proteins.
Annu. Rev. Biochem. 1970, 39, 25-62.
39. Dill, K. A. Dominant forces in protein folding. Biochemistry 1990, 29, 7133-7155.
40. Hagler, A. T.; Huler, E.; Lifson, S. Energy functions for peptides and proteins. I.
Derivation of a consistent force field including the hydrogen bond from amide crystals. J. Am. Chem. Soc. 1974, 96, 5319-5327.
41. Perutz, M. F. Electrostatic effects in proteins. Science 1978, 201, 1187-1191.
42. Barlow, D. J.; Thornton, J. M. Ion-pairs in proteins. J. Mol. Biol. 1983, 168, 867-885.
43. Nicholls, A.; Sharp, K. A.; Honig, B. Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 1991, 11, 281-296.
44. Stickle, D. F.; Presta, L. G.; Dill, K. A.; Rose, G. D. Hydrogen bonding in globular proteins. J. Mol. Biol. 1992, 226, 1143-1159.
45. Pace, C. N.; Grimsley, G. R.; Scholtz, J. M. Protein ionizable groups: pK values and their contribution to protein stability and solubility. J. Biol. Chem. 2009, 284, 13285-13289.
46. Stigter, D.; Dill, K. A. Charge effects on folded and unfolded proteins. Biochemistry 1990, 29, 1262-1271.
47. Pace, C. N. Contribution of the hydrophobic effect to globular protein stability. J.
Mol. Biol. 1992, 226, 29-35.
48. Pace, C. N.; Fu, H.; Fryar, K. L.; Landua, J.; Trevino, S. R.; Shirley, B. A.;
Hendricks, M. M.; Iimura, S.; Gajiwala, K.; Scholtz, J. M.; Grimsley, G. R.
Contribution of hydrophobic interactions to protein stability. J. Mol. Biol. 2011, 408,
19 514-528.
49. Yoder, C. H. Teaching ion-ion, ion-dipole, and dipole-dipole interactions. J. Chem.
Educ. 1977, 54, 402-408.
50. Wada, A.; Nakamura, H. Nature of the charge distribution in proteins. Nature 1981, 293, 757-758.
51. Hol, W. G.; Halie, L. M.; Sander, C. Dipoles of the α-helix and β-sheet: their role in protein folding. Nature 1981, 294, 532-536.
52. Hagler, A. T.; Lifson, S. Energy functions for peptides and proteins. II. The amide hydrogen bond and calculation of amide crystal properties. J. Am. Chem. Soc. 1974, 96, 5327-5335.
53. Baker, E. N.; Hubbard, R. E. Hydrogen bonding in globular proteins. Prog. Biophys.
Mol. Biol. 1984, 44, 97-179.
54. McDonald, I. K.; Thornton, J. M. Satisfying hydrogen bonding potential in proteins.
J. Mol. Biol. 1994, 238, 777-793.
55. Feinberg, G.; Sucher, J. General Theory of the van der Waals Interaction: A Model-Independent Approach. Phys. Rev. A 1970, 2, 2395-2415.
56. Levitt, M.; Gerstein, M.; Huang, E.; Subbiah, S.; Tsai, J. Protein folding: the endgame. Annu. Rev. Biochem. 1997, 66, 549-579.
57. Matsuo, H.; Li, H.; McGuire, A. M.; Fletcher, C. M.; Gingras, A. C.; Sonenberg, N.;
Wagner, G. Structure of translation factor eIF4E bound to m7GDP and interaction with 4E-binding protein. Nat. Struct. Biol. 1997, 4, 717-724.
58. Varani, G.; Nagai, K. RNA recognition by RNP proteins during RNA processing.
Annu. Rev. Biophys. Biomol. Struct. 1998, 27, 407-445.
59. Seeman, N. C.; Rosenberg, J. M.; Rich, A. Sequence-specific recognition of double helical nucleic acids by proteins. Proc. Natl. Acad. Sci. U. S. A. 1976, 73, 804-808.
60. Weiss, R. A. How does HIV cause AIDS? Science 1993, 260, 1273-1279.
61. Yoshida, M. Discovery of HTLV-1, the first human retrovirus, its unique regulatory mechanisms, and insights into pathogenesis. Oncogene 2005, 24, 5931-5937.
62. Smith, J. A.; Daniel, R. Following the path of the virus: the exploitation of host DNA repair mechanisms by retroviruses. ACS Chem. Biol. 2006, 1, 217-226.
63. King, S. R. HIV - Virology and Mechanisms of Disease. Ann. Emerg. Med. 1994, 24, 443-449.
64. Greene, W. C. The molecular biology of human immunodeficiency virus type 1 infection. N. Engl. J. Med. 1991, 324, 308-317.
65. Weeks, K. M.; Ampe, C.; Schultz, S. C.; Steitz, T. A.; Crothers, D. M. Fragments of the HIV-1 Tat protein specifically bind TAR RNA. Science 1990, 249, 1281-1285.
66. Mujeeb, A.; Bishop, K.; Peterlin, B. M.; Turck, C.; Parslow, T. G.; James, T. L.
NMR Structure of a Biologically-Active Peptide-Containing the RNA-Binding
20
Domain of Human-Immunodeficiency-Virus Type-1 Tat. Proc. Natl. Acad. Sci. U. S.
A. 1994, 91, 8248-8252.
67. Cullen, B. R. Regulation of HIV-1 Gene-Expression. FASEB J. 1991, 5, 2361-2368.
68. Stevens, M.; De Clercq, E.; Balzarini, J. The regulation of HIV-1 transcription:
Molecular targets for chemotherapeutic intervention. Med. Res. Rev. 2006, 26, 595-625.
69. Merrick, W. C. Mechanism and regulation of eukaryotic protein synthesis.
Microbiol. Rev. 1992, 56, 291-315.
70. Uy, R.; Wold, F. Post-translational covalent modification of proteins. Science 1977, 198, 890-896.
71. Lipman, F. A.; Levene, P. A. Serinephosphoric acid obtained on hydrolysis of vitellinic acid. J. Biol. Chem. 1932, 98, 109-114.
72. Choudhary, C.; Kumar, C.; Gnad, F.; Nielsen, M. L.; Rehman, M.; Walther, T. C.;
Olsen, J. V.; Mann, M. Lysine Acetylation Targets Protein Complexes and Co-Regulates Major Cellular Functions. Science 2009, 325, 834-840.
73. Moremen, K. W.; Tiemeyer, M.; Nairn, A. V. Vertebrate protein glycosylation:
diversity, synthesis and function. Nat. Rev. Mol. Cell. Bio. 2012, 13, 448-462.
74. Towler, D. A.; Gordon, J. I.; Adams, S. P.; Glaser, L. The Biology and Enzymology of Eukaryotic Protein Acylation. Annu. Rev. Biochem. 1988, 57, 69-99.
75. Paik, W. K.; Kim, S. Protein Methylation. Science 1971, 174, 114-119.
76. Seo, J.; Lee, K. J. Post-translational modifications and their biological functions:
proteomic analysis and systematic approaches. J. Biochem. Mol. Biol. 2004, 37, 35-44.
77. Febbraio, F.; Andolfo, A.; Tanfani, F.; Briante, R.; Gentile, F.; Formisano, S.;
Vaccaro, C.; Scire, A.; Bertoli, E.; Pucci, P.; Nucci, R. Thermal stability and aggregation of sulfolobus solfataricus β-glycosidase are dependent upon the N-epsilon-methylation of specific lysyl residues: critical role of in vivo post-translational modifications. J. Biol. Chem. 2004, 279, 10185-10194.
78. Desrosiers, R.; Tanguay, R. M. Methylation of Drosophila histones at proline, lysine, and arginine residues during heat shock. J. Biol. Chem. 1988, 263, 4686-4692.
79. Najbauer, J.; Orpiszewski, J.; Aswad, D. W. Molecular aging of tubulin:
accumulation of isoaspartyl sites in vitro and in vivo. Biochemistry 1996, 35, 5183-5190.
80. Kramer Jamie, M. Epigenetic regulation of memory: implications in human cognitive disorders. BioMol. Concepts 2013, 4, 1-12.
81. Nakayama, J.; Rice, J. C.; Strahl, B. D.; Allis, C. D.; Grewal, S. I. Role of histone H3 lysine 9 methylation in epigenetic control of heterochromatin assembly. Science
21 2001, 292, 110-113.
82. Grewal, S. I.; Rice, J. C. Regulation of heterochromatin by histone methylation and small RNAs. Curr. Opin. Cell. Biol. 2004, 16, 230-238.
83. Chen, D.; Ma, H.; Hong, H.; Koh, S. S.; Huang, S. M.; Schurter, B. T.; Aswad, D.
W.; Stallcup, M. R. Regulation of transcription by a protein methyltransferase.
Science 1999, 284, 2174-2177.
84. Paik, W. K.; Paik, D. C.; Kim, S. Historical review: the field of protein methylation.
Trends. Biochem. Sci. 2007, 32, 146-152.
85. Martin, C.; Zhang, Y. The diverse functions of histone lysine methylation. Nat. Rev.
Mol. Cell Bio. 2005, 6, 838-849.
86. Zhang, X.; Wen, H.; Shi, X. B. Lysine methylation: beyond histones. Acta Bioch.
Bioph. Sin. 2012, 44, 14-27.
87. Hetzer, C.; Dormeyer, W.; Schnolzer, M.; Ott, M. Decoding Tat: the biology of HIV Tat posttranslational modifications. Microbes Infect. 2005, 7, 1364-1369.
88. Debaisieux, S.; Rayne, F.; Yezid, H.; Beaumelle, B. The Ins and Outs of HIV-1 Tat.
Traffic 2012, 13, 355-363.
22
Chapter2
Effect of Lysine Methylation on α-Helix and
β-Sheet Propensity
23
Chapter 2.
2-1 Introduction
α-Helix
The most abundant secondary structure in proteins is the α-helix, which is adopted
by nearly one third of all protein residues.1The α-helix is characterized by consecutive,
main-chain, i←i+4 hydrogen bonds between each carbonyl oxygen (i) and an amide
hydrogen (i+4) on the adjacent helical turn.2α-Helix stability is determined by N- and
C-capping effects, side chain-helix macrodipole interactions, side chain-side chain
interactions, and the intrinsic structure forming tendencies of the constituting amino
acids.3 Relative occurring frequencies of each amino acid adopting different secondary
structures were analyzed by Chou and Fasman.4 These statistic results revealed that
each amino acid has its own propensity for different secondary structures.4 The
thermodynamic helix propensities were determined by Baldwin and co-coworkers in
alanine-based peptides with minimum side chain interaction based on circular dichroism
spectra using modified Lifson-Roig theory.5, 6 The thermodynamic propensities of the
amino acids were converted to the free energy of helix formation/propagation. Also, the
nucleation of a helix formation is thought to be more difficult than propagation,
therefore capping effects were considered for helix formation.6As such, the Lifson-Roig
24
theory was modified by Baldwin and coworkers to incorporate N- and C-capping
parameters.5-7 The basic assumption is that helix propensity of each amino acid is
position independent.8
Lifson-Roig Theory
Statistical mechanical models had been used to describe the helix-coil equilibrium
including Zimm-Bragg theory and Lifson-Roig theory.9, 10Both models assume that the
helix-coil equilibrium of each residue is a two-state equilibrium. Residues can only
adopt either helix (h) or coil (c) conformation. Zimm-Bragg theory introduced two
parameters to describe the helix-coil equilibrium of each residue: nucleation (σ) and
propagation (s).9 The statistical weight for initiating a helical unit (hc or ch) is defined
as σs. A statistical weight of two successive coil residues (cc) is unity (set to 1). The
statistical weight of two successive helical residues (hh) is σs2. Through the
Zimm-Bragg model, the helicity of a peptide can be deduced by a partition function
with a few parameters. However, the original Zimm-Bragg model neglected many other
factors that affect helicity such as N- and C-capping, electrostatics, and macrodipole.11
Modified Zimm-Bragg theory with additional parameters to include other important
interactions had been proposed.12, 13
The Lifson-Roig theory, similar to Zimm-Bragg theory, employs two parameters
25
(w and v) to describe the equilibrium between α-helix and random-coil states in a
statistical mechanical manner.10 However, only the statistical weight of the helical
conformation with at least three continuous residues adopting helix conformation would
be considered a helix. This is due to the fact that a helix cannot exist without the
stabilization of (i, i+4) hydrogen bonding. The statistical weight of each state of each
residue is based on the residue's own state and the state of the two neighboring
residues.10The Lifson-Roig model utilizes a 4x4 transfer matrix to describe the statistic
weight of a residue, while Zimm-Bragg model uses a simpler 2x2 transfer matrix.
hh hc ch cc
hh w v 0 0
hc 0 0 1 1
ch v v 0 0
cc 0 0 1 1
The parameters were introduced to describe the different structural state of a
residue: u, coil state; v, helix state adjacent to coil state; w, helix state located between
helix states. The parameter v describes helix initiation,10 which is an independent and
energetically uphill process in helix formation. Baldwin and co-workers later discovered
the different capping effects for N- and C- capping, thereby introducing two additional
parameters to describe the N- and C-capping (n and c).5 The modified 4x4 transfer
matrix was: