• 沒有找到結果。

False positive rate

2.6.1 Compare the same SCOP family of homeodomain 1B8I-A

0 2 4 6 8

-2 -1 0 1 2 3

ΔSConsensus

r = 0.38

-4 -2 0 2 4 6 8

-2 0 2 4 6 8

r = 0.60

ΔSCombination

ΔΔG (a)

(b)

ΔΔG

Figure 2.5.2.3. The correlation between ΔΔG and ΔS (A) Consensus (B) Combination.

2.6 Discussion

2.6.1 Compare the same SCOP family of homeodomain 1B8I-A

Figure 2.6.1.1 shows an example, which is the ultrabithorax homeodomain (Ubx) from Drosophila melanogaster (PDB entry 1B8I-A [138]) selected from 66 representative domains to described the characteristics of our method. The DNA is represented in green. 18

DNA-contact residues are presented as yellow stick and other residues are denoted as blue.

The protein sequence is also presented and a contact residue is marked with an asterisk. For the alignment of the representative domain (1B8I-A) to the domains of its member, Figure 2.6.1.1 presents a nice case (PDB entry 1PUF-A), which is a homeobox protein hox-a9 from mouse [139]. We found that the contact residues is highly conserved in the aligned amino acids of the two domains and our scoring method shows this high z-score (z-score = 11.92).

On the other hand, if we align 1B8I-A to 250 non-DNA-binding proteins, our method is able to discard the similar protein structures whose contact residues are not conserved (z-score = 0.58). Figure 2.6.1.1 shows an example of aligning 1B8I-A to 1BOB, which is histone acetyltransferase hat1 from S. cerevisiae in complex with acetyl coenzyme [140].

Figure 2.6.1.1. Searching results of the ultrabithorax homeodomain protein. Searching results using the homeotic Ubx/Exd/DNA ternary complex (PDB entry 1B8I-A) from Drosophila melanogaster as the query. (A) The contact residues of 1B8I-A complex are presented as stick (yellow). The sequence of 1B8I-A is shown and contact residues are marked with asterisks. (B) Structure alignment of 1B8I-A (blue) and 1PUF-A (green). The score is 4.78 and Z-score is 11.92 by our scoring method. (C) Structure alignment of 1B8I-A (blue) and non-DNA-binding protein 1BOB (green). Only the aligned structure/sequence of 1B8I-A and 1BOB are shown. We obtained score = -0.72 and Z-score = 0.58.

Figure 2.6.1.2. Comparison of bound DNA sequences of homologous proteins. The alignments of the bound DNA sequences of homologous proteins by using the homeotic ubx/exd/DNA ternary complex (PDB entry 1B8I-A) as the query. (A) The z-score values and the bound DNA sequences of the complex 1B8I (PDB entry 1B8I-C and 1B8I-D), 1PUF (PDB entry 1PUF-D and 1PUF-E), and 1O4X (PDB entry 1O4X-C and 1O4X-D). All sequences are from 5' to 3'. (B) Alignments of bound DNA sequences of the complexes 1B8I and 1PUF. A colon denotes an identical pair and an asterisk denotes a contact nucleotide (asterisks are marked above/below alphabets on the upper/lower sequence of the alignment, respectively). (C) Alignments of bound DNA sequences of the complexes 1B8I and 1O4X.

The z-score of DNA-binding domains in the same SCOP family may be variable for several representative domains (Figure 2.6.1.2(A)). The 1PUF-A and 1O4X-A1 (Oct-1 POU homeodomains from Homo sapiens [141]) are the members of the 1B8I-A representative domain. The z-scores are 11.92 (1PUF-A) and 4.4 (1O4X-A1) when 1B8I-A was used as the query (Figure 2.6.1.2(A)). The z-scores indicated that the contact residues between 1PUF-A and 1B8I-A are more conserved than the ones between 1O4X-A1 and 1B8I-A on contact residues interacting to the bases of the core binding site in the DNA.

To investigate variation of contact residues of DNA-binding domain in the same SCOP family, we compared the bound DNA sequences of two DNA-binding domains by aligning the double-strand sequences to each other. 1B8I-A binds two DNA sequences (i.e. PDB entry 1B8I-C and 1B8I-D) and 1O4X-A1 binds another two DNA sequences (PDB entry 1O4X-C and 1O4X-D). First we generated four pairing alignments: 1B8I-C and 1O4X-C; 1B8I-C and 1O4X-D; 1B8I-D and 1O4X-C; and 1B8I-D and 1O4X-D. We do not allow any gap insertion when aligning a-pairing DNA sequences. The alignments are obtained by sliding two sequences against each other until the best match is found. The alignment with the maximum number of identical aligned pairs is chosen, and as a result the alignment between 1B8I-C and 1O4X-C is the one chosen (Figure 2.6.1.2(C)). Then we adjust the alignment of the other DNA strand pairs (i.e. 1B8I-D and 1O4X-D) according to this best alignment (1B8I-C and 1O4X-C).

Figure 2.6.1.2(B) and Figure 2.6.1.2(C) show that the number of identical nucleotides between 1B8I-C and 1PUF-E [142] as well as 1B8I-D and 1PUF-D [142] is much higher than those of 1B8I-C and 1O4X-C [143] as well as 1B8I-D and 1O4X-D [144] for whole DNA sequences. At the same time, 11 identical contact nucleotides are obtained from the alignments of 1B8I-C and 1PUF-E as well as 1B8I-D and 1PUF-D; but two identical contact nucleotides are yielded from the alignments of 1B8I-C and 1O4X-C as well as 1B8I-D and 1O4X-D (the contact nucleotides are the nucleotides that interact with contact residues of protein). With respect to 1B8I-A, 1PUF-A and 1O4X-A1 are different not only in the DNA sequences they bind to but also in their DNA-binding sites. These results show that the members in the same SCOP family may have different DNA-binding models and that our method is able to detect the different Protein-DNA interactions based on the evolutionary conservation of DNA-contact residues.

We produced multiple protein sequence alignments of 13 homeodomains (Figure 2.6.1.3) selected from SCOP 1.71 using a multiple structure alignment tool, MUSTANG [145]. These domains were ranked by z-scores calculated by using our scoring method and the sequence of 1B8I-A as the query. According to z-scores, these 13 domains can be roughly divided into two groups, including the Ubx-like homeodomain colored in blue (e.g. PDB entry 9ANT-A (12.77), 1AHD-P (12.19), and 1SAN (11.96)) and the Oct-1 POU homeodomain colored in red (e.g. PDB entry 1E3O-C1 (6.40), 1GT0-C1 (6.38), and 1O4X-A1 (4.40)). Figure 2.6.1.3 shows that all Ubx-like homeodomains are significantly more conserved than Oct-1 POU homeodomains on contact residues (green). The Ubx homeodomain binds together with the extradenticle homeodomain (Exd) to recognize four DNA bases (ATAA) [138] based on four residues that are Ile47, Gln50, Asn51, and Met54, locating at α3 helix in the Ubx (gray columns in Figure 2.6.1.3). The z-scores of the domains are higher if they are conserved on these four residues, such as three antennapedia homeodomains and two homeobox protein

hox. These results show that contact residues interacting with bases in the DNA sequences are often conserved. This result is consistent to previous results [146] in which the homeodomain family was considered as a multi-specific family that consists of some subfamilies. This work concluded that members in the same subfamily bind DNA specifically but the members in different subfamilies recognize different DNA targets. In summary, we demonstrated the conservation of DNA-contact residues in DNA-binding domains.

Figure 2.6.1.3. Multiple structure alignment of 13 homeodomain structures. The domains with similar DNA-binding specificities with 1B8I-A are shown in blue and others are red. The contact residues of 1B8I-A are marked green. The contact residues interacting to the bases of the core binding site in the DNA (ATAA) major groove are indicated gray.