Chapter 5. Web server
5.3 Case study - Foxk1
This case illustrates how to use DBD2BS to infer the binding PWM of mouse Foxk1
(Forkhead box protein K1) using one of the unbound structures of its DNA-binding
domain as well as how to elucidate DBD2BS’s output via the utilities embedded in
DBD2BS.
The query (PDB ID: 2D2W:A) was submitted to DBD2BS with the ‘Query with a
protein structure’ form. Because 2D2W is a NMR structure, DBD2BS selects the first
MODEL by default. After a while, the user will see the candidate templates sorted by
their TM-scores (namely structure similarities) against the query structure (Figure 15).
In this case, the top three templates suggested by DBD2BS are used. The results are
shown in Figure 15.
On the result page, the selected templates with their PWM and other detailed
information are listed in the left area (marked as “1” in Figure 16). Clicking the “3D”
icon of each PWM in “1” loads the superimposed complex in the 3D view. The loaded
complex contains the protein-DNA template and the superimposed query to help users
observe the query-DNA interactions and the conformational changes between the
73
unbound state (the query) and the bound state (the protein in the template) (“3” in
Figure 16). DBDs are displayed as sticks. Residues within 1.5 Angstroms to any heavy
atom of the DNA are colored red when the option “Atom collision” (“2” in Figure 16)
is turned on. DNA base pairs are colored based on their conservation level. The 5’ end
of the PWM (position ‘1’ in the PWM) in the Jmol panel is highlighted by showing the
corresponding base in green so that users can quickly link the PWM with the DNA in
the Jmol panel.
Clicking the “open” button of each template in “1” shows more detail about the
template and the prediction made by using it, including the files ready for downloading
as follows:
superimposed complex: the protein-DNA complex structure that DBD2BS
generates from the query,
template complex: the protein-DNA complex structure whose protein is similar to
the query protein,
alignment: the result reported by the structure alignment tool
contact residue: the contact residues of the query protein in the superimposed
complex
PWM: the binding profile generated by DBD2BS.
74
Figure 15 Snapshot of the template select page.
75
Figure 16 Snapshot of the result page.
The “3D” icon of the first PWM (template ID 2C6Y:B) in “1” was clicked and the option “Atom collision” in “2” was turned on.
Users can click the “CMP” icon of each PWM to see whether the predicted PWM is
supported by the predicted PWMs from other templates. In this case, the “CMP” icon
of the first PWM (template ID 2C6Y:B) was clicked (Figure 16), owing to the lowest
E-value. On the comparison page, the PWM of the selected template is highlighted as
the reference PWM. The alignments of the reference PWM against the other PWMs
were performed as follows. The reference complex—the superimposed complex of the
query protein and the template complex corresponding to the reference PWM—is first
aligned to the other complexes by superimposing the query protein inside them. After
1 2
76
superimposition, the DNA structures from two complexes are structurally aligned via
dynamic programming. Base pairs from different complexes are aligned if they are
closer than 2Å. This may result in discontinuous alignment of two sequence logos.
Figure 17 shows that the unaligned positions were trimmed. Comparing the PWMs
from different templates show which positions in the predicted PWM have higher
confidence when consistent predictions are observed. In this tutorial, the prediction
based on 2C6Y:B was consistent with that based on 2C6Y:A, 2AS5:F and 3G73:A on
four positions (xAxACA) (at the position ‘9’, ‘12’-‘16’ on 2C6Y:B). Further
observation of the first and third positions of trimming PWM on 2C6Y:B (position ‘9’
and ‘13’ of the original PWM) shows that the first position, ‘G’, aligns to ‘T’ on the
others templates and the third position, ‘T’, aligns to ‘A’ on the other templates. To
confirm this situation, the user can click the “CMP” icon of the others PWM. For
example, click the “CMP” icon of the second PWM (template ID 2C6Y:A) and the
comparison page shows that the third position of trimming PWM on 2C6Y:A (position
‘8’ of the original PWM), ‘A’, is aligns to ‘T’ on 2CY6:B, but aligns to ‘A’ on the
others templates. By this observation, we have higher confidence in changing ‘T’ to ‘A’
at third position on 2C6Y:B. Thus, we get the consensus “AAACA”.
77
In addition, Figure 16 shows that the collisions, which indicate large conformational
change, happen near position ‘11’ of the PWM (green base in Jmol mean the position
‘1’ of the 5’ end of the PWM) and position ‘3’, ‘4’ and ‘5’ of the other DNA chain on
2C6Y:B. For the same reason, the collisions happened near position ‘6’ (‘7’, ‘8’ and ‘9’)
on 2C6Y:A, position ‘13’, ‘14’ and ‘15’ (‘7’) on 2AS5:F and position ‘11’ (‘8’-’11’) on
3G73:A. In this case, the positions near collisions (position ‘11’) are different from the
positions predicted by the DBD2BS (position ‘9’, ‘12’-‘16’) on 2C6Y:B. Thus, the
users can ignore that the collision happened.
To further analyze the accuracy of the consistent positions of the PWMs based on the
above, they were aligned to the annotated PWM (Figure 18). According to the
annotated PWM (the first row of Figure 18) obtained from [37], the consensus
“AAACA” achieves five correct bases among the five positions in the annotated motif.
Furthermore, the similarity and complete-similarity between the annotated PWM and
consensus are achieved with 0.95 and 0.47 respective. This demonstrates the
DBD2BS’s success in predicting the most important positions (the largest characters in
the sequence logo) of annotated PWM (Figure 18) by comparing similar templates.
The DBD2BS is also able to provide useful information based on the unbound
structure of the query DBD along with a bound structure from homologues.
78
Figure 17 Snapshot of the comparison page.
Annotation:
consensus
Figure 18 Alignment of the annotated and predicted consensus by DBD2BS for the mouse Forkhead box protein K1 (Foxk1).
79