Sandbox UC 25
Structural analysis of free DNA duplexesStructural analysis of free DNA duplexes
According to the current DNA structural data, free DNA duplexes can adopt certain conformations more likely. Examples are shown checking at the available data in the PDB.
SourcesSources
All extracted PDB structures were downloaded from two sources:
- Nucleic Acid Database (NDB) [1] FTP server, and are represented with their PDB ID code.
- Protein Data bank PDB [2]
NDB X-Ray and NMR structuresNDB X-Ray and NMR structures
The NDB server gives filter options for downloading structures. Briefly, the applied filters are:
- Only DNA structures (proteins, RNA and ligands are discarded)
- No nucleic acid modifications (base, sugar or phosphate)
- No triplex, quadruple, or single stranded helices [3]
- No internal loops, tetraloops, hairpin loops, or bulged bases are allowed.
- No 3-4 way junctions.
Protein Data bank PDBProtein Data bank PDB
All free DNA structures were downloaded from the PDB, and the same filters applied for NDB were used.
Data collection problemsData collection problems
Files not foundFiles not found
Certain obtained results given by the NDB server did not have associated files. Particularly:
X-rayX-ray
- bdj061 (PDB ID not linked to NDB ID): It gives a reference link to a Nature’s paper http://www.nature.com/nature/journal/v368/n6470/abs/368469a0.html. However, the PDB ID does not exist inside the NDB files. The described structure corresponds to the Trp operon operator site. The paper was not publicly available to get the old PDB ID and checked why it was changed (Discard if reference is not available).
NMRNMR
- 2kal: Found in the NDB but with broken links. Proper files were downloaded from the PDB (PDB ID 2KAL).
- 2k1y: idem. (PDB ID 2K1Y).
Modified structuresModified structures
Even though the NDB server structures had stated filters for modified nucleic acids, posterior analysis showed modified bases in several files. The reported set was checked and structures were discarded if a modification was found, via manual analysis.
NMR (41 modified structures)NMR (41 modified structures)
1k8l, 1k8n, 1laq, 1las, 1mk6, 1mkl, 1oq2, 1s0t, 1s74, 1s75, 1s88, 1sp6, 1tuq, 1u6n, 1uab, 1xci, 2hli, 2hmd, 2hmr, 2hou, 2k1y, 2kal, 2kd9, 2kda, 2kh3, 2kh5, 2kh6, 2kh7, 2l2u, 2l2v, 2l8p, 2lfa, 2lfx, 2lfy, 2lg0, 2lho, 2lib, 2lsc, 2lsz, 2lt0, 2rrq
X-ray (9 modified structures)X-ray (9 modified structures)
458d, 2qef, 2qeg, 122d, 123d, 3bna, 4bna, 329d, 3n4n
More filtersMore filters
- Abasic sites - NMR (4 structures): 2hsk, 2hsl, 2hsr, 2hss
- Nicked structures - X-ray (2 structures): 1ndn, 1vte
- Triangular assimetryc units - X-ray (1 structure): 3nao
- 4 way junction - X-ray (1 structure): 3q5c
- Triple helix - X-ray (1 structure): 272d
- Resolution (Lower than 3.0)
- Redundancy (99% sequence ID)
B-DNA dinucleotides filter [OBSOLETE AND NOT INCLUDED]B-DNA dinucleotides filter [OBSOLETE AND NOT INCLUDED]
A filter to differentiate between different right-handed DNA dinucleotides was described in the Olson’s group 3DNA paper [4]. The method identifies DNA populations that can be visualized using Zp(h) vs. Zp plots (B-DNA = bottom distribution). With this criterion, 3DNA was applied to classify all dinucleotides available in the current set, and structures with less than half of B-DNA dinucleotides were discarded (Note. This is strongest filter applied)
Redundancy reductionRedundancy reduction
After the mentioned filters, the obtained structures set (n=86) was analyzed in terms of sequence to remove repeated motifs presented locally in more than one structure. A dendrogram showing sequence identity between PDB sequences is shown below (left) One representative structure was selected using resolution, pH (between 6.5 and 7.5) and acceptable temperature ranges as criteria, for each group in which the sequence identity was 100%, giving a final set (n=34).
In the current project, 34 free DNA structures have been selected via filtering of the entire PDB-DB, using the following criteria.
ReferencesReferences
- ↑ Berman HM, Olson WK, Beveridge DL, Westbrook J, Gelbin A, Demeny T, Hsieh SH, Srinivasan AR, Schneider B. The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. Biophys J. 1992 Sep;63(3):751-9. PMID:1384741 doi:http://dx.doi.org/10.1016/S0006-3495(92)81649-1
- ↑ Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000 Jan 1;28(1):235-42. PMID:10592235
- ↑ B-DNA double helices are classified according to the author who uploaded the PDB.
- ↑ Lu XJ, Olson WK. 3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures. Nat Protoc. 2008;3(7):1213-27. doi: 10.1038/nprot.2008.104. PMID:18600227 doi:10.1038/nprot.2008.104