Standard Residues: Difference between revisions
Eric Martz (talk | contribs) No edit summary |
Eric Martz (talk | contribs) No edit summary |
||
Line 2: | Line 2: | ||
* The [[Amino Acids|22 standard amino acids]], plus ambiguous residue codes ASX, GLX, and undetermined UNK. In 2014, the [[PDB]] added genetically encoded proteogenic [[Amino Acids|SEC (U) and PYL (O)]] as "standard"<ref name="pdb22">[https://www.wwpdb.org/news/news?year=2014#5764490799cccf749a90cddf Announcement: Standardization of Amino Acid Nomenclature], World Wide Protein Data Bank News, January 8, 2014.</ref><ref name="sec">[[1fdo]], released 1997, had selenocysteine 140 in chain A coded as HETATM CSE through the [ftp://snapshots.wwpdb.org/20140102/pub/pdb/data/structures/divided/pdb/fd/pdb1fdo.ent.gz WWPDB snapshot of 2014-01-02], but had it coded aa ATOM SEC in the [ftp://snapshots.wwpdb.org/20141203/pub/pdb/data/structures/divided/pdb/fd/pdb1fdo.ent.gz 2014-12-03 snapshot]. See [[Getting Unremediated PDB Files]].</ref>. | * The [[Amino Acids|22 standard amino acids]], plus ambiguous residue codes ASX, GLX, and undetermined UNK. In 2014, the [[PDB]] added genetically encoded proteogenic [[Amino Acids|SEC (U) and PYL (O)]] as "standard"<ref name="pdb22">[https://www.wwpdb.org/news/news?year=2014#5764490799cccf749a90cddf Announcement: Standardization of Amino Acid Nomenclature], World Wide Protein Data Bank News, January 8, 2014.</ref><ref name="sec">[[1fdo]], released 1997, had selenocysteine 140 in chain A coded as HETATM CSE through the [ftp://snapshots.wwpdb.org/20140102/pub/pdb/data/structures/divided/pdb/fd/pdb1fdo.ent.gz WWPDB snapshot of 2014-01-02], but had it coded aa ATOM SEC in the [ftp://snapshots.wwpdb.org/20141203/pub/pdb/data/structures/divided/pdb/fd/pdb1fdo.ent.gz 2014-12-03 snapshot]. See [[Getting Unremediated PDB Files]].</ref>. | ||
* Eleven standard nucleotides A, C, G, I, U, DA, DC, DG, DI, DT, and DU<ref name="du">In December, 2019, there are over 80 entries in the Protein Data Bank containing deoxyribo-U ("DU"). There are none containing ribo-T ("T").</ref>, plus N for an unknown nucleotide. (I is inosine.) The PDB [http://www.wwpdb.org/documentation/format33/sect4.html#HET provides this list under HET]. | * Eleven standard nucleotides A, C, G, I, U, DA, DC, DG, DI, DT, and DU<ref name="du">In December, 2019, there are over 80 entries in the Protein Data Bank containing deoxyribo-U ("DU"). There are none containing ribo-T ("T").</ref>, plus N for an unknown nucleotide. (I is inosine.) The PDB [http://www.wwpdb.org/documentation/file-format-content/format33/sect4.html#HET provides this list under HET]. | ||
The distinction between ribonucleotides (A, C, G, I, U) and deoxyribonucleotides (DA, DC, DG, DI, DT, DU<ref name="du" />) was first made when the PDB was remediated, effective August 1, 2007. The unremediated files can still be obtained, see [[Getting Unremediated PDB Files]]. | The distinction between ribonucleotides (A, C, G, I, U) and deoxyribonucleotides (DA, DC, DG, DI, DT, DU<ref name="du" />) was first made when the PDB was remediated, effective August 1, 2007. The unremediated files can still be obtained, see [[Getting Unremediated PDB Files]]. |
Revision as of 22:34, 19 December 2019
Standard residues (standard amino acids and nucleotides) are defined in the PDB data file format, and have record type ATOM in PDB-format atomic coordinate files. Standard residues are:
- The 22 standard amino acids, plus ambiguous residue codes ASX, GLX, and undetermined UNK. In 2014, the PDB added genetically encoded proteogenic SEC (U) and PYL (O) as "standard"[1][2].
- Eleven standard nucleotides A, C, G, I, U, DA, DC, DG, DI, DT, and DU[3], plus N for an unknown nucleotide. (I is inosine.) The PDB provides this list under HET.
The distinction between ribonucleotides (A, C, G, I, U) and deoxyribonucleotides (DA, DC, DG, DI, DT, DU[3]) was first made when the PDB was remediated, effective August 1, 2007. The unremediated files can still be obtained, see Getting Unremediated PDB Files.
Note that, in Jmol, A, C, G, I, T, U select nucleotides in either DNA or RNA for backward compatibility, while DA, DC, DG, DI, DT, and DU select only DNA nucleotides. You can select RNA nucleotides with e.g. "(A, U) and RNA", or by enclosing the single-letter nucleotide names in brackets, e.g. "([A],[C],[G],[I],[U])".
At RCSB.Org, using the Advanced Search and query type Chemical ID, you can find all entries in the database that contain a particular chemical component. For example, in December, 2019, five entries contain PYL (explained in Non-Standard Residues).
A complete list of all compounds in the PDB is available, including hydrogens, 3D structures, and bond orders, in the Chemical Components Dictionary of the Worldwide Protein Data Bank. This includes all Standard Residues as well as Non-Standard Residues, carbohydrate adducts, Ligands and Hetero Groups. It is updated weekly for newly released entries.
See alsoSee also
Notes & ReferencesNotes & References
- ↑ Announcement: Standardization of Amino Acid Nomenclature, World Wide Protein Data Bank News, January 8, 2014.
- ↑ 1fdo, released 1997, had selenocysteine 140 in chain A coded as HETATM CSE through the WWPDB snapshot of 2014-01-02, but had it coded aa ATOM SEC in the 2014-12-03 snapshot. See Getting Unremediated PDB Files.
- ↑ 3.0 3.1 In December, 2019, there are over 80 entries in the Protein Data Bank containing deoxyribo-U ("DU"). There are none containing ribo-T ("T").