Calculating GDT TS
This page is under construction. Eric Martz 01:18, 5 March 2021 (UTC)
What is GDT_TS?What is GDT_TS?
The Global Distance Test - Total Score (GDT_TS)[1][2] is used to quantitate the similarity between a predicted protein structure, and a reference structure, which is typically an empirical model. GDT_TS gives an overall average measure of how close each amino acid in the predicted model is to those in the empirical model, taking into account many different superpositions of the two models. When the two structures differ in detail, GDT_TS is better at detecting similarities in fold than is the Root Mean Square Deviation. "RMSD uses the actual distances between alpha carbons, where GDT works with the percentage of alpha carbons that are found within certain cutoff distances of each other."[3] Both tests compare the positions of only the alpha carbon atoms. GDT_TS values range from 0 (a meaningless prediction) to 100 (a perfect prediction). "Random predictions give around 20; getting the gross topology right gets one to ~50; accurate topology is usually around 70; and when all the little bits and pieces, including side-chain conformations, are correct, GDT_TS begins to climb above 90."[4].
Results of predictions submitted to the biannual CASP competitions are judged largely by GDT_TS. Proteopedia pages using GDT_TS include Theoretical models and AlphaFold2 examples from CASP 14.
Server for calculating GDT_TSServer for calculating GDT_TS
GDT_TS can be calculated with the free AS2TS Server provided by Adam Zemla[5][6]. Below are detailed instructions kindly provided by Zemla.
Run 1: SuperpositionRun 1: Superposition
You will need to do two runs on the server. The first run produces the best superposition. The second run calculates GDT_TS based on that superposition.
1. Go to the AS2TS server: linum.proteinmodel.org.
2. Under Protein Structure Analysis services, click LGA = pairwise protein structure comparison. (Technical information is available at the Service description link on the same line.)
3. Fill in your email address.
4. Provide the two structures to be compared using method 1, 2, or 3. Specify the predicted or query structure first -- this will be superposed on the reference structure. Specify the reference structure second.
- For a concrete example, we'll use SARS-CoV-2 ORF8, which was a target in CASP 14. There are two X-ray structures, 7jtl and 7jx6. Chain A of the latter has the highest resolution and fewest missing residues. When it was submitted (February, 2021) to the DALI Server in PDB 25 mode, the top hit for structural similarity was the N-terminal domain of 5a2f. So in slot 1 was entered 5a2f_A 7jx6_A.
5. Caveat: If you let your browser auto-fill the email address slot, make sure to clear any other slots that got auto-filled inadvertently. Otherwise you may get an error message.
6. Leaving the parameters at their defaults, press the START button.
- In the results for Run 1, you may be interested in the RMSD and Seq_Id for the superposition deemed optimal by this server.
The LGA_S value is a structure similarity score for the number of alpha carbons given under N. LGA_S values below 40 indicate that the two structures have different folds. The LGA_S score for our example is 49.46 for 87 alpha carbons, indicating similar folds.
- Caveat: If the first structure is half the length of the second reference structure, then the maximum possible LGA score is 50%. On the other hand, if the second structure is half the length of the first one, then the maximum possible LGA score is 100%. In our example, the length of 5a2f_A is 218 amino acids, and the length of the reference structure 7jx6_A is 104. Therefore, a score of 100 is not impossible.
ReferencesReferences
- ↑ GDT description at the CASP website.
- ↑ Global distance test at Wikipedia.
- ↑ GDT in the Foldit Wiki.
- ↑ AlphaFold2 @ CASP14: “It feels like one’s child has left home.” by Mohammed AlQuraishi, December 8, 2020.
- ↑ Zemla A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 2003 Jul 1;31(13):3370-4. doi: 10.1093/nar/gkg571. PMID:12824330 doi:http://dx.doi.org/10.1093/nar/gkg571
- ↑ Zemla A, Zhou CE, Slezak T, Kuczmarski T, Rama D, Torres C, Sawicka D, Barsky D. AS2TS system for protein structure modeling and analysis. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W111-5. doi:, 10.1093/nar/gki457. PMID:15980437 doi:http://dx.doi.org/10.1093/nar/gki457