AlphaFold2 examples from CASP 14: Difference between revisions
Eric Martz (talk | contribs) No edit summary |
Eric Martz (talk | contribs) No edit summary |
||
Line 26: | Line 26: | ||
| [[7jtl]]:A || || 4.02<br>'''0.66''' || 102/102 (100%)<br>'''87/102 (85%)''' || 4.3<br>'''1.58''' || 829/918 (90%)<br>'''709/918 (77%)''' | | [[7jtl]]:A || || 4.02<br>'''0.66''' || 102/102 (100%)<br>'''87/102 (85%)''' || 4.3<br>'''1.58''' || 829/918 (90%)<br>'''709/918 (77%)''' | ||
|- | |- | ||
| AlphaFold2 || 87 || 1.25 || 83/92* (90%) || 3.23<br>1.21 || 727/728 (100%)<br>336/728 (46%) | | AlphaFold2 || 87 || 2.58<br>'''1.25''' || 92/92 (100%)<br>'''83/92* (90%)''' || 3.23<br>'''1.21''' || 727/728 (100%)<br>'''336/728 (46%)''' | ||
|} | |} | ||
Revision as of 02:58, 24 February 2021
This page is under construction. Eric Martz 01:03, 22 February 2021 (UTC)
Prediction of protein structures from amino acid sequences, theoretical modeling, has been extremely challenging. In 2020, breakthrough success was achieved by AlphaFold2[1], a project of DeepMind. For an overview of this breakthrough, verified by the bi-annual prediction competition CASP, please see 2020: CASP 14. Below are illustrated some examples of predictions from that competition.
SARS-CoV-2 ORF8Following the discussion by Rubiera[2], our first example will be SARS-CoV-2 protein ORF8, a protein that contributes to virulence in COVID-19[3]. CASP 14 classified ORF8 as a "free modeling" (FM) target[4], meaning that there were no adequate empirical templates for homology modeling. This was easily confirmed. When the amino acid sequence of ORF8 is submitted to Swiss Model, it reports the best templates for homology modeling. When the two empirical models that were not available during CASP 14 are excluded (7jtl and 7jx6), the best template offered, chain B of 3afc, covers only 36% of the length of ORF8 at 13.2% sequence identity, with a 4-residue untemplated gap in the sequence alignment. This template would not be adequate for constructing a useful model. X-Ray Structures for ORF8The quality of predictions for the structure of ORF8 are judged by comparison with X-ray crystallographic empirical models which were not available to the groups making predictions. Shortly after the CASP 14 competition (summer 2020), two X-ray crystal structures were reported for ORF8: 7jtl released August 26, 2020, and 7jx6, released September 23, 2020. The resolutions are 2.0 and 1.6 Å respectively, and both have worse than average Rfree values.
: 87 of the 102 residues/chain (85%) align with a root mean square deviation (RMSD) of 0.66 Å[5]. When sidechains are included, 829 atoms (90% of 918 total) aligned with RMSD 4.3 Å[6], and 348 atoms (38%) aligned with RMSD 0.71 Å[5]. AlphaFold2 Prediction for ORF8The quality of a prediction in CASP is judged, in large part, by the Global Distance Test Total Score, GDT_TS. AlphaFold2's predicted structure[7] has a GDT_TS score of 87. (A score of 0 is meaningless, and a score of 100 means perfect agreement with an X-ray crystal structure.) 87 means the model is close to the accuracy of an X-ray crystal structure. The structure predicted by AlphaFold2 is almost as close to the X-ray crystallographic model 7jx6 as is the independently-determined X-ray structure 7jtl. AlphaFold2 predicted the positions of 92 amino acids. (CASP 14 excluded residues 48-59, a 12-residue surface loop, from the target residues[4].) 83 amino acid alpha carbons in the prediction aligned with those in 7jx6 with an RMSD of 1.25 Å[5]. When sidechains are included,
|
|
ReferencesReferences
- ↑ Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zidek A, Nelson AWR, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones DT, Silver D, Kavukcuoglu K, Hassabis D. Improved protein structure prediction using potentials from deep learning. Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan, 15. PMID:31942072 doi:http://dx.doi.org/10.1038/s41586-019-1923-7
- ↑ CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics, a blog post by Carlos Outeir al Rubiera, December 3, 2020.
- ↑ Flower TG, Buffalo CZ, Hooy RM, Allaire M, Ren X, Hurley JH. Structure of SARS-CoV-2 ORF8, a rapidly evolving immune evasion protein. Proc Natl Acad Sci U S A. 2021 Jan 12;118(2). pii: 2021785118. doi:, 10.1073/pnas.2021785118. PMID:33361333 doi:http://dx.doi.org/10.1073/pnas.2021785118
- ↑ 4.0 4.1 Summary and Classifications of Domains for CASP 14.
- ↑ 5.0 5.1 5.2 Alignment and RMSD calculation was by iterative magic fit of Swiss PDB Viewer.
- ↑ Alignment and RMSD calculation was by magic fit of Swiss PDB Fiewer.
- ↑ Download AlphaFold2's predicted structure for ORF8 from T1064TS427_1-D1.pdb.