AlphaFold

From Proteopedia
Jump to navigation Jump to search

In 2020, the AlphaFold2[1][2] system of DeepMind[3][4] demonstrated a major breakthrough[5][6][7][8]. At CASP14, AlphaFold2 was far better able, among over 100 competing groups, to predict structures, including sidechain positions, so close to the subsequently revealed X-ray crystallographic structures as to differ by little more than the differences between two independently-determined X-ray structures of the same molecule. It did this for about two-thirds of the targets in the competition. AlphaFold2 has been hailed as largely solving the protein structure prediction problem for single-chain proteins[5][6][7][8]. "Never in my life had I expected to see a scientific advance so rapid." said Mohammed AlQuraishi of Columbia University[5].

See Theoretical_models#2020:_CASP_14 for more about the initial demonstration at CASP14, and the reactions to it. AlphaFold2_examples_from_CASP_14 describes a detailed analysis of two of the CASP14 predictions.

AlphaFold published July 2021AlphaFold published July 2021

AlphaFold was published in July, 2021[9]. Methods were described in considerable detail. The source code, trained weights, and inference script were made available under an open-source license. Structure prediction required about one GPU (Graphics Processing Unit) minute per model of about 384 amino acids.

Impressively, AlphaFold had remarkable success predicting a set of 10,795 protein chain structures (filtered for high reliability, lengths restricted to 80-1,400 residues) published in the PDB after AlphaFold's training set[10]. Overall alpha carbon accuracy had a median of 1.46 Å RMSD at 95% coverage. The majority of chain structures were predicted with full-chain alpha carbon RMSD values <2 Å. About 25% were predicted with RMSD >4 Å.

Importantly, each prediction comes with a confidence score that reliably predicts the accuracy of the predicted structure.

Accurate prediction of sidechains required accurate prediction of the main chain. Accurate prediction required a multiple sequence alignment depth >~30 sequences, with a depth of ~100 sequences being adequate.

Free AlphaFold-based ServersFree AlphaFold-based Servers

RoseTTAFoldRoseTTAFold

Also in July, 2021, Minkyung Baek and a large team in the group of David Baker developed RoseTTAFold employing a three-track nework, based in part on methods inspired by AlphaFold but not yet fully-detailed by DeepMind. They reported "accuracies approaching those of DeepMind in CASP14"[11]. At the time of its release in July, 2021, it had outperformed all other available structure prediction servers[11].

The RoseTTAFold Server was made freely available. (Open the Structure Prediction menu at the top and choose Submit. At the form, be sure to check RoseTTAFold before submitting your job).

AlphaFold Database of PredictionsAlphaFold Database of Predictions

Also in July, 2021, DeepMind made available over 300,000 structure predictions from amino acid sequences in their free AlphaFold DB[12][13][14][15]. These predictions include all ~20,000 proteins in the human proteome[15][16]. The limitations of these predictions were enumerated

ReferencesReferences

  1. Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zidek A, Nelson AWR, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones DT, Silver D, Kavukcuoglu K, Hassabis D. Improved protein structure prediction using potentials from deep learning. Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan, 15. PMID:31942072 doi:http://dx.doi.org/10.1038/s41586-019-1923-7
  2. AlphaFold at Wikipedia.
  3. AlphaFold: a solution to a 50-year-old grand challenge in biology, DeepMind Blog, November 30, 2020.
  4. DeepMind at Wikipedia.
  5. 5.0 5.1 5.2 AlphaFold2 @ CASP14: “It feels like one’s child has left home.” by Mohammed AlQuraishi, December 8, 2020.
  6. 6.0 6.1 Artificial intelligence solution to a 50-year-old science challenge could ‘revolutionise’ medical research, CASP Press Release, November 30, 2020.
  7. 7.0 7.1 Callaway E. 'It will change everything': DeepMind's AI makes gigantic leap in solving protein structures. Nature. 2020 Dec;588(7837):203-204. doi: 10.1038/d41586-020-03348-4. PMID:33257889 doi:http://dx.doi.org/10.1038/d41586-020-03348-4
  8. 8.0 8.1 DeepMind and CASP14 by John R. Helliwell, International Union of Crystallography Newsletter, December 4, 2020.
  9. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Zidek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature. 2021 Jul 15. pii: 10.1038/s41586-021-03819-2. doi:, 10.1038/s41586-021-03819-2. PMID:34265844 doi:http://dx.doi.org/10.1038/s41586-021-03819-2
  10. The training set cutoff was 2018/04/30. The test set was obtained between then and 2021/02/15.
  11. 11.0 11.1 Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, Wang J, Cong Q, Kinch LN, Schaeffer RD, Millan C, Park H, Adams C, Glassman CR, DeGiovanni A, Pereira JH, Rodrigues AV, van Dijk AA, Ebrecht AC, Opperman DJ, Sagmeister T, Buhlheller C, Pavkov-Keller T, Rathinaswamy MK, Dalwadi U, Yip CK, Burke JE, Garcia KC, Grishin NV, Adams PD, Read RJ, Baker D. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021 Jul 15. pii: science.abj8754. doi: 10.1126/science.abj8754. PMID:34282049 doi:http://dx.doi.org/10.1126/science.abj8754
  12. We’ve made AlphaFold predictions freely available to anyone in the scientific community at DeepMind.com (date of release not specified, approximately July 2021).
  13. AlphaFold’s protein structure predictions now available to explore at the European Bioinformatics Institute, July 23, 2021.
  14. Great expectations – the potential impacts of AlphaFold DB at the European Bioinformatics Institute, July 22, 2021
  15. 15.0 15.1 DeepMind and EMBL release the most complete database of predicted 3D structures of human proteins at the European Bioinformatics Institute, July 22, 2021.
  16. Tunyasuvunakool K, Adler J, Wu Z, Green T, Zielinski M, Zidek A, Bridgland A, Cowie A, Meyer C, Laydon A, Velankar S, Kleywegt GJ, Bateman A, Evans R, Pritzel A, Figurnov M, Ronneberger O, Bates R, Kohl SAA, Potapenko A, Ballard AJ, Romera-Paredes B, Nikolov S, Jain R, Clancy E, Reiman D, Petersen S, Senior AW, Kavukcuoglu K, Birney E, Kohli P, Jumper J, Hassabis D. Highly accurate protein structure prediction for the human proteome. Nature. 2021 Jul 22. pii: 10.1038/s41586-021-03828-1. doi:, 10.1038/s41586-021-03828-1. PMID:34293799 doi:http://dx.doi.org/10.1038/s41586-021-03828-1

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Angel Herraez, Joel L. Sussman, Wayne Decatur