Homology modeling servers

From Proteopedia
Jump to navigation Jump to search

There are a number of free servers that create homology models (also called comparative models) for a submitted amino acid sequence, or that offer libraries of 3D models created in advance for protein sequences. The performance of homology modeling methods is evaluated in an international, biannual competition called CASP. A comparison of 10 servers is included in the 2009 description of Phyre by Kelley and Sternberg[1], which also offers guidance in how to use these servers effectively.

ServersServers

Terminology: The sequence with unknown 3D structure is usually called the target. It is modeled on the template. Because those two terms are similar, sometimes leading to confusion, we shall call the target the query.

The list below is incomplete, and may not include some of the best servers, nor does it include assessments of server performance. Please help by adding additional servers.

  • SWISS-MODEL provides a free, fully-automated homology modeling service. Using the Automated Mode, you submit a protein sequence. When the PDB contains an empirically-determined structure with sufficient sequence identity with your query sequence, it will be used as a template. The resulting homology model will be constructed automatically.

Handling of gapsHandling of gaps

There are three kinds of gaps that present challenges when creating a homology model. It is important to know how a given server handles these challenges. Behavior marked Caution seems likely to produce errors in the homology model.

Server

Sequence Alignment[2]

Template residues lacking 3D coordinates

Untemplated Query Residues
(Gap in Template Sequence)

Gap in Query Sequence

Swiss-Model (Automated Mode)

Untemplated query residues (aligned with a gap in the template sequence) are present in the 3D model, and are indicated with a high temperature value. This appears to be true regardless of the length of the untemplated region. Long untemplated regions may occur in the 3D model as a long hairpin loop extending away from a compact domain, making their lack of template fairly obvious.

The 3D model takes a shortcut, skipping the residues in the template aligned with the gap. This causes the 3D template to bulge away from the 3D query model in this region, and permits registration to be maintained.

Caution: Omitted in the sequence alignment, yet not indicated there by a gap. The 3D model lacks a spatial gap between the residues at the gap boundries, effectively ligating** them . This causes a shift in query-template registration, and produces a 3D model that fails to make apparent the absence of some residues in the template (unless the structural alignment is examined as the downloadable Project in DeepView). The absence of some residues will affect analyses of the 3D model, such as charge distribution, and distribution of evolutionary conservation.

Phyre2 (Normal Modeling Mode)

For short untemplated regions (e.g. 1-5 untemplated residues), the untemplated query residues are present in the 3D model, and registration is maintained according to the sequence alignment by bunching up the untemplated residues in the 3D model, allowing the untemplated query residues to bulge away from the template in the 3D model.

Caution: For long untemplated regions (e.g. 87 residues), the untemplated query residues are omitted from the 3D model, effectively ligating** the templated boundary residues together. The omission fails to reveal, in the 3D model, that a large untemplated region exists. The absence of some query residues will affect analyses of the 3D model, such as charge distribution, and distribution of evolutionary conservation.

Caution: Phyre2's behavior is identical to that of Swiss-Model, above.

*Observation is based on a single model and needs confirmation with additional models.
**Covalent peptide bonds between amino acids are not explicit in PDB files, but all commonly used software places covalent bonds based on interatomic distances. Thus, when a spatial gap is omitted in the 3D model, the two residues abutting the gap are effectively ligated.

ProblemsProblems

Sequence Numbering AnomaliesSequence Numbering Anomalies

It is common for the sequences of proteins in PDB structures to begin with a number other than 1 (2fsr:A, 1ucy:E, 1nsa), and to include a residue numbered zero (1avq:A, 1bxw:A). Discontinuities in sequential numbering may occur (1igt:B, 2fsr:A, 1nsa, 1iao:B). Residues in the same chain may have the same sequence number, notably in the case of "insertions" relative to a reference sequence (1igt:B, 1ucy). These inserted residues may all have the same number, but are distinguished by insertion codes, typically letters in alphabetical order. However, in rare cases, the letters may not be in alphabetical order, e.g. chain J in 1ucy. An overview of sequence numbering anomalies in the PDB, including further examples, is at Specification for Interactive Sequence Listings (in FirstGlance in Jmol).

  • Phyre2 - Caution: In early April, 2011, Phyre2 numbered the aligned portion of the template sequence incorrectly when the above kinds of sequence anomalies occur in the template PDB file. The development team has acknowledged the problem and is working on a fix.

Sequence AlignmentSequence Alignment

  • Swiss-Model fails to indicate which residues are identical, and which are similar, in its sequence alignment.
  • Phyre2 may number the aligned template sequence incorrectly -- see above.

See AlsoSee Also

References and NotesReferences and Notes

  1. Kelley LA, Sternberg MJ. Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc. 2009;4(3):363-71. PMID:19247286 doi:10.1038/nprot.2009.2
  2. Alignment between the query and template sequences. See Homology modeling.

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Wayne Decatur