Homology modeling servers: Difference between revisions

From Proteopedia
Jump to navigation Jump to search
Eric Martz (talk | contribs)
Eric Martz (talk | contribs)
No edit summary
 
(53 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Stub}}
<center><table style="background:#ffff80;"><tr><td>CAUTION: Issues with servers, reported in this article, have not been updated since 2011.</td></tr></table></center>


There are a number of free servers that create [[Theoretical_models#Homology_Models|homology models]] (also called ''comparative models'') for a submitted amino acid sequence, or that offer libraries of 3D models created in advance for protein sequences. The performance of homology modeling methods is evaluated in an international, biannual competition called [[CASP]]. A comparison of 10 servers is included in the 2009 description of Phyre by Kelley and Sternberg<ref>PMID: 19247286</ref>, which also offers guidance in how to use these servers effectively.
There are a number of free servers that create [[Theoretical_models#Homology_Models|homology models]] (also called ''comparative models'') for a submitted amino acid sequence, or that offer libraries of 3D models created in advance for protein sequences. The performance of homology modeling methods is evaluated in an international, biannual competition called [[CASP]]. A comparison of 10 servers is included in the 2009 description of Phyre by Kelley and Sternberg<ref>PMID: 19247286</ref>, which also offers guidance in how to use these servers effectively.


==Servers==
==Servers==
''The list below is incomplete, and may not include some of the best servers, nor does it include assessments of server performance.''
<center><table style="background: #ffe0a0;" width="90%" cellpadding="6">
<tr>
<td><b>Terminology:</b> The sequence with unknown 3D structure is usually called the <b>target</b>. It is modeled on the <b>template</b>. Because those two terms are similar, sometimes leading to confusion, we shall call the target the <b>query</b>.</td>
</tr></table></center>
''The list below is incomplete, and may not include some of the best servers, nor does it include assessments of server performance. Please help by adding additional servers.''


* [http://swissmodel.expasy.org/ SWISS-MODEL] provides a free, fully-automated homology modeling service. Using the ''Automated Mode'', you submit a protein sequence. When the [[PDB]] contains an empirically-determined structure with sufficient sequence identity with your target sequence, it will be used as a template. The resulting homology model will be constructed automatically.
* [http://swissmodel.expasy.org/ SWISS-MODEL] provides a free, fully-automated homology modeling service. Using the ''Automated Mode'', you submit a protein sequence. When the [[PDB]] contains an empirically-determined structure with sufficient sequence identity with your query sequence, it will be used as a template. The resulting homology model will be constructed automatically.


* [http://modbase.compbio.ucsf.edu/modbase-cgi/index.cgi ModBase: Database of Comparative Protein Structure Models] allows users to calculate comparative models on demand.
* [http://modbase.compbio.ucsf.edu/modbase-cgi/index.cgi ModBase: Database of Comparative Protein Structure Models] allows users to calculate comparative models on demand.
*[http://www.ncbi.nlm.nih.gov/pubmed/17640066 The Homology Modeling Automatically (HOMA) web site].
*[http://zhanglab.ccmb.med.umich.edu/I-TASSER/ I-Tasser], formerly known as 'the Zhang lab-Server' - employs comparative protein modelling based on protein threading and has won the last few [[CASP]] events.
*[http://www.yasara.org/ YASARA] - Yet Another Scientific Artificial Reality Application)  features a complete homology modeling module that fully automatically takes all the steps from an amino acid sequence to a refined high-resolution model using a [[CASP]] approved protocol.
===MetaServers===
MetaServers are servers that submit your modeling job to other servers.
* [http://www.proteinmodelportal.org/ Protein Model Portal (PMP)]<ref>PMID: 19037750</ref> a service of the Protein Structure Initiative.


==Handling of gaps==
==Handling of gaps==
There are three kinds of gaps that present challenges when creating a homology model. It is important to know how a given server handles these challenges.
There are three kinds of gaps that present challenges when creating a homology model. It is important to know how a given server handles these challenges. Behavior marked <font color="red">''Caution''</font> seems likely to produce errors in the homology model.


<table border="1" cellpadding="5"><tr><td rowspan="2">
<table border="1" cellpadding="5"><tr><td rowspan="2">
Server
Server
</td><td colspan="2"><center>
</td><td colspan="2"><center>
Sequence Alignment<ref>Alignment between the target and template sequences. See [[Homology modeling]].</ref>
Sequence Alignment<ref>Alignment between the query and template sequences. See [[Homology modeling]].</ref>
</center></td><td rowspan="2">
</center></td><td rowspan="2">
Template residues lacking 3D coordinates
Template residues lacking 3D coordinates
Line 22: Line 37:
</tr><tr>
</tr><tr>
<td>
<td>
Untemplated<br>(Gap in Template)
Untemplated Query Residues<br>(Gap in Template Sequence)
</td><td>
</td><td>
Gap in Target
Gap in Query Sequence
</td>
</td>
</tr><tr>
</tr><tr>
Line 30: Line 45:
[http://swissmodel.expasy.org/ Swiss-Model] (Automated Mode)
[http://swissmodel.expasy.org/ Swiss-Model] (Automated Mode)
</td><td>
</td><td>
Untemplated residues (aligned with gap in target sequence) are present in 3D model, and are indicated with a high [[temperature value]]*.
Untemplated query residues (aligned with a gap in the template sequence) are present in the 3D model, and are indicated with a high [[temperature value]]. This appears to be true regardless of the length of the untemplated region. Long untemplated regions may occur in the 3D model as a long hairpin loop extending away from a compact domain, making their lack of template fairly obvious.
</td><td>
</td><td>
The 3D model takes a shortcut, skipping the residues in the template aligned with the gap. This causes the 3D template to bulge away from the 3D query model in this region, and permits registration to be maintained.
</td><td>
</td><td>
<font color="red">
<font color="red">Caution:</font>
Omitted in sequence alignment. The 3D model lacks a spatial gap between the residues at the gap boundries, effectively ligating** them . This causes a shift in registration.
Omitted in the sequence alignment, yet not indicated there by a gap. The 3D model lacks a spatial gap between the residues at the gap boundries, effectively ligating** them . This causes a shift in query-template registration, and produces a 3D model that fails to make apparent the absence of some residues in the template (unless the structural alignment is examined as the downloadable ''Project'' in [[DeepView]]). The absence of some residues will affect analyses of the 3D model, such as charge distribution, and distribution of [[Evolutionary Conservation|evolutionary conservation]].
</font>
</td>
</td>
</tr><tr>
</tr><tr>
<td>
<td>
[http://www.sbg.bio.ic.ac.uk/~phyre/ Phyre2]
[http://www.sbg.bio.ic.ac.uk/~phyre/ Phyre2]
</td>
(Normal Modeling Mode)
<td>
</td><td>
</td>
For short untemplated regions (e.g. 1-5 untemplated residues), the untemplated query residues are present in the 3D model, and registration is maintained according to the sequence alignment by bunching up the untemplated residues in the 3D model, allowing the untemplated query residues to bulge away from the template in the 3D model.
<td>
<br><br>
</td>
<font color="red">Caution:</font> For long untemplated regions (e.g. 87 residues), the untemplated query residues are omitted from the 3D model, effectively ligating** the templated boundary residues together. The omission fails to reveal, in the 3D model, that a large untemplated region exists. The absence of some query residues will affect analyses of the 3D model, such as charge distribution, and distribution of [[Evolutionary Conservation|evolutionary conservation]].
<td>
</td><td>
</td><td>
<font color="red">Caution:</font>
Phyre2's behavior is identical to that of Swiss-Model, above.
</td>
</td>
</tr>
</tr>
Line 55: Line 73:
==Problems==
==Problems==
===Sequence Numbering Anomalies===
===Sequence Numbering Anomalies===
It is common for the sequences of proteins in [[PDB]] structures to begin with a number other than 1 ([[2fsr]]:A, [[1ucy]]:E, [[1nsa]]), and to include a residue numbered zero ([[1avq]]:A, [[1bxw]]:A). Discontinuities in sequential numbering may occur ([[1igt]]:B, [[2fsr]]:A, [[1nsa]], [[1iao]]:B). Residues in the same chain may have the same sequence number, notably in the case of "insertions" relative to a reference sequence ([[1igt]]:B, [[1ucy]]). These inserted residues may all have the same number, but are distinguished by insertion codes, typically letters in alphabetical order. However, in rare cases, the letters may not be in alphabetical order, e.g. chain J in [[1ucy]]. An overview of sequence numbering anomalies in the [[PDB]], including further examples, is at [http://firstglance.jmol.org/seqspecs.htm Specification for Interactive Sequence Listings] (in [[FirstGlance in Jmol]]).
It is common for the sequences of proteins in [[PDB]] structures to begin with a number other than 1 ([[2fsr]]:A, [[1ucy]]:E, [[1nsa]]), and to include a residue numbered zero ([[1avq]]:A, [[1bxw]]:A). Discontinuities in sequential numbering may occur ([[1igt]]:B, [[2fsr]]:A, [[1nsa]], [[1iao]]:B). Residues in the same chain may have the same sequence number, notably in the case of "insertions" relative to a reference sequence ([[1igt]]:B, [[1ucy]]). These inserted residues may all have the same number, but are distinguished by insertion codes, typically letters in alphabetical order. However, in rare cases, the letters may not be in alphabetical order, e.g. chain J in [[1ucy]]. An overview of sequence numbering anomalies in the [[PDB]], including further examples, is at [[Unusual sequence numbering]]<!--[http://firstglance.jmol.org/seqspecs.htm Specification for Interactive Sequence Listings] (in [[FirstGlance in Jmol]])-->.
* '''Phyre2''': In early April, 2011, Phyre2 numbered the aligned portion of the template sequence incorrectly when the above kinds of sequence anomalies occur in the template [[PDB]] file. The development team has acknowledged the problem and is working on a fix.
* '''Phyre2''' - <font color="red">Caution</font>: In early April, 2011, Phyre2 numbered the aligned portion of the template sequence incorrectly when the above kinds of sequence anomalies occur in the template [[PDB]] file. The development team has acknowledged the problem and is working on a fix.


===Sequence Alignment===
===Sequence Alignment===
Line 63: Line 81:


==See Also==
==See Also==
*[[Practical Guide to Homology Modeling]]
*[[Homology modeling]]
*[[Theoretical models]]
*[http://en.wikipedia.org/wiki/Homology_modeling Homology modeling] at Wikipedia.
*[http://en.wikipedia.org/wiki/Homology_modeling Homology modeling] at Wikipedia.
*[[User:Wayne Decatur/Homology Modeling]]
*[[User:Wayne Decatur/Homology Modeling]]
* An [http://www.jove.com/video/3259/a-protocol-for-computer-based-protein-structure-and-function-prediction article and video entitled "A Protocol for Computer-Based Protein Structure and Function Prediction"] at the [http://www.jove.com/ Journal of Visualized Experiments] illustrates and dicusses the use of the I-TASSER server for protein structure and function prediction.
* [[Structural bioinformatics servers]]


==References and Notes==
==References and Notes==
<references />
<references />

Latest revision as of 21:43, 26 October 2018

CAUTION: Issues with servers, reported in this article, have not been updated since 2011.

There are a number of free servers that create homology models (also called comparative models) for a submitted amino acid sequence, or that offer libraries of 3D models created in advance for protein sequences. The performance of homology modeling methods is evaluated in an international, biannual competition called CASP. A comparison of 10 servers is included in the 2009 description of Phyre by Kelley and Sternberg[1], which also offers guidance in how to use these servers effectively.

ServersServers

Terminology: The sequence with unknown 3D structure is usually called the target. It is modeled on the template. Because those two terms are similar, sometimes leading to confusion, we shall call the target the query.

The list below is incomplete, and may not include some of the best servers, nor does it include assessments of server performance. Please help by adding additional servers.

  • SWISS-MODEL provides a free, fully-automated homology modeling service. Using the Automated Mode, you submit a protein sequence. When the PDB contains an empirically-determined structure with sufficient sequence identity with your query sequence, it will be used as a template. The resulting homology model will be constructed automatically.
  • I-Tasser, formerly known as 'the Zhang lab-Server' - employs comparative protein modelling based on protein threading and has won the last few CASP events.
  • YASARA - Yet Another Scientific Artificial Reality Application) features a complete homology modeling module that fully automatically takes all the steps from an amino acid sequence to a refined high-resolution model using a CASP approved protocol.

MetaServersMetaServers

MetaServers are servers that submit your modeling job to other servers.

Handling of gapsHandling of gaps

There are three kinds of gaps that present challenges when creating a homology model. It is important to know how a given server handles these challenges. Behavior marked Caution seems likely to produce errors in the homology model.

Server

Sequence Alignment[3]

Template residues lacking 3D coordinates

Untemplated Query Residues
(Gap in Template Sequence)

Gap in Query Sequence

Swiss-Model (Automated Mode)

Untemplated query residues (aligned with a gap in the template sequence) are present in the 3D model, and are indicated with a high temperature value. This appears to be true regardless of the length of the untemplated region. Long untemplated regions may occur in the 3D model as a long hairpin loop extending away from a compact domain, making their lack of template fairly obvious.

The 3D model takes a shortcut, skipping the residues in the template aligned with the gap. This causes the 3D template to bulge away from the 3D query model in this region, and permits registration to be maintained.

Caution: Omitted in the sequence alignment, yet not indicated there by a gap. The 3D model lacks a spatial gap between the residues at the gap boundries, effectively ligating** them . This causes a shift in query-template registration, and produces a 3D model that fails to make apparent the absence of some residues in the template (unless the structural alignment is examined as the downloadable Project in DeepView). The absence of some residues will affect analyses of the 3D model, such as charge distribution, and distribution of evolutionary conservation.

Phyre2 (Normal Modeling Mode)

For short untemplated regions (e.g. 1-5 untemplated residues), the untemplated query residues are present in the 3D model, and registration is maintained according to the sequence alignment by bunching up the untemplated residues in the 3D model, allowing the untemplated query residues to bulge away from the template in the 3D model.

Caution: For long untemplated regions (e.g. 87 residues), the untemplated query residues are omitted from the 3D model, effectively ligating** the templated boundary residues together. The omission fails to reveal, in the 3D model, that a large untemplated region exists. The absence of some query residues will affect analyses of the 3D model, such as charge distribution, and distribution of evolutionary conservation.

Caution: Phyre2's behavior is identical to that of Swiss-Model, above.

*Observation is based on a single model and needs confirmation with additional models.
**Covalent peptide bonds between amino acids are not explicit in PDB files, but all commonly used software places covalent bonds based on interatomic distances. Thus, when a spatial gap is omitted in the 3D model, the two residues abutting the gap are effectively ligated.

ProblemsProblems

Sequence Numbering AnomaliesSequence Numbering Anomalies

It is common for the sequences of proteins in PDB structures to begin with a number other than 1 (2fsr:A, 1ucy:E, 1nsa), and to include a residue numbered zero (1avq:A, 1bxw:A). Discontinuities in sequential numbering may occur (1igt:B, 2fsr:A, 1nsa, 1iao:B). Residues in the same chain may have the same sequence number, notably in the case of "insertions" relative to a reference sequence (1igt:B, 1ucy). These inserted residues may all have the same number, but are distinguished by insertion codes, typically letters in alphabetical order. However, in rare cases, the letters may not be in alphabetical order, e.g. chain J in 1ucy. An overview of sequence numbering anomalies in the PDB, including further examples, is at Unusual sequence numbering.

  • Phyre2 - Caution: In early April, 2011, Phyre2 numbered the aligned portion of the template sequence incorrectly when the above kinds of sequence anomalies occur in the template PDB file. The development team has acknowledged the problem and is working on a fix.

Sequence AlignmentSequence Alignment

  • Swiss-Model fails to indicate which residues are identical, and which are similar, in its sequence alignment.
  • Phyre2 may number the aligned template sequence incorrectly -- see above.

See AlsoSee Also

References and NotesReferences and Notes

  1. Kelley LA, Sternberg MJ. Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc. 2009;4(3):363-71. PMID:19247286 doi:10.1038/nprot.2009.2
  2. Arnold K, Kiefer F, Kopp J, Battey JN, Podvinec M, Westbrook JD, Berman HM, Bordoli L, Schwede T. The protein model portal. J Struct Funct Genomics. 2009 Mar;10(1):1-8. Epub 2008 Nov 27. PMID:19037750 doi:10.1007/s10969-008-9048-5
  3. Alignment between the query and template sequences. See Homology modeling.

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Wayne Decatur