ConSurfDB vs. ConSurf: Difference between revisions
Eric Martz (talk | contribs) |
Eric Martz (talk | contribs) No edit summary |
||
(35 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
<table align="right" class="wikitable" width=430><tr><td> | |||
[[Image:2vaa-APD0.31-40degslow.gif]] | |||
</td></tr><tr><td> | |||
{{Template:ColorKey_ConSurf_NoYellow_NoGray}} | |||
Conservation of amino acids non-covalently interacting with a peptide ({{Template:ColorKey_Element_C}} {{Template:ColorKey_Element_N}} {{Template:ColorKey_Element_O}}) in the groove of [https://www.youtube.com/watch?v=2ZakngfbHSo Major Histocompatibility Protein] Class I ([[2vaa]]). Conservation was '''not revealed''' until an [[#Average Pairwise Distance]] of 0.31 was achieved in a customized ConSurf Server job. [[#Examples|DETAILS BELOW]]. | |||
</td></tr></table> | |||
Evolutionary Conservation is introduced at [[Introduction to Evolutionary Conservation]], and treated in somewhat greater depth in the article [[Conservation, Evolutionary]]. These describe how conservation patterns in 3D can help to identify functional sites in proteins. Proteopedia displays conservation patterns pre-calculated by [http://consurfdb.tau.ac.il ConSurfDB], when available. These are usually based on broad protein families that include sequences of proteins with multiple functions. Consequently, they usually '''obscure conservation''' present in a family of proteins with a single function (see [[Conservation%2C_Evolutionary#Caveats|Caveats]]). | Evolutionary Conservation is introduced at [[Introduction to Evolutionary Conservation]], and treated in somewhat greater depth in the article [[Conservation, Evolutionary]]. These describe how conservation patterns in 3D can help to identify functional sites in proteins. Proteopedia displays conservation patterns pre-calculated by [http://consurfdb.tau.ac.il ConSurfDB], when available. These are usually based on broad protein families that include sequences of proteins with multiple functions. Consequently, they usually '''obscure conservation''' present in a family of proteins with a single function (see [[Conservation%2C_Evolutionary#Caveats|Caveats]]). | ||
Line 123: | Line 130: | ||
==Examples== | ==Examples== | ||
<StructureSection load='' size='350' side='right' caption='' scene='39/399854/2vaa_consurf_halos_w274_y159/4'> | <StructureSection load='' size='350' side='right' caption='' scene='39/399854/2vaa_consurf_halos_w274_y159/4'> | ||
With default parameters, the ConSurf Server results have an average [[#Average Pairwise Distance]] (APD) of 1.00<ref name="APD">Tested with 20 arbitrarily selected proteins, mostly enzymes. Average of the average pairwise distance (APD) values: 1.00; range 0.82-1.42.</ref>, and an average of only a few "yellow" residues with insufficient data.<ref name="ISD">Tested with 20 arbitrarily selected proteins, mostly enzymes. Average number of amino acids with insufficient data ("yellow" in ConSurf): 3.5; range 0 to 16.</ref> For the examples below, it was necessary to [[#Limiting ConSurf Analysis to Proteins of a Single Function|customize the ConSurf Server job parameters]] in order to reveal conservation | With default parameters, the ConSurf Server results have an average [[#Average Pairwise Distance]] (APD) of 1.00<ref name="APD">Tested with 20 arbitrarily selected proteins, mostly enzymes. Average of the average pairwise distance (APD) values: 1.00; range 0.82-1.42.</ref>, and an average of only a few "yellow" residues with insufficient data.<ref name="ISD">Tested with 20 arbitrarily selected proteins, mostly enzymes. Average number of amino acids with insufficient data ("yellow" in ConSurf): 3.5; range 0 to 16.</ref> For the examples below, it was necessary to [[#Limiting ConSurf Analysis to Proteins of a Single Function|customize the ConSurf Server job parameters]] in order to '''reveal conservation of key residues''' in proteins with the same function as the query. | ||
These molecular scenes were obtained in [[FirstGlance in Jmol]], which offers many conveniences for analyzing ConSurf Server results. See [[Help:How to Insert a ConSurf Result Into a Proteopedia Green Link|How to Insert a ConSurf Result Into a Proteopedia Green Link]]. | These molecular scenes were obtained in [[FirstGlance in Jmol]], which offers many conveniences for analyzing ConSurf Server results. See [[Help:How to Insert a ConSurf Result Into a Proteopedia Green Link|How to Insert a ConSurf Result Into a Proteopedia Green Link]]. | ||
Line 132: | Line 139: | ||
====ConSurf Server Default APD 1.1==== | ====ConSurf Server Default APD 1.1==== | ||
[[2vaa]] contains three chains. Here, (<scene name='39/399854/2vaa_consurf_halos_w274_y159/4'>restore initial scene, ConSurf Server default settings, APD 1.1</scene>)<ref name=" | [[2vaa]] contains three chains. Here, (<scene name='39/399854/2vaa_consurf_halos_w274_y159/4'>restore initial scene, ConSurf Server default settings, APD 1.1</scene>)<ref name="consurfdefaults">Default ConSurf Server settings: UniRef90 database, excluding sequences with > 95% or < 35% identity with the query, MSA has 150 sequences sampled evenly from all unique sequence hits.</ref> ConSurf colors are applied only to the alpha chain (chain A), while the beta chain (chain B = β-2 microglobulin) and the 8 amino acid peptide (chain P) are shown as gray backbone traces. | ||
Conservation of important residues in the groove is obscured by inclusion in the MSA of proteins with different functions ([[#Example With Multiple Functions|see analysis above]]). The sides of the groove are variable due to many alleles that enable it to bind a wide range of peptide sequences. The only groove residue that is conserved at greater than level 7 is '''Tyr159''' (level 8), whose sidechain hydrogen bonds the main-chain oxygen of the amino-terminal peptide residue. Only a handful of surface residues are highly conserved (level 9), including '''Trp274''' involved in binding CD8. | Conservation of important residues in the groove is obscured by inclusion in the MSA of proteins with different functions ([[#Example With Multiple Functions|see analysis above]]). The sides of the groove are variable due to many alleles that enable it to bind a wide range of peptide sequences. The only groove residue that is conserved at greater than level 7 is '''Tyr159''' (level 8), whose sidechain hydrogen bonds the main-chain oxygen of the amino-terminal peptide residue. Only a handful of surface residues are highly conserved (level 9), including '''Trp274''' involved in binding CD8. | ||
Line 140: | Line 147: | ||
====ConSurf Server Custom APD 0.51==== | ====ConSurf Server Custom APD 0.51==== | ||
A custom consurf job resulting in an APD of 0.51<ref name="apd0.51">Custom ConSurf Server settings for APD 0.51: UniRef90 database, excluding sequences with > 95% or '''< 50%''' identity with the query, MSA has 150 sequences sampled evenly from all unique sequence hits.</ref> (NOT SHOWN) had '''NO groove residues with conservation levels > 6'''. Trp274 was level 9. | A custom consurf job resulting in an APD of 0.51<ref name="apd0.51">Custom ConSurf Server settings for APD 0.51 with 2vaa: UniRef90 database, excluding sequences with > 95% or '''< 50%''' identity with the query, MSA has 150 sequences sampled evenly from all unique sequence hits.</ref> (NOT SHOWN) had '''NO groove residues with conservation levels > 6'''. Trp274 was level 9. | ||
====ConSurf Server Custom APD 0.31==== | ====ConSurf Server Custom APD 0.31==== | ||
<span style="float:right;">{{Template:ColorKey_ConSurf_NoYellow_NoGray}}</span> | <span style="float:right;">{{Template:ColorKey_ConSurf_NoYellow_NoGray}}</span> | ||
By default, ConSurf Server excludes from the multiple sequence alignment sequences with >95% identity, or <35% identity with the query sequence. Changing those limits to >98% and <70% reduced the default APD of 1.1 to 0.31<ref name="apd0.31">Custom ConSurf Server settings for APD 0.31: UniRef90 database, excluding sequences with '''> 98% or < 70%''' identity with the query, MSA has 150 sequences sampled evenly from all unique sequence hits.</ref>. <scene name='39/399854/2vaa_apd_point31/3'>This result reveals high conservation of the following 4 key residues in the groove</scene> (yellow halos). With spin OFF, touch a residue to identify it. | By default, ConSurf Server excludes from the multiple sequence alignment sequences with >95% identity, or <35% identity with the query sequence. Changing those limits to >98% and <70% reduced the default APD of 1.1 to 0.31<ref name="apd0.31">Custom ConSurf Server settings for APD 0.31 with 2vaa: UniRef90 database, excluding sequences with '''> 98% or < 70%''' identity with the query, MSA has 150 sequences sampled evenly from all unique sequence hits.</ref>. <scene name='39/399854/2vaa_apd_point31/3'>This result reveals high conservation of the following 4 key residues in the groove</scene> ({{Yelspan|yellow halos}}). With spin OFF, touch a residue to identify it. | ||
* <span style="background-color:#961d54;color:white;padding:0.2em 0.4em 0.1em 0.4em;">Level 9:</span> | * <span style="background-color:#961d54;color:white;padding:0.2em 0.4em 0.1em 0.4em;">Level 9:</span> | ||
Line 155: | Line 162: | ||
(Tyr159 was level 6. CD8 binding site Trp274 remains level 9.) | (Tyr159 was level 6. CD8 binding site Trp274 remains level 9.) | ||
<scene name='39/399854/2vaa_peptide_contacts/1'>Here are all the polar residues contacting the peptide</scene>. Use the '''POPUP BUTTON''' to see details! (This scene is easily obtained in [http://firstglance.jmol.org FirstGlance]: Tools tab, click Contacts, check Label Contacts, and [[Help:How to Insert a ConSurf Result Into a Proteopedia Green Link|made into a Green Link]]. | <scene name='39/399854/2vaa_peptide_contacts/1'>Here are all the polar residues contacting the peptide</scene>. Use the '''POPUP BUTTON''' to see details! (This scene is easily obtained in [http://firstglance.jmol.org FirstGlance]: Tools tab, click Contacts, check Label Contacts, and [[Help:How to Insert a ConSurf Result Into a Proteopedia Green Link|made into a Green Link]].) | ||
Another custom ConSurf Server job<ref name="apd0.30">Custom ConSurf Server settings for APD 0.30 with 2vaa: UniRef90 database, excluding sequences with > 95% or < 35% identity with the query, MSA has '''250''' sequences '''closest''' to the query.</ref> gave an '''APD of 0.30''', but levels for the above 4 groove residues were 7-8. These lower levels can be accounted for by the highest expectation value<ref name="evalue" /> in the MSA, which was 10 to the power -141. In contrast, for the job with APD 0.31, the highest expectation value was 10 to the power -84. | |||
===Case #2: UV Resistance Protein=== | |||
<scene name='39/399854/4dnw_consurf_apd-point48/1'>''Arabidopsis'' UVB-Resistance Protein UVR8</scene> [[4dnw]] is a homodimer with an <scene name='39/399854/4dnw_consurf_apd-point48/2'>unusual number of between-chain salt bridges</scene>. '''Are the between-chain salt bridges more conserved than the within-chain salt bridges?''' | |||
[[FirstGlance in Jmol]] displays <scene name='39/399854/4dnw_consurf_apd-point48/2'>all salt bridges</scene> with one click (Tools tab), colored by conservation (if pre-processed by the ConSurf Server), and can list them, '''spreadsheet-ready, including conservation level numbers, and marking those between chains'''. | |||
With the default ConSurf Server result '''APD 1.42''', and with a custom ConSurf Server result '''APD 0.91''', the salt-bridged residues have about '''average''' conservation. With a custom result '''APD 0.48''', the between-chain salt bridges have '''above-average''' conservation (7.6 vs. 6.8), while the within-chain salt bridges have below average conservation (6.3 vs. 6.8). In conclusion, when the multiple sequence alignment is limited to sequences closely related to the query (APD 0.48), '''between-chain salt bridged residues are more conserved than are within-chain salt bridged residues.''' The difference is '''statistically significant''' (p < 0.01<ref name="stats">With APD 0.48, mean conservation of between-chain salt bridged atoms is 7.57 ± 0.13 SEM. Subtracting 3 SEM (99% confidence limit) gives 7.18. This does not overlap with either 7.16 (the all-salt-bridged atoms mean + 3 SEM) or 6.82 (the mean for within-chain salt-bridged atoms + 3 SEM).</ref>). | |||
< | <table class="wikitable" style="text-align:center;"> | ||
<tr> | |||
<td colspan=5> | |||
Salt Bridges in [[4dnw]] | |||
</td> | |||
</tr><tr> | |||
<td rowspan=2> | |||
ConSurf [[#Average Pairwise Distance|APD]] | |||
</td> | |||
<td rowspan=2> | |||
Level 9:<br>% of All Residues | |||
</td> | |||
<td colspan=3> | |||
<center> | |||
Mean Conservation Levels ± SEM | |||
</center> | |||
</td> | |||
</tr><tr> | |||
<td> | |||
All Residues | |||
</td><td> | |||
Salt Bridges Between Chains | |||
</td><td> | |||
Salt Bridges Within Chains | |||
</td> | |||
</tr><tr> | |||
<td> | |||
1.42<ref name="consurfdefaults" /> | |||
</td><td> | |||
14% | |||
</td><td> | |||
3.7 | |||
</td><td> | |||
3.5 | |||
</td><td> | |||
3.8 | |||
</td> | |||
</tr><tr> | |||
<td> | |||
0.91<ref name="apd0.91">ConSurf settings for APD 0.91 with 4dnw: Clean UniProt, 35-95%, 200 sequences closest to query.</ref> | |||
</td><td> | |||
16% | |||
</td><td> | |||
5.4 | |||
</td><td> | |||
6.0 | |||
</td><td> | |||
5.0 | |||
</td> | |||
</tr><tr> | |||
<td> | |||
0.48<ref name="apd0.48">ConSurf settings for APD 0.48 with 4dnw: Clean UniProt, 35-95%, 125 sequences closest to query.</ref> | |||
</td><td> | |||
18% | |||
</td><td> | |||
6.8 ± 0.12* | |||
</td><td> | |||
7.6 ± 0.13* | |||
</td><td> | |||
6.3 ± 0.17* | |||
</td> | |||
</tr> | |||
</table> | |||
* *Averages are per atom for 88 between-chains salt-bridged atoms, and 140 within chain salt-bridged atoms. SEM's were calculated as standard deviation divided by the square root of the atom counts. Differences for APD 0.48 are statistically significant, p < 0.01<ref name="stats" />. | |||
*Salt bridges are Lys or Arg sidechain nitrogens within 4.0 Å of Asp or Glu sidechain oxygens. | |||
Examples of conserved patches on other proteins, revealed by ConSurf, will be found in the articles on | Examples of conserved patches on other proteins, revealed by ConSurf, will be found in the articles on | ||
Line 169: | Line 247: | ||
==Conclusion== | ==Conclusion== | ||
In order to discover key functional residues, it is important to inspect multiple ConSurf Server jobs for highly conserved residues, including multiple jobs with [[#Average Pairwise Distance]]s in the range 0.25-0.5 using the [[#Limiting ConSurf Analysis to Proteins of a Single Function|above methods]]. Residues conserved in the broader family of more distantly related proteins with different functions will typically be revealed with default ConSurf Server settings, or even in the ConSurf'''DB''' result. | In order to discover key functional residues, it is important to inspect multiple ConSurf Server jobs for highly conserved residues, including multiple jobs with [[#Average Pairwise Distance]]s (APD) in the range 0.25-0.5 using the [[#Limiting ConSurf Analysis to Proteins of a Single Function|above methods]]. Jobs with APD higher than 0.5 may obscure conservation of residues crucial for the function of the query protein. Residues conserved in the broader family of more distantly related proteins with different functions will typically be revealed with default ConSurf Server settings (APD ~ 1.0), or even in the ConSurf'''DB''' result. | ||
==The ConSurf-DB Mechanism== | ==The ConSurf-DB Mechanism== | ||
Line 225: | Line 303: | ||
# Displays the protein, colored by conservation, in interactive 3D, using the NGL Viewer, [[FirstGlance in Jmol]], [[Chimera]], or [[PyMOL]]. | # Displays the protein, colored by conservation, in interactive 3D, using the NGL Viewer, [[FirstGlance in Jmol]], [[Chimera]], or [[PyMOL]]. | ||
==References== | ==See Also== | ||
*[[ConSurf/Index]] provides links to all pages about evolutionary conservation and ConSurf in Proteopedia. | |||
==Notes & References== | |||
{{Reflist}} | {{Reflist}} |