Conservation, Evolutionary: Difference between revisions

From Proteopedia
Jump to navigation Jump to search
Eric Martz (talk | contribs)
Eric Martz (talk | contribs)
polishing
Line 1: Line 1:
==Locating Conserved Patches==
==Locating Conserved Patches==
Patches of highly conserved residues on the surface of a protein molecular structure are good candidates for [[Site | functional sites]]. These can be readily identified and visualized automatically with the [http://consurf.tau.ac.il ConSurf Server], provided a sufficient number of related protein sequences are in the Uniprot database. Before submitting a custom job request to the [http://consurf.tau.ac.il ConSurf Server], check out your [[PDB code]] at [http://consurfdb.tau.ac.il ConSurfDB], a database of precalculated conservation levels for most chains in the [[Protein Data Bank]].
Patches of highly conserved residues on the surface of a protein molecular structure are good candidates for [[Site | functional sites]]. These can be readily identified and visualized automatically with the [http://consurf.tau.ac.il ConSurf Server], provided a sufficient number of related protein sequences are in the Uniprot database. Before submitting a custom job request to the [http://consurf.tau.ac.il ConSurf Server], check out your [[PDB code]] at [http://consurfdb.tau.ac.il ConSurfDB], a database of precalculated conservation levels for most chains in the [[Protein Data Bank]].
==Locating Variable Patches==
In some cases, patches of highly variable (rapidly mutating) residues are also functional sites. These can also be identified with the [http://consurf.tau.ac.il ConSurf Server]. For example, mutations in influenza hemagglutinin help the virus to evade host defenses. Another example is the high allelic variability of the peptide-binding groove of [[Major Histocompatibility Complex Class I]]. That variability helps the grooves of the alleles within any individual to bind a wide range of peptides, hence enabling the T lymphocyte system to defend against a wide range of pathogens, including influenza virus.


==Caveat==
==Caveat==
Line 53: Line 56:
# Calculates a conservation score for each amino acid.
# Calculates a conservation score for each amino acid.
# Displays the protein, colored by conservation, in interactive 3D, using [[FirstGlance in Jmol]], [[Chimera]], [[PyMOL]], or [[Protein Explorer]].
# Displays the protein, colored by conservation, in interactive 3D, using [[FirstGlance in Jmol]], [[Chimera]], [[PyMOL]], or [[Protein Explorer]].
==Locating Variable Patches==
In some cases, patches of highly variable (rapidly mutating) residues are also functional sites. These can also be identified with the [http://consurf.tau.ac.il ConSurf Server]. For example, mutations in influenza hemagglutinin help the virus to evade host defenses. Another example is the high allelic variability of the peptide-binding groove of [[Major Histocompatibility Complex Class I]]. That variability helps the grooves of the alleles within any individual to bind a wide range of peptides, hence enabling the T lymphocyte system to defend against a wide range of pathogens, including influenza virus.


==Examples==
==Examples==

Revision as of 19:10, 11 March 2009

Locating Conserved PatchesLocating Conserved Patches

Patches of highly conserved residues on the surface of a protein molecular structure are good candidates for functional sites. These can be readily identified and visualized automatically with the ConSurf Server, provided a sufficient number of related protein sequences are in the Uniprot database. Before submitting a custom job request to the ConSurf Server, check out your PDB code at ConSurfDB, a database of precalculated conservation levels for most chains in the Protein Data Bank.

Locating Variable PatchesLocating Variable Patches

In some cases, patches of highly variable (rapidly mutating) residues are also functional sites. These can also be identified with the ConSurf Server. For example, mutations in influenza hemagglutinin help the virus to evade host defenses. Another example is the high allelic variability of the peptide-binding groove of Major Histocompatibility Complex Class I. That variability helps the grooves of the alleles within any individual to bind a wide range of peptides, hence enabling the T lymphocyte system to defend against a wide range of pathogens, including influenza virus.

CaveatCaveat

This caveat applies only to molecules that contain chains with different sequences. The conservation colors shown in Proteopedia's Evolutionary Conservation scenes do not indicate the same levels of conservation for chains of different sequences. This is because ConSurfDB calculates conservation levels independently for each sequence-different chain, and the levels are relative to the multiple sequence alignment constructed for each sequence-independent chain.

For example, consider 1bqh, which contains 10 chains, representing two copies of a 5-chain molecule. Each molecule contains four sequence-different chains. A visit to ConSurfDB reveals, as expected, that a different number of sequences was utilized for the multiple sequence alignment (MSA) and conservation calculations for each of these sequence-different chains, and that each MSA had a different average pairwise difference (APD), a measure of diversity within the MSA. Therefore, residues with, for example, conservation level 9 (maximal conservation) in each of the three ConSurfDB-colored sequence-different chains do not have exactly the same levels of conservation.

1bqh
Chain Length Number of sequences in MSA APD
A 274 144 1.72
B 99 75 1.49
C 8 Length below minimum for ConSurf
G 129 201 1.35

In Proteopedia's Evolutionary Conservation scenes, we have chosen to color all the chains in the molecule at once. This gives a potentially useful overview, but can be misleading unless one realizes that a given conservation color, in two sequence-different chains, does not mean exactly the same level of conservation. In contrast to Proteopedia's Evolutionary Conservation scenes, ConSurfDB and ConSurf Server apply conservation level colors to only one chain sequence at a time, thereby avoiding this possible confusion.

A second caveat bears mentioning here. Slight variations in the conservation pattern will occur over time, as the number of sequences in the sequence databases used by ConSurfDB increase. Each update of ConSurfDB uses somewhat larger sequence databases, and consequently, the MSA's for each chain will be slightly different.

The ConSurf ServerThe ConSurf Server

The ConSurf Server can calculate and display the conservation pattern for 3D structures completely automatically. It uses state-of-the-art methods, all of which are published in peer-reviewed journal articles. It also permits considerable customization. For example, the user may submit their own multiple sequence alignment, or phylogenetic tree. I[1] know of no other server with these advantages.

In brief, the ConSurf Server uses the following process by default:

  1. Obtains the protein sequence for the specified PDB code and chain.
  2. Gathers closely related sequences from Swiss-Prot (or Uniprot) with a PSI-BLAST search.
  3. Does a multiple sequence alignment.
  4. Constructs a phylogenetic tree.
  5. Calculates a conservation score for each amino acid.
  6. Displays the protein, colored by conservation, in interactive 3D, using FirstGlance in Jmol, Chimera, PyMOL, or Protein Explorer.

ExamplesExamples

Evolutionary conservation reported by ConSurfDB for Major Histocompatibility Class I alpha chain in 2vaa.

Drag the structure with the mouse to rotate

At right is the pattern of evolutionary conservation and variability reported by ConSurfDB for the alpha chain of Major Histocompatibility Complex Class I (chain A of 2vaa).

2vaa contains three chains. ConSurf processes only one chain at a time. In the example shown here, the alpha chain (chain A) was processed, while the beta chain (chain B) and the peptide (chain P) are shown as gray backbone traces. Below are instructions for how to insert a ConSurf result into a Proteopedia scene.

Examples of conserved patches revealed by ConSurf will be found in the articles on

Conservation for Domain FoldingConservation for Domain Folding

Certain residues on the surfaces of protein molecules tend to be conserved in order to maintain proper folding, rather than because they are part of a site functioning to interact with substrate, ligand, or a protein partner. Secondary structure elements need to break, in order to turn back into the folded protein domain, at the protein molecular surface. Therefore, it is common to see highly conserved residues that enable turns, or break helices, notably glycines or prolines, on protein structure surfaces.

Every structure in Proteopedia has a link to be displayed in FirstGlance in Jmol. There, you can use the Find dialog to enter the name of an amino acid, e.g. glycine or proline, and the positions of all of the specified amino acids will be highlighted. You can then visualize their distribution in the 3D structure.

How to Insert a ConSurf Result Into ProteopediaHow to Insert a ConSurf Result Into Proteopedia

To create a green-linked scene with a molecule colored by evolutionary conservation use the button "evolutionary conservation" in the "color" tab of the Scene Authoring Tools.

If for some reason you want to calculate the ConSurf coloring scheme on your own and want to insert that into a Proteopedia scene, here is how:

  1. Using either the ConSurf Database or the ConSurf Server, obtain the desired result.
  2. At the ConSurf result page, use the link RasMol Coloring Script to display either the script showing or hiding insufficient data. Block and copy the entire script.
  3. We assume that you already have an article in Proteopedia, with a Jmol applet in place for displaying your ConSurf result. (If not, see the Video Guides and Help:Editing.)
  4. Edit your Proteopedia page, and open the Scene Authoring Tools.
  5. Load the desired molecule into Jmol in the Scene Authoring Tools.
  6. Click on "Jmol" (at the lower right of Jmol) to open Jmol's menu, and there, click on "Console".
  7. In the small white Console window, paste your RasMol Coloring Script into the bottom box, and click Execute.
  8. Make any other changes you wish to this scene, and then save the scene.
  9. Copy the wikitext for the green link that will display your scene, and close the Scene Authoring Tool.
  10. Paste the green link wikitext into your page, and save the page.

The color key that you see above on this page can be inserted in any page using this wikitext:

{{Template:ColorKey_ConSurf}}

See also Help:Color_Keys. If something is not clear, please let us know at .

NotesNotes

  1. User:Eric Martz in January, 2009.

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Eran Hodis, Wayne Decatur