Conservation, Evolutionary: Difference between revisions

From Proteopedia
Jump to navigation Jump to search
Eric Martz (talk | contribs)
→‎The ConSurf-DB Mechanism: describing ConSurf-DB Mechanism
Eric Martz (talk | contribs)
→‎The ConSurf-DB Mechanism: describing ConSurf-DB Mechanism
Line 48: Line 48:


==The ConSurf-DB Mechanism==
==The ConSurf-DB Mechanism==
The ConSurf DataBase server, [http://consurfdb.tau.ac.il ConSurf-DB], pre-calculates conservation levels for each amino acid in every protein chain in the [[Protein Data Bank]]<ref>PMID: 18971256</ref>.
The ConSurf DataBase server, [http://consurfdb.tau.ac.il ConSurf-DB], pre-calculates conservation levels for each amino acid in every protein chain in the [[Protein Data Bank]]<ref>PMID: 18971256</ref>. It went into service in 2008. The amino acid sequence of each protein chain is submitted to PSI-BLAST<ref>PSI-BLAST (Position Specific Iteration-BLAST) is an extension of the Basic Local Alignment Search Tool (BLAST) that is more sensitive at finding distantly related sequences. See [http://en.wikipedia.org/wiki/PSI-BLAST PSI-BLAST at Wikipedia] and [http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/psi1.html PSI-BLAST at NCBI].</ref>


==The ConSurf Server==
==The ConSurf Server==

Revision as of 23:39, 11 March 2009

Locating Conserved PatchesLocating Conserved Patches

Patches of highly conserved amino acid residues on the surface of a protein molecular structure are good candidates for functional sites. Every article in Proteopedia that is titled with a PDB code has an Evolutionary Conservation section below the molecular scene. Clicking show in the blue Evolutionary Conservation bar automatically colors all chain in the molecule by evolutionary conservation as calculated by ConSurf-DB.

Topic pages in Proteopedia (manually-authored pages that typically discuss more than one PDB code) may also include molecular scenes colored by evolutionary conservation. See below for Examples and Instructions.

Locating Variable PatchesLocating Variable Patches

In some cases, patches of highly variable (rapidly mutating) residues are also functional sites. These can also be identified with Proteopedia's Evolutionary Conservation scenes. For example, mutations in influenza hemagglutinin help the virus to evade host defenses. Another example is the high allelic variability of the peptide-binding groove of Major Histocompatibility Complex Class I. That variability helps the grooves of the alleles within any individual to bind a wide range of peptides, hence enabling the T lymphocyte system to defend against a wide range of pathogens, including influenza virus. See the ConSurf-colored example below.

CaveatCaveat

This caveat applies only to molecules that contain chains with different sequences. The conservation colors shown in Proteopedia's Evolutionary Conservation scenes do not indicate the same levels of conservation for chains of different sequences. This is because ConSurf-DB calculates conservation levels independently for each sequence-different chain, and the levels are relative to the multiple sequence alignment constructed for each sequence-independent chain.

For example, consider 1bqh, which contains 10 chains, representing two copies of a 5-chain molecule. Each molecule contains four sequence-different chains. A visit to ConSurf-DB reveals, as expected, that a different number of sequences was utilized for the multiple sequence alignment (MSA) and conservation calculations for each of these sequence-different chains, and that each MSA had a different average pairwise difference (APD), a measure of diversity within the MSA. Therefore, residues with, for example, conservation level 9 (maximal conservation) in each of the three ConSurf-DB-colored sequence-different chains have the highest levels of conservation within their own chain, but do not have exactly the same absolute levels of conservation.

1bqh
Chain Length Number of sequences in MSA APD
A 274 144 1.72
B 99 75 1.49
C 8 Length below minimum for ConSurf
G 129 201 1.35

In Proteopedia's Evolutionary Conservation scenes, we have chosen to color all the chains in the molecule at once. This gives a potentially useful overview, but can be misleading unless one realizes that a given conservation color, in two sequence-different chains, does not mean exactly the same level of conservation. In contrast to Proteopedia's Evolutionary Conservation scenes, ConSurf-DB and ConSurf Server apply conservation level colors to only one chain sequence at a time, thereby avoiding this possible confusion.

A second caveat bears mentioning here. Slight variations in the conservation pattern will occur over time, as the number of sequences in the sequence databases used by ConSurf-DB increase. Each update of ConSurf-DB uses somewhat larger sequence databases, and consequently, the MSA's for each chain will be slightly different.

The ConSurf-DB MechanismThe ConSurf-DB Mechanism

The ConSurf DataBase server, ConSurf-DB, pre-calculates conservation levels for each amino acid in every protein chain in the Protein Data Bank[1]. It went into service in 2008. The amino acid sequence of each protein chain is submitted to PSI-BLAST[2]

The ConSurf ServerThe ConSurf Server

The ConSurf Server can calculate and display the conservation pattern for 3D structures completely automatically. It uses state-of-the-art methods, all of which are published in peer-reviewed journal articles. It also permits considerable customization. For example, the user may submit their own multiple sequence alignment, or phylogenetic tree. I[3] know of no other server with these advantages.

In brief, the ConSurf Server uses the following process by default:

  1. Obtains the protein sequence for the specified PDB code and chain.
  2. Gathers closely related sequences from Swiss-Prot (or Uniprot) with a PSI-BLAST search.
  3. Does a multiple sequence alignment.
  4. Constructs a phylogenetic tree.
  5. Calculates a conservation score for each amino acid.
  6. Displays the protein, colored by conservation, in interactive 3D, using FirstGlance in Jmol, Chimera, PyMOL, or Protein Explorer.

ExamplesExamples

Evolutionary conservation reported by ConSurf-DB for Major Histocompatibility Class I alpha chain in 2vaa.

Drag the structure with the mouse to rotate

At right is the pattern of evolutionary conservation and variability reported by ConSurf-DB for the alpha chain of Major Histocompatibility Complex Class I (chain A of 2vaa).

2vaa contains three chains. ConSurf processes only one chain at a time. In the example shown here, the alpha chain (chain A) was processed, while the beta chain (chain B) and the peptide (chain P) are shown as gray backbone traces. Below are instructions for how to insert a ConSurf result into a Proteopedia scene.

Examples of conserved patches revealed by ConSurf will be found in the articles on

Conservation for Domain FoldingConservation for Domain Folding

Certain residues on the surfaces of protein molecules tend to be conserved in order to maintain proper folding, rather than because they are part of a site functioning to interact with substrate, ligand, or a protein partner. Secondary structure elements need to break, in order to turn back into the folded protein domain, at the protein molecular surface. Therefore, it is common to see highly conserved residues that enable turns, or break helices, notably glycines or prolines, on protein structure surfaces.

Every structure in Proteopedia has a link to be displayed in FirstGlance in Jmol. There, you can use the Find dialog to enter the name of an amino acid, e.g. glycine or proline, and the positions of all of the specified amino acids will be highlighted. You can then visualize their distribution in the 3D structure.

How to Insert a ConSurf Result Into a Proteopedia Green LinkHow to Insert a ConSurf Result Into a Proteopedia Green Link

To create a green-linked scene with a molecule colored by evolutionary conservation use the button "evolutionary conservation" in the "color" tab of the Scene Authoring Tools.

If for some reason you want to calculate the ConSurf coloring scheme on your own and want to insert that into a Proteopedia scene, here is how:

  1. Using either the ConSurf Database or the ConSurf Server, obtain the desired result.
  2. At the ConSurf result page, use the link RasMol Coloring Script to display either the script showing or hiding insufficient data. Block and copy the entire script.
  3. We assume that you already have an article in Proteopedia, with a Jmol applet in place for displaying your ConSurf result. (If not, see the Video Guides and Help:Editing.)
  4. Edit your Proteopedia page, and open the Scene Authoring Tools.
  5. Load the desired molecule into Jmol in the Scene Authoring Tools.
  6. Click on "Jmol" (at the lower right of Jmol) to open Jmol's menu, and there, click on "Console".
  7. In the small white Console window, paste your RasMol Coloring Script into the bottom box, and click Execute.
  8. Make any other changes you wish to this scene, and then save the scene.
  9. Copy the wikitext for the green link that will display your scene, and close the Scene Authoring Tool.
  10. Paste the green link wikitext into your page, and save the page.

The color key that you see above on this page can be inserted in any page using this wikitext:

{{Template:ColorKey_ConSurf}}

See also Help:Color_Keys. If something is not clear, please let us know at .

NotesNotes

  1. Goldenberg O, Erez E, Nimrod G, Ben-Tal N. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res. 2009 Jan;37(Database issue):D323-7. Epub 2008 Oct 29. PMID:18971256 doi:http://dx.doi.org/10.1093/nar/gkn822
  2. PSI-BLAST (Position Specific Iteration-BLAST) is an extension of the Basic Local Alignment Search Tool (BLAST) that is more sensitive at finding distantly related sequences. See PSI-BLAST at Wikipedia and PSI-BLAST at NCBI.
  3. User:Eric Martz in January, 2009.

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Eran Hodis, Wayne Decatur