NMR Ensembles of Models: Difference between revisions
m →Structure Determination by NMR: added link to X-ray crystallography page |
Eric Martz (talk | contribs) |
||
(17 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
== | ==Structure Determination by NMR== | ||
About 6% of the entries in the [[Protein Data Bank]] were determined by nuclear magnetic resonance in solution (NMR) as of October 2024. 83% were determined by [[X-ray crystallography]], 10% by [[electron cryomicroscopy]], and <1% by other methods. NMR can only be used for relatively small macromolecules (see [[#Median_Size_of_Published_NMR_Structures|below]]). | |||
<blockquote> | |||
NMR spectroscopy is based on the ability of a nucleus with a spin of 1/2 (e.g. <sup>1</sup>H, <sup>13</sup>C, <sup>15</sup>N, <sup>31</sup>P) to adopt two different orientations in a magnetic field. The distribution of nuclei between the two states can be changed by subjecting them to a short pulse of radiation with a frequency commensurate with the energy difference between them. Monitoring the magnetic signals in the subsequent decay can yield dynamic information about the orientation and spacing of the nuclei, which provide restraints that can be turned into structural information<ref>Quoted from page 22 of the book ''Molecular Biology of Assemblies and Machines'' by Steven, Baumeister, Johnson and Perham, [https://www.crcpress.com/Molecular-Biology-of-Assemblies-and-Machines/Steven-Baumeister-Johnson-Perham/9780815341666 Garland/CRC Press, 2016].</ref>. | |||
</blockquote> | |||
The primary data yielded by NMR analysis is mostly local and more recently global geometric information about atoms within the structure. Typically, these include distance between pairs of atoms, dihedral angles (typically backbone φ angles and some side-chain χ1 angles) and sometimes global information such as the orientation of a given bond with respect to a fixed axis of the molecule. These data are used as "restraints" to reconstruct 3D models which are compatible with the NMR data. All calculations are performed directly in the physical space, starting with a random conformation of the macromolecule, which is progressively folded to satisfy the restraints. Typically, several runs are performed, starting from different initial conformations, in order to check that the calculation converges onto a single solution. The result is thus an ensemble of models, the distribution of which gives a measure of the precision of the NMR structure. | The primary data yielded by NMR analysis is mostly local and more recently global geometric information about atoms within the structure. Typically, these include distance between pairs of atoms, dihedral angles (typically backbone φ angles and some side-chain χ1 angles) and sometimes global information such as the orientation of a given bond with respect to a fixed axis of the molecule. These data are used as "restraints" to reconstruct 3D models which are compatible with the NMR data. All calculations are performed directly in the physical space, starting with a random conformation of the macromolecule, which is progressively folded to satisfy the restraints. Typically, several runs are performed, starting from different initial conformations, in order to check that the calculation converges onto a single solution. The result is thus an ensemble of models, the distribution of which gives a measure of the precision of the NMR structure. | ||
Line 11: | Line 11: | ||
Model building for NMR experiments typically starts with the complete protein or nucleic acid chain, including [[Hydrogen in macromolecular models|hydrogen atoms]]. The distance restraints are then applied. The resulting model usually includes the entire protein and nucleic acid chains, unlike [[X-ray crystallography|X-ray crystallographic]] models that often lack the ends, and even loops in the middle of chains, due to [[Disorder|disorder]] in protein crystals. | Model building for NMR experiments typically starts with the complete protein or nucleic acid chain, including [[Hydrogen in macromolecular models|hydrogen atoms]]. The distance restraints are then applied. The resulting model usually includes the entire protein and nucleic acid chains, unlike [[X-ray crystallography|X-ray crystallographic]] models that often lack the ends, and even loops in the middle of chains, due to [[Disorder|disorder]] in protein crystals. | ||
Macromolecular structure determination by NMR is done in aqueous solution, and thus requires that the molecule be soluble. For more information, see ''Nature of 3D Structural Data''<ref>[http://www.pdb.org/pdb/static.do?p=general_information/about_pdb/nature_of_3d_structural_data.html Nature of 3D Structural Data | Macromolecular structure determination by NMR is done at high protein concentrations in aqueous solution, and thus requires that the molecule be highly soluble. For more information, see ''NMR in Wikipedia''<ref>[http://en.wikipedia.org/wiki/Nuclear_magnetic_resonance NMR in Wikipedia]</ref> | ||
and ''Nature of 3D Structural Data at PDB''<ref>[https://web.archive.org/web/20110415074813/http://www.pdb.org/pdb/static.do?p=general_information/about_pdb/nature_of_3d_structural_data.html Nature of 3D Structural Data (archived copy)]</ref>. | |||
==Displaying NMR Models== | ==Displaying NMR Models== | ||
Line 17: | Line 18: | ||
===Display of NMR Models by Proteopedia=== | ===Display of NMR Models by Proteopedia=== | ||
{{STRUCTURE_1lcd| PDB=1lcd | SCENE= }} | {{STRUCTURE_1lcd| PDB=1lcd | SCENE= }} | ||
Proteopedia | For ensembles of models from NMR experiments, Proteopedia initially displays just the first model, in the usual cartoon rendering; this is done so to speed-up page loading. You will see a "Displaying simplified model" message within the JSmol panel. If you then click the "load full" button (in orange color), Proteopedia will show '''all the models''', enabling you to see where the models agree with each other, and where they differ. Each model is shown as a thin backbone trace (a line connecting alpha carbon atoms of amino acids, or phosphorus atoms in DNA or RNA chains). The backbone traces are colored by <font color="blue">'''Amino'''</font> to <font color="red">'''Carboxy'''</font> "rainbow", a spectral sequence of colors starting at the amino terminus (or 5' terminus of nucleic acid chains) and ending at the carboxy terminus (or 3' terminus). | ||
{{ColorKey N2CRainbow}} | {{ColorKey N2CRainbow}} | ||
Line 23: | Line 24: | ||
'''Ligands''' ([[Hetero atoms]]) are also shown for all models, except that they are opaque only for model 1, and translucent for all other models. Ligand atoms are colored by element, using the [[CPK color scheme]]. Examples with hetero groups covalently linked to chain termini, with extremely variable positions, are [[1jsa]] and [[1dqc]]. [[1bah]] also has hetero groups in variable positions. [[1hpn]] has only hetero atoms. | '''Ligands''' ([[Hetero atoms]]) are also shown for all models, except that they are opaque only for model 1, and translucent for all other models. Ligand atoms are colored by element, using the [[CPK color scheme]]. Examples with hetero groups covalently linked to chain termini, with extremely variable positions, are [[1jsa]] and [[1dqc]]. [[1bah]] also has hetero groups in variable positions. [[1hpn]] has only hetero atoms. | ||
The example at right shows the 3 models for [[1lcd]], a lac repressor domain bound to DNA, with one sodium ion. '''Water''' is present in this model, but for clarity, Proteopedia does not show water in its initial scene. <scene name='NMR_Ensembles_of_Models/Water/1'>Show water</scene>. (To hide water, click the ''initial scene'' green link just below the molecule.) | The example at right, after clicking the "load full" button, shows the 3 models for [[1lcd]], a lac repressor domain bound to DNA, with one sodium ion. '''Water''' is present in this model, but for clarity, Proteopedia does not show water in its initial scene. <scene name='NMR_Ensembles_of_Models/Water/1'>Show water</scene>. (To hide water, click the ''initial scene'' green link just below the molecule.) | ||
'''Disulfide bonds''' are shown as yellow rods connecting backbones, with the first model opaque, and all other models translucent. An example is [[1iw4]]. | '''Disulfide bonds''' are shown as yellow rods connecting backbones, with the first model opaque, and all other models translucent. An example is [[1iw4]]. | ||
Line 29: | Line 30: | ||
===Individual Models=== | ===Individual Models=== | ||
In order to view individual models, click on '' | Proteopedia shows only the first model by default, while it says ''Displaying simplified model''. After you click the <font color="orange">orange '''load full'''</font> button, all models will be displayed. | ||
In order to view individual models, click on ''JSmol'' or ''Jmol_S'' (lower right corner below the molecule) to '''open Jmol's menu'''. There, use the '''All N models''' item (where N is the total number of models in the ensemble). For example, clicking on 1.1: 1 will display only model 1, and the menu will now say ''model 1/N''. You can also use Jmol's menu to change the rendering and coloring. | |||
[[FirstGlance in Jmol]] also shows model 1 by default, but you can click on ''View All Models''. | |||
===Animating NMR Ensembles=== | ===Animating NMR Ensembles=== | ||
When the models in an NMR ensemble are played like a movie, the resulting animation simulates thermal motion (although not all the motions are necessarily real -- see [[#Meaning_of_the_Variation_Between_Models|below]]). In order to animate the models, click on '' | When the models in an NMR ensemble are played like a movie, the resulting animation simulates thermal motion (although not all the motions are necessarily real -- see [[#Meaning_of_the_Variation_Between_Models|below]]). In order to animate the models, click on ''JSmol'' or ''Jmol_S'' (lower right corner below the molecule) to '''open Jmol's menu'''. Choose '''Animation''', then '''Animation mode''', and click on '''Loop'''. Then choose ''Animation'' again, and click '''Play'''. You can change the speed of the animation with '''FPS''' (frames per second) on the ''Animation'' menu. By default, there is a delay at the first and last models. | ||
==Multiple Model Ensembles from NMR== | ==Multiple Model Ensembles from NMR== | ||
Line 44: | Line 49: | ||
===Meaning of the Variation Between Models=== | ===Meaning of the Variation Between Models=== | ||
The '''variation between models''' in the ensemble can mean either of two things. The variation can represent actual '''flexibility and thermal motion''' that occurred during the NMR measurements in solution, typically at room temperature. Alternatively, the variation can simply mean '''uncertainty in the atomic positions''', namely, that an inadequate number of restraints were available to determine the positions of some atoms. Unfortunately, there is nothing comparable to the [[B value]] or [[Temperature value]] that quantitates the uncertainty of the position of each atom in crystallographic results. | The '''variation between models''' in the ensemble can mean either of two things. The variation can represent actual '''flexibility and thermal motion''' that occurred during the NMR measurements in solution, typically at room temperature. Alternatively, the variation can simply mean '''uncertainty in the atomic positions''', namely, that an inadequate number of restraints were available to determine the positions of some atoms. Unfortunately, there is nothing comparable to the [[B value]] or [[Temperature value]] that quantitates the uncertainty of the position of each atom in crystallographic results. Specific NMR relaxation experiments can however be used to measure the dynamics of individual atoms, mainly backbone amide groups, as the relaxation of the NMR signal is indeed dependent on the internal motions of the molecule. When these NMR relaxation data are available, they can be used to determine '''order parameters''', which are strongly correlated with the [[B value]]s of the crystallographic structures. These can be used to distinguish between intrinsic flexibility and uncertainty due to lack of constraints. When relaxation data is not available, the only way to find out what the meaning of the variation between models is to contact the experimenters who authored the published ensemble of models. | ||
Protein chains commonly have more variation between models at the ends than in the middle. An example is [[2yru]]. | Protein chains commonly have more variation between models at the ends than in the middle. An example is [[2yru]]. | ||
Line 58: | Line 63: | ||
==Reliability of NMR Models== | ==Reliability of NMR Models== | ||
NMR models are more likely to contain major errors <ref>Traditional biomolecular structure determination by NMR spectroscopy allows for major errors. Sander B. Nabuurs, Chris. A. E. M. Spronk, Geerten W. Vuister, and Gert Vriend. (2006). PLoS Computational Biology 2: [http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020009 Open Access Full Text] [http://proteinexplorer.org/favlit/nmr.htm Precis]. DOI: 10.1371/journal.pcbi.0020009</ref> than are crystallographic models that have good [[Resolution]] and [[Free R]] values. See also [[Quality assessment for molecular models]]. | NMR models are more likely to contain major errors <ref>Traditional biomolecular structure determination by NMR spectroscopy allows for major errors. Sander B. Nabuurs, Chris. A. E. M. Spronk, Geerten W. Vuister, and Gert Vriend. (2006). PLoS Computational Biology 2: [http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020009 Open Access Full Text] [http://proteinexplorer.org/favlit/nmr.htm Precis]. DOI: 10.1371/journal.pcbi.0020009</ref> than are crystallographic models that have good [[Resolution]] and [[Free R]] values. See also [[Quality assessment for molecular models]]. In 2012, an X-ray crystallographic structure of integral membrane diacylglycerol kinase, [[3ze4]], revealed functionally important domain swapping<ref>PMID: 23676672</ref><ref>PMID: 23676677</ref> that was not present in an earlier NMR structure [[2kdc]]<ref>PMID: 19556511</ref>. | ||
==Median Size of Published NMR Structures== | ==Median Size of Published NMR Structures== | ||
Line 69: | Line 74: | ||
Another example of two folded domains (zinc fingers) connected by a flexible linker is [[1zu1]]. Again, only one domain can be aligned, and which one is arbitrary. | Another example of two folded domains (zinc fingers) connected by a flexible linker is [[1zu1]]. Again, only one domain can be aligned, and which one is arbitrary. | ||
==See Also== | |||
*[[X-ray crystallography]] | |||
*[[Empirical models]] | |||
==References and Websites== | ==References and Websites== | ||
<references /> | <references /> | ||
==External Resources== | |||
*[http://www.pdb.org/pdb/static.do?p=education_discussion/Looking-at-Structures/methods.html Methods for Determining Atomic Structures discussed at the Protein Data Bank] |