Biological Unit: Difference between revisions
Eric Martz (talk | contribs) No edit summary |
Eric Martz (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
The '''Biological Unit''', also called the '''Biological Assembly'''<ref name="xudunbrackpreprint">[https://www.biorxiv.org/content/10.1101/564385v1.full.pdf Freely available preprint] of Xu and Dunbrack's 2019 "Principles and characteristics of biological assemblies in experimentally determined protein structures".</ref><ref name="xudunbrack">PMID:30965224</ref>, is the [[ | The '''Biological Unit''', also called the '''Biological Assembly'''<ref name="xudunbrackpreprint">[https://www.biorxiv.org/content/10.1101/564385v1.full.pdf Freely available preprint] of Xu and Dunbrack's 2019 "Principles and characteristics of biological assemblies in experimentally determined protein structures".</ref><ref name="xudunbrack">PMID:30965224</ref>, is the [[Protein primary, secondary, tertiary and quaternary structure|quaternary structure]] that is believed to be the major functional form of a macromolecule. Often it is not the structure contained in the published [[PDB file]], which is called the [[asymmetric unit]]. Therefore it is important to visualize the biological unit in order best to relate function to structure. | ||
<!-- I see no biological units at 1h88. Also it is not clear what "natively available" means. Eric Martz April 2010. | <!-- I see no biological units at 1h88. Also it is not clear what "natively available" means. Eric Martz April 2010. | ||
Biological units displayed in the 'Structure Boxes' on Proteopedia pages for PDB entries (e.g. [[1h88]]) are those predicted and calculated using the [[PISA|Protein Interfaces, Surfaces and Assemblies Server]] (PISA), as well as all biological unit structures natively available in Proteopedia (i.e. those that were not uploaded by Proteopedia users). | Biological units displayed in the 'Structure Boxes' on Proteopedia pages for PDB entries (e.g. [[1h88]]) are those predicted and calculated using the [[PISA|Protein Interfaces, Surfaces and Assemblies Server]] (PISA), as well as all biological unit structures natively available in Proteopedia (i.e. those that were not uploaded by Proteopedia users). |
Revision as of 17:16, 12 May 2022
The Biological Unit, also called the Biological Assembly[1][2], is the quaternary structure that is believed to be the major functional form of a macromolecule. Often it is not the structure contained in the published PDB file, which is called the asymmetric unit. Therefore it is important to visualize the biological unit in order best to relate function to structure.
DefinitionDefinition
The Biological Unit, also called the Biological Assembly[1][2], is the quaternary structure of a protein that is believed to be the main functional form of the molecule. It can be a single chain, or a quaternary assembly of multiple identical or non-identical chains. For example, the biological unit of hemoglobin includes two alpha chains and two beta chains, making it a tetrameric α2β2 structure. When a biological unit contains multiple chains that have co-evolved to bind to each other, it may also be referred to as a specific oligomer.
Of course, what is the functional form (biological unit) under one set of conditions may change under a different set of conditions, so there may be more than one functional form (biological unit) that includes a given protein chain. For example, phosphorylation or dephosphorylation by protein kinases or phosphatases often change the affinities between proteins, and hence their quaternary assemblies.
Published macromolecular structure data files (Atomic coordinate files, often in the PDB file format) contain the Asymmetric Unit, which may be identical with the biological unit, or only a portion of it, or may contain multiple biological units. Interchain contacts that occur in the asymmetric unit that are absent in the biological unit are termed crystal contacts. When publishing a macromolecular structure, the authors may elect to specify the biological unit. In the PDB file format, this is done in REMARK 350.
ExamplesExamples
Model | Chains in | |
2 | 1 | |
2 | 2 | |
6 | 6* | |
2 | 2* ** | |
4 | 4**, 4*, 2, 2 | |
1 | 2 | |
6 | 360 (virus capsid) |
* The contacts in this biological unit differ from those in the asymmetric unit.
**The "author specified" assembly (in this case the same as the asymmetric unit) appears unlikely in view of the assembly predicted by PISA, which has a much larger buried surface area.
Truncated proteins may form oligomers that are impossible in the native protein. For example, 1bk5 (karyopherin alpha) is a truncated part of the natural chain, and forms a dimer that would be prevented by the full-length chain. Dimerization is dependent upon Y397. Mutation Y397D prevents this artifactual dimerization, leading to the monomer 1ee5.
Visualizing the Biological UnitVisualizing the Biological Unit
FirstGlance in JmolFirstGlance in Jmol
FirstGlance in Jmol makes it quick and easy to see the biological unit.
- Display the molecule in FirstGlance in Jmol:
- Enter the PDB code in the top search slot at the left edge of any page in Proteopedia. At the page in Proteopedia titled with the PDB code, under Resources, click on the link to FirstGlance.
- Alternatively, go directly to FirstGlance.Jmol.Org and enter the PDB code.
- In the Molecule Information tab (the first/left-most tab), click Biological Unit and follow instructions.
How To Show The Biological Unit In ProteopediaHow To Show The Biological Unit In Proteopedia
Please see Biological Unit: Showing. |
Sources of Biological Unit ModelsSources of Biological Unit Models
Unreliability of REMARK 350 in the PDB File HeaderUnreliability of REMARK 350 in the PDB File Header
When a structure is deposited in the PDB, the authors are required to specify the biological unit if it is known. This is given in REMARK 350 in the header of the PDB file format. Unfortunately, information in REMARK 350 is often incorrect (see discussion of this problem by Roland Dunbrack)[1][2]. There are numerous examples in which the authors state that the biological unit is a monomer in REMARK 350, but provide good experimental evidence in the paper reporting the structure that the biological unit is a dimer. Jose Duarte provided a list of examples.
In 2022, the wwPDB is adding REMARK 350 to PDB-format files originally deposited without that remark, mostly models determined by NMR. What is unfortunate is that the added REMARK 350 specifies "AUTHOR DETERMINED", although it appears that the authors are not being consulted about whether the deposited model is believed to be the biological unit (major functional assembly). An example is 2z59 deposited in 2007 and released in 2008, which lacked REMARK 350 until it was added on March 16, 2022 (see the REVDAT records in the PDB format file).
In some cases, more than one putative biological unit is specified in REMARK 350. Biological units specified by the author(s) are distinguished from those predicted by software. An example is 3fad, which is explained in Introduction to Biological Assemblies and the PDB Archive.
The most reliable way to find out the biological unit is to read the literature and/or contact experts on the molecule in question. Short of such efforts, here are some suggestions:
- When the "author determined" biological unit stated in REMARK 350 has a different number of chains than the asymmetric unit, the biological unit is more likely to be correct, simply because the difference in chains shows that the authors gave REMARK 350 some real consideration.
- When the "author determined" biological unit stated in REMARK 350 has the same number of chains, but in a different conformation, than the asymmetric unit, the stated biological unit is more likely to be correct, for the same reason given in the previous case above. An example is 1qrd, given in the table in the Examples section above.
- When the "author determined" biological unit stated in REMARK 350 has the same number of chains, in the same conformation, as the asymmetric unit, the stated biological unit is less likely to be correct. There is a significant chance that the authors failed to state a known biological unit in REMARK 350 (see examples above).
- When a biological unit is determined only by software, it is less likely to be correct. The software makes an educated guess based upon the characteristics of the contacts present in the protein crystal, but it is sometimes incorrect.
Generation of Biological Unit Models from REMARK 350Generation of Biological Unit Models from REMARK 350
The following servers generate biological unit models from REMARK 350. Be careful because, as explained above, REMARK 350 is often incorrect.
MakeMultimerMakeMultimer
- The MakeMultimer Server generates a PDB file in which every chain is assigned a distinct single-character name, and all chains are in a single model. MakeMultimer provides direct links for downloading, or for visualizing each biological unit in FirstGlance in Jmol.
RCSBRCSB
- Atomic coordinates for biological units, when specified by the authors of a published structure in REMARK 350 of the PDB file format, are available from the RCSB (US) Protein Data Bank. As of April, 2010, "Biological Assemblies" were available at the bottom of the list under Download Files (upper right, near the large PDB code).
- One technical problem with the files from RCSB is that when they contain more than one copy of the asymmetric unit, the duplicated chains all have identical names. RCSB offers visualization of these models in Jmol, but it is usually difficult to tell how many chains are present in the biological unit, either in the snapshot (where each chain is colored similarly in a spectral amino- to carboxy-terminal sequence) or in Jmol, where coloring by chain fails to distinguish chains with the same name. Also, the additional copies are in separate models, which often complicates visualization. In contrast, coordinates for biological units available from MakeMultimer (see above), PISA or PQS (see below) are in a single model, and each chain is given a distinct name. RCSB also offers a viewer named Kiosk but this seems not to show the biological assembly.
As for author-specified biological assemblies, sometimes the specific oligomers were not known at the time the asymmetric unit was published. Also, some authors may have failed to specify the biological unit even when it was known. Rarely, the specified biological units might be incorrect. For all these reasons, it is advisable to consult other sources in addition to REMARK 350.
Software: Protein Interfaces, Surfaces and Assemblies Server (PISA)Software: Protein Interfaces, Surfaces and Assemblies Server (PISA)
The Protein Interfaces, Surfaces and Assemblies Server (PISA) at the European Bioinformatics Institute uses improved methods to predict the biological unit or probable quaternary assembly, compared to its predecessor PQS (see next section). These servers examine the contacts that occur in macromolecular crystals used in X-ray crystallography. They attempt to discriminate between crystal contacts (artifacts of crystallization) and contacts between chains that have co-evolved to maintain specific oligomeric binding.
Software: Probable Quaternary Structure Server (PQS)Software: Probable Quaternary Structure Server (PQS)
The Probable Quaternary Structure Server (PQS) at the European Bioinformatics Institute examines the inter-chain contacts within protein crystals, and makes an educated guess (using published methods) about which contacts represent co-evolved specific oligomeric contacts, and which are artifacts of crystallization. It was usually correct, but not always. It returns models for what it deduces to be the biological units. There are many possible relationships between the asymmetric unit and the biological units returned by PQS. Examples are given in the discussion of PQS at ProteinExplorer.Org. Updates to PQS stopped in August, 2009. In 2010 it is being phased out in favor of PISA (see above).
See AlsoSee Also
- Biological Unit: Showing
- Crystal contacts
- X-ray crystallography
- Multi-protein assemblies
- Complexes
- Assemblies
Web SitesWeb Sites
- Protein Interfaces, Surfaces and Assemblies Server (PISA) at the European Bioinformatics Institute.
- Probable Quaternary Structure Server (PQS) at the European Bioinformatics Institute.
- ProtBud, a database of biological unit structures Offers comparisons and downloads of the results from REMARK 350 vs. PQS.
- Introduction to Biological Assemblies and the PDB Archive
- COCOMAPS (bioCOmplexes COntact MAPS) is a web server for analysis and visualization of the interfaces present in biological complexes, such as protein-protein, protein-DNA and protein-RNA complexes, making use of intermolecular contact maps.
Literature CitationsLiterature Citations
Literature citations will be found at the respective servers linked above.
- ↑ 1.0 1.1 1.2 Freely available preprint of Xu and Dunbrack's 2019 "Principles and characteristics of biological assemblies in experimentally determined protein structures".
- ↑ 2.0 2.1 2.2 Xu Q, Dunbrack RL Jr. Principles and characteristics of biological assemblies in experimentally determined protein structures. Curr Opin Struct Biol. 2019 Apr;55:34-49. doi: 10.1016/j.sbi.2019.03.006. Epub, 2019 Apr 6. PMID:30965224 doi:http://dx.doi.org/10.1016/j.sbi.2019.03.006