Basics of Protein Structure: Difference between revisions
Eric Martz (talk | contribs) No edit summary |
Ann Taylor (talk | contribs) No edit summary |
||
(24 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
' scene='' | ' scene='' | ||
pspeed='8'> | pspeed='8'> | ||
Proteins are long [[chains]] of [[Amino Acids]], and are synthesized by the [[ribosome]], using messenger [[RNA]] as a template. There are 20 amino acids commonly found in proteins. <scene name='60/604417/Ala/ | ''This tutorial illustrates some basic properties of protein structure for a general audience. For a more in depth discussion, please visit [[Introduction to protein structure]]. Words shown in green change the protein view in the box to the right; blue words are links to other pages.'' | ||
Proteins perform many important functions in living organisms, including movement, immune responses, sensing the environment, energy acquisition, and catalyzing reactions. The protein shown to the right is insulin; when insulin isn't properly synthesized or recognized, diabetes occurs. | |||
Proteins are long [[chains]] of [[Amino Acids|amino acids]], and are synthesized by the [[ribosome]], using messenger [[RNA]] as a template. There are 20 amino acids commonly found in proteins. <scene name='60/604417/Ala/2'>Amino acids</scene> contain an <scene name='60/604417/Ala_amino/1'>amino group</scene>, a central carbon atom called the <scene name='60/604417/Ala_alpha/1'>alpha carbon</scene>, and a <scene name='60/604417/Ala_cooh/1'>carboxylic acid</scene>. The 20 amino acids differ by what is attached to the central atom; is variable portion is referred to as the <scene name='60/604417/Ala_side_chain/1'>side chain</scene>. The amino acid shown is alanine; its side chain is a methyl (-CH3) group. The atoms are displayed using the [[CPK|coloring convention]] '''<font color="#808080">Carbon</font>, <span style="background-color:black;color:white;"> Hydrogen </span>, <font color="red">Oxygen</font>, <font color="#3050f8">Nitrogen</font>''': {{Template:ColorKey_Element_C}}, {{Template:ColorKey_Element_H}}, {{Template:ColorKey_Element_O}}, {{Template:ColorKey_Element_N}}. | |||
Proteins are sometimes compared to <scene name='60/604417/Ins_bead_backbone/3'>beads on a string</scene>, where each amino acid residue is a bead. These long chains form complicated structures that allow them to perform their function. Even small alterations in any level of the structure can change how the protein does its job, and can lead to diseases. | Proteins are sometimes compared to <scene name='60/604417/Ins_bead_backbone/3'>beads on a string</scene>, where each amino acid residue is a bead. These long chains form complicated structures that allow them to perform their function. Even small alterations in any level of the structure can change how the protein does its job, and can lead to diseases. | ||
== Ways of representing protein structure == | |||
Protein structures can be displayed in many different ways. In <scene name='60/604417/Spacefill/1'>spacefilling</scene> models, all of the non-hydrogen atoms are shown as spheres with their van der Waals radii. This view is the easiest to use to see holes, clefts or other large scale features, but it is hard to identify individual amino acids or finer structural details. In the <scene name='60/604417/Ball_and_stick/1'>ball and stick</scene> model, the atoms are shown as smaller balls, connected by sticks; this is further simplified in the <scene name='60/604417/Stick/1'>stick</scene> model, which only shows the bonds between atoms. <scene name='60/604417/Backbone/1'>Backbone</scene> representation shows only the N-Calpha-C=O repeating unit; the side chains are omitted. The <scene name='60/604417/Cartoon/1'>cartoon</scene> representation is based upon the backbone, but highlights specific secondary structures (more on that later!). | |||
== Levels of Protein Structure == | == Levels of Protein Structure == | ||
There are [[Four levels of protein structure|four different levels of protein structure]]. The <scene name='60/604417/Ins_bead_backbone_labels/1'>primary structure</scene> is the amino acid sequence. The amino acids are connected by an amide bond, made from the amino group (NH2) of one amino acid, and the carboxylic acid (C=O) from another amino acid. The amino acids are linked in a repeating pattern. The [[Backbone representations|backbone]] of the protein is the repeating N-C-C=O pattern, with the side chains projecting out from the backbone. The end with the free -NH2 group is called the Amino or N terminus, while the end with a free carboxylic acid is called the C terminus. | There are [[Four levels of protein structure|four different levels of protein structure]]. The <scene name='60/604417/Ins_bead_backbone_labels/1'>primary structure</scene> is the amino acid sequence. The amino acids are connected by an amide bond, made from the amino group (NH2) of one amino acid, and the carboxylic acid (C=O) from another amino acid. In the process of making the bond, a water molecule is removed. The amino acids are linked in a repeating pattern. The [[Backbone representations|backbone]] of the protein is the repeating <scene name='60/604417/N_calpha_co/2'>N-C-C=O</scene> pattern, with the <scene name='60/604417/Side_chains/1'>side chains</scene> projecting out from the backbone. The end with the free -NH2 group is called the Amino or <scene name='60/604417/N_terminus/1'>N terminus</scene>, while the end with a free carboxylic acid is called the <scene name='60/604417/C_terminus/1'>C terminus</scene>. Notice that most protein structure representations do not show the hydrogens. The sequence of amino acids is written and numbered from the N terminus (where protein synthesis begins) to the C terminus (where amino acids are added during protein synthesis), so for <scene name='60/604417/N_to_c/1'>the segment shown</scene>, the sequence would be Val-Asn-Gln, or VNQ, if one letter abbreviations are used for the amino acids. For more practice identifying peptide bonds between amino acids, please try [[User:Stephen Mills/Peptide tutorial 1|Peptide tutorial 1 part 1]] and [[User:Stephen Mills/Peptide tutorial 2|Peptide tutorial 1 part 2]]. | ||
The second level of structure is called secondary structure, and is the shapes (conformations) formed by short sequences of amino acids. This level of structure is stabilized by <scene name=' | The second level of structure is called secondary structure, and is the shapes (conformations) formed by short sequences of amino acids. This level of structure is stabilized by <scene name='60/604417/H_bonds/2'>hydrogen bonds</scene> along the backbone. Hydrogen bonds are attractions between an N, O or F and a hydrogen attached to an N, O or F (More about [[hydrogen bonds]].) The two most common shapes are [[Helices in Proteins|alpha helices]] and [[Sheets in Proteins|beta strands]]. These are favored simply because [[Tutorial:Ramachandran principle and phi psi angles|two atoms cannot occupy the same space]] (steric collisions). Insulin only contains <scene name='60/604417/Secondary_structure/1'>alpha helices</scene>; they are shown in pink. | ||
The third level of structure, or tertiary structure, is how the secondary structures pack together to form the overall form of the entire peptide chain. Side chains play an important role in tertiary structure formation, especially the burying of hydrophobic ("water fearing") amino acids in the middle of the structure. In <scene name='60/604417/Hexamer_hydrophobicity/1'>this view</scene>, {{Template:ColorKey_Hydrophobic}} residues are grey and {{Template:ColorKey_Polar}} atoms are shown in light purple. Water molecules are shown with red balls; notice that they tend to be close to the hydrophilic (water loving) groups. Some proteins, like insulin, are also stabilized by<scene name='60/604417/Disulfide_bonds/1'> covalent bonds between the sulfur atoms</scene> (shown in yellow) called disulfide bonds. | |||
Not all proteins have the fourth level of structure, quaternary structure. Quaternary structure is the association of more than one chain to form a larger structure. Insulin forms a <scene name='60/604417/Hexamer/2'>hexamer</scene>. In this view, each insulin monomer is shown in a different color. Quaternary structure can be very important in how the protein functions. Minor changes in insulin's sequence leads to tighter or weaker association between the chains, and is the difference between long lasting and quick acting insulin. For a more in depth discussion about insulin's structure and function, please visit the [[Insulin]] page. | |||
==Protein Structure Data== | |||
The [[Protein Data Bank|World Wide Protein Data Bank]] (WWPDB) is where all experimentally-determined published protein structures are made freely available. Each model has a unique accession code, called a [[PDB code]]. One model of human insulin, shown at right has the PDB code [[3i40]]. Many examples are illustrated in the [http://atlas.molviz.org Atlas of Macromolecules]. Looking for a model of a specific protein? See [[Practical_Guide_to_Homology_Modeling#Is_there_an_empirical_model.3F|Is there an empirical model?]] After you find a PDB code of interest, see [[Introduction to molecular visualization]]. | |||
==Further Reading== | |||
* [[The Building Blocks]] | * [[The Building Blocks]] | ||
** [[Amino Acids]] | ** [[Amino Acids]] | ||
** [[Peptide]] | ** [[Peptide]] | ||
** [[Phi and Psi Angles]] | ** [[Phi and Psi Angles]] | ||
** [[Ramachandran Plots]] | ** [[Ramachandran Plots]] | ||
Line 31: | Line 44: | ||
** [[Tutorial:Ramachandran principle and phi psi angles]] also available as a [http://tinyurl.com/RamachandranPrincipleYouTube YouTube Video] | ** [[Tutorial:Ramachandran principle and phi psi angles]] also available as a [http://tinyurl.com/RamachandranPrincipleYouTube YouTube Video] | ||
* [[Quaternary structure]] | * [[Quaternary structure]] | ||
* [[Highest impact structures]] | * [[Highest impact structures]] | ||
* [[Molecular sculpture]] | |||
* [[Introduction to molecular visualization]] | |||
__NOTOC__ | __NOTOC__ | ||
__NOEDITSECTION__ | __NOEDITSECTION__ |
Latest revision as of 04:49, 7 April 2022
<StructureSection load='3I40' size='350' side='right' caption='Structure of insulin (PDB entry 3I40) ' scene=
pspeed='8'>
This tutorial illustrates some basic properties of protein structure for a general audience. For a more in depth discussion, please visit Introduction to protein structure. Words shown in green change the protein view in the box to the right; blue words are links to other pages.
Proteins perform many important functions in living organisms, including movement, immune responses, sensing the environment, energy acquisition, and catalyzing reactions. The protein shown to the right is insulin; when insulin isn't properly synthesized or recognized, diabetes occurs.
Proteins are long chains of amino acids, and are synthesized by the ribosome, using messenger RNA as a template. There are 20 amino acids commonly found in proteins. contain an , a central carbon atom called the , and a . The 20 amino acids differ by what is attached to the central atom; is variable portion is referred to as the . The amino acid shown is alanine; its side chain is a methyl (-CH3) group. The atoms are displayed using the coloring convention Carbon, Hydrogen , Oxygen, Nitrogen: C, H, O, N.
Proteins are sometimes compared to , where each amino acid residue is a bead. These long chains form complicated structures that allow them to perform their function. Even small alterations in any level of the structure can change how the protein does its job, and can lead to diseases.
Ways of representing protein structure
Protein structures can be displayed in many different ways. In models, all of the non-hydrogen atoms are shown as spheres with their van der Waals radii. This view is the easiest to use to see holes, clefts or other large scale features, but it is hard to identify individual amino acids or finer structural details. In the model, the atoms are shown as smaller balls, connected by sticks; this is further simplified in the model, which only shows the bonds between atoms. representation shows only the N-Calpha-C=O repeating unit; the side chains are omitted. The representation is based upon the backbone, but highlights specific secondary structures (more on that later!).
Levels of Protein Structure
There are four different levels of protein structure. The is the amino acid sequence. The amino acids are connected by an amide bond, made from the amino group (NH2) of one amino acid, and the carboxylic acid (C=O) from another amino acid. In the process of making the bond, a water molecule is removed. The amino acids are linked in a repeating pattern. The backbone of the protein is the repeating pattern, with the projecting out from the backbone. The end with the free -NH2 group is called the Amino or , while the end with a free carboxylic acid is called the . Notice that most protein structure representations do not show the hydrogens. The sequence of amino acids is written and numbered from the N terminus (where protein synthesis begins) to the C terminus (where amino acids are added during protein synthesis), so for , the sequence would be Val-Asn-Gln, or VNQ, if one letter abbreviations are used for the amino acids. For more practice identifying peptide bonds between amino acids, please try Peptide tutorial 1 part 1 and Peptide tutorial 1 part 2.
The second level of structure is called secondary structure, and is the shapes (conformations) formed by short sequences of amino acids. This level of structure is stabilized by along the backbone. Hydrogen bonds are attractions between an N, O or F and a hydrogen attached to an N, O or F (More about hydrogen bonds.) The two most common shapes are alpha helices and beta strands. These are favored simply because two atoms cannot occupy the same space (steric collisions). Insulin only contains ; they are shown in pink.
The third level of structure, or tertiary structure, is how the secondary structures pack together to form the overall form of the entire peptide chain. Side chains play an important role in tertiary structure formation, especially the burying of hydrophobic ("water fearing") amino acids in the middle of the structure. In , Hydrophobic residues are grey and Polar atoms are shown in light purple. Water molecules are shown with red balls; notice that they tend to be close to the hydrophilic (water loving) groups. Some proteins, like insulin, are also stabilized by (shown in yellow) called disulfide bonds.
Not all proteins have the fourth level of structure, quaternary structure. Quaternary structure is the association of more than one chain to form a larger structure. Insulin forms a . In this view, each insulin monomer is shown in a different color. Quaternary structure can be very important in how the protein functions. Minor changes in insulin's sequence leads to tighter or weaker association between the chains, and is the difference between long lasting and quick acting insulin. For a more in depth discussion about insulin's structure and function, please visit the Insulin page.
Protein Structure Data
The World Wide Protein Data Bank (WWPDB) is where all experimentally-determined published protein structures are made freely available. Each model has a unique accession code, called a PDB code. One model of human insulin, shown at right has the PDB code 3i40. Many examples are illustrated in the Atlas of Macromolecules. Looking for a model of a specific protein? See Is there an empirical model? After you find a PDB code of interest, see Introduction to molecular visualization.
Further Reading
- Four levels of protein structure -- Los cuatro niveles estructurales de las proteínas
- Secondary structure
- Quaternary structure