Hox protein: Difference between revisions

Eric Martz (talk | contribs)
No edit summary
Michal Harel (talk | contribs)
No edit summary
 
(38 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{BAMBED
|DATE=July 20, 2012
|OLDID=1419976
|BAMBEDDOI=10.1002/bmb.20650
}}
<StructureSection load='' size='350' side='right' scene='Sandbox_Reserved_169/Complex/1' caption='Homeotic protein SEX complex with homeobox protein extradenticle and DNA (PDB code [[2r5z]]).'>
''This is a joint project of students at La Cañada High School, La Cañada Flintridge, California USA, and students at the University of Southern California, Los Angeles, California USA, mentored by [[User:Remo Rohs|Professor Remo Rohs]].''
''This is a joint project of students at La Cañada High School, La Cañada Flintridge, California USA, and students at the University of Southern California, Los Angeles, California USA, mentored by [[User:Remo Rohs|Professor Remo Rohs]].''


Line 8: Line 14:


[[Image:Cell.jpg|thumb|right|300px|Figure 2: Hox proteins require a cofactor to achieve high binding specificity in order to execute their distinct functions in developing various parts of the fly embryo. Elsevier/Cell Press has provided permission for usage of this figure<ref name="slattery">Slattery M, Riley T, Liu P, Abe N, Gomez-Alcala P, Dror I, Zhou T, Rohs R, Honig B, Bussemaker HJ, Mann RS. Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell. 2011;147(6):1270-82. [http://www.ncbi.nlm.nih.gov/pubmed/22153072 PMID:22153072]</ref>.]]
[[Image:Cell.jpg|thumb|right|300px|Figure 2: Hox proteins require a cofactor to achieve high binding specificity in order to execute their distinct functions in developing various parts of the fly embryo. Elsevier/Cell Press has provided permission for usage of this figure<ref name="slattery">Slattery M, Riley T, Liu P, Abe N, Gomez-Alcala P, Dror I, Zhou T, Rohs R, Honig B, Bussemaker HJ, Mann RS. Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell. 2011;147(6):1270-82. [http://www.ncbi.nlm.nih.gov/pubmed/22153072 PMID:22153072]</ref>.]]
 
{{Clear}}
Hox proteins are transcription factors that play a key role in the '''embryonic development''' across species by activating and repressing genes. In ''Drosophila,'' eight Hox proteins are responsible for the development of different body segments of the fly, such as its antennae, wings, or legs. Hox proteins execute their distinct functions through binding to similar but different in vivo binding sites<ref>Mann RS, Lelli KM, Joshi R. Hox specificity unique roles for cofactors and collaborators. Curr Top Dev Biol. 2009;88:63-101. [http://www.ncbi.nlm.nih.gov/pubmed/19651302 PMID:19651302]</ref>. This page discusses molecular mechanisms through which Hox proteins recognize their DNA targets with very high binding specificity. <br/>
'''Hox proteins''' or '''homeobox proteins''' are transcription factors that play a key role in the '''embryonic development''' across species by activating and repressing genes. In ''Drosophila,'' eight Hox proteins are responsible for the development of different body segments of the fly, such as its antennae, wings, or legs. Hox proteins execute their distinct functions through binding to similar but different in vivo binding sites<ref>Mann RS, Lelli KM, Joshi R. Hox specificity unique roles for cofactors and collaborators. Curr Top Dev Biol. 2009;88:63-101. [http://www.ncbi.nlm.nih.gov/pubmed/19651302 PMID:19651302]</ref>. This page discusses molecular mechanisms through which Hox proteins recognize their DNA targets with very high binding specificity. <br/>


The crystal structure of a Hox-DNA complex (Figure 1) shows that the Hox protein ''Sex combs reduced'' (Scr) binds its specific ''in vivo'' site with the help of cofactors, ''Extradenticle (Exd)/Pbx proteins''. Hox proteins can bind DNA as monomers but their binding specificity is enhanced when the co-factor is present, a principle that is called '''latent specificity''' (Figure 2). In ''Drosophila'', for instance, eight Hox proteins bind as heterodimers with their cofactor Exd to similar but distinct target sites.<br/>
The crystal structure of a Hox-DNA complex (Figure 1) shows that the Hox protein ''Sex combs reduced'' (Scr) binds its specific ''in vivo'' site with the help of cofactors, ''Extradenticle (Exd)/Pbx proteins''. Hox proteins can bind DNA as monomers but their binding specificity is enhanced when the co-factor is present, a principle that is called '''latent specificity''' (Figure 2). In ''Drosophila'', for instance, eight Hox proteins bind as heterodimers with their cofactor Exd to similar but distinct target sites.<br/>
Line 18: Line 24:


==Structural Description of Hox-DNA Complex==
==Structural Description of Hox-DNA Complex==
===Homeodomain Architecture===


<Structure load='2r5z' size='410' frame='true' align='right' caption='Figure 3: 3D-Representation of Exd-Scr-DNA ternary complex with Scr specific site; [http://proteopedia.com/wiki/index.php/2r5z PDB ID# 2R5Z.]' scene='Sandbox_Reserved_169/Complex/1' />
<scene name='Sandbox_Reserved_169/Complex/1'>Figure 3: 3D-Representation of Exd-Scr-DNA ternary complex with Scr specific site</scene> ([http://proteopedia.com/wiki/index.php/2r5z PDB ID# 2R5Z.])


===Homeodomain Architecture===
Both <span style="background-color: black; color: yellow;">'''&nbsp;Scr&nbsp;'''</span> and <font color="blue">'''Exd'''</font> belong to the family of homeodomain proteins, which are encoded by homeoboxes. Homeodomains are helix-turn-helix motifs (<scene name='Sandbox_Reserved_169/Complex/1'>restore initial scene</scene>) comprised of three alpha helices (Figure 3). The <font color="#e06800">'''DNA'''</font>-binding interface residues of both proteins are <scene name='Sandbox_Reserved_169/Complex/2'> evolutionarily most conserved</scene>. {{Template:ColorKey_ConSurf_NoYellow_NoGray}}


Both Scr (yellow) and Exd (blue) belong to the family of homeodomain proteins, which are encoded by homeoboxes. <scene name='Sandbox_Reserved_169/Complex/1'>Homeodomains are helix-turn-helix motifs</scene> comprised of three alpha helices (Figure 3). The interface residues of both proteins are <scene name='Sandbox_Reserved_169/Complex/2'> evolutionary most conserved</scene> (dark purple for conserved residues vs. cyan for variable residues).<br/>
The third alpha helix of the <span style="background-color: black; color: yellow;">'''&nbsp;Scr&nbsp;'''</span> and <font color="blue">'''Exd'''</font> homeodomains, the so-called <scene name='Sandbox_Reserved_169/Scr_recognition_helix/3'>recognition helix</scene>, inserts into the major groove where hydrogen bonds are formed between protein side chains and base pairs. An N-terminal tail forms contacts with the minor groove.<br/>
 
The third alpha helix of the Scr and Exd homeodomains, the so-called <scene name='Sandbox_Reserved_169/Scr_recognition_helix/3'>recognition helix</scene>, inserts into the major groove where hydrogen bonds are formed between protein side chains and base pairs. An N-terminal tail forms contacts with the minor groove.<br/>


===Hox Protein-Cofactor Interactions===
===Hox Protein-Cofactor Interactions===


Scr interacts with its cofactor Exd through hydrophobic interactions via a <scene name='Sandbox_Reserved_169/Scr_ypwm_motif/4'>YPWM motif</scene> located at its N-terminal tail<ref>Mann RS. The specificity of homeotic gene function. Bioessays. 1995 Oct;17(10):855-63. [http://www.ncbi.nlm.nih.gov/pubmed/7487967 PMID:7487967].</ref>. This interaction spans Scr's flexible N-terminal linker across the minor groove of its binding site. In the absence of the YPWM motif, Scr and Exd would not form a heterodimer.<br/>
<span style="background-color: black; color: yellow;">'''&nbsp;Scr&nbsp;'''</span> interacts with its cofactor <font color="blue">'''Exd'''</font> through hydrophobic interactions via a <scene name='Sandbox_Reserved_169/Scr_ypwm_motif/4'>YPWM motif</scene> located at its N-terminal tail<ref>Mann RS. The specificity of homeotic gene function. Bioessays. 1995 Oct;17(10):855-63. [http://www.ncbi.nlm.nih.gov/pubmed/7487967 PMID:7487967].</ref>. This interaction spans Scr's flexible N-terminal linker across the minor groove of its binding site. In the absence of the YPWM motif, Scr and Exd would not form a heterodimer.<br/>


===Major Groove Base Readout===
===Major Groove Base Readout===


Hox proteins achieve a large fraction of their binding specificity through <scene name='Sandbox_Reserved_169/Scr_mg/6'>hydrogen bonds between residues of the recognition helix and base pairs in the major groove</scene>. This form of protein-DNA recognition in the major groove is characterized as '''base readout''' since hydrogen bonds in the major groove can be used to distinguish between all four possible base pairs, A/T, T/A, C/G, and G/C. The Scr residues that engage in major groove base readout are <scene name='Sandbox_Reserved_169/Scr_mg/6'>Ile47, Gln50, Asn51, and Met54</scene>. Major groove contacts are almost identical across the Hox protein family and are not sufficient to achieve specificity within this family of transcription factors.<br/>
Hox proteins achieve a large fraction of their binding specificity through <scene name='Sandbox_Reserved_169/Scr_mg/6'>hydrogen bonds between residues of the recognition helix and base pairs in the major groove</scene>. This form of protein-<font color="#e06800">'''DNA'''</font> recognition in the major groove is characterized as '''base readout''' since hydrogen bonds in the major groove can be used to distinguish between all four possible base pairs, A/T, T/A, C/G, and G/C. The Scr residues that engage in major groove base readout are Ile47, Gln50, Asn51, and Met54. Major groove contacts are almost identical across the Hox protein family and are not sufficient to achieve specificity within this family of transcription factors.


===Minor Groove Shape Readout===
===Minor Groove Shape Readout===


Minor groove contacts, in addition to base readout in the major groove, provide the level of specificity that contributes to distinguishing factors within the Hox family. It has been shown that minor groove contacts are essential for achieving specificity. Three side chains, <scene name='Sandbox_Reserved_169/Scr_mg/8'>His-12, Arg3, and Arg5 are observed to contact the minor groove</scene> of the Scr in vivo site ''fkh250''. However, this additional level of binding specificity is not achieved through hydrogen bonds between protein side chains and functional groups of the bases. Such direct interactions are unable to distinguish A/T and T/A, or C/G and G/C base pairs due to the overlapping location of hydrogen bond donors and acceptors.<br/>
Minor groove contacts, in addition to base readout in the major groove, provide the level of specificity that contributes to distinguishing factors within the Hox family. It has been shown that minor groove contacts are essential for achieving specificity<ref name="nature" />. Three side chains, <scene name='Sandbox_Reserved_169/Scr_mg/8'>His-12, Arg-3, and Arg-5 are observed to contact the minor groove</scene> of the Scr in vivo site ''fkh250''. However, this additional level of binding specificity is not achieved through hydrogen bonds between protein side chains and functional groups of the bases. Such direct interactions are unable to distinguish A/T and T/A, or C/G and G/C base pairs due to the overlapping location of hydrogen bond donors and acceptors.<br/>


The mechanism through which these three residues recognize the DNA minor groove is called '''shape readout''' as they do not form base-specific hydrogen bonds but rather recognize the sequence-specific narrowing of the minor groove. AT-rich regions can be characterized through an intrinsically narrow minor groove, leading to enhanced negative electrostatic potential, which in turn attracts basic side chains. This shape readout mechanism was found to be broadly employed by arginine residues <ref name="nature">Rohs R, West SM, Sosinsky A, Liu P, Mann RS, Honig B. The role of DNA shape in protein-DNA recognition. Nature. 2009;461(7268):1248-53. [http://www.ncbi.nlm.nih.gov/pubmed/19865164 PMID:19865164]</ref>.<br/>
The mechanism through which these three residues recognize the DNA minor groove is called '''shape readout''' as they do not form base-specific hydrogen bonds but rather recognize the sequence-specific narrowing of the minor groove. AT-rich regions can be characterized through an intrinsically narrow minor groove, leading to enhanced negative electrostatic potential, which in turn attracts basic side chains. This shape readout mechanism was found to be broadly employed by arginine residues <ref name="nature">Rohs R, West SM, Sosinsky A, Liu P, Mann RS, Honig B. The role of DNA shape in protein-DNA recognition. Nature. 2009;461(7268):1248-53. [http://www.ncbi.nlm.nih.gov/pubmed/19865164 PMID:19865164]</ref>.<br/>
Line 44: Line 49:


[[Image:Joshi-etal-Figure7.jpg‎ |thumb|left|300px|Figure 4: Expression patterns of Scr in presence of Scr specific site (left panel) vs. Hox consensus site (right panel). Elsevier/Cell Press has provided permission for usage of this figure<ref name="joshi"/>.]]
[[Image:Joshi-etal-Figure7.jpg‎ |thumb|left|300px|Figure 4: Expression patterns of Scr in presence of Scr specific site (left panel) vs. Hox consensus site (right panel). Elsevier/Cell Press has provided permission for usage of this figure<ref name="joshi"/>.]]
 
{{Clear}}
In vitro binding studies have shown that His-12 and Arg3 mutations have a large effect when exposed to the Scr specific site, whereas the effect is small when exposed to a Hox consensus site. The biological importance of both side chains becomes apparent in in-vivo experiments. Upon mutations of His-12 and Arg3 to alanine, Scr expression in a fly embryo is dramatically affected (Figure 4). In comparison to  wild type Scr (A) and based on ectopic expression (B), there is only residual expression detected in  
In vitro binding studies have shown that His-12 and Arg-3 mutations have a large effect when exposed to the Scr specific site, whereas the effect is small when exposed to a Hox consensus site. The biological importance of both side chains becomes apparent in ''in vivo'' experiments. Upon mutations of His-12 and Arg-3 to alanine, Scr expression in a fly embryo is dramatically affected (Figure 4). In comparison to  wild type Scr (A) and based on ectopic expression (B), there is only residual expression detected in  
the thorax region of the double mutant when the Scr specific site is tested (C), whereas there is no apparent effect on expression in the presence of a Hox consensus site (D-F).<br/>
the thorax region of the double mutant when the Scr specific site is tested (C), whereas there is no apparent effect on expression in the presence of a Hox consensus site (D-F).<br/>


==Recognition of Scr Specific vs. Hox Consensus Site==
==Recognition of Scr Specific vs. Hox Consensus Site==


<Structure load='2r5y' size='300' frame='true' align='right' caption='Figure 5: 3D-Representation of Exd-Scr-DNA ternary complex with Hox consensus site; [http://proteopedia.com/wiki/index.php/2r5y PDB ID# 2R5Y].' scene='Sandbox_Reserved_169/Con/3' />
<scene name='Sandbox_Reserved_169/Con/3'>Figure 5: 3D-Representation of Exd-Scr-DNA ternary complex with Hox consensus site</scene> ([http://proteopedia.com/wiki/index.php/2r5y PDB ID# 2R5Y])
 
This observation can be explained based on a second crystal structure of an <span style="background-color: black; color: yellow;">'''&nbsp;Scr&nbsp;'''</span>-<font color="blue">'''Exd'''</font>-<font color="#e06800">'''DNA'''</font> ternary complex where the Hox-Exd hetrodimer is bound to a Hox consensus site, which is not specific to Scr (<scene name='Sandbox_Reserved_169/Con/3'>restore initial scene</scene>). In this structure it is apparent that only <scene name='Sandbox_Reserved_169/Con/2'>Arg5 binds the minor groove</scene> and the remainder of the N-terminal linker is disordered (Figure 5).<br/>
This observation can be explained based on a second crystal structure of an Scr-Exd-DNA ternary complex where the Hox-Exd hetrodimer is bound to a Hox consensus site, which is not specific to Scr. In this structure it is apparent that only <scene name='Sandbox_Reserved_169/Con/2'>Arg5 binds the minor groove</scene> and the remainder of the N-terminal linker is disordered (Figure 5).<br/>


[[Image:Cell2007-Fig4.jpg‎ |thumb|left|300px|Figure 6: Comparison of DNA shape of Scr specific in vivo site (left panel) vs. Hox consensus site (right panel). Elsevier/Cell Press has provided permission for usage of this figure<ref name="joshi"/>.]]
[[Image:Cell2007-Fig4.jpg‎ |thumb|left|300px|Figure 6: Comparison of DNA shape of Scr specific in vivo site (left panel) vs. Hox consensus site (right panel). Elsevier/Cell Press has provided permission for usage of this figure<ref name="joshi"/>.]]
 
{{Clear}}
Based on the comparison of the two crystal structures of a Scr-Exd-DNA ternary complexes (Figure 6), it was found that three N-terminal residues contact the minor groove of the Scr specific site ''fkh250'' (A) compared to only Arg5 binding the Hox consensus site ''fkh250con'' (B). In their protein-bound states, the shapes of both sites are distinct (dark gray, concave; green, convex surfaces). The distinct shapes of the two DNA binding sites, shown as minor groove width in the crystal structures of the complexes (blue plots), are already present when the protein is not bound to the DNA, with two minima in ''fkh250'' (C) vs. one minimum in ''fkh250con'' (D), as inferred by Monte Carlo simulations (green plots). Minor groove width (blue plots) and electrostatic potential (red plots) correlate and form two binding pockets in ''fkh250'' (E) and only a binding site for Arg5 in ''fkh250con'' (F).<br/>
Based on the comparison of the two crystal structures of a Scr-Exd-DNA ternary complexes (Figure 6), it was found that three N-terminal residues contact the minor groove of the Scr specific site ''fkh250'' (A) compared to only Arg5 binding the Hox consensus site ''fkh250con'' (B). In their protein-bound states, the shapes of both sites are distinct (dark gray, concave; green, convex surfaces). The distinct shapes of the two DNA binding sites, shown as minor groove width in the crystal structures of the complexes (blue plots), are already present when the protein is not bound to the DNA, with two minima in ''fkh250'' (C) vs. one minimum in ''fkh250con'' (D), as inferred by Monte Carlo simulations (green plots). Minor groove width (blue plots) and electrostatic potential (red plots) correlate and form two binding pockets in ''fkh250'' (E) and only a binding site for Arg5 in ''fkh250con'' (F).<br/>


==High-throughput Analysis of Hox-DNA Binding Specificity==
==High-throughput Analysis of Hox-DNA Binding Specificity==


[[Image:Slattery-etal-Figure6.jpg‎ |thumb|right|300px|Figure 7:  DNA shape analysis of >650,000 sites derived from SELEX-seq experiments. Elsevier/Cell Press has provided permission for usage of this figure<ref name="slattery"/>.]]
[[Image:Slattery-etal-Figure6.jpg‎ |thumb|right|300px|Figure 7:  DNA shape


Based on SELEX-seq data and a method for high-throughput prediction of DNA shape, the same pattern of two minima in minor groove width (A) was predicted for the binding sites of all anterior Hox proteins vs. a single minimum (A) for all posterior Hox proteins (dark green for narrow groove, white for wide groove). Frames highlight the regions that correspond to the minima in Figure 6. Differences in minor groove width between binding sites can be visualized in a Euclidean distance dendrogram, which forms two branches representing anterior and posterior Hox proteins (B). The differences between both groups are significant as shown by Pearson correlation (C). Remarkably, using DNA shape of their selected binding sites the eight ''Drosophila'' Hox proteins order according to their collinearity. This result, thus, indicates how Hox genes have likely differentiated throughout evolution.<br/>
==3D structure of Hox protein==
Updated on {{REVISIONDAY2}}-{{MONTHNAME|{{REVISIONMONTH}}}}-{{REVISIONYEAR}}


=Further Reading=
[[1b72]] – hHox-B1 + DNA – human<br />
A more general discussion of structural origins of binding specificity in protein-DNA recognition has been published along with a suggestion for a new '''classification of protein-DNA readout modes''' that goes beyond the historical description of direct and indirect readout<ref name="annualreview">Rohs R, Jin X, West SM, Joshi R, Honig B, Mann RS. Origins of specificity in protein-DNA recognition. Annu Rev Biochem. 2010;79:233-69. [http://www.ncbi.nlm.nih.gov/pubmed/20334529 PMID:20334529]</ref>.<br/>
[[1puf]] – hHox-PRL + mHox-A9 + DNA <br />
[[2l7z]] – hHox-A13 + DNA – NMR<br />
[[2lp0]] – hHox-A13 + geminin peptide – NMR<br />
[[2cra]] – hHox-B13 – NMR<br />
[[5edn]], [[5eea]], [[5ef6]], [[5no6]] – hHox-B13 + DNA<br />
[[5eg0]], [[5ego]] – hHox-B13 + MEISI1 + DNA<br />
[[2msy]] – hHox-C9 – NMR<br />
[[3a03]] – hHox-11L1<br />
[[1ig7]] – mHox-MSX + DNA – mouse<br />
[[1lfu]] – mHox-PBX + DNA – NMR<br />
[[2ld5]] – mHox-A13 + DNA – NMR<br />
[[4uut]] – DmHox – ''Drosophila melanogaster''<br />
[[2r5z]] – DmHox + DNA <br />
[[2r5y]] – DmHox + Scr + DNA <br />
[[4uus]], [[5cyc]] – DmHox + Ubx+ DNA <br />
[[5zjq]], [[5zjr]], [[5zjs]], [[5zjt]] – DmHox + AbdB+ DNA <br />


=Acknowledgements=
== References ==
This Proteopedia page originates from the partnership of the Rohs Laboratory at the University of Southern California with La Cañada High School. This partnership was initiated by Remo Rohs and Patty Compeau in September 2011 as '''Bioinformatics Institute''', which is part of the Institutes of the 21st Century. Advice and technical help by Proteopedia editors Eran Hodis, Eric Martz, Jaime Prilusky, and Joel Sussman is acknowledged.<br/>
<references/>


=References=
[[Category:Featured in BAMBED]]
<references/>
[[Category:Topic Page]]