|
|
(71 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
| <applet load='Dnac_from_2ggz_a.pdb' size='500' frame='true' align='right' | | <StructureSection load='' size='350' side='right' caption='' scene=''> |
| scene='User:Eric_Martz/Sandbox_4/Dnac_model_from_2ggz_a/6' /> | |
| ==Homology Model of DnaC==
| |
| The following sequence was provided for DnaC from E. coli:
| |
|
| |
|
| <tt> | | *1 <scene name='31/319439/Surface_test/1'>Surface test 1</scene>: simple surface on chain A of 1d66. Colored translucent green. This surface was generated with the '''representations''' tab of the SAT. |
| MKNVGDLMQR LQKMMPAHIK PAFKTGEELL AWQKEQGAIR SAALERENRA
| |
| <br>
| |
| MKMQ<b>RTFNRS GIRPLHQNCS FENYRVECEG QMNALSKARQ YVEEFDGNIA
| |
| <br>
| |
| SFIFSGKPGT GKNHLAAAIC NELLLRGKSV LIITVADIMS AMKDTFRNSG
| |
| <br>
| |
| TSEEQLLNDL SNVDLLVIDE IGVQTESKYE KVIINQIVDR RSSSKRPTGM
| |
| <br>
| |
| LTNSNMEEMT KLLGERVMDR MRLGNSLWVI FNWDSYR</b>SRV TGKEY
| |
| </tt> | |
|
| |
|
| This sequence was submitted to Swiss Model, which [http://tinyurl.com/4nek2q generated the homology model] shown here (<scene name='User:Eric_Martz/Sandbox_4/Dnac_model_from_2ggz_a/6'>restore initial scene</scene>) using [[2qgz]] chain A as a template, which has 18.6% sequence identity. Apparently Swiss Model used predicted secondary structure to help in the sequence alignment, but details are not clear to me. The homology model represents residues 55-237, shown in boldface in the above sequence. ''Because of the low sequence identity, this model may well contain major errors, or even be wholly incorrect.''
| |
|
| |
|
| Swiss Model has apparently used the [[temperature value]] field in the PDB file to indicate regions that are highly unreliable, namely the regions that are <font color="red"><b>red</b></font> when the model is <scene name='User:Eric_Martz/Sandbox_4/Dnac_model_from_2ggz_a/4'>colored by temperature</scene>. These regions are shown as '''translucent white''' in the initial scene (using the Jmol command <i>select temperature >50</i>). The uncertainty in three of these regions is explained by gaps in the template model (see below). Although the details of these regions are even more uncertain than other regions, it seems likely that these loops are on the surface, if the homology model turns out to be substantially correct.
| | *2 <scene name='31/319439/6zgi_translucen_backbone/1'>6zgi as translucent backbone, chain colors</scene> (no isosurfaces). After clicking this standard SAT-generated green link, then |
| | <jmol> |
| | <jmolButton> |
| | <script> |
| | isosurface delete; isosurface "http://proteopedia.org/wiki/images/6/67/6zgi-cavities.jvxl"; |
| | </script> |
| | <text>Load isosurface cavities</text> |
| | </jmolButton> |
| | </jmol> |
| | (From uploaded file http://proteopedia.org/wiki/images/6/67/6zgi-cavities.jvxl -- see [[Image:6zgi-cavities.jvxl]]) |
|
| |
|
| The [[Conservation, Evolutionary|evolutionary conservation]] pattern, revealed by ConSurf, is quite interesting, showing <scene name='User:Eric_Martz/Sandbox_4/Dnac_model_from_2ggz_a/7'>two conserved patches</scene>.<ref>ConSurf found only 10 sequences in SwissProt, with an Average Pairwise Distance of 1.6. The [http://consurf.tau.ac.il/results/1222995227/output.html run shown here] used 100 sequences from Uniprot, with an APD of 1.4.</ref>
| |
|
| |
|
| In order to find specific residues, or see charge distribution or other aspects of this homology model, please use:
| | *3 <jmol> |
| | <jmolButton> |
| | <script> |
| | load "http://proteopedia.org/wiki/images/c/c6/6zgi-cavities.pngj"; |
| | </script> |
| | <text>Load .PNGJ file</text> |
| | </jmolButton> |
| | </jmol> |
| | (From uploaded file http://proteopedia.org/wiki/images/c/c6/6zgi-cavities.pngj -- see [http://proteopedia.org/wiki/index.php/Image:6zgi-cavities.pngj Image:6zgi-cavities.png]) |
|
| |
|
| [http://oca.weizmann.ac.il/oca-docs/fgij/fg.htm?mol=http%3A//proteopedia.org/wiki/images/3/3e/Dnac_from_2ggz_a.pdb View DnaC Homology Model in FirstGlance in Jmol]
| |
|
| |
|
| ==Crystal Structure of DnaC==
| | *4 <jmol> |
| | <jmolLink> |
| | <script> |
| | load "http://proteopedia.org/wiki/images/c/c6/6zgi-cavities.pngj"; |
| | </script> |
| | <text>Green link loading uploaded PNGJ file</text> |
| | </jmolLink> |
| | </jmol>. SAT not involved. PNGJ file generated externally in the [[Jmol/Application|Jmol Java application]], then uploaded to Proteopedia as for the previous item above. |
|
| |
|
| A sequence-based search at the international [http://targetdb.pdb.org/ Structural Genomics TargetDB] reveals that the closest completed structure is 2QGZ, the one chosen by SwissModel as a template. A number of crystal and NMR structures have sequence identities up to 37% but over shorter stretches, and with higher E values.
| |
|
| |
|
| Diffraction data have been obtained (but the solved structure not yet deposited) for a ''Listeria monocytogenes'' sequence of 307 residues, pI 5.2, with an E value of 1.6e-05, though only 21% sequence identity. Diffraction-quality crystals (but not yet diffraction data) have not been obtained for any sequence with such a low E value.
| | *5 <scene name='31/319439/6zgi_cavities_from_pngj/1'>Test dropping pngj with cavities into SAT</scene>, then saving scene. (No molecule loaded with the SAT "load molecule" tab.) |
|
| |
|
| ''E. coli'' dnaC (245 residues, pI 9.4) has been crystallized by RIKEN Structural Genomics Initiative (Japan), but the crystals may not be of diffraction quality. It has been cloned, expressed as a soluble protein, and purified (but not yet crystallized) by 3 Structural Genomics Groups (RIKEN Structural Genomics Initiative (Japan), Montreal-Kingston Bacterial Structural Genomics Initiative, Midwest Center for Structural Genomics), as have several proteins with >40% sequence identity. So there is reason for optimism that either a crystal structure, or a more suitable template for homology modeling, will be forthcoming soon. One might consider contacting the groups who have reported purification of dnaC to inquire about progress, and possibly request priority for dnaC.
| |
| {{Clear}}
| |
| ==Gaps in the Template Model==
| |
| <applet load='Dnac_from_2ggz_a.pdb' size='500' frame='true' align='right'
| |
| scene='User:Eric_Martz/Sandbox_4/2qgz/3' />
| |
| The template was 2QGZ (<scene name='User:Eric_Martz/Sandbox_4/2qgz/3'>initial scene</scene>). The portion of the template used was Glu107-Arg300. Only the amino-terminal 6 residues were not used as template (translucent). Note that there are <scene name='User:Eric_Martz/Sandbox_4/2qgz/5'>three loops</scene> in this segment of the template that lack coordinates due to [[disorder]] in the crystal (marked with spacefilled alpha-carbon atoms).
| |
|
| |
|
| The missing loops are 202-205 (NGSV), 226-231 (EQATSW), and 268-275 (TIKGSDET). These gaps, which occur between the residues marked /\ below, were apparently ignored in making the model, which has a continuous main chain.
| | Tested in Firefox (current version) on macOS Mojave, mid-2014 MacBook Pro, 2.2 GHz Intel Core i7. |
|
| |
|
| Below is the alignment produced by Swiss Model, used in making the 3D model. Vertical bars for identity were inserted by hand (I may have missed some).
| | *6 <scene name='31/319439/6zgi_cavities/1'>Cavities in coronavirus spike protein 6zgi</scene>. It took 8.5 min to generate the cavity isosurfaces after I entered the command "isosurface minset 100 interior cavity 3.0 10.0" into the command slot in the SAT. (This command takes ~45 sec in the Jmol Java app which is therefore 11-fold faster for this operation.) After clicking this green link, I waited 11 min and the scene still did not appear. So I think the isosurfaces are being generated again when the green link is clicked. |
| <pre>
| |
| | | | | ||
| |
| TARGET 55 R TFNRSGIRPL HQNCSFENYR VECEGQMNAL SKARQYVEEF
| |
| 2qgzA 100 qkqaais--e riqlvslpks yrhihlsdid vnnasrmeaf saildfveqy
| |
|
| |
| TARGET sssss h h hhhhhhh hhhhhhhhh
| |
| 2qgzA hhh h sss h h hhhhhhh hhhhhhhhh
| |
|
| |
|
| | | || || | | |
| |
| TARGET 96 DGN-IASFIF SGKPGTGKNH LAAAICNELL L-RGKSVLII TVADIMSAMK
| |
| 2qgzA 148 psaeqkglyl ygdmgigksy llaamahels ekkgvsttll hfpsfaidvk
| |
|
| |
| TARGET ssss ss hhh hhhhhhhhhh h h ssss sshhhhhhh
| |
| 2qgzA ssss ss hhh hhhhhhhhhh hh ssss sshhhhhhh
| |
|
| |
|
| || | | || |
| |
| TARGET 144 DTFRNSGTSE EQLLNDLSNV DLLVIDEIGV QTESKYEKVI INQIVDRRSS
| |
| 2qgzA 198 naiske---- --eidavknv pvlilddiga vrde-----v lqvilqyrml
| |
| /\ / \
| |
| TARGET hhh ssssss hhhhhhhhhh
| |
| 2qgzA hh h ssssss hhhhhhhhhh
| |
|
| |
|
| | | ||| | | |
| | </StructureSection> |
| TARGET 194 SKRPTGMLTN SNMEEMTKLL ---GERVMDR MRLGNSLWVI FNWDSYR
| |
| 2qgzA 247 eelptfftsn ysfadlerkw awqakrvmer vr-ylarefh leganrr-
| |
| /\
| |
| TARGET h ssssss hhhhh hhhh hh ssssss s
| |
| 2qgzA h ssssss hhhh hhhh hh hh ssss s
| |
| </pre>
| |
| | |
| Below is the sequence with ATOM records (coordinates) from 2QGZ, numbered 100-300, showing the gaps as "...". This sequence listing was used to locate the positions marked /\ above.
| |
| <pre>
| |
| 1 .......... .......... .......... .......... ..........
| |
| 51 .......... .......... .......... .......... .........Q
| |
| 101 KQAAISERIQ LVSLPKSYRH IHLSDIDVNN ASRMEAFSAI LDFVEQYPSA
| |
| 151 EQKGLYLYGD MGIGKSYLLA AMAHELSEKK GVSTTLLHFP SFAIDVKNAI
| |
| 201 S....KEEID AVKNVPVLIL DDIGA..... .VRDEVLQVI LQYRMLEELP
| |
| | |
| 251 TFFTSNYSFA DLERKWA... .....WQAKR VMERVRYLAR EFHLEGANRR
| |
| </pre>
| |
| (Copied from Protein Explorer's sequence display.)
| |
| | |
| Below is the alignment of dnaC with 2QGZ according to TargetDB (see above).
| |
| <pre>
| |
| ID: DR58 Center: NESGC
| |
| E-value: 0.00028 Identity: 19.737%
| |
| | |
| 10 20 30
| |
| Query MKNVGDLMQRLQKMMPAHIKPAFKTGEELLAWQKEQGA
| |
| Q+ Q P++I +++ + + +
| |
| Subjct EVASFISQHHLSQEQINLSLSKFNQFLVERQKYQLKDPSYIAKGYQPILAMNEGYADVSY
| |
| 40 50 60 70 80 90
| |
| | |
| 40 50 60 70 80 90
| |
| Query IRSAALERENRAMKMQRTFNRSGIRPLHQNCSFENYRVECEGQMNALSKARQYVEEF-DG
| |
| +++ L + ++ +++ ++ ++ +++ + + V+ ++M+A+S ++VE++ ++
| |
| Subjct LETKELVEAQKQAAISERIQLVSLPKSYRHIHLSDIDVNNASRMEAFSAILDFVEQYPSA
| |
| 100 110 120 130 140 150
| |
| | |
| 100 110 120 130 140 150
| |
| Query NIASFIFSGKPGTGKNHLAAAICNELLLR-GKSVLIITVADIMSAMKDTFRNSGTSEEQL
| |
| + ++ + G G GK++L AA+ +EL + G S+ ++ ++ +K+++ N++++EE
| |
| Subjct EQKGLYLYGDMGIGKSYLLAAMAHELSEKKGVSTTLLHFPSFAIDVKNAISNGSVKEE--
| |
| 160 170 180 190 200
| |
| | |
| 160 170 180 190 200 210
| |
| Query LNDLSNVDLLVIDEIGV-QTESKYEKVIINQIVDRRSSSKRPTGMLTNSNMEEMTK----
| |
| ++ ++NV +L++D+IG+ Q+ S + +++ I++ R + PT + +N ++ ++ +
| |
| Subjct IDAVKNVPVLILDDIGAEQATSWVRDEVLQVILQYRMLEELPTFFTSNYSFADLERKWAT
| |
| 210 220 230 240 250 260
| |
| | |
| 220 230 240
| |
| Query LLG-------ERVMDRMRLGNSLWVIFNWDSYRSRVTGKEY
| |
| + G +RVM+R+R
| |
| Subjct IKGSDETWQAKRVMERVRYLAREFHLEGANRR
| |
| 270 280 290 300
| |
| </pre>
| |
| | |
| ==Notes==
| |
| <references /> | |