Jmol/PDB file editing with Jmol: Difference between revisions

Eric Martz (talk | contribs)
No edit summary
Eric Martz (talk | contribs)
 
(28 intermediate revisions by 3 users not shown)
Line 1: Line 1:
The Jmol.jar application can be used to edit the contents of PDB files. For example, you could change atom serial numbers, names of chains, change sequence numbers, and so forth. A script file can be written to make specific changes, following the principles outlined below.
The [[Jmol/Application|Jmol.jar application]] can be used to edit the contents of [[PDB files]]. For example, you could change atom serial numbers, names of chains, change sequence numbers, and so forth. A [[Jmol#Command_Scripting_Language|command script file]] can be written to make specific changes, following the principles outlined below, using a [[Help:Plain text editors|plain text editor]].
 
==Identify yourself and changes made==
 
===What changes were made, by whom?===
Before you make any edited PDB file public, such as by uploading it to Proteopedia, PLEASE insert REMARK lines that give your name and professional affiliation, and summarize what changes you made. Inserting REMARK lines can most easily be done as a last step, using a [[Help:Plain text editors|plain text editor]].
 
===Use a distinctive PDB file name===
 
Do not give your PDB file a name that is easily confused with the version published at the [[PDB]], such as 1D66.pdb. Use a name that makes it clear that it has been modified, such as 1D66_chains_renamed.pdb.


==Put the PDB file in a variable==
==Put the PDB file in a variable==
First, load any PDB file. Then:
First, run the [[Jmol/Application|Jmol.jar Java application]], and load any PDB file. If you have a PDB file saved to your disk (for example, downloaded from [http://rcsb.org RCSB.Org]), drag it and drop into the Jmol graphics window. Alternatively, to load 1d66 directly from the [[PDB]]:
load =1d66  # No space between = and the 4-character [[PDB code]].
: ''Note:'' Anything between '#' and the end of a line (or a semicolon) is a comment.


:<tt>mypdb = getproperty("filecontents")</tt>
After you have the PDB file loaded and the molecule is displayed:
 
mypdb = getproperty("filecontents")
mypdb now has one long string (with newlines) for the entire PDB file, unmodified, including the header.
<code>mypdb</code> is a variable inside the Jmol scripting context that now has one long string (including newline characters) for the entire PDB file, unmodified, including the header.


==Edit line by line==
==Edit line by line==
It is usually easiest to loop through the PDB file line by line. So:
It is usually easiest to loop through the PDB file line by line. So, let's define a new variable:


:<tt>mypdblines = mypdb.lines</tt>
mypdblines = mypdb.lines


Now, <tt>mypdblines</tt> is an array with one PDB line per element. So you can loop line by line:
Now, <code>mypdblines</code> is an array with one PDB line per element. So you can loop line by line:
<tt>
for (i=1; i<=mypdblines.length; i++)
:for (i=1; i<=mypdblines.length; i++)
{
:{
   &nbsp;  mypdblines[i] ... # do something with this
:   mypdblines[i] ...
}
:}</tt>


Jmol has plentiful commands for finding lines, and editing them. For example, to operate only on lines beginning "ATOM ...",
Jmol has plentiful commands for finding lines, and editing them. For example, to operate only on lines beginning "ATOM ...",


:<tt>if (mypdblines[i].find("^ATOM ", "")) ...</tt>
if (mypdblines[i].find("^ATOM ", "")) ...


The second parameter "" signals that the first parameter should be interpreted as a [http://www.regular-expressions.info/ regular expression], where "^" means "beginning of the line".
The second parameter "" signals that the first parameter should be interpreted as a [http://www.regular-expressions.info/ regular expression], where "^" means "beginning of the line".
Line 30: Line 40:
Since PDB format has fixed column positions, you can, for example, change the chain name, which is in column 22:
Since PDB format has fixed column positions, you can, for example, change the chain name, which is in column 22:


:<tt>if ((mypdblines[i])[22][23] == "G") {(mypdblines[i])[22][23] = "D";}</tt>
if ((mypdblines[i])[22][23] == "G") {(mypdblines[i])[22][23] = "D";}


(The atom property "chain" is not writable in Jmol, nor are "resno" nor "seqcode". So you can't simply assign new values to these properties.)
(The atom property "chain" is not writable in Jmol, nor are "resno" nor "seqcode". So you can't simply assign new values to these properties.)
Line 38: Line 48:
When finished, you write the PDB file like this:
When finished, you write the PDB file like this:


:<tt>write var @mypdblines "filename.pdb"</tt>
write var mypdblines "someFileName.pdb"


"Var" means you are writing the contents of a variable into a disk file.
"Var" means you are writing the contents of a variable into a disk file.


It is important to note that the command
It is important to note that another command you might think of using,
<code>write "someFileName.pdb"</code>,
writes a file without the original header (and containing only the currently selected atoms). By using the variable <code>mypdblines</code>, you preserve the header and write all atoms.
 
==Saving key information in the header==
 
Custom information can be inserted into the header section of <code>mypdblines</code>.
 
Jmol uses the first line of a PDB file to recognize PDB format, so it is important not to put your custom lines first. Not only Jmol, but [[PyMOL]] and [[Chimera]] and likely other popular [[Molecular modeling and visualization software|molecular visualization apps]] (including [[FirstGlance in Jmol|FirstGlance]]) happily ignore lines in a PDB file that do not begin with a recognizable record name such as REMARK or ATOM. (FirstGlance recognizes lines beginning "!" as custom information from the ConSurf server.)
 
===Method #1===


:<tt>write "filename.pdb"</tt>
For example, if Jmol has calculated things that you would like to have available (without re-calculating) in the output PDB file, you can insert lines between the first and second lines of <code>mypdblines</code> like this:


writes a file without the original header (and containing only the selected atoms). By using the variable <tt>mypdblines</tt>, you preserve the header and write all atoms.
HEADER    TRANSCRIPTION/DNA                      06-MAR-92  1D66           
@ Custom information in lines beginning "@ ".
TITLE    DNA RECOGNITION BY GAL4: STRUCTURE OF A PROTEIN/DNA COMPLEX


==Saving key information in the header==
It is even possible to put Jmol scripts (perhaps to define a function, or specify custom variable values [variables are not saved in PDB nor in PNGJ files]) for later use. For example, this could be inserted into the header of <code>mypdblines</code>:
 
@ # Jmol script.
@ myvar = 12.6
@ function f1()
@ {
@  print _arguments
@ }
@ # End Jmol script.


Custom information can be inserted into the header section of <tt>mypdblines</tt>. For example, if Jmol has calculated things that you would like to have available (without re-calculating) in the output PDB file, you can insert lines between the first and second lines of mypdblines like this:
After loading the saved PDB or PNGJ file with this in its header, you can drag and drop in a script file that (i) extracts the @ lines into a variable, (ii) removes the leading "@ " from each line, then (iii) executes the variable using "<code>script inline @variable</code>".


:<tt>HEADER    TRANSCRIPTION/DNA                      06-MAR-92  1D66           
===Method #2===
:@ Custom information in lines beginning "@ ".
:TITLE    DNA RECOGNITION BY GAL4: STRUCTURE OF A PROTEIN/DNA COMPLEX</tt>


Jmol uses the first line of a  PDB file to recognize PDB format, so it is important not to put your custom lines first. Not only Jmol, but PyMOL and Chimera and likely other popular [[Molecular modeling and visualization software|molecular visualization apps]] (including [[FirstGlance in Jmol|FirstGlance]]) happily ignore lines in a PDB file that do not begin with a recognizable record name such as REMARK or ATOM. (FirstGlance recognizes lines beginning "!" as custom information from the ConSurf server.)
If you prefer your files to be closer to PDB format standards (and so prevent potential problems if those files are read into other software), any extra custom lines should always start with the PDB keyword <code>REMARK</code>. In fact, Jmol is designed to read and apply any Jmol scripts embedded in the file, when a line starts with <code>REMARK jmolscript:</code> (as described in [http://wiki.jmol.org/index.php/File_formats/Scripting#Script_inline_within_a_molecular_coordinates_file this page]). You must put your whole script of commands into that single line, but several such lines in a file are also supported. Taking the example above, this would look like:


It is even possible to put Jmol scripts (perhaps to define a function, or specify custom variable values [variables are not saved in PDB nor in PNGJ files]) for later use. For example, this could be inserted into the header of mypdblines:
REMARK jmolscript: myvar = 12.6; function f1() { print _arguments }
@ # Jmol script.
@ myvar = 12.6
@ function f1()
@ {
@    print _arguments
@ }
@ # End Jmol script.


After loading the saved PDB or PNGJ file with this in its header, you can drag and drop in a script file that (i) extracts the @ lines into a variable, (ii) removes the leading "@ " from each line, then (iii) executes the variable with "script inline @variable".
==Writing a PNGJ file containing the edited lines==
PNGJ files contain a PNG (Portable Network Graphics) static image of the scene Jmol was displaying when the PNGJ file was written, and also the complete information to reproduce the scene in Jmol. When you drag a PNGJ file and drop it into Jmol's graphics window, the scene appears in interactive form that can be rotated, zoomed, and further modified with Jmol commands.


In addition to a PDB file, you can save a PNGJ file with customized PDB lines and a customized header, tho it is slightly more tricky. Unlike a PDB file, you can't save a PNGJ file from a variable. So here is one scripting method that works, just 3 commands:
In addition to a PDB file, you can save a PNGJ file with customized PDB lines and a customized header, though it is slightly more tricky. Unlike a PDB file, you can't save a PNGJ file from a variable. So here is one scripting method that works, using only 3 commands:


# Customize mypdblines as desired previous to this line.
# Customize mypdblines as desired previous to this line.
zap # Deletes all atoms and defined atom sets. Preserves variables and functions.
load var mypdblines # loads the PDB file data contained in the variable, including header, optionally modified.
# Render, color, center, orient and zoom as desired.
write someFileName.pngj


zap # Deletes all atoms and defined atom sets. Preserves variables and functions.
The PNGJ file will have all the customized PDB lines as well as the view at the time it was saved. It does not include variables or functions that were defined at the time is was saved. If any of these are needed, define them in custom header @ lines and write a script to use them as described above, or use <code>REMARK jmolscript:</code> lines to include the definition of those variables.


load var mypdblines # loads the PDB file in the variable, including header, optionally modified.
==A PDB file editing server?==


# Render, color, center, orient and zoom as desired.
[https://wenmr.science.uu.nl/pdbtools/ PDB-Tools Web], a user interface for the [http://www.bonvinlab.org/pdb-tools/ pdb-tools] Python package.
write filename.pngj


The PNGJ file will have all the customized PDB lines as well as the view at the time it was saved. It does not include variables or functions that were defined at the time is was saved. If any of these are needed, define them in custom header @ lines, and write a script to use them as described above.
For those who might be interested in writing a server to modify PDB files, JmolData.jar is a variant of Jmol that runs without a graphics window. It is perfect for these kinds of operations. [[User:Jaime Prilusky|Jaime Prilusky]] used it in Proteopedia.org to generate a series of image files after small rotations. These are then assembled into a multi-GIF movie using other free software (ImageMagick.org) in the web server. See the link "Export Animated Image" under any JSmol box in Proteopedia.Org. In collaboration with Prilusky, [[User:Eric Martz|Eric Martz]] adapted these server routines to make such animations within [http://FirstGlance.Jmol.Org FirstGlance.Jmol.Org] (with a simplified user interface). There, under JSmol, click "Save Image or Animation for Powerpoint".


For those who might be interested in writing a server to modify PDB files (not me!), I remind you that JmolData.jar is a variant of Jmol that runs without a graphics window. It is perfect for these kinds of operations. Jaime Prilusky used it in Proteopedia.org to generate a series of image files after small rotations. These are then assembled into a multi-GIF movie using other free software (ImageMagick.org). See the link "Export Animated Image" under any JSmol in Proteopedia.Org. With Jaime's help, I adapted his server routines to make such animations within FirstGlance.Jmol.Org (where I simplified the user interface). There, under JSmol, click "Save Image or Animation for Powerpoint".
If you know of any PDB file editing servers, please link them here!

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Angel Herraez, Jaime Prilusky