Jmol/PDB file editing with Jmol: Difference between revisions

From Proteopedia
Jump to navigation Jump to search
Angel Herraez (talk | contribs)
change to more proper WikiCode formatting and adding extra info about scripts inside pdb files
Angel Herraez (talk | contribs)
change to more proper WikiCode formatting and adding extra info about scripts inside pdb files
Line 11: Line 11:


==Put the PDB file in a variable==
==Put the PDB file in a variable==
First, run the [[Jmol/Application|Jmol.jar Java application]], and load any PDB file. If you have a PDB file saved to your disk (for example, downloaded from [http://rcsb.org RCSB.Org]), drag it and drop into the Jmol graphics window. Alternatively, to load 1d66 directly from the [[PDB]]
First, run the [[Jmol/Application|Jmol.jar Java application]], and load any PDB file. If you have a PDB file saved to your disk (for example, downloaded from [http://rcsb.org RCSB.Org]), drag it and drop into the Jmol graphics window. Alternatively, to load 1d66 directly from the [[PDB]]:
 
load =1d66  # No space between = and the 4-character [[PDB code]].
: ''Note:'' Anything between '#' and the end of a line (or a semicolon) is a comment.
: ''Note:'' Anything between '#' and the end of a line (or a semicolon) is a comment.
load =1d66  # No space between = and the 4-character [[PDB code]].


After you have the PDB file loaded and the molecule is displayed:
After you have the PDB file loaded and the molecule is displayed:
  mypdb = getproperty("filecontents")
  mypdb = getproperty("filecontents")
 
<code>mypdb</code> is a variable inside the Jmol scripting context that now has one long string (including newline characters) for the entire PDB file, unmodified, including the header.
<code>mypdb</code> is a JmolScript variable that now has one long string (including newline characters) for the entire PDB file, unmodified, including the header.


==Edit line by line==
==Edit line by line==
It is usually easiest to loop through the PDB file line by line. So:
It is usually easiest to loop through the PDB file line by line. So, let's define a new variable:


  mypdblines = mypdb.lines
  mypdblines = mypdb.lines
Line 30: Line 27:
  for (i=1; i<=mypdblines.length; i++)
  for (i=1; i<=mypdblines.length; i++)
  {
  {
   &nbsp;  mypdblines[i] ...
   &nbsp;  mypdblines[i] ... # do something with this
  }
  }


Line 55: Line 52:
"Var" means you are writing the contents of a variable into a disk file.
"Var" means you are writing the contents of a variable into a disk file.


It is important to note that the command
It is important to note that another command you think of using,
<code>write "filename.pdb"</code>,
writes a file without the original header (and containing only the currently selected atoms). By using the variable <code>mypdblines</code>, you preserve the header and write all atoms.


write "filename.pdb"
==Saving key information in the header==


writes a file without the original header (and containing only the selected atoms). By using the variable <tt>mypdblines</tt>, you preserve the header and write all atoms.
Custom information can be inserted into the header section of <tt>mypdblines</tt>.  


==Saving key information in the header==
Jmol uses the first line of a PDB file to recognize PDB format, so it is important not to put your custom lines first. Not only Jmol, but [[PyMOL]] and [[Chimera]] and likely other popular [[Molecular modeling and visualization software|molecular visualization apps]] (including [[FirstGlance in Jmol|FirstGlance]]) happily ignore lines in a PDB file that do not begin with a recognizable record name such as REMARK or ATOM. (FirstGlance recognizes lines beginning "!" as custom information from the ConSurf server.)
 
===Method #1===


Custom information can be inserted into the header section of <tt>mypdblines</tt>. For example, if Jmol has calculated things that you would like to have available (without re-calculating) in the output PDB file, you can insert lines between the first and second lines of mypdblines like this:
For example, if Jmol has calculated things that you would like to have available (without re-calculating) in the output PDB file, you can insert lines between the first and second lines of <code>mypdblines</code> like this:


  HEADER    TRANSCRIPTION/DNA                      06-MAR-92  1D66             
  HEADER    TRANSCRIPTION/DNA                      06-MAR-92  1D66             
  @ Custom information in lines beginning "@ ".
  @ Custom information in lines beginning "@ ".
  TITLE    DNA RECOGNITION BY GAL4: STRUCTURE OF A PROTEIN/DNA COMPLEX
  TITLE    DNA RECOGNITION BY GAL4: STRUCTURE OF A PROTEIN/DNA COMPLEX
Jmol uses the first line of a  PDB file to recognize PDB format, so it is important not to put your custom lines first. Not only Jmol, but [[PyMOL]] and [[Chimera]] and likely other popular [[Molecular modeling and visualization software|molecular visualization apps]] (including [[FirstGlance in Jmol|FirstGlance]]) happily ignore lines in a PDB file that do not begin with a recognizable record name such as REMARK or ATOM. (FirstGlance recognizes lines beginning "!" as custom information from the ConSurf server.)


It is even possible to put Jmol scripts (perhaps to define a function, or specify custom variable values [variables are not saved in PDB nor in PNGJ files]) for later use. For example, this could be inserted into the header of <code>mypdblines</code>:
It is even possible to put Jmol scripts (perhaps to define a function, or specify custom variable values [variables are not saved in PDB nor in PNGJ files]) for later use. For example, this could be inserted into the header of <code>mypdblines</code>:
Line 81: Line 80:
  @ # End Jmol script.
  @ # End Jmol script.


After loading the saved PDB or PNGJ file with this in its header, you can drag and drop in a script file that (i) extracts the @ lines into a variable, (ii) removes the leading "@ " from each line, then (iii) executes the variable with "script inline @variable".
After loading the saved PDB or PNGJ file with this in its header, you can drag and drop in a script file that (i) extracts the @ lines into a variable, (ii) removes the leading "@ " from each line, then (iii) executes the variable using "<code>script inline @variable</code>".


If you prefer your files to be closer to PDB format standards (and so prevent potential problems if those files are read into other software), any extra custom lines should start with the PDB keyword <code>REMARK</code>. In fact, Jmol is prepared to read and apply any Jmol scripts embedded in the file, when a line starts with <code>REMARK jmolscript:</code> (as described in [http://wiki.jmol.org/index.php/File_formats/Scripting#Script_inline_within_a_molecular_coordinates_file this page]). However, there can only be one such line in a file and you must put you whole script of commands into that single line. Taking the example above, this would look like:
===Method #2===
 
If you prefer your files to be closer to PDB format standards (and so prevent potential problems if those files are read into other software), any extra custom lines should always start with the PDB keyword <code>REMARK</code>. In fact, Jmol is designed to read and apply any Jmol scripts embedded in the file, when a line starts with <code>REMARK jmolscript:</code> (as described in [http://wiki.jmol.org/index.php/File_formats/Scripting#Script_inline_within_a_molecular_coordinates_file this page]). You must put your whole script of commands into that single line, but several such lines in a file are also supported. Taking the example above, this would look like:


  REMARK jmolscript: myvar = 12.6; function f1() { print _arguments }
  REMARK jmolscript: myvar = 12.6; function f1() { print _arguments }
(with spaces inside the script being optional)


==Writing a PNGJ file containing the edited lines==
==Writing a PNGJ file containing the edited lines==
PNGJ files contain a PNG (Portable Network Graphics) static image of the scene Jmol was displaying when the PNGJ file was written, and also the complete information to reproduce the scene in Jmol. When you drag a PNGJ file and drop it into Jmol's graphics window, the scene appears in interactive form that can be rotated, zoomed, and further modified with Jmol commands.
PNGJ files contain a PNG (Portable Network Graphics) static image of the scene Jmol was displaying when the PNGJ file was written, and also the complete information to reproduce the scene in Jmol. When you drag a PNGJ file and drop it into Jmol's graphics window, the scene appears in interactive form that can be rotated, zoomed, and further modified with Jmol commands.


In addition to a PDB file, you can save a PNGJ file with customized PDB lines and a customized header, tho it is slightly more tricky. Unlike a PDB file, you can't save a PNGJ file from a variable. So here is one scripting method that works, using only 3 commands:
In addition to a PDB file, you can save a PNGJ file with customized PDB lines and a customized header, though it is slightly more tricky. Unlike a PDB file, you can't save a PNGJ file from a variable. So here is one scripting method that works, using only 3 commands:


  # Customize mypdblines as desired previous to this line.
  # Customize mypdblines as desired previous to this line.
  zap # Deletes all atoms and defined atom sets. Preserves variables and functions.
  zap # Deletes all atoms and defined atom sets. Preserves variables and functions.
  load var mypdblines # loads the PDB file in the variable, including header, optionally modified.
  load var mypdblines # loads the PDB file data contained in the variable, including header, optionally modified.
  <nowiki>#</nowiki> Render, color, center, orient and zoom as desired.
  # Render, color, center, orient and zoom as desired.
  write filename.pngj
  write someFileName.pngj


The PNGJ file will have all the customized PDB lines as well as the view at the time it was saved. It does not include variables or functions that were defined at the time is was saved. If any of these are needed, define them in custom header @ lines, and write a script to use them as described above.
The PNGJ file will have all the customized PDB lines as well as the view at the time it was saved. It does not include variables or functions that were defined at the time is was saved. If any of these are needed, define them in custom header @ lines and write a script to use them as described above, or use <code>REMARK jmolscript:</code> lines to include the definition of those variables.


==A PDB file editing server?==
==A PDB file editing server?==
For those who might be interested in writing a server to modify PDB files, JmolData.jar is a variant of Jmol that runs without a graphics window. It is perfect for these kinds of operations. [[User:Jaime Prilusky]] used it in Proteopedia.org to generate a series of image files after small rotations. These are then assembled into a multi-GIF movie using other free software (ImageMagick.org). See the link "Export Animated Image" under any JSmol in Proteopedia.Org. In collaboration with Prilusky, [[User:Eric Martz]] adapted these server routines to make such animations within [http://FirstGlance.Jmol.Org FirstGlance.Jmol.Org] (with a simplified user interface). There, under JSmol, click "Save Image or Animation for Powerpoint".
For those who might be interested in writing a server to modify PDB files, JmolData.jar is a variant of Jmol that runs without a graphics window. It is perfect for these kinds of operations. [[User:Jaime Prilusky|Jaime Prilusky]] used it in Proteopedia.org to generate a series of image files after small rotations. These are then assembled into a multi-GIF movie using other free software (ImageMagick.org) in the web server. See the link "Export Animated Image" under any JSmol box in Proteopedia.Org. In collaboration with Prilusky, [[User:Eric Martz|Eric Martz]] adapted these server routines to make such animations within [http://FirstGlance.Jmol.Org FirstGlance.Jmol.Org] (with a simplified user interface). There, under JSmol, click "Save Image or Animation for Powerpoint".


If you know of any PDB file editing servers, please link them here!
If you know of any PDB file editing servers, please link them here!

Revision as of 13:03, 12 September 2020

The Jmol.jar application can be used to edit the contents of PDB files. For example, you could change atom serial numbers, names of chains, change sequence numbers, and so forth. A command script file can be written to make specific changes, following the principles outlined below, using a plain text editor.

Identify yourself and changes madeIdentify yourself and changes made

What changes were made, by whom?What changes were made, by whom?

Before you make any edited PDB file public, such as by uploading it to Proteopedia, PLEASE insert REMARK lines that give your name and professional affiliation, and summarize what changes you made. Inserting REMARK lines can most easily be done as a last step, using a plain text editor.

Use a distinctive PDB file nameUse a distinctive PDB file name

Do not give your PDB file a name that is easily confused with the version published at the PDB, such as 1D66.pdb. Use a name that makes it clear that it has been modified, such as 1D66_chains_renamed.pdb.

Put the PDB file in a variablePut the PDB file in a variable

First, run the Jmol.jar Java application, and load any PDB file. If you have a PDB file saved to your disk (for example, downloaded from RCSB.Org), drag it and drop into the Jmol graphics window. Alternatively, to load 1d66 directly from the PDB:

load =1d66   # No space between = and the 4-character PDB code.
Note: Anything between '#' and the end of a line (or a semicolon) is a comment.

After you have the PDB file loaded and the molecule is displayed:

mypdb = getproperty("filecontents")

mypdb is a variable inside the Jmol scripting context that now has one long string (including newline characters) for the entire PDB file, unmodified, including the header.

Edit line by lineEdit line by line

It is usually easiest to loop through the PDB file line by line. So, let's define a new variable:

mypdblines = mypdb.lines

Now, mypdblines is an array with one PDB line per element. So you can loop line by line:

for (i=1; i<=mypdblines.length; i++)
{
     mypdblines[i] ...  # do something with this
}

Jmol has plentiful commands for finding lines, and editing them. For example, to operate only on lines beginning "ATOM ...",

if (mypdblines[i].find("^ATOM ", "")) ...

The second parameter "" signals that the first parameter should be interpreted as a regular expression, where "^" means "beginning of the line".

Most of Jmol's built in functions for operations on character strings are listed in this section of the Jmol documentation.

Since PDB format has fixed column positions, you can, for example, change the chain name, which is in column 22:

if ((mypdblines[i])[22][23] == "G") {(mypdblines[i])[22][23] = "D";}

(The atom property "chain" is not writable in Jmol, nor are "resno" nor "seqcode". So you can't simply assign new values to these properties.)

Write a PDB file containing the edited linesWrite a PDB file containing the edited lines

When finished, you write the PDB file like this:

write var @mypdblines "filename.pdb"

"Var" means you are writing the contents of a variable into a disk file.

It is important to note that another command you think of using, write "filename.pdb", writes a file without the original header (and containing only the currently selected atoms). By using the variable mypdblines, you preserve the header and write all atoms.

Saving key information in the headerSaving key information in the header

Custom information can be inserted into the header section of mypdblines.

Jmol uses the first line of a PDB file to recognize PDB format, so it is important not to put your custom lines first. Not only Jmol, but PyMOL and Chimera and likely other popular molecular visualization apps (including FirstGlance) happily ignore lines in a PDB file that do not begin with a recognizable record name such as REMARK or ATOM. (FirstGlance recognizes lines beginning "!" as custom information from the ConSurf server.)

Method #1Method #1

For example, if Jmol has calculated things that you would like to have available (without re-calculating) in the output PDB file, you can insert lines between the first and second lines of mypdblines like this:

HEADER    TRANSCRIPTION/DNA                       06-MAR-92   1D66             
@ Custom information in lines beginning "@ ".
TITLE     DNA RECOGNITION BY GAL4: STRUCTURE OF A PROTEIN/DNA COMPLEX

It is even possible to put Jmol scripts (perhaps to define a function, or specify custom variable values [variables are not saved in PDB nor in PNGJ files]) for later use. For example, this could be inserted into the header of mypdblines:

@ # Jmol script.
@ myvar = 12.6
@ function f1()
@ {
@   print _arguments
@ }
@ # End Jmol script.

After loading the saved PDB or PNGJ file with this in its header, you can drag and drop in a script file that (i) extracts the @ lines into a variable, (ii) removes the leading "@ " from each line, then (iii) executes the variable using "script inline @variable".

Method #2Method #2

If you prefer your files to be closer to PDB format standards (and so prevent potential problems if those files are read into other software), any extra custom lines should always start with the PDB keyword REMARK. In fact, Jmol is designed to read and apply any Jmol scripts embedded in the file, when a line starts with REMARK jmolscript: (as described in this page). You must put your whole script of commands into that single line, but several such lines in a file are also supported. Taking the example above, this would look like:

REMARK jmolscript: myvar = 12.6; function f1() { print _arguments }

Writing a PNGJ file containing the edited linesWriting a PNGJ file containing the edited lines

PNGJ files contain a PNG (Portable Network Graphics) static image of the scene Jmol was displaying when the PNGJ file was written, and also the complete information to reproduce the scene in Jmol. When you drag a PNGJ file and drop it into Jmol's graphics window, the scene appears in interactive form that can be rotated, zoomed, and further modified with Jmol commands.

In addition to a PDB file, you can save a PNGJ file with customized PDB lines and a customized header, though it is slightly more tricky. Unlike a PDB file, you can't save a PNGJ file from a variable. So here is one scripting method that works, using only 3 commands:

# Customize mypdblines as desired previous to this line.
zap # Deletes all atoms and defined atom sets. Preserves variables and functions.
load var mypdblines # loads the PDB file data contained in the variable, including header, optionally modified.
# Render, color, center, orient and zoom as desired.
write someFileName.pngj

The PNGJ file will have all the customized PDB lines as well as the view at the time it was saved. It does not include variables or functions that were defined at the time is was saved. If any of these are needed, define them in custom header @ lines and write a script to use them as described above, or use REMARK jmolscript: lines to include the definition of those variables.

A PDB file editing server?A PDB file editing server?

For those who might be interested in writing a server to modify PDB files, JmolData.jar is a variant of Jmol that runs without a graphics window. It is perfect for these kinds of operations. Jaime Prilusky used it in Proteopedia.org to generate a series of image files after small rotations. These are then assembled into a multi-GIF movie using other free software (ImageMagick.org) in the web server. See the link "Export Animated Image" under any JSmol box in Proteopedia.Org. In collaboration with Prilusky, Eric Martz adapted these server routines to make such animations within FirstGlance.Jmol.Org (with a simplified user interface). There, under JSmol, click "Save Image or Animation for Powerpoint".

If you know of any PDB file editing servers, please link them here!

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Angel Herraez, Jaime Prilusky