Jmol/Visualizing large molecules

Inadequate Memory May Preclude DisplayInadequate Memory May Preclude Display

Some molecular models ("molecules") are so large that they will not fit within the default amount of computer memory allocated to Jmol (which is the default amount of memory allocated to java). While it is possible to increase the memory allocated to java, most users will not do this, and hence, will not be able to display, in Proteopedia or Jmol, molecules that exceed a certain size.

Maximum Size Per ModelMaximum Size Per Model

Strictly speaking, the format of PDB files is limited to 99,999 atoms in a single model, because there are only 5 columns allocated to atom serial numbers. (Files in the mmCIF format can be read by Jmol, and do not suffer from this limitation.) 3cc2 is a model of a large ribosomal subunit containing 99,049 atoms (close to the limit for a single PDB file). Most likely it will display in Jmol when you go to that page. Jmol ignores the atom serial number, columns 7-11 in the PDB file, instead assigning its own atomIndex number, unique for each atom, and not redundant between models. Jmol can handle PDB files containing >100,000 atoms.

This limitation requires that models containing >=100,000 atoms be split into two or more PDB files, or else represented as artificially separated models in a single PDB file. These work-arounds are awkward for visualization. An example is the combination of portions of the two files 1jgo and 1giy for visualization of a complete Ribosome.

Multiple Model FilesMultiple Model Files

The largest PDB files in the Protein Data Bank are those containing multiple models of large molecules. Since the atom serial numbers start at 1 in each model, these files can get very large (>1,000,000 atoms is possible). An example is 3ezb, which contains 40 models (determined by solution NMR). Each model contains 5,323 atoms (including 2,694 hydrogen atoms); the 40 model file contains 212,920 atoms, and the PDB file is 16.5 megabytes in size. When you visit the page 3ezb, the ensemble will fail to display, producing an "out of memory" error (unless you have allocated more than the default amount of memory to java on your computer). There are files in the PDB several-fold larger than 3ezb. For example, 2hyn is a 64 megabyte file containing 826,896 atoms in 184 models.

Displaying Only The First ModelDisplaying Only The First Model

Jmol can be instructed to load only the first model of a multiple-model PDB file. This is best done with the Jmol application (outside of Proteopedia). Later, the single model could be uploaded to Proteopedia for use in a scene.

Put Jmol.jar in your working folder: Create a folder (directory) on your computer in which to work. Go to [Jmol.Org], select Download (upper right), go to the Jmol Downloads Page, and download the current version as a binary.zip file. Unzip the file (this is automatic on Macs), and copy the file Jmol.jar (one of a large number of files in the zipped download). This is the Jmol application. Paste it into your working folder.

Run the Jmol application: Double-click Jmol.jar in your working folder, and a black window will appear titled "Jmol". If the white "Jmol Script Console" window does not appear, use Jmol's File menu (at the top), Console to open it.

Demonstrate Out Of Memory: Type the following command into the white console window:

load =2hyn

A red "OutOfMemory" error message should appear in Jmol in less than 30 seconds (depending on the speed of you Internet connection).

Load The First Model: Type the following two commands into the white console window:

zap

load models {1 1 1} =2hyn

In less than 30 seconds, the first model from the ensemble in 2hyn should appear in Jmol.

Save The First Model: Type this command:

write pdb 2hyn_model1.pdb

Now you should find a new file 2hyn_model1.pdb in your working folder. You can load it with this command:

load 2hyn_model1.pdb

You can also upload it to Proteopedia for use in molecular scenes generated with Proteopedia's SAT.

Displaying Only Alpha Carbon AtomsDisplaying Only Alpha Carbon Atoms

With large multiple-chain assemblies, or multiple-model ensembles, typically you want to see only the backbone traces. Backbone traces can be visualized from only the alpha carbon atoms (or for nucleic acids, the phosphorus atoms). Jmol can extract ("filter") specified atoms from the PDB file, thereby saving memory. For example, 2hyn contains 4,494 atoms/model (half of which are hydrogen atoms), and 184 models, totalling 826,896 atoms. There are 260 alpha carbon atoms/model, or a total of 47,840 atoms. The alpha carbons represent less than 6% of the original atoms, or a nearly 20-fold reduction in memory requirements.

Using the Jmol application from your working folder (see instructions above), enter this command:

load =2hyn filter "*.ca"

After about a full minute (depending on the speed of your Internet connection), a backbone trace of the first model will appear, which means that loading and filtering are complete. These commands will display the backbone traces for all 184 models:

frame all

color chain

Biological AssembliesBiological Assemblies

However, biological unit assemblies generated from PDB files often exceed this limit by using redundant serial numbers for copied chains. Luckily, Jmol ignores the serial numbers in the PDB file, and can handle models with >100,000 atoms.

Proteopedia Page Contributors and Editors (what is this?)Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Wayne Decatur