Tuesday, June 16, 2009

DNA

By Eric Keller

Scientific animation is a great source of work, and also a great opportunity to flex your creative muscle. It's often a challenge to create something that is simultaneously instructive, visually engaging, and scientifically accurate. However, accuracy in scientific animations is a tricky thing. More often than not, accuracy means creating a non-misleading visual metaphor rather than a perfect atom-for-atom representation. Many times this translates into sacrificing realism for visual and conceptual clarity. That being said, when the situation does call for accuracy in the form of a molecular model, it's good to know that you can achieve this in Maya fairly painlessly, with the help of a few scripts, some free applications, and enough artistic license as the situation will allow.

This tutorial is broken down into several mini-tutorials designed to give you some techniques to build on when bringing macromolecular data into Maya. Specifically, we'll be working with the ever-popular, and often misrepresented, DNA. Of course these techniques can also be used on any of the thousands of molecules and proteins whose structure is freely available online at the Protein Data Bank website (www.rcsb.org/pdb/).

Most of the time the DNA depicted in movies, on TV, and even in publications, is wrong. DNA is often portrayed as a simple twisted ladder. Other times its structure is shown twisting to the left instead of to the right, or its has too many or too few nucleotides per bend. Sometimes it is shown with some horrible combination of all three mistakes. The structure of DNA is very important -- so important, in fact, that it earned James Watson and Francis Crick a Nobel Prize for discovering it. Think of it this way: you wouldn't throw an F-16 flying backwards in to a World War 2 movie would you? So why be just as careless with the way you represent DNA? As an animator, you are responsible for correctly representing DNA's structure; fortunately, this doesn't require a Ph.D. By bringing in the actual DNA crystallography data from the protein databank, you can start out with the real McCoy and be assured that your final DNA animation should turn out to be correct.

OK, let's get to it...
Here's what you need to complete this tutorial:

1) Maya 6.0+ any platform (I believe any platform should work, and I've tested this on NT, XP, and OSX)

2) These MEL scripts:

- pdbReader.mel v1.4 by Tom Doeden available on highend3d.com.

- jPivToParticle.mel by Julian Mann available on highend3d.com.

- particleDeformationPoly.mel by Alex Bigott but only if you're using Maya 5.0 or earlier available on highend3d.com.

- ballAndStick.mel by Geordie Martinez available at www.negative13.com

3) One of these (free) programs:

- Chime (www.mdl.com/downloads/downloadable/)
- RasMol (www.umass.edu/microbio/rasmol/)
- Protein Explorer (http://molvis.sdsc.edu/protexpl/frntdoor.htm)
- RasMac (http://mc2.cchem.berkeley.edu/Rasmol/)
- PyMol9 (http://pymol.sourceforge.net/).

4) A PDB file of DNA

5) This website:

- www.rcsb.org/pdb/


6) And an intermediate level understanding of Maya, especially regarding installing and using MEL scripts and some experience with dynamics. Knowing how to write MEL scripts is not necessary.

Part One: Getting Macromolecular Data into Maya

I am NOT a scientist, but I've often worked on projects that have placed me in a position in which there's was a scientist nearby. And that's great when I have questions or need to have something checked out. You'll definitely want to find a knowledgeable person to proofread the structure of your molecule, and it would be even better if he or she can help you find the molecular structure you need in the protein databank.

So where do you get this data? The protein data bank located at www.rcsb.org/pdb/ is an online database populated with thousands of molecules (22,516 at last I counted). These files are uploaded by scientists and are accessible to anyone. The files contain the actual crystallography data (meaning they contain the positions of the atoms in the molecules). The files themselves are just text files with comments and coordinates. Most often, scientists use a program like Chime, RasMol, or Protein Explorer to view these structures. Thanks to a handy script by Tom Doeden called pdbReader.mel, you can easily import the data into Maya.

Using the script is easy. However, before I get to that, let me give a few words of advice on using the protein databank. It's a good idea to know what exactly you're looking for before downloading the files. Plugging in a general term like "DNA" in the search field on the website is going to generate hundreds of results, many of which include molecules and data you may not want or need. This is true for any protein structure you may look for. The best advice I can give is to get some help from a scientist on your search. For instance, if you're working for a scientist, make sure he or she is very specific in what he or she wants. Furthermore, the databank has a lot of stuff, but it doesn't have everything. Some molecules haven't been structurally analyzed and entered in the database yet.

The databank website also has a great feature called "Molecule of the Month" that is definitely worth a read. The archives contain articles on DNA, bacteriaphage, p53, and many of your other favorites. I found the article on DNA to be particularly useful. In-depth information on what a PDB file is can be found on the site at www.rcsb.org/pdb/info.html.

It's also a good idea to get one of the free programs used to view the PDBs (Chime, RasMol, RasMac, or PyMol9). I use PyMol on a Mac to both view the structure (to verify what I bring into Maya) and to strip out unnecessary data like comments (I just save the file under a new name and it often automatically removes extra code). Chime allows you to view the structure into a web browser, like Internet Explorer.

To make this tutorial easier to follow, I have linked to a pdb file of DNA. Please feel free to download it. I've already stripped out the comments so you can bring it in to Maya without doing anything extra.

Okay, let's really get to it. Download dna.pdb and open it in a text editor. Take a look at how the file is arranged. Some typical rows from our PDB file look like this:

ATOM 21 P G A 2 22.409 31.286 21.483 1.00 58.85 1BNA 83
ATOM 22 O1P G A 2 21.822 31.459 20.139 1.00 78.33 1BNA 84
ATOM 23 O2P G A 2 23.536 32.157 21.851 1.00 57.82 1BNA 85
ATOM 24 O5* G A 2 22.840 29.751 21.498 1.00 40.36 1BNA 86
ATOM 25 C5* G A 2 23.543 29.175 22.594 1.00 47.19 1BNA 87
ATOM 26 C4* G A 2 23.494 27.709 22.279 1.00 47.81 1BNA 88
ATOM 27 O4* G A 2 22.193 27.252 22.674 1.00 38.76 1BNA 89
ATOM 28 C3* G A 2 23.693 27.325 20.807 1.00 28.58 1BNA 90
ATOM 29 O3* G A 2 24.723 26.320 20.653 1.00 40.44 1BNA 91

Each row in the file corresponds to a single atom in the structure. All we're really interested in is the atom' s symbol and its X,Y, and Z coordinates. This is what PDB Reader translates into coordinates for the spheres it will create when you run the script. The atomic symbol (C for carbon, H for hydrogen, N for nitrogen, P for Phosphorus, and O for oxygen) determines which shader will be applied to these spheres and their relative size. The rest of the data is ignored.

Tip 1: What does "C3*" mean? In this case, the author of the PDB file has put an asterix next to the atoms representing the backbone of the double helix. The numbers are merely a reference which pdbReader.mel ignores. If you strip the rows with the asterixes out of the PDB file along with the rows containing P, O1P, and O2P (a repeating Phosphorous with two oxygens), you'll be left with just the base pairs of the DNA model. Save these out to a new PDB file if you decide you want to do something special with them later. Remember to also make reciprocal PDB files of the data you've stripped out so you can have PDB files with just the backbone and no base pairs.


Tip 2: I have found that some atomic symbols, like Si (for silicon), will cause the script to choke. When this happens, I edit the PDB file itself and replace the atoms with an atomic symbol I know it will accept but doesn't occur too often in the molecule (like phosphorus) and then I change these phosphorous molecules back to whatever atom I need once the file has been loaded. It's an ugly work-around but it gets the job done.

It's important to know that all PDB files are not formatted exactly the same way and that sometimes these numbers appear in different columns. A little bit of testing will most likely be necessary in some cases.

Run the pdbReader.mel script in Maya and click the "Get PDB File" button. Browse to find the dna.pdb file you downloaded and select the file.

1 comment:

  1. thanx alot :)
    hope to recive similar help from u in future as im a novice in medical anmation.

    ReplyDelete