n recent years, there have been fantastic advances in the determination of protein and nucleic acid structures by X-ray crystallography. These include use of synchrotron X-ray sources that allow the
collection of a complete data set in about five minutes and the development of sophisticated software to automate many steps involved in converting X-ray diffraction data into the final structure.
X-ray diffraction is proportional to electron density, and knowledge of this
electron density "map" allows one to identify the spatial positions of all atoms
in the structure.
The first step in determining a new protein structure is obtainment of a rough estimate of this electron density map. This estimate is obtained either by combining data from native and heavy atom
derivatives or, if synchrotron radiation is available, collecting data at two
different wavelengths from a protein that has selenomethionine substituted for
the normal amino acid methionine.
The second step is positioning the amino acid residues of the protein in this electron density map using the previously determined amino acid sequence. This is currently a time consuming step, requiring
the manual building of the protein, one residue at a time, into the map using a high-resolution monitor. Depending on the size of the protein, this can take several weeks to more than a month of a skilled
investigators time.
 |
| Snapshot of a screen generated by MAID showing the agreement between the automated fit (maroon and white) and the final refined protein structure (yellow and blue). |
The final step is to "refine" this initial structure by adjusting it until it yields the optimal fit to the X-ray diffraction data. Sophisticated computer programs have been developed to
automate this refinement.
Professor David Levitt of the Physiology Department at the University of Minnesota has been developing a program (MAID) to automate the second step in this procedure‹the building of the protein into
the initial map. He has broken the problem into two steps. In the first step,
the map is searched to find regions that are either -helices or ß-sheets. These are regions in which there are strong
constraints on the possible positions of the atoms. Once these regions are located,
a generic amino acid sequence is optimally fitted into the map. The second step
is to extend these fits into the "loop" regions. This is a much more difficult problem because there are fewer constraints on the possible structures and because the map is usually poor in
these regions.
Professor Levitt has developed a complete graphic visualization program to aid in the writing and testing of the routines required for MAID. The program is written in C++ and uses OpenGL and Motif.
The figure to the left shows a snapshot of the application of MAID to an experimental preliminary electron density map (not shown in the figure) that was actually used to solve for the final structure
using the conventional manual fitting technique. The final protein structure, obtained after intensive refinement procedures, is indicated by the yellow (main chain atoms) and blue (side chain atoms)
lines in the figure. The test of MAID is to see if the automatically fit protein structure, indicated by the maroon (main chain) and white (side chain) lines, agrees with this final refined structure.
One example of the type of fit produced by MAID is shown in the figure that illustrates the fit to one helix region (amino acids 53 to 68) and the extension into the loop region (amino acids 68 to 77).
This fit is very good, accurately fitting all the main chain and most of the
side chain atoms.
At present, MAID can accurately fit more than 95% of the -helix and ß-sheet regions in the map. However, the extension into the loops is much poorer, accurately fitting less than 50% of the amino acids.
In the future, the focus will be on improving the fits in the loop regions making
it possible to completely automate what is now the most time consuming step
in the solution of protein structures.
|
|