The research group of Professor Barry Finzel (Medicinal Chemistry) is developing a centralized database and associated application suite that support searching for complex substructures in macromolecules, storing and sharing user-provided macromolecular data, and presenting search results. Post-Doctoral Associate Dr. Jeff Van Voorst presented a poster about this project at the MSI 2012 Research Exhibition on April 13. The poster was one of the finalists in the competition.
A macromolecule’s structure is represented by all the pairwise distances of the amino acids and nucleotides in the structure. Given the distance representation of a complex substructure, one can search for similar distance patterns; such searching is called distance geometry matching.
The user interface for this database is HTML-based and is generated by a Python web framework (Flask). Having a web interface allows the application to be accessible from anywhere that there is a suitable network connection to the web host. Other reasons for this centralized system include the use of complex software such as the database (e.g. mySQL), the ease of sharing data between two parties, and the relatively large size of the application’s data.
The image above depicts an example of a complex substructure involving an intermolecular interaction and search results. A) Illustrates the local interaction between a single helix of a transcriptional activator and the DNA duplex to which it binds in a known crystal structure. B) Shows the skeleton of backbone atoms (red) used for substructure geometry constraints. C) Shows an ensemble of overlaid substructures that include this motif resulting from a search of all structures in the PDB. D-F) Show three very different examples of the protein:DNA complexes in which this motif can occur. D) Homeodomain transcription factors. E) Holiday Junction replicase. F) Basic region leucine zipper BZIP) proteins.
A traditional website, Drugsite, for this work is already online. Work is ongoing to build the new web-application architecture that will allow distribution of the search process over multiple processors to reduce the search time and return results to any user in seconds.