Minnesota Supercomputer Institute

Current Database Projects

The Minnesota Supercomputing Institute hosts and develops numerous databases. Below is a list of some recent projects. They include public, project-specific, and commercial databases that serve various fields. These include:

  1. Flat file public databases (genbank, embl, swissprot, PDB, Uniprot),
  2. NCBI BLAST databases (nt, est_human, est_mouse, est_others, nr, swissprot etc.),
  3. Genome databases (Human, Mouse, Rat, Chicken, Bovine, Fugu, Zerbafish, Drosophila, Yeast, E. coli etc.)
  4. mySQL databases such as ensembl, gene ontology, ucsc genome.
  5. mySQL and Oracle databases for user-specific projects.

Many of these databases are integrated into specific software for researchers. Public databases are updated regularly. MSI staff provide hardware, software, and advanced user support for maintaining and using these databases. Below is a list of some of the database-application projects at MSI in three categories:

MSI Developed and Hosted Database Projects

Microarray and Genotyping Data System

URL:
http://www.affycore.msi.umn.edu
URL:
http://www.snp.msi.umn.edu
DBMS:
UNIX and files
Client:
Microarray and Genotyping Facilities at the Biomedical Genomics Center
Phase:
Operation

Description

A Web-based system with integration of x500 authorization for distribution both Affymetrix microarray and Sequenom genotyping data for the microarray and genotyping facilities at the Biomedical Genomics Center to University research community.

Sanger and 454 Sequencing Data System

URL:
https://dbw.msi.umn.edu/bmgcseq
DBMS:
UNIX and files
Client:
Sequencing Facility at the Biomedical Genomics Center
Phase:
Operation

Description

This Web-based system uses x500 authorization for distribution of Sanger DNA sequencing, Roche/454 GSFLX sequencing data, and SAGE data at the Biomedical Genomics Center.

Computational Proteomics and Mass Spectrometry System

URL:
http://www.mass.msi.umn.edu
DBMS:
Oracle, UNIX and Files
Client:
The Center for Mass Spectrometry and Proteomics (MSP)
Phase:
Operation

Description

This Web-based system uses x500 authorization for distribution of mass spectrometry and proteomics data at the Center for Mass Spectrometry and Proteomics at the University of Minnesota. This system is also integrated with commerical mascot search engine for streamline of proteomics data search.

Image Data System

URL:
http://www.cbs.umn.edu/ic/data/
DBMS:
UNIX and files
Client:
Imaging Center at College of Biological Sciences
Phase:
Operation

Description

This Web-based system uses x500 authorization for distribution of bioimaging data at the Imaging Center, College of Biological Sciences at the University of Minnesota.

Microarray Database for Tissue Engineering

URL:
Research Group Database
DBMS:
Oracle
Client:
Research Group at the University of Minnesota
Phase:
Operation and is under expanding

Description

This Microarray Data Management System is the cetralized repository for microarry data and related information. It can store experimental, cell, rna, hybridization, raw images, normalize, and finalized results data. It is mainly used for two-channel data and is currently expanding to use Affymetrix and Nimble Gene data.

Sequence Management and Annotation Database

URL:
Research Group Database
DBMS:
mySQL
Client:
Research Group at the University of Minnesota
Phase:
Operation and is under expanding

Description

Sequencing data analysis, annotation database for large scale EST sequencing projects.

Mass Data Application System

URL:
https://dbw.msi.umn.edu:8443/umn_mass/
URL:
https://dbw.msi.umn.edu:8443/umn_mass_test/
DBMS:
Oracle
PI:
Vivek Kapur, Microbilogy
Developers:
Wayne Xu
Phase:
Developing

Description

The Mass Data Application System is a repository of Mass raw data and identified information. It is an on-line application for Mass data depositing, browsing, query, and retrieval.

Laboratory Data Management System

URL:
https://dbw.msi.umn.edu/wackettldms
Please use Firefox, Safri, Netscape, Not IE
DBMS:
File system, Oracle
PI:
Larry Wackett, Biochemistry
Developers:
Wen Dong, Wayne Xu
Phase:
Operation

Description

The Laboratory Data Management System (LDMS) is a Web application for lab research data storage, secure data transfer, and data management. The data is transfered and stored as various flat file formats with associated metadata in an Oracle database.

The Laboratory Data Management System

URL:
https://dbw.msi.umn.edu/dunnyldms
Please use Firefox, Safri, Netscape, Not IE
DBMS:
File system, Oracle
PI:
Garry Dunny, Microbiology
Developers:
Wen Dong, Wayne Xu
Phase:
Operation

Description

The Laboratory Data Management System (LDMS) is a Web application for lab research data storage, data secure transfer, and data management. The data is transfered and stored as various flat file formats associated with information in Oracle database.

Laboratory Data Management System

URL:
https://dbw.msi.umn.edu/hallldms
Please use Firefox, Safri, Netscape, Not IE
DBMS:
File system, Oracle
PI:
Jennifer Hall, Medicine
Developers:
Wen Dong, Wayne Xu
Phase:
Operation

Description

The Laboratory Data Management System (LDMS) is a Web application for lab research data storage, data secure transfer, and data management. The data is transfered and stored as various flat file formats. Associated metadata is stored in an Oracle database.

MSI Hosted Database Projects

University of Minnesota Biocatalytic and Biodegradation Database

URL:
http://umbbd.msi.umn.edu
Host/Lab
sdml
DBMS:
mySQL
Phase:
Operation

Description

This database contains information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds. The goal of the UM-BBD is to provide information on microbial enzyme-catalyzed reactions that are important for biotechnology.

The Vertebrate Secretome & CTT-ome Database (VSDB)

URL:
http://www.secretomes.msi.umn.edu
Host/Lab
sdml
DBMS:
mySQL
Phase:
Operation

Description

The Vertebrate Secretome & CTT-ome Database (VSDB) contains sequence data on vertebrate CoTranslationally Translocated (CTT) proteins based on vertebrate CoTranslationally Translocated (CTT) proteins based on the RefSeq database. This database provides a real-time blast search forworld-wide user commmunity.

The VSDB processes RefSeq vertebrate proteins through a sequence analysis pipeline. The pipeline had .96 sensitivity, specificity, and Matthews Correlation Coefficient when tested using 372 vertebrate protein sequences with known secretion status.

The MOrpholino DataBase

URL:
http://www.secretomes.msi.umn.edu/MODB/
Host/Lab
sdml
DBMS:
mySql
Phase:
Operation

Description

The MOrpholino DataBase was established as a Web-based database for the purpose of our morpholino screen that currently contains over 700 morpholinos including control and multiple morpholinos against the same target. A publicly accessible sequence-based search opens this database for morpholinos against a particular target for the zebrafish community.

Chimpz Movie and File Database

URL:
http://www.goodall.msi.umn.edu
DBMS:
Oracle
Phase:
Operation

Description

This Oracle-based database is used to manage chimpanzee movies.

Stanford Microarray Database Local Installation

URL:
http://www.smd2.msi.umn.edu/
DBMS:
Oracle
Phase:
Operation

Description

This database local installation of SMD. It has been used for storing microarray experiment data.

Blood Utilization Collaboration

URL:
http://www.buc.msi.umn.edu/
DBMS:
Oracle
Phase:
Operation

Description

The Blood Utilization Collaborative is collection of health care organizations banding together to share de-identified blood usage information in order to assess community practice within Minnesota and eastern North Dakota.

Cedar Creek Ecosystem Science Reserve with Data Document System

URL:
http://www.cedarcreek.umn.edu
http://www.lter.umn.edu
Host/Lab
SDML
DBMS:
Oracle and mySQL
Phase:
Operation

Description

This is National Science Foundataion Long Term Ecological Research Site. Databases were used for organizing and distributing decades of Ecological experiment data to Ecology research community.

Caenorhabditis Genetics Database Management System

URL:
https://dbw.msi.umn.edu/cgcdb/cgc_login.php
Host/Lab
SDML
DBMS:
mySQL
Phase:
Operation

Description

This mySQL based database application is used for handling sample ordering for Caenorhabditis Genetics Center directed by Ann Rougvie at GCD department.

National Center for Earch-Surface Dynamics

URL:
http://www.nced.umn.edu
Host/Lab
SDML
DBMS:
mySQL
Phase:
Operation

Description

The NSF Science and Technology Center. NCED is to catalyze development of an integrated predictive science of the process shaping the surface of the earth in order to transform management ecosystems, resources, and land use.

PlantDB

URL:
Research Group Specific Database
DBMS:
MySQL
Phase:
Under development

Description

Database for plant microarray experiment.

Wiki Site for Structural Genomics

URL:
Research Group Specific Database
Host/Lab
SDML
DBMS:
mysql and others
Phase:
Under development

Description

This project is custom wiki set up for structural genomics group.

Commercial Web-Database-Applications Hosted and Supported at MSI

GeneData Expressionist for Microarray Data Analysis

URL:
http://dbexpr.msi.umn.edu:16000
Client:
Microarray especially Affymetrix Users at the University of Minnesota
Vendor:
GeneData Inc.

Description

GeneData Expressionist is a comprehensive microarray data analysis system. It consists of three closely integrated modules:

Refiner:
A workflow for data loading and quality assurance.
CoBi:
Oracle database for expression data and annotation information storage and management.
Analyst:
Statistic tools for data analysis.

MSI staff provides support for this this software including the genechip library, maintenance, hardware, and software, client access, and utorials. It was widely used by University of Minnesota microarray community.

Mascot for Proteomics Search

URL
http://sequest7.msi.umn.edu/mascot/
Client:
Proteomics Research Community at the University of Minnesota
Vender:
Matrix Science Ltd.

Description

Mascot from Matrix Science is a search engine that uses mass spectrometry data to identify proteins from primary sequence databases. MSI provides support for the program and databases including user-specific database generation.

Peaks online for Proteomics Search

URL
http://sequest5.msi.umn.edu:8080/peaksonline/
Client:
Proteomics Research Community at the University of Minnesota
Vender:
Bioinformatics Solution Inc.

Description

PEAKS is a software for analysis of peptide mass spectrometry data. Especially for these de novo MS/MS search. It has been used by proteomics user community at the University of Minnesota.

GeneTraffic for Microarray Data Analysis

URL:
http://cgl1.msi.umn.edu
Client:
Microarray User Community at the University of Minnesota
Vendor:
Iobion/Strategene.

Description

Iobion Genetraffic is a Web-based microarray data analysis system.