Associate Professor Baolin Wu

PUBHL Biostatistics Division
School of Public Health
Twin Cities
Project Title: 
Statistical Methods Development for Analyzing Whole-Genome Sequencing Data

These researchers are developing novel statistical methods and efficient computational tools for discovering disease associated (primarily rare or less frequent) variants using the whole genome sequence (WGS) data from the U.K. biobank and U.K.10K data, and NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. Both datasets are whole genome sequencing samples with more comprehensive coverage of genetic variants compared to traditional GWAS chips. The TOPMed data have been released to the scientific community through dbGaP.

The researchers are currently developing several statistical methods for efficiently testing rare variant set association with multiple traits and robust risk prediction. Statistical methods for rare variant set testing methods across the genome and multiple traits are currently under active development. The group will explore several adaptive testing methods to improve the overall rare variant detection power. A common theme to these data analyses is the large-scale volume of data and testing. The researchers typically rely on very small p-values for significance testing. For example, it is common to use 5E-8 as a genome-wide significance cutoff. As a result, to compare and validate any developed methods, they need to perform hundreds of millions of simulation experiments in order to verify small p-values. Thus HPC resources prove to be an essential and indispensable tool to ensure project success.

Research by this group was featured on the MSI website in February 2016: Finding Genetic Markers of Transplant Rejection.

Project Investigators

Bin Guo
Sharon Ling
Jiawei Liu
Maria Masotti
Associate Professor Baolin Wu
Are you a member of this group? Log in to see more information.