The Data Science and Health Initiative (DASHI) is a new partnership between the Data Science Institute and Columbia University Irving Medical Center to build collaborative research projects that leverage foundational data science for new clinical advances.

On the biomedical side, there is emerging access to large-scale, complex datasets due to recently deployed technologies, e.g. in imaging, genomics, and electronic health records. On the engineering side, method developers seek data to test their developments in real-world settings. DASHI aims to bridge this gap and create synergy between the University’s institutional strengths.

The following teams have received seed funding for their pilot projects.

Advancing Health through Data Integration: Creating a Neighborhood Environmental Vulnerability Index for Childhood Asthma Research and Clinical Care

Jeanette Stingone, Public Health

Stephanie Lovinsky-Desir, Pediatrics

Epidemiologic research demonstrates structural and social neighborhood environmental factors contribute to childhood asthma, although the combined effect of these factors are rarely investigated. This team will combine data integration tools with clinical knowledge of childhood asthma to construct a neighborhood environment vulnerability index that can help identify children at greatest risk of developing asthma and identify potential neighborhood interventions for reducing asthma morbidity.

A New Statistical Development for Analyzing Single-Cell RNA-seq Data in GBM

Jianhua Hu, Biostatistics

Jeffrey Bruce, Neurosurgery

Peter Sims, Systems Biology

Emergence of single-cell RNA sequencing (scRNA-seq) enables studying cell-level heterogeneity in terms of genomic variability for complex diseases, including cancers. This team will develop a set of systemic and effective analysis tools for scRNA-seq data, based on formal statistical modeling and machine learning methods, with the applications to GBM.

Privacy-preserving Detection of Delayed Cerebral Ischemia using Federated Learning with Differential Privacy

Gamze G├╝rsoy, Biomedical Informatics

Soojin Park, Biomedical Informatics

This team will develop a federated learning approach that incorporates differential privacy to predict the onset of delayed cerebral ischemia (DCI), which is a leading cause of death and disability in stroke patients. This approach will allow the researchers to increase the generalizability and statistical power of their machine learning model by enabling the use of data from multiple institutions in a privacy-preserving manner. The resulting predictive model may be used to alert clinicians to DCI, which could reduce neurological injury.

The DASHI steering committee includes Peter Canoll, Elham Azizi, Itsik Pe’er, Elias Bareinboim, Lawrence H. Schwartz, and Sarah Collins Rossetti.