About the Focus Area

Data science is a driver of health care transformation.

From the development of new technologies and devices to targeted interventions to improve health outcomes, data-based solutions are reshaping the health care landscape and improving the efficiency and effectiveness of medicine. Researchers and practitioners apply data science principles and techniques to better understand health processes and transform health care delivery for better diagnoses, better care, and better cures.

Our Health Analytics Center, which is located at the Columbia University Irving Medical Center, facilitates collaborations between researchers from medicine, biology, public health, informatics, computer science, applied mathematics, and statistics. These thought leaders combine techniques from the growing field of data science with subject-matter expertise from their respective disciplines to improve the health of individuals and health care systems.

For example, precision medicine aims to find the right drug for the right patient at the right moment. Such accuracy is especially crucial in cancer treatments; exact causes vary between different tumors, and no two tumors have the same set of alterations. DSI-affiliated researchers from statistics, biomedical informatics, and cell biology are mapping a comprehensive set of causes to model, predict, and target therapeutic sensitivity and resistance of cancer for better treatment. Other experts in biomedical informatics, biomedical engineering, radiology, urology, and pathology have collaborated to use data science with magnetic-resonance imaging physics to improve prostate cancer diagnosis and staging. Yet another team sought to understand tumor microbiology and determined that bacteria in pancreatic tumors actually degrade a popular chemotherapy drug.

Data is also considered an organizational asset and plays an increasingly vital role in clinical administrative decision-making. Virtually every hospital and clinic collects detailed medical records about its patients, but hospitals are wary of sharing data with other institutions due to privacy concerns. Researchers in engineering, computer science, and biomedical informatics are building an infrastructure for sharing machine learning models of large-scale, clinical datasets to rapidly advance innovation in clinical data research while safeguarding patients’ privacy.

DSI graduate students also collaborate with faculty to complete capstone projects and apply data science techniques to real challenges for the health professions. A recent student team used deep learning methods to build a model to predict whether high-resolution images of lung tissue showed evidence of pulmonary fibrosis. Another group partnered with New York City’s Department of Health and Mental Hygiene to scrape public social media posts from Twitter and Reddit and build a web app in R Shiny to flag posts associated with depression. By determining which genes influence biofilm structures, our M.S. students have also suggested ways to help make conventional antibiotics more effective and helped develop tools hospitals may use to diagnose bacterial conditions and treat infections.  

Related Centers