Our work builds upon the work of teams of Columbia researchers in medicine, biology, public health, informatics, computer science, applied mathematics, and statistics. The Health Analytics Center is located at the Columbia University Medical Center.
J. Michael Schmidt (Neurology)
Clinicians in the Neuro-ICU may be confronted daily by over 200 time-related variables for each patient; yet we know from cognitive science that people are only able to understand the relatedness of two variables without help. We are investigating how to help clinicians make sense of real-time streams of physiological data as well as of their relationships and trends. The objective of this project is to demonstrate that interactive data visualizations designed to transform and consolidate complex multimodal physiological data into integrated interactive displays will reduce clinician cognitive load and will result in reductions in medical error and improvements in patient care, safety, and efficiency. This project is a collaboration with the Draper Laboratory and funded by the DoD Telemedicine and Advanced Technology Research Center (TATRC) and the Dana Foundation.
Noémie Elhadad (Biomedical Informatics), Chris Wiggins (Applied Mathematics and Applied Physics)
Physicians treating patients in the clinic, on the floor, or in the emergency room are faced with an overwhelming amount of complex information about their patients, with little time to review it. HARVEST is an interactive patient record summarization system, which aims to support physicians in their information workflow. It extracts content from the patient notes, where key clinical information resides, aggregates and presents information through time. HARVEST is currently deployed at NewYork-Presbyterian Hospital. It relies on a distributed platform for processing data as they get pushed into the electronic health record. We are now investigating summarization models of patient records that identify their co-morbidities and their status through time, by modeling all observations in the record, from the notes to laboratory test measurements and other structured information like billing codes.
Sean Luo (Physicians and Surgeons, Psychiatry), Min Qian (Public Health, Biostatistics), Kara Rudolph (Public Health, Epidemiology)
Pharmacologic treatment of opioid use disorder is complicated by the likely absence of a one-sizefits-all best approach; rather, “optimal” dose and dose adjustment are hypothesized to depend on person-level factors, including factors that change over time, reflecting how well the individual is responding to treatment. This team will use harmonized data from multiple existing clinical trials with natural variability in medication dose adjustments over time to 1) learn optimal dosing strategies, and 2) estimate the extent to which such optimal dosing strategies could reduce risk of treatment dropout and relapse.
Billy Caceres (Nursing), Ipek Ensari (DSI), Kasey Jackman (Nursing)
This pilot study will use data science techniques to leverage ecological momentary assessment and consumer sleep technology to phenotype sleep health profiles in Black and Latinx sexual and gender minority adults. The investigators will use 30 days of daily electronic diaries and actigraphy to examine the associations of daily exposure to minority stressors (such as experiences of discrimination and anticipated discrimination) with sleep health among Black and Latinx sexual and gender minority adults.
Aviv Landau (DSI), Desmond Patton (DSI and Social Work), Maxim Topaz (Nursing)
Child abuse and neglect is a social problem that has reached epidemic proportions. The broad adoption of electronic health records in clinical settings offers a new avenue for addressing this epidemic. This team will develop an innovative artificial intelligence system to detect and assess risk for child abuse and neglect within hospital settings that would prioritize the prevention and reduction of bias against Black and Latinx communities
Itsik Pe’er (Engineering, Computer Science), Anne-Catrin Uhlemann (Physicians and Surgeons, Medicine)
This project will develop methods for temporal analysis of gut microbiome compositions to better define the risk of infections in liver transplant recipients. The project team will integrate existing coarse resolution data with newly collected deep metagenomics and metabolomics data.
Piero Dalerba (Physicians and Surgeons, Pathology and Cell Biology), Jiahnhua Hu (Public Health, Biostatistics), Mary Beth Terry (Public Health, Epidemiology), Wan Yang (Public Health, Epidemiology)
Using multiple nationally representative large-scale exposure and cancer incidence datasets, this project will build a novel model-inference system to study the dynamics of colorectal cancer, test a range of risk mechanisms over the life course, and identify key risk factors underlying the recent increase in young onset colorectal cancer incidence in the United States to support more effective early prevention. This project is jointly funded with Cancer Dynamics.
Elham Azizi (Engineering, Biomedical Engineering and Cancer Dynamics), Jellert Gaublomme (Arts and Sciences, Biological Sciences), Brent Stockwell (Arts and Sciences, Biological Sciences)
This project will leverage machine learning techniques to combine two types of single-cell data modalities with the goal of achieving a more comprehensive characterization of heterogeneous cell states in the tumor microenvironment. Specifically, the team will develop probabilistic models to elucidate the role of intercellular interactions in driving susceptibility of treatment-resistant mesenchymal tumor cells to a newly discovered ferroptotic vulnerability, which could offer a therapeutic avenue to prevent survival of these cancer cells that are prone to metastasis. This project is jointly funded with Cancer Dynamics.
Sergey Kalachikov (Engineering, Chemical Engineering), Rene Hen (Physicians and Surgeons, Neuroscience)
This team will incorporate data on antidepressant resistance and drug response profiles, their own behavioral and RNA sequence data, and publicly available large-scale data sets to help identify candidate genes that implicate specific morphological changes in the brain. The long term aim of this research is to reveal specific gene pathways and regulatory networks associated with treatment-resistant Major Depressive Disorder.
Marianthi-Anna Kioumourtzoglou (Public Health, Environmental Health Sciences), John Paisley (Engineering, Electrical Engineering), Kai Ruggeri (Public Health, Health Policy and Management)
Personalized approaches to behavioral interventions, known as nudges, may improve access to health care in low-income communities. Using health, environment, transportation, and financial data, this project will build smart nudges that adapt to individual needs by using innovative methods in machine learning and data science.
Roxana Geambasu (Engineering, Computer Science), Daniel Hsu (Engineering, Computer Science), Nicholas Tatonetti (Physicians and Surgeons, Biomedical Informatics)
Today, virtually every clinic and hospital–small or large–collects clinical information about their patients and aims to use these data to predict disease trajectories and discover new treatments. Unfortunately, these datasets, which vary vastly in size and type of information they contain, are almost always siloed behind institutional walls because of privacy concerns. This limits the scope and rigor of the research that can be done on these datasets. We are building an infrastructure system for sharing privacy-preserving machine learning models of large-scale, dynamic, clinical datasets. The system will enable medical researchers in small clinics or pharmaceutical companies to incorporate multitask feature models learned from big clinical datasets, such as New York Presbyterian’s Clinical Data Warehouse, to bootstrap their own machine learning models on top of their (potentially much smaller) clinical datasets. The multitask feature models protect the privacy of individual records in the large datasets through a rigorous method called differential privacy. We anticipate the system will vastly improve the pace of innovation in clinical data research while alleviating the privacy concerns.
David Blei (Arts and Sciences, Statistics; and Engineering, Computer Science), Anna Lasorella (Physicians and Surgeons, Pediatrics), Raul Rabadan (Physicians and Surgeons, Systems Biology), Wesley Tansey (Physicians and Surgeons, Systems Biology)
Precision medicine aims to find the right drug, for the right patient, at the right moment and at the right dose. This aim is particularly relevant in cancer, where standard therapies elicit very different responses across patients. This project’s goal is to model, predict, and target therapeutic sensitivity and resistance of cancer. The project will work to integrate Bayesian modeling with recently developed variational inference and deep learning methods, and apply them to large scale genomic and drug sensitivity data across many cancer types. The project will leverage the strong expertise of two leading teams in computational genomics and machine learning together with experimental labs across the Medical and Morningside campuses.