

Jeannette M. Wing was a keynote speaker at the recent Future Forum Technology Summit held in Shenzhen, China, where she detailed how Columbia researchers are using data science to to combat a host of societal problems.
The Future Forum, sponsored by China’s scientific, educational, and investment leaders, is a scientific public welfare platform whose mission is to “change the future with science.”
During her talk, Wing highlighted the most advanced data science research at Columbia and discussed ho it is enhancing all fields at the university as well as the many sectors of society that benefit from data-driven breakthroughs in areas such as medicine and health, business and technology as well as education and climate science.
“Our mission at DSI is to advance the state of the art in data science while using it to transform all fields, professions and sectors as we ensure the responsible use of data to benefit society,” said Wing, Avanessians Director of the Data Science Institute and Professor of Computer Science at Columbia.
As an example of research that uses data for good, Wing cited an NSF-supported effort to develop a software platform that will help climate scientists confront the challenges of big data and deepen their understanding of global warming. The Pangeo: An Open Source Big Data Climate Science Platform is designed to solve one of climate science’s most pressing challenges: accessing and using the explosive growth in the size of climate datasets, which have become an indispensable tool for scientific inquiry in climate-change research. Pangeo will integrate a suite of open-source software packages in the Python programming language to produce a toolkit for analysis of climate datasets, which will greatly enhance the work of current and future geoscientists.
In terms of to using data science to enhance healthcare, Wing discussed the Observational Health Data Sciences and Informatics project, known as “Odyssey. The data-drive project’s will collect one billion medical records, an effort scanning 25 countries, 80 databases and 600 million patent records. Currently, medical error is an enormous problem in health related fields, and the Odyssey initiative is applying advanced data science techniques to a half billion patient records. Evaluating those records is helping doctors and healthcare workers answers questions about hypertension, seizures, and many other diseases, reducing the risk for medial error.
The vision of creating accessible, reliable clinical evidence by accessing the clinical experience of hundreds of millions of patients across the globe is a reality. The Observational Health Data Sciences and Informatics (OHDSI) has built on learnings from the Observational Medical Outcomes Partnership to turn methods research and insights into a suite of applications and exploration tools that move the field closer to the ultimate goal of generating evidence about all aspects of healthcare to serve the needs of patients, clinicians and all other decision-makers around the world.
And regarding cancer research, Wing noted the work a team supported by DSI whose pioneering studies of of tumor microbiology could improve chemotherapies treatments for cancer patients. In one study, the team found that that bacteria in pancreatic tumors degrade a chemotherapy drug — Gemcitabine — most commonly used to treat patients with pancreatic cancer. And if antibiotics are found to kill the bacteria in pancreatic patients, chemotherapy may become more effective, giving patients more hope and more life.
Doctors frequently have questions about what is the best drug to use, or what side effects might appear, or whether giving two drugs together will cause a problem. Yet the vast majority of these questions go unanswered. Nowadays, medical records and insurance data make it possible to answer these questions, with the challenge that it is possible to get the wrong answer. For example, if healthier patients take one drug compared to another, then the first drug may appear to work better. The Observational Health Data Sciences and Informatics (OHDSI) initiative applies advanced data science techniques to avoid those errors. With half a billion patient records, OHDSI answers questions about hypertension, seizures, and many other diseases. This talk will illustrate OHDSI’s approach.
Observational Health Data Sciences and Informatics (OHDSI, pronounced “Odyssey”) [1] is an international collaborative whose goal is to create and apply open-source data analytic solutions to a large network of health databases to improve human health and wellbeing. The OHDSI team comprises academics, industry scientists, health care providers, and regulators whose formal mission is to transform medical decision making by creating reliable scientific evidence about disease natural history, healthcare delivery, and the effects of medical interventions through large-scale analysis of observational health databases for population-level estimation and patient-level predictions[2]. Over 90 participants from around the world have joined the collaborative with a vision to access a network of one billion patients to generate evidence about all aspects of healthcare, where patients, clinicians and all other decision-makers around the world use OHDSI tools and evidence every day.
The Future Forum is the only scientific public welfare platform for cross-border business in China. It is not only the “communication person” of science to the public, the “docking person” of the scientific and business circles, but also the “promoting person” that stimulates scientific breakthroughs with private capital. The Future Forum is the only scientific public welfare platform for cross-border business in China. It is not only the “communication person” of science to the public, the “docking person” of the scientific and business circles, but also the “promoting person” that stimulates scientific breakthroughs with private capital.
— Robert Florida