The DSI Seed Funds Program supports new collaborations that will lead to longer term and deeper relationships among faculty in different disciplines across campus. Aimed at advancing research that combines data science expertise with domain expertise, the program’s funded research should embody the spirit of the Institute’s mission statement.
We are pleased to announce a call for proposals from Columbia University faculty and research staff for DSI Seed Funds. We are actively seeking proposals that represent new collaborations, which ideally lead to future proposal submissions to government, industry, or foundations. We are particularly interested in proposals that address one or more of DSI’s focus areas—Business and Finance, Climate, Foundations of Data Science, Health, and Social Justice—and encourage proposals that address the technical challenges in the fair and ethical use of data.
Statement on Racial Equity The Data Science Institute is committed to racial equity and justice. Proposals should explicitly state that the project will uphold these values, e.g., stating that the methods used to collect and analyze project data and the project outcomes reported are fair, just, and ethical.
Funding This year, DSI will provide two levels of funding levels. We will be accepting proposals with budgets up to $25,000, as well as proposals with budgets up to $100,000, annually, for a maximum of two years. The $25,000 grants are intended for projects where significant salary support of Ph.D. students, post-docs, or research scientists is not needed. As a condition of funding, awardees will be required to submit quarterly financial reviews and biannual progress reports. Eligibility for continued funding for a second year will also require a progress report. All reports must include progress on external funding proposal submission(s) and other related activities (presentations, publications, etc.). Budgets are encouraged to request support for DSI research scientists (as in last year’s call) and for the University’s Bridge to the Ph.D. Program in STEM program (new for this year’s call).
The Data Science Institute’s research scientists and scholars represent a wide range of expertise, from the foundations of data science to a domain where data science is heavily used. Collaborating with a DSI research scientist or scholar may accelerate your research project. The Bridge to the PhD Program in STEM is a structured post-baccalaureate opportunity with an aim to diversify the STEM professoriate and workforce. By including one of their scholars as part of your DSI Seed Fund research proposal, you are contributing towards increasing pathways for underrepresented students to advance in STEM disciplines. The Office of the Vice Provost for Faculty Advancement covers 70% of the scholar’s salary and fringe, with 30% (~$17K) expected from the sponsoring PI. Your DSI Seed Fund budget is eligible to cover the PI’s expected cost for sponsoring a scholar.
The Herbert & Florence Irving Institute for Cancer Dynamics The Data Science Institute and the Herbert and Florence Irving Institute for Cancer Dynamics are collaborating to support up to two seed grants, up to $100,000 annually, for a maximum of two years, that align with their shared mission of improving the understanding of cancer biology, origins, treatment and prevention through data-driven methods and processes. The Institute is particularly interested in candidates that may further advance core research in Statistical and Probabilistic Modeling. If you wish to have your Seed Fund application reviewed for this opportunity as well, please indicate as such on the application cover page (check box).
Proposal Process The deadline for proposal submission is Monday, November 9, 2020, 3:00 p.m. We will not accept incomplete or late submissions. We anticipate notifying award recipients by December 21, 2020. Please submit materials via email, in .doc or.pdf format, to dsi-seed@columbia.edu, by the Monday, November 9, 2020, 3:00 p.m. deadline.
Szabolcs Marka, Physics; Zsuzsanna Marka, Physics; Zelda Moran, Public Health; John Wright, Electrical Engineering
This team is pioneering a machine-learning based imaging and sorting solution that aims to drastically reduce Africa’s tsetse population. The solution, which allows for the sorting of male and female tsetse flies, to support the Sterile Insect Technique, which the IAEA has used to eradicate tsetse populations in Zanzibar and other countries.
Pierre Gentine, Earth and Environmental Engineering; Marco Giometto, Civil Engineering and Engineering Mechanics; Mostaf Momen, Civil Engineering and Engineering Mechanics; Carl Vondrick, Computer Science
This team is developing machine-learning models and improved satellite-imaging techniques that will help environmental officials locate and characterize hazardous pollutants in the lower atmosphere, allowing them to design strategies to mitigate pollution.
Marianthi-Anna Kioumourtzoglou, Environmental Health Sciences; John Paisley, Electrical Engineering; Kai Ruggeri, Health Policy and Management
This research team intends to reduce missed appointments at community clinics by using big data and Bayesian machine learning techniques to understand why patients miss appointments and what can be done to help them keep them.
Aviv Landau, Data Science Institute; Desmond Patton, Social Work; Maxim Topaz, Nursing
This team is developing an innovative artificial intelligence system to detect and assess risk for child abuse and neglect within hospital settings that would prioritize the prevention and reduction of bias against Black and Latinx communities.
Matthias Preindl, Electrical Engineering; Alan West, Chemical Engineering
This engineering team is developing a machine-learning model that can estimate a Li-Ion battery’s charge level with greater accuracy, aiming for an error rate of just one percent.
David Blei, Statistics; Anna Lasorella, Pediatrics; Raul Rabadan, Systems Biology; Wesley Tansey, Systems Biology
This team aims to model, predict, and target therapeutic sensitivity and resistance of cancer. They will integrate Bayesian modeling with recently developed variational inference and deep learning methods and apply them to large scale genomic and drug sensitivity data across many cancer types.
Xi Chen, Computer Science; Sharon Di, Civil Engineering and Engineering Mechanics; Qiang Du, Applied Physics and Applied Mathematics; Eric Talley, Law
This team is developing a fundamental framework using the game theoretic approach to model the strategic interactions of conventional human-driven vehicles and autonomous and/or connected vehicles. Other than technical advances, this project will also address the Trolley Problem (i.e., ethical sense development) in AV algorithm design.
Michael Collins, Computer Science; David Kipping, Astronomy
This team will build predictive models capable of intelligently optimizing telescope resources, and uncover the rules and regularities in planetary systems, specifically through the application of grammar induction methods used in computational linguistics.
Roxana Geambasu, Computer Science; Daniel Hsu, Computer Science; Nicholas Tatonetti, Biomedical Informatics
This team is building an infrastructure system for sharing privacy-preserving machine learning models of large-scale, dynamic, clinical datasets. The system will enable medical researchers in small clinics or pharmaceutical companies to incorporate multitask feature models learned from big clinical datasets to bootstrap their own machine learning models on top of their (potentially much smaller) clinical datasets. The multitask feature models protect the privacy of individual records in the large datasets through a rigorous method called differential privacy.
Trenton Jerde, Zuckerman Institute; Nikolaus Kriegeskorte, Zuckerman Institute; Nima Mesgarani, Electrical Engineering; Chris Wiggins, Applied Physics and Applied Mathematics
This team will build a complementary mechanism for web-based sharing of reasoned judgments to perform probabilistic inference on contentious claims with machine learning algorithms and bring rationality to the social web.
Ruth DeFries, Ecology, Evolution and Environmental Biology; Arlene Fiore, Earth and Environmental Sciences; Jeff Goldsmith, Biostatistics; Marianthi-Anna Kioumourtzoglou, Environmental Health Sciences; Daniel Westervelt, Lamont-Doherty Earth Observatory; John Wright, Electrical Engineering
This team will develop methods to extract patterns from multiple datasets and identify the dominant sources of air pollution across India and how they vary in space and time. Their work is a step towards the overarching goal of informing effective clean air solutions and reducing public health burdens associated with exposure to air pollution in India.
Kriste Krstovski, Data Science Institute; Yao Lu, Sociology
This team combines new sources of labor market data with data science methods to identify factors and environments that shape gender and racial inequality in high-skilled labor market. The team will chart long-term career trajectories of a large number of high-skilled American workers and examine gender and racial variations; and construct measures of company environment, especially that pertains to gender and racial equity, and assess its consequences for the career path of different groups of skilled workers.
Itsik Pe’er, Computer Science; Anne-Catrin Uhlemann, Medicine
This team is developing methods for temporal analysis of gut microbiome compositions to better define the risk of infections in liver transplant recipients. They will integrate existing coarse resolution data with newly collected deep metagenomics and metabolomics data.
Piero Dalerba, Pathology and Cell Biology; Jianhua Hu, Biostatistics; Mary Beth Terry, Epidemiology; Wan Yang, Epidemiology
This team will build a novel model-inference system to study the dynamics of colorectal cancer, test a range of risk mechanisms over the life course, and identify key risk factors underlying the recent increase in young onset colorectal cancer incidence in the United States to support more effective early prevention.
Elham Azizi, Biomedical Engineering; Jellert Gaublomme, Biological Sciences; Brent Stockwell, Biological Sciences
This team will develop probabilistic models to elucidate the role of intercellular interactions in driving susceptibility of treatment-resistant mesenchymal tumor cells to a newly discovered ferroptotic vulnerability, which could offer a therapeutic avenue to prevent survival of these cancer cells that are prone to metastasis.