Funding Opportunities
The DSI Seed Funds Program supports research collaborations between data scientists and domain experts.
Proposal Deadline: Tuesday, November 15, 2022
Apply HereProgram Goals
The DSI Seed Funds Program supports new collaborations with the goal of developing longer term and deeper interdisciplinary relationships among faculty at Columbia. The program aims to advance research that combines data science expertise with domain expertise.
Seed fund proposals should embody the spirit of DSI’s mission, and address the technical strengths needed to create more fair and ethical use of data. Proposals that align with one or more of DSI’s focus areas are preferred.
Proposals should represent new collaborations, which have the potential to lead to future funding opportunities with government, industry, or foundations. DSI Seed Funds should be viewed as planning grants for upcoming solicitations from DARPA, NIH, NSF, and others.
Proposal Process
Proposal Deadline: Tuesday, November 15, 2022
Please submit the below materials using the submission form. Unfortunately, we cannot accept incomplete or late submissions. The form will require the below materials:
- Contact Information for all PIs and Collaborators
- Project Proposal (five-page maximum, single spaced, 12 pt. font, Times New Roman)
- Explicit statement on how the project will uphold DSI’s commitment to racial equity and justice
- Budget with narrative
- CVs for faculty / collaborators (two-page, NSF-style format)
We anticipate notifying award recipients by early January 2023.
Funding and Terms
DSI provides two levels of funding. Proposals should indicate which level of funding is being requested.
Funding Level | Details |
---|---|
$25,000 (Maximum of 2 Years) | Intended for projects where significant salary support of Ph.D. students, postdoctoral researchers, or research scientists is not needed. |
$75,000 (Maximum of 2 Years) | Proposed budget may include salary expenses for research support staff. |
Seed Grant Terms (Both Levels)
- Awardees will be required to submit quarterly financial reviews and biannual progress reports.
- Eligibility for Year 2-continued funding is determined after review of the first year progress report.
- Progress reports must include details on external funding proposal submission(s) and other related activities (presentations,publications, etc.).
Additional Avenues for Support:
Proposed budgets are encouraged to request support from 1) DSI Research Scientists and Scholars; and 2) Columbia’s Bridge to the Ph.D. Program in STEM:
- DSI Research Scientists and Scholars represent a wide range of expertise, from the foundations of data science to domains where data science is heavily used. Collaborating with a DSI research scientist or scholar may accelerate your research project.
- The Bridge to the Ph.D. Program in STEM is a structured, post-baccalaureate opportunity aimed to diversify the STEM professoriate and workforce. By including one of their scholars as part of your DSI Seed Funds research proposal, you contribute towards increasing pathways for underrepresented students to advance in STEM disciplines. The Office of the Vice Provost for Faculty Advancement covers 70% of the scholar’s salary and fringe, with 30% (~$17K) expected from the sponsoring principal investigator (PI). Your DSI Seed Funds budget is eligible to cover the PI’s expected cost for sponsoring a scholar.
Statement on Racial Equity
DSI is committed to racial equity and justice. Proposals should explicitly state that the project will uphold these values, e.g., stating that the methods used to collect and analyze project data, and the project outcomes reported are fair, just, and ethical.
Criteria for Proposal
Seed fund determinations will be assessed based on the criteria below. Please consider addressing these questions in your proposal.
- Why is the proposed project novel? Additionally, describe the novelty of the collaboration in terms of people, disciplines, and/or schools. Contrast to prior work is recommended.
- Why is seed funding essential to the success of this project?
- How is the project necessarily inter-/multi-disciplinary?
- What is the intended follow-up for this project to obtain future funds, especially plans to submit to large-scale funding opportunities?
- Describe the project’s adherence to DSI’s commitment to racial equity and justice.
All projects must be relevant to advancing and/or applying data science as a field.
Questions can be directed to dsi-seed@columbia.edu; or Radhika Patel, Chief Operating Officer at The Data Science Institute.
Apply HereRecent Seed Fund Projects
-
Szabolcs Marka, Physics; Zsuzsanna Marka, Physics; Zelda Moran, Public Health; John Wright, Electrical Engineering
This team is pioneering a machine-learning based imaging and sorting solution that aims to drastically reduce Africa’s tsetse population. The solution, which allows for the sorting of male and female tsetse flies, to support the Sterile Insect Technique, which the IAEA has used to eradicate tsetse populations in Zanzibar and other countries.
-
Pierre Gentine, Earth and Environmental Engineering; Marco Giometto, Civil Engineering and Engineering Mechanics; Mostaf Momen, Civil Engineering and Engineering Mechanics; Carl Vondrick, Computer Science
This team is developing machine-learning models and improved satellite-imaging techniques that will help environmental officials locate and characterize hazardous pollutants in the lower atmosphere, allowing them to design strategies to mitigate pollution.
-
Marianthi-Anna Kioumourtzoglou, Environmental Health Sciences; John Paisley, Electrical Engineering; Kai Ruggeri, Health Policy and Management
This research team intends to reduce missed appointments at community clinics by using big data and Bayesian machine learning techniques to understand why patients miss appointments and what can be done to help them keep them.
-
Aviv Landau, Data Science Institute; Desmond Patton, Social Work; Maxim Topaz, Nursing
This team is developing an innovative artificial intelligence system to detect and assess risk for child abuse and neglect within hospital settings that would prioritize the prevention and reduction of bias against Black and Latinx communities.
-
Matthias Preindl, Electrical Engineering; Alan West, Chemical Engineering
This engineering team is developing a machine-learning model that can estimate a Li-Ion battery’s charge level with greater accuracy, aiming for an error rate of just one percent.
-
David Blei, Statistics; Anna Lasorella, Pediatrics; Raul Rabadan, Systems Biology; Wesley Tansey, Systems Biology
This team aims to model, predict, and target therapeutic sensitivity and resistance of cancer. They will integrate Bayesian modeling with recently developed variational inference and deep learning methods and apply them to large scale genomic and drug sensitivity data across many cancer types.
-
Xi Chen, Computer Science; Sharon Di, Civil Engineering and Engineering Mechanics; Qiang Du, Applied Physics and Applied Mathematics; Eric Talley, Law
This team is developing a fundamental framework using the game theoretic approach to model the strategic interactions of conventional human-driven vehicles and autonomous and/or connected vehicles. Other than technical advances, this project will also address the Trolley Problem (i.e., ethical sense development) in AV algorithm design.
-
Michael Collins, Computer Science; David Kipping, Astronomy
This team will build predictive models capable of intelligently optimizing telescope resources, and uncover the rules and regularities in planetary systems, specifically through the application of grammar induction methods used in computational linguistics.
-
Roxana Geambasu, Computer Science; Daniel Hsu, Computer Science; Nicholas Tatonetti, Biomedical Informatics
This team is building an infrastructure system for sharing privacy-preserving machine learning models of large-scale, dynamic, clinical datasets. The system will enable medical researchers in small clinics or pharmaceutical companies to incorporate multitask feature models learned from big clinical datasets to bootstrap their own machine learning models on top of their (potentially much smaller) clinical datasets. The multitask feature models protect the privacy of individual records in the large datasets through a rigorous method called differential privacy.
-
Trenton Jerde, Zuckerman Institute; Nikolaus Kriegeskorte, Zuckerman Institute; Nima Mesgarani, Electrical Engineering; Chris Wiggins, Applied Physics and Applied Mathematics
This team will build a complementary mechanism for web-based sharing of reasoned judgments to perform probabilistic inference on contentious claims with machine learning algorithms and bring rationality to the social web.
-
Ruth DeFries, Ecology, Evolution and Environmental Biology; Arlene Fiore, Earth and Environmental Sciences; Jeff Goldsmith, Biostatistics; Marianthi-Anna Kioumourtzoglou, Environmental Health Sciences; Daniel Westervelt, Lamont-Doherty Earth Observatory; John Wright, Electrical Engineering
This team will develop methods to extract patterns from multiple datasets and identify the dominant sources of air pollution across India and how they vary in space and time. Their work is a step towards the overarching goal of informing effective clean air solutions and reducing public health burdens associated with exposure to air pollution in India.
-
Kriste Krstovski, Data Science Institute; Yao Lu, Sociology
This team combines new sources of labor market data with data science methods to identify factors and environments that shape gender and racial inequality in high-skilled labor market. The team will chart long-term career trajectories of a large number of high-skilled American workers and examine gender and racial variations; and construct measures of company environment, especially that pertains to gender and racial equity, and assess its consequences for the career path of different groups of skilled workers.
-
Itsik Pe’er, Computer Science; Anne-Catrin Uhlemann, Medicine
This team is developing methods for temporal analysis of gut microbiome compositions to better define the risk of infections in liver transplant recipients. They will integrate existing coarse resolution data with newly collected deep metagenomics and metabolomics data.
-
Piero Dalerba, Pathology and Cell Biology; Jianhua Hu, Biostatistics; Mary Beth Terry, Epidemiology; Wan Yang, Epidemiology
This team will build a novel model-inference system to study the dynamics of colorectal cancer, test a range of risk mechanisms over the life course, and identify key risk factors underlying the recent increase in young onset colorectal cancer incidence in the United States to support more effective early prevention.
-
Elham Azizi, Biomedical Engineering; Jellert Gaublomme, Biological Sciences; Brent Stockwell, Biological Sciences
This team will develop probabilistic models to elucidate the role of intercellular interactions in driving susceptibility of treatment-resistant mesenchymal tumor cells to a newly discovered ferroptotic vulnerability, which could offer a therapeutic avenue to prevent survival of these cancer cells that are prone to metastasis.
Other Recently Funded Programs
-
The Columbia-IBM Center for Blockchain and Data Transparency supports research that advances innovation in blockchain, data transparency, data sharing, fair use of data, and related technologies for the good of society. The Center has funded several research projects to develop thought leadership and influence policy.
For an overview of current and past research projects, visit the Columbia-IBM Center for Blockchain and Data Transparency here.
-
The Data Science and Health Initiative (DASHI) is a partnership between the Data Science Institute and Columbia University Irving Medical Center to build collaborative research projects that leverage foundational data science for new clinical advances.
Three projects were awarded in 2022. Learn more about these projects here.