Diversity
The Data Science Institute strives to be a force for change. We acknowledge that DSI should do more within our own unique community to address racial equity gaps and to increase diverse representation in data science research and education. We will engage in proactive measures to dismantle systemic racism, racial inequity in data science, racial inequity across the Columbia community, and racial inequity across academia more broadly.
The initial DSI Racial Equity Action Plan includes the following:
- Create a faculty-led DSI Race and Equity Advisory Council by January 2021.
- Add a DSI Racial Equity Statement to the DSI website.
- Require proposals to the DSI Seed Grant program to state explicitly how projects will ensure that the data collected and analyzed are done in a fair, just, and ethical manner.
- Promote research that addresses issues on racial equity and fairness in data science, e.g., in training data, machine learning algorithms and models, and automated decision making.
- Instill a culture of shared responsibility among DSI staff, faculty, and researchers for upholding the DSI commitment to racial equity and justice.
- Actively seek racially diverse individuals to apply to DSI programs.
- Collaborate with university partners, including the School of Engineering and Applied Science and Arts and Sciences, on actions to support racial diversity, equity, and inclusion for the M.S. in data science program.
- Continue to support the DSI Task Force on Racial Equity through Fall 2020: (a) to establish partnerships with external organizations toward achieving shared goals on racial equity; (b) to address issues of climate and culture; and (c) to develop plans for a DSI Working Group on Race and Equity.
This initial plan is just a small step toward making progress for the long term. We commit to executing this plan with the same rigor as our other research, education, and outreach efforts.
Taking Action: A Message from DSI Director Jeannette M. Wing
WATCH: Centering Race in Data Science @ Harvard Data Science Initiative
Data Science Institute Racial Equity Action Plan
Making a Commitment to Critical Discussions on Race and Data Science
In Support of Our Asian and Asian American Community: A Message from DSI Director Jeannette M. Wing
Research Highlights
-
Colin Wayne Leach, Psychology, Africana Studies
Courtney Cogburn, Social Work
Sining Chen, Industrial Engineering and Operations Research
Kathleen McKeown, Computer Science
Susan McGregor, Data Science Institute
Social media is a powerful means of individual expression, and collective consolidation, of people’s sentiment about the most important issues in our society.​ ​​This transdisciplinary​ ​project marries the latest advances in computational and statistical techniques of language use over time with social behavioral theories of emotion and stress to examine the temporal dynamics of ​tweets surrounding police killings of Black people and subsequent protest​s​ (e.g., Black Lives Matter). -
Billy Caceres, Nursing
Ipek Ensari, Data Science Institute
Kasey Jackman, Nursing
This pilot study will use data science techniques to leverage ecological momentary assessment and consumer sleep technology to phenotype sleep health profiles in Black and Latinx sexual and gender minority adults. The investigators will use 30 days of daily electronic diaries and actigraphy to examine the associations of daily exposure to minority stressors (such as experiences of discrimination and anticipated discrimination) with sleep health among Black and Latinx sexual and gender minority adults. -
Maxim Topaz, Nursing
Aviv Landau, Data Science Institute
Desmond Patton, Social Work
Child abuse and neglect is a social problem that has reached epidemic proportions. The broad adoption of electronic health records in clinical settings offers a new avenue for addressing this epidemic. This team is developing an innovative artificial intelligence system to detect and assess risk for child abuse and neglect within hospital settings that prioritizes the prevention and reduction of bias against Black and Latinx communities.
-
Kriste Krstovski, Data Science Institute, Business
Yao Lu, Sociology
This research team combines new sources of labor market data, including online resumes and employee reviews, with data science methods to identify factors and environments that shape gender and racial inequality in the high-skilled labor market. The team is charting long-term career trajectories of a large number of high-skilled American workers, examined gender and racial variations, constructing measures of company environment that pertain to gender and racial equity, and assessing consequences for the career path of different groups of skilled workers.
-
Smaranda Muresan, Data Science Institute
Reducing the achievement gaps in STEM disciplines among subpopulations of students is important for the U.S. to meet its 21st century science and technology needs. This project focuses on environmental factors that devalue, marginalize, or discriminate against students based on a social identity like race, gender, disability status, or socioeconomic status. To date, the research synthesized and systematically analyzed data from interventions shown to help reduce the impact of social identity threats on student participation in STEM, and applied results of the synthesis and analyses to enhance existing interventions.
-
Wayne Leach, Psychology, Barnard College
Courtney Cogburn, Social Work
This team applied topic modeling and sentiment analysis techniques to Twitter activity before and after events related to police use of force against unarmed Black victims. They focused on more than 8.5 million tweets from August 2014 to cover the event of a police officer (Darren Wilson) shooting a Black victim (Michael Brown) in Ferguson, MO. Their study demonstrated that there are lags between events and their emotional response on Twitter.
-
Hardeep Johar, Industrial Engineering and Operations Research
Patrice Derrington, Architecture, Planning and Preservation
This team was tasked with defining gentrification trends through publicly available datasets that uncover insights on the spread of gentrification across 55 Public Use Microdata Areas (PUMA) districts in New York City. The four most important factors were: real estate price, race of inhabitants, employment, and education.
-
Desmond Patton, Social Work
This partnership with the NYC Center for Court Innovation organized a youth technology advisory council, TechAdvise, comprised of ten youths from underrepresented groups in tech. The project’s goal is not only to improve the technology we develop to address underserved populations, but also to inspire them and who they represent to learn computing and data science.
Data Science Racial Equity Advisory Committee
Augustin Chaintreau
-
The Fu Foundation School of Engineering and Applied Science
Associate Professor of Computer Science
Abby Kamen
-
School of Social Work
M.S. in Social Work, Class of 2021
Colin Wayne Leach
-
Barnard College
Professor of Psychology
Kathleen R. McKeown
-
The Fu Foundation School of Engineering and Applied Science
Henry and Gertrude Rothschild Professor of Computer Science -
Data Science Institute
Founding Director
Dennis Mitchell
- Executive Vice President for University Life and Senior Vice Provost for Faculty Advancement
-
College of Dental Medicine
Professor of Dental Medicine (Community Health and Periodontics)
Mutale Nkonde
-
Graduate School of Arts and Sciences
M.A. Candidate American Studies, Class of 2023; Center for the Study of Ethnicity and Race
Amber E.H. Tingle
-
Data Science Institute
Executive Director of Strategic Communications and Media Relations
Kevin Womack
-
Data Science Institute
M.S. in Data Science, Class of 2021
Ming Yuan
-
Data Science Institute
Associate Director for Research -
Faculty of Arts and Sciences
Professor of Statistics