The Data Science Institute strives to be a force for change. We acknowledge that DSI should do more within our own unique community to address racial equity gaps and to increase diverse representation in data science research and education. We will engage in proactive measures to dismantle systemic racism, racial inequity in data science, racial inequity across the Columbia community, and racial inequity across academia more broadly.
The initial DSI Racial Equity Action Plan includes the following:
This initial plan is just a small step toward making progress for the long term. We commit to executing this plan with the same rigor as our other research, education, and outreach efforts.
A Message from DSI Director Jeannette M. WingData Science Institute Racial Equity Action Plan
We are pleased to offer fellowships for M.S. in Data Science students from historically underserved or underprivileged populations through the JPMorgan Columbia University Data Science Institute Master’s Diversity Fellowship Award Program. Support is guaranteed for two semesters of full-time study, including full tuition, fees, and a stipend to cover living expenses in New York City. Fellows will also be invited to complete an internship with JPMorgan. Renewal for the third and final semester of the program is contingent on completing the internship. Prospective students must submit a 250 – 500 word reflection that provides insight into how their identity, life experiences, and perspective would advance diversity in the data science community as part of the admissions application. Please note: Applicants for these fellowships should be U.S. citizens or permanent residents.
Colin Wayne Leach, Psychology, Africana Studies Courtney Cogburn, Social Work Sining Chen, Industrial Engineering and Operations Research Kathleen McKeown, Computer Science Susan McGregor, Data Science Institute Social media is a powerful means of individual expression, and collective consolidation, of people’s sentiment about the most important issues in our society. This transdisciplinary project marries the latest advances in computational and statistical techniques of language use over time with social behavioral theories of emotion and stress to examine the temporal dynamics of tweets surrounding police killings of Black people and subsequent protests (e.g., Black Lives Matter).
Billy Caceres, Nursing Ipek Ensari, Data Science Institute Kasey Jackman, Nursing This pilot study will use data science techniques to leverage ecological momentary assessment and consumer sleep technology to phenotype sleep health profiles in Black and Latinx sexual and gender minority adults. The investigators will use 30 days of daily electronic diaries and actigraphy to examine the associations of daily exposure to minority stressors (such as experiences of discrimination and anticipated discrimination) with sleep health among Black and Latinx sexual and gender minority adults.
Maxim Topaz, Nursing
Aviv Landau, Data Science Institute
Desmond Patton, Social Work
Child abuse and neglect is a social problem that has reached epidemic proportions. The broad adoption of electronic health records in clinical settings offers a new avenue for addressing this epidemic. This team is developing an innovative artificial intelligence system to detect and assess risk for child abuse and neglect within hospital settings that prioritizes the prevention and reduction of bias against Black and Latinx communities.
Kriste Krstovski, Data Science Institute, Business
Yao Lu, Sociology
This research team combines new sources of labor market data, including online resumes and employee reviews, with data science methods to identify factors and environments that shape gender and racial inequality in the high-skilled labor market. The team is charting long-term career trajectories of a large number of high-skilled American workers, examined gender and racial variations, constructing measures of company environment that pertain to gender and racial equity, and assessing consequences for the career path of different groups of skilled workers.
Smaranda Muresan, Data Science Institute
Reducing the achievement gaps in STEM disciplines among subpopulations of students is important for the U.S. to meet its 21st century science and technology needs. This project focuses on environmental factors that devalue, marginalize, or discriminate against students based on a social identity like race, gender, disability status, or socioeconomic status. To date, the research synthesized and systematically analyzed data from interventions shown to help reduce the impact of social identity threats on student participation in STEM, and applied results of the synthesis and analyses to enhance existing interventions.
Wayne Leach, Psychology, Barnard College
Courtney Cogburn, Social Work
This team applied topic modeling and sentiment analysis techniques to Twitter activity before and after events related to police use of force against unarmed Black victims. They focused on more than 8.5 million tweets from August 2014 to cover the event of a police officer (Darren Wilson) shooting a Black victim (Michael Brown) in Ferguson, MO. Their study demonstrated that there are lags between events and their emotional response on Twitter.
Hardeep Johar, Industrial Engineering and Operations Research
Patrice Derrington, Architecture, Planning and Preservation
This team was tasked with defining gentrification trends through publicly available datasets that uncover insights on the spread of gentrification across 55 Public Use Microdata Areas (PUMA) districts in New York City. The four most important factors were: real estate price, race of inhabitants, employment, and education.
This partnership with the NYC Center for Court Innovation organized a youth technology advisory council, TechAdvise, comprised of ten youths from underrepresented groups in tech. The project’s goal is not only to improve the technology we develop to address underserved populations, but also to inspire them and who they represent to learn computing and data science.