Leaders across all sectors increasingly embrace the power of data to tackle pressing social justice challenges. From income inequality and incarceration, to immigration and beyond, data science methods are used to gain insights, to improve decision-making, and to support the creation of scalable solutions that will positively impact society.
Automated decision systems are intended to ensure fair and equitable treatment of people, i.e., decision-making that is not subject to human bias, emotion, fatigue, faults, years of experience, etc. (e.g., some judges are more lenient than others). Ironically, these very solutions use historical data (e.g., all past court judgments) and thus continue to reflect human frailties, most notably historical and systemic bias.
The effective use of data science methods to address systemic inequality and to reduce variations in judgment across humans requires more than the adoption of the right technical approaches to the right data sets about people, institutions, communities, and systems. Collaborations between data scientists and domain experts in other disciplines, particularly the social sciences, are essential to design multifaceted, human-centered, and ethical solutions and prevent or minimize biased, inappropriate, or unintended outcomes. Interdisciplinary approaches to social justice shift the very questions we ask and how we interpret the results.
Through DSI, thought leaders from across Columbia combine techniques from the growing field of data science with their own subject-matter expertise to reshape thinking and co-create practical solutions for a more just world.
For example, an interdisciplinary research team including researchers from social work, nursing, and data science are designing an AI system to detect and assess risk for child abuse and neglect, while another team with health policy, environmental health, and electrical engineering expertise create personalized behavioral interventions to improve access to health care in low-income communities. Our research scientists are also developing tools to authenticate digital media in the “fake news” era, gaining new insights on gender and racial/ethnic inequality in the labor market, and using natural language processing to help reduce racial and gender achievement gaps in STEM fields.
DSI-affiliated faculty have also developed innovative, interdisciplinary curricula to embed data science into social justice-related courses for undergraduate and graduate students, including history, social work, and public policy courses. Teams of M.S. in data science students have partnered with social work and psychology faculty to understand Twitter activity before and after police use of force against unarmed Black victims, developed a cost-benefit model to measure violence intervention efforts in Baltimore, Chicago, and New York, and defined gentrification trends and uncovered insights on its spread throughout New York City.
Social Justice
Data, Media and Society
Cybersecurity
Disaster events often occur in remote, hard-to-access regions, with conditions made more difficult by the evolving crisis. Groups such as International Organization for Migration and International Displacement Monitoring Centre rely on field-based estimates from humanitarian groups, media reports, and in rare instances, survey data to get a handle on the numbers and characteristics of those displaced for planning humanitarian assistance. However, data are often delayed and fragmentary and require triangulation, affecting decision making and the ability to track trends in displacement as well as return or local integration. This research is processing data from multiple sources to provide a synoptic view of disaster displacement globally, broken down by type and location.
The course introduces School of International and Public Affairs students to computational thinking, including Python programming, and teaches students to apply that way of thinking to public policy issues. Course participants have applied this new knowledge for their capstone projects, during which they address real world policy and management challenges for external clients.
This program adds ethical teaching to current undergraduate computer science courses and is compiling these modules into a textbook. The initial set of courses augmented by ethics modules included Machine Learning, Human Computer Interaction, and Networks and Crowds.
This series of one-hour talks is open to the entire Columbia community and features distinguished speakers who are grappling with the challenge of ensuring that data science serves the public good. Topics include financial systems risk, interpretability and discrimination in machine learning, different definitions of fairness and privacy, and equitable access to digital technology.
Desmond Patton, Social Work