The idea that we can and should use data to solve urban policy problems is widely accepted among policy makers, academic researchers and in the business community. Most mayors have accepted the value of data driven policy by signing on to what might be called the “smart city movement.” Not all urban leaders actually know what it means to be a smart city though. Some envision a futuristic city that looks like a scene out of an old Jetsons cartoon — a place where we would just press buttons on a computer and services would be delivered in an effective and efficient way. Others are more realistic in their understanding of what data and technology can do to improve city services. They would be happy, for example, with street lights that turn on at night and go off in the morning. However, one thing all mayors agree upon, no one wants to live in a “stupid city.” A successful city in the 21st century has become, by definition, a smart city.

Smart city, in some sense, has become an aspirational term, where we have deeply embedded the idea that we can develop and implement public policy without making mistakes. Since that is completely unrealistic, we need a more practical and empirically-based definition to work with. For our purpose, a smart city makes data and technology usable and accessible to solve urban problems with the promise of efficiency and effectiveness in delivering public services. By focusing on the utility of data in improving public services (as opposed to serving private interests), we are using data for “good.” Data can be used in every aspect of urban governance: to enhance citizen engagement; design policy; analyze the budget implications of policy alternatives; implement programs; monitor program operations for effectiveness and equity; and improve the efficiency of service delivery.

How might this work in practice? While all cities would like to use data to solve pressing problems, not all cities have the resources or capacity to collect data, to analyze data or to choose the “right” data to use in their analysis. Even when cities have data, the data itself does not really provide the policy solutions. City officials must make decisions about what the data means in the context of policy implementation. The data may provide policymakers with choices, but how they value those choices will generally be part of a political process. Even in a democracy with built-in systems of public accountability, we cannot ensure that data will be used for the public’s benefit, for good. And in an environment where there are competing – and often scarce – resources, the data cannot decide policy priorities.

Cities which are effectively using data to solve policy problems often choose to partner with universities or contract out to businesses for technical assistance in collecting and analyzing data. This can be a complex and, often, overwhelming process for cities without their own capacity to evaluate technology and analyze data.

Recently, Columbia University’s School of International and Public Affairs and Data Science Institute had the opportunity to partner with New York City’s Department of Environmental Protection on a project called “Stopping Trash Where It Starts.” The City’s goal was to limit floatable trash in the waterways as part of an overall plan to reduce water pollution. Since the City’s data collection efforts indicated that street litter is the primary source of floatable trash, our project focused on helping the City develop a better understanding of the causes of street litter. Our data analysis was intended to inform policy recommendations and new initiatives for reducing litter on the streets.

Professor Patricia Culligan and myself, together with our graduate students, used a multi-method approach to design the research and collect data, which also included an analytic policy model based on multiple data sources. We used 311 customer complaint data from NYC’s Open Data portalSidewalk and Street Score Cards from the NYC Department of Environmental Protection; administrative data from the City of New York Department of Sanitation (DSNY); demographic and land use data from the NYC Department of City Planning; U.S. weather station data; and data from the NYC Tree Census. We also augmented the existing data by developing a survey instrument and app and collected original data to identify the type, quantity and source of street litter, as well as neighborhood characteristics that might be impacting street litter hotspots.

The City is now using many of the policy recommendations we made and other cities have requested our survey protocol and analytic model to replicate our research. This project was a successful partnership between city government and university researchers. The most important factors that allowed us to use data to help develop policy for reducing the trash in NYC’s waterways was a clear understanding of the policy problem; the availability of a variety of different relevant data bases; and our ability to augment existing data with the collection of original data.

This post was written by Ester R. Fuchs, Professor of International and Public Affairs and Political Science at Columbia University. She is a member of the Data Science Institute’s Executive Committee.