We use the tagline “Data for Good” to capture succinctly the who, what, when, why, and how of data science at Columbia.

The recent convergence of big data, cloud computing, and novel machine learning algorithms and statistical methods is causing an explosive interest in data science and its applicability to all fields. This convergence has already enabled the automation of some tasks that better human performance.

The innovations we derive from data science will drive our cars, treat disease, and keep us safe. At the same time, such capabilities risk leading to biased, inappropriate, or unintended action. The design of data science solutions requires both excellence in the fundamentals of the field and expertise to develop applications that meet human challenges without creating even greater risk.

DSI advances the state-of-the-art in data science; transforms all fields, professions, and sectors through the application of data science; and ensures the responsible use of data to benefit society.

By “responsible use,” we mean the fair and ethical use of data, transparency, and accountability of our data science techniques and processes, and the safety and security of the systems we build that rely on these techniques and models.  We also need to understand the ethical issues and implications of our technology.

We aim to have a positive impact on society by tackling societal grand challenges, such as climate change, education, energy, environment, healthcare, inequality, and social justice. Tackling such challenges cannot be done by one discipline alone, and given the kinds and amounts of data amassed in these sectors, data science will be at the heart of addressing these challenges.