By Jeannette M. Wing

I use the tagline “Data for Good” to state paronomastically what data science at Columbia stands for.

First, we should use data science for the good of humanity and society.  Data science should be used to better people’s lives.  Data science should be used to improve relationships among people, organizations, and institutions.  Data science, in collaboration with other disciplines, should be used to help tackle societal grand challenges such as climate change, education, energy, environment, healthcare, inequality, and social justice.

Second, we should use data in a good manner.  Here, I like to use the acronym FATES to suggest what “good” means.  Fairness means that the models we build are used to make unbiased decisions or predictions.  Accountability means to determine and assign responsibility—to someone or to something—for a judgment made by a machine.  Transparency means being open and clear to the end user about how an outcome, e.g., a classification, a decision, or a prediction, is made.  Ethics for data science means paying attention to both the ethical and privacy-preserving collection and use of data as well as the ethical decisions that the automated systems we build will make.  Safety and security (yes, two words for one “S”) means ensuring that the systems we build are safe (do no harm) and secure (guard against malicious behavior).