Researchers estimate that only 5% of languages are in active use in the digital sphere. As many as 98% are digitally-disadvantaged, which means they are not supported on leading devices, operating systems, mobile apps, and browsers, and essential digital tools—fonts, keyboards, spell checks, autocorrects, and voice activation tools—are not available to them. Any drive to expand support to more languages has largely excluded small and poor language communities.
As communication throughout the world becomes increasingly mediated by digital technologies, technology can impact whether or not a language survives the digital age. Data Science Institute (DSI) postdoctoral research scholar Isabelle Zaugg is a communication scholar and filmmaker who explores how new technologies can both aid and hinder the survival of vulnerable languages. “While cyberspace was built on the promise of serving all the people of the world, without awareness and course correction, we may instead find that language diversity becomes collateral damage of digital tech,” she said.
Zaugg, who received a Ph.D. in communication from American University, examines how East African languages that utilize the Ethiopic script are supported in the digital sphere, including their digital history, online vitality, and battle against digital extinction. She leverages her expertise to develop a range of policy, governance, and advocacy solutions with the goal of creating a digital public sphere that is both linguistically inclusive and just. Her work also addresses the risks of surveillance and the need for robust content moderation as digitally-disadvantaged languages move up the ladder of full stack digital support.
At Columbia, Zaugg leverages her interdisciplinary practice through the Collaboratory, which is a partnership between DSI and Columbia Entrepreneurship to integrate data and computer science into other domains across the university. “Columbia recognizes the need for graduates in every discipline to have data science skills and intends to be at the forefront of this movement. This type of interdisciplinary work has ethics and diversity ‘baked in’ as it brings in a variety of backgrounds and the overlapping ethics of different domains,” she explained.
Zaugg co-developed a cross-disciplinary Collaboratory course, Multilingual Language Technologies and Language Diversity, to address the challenge of scaling natural language processing technologies developed mostly for English to the rich diversity of human languages. She worked with humanities professor Lydia Liu and computer scientist Smaranda Muresan to combine data and computational training with education on interrelated ethical, cultural, business, and policy issues.
“The collaboration with Dr. Zaugg in teaching this course has been one of the most rewarding teaching experiences for me and one of the most valuable learning opportunities for our students,” Muresan said. “The [computer science] students are exposed to a very rigorous and insightful discussion around the ethical issues of developing multilingual technologies, while the humanities students are introduced to state-of-the-art language technologies and the computational models behind them.”
In addition to her work with DSI and the Collaboratory, Zaugg was also a Mellon-Sawyer postdoctoral fellow (2017-2019) and is a lecturer at Columbia’s Institute for Comparative Literature and Society. She helped lead the Sawyer Seminar on Global Language Justice, which focused on language rights as a facet of human rights, and developed and taught Global Language Justice in the Digital Sphere. The course was listed in the Columbia Daily Spectator as one of the “Courses We Loved: Staff Picks for 2019.”
Zaugg considers herself a global citizen and conducts research in New York City and Addis Ababa. She describes a vision for the future internet which is truly global and truly local, simultaneously. “New tools such as machine translation mean more language communities can access information in any language, translated into their own,” she said. “Communities will be able to communicate without losing their mother tongue, and will also be able to make contributions to the global internet in their own language.”
— Karina Alexanyan, Ph.D.