Data Science Day 2023
Wednesday, April 19, 2023
8:00 am - 5:00 pm
Data Science Day provides a forum for innovators in academia, industry, and government to connect. The in-person conference will host keynote presentations from leading voices in data-driven innovation, lightning talks from Columbia University researchers, and interactive posters and technology demonstrations.
Clifford Stein, Interim Director of The Data Science Institute; and Wai T. Chang Professor of Industrial Engineering and Operations Research and Professor of Computer Science at Columbia University, will be the master of ceremonies.
Manuela Veloso is Head of J.P. Morgan Chase AI Research and Herbert A. Simon University Professor Emerita at Carnegie Mellon University, where she was previously Faculty in the Computer Science Department and Head of the Machine Learning Department. Her recent interests are in Artificial Intelligence (AI), Symbiotic Human-Robot Autonomy, Continuous Learning Systems, and AI in Finance. She is past President of the Association for the Advancement of Artificial Intelligence (AAAI), and the co-founder and a past President of the RoboCup Federation. In her career she has received numerous awards and honors, including: National Science Foundation CAREER Award, Allen Newell Medal for Excellence in Research, Radcliffe Fellow, Einstein Chair Professor of the Chinese Academy of Sciences, and the ACM/SIGART Autonomous Agents Research Award. Veloso is a Fellow of AAAI, AAAS, ACM, and IEEE. She was elected in 2022 to the National Academy of Engineering for her “contributions to artificial intelligence and its applications in robotics and the financial service industry.”
Moderated By: Jeannette M. Wing, Executive Vice President for Research and Professor of Computer Science, Columbia University
Lee C. Bollinger, President, Columbia University, joins the event to give remarks on the impact of data science and the Data Science Institute. Bollinger will be joined on stage by Clifford Stein (current DSI Interim Director); Jeannette M. Wing (former DSI Avanessians Director) and Kathleen R. Mckeown (DSI’s founding Director), for a special ceremony to recognize his presidency and his contributions in the establishment of the Data Science Institute.
Photo Credit: Eileen Barroso
Refund Policy: The deadline to secure a refund is Friday, April 7. Refunds may be requested up to 7 days prior to the event and will be considered on a case by case basis. No refunds will be issued after Wednesday, April 12.
Accessibility: Please find campus maps and accessibility information via the Columbia Visitor Service Center.
Postdoctoral Research Scientist, Data Science Institute, Columbia University
Talk Title: Global Monitoring of Emotional Responses to Climate Extremes: Evidence from Eight Billion Social Media Posts
Abstract: Climate change is intensifying regional heat and precipitation extremes, posing complex risks to human well-being on a planetary scale. Can pairing digital data streams with NLP provide a tool to track the hidden human impacts of climate stressors on daily life? In this talk, I’ll share key insights from a global-scale natural experiment that linked the lexical content of ~8 billion geolocated tweets across 190 countries and 13 languages with daily data on local climate extremes and weather conditions. Constructing historical sentiment atlases for nearly every county in the world, I’ll assess whether local exposure to randomly-timed climate hazards alters positive and negative online expressions compared to local baselines. Lastly, I’ll describe societal sentiment responses to two events statistically attributed to human-caused climate change: the 2021 U.S. Pacific Northwest heatwave and the Western European extreme rainfall event. These results starkly reveal a fundamental aspect of human responses to emerging climatic extremes: future psychosocial impacts may far exceed those registered in the recent past, barring adaptation beyond what society has already achieved.
Assistant Professor of Ophthalmic Science (in Ophthalmology), Department of Ophthalmology, Columbia University Irving Medical Center
Talk Title: Creating a Robust, Interpretable, and Portable Medical-Expert–AI Team for Eye Disease Detection
Abstract: The focus of our Artificial Intelligence for Vision Science (AI4VS) Lab is to develop AI ‘partners’ to work in tandem with clinicians to expedite eye disease detection. Our lab has 3 key goals: to robustly handle data collected from different sites/patient populations, (2) to ensure the mechanisms behind AI’s predictions are interpretable by medical experts, and (3) to create AI technology that is portable so it can reach those populations most in-need. In this lightning talk, I will give an overview of our ongoing work toward tackling these three challenges, showcasing how symbiotic expert-AI teammates may be able to achieve better disease detection accuracy and interpretability than either one alone.
Elizabeth Standish Gill Associate Professor of Nursing, School of Nursing, Columbia University Medical Center
Talk Title: Transforming Patient Care at Home with AI
Abstract: This presentation will cover the current trends in using AI in healthcare with examples in home healthcare. It will specifically look at how AI is being used to identify high-risk patients who need priority for nursing visits and automatically identify patients who are deteriorating. The presentation will also provide examples of studies using AI for speech recognition technologies to identify at-risk patients. Recommendations on future directions for research will be provided.
Sandra C. Matz
Daniel W. Zalaznick Associate Professor of Business, Columbia Business School
Talk Title: Using Big Data as a Window into People’s Psychology
Abstract: Every step you take online leaves a digital footprint. What can these footprints teach us about their owner’s preferences, needs and motivations – in short, their personality? How can such insights be used (or abused) to influence people’s behavior? And what might a future look like in which individuals benefit more from their data than they currently do?
Moderator: Anthony Vanky
Assistant Professor, Graduate School of Architecture, Planning, and Preservation, Columbia University
Professor of Professional Practice in the Faculty of Business; and Faculty Director, Program for Financial Studies, Columbia Business School
Talk Title: Credit Information in Earnings Calls
Abstract: We develop a novel technique to extract credit-relevant information from the text of quarterly earnings calls. This information is not spanned by fundamental or market variables and forecasts future credit spread changes. One reason for such forecastability is that our text-based measure predicts future credit spread risk and firm fundamentals. More firm- and call-level complexity increase the forecasting power of our measure for spread changes. Out-of-sample portfolio tests show the information in our measure is valuable for investors. Our results suggest that investors do not fully internalize the credit-relevant information contained in earnings calls.
Associate Professor of Industrial Engineering and Operations Research, Columbia Engineering
Talk Title: Mutual Funds: First-Mover Investors, Redemptions, and Spillover Risk
Abstract: We study the vulnerability of mutual funds to fire-sale spillover losses. We account for the first-mover incentive that results from the mismatch between the liquidity offered to redeeming investors and the liquidity of assets held by the funds. We show that a higher concentration of first movers increases the aggregate vulnerability of the mutual fund system. When calibrated to U.S. mutual funds, our model shows that, in stressed market scenarios, spillover losses are significantly amplified through a nonlinear response to initial shocks that results from the first-mover incentive. Higher spillover losses provide a stronger incentive to redeem early, further increasing fire-sale losses and the transmission of shocks through overlapping portfolio holdings. (joint work with Paul Glasserman and Marko Weber)
Quetelet Professor of Social Science, Faculty of Arts and Sciences, Columbia University
Talk Title: Financial Institutions, Neighborhoods, and Racial Inequality
Abstract: Does living in a minority neighborhood make conventional banking harder? Based on more than 6 million queries, we compute the difference in the time required to walk, drive, or take public transit to the nearest bank vs. the nearest alternative financial institution (AFI – such as payday lender) from the middle of every block in each of 19 of the nation’s largest cities. We find that race is strikingly more important than class, as the AFI is more often closer than the bank in well-off minority neighborhoods than in poor white ones. I present some ideas about why.
Moderator: Yao Lu
Professor of Sociology, Faculty of Arts and Sciences, Columbia University
Associate Professor of Computer Science, Columbia Engineering
Talk Title: Programming Language Processing: How AI can Revolutionize Software Development?
Abstract: The past decade has seen unprecedented growth in Software Engineering— developers spend enormous time and effort to create new products. With such enormous growth comes the responsibility of producing and maintaining quality and robust software. However, developing such software is non-trivial— 50% of software developers’ valuable time is wasted on finding and fixing bugs, costing the global economy around USD$1.1 trillion. Today, I will discuss how AI can help in different stages of the software development life cycle for developing quality products. In particular, I will talk about Programming Language Processing (PLP), an emerging research field that can model different aspects of code (source, binary, execution, etc.) to automate diverse Software Engineering tasks, including code generation, bug finding, security analysis, etc.
Associate Professor of Computer Science, Columbia Engineering
Talk Title: Seamless Natural Communication
Abstract: My research focuses on how to enable machines to interact with different users in a seamless fashion. To achieve that, I work on multimodal user modeling, dialog system planning, natural language understanding and generation, and human-computer interaction.
Associate Professor of Computer Science, Columbia Engineering
Talk Title: Connecting Vision, Language, and Code for Explainable and Reprogrammable AI
Abstract: Vision-language models (VLMs) such as CLIP have shown promising performance on a variety of recognition tasks using the standard zero-shot classification procedure — computing similarity between the query image and the embedded words for each category. By only using the category name, they neglect to make use of the rich context of additional information that language affords. The procedure gives no intermediate understanding of why a category is chosen, and furthermore provides no mechanism for adjusting the criteria used towards this decision. We present an alternative framework for classification with VLMs, which we call classification by description. We ask VLMs to check for descriptive features rather than broad categories: to find a tiger, look for its stripes; its claws; and more. By basing decisions on these descriptors, we can provide additional cues that encourage using the features we want to be used. In the process, we can get a clear idea of what features the model uses to construct its decision; it gains some level of inherent explainability. We query large language models (e.g., GPT-3) for these descriptors to obtain them in a scalable way. Extensive experiments show our framework has numerous advantages past interpretability. We show improvements in accuracy on ImageNet across distribution shifts; demonstrate the ability to adapt VLMs to recognize concepts unseen during training; and illustrate how descriptors can be edited to effectively mitigate bias compared to the baseline.
Moderator: Eric L. Talley
Isidor and Seville Sulzbacher Professor of Law, Columbia Law School
We encourage members of the Columbia University community to submit a poster or demonstration for exhibition at Data Science Day 2023. The session is a great opportunity to network and receive feedback from senior academic, industry, and government representatives. We anticipate 700+ guests to attend and explore Columbia’s data science research.
Eligibility: Columbia University faculty members, affiliated researchers, currently enrolled undergraduate and graduate students, Ph.D. candidates, and postdoctoral researchers may apply to exhibit at Data Science Day. Please indicate your Columbia UNI when applying.
Complimentary Registration: If selected, your team will receive free admission to Data Science Day.
Questions? Contact firstname.lastname@example.org
Data Science Day is made possible by the support of the DSI Industry Affiliates Program.