We often hear about the successes of machine learning in consumer services, such as search, online shopping, speech recognition, and image classification, but the kinds of data that scientists and engineers collect and need to analyze are different. How can these machine learning methods be applied to scientific data? Machine Learning in Science and Engineering, or MLSE 2020, aimed to bring together machine learning experts with experts from all science and engineering disciplines to answer this question.

Watch a quick recap of MLSE 2020

More than 130 speakers and 1,300 attendees gathered December 14 and 15 to explore how artificial intelligence and machine learning can help solve emerging challenges. The two-day virtual conference offered 70 hours of concurrent programming across 11 dedicated tracks, each with their own programs and participating research. More than 420 members of the Columbia community registered for this year’s conference, which also saw interest from 328 researchers from international universities and organizations and attendees from 266 unique American affiliations, including universities, companies, and other organizations.

The conference was hosted by the Data Science Institute (DSI) at Columbia University, supported by a National Science Foundation (NSF) TRIPODS+X award, and co-sponsored by DSI’s Industry Affiliates Program, IEEE Brain Initiative, Northeastern University Department of Chemical Engineering, and Calico Life Sciences.

Jeannette M. Wing, Avanessians Director of the Data Science Institute and Professor of Computer Science, and Qiang Du, Fu Foundation Professor of Applied Mathematics, Department of Applied Physics and Applied Mathematics, Columbia Engineering, served as conference chairs. 

“I’m impressed at how deeply the different science and engineering communities have embraced machine learning. We had speakers from a Nobel Laureate to undergraduates, from academia to industry to government,” Wing said during her closing remarks.   

Columbia faculty chaired or co-chaired each of the conference’s 11 tracks—astronomy, astrophysics, and physics; biology; chemistry, chemical engineering, and materials science; computing systems; earth and environmental sciences; health sciences; mechanical engineering, engineering mechanics, and civil engineering; methods and algorithms; neuroscience; quantum; and transportation—evidence of “the breadth and depth of the use of machine learning throughout the university in all science and engineering disciplines,” according to Wing.  

MLSE 2020 marks the third annual MLSE conference. The first open MLSE conference was organized by Newell Washburn at Carnegie Mellon University in 2018 in partnership with Georgia Tech. The second MLSE, which was also supported by an NSF TRIPODS+X award, was organized by Dana Randall at Georgia Tech in 2019 in conjunction with CMU and Columbia University.

Columbia was able to pivot from an in-person to a completely virtual event in the midst of the coronavirus pandemic, and expanded the scope of the previous conferences by adding new tracks in science and engineering fields, foundations of machine learning, and the design of computational systems. 

Du noted that the virtual event’s success was due to a considerable team effort. “The planning for MLSE 2020 took more than a year. The conference was able to reach out to a much greater community than before, providing a forum for experts with different backgrounds to discuss critical challenges, common issues, and innovative solutions,” he said.

Conference highlights included:

  • Cynthia Rudin, professor of computer science, electrical and computer engineering, and statistical science at Duke University, discussed how to create an interpretable deep learning model for computer vision, and how to project high dimensional data onto two dimensions in order to visualize it while preserving its structure. She presented ProtoPNet, a deep learning method for case-based reasoning in images, and concept whitening, a technique that disentangles the latent space of a neural network by forcing all information about a given concept to travel through exactly one neuron. 

  • Barbara Engelhardt, associate professor of computer science at Princeton University, reviewed the challenges and the opportunities that arise when using machine learning to assist with hospital systems responding to pandemics. She discussed modeling patient time-course and understanding the trajectory of diseases; the abilities of machine learning methods to perform patient triage; how to develop policies for care with limited patient data sets; an application of these approaches to resource allocation; and how demographic traits impact machine learning methods for hospital patient care.

  • William Dally, chief scientist and senior vice president of research for Nvidia and professor-research of computer science and electrical engineering at Stanford University, described current hardware for deep learning and research to continue performance scaling in the absence of Moore’s Law, and discussed dedicated accelerators, special instructions, data representation, sparsity, and analog methods. 

  • Nobel laureate Barry C. Barish, Ronald and Maxine Linde Professor of Physics, Emeritus, Division of Physics, Mathematics and Astronomy at Caltech, suggested various areas where machine learning may improve both LIGO interferometer performance and play a role in searches for gravitational waves.

  • The youngest presenter was Varun Agarwal, a high school student from Houston, TX, who shared his research on tuberculosis severity analysis through transfer learning during the health track‘s poster session.

  • 50 research posters from teams representing national and international universities and organizations.
  • 95 job and research opportunities posted on the MLSE 2020 community board.
  • 575+ networking conversations via the MLSE 2020 app.

For more information, please visit MLSE2020.com.