Florence Hudson always wanted to be a scientist. In the 1960s and 1970s, she was transfixed by the Apollo missions and dreamed of becoming a rocket scientist—a dream that would become a reality. She was one of the first women to study mechanical and aerospace engineering at Princeton, after which she became an aerospace engineer at NASA and designed a propellant production system that enabled sample-return missions from Jupiter’s Galilean satellites. She later climbed the ranks of corporate America, working as vice president and chief technology officer at IBM.

Hudson was recently named executive director of the Northeast Big Innovation Data Hub, one of four regional big data hubs funded by the National Science Foundation. The Hub, which is hosted by the Data Science Institute at Columbia University, works with industry, academia, nonprofits, and government agencies to address societal and scientific challenges, spur economic development, and accelerate innovation in the national big data ecosystem.

Here, Hudson discusses her background and vision for the Northeast Hub.

What drew you to your new role as executive director of the Hub?

I love working with the big data ecosystem community of academic, industry, not for profit, and government researchers and leaders to be a catalyst in the application of innovative data science approaches to solve societal and scientific challenges. Being a community convener, collaboration hub, and catalyst are natural tendencies for me, so the mission of the hub and my personal mission are very well aligned. Data science is a critical skill across all areas of our world now, and to be a leader in enhancing human and institutional understanding of the value of data science, and to enable them to develop valuable insights from the data, is a great opportunity.

You have supported in the Northeast Hub since its inception. How did you first get involved?

I participated in the Northeast Hub kickoff meeting in 2015 so I have been involved from the beginning, and have been on the executive committee and advisory board since its inception. I’ve also been on the strategic planning group for the Northeast Hub and the All Hubs team. I served as special adviser for Next Generation Internet, leading our Hub collaboration with the European Commission’s Horizon 2020 Next Generation Internet initiative, including helping to write a Policy Brief on Surveillance and Analytics in the Deep Web: A Priority for Collaboration in AI and Cybersecurity between EU and the US. I’ve been a co-facilitator and participant in the Northeast Hub’s Data Sharing workshop (report here), and the Cybersecurity IoT workshop as well. Katie Naum, the Hub operations manager, and I participated in cross-Hub education and data literacy events such as JupyterCon.

You are trained as a mechanical and aerospace engineer and worked for major corporations. Have you always been interested in data science?

I have always been a data scientist, focused on pattern recognition and anomaly detection. As an aerospace engineer, data is key. Whether designing aircraft, or designing space missions, we need to leverage data. Now data science is applicable and important in nearly every use case, from government elections to agriculture, to Covid-19. With increased data-sharing requirements for endeavors such as precision medicine, we are uniting data science with various disciplines and information technology to crack the code that advances and deepens scientific understanding.

You mentioned Covid-19. What are your thoughts about data and Covid-19 and what has it taught us about the central importance of data?

This is an urgent example of how to use data for good to address societal challenges, in this case related to the Covid-19 pandemic. In our role as community convener, collaboration hub, and catalyst for data science in the Northeast region, the Hub has launched a Covid-19 Resources guide on its website. The guide includes funding resources for researchers, links to useful datasets, collaboration opportunities and virtual events we are co-sponsoring with community members, as well as links to hundreds of National Science Foundation RAPID grants funded for Covid research in our region and across the U.S.

What value does the Northeast Hub bring to society?

In essence, we work with the community to develop and leverage data science approaches to catalyze action from knowledge and insight, enabling the development of new solutions to societal problems through community-led efforts. We work with partners such as the New York Hall of Science to teach data science to underserved communities, addressing the need for more data science skills while enhancing community opportunities for new success.

As incoming executive director, what is your plan for the Hub?

My plan is to increase our positive impact as a community convener, collaboration hub, and catalyst to promote and apply translational data science in the Northeast Region. We will enable the members of the data-science ecosystem in the Northeast U.S. to work together to achieve our mutual goals and amplify community success in the Hub’s four focus areas: health, responsible data science, urban to rural communities, and education + data literacy.

Which successful projects has the Hub worked on?

We have had a lot of success in the education + data literacy focus area in particular. The Big Data for Education team has conducted workshops in New York City, Buffalo, Philadelphia, Pittsburgh, and Boston, including tutorials by senior researchers, who taught how and when to use key methods for educational data mining and learning analytics. In addition, the Big Data for Education Team has supported the ongoing run of the Big Data and Education Massive Open Online Course (BDEMOOC), which has had more than 100,000 learners in its several iterations. BDEMOOC helps learners develop skills both in using standard machine learning and data mining algorithms, and in working in areas more specific to education such as latent knowledge estimation. The Big Data for Education project has resulted in the publication of more than 20 scientific papers, many of them first-authored by graduate students. We will also be publishing success stories and project outcomes on the Hub website, linking to papers produced by projects such as Behavioral Predictors of MOOC Post-Course Development, and highlighting workshop information that could be leveraged to inform more workshops such as the Longitudinal Data Competition, which invited data scientists from around the world to analyze and develop insights from student data.

Which kind of projects will you focus on?

The projects we focus on are innovative and transformational in the leverage of data science for the good of society. These include projects such as the Exposome Data Insight project, a project aligned with the health as well as urban to rural focus areas. The goal is to address problems in human health and disease by developing computational and bioinformatics methods to reproducibly and efficiently reason over high-throughput data streams spanning molecular data to population data. This project is a collaboration between researchers from multiple universities with expertise in health, data science, machine learning, and smart cities-sensor testbeds. The team leverages data from five federal agencies already, and we are growing the collaboration to include city-level environmental data from sensor testbeds, and regional health data. The researchers can then correlate health, demographic and environmental data to understand health disparities due to environmental exposure to various particles, and develop a plan to share the data to inform potential policies and plans to address the disparities. In the responsible data science focus area, the Framework for Integrative Data Equity Systems (FIDES) project is a collaboration begun by the Northeast, Midwest and West Hubs to enable collaborative development of data science practices, leveraging data in an ethical and equitable way to benefit society.

We encourage the Northeast Hub community to contact the Hub to collaborate in any of the four focus areas, including projects they are pursuing already, and projects they would like to engage in which are or could be in the Hub portfolio. The Hub includes public and private institutions in the states of Connecticut, Maine, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island and Vermont. We highlight the collaborators in the Hub on our website, and want to attract more collaborators. A diverse and inclusive community is a great source for collaborative innovation.