A writer once dubbed Adam Kelleher, a professor at the Data Science Institute, a “data genius.”
Kelleher, also the principal data scientist at BuzzFeed, has indeed designed many successful data-driven technologies. But he’s perhaps best known for leading the development of Pound, a technology that tracks how BuzzFeed’s content travels through the web. He developed Pound (Process for Optimizing and Understanding Network Diffusion) along with his twin brother, Andrew Kelleher.
For the first time, Pound data showed how a post spread to millions of viewers across social networks, news sites as well as emails and blogs. BuzzFeed’s content is shared by tens of millions of people every month, so it’s not easy to track its trajectory. Kelleher led the analysis of Pound data to help BuzzFeed create and track content that’s perfectly designed to spread virally.
At the Data Science Institute, Kelleher teaches two classes – Causal Inference and Machine Learning – where he introduces students to data-science techniques used in industry. He knows first-hand what data skills students must know if they are to excel in their careers and that is what he teaches them.
Before he was hired by BuzzFeed, Kelleher was an academic, so he’s a natural fit for DSI. He studied physics at the University of North Carolina and earned a doctorate in theoretical gravity and cosmology. While a student, he took an interest in applied-information diffusion, which is essentially the study of how information spreads. At the time, his brother Andrew was a data engineer at BuzzFeed, where he was leading a team to measure how the company’s content spread. Kelleher joined the team and has been at BuzzFeed ever since.
In this interview, he discusses data science from the perspective of a practitioner and a professor. He talks about what skills a data scientist needs to succeed in industry as well as how he has been able to excel at BuzzFeed, where he has built a reputation as a “data genius.”
**************
What is your main responsibility at BuzzFeed?
I wear many hats and my job is the summary of all my projects. My main job is to generate the most impactful high-level projects. For example, I’ve established a framework for machine learning to make it easier for the data science team to use machine learning. We have about 20 people on our team. We use machine learning to select some of the content on our feeds and the modules around the site.
Do you enjoy your job?
I do! I get to build great stuff. I’d love to have more time to get more in-depth on projects, but overall it’s awesome. In industry it seems the future of science is individual data sets. The scale of the data is huge and there’s always incentive to collect more. And we need the right resources to study the data. It’s not like academic research; it’s more project driven. It might not be likely to get a grant to do the kind of work I do but I’m building cool stuff every week, so it’s fun.
Do you like teaching at DSI?
I really enjoy it. The students are bright, talented and curious and many of them also have leadership qualities. They have the potential to define the future of data science.
Does studying data science enhance a student’s marketability?
It’s definitely a huge bonus. You will learn the skills that come with the degree and that is what will sell you. The DSI master’s program focuses on helping students develop the right skills. Other data science programs have come and gone because they taught the wrong skills. You need to have the right skills, you need to have insight into how those skills relate to the overall project and you need to know how to apply the skills to certain applications. Students learn all this at DSI, which is why I teach here.
What advice do you have for the master’s students in DSI?
First, they should do internships. Hands-on experience is incredibly important. A lot of workplace skills are hard to learn in school, such as how to work with a team, how to divide the work among the members of the team and how to be efficient in meetings. It’s also important to learn how to coordinate projects among different teams, and how to work with different groups and colleagues within a company. We have interns on our data science team, and I definitely encourage my students to apply.
Do you work in this way at your job?
Yes, my team has to work across all departments at BuzzFeed, with the editorial team and content producers. They don’t have technical backgrounds so we need to explain our work to them in ways they’ll understand. We must also design our projects to satisfy editorial needs. We make sure to do a lot of follow up to ensure that what we build works. We build software to specs, but oftentimes colleagues will change the requirements, so my team must work closely with the teams to make changes in the code as they build it out. That involves a lot of back and forth, but keeping good relationships with colleagues, the stakeholders in the company, is key. Our projects must work for them.
Should graduate students follow their interests or focus on furthering their careers?
For me it was a bit of both, but I mostly did what I was interested in. If you follow your interests, you’ll work harder and perform better. You should also work on improving your weaknesses and learn new things in grad school. In industry there’s often no time to go into great depth on projects. Graduate school is the time to develop a depth of understanding on subjects you love and learn skills that will also further your career.
What were your interests in high school?
I liked physics a lot, but I also liked the natural sciences. And I wanted to understand social science, economics and public policy. I followed my interests and I read a lot.
What do you like to read?
I read an assortment of news sources: The New York Times, Fast Company, Wired and Buzzfeed. I read a lot off my Facebook and Twitter feeds. In terms of books, I love Isaac Asimov’s Foundation series. It’s a series about Hari Sheldon, a mathematician who develops a branch of mathematics called psychohistory. The book has inspired social scientists and physicists and it even inspired my interest in causal inference. I also read about issues I’m interested in, such as such how the criminal justice works and how it could be reformed.
After getting a doctoratewhy did you decide to work in industry?
At UNC, I worked on a research project on network analysis in public policy for the Carolina’s Network Group. My role was to collect all articles in policy, going from blog to blog, and I built a data system to analyze all the info I collected. That was the applied information diffusionproject that coincided with the work my twin brother Andrew was doing at BuzzFeed.He was the 150th employee in the company and I was the 300th. He has since left the company, but I’m still at BuzzFeed, happily.
You work with data scientists from different backgrounds. Do you see the abilities they have based on their backgrounds.
Yes, I have seen the different types of data scientists. Some have engineering backgrounds but may not know statistics; some studied statistics but may not know computer science; and others studied science, math or physics, but might not know how to keep the big picture in mind or how to keep it simple. Oftentimes, for instance it’s better to pick the simple algorithm than to agonize over finding the best possible one. When I was a student I was like that: I’d try for the perfect algorithm to fit my data. But working in industry taught me that simple is often better. I explain this to my students all the time and try to make them well rounded. Data science is a relatively new field that’s evolved over a few decades and is driven in part by industry need.
Why did you decide to return to academia to teach?
I loved teaching when I was a graduate student. And I thought I could add value in the field of data science by bringing what I’ve learned in industry back to the classroom. My goal is to make people better at data science. In part, I also think causal inference should play a bigger role in academics, so I wanted to teach that class. I have a blog dedicated to causal inference and I use it at work. It’s great for taking observational data and using it to inform policy recommendations in a number of governmental areas. When you think about it, that’s what data science is: You inform people’s actions by the smart use of data. You use data to tell them, ‘if you take this action, this will be the result.’ This is what causal inference is, and that is fundamentally what we do as data scientists: We use data to help people make the right decisions and do good.