A Q & A with Gary Sztajnman | Exploring Applications in Data Science

Gary Sztajnman

Gary Sztajnman came to New York from his native France to study for a master's in data science at Columbia Engineering. He spoke with us about his recent internship at AmperMusic, a New York-based startup using artificial intelligence to compose music. He is currently vice president of the student organization Columbia Data Science Society.

How did you get interested in data science?

I was working at Linkfluence, a social media analysis firm. My team was researching public attitudes toward Stevia, a sugar substitute used by one of our clients, a multinational food company.  My task was to mine tens of thousands of posts on Twitter and Facebook and interpret their emotional content using sentiment-analysis software. I found that people had a more positive view of Stevia than we expected. That project helped me see the value in developing data science skills. I applied to multiple data science programs and was accepted by Columbia Engineering, my first choice.

You won second place at the 2015 Columbia Data Science Student Challenge using New York City data to predict which restaurants were most likely to have rats. How did you come up with that idea?

A few days before the hackathon my friends and I had dinner at a small restaurant in SoHo and all of us got sick. We felt that we didn’t have enough data to pick a good restaurant for the next time. That was the inspiration for the Ratspector.

At a hackathon hosted earlier this year by Audible, Sztajnman (right) and three classmates presented their project results.

How does it work?

The Ratspector draws from data strongly correlated to 311-rat complaints , for example, 311 complaints about stagnant water and large volumes of trash. We visualized the correlation on a map to test our hypothesis, and then calculated the restaurant’s distance from these features. We fitted a Random Forest model in Azure ML using rat complaints as labels to compute a “Ratspector score.” After training and optimizing the model, we could predict how likely a New York restaurant was to have rats.

What did you do this past summer at Amper Music?

I worked with a team applying artificial intelligence to the music-writing process. Let’s say you’re creating an ad for Pepsi and want your sound track to convey a range of emotions – longing, excitement or satisfaction. You used to have to hire a composer to write that music for you. Amper is developing software that can do it in a minute.

What was your role as an intern and what tools did you use?

My job was to help tune the recommendation system; the more you use Amper, the better it gets at predicting your preferences and making music that you’ll like.  I used Python, with some R and SQL. I learned how to write production-quality code efficiently using packages such as NumPy, Pandas, scikit-learn and Flask. 

AmperMusic wrote the score for this short film entered in the 2016 TIFF x Instagram Shorts Festival. (Grange Productions)

What skills did you learn at Columbia that helped you at AmperMusic?

The machine learning course taught me how to solve a range of problems. I would regularly consult my slides from class to find the right algorithm or best implementation to match music-composition challenges that came up.

How does Amper create music? Walk us through the process.

Amper functions as a digital composer and conductor. As the composer, Amper writes a musical score. Then, as the conductor, Amper performs and produces the music to share with the user.

Was there a favorite piece of music you helped develop?

In July, we generated a Bach-inspired piece of classical music. It was amazing to see a computer learn to make its own music. By the end, I couldn’t distinguish between the music written by Bach and the music generated by the computer.

Do you think that computers will eventually become better composers than humans? 

It all depends on how you judge a song. Today, a deep learning algorithm can precisely copy a Van Gogh painting or a piece by Bach. But it is still only reproducing music made by humans. I'm not sure that algorithms will ever be truly creative.

— Daniel First

550 W. 120th St., Northwest Corner 1401, New York, NY 10027    212-854-5660
©2018 Columbia University