DSI Grad to Work at Capital One As a Machine Learning Data Scientist

October 19, 2017

UPDATE:

Jonathan Galsurkar recently accepted a full-time position at Capital One, working as a Senior Associate Machine Learning Data Scientist, Natural Language Processing. He will be based in the firm’s Union Square office.

The skills he attained in his DSI coursework, he said, helped him land this job as well as his earlier internship at IBM (see below story from 2017 that focues on that internship). For his job at Capital One, he’ll focus on using Natural Language Processing to help the firm reach its goals. His skills in the area were developed through his internship at IBM, along with his DSI coursework in Machine Learning, Natural Language Processing, and Deep Learning.

During his time at DSI, Jonathan had many successes.

“I have met some amazing friends, many of which I’m sure will be lifelong,” said Jonathan, who graduated with a DSI master’s degree in May 2018. “I have developed great relationships with faculty and advisers, both of which have helped shape my path and teach me more than I could have imagined. I also won first place at the 2017 Data Science Hackathon. The Capstone project allowed my team and I to develop a product that we believe could really help save lives, while also working with very cool technologies to do so. It’s been a blast!”

************

Jonathan Galsurkar worked a summer internship with an intriquing twist: the objective of his internship was to use data for social good.

He interned for IBM’s Science for Social Good initiative, where the company’s research scientists and engineers partner with non-governmental organizations to solve some of society’s most intractable problems.

Jonathan, a master’s student at DSI, worked on a project for the United Nations Development Programme (UNDP). The Programme helps countries adopt the U.N.’s sustainable development goals, which include ending poverty, protecting the planet and ensuring prosperity for all.

Using a Rapid Integrated Assessment, UNDP policy experts assess a country’s national development plans and align them with the U.N.’s sustainable development goals — there are a total 17 goals and 169 targets. It’s a labor intensive undertaking since experts must manually review thousands of pages from national plans and match sentences or paragraphs with U.N. targets. Jonathan led a team that developed a natural language processing-based method to expedite the work by automating the assessments.

It was an ideal internship for Jonathan, whose goal after he graduates is to use data science for social good. In this interview, he talks about his internship at IBM, his studies at DSI and how he intends to use data science to improve society.

***********

Can you talk about your internship?

My project was to develop a way to accelerate the work of the U.N. experts. Specifically, we used deep-learning embedding techniques and semantic searching to find sentences or paragraphs in the national development documents that match the semantic concepts of each of the sustainable goal targets. We successfully piloted, for instance, an algorithm to perform the assessment of Papua New Guinea’s national plan. The U.N. estimated that the algorithm will help reduce the time of the assessment from three or four weeks to just three days. With this reduction in time, the U.N. can more quickly help countries ensure the coherence of their national development plans.

Was the internship fun?

The internship was definitely fun with many activities such as a ping-pong tournament, a tennis tournament, many barbeques, etc. I was placed in a large office space with the other social-good interns. This was great because although we worked on different projects, we bounced ideas off of each other, had lunch together, and by the end of the summer became great friends. I was in IBM’s Yorktown Heights location. IBM paid for me to live in White Plains and provided a shuttle that took interns living at our location to and from work.

What data techniques did you use?

I was the lead on the project and the techniques we used were mostly deep-learning based sentence/paragraph embedding methods and what is known as nearest neighbor-based semantic searching system. One novel contribution we made was in our use of historical assessment data for neighborhood supervision – which means we managed to incorporate some supervision in an otherwise unsupervised model (the embedding model). The work was done in Python using libraries such as Gensim, Numpy, pandas and scikit-learn.

Did you learn new techniques during the internship?

IBM wanted to ensure that we get the most out of our internships, especially when it comes to learning new things. My project was mostly natural language processing, something I had very little experience with. Through my manager and mentors, I learned more than I could imagine in the course of the summer, which increased my interest in natural language processing. I’m thus currently taking Professor McKeown’s NLP class and really enjoying it.

Was your IBM internship extended?

Yes. IBM had another project similar to the one I worked on over the summer and we arranged it so I can stay on. I enjoy the culture at IBM and the many things I have learned from my manager and mentors are invaluable. So I couldn’t pass up continuing the internship through the fall.

How’d you get interested in data science?

My first experience working with Big Data was during a Research Experience for Undergraduates program I participated in at the University of Southern California. I was one of three students selected to work on a project called Scalable Machine Learning Models for Smart Healthcare Informatics — an experience that truly changed my life. My research team worked on tools that analyzed large databases of health records and could be used by doctors to make fast diagnoses and recommend treatments for critically-ill children. For me, it was an honor to work on a project as significant as this. And while working on the project I realized that data, if used creatively, could be used to solve life-threatening problems in society. I was inspired and motivated to continue my education by pursuing a master’s degree in data science at DSI.

Are you enjoying the master’s program?

The program is well balanced, offering classes in the main areas of data science. The required classes are key to understanding data science, while the electives will allow me to choose what specific field I want to focus on. And the cherry on top is the DSI capstone project, which will allow me to take all of the knowledge I’ve amassed in classes and apply it in a real-world setting, an experience that will truly put my education into perspective.

Can you talk about your background and your accomplishments?

I’m a first-generation American, born and raised in the Brooklyn. My mother is Ukrainian and my father is Indian-Israeli. I studied as an undergrad at Hunter College, double majoring in computer science and mathematics. Some of my achievements include winning first place in the 2017 Columbia Annual Data Science Hackathon; receiving the $25,000 Monsanto Graduate Scholarship; and graduating from Hunter College summa cum laude. I have also made an app (Gathr), not currently published, and founded a startup. I also presented the work I did for my IBM internship during the Data Science for Social Good Conference in Chicago. For fun I like to play guitar, snowboard, and most of all travel. I’ve been to about 25 countries thus far and have plenty more I plan to go to visit.

What are your plans after you graduate?

Initially I’d like to work for a big tech company and advance the state of the art of data science and working on interesting problems. Ultimately, though, my goal is to create my own data-science driven start up with the intent of making this a better world. There are so many others in the world less fortunate than ourselves. If we can leverage technology and data to better their lives, I feel it is our obligation to do so.