Four DSI Graduates Offered Career Advice to Students

Alumni Panel
Alumni Panel

Four DSI graduates returned to campus for a panel discussion where they offered career and academic advice to current students. The graduates, all of whom work in New York City, spoke for an hour on the panel, which was moderated by Simran Lamba, a current master’s student. After the panel discussion, the students had the chance to ask the panelists questions and later meet them personally in Columbia’s Shapiro Hall.   

The alumni panel discussion was hosted by the Data Science Institute and DSI's Student Council. Rachel Cohen, DSI's Assistant Director of Student Services and Career Development, said that the four panelists “have been through our rigorous master’s program and successfully attained roles as data scientists at great companies.”

“And the more opportunities we have to bring them back to campus to interact with current students,” she added, “the better.”

The panelists were Xizewen Han, a data scientist for Fareportal; Francisco Arceo, a data scientist for Commonwealth Bank; Tony Paek, a data scientist for Conde Nast and Gary Sztajnman, a data scientist in product development, for Verizon. What follows is an edited summary of the panel discussion.


Talk about your jobs and what they entail?

Gary: I started the program in 2015 and began at Verizon in 2017. Verizon has many media products and I use data science to design better products and experiences for the users.

Tony: I graduated in December 2015. I worked for American Express for a year and a half and then joined Conde Nast’s digital team. I make prediction models for advertisers.

Francisco: I started as a part-time student at DSI in 2014 and graduated in December 2016. It was my second master’s degree. I had a really wonderful time at DSI and I learned a lot. I first had a job that was a poor fit and didn’t provide me with the analytical work which I was interested in. But now I love my job working as a data scientist for the Commonwealth Bank of Australia.

Xizewen: I graduated last December (2016). I use neural networks for fraud detection. My work has helped the company reduce its loss due to fraud significantly. Before this, I had a great internship at Mediamath, an ad-tech company. I credit that to DSI, which introduced me to the opportunity. I had a great experience there and learned a lot, which helped to my succeed in my current job.

Which DSI courses have been most useful to you on your jobs?

Francisco: In general, studying algorithms has been the most helpful for me. It’s important to understand statistics and have good statistical intuition, but in everyday work algorithms will help you the most. That and your ability to write readable code.

Tony: I really likedmy modelling class with Chris Wiggins, that’s been most applicable in my job. Also I studied natural language processing with Mike Collins, which also helps me alot in my work. These were core classes that were deep and interesting.

Gary: My favorite class was neural networks and deep learning. But my advice to students is to take what classes interest you. Think of what you want to do and take your classes based on your interests.

Xizewen: I loved Bayesian Models for Machine Learning with Professor John Paisley. It gave me a new perspective in machine learning and I found it fascinating. The math was also very elegant. In the fall of 2018, I will leave my job to begin a doctorate in statistics at the University of Texas at Austin. Professor Paisley’s class motivated me to do the Ph.D. He even wrote me a letter of recommendation.

What tools do you use on a regular basis?

Francisco:I use the core data science languages, Python and R. I also spend a lot of time building out data sets (i.e. clean data sets that can be used for analytical work) using HQL, SQL, and other standard database platforms. You should certainly learn how to build predictive models (even with simple algorithms) and if you want to work on the more technical engineering side, you’ll probably need to be strong in Java, Scala, or C/C++.

Tony:I agree with Francisco about the importance of core languages such as SQL. You should all be comfortable with programming languages. You might start with Python and then learn Spark but it will depend on the needs of your companies.

Xizewen: Knowing SQL is important. You should also be good at data visualization -- get good at using Tableau or Power BI. Technical skills are key but don’t underestimate communication skills. It plays such an important role during the interview. Making the projects you worked on into an attractive and accessible story could benefit you much more than simply throwing out fancy technical terms.  

Do you have advice for students who are searching for interns and jobs?

Tony:The most important thing is to understand the distinction of different roles or jobs. Most companies use the phrase data science broadly. There is data analytics, or machine learning or data engineering.  When looking at jobs, research what skills are required for the job and what your main tasks will be. Read the job description carefully and pick a job that aligns with your interests.  

Francisco:I agree with Tony. Data science is broad -- there’s an engineering focus, a business focus, a modeling focus, and others.  So when you are looking for internships or jobs, you should certainly apply to a ton of places but you should also try to prioritize places that are aligned with your interests. Ask the people you interview with hard questions. I have done data science interviews for my current and past employer and personally I like it when candidates ask me hard questions. It shows they did their homework. Some job environments might be bad for you. I hated my first job - I was looking to run away. I got lucky because I was able to find a different opportunity that better aligned with my interests and really fell in love with my job. It’s really worth finding out what you’ll do on an hour by hour basis on the job, from the mundane (e.g., frequent team meetings) to the exciting (head down coding time). You are interviewing the company, too, so again, ask hard questions. Make sure it’s the right fit so you like your work and will grow at your job.

Gary: First, decide what you want to do and focus on finding a job in that area of data science, which is a broad field and a fast changing field. And use your network of friends at Columbia to get internships and jobs.

Francisco: Exactly. Use your network. You are at Columbia - a school in New York City, where all the jobs are. You are studying at an elite university. Always keep the bigger picture in mind when you’re working on your problems--don’t lose sight of the forest while navigating amongst the trees. When you are talking about the work that you’ve done for class, a research project, or a hackathon always remember to ask yourself “So what?” What was the point of your effort? In my opinion, having that high-level understanding of your work is critical to being an effective data scientist.  Lastly, make sure to leverage your network, so many of my job offers have been a consequence of my network developed either through my colleagues or Columbia--you have an extraordinary group of peers to help you.

Xizewen: For me, I think it would be great to keep in mind that there’s no need to rush. If you find that there’s a skill that could make you more attractive in the job market, even if you have to start from scratch, go ahead and learn it, and take your time to really develop this skill rather than do it quick and dirty just for one interview. Take this time as a great learning opportunity. As an international student I always had this fear that I need to find a job sooner rather than later. But retrospectively speaking, I found that people in my year went on to find jobs any time throughout the school year and there many students received great offers towards the end of the master’s program. DSI is a great instiute at a great university, so you’ll do more than fine.

--By Robert Florida

550 W. 120th St., Northwest Corner 1401, New York, NY 10027    212-854-5660
©2017 Columbia University