Arushi Arora | Data Science for Predicting Risk
Data scientist, student advocate, and teaching assistant. These are a few of the roles Arushi Arora held as a master's student at the Data Science Institute.
Before coming to Columbia, Arora wavered between pursuing a career in computer science or data science, but an internship with Microsoft the summer before her senior year in college helped her reach a decision. Working on a project to examine India’s capacity to switch from fossil fuels to renewables she became fascinated by the power of data science to make predictions and answer complicated questions. After graduating in 2015 with a degree in computer science from an engineering school in her native India, she enrolled in Columbia’s master’s in data science program.
One highlight of Columbia’s program, which she finished last semester, was an internship with MillimanMAX, the data-analytics arm of Milliman, a global actuarial consulting firm based in Seattle. From the MillimanMax office in Cambridge, Mass., she worked on projects tied to predicting risk in the insurance industry and got to work with individual corporate clients. She looked forward to weekly lectures on topics in machine learning, gaining an understanding of how advanced algorithms can be applied to risk-prediction problems relevant to work in the risk-assessment industry.
While building a data science portfolio at Columbia, Arora worked as a teacher's assistant at Columbia Engineering and the Columbia Journalism School. She also helped launch a student council to advocate for changes in the program and on a lighter side, learned how to dance like a Bollywood star. She returned to India after graduation, where she is currently working with the big data analytics and consulting firm TransOrg Analytics. She recently spoke with us about her experience at Columbia and as a data scientist.
What do you like most about working with massive data sets?
You can pick up any data set and it’s not just the information there that’s important, but external factors like weather, city noise complaints or regional, geographic factors that can influence your interpretations. You can learn innumerable things about the world from a single data set.
Many data scientists dread data-cleaning. What’s the appeal for you?
It is highly satisfying to see an organized chart appear when all the messy data has been sorted. I wrote a guest article for a data-cleaning software company, Trifacta Wrangler, after using the tool for two of my classes.I walked readers through the cleaning of a set of global flood records for my Exploratory Data Analysis and Visualization course.
What was your favorite project as a student at DSI?
My first project, for my Storytelling with Data course. My teammates and I built a platform to visualize check-ins on Foursquare, allowing us to track the day-to-day behavior of millions of New Yorkers. I was new to New York, and this project gave me insights into my new home.
What did you find out?
After arriving at work, many people crave coffee, especially on Monday mornings. We noticed that neighborhood coffee shops hit their peak popularity at this time. We noticed two daily peaks at fitness centers—in the morning before work, and in the evening, just after. Not a surprise, but it was fun seeing that while subway check-ins are a useful way to see when people arrive at work, check-ins at bars and gyms are a good way to see where people go after work. The data also confirmed that New Yorkers love Sunday brunch.
What did you work on specifically at MillimanMAX?
I helped build an underwriting model for a client handling professional liability insurance cases across the country. With no prior industry knowledge, this internship helped me understand how data science is being applied to very specific real-world problems such as pricing risk.
Any advice for students choosing between multiple internships?
I was lucky enough to have had this problem and to have picked the right place for me. I’d recommend going to a company that will help you grow in your specific area of interest. Don’t be swayed by money or an important-sounding title. Go for the internship that best matches the skills you want to learn. A title lives on your resume. The experience is what you’ll remember.
After your internship, you and several classmates founded a student council chapter for data science students at Columbia Engineering. What did you hope to accomplish?
It began informally with five students as we discussed ways to improve the program. We successfully applied last fall to Columbia Engineering for official recognition. One of our goals was to increase networking opportunities—most of us are in a range of classes with limited opportunity to mingle. DSI and the DSISC (Student Council) has since increased the number of social events for data science students—we organized a mid-exam break, end of semester social and ice-breaker events for new students.
How can students become involved?
The council has a governing board, but any DSI student can attend weekly meetings. Some students even volunteer to take minutes. At a Town Hall we organized last fall, students were able to ask the Dean, Kathleen McKeown, questions about courses and opportunities after graduation.
What other opportunities did you find at Columbia?
I discovered the Columbia Bollywood Dance Team, a campus group that performs expressive, high energy mashups of Indian classical and western dancing. I also helped teach an investigative journalism class, teaching data analysis and statistics to journalism students. We explored how data is incorporated into a news piece, and whether the data or the news hook comes first. I also TA’d for a cloud computing and big data course in Columbia’s computer science department.
— Courtney Belle