PhD Specialization in Data Science

The Ph.D. specialization in Data Science is an option within each participating department's PhD program. The current participating departments are Applied Mathematics, Computer Science, Electrical Engineering, Industrial Engineering and Operations Research, and Statistics. PhD students in these departments can choose to do this specialization. To participate, students need to fulfill the requirements below in addition to those of their respective department's PhD program. Students should discuss this specialization option with their PhD advisor and their department's Director for Graduate Studies. Further questions about the specialization requirements should be directed to Professor David Blei (CS and Stats).


  1. The specialization consists of either 5 courses from the lists below, or 4 courses plus 1 additional course approved by the curriculum committee.
  2. At least 3 of the courses should come from outside the student’s home department.
  3. At least 1 course has to come from each of the three thematic areas listed below.


  • COMS 4231 Analysis of Algorithms I
  • COMS 6232 Analysis of Algorithms II
  • COMS 4111 Introduction to Databases
  • COMS 4113 Distributed Systems Fundamentals
  • EECS 6720 Bayesian Models for Machine Learning
  • COMS 4771 Machine learning
  • COMS 4772 Advanced machine learning


  • IEOR E6613 Optimization I
  • IEOR E6614 Optimization II
  • IEOR E6711 Stochastic Modeling I
  • EEOR E6616 Convex Optimization


  • STAT 6301 Probability Theory I
  • STAT 6201 Theoretical Statistics I
  • STAT 6101 Applied Statistics I
  • STAT 6104 Computational Statistics
  • STAT 5224 Bayesian Statistics
  • STCS 6701 Foundations of Graphical Models  (Joint with CS)


Participating PhD Programs



Data Science PhD Specialization Committee Chair

Specialization Steering Committee

Data Science PhD Specialization Committee

550 W. 120th St., Northwest Corner 1401, New York, NY 10027    212-854-5660
©2019 Columbia University