Department Website
Program Description
The Master of Science in Data Science is a highly-selective program for students with a strong background in mathematics, computer science, and applied statistics. The degree focuses on the development of new methods for data science.
Our networked world is generating a deluge of data that no human, or group of humans, can process fast enough. This data deluge has the potential to transform the way business, government, science, and healthcare are carried out. But too few possess the skills needed to use automated analytical tools and cut through the noise to create knowledge from big data.
A new discipline has emerged to address the need for professionals and researchers to deal with the “data tidal wave.” Its objective is to provide the underlying theory and methods of the data revolution. This emergent discipline is known by several names. We call it “data science,” and we have created the world’s first MS degree program devoted to it.
The curriculum is 36 credits, and offers two ways to structure the graduate program that gives students the opportunity to pursue a specialization through the Industry Concentration or tracks.
Admissions
All applicants to the Graduate School of Arts and Science (GSAS) are required to submit the general application requirements, which include:
Admission to the Master of Science in Data Science requires substantial but specific mathematical competencies, typical of a major in mathematics, statistics, engineering, physics, theoretical economics, and computer science with sufficient mathematical training. In addition, applicants should have some training in programming and basic computer science. Preference is given to applicants with prior exposure to machine learning, computational statistics, data mining, large-scale scientific computing, operations research (either in an academic or professional context), as well as to applicants with significantly more mathematical and/or computer science training than the minimum requirements listed above.
See Data Science for admission requirements and instructions specific to this program.
Program Requirements
The program requires the completion of 36 credits, and offers a concentration in Industry. See below for concentration details and requirements.
Course List
Course |
Title |
Credits |
DS-GA 1001 | Introduction to Data Science | 3 |
DS-GA 1002 | Probability and Statistics for Data Science | 3 |
DS-GA 1003 | Machine Learning | 3 |
DS-GA 1004 | Big Data | 3 |
DS-GA 1006 | Capstone Project and Presentation | 3 |
| 3 |
| Inference and Representation | |
| Deep Learning | |
| Fundamentals of Natural Language Processing | |
| Large Language Models: Evaluation and Applications | |
| Mathematical Tools for Data Science | |
| Optimization and Computational Linear Algebra | |
| Text as Data | |
| Computational Cognitive Modeling | |
| Responsible Data Science | |
| Probabilistic Time Series Analysis | |
| Mathematical Statistics | |
| Probability and Statistics 2 | |
Total Credits | 36 |
Industry Concentration
Students have the opportunity to pursue a specialization through the Industry Concentration. This concentration is specifically targeted to respond to the needs and inputs from companies and allows MS in Data Science students to apply the knowledge and skills obtained in their coursework to industry during the degree program. It requires more industry-targeted coursework and a Practical Training experience. Students in the concentration are required to take the courses below for the degree as a part of the 36 credit requirement.
Concentration Requirements
Course List
Course |
Title |
Credits |
DS-GA 1009 | Practical Training for Data Science (taken within the first year of the program) | 3 |
| 6 |
| Large Language Models: Evaluation and Applications | |
| Database Systems | |
| Principles of Database Systems | |
| Information Visualization | |
| Large-Scale Visual Analytics | |
| Advanced Topics Database Systems | |
| Realtime and Big Data Analytics | |
| Big Data Application Development | |
| Spec Top Computer SCI: | |
| Spec Top Computer SCI: (Cloud and Machine Learning) | |
| Spec Top Computer SCI: (Introduction to Deep Learning Systems) | |
| Database Management & Modeling | |
| Trends in Financial Data Science | |
| Robo Advisors & Systematic Trading | |
| Inference and Representation | |
| Deep Learning | |
| Fundamentals of Natural Language Processing | |
| Large Language Models: Evaluation and Applications | |
| Text as Data | |
| Natural Lang Processing | |
| Spec Top Computer SCI: (Learning with Large Language and Vision Models) | |
| Spec Top Computer SCI: (Statistical NLP) | |
Capstone (DS-GA 1006)
One of the key features of the MS in Data Science curriculum is a capstone project that makes the theoretical knowledge gained in the program operational in realistic settings. During the project, students go through the entire process of solving a real-world problem; from collecting and processing real-world data, to designing the best method to solve the problem, and finally, to implementing a solution. The problems and datasets come from real-world settings identical to what might be encountered in industry, academia, or government.
Learning Outcomes
Upon successful completion of the program, graduates will have:
- The knowledge and skills needed to develop new methods for data science.
- The skills needed to use automated analytical tools and cut through the noise to create knowledge from big data.
- The ability to solve a real-world problem from collecting and processing real-world data, to designing the best method to solve the problem, and finally, to implementing a solution.
- An awareness of the social and ethical implications of data-driven methods, alongside tools to address and mitigate associated biases.
Policies
NYU Policies
University-wide policies can be found on the New York University Policy pages.
Graduate School of Arts and Science Policies
Academic Policies for the Graduate School of Arts and Science can be found on the Academic Policies page.