Department Website
Program Description
The Master of Science in Data Science is a highly-selective program for students with a strong background in mathematics, computer science, and applied statistics. The degree focuses on the development of new methods for data science.
Our networked world is generating a deluge of data that no human, or group of humans, can process fast enough. This data deluge has the potential to transform the way business, government, science, and healthcare are carried out. But too few possess the skills needed to use automated analytical tools and cut through the noise to create knowledge from big data.
A new discipline has emerged to address the need for professionals and researchers to deal with the “data tidal wave.” Its objective is to provide the underlying theory and methods of the data revolution. This emergent discipline is known by several names. We call it “data science,” and we have created the world’s first MS degree program devoted to it.
The curriculum is 36 credits, and offers two ways to structure the graduate program that gives students the opportunity to pursue a specialization through the Industry Concentration or tracks.
Admissions
All applicants to the Graduate School of Arts and Science (GSAS) are required to submit the general application requirements, which include:
Admission to the Master of Science in Data Science requires substantial but specific mathematical competencies, typical of a major in mathematics, statistics, engineering, physics, theoretical economics, and computer science with sufficient mathematical training. In addition, applicants should have some training in programming and basic computer science. Preference is given to applicants with prior exposure to machine learning, computational statistics, data mining, large-scale scientific computing, operations research (either in an academic or professional context), as well as to applicants with significantly more mathematical and/or computer science training than the minimum requirements listed above.
See Data Science for admission requirements and instructions specific to this program.
Program Requirements
The program requires the completion of 36 credits, and offers a concentration in Industry.
Course List
Course |
Title |
Credits |
DS-GA 1001 | Introduction to Data Science | 3 |
DS-GA 1002 | Probability and Statistics for Data Science | 3 |
DS-GA 1003 | Machine Learning | 3 |
DS-GA 1004 | Big Data | 3 |
DS-GA 1006 | Capstone Project and Presentation | 3 |
| 3 |
| |
| Inference and Representation | |
| Deep Learning | |
| Natural Language Processing with Representation Learning | |
| Natural Language Understanding and Computational Semantics | |
| Mathematical Tools for Data Science | |
| Optimization and Computational Linear Algebra | |
| Text as Data | |
| Computational Cognitive Modeling | |
| Responsible Data Science | |
| Probabilistic Time Series Analysis | |
| 18 |
Total Credits | 36 |
Industry Concentration
Students have the opportunity to pursue a specialization through the Industry Concentration. This concentration is specifically targeted to respond to the needs and inputs from companies and allows MS in Data Science students to apply the knowledge and skills obtained in their coursework to industry during the degree program. It requires more industry-targeted coursework and a Practical Training experience. Students in this concentration will be required to take the following courses for the degree as a part of the 36 credit requirement:
- DS-GA 1009 Practical Training for Data Science within the first year of the program (3 credits in fall, spring, or summer), and
- Two electives within the Big Data or Natural Language Processing subject areas (6 credits).
- The courses below fall within the Big Data subject area. This list is approved and reviewed annually.
- DS-GA 1012 Natural Language Understanding and Computational Semantics
- CS-GY 6313 Information Visualization
- CS-GY 6323 Large-Scale Visual Analytics
- CS-GY 6083 Principles of Database Systems
- DS-GA / CSCI-GA 2433 Database Systems
- CS-GY 6093 Advanced Database Systems
- CSCI-GA 2434 Advanced Database Systems
- CSCI-GA 2436 Realtime and Big Data Analytics
- CSCI-GA 2437 Big Data Application Development
- CSCI-GA 3033 Special Topics in Computer Science: Cloud and Machine Learning
- CSCI-GA 3033 Special Topics in Computer Science: Introduction to Deep Learning Systems
- INTG1-GC 1025 Database Management & Modeling
- TECH-GB 2350 Robo Advisors & Systematic Trading
- MATH-GA 2047 Trends in Financial Data Science
- The courses below fall within the Natural Language Processing subject area. This list is approved and reviewed annually.
- DS-GA 1011 Natural Language Processing with Representation Learning
- DS-GA 1012 Natural Language Understanding and Computational Semantics
- CSCI-GA 3033 Statistical NLP
- DS-GA 1005 Inference and Representation
- DS-GA 1008 / CSCI-GA 2572 Deep Learning
- DS-GA 1015 Text as Data
- CSCI-GA 2590 Natural Language Processing
- CSCI-GA 3033 Special Topics in Computer Science: Learning with Large Language and Vision Models
All other requirements remain the same.
Capstone (DS-GA 1006)
One of the key features of the MS in Data Science curriculum is a capstone project that makes the theoretical knowledge gained in the program operational in realistic settings. During the project, students go through the entire process of solving a real-world problem; from collecting and processing real-world data, to designing the best method to solve the problem, and finally, to implementing a solution. The problems and datasets come from real-world settings identical to what might be encountered in industry, academia, or government.
Learning Outcomes
Upon successful completion of the program, graduates will have:
- The knowledge and skills needed to develop new methods for data science.
- The skills needed to use automated analytical tools and cut through the noise to create knowledge from big data.
- The ability to solve a real-world problem from collecting and processing real-world data, to designing the best method to solve the problem, and finally, to implementing a solution.
- An awareness of the social and ethical implications of data-driven methods, alongside tools to address and mitigate associated biases.
Policies
NYU Policies
University-wide policies can be found on the New York University Policy pages.
Graduate School of Arts and Science Policies
Academic Policies for the Graduate School of Arts and Science can be found on the Academic Policies page.