Standardization and Transparency in Data Science Masters Degree Programs

STIDS Pilot Study

STIDS WORKING GROUP OVERVIEW

With the rapid emergence of data science degree programs it has become challenging for students, faculty, staff, and administrators to meaningfully compare program offerings. To start to address this challenge, ADSA established the Standardization and Transparency for Data Science Degrees (STIDS) Working Group to begin to characterize data science education programs at a variety of levels and create a careful taxonomy for data science competencies.

From 2021-2022, the STIDS Working Group sought to define a roadmap for program transparency to better enable students, employers, and the universities themselves to understand the skills, student learning outcomes, and program best practices available across the spectrum of programs and degrees. The working group started by surveying the landscape of Masters in Data Science programs and educational frameworks, and turned the resultant information into a survey for those implementing Masters programs in data science (or similarly named degrees).

Education icon

The ultimate goal of the survey and this project is to provide a venue for data science programs to illustrate how their programs meet the core needs described by existing frameworks, and to highlight what makes their programs unique. Providing this information in an open venue can create a dialog among programs, and with learners and employers, about best practices for data science education.

Below, we present the results of our initial pilot project - information from 21 unique Data Science Masters degree programs. If you would like to contribute information about your institution's masters program, please respond via this form. We plan to continue to collect data from institutions and offer new ways for potential students, program administrators, and employers to browse and search the data.

If you have any questions or comments, please reach out to us at steve@academicdatascience.org.

PILOT PROJECT SUMMARY

The Academic Data Science Alliance’s Standardization and Transparency in Data Science Degrees (STIDS) Working Group seeks to help clarify the landscape of degree programs in the field. As part of this process, ADSA distributed a survey to data science programs in the US, asking for information about programs across a variety of topic areas. In a number of the topic areas, especially the more traditional and technical areas (e.g. mathematics, statistics), there is strong uniformity in degree requirements. However, areas that are less technically focused (e.g. ethics, collaboration) there is a great deal more variability in degree requirements. By presenting these data in a transparent manner, we hope to initiate conversations among degree program administrators about the structure of their programs. We also hope to provide a venue that allows future data science students to better understand program offerings at institutions of interest.

Pic chart of student area of knowledge

PREREQUISITES

Most of the responding programs require some background in statistics, prior to starting the program (17 of 21), while fewer require background knowledge in calculus, linear algebra, and programming (15, 12, and 12 of 21 respectively). Beyond these core areas, programs reported additional prerequisites including data structures, english composition, databases, probability and inference, and domain specific experience.

FOUNDATIONS OF DATA ANALYSIS

Among Data Analytics, Statistics, and Data Modeling, most of the topic areas are required for most programs. The exceptions to this are Artificial Intelligence, Data Collection Design, and Model Risk and Mitigation Strategies having a more even spread among required, elective, and required for a specialization. We see major differences among programs with Mathematics topic areas, many of which are not required, and most of which are elective. Only Calculus and Matrices and Basic Linear Algebra show strong consensus as requirements.

Bar graph of data analytics and statistics

Bar graph of mathematics

SYSTEMS AND IMPLEMENTATION

Data Structures is the most commonly required topic area for Computing and Computer Fundamentals, with other areas more commonly offered as electives or not required at all. Most data engineering topic areas are pretty evenly split between being required for the degree or offered as an elective, with the exception of Data Preparation and Cleaning, which is fairly universally required. The two topic areas offered for Software Development and Maintenance are also usually required.

Bar graph of computing fundamentals, data engineering, sofware develope

DATA SCIENCE IN PRACTICE: PROFESSIONAL PRACTICE AND RESPONSIBLE DATA SCIENCE

More than half of programs require some coursework in Responsible Practices with Data and Ethics, though the breadth of topic areas in our framework is not often covered completely by required coursework. For example, Data Analysis for Security is relatively uncommon, as are FAIR and CARE principles. We see big differences among programs in the areas of Effective Collaboration in Teams with most programs requiring coursework in project management and collaborating with partners, but relatively few programs requiring coursework in infrastructure costs, product management, and devops. Basic communication skills are commonly required, though the types of courses through which these requirements are achieved is very broad (e.g. Statistical Programming, Communicating with Data, or Capstone courses).

Bar chart of responsible practices with data

DATA SCIENCE PROJECT DESIGN

Results around Data Science Project Design are similarly mixed with respect to degree requirements. Topic areas under Users and Impacted Group and Open Science by Design typically have mixed levels of requirement, while those under Research Methods, Data, and Visualization tend to be required to complete the degree.

DATA SCIENCE PROJECT DESIGN Results around Data Science Project Design are similarly mixed with respect to degree requirements. Topic areas under Users and Impacted Group and Open Science by Design typically have mixed levels of requirement, while those under Research Methods, Data, and Visualization tend to be required to complete the degree.

Bar graph of results of data science project.

CONCLUSION

We hope these results, and the data we share, can provide a window into the differences and similarities among masters level data science programs. We encourage other institutions to also share information about their programs to help the academic data science community better understand the current state of these programs, and to help those developing programs into the future.

REFERENCES

Computing Competencies for Undergraduate Data Science Curricula, ACM Data Science Task Force

National Academies of Sciences, Engineering, and Medicine 2018. Data Science for Undergraduates: Opportunities and Options. Washington, DC: The National Academies Press. https://doi.org/10.17226/25104.

EDISON Data Science Framework: Part 1. Data Science Competence Framework (CF-DS) Release 3

Learning-outcomes-for-Masters-level-Data-Science-Programs(2021-03-17) (unpublished survey responses collected by this working group)

Practice of Data Science, Defining a Field, a School and a Curriculum, Brian Wright, University of Virginia

Fayyad, U., & Hamutcu, H. (2020). Toward Foundations for Data Science and Analytics: A Knowledge Framework for Professional Standards. Harvard Data Science Review, 2(2). https://doi.org/10.1162/99608f92.1a99e67a

CONTRIBUTORS

Cathy Anderson (University of Virginia)

Purush Papatla (University of Wisconsin, Milwaukee)

Brian Wright (University of Virginia)

Alexandra Johnson (Washington State University)

Micaela Parker (ADSA)

Sungjune Park (UNC Charlotte)

H. V. Jagadish (University of Michigan)

Nairanjana Dasgupta (Washington State University)

Abani Patra (Tufts University)