Data Science Ethos Lifecycle

New Data Science Ethos Paper!

Members of the Data Science Ethos Working Group have published a paper describing our work. Congrats to Margarita Boenig-Liptsin (Institut d'Études Avancées de Paris), Anissa Tanweer (University of Washington), and Ari Edmundson (University of California, Berkeley) on the publication, and thanks to the entire working group for all the hard work!

Overview

The data science lifecycle model is a ubiquitous tool for describing the stages of research in a typical data science project. While helpful for illustrating parts of the research process, lifecycle tools almost universally omit ethical considerations and societal contexts. By abstracting away the broader societal contexts, these lifecycle models do not adequately capture the way in which data scientists think, and the kinds of questions they must address while doing real world data science work. We see a need for a data science framework that includes explicit societal contexts and makes questions of social good actionable. The result will be a more true-to-life model of the data science lifecycle that shows how societal questions are a constitutive part of the day-to-day work of a data scientist.

Data Science Ethos Lifecycle screen capture

A mockup of the Data Science Ethos Lifecycle

ADSA’s Data Science Ethos Working Group has designed this new lifecycle model, including example case studies, with participation from community members who are experts in data science and the social sciences. Our model has applications in formal and informal training, guidance for research administrators seeking to move beyond regulatory frameworks for research compliance, and as reference for practicing data scientists. 

Pilot Phase

Elements of the Data Science Lifecycle Ethos Tool have already been piloted in classrooms at multiple institutions, and in a workshop for the UC Berkeley Human Contexts and Ethics Program. A description of the tool is currently under review at the Journal of Statistics and Data Science Education. The next major phase of this project is to continue to gather and refine an initial set of case studies to be included at launch, to complete the build-out of the interactive tool, and to partner with community members for piloting in classroom or research group settings. Efforts are underway for a third case-study focused on clean water to be included in the initial launch of the lifecycle tool.

Data Science Ethos Lifecycle screen capture

Example of the lenses through which we can interpret stages of the research lifecycle

Long Term Vision

ADSA’s vision for this project is that it will become a common touchpoint for data science instruction in higher education, starting with members of the ADSA community. To make the tool relevant to the needs of instructors and practitioners, future development of the tool should allow submission of new case studies, allow users to match curricular artifacts to case studies, and offer versions of the tool that can be used off-line.  

Contributors

  • Charlotte Cabasse-Mazel, Chair (dhCenter UNIL-EPFL)
  • Margo Boenig-Liptsin (UC Berkeley)
  • Cathryn Carson (UC Berkeley)
  • Caren Cooper (NC State University)
  • Alycia Crall (The Carpentries)
  • Tiana Curry (ADSA)
  • Ari Edmundson (UC Berkeley)
  • Anna-Maria Gueorguieva (UC Berkeley)
  • Eva Newsom (UC Berkeley)
  • Carlos Ortiz (UC Berkeley)
  • Stephanie Shipp (University of Virginia)
  • Anissa Tanweer (University of Washington)