MSDSE Summit - 2019

November 5-7, 2019

Santa Fe, New Mexico


The Moore-Sloan Data Science Environment's annual summit will be held on November 5-7, 2019 in beautiful Santa Fe, NM. This year, the event is brought to you by the Academic Data Science Alliance (ADSA), together with our partners at the Moore-Sloan Data Science Environments. This meeting brings academic data scientists and researchers together from all disciplines and career stages to share research advances, in methodologies and domain applications, and programmatic and educational approaches to data science. The Summit will include lightning talks, demos, poster sessions, and collaborative discussions.

A close-up photo of a succulent plantThis meeting is a great opportunity to connect with colleagues and make new connections to share research advances. This year we will have a few new faces as we start to expand our community to other universities and institutions.

This year’s Summit will be hosted at the Drury Plaza Hotel in Santa Fe, NM, about 1 hour drive from the Albuquerque International airport (ABQ).

2019 MSDSE Summit Program

Tuesday November 5

after 3:00pm



Registration opens (outside Palace)

5:30pm - 6:30pm

Dinner and Welcome (Palace)

6:30pm - 7:00pm

Pub quiz ice-breaker

7:15pm - 8:00pm

Keynote speaker: Andy Rominger, Santa Fe Institute (Palace)

Wednesday November 6

7:30am - 8:55am

Breakfast at your hotel (Drury guests: 2nd floor mezzanine)


Registration desk opens (outside Palace)


9:00am - 9:55am

Breakout 1

Career Paths in Data Science - Michael Laver (O’Keefe)

The Pangeo Project - Amanda Tan (Riviera)

Data Science Capstone Research Programs - Anthony Suen (Lamy)

10:00am - 10:20am

Break (outside Palace)

10:25am - 11:20am

Breakout 2

Diversity and Inclusion in Data Science - Sara Stoudt (Lamy)

Managing a Productive Data Science Team - Ciera Martinez (Riviera)

Goodhart’s Law: Are Academic Metrics Being Gamed? - Michael Fire (O'Keefe)

11:25am - 11:40am

Group Photo (Meet in Drury Lobby)

11:45am - 12:55pm

Lunch with Poethon (aka Poetry Slam) (Palace)

1:00pm - 1:55pm

Tools and Demos - TagWorks: Best ever textual training data by the crowd - Nick Adams (O'Keefe)

Tools and Demos - apricot: Submodular selection for summarization of large data sets - Jacob Schreiber (Riviera)

Tools and Demos - Making open datasets more accessible with Gigantum - Dav Clark (Lamy)

Lightning Talks (Palace):

  • Wiki-Atlas: An augmented reality approach to viewing wikipedia content on the go - Anastasios Noulas
  • The challenge of analyzing not-your-own-data: documentation lessons and opportunities from high-energy physics to homelessness support - Matt Bellis
  • A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution - Qing Qu
  • Mapping Marine Picophytoplankton Biogeography using Statistical Learning - Corinne Jones
  • What Systems Biology Can Learn From Software Engineering - Joe Hellerstein
  • Data-driven de-identification of clinical notes - Vikas Pejaver
  • Enabling science through better data management - Diya Das

2:00pm - 4:00pm

Free Time - afternoon snacks available (outside Palace); optional hikes, art walk, SOS Trip (*sign-up)

4:00pm - 4:55pm

Breakout 3

Unifying deep learning with item response theory: interval measurement, debiasing, efficiency, and explainability - Chris Kennedy (Riviera)

Best Practices for Best Practices - Stuart Geiger (O'Keefe)

Building Local Research Communities through Special Interest Research Groups - Valentina Staneva (Lamy)

5:30pm - 6:30pm

Dinner (Palace)

6:30pm - 8:00pm

Poster Session (Palace)

  • Gradient Group Lasso Identifies Sparse Functional Basis for Molecular Manifolds - Samson Koelle
  • Using Data Science to Understand the Film Industry's Gender Gap - Dima Kagan
  • 3D Organ Shape Reconstruction from Topogram Images - Elena Sizikova
  • Nonconvex approaches for sparse deconvolution problems - Qing Qu

Demo (Palace)

  • Reproducible Open Benchmarks for Data Analysis - Heiko Mueller, Irina Espejo, Kyle Cranmer

Thursday November 7

7:30am - 8:55am

Breakfast at your hotel (Drury guests: 2nd floor mezzanine)


Registration desk opens (outside Palace)


Seed grant proposal review (private meeting; Meem)

9:00am - 9:55am

Breakout 4

Collecting resources on best practices for scientific software development - Lindsey Heagy (O'Keefe)

Open Long-Tailed Recognition in the Real World - Stella Yu (Riviera)

Data Science Consulting as a University Service - Kyle Cranmer (Lamy)

10:00am - 10:20am

Break (outside Palace)

10:25am - 11:20am

Breakout 5

The side project you love and ignore - Ciera Martinez (Riviera)

Exploring the future of the hackweek - Anthony Arendt (Lamy)

Transitioning Open Source projects to a self-sustaining organizational model - Matti Picus (O'Keefe)

11:30am - 12:45pm

Lunch (Palace); Core Team Lunch Meeting (Meem)

1:00pm - 1:55pm

Tools and Demos - Docker for Research and Pedagogy in Data Science - Sang-Yun Oh and Alex Franks (O'Keefe)

Tools and Demos - Intro to Julia - Stefan Karpinski (Riviera)

Tools and Demos - Public Editor: The Collaborative Way to Credible News - Nick Adams (Lamy)

Lightning Talks (Palace):

  • Learning on the job at a hypergrowth unicorn - Allison Smith
  • Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening - Krzysztof Geras
  • An artificial neural network controller for insect flight - Callin Switzer
  • GEOME Evolution Blockchain - Neil Davies
  • Echopype: Enhancing the Interoperability and Scalability of Ocean Sonar Data Processing for Biological Information - Valentina Staneva
  • Reflections on machine learning for (bio)acoustics - Justin Kitzes
  • Investigating and Archiving the Scholarly Git Experience - Vicky Steeves

2:00pm - 2:55pm

Breakout 6

Special Interest Group in Quantitative Cell Biology and Communities (QCBC) - Joseph Hellerstein (Riviera)

Communities of practice for Jupyter deployments on shared infrastructure - Jim Colliander, Fernando Pérez (O'Keefe)

Talking With the Public About Data Science - Meredith Broussard (moderator), Joshua Tucker, Andrea Jones-Rooy, Sara Stoudt (Lamy)

3:00 - 5:30pm

Free Time - afternoon snacks available (outside Palace); optional hike/walk, trip to MeowWolf (*sign-up)

6:00pm - 9:00pm

Dinner, Joint Summits Evening Program with Special Lightning Talk Session (Palace)

Lightning Talks:

  • How Universities can Attract and Retain Data Scientists - Nick Adams
  • Sustainability and profitability - Dav Clark
  • Making the Most of Your Data Science Institute - Diya Das and Daniela Huppenkothen
  • Scoping Research Engineer work - Andreas Mueller
  • Is academic data science diverse? Data from 42 data science centers, institutes and think tanks - Laura Noren

Friday November 8

7:30am - 8:55am

Breakfast at your hotel (Drury guests: 2nd floor mezzanine)

9:00 - 11:00am

Informal Meeting Time



Program Abstracts

Explore abstracts from talks given at the 2019 MSDE Summit


ADSA Executive Director 
Micaela Parker - micaela at academicdatascience dot org

ADSA Community Coordinator
Steve Van Tuyl - steve at academicdatascience dot org

ADSA Program Assistant
Megan Atkinson - megan at academicdatascience dot org

Berkeley Summit contact
Marsha Fenner - mwfenner at berkeley dot edu

NYU Summit contact
Emily Corona - emily.mathis at nyu dot edu

UW Summit contact
Sarah Stone - exec-director at escience dot washington dot edu