CDN Seed Grants

ADSA's Career Development Network (CDN) runs an annual Seed Grant program designed to promote and support efforts by members of the CDN to thoughtfully incorporate data science best practices into university education, training and research programs.

written document Icon

Competitive proposals describe cross-disciplinary activities that will launch long-term endeavors to create new data science communities or broaden participation in existing ones. All projects are designed to ultimately benefit the academic data science community at one or more institutions, in one or more disciplines.

group of people icon

Examples of activities we fund include, but are not limited to: launching a Hackweek or XD (“across domain”) workshop in a new discipline or university that has not previously hosted a similar event; hosting a meeting on topics such as ethics, privacy, career development, or data sharing; piloting a high-risk high-reward data science project that is unfundable by others and supports the creation or growth of a data science community; developing open curricula for workshops or courses that will be broadly shared.

We are pleased to announce the 2021 Seed Grant winners!

Caselet+: Inclusive Data Science Problem Solving Skills Building at Scale

Lujie (Karen) Chen & Shimei Pan

Department of Information Systems, University of Maryland Baltimore County

Data science problem solving (DSPS) enables learners to leap from the commonly taught component knowledge in methods (“know-what”) or tools (“know-how”) to high-order decision making on "what to do, when and why". The mastery of DSPS requires a large amount of deliberate practice in a variety of application domains in order to realize skill transfer in real-world settings. It often takes years of apprenticeship with experienced data scientists, which can hardly scale and be inclusive, given the limited supply of qualified mentors in low-resourced communities. In this proposed project, we aim to develop a scalable training platform that may facilitate the inclusive skill building for data science problem solving (DSPS). This project builds upon the concept of “caselet” which are “bite-sized” case studies authored by experienced data scientists, featuring learning science-inspired and auto-graded exercises to support the deliberate practice of DSPS at a large scale. The proposed work will expand the caselets repository with aspects of DSPS to ensure positive societal impact  (e.g., algorithm fairness, transparency, accountability, privacy, and security).  In addition, the project supports the development of a new set of “culturally relevant” caselets based upon structured interviews and a pilot study of the existing caselets with students from underrepresented groups at the research team’s home institute. While the extension to include problem-solving ensuring positive societal impact may contribute to the training of data scientists to develop inclusive data-driven solutions, the extension to include “culturally relevant” caselets may encourage inclusive participation of students from underrepresented groups.

Data Science By Design: Best Practices of Visual Storytelling

Ciera C. Martinez

Berkeley Institute for Data Science, University of California, Berkeley

Sara Stoudt

Statistical & Data Sciences, Smith College

Váleri N. Vásquez

Energy and Resources Group, University of California, Berkeley

Data science has a critical visual component. An understanding of narrative through data visualization can help us share data science as a practice, through curriculum, guides, and tutorials in data science education. With funding from ADSA, we will organize two 1-day virtual events (in Spring and Fall 2021) which will support the creation of informative and design-conscious content for sharing the best practice of data science. We are calling this event “Data Science by Design” because in addition to a focus on aesthetic visual design principles, we believe that as a community we should be conscientiously developing the future of data science: a future built with diversity, inclusion, and open education as its guiding principles. Our events will empower and support attendees to produce visual content that effectively and artfully communicates the practice of data science research in the form of how-to guides, infographics, and zines. Further, these events will create a community. We will compile our community-developed products into a self-published anthology that will be freely available at datasciencebydesign.org, as well as to the wider data science community in both digital and print form.

An Inaugural Data Science Summit at UC Santa Barbara

Alexander Franks

Department of Probability and Statistics

Allison Horst

Environmental Science and Management

Michael Beyeler

Computer Science and Psychological & Brain Sciences
University of California, Santa Barbara

Data driven science is ubiquitous at the University of California, Santa Barbara (UCSB), with a growing focus on undergraduate and graduate education, research, and communities of practice. With funding from ADSA, we will organize and host a one-day inaugural Data Science Summit at UCSB in the Fall of 2021. The summit will, for the first time, convene data science practitioners, researchers and educators from all departments across campus with an overarching goal of creating a more cohesive, collaborative, and inclusive data science community. A combination of keynote talks, short / lightning talks, a panel discussion and poster session will highlight the breadth of data science-related research, methods, pedagogy and experiences at UCSB, and in data science more broadly. Breakout “Birds of a Feather” sessions will bring together participants with similar interests and expertise (e.g. Machine Learning, Education, Diversity and Inclusion) to facilitate interdisciplinary collaboration, and to serve as a launchpad for working groups that continue meeting beyond the summit for enduring engagement and impact. More information forthcoming at datascience.ucsb.edu/summit21

Our 2020 Winners

Due to the COVID pandemic, our 2020 winners have all received a no-cost extension. Look out for these events and activities in 2021!

Building a Computational Social Science Research Group at Atlanta Area Research Institutions 

Emily Kalah Gade 

Department of Political Science, Emory University 

Nora Webb Williams 

Department of International Affairs, University of Georgia 

Diyi Yang 

School of Interactive Computing, Georgia Tech 

This project will kick start data science and social science research collaborations in the Atlanta area through a three-phased strategy involving workshops, a datathon, and extended research consultations. First, we will host a week-long basic NLP/computer vision training coupled with a datathon that brings together graduate students from across the social and computer sciences from Emory University, Georgia Tech, the University of Georgia, and Georgia State. We will host a Software Carpentry workshop the week prior so that students who have no computational social science background can gain their footing before our specialized workshop. Second, following the datathon, we will award five mini seed grants to the most promising research teams, to enable them to move towards project completion. Each seed grant will come with data science consultations with the PIs. Finally, eight months later, we will host a one-day workshop for papers that originate from the datathon, to provide formal feedback and move toward publication. Together, these aims will expose up to 50 graduate students in the Atlanta area to data science skills, provide in-depth training and mentorship for up to 25 students who continue working on their data science products, and provide three junior women scholars at premier Atlanta area institutions with the opportunity to build out computational social science courses that will then be offered at their home institutions. Additionally, it will strengthen interdisciplinary and inter-institutional bridges between computer and social scientists at Emory, Georgia Tech, and UGA and provide research funding for promising computational social science graduate student projects.

Advanced Scientific Programming in Python

Nelle Varoquaux

TIMC, CNRS, France

Tiziano Zito

Psychology department, Humboldt-Universität, Germany

Juan Nunez-Iglesias

Monash Micro Imaging, Monash University, Australia

Software development plays an increasingly important role in data-driven discoveries. Yet most data scientists are not trained in modern software engineering concepts and best practices. This is true even of those that regularly write scripts and software to achieve their research goals. With support from ADSA, we will run a week long summer school as well as create online materials to address this disparity. This course fills a gap in the current curriculum and workshops proposed to graduate students and postdocs by proposing advanced programming skills, such as parallelization, advanced numerical programming, and profiling & optimization. We will provide hands-on experience on the subset of these techniques that matter most to the participants, teaching them clean language design, extensibility of code, and good practices in scientific computing and data visualization.

AudioXD: Seeding a new multi-disciplinary, multi-university network of data scientists working with audio recordings

Justin Kitzes

Department of Biological Sciences, University of Pittsburgh

Brian McFee

Department of Music and Performing Arts Professions and Center for Data Science, New York University

Daniel Turek

Department of Mathematics and Statistics, Williams College

One of the central goals of the MSDSE’s has been the creation of multi-disciplinary and multi-university networks of researchers that are united by common data science interests, methods, and challenges. Three alumni of the MSDSE’s were awarded funding to seed a multi-disciplinary, multi-university network of researchers focused on the challenges of collecting and analyzing audio data. The proposed work has two primary objectives: the formation of a cross-domain professional network of academic data scientists analyzing audio data, and the submission of a large grant proposal to sustain the future work of this network. The first objective will be met by a 25-30 person AudioXD workshop that will be designed explicitly to foster the creation of new collaborative research project ideas. The second objective will be met by follow up activities completed by a multi-disciplinary core team, including the PI’s, who attended the AudioXD workshop. Over the 3-6 month period following the AudioXD workshop, this smaller core team will draft at least one large grant proposal to support future collaborative work and then convene for a second meeting, a hackathon-style working session where they will complete the proposal.