Who IS the Data Science Community?

15 June 2021

From the desk of Laura Norén

(read more about Laura Norén)

We call it the Data Science Community Newsletter because we believe that when all of us read the same updates, research advances, and commentary we developed a shared imagination of what counts as data science and who gets to have a seat at the data science table. The DSCN is our almost literal attempt to get data scientists on the same page. In order to stay community-focused, we source a lot of our material from readers’ twitter accounts and the departmental home pages where readers work. Your posts about data science are a good proxy for your thoughts about data science. We like to signal boost what’s already on your mind.

What do we know about who reads the newsletter?

There are over 8,500 subscribers, mostly based in the US (75%).

Canadians, Brits, and Germans make up 3% each of readers opening the newsletter with another 2% of newsletter opens coming from France. We also see readers in India, Spain, Australia, Brazil, Chile, and Ireland.

Speaking of opening the newsletter, anywhere from 25-30% of you open any given newsletter, which is pretty good engagement for an email newsletter. We also see a lot of room for improvement in our “open” rate.

DSCN readers ❤️ higher education 

Our readership is dedicated to formal education. Readers, congratulations on all of your formal education.

Almost half of you have PhDs (47%) or MDs (1%). That’s way above the US national level of about 2% of the population having either a PhD or MD. Another ~40% of you have some flavor of Masters Degree (compare that to 11% of the overall US population). Add it up and 86 percent of the responding readers hold graduate degrees. Way to write those dissertations and theses. (You are my people - there’s no doubt about that.)

Our readers also have strong intentions to continue their data science training. Considering how well-educated readers already are, it is surprising that 17% plan to get another formal university degree and 7% plan to get a certificate from a university.

A quarter plan to attend a 1-6 day workshop or other training session, about half (47%) plan to take an online course (Coursera, Udacity), and almost 70% plan to learn more about data science on StackOverflow, GitHub, or YouTube as needed. 

Takeaway: With all the continuing training among a group of highly educated people, I sense the need for a professional data science organization that can identify relevant material for continuing professional education (CPE) credits, similar to what physicians, dentists, and architects have to do to retain their professional standing. There has to be a good balance between informal training on StackOverflow and YouTube and spending the time and money to get another formal university certificate or degree.

Where do DSCN readers work?

Almost half (47%) of our readers work in higher education, well above the US national rate (2%). Makes sense. We write for the academic data science audience. The DSCN readership is also over-represented by readers working in tech and finance with a healthy representation among non-profits and government agencies.

The DSCN readership is radically interdisciplinary

While the Computer Science + Data Science (CS+DS) category is where the largest number of readers got their highest degree (29%), that group does not constitute a numerical majority of readers. That group doesn’t even make up one-third of our readers. This is one key differentiating fact about our readership - it is radically interdisciplinary*. The social science group is roughly tied for second place with the stats/math group at ~13%. Then there are six disciplines that are similar in size - between 4 and 8% of the total. None of this surprises us, but it often surprises outsiders who assume that data science/AI is done by people with CS degrees. 

*Radically interdisciplinary groups are groups in which no single discipline makes up 50% of the membership, there are at least five disciplines representing at least 5% of the membership each, and the disciplines in the group use different methods and bodies of theory. In our readership, many groups use data science in their methodology, but there is methodological divergence in terms of how data are collected, in how data relate to theory, in whether/how data science is applied, and in which literatures are considered canonical. There can also be differences in the way data are seen to be related to human subjects with some disciplines seeing data as intrinsically and inextricably linked to human identity, thus potentially being covered by human rights law and other ethical expectations designed to cover humans. Other disciplines may rarely use data derived from humans (astronomers) or may see data as wholly distinct from the humans who generate it, thus outside the purview of laws and ethical principles designed to address humans and human rights. 

The DSCN readership is possibly ready to get out again

We wanted to know when we might start seeing DSCN readers at events as we emerge from what was hopefully the worst stretch of the pandemic. We realize that pandemic recovering processes are proceeding at different paces in different places and that the pandemic is not over yet. We went back and forth on differentiating between in-person events and web-based interaction, but eventually gave up and simply asked about readers’ expectations for any kind of event attendance in the next 12 months. The poll was conducted in April/May so there was still a fair amount of confusion about the state of the pandemic, WFH policies, and what mental/physical/emotional/sartorial/misanthropic state you’d be in after what may have been the most challenging year of your professional life. 

It looks like many of you are planning to get back to your top 2-4 events, though 15% of you didn’t know enough about your plans for the next 12 months to hazard a guess. 

There are still more people canceling everything (9%) than going with the ‘say yes to everything’ approach (5%), but the majority appears to be ready for at least one event. 

As for organizing events, over a third of our readers are pitching in to help organize at least one event. Thank you for your service to our community. 

Summary - DSCN readers are radically interdisciplinary, mostly from the US, highly educated, probably working in higher education or tech, very interested in getting more education and training, and ready to attend events.

Now that you know who our readers are, you may want to know what they want more of and less of from the DSCN crew. We do love your funny, creative, occasionally cranky responses. Every last one of them.

Reaching the DSCN audience

Perhaps you want to reach our super smart, passionate, opinionated, event-attending audience. They have a lot going for them. We do have a few opportunities for sponsors. Please email DSCN editor laura.noren@nyu.edu to find out more. You may also want to read the “What DSCN readers want” blog post while you’re waiting for a response. That post has even more insights into our audience and our plans for continuing to build the academic data science community.

Want to know what readers want from the DSCN?

Read On!

Looking for more From the Desk?

Check out the From the Desk archive!