The Duties of Data Journalism

Data journalism is duty-bound to the public to tell their stories in ways that challenge rather than reify existing harmful social structures, ideologies, and constructs.

Journalists who implement data analysis tools and use data visualizations or create their own datasets to analyze have a responsibility to the communities whose stories they are telling. Sometimes, these communities are much wider and further reaching than traditional journalism typically deals with, as a dataset can include thousands of people, while traditional journalism involves interviewing a small number of people, as Alex Howard points out in his keynote on “Data Journalism in the Second Machine Age.”

This extends the responsibility of the journalist across entire segments of the population, and furthermore, this responsibility must be taken with the utmost gravity, as data-driven journalism is often perceived as the definitive answer, and numerical statistics – no matter how they’ve been manipulated – tend to be cited as absolute, irrefutable proof of a fact. This is significant because social issues are often changing and evolving and are not fixed physical constants. Our social mores, constructs, contracts, and our society itself, is ever-changing and must be presented as such – as variable and malleable instead of static.

As Kevin Guyan points out, these same factors are not untouched by the collection of data, and indeed, “social identities” can be and often “are partly constructed through data collection practices.” (Guyan, 51) This intensifies the magnitude of data journalism’s obligation to the communities it represents and reports on. Unethical and biased data collection practices do not merely cause harm to, say, queer communities – they actively shape the wider public’s perception of the LGBTQIA2S+ community and their day-to-day lived experiences, and indeed help mold the social identities of queer and trans individuals.

In “Data Set Failures and Intersectional Data,” Nikki Stevens asks us to consider the question of “Can data, as a concept and/or as a material object, be intersectional?” (Stevens, 13) This is presented in the context of an analysis of the many failures of a survey that limited people to several options for gender, sex, sexuality, and which used words and terms that are inherently laden with bias, such as “gender non-conforming.” Terms like these suggest that those who are cisgender are “conforming” to society’s preset gender binary, while those who did not “conform” were othered by mere virtue of the wording of the question.

Stevens adds that the team was enlightened most by a final question which “asked users to add any other aspects of their identity they felt were important.” (13) The realization that there is more to people than a limited amount of data points, chosen from a limited, biased list, “reinforced for (them) the importance of self-identify boxes.” (13) Often, what people find most important about themselves cannot be quantified; it’s qualitative, and while such data can be compared to other datasets about different people, and attempts can be made to measure such variables, ultimately, people are complex systems, and their social identities are even more complex and can elude quantification or qualification.

Data journalism has an obligation to realize that within any given social system, there will be variables that remain unmeasured and others that are skewed by pre-existing biases. Transparency at all stages of data collection, analysis, and journalism, including being candid about what is being left out or sidelined, is paramount. The goal of journalism is to tell truths, and we must be cognizant that the truth can be biased towards existing social structures, constructs, and contracts.