Data Journalism as a Mechanism for Unraveling Systems of Oppression

This week’s readings about data journalism further complicated my understanding of data and how we use it. In the podcast interview with Mimi Onuoha and Lam Thuy Vo, they discussed the ways in which documentation can be a form of violence to already-marginalized communities. Documentation, in itself, is a way of reducing people or stories into something more easily understandable or measurable. It strips data of crucial context that is necessary to understanding the complexity of human systems, and it works to actively disempower people and knowledge. They also note that many of the questions we have as data journalists ultimately center on power; who holds power in this data? Who loses power? Who holds power over this data? These questions are incredibly important to keep at the forefront when collecting and analyzing data, because data does not exist in isolation from human beings and their communities. 

Building upon this argument, Nikki Stevens talks about the ways in which the concepts of “dirty” and “clean” data can further reinforce harmful stereotypes. She states that “a focus on a data’s cleanliness is a way of controlling which knowledge is ‘valid’ and is directly counter to intersectional aims” (p. 12). This quote struck me, as did our conversation about raw data as an oxymoron last week, because I’ve been so primed to accept the clean/messy data binary as an academic. Consequently, I never understood the ways in which this standardization can be used to eliminate some of the nuance that exists within and between human relationships. Taking an intersectional approach to data collection and analysis is a substantial goal, however the author notes that the positivist assumptions made in quantification are somewhat incompatible to the broader intersectionality framework. The question I was left with from this article was, how can reimagine data journalism in completely new, black feminist structure that seeks to flip conventional power dynamics on its head? 

During his talk, Alex Howard explained that data + journalism + activism + responsive institutions = social change. The articles previously discussed tackle the first three components of this equation by discussing how data journalism can be used as a form of activism that empowers vulnerable communities. However, I believe responsive institutions is one of the most important ingredients to social change. How do we make institutions responsive to us as the public? Understanding this accountability mechanism disrupts the status quo and requires the people to take power in a way that they previously were unable to. This, I believe, is the core role of data journalism—to shift the narratives that focus on outdated social and power structures to center voices that have previously been silenced. Data journalism has the capacity to reinvent how we think about numbers representing people while also highlighting the limitations to this logic. 

Considerations of data collection and usage in data journalism

Data journalism can be so valuable at aiding the understanding of complex issues. It has the potential to create policy change and hold wrong doers accountable, while using data to back and create credibility. However, data can be used to cause harm if not handled properly. The reading below explore further concepts that should be considered in data collection and usage.  

Becoming Data: Data and Humanity podcast episode dives into the ethical and moral uses of data and how data can be used for good or evil. It uncovers how data collection practices are often not legal. The data collected is then subjected to the collectors own morality and through this, the podcasts exposes that data can be used for harm. They bring a vital perspective of  questioning how can we use data to nurture instead of punish. They raise important aspects to consider for data journalism, where the data is coming from and ultimately when exposed will it do harm to certain groups. 

Alex Howard’s talk on Data Journalism in the Second Machine Age discusses the new technologies that can be applied and through the evolution of technology how data is being integrated instead of people. Similarly to the Becoming Data podcast, he raises valid points about the ethics of using people’s data. Alex Howard uses several examples to highlight that privacy and security of sensitive data must still be protected and cannot be released, similar to how journalist would use an anonymous source to protect their identity. 

Data Set Failures and Intersectional Data by Nikki Stevens discusses intersectionality which is an important analytical framework which can be used in data journalism. Nikki discusses the common eight lifecycle phases of data and where failure has been seen. Intersectional research can aid data journalism by exposing power structures, systemic inequalities and social inequality.  

Data journalism often addresses complex issues and because of this it is important to ensure the data is credible, is fact-checked, and transparent while protecting the identity of those involved. It is  important that data is not being used to continue systemic biases through collection methods. Data can be used for harm, but it can also be used to hold certain groups accountable (ex. governments) or to facilitate public policy, while aiding and elevating marginalized voices.

Accountability and Data Journalism

Data journalism has an important role in redefining data as non-neutral, and therefore assumes a huge responsibility to the public. Nikki Stevens discusses the lack of “neutrality” inherent to data in their article “Data Set Failures and Intersectional Data” – even in cases where data is collected for more ethical and intersectional goals. Stevens asks us to consider if data can itself be intersectional – to which I say that data as it interacts with other data can perhaps be intersectional, but likely not on its own. In “Moving Targets: Collecting Queer Data,” Kevin Guyan raises some critical questions related to the data collection process in writing about the necessary contextual framing and interpretation of data with regards to space, time, other participants, and researchers.  In their piece, Guyan asks: “in moments when data is captured, whose interests are prioritized? The interests of individuals or groups about whom the data relates (in other words LGBTQ people) or the interests of those who possess the power and resources to collect the data?” I believe that more times than not, data is captured and shaped by the interests of those who possess the power and resources to collect the data.  The mere act of collecting the data leads many to assume that therefore the interests of the groups about whom the data relates to is reflected in the data, even if that may not be the case… which can have dangerous implications in attempts to relay an “objective truth” that is at the same time being constructed and self-validated. There is no denying that data is always situated in the context of the collector. The “queering” of collection methods in data journalism involves redefining the relationship between researcher and participant, through communal and consensual processes of building knowledge in non-hierarchical ways. It also asks for more fluidity in the “moving targets” of identity, reinforcing the dynamic nature of who we are and how we change as humans throughout our life. Above all, data journalism has a responsibility in taking time to think and share about how journalistic intent shapes the data collected. It’s crucial to be honest about positionalities when presenting a data journalism project, and to consider the context in which participants may view themselves in relation to others in the act of attempting to codify one’s own identity. Given the ability of data to not only shape our reality but also present itself as “natural”, there is immense responsibility to the public needed for accountability when we realize failures in data collection practices. Steven’s article is an example of what this accountability can look like. “One Size Fits Man” by Caroline Criado-Perez exemplifies the wide-ranging consequences of misogyny and patriarchy as it relates to women’s experiences with technology that was simply not built for them. The power dynamics of domination are reinforced in datasets that center men more than women, and misrepresent women in the representation that does exist. Sometimes, the existence of data at all creates attitudes that disregard the very real critiques made in hopes of data collection that enables building technology for all people,as shown in the example given by Criado-Perez about Tom Schalk’s offensive and lazy response to reports of faulty voice technology in car navigation systems. A nuanced approach in data journalism is essential to reinforcing the importance of intersectional voices.

The Role of Data Journalism

The role of data journalism is to give qualitative and quantitative reporting, empower the public, and hold institutions accountable. There is a negotiation and balancing of the relationship to power between journalists, institutions, and the public. In an ideal world, data journalism would be “the collection, protection, and interrogation of data as a source complementing traditional investigative reporting (witness, experts, and authorities),” in service of creating a healthier existence for humanity. This extended definition from Alex Howard’s lightning talk “Data Journalism in the Second Machine Age”, along with his declaration, “data plus journalism plus activism plus responsive institutions equals social change, establishes a way to look at the interconnected nodes in data journalism that require it keep central a power-based lens.

In Episode 1 of the Data & Society Podcast “Becoming Data: Data and Humanity,” Lam Thuy Vo discusses two ways of looking at data/datasets. One type of data is created naturally as people live and is collected by platforms, institutions, and individuals, while the other is intentionally made by individuals to shape their public self. Since this forming of data is contingent on the public, concerns, and frameworks for maintaining privacy, security, ethics, and transparency should be present. Data is not neutral; how it is collected, used, analyzed, and preserved indicates the relations that exist around it. In collection, we consider the “‘reality’ of what collections methods can uncover and the impact of these methods on the data collected, participants, and researchers” (Guyan, 61). The ‘reality’ and impact are present during each part of the data lifecycle. While the data itself is paramount to data journalism, institutions play just as great a role.

Institutions that do not embrace intersectionality, and that are built upon colonial, extractive practices, will not aid data journalism in its role to understand our world and make it more just. If institutions are not questioning accessibility, the financial interests present, the normative implications built into how the public interacts with them, the rigidity of how the public must identify, and so on, how can they be sure that the data they collect, and house is accurate to the ways in which the public wants to be represented? Dismantling present structures of inequality in institutions that data journalists use when not making their own datasets requires institutional rethinking of documentation practices. How is the data constructed, what is deemed as knowledge, and are we removing the context from the data?

Data journalist in the role of change agent (for the betterment of society) are tasked then with creating and maintaining relationships with the public they wish to analyze, and with holding institutions accountable to the public and creating pathways for the public to advocate for themselves. Data journalists and the field of data journalism must navigate the challenges and complexities to contribute to the greater good.

Embedded Bias and the Data Journalist’s Role to Provide Clarity

  One key aspect that elicits the importance of data journalism is the role it has undertaken in society. One of the main functions of a journalist is to always question. And data, in the context of journalism, provides a substantiative resource that fills the void created when questioning whether something is credible or not. It provides a sense of rationality to the reader when the journalist makes a claim. However, it’s important to note that as data journalists, when you begin to work with data you are inherently using a tool rooted in bias– Culture, context and society are woven into data, whether recognized or not. And when presenting this data to the public, through a medium that is often perceived as unbiased, it is your responsibility to address this. The role of the data journalist in today’s society is to provide quantitative and qualitative evidence for any particular claim, and it is their responsibility to be inclusive as possible; particularly when addressing social issues.

In Chapter 2 of Queer Data, Kevin Guyan supports this by quoting, “ …data collection processes… are productive and highly political practices though which (only) certain LGB(TI)Q populations are counted.” As a data journalist, it is important to recognized this, especially if you are covering topics concerning the community. It should be noted that simply recognizing this bias is not sufficient. Ethical factors of collection, awareness to gaps in the data, protecting at-risk communities from harm, advocating for inclusion within the dataset, are all contributing factors a data journalist should keep in mind. Failing to account for these only furthers the cultural paradigm that pushes these community to the fringes of the social spotlight.

This idea directly leads to what Criado-Perez describes in “Invisible Women: Data Bias in a World Designed for Men”. When we don’t address these “cultural norms” embedded into data, it is possible to create situations that directly harm individuals. Criado-Perez notes that in a 2016 paper there was “significantly higher transcription error rates for women than men.” Not addressing gender disparities has a significant impact on the livelihood of women. As a data journalist, if you were to use this paper, it’s your responsibility to uncover and address this disparity to prevent further “credibility” of that dataset. If you are not questioning the data, you are only doing half of your job as a data journalist.

Alex Howard highlights this concept in his talk, “Data Journalism in the Second Machine Age”. He states, “gathering, cleaning, organizing, analyzing, visualizing and publishing data to support the creation of acts in journalism.” If we use data left uncheck, we are not fulfilling our role as data journalists in modern society. It is the data journalists responsibility to account for any embedded bias, to ensure safe and fair use of data and the populations therein.

Data Journalism: A Tool of Responsibility

In our current digital age, Data Journalism’s role is one that is highly important. Through data gathering, analysis, and reporting, journalists can unveil truths, confront biases, and advocate for reform. This influence, however, demands an unwavering commitment to data integrity.

Consider Caroline Criado-Perez’s “Invisible Women.” She exposes a “male-fits-all” bias, spotlighting data omissions that bypass women. Whether in medical studies or city planning, data often mirrors male-centric standards. These tendencies emphasize the need for inclusive reporting and addressing imbalances.

Kevin Guyan’s “Queer Data” underlines the importance of capturing data on underrepresented groups, especially those with diverse sexual and gender identities. This isn’t just about representation; it’s about grasping our community’s multifaceted realities.

Nikki Stevens’ “Data Set Failures” offers insights into the challenges of collecting data from diverse open-source communities. She highlights the nuances in data-gathering techniques, reminding reporters of data’s multifaceted essence and the obligation to handle it judiciously.

The Data & Society podcast “Becoming Data” notes that data, while enlightening, can misguide if stripped of context. The power structures inherent in data suggest that tools may be impartial, but their usage isn’t. Reporters must ensure data amplifies truth, not just power dynamics.

So, what is data journalism’s duty? It’s a relentless quest for authenticity, translating complex data into lucid tales, and safeguarding against data misuse. In essence, it is at the nexus of digital innovation, truth, and public trust. It can spotlight marginalized narratives and champion the unheard, but with great power comes the duty to ensure data’s genuine portrayal of our shared human journey.

Purpose of Data Journalism within Society

Data journalism has many roles and responsibilities within society: it questions the past, present, and future states and processes of the world. Data journalism must use both the quantitative and qualitative methods of research to tell a holistic narrative. Its interdisciplinary nature allows people of various cultures, research and professional fields, ages, backgrounds, etc. to partner together in revealing and addressing the wrongdoings/ faults of various parties (no matter how “good” their intentions are). In Data Set Failures and Intersectional Data, Nikki Stevens exposes how traditional methods and corporate interests seeped through their OSC project, despite being “the first step in a project to create safer spaces within OSC for individuals from marginalized groups”. Other cases where data journalism uncovers the truth is shown in “One Size Fits Man”, where it’s revealed smartphones and pianos are built for the bodies of unmarked end users, usually that of the affluent white man.

It is responsible for being grounded in black feminist scholarship, disability justice, design justice, and queer theory. It is at the helm of empowering the public to think critically about the topic at hand and data ethics in general. Moreover, as mentioned in the podcast Becoming Data: Data and Humanity, data journalism fights against techno-chauvinism, the widespread concept coined by Meredith Broussard that states accelerated technology will save the world from all its problems. For instance, Lam Thuy Vo mentions how the current landscape of data collection of the most marginalized groups favors the deficit narrative. Lam provides a hypothetical example of a woman of color’s profile consisting of more “negative datasets” like records with law enforcement and child services than the “positive datasets” of measures she has taken to sustain her family’s well-being (e.g. maintaining the family finances).

It is the duty of data journalism to use visual storytelling and narratives to help the public more easily understand the context and key players, use stories of anonymous people’s experiences to connect with the public’s humanity. As demonstrated in the OSC paper, the processes/methods that made the data journalism projects possible must be open to the public, so that they can be challenged and/or replicated, so the field of data journalism progresses in the right direction. Furthermore, it should assist the public themselves to know and use their own power in holding entities accountable. 

This is a tangent, but I agree with what my colleagues have shared, that the government could provide funding to the programs (e.g. digital coalitions and libraries) focused on educating and uplifting the public in data literacy, so that people can understand and engage in various data journalism projects. Additionally, I think those of the public who have access to the Internet and privileges such as time, money, and holistic health are responsible for actively taking ownership of data being collected about them and going beyond raising awareness about the lack of missing data sets, which may protect vulnerable communities from being targeted or protect the NYPD from being held accountable for racial profiling.

Data Journalism’s Responsibility to the Public

Data journalism is a critical component of the modern media landscape, offering a unique blend of investigative rigor and the power of data analysis and visualization to shed light on complex issues. As the world grapples with an ever-expanding sea of data, it is the responsibility of data journalists to ensure that this information is accurate, inclusive, and respects individuals’ rights. Two important readings and one recording help us understand why this responsibility to the public is paramount.

Caroline Criado-Perez’s “Invisible Women: Data Bias in a World Designed for Men” underscores the serious consequences of data bias in design and products, often favoring a “one-size-fits-men” approach. Medical research, safety equipment, public transportation, and office environments have historically ignored women’s unique needs, jeopardizing their safety, health, and comfort. Data journalism can address these issues by uncovering and highlighting such biases. By making these discrepancies visible to the public, data journalists hold companies and institutions accountable for better gender-aware design and data collection.

Nikki Stevens’s “Data Set Failures and Intersectional Data” delves into the challenges of collecting intersectional demographic data. It reveals the tension between quantification and the complexity of individual identities. Data journalists need to navigate this complexity and provide nuanced narratives rather than reducing diverse experiences to mere statistics. Furthermore, the ethical considerations surrounding data collection are vital. Data journalists must be vigilant about the source of funding and ensure their work respects privacy, transparency, and individuals’ rights. This means striving for a balance between data ownership, privacy, and the benefits to human rights.

Alex Howard’s perspective on “Data Journalism in the Second Machine Age” highlights the evolution of journalism and the role of data journalists in informing the public. Data journalists use technology and data to uncover stories, making information accessible and understandable to a wider audience. They help hold the powerful accountable through empirical analysis and transparent reporting. However, as the digital age advances, data journalists must also grapple with the responsibility of ensuring data privacy and data ethics.

The reading materials collectively emphasize the role of data journalism in ensuring a responsible, inclusive, and ethical use of data. Data journalists have a unique responsibility to expose biases, protect individual rights, and provide accurate and informative narratives to the public. They must navigate the complexities of intersectional data, maintain transparency, and strike a balance between data ownership and human rights benefits. Furthermore, data journalism plays a vital role in advancing gender-aware design and data collection by holding institutions and organizations accountable for their one-size-fits-men approaches. In the era of big data, data journalism serves as a bridge between the complexities of data and the public’s need for accurate and ethical information. It is a cornerstone of modern journalism, responsible for ensuring that data works for the betterment of society and its diverse population.

The Duties of Data Journalism

Data journalism is duty-bound to the public to tell their stories in ways that challenge rather than reify existing harmful social structures, ideologies, and constructs.

Journalists who implement data analysis tools and use data visualizations or create their own datasets to analyze have a responsibility to the communities whose stories they are telling. Sometimes, these communities are much wider and further reaching than traditional journalism typically deals with, as a dataset can include thousands of people, while traditional journalism involves interviewing a small number of people, as Alex Howard points out in his keynote on “Data Journalism in the Second Machine Age.”

This extends the responsibility of the journalist across entire segments of the population, and furthermore, this responsibility must be taken with the utmost gravity, as data-driven journalism is often perceived as the definitive answer, and numerical statistics – no matter how they’ve been manipulated – tend to be cited as absolute, irrefutable proof of a fact. This is significant because social issues are often changing and evolving and are not fixed physical constants. Our social mores, constructs, contracts, and our society itself, is ever-changing and must be presented as such – as variable and malleable instead of static.

As Kevin Guyan points out, these same factors are not untouched by the collection of data, and indeed, “social identities” can be and often “are partly constructed through data collection practices.” (Guyan, 51) This intensifies the magnitude of data journalism’s obligation to the communities it represents and reports on. Unethical and biased data collection practices do not merely cause harm to, say, queer communities – they actively shape the wider public’s perception of the LGBTQIA2S+ community and their day-to-day lived experiences, and indeed help mold the social identities of queer and trans individuals.

In “Data Set Failures and Intersectional Data,” Nikki Stevens asks us to consider the question of “Can data, as a concept and/or as a material object, be intersectional?” (Stevens, 13) This is presented in the context of an analysis of the many failures of a survey that limited people to several options for gender, sex, sexuality, and which used words and terms that are inherently laden with bias, such as “gender non-conforming.” Terms like these suggest that those who are cisgender are “conforming” to society’s preset gender binary, while those who did not “conform” were othered by mere virtue of the wording of the question.

Stevens adds that the team was enlightened most by a final question which “asked users to add any other aspects of their identity they felt were important.” (13) The realization that there is more to people than a limited amount of data points, chosen from a limited, biased list, “reinforced for (them) the importance of self-identify boxes.” (13) Often, what people find most important about themselves cannot be quantified; it’s qualitative, and while such data can be compared to other datasets about different people, and attempts can be made to measure such variables, ultimately, people are complex systems, and their social identities are even more complex and can elude quantification or qualification.

Data journalism has an obligation to realize that within any given social system, there will be variables that remain unmeasured and others that are skewed by pre-existing biases. Transparency at all stages of data collection, analysis, and journalism, including being candid about what is being left out or sidelined, is paramount. The goal of journalism is to tell truths, and we must be cognizant that the truth can be biased towards existing social structures, constructs, and contracts.

The Case for Data Journalism

Data journalism serves as a cornerstone in data acquisition, analysis, dissemination, advocacy, and public education, among other crucial functions. From the readings and recordings, a robust case emerges for data journalism’s pivotal role and its responsibility to the public in the modern information age.

Methodological Contributions to Research

Kevin Guyan’s “Queer Data” raises pertinent questions concerning various study designs, encompassing methodologies and data collection tools. These salient questions underline the potential for researchers to inadvertently introduce biases, particularly against minority and marginalized populations. Beyond spotlighting these methodological limitations, data journalists champion novel approaches to data collection and analysis, often breaking away from academic conventions. These avant-garde approaches establish fresh methodological avenues for comprehensively understanding our world. Moreover, they set an innovative precedent for defining research methodologies, thus empowering researchers and data enthusiasts. Nikki’s work in “Data Set Failures and Intersectional Data” further advances methodological contributions by exploring intersectionality and novel approaches tailored to specific research contexts.

Facilitating Data Access for the Public

Alex Howard’s discourse on “Data Journalism in the Second Machine Age” artfully showcases how data journalism creatively sources, analyzes, and disseminates data. This creative process encompasses digitizing paper records into searchable online archives and employing data-driven methods to illuminate intricate societal issues. It spans concerns such as air pollution, corruption, food security, national security, and healthcare, delivering them to the public with flair and impact. This not only educates the public but also serves as a potent advocacy tool.

Advocacy and Ensuring Accountability

Inextricably linked to data accessibility, data journalism plays a pivotal role in advocacy and fostering accountability. By uncovering concealed issues and underscoring those previously disregarded, data journalists mobilize the necessary attention for policy shifts and concrete actions. A striking instance is the controversial publication of personally identifiable information about gun owners by New York-based Journal News, which, despite causing public outrage, catalyzed substantial policy alterations in New York State’s gun laws. Additional examples from renowned sources like Wikileaks, The New York Times, The Los Angeles Times, La Nacion, and more covering topics like corruption, ambulance response times, fundraising, serve as invaluable assets for advocacy and structural accountability.

The Dark Side of Data Journalism

Nonetheless, data journalism is not immune to ethical dilemmas. As illuminated in the podcast “Becoming Data: Data and Humanity (A Data & Society Podcast) Episode 1,” data journalists may sometimes unintentionally publish information about vulnerable groups, inadvertently perpetuating oppression within societal power dynamics. A concerning example revolves around the publication of eviction data, which landlords and real estate agents could exploit when selecting tenants, potentially perpetuating inequality and structural injustices that society seeks to alleviate.

The discourse above demonstrates that data journalism is an indispensable societal cornerstone, extending its influence across data utilization, accessibility, advocacy, and accountability. While its capacity to empower and enlighten is profound, data journalism must tread carefully, acknowledging the complex terrain of ethical responsibilities and the potential consequences that could either uplift or harm the public it serves.