Putting Data Ethics into Practice
Michele Claibourn is the Director of Equitable Analysis at The Equity Center and an Assistant Professor in the Batten School of Leadership and Public Policy where she teaches classes on equitable policy, data ethics and practice, and data visualization.
We’ve all seen stories about the misuse of data or the ways data science is applied toward harmful ends (see, for example, ProPublica’s investigation into racially biased predictive policing algorithms). The calls to use data and make data-driven decisions in the public realm, though, continue to rise. If we are to reduce the disparate risks of data in the civic sphere, we need to understand what a practice of data analysis that embodies principles of feminism and liberatory justice entails. That is the question that framed the work of the Public Interest Data: Ethics & Practice course in the Batten School of Leadership and Public Policy this past spring.
The course had two big goals for the semester: (1) to make progress on a project that advances social justice and policy understanding; and (2) to give public policy students experience using and evaluating the use of data for important policy questions.
Towards the first goal, past iterations of the class have partnered with the local Department of Social Services to examine processes and outcomes in the child welfare system.[1] This year, we began a new partnership with the UVA Legal Data Lab to contribute to a larger effort to make de-identified court case information from Virginia’s State Court system more accessible for researchers. Students in the class undertook a series of projects to help us better understand and document the state court data, both exploring what information this large database contains and what is missing and describing the meaning, uses, and limitations of key variables.
To meet the second goal, we read and discussed works on data ethics, data feminism, and data justice. As a class, we deliberated on ideas – about who has the power to collect and use data and to what ends, about the dehumanizing effect of data aggregation, about the reductionism of binary and limited categorizations, about our own positionality in the data science ecosystem, and more – and then considered the implications for how we used, analyzed, and communicated about the state criminal court data.
We drew a variety of lessons and conclusions throughout the semester, like recognizing that this court administrative data exists in response to an exercise of state power. Our intent in using this data, thus, needed to be clear: we meant to turn the lens onto the operations of state power, not treat these records as another form of state surveillance of citizens. We regularly reminded ourselves that a criminal charge is not equivalent to a crime; rather, these records represent decisions to charge individuals with a crime. Thus, we needed to avoid treating these as proxies for the occurrence of criminal activity. We read about experiences with carceral systems to help keep the humanity of those represented in the data at the forefront, recognizing that each record captures a potentially traumatic moment for a person. In response, we would be intentional about the language we used to talk about justice-impacted individuals. And we learned about our own gaps in knowledge and experience further underscoring the need to do this analysis in partnership with our neighbors more directly touched by these systems. With a clearer sense of what kinds of questions can and cannot be addressed with these data, we are in a better position to develop those partnerships in ways that won’t overstate what we can do as a class.
The first round of our exploratory work can be found on our class website, including:
- A look at traffic and vehicular cases
- An exploration of traffic fines and the revenues from fines across localities
- An examination of probation in four localities
- A comparison of sentencing variation for charges with and without mandatory minimums
As part of developing an ethical practice, we sought to make our work collaborative and open, providing the code for all of our work both within the project pages and on GitHub.
We hope these first drafts will serve as examples of the kinds of outcomes and processes researchers, journalists, and advocates could address with the Virginia State Court data. Our explorations have helped us form some questions for the next version of the class, focusing on the effects of recent reforms in the criminal code, for example, not allowing suspension of a driver’s license for unpaid court fees, separating a jury trial from jury sentencing, legalizing possession of marijuana. If you have questions that could be addressed with these data or communities that could benefit, reach out to mclaibourn@viginia.edu. And look for the launch of the State Court Data website later this summer.
[1] See, for example, the 2020 report, Charlottesville Child Welfare Study: Understanding Referral Disproportionality, https://doi.org/10.18130/v3-rcah-yq04.
- UVA Club of Pasadena: Hoo-liday Party
- UVA Club of Middle Tennessee: Hoo-liday Party
- UVA Club of Culpeper: Hoo-liday Party