As we saw recently, when it comes to big data for public services there needs to be algorithmic accountability. People need to understand not only what data is being used, but what analysis is being performed on it and for what purpose.
Further, complementing big data with thick, adjacent and lean data also helps to tell a more complete story of analysis. These posts piqued much interest and so this third and final installment on data offers a social welfare case study on algorithmic accountability: how to be transparent with algorithms.
A Predictive Data Tool for a Big Problem
The Allegheny County Department of Human Services (DHS), Pennsylvania, USA, screen calls about the welfare of local children. The DHS receives around 15,000 calls per year for a county of 1.2 million people. With limited resources to deal with this volume of calls, limited data to work with, and each decision a tough and important one to make, it is critical to prioritize the highest need cases for investigation.
To help, the Allegheny Family Screening Tool was developed. It’s a predictive-risk modeling algorithm built to make better use of data already available in order to help improve decision-making by social workers.
Drawing on a number of different data sources, including databases from local housing authorities, the criminal justice system and local school districts, for each call the tool produces a Family Screening Score. The score is a prediction of the likelihood of future abuse.
The tool is there to help analyse and connect a large number of data points to better inform human decisions. Importantly, the algorithm doesn’t replace clinical judgement by social workers – except when the score is at the highest levels, in which case the call must be investigated.
As the New York Times reports, before the tool 48% of the lowest-risk families were being flagged for investigation, while 27% of the highest-risk families were not. At best, decisions like this put an unnecessary strain on limited resources and, at worst, result in severe child abuse.
How to Be Algorithmically Accountable
Given the sensitivity of screening child welfare calls, the system had to be as robust and transparent as possible. Mozilla reports the ways in which the tool was designed, over multiple years, to be like this:
- A rigorous public procurement process.
- A public paper describing all data going into the algorithm.
- Public meetings to explain the tool, where community members could ask questions, provide input and influence the process. Professor Rhema Vaithianathan is the rock star data storyteller on the project.
- An independent ethical review of implementing, or failing to implement, a tool such as this.
- A validation study.
The algorithm is open to scrutiny, owned by the county and constantly being reviewed for improvement. According to the Wall Street Journal the trailblazing approach and the tech are being watched with much interest by other counties.
Algorithmic Accountability with Extreme Transparency
It takes boldness to build and use a tool in this way. Erin Dalton, a deputy director of the county’s DHS and leader of its data-analysis department, says that “nobody else is willing to be this transparent.” The exercise is obviously an expensive and time-consuming one, but it’s possible.
During recent discussions on AI at the World Bank the point was raised that because some big data analysis methods are opaque, policymakers may need a lot of convincing to use them. Policymakers may be afraid of the media fallout when algorithms get it badly wrong.
It’s not just the opaqueness, the whole data chain is complex. In education Michael Trucano of the World Bank asks: “What is the net impact on transparency within an education system when we advocate for open data but then analyze these data (and make related decisions) with the aid of ‘closed’ algorithms?”
In short, it’s complicated and it’s sensitive. A lot of convincing is needed for those at the top, and at the bottom. But, as Allegheny County DHS has shown, it’s possible. For ICT4D, their tool demonstrates that public-service algorithms can be developed ethically, openly and with the community.
Stanford University is currently examining the impact of the tool on the accuracy of decisions, overall referral rates and workload, and more. Like many others, we should keep a close watch on this case.
Thanks for an interesting post! How can we find out more about the Stanford study? Is there a research protocol available?
Thanks for your question, Annette. I’m not sure if a research protocol is available. I would suggest you contact someone listed above from the project, or Jeremy Goldhaber-Fiebert, Assistant Professor of Medicine at Stanford University, who is conducting the study.
If you come across an online resource of interest, please share it here. Thanks.