Big data in the courtroom: Bug or Feature?
Anna Sagana & Mario Senden
The advent of big data together with recent developments in machine learning (ML) has had a vast impact on many aspects of our lives; recommendation engines help companies like Netflix and Amazon to anticipate (create?) our desires while Google Maps predicts traffic to suggest better routes. It is thus little surprise that machine learning algorithms are already being tested [1] and employed in courtrooms [2]. In the US, the amount of law firms using ML is increasing with prominent examples including companies like LexisNexis and Westlaw. But not only law firms [3] are interested in big data and ML, judges seem to be equally allured and prominent legal professors start advocating their use [4,5].
This demand seems reasonable since the legal system is a rich source of data. More than 350.000 cases pass through US courts each year. Similarly, the Dutch courts issue more than 80.000 sentences each year on first instance cases alone [6]. Meanwhile, there is an increasing amount of case information that is stored electronically ranging from emails and social media interactions to CCTV footage and GPS tracking. The processing demands accompanying the vast amounts of information and evidence thusly generated take human cognitive abilities to their limit. Machine learning can help to reduce cognitive load by detecting patterns in data that are non-visible to human observers and thus reduce the likelihood of accidental errors [7,8]. Furthermore, ML algorithms carry the promise of unearthing new, better, decision patterns from the data and thus improve judicial decision making.
While machine learning can be advantageous for the legal domain, it also carries danger. Cases such as that of Wisconsin v. Loomis [9] in the US ought to fuel our concern. In that case, the defendant Eric Loomis was found guilty for his role in a drive-by shooting. However, his punishment was not merely based on his crime but also on computerized risk assessment. Mr. Loomis received an unusually harsh sentence because a risk-assessment tool (Compas [10]) deemed him highly likely to reoffend. The algorithms used by this tool are a well-kept secret only known to their developers and hence neither accessible by the defense attorneys nor by the State of Wisconsin. Such lack of transparency not only puts due process at jeopardy but also important legal decisions into the hands of private companies.
We are currently at a crossroads with one path leading to machine-aided, improved, human decision making and the other to black-box systemization where legal decisions can neither be made nor comprehended by legal experts. Big data and machine learning will arrive within the judicial system along one of these paths, it is upon legal professionals, legal psychologists, and data scientists; to make sure it does so along with the right one and head towards a bright future.
References
1 http://www.ucl.ac.uk/news/news-articles/1016/241016-AI-predicts-outcomes-human-rights-trials
4 Katz DM. Quantitative legal prediction-or-how I learned to stop worrying and start preparing for the data-driven future of the legal services industry. Emory Law Journal. 2012;62:909
5 Surden H. Machine learning and law. Wash Law Rev. Mar 2014;89(1):87-115
6 https://longreads.cbs.nl/trends17-eng/society/figures/security_and_justice/
7 Danziger S, Levav J, Avnaim-Pesso L. Extraneous factors in judicial decisions. Proceedings of the National Academy of Sciences. Apr 26 2011;108(7):6889-6892.
8 Cho K, Barnes CM, Guanara CL. Sleepy punishers are harsh punishers: Daylight saving time and legal sentences. Psychological Science. Feb 2017;28(2):242-247