Mart, Susan Nevelow, The Algorithm as a Human Artifact: Implications for Legal {Re}Search (October 26, 2016). Available for download at SSRN: https://ssrn.com/abstract=2859720
“When legal researchers search in online databases for the information they need to solve a legal problem, they need to remember that the algorithms that are returning results to them were designed by humans. The world of legal research is a human-constructed world, and the biases and assumptions the teams of humans that construct the online world bring to the task are imported into the systems we use for research. This article takes a look at what happens when six different teams of humans set out to solve the same problem: how to return results relevant to a searcher’s query in a case database. When comparing the top ten results for the same search entered into the same jurisdictional case database in Casetext, Fastcase, Google Scholar, Lexis Advance, Ravel, and Westlaw, the results are a remarkable testament to the variability of human problem solving. There is hardly any overlap in the cases that appear in the top ten results returned by each database. An average of forty percent of the cases were unique to one database, and only about 7% of the cases were returned in search results in all six databases. It is fair to say that each different set of engineers brought very different biases and assumptions to the creation of each search algorithm. One of the most surprising results was the clustering among the databases in terms of the percentage of relevant results. The oldest database providers, Westlaw and Lexis, had the highest percentages of relevant results, at 67% and 57%, respectively. The newer legal database providers, Fastcase, Google Scholar, Casetext, and Ravel, were also clustered together at a lower relevance rate, returning approximately 40% relevant results. Legal research has always been an endeavor that required redundancy in searching; one resource does not usually provide a full answer, just as one search will not provide every necessary result. The study clearly demonstrates that the need for redundancy in searches and resources has not faded with the rise of the algorithm. From the law professor seeking to set up a corpus of cases to study, the trial lawyer seeking that one elusive case, the legal research professor showing students the limitations of algorithms, researchers who want full results will need to mine multiple resources with multiple searches. And more accountability about the nature of the algorithms being deployed would allow all researchers to craft searches that would be optimally successful.”