New research disputes the accuracy, fairness and limits of so-called “risk assessment” tests used by corrections departments to predict an offender’s likelihood to recommit crime.
In a January article for Science Advances magazine, analysts revealed the results of two critical studies that debunk the predictive algorithms behind the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) in use nationwide since 2000.
“Proponents of these systems argue that big data and advanced machine learning make these analyses more accurate and less biased than humans,” wrote Julia Dressel and Hany Farid of the American Association for the Advancement of Science. “We show, however, that the widely used commercial risk assessment software COMPAS is no more accurate or fair than predictions made by people with little or no criminal justice expertise.”
One study compared the COMPAS results of about 7,000 persons in Florida between 2013 and 2014, with their actual recidivism during the first two years following their release.
“This analysis indicated that the predictions were unreliable and racially biased,” Dressel and Farid wrote. “COMPAS scores appeared to favor White defendants over Black defendants by under predicting recidivism for White and over predicting recidivism for Black defendants.”
Dressel and Farid themselves conducted a study where random online participants were asked to predict a person’s recidivism given minimal information—the offender’s sex, age and criminal history, but not race.
“With considerably less information than COMPAS (only seven features compared to COMPAS’ 137), a small crowd of non-experts is as accurate as COMPAS at predicting recidivism,” Dressel and Farid asserted.
Northpointe, the private software company that created COMPAS, refuses to disclose the specifics of its prediction process, out of concern for sharing the method with rival firms. But whatever algorithmic functions are involved, Dressel and Farid determined that COMPAS may only be 65 percent accurate, at best.
“Given that our participants, our classifiers, and COMPAS all seemed to reach a performance ceiling of around 65 percent accuracy, it is important to consider whether any improvement is possible,” they said of their results. “…it remains to be seen whether a larger pool of participants will yield even higher accuracy, or whether participants with criminal justice expertise would outperform those without.”
Separate from COMPAS, Dressel and Farid tested eight other algorithmic programs used across the country to predict recidivism—only one of which made accurate predictions.
“It is valuable to ask whether we would put these decisions in the hands of random people who respond to an online survey because, in the end, the results from these two approaches appear to be indistinguishable,” Dressel and Farid concluded.