November 12, 2020

Public Safety Assessment Validation Studies

Validating the Public Safety Assessment

The PSA was built using historical data from around the country. It is important to test whether the PSA is accurate in the current era and in different communities. The creation of the PSA-DMF System reports can be costly, and it is important that communities spend limited resources wisely, investing in tools that have been proven to be accurate. Additionally, the validation will help identify whether the PSA accurately categorizes risk across different races and genders.

The Studies

Unlike most of the A2J Lab’s studies, these studies are not RCTs. They are validation studies. A validation study uses statistical analysis to determine how well the PSA classifies what it was designed to classify, i.e., the risk that certain outcomes will occur. In this case, the outcomes of interest are failure to appear (“FTA”), new criminal activity (“NCA”), and new violent criminal activity (“NVCA”). Validation studies assess how accurately the PSA scores classified individuals (or cases) in past cases according to risk of FTA and N(V)CA. The closeness of the relationship between the classification and what subsequently happens determines validity.

The A2J Lab is working with four counties to perform validation studies based on their use of the PSA. The counties have provided (or are currently providing) the A2J Lab with data related to the PSA and to misbehavior rates. The A2J Lab is organizing and analyzing the data to determine the PSA’s validity for various groups of people.


The A2J Lab is studying validity for how well the PSA classifies risk of FTA, NCA, and NVCA.

What We’ve Learned

Harris County, Texas: There was moderate evidence that the PSA was overall valid in Harris. Some validation techniques (e.g., simple correlations, area under the curve, balanced accuracy) provided weak evidence of validity, others (e.g., simple plots, logistic regression) provided strong evidence of validity. No technique suggested invalidity. There was no substantial evidence to suggest that the PSA scales performed differently for different racial and gender groups.

McLean County, Illinois: There was moderate evidence for the overall validity of the PSA scales in McLean. Most validation techniques (simple plots, logistic regression, balanced accuracy measures, and area under the curve for NCA and FTA) provided either moderate or strong evidence for validity, while the remainder (correlations, area under the curve for NVCA) provided weak evidence of validity. No technique suggested invalidity. Analyses by gender and race were inconclusive.

Kane County, Illinois: There were three primary results. First, there was moderate evidence of overall validity in Kane, in that with one exception, increasing scores on the new criminal activity (NCA), new violent criminal activity (NVCA), and failure to appear (FTA) generally corresponded with increasing rates of misbehavior among released individuals. The exception occurred at the high values of the NCA scale; there was no relevant difference in misbehavior rates at NCA values of 4, 5, and 6. Second, increases in the NCA and FTA scales sometimes corresponded to markedly different increases in misbehavior rates, providing evidence against the uniform validity of these scales in Kane County. Third, there was no evidence suggesting marked racial or gender differences in the validity of the PSA scales; while some metrics showed inequity with respect to a particular rate or gender, other metrics showed inequity in the reverse direction. Thus, we have no evidence of equitable invalidity, we cannot confirm equitable validity.

One additional report will post when available.

The Research Team

Jim Greiner, Faculty Director, Access to Justice Lab; Professor of Law, Harvard Law School

Matthew Stubenberg, Associate Director of Legal Technology, Access to Justice Lab

Ryan Halen, Data Analyst, Access to Justice Lab