• Skip to primary navigation
  • Skip to main content
Access to Justice Lab logo

The Access to Justice Lab

at Harvard Law School

  • About
  • Projects
  • Publications
  • Resources
  • People
    • A2J Lab Staff
    • Advisory Board
    • Collaborators
  • Contact
  • Donate
  • Show Search
Hide Search

Blog

Online Dispute Resolution of Low-Level Court Proceedings: Two Broken Field Experiments, One Unexpected Result

March 8, 2023 by Alice Hu

The Covid-19 pandemic forced courts to change overnight the way they heard cases, from requiring masks in courtrooms to holding hearings online or via the telephone. Online dispute resolution (ODR) systems became a popular method for courts to dispense justice. Proponents have long argued that ODR increases access to justice, mitigates procedural and substantive errors, and conserves court resources—particularly for self-represented litigants. Is this true? The A2J Lab conducted two randomized control trials (RCTs) to assess newly implemented ODR programs. This post discusses those RCTs and one useful and unexpected finding.

What is ODR and Current Landscape

ODR has seen a rise in popularity even before the pandemic. In 2019, the ABA reported that sixty-six courts in twelve states maintained ODR platforms for settling civil and criminal disputes. Between 2018 and 2019, the number of court-annexed ODR platforms more than doubled. 

ODR platforms are often classified into two types: instrumental ODR systems offer a virtual platform for facilitating otherwise face-to-face dispute resolution, while principal ODR systems take a “proactive role facilitating the resolution of the dispute.” Courts may implement ODR as a mandatory pretrial process or as an optional alternative to trial. ODR can occur synchronously in real-time videoconferences or asynchronously in chat rooms. ODR systems may operate differently in disputes involving an individual against the state versus disputes between private individuals, and their usage may be free or cost money.

Two vendors, Matterhorn and Modria (the latter recently purchased by Tyler Technologies), currently dominate commercial ODR systems in the United States. 

Purported Benefits of ODR

Increasing Access to Justice

Advocates argue that ODR allows parties who lack the time and money to attend an in-person hearing a cheaper and more accessible online mediation process. If individuals can negotiate small claims or contest citations over an asynchronous, online platform, they may be less likely to default. 

However, barriers include a lack of willingness to use ODR as well as digital divides. Low-income and disadvantaged communities are more likely to lack the necessary broadband connection for video conferencing.

Mitigating Procedural and Substantive Mistakes

Proponents argue that ODR reduces the risk of procedural and substantive errors on the part of participants. Pro se (self-represented) litigants face substantial obstacles in court to follow the correct procedures and adequately present their case. ODR’s informality and physical separation might allow self-represented litigants to feel less nervous and mitigate biases from face-to-face hearings. 

However, some researchers argue that parties who suffer disadvantages in traditional hearings or alternative dispute resolutions may suffer similar disadvantages in ODR. Further, efficiency gains may hinder equity goals. For instance, more efficient resolution of eviction suits by landlords against tenants may exacerbate landlords’ comparative power over tenants. 

Conserving Court Resources

Advocates argue that ODR may reduce resource demands on the court. By using ODR, courts might avoid the costs of physical spaces and staff needed for face-to-face hearings. ODR may also reduce the administrative costs of procedural errors by self-represented litigants, such as failure to file documents on time or appear in court. 

There has been little testing of all the popularly cited benefits of ODR, and further research is needed.

Our Randomized Controlled Trials (RCTs)

There are currently no RCTs evaluating whether ODR serves the equity, access to justice, or efficiency goals raised by proponents. RCTs are considered the “gold standard” for empirical testing because randomization creates statistically equivalent groups but for the tested intervention, minimizing the possibility that observed differences are due to chance. Unlike studies that compare pre-intervention with post-intervention outcomes, RCTs do not create a risk that another variable influenced experiment outcomes.

Design

The most straightforward way to assess the said benefits of any ODR program is to randomize its usage. The randomization creates two identical groups with a single variable—one of which uses ODR and other of which does not; as a result, any observed differences in access to justice, budgetary, or other outcomes can be attributed to the ODR. Depending on the court’s usage of ODR, the study might randomize either access to the platform (if usage is optional) or orders compelling litigants to use it (if usage is mandatory). Either design, if implemented properly, would produce credible results.

However, the two courts (in different jurisdictions) in our studies declined to implement either design. Instead, in Court A, all individuals who received traffic citations received information about the ODR platform from the citation form. Users who registered for the ODR platform were randomized to either receive or not receive additional information about negotiating their charges and alternative options like payment plans. In Court B, all individuals who received traffic citations received a web address from the citation firm that directed them to either pay the citation and plead guilty, use the ODR platform, or appear in court. Eligible users were randomized to either receive or not receive a postcard that encouraged them to use the ODR platform.

Obstacles

With Court A, we faced four major barriers: 1) Court A abandoned plans to advertise the ODR pilot; 2) The court buried ODR information in small font, hidden within complex boilerplate language on the citation; 3) The court crafted narrow eligibility guidelines; some individuals who attempted to use the platform were found ineligible; 4) Court A’s citation form both provided information about the ODR platform and scheduled a hearing, causing potential confusion over whether the ODR platform was available to defendants. Court A also set an internal deadline for registration for the ODR platform which was not communicated to defendants.

With Court B, we faced three key obstacles: 1) Court B moved in-person hearings online at the start of the study at the start of the COVID-19 pandemic, potentially making the platform less attractive; 2) Court B paid for its ODR vendor to process 1,500 cases, after which the court passed the cost to users; and 3) The ODR vendor administered our survey incorrectly, missing some participants and allowing others to make multiple, inconsistent submissions.

Results

In Court A, only one individual registered to use the ODR platform over four months. Facing such low usage, we recommended terminating the study, and Court A accepted our recommendation.

In Court B, the encouragement deployed failed to encourage. The ODR usage rate in the group randomized to receive the postcard was 72.5%, as compared to 72.6% for the no-postcard group; the corresponding p-value was a .996, with a frequentist confidence interval for the treatment effect of (-.075, .075). (P-value quantifies the possibility that the observed differences are due to chance and not the intervention; a low p-value, such as below .05, indicates that the difference is not likely due to chance.) With no discernible difference in ODR usage rates, we could infer nothing about the effect of ODR.

An Unexpected Finding

While not part of our research design, we discovered a large and statistically significant reminder effect. We discovered that litigants to whom we sent the encouragement postcard were 23% more likely to have resolved their cases within the experiment timeframe, regardless of whether they used the ODR platform (p < .0005, 95% confidence interval (.16, .31)). In addition, litigants to whom we sent the postcard were 12% more likely to appear for their hearings (p > .002, 95% confidence interval (.05, .19)). The size of the reminder effect comfortably exceeded expectations for the effects of postcard reminders found in nudge literature.

We postulate that the reminder effect arose from the combination of the encouragement and the presence of the ODR platform, especially given that the ODR platform permitted asynchronous negotiation and resolution. The postcard may encourage participants to “take care of this quickly right now,” and the ODR platform provides them with a means to do so from their homes not otherwise available.

Suggestions for Courts Implementing ODR

We encourage courts to conduct RCTs in which users are randomly assigned to ODR platforms directly, rather than through the proxy of the encouragement postcard. The former enables direct observation of the effects of ODR. While many in the U.S. legal profession remain skeptical toward randomization, it is the gold standard for research. We hope that the results here encourage additional research into the efficacy of ODR.

We have three additional suggestions. First, courts should carefully consider and balance cost and utility before deploying ODR platforms. Many vendors charge on a per-case basis. Courts who craft broad eligibility guidelines may see higher costs if many litigants elect to use ODR. Courts may accept this cost if the ODR “buys” more access to justice for litigants who otherwise would not have responded.

Some have considered potential future “open source” solutions to reduce the costs of ODR systems, but open-source software contains hidden costs given developer unfamiliarity with the code, a lack of support staff, and issues with scaling. Thus, courts should not look to open-source solutions as an immediate solution to ODR platforms’ cost.

Second, courts desiring greater ODR adoption should craft broad, clear eligibility criteria. As shown by our studies, narrowly crafted eligibility requirements, failure to remind users of ODR, or unintentionally hiding registration information for ODR are all likely to limit the number of users. Our research suggests that reminding users and reducing registration friction increases participation. 

Lastly, courts should carefully evaluate compatibility between court use cases and vendor features before adopting any ODR system. Various technical issues have delayed different courts’ implementation of ODR systems and increased overhead costs, including issues in Application Programming Interface (API) and database integration. Responding to these delays, courts have recommended establishing a project management team to oversee the ODR deployment process.

Lessons in A2J Techniques from ODR Evaluations

June 22, 2022 by Renee Danser

The rise in court-hosted online dispute resolution (“ODR”) is noticeable. But does it work? It seems intuitive that if courts are widely deploying ODR, touted to improve access to justice by removing the inconvenience of—some argue any necessity to—having to visit a physical courthouse to participate in one’s case, we would find strong evidence to support such wide adoption.

That strong evidence base does not currently exist.[1] In fact, no credible evidence as to ODR’s effectiveness, one way or another, exists. We tried ourselves to investigate whether ODR works. Unfortunately, we still don’t know.

If you are reading this post, you likely know already that at the Access to Justice Lab at Harvard Law School, we focus on credible, meaning almost always randomized, evaluation to understand the direct causal impact of those interventions using the randomized control trial.[2] In this post, we summarize two evaluations completed in partnership with the Iowa Judicial Branch and the 11th Judicial Circuit of Florida, invaluable partners.

In the coming months, you can watch for a longer publication going into more detail. Here we will proceed as follows: first we will remind readers of why randomization is an important component to a credible evaluation design. Then we will summarize our two studies, which differ in goals and design, but neither of which tested whether the key question of whether ODR (versus no ODR) affects relevant outcomes. Finally, we will end with lessons learned and opportunities for future research/court partnerships.

The Importance of Randomization in Evaluation Designs
Why are we so committed to evaluations incorporating randomization? Randomized studies select groups statistically identical to one another except for that one is not exposed to an intervention or program (here, the availability of an ODR platform), allowing us to know with as much certainty as our current knowledge systems allow that the reason for any observed differences in outcomes is the intervention or program. By contrast, a commonly used methodology that compares outcomes prior to an intervention’s implementation to outcomes after the implementation could be rendered of little value by changes, fast or evolutionary, occurring at about the same time as the intervention. Such factors might include a change in presiding judge, a new crop of mediators or lawyers working on these cases, a change in mechanism to access the court such as by phone or synchronous or synchronous online interaction (uncoincidentally, similar to how ODR works), a change in filing fee amount, a change in way cases process through the court, or change in thinking among members of the bar regarding what is trial worthy and what is better to settle. The gold-standard RCT neutralizes these potentially influencing factors as much as we currently know how to neutralize them.

It was the Best of Times, it was the Worst of Time: A Tale of Two Studies
As suggested above, the threshold question of whether ODR works or not is not yet answered. Before we understand what components of ODR make it better or worse, we need to know if the concept overall works. We went into these evaluations hoping to investigate that important first inquiry. We came closer in Florida.

Florida
In Florida we attempted to use an encouragement design.[3] We sought to answer the question of whether providing encouragement to use an ODR platform to resolve traffic compliance matters results in more use of the platform and, if so, whether those that use the platform experience better outcomes as compared to those who do not. The hope was that people receiving encouragement would do the thing we encouraged them to do at much higher rates than those who did not receive the encouragement. If that hope had been realized, then by randomizing encouragement to use the intervention—giving encouragement to some and not others, randomly selected—we would have effectively been randomizing the intervention.[4]

Encouragement came in the form of a postcard. Individuals with eligible alleged traffic infractions were randomly assigned to receive this encouragement or not. Nothing else about their case changed: law enforcement still issued the same citations, cases were still scheduled as a matter of course with the court, and they proceeded if no other action was taken prior to that scheduled date (such as paying the ticket or using the ODR platform to show remediation of noncompliance). We ended with 289 study participants getting the encouragement and 274 not.

Encouragement designs work only if the encouragements . . . er . . . encourage. Our postcard didn’t. We weren’t terribly optimistic that it would. But we were unable to persuade stakeholders to adopt a stronger design.

In other words, our two groups used the ODR platform at nearly the same rate, meaning we are not able to untangle the effect of ODR itself from other possible outcomes.

What we did see in our data is a possible reminder effect[5] to a large degree. We saw those that received the encouragement postcard were more likely to appear at their subsequent court events at a rate of twelve percentage points. And, perhaps a product of that appearance, those receiving the postcard were more likely to resolve their case at a rate of twenty-five percentage points. Previous researchers have observed reminder effects, usually with postcards or text messages about hearings, but that those have tended to be in the five-eight percent range.[6] The fact that we’re observing larger effects leads us to hypothesize that the combination of a reminder and a convenient method of resolution may be larger than the reminder effect alone.

What we think we see in this data is a lot of people were able to access the ODR platform and did so notwithstanding encouragement. But, subsequently, those that received the encouragement paid more attention to their cases than those who did not. We cannot be definitive about this finding. This is not the outcome for which we were testing. But, it is a hypothesis that emerged and deserves more attention from future research.

Iowa
The Iowa study differed in goals and design. The Iowa Judicial Branch deployed the ODR platform, for the purposes of this evaluation, for a handful of pre-selected traffic infractions. The Iowa Judicial Branch agreed to randomize neither access to the platform nor encouragement to use it. Instead, the Branch agreed to randomize information about payment plans available via the court system as well as information about what prosecutors ordinarily negotiate. Randomization would occur for litigants who created accounts to use the platform.

In the case of information about payment plans, that information was not a secret but also was not openly available, either. Usually it required a litigant to affirmatively ask for a payment plan rather than being an option the litigant could select.

In the case of reasonable expectations, the idea was that giving the litigant some information about what a prosecutor might offer in a plea might help the litigant to decide to pursue negotiation or resolve the matter more quickly (presumable by paying or expediting a not guilty plea).

We were not particularly optimistic that randomization of information of this type would produce much of a treatment contrast. As it turned out, the issue was not the treatment contrast but the platform. During the several months of enrollment, no one used it. Actually, over the course of enrollment, only one participant successfully made it through the platform. Volume did tick up slightly after enrollment closed, but remained too low for an evaluation.

Lessons Learned and Opportunities
Some themes emerged from both evaluation attempts which we think are useful to courts as they move to ODR 2.0. We will use this section to highlight some. Not all applied in each jurisdiction; some applied in both.

Informing the User about the Option
It seemed that courts did not make substantial efforts to inform the user about the option to use ODR. Neither of our evaluation jurisdictions implemented a program mandating the use of ODR to resolve the selected use, which likely would de facto serve to inform the user about the platform. Instead, both made use of ODR opt-in. Litigants cannot opt into something unless they know it exists. In traffic matters, most courts attempt to alert litigants to the existence of ODR by including text about it on citation forms. In most jurisdiction that we have observed, in this study and others, citations are not a good forum for notifying anyone of anything important. They are packed full of dense text with numerous unintelligible statutory citations. The notification of the ODR option amounts to a URL that may not intuitively appear connected to the court. Notwithstanding the URL and the option to resolve one’s case without attending a court hearing, the citation has a court date at which, the citation says, the alleged offender is compelled to appear. It is easy to see how litigants may get confused and/or disregard the ODR option.

Expanding Eligibility Guidelines
In implementing ODR, some courts fear that a deluge of users will flood the platform, making the platform cumbersome for the court to manage. This fear can result in narrowly drawing eligibility guidelines. That could be a method to thin the herd of users. But attempts to clarify who can and cannot use an ODR platform may also confuse potential users, resulting in some eligible users concluding that they cannot use the platform and some ineligible users concluding the opposite. When determining eligibility for ODR, and perhaps any program, we should consider what will make logical sense to the user. Particularly given the Iowa experience, broad rather than narrow eligibility guidelines may help to increase usage.

Clear Timelines
Courts often develop processes around the idea that judicial time should not be wasted. Likely the process discussed above of providing in the citation both information about ODR while also scheduling the next hearing is to preserve judicial time and to keep cases moving. Internal processes around deadlines for users to complete ODR developed with likely the same goal in mind. However, these ODR-specific internal deadlines are not, as far as we were able to observe, communicated to the user, and they differed from the live-appearance deadlines. This results in individuals otherwise eligible to use the platform being unable to do so because an undisclosed deadline for platform use has passed. After the passing of this deadline, the only option is to appear in court to resolve the matter. ODR-specific deadlines should be communicated. Indeed, we wonder what harm there is in either letting someone resolve their case using the platform right up until the day of the hearing, or only scheduling a hearing after a disclosed deadline passes, with the latter option seeming to accommodate both the robust use of the platform and the preservation of judicial time.

Increasing Access to Justice with Simplification and Reminders
Not everything went as we had hoped in these two evaluations. We came away with some thoughts on improvement of court installed ODR platforms. We also came away with a hypothesis that could be a boon to the access to justice community. The idea that combining tools that are separately thought of as improving access to justice (here, reminders and a tool to ease case resolution) might work. The takeaway here is that simplified processes coupled with reminders may create a significant increase in usage of tools designed, we think, to improve access to justice. We are looking forward to testing, and observing others test, this hypothesis in the future.

Support for this project was provided in part by The Pew Charitable Trusts. The views expressed herein are those of the author(s) and do not necessarily reflect the views of The Pew Charitable Trusts.

[1] There are a handful of empirical studies that investigated user experiences in ODR processes. See, Martin Gramatikov & Laura Klaming, Getting Divorced Online: Procedural and Outcome Justice in Online Divorce Mediation, 14 J.L. & FAM. STUD. 97, 117 – 8 (2012) (“finding high levels of satisfaction with online divorce procedures and quality of outcomes of both male and female divorcees in the Netherlands, although the former focused more on monetary and time costs while the latter focused on negative emotions”); See, Katalien Bollen & Martin Euwema, The Role of Hierarchy in Face-to-Face and E-Supported Mediations: The Use of an Online Intake to Balance the Influence of Hierarchy, 6 Negotiation and Conflict Management Research 4:305 – 19, 313 (2013) (“finding that a hybrid process combining online intake with face-to-face mediation had an equalizing effect in hierarchical labor settings on parties’ fairness and satisfaction perceptions”); See, Marc Mason & Avrom H. Sherr, Evaluation of the Small Claims Online Dispute Resolution Pilot, Institute of Advanced Legal Studies, at 19 (Sept. 1, 2008), (finding a lower settlement rate than offline small claims mediations, as well as problems such as the online system timing out, the registration process, spam filtering, a lack of transparency, and digital access and competency, although the study was limited in scope and only had a sample size of 25 cases in the UK); See, Laura Klaming, Jelle van Veenen and Ronald Leenes, I Want the Opposite of What You Want: Reducing Fixed-pie Perceptions in Online Negotiations, 2009 J. DISP. RESOL. 139:85 – 94, 92 – 93 (finding that “providing negotiators with incentives independent from the resources that have to be divided, as well as providing them with information about the opponent’s preferences, led to more agreements”); See, Udechukwu Ojiako et al., An Examination of the ‘Rule of Law’ and ‘Justice’ Implications in Online Dispute Resolution in Construction Projects, 36 International Journal of Project Management 301, 305 , 308(2018) (“finding that the ODR process does not affect parties’ satisfaction with the “rule of law” or “justice” in small claims ODR in construction projects, while suggesting further research on the cultural contexts around these concepts”), Due to the limited scale on which ODR has been implemented in American courts, there are few independent efforts to quantify the outcomes of ODR initiatives in the public sector. “There are some self-reported pre-ODR and post-ODR datasets, mostly compiled by courts and private platforms and unconfirmed by independent research” (See, Joint Technology Committee Resource Bulletin, Case Studies in ODR for courts; A View from the Front Lines, 3 – 18 (2017); Amy Schmitz, Expanding Access to Remedies Through E-Court Initiatives, 67 Buff. L. Rev. 89, 158 (2019); Kevin Bowling, Jennell Challa, & Di Graski, Improving Child Support Enforcement Outcomes with Online Dispute Resolution, Trends in State Courts, 43 – 8, 46 (2019), and Avital Mentovich, J.J. Prescott, & Orna Rabinovich-Einy, Are Litigation Outcome Disparities Inevitable? Courts, Technology, and the Future of Impartiality, 73 Ala. L. Rev. (2020) at 893.

[2] See, Joshua D. Angrist, Instrumental Variables Methods in Experimental Criminological Research: What, Why and How, 2 Journal of Experimental Criminology 23 – 44, 24 (2006) (arguing that randomized studies are considered the gold standard for scientific evidence).

[3] Conner Mullally, Steve Boucher, & Michael Carter, Encouraging Development: Randomized Encouragement Designs in Agriculture, 95 American Journal of Agricultural Economics 5:1352 – 8, 2 (2013) (defining the encouragement design); Paul J. Ferraro, Counterfactual Thinking and Impact Evaluation in Environmental Policy, Environmental Program and Policy Evaluation: Addressing Methodological Challenges. New Directions for Evaluation, 75 – 84, 80 (2009) (suggesting the encouragement design may be appropriate when randomly restricting access to an intervention cannot be done). One assumes Ferraro means literally cannot, or, in other words, ethically impermissible to, rather than a preference not to, randomly restrict access. As an aside, Ferraro’s description of the need for rigorous evaluation in environmental policy is directly analogous to the need for same in the legal sphere. See generally, D. James Greiner & Andrea Matthews, Randomized Control Trials in the United States Legal Profession, Annu. Rev. Law Soc. Sci. 12:295 – 312 (2016).

[4] Id., Mullally et. al., (“Assignment to th ‘encouraged’ group is then used as an instrumental variable in order to estimate the impact of the treatment”).

[5] For more information about reminder effects, See, e.g., Brian H. Bornstein, et. al., Reducing Courts’ Failure to Appear Rate: A Procedural Justice Approach, (May 2011), https://www.ojp.gov/pdffiles1/nij/grants/234370.pdf (evaluating the effectiveness of different messaging approaches in mailed post cards); Timothy R. Schnacke, et. al., Increasing Court Appearance Rates and Other Benefits of Live-Caller Telephone Court-Date Reminders: The Jefferson County, Colorado, FTA Pilot Project and Resulting Court Date Notification Program, 393 Ct. Rev.: J. Am. Judges 86 (2012) (evaluating the effectiveness of providing information about the consequences of failing to appear by live-calling individuals); Christopher T. Lowenkamp, et. al., Assessing the Effects of Court Date Notifications within Pretrial Case Processing, 43(2) Am. J. Crim. Just. 167, 173 (2017) (evaluated the effectiveness of different messaging approaches and different methods of delivering notifications); Brice Cooke, et. al., Using Behavioral Science to Improve Criminal Justice Outcomes: Preventing Failures to Appear in Court, (January 2018), https://www.prisonpolicy.org/scans/Using_Behavioral_Science_to_Improve_Crimina_Justice_Outcomes_Cooke_et_al_2018.pdf (evaluated the effectiveness of the timing and messaging approaches in text notifications); Stephen H. Taplin, et. al., Testing Reminder and Motivational Telephone Calls to Increase Screening Mammography: A Randomized Study, 92(3) J. of the Nat’l Cancer Inst. 233 (2000) (finding reminders to be as efficacious as addressing barriers with phone call reminders performing better than postcards); Susan Maxwell, et. al., Effectiveness of Reminder Systems on Appointment Adherence Rates, 12(4) J. of Health Care for the Poor and Underserved 504, 508 (2001) (finding show rates for appointments for those who received no reminder to be 49.9% as compared to those that received a mailed reminder to be 52.1%); Mary Elaine Koren, et. al., Interventions to Improve Patient Appointments in an Ambulatory Care Facility, 15(4) J. Ambulatory Care Mgmt. 76 (1994) (finding insignificant difference between the type of reminder, when using phone or mailings, but did find some reminder to be more effective than no reminder).

[6] See, Id.

Illustrations and Studies Are Not the Same

January 25, 2021 by James Greiner

Recently, we became aware of at least two blog posts (see here and here) lifting passages from, and selectively highlighting a result of, a draft paper the five of us authored. These posts provide a distorted picture of what our paper said and did.  The paper is not intended as an analysis of the Public Safety Assessment-Decision Making Framework (“PSA-DMF”) System risk assessment, nor is the data the paper analyzes anywhere close to final.  Rather, the paper proposes new statistical methodology.  It uses, for illustration purposes, partial data made available to us in the middle of a still-ongoing field experiment of the PSA-DMF System.  The study that produced the partial, interim data used to illustrate our statistical methods is still ongoing and our illustration uses less than 20% (in a rough sense) of the information the study will ultimately produce.  An interim report on the study, which has been public for some time now, makes all of this clear.  Moreover, there were many results from the paper’s illustrative application, only one of which the blog posts highlight.

We are scientists.  We want our work, both our new methodologies and our applied findings, to inform public debates and decision making.  But it is not helpful to mistake illustrative applications based on partial interim data in a paper intended to propose new statistical methods for final study results, nor to highlight selectively only certain results of an overall analysis.  We hope this post will clarify mistaken impressions.

Here are some details.

Our paper proposes new statistical methodology for evaluating risk assessment instruments and uses, as its applied example, interim data from a still-ongoing randomized control trial (“RCT”) study.  The RCT study evaluates the use of a predisposition risk assessment instrument called the Public Safety Assessment (“PSA”) and the accompanying, jurisdiction-specific, Decision Making Framework (“DMF”) in Dane County, WI.  Arnold Ventures supported the development of the PSA-DMF System.

The two blog posts purport to make much of one of the several results stemming from the illustration of our new methodology via the application of our methods to interim data from the Dane County study.  The results of our illustration are many and varied.  Here is a sampling of some these varied results as applied to the interim data from this one RCT:

  • The availability of the PSA-DMF System had no statistically significant effect on the prevalence of predisposition new criminal activity (“NCA”) in any of three conceptually defined classes of individuals appearing at a first appearance hearing.
  • The availability of the PSA-DMF System had no statistically significant effect on the prevalence of predisposition new violent criminal activity (“NVCA”) in any of three conceptually defined classes of individuals appearing at a first appearance hearing.
  • The availability of the PSA-DMF System had no statistically significant effect on the prevalence of predisposition failure to appear (“FTA”) in any of three conceptually defined classes of individuals appearing at a first appearance hearing.
  • The availability of the PSA-DMF System had no statistically significant effect on the measure of the racial fairness of the Dane County judges (actually, “Commissioners” in Dane) that two authors of our paper proposed in separate work, called “principal fairness.
  • The availability of the PSA-DMF System had a statistically significant effect on the principal fairness measure with respect to gender comparisons, in that it increased the strictness of Commissioner decisions for men while decreasing the corresponding strictness for women, thus widening somewhat an already-existing gender difference in those decisions.

The two blog posts seize upon the last of these results (ignoring almost all of the others) to articulate an attack on the PSA-DMF System and upon the use of algorithms or risk classification instruments in criminal justice more generally.

Advocates for a particular position often simplify, distort, and selectively quote.  In a democracy such as ours, committed to free expression of ideas, it is not a mistake for them to do so.  The mistake is for anyone else to pay attention.  Good decision making in a democracy requires readers to distinguish between what is worthy of attention and belief and what is not.

One indication of whether a report on research deserves attention is whether the report’s authors contacted the relevant researchers before publishing, to request comment or clarification about the nature of the research.  To our knowledge, the authors of neither blog post did here.  Had they done so, we would have been happy to bring certain facts to their attention.  First, the paper they quote is about statistical methodology, as a cursory review of it reveals.  It is quite nerdy.  The “results” in the paper are intended to illustrate how the statistical methodology works on a dataset.  They are not intended to form the basis of conclusions relevant for policy.  That is why, for example, we made no attempt in the paper to adjust for the fact that we conducted multiple tests on the same data.  Including an illustrative application in a paper proposing new statistical techniques is traditional in the field, and again, the results of an illustration are not intended to form the basis of policy making.  Second, as a lengthy report and a more accessible FAQ sheet (both publicly available on the website of the Access to Justice Lab, where two of us work) make clear, the data used for the paper were an incomplete subset of about 20% of the data the Dane County RCT will eventually produce.  Moreover, the Dane County RCT is one of five field RCT studies the A2J Lab has underway (a sixth RCT field operation will produce a technical report only due to IRB restrictions).  In a rough, hand-wavy sense, we are talking about one-fifth of one sixth of the studies underway in this area.  Third, as the report and the FAQ sheet both discuss, based on the data produced from the Dane County study at this time, the availability of the PSA-DMF System had no statistically significant effect on the number of predisposition days that individuals appearing at a first appearance hearing spent incarcerated, suggesting (although not proving) that any disparity in the strictness or permissiveness of Commissioner decisions (which is what the blog posts focus on) did not translate into a difference in predisposition jail time.

Stepping back, it is not accurate to say that any statistically significant difference between two groups immediately means actionable discrimination.  Even taking the illustrative gender result upon which the two blog posts focus as the final word (it isn’t) from the only field RCT (it isn’t) on the only risk assessment instrument used in criminal law (not even close), one should be cautious.  Gender disparities have long been present in criminal justice, with women generally being arrested with less frequency than men and receiving more lenient treatment from the court system.  Saying that gender disparities exist, or even increase, because of some intervention requires careful thought about whether we want to do anything about those disparities, and if so, what we might want to do.  In this case, suppose the presence of the PSA-DMF System increased gender disparities by increasing the leniency of the criminal justice system’s treatment of women vis-à-vis similarly situated men (as appears to be partially the case from our illustrative result).  One way the criminal justice system could solve that “problem” would be by treating women more like men, i.e., treating women more strictly/harshly.  Is that what we want?

Good science is slow, sometimes maddeningly so.  Credible research takes time.  Useful inference requires careful attention to context, and policy decisions require the weighing of alternatives.  It is not helpful to any of these processes for those with a political ax to grind to present results designed to illustrate the application of new statistical methods, results based on a subset of preliminary data from one of several ongoing studies, as though those results were the last word on anything, including that one study.

One final thought:  The Access to Justice Lab, where two of us work and which conducted the Dane County field operation, is supported by Arnold Ventures, and the Dane County study itself is supported by Arnold Ventures.  That said, all five of us are agnostic at this stage about whether any criminal justice risk assessment instrument, including the PSA-DMF System, is a good or a bad thing.  In our view, credible evidence one way or the other does not yet exist.  What little that does exist suggests to us that the sturm und drang about risk assessments, from proponents and opponents, may be overblown.  Moreover, what one thinks about whether it is a good or bad idea to use risk assessment instruments of any kind should turn in huge part about what one would do if risk assessments are not used.  The most common alternative to the use of risk assessments, in criminal justice at least, is unguided, or loosely guided, or less guided human decision making.  And those opposed to the use of risk assessments in criminal justice apparently prefer these unguided human decisions.  The United States has had decades of experience with unguided human decision making in its criminal justice systems.  How has that gone?

James Greiner
Ryan Halen
Kosuke Imai
Zhichao Jiang
Sooahn Shin

* Note:  authors are listed in alphabetical order by last name

How we’re learning more about ways to improve access to justice across the U.S.

December 12, 2020 by Sandy North

We’re closing out 2020 with a bang: We have three new studies in the field. Our amazing partners have worked with us to prepare and launch these projects in the face of the unprecedented challenges posed by COVID-19. Focusing on different topics and in different geographies, these new studies have the potential to improve access to justice across the U.S.


Transformative Justice Program (Williamson County, TX)

The Problem

In Texas, young adults aged 17 – 24 are overrepresented in the adult criminal justice system, accounting for 29% of the state’s arrests while only making up 11% of the population. Emerging adults also have the highest short-term recidivism of any age group due to underlying factors that are not addressed by the current criminal justice system. These factors include, among others, mental health struggles, substance abuse, and co-occurring disorders.

The Program

The program is Williamson County’s first felony diversion program. It diverts emerging adults charged with a low-level felony offense from the traditional criminal justice system. The program is a court-based system that combines release into the community with developmentally appropriate, intensive, community-based services. County staff work with participants to create individualized plans to address health, housing, educational, and other needs.

Eligible participants can be part of the program for up to 18 months, and they will be connected with community social services providers to help them meet their individualized program goals. Participants who successfully meet the goals identified in their individual plan will graduate from the program and will be eligible for expungement of the record of the arrest and charges.

The Study

In partnership with Williamson County, the Public Policy Research Institute at Texas A&M University, and the University of Texas Health Science Center at Houston (UTHealth) School of Public Health, the A2J Lab has launched a randomized evaluation of the diversion program.

The research team will conduct an RCT to identify the effects of the program on several outcomes including recidivism rates and quality of health outcomes. The study will also supplement criminal justice data with quarterly surveys and other data sources.

What We’ll Learn

This study will provide important evidence about the impact of community-based services. The population has a high level of need. Because they are so young, any benefit they experience could improve their lives and their communities for decades to come. With this research, we will learn the extent of this potential improvement.


Online Dispute Resolution (Carroll County, IA and Miami-Dade County, FL)

The Problem

Miami-Dade will likely issue tens of thousands of traffic compliance tickets in 2020. These are minor infractions that often simply require a motorist to show proof of license, registration, or insurance to resolve the case. The complicating factor is that the motorist needs to get to the courthouse on the right day and time and wait until it is their turn to give this proof. Similarly, the three different law enforcement agencies that police the Carroll County roadways issued thousands of traffic tickets last year. All of these tickets require an in-person appearance if there’s a discrepancy to resolve.

Attendance can be difficult, and the consequences for not being in the right place at the right time can be expensive. While the process of having these tickets resolved is often inexpensive, leaving the issue unattended can result in increased fines and fees. The courts’ solution to this?  Put it online.

The Program

With the installation of Online Dispute Resolution (ODR), motorists can now avoid the hassle of getting to the courthouse–which comes with all of the typical inconveniences of parking, reliable transportation, taking time off of work, finding childcare–and upload all of the compliance documents or negotiate discrepancies online. Prosecutors and court staff can then review what is submitted and handle the ticket without ever causing the motorist to set foot in the courtroom.

The Study

The A2J Lab is running RCTs in both counties. The research team will measure outcomes such as perceptions of the justice system, time to disposition, settlement success, and failure to appear.

What We’ll Learn

This change is happening at a critical time for state courts in public perception. As courts are moving practices online to cope with the COVID-19 pandemic, they have very little information about how those changes impact people’s perceptions of court experiences and other outcomes. This study will collect data on how using ODR impacts users’ overall perception of the justice system. This data, along with information about how cases resolve, will create evidence about how ODR improves the experiences of people in Miami-Dade and Carroll counties.


Plain Language Court Forms (DuPage County, IL)

The Problem

Reducing technical jargon and “legal-ese” is of major interest to access for justice advocates. For pro se litigants (people without lawyers), forms can be so complex that they require large sets of self-help materials to understand. Changing the forms themselves to be easy-to-read documents is a more efficient way to assist people without attorneys file the paperwork they need.

The Program

The Illinois Supreme Court Commission on Access to Justice created a Standardized Forms Committee to explore the creation of pro se forms that implement standardization and plain language practices. The new forms attempt to provide better guidance and support to pro se litigants by integrating plain language and simplification.

The study will include individuals who attempt to download a pro se divorce form for filing in DuPage County, Illinois. Any attempt to download a relevant form through the various websites, such as the DuPage County website (www.dupageco.org), the computers in the self-help center at the courthouse, or local legal aid organization websites will redirect to a study-based webpage. This webpage will verify eligibility for the study, provide information about the study, and subsequently provide either the new standardized pro se form or the previously created DuPage-specific pro se divorce form. These forms will be provided at random.

The Study

Through metadata and time-stamp information attached to the downloaded forms, as well as information documented in the court’s case management system, the A2J Lab will track a number of outcomes. These outcomes will include success of filing the form, time to filing, time to disposition, number of procedural errors, and whether the study participants use the downloaded form or end up reaching out to legal aid for assistance. The A2J Lab expects to conduct the field randomization for approximately six months and will track each case through disposition.

What We’ll Learn

This evaluation will give concrete evidence about what if any impact access to plain language forms has. Increasing the body of evidence about its efficacy will help courts decide how to deploy their resources.


Preliminary data for all of these studies will be available in the next year.

A2J Lab presents an interim report on the Public Safety Assessment–Decision Making Framework System RCT in Dane County, Wisconsin

September 24, 2020 by Sandy North

Today, the A2J Lab presented the findings from the interim report of the randomized evaluation of the Public Safety Assessment–Decision Making Framework System (PSA-DMF), a pretrial risk assessment tool and related decision-making framework, in Dane County, Wisconsin. The full interim report is now available on the A2J Lab website.

The presentation at the Dane County Criminal Justice Council was the first result of a years-long study, which is not yet complete. In the late spring of 2017, Dane County started providing the PSA-DMF System information to judicial officers deciding how much and what kind of bail and supervision to assign to individuals who have been arrested; such decisions affect whether the individual will be released or remain in jail until trial. Working with the A2J Lab and Arnold Ventures, Dane County’s randomized field experiment began a month later.

The PSA-DMF System is one tool in the toolbox that the judicial officer can draw upon in the exercise of their professional discretion. Scientists supported by Arnold Ventures produced the PSA by reviewing past data on criminal history, demographics, new crimes, rates at which participants failed to appear at their court hearing, and other potential risk factors. The PSA scores are applied to the Decision Making Framework, which brings together the information from the PSA with a community’s local policies and values, its laws, and its resources to provide a recommendation regarding pretrial release and, if release is obtained, a supervision level. Judicial officers may use the PSA-DMF System report when deciding whether to release an individual before trial, and this decision rests always with the judicial officer.

The experiment used something like a coin flip to divide cases into two groups. In the treated group, the judicial officer who was deciding how much and what kind of bail and pretrial supervision to assign received a paper printout with the PSA-DMF System information on it. In the control group, the judicial officer did not receive the paper printout. In other words, the control group received standard practice, and the treated group received the new system in which the judicial officer got the PSA-DMF System printout.

The report released this week analyzes data from one year of follow-up for cases that were included in the study between the start of the study and the middle of 2018. Dane County randomized cases from the middle of 2017 until the end of 2019. In this interim report, the A2J Lab analyzed data the County provided to compare the treated and control groups on (among other things) measurements of racial fairness, number of days incarcerated (if any), rates at which participants failed to appear at their court hearing, new criminal activity, and new violent criminal activity. Criminal justice officials in Dane County worked to provide information to the A2J Lab to facilitate the A2J Lab’s independent evaluation in a spirit of learning and a desire to improve.

Quantitatively, the A2J Lab does not yet have enough cases, or enough of a follow-up period on those cases, to make firm conclusions about whether it is better or worse to make the PSA-DMF System printout available to the judicial officer before assignment of bail (if any) and release conditions. The A2J Lab’s studies, like all randomized control trials, require a number of participants sufficient to detect policy-relevant differences between treatment groups (here, the PSA-DMF System report versus business as usual). The number of participants must be sufficient to analyze for statistical significance. Because of the two-year follow-up period, the County will provide the A2J Lab full information on all arrestees in the study sometime in early 2022. If the data are provided then, the A2J Lab’s final report will be available in the summer of 2022.

The limited data available thus far, not enough to draw firm conclusions, suggested several findings:

  • There is some evidence that providing the PSA-DMF System printout to the judicial officer caused a change in the officer’s decisions.
  • Generally, when the printout indicated that an individual presented lower risk, the judicial officer was less likely to require cash bail or, if cash bail was required, the amount was lower than in comparable control group cases. The opposite was generally true in the treated group when the printout indicated that the individual presented higher risk, as compared to the control group. This change was statistically significant but mild, and we cannot yet tell whether the change is policy-relevant.
  • Treated group cases varied less in bail types and amounts than did control group cases. This change was strong and statistically significant.

As of the time of the interim report, there was no statistically significant difference between treated versus control group cases with respect to:

  • various measures of the racial fairness of the judicial officer’s decisions;
  • the number of days (if any) of pretrial incarceration;
  • the frequency with which arrestees failed to appear at court dates; or
  • the frequency with which arrestees were arrested for new crimes, including new violent crimes, during the pretrial period.

These finding might change, however, when the A2J Lab finishes analysis of the final dataset.

While it is too early to draw conclusions about whether the PSA-DMF System is positive, negative, or neutral for Dane County, this data provides a first look at the type of report that the final dataset will support.

  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Interim pages omitted …
  • Go to page 24
  • Go to Next Page »

Using empirical research to make the U.S. justice system work better for everyone.

The Access to Justice Lab

Copyright © 2023 · Monochrome Pro on Genesis Framework · WordPress · Log in

 

Loading Comments...