February 8, 2023
Dr. Maria Altieri
As surgeons often make high-stakes, time-sensitive decisions, there is a growing interest in the use of clinically actionable analytics to augment surgical decision-making.
Among the many decision-support tools available, the POTTER (Predictive OpTimal Trees in Emergency Surgery Risk) artificial intelligence-based (AI) calculator has yielded promising results that harbor relatively unique potential for clinical application. POTTER—leveraging a flexible decision-tree-based machine-learning approach—has predictive accuracy that is on par with, or greater than, many other risk calculators. Its algorithm was validated using the ACS National Surgical Quality Improvement Program (NSQIP®) data from 2014. The Surgical Risk Preoperative Assessment System (SURPAS) model (developed and validated via ACS NSQIP data) and several algorithms reported by Chiew and colleagues have state-of-the-art accuracy for predicting mortality and postoperative intensive care unit (ICU) admission for broad, heterogenous surgical patient populations.1,2
Figure 1. POTTER Calculator
Although POTTER use has been reported for specific patient populations such as emergency surgical patients and elderly patients, the evidence suggests that this tool is broadly generalizable.3-5 Perhaps most importantly, POTTER is available in a user-friendly mobile application, making it ready for deployment in clinical settings to assess operative risk or to aid in counseling patients.
POTTER has some opportunities for improvement. The application requires manual data entry, which could be a small hindrance for some users, although it requires manual entry of less than 10 variables (see Figure 1).6
In contradistinction, AI predictive analytic platforms using automated electronic health record (EHR) data inputs obviate manual data entry requirements.7 Historically, there has been a lack of high-level evidence from prospective studies supporting automated EHR data entry, but recent evidence suggests that this approach is effective and can be deployed as a mobile device application.8
In addition, POTTER uses preoperative data and does not incorporate intraoperative data that are potentially informative in predicting postoperative complications. POTTER algorithms were trained primarily on outcomes in the US 2007–2013, which may not generalize well to other countries and may not accurately represent risk in 2023. Like most similar risk calculators, POTTER requires prospective validation and assessments of its effects on decision-making and patient outcomes.
Few studies have robustly tested the effects of AI-enabled decision support on clinical decision-making. The Hypotension Prediction (HYPE) trial9 is a notable exception. In a randomized study, the authors deployed an AI algorithm that predicted impending intraoperative hypotension. The HYPE trial showed that anesthesiologists using the algorithm acted earlier, differently, and more frequently, and their patients experienced fewer hypotensive events and less time-weighted hypotension.
Although unproven, it remains plausible that on a larger scale, the HYPE trial algorithm could decrease complications related to intraoperative hypotension (e.g., acute kidney injury), and thus improve patient outcomes.
Likewise, it remains to be determined whether the POTTER app and similar surgical AI decision support systems improve patient outcomes. Lupei, Sun, and colleagues10,11 previously have reported an AI model degradation from internal or external validation to real-time validation, as well as the impact of AI-enabled tools that are integrated into real-time clinical workflows. Although AI offers greater potential to accurately represent complex, nonlinear pathophysiology compared with basic statistical modeling, recent studies have demonstrated no great superiority of deep learning over regression in classifying illness severity of individual patients using readily available clinical data.12,13
For cases in which AI offers no predictive performance advantage, it may be preferable to use regression-based algorithms that are more easily interpreted by clinicians and have a longer, stronger record of success in clinical settings.
Our personal, anecdotal experience with POTTER is that it provides an accurate, data-driven prediction of postoperative complications, which can be useful adjuncts to shared decision-making processes and prognostic conversations with patients and caregivers, especially when the prognosis is poor.
The POTTER app is useful as it helps predict postoperative morbidity and mortality following emergency surgery compared to similar elective surgery. After inputting information regarding the patient, the user can select the outcome for which a risk estimate is desired. A series of questions follows, and each new question is based on the answer to the previous question as it forms a decision tree, which finally calculates the risk based on the previous responses. The final result predicts the risk of death for patients undergoing emergency general surgery procedures and 18 postoperative complications.
By augmenting, rather than replacing, the knowledge, intuition, and skills that surgeons offer their patients, clinically actionable predictive analytics can anchor decision-making and prognostication with objectivity and reduce the variability that is inherent to the provider-specific hypothetical, deductive reasoning that is the hallmark of current surgical decision-making practices.
The authors have no conflicts of interest related to the POTTER application.
The thoughts and opinions expressed in this viewpoint article are solely those of the authors and do not necessarily reflect those of the ACS.
Dr. Maria Altieri is the section chief of gastrointestinal surgery at the Hospital of the University of Pennsylvania (Penn) in Philadelphia and assistant professor of surgery at the Perelman School of Medicine at Penn.