Advertisement

Use of Machine Learning in Pediatric Surgical Clinical Prediction Tools: A Systematic Review

  • Author Footnotes
    1 Authors contributed equally to this work.
    Amanda Bianco
    Footnotes
    1 Authors contributed equally to this work.
    Affiliations
    Faculty of Medicine and Health Sciences, McGill University, Montreal, Quebec, Canada
    Search for articles by this author
  • Author Footnotes
    1 Authors contributed equally to this work.
    Zaid A.M. Al-Azzawi
    Footnotes
    1 Authors contributed equally to this work.
    Affiliations
    Faculty of Medicine and Health Sciences, McGill University, Montreal, Quebec, Canada
    Search for articles by this author
  • Elena Guadagno
    Affiliations
    Harvey E. Beardmore Division of Pediatric Surgery, The Montreal Children's Hospital, McGill University Health Centre, Montreal, Quebec, Canada
    Search for articles by this author
  • Esli Osmanlliu
    Affiliations
    Department of Pediatrics, McGill University Health Centre, Montreal, Quebec, Canada
    Search for articles by this author
  • Jocelyn Gravel
    Affiliations
    Department of Pediatric Emergency Medicine, Sainte-Justine Hospital, Université de Montréal, Montreal, Quebec, Canada
    Search for articles by this author
  • Dan Poenaru
    Correspondence
    Corresponding author. Harvey E. Beardmore Division of Pediatric Surgery, The Montreal Children’s Hospital, Room B04.2022, 1001 Decarie Boulevard, Montreal, Quebec H4A 3J1, Canada. .
    Affiliations
    Faculty of Medicine and Health Sciences, McGill University, Montreal, Quebec, Canada

    Harvey E. Beardmore Division of Pediatric Surgery, The Montreal Children's Hospital, McGill University Health Centre, Montreal, Quebec, Canada
    Search for articles by this author
  • Author Footnotes
    1 Authors contributed equally to this work.
Open AccessPublished:January 18, 2023DOI:https://doi.org/10.1016/j.jpedsurg.2023.01.020

      Highlights

      • Clinical prediction tools (CPTs) are decision-making instruments utilizing patient data to predict specific outcomes, and were first developed using statistical models, but are now increasingly being supplemented by machine learning (ML).
      • This systematic review investigated the clinical validity and applicability of ML-based CPTs compared to statistical CPTs in pediatric surgery.

      Summary

      Purpose

      Clinical prediction tools (CPTs) are decision-making instruments utilizing patient data to predict specific clinical outcomes, risk-stratify patients, or suggest personalized diagnostic or therapeutic options. Recent advancements in artificial intelligence have resulted in a proliferation of CPTs created using machine learning (ML) - yet the clinical applicability of ML-based CPTs and their validation in clinical settings remain unclear. This systematic review aims to compare the validity and clinical efficacy of ML-based to traditional CPTs in pediatric surgery.

      Methods

      Nine databases were searched from 2000 until July 9, 2021 to retrieve articles reporting on CPTs and ML for pediatric surgical conditions. PRISMA standards were followed, and screening was performed by two independent reviewers in Rayyan, with a third reviewer resolving conflicts. Risk of bias was assessed using the PROBAST.

      Results

      Out of 8,300 studies, 48 met the inclusion criteria. The most represented surgical specialties were pediatric general (14), neurosurgery (13) and cardiac surgery (12). Prognostic (26) CPTs were the most represented type of surgical pediatric CPTs followed by diagnostic (10), interventional (9), and risk stratifying (2). One study included a CPT for diagnostic, interventional and prognostic purposes. 81% of studies compared their CPT to ML-based CPTs, statistical CPTs, or the unaided clinician, but lacked external validation and/or evidence of clinical implementation.

      Conclusions

      While most studies claim significant potential improvements by incorporating ML-based CPTs in pediatric surgical decision-making, both external validation and clinical application remains limited. Further studies must focus on validating existing instruments or developing validated tools, and incorporating them in the clinical workflow.

      Keywords

      Abbreviations:

      CPT (Clinical Prediction Tool), PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses), PROBAST (Prediction model Risk Of Bias ASsessment Tool), ML (Machine Learning)

      Category of the manuscript

      This manuscript is a systematic review.

      Conflicts of interest

      The authors of this manuscript have no conflicts of interest to disclose.

      Previous communication

      This manuscript is based on the abstract that was accepted for the 2022 Canadian Association of Pediatric Surgeons (CAPS) Annual Meeting.

      Financial support statement

      This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

      1. Introduction

      Clinical prediction tools (CPTs) have become ubiquitous in the diagnosis and treatment of many conditions. CPTs are defined as clinical decision-making instruments which utilize different aspects of the patient's clinical history, physical examination and various biologic and imaging test results to predict a specific clinical outcome, risk-stratify patients, or suggest a personalized diagnostic or therapeutic course of action [1]. Historically, CPTs were first developed using statistical models, often regression models (e.g. logistic regression or recursive partitioning), in order to create a clinical decision-making framework [2]. Their overall aim was to standardize patient treatments while simultaneously improving outcomes.
      Recently, multiple CPTs have been created using machine learning (ML, a subset of artificial intelligence), rather than traditional statistical methods, due to advancements in electronic information storage and sharing [3]. Traditional statistical CPTs often rely on linear regression models and subjective human input to identify variables of interest correlated with a specific outcome. On the other hand, ML focuses on computer algorithms that “learn” from input data to predict a specific outcome [4]. ML-based CPTs will extrapolate, from a limited set of inputs regarding a patient encounter, a possible diagnosis or risk assessment to guide future clinical decision-making. Therefore, they leverage the ability of ML to deal with large and complex datasets such as the increasingly available electronic health records data to extrapolate simple decision-making frameworks.
      While most ML-based CPTs have compared favourably to traditional tools in silico (i.e. outside clinical settings), there is limited evidence of superiority of such tools in actual clinical workflows [5]. For example, Marcinkevics et al. found that ML-based CPT algorithms achieved better diagnostic performance than either the Alvarado or the Pediatric Appendicitis Scores in a pediatric population with suspected appendicitis. However, the clinical applicability of these ML-based CPTs and their validation in various clinical settings remain to be determined [6].
      Whilst many ML-based CPTs have recently been devised for a variety of different surgical pediatric conditions, it remains unclear if they were externally validated or implemented in the clinical setting. Therefore, this systematic review aims to examine the clinical efficacy and validity of ML-based CPTs as compared to statistical CPTs currently in use in the pediatric population. Our results are expected to enhance the development, validation and integration of ML-based CPTs in pediatric surgery, and shed light on the current gaps that exist in their clinical implementation and validation.

      2. Methods

      We conducted a systematic review of the published literature interrogating the use of CPTs and ML in pediatric surgery. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and checklist for conducting systematic reviews were used [
      • Liberati A.
      • Altman D.G.
      • Tetzlaff J.
      • et al.
      The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration.
      ] (see Supplementary Material). The review was registered with the National Institute for Health Research’s PROSPERO website (CRD42021268036) and Open Science Framework (https://doi.org/10.17605/OSF.IO/J8M9D). A senior medical librarian (EG) searched the following databases from 2000 until July 9, 2021: Medline (Ovid), Embase (Ovid), Cochrane (Wiley), Global Health (Ovid), Web of Science (Clarivate Analytics), ProQuest Central, Inspec – Engineering Village (Elsevier), Africa Wide Information (Ebsco) and Global Index Medicus (WHO) with no language restrictions. The search strategy used variations in text words found in the title, abstract or keyword fields, and relevant subject headings to retrieve articles looking at CPTs and ML in the pediatric setting. Animal studies were excluded (see Supplementary Material for the full search strategy and PRISMA-S extension for searching).
      References found were imported into EndNote X9, where duplicates were removed. Records were then imported into the online platform Rayyan [
      • Ouzzani M.
      • Hammady H.
      • Fedorowicz Z.
      • et al.
      Rayyan—a web and mobile app for systematic reviews.
      ] and screened by two independent reviewers (AB & ZA), with a third reviewer (DP/EG) resolving conflicts. Inter-rater reliability was measured using the first 50 articles, aiming for a kappa score above 80% prior to commencing the two-step screening process. Articles were first screened by title and abstract, followed by full-text reviews of included articles. The primary reason for exclusion was documented in a Google Sheet.

      2.1 Inclusion and Exclusion Criteria

      Studies were included if (1) the participant group was composed of children (birth to 18 years of age), (2) data from pediatric population was analyzed separately from those of the adult population if both samples were assessed, (3) data from surgical population was analyzed separately from those of the non-surgical population if both samples were assessed, and (4) the study compared the validity and clinical efficacy of CPTs in pediatric surgery.
      Studies were excluded if (1) participant group was solely composed of preterm neonates or adults (over 18 years of age), (2) study involved animals, (3) the condition studied was not surgical, (4) article was a conference abstract or conference paper, clinical trial or a methodological study based on an ML-based CPT, (5) studies for which it was not possible to obtain the full text, (6) studies for which patient demographics were absent and/or the CPT-related methodology was incomplete, and (7) studies using only images, signals, gene expression profiles or other genetic data. Studies were also excluded if they were literature - narrative, scoping or systematic - reviews, however, individual studies included in literature reviews were kept if they met the inclusion criteria.

      2.2 Data Extraction and Analysis

      The following data were extracted from all studies: country of origin, study type, surgical specialty, condition studied, demographics of study population (e.g. sample size, average age and female-to-male ratio), CPT type (e.g. diagnostic, prognostic, interventional, or risk stratifying), as well as ML model studied and comparator studied (e.g. no comparator, ML-based CPT, traditional statistical CPT or unaided clinician).
      To analyze how the CPTs were trained and validated, the source and country of origin of the datasets, average number of input variables studied, validation method used, train-test split ratio used (if applicable), model performance measures used, and best overall models were extracted from all studies along with the aforementioned demographics of the study population. Data was extracted for both internal and external validation, if the latter was performed. Future plans or next steps for each CPT were identified to assess their clinical applicability.

      2.3 Quality Assessment

      The risk of bias and applicability of the included studies was assessed by two independent reviewers (AB & ZA) using the Prediction model Risk Of Bias ASsessment Tool (PROBAST) [
      • Wolff R.F.
      • Moons K.G.M.
      • Riley R.D.
      • et al.
      PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies.
      ]. The PROBAST includes 20 signaling questions across 4 domains, including participants, predictors (i.e. input variables), outcome, and analysis.

      3. Results

      The initial search yielded 9291 studies, of which 8300 remained for title and abstract screening after duplicate removal. Out of the 139 studies included for full-text review, 48 studies were retained for final data extraction (Fig. 1 and Table 1). These included 41 retrospective, 6 prospective, and one study that was both retrospective and prospective. Table 1 highlights general study characteristics and surgical specialties in which ML-based CPTs were used in pediatric surgical decision-making. Information regarding the specific ML models used in each study is summarized in Table S1. Most studies were performed in high-income countries, the United States being most represented (28/48). The application of ML-based CPTs spanned across many different surgical specialties, but most represented were pediatric general (14), neurosurgery (13) and cardiac surgery (12) (Table 1). The most utilized ML methods were random forest (54.2%) followed by decision trees (37.5%) and support vector machines (25%) (Fig. 2 (A) and Table S1). CPTs utilizing ML models had less than 50 input variables per model in 31 out of 48 studies (Fig. 3).
      Fig. 1
      Fig. 1PRISMA Flow Diagram. From: Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372:n71. https://doi.org/10.1136/bmj.n71. For more information, visit: http://www.prisma-statement.org/.
      Table 1Included studies.
      AuthorCountry of OriginStudy TypeSurgical SpecialtyCondition StudiedCPT Type
      Marcinkevics et al. [
      • Marcinkevics R.
      • Reis Wolfertstetter P.
      • Wellmann S.
      • et al.
      Using Machine Learning to Predict the Diagnosis, Management and Severity of Pediatric Appendicitis.
      ]
      Germany, SwitzerlandRetrospectiveGeneralAppendicitisDiagnostic, Prognostic, Interventional
      Hu et al. [
      • Hu Y.
      • Gong X.
      • Shu L.
      • et al.
      Understanding risk factors for postoperative mortality in neonates based on explainable machine learning technology.
      ]
      China, USARetrospectiveGeneral, Cardiac, Thoracic, Neurosurgery, Cranioplasty, Gynecological, Plastic, OtherNeonatal postoperative mortalityPrognostic
      Lure et al. [
      • Lure A.C.
      • Du X.
      • Black E.W.
      • et al.
      Using machine learning analysis to assist in differentiating between necrotizing enterocolitis and spontaneous intestinal perforation: A novel predictive analytic tool.
      ]
      USARetrospectiveGeneralNecrotizing enterocolitis, spontaneous intestinal perforationDiagnostic
      Stiel et al. [
      • Stiel C.
      • Elrod J.
      • Klinke M.
      • et al.
      The Modified Heidelberg and the AI Appendicitis Score Are Superior to Current Scores in Predicting Appendicitis in Children: A Two-Center Cohort Study.
      ]
      GermanyRetrospectiveGeneralAppendicitisDiagnostic
      Troesch et al. [
      • Troesch V.L.
      • Wald M.
      • Bonnett M.A.
      • et al.
      The additive impact of the distal ureteral diameter ratio in predicting early breakthrough urinary tract infections in children with vesicoureteral reflux.
      ]
      USARetrospectiveUrologyFebrile breakthrough urinary tract infection within 13 months of starting prophylactic antibiotics in children with vesicoureteral refluxPrognostic
      Cooper et al. [
      • Cooper J.N.
      • Wei L.
      • Fernandez S.A.
      • et al.
      Pre-operative prediction of surgical morbidity in children: comparison of five statistical models.
      ]
      USARetrospectiveGeneral, Thoracic, Otolaryngology, Orthopedic, Urology, Neurosurgery, Plastic30-day surgical morbidityPrognostic
      Chen et al. [
      • Chen C.K.
      • Manlhiot C.
      • Mital S.
      • et al.
      Prelisting predictions of early postoperative survival in infant heart transplantation using classification and regression tree analysis.
      ]
      Canada, SingaporeRetrospectiveCardiacInfants listed for heart transplantation who would survive at least 3 months post-transplantationPrognostic
      Lorenzo et al. [
      • Lorenzo A.J.
      • Rickard M.
      • Braga L.H.
      • et al.
      Predictive Analytics and Modeling Employing Machine Learning Technology: The Next Step in Data Sharing, Analysis, and Individualized Counseling Explored With a Large, Prospective Prenatal Hydronephrosis Database.
      ]
      CanadaProspectiveUrologyInfants who are most likely to undergo a surgical intervention for prenatal hydronephrosisInterventional
      Ward et al. [
      • Ward A.
      • Jani T.
      • De Souza E.
      • et al.
      Prediction of Prolonged Opioid Use After Surgery in Adolescents: Insights From Machine Learning.
      ]
      USARetrospectiveAnesthesiologyProlonged opioid use after surgery in adolescentsPrognostic
      Zhang et al. [
      • Zhang K.
      • Liu X.
      • Jiang J.
      • et al.
      Prediction of postoperative complications of pediatric cataract patients using data mining.
      ]
      ChinaRetrospectiveOphthalmologyPostoperative complications of pediatric cataract patients within 1 year after surgeryPrognostic
      Jalali et al. [
      • Jalali A.
      • Buckley E.M.
      • Lynch J.M.
      • et al.
      Prediction of periventricular leukomalacia occurrence in neonates after heart surgery.
      ]
      USAProspectiveCardiacOccurrence of periventricular leukomalacia in neonates post-cardiac surgeryPrognostic
      Jalali et al. [
      • Jalali A.
      • Simpao A.F.
      • Galvez J.A.
      • et al.
      Prediction of Periventricular Leukomalacia in Neonates after Cardiac Surgery Using Machine Learning Algorithms.
      ]
      USARetrospectiveCardiacOccurrence of periventricular leukomalacia in neonates post-cardiac surgeryPrognostic
      Miller et al. [
      • Miller R.
      • Tumin D.
      • Cooper J.
      • et al.
      Prediction of mortality following pediatric heart transplant using machine learning algorithms.
      ]
      USARetrospectiveTransplantPredicting mortality post-transplant surgery within 1, 3 or 5 yearsPrognostic
      Sun et al. [
      • Sun H.
      • Liu Y.
      • Song B.
      • et al.
      Prediction of arrhythmia after intervention in children with atrial septal defect based on random forest.
      ]
      ChinaRetrospectiveCardiacArrhythmia after interventional closure in children with atrial septal defectPrognostic
      Wilson et al. [
      • Wilson T.J.
      • Chang K.W.C.
      • Yang L.J.S.
      Prediction Algorithm for Surgical Intervention in Neonatal Brachial Plexus Palsy.
      ]
      USARetrospectiveNeurosurgeryPersistent neonatal brachial plexus palsy patients that would benefit from surgeryInterventional
      Habibi et al. [
      • Habibi Z.
      • Ertiaei A.
      • Nikdad M.S.
      • et al.
      Predicting ventriculoperitoneal shunt infection in children with hydrocephalus using artificial neural network.
      ]
      IranRetrospectiveNeurosurgeryVentriculoperitoneal shunt infection in children with hydrocephalusPrognostic
      Guo et al. [
      • Guo K.
      • Fu X.
      • Zhang H.
      • et al.
      Predicting the postoperative blood coagulation state of children with congenital heart disease by machine learning based on real-world data.
      ]
      ChinaRetrospectiveCardiacPostoperative blood coagulation state of children with congenital heart diseasePrognostic
      Bertoni et al. [
      • Bertoni D.
      • Sterni L.M.
      • Pereira K.D.
      • et al.
      Predicting polysomnographic severity thresholds in children using machine learning.
      ]
      USARetrospectiveOtolaryngologyChildren needing postoperative overnight monitoring based on the polysomnographic severity of obstructive sleep disordered breathingInterventional
      Wadhwani et al. [
      • Wadhwani S.I.
      • Hsu E.K.
      • Shaffer M.L.
      • et al.
      Predicting ideal outcome after pediatric liver transplantation: An exploratory study using machine learning analyses to leverage Studies of Pediatric Liver Transplantation Data.
      ]
      Canada, USAProspectiveTransplantIdeal outcome at 3 years post-liver transplantPrognostic
      Azimi et al. [
      • Azimi P.
      • Mohammadi H.R.
      Predicting endoscopic third ventriculostomy success in childhood hydrocephalus: an artificial neural network analysis.
      ]
      IranRetrospectiveNeurosurgeryEndoscopic third ventriculostomy success at 6 monthsPrognostic
      Skoch et al. [
      • Skoch J.
      • Tahir R.
      • Abruzzo T.
      • et al.
      Predicting symptomatic cerebral vasospasm after aneurysmal subarachnoid hemorrhage with an artificial neural network in a pediatric population.
      ]
      USARetrospectiveNeurosurgerySymptomatic cerebral vasospasm in children with aneurysmal subarachnoid haemorrhagePrognostic
      Cooper et al. [
      • Cooper J.N.
      • Minneci P.C.
      • Deans K.J.
      Postoperative neonatal mortality prediction using superlearning.
      ]
      USARetrospectiveGeneral/Thoracic, Otolaryngology, Orthopedic, Urology, Neurosurgery, PlasticPostoperative neonatal mortalityPrognostic
      Cohen et al. [
      • Cohen K.B.
      • Glass B.
      • Greiner H.M.
      • et al.
      Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning.
      ]
      USARetrospectiveNeurosurgeryCandidates for pediatric epilepsy surgeryInterventional
      Pasha et al. [
      • Pasha S.
      • Shah S.
      • Newton P.
      • et al.
      Machine Learning Predicts the 3D Outcomes of Adolescent Idiopathic Scoliosis Surgery Using Patient-Surgeon Specific Parameters.
      ]
      USARetrospectiveOrthopedic3D global spinal alignment 2 years after posterior fusion spinal surgery in patients with adolescent idiopathic scoliosisPrognostic
      Hale et al. [
      • Hale A.T.
      • Riva-Cambrin J.
      • Wellons J.C.
      • et al.
      Machine learning predicts risk of cerebrospinal fluid shunt failure in children: a study from the hydrocephalus clinical research network.
      ]
      Canada, USAProspectiveNeurosurgeryCerebrospinal fluid shunt failurePrognostic
      Killian et al. [
      • Killian M.O.
      • Payrovnaziri S.N.
      • Gupta D.
      • et al.
      Machine learning-based prediction of health outcomes in pediatric organ transplantation recipients.
      ]
      USARetrospectiveTransplantPost-transplant hospitalizationPrognostic
      Wissel et al. [
      • Wissel B.D.
      • Greiner H.M.
      • Glauser T.A.
      • et al.
      Early identification of epilepsy surgery candidates: A multicenter, machine learning study.
      ]
      USARetrospectiveNeurosurgeryCandidates for epilepsy surgeryInterventional
      Saltzman et al. [
      • Saltzman A.F.
      • Carrasco Jr., A.
      • Hecht S.
      • et al.
      A decision tree to guide long term venous access placement in children and adolescents undergoing surgery for renal tumors.
      ]
      USARetrospectiveUrologyCandidates for venous access placementInterventional
      Bertsimas et al. [
      • Bertsimas D.
      • Zhuo D.
      • Dunn J.
      • et al.
      Adverse Outcomes Prediction for Congenital Heart Surgery: A Machine Learning Approach.
      ]
      Greece, Poland, Portugal, USARetrospectiveCardiacAdverse outcomes for congenital heart surgeryPrognostic
      Avila-George et al. [
      • Avila-George H.
      • De-la-Torre M.
      • Castro W.
      • et al.
      A Hybrid Intelligent Approach to Predict Discharge Diagnosis in Pediatric Surgical Patients.
      ]
      PeruRetrospectiveGeneralDischarge diagnosis in surgical patientsDiagnostic
      Ruiz-Fernandez et al. [
      • Ruiz-Fernandez D.
      • Monsalve Torra A.
      • Soriano-Paya A.
      • et al.
      Aid decision algorithms to estimate the risk in congenital heart surgery.
      ]
      Colombia, SpainRetrospectiveCardiacRisk in congenital heart surgeryRisk Stratifying
      Reddan et al. [
      • Reddan T.
      • Corness J.
      • Harden F.
      • et al.
      Analysis of the predictive value of clinical and sonographic variables in children with suspected acute appendicitis using decision tree algorithms.
      ]
      AustraliaProspective, RetrospectiveGeneralAcute appendicitisDiagnostic
      Aydin et al. [
      • Aydin E.
      • Turkmen I.U.
      • Namli G.
      • et al.
      A novel and simple machine learning algorithm for preoperative diagnosis of acute appendicitis in children.
      ]
      TurkeyRetrospectiveGeneralAcute appendicitisDiagnostic
      Lin et al. [
      • Lin D.
      • Chen J.
      • Lin Z.
      • et al.
      A practical model for the identification of congenital cataracts using machine learning.
      ]
      ChinaRetrospectiveOphthalmologyCongenital cataractsDiagnostic
      Shahi et al. [
      • Shahi N.
      • Shahi A.K.
      • Phillips R.
      • et al.
      Decision-making in pediatric blunt solid organ injury: A deep learning approach to predict massive transfusion, need for operative management, and mortality risk.
      ]
      USARetrospectiveTraumaNeed for massive transfusion (MT), failure of non-operative management (NOM), mortality, and successful non-operative management without intervention in the setting of blunt solid organ injuryInterventional
      Dong et al. [
      • Dong R.
      • Jiang J.
      • Zhang S.
      • et al.
      Development and Validation of Novel Diagnostic Models for Biliary Atresia in a Large Cohort of Chinese Patients.
      ]
      ChinaRetrospectiveGeneralBiliary atresiaDiagnostic
      DiRusso et al. [
      • DiRusso S.M.
      • Chahine A.A.
      • Sullivan T.
      • et al.
      Development of a model for prediction of survival in pediatric trauma patients: comparison of artificial neural networks and logistic regression.
      ]
      USARetrospectiveTraumaSurvivalPrognostic
      Guo et al. [
      • Guo Y.
      • Liu Y.
      • Ming W.
      • et al.
      Distinguishing Focal Cortical Dysplasia From Glioneuronal Tumors in Patients With Epilepsy by Machine Learning.
      ]
      ChinaRetrospectiveNeurosurgeryFocal cortical dysplasia, glioneuronal tumoursDiagnostic
      Liu et al. [
      • Liu J.
      • Dai S.
      • Chen G.
      • et al.
      Diagnostic Value and Effectiveness of an Artificial Neural Network in Biliary Atresia.
      ]
      ChinaRetrospectiveGeneralBiliary atresiaDiagnostic
      Ruiz et al. [
      • Ruiz V.M.
      • Saenz L.
      • Lopez-Magallon A.
      • et al.
      Early prediction of critical events for infants with single-ventricle physiology in critical care using routinely collected data.
      ]
      USARetrospectiveCardiacCritical events in infants with single-ventricle physiology before second-stage surgeryRisk stratifying
      Bartz-Kurycki et al. [
      • Bartz-Kurycki M.A.
      • Green C.
      • Anderson K.T.
      • et al.
      Enhanced neonatal surgical site infection prediction model utilizing statistically and clinically significant variables in combination with a machine learning algorithm.
      ]
      USARetrospectiveGeneralSurgical site infectionPrognostic
      Jamshidnezhad et al. [
      • Jamshidnezhad A.
      • Azizi A.
      • Shirali S.
      • et al.
      Evaluation of Suspected Pediatric Appendicitis with Alvarado Method Using a Computerized Intelligent Model.
      ]
      IranRetrospectiveGeneralAppendicitisDiagnostic
      Schwartz et al. [
      • Schwartz M.H.
      • Rozumalski A.
      • Novacheck T.F.
      Femoral derotational osteotomy: surgical indications and outcomes in children with cerebral palsy.
      ]
      USARetrospectiveOrthopedicSurgical indications for femoral derotational osteotomyInterventional
      Grundmeier et al. [
      • Grundmeier R.W.
      • Xiao R.
      • Ross R.K.
      • et al.
      Identifying surgical site infections in electronic health data using predictive models.
      ]
      USAProspectiveGeneralSurgical site infectionPrognostic
      Chang Junior et al. [
      • Chang Junior J.
      • Binuesa F.
      • Caneo L.F.
      • et al.
      Improving preoperative risk-of-death prediction in surgery congenital heart defects using artificial intelligence model: A pilot study.
      ]
      BrazilRetrospectiveCardiacMortality among patients with congenital heart disease undergoing cardiac surgeryPrognostic
      Peltri et al. [
      • Peltri G.
      • Bitterlich N.
      Increased predictive value of parameters by fuzzy logic-based multiparameter analysis.
      ]
      GermanyRetrospectiveCardiacPostoperative effusion and edemaPrognostic
      Hale et al. [
      • Hale A.T.
      • Stonko D.P.
      • Brown A.
      • et al.
      Machine-learning analysis outperforms conventional statistical models and CT classification systems in predicting 6-month outcomes in pediatric patients sustaining traumatic brain injury.
      ]
      USARetrospectiveNeurosurgery6-month outcomes in pediatric patients sustaining traumatic brain injuryPrognostic
      Jalali et al. [
      • Jalali A.
      • Lonsdale H.
      • Zamora L.V.
      • et al.
      Machine Learning Applied to Registry Data: Development of a Patient-Specific Prediction Model for Blood Transfusion Requirements During Craniofacial Surgery Using the Pediatric Craniofacial Perioperative Registry Dataset.
      ]
      USAProspectiveCraniofacialBlood transfusion requirementInterventional
      Legend: CPT - Clinical Prediction Tool; USA - United States of America.
      Fig. 2
      Fig. 2(A) Most common ML models, (B) model comparator, (C) best performing models when ML and LR (or a variation of LR) are compared, (D) internal validation methods, (E) external validation status, (F) future steps or next steps score, all from the included studies. Refer to for more details.
      Fig. 3
      Fig. 3Average number of input variables for the included studies.
      Thirty-nine of the included studies compared ML-based CPTs with either other ML models or logistic regression (representative of statistical CPTs). Nine out of 48 (18.8%) studies did not include a comparator (Fig. 2 (B) and Table S1). The internal validation of the ML model development was a prerequisite to study inclusion, hence studies utilized a train-test split method (23), k-fold cross-validation (11), or both (4), among others (Fig. 2 (D) and Table S1). However, measures of performance varied widely among studies: Table 2 outlines the most common performance measures used. External validation was only performed in 2 studies (Fig. 2 (E) and Table S1). A score from 0 to 2 was assigned to each study summarizing future development plans for their ML-based CPT, where 0 denotes studies that were only proof-of-concepts and 2 denotes studies in which external validation was performed and clinical integration is the next step. (Fig. 2 (F) and Table S1). Ten (20.8%) studies had a score of 0, 22 (45.8%) stated validation with a larger dataset was the next step, but it is not specified to be external validation and were assigned a score 0.5. Thirteen (27.1%) studies were assigned a score of 1 for external validation being the next step, and the only studies with external validation (3, 4.3%) were assigned a score of 1.5 as they were not ready for clinical integration. As seen in Table S1, the majority of studies lacking external validation had future plans of applying the ML model on a separate population, yet such follow-up studies were rarely identified in our search.
      Table 2Most common performance measures used in the included studies (N = 46). Note that there are studies that used more than one performance measure.
      Performance MeasureN (%)
      AUC/AUROC38 (82.6%)
      Sensitivity/Recall29 (63%)
      Specificity24 (52.1%)
      PPV/Precision18 (39.1%)
      Accuracy14 (30.4%)
      NPV13 (28.3%)
      F1-Score/F-Score/F-Measure9 (19.6%)
      AUPR3 (6.5%)
      Legend: AUROC - Area Under Receiver Operating Curve; PPV - Positive Predictive Value; NPV - Negative Predictive Value; AUPR - Area Under Precision Recall Curve.
      The majority of ML-based CPTs in which ML was compared to statistical CPTs reported higher performance measures using the ML-based approach (Fig. 2 (B) and Table S1). Out of 22 studies in which ML-based CPTs were compared to statistical CPTs (logistic regression or variants of it), statistical CPTs outperformed ML in only 2 studies. Two other studies found that there was comparable performance for ML versus statistics-based CPTs. One study, by Bertoni et al. [
      • Bertoni D.
      • Sterni L.M.
      • Pereira K.D.
      • et al.
      Predicting polysomnographic severity thresholds in children using machine learning.
      ], found that ML and statistics-based CPTs performed poorly for children needing postoperative overnight monitoring. Therefore, ML-based CPTs had higher discriminative power and greater accuracy whenever compared to statistical approaches (Fig. 2 (C) and Table S1). In some studies where different ML models performed similarly, a slightly less accurate model was chosen due to ease of implementation and integration into clinical workflows.
      Based on established CPT categories [
      • Cowley L.E.
      • Farewell D.M.
      • Maguire S.
      • et al.
      Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature.
      ], ML-based CPTs were divided into diagnostic, interventional, risk stratifying, or prognostic depending on the main outcome(s) of interest (Table 1). Diagnostic CPTs (10) were used for identifying surgical candidates or surgery-requiring conditions, interventional CPTs (9) evaluated the need for a surgical intervention or interventions relating to surgery, risk stratifying CPTs (2) evaluated surgery-related risks. The majority of studies were prognostic CPTs (26) and focused on predicting outcomes after surgery or any factors associated with the post-operative period. Lastly, one study by Marcinkevics et al. [
      • Marcinkevics R.
      • Reis Wolfertstetter P.
      • Wellmann S.
      • et al.
      Using Machine Learning to Predict the Diagnosis, Management and Severity of Pediatric Appendicitis.
      ] developed an ML-based CPT for diagnostic, interventional and prognostic purposes.
      In order to answer the question of validity and applicability, scores were assigned to each study highlighting future directions for that specific CPT. Generally, ML-based CPTs lacked external validation, i.e. the scores assigned to each study were ≤1. It is worth noting that in some studies, external validation was performed by “holding out” or “splitting” a part of the population during the validation phase for testing. We have considered this to be internal, rather than external, validation, as the latter was defined to be the implementation of the CPT in a population completely separate from the starting population [
      • Maleki F.
      • Muthukrishnan N.
      • Ovens K.
      • et al.
      Machine Learning Algorithm Validation: From Essentials to Advanced Applications and Implications for Regulatory Certification and Deployment.
      ].

      3.1 Risk of Bias

      Risk of bias analysis showed that almost all of the included studies were applicable to the review question, but 90% of them had a high risk of bias, mainly in the analysis domain (Fig. 4 and Table S2). This is because they lacked appropriate internal validation techniques (e.g. train-test split only instead of cross-validation or bootstrapping techniques), the performance measures did not evaluate model calibration or discrimination, missing data were not handled appropriately and/or the number of participants with the outcome relative to the number of input variables studied was not sufficient. Risk of bias was unclear in the predictor and outcome domains for 71% of the included studies, primarily because authors did not clearly state whether predictors were assessed without knowledge of outcome data and vice-versa, while taking into consideration that most of the studies were retrospective in nature. Furthermore, a pre-planned meta-analysis of overall performance advantage of ML-based CPTs had to be abandoned because of the unacceptably high risk of bias and heterogeneity of the articles included. It is worth noting that the PROBAST was designed to evaluate studies in which the prediction models are either diagnostic only or prognostic only. However, all signaling questions could be answered for the included studies in which the CPT type differed.
      Fig. 4
      Fig. 4Assessment of risk of bias and applicability of the included studies using the PROBAST.

      4. Discussion

      CPTs use patient data or information regarding the patient encounter to arrive at a simplified decision-making framework. CPTs have had a number of uses within the pediatric and adult populations across different fields [
      • Adams S.T.
      • Leveson S.H.
      Clinical prediction rules.
      ]. Traditionally, these tools were created using a mix of linear statistics, subjective clinical expertise, and patient data. Clinicians interested in a particular condition would review retrospective or prospective cohorts of patients, then analyze the results using simple linear statistics to highlight predictive correlations between specific variables and the desired outcome(s) [
      • Alvarado A.
      A practical score for the early diagnosis of acute appendicitis.
      ,
      • Samuel M.
      Pediatric appendicitis score.
      ]. However, due to the subjective nature of their design, traditional statistics-based CPTs have had a number of shortfalls - including performance (missed diagnoses or incorrect diagnoses) and reproducibility. Moreover, mutually contradicting versions of the same CPT could be devised based on the patient population chosen, the variables included for analysis, and the type of analyses performed. For instance, the earliest validated published CPT on the diagnosis of appendicitis was the Alvarado Score - however, over the years, a number of similar or modified scores have been proposed to address some of its shortcomings [
      • Alvarado A.
      A practical score for the early diagnosis of acute appendicitis.
      ,
      • Samuel M.
      Pediatric appendicitis score.
      ,
      • Andersson M.
      • Andersson R.E.
      The appendicitis inflammatory response score: a tool for the diagnosis of acute appendicitis that outperforms the Alvarado score.
      ,
      • Mikaere H.
      • Zeng I.
      • Lauti M.
      • et al.
      Derivation and validation of the APPEND score: an acute appendicitis clinical prediction rule.
      ,
      • Díaz-Barrientos C.Z.
      • Aquino-González A.
      • Heredia-Montaño M.
      • et al.
      The RIPASA score for the diagnosis of acute appendicitis: A comparison with the modified Alvarado score.
      ]. Therefore, while most CPTs have proven clinical utility, the very presence of multiple alternative CPTs for any clinical question points to the need for improved tools.
      In order to remove (or at least decrease) the subjectivity inherent in the design and development of CPTs, ML-based approaches have been more recently favoured [
      • Marcinkevics R.
      • Reis Wolfertstetter P.
      • Wellmann S.
      • et al.
      Using Machine Learning to Predict the Diagnosis, Management and Severity of Pediatric Appendicitis.
      ]. ML is based on artificial intelligence, and hence includes a number of different methods of data manipulation. The basic principle of ML is to use a computer algorithm that “learns” from the data, given a specific set of inputs and outputs. The inputs are selected widely in order to limit any subjectivity in variable choice, such as patient laboratory values, imaging, or demographic information [
      • Cowley L.E.
      • Farewell D.M.
      • Maguire S.
      • et al.
      Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature.
      ,
      • Maleki F.
      • Muthukrishnan N.
      • Ovens K.
      • et al.
      Machine Learning Algorithm Validation: From Essentials to Advanced Applications and Implications for Regulatory Certification and Deployment.
      ,
      • Adams S.T.
      • Leveson S.H.
      Clinical prediction rules.
      ]. The typical workflow of developing ML-based CPTs includes the following: (1) determining the outcome to be predicted for a given population, (2) gathering input-output data from a sample of the population (termed the “internal dataset”), (3) preprocessing (“cleaning”) the internal dataset, and (4) testing and validating the resulting CPT using the internal dataset to evaluate its performance with respect to the desired outcome to be predicted [
      • Maleki F.
      • Muthukrishnan N.
      • Ovens K.
      • et al.
      Machine Learning Algorithm Validation: From Essentials to Advanced Applications and Implications for Regulatory Certification and Deployment.
      ]. The optimization of the internal dataset for testing and validation is arguably the most important step in this workflow, ensuring that the resulting CPT is as accurate and precise as possible. However, in order to ensure the reliability, robustness and generalizability of these CPTs and detect any inherent biases in the internal dataset, it is imperative that, prior to clinical integration, they are also validated using an external dataset. The latter refers to a dataset for a population different from the one sampled for the internal dataset [
      • Maleki F.
      • Muthukrishnan N.
      • Ovens K.
      • et al.
      Machine Learning Algorithm Validation: From Essentials to Advanced Applications and Implications for Regulatory Certification and Deployment.
      ].
      In this systematic review, we sought to determine the clinical validity and applicability of ML-based CPTs as compared to traditional statistical CPTs in pediatric surgery. To this end, we have identified 48 articles that have met our inclusion criteria and qualified for full-text screening. In addition to lacking external validation, ML-based CPTs are very heterogeneous in design. The sample size used for each CPT and the number of features included as input variables varied drastically across studies. Only a handful of studies addressed their choice for the number of included input variables. The use of ML, however, allowed investigators to include large numbers of input variables in their datasets, well outside the capabilities of traditional statistics.
      Most studies were retrospective in nature, reflecting the relatively easy access to electronic health records data, typically without a need for individual consent. However, in keeping with a general lack of uniformity, the studies reviewed used different parametric performance measures for their chosen CPT model. Some common measures included area under receiver operating curve, sensitivity and specificity - most relevant to clinicians who might utilize the CPT (Table 2). It appears, however, that having the best performance measure is not absolutely required, as some ML models are more useful than others depending on the context in which they are used. ML models such as random forest, support vector machines, and artificial neural networks are known for their high predictive performance for nonlinear problems and ability to find complex interactions among input variables, while other models might be chosen for their simplicity, which can improve model understanding and interpretation [
      • Liu L.
      • Ni Y.
      • Zhang N.
      • et al.
      Mining patient-specific and contextual data with machine learning technologies to predict cancellation of children's surgery.
      ].
      Similarly, different groups chose different internal validation methods. The most commonly used validation methods were train-test splitting and k-fold cross-validation. In train-test splitting, the data is split into a percentage that is used for training and the remaining data is used for testing. The train-test split method was by far the most utilized in our included studies, but the ratio between the training and testing samples varied widely, often without any justification. In k-fold cross-validation, data would be split into k – 1 sets, algorithm performance is measured for each set to optimize parameters, then the algorithm is tested on the kth set of data. Lastly, in studies lacking a comparator to the ML model, it was impossible to determine whether or not the CPT had any clinical validity, or was rather a simple proof-of-concept to the predictive ability of ML-based CPTs. It is clear throughout our review that there is a need for standardization and transparency in how ML-based CPTs are being developed and tested, in order to ensure their safety and applicability in clinical settings. Tools like the PROBAST can be used to guide standardization efforts.
      This review has several limitations. The distinction between ML-based CPTs versus traditional statistical CPTs is not clear-cut. During literature search and screening, many studies defined variants of logistic regression as ML, but to arrive at a reasonable number of studies, all logistic regression variants were considered non-ML. Due to the heterogeneity of studies and their design, it was very difficult to obtain information regarding model development and validation, which led to studies being excluded. Articles that might have met the inclusion criteria, but for which full-text was unobtainable, were not included in this study. The study is concerned with the pediatric population, however, the definition for pediatric in our study included only term patients, excluding all premature populations. A limitation of the PROBAST is that some of the signaling questions require clinical or statistical expertise.

      5. Conclusions

      Our review confirms that, despite the enthusiasm for ML-based CPTs in the literature, their external validation remains elusive, and their actual clinical implementation is rare. While we do not question the potential utility of ML-based CPTs, much work remains to be done in terms of validating these methods in a standardized fashion and transitioning them into the clinical environment. We believe that this is a missed opportunity since ML-based CPTs leverage the advantages of artificial intelligence in handling and analyzing large datasets. In an age where electronic health records have eased data gathering, integration of patient profiles, and information sharing, ML-based CPTs could be the future frontier for establishing efficient decision-making frameworks that improve patient outcomes in pediatric surgery as they have the power to handle such complex datasets. Regardless of the surgical specialty, they have the potential to identify surgical candidates or conditions, evaluate the need for surgery and surgery-related risks as well as predict postoperative outcomes, all of which are prime targets for expanded research and efforts in the pediatric setting.

      Uncited References

      [
      • Maguire J.L.
      • Kulik D.M.
      • Laupacis A.
      • et al.
      Clinical prediction rules for children: a systematic review.
      ]; [
      • Goldstein B.A.
      • Navar A.M.
      • Pencina M.J.
      • et al.
      Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review.
      ]; [
      • Bi Q.
      • Goodman K.E.
      • Kaminsky J.
      • et al.
      What is Machine Learning? A Primer for the Epidemiologist.
      ]; [
      • Christodoulou E.
      • Ma J.
      • Collins G.S.
      • et al.
      A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models.
      ].

      Acknowledgements

      N/A for this systematic review.

      Appendix A. Supplementary data

      References

        • Maguire J.L.
        • Kulik D.M.
        • Laupacis A.
        • et al.
        Clinical prediction rules for children: a systematic review.
        Pediatrics. 2011; 128: 666-677https://doi.org/10.1542/peds.2011-0043
        • Cowley L.E.
        • Farewell D.M.
        • Maguire S.
        • et al.
        Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature.
        Diagnostic and prognostic research. 2019; 3https://doi.org/10.1186/s41512-019-0060-y
        • Goldstein B.A.
        • Navar A.M.
        • Pencina M.J.
        • et al.
        Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review.
        Journal of the American Medical Informatics Association. 2017; 24: 198-208https://doi.org/10.1093/jamia/ocw042
        • Bi Q.
        • Goodman K.E.
        • Kaminsky J.
        • et al.
        What is Machine Learning? A Primer for the Epidemiologist.
        American journal of epidemiology. 2019; 188: 2222-2239https://doi.org/10.1093/aje/kwz189
        • Christodoulou E.
        • Ma J.
        • Collins G.S.
        • et al.
        A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models.
        Journal of Clinical Epidemiology. 2019; 110: 12-22https://doi.org/10.1016/j.jclinepi.2019.02.004
        • Marcinkevics R.
        • Reis Wolfertstetter P.
        • Wellmann S.
        • et al.
        Using Machine Learning to Predict the Diagnosis, Management and Severity of Pediatric Appendicitis.
        Frontiers in Pediatrics. 2021; 9https://doi.org/10.3389/fped.2021.662183
        • Liberati A.
        • Altman D.G.
        • Tetzlaff J.
        • et al.
        The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration.
        Bmj. 2009; 339: b2700https://doi.org/10.1136/bmj.b2700
        • Ouzzani M.
        • Hammady H.
        • Fedorowicz Z.
        • et al.
        Rayyan—a web and mobile app for systematic reviews.
        Systematic Reviews. 2016; 5: 210https://doi.org/10.1186/s13643-016-0384-4
        • Wolff R.F.
        • Moons K.G.M.
        • Riley R.D.
        • et al.
        PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies.
        Annals of Internal Medicine. 2019; 170: 51-58https://doi.org/10.7326/M18-1376
        • Bertoni D.
        • Sterni L.M.
        • Pereira K.D.
        • et al.
        Predicting polysomnographic severity thresholds in children using machine learning.
        Pediatric Research. 2020; 88: 404-411https://doi.org/10.1038/s41390-020-0944-0
        • Maleki F.
        • Muthukrishnan N.
        • Ovens K.
        • et al.
        Machine Learning Algorithm Validation: From Essentials to Advanced Applications and Implications for Regulatory Certification and Deployment.
        Neuroimaging Clin N Am. 2020; 30: 433-445https://doi.org/10.1016/j.nic.2020.08.004
        • Adams S.T.
        • Leveson S.H.
        Clinical prediction rules.
        Bmj. 2012; 344: d8312https://doi.org/10.1136/bmj.d8312
        • Alvarado A.
        A practical score for the early diagnosis of acute appendicitis.
        Ann Emerg Med. 1986; 15: 557-564https://doi.org/10.1016/s0196-0644(86)80993-3
        • Samuel M.
        Pediatric appendicitis score.
        J Pediatr Surg. 2002; 37: 877-881https://doi.org/10.1053/jpsu.2002.32893
        • Andersson M.
        • Andersson R.E.
        The appendicitis inflammatory response score: a tool for the diagnosis of acute appendicitis that outperforms the Alvarado score.
        World J Surg. 2008; 32: 1843-1849https://doi.org/10.1007/s00268-008-9649-y
        • Mikaere H.
        • Zeng I.
        • Lauti M.
        • et al.
        Derivation and validation of the APPEND score: an acute appendicitis clinical prediction rule.
        ANZ J Surg. 2018; 88 (e7): E303https://doi.org/10.1111/ans.14022
        • Díaz-Barrientos C.Z.
        • Aquino-González A.
        • Heredia-Montaño M.
        • et al.
        The RIPASA score for the diagnosis of acute appendicitis: A comparison with the modified Alvarado score.
        Rev Gastroenterol Mex (Engl Ed). 2018; 83: 112-116https://doi.org/10.1016/j.rgmx.2017.06.002
        • Liu L.
        • Ni Y.
        • Zhang N.
        • et al.
        Mining patient-specific and contextual data with machine learning technologies to predict cancellation of children's surgery.
        Int J Med Inf. 2019; 129: 234-241https://doi.org/10.1016/j.ijmedinf.2019.06.007
        • Hu Y.
        • Gong X.
        • Shu L.
        • et al.
        Understanding risk factors for postoperative mortality in neonates based on explainable machine learning technology.
        J Pediatr Surg. 2021; 5: 5https://doi.org/10.1016/j.jpedsurg.2021.03.057
        • Lure A.C.
        • Du X.
        • Black E.W.
        • et al.
        Using machine learning analysis to assist in differentiating between necrotizing enterocolitis and spontaneous intestinal perforation: A novel predictive analytic tool.
        J Pediatr Surg. 2020; 13: 13https://doi.org/10.1016/j.jpedsurg.2020.11.008
        • Stiel C.
        • Elrod J.
        • Klinke M.
        • et al.
        The Modified Heidelberg and the AI Appendicitis Score Are Superior to Current Scores in Predicting Appendicitis in Children: A Two-Center Cohort Study.
        Front. 2020; 8592892https://doi.org/10.3389/fped.2020.592892
        • Troesch V.L.
        • Wald M.
        • Bonnett M.A.
        • et al.
        The additive impact of the distal ureteral diameter ratio in predicting early breakthrough urinary tract infections in children with vesicoureteral reflux.
        Journal of pediatric urology. 2021; 17 (e5): 208.e1https://doi.org/10.1016/j.jpurol.2021.01.003
        • Cooper J.N.
        • Wei L.
        • Fernandez S.A.
        • et al.
        Pre-operative prediction of surgical morbidity in children: comparison of five statistical models.
        Comput Biol Med. 2015; 57: 54-65https://doi.org/10.1016/j.compbiomed.2014.11.009
        • Chen C.K.
        • Manlhiot C.
        • Mital S.
        • et al.
        Prelisting predictions of early postoperative survival in infant heart transplantation using classification and regression tree analysis.
        Pediatric Transplantation. 2018; 22: 3https://doi.org/10.1111/petr.13105
        • Lorenzo A.J.
        • Rickard M.
        • Braga L.H.
        • et al.
        Predictive Analytics and Modeling Employing Machine Learning Technology: The Next Step in Data Sharing, Analysis, and Individualized Counseling Explored With a Large, Prospective Prenatal Hydronephrosis Database.
        Urology. 2019; 123: 204-209https://doi.org/10.1016/j.urology.2018.05.041
        • Ward A.
        • Jani T.
        • De Souza E.
        • et al.
        Prediction of Prolonged Opioid Use After Surgery in Adolescents: Insights From Machine Learning.
        Anesth Analg. 2021; 3: 3https://doi.org/10.1213/ANE.0000000000005527
        • Zhang K.
        • Liu X.
        • Jiang J.
        • et al.
        Prediction of postoperative complications of pediatric cataract patients using data mining.
        J. 2019; 17: 2https://doi.org/10.1186/s12967-018-1758-2
        • Jalali A.
        • Buckley E.M.
        • Lynch J.M.
        • et al.
        Prediction of periventricular leukomalacia occurrence in neonates after heart surgery.
        IEEE j. 2014; 18: 1453-1460https://doi.org/10.1109/JBHI.2013.2285011
        • Jalali A.
        • Simpao A.F.
        • Galvez J.A.
        • et al.
        Prediction of Periventricular Leukomalacia in Neonates after Cardiac Surgery Using Machine Learning Algorithms.
        J Med Syst. 2018; 42: 177https://doi.org/10.1007/s10916-018-1029-z
        • Miller R.
        • Tumin D.
        • Cooper J.
        • et al.
        Prediction of mortality following pediatric heart transplant using machine learning algorithms.
        Pediatric Transplantation. 2019; 23e13360https://doi.org/10.1111/petr.13360
        • Sun H.
        • Liu Y.
        • Song B.
        • et al.
        Prediction of arrhythmia after intervention in children with atrial septal defect based on random forest.
        BMC Pediatr. 2021; 21: 280https://doi.org/10.1186/s12887-021-02744-7
        • Wilson T.J.
        • Chang K.W.C.
        • Yang L.J.S.
        Prediction Algorithm for Surgical Intervention in Neonatal Brachial Plexus Palsy.
        Neurosurgery. 2018; 82: 335-342https://doi.org/10.1093/neuros/nyx190
        • Habibi Z.
        • Ertiaei A.
        • Nikdad M.S.
        • et al.
        Predicting ventriculoperitoneal shunt infection in children with hydrocephalus using artificial neural network.
        Childs Nerv Syst. 2016; 32: 2143-2151https://doi.org/10.1007/s00381-016-3248-2
        • Guo K.
        • Fu X.
        • Zhang H.
        • et al.
        Predicting the postoperative blood coagulation state of children with congenital heart disease by machine learning based on real-world data.
        Transl. 2021; 10: 33-43https://doi.org/10.21037/tp-20-238
        • Wadhwani S.I.
        • Hsu E.K.
        • Shaffer M.L.
        • et al.
        Predicting ideal outcome after pediatric liver transplantation: An exploratory study using machine learning analyses to leverage Studies of Pediatric Liver Transplantation Data.
        Pediatric Transplantation. 2019; 23e13554https://doi.org/10.1111/petr.13554
        • Azimi P.
        • Mohammadi H.R.
        Predicting endoscopic third ventriculostomy success in childhood hydrocephalus: an artificial neural network analysis.
        J Neurosurg Pediatrics. 2014; 13: 426-432https://doi.org/10.3171/2013.12.PEDS13423
        • Skoch J.
        • Tahir R.
        • Abruzzo T.
        • et al.
        Predicting symptomatic cerebral vasospasm after aneurysmal subarachnoid hemorrhage with an artificial neural network in a pediatric population.
        Childs Nerv Syst. 2017; 33: 2153-2157https://doi.org/10.1007/s00381-017-3573-0
        • Cooper J.N.
        • Minneci P.C.
        • Deans K.J.
        Postoperative neonatal mortality prediction using superlearning.
        J Surg Res. 2018; 221: 311-319https://doi.org/10.1016/j.jss.2017.09.002
        • Cohen K.B.
        • Glass B.
        • Greiner H.M.
        • et al.
        Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning.
        Biomedical Informatics Insights. 2016; 8: 11-18https://doi.org/10.4137/BII.S38308
        • Pasha S.
        • Shah S.
        • Newton P.
        • et al.
        Machine Learning Predicts the 3D Outcomes of Adolescent Idiopathic Scoliosis Surgery Using Patient-Surgeon Specific Parameters.
        Spine. 2021; 46: 579-587https://doi.org/10.1097/BRS.0000000000003795
        • Hale A.T.
        • Riva-Cambrin J.
        • Wellons J.C.
        • et al.
        Machine learning predicts risk of cerebrospinal fluid shunt failure in children: a study from the hydrocephalus clinical research network.
        Childs Nerv Syst. 2021; 37: 1485-1494https://doi.org/10.1007/s00381-021-05061-7
        • Killian M.O.
        • Payrovnaziri S.N.
        • Gupta D.
        • et al.
        Machine learning-based prediction of health outcomes in pediatric organ transplantation recipients.
        JAMIA open. 2021; 4: ooab008https://doi.org/10.1093/jamiaopen/ooab008
        • Wissel B.D.
        • Greiner H.M.
        • Glauser T.A.
        • et al.
        Early identification of epilepsy surgery candidates: A multicenter, machine learning study.
        Acta Neurol Scand. 2021; 144: 41-50https://doi.org/10.1111/ane.13418
        • Saltzman A.F.
        • Carrasco Jr., A.
        • Hecht S.
        • et al.
        A decision tree to guide long term venous access placement in children and adolescents undergoing surgery for renal tumors.
        J Pediatr Surg. 2020; 55: 1334-1338https://doi.org/10.1016/j.jpedsurg.2019.04.034
        • Bertsimas D.
        • Zhuo D.
        • Dunn J.
        • et al.
        Adverse Outcomes Prediction for Congenital Heart Surgery: A Machine Learning Approach.
        World J Pediatr Congenit Heart Surg. 2021; 21501351211007106https://doi.org/10.1177/21501351211007106
        • Avila-George H.
        • De-la-Torre M.
        • Castro W.
        • et al.
        A Hybrid Intelligent Approach to Predict Discharge Diagnosis in Pediatric Surgical Patients.
        Appl Sci-Basel. 2021; 11: 17https://doi.org/10.3390/app11083529
        • Ruiz-Fernandez D.
        • Monsalve Torra A.
        • Soriano-Paya A.
        • et al.
        Aid decision algorithms to estimate the risk in congenital heart surgery.
        Comput Methods Programs Biomed. 2016; 126: 118-127https://doi.org/10.1016/j.cmpb.2015.12.021
        • Reddan T.
        • Corness J.
        • Harden F.
        • et al.
        Analysis of the predictive value of clinical and sonographic variables in children with suspected acute appendicitis using decision tree algorithms.
        Sonography. 2018; 5: 157-163https://doi.org/10.1002/sono.12156
        • Aydin E.
        • Turkmen I.U.
        • Namli G.
        • et al.
        A novel and simple machine learning algorithm for preoperative diagnosis of acute appendicitis in children.
        Pediatr Surg Int. 2020; 36: 735-742https://doi.org/10.1007/s00383-020-04655-7
        • Lin D.
        • Chen J.
        • Lin Z.
        • et al.
        A practical model for the identification of congenital cataracts using machine learning.
        EBioMedicine. 2020; 51102621https://doi.org/10.1016/j.ebiom.2019.102621
        • Shahi N.
        • Shahi A.K.
        • Phillips R.
        • et al.
        Decision-making in pediatric blunt solid organ injury: A deep learning approach to predict massive transfusion, need for operative management, and mortality risk.
        J Pediatr Surg. 2021; 56: 379-384https://doi.org/10.1016/j.jpedsurg.2020.10.021
        • Dong R.
        • Jiang J.
        • Zhang S.
        • et al.
        Development and Validation of Novel Diagnostic Models for Biliary Atresia in a Large Cohort of Chinese Patients.
        EBioMedicine. 2018; 34: 223-230https://doi.org/10.1016/j.ebiom.2018.07.025
        • DiRusso S.M.
        • Chahine A.A.
        • Sullivan T.
        • et al.
        Development of a model for prediction of survival in pediatric trauma patients: comparison of artificial neural networks and logistic regression.
        J Pediatr Surg. 2002; 37 (discussion -104): 1098-1104
        • Guo Y.
        • Liu Y.
        • Ming W.
        • et al.
        Distinguishing Focal Cortical Dysplasia From Glioneuronal Tumors in Patients With Epilepsy by Machine Learning.
        Front Neurol. 2020; 11https://doi.org/10.3389/fneur.2020.548305
        • Liu J.
        • Dai S.
        • Chen G.
        • et al.
        Diagnostic Value and Effectiveness of an Artificial Neural Network in Biliary Atresia.
        Front. 2020; 8https://doi.org/10.3389/fped.2020.00409
        • Ruiz V.M.
        • Saenz L.
        • Lopez-Magallon A.
        • et al.
        Early prediction of critical events for infants with single-ventricle physiology in critical care using routinely collected data.
        J Thorac Cardiovasc Surg. 2019; 158 (43.e3): 234https://doi.org/10.1016/j.jtcvs.2019.01.130
        • Bartz-Kurycki M.A.
        • Green C.
        • Anderson K.T.
        • et al.
        Enhanced neonatal surgical site infection prediction model utilizing statistically and clinically significant variables in combination with a machine learning algorithm.
        Am J Surg. 2018; 216: 764-777https://doi.org/10.1016/j.amjsurg.2018.07.041
        • Jamshidnezhad A.
        • Azizi A.
        • Shirali S.
        • et al.
        Evaluation of Suspected Pediatric Appendicitis with Alvarado Method Using a Computerized Intelligent Model.
        Int J Pediatr. 2016; 4: 1465-1473
        • Schwartz M.H.
        • Rozumalski A.
        • Novacheck T.F.
        Femoral derotational osteotomy: surgical indications and outcomes in children with cerebral palsy.
        Gait Posture. 2014; 39: 778-783https://doi.org/10.1016/j.gaitpost.2013.10.016
        • Grundmeier R.W.
        • Xiao R.
        • Ross R.K.
        • et al.
        Identifying surgical site infections in electronic health data using predictive models.
        J Am Med Inform Assoc. 2018; 25: 1160-1166https://doi.org/10.1093/jamia/ocy075
        • Chang Junior J.
        • Binuesa F.
        • Caneo L.F.
        • et al.
        Improving preoperative risk-of-death prediction in surgery congenital heart defects using artificial intelligence model: A pilot study.
        PLoS ONE. 2020; 15e0238199https://doi.org/10.1371/journal.pone.0238199
        • Peltri G.
        • Bitterlich N.
        Increased predictive value of parameters by fuzzy logic-based multiparameter analysis.
        Cytometry B Clin Cytom. 2003; 53: 75-77
        • Hale A.T.
        • Stonko D.P.
        • Brown A.
        • et al.
        Machine-learning analysis outperforms conventional statistical models and CT classification systems in predicting 6-month outcomes in pediatric patients sustaining traumatic brain injury.
        Neurosurg. 2018; 45: E2https://doi.org/10.3171/2018.8.FOCUS17773
        • Jalali A.
        • Lonsdale H.
        • Zamora L.V.
        • et al.
        Machine Learning Applied to Registry Data: Development of a Patient-Specific Prediction Model for Blood Transfusion Requirements During Craniofacial Surgery Using the Pediatric Craniofacial Perioperative Registry Dataset.
        Anesth Analg. 2021; 132: 160-171https://doi.org/10.1213/ANE.0000000000004988