Using machine learning to guide targeted and locally-tailored empiric antibiotic prescribing in a children's hospital in Cambodia
Oonsivilai M., Yin M., Luangasanatip N., Lubell Y., Miliya T., Tan P., Loeuk L., Turner P., Cooper B.
Background: Early and appropriate empiric antibiotic treatment of patients suspected of having sepsis is associated with reduced mortality. The increasing prevalence of antimicrobial resistance risks eroding the benefits of such empiric therapy. This problem is particularly severe for children in developing country settings. We hypothesized that by applying machine learning approaches to readily collected patient data, it would be possible to obtain actionable and patient-specific predictions for antibiotic-susceptibility. If sufficient discriminatory power can be achieved, such predictions could lead to substantial improvements in the chances of choosing an appropriate antibiotic for empiric therapy, while minimizing the risk of increased selection for resistance due to use of antibiotics usually held in reserve. Methods and Findings: We analyzed blood culture data collected from a 100-bed children's hospital in North-West Cambodia between February 2013 and January 2016. Clinical, demographic and living condition information for each child was captured with 35 independent variables. Using these variables, we used a suite of machine learning algorithms to predict Gram stains and whether bacterial pathogens could be treated with standard empiric antibiotic therapies: i) ampicillin and gentamicin; ii) ceftriaxone; iii) at least one of the above. 243 cases of bloodstream infection were available for analysis. We used 195 (80%) to train the algorithms, and 48 (20%) for evaluation. We found that the random forest method had the best predictive performance overall as assessed by the area under the receiver operating characteristic curve (AUC), though support vector machine with radial kernel had similar performance for predicting Gram stain and ceftriaxone susceptibility. Predictive performance of logistic regression, simple and boosted decision trees and k-nearest neighbors were poor in comparison. The random forest method gave an AUC of 0.91 (95%CI 0.81-1.00) for predicting susceptibility to ceftriaxone, 0.75 (0.60-0.90) for susceptibility to ampicillin and gentamicin, 0.76 (0.59-0.93) for susceptibility to neither, and 0.69 (0.53-0.85) for Gram stain result. The most important variables for predicting susceptibility were time from admission to blood culture, patient age, hospital versus community-acquired infection, and age-adjusted weight score. Conclusions: Applying machine learning algorithms to patient data that are readily available even in resource-limited hospital settings can provide highly informative predictions on susceptibilities of pathogens to guide appropriate empiric antibiotic therapy. Used as a decision support tool, such approaches have the potential to lead to better targeting of empiric therapy, improve patient outcomes and reduce the burden of antimicrobial resistance.