News | Artificial Intelligence | November 07, 2018

Artificial Intelligence May Fall Short Analyzing Data Across Multiple Health Systems

Study shows deep learning models must be carefully tested across multiple environments before being put into clinical practice

November 7, 2018 — Artificial intelligence (AI) tools trained to detect pneumonia on chest X-rays suffered significant decreases in performance when tested on data from outside health systems, according to a new study. The study, conducted at the Icahn School of Medicine at Mount Sinai, was published in a special issue of PLOS Medicine on machine learning and healthcare.1 These findings suggest that artificial intelligence in the medical space must be carefully tested for performance across a wide range of populations; otherwise, the deep learning models may not perform as accurately as expected.  

As interest in the use of computer system frameworks called convolutional neural networks (CNN) to analyze medical imaging and provide a computer-aided diagnosis grows, recent studies have suggested that AI image classification may not generalize to new data as well as commonly portrayed.

Researchers at the Icahn School of Medicine at Mount Sinai assessed how AI models identified pneumonia in 158,000 chest X-rays across three medical institutions: the National Institutes of Health; The Mount Sinai Hospital; and Indiana University Hospital. Researchers chose to study the diagnosis of pneumonia on chest X-rays for its common occurrence, clinical significance and prevalence in the research community.

In three out of five comparisons, CNNs’ performance in diagnosing diseases on X-rays from hospitals outside of its own network was significantly lower than on X-rays from the original health system. However, CNNs were able to detect the hospital system where an X-ray was acquired with a high-degree of accuracy, and cheated at their predictive task based on the prevalence of pneumonia at the training institution. Researchers found that the difficulty of using deep learning models in medicine is that they use a massive number of parameters, making it challenging to identify specific variables driving predictions, such as the types of computed tomography (CT) scanners used at a hospital and the resolution quality of imaging.

“Our findings should give pause to those considering rapid deployment of artificial intelligence platforms without rigorously assessing their performance in real-world clinical settings reflective of where they are being deployed,” said senior author Eric Oermann, M.D., instructor in neurosurgery at the Icahn School of Medicine at Mount Sinai. “Deep learning models trained to perform medical diagnosis can generalize well, but this cannot be taken for granted since patient populations and imaging techniques differ significantly across institutions.”

“If CNN systems are to be used for medical diagnosis, they must be tailored to carefully consider clinical questions, tested for a variety of real-world scenarios and carefully assessed to determine how they impact accurate diagnosis,” said first author John Zech, a medical student at the Icahn School of Medicine at Mount Sinai.

This research builds on papers published earlier this year in the journals Radiology and Nature Medicine, which laid the framework for applying computer vision and deep learning techniques, including natural language processing algorithms, for identifying clinical concepts in radiology reports for CT scans.

Listen to the PODCAST: Radiologists Must Understand AI To Know If It Is Wrong

For more information: www.journals.plos.org/plosmedicine

Reference

1. Zech J.R., Badgeley M.A., Liu M., et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLOS Medicine, Nov. 6, 2018. https://doi.org/10.1371/journal.pmed.1002683

Related Content

Partial Breast Irradiation Effective, Convenient Treatment Option for Low-Risk Breast Cancer
News | Radiation Therapy | May 20, 2019
Partial breast irradiation produces similar long-term survival rates and risk for recurrence compared with whole breast...
AI Detects Unsuspected Lung Cancer in Radiology Reports, Augments Clinical Follow-up
News | Artificial Intelligence | May 20, 2019
Digital Reasoning announced results from its automated radiology report analytics research. In a series of experiments...
Tru-Vu Monitors Releases New Medical-Grade Touch Screen Display
Technology | Flat Panel Displays | May 17, 2019
Tru-Vu Monitors released the new MMZBTP-21.5G-X 21.5” medical-grade touch screen monitor. It is certified to both UL...
New Study Evaluates Head CT Examinations and Patient Complexity
News | Neuro Imaging | May 17, 2019
Computed tomography (CT) of the head uses special X-ray equipment to help assess head injuries, dizziness and other...
Brain images that have been pre-reviewed by the Viz.AI artificial intelligence software to identify a stroke. The software automatically sends and alert to the attending physician's smartphone with links to the imaging for a final human assessment to help speed the time to diagnosis and treatment. Depending on the type of stroke, quick action is needed to either activate the neuro-interventional lab or to administer tPA. Photo by Dave Fornell.

Brain images that have been pre-reviewed by the Viz.AI artificial intelligence software to identify a stroke. The software automatically sends and alert to the attending physician's smartphone with links to the imaging for a final human assessment to help speed the time to diagnosis and treatment. Depending on the type of stroke, quick action is needed to either activate the neuro-interventional lab or to administer tPA. Photo by Dave Fornell.

Feature | Artificial Intelligence | May 17, 2019 | Inga Shugalo
With its increasing role in medical imaging,...
New Phase 2B Trial Exploring Target-Specific Myocardial Ischemia Imaging Agent
News | Radiopharmaceuticals and Tracers | May 17, 2019
Biopharmaceutical company CellPoint plans to begin patient recruitment for its Phase 2b cardiovascular imaging study in...
3 Recommendations to Better Understand HIPAA Compliance
Feature | Information Technology | May 17, 2019 | Carol Amick
According to the U.S.
The webinar "Realizing the Value of Enterprise Imaging: 5 Key Strategies for Success" will outline how to improve patient care, lower costs and reduce IT complexity through a well-designed enterprise Imaging strategy.  Change Healthcare
Webinar | Enterprise Imaging | May 16, 2019
The webinar "Realizing the Value of Enterprise Imaging: 5 Key Strategies for Success" will outline how to improve pat
Managing Architectural Distortion on Mammography Based on MR Enhancement
News | Mammography | May 15, 2019
High negative predictive values (NPV) in mammography architectural distortion (AD) without ultrasonographic (US)...
FDA Clears Aidoc's AI Solution for Flagging Pulmonary Embolism
Technology | Artificial Intelligence | May 15, 2019
Artificial intelligence (AI) solutions provider Aidoc has been granted U.S. Food and Drug Administration (FDA)...