- Yoon, Jung;
- Strand, Fredrik;
- Baltzer, Pascal;
- Conant, Emily;
- Gilbert, Fiona;
- Lehman, Constance;
- Mullen, Lisa;
- Nishikawa, Robert;
- Sharma, Nisha;
- Vejborg, Ilse;
- Moy, Linda;
- Mann, Ritse;
- Morris, Elizabeth
Background There is considerable interest in the potential use of artificial intelligence (AI) systems in mammographic screening. However, it is essential to critically evaluate the performance of AI before it can become a modality used for independent mammographic interpretation. Purpose To evaluate the reported standalone performances of AI for interpretation of digital mammography and digital breast tomosynthesis (DBT). Materials and Methods A systematic search was conducted in PubMed, Google Scholar, Embase (Ovid), and Web of Science databases for studies published from January 2017 to June 2022. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) values were reviewed. Study quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 and Comparative (QUADAS-2 and QUADAS-C, respectively). A random effects meta-analysis and meta-regression analysis were performed for overall studies and for different study types (reader studies vs historic cohort studies) and imaging techniques (digital mammography vs DBT). Results In total, 16 studies that include 1 108 328 examinations in 497 091 women were analyzed (six reader studies, seven historic cohort studies on digital mammography, and four studies on DBT). Pooled AUCs were significantly higher for standalone AI than radiologists in the six reader studies on digital mammography (0.87 vs 0.81, P = .002), but not for historic cohort studies (0.89 vs 0.96, P = .152). Four studies on DBT showed significantly higher AUCs in AI compared with radiologists (0.90 vs 0.79, P < .001). Higher sensitivity and lower specificity were seen for standalone AI compared with radiologists. Conclusion Standalone AI for screening digital mammography performed as well as or better than radiologists. Compared with digital mammography, there is an insufficient number of studies to assess the performance of AI systems in the interpretation of DBT screening examinations. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Scaranelo in this issue.