Krogue, Justin D; Cheng, Kaiyang V; Hwang, Kevin M; Toogood, Paul; Meinberg, Eric G; Geiger, Erik J; Zaid, Musa; McGill, Kevin C; Patel, Rina; Sohn, Jae Ho; Wright, Alexandra; Darger, Bryan F; Padrez, Kevin A; Ozhinsky, Eugene; Majumdar, Sharmila; Pedoia, Valentina

doi:10.1148/ryai.2020190023

Download PDF

Automatic Hip Fracture Identification and Functional Subclassification with Deep Learning.

2020

Published Web Location

https://doi.org/10.1148/ryai.2020190023

Abstract

Purpose

To investigate the feasibility of automatic identification and classification of hip fractures using deep learning, which may improve outcomes by reducing diagnostic errors and decreasing time to operation.

Materials and methods

Hip and pelvic radiographs from 1118 studies were reviewed, and 3026 hips were labeled via bounding boxes and classified as normal, displaced femoral neck fracture, nondisplaced femoral neck fracture, intertrochanteric fracture, previous open reduction and internal fixation, or previous arthroplasty. A deep learning-based object detection model was trained to automate the placement of the bounding boxes. A Densely Connected Convolutional Neural Network (or DenseNet) was trained on a subset of the bounding box images, and its performance was evaluated on a held-out test set and by comparison on a 100-image subset with two groups of human observers: fellowship-trained radiologists and orthopedists; senior residents in emergency medicine, radiology, and orthopedics.

Results

The binary accuracy for detecting a fracture of this model was 93.7% (95% confidence interval [CI]: 90.8%, 96.5%), with a sensitivity of 93.2% (95% CI: 88.9%, 97.1%) and a specificity of 94.2% (95% CI: 89.7%, 98.4%). Multiclass classification accuracy was 90.8% (95% CI: 87.5%, 94.2%). When compared with the accuracy of human observers, the accuracy of the model achieved an expert-level classification, at the very least, under all conditions. Additionally, when the model was used as an aid, human performance improved, with aided resident performance approximating unaided fellowship-trained expert performance in the multiclass classification.

Conclusion

A deep learning model identified and classified hip fractures with expert-level performance, at the very least, and when used as an aid, improved human performance, with aided resident performance approximating that of unaided fellowship-trained attending physicians.Supplemental material is available for this article.© RSNA, 2020.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content

For improved accessibility of PDF content, download the file to your device.

UCSF