Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Previously Published Works bannerUCSF

Computer‐Aided Detection AI Reduces Interreader Variability in Grading Hip Abnormalities With MRI

Abstract

Background

Accurate interpretation of hip MRI is time-intensive and difficult, prone to inter- and intrareviewer variability, and lacks a universally accepted grading scale to evaluate morphological abnormalities.

Purpose

To 1) develop and evaluate a deep-learning-based model for binary classification of hip osteoarthritis (OA) morphological abnormalities on MR images, and 2) develop an artificial intelligence (AI)-based assist tool to find if using the model predictions improves interreader agreement in hip grading.

Study type

Retrospective study aimed to evaluate a technical development.

Population

A total of 764 MRI volumes (364 patients) obtained from two studies (242 patients from LASEM [FORCe] and 122 patients from UCSF), split into a 65-25-10% train, validation, test set for network training.

Field strength/sequence

3T MRI, 2D T2 FSE, PD SPAIR.

Assessment

Automatic binary classification of cartilage lesions, bone marrow edema-like lesions, and subchondral cyst-like lesions using the MRNet, interreader agreement before and after using network predictions.

Statistical tests

Receiver operating characteristic (ROC) curve, area under curve (AUC), specificity and sensitivity, and balanced accuracy.

Results

For cartilage lesions, bone marrow edema-like lesions and subchondral cyst-like lesions the AUCs were: 0.80 (95% confidence interval [CI] 0.65, 0.95), 0.84 (95% CI 0.67, 1.00), and 0.77 (95% CI 0.66, 0.85), respectively. The sensitivity and specificity of the radiologist for binary classification were: 0.79 (95% CI 0.65, 0.93) and 0.80 (95% CI 0.59, 1.02), 0.40 (95% CI -0.02, 0.83) and 0.72 (95% CI 0.59, 0.86), 0.75 (95% CI 0.45, 1.05) and 0.88 (95% CI 0.77, 0.98). The interreader balanced accuracy increased from 53%, 71% and 56% to 60%, 73% and 68% after using the network predictions and saliency maps.

Data conclusion

We have shown that a deep-learning approach achieved high performance in clinical classification tasks on hip MR images, and that using the predictions from the deep-learning model improved the interreader agreement in all pathologies.

Level of evidence

3 TECHNICAL EFFICACY STAGE: 1 J. Magn. Reson. Imaging 2020;52:1163-1172.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View