Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Previously Published Works bannerUCLA

Robustness-Driven Feature Selection in Classification of Fibrotic Interstitial Lung Disease Patterns in Computed Tomography Using 3D Texture Features

Abstract

Lack of classifier robustness is a barrier to widespread adoption of computer-aided diagnosis systems for computed tomography (CT). We propose a novel Robustness-Driven Feature Selection (RDFS) algorithm that preferentially selects features robust to variations in CT technical factors. We evaluated RDFS in CT classification of fibrotic interstitial lung disease using 3D texture features. CTs were collected for 99 adult subjects separated into three datasets: training, multi-reconstruction, testing. Two thoracic radiologists provided cubic volumes of interest corresponding to six classes: pulmonary fibrosis, ground-glass opacity, honeycombing, normal lung parenchyma, airway, vessel. The multi-reconstruction dataset consisted of CT raw sinogram data reconstructed by systematically varying slice thickness, reconstruction kernel, and tube current (using a synthetic reduced-tube-current algorithm). Two support vector machine classifiers were created, one using RDFS ("with-RDFS") and one not ("without-RDFS"). Classifier robustness was compared on the multi-reconstruction dataset, using Cohen's kappa to assess classification agreement against a reference reconstruction. Classifier performance was compared on the testing dataset using the extended g-mean (EGM) measure. With-RDFS exhibited superior robustness (kappa 0.899-0.989) compared to without-RDFS (kappa 0.827-0.968). Both classifiers demonstrated similar performance on the testing dataset (EGM 0.778 for with-RDFS; 0.785 for without-RDFS), indicating that RDFS does not compromise classifier performance when discarding nonrobust features. RDFS is highly effective at improving classifier robustness against slice thickness, reconstruction kernel, and tube current without sacrificing performance, a result that has implications for multicenter clinical trials that rely on accurate and reproducible quantitative analysis of CT images collected under varied conditions across multiple sites, scanners, and timepoints.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Item not freely available? Link broken?
Report a problem accessing this item