- Mullooly, Maeve;
- Ehteshami Bejnordi, Babak;
- Pfeiffer, Ruth;
- Fan, Shaoqi;
- Palakal, Maya;
- Hada, Manila;
- Vacek, Pamela;
- Weaver, Donald;
- Shepherd, John;
- Fan, Bo;
- Mahmoudzadeh, Amir;
- Wang, Jeff;
- Malkov, Serghei;
- Johnson, Jason;
- Herschorn, Sally;
- Sprague, Brian;
- Hewitt, Stephen;
- Brinton, Louise;
- Karssemeijer, Nico;
- van der Laak, Jeroen;
- Beck, Andrew;
- Sherman, Mark;
- Gierach, Gretchen
Breast density, a breast cancer risk factor, is a radiologic feature that reflects fibroglandular tissue content relative to breast area or volume. Its histology is incompletely characterized. Here we use deep learning approaches to identify histologic correlates in radiologically-guided biopsies that may underlie breast density and distinguish cancer among women with elevated and low density. We evaluated hematoxylin and eosin (H&E)-stained digitized images from image-guided breast biopsies (n = 852 patients). Breast density was assessed as global and localized fibroglandular volume (%). A convolutional neural network characterized H&E composition. In total 37 features were extracted from the network output, describing tissue quantities and morphological structure. A random forest regression model was trained to identify correlates most predictive of fibroglandular volume (n = 588). Correlations between predicted and radiologically quantified fibroglandular volume were assessed in 264 independent patients. A second random forest classifier was trained to predict diagnosis (invasive vs. benign); performance was assessed using area under receiver-operating characteristics curves (AUC). Using extracted features, regression models predicted global (r = 0.94) and localized (r = 0.93) fibroglandular volume, with fat and non-fatty stromal content representing the strongest correlates, followed by epithelial organization rather than quantity. For predicting cancer among high and low fibroglandular volume, the classifier achieved AUCs of 0.92 and 0.84, respectively, with epithelial organizational features ranking most important. These results suggest non-fatty stroma, fat tissue quantities and epithelial region organization predict fibroglandular volume. The model holds promise for identifying histological correlates of cancer risk in patients with high and low density and warrants further evaluation.