Skip to main content
eScholarship
Open Access Publications from the University of California

Hierarchical models for screening of iron deficiency anemia

Abstract

We investigate the problem of classifying individuals based on estimated density functions for each individual. Given labelled histograms characterizing red blood cells (RBCs) for different individuals, the learning problem is to build a classifier which can classify new unlabelled histograms into normal and iron deficient classes. Thus, the problem is similar to conventional classification in that there is labelled training data, but different in that the underlying measurements are not feature vectors but histograms or density estimates. We describe a general framework based on probabilistic hierarchical models for modelling such data and illustrate how the model lends itself to classification. We contrast this approach with two other alternatives: (1) directly defining distance between densities using a cross-entropy distance measure, and (2) using parameters of the estimated densities as feature vectors for a standard discriminative classification framework. We evaluate all three methods on a real-world data set consisting of 180 subjects. The hierarchical modeling and density-distance approaches are most accurate, yielding cross-validated error rates in the range of 1 to 2%. We conclude by discussing the relative merits of each approach, including the interpretability of each model from a clinical diagnostic viewpoint.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View