ObjectiveBiomedical imaging research relies heavily on the subjective and semi-quantitative reader analysis of images. Current methods are limited by interreader variability and fixed upper and lower limits. The purpose of this study was to compare the performance of two assessment methods, pairwise comparison and Likert scale, for improved analysis of biomedical images.
Materials and methodsA set of 10 images with varying degrees of image sharpness was created by digitally blurring a normal clinical chest radiograph. Readers assessed the degree of image sharpness using two different methods: pairwise comparison and a 10-point Likert scale. Reader agreement with actual chest radiograph sharpness was calculated for each method by use of the Lin concordance correlation coefficient (CCC).
ResultsReader accuracy was highest for pairwise comparison (CCC, 1.0) and ranked Likert (CCC, 0.99) scores and lowest for nonranked Likert scores (CCC, 0.83). Accuracy improved slightly when readers repeated their assessments (CCC, 0.87) or had reference images available (CCC, 0.91).
ConclusionPairwise comparison and ranked Likert scores yield more accurate reader assessments than nonranked Likert scores.