In a categorization task involving both labeled and unlabeled
data, it has been shown that humans make use of the underlying
distribution of the unlabeled examples. It has also been shown
that humans are sensitive to shifts in this distribution, and will
change predicted classifications based on these shifts. It is not
immediately obvious what causes these shifts – what specific
properties of these distributions humans are sensitive to. Assuming
a parametric model of human categorization learning,
we can ask which parameters or sets of parameters humans fix
after exposure to labeled data and which are adjustable to fit
subsequent unlabeled data. We formulate models to describe
different parameter sets which humans may be sensitive to and
a dataset which optimally discriminates among these models.
Experimental results indicate that humans are sensitive to all
parameters, with the closest model fit being an unconstrained
version of semi-supervised learning using expectation maximization.