This work is explores linear dimensionality reduction techniques that preserve information relevant for specific classification tasks. We propose a Gaussian latent variable model that is tuned to maximize the likelihood of the observed data, subject to a constraint that a prediction loss based on the lower dimensional representation meets a chosen threshold. We augment a log-likelihood objective with auxiliary losses that enforce the prediction constraint via Lagrange multipliers. Our prediction-constrained training objective effectively integrates supervisory information even when only a small fraction of training samples are labeled. We analyzed the performance of our PC approach for predicting emotions from face images. We improved prediction quality compared to a multinomial logistic regression model fitted on the output of standard linear dimension reduction techniques. We also achieved competitive performance against a multinomial logistic regression model trained on full resolution images. We were able to learn parameters that can project images to a low-dimensional space and capture the defining feature of each class. In particular, reconstructed images with these parameters were significantly better in capturing the facial expressions compared to Factor Analysis and Probabilistic Principal Component Analysis.