Large-scale screening for depression has been using norms developed based on a given population at a given time. Researchers have attempted to adjust the cutoff scores over time and for different populations, but such efforts are too few and far in between to be sensitive to temporal and regional variations. In this study, we proposed an unsupervised machine learning approach to constructing depression classifications to overcome the limitations of the traditional norm-based method. Data were collected from 8,063 Chinese middle and high school students. Using k-means clustering, we generated four levels of depressive symptoms to match the norm-based classifications. We then evaluated the validity of the classifications by comparing them with the norm-based method (and its variations) in terms of their robustness, model performance (accuracy, AUC, and sensitivity), and convergent construct validity (i.e., associations with known correlates). The results showed that our automatic classification system performed well as compared to the norm-based method.