Individuals with serious mental illness experience changes in their clinical states over time that are difficult to assess and that result in increased disease burden and care utilization. It is not known if features derived from speech can serve as a transdiagnostic marker of these clinical states. This study evaluates the feasibility of collecting speech samples from people with serious mental illness and explores the potential utility for tracking changes in clinical state over time. Patients (n = 47) were recruited from a community-based mental health clinic with diagnoses of bipolar disorder, major depressive disorder, schizophrenia or schizoaffective disorder. Patients used an interactive voice response system for at least 4 months to provide speech samples. Clinic providers (n = 13) reviewed responses and provided global assessment ratings. We computed features of speech and used machine learning to create models of outcome measures trained using either population data or an individual's own data over time. The system was feasible to use, recording 1101 phone calls and 117 hours of speech. Most (92%) of the patients agreed that it was easy to use. The individually-trained models demonstrated the highest correlation with provider ratings (rho = 0.78, p<0.001). Population-level models demonstrated statistically significant correlations with provider global assessment ratings (rho = 0.44, p<0.001), future provider ratings (rho = 0.33, p<0.05), BASIS-24 summary score, depression sub score, and self-harm sub score (rho = 0.25,0.25, and 0.28 respectively; p<0.05), and the SF-12 mental health sub score (rho = 0.25, p<0.05), but not with other BASIS-24 or SF-12 sub scores. This study brings together longitudinal collection of objective behavioral markers along with a transdiagnostic, personalized approach for tracking of mental health clinical state in a community-based clinical setting.