It is estimated that nearly one in five adults in the United States live with mental illness, and for individuals who struggle with mental health, the experience can be excruciating. The rise of mobile devices presents a unique opportunity to improve mental health outcomes, in part through empowering mental health professionals. Because many individuals always have their smartphones with them, smartphones may be able to enable health professionals to identify when an individual is in need and provide immediate care, rather than the current model of delaying care until a scheduled appointment. To this end, I investigate the feasibility of two new data-driven tools: one to identify when care is needed and the second to help train counselors to intervene when care is needed. The first tool I consider seeks to use a smartphone to sense an individual's well-being. Such a tool could be used to inform health professionals of patients' states, evaluate the efficacy of therapies, and deliver just-in-time interventions. To evaluate the potential accuracy of such a tool, I collect students' passive smartphone data and self-reported well-being measures, and then consider predicting well-being on a daily basis and detecting significant changes over a period of time. As this approach seems unreliable for most individuals, I further explore for which individuals such an approach may be reliable and develop a framework for evaluating longitudinal sensing quality. I find that while correlations between smartphone-sensed measures and reported wellbeing scores exist, these relationships are often too weak to reliably predict wellbeing. The second tool I explore seeks to help suicide prevention counselors practice intervening over chat for individuals in crisis. For this, I collect and leverage synthetic conversation transcripts and show how to evaluate a baseline system for counselors to practice crisis de-escalation strategies in a no-risk environment. While text retrieval and generation methods can return responses that make sense in limited context most of the time, i.e., in greater that 50% of examples, generated responses are shorter than retrieving full messages, implying that generation may potentially be a less engaging approach. Overall, I find that significant consideration of context is needed to provide meaningful evaluation of methods for the tools envisioned. While popular algorithms and methods may hold potential to develop the tools discussed, rigorous evaluation and further work is needed to ensure reliability within the application context.