Recent research highlights a growing demand for youth mental health services (Barican et al., 2022; Kazdin, 2019; USPSTF, 2022), prompting the need to enhance mental health workforce capacity. Improving workforce capacity entails strengthening critical decision-making activities, including considering client problems, prioritizing them, and selecting the most suitable practices to address them. Clinical supervision, involving dyads of qualified mental health professionals ("supervisors") and direct service providers ("supervisees"), aims to improve these activities (Proctor, 1986; Milne, 2007). Challenges include time constraints, varying competency activity levels, and difficulty in incorporating new scientific findings, compounded by high turnover rates (Bernstein et al., 2015; Brabson et al., 2020; Chorpita et al., 2021; Collatz & Wetterling, 2012; Dorsey et al., 2017; Powell & York, 1992; Simon & Greenberger, 1971). Integrating decision support systems into clinical supervision could address these challenges, promoting use of evidence and ensuring sustained skill retention among supervisory dyads (Bjork & Bjork, 2020). Within the context of a decision-support system integrated within clinical supervision, this dissertation investigated the reliability of quality, effort, and efficiency metrics, and then examined the associations between ordinal repetition of activities and passage of time with those quality and effort metrics. As such, it explored whether time or repetition is associated with improvement, deterioration, or no change in these metrics.
The study analyzes existing data from a multi-site randomized implementation trial aimed at promoting the use of evidence-based methods for engaging youth and families in treatment. We audio recorded and transcribed supervision events in which mental health workers discussed cases at-risk for poor treatment engagement. For part one, 26 supervisees and 17 supervisors discussed 30 cases; for part two, 48 supervisees and 16 supervisors, trained and using a decision-support system, discussed 118 cases.
Observational coders rated efficiency and the extensiveness of decision-making activities using a subset of the ACE-BOCS coding system (Chorpita et al., 2018). Efficiency was rated holistically for each event on a 5-point scale, from presence of extensive discussions on unnecessary topics (1) to swift and organized decision-making and planning (5). Quality was evaluated using a dichotomous scale, based on whether each activity met sufficient quality criteria, primarily indicating the presence of the activity. Effort was measured by the total number of words spoken for each activity. Two overall effort scores were calculated based on the total words spoken and duration of the entire event. The total number of supervisory events per supervisory dyad was an indicator of repetition of supervisory activities, and the total weeks since training in the decision-support system measured the passage of time.
To assess interrater reliability across all coders, we used Fleiss' kappa (κ) for the four dichotomous quality metrics and ICCs (model [2,1], consistency) for the ordinal efficiency metric. To examine possible change in outcomes, we used mixed effects regression models, examining three hierarchical levels: cases nested within supervisees nested within supervisors. Thus, supervisors were the main level of analysis. We assessed the impact of each level on results and simplified the model if it didn't improve it. To manage skewed data with quality and effort measures having excess zeros, we implemented corrections like the Firth logistic regression and employed specialized models such as the Hurdle model, respectively. These strategies helped mitigate bias and stabilize parameter estimates.
Interrater reliability estimates showed that coders consistently rated both the decision-making activities and overall efficiency reliably. A strong positive correlation confirmed the initial validity of the effort measure. Findings revealed changes in efficiency, the presence of quality, and the likelihood of putting in effort as dyads moved through each level of supervision for their cases (for example, from the first supervision event type to the second and then to the third type). Increasing repetition of supervision events or time within each supervision stage did not predict whether the dyads improved in these outcomes.
This study underscores the sustainability of quality, effort, and efficiency across repeated supervision events within different supervision types and over time. It also identifies areas for further investigation, including the need for more nuanced and robust measures of quality and effort. Future research should address these issues and explore alternative assessment methods to gain a deeper understanding of workforce learning. This understanding will inform strategies aimed at maximizing workforce capacity to meet the growing demand for high-quality youth mental health services.