Psychiatric Labels: Exploring Indirect and Direct Assessments of Task Performance

ABSTRACT We explore the idea that performance expectations in problem-solving groups (e.g., juries, planning groups) are partially outside of group members’ awareness. We first identify a divergence between indirect and direct teammate performance assessments among participants who are working with a teammate with schizophrenia in a two-person task group. The indirect indicator is the participant’s resistance to the teammate’s problem-solving suggestions, and the direct indicator is the participant’s subsequent and private responses to a series of questions about the teammate’s task performance. We explore the divergence further by assessing the extent to which participants’ political beliefs differentially affect the two measures. Liberals are likely to hold less explicitly prejudicial views of individuals with a mental illness than do conservatives. But, if performance expectations are driven by fairly uniform status beliefs, liberals’ resistance to influence from individuals with a mental illness should be similar to conservatives’. Consistent with that expectation, liberals’ direct assessment of the task performance of teammates with schizophrenia is more positive than conservatives’, but their indirect assessment (i.e., their resistance to their influence) is the same as conservatives’. All the findings hold with controls for stigmatized behavior toward the teammate (social and physical distance), stigmatized perceptions of the teammate (teammate evaluation and teammate likability), and social desirability bias. The findings are generally consistent with the idea that deference behaviors are sometimes rooted in performance expectations that are subconsciously held. They also illuminate status processes related to mental health and suggest a new way to infer the extent to which explicit performance assessments differ from performance expectations.

theory makes no claims one way or another about individuals' awareness of these expectations, an ambiguity linked to the challenges of measuring this potentially subconscious construct.Nonetheless, several recent studies have explored this question, seeking to determine if and when individuals are conscious of the expectations (e.g., Dippong 2020;Kalkhoff et al. 2020;Melamed et al. 2019).We build on those efforts with analyses that offer a new way to infer the extent to which individuals' explicit assessments diverge from their performance expectations.We explore the new strategy with mental health, an attribute that functions as a status characteristic within collectively oriented task groups (e.g., Lucas andPhelan 2012, 2019).

Status Characteristics Theory
According to SCT, when individuals work together on a valued task, the diffuse status characteristics that differentiate them shape their expectations about how they and others will perform on the task (Berger et al. 1977;Berger, Cohen, andZelditch 1966, 1972;Berger, Wagner, and Zelditch 1985).Diffuse status characteristics are culturally defined characteristics (e.g., gender) whose states (e.g., man, woman) are given different degrees of esteem in widely shared status beliefs in the dominant culture.According to SCT, those widely shared status beliefs shape performance expectations in fairly uniform ways, leading most group members to expect those in the status-advantaged category to perform better than those in the status-disadvantaged category.Those expectations then create self-fulfilling prophecies: individuals in the status-disadvantaged category, sensing that they have less to contribute than those in the status-advantaged category, participate less frequently and defer to those in the advantaged category more frequently, while those in the status-advantaged category, sensing that they have more to offer, participate more often and defer less readily.Thus, SCT proposes a causal chain to explain status processes within problem-solving groups: widely shared status beliefs shape performance expectations, which, in turn, shape deference behavior.
Yet, the nature of the performance expectations that mediate these processes is unclear.When describing status processes, Berger, Wagner, and Zelditch (1985:37; also see Berger and Conner 1974:87) report that they "do not think of these as consciously guided processes, or processes that the actor monitors, or processes that the actor may even be aware of."Yet, they also do not describe the performance expectations as definitely subconscious or conscious, explaining instead that "expectation-states theories in general make no assumptions, which are formal parts of these theories, that relate the formation of the interactant's expectation states to conscious processes."Thus, although status characteristics theorists suspect that the expectations are outside of individuals' awareness, they have not made that idea a formal part of the theory.This ambiguity is reflected in the varied ways that researchers describe the expectations, with some describing them as "often" subconscious (e.g., Ridgeway 2019;Wagner 2007), others as "mostly" (Webster and Slattery Walker 2022) or "usually" subconscious (e.g., Doerer, Webster, and Walker 2017;Kalkhoff et al. 2020;Rashotte and Webster 2005;Ridgeway and Walker 1995), and still others as "assumed to be" subconscious (e.g., Melamed et al. 2019).
The uncertainty about this feature of performance expectations is related to the challenges to measuring the expectations.As others have noted (e.g., Kalkhoff et al. 2020;Melamed et al. 2019), overt measures of performance expectations present two interrelated problems: (1) if the expectations are subconscious, they are, presumably, not accessible through overt questions (the awareness problem); and (2) if they are within individuals' awareness, overt measures may alter them and/or allow participants to disguise them (the demand effect problem).Thus, as Berger, Wagner, and Zelditch (1985:35) suggest, performance expectations are unobservable phenomena that must be inferred rather than measured directly.
These challenges have led researchers to develop techniques designed to measure the expectations indirectly and implicitly.The strategies include measures of vocal frequency accommodation (e.g., Dippong 2020;Gallagher et al. 2005), magnetic resonance imaging (Melamed et al. 2017), electroencephalogram (Kalkhoff et al. 2020), and the implicit association test of status beliefs as a proxy for expectations (Melamed et al. 2019).They also include more traditional strategies that contrast status-related behavior or reports when participants are prompted to explain or think about their expectations with status-related behavior or reports when they are not given such prompts (e.g., Doerer, Webster, and Walker 2017).
Yet, each of these indirect measurement strategies includes limitations, including reactivity, cost, and questions of measurement validity (Webster and Dippong 2022).Furthermore, some studies show that overtly measured performance expectations correspond to deference behavior (Driskell and Mullen 1988;Savage, Dippong, and Melamed 2020;Walker and Gur 2017), although the correspondence tends to occur for status characteristics closely connected to knowledge (e.g., education, military rank, laboratory-created (i.e., participants are told one group is more skilled than another group at an ability that participants just learned about in the lab)). 1 Thus, despite progress on measuring and understanding performance expectations, questions remain about whether and when they are subconscious and, more fundamentally, about how to measure them.
We explore these questions for one status characteristic, mental health, within a collectively oriented two-person task group.We first identify a divergence between the indirect and direct measures of the participants' perceptions of their teammate's task performance in conditions with a teammate who discloses a history of schizophrenia.The indirect indicator is the participant's resistance to the teammate's problem-solving suggestions, a measure used for decades in the SCT literature to infer participants' assessments of their teammate's problem-solving suggestions (Berger and Webster 2018), and the direct indicator is the participant's subsequent and private responses to a series of questions about the teammate's task performance. 2 We then explore the divergence further by assessing the extent to which political beliefs differentially affect the two types of measures.Political liberals hold less explicitly prejudicial views of disadvantaged individuals than do political conservatives (e.g., Haidt 2012), suggesting that liberals' direct assessments of the performance of individuals with a mental illness will be more positive than conservatives'.But, if performance expectations are subconsciously held and emerge from widely shared and fairly uniform status beliefs, those expectations should also be fairly uniform and, therefore, unrelated to political beliefs; thus, liberals' scores on the indirect indicator of performance expectations-i.e., their deference to the teammate's problem-solving suggestions-should be similar to conservatives'.Thus, if performance expectations are subconscious and fairly uniform, political beliefs should moderate the effect of teammate mental illness on the direct measure of teammate performance but fail to moderate the effect of teammate mental illness on the indirect measure.In the next section, we review past studies of status processes related to mental illness.

Status Effects of Mental Illness
The stigma of mental illness is widespread.Numerous studies in the U.S. (e.g., Hipes and Gemoets 2019;Hipes et al. 2016;Kroska et al. 2014;Lucas and Phelan 2012;Markowitz and Engelman 2017;Thibodeau and Principino 2019) and elsewhere (Schomerus et al. 2012) suggest that people tend to fear, negatively evaluate, and seek social and physical distance from individuals known to have a mental illness, patterns that have held steady for several decades.And the stigma extends to perceptions of incompetence and behaviors rooted in that perception, patterns evident in recent online studies (Phelan et al. 2019;Sadler, Meagor, and Kaye 2012), vignette experiments (Hipes and Gemoets 2019), field experiments (Hipes et al. 2016), and laboratory experiments (Kroska et al. 2015;Lucas andPhelan 2012, 2019;Manago and Mize 2022).The laboratory experiments generally show that individuals working in two-person task groups reject problem-solving suggestions from teammates with a mental illness more often than they reject others' suggestions, although the patterns vary somewhat by diagnosis and participant gender (for a review, see Lucas and Hipes 2022).
Yet, this discrimination is most evident when it is measured indirectly.When it is measured with explicit assessments, particularly explicit assessments of individuals with whom participants are interacting, the evidence is weaker or non-existent. 3Importantly, the divergence in results between direct and indirect measures occurs for both stigma-related measures (e.g., social distance, likability) (Kroska et al. 2014;Stier and Hinshaw 2007) and status-related measures (e.g., measures of competence) (Kroska et al. 2015).The divergence in results suggests that the direct and indirect measures are gauging different types of perceptions, with direct verbal measures tapping perceptions that are within individuals' awareness and the indirect behavioral measures tapping perceptions that are-at least for some participants-below individuals' awareness (Dovidio, Kawakami, and Beach 2001;Greenwald and Banaji 1995).The divergence also aligns with the SCT idea that hierarchical task-group behaviors (e.g., resistance to influence) are rooted in performance expectations that are often inaccessible when measured directly. 4 Drawing on these studies, we expect a teammate's disclosure of a psychiatric hospitalization to reduce assessments of that teammate's task performance when that assessment is measured indirectly through participants' willingness to accept their problem-solving suggestions.But, we do not expect that disclosure to affect assessments of that teammate's task performance when it is later measured directly with explicit questions.Thus, we advance the following two hypotheses: Hypothesis 1a: A teammate's mental illness will increase participants' resistance to that teammate's influence.Hypothesis 1b: A teammate's mental illness will be unrelated to participants' direct assessments of that teammate's task performance. 5 We examine these processes with two psychiatric diagnoses: schizophrenia and depression.We also examine these processes with a teammate who discloses a history of hospitalization for a non-psychiatric procedure, leg surgery, to determine the extent to which the patterns apply to non-psychiatric medical problem that does not have a documented pattern of stigma connected to it.
As discussed earlier, individuals seek physical and social distance from individuals with a mental illness at a higher rate than they do from others.These tendencies, sometimes described as "stigma processes" to distinguish them from status processes (Lucas and Phelan 2012), grow out of fears of danger (e.g., Markowitz and Engelman 2017), disgust (e.g., Oaten, Stevenson, andCase 2009), discomfort (e.g., Cahill andEggleston 1994), and concerns about stigma by association, or "courtesy stigma" (e.g., Corrigan and Miller 2004).The urge for physical and social distance could co-occur with and potentially exacerbate negative performance assessments if, for example, the assessments required in-person interactions or a public appearance with an individual with a mental illness.We address this concern in two ways.First, both measures of task performance are distinct from social and physical distance measures, because, as we discuss below, the teammate interaction is done over a computer (reducing danger fears, disgust reactions, and discomfort) and the interaction is not public (reducing courtesy stigma concerns).Second, our models control for both a behavioral measure of stigma (efforts to seek physical and social distance from the teammate) and two verbal measures of stigma (evaluation of the teammate and teammate likability), allowing us to evaluate these processes net of stigmarelated behaviors and perceptions.

The Role of Political Beliefs
A demonstration of a divergence between direct and indirect assessments of task performance would be consistent with the SCT idea that task-group behavior is rooted in expectations that are often inaccessible when measured explicitly.But, we investigate the robustness of this possibility further by also examining how an explicitly measured belief that is likely to be related to prejudicial perceptions-namely, political liberalism-affects these patterns.More specifically, we use political beliefs as a way to identify individuals whose performance expectations should differ from their explicitly reported task assessment if, in fact, performance expectations are implicitly held.
Numerous studies in the psychology of morality suggest that in the U.S., political liberals are more likely than political conservatives to feel concern for disadvantaged individuals (the "ethic of care"), support equality and civil rights (the "ethic of equality") (Graham et al. 2013;Haidt 2012), and value communion (i.e., the maintenance of relationships and social functioning) and its concomitant traits, such as empathy and kindness (Eriksson 2018).Political liberals are also less likely than political conservatives to attribute class disadvantage to personal failings, such as laziness and dishonesty (Hunzaker and Valentino 2019).Together these patterns suggest that political liberalism will increase participants' direct assessments of the performance of individuals with a mental illness.But, if task-group behavior, such as resistance to a teammate's problem-solving suggestions, is driven by implicitly held performance expectations that emerge from widely shared and fairly uniform status beliefs, political views should not affect the task-group behavior (our indirect measure).Thus, we test the following two hypotheses: Hypothesis 2a: Participants' political liberalism will not moderate the effect of teammate mental illness on resistance to teammate influence.

Hypothesis 2b:
Participants' political liberalism will moderate the effect of teammate mental illness on direct assessments of teammate task performance, increasing the assessment of teammates with a mental illness but having no effect on the assessments of other teammates.
We also explore these processes with a somewhat different analysis strategy by examining the way that liberalism directly affects the discrepancy between the direct and indirect assessments.If liberalism increases indirectly measured expectations but not indirectly measured expectations, the direct scores should exceed the indirect scores for liberals but not for conservatives.Thus, we test the following moderation hypothesis: Hypothesis 3: Political liberalism will moderate the effect of teammate mental illness on the discrepancy between direct and indirect measures of teammate performance.More specifically, liberalism will increase the discrepancy for teammates with a mental illness but have no effect on the discrepancy for other teammates.
As we explain below, the discrepancy scores are the residuals from the regression of the direct measure on the indirect measure (resistance to influence).

Social Desirability Bias
Studies of stigma processes are, of course, vulnerable to social desirability bias, particularly when discrimination and prejudice are measured explicitly.This bias can be revealed by comparing the responses from participants who have a strong tendency to give socially desirable responses with those who do not.We explore the possibility of social desirability bias by interacting the hospitalization conditions with a measure of the tendency to give socially desirable responses.As we report in the final section of the results, none of the interaction terms involving the mental health conditions reach significance, suggesting that the patterns we identify are not a function of social desirability bias, at least not as measured with our social desirability index.

Sample
We collected data from 559 students who were taking an undergraduate course at a public university in the south between the fall of 2013 and the fall of 2015.Participants were given a description of the study in the informed consent sheet that they signed before participating in the study.Five of the participants were high school students who were taking a class for college credit.We have parental approval for their participation but dropped them for methodological reasons explained below.Three students elected to have their data destroyed, a standard option in the debriefing form, which left us with 551 undergraduate students who were willing to be included.
Motivation for success on the joint task is an important criterion for inclusion in SCT studies, so we excluded the cases in the bottom 3% (16 cases) on a composite measure of motivation for success ("How important was it to you that your team obtained correct answers on the contrast sensitivity tasks?" and "How important was it to you to succeed on the contrast sensitivity tasks?").These 16 cases had composite motivation scores that ranged from 0 to 15.5 on a scale that ranged from 0 to 100 (mean = 60.9, sd = 19.9).The results are substantively the same with higher and lower cutoffs.In fact, the focal coefficients remain significant/not significant if we drop only the five participants with a zero on the motivation-to-succeed composite.In the debriefing, 26 of the remaining 535 participants (16 men and 10 women) reported a very clear and early suspicion that there was no teammate and/or that the joint task was not real, leaving 509 non-suspicious participants who were willing to have their data retained and were motivated to succeed.Thus, we excluded 42/551 (7.6%) due to suspicion and/or lack of motivation, an exclusion rate that is below the average rate (14.53%) among SCT studies that report doing exclusions according to Dippong's (2012) meta-analysis.
Rates of exclusion by condition are 7.2% in the schizophrenia condition, 5.2% in the depression condition, 6.6% in the leg surgery condition, and 11.2% in the non-patient condition and 10.6% among men and 6.2% among women.The difference in exclusion rates between the depression and non-patient conditions (p = .079;two-tailed test) and by gender (p = .074;two-tailed test) are close to significance.The results are highly similar when all 551 cases are retained and, as noted above, when using other cutoffs for the motivation-tosucceed composite (available from the first author on request).

Teammate Hospitalization History and Gender
We manipulated the participant's teammate's hospitalization history and gender through an information exchange.At the beginning of the computerized instructions, participants learned that they would be working with a teammate on 25 "contrast sensitivity tasks."The instructions then asked them to fill out an electronic information sheet that would be exchanged with the teammate.The instructions explained that "The educational, employment, and demographic information you exchange will be similar to the information you might obtain from coworkers at a job" and asked them to "Please answer the following questions about yourself carefully and accurately."The form asked participants their gender, age, year in college, years of work experience, type of work experience, and whether they had had to take a leave of absence from school or work, and, if so, the reason.The teammate's response to the gender question served as the manipulation of teammate gender, and the teammate's responses to the last two questions served as the manipulation of the teammate hospitalization history.These responses were randomly assigned by the computer program.In the non-patient condition, the teammate response to the leave-ofabsence question was simply "No."In the depression, schizophrenia, and leg surgery conditions, the answer was "Yes," and the answer to the follow-up question about the reason for the absence was "Last year I was hospitalized for depression/schizophrenia/ leg surgery, so I took a little time off."Table 1 shows the descriptive statistics for this and the other variables in the analyses.The teammate's responses were matched with the participant's on the other information sheet questions so as not to introduce any other status differences, and we used broad response categories for all the response options except year in college so that the matching responses did not arouse suspicion.After participants were shown the teammate's responses, the instructions asked them to write the teammate's responses down on a Partner Information Sheet beside the computer, a task designed to ensure that participants saw their teammate's leave-of-absence and gender responses.

High School Students
Our response options for year in college (freshman, sophomore, junior, senior, and postbaccalaureate) did not, unfortunately, include high school student.Yet, we know from demographic questions at the beginning of the study that five participants were high school students.Those five students selected "freshman" as did, of course, the computerized teammate, so those five students thought they were working with someone with higher educational attainment.Consequently, we dropped those five cases to ensure that all participants perceived their teammate as equal in status on all factors except the manipulated variables.Contrast Sensitivity Task After exchanging information with the teammate, participants learned more about the contrast sensitivity tasks, a standard task used for investigating status-organizing processes (Berger 2014).Participants learned that on each of the 25 tasks, the two teammates would be presented with images and that their task was to determine which of the two images included more white area.Through an example trial, the teammates learned that they would provide an initial answer that was shared and that each teammate would then privately enter his or her final answer.In reality, all sets of images had an equal proportion of white, and the teammate was computerized and programmed to give an initial answer that differed from the participant's on 20 of the trials (all but trials 1, 6, 13, 17, and 22).The 20/25 rate of disagreement is a standard rate of disagreement used to investigate expectation states theory hypotheses (Berger 2014:274).Participants were told that the two teammates' final choices on each trial would be combined and that teams with scores in the top 25% would split a $20 bonus.This joint reward was designed to create a valued outcome and to motivate participants to work with the teammate to find the correct answer, contributing to the fulfillment of three SCT scope conditions (valued outcome, motivation to succeed, and collective orientation). 6After the 25 trials, participants completed a post-experimental questionnaire.

Participant attributes
Female participant is dummy coded (0 = male).The teammate's year in college was matched to the participant's during the information exchange, so participant and teammate education reflects both the participant's and the teammate's education.The options ranged from freshman to post-baccalaureate, but only one participant selected post-baccalaureate, so that case is folded into the senior category.These are dummy coded (0 = freshman).We measured the tendency toward social desirability at the end of the study using a shortened (10-item) version of the Marlowe-Crowne Social Desirability Scale that has strong psychometric properties (Fischer and Fick 1993).The items ask participants to give true or false answers to statements such as "I have never intensely disliked anyone."High scores indicate socially desirable responses.
We measured political liberalism before participation in the joint task, using the average of participants' self-identification on two 101-point sliders placed below the following prompts: "Politically, I am:," with "Extremely Liberal" on the left end and "Extremely Conservative" on the right end, and "I see myself as:," with "100% Democrat" on the left and "100% Republican" on the right.The items are correlated at .80.We coded this so high values indicate liberalism and divided the score by 10, so values range from 0 to 10.

Semester
We dummy coded semester, with the first semester (fall of 2013) omitted.

Stigma
We include a behavioral and two verbal measures of stigma.Social and physical distance, the behavioral measure, is the average of three dichotomous items (no = 1; yes = 0) that ask participants if they would like to: (1) stay after for 5 minutes to meet their teammate (mean = .28),(2) give their teammate their name and e-mail address (mean = .42),and (3) get to know their teammate socially (mean = .82).
The instructions for the first item read: The [university name] Department of Sociology encourages its researchers to give study participants who work on teams the opportunity to meet one another after the study is over.Therefore, if you have time, we want to give you the opportunity to meet your partner.The meeting will take about 5 minutes.
The instructions for the second item read: Would you like to provide your partner with your name and [university name] email address?If so, please provide that information below and we will give it to your partner after the study is over.
A full name or a correct university e-mail address was enough information to identify the teammate in the university directory, so we coded participants with a 0 on the second item if they gave: (1) their full name only, (2) a correct e-mail address only, (3) a first name and a correct e-mail address, or (4) a full name and a correct e-mail address. 7All others were coded with a 1.The instructions for the third item read: In addition to giving you the opportunity to meet your partner after the study, we also want to give you the opportunity to set up a future meeting with your partner.Indicate below if you would like us to tell your partner that you would like to get to know him or her socially outside of this study.The response you give here will be shared with your partner after the study is over.
If participants said "yes," they were told that "We can facilitate this meeting.Which type of meeting you would like us to arrange?The response you give here will be shared with your partner after the study is over.Select all that apply."The options included: "conversation on-line; conversation at a local coffee shop; no arrangement, because I changed my mind."Participants who selected "no" initially or in the follow-up question, were coded with a 1.The alpha reliability is .412.We also ran all the models with the three items included separately, and the results are highly similar, with the all the focal coefficients remaining significant/not significant.
Teammate evaluation is the rating of "my partner" on 9-point semantic differential scale anchored with "good" and "bad."The middle point of the scale was marked "neutral" (coded with 0), and the points between the midpoint and the endpoints were marked "slightly" (coded with −1/1), "quite" (coded with −2/2), "extremely" (coded with −3/3), and "infinitely" (coded with −4.3/4.3).The instructions introducing the scales emphasized that the ratings would not be shared with the partner.This measure is the evaluation component of stigma sentiments (see Kroska and Harkness (2006) for a report on measurement validity). 8 Teammate likability is the rating of "my partner" on a 101-point slider that was anchored with "unlikable" and "likable."The instructions also emphasized that ratings would not be shared with the partner.The values were divided by 10, so they range from 0 to 10.

Dependent Variables
Resistance to influence is operationalized with participants' percentage of stays: the percentage of the 20 disagreement trials in which participants stay with their initial choice for their final choice in the contrast sensitivity tasks.The variable is left skewed, with a chi-square of 11.26 (p = .004)for the joint test of skewness and kurtosis, but no transformations improve this.
Assessment of teammate's task performance is the factor score extracted from principal factor analysis of five items measured with 101-point sliders: indicate how useful your partner's ideas were (not useful/useful); rate the quality of the contributions that your partner made during the contrast sensitivity tasks (very low quality/very high quality); indicate how skilled your partner was at the contrast sensitivity tasks (unskilled/skilled); indicate who you think has the most contrast sensitivity-you or your partner (me/my partner); indicate how responsible you felt your partner was when making the final selection on the contrast sensitivity tasks (not responsible/responsible), with the order of the items and the direction of the adjective pairs randomized across participants.The instructions introducing the scales emphasized that the ratings would be private.The items load on a single factor, with the following loadings from principal factor analysis: .39(partner has most), .55(responsible), .71(useful), .80(skilled), .83(high quality contribution).The alpha reliability is .78.The variable approaches normality, with a chi-square of 5.28 (p = .071)for the joint test of skewness and kurtosis. 9 Assessment-resistance discrepancy is the variance in the direct assessment of task performance that is unexplained by the resistance to the teammate's influence.Specifically, it is the residuals from the OLS regression of the assessment of task performance on the resistance to teammate influence, displayed in Model 1 of Table 2.In regression models, a case has a positive residual value when its value on the dependent variable is high relative to other cases with similar values on the independent variables, and a case has a negative residual value when its value on the dependent variable is low relative to other cases with similar values on the independent variables.Thus, a high value on a discrepancy score indicates that a participant's direct assessment of teammate task performance is high relative to the participant's resistance to influence, while a negative discrepancy means the reverse.We ran analyses for discrepancies separately by condition, so Table 1 includes the descriptive data by condition.

Analysis Plan
We first review the relationship between the direct and indirect measures of teammate task performance in Table 2.We then evaluate our hypotheses with the models presented in Tables 3-5.We conclude by reviewing additional analyses that explore the role of social desirability bias.As shown in the table notes, we control for teammate gender in all models and for participant attributes (gender, education, social desirability), semester, and behavioral and verbal measures of stigma (social and physical distance, teammate evaluation, and teammate likability) in most models. 10We control for participant attributes in case the random assignment did not distribute these attributes in a perfectly random way.Due to  space limitations, we do not display the coefficients for the controls in the tables, but we provide the full models in Tables A1-A3 in the Online Appendix.
We tested for interactions between the two types of conditions (i.e., teammate hospitalization history x teammate gender) and between the conditions and participant gender.The participant gender by teammate gender interaction term reached significance in two of the assessment of task performance models (Models 2 and 6 in Table 4), so we retained that term in all of the task performance models.Participant gender also moderates the leg surgery condition (but not the schizophrenia or depression conditions) in four task performance models (Models 3-6 in Table 4).Therefore, we present the final task performance model (Model 6) separately by participant gender in Table A3 of the Online Appendix, so readers can see the gender-specific coefficients for that model.

Correlation Between Indirect and Direct Measures
The Table 2 models show the strength and significance of the relationship between the direct and indirect measures of task performance overall (Model 1), within each hospitalization condition (Models 3, 5, 7, and 9), and with a tendency toward social desirability controlled (Models 2, 4, 6, 8 and 10).The bottom row shows the relationship as a correlation.As shown, the two measures are significantly related, and the relationships decline very little with the control for social desirability.The correlations between the measures fall below the median in Nosek's (2007) examination of correlations between implicit and explicit attitudes (r = .37before adjustment for internal consistency and r = .48after adjustment), but they vary by condition, with the strongest relationship in the non-patient condition (r = −.358) and the weakest in the schizophrenia condition (r = −.210).Thus, this preliminary analysis suggests that the direct and indirect measures are capturing related but not identical information and the correspondence between the measures is weakest in the schizophrenia condition.

Resistance to Influence: H1a and H2a
Table 3 presents coefficients from OLS regressions of the resistance to influence on conditions and controls.According to H1a, a teammate's history of psychiatric hospitalization will increase a participant's resistance to that teammate's influence.Consistent with that hypothesis, Model 1 shows that teammate hospitalization for both schizophrenia and depression increase participants' resistance to influence.The effects hold in Models 2-4, which control for participant attributes and semester (Model 2), behavioral and verbal measures of stigma (added in Model 3), and liberalism (added in Model 4).The four models also show, by contrast, that teammate hospitalization for leg surgery is unrelated to resistance to influence.Together these results suggest that mental illness functions as a status characteristic, but that physical illness-at least a leg problem that warrants surgery and hospitalization-does not.According to H2a, participants' political beliefs will not moderate the effect of teammate mental illness on resistance to teammate influence.Consistent with that hypothesis, Model 5 shows that the political liberalism by hospitalization coefficients are not significant, suggesting that conservatives and liberals resist influence from teammates with a mental illness at a similar rate.

Assessment of Task Performance: H1b and H2b
Table 4 presents coefficients from OLS regressions of the participant's assessment of teammate task performance on conditions and controls.According to H1b, a teammate's history of psychiatric hospitalization will not affect direct assessments of a teammate's task performance.Consistent with that hypothesis, the schizophrenia and depression hospitalization coefficients are not significant in Model 1, and the non-significance holds in Models 2-4, which control for participant attributes and semester (Model 2), behavioral and verbal measures of stigma (added in Model 3), and liberalism (added in Model 4).
According to H2b, participants' political beliefs will moderate the effect of teammate mental illness on participants' direct assessment of their teammate's task performance.Consistent with that hypothesis, Model 5 shows that liberalism moderates the effect of teammate schizophrenia: it increases the performance assessment of teammates hospitalized for schizophrenia (b = .076,p = .010)but has no effect on the performance assessment of teammates with no hospitalization history (b = −.026,p = .347),and the slope difference is significant (b = .102,p = .011).In Model 6 we control for the resistance to influence, thereby showing the effect of political beliefs on the indirect measure of task performance net of the direct measure of task performance.As shown, the results are highly similar.Liberalism increases the performance assessment of teammates with schizophrenia (b = .080,p = .005)but has no effect on the assessment of teammates with no hospitalization history (b = −.035,p = .193),and the slope difference is significant (b = .115,p = .003). Figure 1 provides a plot of the Model 6 equation with the covariates held at the means.
We do not find support for H2b in the depression condition.Contrary to H2b, liberalism does not moderate the effect of teammate mental illness on task-performance assessment.Perhaps depression, which is less severely disabling than schizophrenia, does not elicit the same degree of concern that emerges from liberals' ethic of care (Haidt 2012), thus reducing the relevance of political beliefs to direct assessments of individuals with depression.We return to this topic in the discussion.

Assessment-Resistance Discrepancy: H3
According to H3, liberalism will increase the direct measure of task performance relative to the indirect measure (resistance to influence) when working with a teammate with a mental illness.Table 5 shows coefficients from OLS regressions of those discrepancies regressed on liberalism and controls.To simplify interpretation, we ran regressions separately within each condition.All models control for teammate gender; Models 2, 4, 6, and 8 also control for participant attributes, semester, and the behavioral and verbal measures of stigma.Consistent with H3, liberalism increases the assessment-resistance discrepancy in the schizophrenia condition but not in the non-patient or leg surgery conditions.When the full sample is analyzed with liberalism by hospitalization interactions, the difference between the schizophrenia and non-patient slopes is significant (b = .112,se = .039,p = .004).Contrary to H3, however, liberalism is not significant in the depression condition models (Models 3 and 4).Thus, as with the earlier models, the hypotheses are supported for a schizophrenia hospitalization but not a depression hospitalization.

Exploring Social Desirability Bias
Social desirability bias could have shaped participants' responses, particularly their direct assessments of their teammates' performances.Table 2 shows that the tendency toward social desirability is significant only in the non-patient condition, suggesting a weak role for social desirability bias.We explored this possibility more fully by adding social desirability by condition interactions to all the models in Tables 3 and 4, but not one of terms reached significance.We also considered the possibility that social desirability moderated the liberalism by condition interactions, so we added social desirability by liberalism by hospitalization history 3-way terms (and all the lower two-way terms) to the final model in Table 3 and the last two models of Table 4.The leg surgery three-way term and the leg surgery by social desirability two-way term reached significance in Model 6 of Table 4, but none of the other three-way or two-way terms reached significance.We also added a social desirability by liberalism term to the final models in Table 5.Again, the only term that reached significance was in the leg surgery model.Together these results suggest that social desirability bias did not differentially affect responses in the mental health conditions or differentially shape the way that liberals and conservatives responded in the mental health conditions, although future investigations with additional measures of social desirability bias will be valuable.

Discussion
According to status characteristics theory, when individuals work jointly on a valued task, the diffuse status characteristics that differentiate them shape their expectations about how they and others will perform on the task, and those expectations then guide their willingness to accept others' task-related suggestions.Yet, researchers make no assumptions about the extent to which those performance expectations are within individuals' awareness.This uncertainty has prompted recent studies aimed at illuminating their nature and role in the status process (e.g., Kalkhoff et al. 2020;Melamed et al. 2019).We extend those efforts by using political beliefs to identify individuals whose directly reported assessments of their teammates would likely differ from their behavior toward their teammates if the expectations driving their behavior are, indeed, outside of their awareness.The analyses suggest a new way to infer the extent to which individuals' explicit assessments diverge from their performance expectations.
We found, as predicted, that participants ignored problem-solving suggestions from teammates with a history of psychiatric hospitalization more frequently than they ignored suggestions from teammates with no such history, replicating other findings suggesting that some psychiatric diagnoses function as a status characteristic (e.g., Lucas andPhelan 2012, 2019).But, as predicted, a teammate's history of psychiatric hospitalization did not affect participants' subsequent explicit, but private, assessments of their teammates' task performance.Both sets of results held even with controls for both behavioral and verbal measures of stigma and with controls for social desirability bias.This divergence between the indirect behavioral and the direct verbal assessments of task performance is consistent with SCT researchers' suspicion that task-group behavior may be guided by implicitly held performance expectations (e.g., Berger, Wagner, and Zelditch 1985).
Yet, the divergence is only one set of data points.Therefore, we sought to explore the pattern further by examining how a belief that is related to prejudicial perceptions-namely political liberalism-affects these relationships.In essence, we used political beliefs as a way to identify individuals whose deference behavior should differ from their direct assessments if, in fact, the performance expectations underlying deference behavior are subconscious.We found, as predicted, divergence in the direct and indirect measures among liberal participants who were working with teammates with schizophrenia.Political liberalism increased direct performance assessments of teammates with schizophrenia but had no effect on the performance assessment implied by their deference behavior, and again these results held with controls for both behavioral and verbal measures of stigma and with controls for social desirability bias.These results suggest that political views do not override the tendency to behave in discriminatory ways when seeking to receive a valued outcome and that task group behavior may, indeed, be driven by perceptions outside of individuals' awareness.
Next, we examined the effect of political views on the discrepancy between the two types of assessments.Here we found the same pattern: liberalism increased participants' direct assessments relative to the assessment implied their behavior.Together these results support the idea that the deference behaviors identified in collectively oriented task groups are sometimes grounded, at least in part, in expectations that are not fully accessible when measured directly.The results also suggest that a divergence between direct and indirect assessments-something frequently found in SCT studies-can mask a more complicated pattern, with an outside factor (in this case, political views) determining the participants for whom the two types of measures diverge.Future work examining other factors that differentially shape the two types of measures would deepen our understanding of performance expectations and status processes more generally.
Yet, our results did not support our hypotheses in the depression condition.Contrary to predictions, political liberalism was unrelated to both the indirect behavioral and the direct verbal measures of the task performance of teammates with depression.Depression, which is more common and less severely disabling than schizophrenia, may not elicit the same feelings of concern that grow out of the ethic of care so common among political liberals (Graham et al. 2013;Haidt 2012), which, in turn, may reduce the relevance of political beliefs to assessments of individuals with depression.Future work examining performance expectations could explore the role of beliefs that are more closely linked to the relevant performance assessments, which in the case of mental illness may be beliefs about the competence of individuals with a mental illness.

Two Interpretations
Researchers studying performance expectations share concerns about the validity of direct measures of the expectations.But, there are two ways to understand the problem: (1) explicit measures could be problematic because performance expectations are subconscious and, therefore, inaccessible when measured directly, or (2) explicit measures could be problematic because performance expectations are within individuals' awareness and, therefore, can be censored when measured directly.Both are possible, but several of our findings suggest the first interpretation provides the better explanation.First, if performance expectations are within individuals' awareness and are censorable (possibility #2), it is not clear why participants were not censoring them during the indirect behavioral measure.The divergence in results between the direct and indirect measures is consistent with the idea that the perceptions driving the behaviors are implicitly held.Second, the divergence between direct and indirect measures happens only for the liberal participants, the participants we would expect to have divergent scores if, in fact, performance expectations are subconsciously held.Finally, if censoring were happening (possibility #2), we would expect it to happen more often among those with a tendency toward social desirability.But, that does not appear to be happening: the social desirability index does not moderate the effect of the mental health conditions on either of the outcomes nor does it moderate the mental health conditions among liberals, the group of participants whose direct and indirect assessments are most likely to diverge.

Theoretical and Methodological Implications
Our findings have implications for several lines of status-related research.First, the findings are relevant to debates regarding the generalizability of status processes.According to SCT, the status beliefs that shape performance expectations and, in turn, deference behaviors are beliefs about how "most people feel," termed "third-order beliefs" (Ridgeway and Correll 2006) and, more recently, "generalized second-order beliefs" (Mize 2019).These beliefs have been shown to override first-order beliefs (how individuals personally feel) when the two types of beliefs conflict (Melamed et al. 2019), suggesting that the link between status characteristics and deference behavior will be fairly uniform within a given culture.Our study is consistent with that pattern.Despite differences between liberals and conservatives in their direct assessments of the performance of individuals with schizophrenia (likely a reflection of their first-order beliefs), the two groups resisted influence from individuals with schizophrenia at a similar rate.
Our findings also extend the literature on the status consequences of a mental illness diagnosis.The computerized teammates interacted with the participant in the same way across conditions, so our findings suggest that information about psychiatric hospitalization alone is enough to prompt resistance to influence and, unlike other investigations (Lucas and Phelan 2019), to do so for both a schizophrenia and a depression hospitalization.We also found that the resistance to suggestions did not occur with patients hospitalized for leg surgery, suggesting that the resistance effects are unique to mental illness.
Our analyses also suggest an avenue for unobtrusively exploring the extent to which the performance expectations underlying influence behaviors are subconscious, building on other unobtrusive techniques designed to illuminate the similarities and differences between indirectly and directly measured performance assessments and expectations (e.g., Doerer, Webster, and Walker 2017;Rashotte and Webster 2005).As noted above, future studies could extend our strategy by exploring the moderating role of beliefs that are more closely linked to the relevant status perceptions.In studies of the mental health status characteristic, these may be beliefs about the capabilities of individuals with a mental illness.

Future Research
Future work examining performance expectations within SCT could contrast indirectly and directly measured performance expectations for other established status characteristics (e.g., race, gender) and with additional types of indirect measures, such as sequential priming (Cameron, Brown-Iannuzzi, and Payne 2012), which assesses the effects of a stimulus on subsequent tasks, or the affect misattribution procedure (Payne et al. 2005), which measures automatic affective reactions toward stimuli based on how these reactions influence judgments of ambiguous stimuli.Future studies could also examine the role of factors shown to moderate the relationship between direct and indirect measures, such as interpersonal factors (e.g., perceived social consequences), intrapersonal factors (e.g., strength of the attitude), and measurement factors (Nosek 2007).It will also be valuable to examine the way that other beliefs and values relate to indirectly and directly measured performance assessments.Researchers could, for example, examine the way that the five foundations of moral beliefs (ethics of care, fairness, loyalty, authority, and purity) (Haidt 2012) relate to indirectly and directly measured performance assessments of individuals with varied status characteristics.Finally, it would be valuable to explore these processes with probability samples.Although such samples are not feasible with laboratory experiments, recent progress with online studies of status processes (Manago, Mize, and Doan 2021) suggests it may be possible to obtain broader and potentially representative samples with online experiments.Together all of these types of studies should deepen our understanding of performance expectations and status processes more generally.

Figure 1 .
Figure 1.Effect of Liberalism on Assessment of Teammate Performance by Condition.

Table 2 .
OLS regressions of assessment of teammate task performance on resistance to influence.

Table 3 .
(Kroska et al. 2023LS regressions of resistance to teammate influence on conditions, participant liberalism, and controls (N = 509).Models 2-5 also control for participant gender, participant and teammate education, a tendency toward social desirability, and semester.Models 3-5 also control for social and physical distance, teammate evaluation, and teammate likability.We report Model 2 in another study(Kroska et al. 2023).

Table 4 .
Coefficients from OLS regressions of assessment of teammate task performance on conditions, participant liberalism, and controls (N = 509).

Table 5 .
Coefficients from OLS regressions of assessment-resistance discrepancy on participant liberalism and controls in each hospitalization condition.