A hierarchical classes analysis HICLAS of primary care patients with medically unexplained somatic symptoms

. This study used a clustering model, Hierarchical Classes Analysis HICLAS , to examine patient groupings in a multiethnic sample of 1456 patients using primary care services at a university-afﬁliated community clinic in southern California. Somatic symptoms, psychiatric diagnoses and disability were studied using a survey instrument that (cid:14) . included portions of the Composite International Diagnostic Interview CIDI , the Diagnostic Interview Schedule (cid:14) . (cid:14) . DIS and the RAND-MOS Short Form Health Survey’s SF-36 ‘physical functioning’ dimension. HICLAS identiﬁed 11 clusters of patients with distinct patterns of medically unexplained somatic symptoms. These patient clusters varied with respect to psychiatric diagnoses and symptoms, gender, immigration status and disability. Results of this (cid:14) . study suggest that the type of presenting symptom s and their various combinations may have diagnostic and prognostic value in primary care settings. These new ﬁndings may lead to further reﬁnement of current diagnostic constructs for somatizing syndromes. Q 1998 Elsevier Science Ireland Ltd. All rights reserved.


Introduction
Somatizing syndromes have been described Ž throughout the years under protean labels e.g. 'ennui', 'hysteria', 'hypochondriasis', 'neur-. asthenia' shaped by the medical model dominant Ž . at the time Shorter, 1993 . Historically, regardless of fashion or prevailing paradigms, somatic presentations have had a place of their own in descriptive psychopathology.
Because of the pejorative connotation of some Ž . old terms e.g. 'hysteria' , the word 'somatization', denoting psychological causality, was incorporated into clinical discourse. In current diagnostic systems, 'somatoform' became the term used to signify an overarching category that subsumed somatization disorder as well as other disorders, such as hypochondriasis, that involved medically unexplained physical symptoms. However, the acceptance of the two terms remains confined to the area of psychiatry and clinical psychology. Because hypochondriasis and somati-Ž zation disorder the two most distinctive and valid . somatoform diagnoses have low prevalence rates and fail to capture the large majority of patients presenting with unexplained medical symptoms, there is a need to develop systems of classification that are more 'user-friendly' and that can be shared with and accepted by primary care physicians.

Lists of somatic symptoms
The traditional lists of somatic symptoms used to elicit diagnoses of hysteria, and later somatization disorder, were actually quite comprehensive. Originally, these included not only somatic symptoms, but also other attitudinal and clinical features often seen in these patients such as dramatic demeanor, depressive, anxiety, and even psychotic symptoms. For example, in the 1960s, Ž . Perley and Guze 1962 used a list with 59 different symptoms to diagnose hysteria in their prospective studies. In the more recent classifica-Ž . tions DSM-III, DSM-III-R and DSM-IV , the symptoms have been restricted to somatic manifestations. The total number of 'possible' symptoms has decreased from 59 to 37 in 35 in DSM-III-R and now 34 in DSM-IV, and the symptom cut-off for defining a case has followed suit.
Many of the somatic symptoms listed in the above nomenclatures and included in diagnostic instruments such as the Diagnostic Interview Ž . Ž . Schedule DIS Robins et al., 1981 and the Composite International Diagnostic Interview Ž . Ž . CIDI Robins et al., 1988 are relatively com-Ž mon in general and clinical populations e.g. gas-. trointestinal, cardiorespiratory . Other symptoms, while rare, provide a distinctive rubric of psy-Ž chopathologic 'caseness' e.g. pseudoneurologic or . 'conversion' symptoms , and therefore, remain useful for systematic studies.

Somatic symptom typologies
Clustering methods have been used extensively to inform taxonomic work in psychiatry in general as well as in other areas of medicine and in Ž biological science such as the identification of . species . In addition to clinical observations, empirical clustering studies of symptoms and signs have proved useful for the purpose of developing new diagnoses andror validating old ones. In a thorough review of clustering applications in psy-Ž . chiatry, Blashfield 1986 cited over 500 such studies.
To our knowledge, only two studies have stud-Ž ied empirically how somatic symptoms among . other psychiatric symptoms cluster, and both studies were carried out in non-clinical populations. Both investigations used computerized clustering algorithms to examine clustering of DISelicited somatic symptoms, as well as psychiatric symptoms such as anxiety and depression. The Ž . first of these studies Swartz et al., 1986 used a procedure developed for the analysis of medical classifications called 'grade of membership analy-Ž . sis' or GOM Woodbury and Manton, 1982 . GOM produces 'fuzzy' as opposed to the discrete categories generated in the more traditional clustering models. According to the developers of GOM, the use of fuzzy sets may be a more appropriate way to represent 'gradations' in the individual expression of symptoms, which are so characteristic of psychiatric syndromes and classi-( ) fications. The sample used for GOM analysis included approx. 4000 community respondents interviewed with the DIS as part of the ECA study in the Piedmont region of North Carolina. The version of the DIS used in the study asked about and probed 47 individual symptom items for eliciting a diagnosis of somatization disorder. Of these items, 38 were somatic symptoms, four were symptoms of sexual dysfunction, two were measures of 'sickliness' and three were symptoms of depression. To this list the investigators added Ž one psychotic symptom hallucination or delu-. sion , three depressive symptoms and one anxiety Ž . symptom 'nervousness' in efforts to round up all possible features of somatization disorder as described in DSM-III. Hence, the final list consisted of 52 symptoms, thus approximating the original list used by the Washington University group Ž . Perley and Guze, 1962 . Respondents endorsing Ž at least three of the symptoms approx. 45% of . the sample were included in the clustering analyses of somatic symptoms.
The results of the GOM analyses yielded seven 'pure' types of 'naturally occurring' somatic symptom clusters. These clusters had different symptom admixtures and sociodemographic character-istics. A brief outline of these clusters is provided in Table 1.
Ž . In the other study, Rubio-Stipec et al. 1989 analyzed DIS interviews in approx. 3500 respondents at the Los Angeles ECA and Puerto Rican survey sites. The authors used factor analysis ᎏ which has an underlying dimensional model ᎏ as a way to represent symptom clusters. These analyses yielded five clusters in the Puerto Rican Ž sample alcohol, affective, phobic, psychotic and . somatization disorder symptoms , all of which, except somatization, could be replicated in the Los Angeles sample. The somatization cluster found in the Puerto Rican community included Ž 14 symptoms, five pseudoneurologic dizziness, . amnesia, fainting, paralysis, double vision , four Ž gastrointestinal abdominal pain, vomiting, nau-. Ž sea, excessive gas , two cardiorespiratory chest . pain and palpitations , one musculoskeletal Ž . Ž muscle weakness and one non-specific lifetime . sickliness .

HICLAS clustering model
The present study makes extensive use of the Ž . Hierarchical Classes Model HICLAS to repre- sent how primary care patients cluster with respect to groupings of medically unexplained somatic symptoms. HICLAS is a recently developed two-way, two-mode clustering model, well suited to binary data such as in the present application ŽDe Boeck and Rosenberg, 1988;Gara et al., 1992;Van Mechelen et al., 1995;Rosenberg et . al., 1996 . The model has the capability of repre-Ž senting clustering in the rows in this application, . Ž . patients and columns symptoms of a binary Ž . data array. Most other but not all clustering algorithms represent either row clusters or column clusters, but not both. In addition, HICLAS is unique in that it explicitly represents set-theoreti-Ž . cal superset᎐subset relations among the row and column clusters.
The first published application of the HICLAS model in the area of psychiatric nosology was an analysis of how symptoms were distributed among Ž disorders in the DSM-III-R manual itself Gara . et al., 1992 . In effect, this was a quantitative analysis of the underlying taxonomy tacit in the DSM-III-R. This HICLAS analysis revealed not only several well-defined discrete symptom classes Ž . e.g. delusions, depression, somatic symptoms in the DSM-III-R but also several well-defined clusters of psychiatric disorders. The latter were modeled in HICLAS as combinations of one or more symptom classes and seemed to match traditional categories in descriptive psychopathology that have been incorporated in psychiatric nosologies throughout the years. Listed in hierarchical order on the basis of how well they were defined by the HICLAS program, these syndromes included: Psychoses; Mood Disorders; Organic Mental Disorders; Sleep Disorders; Addictive Disorders; Somatoform Disorders; Schizophrenia; and Anxiety Disorders. Interestingly, other less 'traditional' categories, such as Adjustment Disorders, Childhood Disorders, Personality Disorders and Sexual Disorders were not well-defined and fit the HICLAS model quite poorly. An interesting validation of the HICLAS model was the finding that Ž . the 'goodness of fit' Jaccard measure of disorder classes to the HICLAS model was significantly correlated to their inter-rater reliability in the Ž . DSM-III-R field trials Gara et al., 1992 .

Subjects
The subject sample consisted of 1456 new patients who sought primary care services at a Ž University-affiliated outpatient clinic North . Orange County Community Clinic located in Anaheim, CA. Following completion of informed consent procedures, and in temporal proximity with their clinical examination by a physician, the patients participated in a structured interview administered by trained bilingual interviewers that included detailed questions on general demographics, psychopathology, and physical functioning. Fifty percent of those patients initially approached for the study agreed to participate. There were no demographic differences between study participants and those who declined participation, except for level of education. Those who agreed to participate had, on average, one more year of education than those who did not.

Instruments
Assessment of psychopathology was made with the Composite International Diagnostic Interview Ž . Ž . CIDI Robins et al., 1988 . Diagnoses examined included: Somatization Disorder; Generalized Ž Anxiety; Dysthymia; and Major Depression in-. cluding melancholic subtypes . In addition, the 'physical functioning' dimension of the RAND-Ž . MOS Short Form Health Survey SF-36 was used Ž . as a measure of disability Brook et al., 1979 . Total scores in this dimension range between 10 Ž . Ž . severe disability and 30 no disability .
Ž . Bilingual SpanishrEnglish research interviewers were trained in the use of the CIDI, adhering to the official CIDI training guidelines as done at the US training site located in the Department of Psychiatry, Washington University in St. Louis. All instruments were translated, pre-tested and adapted for use with Spanishspeaking subjects.
Following the standard probing system in the CIDI, symptoms were scored as 'present' if they met the severity criteria and remained medically unexplained after detailed questioning. For exam-( ) ple, if the respondent answered 'yes' to the question 'have you ever had abdominal or belly pain?', the interviewer proceeded with a specific set of questions to determine symptom severity, which included probes regarding physician visits, medication intake, or significant interference with daily life or functioning. If these criteria were met, the interviewer asked about the physician's diagnosis and probed whether the symptom was ever due to physical illness or injury, or followed the use of medications, drugs or alcohol. If these inquiries proved negative for medical explanations, the symptom was scored as a positive somatization symptom. Obviously, the four female reproductive items were skipped in the case of male patients. Thus, there were only 37 symptoms applicable to males.

HICLAS and other statistical analyses used in this study
A hierarchical classes approach to psychiatric taxonomy presupposes the assemblage of a matrix of patients = symptoms. In the present analyses, the matrix has 1455 rows corresponding to the 1455 patients with complete symptom data and 40 columns representing the various CIDI somatic symptoms. A cell entry in the matrix is '1' if the patient has the symptom and if the symptom is judged to be disruptive and remains medically unexplained after medical consultation. Otherwise the cell entry is '0'. In this model various relationships among symptoms and across subjects can be represented. For example, two or more patients can be allocated by HICLAS to the same patient cluster if they share the same pat-Ž tern of symptoms e.g. symptoms from the same . organ systems . Similar patterns can also be detected for the various symptoms, and these patterns can be represented as symptom clusters that are arrayed hierarchically. Thus, HICLAS defines two hierarchical class structures for a patient = symptom matrix. One represents cluster relations among patients; the other, cluster relations among symptoms. The two structures and their relations constitute a formal hierarchical classes model for the entire matrix.
Because the resulting solution from the pri-mary HICLAS analyses yielded an extraordinary Ž . number of low frequency N s 1᎐2 patient classes, we also performed a second order HI-CLAS analysis on the results of the first analysis, in order to reduce the number of low frequency clusters. This second order analysis, analogous to a second-order factor analysis, led to the merging of various low-frequency symptom clusters that had similar patterns into larger clusters, thus bypassing the need to collapse these low frequency clusters by eyeballing or by other a priori, heuristic procedures.
The results of HICLAS analyses were related Ž . statistically e.g. using t-tests to psychiatric diagnoses such as major depression, melancholic depression, and generalized anxiety, and to disability. For the latter, an index of 'disability' was derived from the RAND-MOS 'physical functioning' dimension. Scores in this scale range between Ž . Ž . 10 no disability and 30 severe disability .

Results
The patients were 55% female, and their ages Ž ranged between 18 and 67 years mean s 36.4; . S.D.s 11.8 . The sample included predominantly Ž four ethnic groups: US Non-Hispanics N s 533, . a majority of them white ; US born Hispanics Ž . N s 204, a majority of Mexican origin ; Mexican Ž . immigrants N s 593 and Central American Ž immigrants N s 125, most of them from El Sal-. vador and Guatemala . One of the 1456 subjects had incomplete data and was dropped from all further analyses reported in this article. The average number of years of completed schooling was Ž . 9.9 S.D.s 4.1 . Not surprisingly, immigrants had Ž . less schooling 7.3 years than non-immigrants Ž . 12.6 .

First order HICLAS analyses
A partitioning of the 41 CIDI somatic symp-Ž . toms into eight clusters as described below was used as the initial configuration for the HICLAS analyses of the 1455 = 41 matrix. This a priori grouping of symptoms is necessary for HICLAS analysis. That is, HICLAS requires an initial con- . figuration clustering of either the row or the columns of a two-way matrix in order to minimize the possibility of local minima in the final solu-Ž . tion De Boeck and Rosenberg, 1988 . As the focus here is on patient clusters and their properties, we thought it better to make a priori assumptions about symptom clusters than about patient clusters.
We grouped the 41 symptoms into eight speci-Ž . The resulting HICLAS solution fits the original Ž . data matrix fairly well s 0.73 . HICLAS also calculates a Jaccard measure of fit for each individual symptom, which is interpreted in a way that is roughly comparable to interpreting a factor loading. For example, the goodness of fit of 'blurred vision' to its associated cluster is 0.385. The higher the fit, the more representative the symptom is for the particular class or cluster in which it is placed. Some organ system clusters Žheadache, genito-urinary, cardiorespiratory, fe-. male-reproductive, musculoskeletal generated Ž . particularly high fits ) 0.50 . However, pseudoneurological symptoms had lower fits and the 'lone' skin symptom from the CIDI did not fit the HICLAS model at all. This may be a consequence of the very low overall prevalence of these symptoms, their heterogeneity, and in the case of the skin symptom, rare co-occurrence with other symptoms in the data set.

Second order HICLAS analyses
These analyses allocated the 1455 patients into one of 11 major clusters, effectively reducing the large number of clusters obtained in the first-order analyses and making the clustering results more manageable.
The 11 patient clusters identified by HICLAS are labeled 'A᎐K' in Fig. 1. Starting at the bottom of the figure, the reader will notice six clusters Ž . boxes F, G, H, K, I, J corresponding to patient clusters. As is characteristic of HICLAS, patient clusters are always defined with respect to symptom clusters. For example, patient Cluster F consists exclusively of patients who have only car-Ž . diorespiratory CR symptoms, Cluster G of Ž . patients who have only genito-urinary GU symptoms, and so forth. The notation 'n s 52' within Cluster F means that 52 patients exhibited this class of symptoms. Note that Cluster 'K' is Ž . Fig. 1 Fig. 1 is configured also serves to illustrate that pseudoneurological symptoms tend not to occur in patients by themselves as distinct clusters, but instead, always co-occur with other symptom clusters.

Implications of the HICLAS model of unexplained somatic symptoms: the case of pseudoneurological symptoms
Ž . Two logical propositions P1 and P2 can be constructed based on the HICLAS configuration in Fig. 1. The first proposition, dubbed 'P1', is the following: when there are se¨eral pseudoneurological symptoms present, it is likely that a patient will ( ) meet Escobar et al.'s 1989 criteria for abridged Ž somatization have four or more unexplained . physical symptoms if male; six or more if female . This proposition P1 is based on the fact that in Fig. 1 a patient with several pseudoneurological symptoms is likely to be found in Cluster A, and patients in Cluster A also have all symptom clusters that are beneath Cluster A in the hierarchy. The second proposition based on the figure, dubbed P2, asserts that the relation between pseudoneurological symptoms and abridged criteria is asymmetric: gi¨en that a patient meets abridged criteria, the likelihood that he or she will also ha¨e se¨eral pseudoneurological symptoms is smaller than the likelihood that a patient will meet abridged criteria gi¨en the presence of se¨eral pseudoneurological symptoms.
In order to test whether propositions P1 and P2 were valid, and not some artifact of HICLAS analysis, we assessed the likelihoods associated with P1 and P2 using an alternative set of statistical analyses. These included cross-tabulation analysis coupled with the Somer's D statistic. The latter statistic explicitly represents asymmetric predictive relationships when such are present in actual data. The cross-tabulation analysis that we Ž . used involved two binary variables: a a variable indicating whether or not a given patient met Ž . abridged criteria for somatization; and b a variable indicating whether or not a patient had three or more pseudoneurological symptoms. A total of Ž . 111 patients 7.6% of the sample had three or more pseudoneurological symptoms; we chose three symptoms as the cutoff because patients in Cluster A averaged 2.4 such symptoms.
The results of the cross-tabulation described above were as follows. The value of Somer's D Ž . was 0.74 P-0.001 in that instance when meeting abridged criteria was 'predicted' by the presence of three or more pseudoneurological symptoms. The value of D was only 0.30 the other way Ž around i.e. 'predicting' three or more pseudoneurological symptoms when abridged criteria . were met . The magnitude of the first Somer's D Ž . 0.74 , as well as the asymmetry of the two D Ž . statistics when considered together 0.74 vs. 30 , validates propositions P1 and P2. Hence, the results confirm the utility of pseudoneurological symptoms in flagging cases of somatoform disorder.
It is possible that eliciting pseudoneurological symptoms will prove to be a parsimonious way to Ž . screen for DSM-IV Somatization Disorder SD , a diagnosis for which only eight of the 1455 patients in the present study met the criteria. The presence of three or more pseudoneurological Ž . symptoms identifies half n s 4 of these SD patients, with a false positive rate of 7.4% and a false negative rate of 0.3%. An index based on four or more pseudoneurological symptoms performs even better as a screen, flagging the same four SD patients, but yielding lower rates of false Ž . positives 3.6% and an identical rate of false Ž . negatives 0.3% . This four-symptom pseudoneurological index also predicts abridged criteria for somatization even more strongly than the three-Ž . symptom index 0.81 vs. 0.74 .

( )
We also compared, using an additional set of variables, the eight patients who met criteria for SD with the 53 patients who did not meet criteria for SD but who did report having four or more pseudoneurological symptoms. We found no statistically significant or even nominal differences between the two types of patients in terms of number of depressed symptoms, total score on the SF-36 physical functioning scale, and number Ž of anxiety symptoms P values associated with all . t-tests were G 0.40 . Interestingly, the percentage of patients presenting four or more pseudoneurological symptoms who met criteria for lifetime major depression was 58.4%; while the percentage of patients with SD who met depression criteria was 62.0%. Again, the between-group differences in lifetime depression were not statistically Ž . significant Fisher's exact Ps 0.83 . In fact, over 82% of the pseudoneurological patients and 87% of the SD patients met criteria for at least one lifetime DSM-IV axis I diagnosis other than a Ž somatoform disorder. Of the 487 patients 33% of . the total sample who were diagnosed with any DSM-IV axis I diagnosis, the 53 patients who met the pseudoneurological criteria were more disabled according to the SF-36 data than those w Ž . patients who did not meet the criteria t 483 s x 3.37, P-0.001 .

Closer scrutiny of the patients in HICLAS Cluster A
One way of validating the HICLAS solution is to compare patients with multiple pseudoneuro-Ž logical and other somatic symptoms i.e. the . patients in Cluster A with patients in the other HICLAS clusters, on a variety of axis I, symptom and demographic variables. Table 2 shows this comparison. As Table 2 indicates, Cluster A patients are quite distinct from the other primary care patients. That is, Cluster A patients are 2᎐3 times more likely to have a lifetime axis I diagnosis, particularly major depression. These patients also evince significantly greater physical disability, as well as a greater number of depressed and anxious symptoms. In terms of demographics, these patients are more likely to be female, born in the US, and somewhat older than the patients in other clusters.

Discussion
This study represents the first application of HICLAS for examining how primary care patients cluster with respect to medically unexplained somatic complaints. Eleven patient clusters were identified with distinct patterns of somatic symptoms. The HICLAS model that identified these clusters also fit the data fairly well. In addition, it was demonstrated in this study that an asymmetric relationship held between numerous pseudoneurological symptoms and somatization Ž . abridged concept , such that the former predicted the latter but not vice-versa. We were able to detect this asymmetry because patients with several pseudoneurological symptoms were lo- . cated by HICLAS in a superset class i.e. Class A at the upper-most level of the hierarchy, subsuming all clusters at lower levels. This important role of pseudoneurological symptoms was also confirmed by analyses showing that the presence of four or more such symptoms was strongly associated with somatization, whether defined by DSM-IV criteria or by abridged criteria.
With respect to the external validity of the HICLAS result, it was found that patients with pseudoneurological and multiple other somatic Ž . symptoms Cluster A differed from other patients in terms of psychiatric comorbidity, physical disability, and demographic factors. This clearly represents a more severe form of somatization, less likely to be seen in the case of immigrant patients compared to the US born. Naturally there are Ž certain characteristics of the study e.g. attrition, unique ethnic makeup of primary care patients . sampled, and use of lay interviewers that limit the generalizability of the clustering results to all populations.

The relationship of HICLAS to grade of ( ) membership GOM analysis
It is interesting to note that the current results using HICLAS replicate, at least in part, those of Ž . Swartz et al. 1986 using a different analytic Ž . technique GOM on a general population sam-Ž . ple. For example, their 'pure' type V see Table 1 very closely resembles our HICLAS Cluster 'A' Ž . Fig. 1 . Thus, both of these clusters are seen Ž almost exclusively among females Cluster A is . 82% female , include high levels of unexplained symptoms coming from multiple organ systems and augur high levels of disability or health services use. Also, their 'pure' type I may be akin to Ž . HICLAS Cluster 'K' 'no patternrfew symptoms' , Ž and their single organ symptom clusters types VI . and VII seem very similar to HICLAS Clusters 'F' and 'H', which were made up almost exclusively of gastrointestinal and cardiorespiratory symptoms.
HICLAS is based on an underlying discrete model of category membership, as opposed to the GOM model, which is based on fuzzy set theory. The novelty of HICLAS lies in the fact that Ž . hierarchical supersetrsubset arrangements among classes are explicitly represented. For example, we have seen already that the location of pseudoneurological symptoms in a superordinate Ž . class 'A' by HICLAS has important implications for classifying somatoform syndromes. However, while GOM does locate pseudoneurological Ž . symptoms in two distinct classes, a male class IV Ž . and a female class V , and does show their high levels of co-occurrence with other somatic symptoms, the representation of superset-subset relationships between pseudoneourological and other symptom clusters is beyond the purview of GOM analysis. Nonetheless, GOM analysis is quite useful in other respects, and the fact that there is considerable convergence between the symptom Ž clusters e.g. pseudoneurological; cardiorespira-. tory identified by GOM in one large general population sample and those identified by HI-CLAS in another large, albeit clinical, sample bodes well for the development of a method-independent, general typology of medically unexplained somatic symptoms.

Re¨ised somatization construct
The present data may contribute to the further refinement of diagnostic indices of somatization such as the abridged somatization construct Ž . Escobar et al., 1989 by adding type and number of organ systems to the more generic high levels of unexplained symptoms. These new observations may provide the construct more precision in discriminating among various psychopathologies, thus improving the detection of 'pure' and 'mixed' somatizing syndromes. In addition, the finding that four or more pseudoneurological symptoms was highly predictive of DSM-IV criteria for SD and abridged criteria for somatoform disorder is intriguing, and deserves replication in other large samples of primary care patients. While awaiting such future research, a tentative suggestion for primary care providers is to pay close attention to medically unexplained pseudoneurological symptoms. Upon observing the co-occurrence of several of these symptoms in a patient, the provider should consider a somatoform diagnosis as well other co-morbid axis I diagnoses, such as major ( ) depression. He or she should also be apprised of the fact that the coupling of a psychiatric diagnosis with several pseudoneurological symptoms is associated with considerable functional impairment.