Abstracting height and weight from medical records, and breast cancer pathologic factors

Cancer registries routinely collect data on clinicopathologic factors, but rarely abstract anthropometric variables. We conducted a chart review study, examining the feasibility of abstracting weight, height, alcohol use, and smoking from medical records in women (n = 1,974) diagnosed with invasive breast cancer, and investigated the association between the abstracted variables with clinicopathologic features. Qualitative data were reviewed and categorized. Frequencies of the abstracted data, and demographic and clinicopathologic variables were calculated. Logistic regression models measured the relationship between the outcome variables, tumor size, stage of disease, and estrogen/progesterone (ER/PR) status with the abstracted variables. Data on current alcohol-use/no-use, current-smoker/non-smoker, and height/weight data were obtained on 96%, 97%, and 88–89% of the participants, respectively. The multivariate analysis showed that overweight (≥25 kg/m2) women had significantly larger (≥2 cm) tumor size compared with normal weight for both women <50 years (OR = 1.79; 95% CI = 1.14–2.81; p ≤ 0.05) and for women ≥50 years at diagnosis (OR = 1.58; 95% CI = 1.19–2.09; p ≤ 0.05). These results suggest that abstracting current height and weight via medical records is feasible, and at minimum, current alcohol use and smoking status can be ascertained. In addition, being overweight was significantly associated with cancer clinicopathologic/prognostic factors, which has implications for monitoring etiologic factors that could be associated with cancer trends, incidence, and survival. Therefore routine collection of height and weight via cancer registries should be explored further.


Introduction
Cancer registries regularly abstract data items pertaining to breast cancer clinicopathologic factors, including stage of disease, age at diagnosis, tumor size, estrogen receptor status, and several other factors.However, lifestyle and anthropometric factors including, height, weight, smoking, and alcohol use, which have been shown to be important etiologic and prognostic factors for breast cancer, are not routinely collected by cancer registries.The etiology of breast cancer and clinicopathologic factors, such as age at diagnosis, tumor size, and estrogen/progesterone receptor (ER/PR) status, have been associated with height, weight, smoking, and alcohol use in some studies, but not in others [1][2][3].
Several studies have investigated the association between height, weight, and/or obesity with breast cancer risk [4][5][6][7][8].Obesity, particularly for post-menopausal women, has been shown to increase risk (RR: 1.59, 95% CI: 1.09-2.32),while obesity in pre-menopausal women reduces risk (RR: 0.61, 95% CI: 0.42-0.89)[5,7,9].Some studies also suggest that tumor size is related to obesity, specifically obese patients present with larger tumor size and/or diameter [1][2][3].Maehle and colleagues found a significant association between larger tumor size ([2.0 cm) and body weight and body mass index (BMI) (2001).The relationship between obesity and tumor size was more pronounced in ER-/PR-tumors.However, other investigations have not found an association between body weight and clinicopathologic factors and therefore further studies need to be conducted [5,10].
Other lifestyle factors, including smoking and alcohol use, have been associated with breast cancer clinicopathologic factors; however results have been equivocal, particularly for smoking.Some studies have suggested a weak association between smoking and poor prognosis after breast cancer diagnosis [11][12][13].Manjer reported an increased risk of being diagnosed with ER-tumors [RR: 2.21 (95% CI: 1.23-3.96)]for current smokers and an increased risk of 2.67 (95% CI: 1.41-5.06)for ex-smokers compared with never smokers (2001).A recent study from the California Teachers cohort (n = 116, 544) indicated that current smokers have a 32% increased risk for breast cancer compared with never smokers, which is in contrast to previous studies that have shown no association between smoking and breast cancer either because passive smoking was not accounted for and/or the relationship only occurred in certain subgroups (younger women and/or women with family history) [14][15][16][17][18]. Alcohol use has also been associated with ER+/-status, with a recent study showing that alcohol users were significantly more likely (RR: 1.35, 95% CI 1.02-1.80)to be diagnosed with ER+ tumors, regardless of PR status, compared with nondrinkers in post-menopausal women [19].But other studies have failed to show a relationship between alcohol use and estrogen receptor status [20][21][22][23].
Cancer prognosis has been divergent for non-Hispanic (NH) White women compared with other ethnicities [24][25][26].Overall prognosis for breast cancer during the 1980s remained fairly steady, but after 1989 a dramatic decline in overall cancer mortality rates for non-Hispanic White women have been observed [26].Furthermore, from 1980 to 1989 and more recently breast cancer mortality rates have declined 2.5% each year for White women [25,27].However, mortality for African American women from 1973 to 1991 steadily increased, and has only recently slowed [24].It is possible that access to care, genetic differences, and/or lifestyle differences may contribute to the observed differences in survival among various ethnic groups, but only one other study has investigated whether lifestyle differences, particularly height, weight, smoking, and alcohol use, could contribute to observed clinicopathologic differences (which may influence prognosis) between race/ethnic groups [28].
The present study assessed the feasibility and importance of routinely abstracting height and weight data by cancer registries.The purpose for the present study was to examine the feasibility of abstracting variables (height, weight, smoking, and alcohol use), not routinely collected by cancer registries, via medical chart review in women diagnosed with breast cancer.In addition, we investigated whether these factors are associated with clinicopathologic features of breast cancer, including tumor size, stage, tumor grade, and ER/PR status.A secondary aim was to further investigate the association between clinicopathologic characteristics and the abstracted items (height, weight, smoking, and alcohol use) by race/ethnicity.

Study population and abstraction
We conducted chart reviews in 2003, and abstracted height, weight, smoking, and alcohol use data at time of diagnosis from existing medical records in a population-based cancer registry, the Cancer Surveillance Program of Orange County (CSPOC) [29][30][31][32][33]. Breast cancer cases diagnosed during a 1.5-year period between 1 July 1996 and 31 December 1997 were included in the present study.During this 1.5-year period, 2,275 cases were diagnosed with invasive breast cancer in Orange County.Of the 2,275 breast cancer cases, we were able to retrospectively abstract medical records on 1,974 patients.Therefore, abstraction was not completed on 301 patients due to missing charts, closed hospital and/or record facilities, or lack of on-site hospital resources (the hospital did not have either sufficient staff or space to accommodate our medical record abstractor).
Abstraction of height, weight, smoking, and alcohol use was conducted by an experienced medical records abstractor.The abstractor had previous experience in abstracting data from existing medical records charts, including IRB certification, which includes training in confidentiality of data and HIPPA certification.Data abstraction of height, weight, smoking, and alcohol use and date of data ascertainment from the chart were collected on a standardized form.Height in inches and weight in pounds were recorded.Smoking information, including current smoking status, previous smoking history, cigarettes per day and years, and current and previous alcohol use, was also abstracted.Also, on a sub-sample of the patients where data was not available in the medical charts, we were able to obtain smoking and alcohol use status (that were collected for previous studies) from the registry database.In addition to abstracting quantitative data for smoking and alcohol use, we also recorded on a sub-sample of 25 patients (selected sequentially) qualitative data on smoking and alcohol use in order to more clearly characterize (quantity and frequency) these types of data in medical charts.We categorized the qualitative data for smoking as current smokers and non-smokers, and for alcohol use as current drinkers and non-drinkers.The number of charts abstracted per day was also recorded.The study protocol was approved by the Internal Review Board (IRB) of the University of California, Irvine (HSR#2003-3283), and by the California State University, Fullerton (HSR#07-0103).

Other study measures
Race/ethnicity and socioeconomic status (SES) data were ascertained via the cancer registry.SES score was based on various factors collected by the registry with a larger score indicating higher SES status [34].Clinicopathologic features including stage, tumor size, tumor grade, ER/PR status, previous cancer history, and age at cancer diagnosis were also obtained through the cancer registry database.Stage of disease at diagnosis was the summary stage defined by the Surveillance, Epidemiology and End Results (SEER) program of the National Cancer Institute as follows: localized disease was defined as invasive carcinoma confined to the breast; regional stage was defined as invasive carcinoma spread beyond the breast, by direct extension and/or to regional lymph nodes; and distant disease was defined as direct extension beyond adjacent organs specified as regional, metastasis to distant lymph nodes, or development of discontinuous secondary or metastatic tumors.In terms of TNM classification, localized disease includes tumors T1-T3, N0, M0.Regional disease includes tumors T4, N0, M0 or any T, N1-N3, M0, and distant disease corresponds to any T, N, M1.
We conducted logistic regression analysis to assess the relationship between the abstracted variables (BMI, smoking, and alcohol use) with prognostic variables, including tumor size, stage at diagnosis, and ER status.We excluded women diagnosed with more than one cancer (n = 450) and women who were underweight (\18.5 kg/m 2 ; n = 46) as this may indicate other existing co-morbid conditions.Three separate models for the outcome variables, tumor size (C2 cm vs. \2 cm), stage at diagnosis (regional/distant vs. localized), and ER status (ER-vs.ER+), were conducted.Independent variables included in these models were: BMI (18.5-24.9 vs. 25.0+kg/m 2 ), smoking (non-smoker vs. smoker) and alcohol use (use vs. no-use), age at diagnosis (continuous), race/ethnicity (NH White vs. other), and SES (continuous).All three models were stratified by age at diagnosis (\50 years and C50 years) in order to adjust for potential menopausal influences on these clinicopathologic factors.In addition, we conducted a chi-square analysis to assess the relationship between the abstracted variables and prognostic variables with race/ethnicity (NH White vs. other).

Results
Demographic, height and weight, and alcohol and smoking data are shown in Table 1.Twenty-six percent were diagnosed with breast cancer at age \50 years, while a majority of the women were diagnosed at age C50 years (74%).A majority of the women were NH White (85.9%)followed by Asian/PI (6.2%) and Hispanic (6.2%).2.3% of the women were underweight, 42.6% were normal weight, 27.3% were overweight, and 15.9% were obese.A majority of the women were non-smokers (84.7%), while nearly half of the women were current alcohol users (46.4%).
Clinicopathologic features showed that 64.5% were diagnosed with localized stage, 29.8% had regional stage, and 3.7% had distant metastasis (Table 2).Tumor grade was distributed as follows across grade I (20.9%), grade II (34.4%), and grade III (27.6), while 2.1% had grade IV disease.Approximately 54% were diagnosed with a tumor size of \2 cm, and 40.3% were diagnosed with a C2 cm tumor size.A majority had ER+ (56.5%) and PR+ (44.7%) tumors, and most had no history of previous cancers (76.7%).
Abstraction of height, weight, smoking, and alcohol use data from medical charts was conducted on six charts per hour.Height and weight abstraction was feasible on nearly 89% of the patients, and smoking and alcohol use was abstracted from medical records on 80.5% and 77.9%, respectively (Table 3).Height and weight data were found in multiple areas of the charts, usually in the anesthesiology sheet if the patient had surgery.Also, height and weight data were recorded in admission summary sheets, and MD and chemotherapy notes (data not shown).With the inclusion of already existing data collected by the registry, smoking and alcohol use reached 97% and 96% completion (Table 3).The qualitative data on smoking and alcohol use shows that there is wide variation in recording smoking and alcohol use history with no standard method of recording years of smoking, number of cigarettes, and/or how often and how much a patient drinks alcohol in medical charts.Chart review indicated that qualitatively data for smoking history varied and included responses  in women who were overweight compared with women who were normal weight (18.5-24.9kg/m 2 ).Also, alcohol drinkers were significantly (p \ 0.05) less likely (OR: 0.58, 95% CI: 0.39-0.88) to be ER-compared to non-drinkers, for women C50 years at diagnosis.No relationship was observed between PR+/-status and alcohol use (data not shown).
Table 5 shows the clinicopathologic and lifestyle variable proportions by race/ethnicity.When assessing the clinicopathologic features stratified by race/ethnicity, the data suggests that a significantly higher proportion of the other ethnic group (all women other than NH White) were diagnosed at a younger age (p \ 0.0001), were diagnosed with regional/metastatic disease (p = 0.07), have larger tumor size (p = 0.005), have higher tumor grade (p = 0.005), and have a higher proportion of ER-and PR-tumor status (p \ 0.0001), compared with NH White women.When examining the abstracted lifestyle/anthropometric variables, a lower proportion of the other ethnic group smoked (p = 0.09), drank less (p \ 0.001), were shorter (p \ 0.001) and were more overweight, but not significantly (p = 0.12), compared with the NH White group.

Discussion
In the present study, abstraction of height and weight data was feasible on nearly 90% of the sample and we obtained current smoking and alcohol use data at time of diagnosis on approximately 97% of the sample from medical charts and registry data.In terms of BMI, approximately half were normal weight and the other half were either overweight or obese.A majority of the women were nonsmokers and half (46%) drank alcohol.Younger women (\50 years at diagnosis) who were overweight were 80% more likely to be diagnosed with larger tumors compared with women who were normal weight.Similarly, women diagnosed at age C50 years and who were overweight were 58% more likely to be diagnosed with larger tumors.In the same age group, women who drank were 42% less likely to be diagnosed with ER-tumors (vs.ER+ tumors) compared with non-drinkers.In addition, we found that NH White women have overall better clinicopathologic markers/status at diagnosis compared with other ethnic groups, but the other ethnic groups were more likely to be overweight and less likely to drink or smoke.Cancer registries routinely collect clinicopathologic features, including stage of disease, age at diagnosis, tumor size, and tumor grade; however few studies have assessed the feasibility of abstracting modifiable risk factors, including weight, smoking, and alcohol use, from medical records.One earlier study that assessed obesity and breast cancer recurrence and survival in African-American women reported that height, weight, ER/PR status, tumor size, and other prognostic factors were not routinely found in medical records [35].In contrast, for the present study, abstraction of height and weight data reached nearly 90% and with previous data from the registry, smoking and alcohol use data was abstracted on nearly 100% of the charts.It is highly possible that because there was a six-year difference from when we began abstracting to the actual date of diagnosis (we reviewed charts in 2003 for women diagnosed between 1996 and 1997) that we were not able to abstract height and weight data on nearly all patients due to missing charts at storage facilities and hospital closures.Our results also indicate that at minimum we were only able to collect whether patients were current smokers and/or alcohol users, because more specific details on number of cigarettes, pack-years, or type/frequency of alcohol used were not consistently recorded in medical charts.
Several studies have shown that being overweight and/ or obese increases the risk for breast cancer in post- menopausal women by 20% [36][37][38].Our results showed that being overweight increased the risk of being diagnosed with a larger tumor size in both younger (\50 years) and older women (C50 years).Similar to our results, a population-based study of 1,177 women younger than 45 years (pre-menopausal) showed that women in the highest BMI quartile had a significantly larger tumor size (2 to \5 cm: OR, 2.3; 95% CI, 1.5-3.1;or C5 cm: OR, 2.7; 95% CI, 1.5-4.8)compared with women in the lowest BMI quartile [39].Other studies have shown a relationship between higher BMI and larger tumor size in post-menopausal women alone [3,38].The Iowa Women's Health Study revealed that post-menopausal women in the highest BMI tertile were more likely to be diagnosed with a larger tumor (C2 cm) compared with women in the lowest tertile [40].Ove Maehle and colleagues followed up women enrolled in the Norwegian Cancer Registry and showed that patients who were overweight (26.2 kg/m 2 ) had a significantly larger tumor diameter (C2 cm) compared with women with a lower BMI (25.2 kg/m 2 ) [3].The relationship between tumor size and BMI was significantly present only in women [50 years and in ER negative and PR negative tumors.It is possible that the relationship between tumor size and weight could be mediated by hormonal fluctuations in women who are overweight, and studies have suggested this [3,33]; however, similar to our results, another study showed no relationship between obesity and/ or BMI with ER/PR status [11].
In our study, alcohol use, a modifiable risk factor, was associated with a reduced risk of ER negative tumors when compared with ER positive tumors in post-menopausal women, indicating a relationship between alcohol use and ER positive status.A population-based study in 1,188 breast cancer patients showed that post-menopausal women in the highest alcohol intake strata had a 35% increased risk (RR: 1.35, 95% CI: 1.56-3.56) of being diagnosed with ER positive tumors [19].A recent cohort study with a follow-up of 10 years in 38, 454 women suggested that women drinkers were at increased risk (RR: 1.11, 95% CI: 1.03-1.20)for breast cancer for ER positive/PR positive tumors only, while no relationship was found with alcohol use in women diagnosed with ER negative tumors [41].Both hormonal and non-hormonal dependent mechanisms have been suggested in relation to alcohol use and breast cancer risk [42][43][44].For the present study, it is possible that we found a relationship between alcohol use and ER positive status, because alcohol increases estrogen production and also increases the expression of ER positive tumors [43,44].
Our results showed that NH White women compared with other ethnic groups (includes Hispanic, Asian, NH Black, and other) had better clinicopathologic indicators, including lower stage, tumor size, tumor grade at diagnosis, and were more likely to be ER positive, PR positive, and ER/PR positive.In contrast to the NH White women, the other ethnic group had better lifestyle behaviors, including current smoking and drinking less.However, the other ethnic group had higher BMI compared with NH White women (not statistically significant).Several other studies have suggested differences in breast cancer incidence and outcomes among various ethnic groups [28,45,46].However, few studies have compared modifiable risk factors and clinicopathologic factors among ethnic groups.The Women's Health Initiative followed 156,570 women and compared breast cancer characteristics in 3,938 women among six ethnic groups and reported that compared to White women, African American and Hispanic women were significantly less likely (p \ 0.001) to be diagnosed with ER positive and PR positive tumors, and had significantly higher frequency of poorly differentiated tumors [28].However, no significant differences in tumor size and stage were reported among the five ethnic groups.Similar to our results, the same study reported that compared to Whites, the other ethnic groups, except for Asian/Pacific Islander, had significantly higher BMI, and women of every minority group were significantly (p \ 0.001) less likely to drink alcohol compared to White women.It is widely recognized that early screening through mammography may reduce differences in stage and other clinicopathologic factors among the various ethnic groups [47,48].But, these studies did not adjust for BMI and/or other modifiable factors when assessing differences in clinicopathologic factors among the ethnic groups.
Limitations of the present study should be recognized.First, abstraction of alcohol use and smoking data via medical charts revealed that, at most, only never and/or current use data for these modifiable factors was available.Number of cigarettes smoked, pack years, and alcohol drinks per day and type of alcohol were not quantifiable via medical charts.Therefore, associations between alcohol use and smoking status with clinicopathologic factors were limited to use and/or no use of current alcohol and smoking.Second, even though we abstracted variables at time of diagnosis, because there was a six-year difference from when the participants were diagnosed (in 1996) to when we conducted the chart review (in 2003), some hospitals had closed and/or charts were missing and therefore height and weight data were obtained on only 89% of patients.However, based on abstraction of alcohol and smoking data (which included collection via the registry for previous studies), if height and weight data were abstracted along with the standard clinicopathologic variables, then most likely nearly 100% of height and weight data could be ascertained.Last, we did not consistently record the location of the abstracted variables in the medical charts which could affect feasibility of collecting these data.However, obtaining the precise location of all the abstracted variables in the charts was beyond the scope of the present study and should be conducted in future investigations.
There are several strengths of the present study.The data obtained for the present study was via a populationbased cancer registry, and includes an ethnically diverse population.Additionally, weight data was measured and then recorded in the anesthesiology records/medical charts, as well as the clinicopathologic variables, which provide objective data.Compared to self-reported data for weight which might be subject to socially desirable responses, objective measures reduce the chances of spurious findings.Importantly, this is one of the first studies to assess the feasibility of abstracting height and weight data via chart review.
The continuing increasing trend in obesity in the United States and the relationship between being overweight and/ or obese with increased cancer incidence should emphasize the importance of collecting measurements of height and weight via cancer registries, making this a public health priority.The results of the present study indicate that abstracting height and weight data from medical charts is feasible, and can be conducted in a relatively short period of time.However, smoking and alcohol use is not consistently recorded and may be difficult to report systematically.Also, these data suggest that BMI (and the other lifestyle variables) are associated with clinicopathologic factors at breast cancer diagnosis, and that these modifiable risk factors do vary by race/ethnicity.Because registries routinely collect important clinicopathologic factors and follow-up data associated with cancer diagnosis, adding routine abstraction of height and weight data may increase our understanding of changes in cancer incidence trends, disparities among ethnic groups, and etiologic factors associated with cancer.Therefore, cancer registries should, at least preliminarily, begin to explore collection of height and weight data, and further investigate implementation of routine abstraction of these two data items that have significant public health impact and association with cancer risk.

Table 1
Distribution of demographic, BMI, alcohol use, and smoking data in women with invasive breast cancer (n = 1,974)

Table 2
Distribution of clinicopathologic variables in women diagnosed with invasive breast cancer (n = 1,974) day, 30 pack-years, 2 packs/month 9 20, and smokes 1.5 ppd.Alcohol use was recorded in the charts as drinks alcohol, drinks socially, and drinks rarely.Therefore when collecting smoking and alcohol use from medical records, smoking and alcohol use can only be categorized into current smoker/non-smoker and current use/no alcohol use.Multivariate logistic regression (Table4), adjusted for age at diagnosis, race/ethnicity, and SES, showed that being overweight was significantly (OR: 1.79, 95%

Table 3
Feasibility of data collection, and completeness of quantitative and qualitative data (n = 1,974) CI: 1.14-2.81,p B 0.05) associated with larger (C2 cm) tumor size in women \50 years at age of diagnosis.Similarly in women C50 years at diagnosis, the odds of a larger tumor size was 1.58 (95% CI: 1.19-2.09,p B 0.05)

Table 4
Logistic regression models a,b of clinicopathologic outcomes associated with BMI, smoking and alcohol use a Adjusted for age at diagnosis, race/ethnicity, and socioeconomic status (SES) b Excludes those with BMI \ 18.5 kg/m 2 and those with two or more cancers c p \ 0.05

Table 5
Distribution a of abstracted and clinicopathologic variables stratified by NH White and other race/ethnicity