Residential mobility in the California Teachers Study: implications for geographic differences in disease rates.

BACKGROUND
Especially for cancers with long latency periods, such as breast cancer, the issue of residential mobility hinders ecologic analyses seeking to examine the role of environmental contaminants in chronic disease etiology. This study describes and evaluates characteristics associated with residential mobility in a sub-sample of the California Teachers Study (CTS) cohort.


METHODS
In 2000, lifetime residential histories were collected for a sub-sample of 328 women enrolled in the CTS; women's degree of residential mobility and associated factors were analyzed.


RESULTS
While most women moved many times during their lives (average = 8.9), the average number of years at their residence when they enrolled in the study was reasonably long (15.1 years). Age strongly predicted duration at current residence but was not related to the number of lifetime residences. After adjusting for age, California-born women and women living in high socioeconomic status (SES) neighborhoods were significantly more residentially stable. Agreement between self-reported urbanization of recent residences and that based on census data of the geocoded residences was very good (80% concordant). Among women currently living in urban areas, an average of 43.3 years, or 77%, of their lifetimes were spent in urban residences; among women currently living in a rural area, an average of 37.3 years, or 67% of their lifetimes were spent in rural residences.


CONCLUSIONS
This suggests that analyses of incidence rates based on current residence, while not capturing a woman's full exposure history, may reasonably reflect some aspect of longer term chronic exposures, especially those related to urbanization, at least in professional women.


Background
Ever since John Snow's legendary investigation of the 1850s London cholera epidemic, mapping disease incidence has been one of the fundamental tools of epidemiology. Today, with the advent of Geographic Information System (GIS) technologies, disease mapping has emerged as an especially powerful epidemiologic tool. Many cancers display marked geographic incidence patterns with strong urban/rural gradients (Melton, Brian, & Williams, 1980;Haenszel, Marcus, & Aimmerer, 1956;Nasca et al., 1980;Doll, 1991;Vassallo, De Stefani, Ronco, & Barrios, 1994;Blondell, 1988;Howe, Keller, & Lehnherr, 1993;Valerianova, Gill, Duffy, & Danon, 1994). By way of example, geographic location is among the strongest predictors of breast cancer incidence with more than a ten-fold difference among regions of the world (Parkin, Whelan, Ferlay, Raymond, & Young, 1997). The observation that incidence rates tend to be higher in urbanized areas has fueled speculation that environmental contaminants may play a role in the etiology of breast and other cancers (Laden & Hunter, 1998;Wolff, Collman, Barrett, & Huff, 1996;John & Kelsy, 1993;Johnson-Thompson & Guthrie, 2000). Such speculation, however, is predicated on incidence patterns based on residence at diagnosis; it does not take into account previous residences. Thus, as modern-day epidemiologists struggle to understand the potential role of environmental exposures in chronic disease etiology, especially in cancers that tend to have long latency periods, they are forced to grapple with the issue of residential mobility.
As previous researchers have noted (Rogerson & Han, 2002;Polissar, 1980;Kliewer, 1992) the degree to which residential mobility impacts our ability to detect geographic differences in disease rates can depend on a number of factors, including the latency of the disease of interest, the scale of the analysis, the distance of migration and the degree to which migration is driven by health status. To avoid these problems, Polissar (1980) recommended studying relatively large regions and diseases with short latencies. Unfortunately, limiting studies to large regions precludes our ability to examine potential links to some environmental exposures that may be limited to fairly small areas. Furthermore, most cancers, and many other chronic diseases, tend to have fairly long latency periods.
Research conducted to date has a number of limitations. Studies that have relied on area measures of mobility based on US census data cannot take into account individual determinants of mobility (Kliewer, 1992). Furthermore, studies often define mobility over very large areas, such as US states (Kliewer, 1992) and/ or over long periods of time, such as the place of birth compared to the place at diagnosis (Haenszel & Dawson, 1965;Kliewer, 1992).
This analysis describes and evaluates characteristics associated with residential mobility in a sub-sample of the California Teachers Study (CTS) cohort, a large prospective study of breast cancer . In this study, extensive lifetime residential histories were collected and residences for the most recent decade were geocoded to a very small geographic area (US census block groups). The objectives of our study were to: (1) describe the degree of lifetime residential mobility in study subjects; (2) identify factors associated with residential mobility; (3) conduct a validation of self-reported urban location against urban location based on geocoding of street addresses and (4) describe the degree to which study subjects reside in exclusively urban or rural areas during their lifetime.

Study population
Subjects in our analysis were women selected to participate in a biomarker sub-study nested within the CTS cohort. The CTS was established from 1995 baseline questionnaires returned by 133,479 active and retired female enrollees in the State Teachers Retirement System. The creation and characteristics of the cohort are described in more detail elsewhere . The biomarker sub-study was designed to examine urban/rural differences in dietary and environmental exposures of potential interest to breast cancer etiology in the CTS cohort. As part of the sub-study, participants were asked to provide a detailed lifetime residential history. A convenience sample of 544 women (271 randomly selected from urban areas and 273 randomly selected from rural areas), who were under the age of 85, was identified from the CTS cohort for participation in the sub-study. Of the 544 women approached, 328 participated (60.3%). The participation rate was higher among rural residents (71%) than among urban residents (49%). Use of human subjects was reviewed by the Human Subjects Research Committee of the Northern California Cancer Center and the California Health and Human Services Agency, Committee for the Protection of Human Subjects and found to be in compliance with their ethical standards as well as with the US Code of Federal Regulations, Title 45, Part 46 on the Protection of Human Subjects.

Residential history data
Trained, in-person interviewers collected the substudy participants' residential history information.
Women were asked to provide information on all residences of 6 months or more, from their time of birth up to their current residence. Interviewers collected specific address information (street, city, state, country) for all places women had lived in the most recent 10 years (i.e., since 1990). Because the interview was very long, women were asked to provide only the city in which they lived, not the actual street address, for all their residences prior to 1990. In addition to address information, the women also reported the year, or age at which, they moved to each address as well as the urban (large city, suburb) or rural (small town, rural) attribute of the area at the time they had lived there. The 328 women who supplied residential histories as part of this sub-study constitute the study sample of the present analysis.

Geocoding
We geocoded women's street addresses for the past 10 years, assigning each address both a latitude/longitude and a US Census block group. Prior to geocoding, we used ZP4 software (Semaphore Corporation, 2002) to standardize and validate the addresses. We used a GIS to automatically match addresses with a road network and assign a latitude/longitude, which we then used to determine the corresponding census block group. When possible, we manually located all addresses that could not automatically be matched. Geocoding was performed using ArcView GIS software (Environmental Systems Research Institute, Inc., 2000) and street databases from GDT (Geographic Data Technologies, 2002), US Census Bureau (TIGER2000) (US Census Bureau, 2002) and NavTech (Navigational Technologies, 2002). We did not attempt to geocode addresses that fell outside California (N=7) or older addresses for which only city, state and country information were gathered.

US census data
We used data from the US Census Bureau to derive neighborhood measures of urbanization and socioeconomic status (SES). We assigned degree of urbanization (in four categories) to all residences based on US Census 1990 block group data. The first category, ''most urban,'' we defined as cities of X100,000 people within a Metropolitan Statistical Area (MSA) or Consolidated Metropolitan Statistical Area (CMSA) of X1 million people. Cities of o100,000 people in MSA/CMSAs of X1 million people fell into a second category, defined as ''suburban.'' A third category included cities in MSA/ CMSAs of o1 million people and was defined as ''medium and small metropolitan areas.'' The final category included small towns (o50,000 people) and rural areas outside of census-designated urbanized areas.
We then compared the women's self-reported degree of urbanization to our GIS-derived census-based measures.
We created a summary SES metric based on the women's 1990 census block group, incorporating block group measures of occupation, education and income (US Bureau of the Census, 1992). To do this, we first ranked all California block groups separately by education level (percentage of adults over age 25 having completed a college degree or higher), income (median family income) and occupation (percentage of adults employed in managerial or professional occupations) according to quartiles based on the statewide adult population. This resulted in a score of one through four for each of these SES attributes. We then summed the scores across each attribute and categorized them into four groups based on the quartiles of this score for the statewide population.

Analysis
We calculated the number of women's lifetime residences and the number of years at their current residence. These served as the primary measures of residential mobility in our analysis. The distributions of these measures were compared across categories of age, race, birthplace, neighborhood SES and urbanization. Generalized linear regression models were run to identify which of these factors were predictors of residential mobility, after adjusting for age. These models were run using the PROC GENMOD procedure in SAS (SAS Institute, Inc., 1999) assuming a Poisson distribution, a standard assumption in analyzing discrete count data (McCullagh & Nelder, 1989). Because the data were over-dispersed, we multiplied the covariance matrix by a dispersion parameter, which was estimated based on the Pearson-chi-square statistic. This does not change the parameter estimates but adjusts the standard errors for the extra dispersion in the data. We did this using the PSCALE option in SAS (SAS Institute, Inc., 1999). These regression models generated rate ratios that represent residential mobility among a subgroup of the respondents relative to a referent group, after taking age into account. Table 1 shows the demographic characteristics of the 328 women included in this analysis compared to those of the full CTS cohort (N=133,479). The women in our analysis were, on average, slightly younger (mean=51.0 years of age) compared to the entire CTS cohort (mean=54.1 years of age), with comparatively fewer women in the oldest age category. The racial/ethnic composition of the two groups was roughly the same with most women being non-Hispanic white (85% in biomarker study participants; 87% in the CTS cohort). A greater proportion of women in our analysis were born in California (52%) compared to the entire CTS cohort (44%). The degree of urbanization for participants' current residence was markedly different from that of the entire CTS cohort. This reflects the current study's sampling scheme, which over-sampled in rural areas in order to ensure adequate numbers for conducting urban/rural comparisons and the greater participation rate in the present study among rural subjects. Recent residential mobility (i.e., since the inception of the CTS cohort) appeared to be similar between the two groups, as did neighborhood SES.

Results
The 328 study participants reported 2910 lifetime residences, of which 2078 (71.4%) fell within California. Ninety percent of these 328 women were able to recall the full street addresses for all their residences within the past 10 years, recounting 478 California addresses (as well as seven addresses outside of California). Of the 478 California addresses, we were able to geocode 424 (89%). Ninety-nine percent of participants were able to recall at least the city name for all their lifetime residences (data not shown). Table 2 summarizes the degree of residential mobility among study participants. On average, women lived at their current address for approximately 15 years. Nearly 40% reported having lived at their current residence for 15 years or more. At the extremes, 5% had lived at the same address for 1 year or less and 4% had lived at their current address for 40 years or longer. None of the women reported living at the same address for their entire lives. On average, women reported living in approximately nine different residences (average=8.9) during the course of their lives with 50% reporting between six and ten lifetime residences.
While our primary analyses here are focused on stability of residence at current address, using the geocoded data from the most recent 10 years, we expanded our analysis to evaluate the degree of mobility when measured at different scales (Fig. 1). While 62% of subjects lived at the same residence for at least 10 years, 82% remained in the same county. Further broadening our scale to three large regions of the state (San Francisco Bay Area, South Coast, and Remainder of California), recently defined in an analysis of regional variations in breast cancer rates among the CTS cohort (Reynolds et al., 2004a), we found that 90% have lived in the same region for 10 years or more. Finally, 98% of study subjects reported living in California for at least 10 years. Table 3 summarizes our results aimed at identifying determinants of residential mobility. As expected, the strongest predictor of duration at current residence was age. The average number of years at current address was 6.8 for younger women (those under 44 years of age) compared to 24.1 for women age 65 and older. This represents a rate ratio (RR)=3.67, meaning that compared to women under 44 years of age, women age 65 and older have lived at their current residence nearly four times (3.67) as long. The number of years at current address was highest among African American women (mean=23.3 years) and lowest among Hispanics (mean=11.5 years). After adjusting for age, however, no statistically significant differences in duration at current residence between races were detected. The average number of years at current address was highest among California-born women (15.5 years) and remained significantly higher after adjusting for age. Duration at current address increased with increasing neighborhood SES (with the top two quartiles averaging approximately 16 years versus an average of about 13 years in the lower SES quartile). The differences in duration by neighborhood SES persisted even after adjusting for age. Duration of residence at current address was modestly longer for women living in urban/suburban neighborhoods (16 years) than those living in small town/rural areas (14.4 years), although after taking age into account, this difference was not significant (RR=1.09, 95% CI: 0.95-1.26). While the number of lifetime residences did increase slightly with age (average=8.5 in youngest age group compared to 9.2 in the oldest age group), these differences were not statistically significant (RR=1.08, 95% CI: 0.91-1.28). In general, the number of lifetime residences were fewer among women of color than among non-Hispanic whites; after adjusting for age, however, only Hispanics had significantly fewer residences (RR=0.78, 95% CI: 0.62-0.97). Both Californiaborn women (age-adjusted RR=0.88, 95% CI: 0.79-0.97) and foreign-born women (age-adjusted RR=0.89, 95% CI: 0.69-1.17) reported fewer lifetime residences than women born in Canada or other areas of the US.

ARTICLE IN PRESS
For the 424 geocoded recent residences (post-1990), we examined the concordance between women's selfreported urban/rural designation and our designation based on 1990 census data. For the purposes of this analysis, we collapsed our four urbanization categories into two (urban=urban and suburban; rural=small town and rural). Among the geocoded addresses, agreement between our 1990 census-based urban/rural designations and that reported by participants occurred 85% of the time (data not shown).
Finally, we evaluated the degree to which the urban/ rural attributes of women's residences varied throughout their lifetimes. Based on their self-reported assessment of all their lifetime residences, 7.3% of women reported having always lived in an urban area and 3.6% reported having always lived in a rural area. Of the women currently living in urban areas, 16.6% were born in a rural area; of those currently living in rural areas, 27.0% were born in an urban area. Among women currently living in urban areas (N=133), an average of 43.3 years, or 77%, of their lifetimes were spent in urban residences; among women currently living in a rural area (N=193), an average of 37.3 years, or 67%, of their lifetimes were spent in rural residences (data not shown).  Reynolds et al. (2004a).

Discussion
Ecologic studies that rely on geographic differences in disease rates, particularly for long latency diseases such as cancer, are often compromised by a lack of information on length of residence. A number of researchers have noted that the effect of residential mobility on geographic differences in disease rates depends upon the scale of the analysis, the distance of migration, and the latency of the disease of interest (Rogerson & Han, 2002;Polissar, 1980;Kliewer, 1992). Thus, it is important to discuss our results in the context of these issues.
In our study, residential mobility was measured at the smallest scale possible, (i.e., any change in street address was considered to be a move). Based on this definition of location, study participants on average lived at their current address for 15 years, with approximately 62% residing at that location for more than 10 years. As shown in Fig. 1, residential stability increases dramatically as we increase the geographic scale of our analysis. Thus, whether one finds our results encouraging or disheartening depends largely on the geographic scale of interest. For example, in a related study of geographic patterns of breast cancer incidence in the CTS cohort, Reynolds and colleagues (Reynolds et al., 2004a) reported approximately 20% higher rates of breast cancer in the San Francisco Bay and South Coast areas compared to the remainder of the state of California. Our results presented here (showing 90% have resided in the same region) suggest that these estimates are probably not too dramatically affected by residential mobility. Thus it is unlikely that the effects observed in the Reynolds study (Reynolds et al., 2004a) are due to recent in-migration of high-risk individuals into these areas or are dramatically attenuated by high-risk individuals moving out (or lowrisk individuals moving in). In contrast, we currently are conducting a study of pesticide exposures and breast cancer incidence in the CTS cohort (Reynolds et al., 2004b). In that study, a GIS was used to assign potential exposures to agricultural pesticide use in a half-mile buffer around subjects' residence at diagnosis. The results of the present analyses, demonstrating that only 62% have resided at their current residence for 10 years or longer, is not as reassuring for the pesticide study, and must be considered as a limitation to that study. As previous authors have noted, if residential mobility is non-selective between regions of interest, then regional differences in disease rates associated with regionally based factors will be attenuated (Polissar, 1980;Rogerson & Han, 2002). While non-selective migration leads to a decline in geographic variability of disease rates, selective migration that is driven by the health status of migrants may lead to exaggerated rate differences between regions (Rogerson & Han, 2002). A greater percentage of women in our study moved during their lifetime from an urban to a rural area (27.0%) than from a rural to an urban area (16.6%). It is difficult to say how this differential migration might affect disease rate ratios between urban and rural cohort members because such an effect depends on whether migration patterns are related to health status, something we could not evaluate in the current study.

ARTICLE IN PRESS
In general, the women participating in our study demonstrated reasonable ability to accurately describe (compared to a standard classification based on census data) the degree of urbanization of the neighborhoods in which they resided. Concordance between self-reported and GIS-derived, census-based urban/rural designations was high (85%). This suggests that, in cases where address information or GIS technologies are not available, reliance on self-reported degree of urbanization may reasonably approximate this neighborhood attribute.
While very few study participants spent their entire lives living exclusively in either an urban or a rural area, the degree of urbanization of their current residences largely reflected the urban/rural designation where they lived for the majority of their lives. On average, among women currently living in urban areas, three-quarters of their lifetime was spent in urban areas. Similarly, among women currently living in rural areas, two thirds (on average) of their lifetime was spent living in rural areas. If this pattern is reflective of the general population, it would suggest that urban/rural gradients in disease rates may truly be capturing some risk factor or risk factor profile associated with extended exposures to urban environments.
In his evaluation of the effect of migration on geographic comparisons of disease rates, Polissar (1980) recommended studying relatively large regions and diseases with short latencies. Our results support the idea that expanding the geographic scale of analysis can dramatically reduce the effect of residential mobility on such studies. Unfortunately, limiting studies to large regions also makes it difficult to identify potential environmental determinants of disease, unless they are exposures which occur over large areas. Furthermore, while it is generally accepted that many chronic diseases, especially cancers have long latency periods, it is becoming increasingly problematic to define disease-specific 'latency periods'. For example, while there is increasing evidence that the breast may be especially vulnerable to environmental insults early in life (perhaps even in utero) (Potischman & Troisi, 1999) there is also evidence that more recent exposures are important in determining breast cancer risk (Brody & Rudel, 2003). Especially with respect to the use of exogenous hormone use, it may be that the more recent exposures are most important in determining risk (Bernstein, 2002).
There are some limitations to the current study worth noting. These analyses are based on a sub-sample of the CTS and may not reflect residential mobility for the general California female population. Because it is, to some extent, an occupational cohort, CTS study participants are more homogeneous than the statewide population. All members of the cohort have at least a college degree and all either work or have worked in a public school system, although many also have now or in the past had other occupations. In an attempt to assess the representativeness of our sample to the California population, we compared the residential mobility of our sample to data provided by the US census. Unfortunately, the US census does not provide estimates of residential stability for individuals, but rather for occupied households so a direct comparison can not easily be made. However, based on this census variable, only 31% of occupied households in California in 2000 were occupied by the same householder for more than 10 years, which is a considerably smaller proportion than the 62% of women in our study who have resided at their current address for ten years or more. This difference is probably in part due to the differing age structure of our sample compared to the universe of California householders. Furthermore, we would argue that our results are more informative to geographic studies of breast cancer (and other diseases more common among women of higher SES), since the demographics of our study sample better reflect the women at risk of developing breast cancer (i.e. older, educated, primarily non-Hispanic white women).
Very little has been published about the residential mobility of adult women. A recent case-control study of breast cancer in Marin County, California, reported that participants had lived in the same county for an average of 26 years, however the study did not report on length of residence in the same home (Wrensch et al., 2003). In a recent case-control study of breast cancer on Long Island, New York, Gammon et al. reported that 58% of study participants had lived at their current home for at least 15 years (Gammon et al., 2002), a percentage considerably greater than what we observed (39%) in our study sample. As in our analyses, Gammon et al. also noted increased residential stability among older women. However, in contrast to our findings, they reported greater residential stability among lower SES women.
In summary, although most study participants moved several times during their lifetimes, the average number of years at their current address was reasonably long (15.1 years). This suggests that geographic patterns of disease incidence rates based on current residence, while not capturing a woman's full exposure history, may reflect some aspect of longer term chronic exposures. Whether this degree of residential stability is sufficient to substantially bias risk estimates based on exposures linked to residence at diagnosis, is largely dependent on the presumed latency between exposure and disease development, which for some diseases may be difficult to estimate. These issues need to be weighed in the context of the disease and exposure of interest. To the degree that the residential stability seen in our detailed analyses represents that of the population of women most at risk for breast cancer, ecologic studies designed to examine potential environmental causes of this disease, based on address at diagnoses, should be able to reliably characterize some aspect of long term chronic exposures, albeit not early life exposures.