Source contributions to ambient VOCs and CO at a rural site in eastern China

Ambient data on volatile organic compounds (VOCs) and carbon monoxide (CO) obtained at a rural site in eastern China are analyzed to investigate the nature of emission sources and their relative contributions to ambient concentrations. A principal component analysis (PCA) showed that vehicle emissions and biofuel burning, biomass burning and industrial emissions were the major sources of VOCs and CO at the rural site. The source apportionments were then evaluated using an absolute principal component scores (APCS) technique combined with multiple linear regressions. The results indicated that 71% 7 5% (average 7 standard error) of the total VOC emissions were attributed to a combination of vehicle emissions and biofuel burning, and 7% 7 3% to gasoline evaporation and solvent emissions. Both biomass burning and industrial emissions contributed to 11% 7 1% and 11% 7 0.03% of the total VOC emissions, respectively. In addition, vehicle emissions and biomass and biofuel burning accounted for 96% 7 6% of the total CO emissions at the rural site, of which the biomass burning was responsible for 18% 7 3%. The results based on PCA/ APCS are generally consistent with those from the emission inventory, although a larger relative contribution to CO from biomass burning is indicated from our analysis.


Introduction
Atmospheric volatile organic compounds (VOCs) are important species affecting air chemistry on regional and global scales (Singh, 1999). Enhanced emissions of VOCs from various anthropogenic sources have not only reduced the air quality within source regions, but also have altered the composition of the atmosphere in remote regions through medium-and long-distance transport (Singh and Zimmerman, 1992;Blake et al., 2003). Thus, understanding emission sources of air pollutants and their source apportionment is critical to atmospheric chemistry research.
Recent studies have revealed that the burning of biomass and biofuel is a major source of air pollutants in rural China, releasing significant amounts of VOCs and CO into the atmosphere, particularly after harvest seasons (Wang et al., 2002Carmichael et al., 2003). To estimate source contributions including biomass and biofuel burning, a detailed inventory of air pollutant emissions in China has been recently developed by Streets et al. (2003) in support of the airborne Transport and Chemical Evolution over the Pacific (TRACE-P) observations. However, evaluations of these emission estimates using surface measurements and the TRACE-P observations suggest that CO could be underestimated by 50% or more in China (Wang et al., 2002Carmichael et al., 2003;Palmer et al., 2003). The VOC emission estimates are also thought to be highly uncertain Carmichael et al., 2003). Therefore, further studies using different approaches and techniques are required in order to reconcile the source emission inventories, to improve our knowledge of atmospheric chemistry, and to develop environmental management strategies. In particular, accurate quantifications of biomass and biofuel burning sources are required. For these purposes, a principal component analysis/absolute principal component score (PCA/APCS) receptor model is utilized to evaluate and characterize rural source emissions in eastern China.
It is well known that vehicle exhaust and industrial sources contribute to ambient VOC levels in urban areas (Singh and Zimmerman, 1992). However, relatively less is known about the source characteristics of VOCs in rural areas (Sexton and Westberg, 1984;Borbon et al., 2002) due to a lack of field measurements. In China, approximately 80% of the population lives in the rural/agricultural regions. The main energy source for cooking and heating in many rural regions is biofuel, unlike in urban areas where liquefied petroleum gas or natural gas is the primary domestic fuel source. Open burning of agricultural residues is a common activity in rural areas particularly during harvest seasons. In addition, there are many smalland medium-sized factories spread around the rural/ agricultural areas of China due to rapid economic development in recent years. Consequently, human activities and the resulting VOC and CO emissions in rural areas are expected to have a significant influence on air quality. Furthermore, rural locations downwind of urban areas will receive air masses containing anthropogenic emissions. Thus, to reduce the VOC emissions and subsequently ozone levels, it is necessary to identify and quantify the major sources of VOCs and CO in the atmosphere of rural China.
From June 1999 to July 2000 and from February to June 2001, measurement campaigns were conducted at a rural site in eastern China (Lin'an). The measured air pollutants include O 3 , CO, SO 2 , total reactive nitrogen (NO y ), nitric oxide (NO) and VOCs. The overall seasonal variations of the measured gases and the relationships between O 3 , NO y and CO at this site have been well characterized by . Cheung and Wang (2001) found that elevated ozone concentrations in this region had adverse effects on human health and agricultural crops. The emission patterns and potential sources of CO, NO y and SO 2 , and the relationships of trace gases and aerosols at this rural site are presented in Wang et al. (2002Wang et al. ( , 2004. This paper focuses on source identification and apportionment of VOCs and CO measured at this rural site with the application of a receptor model to the data set. The model-derived source apportionments are compared with estimates obtained from emission inventories in the study region.

Experimental
The ground-level field measurements were conducted at the Lin'an Baseline Air Pollution Monitoring Station (30 25 0 N, 119 44 0 E, 132 m) in the Zhejiang Province of China (Fig. 1). A detailed description of the site is given by Wang et al. (2002Wang et al. ( , 2004. Briefly, the station is 53 km west and 210 km southwest of Hangzhou and Shanghai, respectively. Lin'an County, with a population of about 50,000, is 10 km to the south. To the west and farther south are less developed and sparsely populated mountainous regions. Several small villages are scattered around the station within a 2 km range. The sampling site is surrounded by hills with pines, mixed deciduous trees, and bamboo and crop fields lie in the valleys among the hills. The land-use pattern is typical for rural areas of eastern China. Thus, the measurements are expected to represent a large region instead of isolated emission hotspots.

ARTICLE IN PRESS
Whole air samples subsequently analyzed for VOCs and CO were collected at the study site between October-November 1999 (16 samples) and March-June 2001 (46 samples). One sample was usually collected each day at noon. However, on 29 and 30 October 1999, and 1 and 7 June 2001, five samples were collected at 3-h intervals each day for the purpose of exploring diurnal variations. Most of the air samples were acquired in October 1999, March and June 2001 representing autumn, spring and summer, respectively. We combined the data collected in autumn 1999 with those in spring and summer 2001, other than taking more samples in autumn 2001, because after long-term field observations and site visits, we found that there was no major change on emission sources in the region between 1999 and 2001. Also, the PCA/APCS model depends on inter-correlations of various species other than their absolute concentrations. Thus, it is appropriate to use the data in different periods altogether for PCA/APCS analysis at the same sampling site. With regard to the data points, we took samples in three different seasons. Number of air samples collected in cold season (spring) was similar to that in warm seasons (summer and autumn). The air samples were collected in evacuated 2 L electro-polished stainless steel canisters each equipped with a bellows valve. Prior to sampling, the canisters were cleaned and evacuated at the University of California at Irvine. Details of the preparation and pre-conditioning of the canisters prior to sampling are described in Blake et al. (1994).
During sampling the canister valve was slightly opened, allowing about 1 min for the collection of the ''integrated'' samples. It is important that these samples be representative of the air in the atmosphere at the sampling site. Although the determination of an average concentration is desirable, the grab samples will be reliable as long as these samples are representative of actual pollutant physical and chemical characteristics. To maximize the representativity of air samples, we took samples at noon each sampling day as we believe that air masses mixed very well at that time. The canisters were then shipped to the University of California at Irvine for chemical analysis using gas chromatography (GC) with flame ionization detection (FID), electron capture detection (ECD), and mass spectrometer detection (MSD). A 6-column multiple GC-FID/GC-ECD/GC-MSD system was used to identify and quantify the VOCs. Detailed descriptions of the chemical analysis, relevant quality assurance/quality control, and accuracy, precision for each reported species are given by Colman et al. (2001).

The PCA/APCS receptor model
The statistical analysis on the collected data was performed using SPSS statistical software packages. A detailed description of the receptor model is given elsewhere (Thurston and Spengler, 1985;Swietlicki et al., 1996;Guo et al., 2004). In brief, PCA is a wellestablished tool for analyzing structure in multivariate data sets (Miller et al., 2002). It starts with a large number of correlated variables and seeks to identify a smaller number of independent factors that can be used to explain the variance in the data. It should be noted that a factor does not necessarily represent a specific emission or depletion mechanism, but rather a pattern of association (Derwent et al., 1995). Thus, a factor may contain more than one emission sources. The derived variables are simply linear combinations of original variables.
The first step in APCS is the normalization of all species concentrations as Z ik where C ik is the concentration of variable i (in our case VOCs and CO) in sample k; C i is the arithmetic mean concentration of variable i; and s i is the standard deviation of variable i for all samples included in the analysis.
As the factor scores obtained from PCA are normalized, with a mean of zero and standard deviation equal to unity, the true zero for each factor score is calculated by introducing an artificial sample with concentrations equal to zero for all variables The factor scores of the variables are obtained from PCA by analysis of normalized VOC concentrations. The APCS for each component is then estimated by subtracting the factor scores for this artificial sample from the factor scores of each true sample.
Regressing the VOC concentration data on these APCS gives estimates of the coefficients which convert the APCS into pollutant source mass contributions from each source for each sample. The source contributions to C i can be calculated by using a multiple linear regression procedure according to the relationship: where ðb 0 Þ i is the constant term of multiple regression for pollutant i; b pi is the coefficient of multiple regression of the source p for pollutant i; and APCS p is the scaled value of the rotated factor p for the considered sample. APCS p Ãb pi represents the contribution of source p to C i : The mean of the product APCS p Ãb pi on all samples represents the average contribution of the sources. The PCA/APCS approach represents a source apportionment technique which requires a minimum of inputs regarding source characteristics, but provides quantitative information regarding source profiles and their impacts. When the receptor model is used, it must be noted that the sampled air be well mixed with species from different sources and the species be relatively stable during the transport from emission sources to the receptor site. While the theoretical aspects of the approach are a bit complex, PCA/APCS is simple to perform in practice, requiring only principal component and regression procedures routinely available on standard statistical computer packages. The receptor model calculates the source profiles and source strengths in absolute concentrations. It does not require knowledge as to the number of active sources and their composition (source profiles) in advance. This important feature enables the model hypothesis to be validated by comparison of estimated source profiles with literature data. On the other hand, like all other multivariate receptor models, the PCA/APCS model needs adequate degrees of freedom for an accurate statistical analysis (Thurston and Spengler, 1985;Swietlicki et al., 1996). Furthermore, these multivariate receptor models may not be able to separate sources that are strongly correlated. In this study, PCA was carried out on CO and 18 VOCs which were among the most abundant trace compounds in the air and played important roles in atmospheric chemistry. The results of any PCA depend on the sampling duration, the number of pollutants and data points included.
It is worth pointing out that insufficient air samples would leave inadequate degrees of freedom for an accurate multivariate statistical analysis. In order to obtain stable PCA results, the sample number must greatly exceed the number of selected species. Thurston and Spengler (1985) recommend a sample excess of 50. Choi et al. (2003) have done extensive sensitivity tests on this topic and found that stable results can be sometimes achieved with a sample number excess of as little as 25, but below this the different principal components begin to group together and the PCA results become unstable. In this study, three outliers were removed from summer because the mixing ratios of ethylbenzene and xylene isomers in these three samples were notably higher than in other samples, whereas other species in the three samples were comparable to corresponding species in the remaining samples. Similarly, one outlier with the highest CO mixing ratio (1390 ppbv) was removed from the spring samples. Consequently, 58 air samples and 19 chemical species were selected for PCA, and the excess of samples to species was 39. Thus, we believe that the PCA will give us robust results.

Seasonal patterns of ambient VOCs and CO
Average concentrations of VOCs and CO in autumn 1999, spring and summer 2001 are presented in Table 1. To investigate whether there are statistical differences among the three seasons, a t-test is generally used. If the p-value is less than 0.001, that means there is a significant difference between the two data sets with a confidence level of 99.9%. By contrast, a p-value>0.05 means no significant difference between two data sets. For the i-butane, n-butane, n-pentane, ethyne, ethylbenzene, xylenes and C 2 Cl 4 , there was no significant difference in mixing ratios among the three seasons. For the C 2 -C 3 alkenes, benzene (po0:01) and toluene (po0:05), the average mixing ratio was higher in autumn than in spring and in summer, but no significant difference between spring and summer. It was found that the average i-pentane mixing ratio in summer was statistically higher than that in spring and autumn (po0:01). Since i-pentane is a component of gasoline, the higher i-pentane level in summer suggests more evaporation of gasoline in the hot season. The t-test showed that more isoprene was emitted in summer, followed by autumn and spring as its emission from vegetation is a function of light intensity and temperature (Jobson et al., 1994;McLaren et al., 1996). Significantly higher CH 3 Cl and CO (po0:05) mixing ratios were found in summer and autumn in spite of a more efficient dispersion of the pollutants and a more active photochemistry in these hot seasons. This is likely due to the variations of VOC emission source strength in different seasons. Biomass and biofuel burning has already been observed at the site particularly in summer and autumn in our previous papers (Wang et al., 2002. Since CH 3 Cl is a tracer of biomass and biofuel burning (Lobert et al., 1991), the results suggest that there were more biomass and biofuel burning activities in summer/autumn than in spring at the site. More evidence can be found by comparing the CO mixing ratios during the spring and summer/autumn seasons. The higher CO level in summer and autumn suggests a more frequent occurrence of combustion. In addition, the mixing ratios of ethane in spring and autumn were higher than that in summer (po0:01). Significant difference was found for propane among the three seasons with the highest value in autumn and the lowest in summer. It can be seen that the average concentration of C 2 Cl 4 , mainly emitted from industrial sources, was almost the same in spring as that in summer and autumn. Considering that dispersion and photochemistry are more active in summer/autumn than spring, we would expect more summer and autumn emissions of C 2 Cl 4 to keep the concentration the same as in spring.

Source identification of VOCs and CO at the rural site
In this section, we focus on source identifications of ambient VOCs and CO at the site during the sampling periods. As discussed in Section 3, four out of 62 samples were removed due to unusually high mixing ratios of various compounds. Nineteen out of 54 measured species were selected for further analysis because these species are among the most abundant trace compounds in the atmosphere and are the tracers of major anthropogenic and natural sources.
The data acquired at the site were used to conduct PCA, and the statistical results are shown in Table 2. Three factors were extracted. The first factor (F1) explained 63% of the total variance and the second factor (F2) accounted for 11%. The last factor (F3) was responsible for around 8%. F1 and F3: A high CO factor loading was found in F1 and F3. CO is a product of incomplete fossil fuel combustion and/or biomass and biofuel burning. High factor loadings were also found in F1 for ethene, propene, ethyne and benzene. These chemicals are mainly emitted from vehicle exhaust in urban areas and may also be released from biomass and biofuel burning in rural areas. On the other hand, CH 3 Cl correlated with F1 and F3 fairly well. Together, this suggests that both vehicle exhaust and biomass or biofuel burning are mixed in F1. Considering the daily emission patterns and wide use of biofuel in rural China, and seasonal characteristics of biomass burning, it is more likely that the vehicle emissions were correlated with biofuel rather than biomass burning. Moreover, C 2 Cl 4 had high factor loading in F1 as well. C 2 Cl 4 is a manufactured chemical that is widely used for dry cleaning of fabrics and for metal degreasing. Therefore, the VOC and CO sources in F1 were mainly vehicle exhaust, biofuel burning and industrial sources. In addition to CO and CH 3 Cl, isoprene had a high factor loading in F3. Since isoprene is released from biogenic sources in rural areas and its emission is related to temperature, the good correlation with CH 3 Cl indicates the co-location of biomass burning and biogenic emissions. Therefore, F3 was likely associated with biomass burning and biogenic emissions.

ARTICLE IN PRESS
The source profiles of F1 and F3 and the factor scores of selected air samples were used to check these interpretations of F1 and F3. Since factor scores are related to source contributions, air samples with higher factor scores would likely be closer to the emission sources. Therefore, the source profiles of the three extracted factors (F1, F2 and F3) were obtained by selecting 4-5 air samples with the highest factor scores. Fig. 2 shows the source profiles of the F1 and F3. It can be seen that the mixing ratios of the vehicle tracers ethene, propene, n-butane, ethyne and benzene in F1 were statistically higher than those in F3. The levels of the biofuel or biomass markers C 2 -C 3 alkenes, ethyne and benzene in F1 were also generally much higher than those in F3, but the average mixing ratio of CH 3 Cl (a typical biomass and biofuel burning tracer) did not show significant difference between F1 and F3. In addition, the mixing ratio of industrial/urban tracer C 2 Cl 4 was much higher in F1 than that in F3. The results obtained from these source profiles indicate that F1 is dominated by a mix of urban plumes and rural emissions whereas F3 is mainly related to rural emission sources. To investigate whether the F1 was a mix of urban emissions and biofuel or biomass burning, the time series of factor scores for the three extracted factors is presented (Fig. 3). For the F1, the highest factor scores (circled points) were observed in October 1999 and March 2001. By contrast, the F3 had the highest factor scores in summer 2001. Since biofuel burning occurs every day in rural China while the open burning of crop residues (biomass burning) usually occurs in summer, the above observations verifies that F1 was associated with urban emissions and biofuel burning while F3 was more related to biomass burning.
F2: i-Pentane was found to highly correlate with ethylbenzene and xylenes in the second factor. i-Pentane is known to be a tracer compound for gasoline evaporation (Morikawa et al., 1998). Aromatics are also emitted by gasoline evaporation and solvent emissions. Thus, gasoline evaporation/solvent usage could be a main contributor to the second factor. This is consistent with the source profile of F2, in which the major components of gasoline i.e. i-pentane and ethylbenzene were much higher than those in F3 but comparable to those in F1 (Fig. 2). Table 3 shows the source apportionment of individual VOCs and CO at the site. In APCS, the source contribution estimates can be negative (Miller et al., 2002). The R 2 values in the table represent the correlation between the measured and calculated concentrations. Most of the VOCs had an R 2 > 0:80 indicating a good fit between the observed and calculated concentrations. It was found that F1 accounted for about 78% of the total CO emissions. For VOCs, over 65% of the total emissions for individual VOCs were attributed to the co-contributions of vehicle emissions, biofuel burning and industrial sources, with the exception of o-and p-xylenes and isoprene. Gasoline evaporation and/or solvent usage (F2) explained 14-63% of the total emissions for ipentane, ethylbenzene and xylenes, and o 8% for the other species. Biomass burning and biogenic emissions (F3) contributed to 18% of the total CO emissions, 11% for benzene, 16% for ethyne, 17% for i-pentane and 27% for CH 3 Cl. By comparison, about 68% of ambient CH 3 Cl at the rural site was due to biofuel emissions. All of the ambient isoprene was emitted from biogenic sources in F3.

ARTICLE IN PRESS
The mass contributions of each source to the total ambient VOC concentrations were next estimated (Fig. 4). Here the unit of all VOC concentrations was mg m À3 . For factors containing more than one source, the tracers for a source were assumed to be exclusively emitted from this specific source and their contributions were assigned to this source. For instance, C 2 Cl 4 was exclusively allotted to the industrial emissions in F1. Furthermore, the fraction of the contributions from similar sources was summed and represented a common source. For example, as stated earlier the contribution of vehicle emissions was not distinguished from that of biofuel burning. This is because both sources shared common fingerprints such as ethene, propene, benzene  and ethyne, and neither source presented a strong enough source signature to separate from each other. The results implied that F1 originated from a combination of biofuel burning and vehicle exhausts. This is consistent with our previous findings based on correlation analysis . The mass contribution of the combination of vehicle emissions and biofuel burning to total VOCs was 71%75%. The contribution of biomass burning to total VOCs was found to be 11%71% whereas industrial emissions accounted for 11%70.03% and gasoline evaporation/solvent emissions 7%73%. The contribution of biogenic emissions to the total VOCs was only 0.02%70.51%, which is negligible. This is due to the fact that in the 18 selected VOC species for PCA/APCS analysis, only isoprene is a product of biogenic emissions while other species are emitted from anthropogenic sources. If looked at the relative contributions of emission sources to individual VOC species, we found that isoprene was exclusively emitted from biogenic emissions (Table 3). However, since the average isoprene level was much lower than the total concentrations of anthropogenic VOC species, it is not surprising that biogenic contribution to the total VOCs is ignorable at the site. In fact, in this region there were some anthropogenic sources which largely affected the abundance and speciation of ambient VOCs at the site.

Model evaluation
To evaluate the performance of the PCA/APCS model, the correlation between the field measurements and the corresponding modeling results was statistically estimated. Fig. 5 shows the scatter plot of the calculated and measured CO mixing ratios at the site (n ¼ 58). It was found that the calculated values are close to the measured CO mixing ratios with a squared correlation coefficient of 0.84. The model underestimated CO levels by 21% [Uncertainty=(CO measured -CO modeled )/ CO measured ] (Table 4). This suggests that the receptor model can reasonably estimate source apportionments of CO. The uncertainty of the model prediction was also evaluated for individual VOCs. It was found that C 2 -C 3 alkenes, C 4 -C 5 alkanes, benzene, toluene, ethylbenzene, m-xylene and C 2 Cl 4 were overestimated by the model by 2-63%. In contrast, the model underestimated emissions of C 2 -C 3 alkanes, o-and p-xylenes, ethyne, CH 3 Cl and isoprene by 12-61%. Furthermore, the model underestimated the total VOCs by 4%. Model evaluation studies suggest that the data input error is often a major contributor to total uncertainty (Hanna, 1988). In this study, the parameter APCS p in Eq. (3) was modeled from field measurements and then used to calculate source contributions to ambient air pollutants. The uncertainties in the APCS parameter caused errors in the prediction.

Comparison with the emission inventories
The results obtained in this study were compared with the emission inventories developed by Streets et al. (2003). Our model estimate suggests that 18%73% of CO and 11%71% of total VOCs were caused by biomass burning (Table 5). The estimated CO emission due to biomass burning at the rural site (18%) was twice that in emission inventories for Zhejiang province (9%), 6 times that for Jiangsu (3%) and comparable to that for Anhui province (15%). On the other hand, the contribution of the combined vehicle emissions and biofuel burning to the total CO emissions in this study (78%75%) was consistent with the emission inventories  (71% for Zhejiang, 76% for Jiangsu and 61% for Anhui). Table 5 also lists the source contributions to total VOCs derived from the emission inventory for Zhejiang province and neighboring provinces. It can be seen that the contribution of biomass burning to VOCs at Lin'an (11%71%) was comparable to that in Zhejiang (9%). Similar to the case for CO, the combined contribution of vehicle emissions and biofuel burning (71% versus 68%) to VOCs was comparable.

Summary and conclusions
Measurements of VOCs and CO at a rural/agricultural site in eastern China were analyzed to better understand their emission characteristics and relative contribution from major source sectors. Based on the analysis using the PCA/APCS model, the dominant VOC and CO sources were a combination of vehicle emissions, biofuel burning and industrial sources, biomass burning and biogenic emissions, and gasoline evaporation and solvent emissions. Model-derived source apportionment showed that the combination of vehicle emissions and biofuel burning explained 71% of the total VOC emissions, 11% for biomass burning and 0.02% for biogenic emissions. In addition, 78% of CO emissions were from vehicle emissions and biofuel burning, and 18% from biomass burning. These results are generally consistent with those from the emission inventories, although our analysis suggests a larger relative contribution to CO from biomass burning (18% versus 9%).
Signals from biofuel burning and urban emissions coexisted in the data set. As a result, the PCA/APCS model was unable to determine the relative importance of these two sources. Under such circumstances, chemical transport models and/or chemical mass balance model, which require detailed emission inventories and/or source profiles, respectively, will be needed for more accurate source apportionments of ambient VOCs and CO. a In the emission inventory, for the total VOC emissions, the domestic combustion sources are assumed to be biofuel burning.