Contribution of biomass and biofuel emissions to trace gas distributions in Asia during the TRACE-P experiment

[ 1 ] A comprehensive emission inventory with enhanced spatial and temporal resolution is used to help quantify the contribution from three source categories (fossil, biofuel, and biomass burning) during the NASA TRACE-P experiment. Daily biomass burning emissions are developed to support this analysis. Emissions of 27 species and their ratios, by sector, region, and source category are presented. The emission distributions and chemical composition are further analyzed using various statistical techniques. Using cluster analysis, the 27 chemical species are combined into 8 groups that have similar regional distribution, and 52 regions are assembled into 11 regional groups that have similar chemical composition. These groups are used in Chemical Mass Balance analysis to characterize air masses and to quantify the contribution of the three source categories to the observed species distributions. Five DC8 flights with 16 flight segments associated with outflow events are analyzed. In general, Asian outflow is a complex mixture of biofuel, biomass, and fossil sources. Flights in the post frontal regions at high latitudes and low altitudes have a high contribution of fossil fuel emissions. Flights in the warm sector of cold fronts are dominated by biomass burning contributions (about 70%). Biofuel contributions are high (about 70%) when air masses come from central China. The receptor model results are shown to be consistent with other 3-D chemical model sensitivity studies and analysis using ratios of indicator species (e.g., d K + / d SO 42 (cid:1) , CH 3 CN/ SOy , SO y /CO, and C 2 Cl 4 /CO). Composition and Structure: Constituent sources and sinks; 0345 Atmospheric Composition and Structure: Pollution—urban and regional (0305); 0365 Atmospheric Composition and Structure: Troposphere—composition EYWORDS


Introduction
[2] Biomass burning, together with biofuel combustion produce large amounts of CO, NMVOC, BC, OC [Andreae and Merlet, 2001;Crutzen and Andreae, 1990;Duncan et al., 2003;Galanter et al., 2000;Streets et al., 2003b]. At a more local scale they cause smoke and haze which can adversely affect human health. They also affect global climate through their emissions of greenhouse gases and aerosols, and by modifying the capacity of the vegetation cover to act as a carbon sink . Characterizing the role of biomass burning in determining the composition, location, and fluxes of trace gases and aerosols in the Asian outflow was an important objective of TRACE-P Tang et al., 2003aTang et al., , 2003b. However, quantification of the contribution of emissions from biofuel and biomass burning is challenging due to the fact that biofuels have similar chemical composition as biomass burning materials, making it difficult to distinguish between the two [Hao and Liu, 1994;Streets and Waldhoff, 1999].
[3] Chemical distributions can be used to help identify emission signals. In the Indian Ocean Experiment (INDOEX), mixing ratios of CO 2 , ozone, and mediumand long-lived hydrocarbons (CH 4 , C 2 H 2 , CH 3 CH 3 , and CH 3 CH 2 CH 3 ) [De Laat et al., 2001;Dickerson et al., 2002], as well as aerosol composition (NO 3 À , SO 4 2À , NH 4 + , K + , OC, and BC) were used to characterize air masses. However, most of these studies have focused on air mass characterization from the measurement side. Aircraft studies focused on chemical signatures in the source areas are relatively few, even though this information is of fundamental importance to the characterization of source-receptor relationships in outflow events. The lack of such studies in Asia is due in part to limited emissions information.
[4] Recently, comprehensive emission inventories were developed in support of the Ace-Asia and TRACE-P experiments. This information was combined into the ACE-Asia and TRACE-P Modeling and Emission Support System (ACESS) to support these field studies [Streets et al., 2003a[Streets et al., , 2003b. These inventories contain source information on 52 region/countries, 27 chemical species, uncertainty estimates, and detailed source sector-activity information.
[5] In this paper we analyze this detailed emission information to identify regional signals, and to characterize and distinguish biomass, biofuel and fossil fuel sources. We then apply the result of the analysis to help quantify the contributions of these sources to the air masses observed during the TRACE-P experiment.
[6] In the first part of this paper we describe the methodologies used to develop biomass and biofuel emission inventories, with the main focus on improving their spatial and temporal resolution. The inventories from Streets et al. [2003a] and Streets et al. [2003b] present seasonal biomass burning emissions formulated for the base year 2000. For analysis of the TRACE-P observations, we need emissions that represent emissions for March 2001. For this purpose we build a daily biomass emission inventory for the TRACE-P period. The second part of this paper is focused on identification of chemical signatures derived from the enhanced emission inventory. In this process, we explore various chemical composition analysis using source categories and regions. Chemicals and regions are grouped by their characteristics using statistical analysis.
[7] The last section of the paper is devoted to the characterization of air masses using the source and receptor information. Chemical Mass Balance (CMB) analysis is performed using the aircraft observations and the source information, and the results used to estimate the contribution of biomass, biofuel, and fossil fuel emissions to the various outflow regions.

Domain and Data
[8] The emission inventory was developed for the domain shown in Figure 1. Regional emissions were estimated from Streets et al. [2003a] for the year 2000. Both regional and gridded forms of the emissions were compiled for several ''activity'' sectors. We combined the activity sectors into three source categories: biofuel, fossil fuel (including the non-fossil fuels CH 4 and NH 3 ), and ''open'' biomass burning. The inventory includes nine major chemical species: SO 2 , NO x , CO, non-methane volatile organic compounds (NMVOCs), NH 3 , CH 4 , CO 2 , black carbon aerosol (BC), and organic carbon aerosol (OC). The NMVOC emissions are speciated into 19 sub-categories based on chemical reactivity and functional groups [Streets et al., 2003a]. The NMVOC subspecies included in the analysis are: ethane, propane, butanes, pentanes, other alkanes, ethene, propene, terminal alkenes, internal alkenes, acetylene, benzene, toluene, xylenes, other aromatics, formaldehyde, other aldehydes, ketones, halocarbons, and all other NMVOCs.
[9] For spatial and temporal allocation we used three categories of data sets. The first category is sectoral energy/emission data organized by countries/regions [Streets et al., 2003a[Streets et al., , 2003b. The second is geographical data for spatial allocation to model grids. Administrative boundary and road map data were extracted from RAINS-ASIA model and the Digital Chart of the World (DCW) (see the National Imagery and Mapping Agency (NIMA), The Digital Chart of the World, available at http://www.nima.mil/, 1998.) High-resolution (30 second by 30 second) population and land cover data were extracted from LandScan Global Population database from Oak Ridge National Laboratory (ORNL) [ORNL, 1999]. These data were used for allocating anthropogenic emissions. The third type of data is satellite information. We used data from the Advanced Very High Resolution Radiometer (AVHRR) satellite for cloud and fire count information. We also used the NASA Total Ozone Mapping Spectrometer (TOMS) for aerosol information (see TOMS web site at http://toms.gsfc.nasa.gov/aerosols/ aerosols.html, 2001; see also the World Fire Web (WFW) web site at http://www.gvm.jrc.it/tem/wfw/ wfw.htm) [Streets et al., 2003b]. The data from the NOAA AVHRR were mainly used to generate spatial and temporal allocation factors for the biomass burning emission estimates. The TOMS-Aerosol Index data were GTE used to help overcome cloud interferences in the fire count data.

Spatial and Temporal Distributions
[10] The technical aspects of the development of the total emissions for Asia are described elsewhere [Klimont et al., 2002;Streets et al., 2003aStreets et al., , 2000aStreets et al., , 2000bStreets et al., , 2001Streets et al., , 2003b; here we only introduce the methodologies and focus on those aspects that we used in subsequent analysis.
[11] The emissions of a particular species are estimated as a product of the activity rate, the unabated emission factor, and the removal efficiency of any applied emission abatement technologies, using the equation [Streets et al., 2003a]: k;l;m ef j;k;l;m 1 À h j;l;m;n a j;k;l;m;n X j;k;l;m;n ð1Þ where j, k, l, m, n species, region, sector, fuel/activity type, abatement technology; E emissions; A activity rate; ef unabated emission factor; h removal efficiency of abatement technology n; a maximum application rate of abatement technology n; and X actual application rate of abatement technology n; note that the set of abatement technologies includes a ''no-control'' case, such that P n X = 1. In this inventory, we have 27 chemical species (8 major species and 19 NMVOC sub-species), 3 source categories (biofuels, biomass burning, and fossil fuel), and 52 region/ countries. The number of possible combinations of chemical-categories-regions is 8.8 million. This presents a formidable informatics challenge in identifying useful and robust source indicators.

Biofuel Emissions
[12] Biofuel emissions arise from the combustion of wood, animal waste (dung), and agricultural waste as household fuel. In this study, biofuel is treated as an anthropogenic source, in contrast to biomass burning which is treated as a partly natural/partly anthropogenic source. Total emission estimates from Streets et al. [2003a] were allocated to grids using the spatial allocation procedure described here. Since biofuels are mainly used for cooking and heating in rural areas, biofuel emissions were spatially allocated to grids using rural population. The monthly variation of biofuel emissions is large in East Asia and is driven by domestic heating needs. However, the daily variation is relatively small. Biofuel emissions for March were shown to be equal to the monthly average emissions according to Streets et al. [2003a]. Geographical distributions of estimated biofuel and fossil fuel emissions are presented in Figure 2. As shown, the high biofuel emission regions are mainly located in central and east China, Southeast Asia, and South Asia.

Biomass Burning Emissions
[13] Biomass burning in our inventory is divided into three major categories: forest burning, savanna/grassland burning, and the burning of crop residues in the field after harvest.
However, in terms of chemical composition, it is difficult to distinguish between biomass and biofuels, since they burn the same material. So, in addition to chemical composition, information on spatial distribution and temporal (daily) variation of biomass burning emission is useful in distinguishing between biofuel combustion and biomass burning emissions.
[14] We used the annual-regional emission data from Streets et al. [2003b] to estimate daily emissions for the period from February 26 to May 10, 2001. The regional emission estimation procedure used to estimate biomass burning emissions is described by Streets et al. [2003a] and the seasonal variation of emission distribution is described by Streets et al. [2003b]. These estimates of total biomass burning emissions by regions were further disaggregated to provide daily, gridded emissions. Satellite data (e.g., AVHRR fire count and TOMS Aerosol Index) were used to provide spatial/temporal variability representative of March 2001. The AVHRR sensor on board the NOAA series of polar orbiting satellites provides full global daily coverage (1.1 Â 1.1 km to 2.4 Â 6.9 km resolution) in five visible and infrared channels, and can be used to detect active fires. The World Fire Web (WFW) uses AVHRR data to map active fire events using a special contextual algorithm. WFW's self-adapting contextual algorithm has the advantage that it provides a means of consistently mapping fires over the whole globe (see World Fire Web (WFW) at http://www.gvm.jrc.it/tem/ wfw/wfw.htm). The WFW's fire count, cloud, and satellite coverage data were used to analyze daily fire events. The basic allocation method distributed the emissions within a region in time and space according to the daily fire count statistics normalized by the total number of detected fires. However, the WFW data set has two major problems due to availability of AVHRR information: (1) cloud interference; and (2) satellite coverage.
[15] To account for missing data due to cloud cover, and satellite coverage, we applied a normalized factor to the fire count data to adjust for missing data; i.e., where FC adj_i,j adjusted fire count (ith day, jth grid); FC i,j original fire count (ith day, jth grid); Sam i,j satellite coverage frequency (ith day, jth grid); Cld i,j cloud coverage frequency (ith day, jth grid); DC max maximum data count of each grid Cos (lat j ) latitude adjuster for DC max (radian, jth grid) [16] However, this process did not account for no-data grid cells or data error conditions. For example, if Sam i,j Cld i,j or Sam i,j À Cld i,j % 0, FC adj_i,j cannot be calculated. In this case, we applied lower and upper bounds of adjusted fire count. In the case FC i,j = 0 and Sam i,j = 0, we used 3-day moving averages (only applied for zero-fire counts cells): where FC adj_iÀ1,j is adjusted fire count (i À 1th day, jth grid), and FC adj_i+1,j is adjusted fire count (i + 1th day, jth grid).
[17] If there was trouble in the satellite onboard system or at the receiving station, or if clouds persisted for more than several days, the moving average scheme cannot improve the AVHRR fire count data. In this case we used TOMS-AI data as an additional information source. However, the TOMS-AI data should be used with caution because they detect all (absorbing) aerosols, including dust and manmade smoke. We applied several masks to help filter the information that is not caused by biomass burning. These masks included: (1) the classification of cloud conditions with and without rain using NCEP daily precipitation fields; (2) land cover maps to omit dust interference; and (3) maps of anthropogenic smoke sources including coal mine fires, oil wells, and gas drilling sites. More detailed methodology information is described by Streets et al. [2003b]. Because of the uncertain factors described above, allocation of emissions based on fire counts is imperfect at best. We tested our fire count adjustment methodologies using correlation analysis between regional biomass emissions and sum of fire counts within each region. Results are shown in Figure 3. In the 2001 spring case, the moving average was found to be the best method to adjust fire count because there were many dust events and the dust influence in the TOMS-AI data was a strong interferer. We decided not to use TOMS in the final analysis.
[18] Figure 4 shows the geographical distribution of temporally averaged (monthly) biomass burning CO emissions (upper) and the domain-averaged daily biomass burning CO emissions (lower) estimated using the moving average scheme. The figure shows high emission intensities in Southeast Asia, South Asia, and southern China regions. However, there are also biomass burning emissions in [19] It is important to emphasize that this analysis is focused on spatial/temporal distributions. The absolute magnitude of emissions remains highly uncertain. The total CO emissions from open burning for March 2001 is 17.7 Tg CO/month. The spatial distribution of CO for March 2001 is similar to Heald et al. [2003], but the CO emission amount is lower than theirs (25.4 Tg CO/month). The similarity in spatial distribution comes from the same use of the satellite data source (WFW AVHRR fire count) for the allocation procedure. Difference in the emitted amounts are discussed by Duncan et al. [2003] and Streets et al. [2003b].

Chemical Signatures for Biofuel, Biomass Burning, and Fossil Fuel Emission
[20] Using this detailed emissions inventory developed for Asia, we searched for chemical signatures that characterize emissions from fossil fuel, biofuel, and biomass burning. The chemical mass fractions per unit fuel burned for NMVOC species are presented in Figure 5a. Ethane, ethene, internal alkenes, and acetylene are largely emitted from biofuel use; ethane, ethene, formaldehyde, and ketones are elevated in biomass burning; and propane, butanes, ethane, and formaldehyde arise principally from fossil fuel combustion. As discussed previously, biomass burning is further classified into 3 major sub-sectors in the inventory (i.e., agricultural residue burning, savanna/grass land burning, and forest burning). Forest burning is further divided into tropical forest burning and extra-tropical forest burning. Figure 5b shows the same NMVOC information for the biomass burning sub-classes. From this analysis, it appears that propane and propene can be used to classify agricultural residue burning from the other biomass burning emissions, and ethane can be used for extra-tropical forest burning. Formaldehyde appears to be a good indicator of savanna burning, but its use is complicated by the fact that it has a photochemical source as well.
[21] Information based on the unit emission by unit fuel burned is not sufficient to characterize regional emissions. Rather, it is necessary to focus on total amounts emitted by region. The emission intensities by species and by the three source categories are presented in Figure 6. CO, CO 2 , ethane, ethene, propene, terminal alkenes, and internal alkenes vary greatly between the source categories, but are not dominated by any single category. SO 2 , CH 4 , NH 3 , butanes, pentanes, other alkanes, toluene, xylenes and other aromatics are dominated by fossil fuel emissions. Formaldehyde, other aldehydes, and ketones are dominated by biomass burning. No species is dominated by biofuel emissions. Internal alkenes, acetylene and benzene have the highest contribution for biofuels. This information suggests that it is necessary to manipulate these chemical distributions (e.g., using ratios) to distinguish biofuel emissions from biomass and fossil fuel emission. For example, the SO 2 to internal alkenes ratio may be a good indicator of biofuel burning because SO 2 is dominated by fossil fuel combustion and internal alkenes have a relatively high contribution from biofuel use.
[22] The relative importance of source categories (i.e., biomass burning, biofuel, and fossil fuel) have regional signatures as shown in Figure 7. For most regions, the major source of SO 2 is fossil fuel. For NO x , a small contribution from biomass burning is seen in the southern regions, but very little contribution from biofuels. For CO, biomass burning and biofuel emissions play an import role in most regions, and the portion due to bio-emissions is higher in the less developed regions. Relative to CO, the fraction of bioemissions for CO 2 is smaller in most regions. Even in Southeast Asia and South Asia, the fossil fuel emission portion is greater for CO 2 than for CO. This is due to burning efficiency. Low-efficiency combustion like biomass burning produces high CO/CO 2 ratios. BC and OC show similar regional distribution. However the ratio of BC/OC varies substantially. The ratio is high in fossil fuel dominated regions. This suggests that the BC/OC ratio can act to distinguish between fossil fuel and biomass burning.
[23] Emissions of CO by source category are overlaid with CO/CO 2 ratios in Figure 8a. The gradient of CO/CO 2 , and the fraction by source category vary significantly by region. Developed countries such as Japan, South Korea, and Taiwan have the lowest CO/CO 2 ratios and the highest fossil fuel emission fraction; while the less-developed countries such as Myanmar, Cambodia, and Vietnam have the highest CO/CO 2 ratio and lowest fossil fuel emission fraction. Figure 8b shows the BC/OC emissions for fossil combustion overlaid with the regional BC/OC ratios. The BC/OC ratio for fossil fuel combustion is clearly higher than other source categories regardless of region. The BC/ OC values of total emission show regional gradients, with the developed regions Hong Kong, Beijing, Tianjin, Japan, South Korea, and Taiwan having higher BC/OC ratios. Southeast Asia, South Asia, northernmost China, southern-most China, and Mongolia have the lowest values. The values reflect fossil/biofuel energy use described above.

Analysis of Region and Chemical Groups
[24] As discussed above, it remains a challenge to characterize regional source profiles in a definitive manner because emissions from most regions in Asia represent a mixture of biomass, biofuel and fossil fuel signals. Here we attempt to identify the similarities in regional distributions   and chemical composition as indicators of these three source categories.

Pearson Correlation Analysis to Identify Relations Among Source Categories
[25] As an initial test, Pearson correlation analysis was performed for chemical species-source-category combinations to look for regional distribution patterns. We analyzed the different combinations of the 27 species in each of fossil, biofuel, and biomass burning source categories (81 Â 81 combinations) that were correlated across the regions. With this test, the chemical species-source combinations have similar regional distributions were identified. We also examined whether there were any significant differences ''within'' and ''between'' source categories. The mean correlation coefficients (R) for this analysis (6 out of 3280 Rs) are presented in Table 1. Coefficients for ''within'' source categories are generally higher than ''between'' categories. For example, the cross correlation of fossil fuel sources is 0.84. This number indicates how well the fossil fuel emissions are correlated between regions. This correlation is lower than that for biofuel and biomass burning. The analysis also indicates that fossil fuel emissions by region are more strongly correlated with biofuel than with biomass burning. Also, biofuel emissions are more strongly correlated with human energy use activities and/or burning conditions (''in-stove'' burning), whereas biomass burning is affected by both human and natural causes, and has significantly different burning conditions (''open'' burning). This analysis Figure 8. CO emissions by region and source-category overlaid with regional CO/CO 2 emission ratios (Figure 8a), BC/OC emission ratio by regions overlaid with regional BC/OC ratios by source category (Figure 8b). See color version of this figure at back of this issue. suggests that biomass burning can be distinguished from biofuel emission in terms of regional distributions.

Cluster Analysis for Chemical Groups
[26] Structures (i.e., taxonomies) within the emission database were identified using hierarchical cluster analysis applied to the regional total emissions (i.e., without source category classification) [Afifi and Clark, 1999;Der and Everitt, 2001;Szilágyi, 1991]. Table 2 and Figure 9 show the results of the chemical cluster groups and dendrogram of the clustering, respectively. In Figure 9, the horizontal axis denotes the linkage distance (in vertical icicle plots, the vertical axis denotes the linkage distance). Thus, for each node in the graph (where a new cluster is formed) we can read off the criterion distance at which the respective elements were linked together into a new single cluster. When the data contain a clear ''structure'' in terms of objects that are similar to each other, then this structure will often be reflected in the hierarchical tree as distinct branches.
[27] We tried to identify which number of groups would be the best for our purpose, using rescaled distance of cluster combine. As we can see in Figure 9, the cluster combine starts to become less active near distance 6 -7 (about 25 -30% of total rescaled distance) that means 9 -8 cluster groups. We decided to cluster chemical species with 8 groups because we can get more hydrocarbon species and major species (e.g., CO, BC, and OC) in a group. The selected 8 groups were included in subsequent analysis. From this analysis SO 2 shows up as a separate group. This reflects the fact that SO 2 emissions are dominated by fossil fuel usage. NO x , CO 2 and halocarbons were grouped as one, and they relate to the stage of development. CO, BC, OC, ethane, propane, ethene, propene, terminal alkenes, internal alkenes, and acetylene have similar regional distributions and were identified as a group. CH 4 and NH 3 are identified as a group and represent species not highly related to combustion. The hydrocarbon species were further clustered into four groups that reflect source category contributions.
[28] We further analyzed the similarity of the chemical distributions by repeating the analysis using the source categories (not shown as a table or a figure). We found the cluster groups identified were the same as the previous analysis. This tells us that the chemical groups identified can be used as source category classifiers regardless of region.

Cluster Analysis for Regional Groups
[29] Cluster analysis was also performed to classify regions into groups using the chemical composition of total emissions. The results of this analysis reduced the 52 regions into 11 regional groups as shown in Table 3 Figure 9. Chemical groups identified from hierarchical cluster analysis.

GTE
and Figure 10. Figure 11 shows the emission profiles of the regional groupings. Provinces in central China, northeast China, and North Korea (DPRK) were grouped as regional group 1. In this group, fossil fuel emissions account for about 50%, with a larger contribution from the biofuel emissions in the remainder of the emissions. Regional group 2 includes highly developed regions including Japan, Beijing, Tianjin, Shanghai, and Taiwan. This group has a small fraction of bio-emissions. Regional group 3 covers Fujian, Guangxi, Jiangxi, NeiMongol, Nepal, Yunnan, Indonesia, and India. These regions have a high biomass burning fraction, but still have significant fossil fuel and biofuel emissions (about 40-50%). South China regions including Guangdong and Hainan were identified as group 4. These regions have more of their emissions from fossil fuel, but still have a significant amount of biomass burning. Hong Kong and Brunei form group 6; these regions have high fossil fuel emissions and  Figure 10. Regional groups identified from hierarchical cluster analysis (with sector).
WOO ET AL.: CONTRIBUTION OF BIO-EMISSIONS almost no bio-emissions. Qinghai, Xizang, and Mongolia (group 7) have low emission intensities but a high biomass burning ratio. Most of the southeast Asian countries and some of south Asian countries (Cambodia, Laos, Malaysia, Myanmar, Philippines, Thailand, Vietnam, Bangladesh, Bhutan, Sri Lanka) form group 9. These regions are dominated by biomass burning emissions. Groups 10 and 11 were Singapore and Pakistan, respectively. Singapore is a highly developed country with almost no bio-emissions while Pakistan is heavily dependent on fossil fuel and biofuels.
[30] Figure 11 shows the source profiles of 8 major chemical species for the regional groups. The background map is colored by groups, and each regional group has a pie chart that shows the chemical fraction by species (size of each pie) and the total emission amount (size of pie chart) of the 8 major species. As shown, some groups have obvious differences (e.g., Region 9 and Region 2) that reflect structural differences in the energy/fuel usage, while others are quite similar (e.g., Region 9 and Region 3).

Trace Gas Distribution and Air Mass Characterization
[31] In the previous sections we focused on the estimation of emissions and their chemical and regional characteristics.
Here we apply this information to the TRACE-P aircraft observations to identify which source regions and sourcecategories influence different flight segments.

Receptor Oriented Models
[32] Receptor models are generally contrasted with source models that use pollutant emission rate estimates, meteorological transport, and chemical transformation mechanisms to estimate the contribution of each source to receptor concentrations. The two types of models are complementary, Figure 11. Source profiles from regional groups. See color version of this figure at back of this issue.
with each having strengths that compensate for the weaknesses of the other [United States Environmental Protection Agency (USEPA), 2001Watson et al., 1990;Williamson and Dubose, 1983]. In our study, we used two types of receptor models: (1) back-trajectory analysis; and (2) chemical mass balance analysis.
[33] The Chemical Mass Balance (CMB) model is one of several receptor models that have been applied to air resources management. The CMB receptor model [Cheng and Hopke, 1989;Friedlander, 1973;Venkataraman and Friedlander, 1994;Watson, 1984] consists of a solution to linear equations that expresses each receptor chemical concentration as a linear sum of products of source profile abundances and source contributions. The basic equation for the CMB model is where C i the concentration at a receptor that is species i; a ij fraction of species i from source j not lost in the atmosphere (0 -1); F ij the fraction of emission of source jthat is species i; P the number of sources (i.e., j = 1. . ..P); M j total source contribution at the receptor from source j; E i error term.
[34] For each run of the CMB, the model fits speciated data from a single source or group of sources to a particular receptor (sample). The source profile abundances (i.e., the mass fraction of a chemical or other property in the emissions from each source type) and the receptor concentrations, with appropriate uncertainty estimates, serve as input data to CMB. The output consists of the amount contributed by each source type represented by a profile to the total mass, as well as to each chemical species. CMB calculates values for the contributions from each source and the uncertainties of those values. CMB is applicable to multi-species data sets, the most common of which are chemically characterized PM 10 , PM 2.5 , and Volatile Organic Compounds (VOC).
[35] Here we use CMB to estimate the regional and source category emission contributions to assess the impact of fossil, biofuel, and biomass burning on observed trace gas distributions.

Selection of Episodes Using DC8 Flight Measurements
[36] To demonstrate how the emission information can be used to estimate the contribution of biomass and biofuel emissions in specific air masses, we focused on the measurements from the DC-8. Backward trajectories for all DC8 flights (not shown here) color-coded by modeled CO, were used to select episodes. We selected chemical species for this investigation according to coexistence (i.e., species that are measured and explicitly included in the emissions inventory), availability (completeness of data), and uniqueness (significance of difference for ''within'' and ''between'' source categories). After coexistence screening, we selected 18 out of 27 species -SO x , NO y , CO, CH 4 , NH 3 , CO 2 , BC, ethane, propane, butanes, pentanes, ethene, propene, ethyne, benzene, toluene, xylenes, and HCHO for further analysis. Hydrocarbon species are preferred in this analysis because they have relatively uniform magnitude, and do not have to be normalized. Chemical groups 3, 5, and 8 in Table 2 were selected because they each contain hydrocarbon species. It is necessary to select specific species to help distinguish the source categories from the selected chemical groups. These were selected using correlation coefficients and mean difference between groups. Table 4 shows R values and re-scaled mean difference between groups. The re-scaled mean differences were calculated from the data in Figure 5. The higher the absolute magnitude of the mean difference, the clearer the difference by source categories. A large R indicates how well the mean differences are maintained throughout the regions. So, the chemicals with the higher R and higher absolute mean difference represent clear source category signatures regardless of regions. Since the air masses can travel several hours/days before they were observed by the DC-8, the fractional change of chemical species from the source regions was also considered in the analysis. The species with longer lifetime are preferred so that the emission signatures do not have large time dependency. However, species with ''very long'' lifetimes are not optimal because their concentrations may not be sensitive to local emissions. The evaluation of the sensitivity of the findings to reactivity, and the mixing of different aged air masses, requires further study.
[37] After final screening, 4 out of 27 species were selected to be included in the analysis -ethane, propane, butanes, and acetylene. Propene and HCHO were initially selected but discarded because they are short-lived species and in the case of HCHO, have both primary and photochemical sources. Since the selected chemical species have similar regional distributions and different chemical fractions by sectors, they can be used to help identify source categories.
[38] The CMB analysis procedure was as follows. For each observation point, the CMB model used the chemical fraction of these 4 species by source category, and the measured quantities of these species, and calculated the fractional amounts of biofuel, biomass and fossil emissions. Five DC8 flights, with 16 individual flight segments, were selected for this analysis, and these points are shown in Figure 12.

Daily Emission, Meteorological Conditions, and CMB Model Capability Measures for Selected Events
[39] Figure 13 shows the daily biomass burning emission distribution from March 1 to March 21. Daily (3-day interval) emission distributions show differences in spatial distributions as well as emission intensity. Southeast Asia (Region 9) is the only region which shows consistent biomass burning activity over this time period. [40] Calculated CO mixing ratios and horizontal wind fields for March 4, 10, 13, and 18 at 3GMT are shown in Figure 14. All left-hand figures are at 438 m altitude and the right-hand ones are at 2797 m. These results show the complex nature of pollution outflow from Asia.
[41] The CMB model was run to quantify source contributions during these events. Detailed contributions by three source categories for selected flight points are presented in the next section. To evaluate the model capabilities, coefficient of determination (R 2 ) and calculated/ measured ratio (C/M ratio) for all selected flight points were calculated ( Table 5). The CMB's R 2 can represent how much of the variance in the measured data can be explained by the model. The values ranged from 0.68 to 0.99, and in most cases the CMB model could explain more than 80% of the variability in the observations. The C/M ratios ranged from 0.89 to 1.16, and 88% of the model calculated values (sum of 4 species selected) fall within ±10% of the observed values.

Source Identification for Selected Flights 4.4.1. DC8 Flight 6 (March 4)
[42] One of the important objectives of TRACE-P was characterizing the role of biomass burning in determining the composition, location and fluxes of trace gases and aerosols in the Asian outflow. On March 4, the DC8 sampled pollution outflow as it flew into Hong Kong. To aid in the analysis we calculated three-dimensional (3-D) 5-day backward trajectories for each 5-minute segment along the flight paths using calculated meteorology [Tang et al., 2003a[Tang et al., , 2003b. We combined these trajectories with results from our 3-D chemical model (i.e., STEM 2K1) output .
[43] Figure 15a presents 3-D features of the flight track with sampled data points colored by measured acetylene mixing ratios, and emissions. In terms of emissions, the light-blue dots represent fossil fuel, yellow dots biofuel, and red dots biomass. The 2-D trajectories are colored by modeled CO, and the source category contributions (pie chart) for each selected flight segment are presented in Figure 15b. The pie charts represent the source contributions calculated using the CMB model, and the numbers on the side of the pie charts are times (GMT).
[44] The CMB model found a large contribution of fossil fuel emissions (57 -60%) for flight points 4.6 and 5.2 GMT. Biofuel and biomass burning emissions contributed 26-29% and 11 -17%, respectively. For the 6.9 GMT point, the fraction of biomass burning emissions was high (63%). The fraction contribution was quite different on GMT 7.0, with a high biofuel fraction (67%).
[45] For the first two points above, the 2-D trajectories show that these air masses passed over region 1 (Central China), region 5 (Xinjiang and Shanxi) and region 3 (Shanghai). The CO mixing ratio along the trajectory increased as the air masses passed over China's coastal region. The 3-D trajectories show that the air masses were descending as they traveled off the Asian continental land mass.
[46] As we described before, fossil fuel and biofuel emissions are higher than biomass emissions in region 1,   [47] Flight 8 is one of the best flights to analyze the role of biomass burning in southeast Asia region because there were large biomass emissions from March 4 to March 9. The outflow in the lower layer in Figure 14c shows relatively weak transport to the east, but the upper layer shows strong eastward outflow in the warm sector of the front.  [48] As shown in Figure 16, the CMB model found a high contribution of biomass burning (about 72%) for flight points at 3.3 and 3.4 GMT, with biofuels contributing 19-22%, and fossil fuel emissions only 6-9%. For the 2.5 GMT point, biofuel (48%) and biomass burning (39%) were both important. The trajectories show that these air masses came from region 9 (Southeast Asia). The only difference between 3.3/3.4 GMT and 2.5 is the time and location of the trajectory over Southeast Asia. It appears that the 2.5 GMT point met with a mixture of biofuel emissions and fresh biomass burning emission, whereas the trajectories for the 3.3/3.4 GMT points were dominated by Southeast Asian biomass emissions.
[49] Ma et al. [2003] identified the contributions of biomass plumes (e.g., biomass and biofuel) using the P-3B flight aerosol measurement data. They used dK + /dSO 4 2À slopes from biomass (biofuel + biomass burning) and fossil plumes to analyze contributions of source categories. Both P-3B flight 10 and DC8 flight 8 flew a similar path along the 20°N latitude on the same day (March 9). The results using dK + /dSO 4 2À slopes indicate the biomass contribution was greater than 80% of the total mass in the measured plume. This result is consistent with ours, i.e., 87-96%.
[50] Acetonitrile (CH 3 CN) is a good indicator of biomass combustion sources [Reiner et al., 2001], tetrachloroethene (C 2 Cl 4 ) and SO y (SO 2 + SO 4 À2 ) are good fossil fuel markers [Ma et al., 2003], and CO is a useful marker for general combustion sources. We compared CH 3 CN/SO y and SO y /CO ratios for the 4.6/5.2 GMT data points of DC8 flight 6 (high fossil fuel source contribution) and for the 2.5/ 3.3/3.4 GMT data points of DC8 flight 8 (high biomass combustion source contribution). CH 3 CN/SO y ratios for flight 6 are significantly lower (mean = 0.1) than those on flight 8 (mean = 11.8). In the case of SO y /CO ratios, flight 6 shows higher values (mean = 9.3) compared with flight 8 (mean = 4.4). C 2 Cl 4 /CO ratios for flight 6 were also higher (mean = 0.061) than flight 8 (mean = 0.026). These ratios are consistent with our estimates using the emission inventory approach. The added value of the emission-based analysis is the ability to distinguish biomass and biofuel contributions. [51] Biomass burning emission sensitivity tests using the 3-D chemical model (STEM 2K1) were conducted by Tang et al. [2003aTang et al. [ , 2003b. Model calculations with and without biomass burning emissions were compared to observations using the same emissions inventory used in the CMB analysis. In the case of the two flight measurement points, 3.3-3.4 GMT, the 3-D chemical model sensitivity study showed that biomass burning contributed about 60% to modeled CO levels. These model-based results are consistent with our result, but the contribution was lower (71 -73% contribution). 4.4.3. DC8 Flight 9 (March 10) [52] For Flight 9, the DC8 flew from the South China Sea to the Yellow Sea, and this is a good flight to analyze the roles of fossil fuel, biofuel, and biomass burning. By this time the biomass burning in Southeast Asia region had declined significantly.
[53] As we can see in Figure 17, five points along the flight track were selected for analysis. In the figure, the 3.5 GMT and 5.2 GMT points have strong fossil fuel emission contributions (71 -72%), with biomass contributing 15-21% and biofuel emission contributing 7 -14%. The 3-D trajectories show that the air masses passed at low altitude over Region 2. This region is dominated by fossil fuel emissions. Figure 13 shows the same biomass burning activity in central China as on March 10. This can explain the contribution of biomass burning for these two points. The SO y /CO ratios for the aircraft observation were high (mean = 23.0), indicating a strong fossil fuel combustion signal.
[54] For the 3.3 GMT point the trajectories missed the large cities and were influenced more strongly by Region 4, which has a high biofuel contribution. The fraction of biofuel emission was high (61%), with fossil fuel emissions the next largest fraction (29%).
[55] The 7.6 GMT and 7.9 GMT points show a high contribution of biomass burning emission (about 65-69%). Fossil fuel emissions contribute 21-22% and biofuel emissions contribute only 10-13%. The trajectories show that these air masses were descending in the leading edge of the [56] Point 5.5 GMT on Flight 10 has a biomass burning contribution of 72% and a fossil fuel contribution of 24%. The trajectory shows high concentration of CO in the air mass that passed over central China (region 1). There was biomass burning in the region along the transport path (Figure 18a, yellow point/line).
[57] Flight 12 flew from Hongkong to Tokyo. On this flight three points (3.2, 5.2, and 5.7 GMT) were selected to see the gradient of composition for the source categories ( Figure 18b). For point 3.5, the CMB model shows that biofuel and biomass burning contribute 45% and 45%, respectively. The 2-D back trajectories show high concentrations of CO in the source region and reduced concentrations along the trajectory. The 3-D back trajectory shows that the altitude was between 500 m and 2.8 km. The readers should notice that the biomass emissions in 3-D figure are for Flight 10. In Figure 13f, we can find strong biomass burning in Southeast Asia region on March 16, and it decreased significantly on March 18 (Figure 4a). There was little biomass burning in central China region. The wind field and CO in the 438m altitude (Figures 14g  and 14h) show a weak high-pressure system over central China region that mixed biomass and biofuel emission together. In the upper layer, stronger outflow to the western region is shown. For the 5.2 GMT point, biofuel was dominant (70%), and biomass burning contributed 20%. The 2-D back trajectory shows that air masses mainly passed over the central China (region 1). The 3-D back trajectory shows that air was in the free troposphere and descended over the coast. Emissions in central China have strong biofuel emissions. The wind field and CO in the 2.8 km layer (Figures 14g and 14h) show very strong easterly outflow in the mid-east China region, which took biofuel and biomass emissions to the east. The 5.7 GMT point shows an equal contribution of fossil fuel (46%) and biofuel (42%). The 2-D back trajectory shows air masses passing over Shandong area then moving to lower altitude. The lower layer wind field and CO distribution show a strong influence from megacities (Region 2) and central China.
[58] Biomass burning emission sensitivity tests using a 3-D chemical model (the Models-3 Community Multi-Scale Air Quality modeling system (CMAQ)) were conducted by Zhang et al. [2003] for these flights. The vertical distribution plot with biomass burning contribution shows that biomass burning plumes contribute about 50% at 3.5 GMT, 30% for 5.2GMT, and 20% for 5.7GMT point along to the DC8-12 flight path. This result is consistent with the CMB model result (45%, 20%, 12% contribution).

Summary
[59] In this paper we analyzed detailed emission information to identify regional signals, and to characterize and distinguish biomass, biofuel and fossil fuel sources. We then applied this information to help interpret aircraft observations during the NASA TRACE-P experiment in Asia.
[60] Such an analysis requires total emissions as well as estimates of daily spatial and temporal distribution. We estimated daily emissions for the period from [61] Within the inventory, we searched for chemical signatures that characterize emissions from fossil fuel, biofuel, and biomass burning. This includes analyzing the composition and distribution of each chemical species. Eight major species, 19 NMVOC sub-species, and two ratios (CO/CO 2 and BC/OC) were analyzed. We found distinct gradients in the regional CO/CO 2 ratio that can be related to the regional economic development. We also found that differences between fossil fuel and biomass burning can be reflected in the BC/OC ratios.
[62] We further explored how the emission distribution and chemical composition could be classified using statistical analysis including Pearson correlation analysis and cluster analysis. Significant correlation of source categories across regions were found, indicating that the three source sectors could be classified. Fossil fuel and biofuel emissions were shown to be more strongly correlated than biomass burning emissions. This analysis suggests that biomass burning can be distinguished from biofuel emissions in terms of regional distribution. From cluster analysis, 27 chemical species were combined into 8 groups that have similar regional distributions, and 52 regions were also clustered into 11 regional groups that have similar chemical composition.
[63] These groups were used in Chemical Mass Balance (CMB) analysis to characterize air masses and to quantify the contribution of three source categories to the observed species distributions. Five DC8 flights (e.g., Flights 6, 8, 9, 10, and 12) with 16 flight segments were selected as outflow events. Four chemical species (ethane, propane, butanes, and acetylene) out of 27 were selected as CMB model input. We analyzed spatial and temporally resolved emission data, backward trajectory analysis, 3-D chemical source model, and wind field information to interpret source contribution from the CMB model for each selected outflow event. In general, Asian outflow is usually a complex mixture of biofuel, biomass and fossil sources. The CMB was able to estimate the relative contributions from these sources. Flights in the post frontal regions at high latitudes and low altitudes were found to have a high contribution of fossil fuel emissions. Flights in the warm sector of cold fronts were dominated by biomass burning contributions (about 70%). Biofuel contributions were high (about 70%) when air masses come from central China. The receptor model results were shown to be consistent with 3-D chemical model sensitivity studies for two common flight cases.
[64] Our receptor based approach showed consistency with biomass burning emission sensitivity tests using 3-D chemical ''source'' models [Tang et al., 2003a[Tang et al., , 2003bZhang et al., 2003]. In addition the results were consistent with source indicators. The aerosol species slope (dK + / dSO 4 2À ) approach using the P-3B flight measurement data indicated similar contributions of bio-emission (e.g., biomass and biofuel) in the co-flight day (March 9) [Ma et al., 2003a]. The ratios of traditional source tracers including acetonitrile (CH 3 CN: biomass combustion sources), tetrachloroethene (C 2 Cl 4 : fossil fuel sources) and SO y (fossil fuel sources), and CO (general sources) were also analyzed. CH 3 CN/SO y , SO y /CO, and C 2 Cl 4 /CO ratios also supported the CMB analysis for the selected data points. The CMB receptor model, 3-D chemical model and source tracer ratios showed consistent results for the selected flight cases.
[65] The results presented show how comprehensive emission information during the NASA TRACE-P experiment, when integrated with modeling analysis and measurements, can provide valuable information to help characterize the source contributions in individual air masses. Further work using more detailed aerosol information is planned.
[66] We plan to extend the emissions inventory to include acetonitrile, and particulate K, Ca, Hg, among others, and to repeat this analysis using measurements of aerosol composition, to provide further information to assess the contribution of source categories and further resolve fuels (e.g., coal versus dung). Also, the evaluation of the sensitivity of the chemical species selection to reactivity and the mixing of different aged air masses require further study.     . CO emissions by region and source-category overlaid with regional CO/CO 2 emission ratios (Figure 8a), BC/OC emission ratio by regions overlaid with regional BC/OC ratios by source category (Figure 8b). Figure 11. Source profiles from regional groups.