Global-scale flow routing using a source-to-sink algorithm

. In this paper, the development and global application of a new approach to large-scale river routing is described. It differs from previous methods by the extent to which the information content of high-resolution global digital elevation models is exploited in a computationally efficient framework. The model transports runoff directly from its source of generation in a land model cell to its sink on a continental margin or in an internally draining basin (and hence is referred to as source-to-sink routing) rather than from land cell to land cell (which we call cell-to-cell routing). It advances the development of earlier source-to-sink models by allowing for spatially distributed flow velocities, attenuation coefficients, and loss parameters. The method presented here has been developed for use in climate system models, with a specific goal of generating hydrographs at continental margins for input into an ocean model. However, the source-to-sink approach is flexible and can be applied at any space-time scale and in a number of other types of large-scale hydrological and Earth system models. Hydrographs for some of the world's major river basins resulting from a global application, as well as hydrographs for the Nile River from a more detailed application, are discussed.


Introduction
Large-scale river-routing algorithms are required for a range of modeling applications in global hydrology and Earth system science, including macroscale hydrological modeling, fully coupled land-ocean-atmosphere climate system modeling, terrestrial biogeochemical and ecosystem modeling, and dynamic global vegetation modeling. Their purpose is to simulate the transport of runoff generated within modeling units on land (e.g., grid cells, watersheds, or other spatially defined units), through river networks, across the landscape, to its associated delivery point on the continental margin in order to produce realistic streamflow hydrographs at any location along the length of the channel. While the representation of the vertical movement of moisture through the soil-vegetation-atmosphere system has received considerable attention in large-scale modeling efforts [Henderson-Sellers et al., 1993], lateral transport has received comparatively less.
Because streamflow is an integral component of the climate system, the absence of river-routing algorithms in Earth system models represents a significant shortcoming. For example, water cycle closure is required in fully coupled climate system models (CSMs) (e.g., to maintain global freshwater and oceanic salinity balances), yet the representation of river transport that effectively closes the cycle is often primitive or nonexistent  [Boville and Gent, 1998]. River routing provides a means for transport of not only water but sediment, nutrients, and biogeochemical materials as well, all of which are important elements of Earth system cycles on land: Hence their river-borne transport requires an appropriate model representation of streamflow routing. From a model verification perspective, streamflow is the most observable and well documented of the land surface fluxes. Because streamflow from continental watersheds represents the outflow of water from vast regions of land surface, it is an expression of the integrated response of the land to all the Earth system processes occurring within basin boundaries. Consequently, large-scale river routing provides an opportunity to better validate model simulations of terrestrial hydrology, ecology, biogeochemistry, and climate.
The availability of high-resolution global digital elevation models (DEMs) like the 5-arc-min TerrainBase [Row et al., 1995] or the 30-arc-sec GTOPO30 [Gesch et al., 1999] offers an important opportunity to advance the development of largescale routing models by enabling progressively more realistic representations of topography and river channel networks. In this regard, Oki and Sud [1998] have developed a global raster river network with a resolution of 1 ø x 1 ø using vector maps and DEM high-resolution data. However, a significant challenge is to determine how the relatively high resolution topographic data can be utilized to enhance river routing algorithms that can be interactively coupled to regional and global climate models, which typically operate at coarser scales (e.g., 0.5 ø at high resolution to 4 ø x 5 ø at low resolution), without significantly increasing the often large computational overhead of the host models.
The purpose of this paper is to describe the development

Ij_i[L 3IT] is the inflow to cell i from each of the immediately upstream neighbor cells j, Oi[L 3/T] is the outflow from cell i, R i [L 3IT] is the runoff generated within cell i, and E i [L 3IT]
are the losses due to evaporation and infiltration from the river network within cell i. Note that the term E i accounts for losses during the routing process after the runoff has been generated.  , 1993, 1994Bonan, 1995]. However, cell-to-cell routing schemes possess some significant disadvantages that, in part, have motivated the present research. Primary among these is that the cell-to-cell method does not account for within-cell routing, which clearly impacts the outflow hydrograph for the coarser grid associated with climate models [see, e.g., Naden, 1993]. In other words, it does not consider the routing of water from the different areas of the cell to the cell outlet, from which it is routed to its downstream cell. In order to capture these important within-cell effects, due to higher-resolution variations in topography that define river networks, cell-to-cell algorithms could simply be run at the higher resolution of currently available global DEMs (e.g., roughly 1 km for GTOPO30). However, because there are more than 90,000 DEM cells in a typical climate model cell (i.e., approximately 2.8 ø resolution), this is computationally infeasible for routing schemes that will be interactively coupled within climate models, given the significant computational overhead. Additionally, running cell-to-cell algorithms at the climate model grid resolution also limits the accuracy with which river networks and watershed boundaries can be delineated, since grid cells have to be assigned entirely to one, and only one, watershed. In short, cell-to-cell methods, in their current form, cannot readily capitalize on the best available global DEMs in order to more accurately route streamflow across the continents into the oceans.

Source-to-Sink Routing
In their effort to develop more efficient routing models that do not spend resources in flow and storage calculations at locations in which the modeler is not interested, researchers developed (what is herein referred to as) source-to-sink models. In the literature, researchers refer to routing from "cells in which runoff is produced" to "watershed outlet cells." In this paper, cells in which runoff is produced are called source cells or just sources, and watershed outlet cells are called sink cells or sinks. Naden [1993] describes a routing method that implicitly includes within-source routing (i.e., routing from the different areas of the source to the source outlet) followed by sourceto-sink routing (i.e., routing from the source outlet to the watershed outlet). Runoff from each source is convolved with a source-specific response function to determine the contribution of each source to streamflow at the sink. Total streamflow at the sink at any time is the sum of the contributions from all sources in the watershed. Although presented as a large-scale hydrologic model, the approach was only tested in a 10,000km 2 watershed. Still, according to the author, the fact that network width functions can be determined from highresolution global DEMs using Geographic Information Systems (GIS) suggests that the method is likely to be applicable globally.
Lohman et al.
[1996] employ a response function for withinsource routing and the linearized St. Venant equation to trans-port streamflow from the source outflow point to the sink. The response function for within-source routing is not specific to each source and is equivalent to Mesa and Mifflin's [1986] hillslope response. It is obtained by deconvolution of the catchment response function with the river network response function, which is conceptually similar to the network width function of Mesa and Mifflin [1986]. The scheme relies heavily on the availability of runoff data globally for the deconvolution of functions and, as such, may only be applicable on a regional basis at present (e.g., Lohman et al. [1996] test the method in the 38,000-km 2 Weser River catchment in Germany, and Lohman et al. [1998] apply the scheme in the 566,000-km 2 Arkansas Red River basin). However, the authors intend to simplify and regionalize grid cell response calculations to increase the domain over which their approach can be applied.
Kite et al. [1994] described the development of a method that consists of source-to-sink routing within each source, followed by cell-to-cell routing in the watershed, and its application in the 1,600,000-km 2 Mackenzie River basin. The watershed was partitioned into a number of sources, or grouped response units (GRUs), and further subdivided into five different land cover types. The distribution of travel times to the GRU outlet was computed for each land cover type by computing the travel time from each pixel to the GRU outlet (calculated as the sum of a to-stream and in-stream travel time). Nonlinear reservoir inflow-storage and outflow-storage relationships were used to transport streamflow from GRU outlets to the next GRU downstream. The approach described in our paper most closely resembles the within-GRU component of Kite et al. [1994] with important modifications and simplifications appropriate for application at the global scale.
Advantages and disadvantages of our source-to-sink model with respect to the previously described routing schemes are as follows. The first advantage is that it uses the river network topology generated within GIS from raster high-resolution spatial data (i.e., GTOPO30 and future higher-resolution DEMs) and analyses it with built-in GIS tools for estimating the response function parameters of the lower-resolution modeling units used in global climate models. A second advantage is that our source-to-sink model accommodates spatially variable streamflow parameters. The source-to-sink models described above require uniform parameters, which is an important shortcoming for large-watershed flow routing. A third advantage of the source-to-sink model presented here is that the streamflow parameters (i.e., flow velocity, attenuation coefficient, and loss parameter) and the resulting hydrographs are scale-independent. This implies that the input parameters do not have to be recalibrated when the resolution of the modeling units changes. It also implies that the accuracy of the model results improve as the resolution of the modeling units increases. This is not necessarily true for cell-to-cell models, as it has been recently demonstrated that cell resolution affects flow attenuation [Olivera et al., 1999]. A final advantage is computational efficiency relative to the cell-to-cell approach. Because streamflow is transported directly from the point of runoff generation to the watershed outlet, routing calculations are not required within all the land cells between these two endpoints. The importance of this computational efficiency increases for larger watersheds and higher-resolution applications. An important disadvantage of the source-to-sink method is that river water stored within each cell is not explicitly tracked in time; consequently, hydrographs are only generated at watershed outlets rather than for every cell on the land surface as in the cell-to-cell method. Finally, because of the linearity of the system, the source-to-sink model requires constant in time, but not necessarily uniform in space, flow parameters (i.e., flow velocity, flow attenuation coefficient, and loss coefficient), so that response functions can be defined for each source.
In our opinions the advantages of the source-to-sink approach outweigh the disadvantages and hence justify an exploration of the feasibility of the approach at the global scale in the context of an eventual coupling with a CSM. The remainder of this paper describes the global implementation of a simple and robust source-to-sink routing scheme. A detailed intercomparison of cell-to-cell and source-to-sink methods is an active area of research by the authors, as are methods for combining the two approaches. Results of these investigations will be presented in future publications.

Methodology
The methodology presented here models the process by which the water moves across the landscape from the location where runoff is generated, the source, to the location along the continental margin where no further flow routing is necessary, the sink. The model assumes that the terrain topography is described by a DEM, and the runoff distribution, in space and time, is known from a separate land surface model. It generates hydrographs at the different sinks.
The model is supported by a Global Spatial Database, in the form of raster and vector GIS maps, specifically developed to work in conjunction with it. Streamflow parameters, such as velocity, attenuation coefficients, and loss parameters, which depend more on local than on overall topographic conditions, are not included in the database and are left for the user to define.

Global Spatial Database
In this section, the general principles followed to develop the Global Spatial Database that supports the model are explained. This Global Spatial Database is available from the authors by request.
A DEM is a sampled array of elevations for ground positions that usually are at regularly spaced intervals. DEM-based terrain analysis consists of extracting from the DEM the relevant information about the flow network for hydrologic modeling. GTOPO30, a set of 30-arc-sec (approximately 1 km) DEMs, was used for the hydrologic analysis because it is the highestresolution raster topographic data available for the entire world. The terrain analysis was carried out by applying standard GIS functions included in commercially available software that supports operations on raster data.
By identifying the downstream pixel adjacent to each terrain pixel, a network that represents the flow paths is produced, and a unique path can be traced from each pixel to the ocean (see Figure 1). In order for water to flow along the landscape without being trapped in terrain depressions, each pixel must have at least one of its eight neighbor pixels at a lower elevation. However, in some cases, terrain models (DEMs) depict fictitious depressions, produced by the interpolation schemes used to describe variation in elevation between raster points [DeVantier and Feldman, 1993]. In general, the existence of depressions in the DEM is explained by (1) fictitious terrain depressions, also called pits, which might be small for most practical purposes but which are critical for hydrologic mod- Sinks are runoff-receiving units and are defined as the areas where the water needs no further routing, because it has either discharged into the ocean or into lakes located in terrain depressions. These areas are located along the continental margin (coastline pixels) and at the lowest elevation of the inland catchments (pour point pixels). After subdividing the entire globe into square boxes by a 3 ø x 3 ø (longitude by latitude) mesh, a resolution typical of ocean circulation models, sinks are defined as those boxes containing coastline segments or inland catchment pour points. Most sinks are located along the coastline, whereas inland catchment pour points are less common. All other boxes are not considered sinks because water flows through them (see Figure 2 for an example of the African continent).
Sources are runoff-producing units and are the areas where the water enters the surface water system as runoff. To define the source polygons, three sets of polygons are intersected: (1) the drainage area of each sink delineated according to the flow network determined previously (see Figure 2 for an example of the African continent), (2) land boxes defined by subdividing the terrain into square boxes by a 0.5 ø x 0.5 ø (longitude by latitude) mesh, and (3) runoff boxes defined by a T42 mesh [Bonan, 1998], with resolution of approximately 2.8 ø x 2.8 ø (longitude by latitude), for which runoff time series are provided from a CSM. A fine 0.5 ø x 0.5 ø mesh is used for land boxes, rather than a coarse T42 mesh, to allow for better definition of the spatial distribution of flow times to the sink, which would otherwise be averaged over the large T42 cells. By doing this, each T42 cell is subdivided into more than 30 land boxes, allowing the routing model to better capture the geo-morphological characteristics of the area. Note that the smaller land boxes are not intended to improve the resolution of the runoff distribution, which is given by a CSM with a T42 resolution. Eventually, as the resolution of climate models increases, the runoff boxes and land boxes will coincide. After intersecting the drainage area of the sinks with the mesh of land boxes and the mesh of runoff boxes, the source polygons are obtained in such a way that for each source its exact location, its runoff time series, and its corresponding sink are known (see Figure 3 for an example in the Nile River basin). In

Flow-Routing Algorithm
Routing of runoff along a flow path, from the point it enters the system (source) to the point where it needs no further routing (sink), is accomplished by applying the advectiondispersion equation to flow path segments and then convolving the responses of the different segments to produce the source response at the sink [Olivera and Maidment, 1999]. The hydrograph for a specific sink is calculated as the sum of the contributions of all the sources that drain to it. Mathematically, this is written as Since the representative Peclet number II• is a measure of the relative importance of advection with respect to hydrodynamic dispersion (the cause of flow attenuation), some important conclusions can be drawn from (6): the relative importance of advection with respect to flow attenuation increases with flow path length and flow velocity and decreases with the attenuation coefficient. Therefore it is expected that flow attenuation plays a lesser role as the watershed size increases. First-order losses (i.e., flow lost proportional to actual flow) can also be included in this approach by multiplying the response function of (4) by a dimensionless loss factor tI)j [Olivera, 1996] (available at www. ce.utexas.edu/prof/olivera/ disstn/abstract.htm) equal to (7) where the loss parameter Xk [1/T] is related to the loss mechanisms such as evaporation to the atmosphere or infiltration to the deep subsurface system. The loss parameter can be understood as the fraction of water lost per unit time and reflects the fact that losses increase with flow time and that distant sources experience more losses than those located close to the sink. Note that although the loss parameter is constant, water losses are not constant and change in time as the flow changes. After accounting for losses, the source response function at the sink is equal to

Application, Results, and Discussion
In this section, we first demonstrate the applicability of the model for use in global routing calculations and then highlight its power and flexibility in an application in the Nile River basin.

Application to the Globe A 10-year daily time series of T42-resolution global runoff data, simulated using the National Center for Atmospheric Research Community Climate Model Version 3 (CCM3) [Kiehl et al., 1998] with an interactive land surface component, the Land Surface Model Version 1 (LSM) [Bonan, 1996], was used as input to our flow-routing model. While it is well known that climate model runoff does not agree well with observations, Bonan [1998] and Kiehl et al. [1998] have shown improvement in simulated hydrology over previous CCM versions in
which land surface conditions were prescribed. For our purposes, use of the CCM3 runoff output is sufficient for demonstrating the features of the routing algorithm, as this is precisely the resolution and quality of runoff data that will be input into the routing scheme when coupled to a CSM.
Routing of CSM runoff data, with specified streamflow parameters, for different basins can be seen in Figure 5. The river basins selected were the Congo in Africa, the Amazon, Ori-

noco, and Parana in South America, the Mackenzie and Mississippi in North America, and the Yangtze and Huang Hu in
Asia. In all cases, v = 0.3 m/s, D = 0, and X = 0, which implies pure translation at a rate of 0.3 m/s and no losses. Note that we have not attempted a global calibration of these streamflow parameters, because validating CCM3 runoff is not the focus of this paper. Rather, our intention is to show the capability of the model to transport streamflow over vast land regions and simulate hydrographs at continental margins for input into ocean circulation models, given a set of streamflow parameters and simulated runoff rates. Figure 5 shows, for all cases, lagging and damping of the flow hydrograph with respect to the runoff, so that peak flows are lower than peak runoff values, low flows are higher than low runoff values, and, in general, the entire flow hydrograph is shifted to the right, along the time axis, with respect to the runoff. The lags are induced by overall travel times to the coast, and the damping is induced by the differential travel times and by the flow attenuation processes. It can also be observed that, depending on the size, shape, and geomorphology of the watershed, lags can be as long as 4 to 5 months and damping can be as much as 40%. Note that these results are also sensitive to the choice of streamflow parameters and the spatial-temporal variability of the runoff input.
Owing to inadequacies of the climate model runoff mentioned above, we are not able to validate the imposed translation and attenuation. However, the predicted hydrographs are certainly reasonable given the realistic nature of the velocity and topographic input parameters. The impact of these delays on ocean circulation and ocean-atmosphere interaction is the subject of ongoing research.
The power and flexibility of the routing scheme, that is, its ability to incorporate terrain spatial variability, are discussed next in an application of the model to the Nile River basin under differing streamflow parameters.

Application to the Nile River Basin
Spatially distributed streamflow parameters, such as flow velocity, attenuation coefficient, and loss coefficient, are difficult to estimate at a global scale when detailed descriptions of the terrain are not available. Assumptions of uniformly distributed parameters, though appealing for their simplicity, may overlook well-known hydrologic processes that take place in some parts of the river basins. For example, floodplain storage, which is significantly important in flat areas such as the Sudd (see Figure 3). This area is close to that reported by Revenga et al. [1998], which was estimated using a different methodology. In this application, emphasis has been placed on the importance of accounting for the spatial variability of the streamflow parameters. For this purpose, three distinct zones were as- It is important to note that in the area-flow time curves discussed above, the relative importance of the sources, given by the total runoff depth and overall losses, was not considered. The effect of spatial variation of runoff depth plus the temporal distribution of runoff will be considered next when generating hydrographs at the ocean. Figure 9 shows 1  The discrepancies between the hydrographs, peak flows, and low flows in Figure 9 show the importance of accounting for localized areas of floodplain storage and, in general, of accounting for spatial variability of the terrain and streamflow parameters. The fact that a 108,000-km 2 localized area of floodplain storage, like the Sudd Marshes, can modify peak flows by as much as 32% in a 3,250,000-km 2 basin shows that models that do not support spatially distributed streamflow parameters are limited in their applicability.

Conclusions
A model for runoff routing at a global, continental, or largewatershed scale has been developed. The method presented here has been developed for use in CSMs, with a specific goal of generating hydrographs at continental margins for input into an ocean model so that the importance of large-scale river transport can be properly assessed. It models the transfer of runoff from its sources of generation (where it enters the surface water system) to its sinks at either the continental margin or an inland catchment pour point (where no further routing is necessary) and therefore is referred to as a sourceto-sink model.  (18). Moreover, when an insignificant kinetic species is retained, the concentration of that species is small, due to the cutoff criteria. Thus the contribution of the retained kinetic species to changes in chemical speciation is small, and the rate of reaction at this point is quite slow because of (15).
There are two precipitated kinetic species (Fe(OHh(s) and MnOz(s)) associated with the reaction network used in this effort (see Table 2). Since the precipitated species are immo bile, they do not need a transport equation. However, the precipitated species may be composed of components that are not present in all compartments (see Table 3). In this paper the component Mn is only present in compartments 2 and 3 (for Mn02(s) and associated aqueous complexes), and the compo nent Fe is only present in compartment 3 (for Fe(OHh(s) and associated aqueous complexes). Again, there is the potential for discontinuities in transport equations that could be handled with moving boundaries. This time, however, it is (16) that could encounter a discontinuity at compartment boundaries, because the transported entities (i.e., dissolved Mn and Fe) are components, not kinetic species. A strategy similar to that used for aqueous kinetic species is employed. Since the incoming water is pristine (i.e., in compartment 1), Mn and Fe are defined as having a dissolved concentration of zero at the upstream boundary but an initial condition throughout the spatial domain that consists solely of precipitated concentra tion. This allows T M n and T Fc to be continuous across the spatial domain because of (17).
2.2.3. Linkage. Inspection of (2) and (3) reveals that there is yet another potential for discontinuities in Tj during the course of redox zone development. This type of disconti nuity cannot be handled in a manner similar to the kinetic species because it occurs with components that are present in all compartments. The cause of this discontinuity is that some components may experience a large change in value at com partmental boundaries. Clearly, such discontinuities would wreak havoc on most numerical methods for solute transport.
An alternative formulation for the compartmentalized ap proach that circumvents this problem is i=l i=l n = 1, k; j = 1, Nc• When simulating perturbed redox systems, it can usually be assumed that the dissolved redox species are either initially zero or initially the maximum value for that particular setting.
When (19) is used to account for compartmental switching, one simply transports only T] for components subject to large changes between compartments (in this paper H+ and cOj-). This alternative formulation for the compartmentalized ap proach eliminates the discontinuities associated with (2)-(4); Tj is calculated from (19) and then passed to the chemical module.

2.2.4.
Flowchart of COMPTRAN. COMPTRAN was constructed with extensively modified versions of the geo chemical code KEMOD [Yeh et al., 1993[Yeh et al., , 1998  In addition to performing geochemical and solute transport calculations, COMPTRAN essentially works by making two key decisions. The first decision to be made is to assign each node to the correct compartment. This decision is made by the subroutine switch.f (Figure 2a). The sequence of operations that is followed each time ocspitf is called by gm2d.f is illus trated in the flowchart in Figure 2b. The subroutine ocspitf loops over the entire set of nodes in the boundary value prob lem and calculates a "batch" geochemical solution at each node. The second major decision is made within the geochemi cal sequence of operations. A decision is made within the subroutine truth.f as to whether the lower-energy redox reac tion within a given compartment is behaving in a kinetically limited or thermodynamically limited manner. This decision is made by determining whether the concentration of the lower energy redox product is lower in the kinetic step or in the thermodynamic step (see Figure 1): (1) If the concentration of the lower-energy redox product is lower in the kinetic step, then the reaction is kinetically limited.
(2) If the concentration of the lower-energy redox product is lower in the thermody namic step, then the reaction is being limited by free energy constraints.

Elimination of Transport Equations
Depending on how the geochemical problem is formulated, it may not be necessary to write a transport equation (i.e.,