Methods of Tail Dependence Estimation

Characterization and quantiﬁcation of climate extremes and their dependencies are fundamental to the studying of natural hazards. This chapter reviews various parametric and nonparametric tail dependence coefﬁcient estimators. The tail dependence coefﬁcient describes the dependence (degree of association) between concurrent extremes at different locations. Accurate and reliable knowledge of the spatial characteristics of extremes can help improve the existing methods of modeling the occurrence probabilities of extreme events. This chapter will review these methods and use two case studies to demonstrate the application of tail dependence analysis.


Introduction
Weather and climate extremes are of particular importance due to their impacts on the economy, environment and human life.Understanding the spatial dependence structure of rare events is fundamental in risk assessment and decision making.Most measures of dependence (e.g., Pearson linear correlation, Spearman (1904) and Kendall (1962) correlation) are designed to describe the dependence of random variables over their distributions.Most commonly used measures are not able to correctly capture the dependence of the upper or lower parts (extremes) of the distribution (Kotz and Nadarajah 2000).For example, the Pearson correlation coefficient may not exist for random variables above a certain high (extreme) threshold (De Michele et al. 2003).The Pearson linear correlation describes how well two random variables are linearly correlated with respect to their entire distribution.However, this information cannot be used to understand how the extremes of two random variables are dependent (Serinaldi 2008).
In general, most dependence measures associate the entire distribution of two or more random variables.However, the dependence between the upper part of the distribution may be different than the mid-range and/or lower part of the distribution (Embrechts et al. 2002).For example, two random variables with low dependence between mid-range values, but strong association among high (low) values.
In extreme value analysis, the tail dependence coefficient describes the association between the upper or lower part (tail) of two or more random variables (Schmidt 2005;Frahm et al. 2005;Ledford and Tawn 1997).The tail dependence coefficient is first introduced by Sibuya (1959) as the dependence in the upper-right and lower-left quadrants of a bivariate distribution function.Put differently, in a bivariate distribution, the tail dependence refers to the limiting proportion that one variable's marginal distribution exceeds a certain threshold given the other variable's margin has already exceeded the same threshold.
Figure 6.1 explains the concept of tail dependence using an example.The figure displays two generated random variables with the same linear correlation coefficient of approximately 0.7. Figure 6.1 (left) is simulated using the bivariate normal distribution, and Fig. 6.1 (right) is generated using the bivariate t-distribution.In both cases the simulated variables are transformed to uniform [0-1] distribution.The variables (V 1 and V 2 ) in both figures show positive Pearson linear correlation coefficient ( 0:7).However, the upper right quadrant (above the dotted lines) is different in the left and right panels of Fig. 6.1.As shown, in Fig. 6.1 (left) the values in the upper right quadrant (upper tails of V 1 and V 1 ) are locally independent, while in Fig. 6.1 (right) the upper tail values seem to be locally correlated (compare the upper right corners of both panels).This indicates that the probability of occurrence of V 2 above a given high threshold (e.g., dotted line in the figure), assuming V 1 exceeds the same threshold is higher in the right panel compared to the left panel in Fig. 6.1.For additional information and graphical examples, the interested reader is referred to Fisher and Switzer (2001) and Abberger (2005).
Parametric methods are frequently used for univariate extreme value analysis (e.g., Fisher and Tippett 1928;Gumbel 1958).On the other hand, in multivariate extreme value analysis, the joint probabilities of multiple random variables is considered.This includes the probability occurrence (risk) of each variable based on its univariate marginal distribution and the dependence of multiple probability occurrences.Depending on the marginal distribution of random variables and their dependence structure, a parametric model may or may not be sufficient to model the characteristics of the joint extremes.Thus far, many parametric and nonparametric methods have been developed for analysis of tail dependence of random variables (Schmidt and Stadtmüller 2006;Malevergne and Sornette 2004;Poon et al. 2004;Ledford and Tawn 2003;Malevergne and Sornette 2002;Ledford and Tawn 1996).
In the past three decades, most applications of tail dependence models have been in financial risk management and dependence analysis of between extreme assets (e.g., Schmidt (2005), Frahm et al. (2005) and Embrechts et al. (2002) and references therein).In particular, many non-parametric methods are introduced based on the concept of empirical copula.Copulas are multivariate distribution functions that can describe the dependence of two or more random variables independent of their marginal distrubtions.In recent years, multivariate copulas have been applied in numerous hydrologic applications (Nazemi and Elshorbagy 2011;AghaKouchak et al. 2010b;Bárdossy and Li 2008;Serinaldi 2009;Zhang et al. 2008;AghaKouchak et al. 2010a;Favre et al. 2004;De Michele and Salvadori 2003;Kelly and Krzysztofowicz 1997).Renard and Lang (2007) investigated the usefulness of the multivariate normal copula in extreme value analysis.With several case studies, Renard and Lang (2007) demonstrated that the multivariate normal copula can be reasonably used for extreme value analysis.However, the authors acknowledge that the low probabilities can be significantly underestimated if asymptotically dependent random variables are described by the normal copula, which is an asymptotically independent model.Serinaldi (2008) investigated the association of rainfall data using the non-parametric Kendall rank correlation.The study suggests a copula-based mixed model for modeling the dependence structure and marginal distributions of variables.
In general, the tail dependence between variables may strongly depend on the choice of model or estimation technique (Frahm et al. 2005).This chapter reviews several parametric and non-parametric tail dependence estimators.Various aspects of modeling tail dependence between variables are discussed in detail, including the choice of extreme value threshold, and advantages and disadvantages of tail dependence models.The chapter is organized into seven sections.After the introduction, the concept of tail dependence is reviewed.The third section is devoted to parametric tail dependence analysis and copulas.In section four, non-parametric methods are discussed.Section five provides additional insight into the choice of extreme value threshold.The last section highlights two case studies using the tail dependence estimators.

Tail Dependence: Basic Definitions
Let X 1 ; : : : ; X n be n random variables.The the upper tail ( up ) for a multivariate distribution with n random variables X.X 1 ; : : : ; X n / is defined as (Joe 1997;Melchiori 2003): In Eq. 6.1, F 1 ; : : : ; F n are the cumulative distribution functions for the random variables X 1 ; : : : ; X n , and t is the extreme value threshold.The equation expresses the probability (P r) of occurrence of extremes (values above the threshold t) in X 1 , conditioned on the occurrence of extremes (above the same threshold) in X 2 ; : : : ; X n .Similarly, the lower tail dependence coefficient ( lo ) is described as: The multivariate distribution function is said to be upper tail dependent if 0 < up Ä 1 and upper (lower) tail independent if up D 0 ( lo D 0).For example, in Fig. 6.1 (left), the upper tail coefficient is approximately zero ( up 0), while for Fig. 6.1 (right) the upper tail coefficient is approximately 0.8 ( up 0:8).For a more comprehensive discussion on the theoretical concept of tail independence, the interested reader is referred to Draisma et al. (2004) and Husler and Li (2009).

Copulas and Tail Dependence
The upper (lower) tail coefficient can also be defined using copulas.Copulas are joint cumulative distribution functions that describe dependencies among variables independent of their marginals (Joe 1997;Nelsen 2006): where C n is an n-dimensional joint cumulative distribution function (CDF) of a multivariate random variable (U.U 1 ; : : : ; U n /) and whoses marginals are uOE0; 1. Equation 6.1 can be alternatively presented as: where F 1 1 ; : : : ; F 1 n are the inverse CDF of the random variables X 1 ; : : : ; X n .Notice that the conditional probability, given in Eq. 6.4, can be described as: Substituting Eq. 6.3 into Eq.6.5 with some algebraic manipulation yields the following formulation for the upper tail (Joe 1997;Frahm et al. 2005): Similarly, the lower tail dependence coefficient ( lo ) can be expressed as (Joe 1997): C .n/ .u;: : : ; u/ .n 1/u (6.7) There are various copulas families, which have been developed for different purposes.One major difference between different copula families is the upper (lower) tail association they represent.For example, copula families may differ in the upper and lower tail of the distribution, where the dependence is strongest (weakest).In this study, two elliptical copulas, namely a normal copula and t-copula, as well as a non-Gaussian (v-transformed) copula, are used for simulations.In the following section, a number of copula families and their tail dependence behavior is discussed.

Gaussian Copula
One of the most commonly used copula families is the multivariate Gaussian (normal) copula, which is obtained from the multivariate normal distribution (Nelsen 2006): C .u 1 ; : : : ; u n / D F n .F 1 .u 1 /; : : : ; F 1 .un // (6.8) Equation 6.8 describes an n-dimensional multivariate Gaussian copula with correlation matrix n n whose density function is: where: F n D Multivariate Gaussian CDF y.u i / D F 1 .ui /

t-Copula
The t-copula (alternatively known as Student copula), is an elliptical copula derived from the Student distribution: : : : For > 2, the shape matrix ( ) is proportional to the correlation matrix (Malevergne and Sornette 2003).The density function of the t-copula is expressed as (Malevergne and Sornette 2003): 1 C y 0 1 y Á .Cn/=2 (6.12) where: y k = t 1 .uk / t = univariate Student distribution with degrees of freedom Both Gaussian and t-copulas are elliptical; however, they represent different tail dependencies.The Gaussian copula is upper (lower) tail independent ( up 0) regardless of the correlation coefficient among variables (Coles 2001;Renard and Lang 2007;Mikosch and Resnick 2006).This indicates that the extreme values from the different random variables occur independently, even if the random variables exhibit a high correlation.It is worth pointing out that for independent variables, one could expect up D 0. Note that the converse is not necessarily true, meaning that up D 0 does not indicate that the random variables are necessarily independent (Malevergne and Sornette 2003).
Contrary to the Gaussian copula, the t-copula can capture the upper (lower) tail dependence (if exists) among two or more random variables.The t-copula can capture the asymptotic dependence even when the variables are negatively (inversely) associated (Embrechts et al. 2001).In t-copula formulation, as increases, the tail dependence weakens, and thus, the probability of occurrence of extreme values reduces.Figure 6.2a displays the tail behavior of the bivariate t-copula with D 1 10.The Figure presents occurrences of x > 0:8 (percentage) in both random vectors of the bivariate t-copula.One can see that an increase in results in less occurrences of extremes (values above the threshold t in Eq. 6.1). Figure 6.2b shows the tail dependence of Gaussian copula.The occurrence of joint extremes in Gaussian copulas is considerably less than t-copula (threshold: 0.8).This indicates that if strong dependence exist among multiple variables, using the Gaussian copula may not be suitable for modeling dependence of extremes.In fact, the multivariate Gaussian distribution is upper (lower) tail independent ( up 0) meaning it cannot be used to describe dependencies of extremes (Coles 2001;Renard and Lang 2007).
It is noted that the tail behavior of a multivariate model depends solely on the type of copula and not on the marginal distribution of individual variables.Therefore, in modeling the dependencies of extremes, the choice of copula family plays a significant role.

Gumbel-Hougaard Copula
In the following section, a heavy upper tailed Archimedean copula (Nelsen 2006), known as the Gumbel-Hougaard copula, is introduced.Unlike many copula families, Archimedean copulas are not derived from standard multivariate distributions.Generally, the multivariate Archimedean copulas can be expressed as: where ‰ is the so-called generator function.For the Gumbel-Hougaard copula, the generator function can be expressed as: ‰.x/ D .ln.x// Â (6.14) where: Â 1 D copula parameter: By substituting Eq. 6.14 into Eq.6.13, the general formulation of the bivariate Gumbel-Hougaard copula can be described as (Venter 2002): Equation 6.15 represents the bivariate Gumbel-Hougaard copula with variables uOE0; 1 and OE0; 1.The Gumbel-Hougaard copula is parameterized through a single parameter Â.The copula parameter Â is to be estimated based on available data.By substituting Eq. 6.15 into Eq.6.6, the upper tail dependence coefficient for the Gumbel-Hougaard copula can be derived as: up D 2 2 Â (Salvadori et al. 2007;Frahm et al. 2005;Nelsen 2006).A discussion on copula parameter estimation techniques is beyond the scope of this chapter.The interested reader is pointed to Genest et al. (1995), Salvadori et al. (2007) and Nelsen (2006) for more detailed discussions on parameter estimation.

Nonparametric Tail Dependence Methods
There are different nonparametric tail dependence estimators that can be used to evaluate the significance of tail behavior.The first nonparametric approach introduced here is based on the concept of the empirical copula (C where F .m/ refers to the empirical distribution of random variables.The tail dependence estimator .1/ up is then expressed as (Schmidt and Stadtmüller 2006): where: = rank of j I = indicator function It is worth pointing out that Eq. 6.17 is the empirical copula with the interval .
.1/ up is derived using the empirical tail-copula introduced by Genest et al. (1995).Haung (1992) suggested another tail dependence measure (here, .2/ up ) based on empirical copulas and extreme value theory: Coles et al. (1999) proposed a different nonparametric tail dependence measure ( .3/ up ) as follows (Frahm et al. 2005): where Another nonparametric tail dependence estimator (here, .4/ up ) is proposed by Joe et al. (1992): where the term C m is the empirical copula as described in Eq. 6.20.It should be noted that nonparametric methods of estimating tail dependence is not limited to the ones mentioned above (see for example Capéraa et al. 1997).

Extreme Value Threshold
Estimation of the extreme value threshold requires assuming a threshold above (or below) values that are considered as extreme (Frahm et al. 2005).For tail dependence analysis, one can use a fixed (e.g., above 95 % of data) or variant threshold approach.The so-called optimal threshold approach (Frahm et al. 2005;Peng 1998) uses a kernel plateau-finding algorithm to estimate the optimal extreme value threshold.In this method, the optimal plateau is estimated in four steps: (1) a kernel box with a bandwidth of b (e.g., b D int.0:05n/) is selected; (2) the mean values of the coefficients that fall within each box results in n 2b values; (3) for a moving plateau with a length of l D p n 2b, the corresponding values are calculated ( k ; : : : ; kClC1 where k=1,. . ., n 2b m C 1); (IV) the optimal plateau (extreme value threshold) is the first one that fulfills the following condition (for more detailed description, the reader is referred to Frahm et al. (2005) and Peng (1998)): † kCl 1 i DkC1 j i k j Ä 2¢ (6.22)where ¢ is the standard deviation of the i values (means of coefficients that fall within each box).The optimal tail dependence coefficient is then expressed as: 3 displays an example of tail dependence coefficient variability versus the choice of extreme value threshold.In this figure, the box refers to the plateau that satisfies the condition mentioned above (Eq.6.22) and its corresponding TDC.Note that the box size in Fig. 6.3 is not scaled, and the box size is placed on the figure for illustration.For other methods of extreme value threshold estimation, the interested reader is pointed to Tancredi et al. (2006).

Case Studies
Case Study A: In the following example, the tail dependence coefficient is used for analysis of anisotropy of spatial dependencies of extremes.Figure 6.4 displays the rainfall accumulations above 95 % threshold normalized to [0-1].The precipitation data used in this example is from the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN, Sorooshian et al. 2000;Hsu et al. 1997) data set, which is an infrared-based microwave-adjusted precipitation product.The tail dependence coefficient is estimated using Eq.6.17 ( .1/ up ) for the two perpendicular directions shown in Fig. 6.4.In this example, the data is smoothed with a moving-average window with bandwidth of 2 pixels and 80 % overlap.Figure 6.5 indicates that heavy precipitation rates are dependent over longer distances in the horizontal direction (see Fig. 6.5 (left)) as opposed to the vertical direction (see Fig. 6.5 (right)).This indicates that the spatial dependence structure of heavy rainfall rates is asymmetrical, and in this example, heavy rainfall rates are spatially more dependent in the horizontal direction.
Case Study B: Many earth science variables provide excellent data for studying spatial dependencies of extreme events.This example demonstrates a nonparametric approach to evaluating the dependence structure of the extreme precipitation values over a region in the southern part of the United States.Understanding extreme precipitation spatial dependencies and behavior on the local, regional and global scale will provide enhanced insight in the spatial dependence structure of precipitation in different regions of the world.This information can then be used to assist in planning and decision making purposes..1/ up for three percentile groups, 75th, 90th and 95th as described in Eq. 6.17; and (Bottom) .2/ up for three percentile groups, 75th, 90th and 95th as described by 6.18 Two nonparametric tail dependence methods based on empirical copulas are used to derive tail dependence estimators: (1) .1/ up introduced in Eq. 6.17; and (2) .2/ up introduced in Eq. 6.18.In both cases, the tail dependence estimator helps describe the dependent structure or degree of association between concurrent rainfall extremes at different locations.High spatial and temporal resolution precipitation data can be analyzed using these nonparametric tail dependence methods, which allows for solving for the tail dependence coefficient and thus describing the dependence structure of the extreme precipitation events.
The study region is over Mississipp, which is located in the southern part of the United States with a latitude of 38N to 35.5N and longitude of 110W to 107:5W from January 1st, 2005 to December 31st, 2008.The precipitation data used in this example is the National Center for Environmental Prediction (NCEP)'s Stage 4 mosaic multi-sensor national precipitation analysis, which has a 4 km spatial resolution and hourly temporal resolution (Lin and Mitchell 2005).
Solving for .1/ up (Eq.6.17) and .2/ up introduced in Eq. 6.18, and smoothing the results for display purposes, one can see the dependence structure of precipitation for the data in this region.Similar to the previous example, the data is smoothed using a moving-average window with a bandwidth of 2 pixels and 80 % overlap.Figure 6.6 shows three different precentile groupings of the extreme precipitation events: 75th percentile, 90th precentile and 95th precentile for .2/ up (Fig. 6.6, bottom) are consistent with each other showing the expected decrease in dependence with distance across each of the different precentile groups.The figure indicates that the spatial dependence of extreme convective precipitation will decrease rapidly as distance increases.This is expected because with extreme precipitation events, spatial dependence is typically the highest near the region of convective activity, which produces the largest observed precipitation.This is not always the case, for example, extra-tropical cyclones can also produce extreme precipitation events and can have spatial dependence up to a few hundred kilometers.
Figure 6.7 displays .1/ up (Fig. 6.7, top) and .2/ up (Fig. 6.7, bottom) methods using the concept of optimal threshold (Eqs.6.22 and 6.23) to determine the percentile for calculating the tail dependence coefficient as well as a smoothed version for illustration purposes.Contrary to commonly used method (Fig. 6.6), this approach of tail dependence analysis is independent of a fixed (constant) threshold.In other words, this method provides a tail dependence analysis that does not require additional decisions regarding the choice of extreme value threshold (Frahm et al. 2005).

Summary and Conclusions
Extreme events (e.g., floods, droughts, heat waves) have varying spacial dependence structures across different geographic locations and the understanding of these dependencies is fundamental to risk assessment and decision making.Understanding the different characteristics of extreme events, including spatial dependence, will provide regional planners and policy makers with information and knowledge of extreme events that impact their the local and regional communities.Spatial characteristics of extreme events can be investigated through estimation of tail dependence coefficient for different locations.This chapter reviewed several nonparametric and parametric tail dependence coefficient estimators.The tail dependence coefficient describes the degree of association between concurrent extremes.The presented nonparametric methods are based on the concept of bivariate empirical copula of random variables, whereas the parametric approach is based on the concept of Gumbel-Hougaard Copula.The chapter also reviewed different aspects of modeling tail dependence such as the choice of extreme value threshold.
In the first case study, the tail dependence coefficient is used for analysis of anisotropy of spatial dependencies of extremes.The results showed that the spatial dependence structure of heavy rainfall rates was asymmetrical.In the second example, the tail dependence coefficient is used to investigate spatial dependencies of precipitation extremes on a local scale revealing the spatial dependence structure of the extreme convective precipitation as described by the tail dependence coefficient.Extreme precipitation impacts many aspects of human society, such as loss of property and life due to flooding and area destruction from severe storms.
In the case studies, a kernel plateau-finding algorithm is used to obtain tail dependence coefficients, avoiding a fixed extreme value threshold.The results of previous studies (e.g., AghaKouchak et al. 2010c) reveal that using the kernel plateau-finding algorithm for tail dependence is superior to the fixed threshold approaches.This method, also known as the optimal threshold approach, can obtain a measure of tail dependence that does not require additional decisions regarding the choice of extreme value threshold.
The tail dependence coefficient has numerous applications including: validation and verification of weather and climate models in reproducing extreme events; analysis of simultaneous extremes; probabilistic assessment of occurrences of extremes, and understanding climate variability.For example, by deriving tail dependence coefficients for simulations of a numerical weather prediction model or a climate model, one can evaluate whether these models produces dependencies as seen in the observations.These approaches are not limited to precipitation, but also a wide variety of earth science variables.This study of extremes tail dependence on local, regional and global scales can assist in planning and policy making as well as validating numerical models, thus providing a valuable tool for understanding how extreme events impact society.

Fig. 6
Fig. 6.1 (left): Upper tail values (upper right quadrant -above dotted lines) of V 1 and V 1 are locally independent; (right): Upper tail values of V 1 and V 2 seem to be locally correlated (Modified after AghaKouchak et al. (2010c))

Fig. 6 . 2
Fig. 6.2 Tail dependence behavior from bivariate random variable simulated using (a) t-copula and (b) Gaussian copula.The y-axes show the occurrences of joint extremes (percentage) in dependent random variables simulated using t-copula and Gaussian copula (modified after AghaKouchak et al. 2010b)

Fig. 6 . 6
Fig. 6.6 Two nonparametric methods for calculating tail dependence coefficients.(Top) Fig. 6.7 Tail dependence analysis using .1/ up and .2/ up and based on the concept of optimal threshold introduced in Eqs.6.22 and 6.23