Approximating the Distribution of Pareto Sums
- Author(s): Zaliapin, Ilya
- Kagan, Yan Y
- Schoenberg, Federic P.
- et al.
Heavy tailed random variables (rvs) have proven to be an essential element in modeling a wide variety of natural and human induced processes, and the sums of heavy tailed rvs represent a particularly important construct in such models. Oriented toward both geophysical and statistical audiences, this paper discusses the appearance of the Pareto law in seismology and addresses the problem of the statistical approximation for the sums of independent rvs with common Pareto distribution F(x)=1 - xα for 1/2 < α < 2. Such variables have infinite second moment which prevents one from using the Central Limit Theorem to solve the problem. This paper presents five approximation techniques for the Pareto sums and discusses their respective accuracy. The main focus is on the median and the upper and lower quantiles of the sum?s distribution. Two of the proposed approximations are based on the Generalized Central Limit Theorem, which establishes the general limit for the sums of independent identically distributed rvs in terms of stable distributions; these approximations work well for large numbers of summands. Another approximation, which replaces the sum with its maximal summand, has less than 10% relative error for the upper quantiles when α < 1. A more elaborate approach considers the two largest observations separately from the rest of the observations, and yields a relative error under 1% for the upper quantiles and less than 5% for the median. The last approximation is specially tailored for the lower quantiles, and involves reducing the non-Gaussian problem to its Gaussian equivalent; it too yields errors less than 1%. Approximation of the observed cumulative seismic moment in California illustrates developed methods.