A Solution to the Crucial Problem of Population Degeneration in High-Dimensional Evolutionary Optimization

Three popular evolutionary optimization algorithms are tested on high-dimensional benchmark functions. An important phenomenon responsible for many failures - “population degeneration” - is discovered. That is, through evolution, the population of searching particles degenerates into a subspace of the search space, and the global optimum is exclusive from the subspace. Subsequently, the search will tend to be confined to this subspace and eventually miss the global optimum. Principal components analysis (PCA) is introduced to discover population degeneration and to remedy its adverse effects. The experiment results reveal that an algorithm's efficacy and efficiency are closely related to the population degeneration phenomenon. Guidelines for improving evolutionary algorithms for high-dimensional global optimization are addressed. An application to highly nonlinear hydrological models demonstrates the efficacy of improved evolutionary algorithms in solving complex practical problems.

A Solution to the Crucial Problem of Population Degeneration in High-Dimensional Evolutionary Optimization Wei Chu, Xiaogang Gao, and Soroosh Sorooshian Abstract-Three popular evolutionary optimization algorithms are tested on high-dimensional benchmark functions.An important phenomenon responsible for many failures -"population degeneration" -is discovered.That is, through evolution, the population of searching particles degenerates into a subspace of the search space, and the global optimum is exclusive from the subspace.Subsequently, the search will tend to be confined to this subspace and eventually miss the global optimum.Principal components analysis (PCA) is introduced to discover population degeneration and to remedy its adverse effects.The experiment results reveal that an algorithm's efficacy and efficiency are closely related to the population degeneration phenomenon.Guidelines for improving evolutionary algorithms for high-dimensional global optimization are addressed.An application to highly nonlinear hydrological models demonstrates the efficacy of improved evolutionary algorithms in solving complex practical problems.Index Terms-Differential evolution, evolutionary computation, high-dimensional, particle swarm optimizer, principal components analysis, shuffled complex evolution (SCE-UA).

I. INTRODUCTION
E VOLUTIONARY optimization algorithms for uncon- strained optimization attempt to discover the value and location for the global optimum of an -dimensional function: In the search, the solution vector is solely derived from the values of the function without the use of the function's derivatives.The objective function can be in the form of either a mathematical expression or a physical model's computer program, which makes the optimization algorithms a variety of applications in science, engineering, and business [1]- [10].Current evolutionary algorithms are mostly featured with particle-based evolution processes: the algorithm first randomly distributes a "population" of "particles" (or members, points) ,with and is the size of population (number of samples), to sample the objective function in the search space, then sequentially evolve the population by substituting the existing particles (as parents) with new ones having better function values (as offspring).
After many generations of evolution, the population is expected (but not guaranteed) to have at least a particle reaching the global optimum.
Recently, a large number of evolutionary algorithms or algorithm modifications have been developed using genetic-, swarm-, annealing-, and hybrid-based mechanisms to enhance the reliability and efficiency of the evolution processes.These algorithms supported by powerful computation capability have shown good performances in global optimization and become popular increasingly.However, even with these algorithms, global optimization can still be problematic or computationally demanding for high-dimensional cases ( ). High-dimensional problems challenge evolutionary algorithms because the feasible space grows exponentially with increase in dimensionality and the impracticality of increasing the population size in the same manner.Many existing evolutionary algorithms do not pay enough attention to this aspect and sacrifice efficacy due to implementing searches of high computational demand.Therefore, understanding the mechanisms of algorithms that cause success (or failure) and efficiency (or inefficiency) of searching in high-dimensional spaces, is a prerequisite to overcome the theoretical barriers.In this study, we use high-dimensional benchmark functions to test three popular evolutionary algorithms.The selected algorithms are the Shuffled Complex Evolution (SCE-UA)-an algorithm based on simplex scheme, the Particle Swarm Optimizer (PSO)-an algorithm simulating the social activity of animal groups; and the Deferential Evolution (DE)-an algorithm using genetic hybridization.
Although the search strategies of each of these algorithms originate from distinct scientific fields, we show that they all possess two opposite functionalities: 1) exploitation: the process of making particles converge to the global optimum, and 2) exploration: the process to enable particles to explore the feasible space of parameters.The exploitation process tends to drive particles to the most prominent region in the feasible space and speed up the search.Meanwhile, this process increases the risk of missing the global optimum.The exploration process attempts to overcome this problem by diversifying the search directions.This process increases robustness of the search, but at the expense of dragging down the speed of convergence.Detailed analysis of the selected algorithms' processes and performances on high-dimensional benchmark functions revealed that the barriers which keep each algorithm from success of efficient optimization on high-dimensional problems are closely related to implementations of these two functionalities.Balancing these two functionalities is a key aspect of constructing a successful algorithm.
In this paper, we report the results of our study which reveals that in high-dimensional searches, some exploitation processes tend to drive particles into a subspace of the feasible space.In other words, at certain stages during the evolution, all the particles in the population move into a subspace, which has smaller dimensionality than that of the search space.Subsequently, the search will tend to be confined to the subspace and eventually fail if the global optimum is excluded from the subspace.We refer to this phenomenon as "population degeneration."Principal components analysis (PCA) [11] of particle population is used to detect the occurrence of degeneration.Furthermore, principal components (PCs) can provide information useful to remedy the adverse effects of population degeneration.

II. SELECTED ALGORITHMS
Three among the most popular evolutionary optimization algorithms are investigated in this study.The pseudo codes for the selected algorithms are summarized in Table I.The algorithms all start from an initial population (first-generation) of randomly sampled particles within the search range in the feasible space.Afterwards, the offspring replacements with different mechanisms are initiated in order to evolve the population.

A. Shuffled Complex Evolution (SCE-UA)
The SCE-UA algorithm [12], [13] employs the Nelder-Mead simplex scheme [14], [15] to make particle replacements for population evolution.An -simplex is an -dimensional "pyramid" (convex hull) in with affinely independent vertices and edges (e.g., 1-D simplex is a line segment; 2-D simplex is a triangle; 3-D simplex is a tetrahedron.).Starting with a simplex located inside the search domain, the scheme first ranks the vertices according to their function values, so that then it attempts to replace the worst vertex with a new point on the line between and the centroid of the rest vertices, .Parameter determines the location of the new point on the line and several values (in SCE-UA, for reflection, and for contraction) will be tried to find a new point having If successful, the worst vertex is replaced by the new point, otherwise, is replaced by a random point (mutation) in the search domain.In this scheme, information from a simplex's vertices on the response surface of the benchmark function is used to approximate the direction for steepest descent.Driven by the direction of steepest descent, the simplex can effectively find a better offspring .To prevent simplexes from converging to local optima and to enhance the chances of finding the global optimum, SCE-UA utilizes the shuffled-complex process.At the start, the particle population is partitioned into complexes.Each complex includes particles and evolves independently using the simplex method.At the beginning of every iteration of evolution, a simplex is formed within each complex by randomly selected out of the particles in order to perform simplex search.Once a worst vertex is replaced, the simplex will be broken down and its particles return to the complex.We refer to this procedure as simplex subroutine.After each complex completes a certain number ( in this study) of simplex subroutine, particles in all complexes are mixed, sorted and re-divided into new complexes through the shuffling procedure (see Table I), and one iteration ends.By shuffling, the new complex is likely to contain particles from all the previous complexes, hence having information about the function over the region covered by the entire population instead of a single complex.This gathering of information from all the previous complexes results in a better chance for the new complexes moving towards the global optimum.

B. Particle Swarm Optimizer (PSO)
The concept of PSO [16], [17] originates from the computer simulation of the social activities of a bird flock or fish school in which individual members can benefit from its own experience and other members' best discovery during their search for food, mates, or better living conditions.The algorithm starts from a randomly selected initial population and evolves individual particles successively.Each particle's position is updated by trying a displacement (velocity) based on three sources: (1) the particle's velocity in the previous evolution ( ), (2) the particle's best ever position ( ), and (3) the population's current best position ( ): where , , and are the significance coefficients and are random vectors with uniformly generated components is defined as producing a new vector having components .If , otherwise no replacement occurs.A particle's velocity is bounded by the maximum velocity .In the algorithm, taking the displacement with reference to the best point in the population makes a particle have better chance to find an offspring and drives particles moving towards a convergence.In contrast, the particle movements will diverge strongly by historical and current information ( and ) from individual particle's path and the random vectors that alter the component magnitudes of the displacements.

C. Differential Evolution (DE)
The DE algorithm [18] uses the so-called "greedy" scheme to generate offspring.Similar to the PSO algorithm, it attempts to replace the population's particles one-by-one with an offspring.In the DE algorithm, for a particle , the candidate offspring has hybridized components from and another random variable according to a randomly generated number : where is the crossover constant, is the scaling factor and is the component index.If , otherwise the parent survives to the next evolution.
In DE algorithm, the use of random vector is a loose process that tends to have the offspring generated along random directions.The hybridization process also helps making population evolve randomly.SCE-UA, PSO, and DE are three very typical methods and posse distinct search strategies.The SCE-UA algorithm is a steepest descent method, employing particles (i.e., the simplex vertices) to approximate slope of the objective function's response surface in order to speed the evolution.On the other hand, the DE algorithm is an method heavily relying on randomness such that the discovery of an offspring relies on a randomly selected set of three particles from the population.As for PSO, its search strategy falls somewhere between SCE-UA and DE.In SCE-UA, the divergence process (the shuffled-complex procedure) is at the particle level while in both PSO and DE algorithms, the particles are broken into components for more diverse searches.Those differences in search strategy significantly affect their global optimization performances on high-dimensional problems, as demonstrated through the benchmark function tests

ALGORITHM EFFICIENCY AND ROBUSTNESS D. Benchmark Functions
We chose two benchmark test functions that represent two major types of test functions: classical and compositional.The first one is the popular Griewank function.Like many classical benchmark functions, Griewank function has a regular and symmetrical shape.As shown in Table II, this function is constructed with two components: 1) an -D sphere function that gives the global minimum at the origin, and 2) an -D wave function that overlies on the trend function, , and creates many local minima surrounding the global minimum (see the 2D response surface in Table II).This feature makes the Griewank function serve well for testing an algorithm's capability of escaping local minimums and converging to the global minimum.
The second benchmark function is a composition function (CF1) suggested by Liang et al. [19].As shown in Table II, the composition function is a weighted summation of ten sphere functions: By changing vector and scalar the sphere functions can have variable magnitudes and can be located at different positions in the feasible space.Weight functions are used to em- phasize the impact of the corresponding sphere function in the region around its sphere center.Unlike most traditional benchmark functions, the CF function's response surface is characterized with multiple irregular attractive regions (with distinct shapes and sizes) converging to individual minimums at different levels.The lowest one is the global minimum.This irregularity poses a great challenge for evolutionary algorithms, especially when the global minimum is located within a small convergence zone close to the border.
Noticeably, both the benchmark functions are continuous and smooth (differentiable everywhere).The points where the function reaches either local or the global minimums have zero gradient ( ).These points are identified as the stationary points.

E. Experimental Setup
The SCE-UA, PSO and DE algorithms are compared on both benchmark functions of 30-D and 100-D.The maximum number of function evaluation is set at 500 000 for the 30-D experiments and 1 000 000 for the 100-D experiments, respectively.Each of the experiments is run 30 times with different random seeds.All algorithms' runs start with an initial population of particles, where is the dimensionality of the benchmark function.
Parameters in each algorithm are specified to their default or popular values: 1) SCE-UA: The number of complex ( ): . Each complex has particle.Therefore, the population size is particles.
2) PSO: The significant coefficients: the inertia weight is held as a constant .(Based on preliminary experiments, we did not found noticeable difference in performance between the use of an annealed and a constant .This is in accord with the study of [20]) and are set at 2. is half of the search range.
3) DE: The crossover constant ; the scaling factor .As reported in [21], bound handling is critical to the performance of PSO.The widely used random and absorbing bound handling schemes may paralyze PSO when applied to high-dimensional problems.The difference in effects from different bound handling on SCE-UA and DE are not as evident as on PSO.Therefore, in this study, we adopt reflecting bound handling method for all three algorithms.In this method, when a particle flies outside a bound in one of the dimensions, the bound will act like a mirror and reflect the projection of the particle's displacement.

1) Results for the Griewank Function:
As illustrated in Fig. 1   exception.The BFV decreasing speeds of individual algorithms express the similar patterns as in the 30-D experiments.
To understand the algorithms' different behaviors, we investigated the results further.The questions posed were as follows.
• Where do all the SCE-UA runs stop at?
• What impede the SCE-UA algorithm from converging to the global minimum?• Why DE is so robust that it succeed in every scenario?

A. Stagnation of the SCE-UA Algorithm
As described earlier, the SCE-UA algorithm employs the Nelder-Mead simplex scheme to move the offspring along the function's steepest descent direction.Ideally, if the function is continuous and smooth, the scheme will eventually drive the simplex's vertices converge to a stationary point.However, it has been reported that even for lower-dimensional problems, Nelder-Mead simplexes can sometimes stagnate at non-stationary points, a phenomenon named "stagnation" [22].Our tests using the CF1 function reveal that incidents of stagnation increase when the function dimensionality increases.To illustrate where the SCE-UA runs stop, we examine the function gradients (numerically estimated with ) at the final points of SCE-UA runs in both 30-D and 100-D cases.We find that all the final points are non-stationary ( ).Fig. 2 shows the histograms of gradient norms at the final points: all the points possess significant gradients.
To identify if these final points are located inside the attractive region of the global minimum, we further examine the geometric relationship between the normalized negative gradient and the normalized direction vector from the final point to the global minimum for each final point.Given the fact that the CF1 function over the attractive region of the global minimum is a weighted sphere function, at any point in this region the negative gradient vector should point towards the global minimum.As illustrated in Fig. 3, for 30-D runs, the corresponding components of these two normalized vectors at all the final points are highly correlated.This indicates that these points are already close to the global minimum with negative gradients pointing towards it.For 100-D runs, the final points are still located inside the attractive region but the correlations are not as high as in the 30-D  cases because these final points are relatively farther away from the global minimum.
A key question here is: what make the SCE-UA algorithm lose its capability to fully converge?A possible reason is that the particles which are randomly selected from a complex to form a simplex have already lost some properties necessary for achieving the global optimum.We hypothesized that after generations of evolution the population might have degenerated to a subspace of the feasible space.In order to test this hypothesis, we need to examine the geometric structure of the population in a way which illustrates the faults of the particle collection.For this, we used the principal components analysis.

B. Principal Components Analysis (PCA)
Principal Components Analysis (PCA) is a multivariate analysis tool that transforms a given dataset to a new orthogonal independent coordinate system so that the first coordinate (called the first principal component PC ) has the largest variance of projections from the dataset; the second coordinate has the second largest variance and so on.In certain cases, some lower-rank PCs will have negligible variances, which means the dataset having "dimensionality reduction".In our case, the population's particles are transformed to the PC coordinate system : where matrix has PCs ( : ) as its columns, is the covariance matrix of the particle population in the original coordinate system, and eigenvalue is the data variance along .If we have , the vector component constant for every .This indicates that the dataset is located within a subspace spanned by ( ) independent orthogonal vectors (PCs) denoted as .By applying the PCA procedure to a population and checking the PC's relative variance we can identify if the dimensionality reduction has happened in the population.If it has happened, we need to be aware of two issues: 1) Given the global optimum location of the benchmark function , it is easy to detect if is inside the subspace.If is outside of the subspace, the particle population may lose the ability to converge to or may even converge to a non-stationary point.Sketchy plots in Fig. 4 provide visualization of reduced dimensionality of particle population in 2-D and 3-D scenarios and the consequences.
2) If vectors and , then too.It means that theoretically, offspring generation mechanism based on linear operations on parents in degenerated population cannot restore the lost dimensions.In high-dimensional optimization problems, one can not ensure that particle population is able to maintain it dimensionality throughout the whole search.Actually, we discovered that in each of the SCE-UA failed runs and most of PSO problematic Fig. 5.The PCA results during the evolution of the SCE-UA (left column), PSO (middle column) and DE (right column) runs for the 100-D CF1 function.The horizontal coordinate represents the 100 PCs ordered according to their relative variances (the vertical axis).From top to bottom, the row panels are for the results after the first iteration, after one-third of the total iterations, after two-thirds of the total iterations, and after the last iteration, respectively.runs, the population degenerated: all particles spanned a subspace with a reduced dimensionality and the benchmark function's global minimum point was excluded from the subspace.Detailed results are illustrated in the following section.

C. PCA Results of the Three Algorithms
In the experiments, we applied the PCA procedure to the particle populations every time a generation of evolution (iteration) is completed.Due to the characteristics of the PSO algorithm, the best historical position of each particle is also included in the PSO population.Fig. 5 illustrates how the relative variances on PCs varies from the early population to the final one for median runs of three algorithms on the 100-D CF1 function (results from other 100-D runs and the 30-D runs are similar).Since all algorithms start with a randomly sampled initial population, after the first iteration, the populations of all three algorithms still maintain comparable variances for each individual PC (the top panels).However, with continuing evolution, the PCA results for the three algorithms diverge.
For SCE-UA (the panels on the left column), the population finally generates so much that only one PC has dominant amount of variance, 99.5% of the total, after 74 iterations (about 4.5 function evaluations).In fact, after one third of the total iterations, only four PCs have significant variances.This means that the population is confined in a 4-D subspace of the 100-D search space.Because the SCE-UA algorithm generates the candidate offspring along the steepest descent direction -a difference vector between a simplex's worst vertex and the centroid of the rest vertices, if all the vertex particles are within the same subspace, the new offspring will be confined in the same subspace.The population continues reducing its dimensionality until quits at a 1-D subspace and the final point is not even a stationary point.
For PSO (the panels in the middle column), the degeneration of particle population is less severe than in SCE-UA.After twothird of the total iterations (2120 iterations with about 8.6 function evaluations), the last 28 PCs out of all 100 PCs have only 1% of the total variance, which means that the capability of searching over directions of these 28 PCs is greatly reduced.In other words, the algorithm lost the capability of searching through the full 100-D parameter space.For the same reason as in the case of SCE-UA, it is difficult for the PSO algorithm to restore the lost dimensions in the following iterations.For this PSO run, it is lucky that the population degeneration happens because the particles have reached the attractive region of the global minimum and, hence, the subspace contains the global minimum point.However, in all the failed PSO runs, the population degeneration occurs due to the attractive region of a local minimum and the subspace spanned by the population excludes the global minimum.Consequently, the particles lose their capability to approach the global minimum and finally are trapped to the local minimum.
For DE (the panels on the right column), throughout all of the 2500 iterations, every PC remains comparable variance.This means that the searches are always conducted in the 100-D full search space, which contributes to the robust performances of DE in the experiments.In fact, if the degeneration had happened, the way of DE to generate the offspring would not be able to get it recovered too, like in the cases of the other two algorithms.

D. Geometrical Study of the Experiment Results
It is also of great interest to show how the relationship between the global minimum and the subspace spanned by the population changes during the course of evolution for all three algorithms.To do this, we must define some numerical thresholds.The first one is how to define a dimension being reduced numerically.Since the dimensionality of the experiment is 100, for random generated populations, the expected percentage of total variance projected on each dimension is 1%.Therefore, if one dimension has variance projection less than of its expectation (0.01% of the total variance), it is numerically defined as reduced.Second, it is important to define if the global minimum is within a subspace?In this experiment, if the distance  from the global minimum to the subspace, which is also a hyperplane, is less than , is treated as being on the super plane, namely within the subspace.
If we put results of the 30 runs in parallel, after each iteration, we can determine in how many runs (out of the 30 runs) the global minimum is remaining in the subspace spanned by the particle population.Illustrated in Fig. 6, in all SCE-UA runs, the global minimum is excluded from the population space after a few early iterations.In contrast, for all DE runs, the global minimum always remains in the subspace of the population which is actually because the population does not degenerate, as revealed by Fig. 5. From Fig. 5, it is clear that PSO runs all have population degeneration occurring.However, Fig. 7 shows that if the population degeneration is caused by the attractive region of the global minimum ( ), is still contained in the subspace of the population.However, if the degeneration is caused by the attractive region of the local minimum, is very likely to be excluded from the population subspace and when it happens, the search is doomed to fail.

E. Interpretation
Population degeneration cannot be addressed by the population diversity measures, that are proposed by many recent studies in evolutionary algorithms.In general, these reported measures fall into two categories: 1) Population diversity in the objective space [23]- [26]; and 2) Population diversity in the parameter space [27]- [30].Furthermore, the measures in the parameter space are based on the particle-dimension-distance to quantify population diversity.These distance-based measures cannot identify population degeneration.For instance, in another study [21], we demonstrate that one of the most popular diversity measures cannot reflect the population degeneration that leads the PSO algorithm to fail on some complex test functions.
The difference in the severity of population degeneration between the three algorithms is closely related to the offspring generating mechanisms used by the algorithms.To keep the population spanning the full parameter space through the optimization, offspring should be generated in "all directions" during evolution.Here, "all directions" mean linearly independent directions in an -D search space.
In the SCE-UA algorithm, offspring are generated (using the Nelder-Mead simplex scheme) along the steepest descent direction of a benchmark function.The steepest descent direction represents the most promising searching direction, which makes the search effective.Meanwhile the population converges and possesses orientations in accordance to the function features.The complex shuffling scheme and mutation procedure alter the search directions but are not strong enough to prevent the convergence trend in the high-dimensional space.Therefore, the SCE-UA search during the evolution is in "very selective directions," not in all directions.
The scheme employed by PSO generates the offspring in a direction with three components: the direction of the last displacement, the direction toward the best position of the population, and the direction toward the particle's best historical position.The first and last components help maintaining the spatial diversity since each particle's path is independent at the beginning.However, if the similarity among particles' paths emerges, for example the whole population moves from one region to another more promising region during evolution, the spatial diversity of search direction will also be compromised.
In contrast, the DE algorithm generates offspring along directions that are defined by two randomly selected particles and diverted by the hybridization process.These mechanisms tend to generate offspring in all directions and greatly reduce the possibility of population degeneration but cost inefficient searches.

IV. REMEDY FOR POPULATION DEGENERATION
Because efficiency is essential in high-dimensional optimization, the parsimonious slope-approximation processes of generating offspring, such as the simplex scheme in the SCE-UA method, are preferable in developing evolutionary optimization algorithms for high-dimensional applications.However, the potential risk from population degeneration must be properly controlled.In addition to its capability of discovering population degeneration, PCA also can help remedy the adverse effects of population degeneration.PCA can identify all of the reduced PCs, on which the population variance projections are trivial.Then, extra offspring can be generated along these PCs complementing the offspring generated routinely.In this manner, the offspring is generated in all directions and can span the full parameter space.
As a demonstration, we modified the SCE-UA algorithm [31], [32], by applying PCA to the population at the end of each iteration and generating additional offspring along all reduced PCs.This modified SCE-UA algorithm not only performs properly on the 100-D CF1 function with all runs successfully converging to the global minimum, but also demonstrates its efficiency and effectiveness on a suite of very complex compositional functions.As a general tool, PCA can also be applied to remedy population degeneration associated with PSO.The above experiments show that population degeneration also occurs in the PSO runs on 100-D CF1, but not causing stagnation of the swarm.However, as reported by Chu et al. [33], population degeneration triggered by one of the most popular bound-handling schemes, absorbing scheme, does drive the swarm to stagnate at the boundary of the searching space.The absorbing scheme is necessary in exploring the global optima that are close to the boundary in many practical problems.In fact, adding a PCA scheme to PSO, in the same manner as how we apply PCA to SCE-UA, can prevent PSO from stagnation caused by population degeneration when using absorbing bound-handling scheme.This can be demonstrated through a test using the composition function,CF4, in Liang et al. [19].The CF4 function (Fig. 8) constructs a dominant attractive region which converges to a local minimum (at the center of the search space), whereas the global minimum and other better local minima hide in a narrow area close to the boundary.
For comparison, we run PSO 30 times on CF4 with each of the four settings: 1) with reflecting bound-handling: relocating an outside offspring particle inside at the symmetric position by the boundary, 2) with random bound-handling: ranrelocating the outside particles within the search domain, 3) with absorbing bound-handling: setting the outside offspring particles onto the boundary, and 4) with absorbing bound-handling and PCA scheme.Results show that PSO is consistently trapped by the local minimum at the center of the search space (Fig. 9

V. PRACTICAL APPLICATION
With remedy of population degeneration, evolutionary algorithms are capable of solving high-dimensional and complex practical problems.As an example, we present some results from our recent study of improving the parameter estimation and calibration of the highly nonlinear hydrologic models used for flood forecasting [34].In specific, the SACramento Soil-Moisture Accounting Model (SAC-SMA), which is the major component of the U.S. National Weather Service (NWS) River Forecast System and is currently serving as the operational model for flood and river flow forecasting over the United States, was used to examine the SP-UCI method, a SCE-UA variant integrating PCA.The SAC-SMA model has 13 parameters which need to be obtained through parameter estimation.
Simulating the rainfall-runoff process, the SAC-SMA model represents a complex system involving physical components (different soil layers) and water movement (infiltration, percolation).Since many parts of this system are underground and unobservable, it utilizes a series of conceptual water storages to approximate the soil moisture conditions and to control the production of streamflow (Fig. 10).Therefore, the skill of this model relies on how well the model parameters are calibrated.
To demonstrate effectiveness of SP-UCI, we applied it to the calibration of SAC-SMA over the Leaf River basin located in the State of Mississippi.This watershed has an area of 1944 (Fig. 11) and is an intensively studied watershed, and therefore has abundant and easily accessible hydrological data.We obtained 11 years (January 1, 1953 to December 31, 1963) of observation time series from the Hydrologic Research Laboratory at NWS.The data set includes mean areal precipitation (mm/6 h), potential evapotranspiration (mm/day), and streamflow (m3/s).The mean annual precipitation for the entire period is 1323.7 mm and the mean runoff is 27.14 m3/s.
For comparison, both SP-UCI and SCE-UA were applied 50 independent times, with objective function defined as the daily root-mean-square error (DRMS) of the simulated runoff against the observation.Final DRMS values are presented in Fig. 12, along with results reported by some previous studies [35], [36].It is evident that SP-UCI elevates SAC-SMA to a record high level in terms of its ability to predict the daily runoff of the basin.SP-UCI not only retrieves the optimal parameter values   responsible for more accurate runoff simulation, but also provides consistent (narrow ranges) model parameter distributions, which leads to correct understanding of the model's behavior and the watershed's hydrologic features [34].

VI. SUMMARY AND DISCUSSIONS
In this study, we compare three evolutionary algorithms using two typical benchmark functions, each of which is tested in 30-D and 100-D cases and each test included 30 runs.From the experiments we conclude the following.
1) All the failed global optimizations are unanimously in company with population degeneration in which the population degenerated into a subspace embedded inside the search space and the global minimum is outside of the subspace.This population degeneration is in general irrecoverable by the algorithms themselves.
2) The possibility of the occurrence of population degeneration is closely related to the algorithm's mechanism used to generate offspring.The SCE-UA algorithm has the most effective and efficient scheme to discover better offspring among the three algorithms, but it is most vulnerable to population degeneration.The DE algorithm shows robustness in global optimizations by employing diverse search during the evolution, which, in turn, decreases its inefficiency.3) DE showed its robustness in this study, but the low efficiency diminishes its application to high-dimensional optimization.On the other hand, in all the succeeded runs, SCE-UA consistently exhibits remarkable efficiency compared with the other two algorithms which suggests that it will have great potential for high-dimensional optimization if the population degeneration can be controlled, as reported in [31], [32].When using evolutionary algorithms to solve real-world problems, the location of the global optimum (if it exists) is generally unknown.As the searching dimensionality increases, the response surface of the test function becomes much more complicated.Particle population gets less capable to explore the high-dimensional landscape and more susceptible to degeneration.Therefore, the results of this study recommend: 1) An algorithm should maintain the dimensionality of the space spanned by particle population during evolution, because even if only one dimension is reduced, the succeeding search will possibly be restricted to the subspace missing the opportunity to achieve the global minimum.2) The PCA procedure is a powerful tool of not only discerning population degeneration but also identifying reduced PCs.Hence it can be used to remedy population degeneration and diverse the search directions.3) As a practical instance of the No Free Lunch Theorems [37], efficiency and robustness are often uncongenial.To be able to work on a wide range of real-world problems, an algorithm should enable users to control the balance between efficiency and robustness.How to realize this through designing an efficient and flexible mechanisms of generating offspring will be one of the foci for our future study.
(a) and (b), the three algorithms perform consistently through their 2 30 runs.In every SCE-UA and DE run, the best function value (BFV) converges to the global minimum.For PSO, a few runs converge to local minimums (eight runs in 30-D experiment and five runs in 100-D).SCE-UA converges at least one-order faster than PSO and DE in both 30-D and 100-D cases.On the other hand, PSO outperforms DE in terms of convergence speed.With the increase in dimensionality of the benchmark function, differences in BFV decreasing rates of two algorithms are more evident in 100-D than in 30-D.In particular, for SCE-UA, the numbers of function evaluations to convergence remains at an order of in both 30-D and 100-D runs, whereas the numbers for PSO and DE go up from the order of to .These results indicate that, in the case of Griewank function, the increase in dimensionality causes the efficiencies of PSO and DE to drop more significantly than that of SCE-UA.2) Results for the CF1 Function: In Fig. 1(c) (the 30-D experiments), for SCE-UA, all the runs reach the close vicinity of global minimum, but are unable to converge to the global minimum.As geometric sizes of populations become very small, all of the optimization runs terminate prematurely as though they had converged to a minimum point.These termination points randomly distributes around with mean BFV of 1.3295 and mean distance to of 0.1943.Most PSO runs succeed while only 6 runs converge to one of the local minima and are trapped there within the given maximum number of function evaluations.The DE runs unanimously achieve the global minimum.Similar to the results on the Griewank function, SCE-UA shows the fastest BFV decrease rate; BFVs in the successful PSO runs drop faster than those in DE runs, where as in the failed PSO runs, BFV drops even slower than in DE runs.The results from the 100-D experiments (Fig. 1(d)) show the problems more clearly.All the SCE-UA runs quickly stop at some points that are still away from the global minimum.The majority of PSO runs succeed in converging to the solution with nine runs are trapped to a local minimum.The DE algorithm exhibit its robustness and discover the global minimum without

Fig. 3 .
Fig. 3. Correlation between components of normalized negative gradient and normalized direction vectors pointing to the global minimum at final points for SCE-UA runs on the 30-D (left) and 100-D (right) CF functions.

Fig. 4 .
Fig. 4. Illustration of population dimensionality reduction in 2-D (left) and 3-D (right) scenarios.In 2-D, the benchmark function response surface is shown as well as the contours in the coordinate plane.When the population degenerates onto a line parallel to the first PC (P1) with the variance on the second PC (P2) equal to zero, the succeeding search will be restricted on the line and eventually converges to the best point on this line which is a non-stationary point in the original feasible space.Similarly, in 3-D all sample particles may degenerate onto a plane defined by the first two PCs (P1 and P2) with the third orthogonal PC (P3) perpendicular to the plane.If the global minimum is not on this plane, the search will lose the capability of converging to .(Black dots are searching particles.).

Fig. 6 .
Fig. 6.Number of runs in which the global minimum remains in the space spanned by particle population as a function of iteration through optimization of 100-D CF1 function for SCE-UA(dash line) and DE (solid line) for the ensemble of 30 runs.(Iterations of different algorithm may have different amounts of function evaluation.).

Fig. 7 .
Fig. 7. Mean distance of the global minimum to the hyperplane defined by the particle populations for the 21 successful PSO runs (the upper panel) and 9 failed PSO runs (the lower panel) on the 100-D CF1 function.
(a)-(c)) with settings 1-3.Only with PCA as in setting 4, PSO can escape the trapping by this local minimum and obtain much better final function values.

Fig. 12 .
Fig. 12. Box plots of results from 50 runs of SCE-UA and SP-UCI.The dash line and dotted line indicate the result reported by Thiemann et al. [36] and Brazil [35] respectively.(The box has lines at the lower quartile, median, and upper quartile values.The whiskers are lines extending from each end of the box to show the extent of the rest of the data.Outliers, denoted by , are data with values beyond 100 units of interquartile range.).

TABLE I PSEUDO
CODES FOR STUDIED ALGORITHMS