A new evolutionary search strategy for global optimization of high-dimensional problems

Global optimization of high-dimensional problems in practical applications remains a major challenge to the research community of evolutionary computation. The weakness of randomization-based evolutionary algorithms in searching high-dimensional spaces is demonstrated in this paper. A new strategy, SP-UCI is developed to treat complexity caused by high dimensionalities. This strategy features a slope-based searching kernel and a scheme of maintaining the particle population’s capability of searching over the full search space. Examinations of this strategy on a suite of sophisticated composition benchmark functions demonstrate that SP-UCI surpasses two popular algorithms, particle swarm optimizer (PSO) and differential evolution (DE), on high-dimensional problems. Experimental results also corroborate the argument that, in high-dimensional optimization, only problems with well-formative ﬁtness landscapes are solvable, and slope-based schemes are preferable to randomization-based ones.


Introduction
Rapid advancements in technology and science pose new challenges to the optimization community.In real-world applications, more and more complex systems, such as power systems, protein structures, medical image registration, and financial market [4,16,19,24], are being simulated by computer models with the aid of elevated cyber infrastructures.These models have increasingly elaborate structures resulting from representing multiplex physical or conceptual processes and, hence, possess large numbers of parameters.Therefore, optimization of these models is indeed searching the solutions in high-dimensional spaces spanned by the model parameters.In high-dimensional spaces, there exist many unique difficulties and phenomena which are not present in low-dimensional problems and, therefore, plague the well-performing methods in low-dimensional spaces.For instance, ''curse of dimensionality'', which was first coined by Bellman [1,2], is the term used to describe the problem caused by the exponential increase in volume associated with adding extra dimensions to a mathematical space.
Furthermore, Mendes et al. [22] argued that only functions whose fitness landscapes provide clues to locations of the solutions (optima) can be called problems; other functions, such as ''deceptive functions'' (where the gradients lead a hill-descender/-climber away from global optima) and ''random functions'' (where gradients exist but are unrelated to solutions), are nonsense.This argument is more plausible when we deal with high-dimensional problems.Actually, the ''clues'' here mainly refer to the gradient or slope of the response surface.Fortunately, many real-world applications fall into this class [22].Thus, algorithm developers should focus on solving problems instead of intricate benchmark functions.
However, although the deceptive and random functions are short of practical importance, many (probably more and more) benchmark functions are designed to represent deceptiveness and randomness [11,28].Consequently, many algorithms or algorithm alternatives attempting to tackle these two difficulties have been developed recently.Obviously, there is only one way to achieve the success on deceptive or random functions: randomization.On deceptive or random fitness landscapes, attractive regions of global minima can only be reached if a searching particle jumps into it by chance.Therefore, based on genetic-, swarm-, annealing-, and hybrid-mechanisms, many searching algorithms heavily integrate randomization in the offspring reproduction process, such as simulated annealing (SA) [15], particle swarm optimizer (PSO) [14], differential evolution (DE) [26], covariance matrix adaptation-evolution strategy (CMA-ES) [12], and their recent modifications [3,7,8,23,25,[30][31][32]34,35].These algorithms demonstrate outstanding performances when applied to low-dimensional deceptive and random functions and have prevailed in recent literature.However, when problem dimensionality increases, these algorithms lose their effectiveness [11] due to the fact that the power of randomization drops geometrically.The vulnerability of randomization to high dimensionality is easy to understand and can be demonstrated by a simple example: ''Assume that the global optimum is a ''quadrant'' (the analogy of the 2-D quadrant in the high-dimensional space) instead of a point in an n-dimensional search space.A particle is randomly jumping in the search space.To make it simple, we define that the optimization succeeds if the particle jumps into the correct quadrant (the global optimum).Therefore, the probability of success at every step is 2 Àn .With low-dimensional problems, the probability is acceptable if we have a large population of particles or if we can evolve particles many times, which is the case of most evolutionary optimization algorithms.However, for high-dimensional problems, the probability becomes so small that the problem cannot be solved in practice.For example, with n = 100, the probability is only 2 À100 , which is much lower than the probability of winning any lottery on the Earth!''.
Another major concern when applying evolutionary algorithms to real-world applications is efficiency.Unlike mathematical benchmark functions, the computation time for real-world problems can be substantial, especially in high-dimensional settings or when complex processes are involved.A successful optimization algorithm must be able to evolve in a parsimonious manner in many situations.Algorithms relying heavily on randomization sacrifice their efficiency due to the fact that the more randomness in the offspring-generating mechanism, the lower probability of producing qualified offspring to keep the population evolving.As a result, these algorithms usually require a large number of function evaluations.For many highdimensional real-world applications, the computations last too long to be viable.
To solve high-dimensional real-world optimization problems, an algorithm that can wisely exploit response surfaces and possesses high efficiency is in great need.Motivated by this consideration, we introduce an innovative evolutionary algorithm called the shuffled complex evolution with principal components analysis-University of California at Irvine (SP-UCI).Six composition functions in [20], CF1-CF6, are used to benchmark the SP-UCI method in comparison with the PSO and DE methods.Having irregular fitness landscapes and high levels of complexity, the composition functions can sufficiently present the difficulties of high-dimensional problems.Recent studies have shown that the composition benchmark functions pose large difficulties for global optimization algorithms [20].To test the scaling behaviors of SP-UCI, the benchmark experiments are conducted in 10, 50, 100, and 1000 dimensions.Experimental results demonstrate that the SP-UCI method exhibits consistently superior performances in high-dimensional (>50-D) settings in comparison with the PSO and DE methods, and its merits become more significant as the dimensionality increases.
The no free lunch (NFL) theorems [33] state that ''for any algorithm, any elevated performance over one class of problems is offset by performance over another class''.In other words, there is no such algorithm that can outperform others over all problems.As a practical example of NFL, Langdon and Poil [18] demonstrated that genetic programming (GP) can readily find simple functions (response surfaces) which suite one heuristic algorithm over the other and vice versa.Following this line of reasoning, the development of any search algorithm should be accompanied by statements describing what kind of problems it is designed for.In contrast, the practice for many algorithm developers is that they try to claim that their methods perform well in general and try to use a set of benchmark functions to support them.However, a set of benchmark functions are never sufficient to represent problems in general.Therefore, we want to state that the SP-UCI method is designed for high-dimensional and complex problems.Furthermore, the superior performance of this method on the benchmark functions substantiates that searching in high-dimensional space should rely on strategies which can cleverly and efficiently exploit fitness landscapes, such as slope-based algorithms, instead of relying heavily on randomization processes.

The SP-UCI method
The SP-UCI method is developed based on the shuffled complex evolution-University of Arizona (SCE-UA) method [9].Since its debut, the SCE-UA method has been widely used in calibrating conceptual hydrological models which generally have very complicated fitness landscapes with uncountable local minima, unknown roughness, and discontinuities.Studies show that the SCE-UA method has demonstrated efficiency and effectiveness on both benchmark functions and real-world applications [10,27,29].However, when it was designed, SCE-UA was constructed primarily for and tested on low-dimensional problems.Recently, our study [6] reveals that SCE-UA tends to malfunction on high-dimensional problems, due to ''population degeneration'', which will be introduced in the following paragraph.SCE-UA employs the n-dimensional Nelder-Mead simplex (hereafter referred to as simplex) scheme as the searching and evolving kernel.The simplex scheme is an effective tool reproducing qualified offspring by estimating the deepest descent direction in its proximity.As one of the pioneers in direct search, the simplex method has been intensively studied both experimentally and theoretically [13,17,21].As a local search algorithm, simplex exhibits high efficiency with a convergence rate close to O(n) [17].However, as revealed in our study [6], the offspring particles reproduced through a series of simplex processes may converge into a subspace of the original search space; namely, the space spanned by searching particles has a lower dimensionality than the one of the original search space.Since then, subsequent evolutions will be restricted within the subspace and have little chance of recovering to full search in the parameter space.We refer to this phenomenon as ''population degeneration'', which is caused by the fact that there is no mechanism to maintain the population's capability of spanning the full search space.Actually, this deficiency lies in many direct search algorithms, where the sample population evolves by itself without any supervision on dimensionality change of the population.Population degeneration may lead to disastrous consequences, such as converging to non-stationary points or unexpected divergence [6].
In response to this deficiency, we develop a scheme which utilizes principal components analysis (PCA) to identify and search along the dimensions that are not spanned by the sample population.By integrating this scheme, the SP-UCI method guarantees that the particle population is able to search in the full space during every loop throughout the entire evolution.
In detail, the SP-UCI method incorporates four concepts that are expressed by individual algorithmic modules: (a) the complex shuffling scheme; (b) population dimensionality monitoring and restoration; (c) modified competitive complex evolution (MCCE) strategy; and (d) multinormal resampling.Modules (b) and (d) are new developments to enhance the method's performance on high-dimensional and complex problems, whereas Modules (a) and (c) are inherited from the SCE-UA method with some modifications.Each of these modules is particularly designed to account for one of the major difficulties in direct search: The complex shuffling scheme, Module (a), is powerful in exploring multimodal response surfaces; Module (b) remedies the potential population degeneration; Module (c) is a sophisticated implementation of the simplex method which drives the evolution in an efficient manner; and, finally, multinormal resampling, Module (d), helps search over rough fitness landscapes.Figs.1-4 are the flowcharts of SP-UCI, and the details of each module are presented in Appendix A. The Matlab codes of the SP-UCI method are available upon request.

Test functions
For the sake of testing the performance of our method in real-world applications and illustrating our arguments about searching on high-dimensional problems, we choose six recently developed novel composition functions, CF1-CF6, in [20] as benchmark functions.The following considerations make these functions preferable over widely used standard benchmark functions: (1) These functions all have very complicated and irregular fitness landscapes; hence, they have the ability to mimic realworld problems.(2) There are no known patterns (such as the layout of minima) hidden in these functions, which prevent the tested algorithms from taking advantage of any known properties of response surfaces.(3) The absence of symmetry makes the complexities of these functions increase substantially with dimensionality.
Therefore, they can effectively test a algorithm performance's on high-dimensional problems.(4) They are composed of a series of popular standard functions and, therefore, can represent difficulties of a wide range of standard benchmark functions.
Generally, each of these functions is composed of ten basic test functions chosen among Sphere, Ackely, Griewank, Rastrigin, and Weierstrass functions.The selected functions are randomly located and are biased by different magnitudes to generate global minimum and several local minima.From the first to the 10th basic function, biases of 0, 100, . . ., 900 are added respectively.Therefore, the minimum of the first basic function is the global optimum with a value of 0, and minima of the other nine basic functions are local minima with an altitude of 100, . . ., 900, respectively.The minimum of the 10th basic function is set to the origin in order to trap algorithms that take advantage of global optimum at the center of the search space.Furthermore, instead of simple summation, the ten basic functions are combined with Gaussian functions to blur structures of individual functions.Details of constructions and properties of the composition functions can be found in [20].
For many popular basic test functions, high dimensionality does not affect their regularities, such as symmetric landscape and uncorrelated dimensions.Therefore, the difficulties of these functions do not increase very much in high-dimensional space.In contrast, these composition functions have very irregular fitness landscapes, as shown in Fig. 5.Even in 2-D, these plots vividly display the harshness of these functions.However, the harshness will increase greatly when dimension increases.For example, the ratio of the global minimum attractive region to the whole search region will decrease geometrically with dimensionality, and the depth of local minima and magnitude of noise will increase.Thus, this group of benchmark functions can provide a rigorous examination on an algorithm's capability to search on high-dimensional complex problems.

Experiment settings
In addition to SP-UCI, we also tested two other widely-used algorithms: particle swarm optimizer (PSO) and differential evolution (DE) (see Appendix B for brief descriptions of PSO and DE).The reason for selecting these two algorithms is twofold: first, as two of the most popular heuristic optimization algorithms, they are good references for inter-comparisons among different studies.Second, as discussed in our previous study [6], these two algorithms, together with the SP-UCI method, represent three major types of offspring-generating schemes: DE relies heavily on randomization to enhance its robustness; SP-UCI takes the advantage of slopes of response surfaces to achieve efficiency; and PSO falls somewhere between DE and SP-UCI, combining response surface information provided by best points (in population and in history) and randomly-generated weighting vectors.Benchmarking these three algorithms will shed light on how the performances of randomization-based and slope-based searching algorithms change with dimensionality.
Liang et al. [20] reported that CMA-ES was outperformed by PSO and DE on these sophisticated composition functions.Therefore the CMA-ES method is excluded from the comparison.

General settings
Our experiments were conducted on 10-, 50-and 100-D.Due to computational constraints, experiments of 1000-D were only carried out for CF1.All three algorithms were run 30 times on each benchmark function.All functions had the same i m?
Check and restore dimensionality of the ith complex (see Figure 2) Yes Evolve the ith complex using the MCCE algorithm (see Figure 3 search range of [À5, 5] on every dimension.Locations and magnitudes of global minima and local minima were set at the default values as in the codes provided by the authors of the composition functions.Sample population was randomly initialized in the search range with uniform distribution.Population size was kept the same across three methods: 8 Â d + 4, which means that there were four complexes in the SP-UCI method.
The maximum number of function evaluation was set to 10 4 Â d, for d = 10, 50, 100, and 10 5 Â d, for d = 1000.In addition to the maximum function evaluations, there were two additional stop criteria for SP-UCI, as specified below.

SP-UCI
Two additional stop criteria: if population converges to a space of geometric size less than 10 À6 ; or if the best function value has not improved by 0.1% over the last 50 loops.PSO The significant coefficients: the inertia weight was held as a constant c 0 = 0.5.c 1 and c 2 were set at 2. V max was half of the search range.DE The crossover constant C = 0.5; the scaling factor F = 0.5.

From main routine
Identify the number of lost dimensions (L) using Principal Component Analysis No Replace the worst point with the new point and sort the complex Choose the ith lost dimension and search over it Found a point better than the worst point in the complex?

Bound handling
In another recent study [5], we found that bound handling is critical to the performance of PSO.It is revealed that the widely-used random and absorbing bound-handling schemes may paralyze PSO when applied to high-dimensional problems.Therefore, in this study, we adopted a reflecting bound-handling method for all three algorithms.In this method, when a particle flies outside a bound in one of the dimensions, the bound will act like a mirror and reflect the projection of the particle's displacement.Assign a triangular pdf to points in the complex

From main routine
Set the maximum number of simplex iteration I and index i = 1 i I ?
Fig. 3.The module of modified competitive complex evolution strategy.
better than PSO with success runs on CF1-3 and CF5.PSO only reached the global minimum on CF1 and CF5.CF4 and CF6 are too complex for any algorithm to work well.In particular, on CF4, most runs were trapped by the local minimum with a value of 600.2) 50-D problems: results are presented in Fig. 7. DE was not outstanding anymore in this scenario, and performances of all three algorithms deteriorated.Only on CF1, most runs ended up at the global minimum, except for seven runs of PSO.On CF2, only one run of SP-UCI and one run of PSO succeeded, whereas other runs were all trapped by local minima.For CF5, most runs got very close to global minimum, but on run can overcome the roughness around to capture it.For CF6, all runs were captured by the local minima defined by a basic function with a magnitude of 900.The same thing happened to PSO and SP-UCI on CF4, and DE seems to escape this fate but not going too far.3) 100-D problems: as shown in Fig. 8, SP-UCI outperformed the other two on every benchmark function by achieving lower or equal function values.Due to increased complexity resulting from increased dimensionality, only on CF1, all algorithms achieve the solution.However, a close look reveals that, even on CF1, only SP-UCI succeeded in obtaining function values lower than 10 À6 (the termination criterion), whereas the results of PSO are all above 10 À4 , and results of DE are all larger than 10 0 .In addition, seven PSO runs were trapped by a local minimum of CF1.On CF5, only

To main routine i = i+1
From main routine Mix all new and complex points and sort them in order of increasing function value Replace the first p points into the complex and discard the rest points SP-UCI runs all converged to a small region around the global minimum, whereas runs of the other two were still farther away.On CF2 and CF3, even though no run got close to the solution, SP-UCI still achieved the best results.Finally, for CF4 and CF6, the high dimensionality made them so intimidating that all runs were trapped by the minima of the 10th basic functions, which are at the origin.
From this array of results, we can clearly see that the randomization-based algorithm, DE, only works well on lowdimensional problems.As dimensionality increases, the slope-based algorithm, SP-UCI, exhibits its persistence and yields better performances.Meanwhile, the fact that, for 100-D problems, only the simplest CF1 function can be solved supports our argument that, in high-dimensional space, only problems with well-informative landscape are solvable.CF2-CF6 all inherit more deceptiveness and randomness than does CF1 and, therefore, all three algorithms fail on them.

Comparison of efficiency on 100-D problems
As we mentioned, efficiency is a key issue in many real-world applications, especially for high-dimensional problems.As illustrated in Fig. 9, the results from 100-D experiments demonstrate SP-UCI's parsimoniousness in function evaluation.Owing to its sufficient exploitation of fitness landscapes, SP-UCI is able to evolve in a very efficient manner.Convergence rates of SP-UCI were strikingly higher than those of DE and PSO.SP-UCI also demonstrated its robustness by very consistent performances: Across all of the randomly initialized runs on the same test function, SP-UCI always showed very similar evolution speed and converged to the same point or a small region, whereas for PSO and DE, they sometimes had very diverse behaviors resulting from the random initialization.

CF1
Both SP-UCI and DE succeeded in consistently converging to the global optimum.However, SP-UCI converged with a much faster speed.After less than 10 5 function evaluations, the best function value (BFV) had dropped below 10 À6 , whereas the BFV of DE runs only dropped to the order of 10 0 after 10 6 function evaluations.PSO runs converged to both the global minimum and a local minimum.For those successful PSO runs, their convergence rate was between the DE and SP-UCI runs.CF2 Twenty-seven out of 30 SP-UCI runs converged to the minimum point of the second basic function (BFV = 100) with 3 runs converging to the minimum point of the fourth basic function (BFV = 300).Every run converged to its final point with less than 10 5 function evaluations.PSO runs spread to more local minima, and DE runs all ended at different final points spreading over the range of 150-300.

CF3
No algorithm had all of its runs converge to the same point.After around 2 Â 10 5 function evaluations, SP-UCI runs stopped with BFV over the range of 150-350.BFV of the DE runs dropped very slowly and fell within the range of 800-950 by the end.PSO runs diverged into 2 groups again: the first group evolved even slower than DE and had final BFVs higher than 1100.The other group evolved faster than DE and had final BFV in the range of 250-600.

CF4
For every method, the difference between curves of individual runs is small.All SP-UCI runs evolved very fast, but converged to the origin.All PSO runs also converged to the same point, but with a much slower rate.Finally, DE evolved so slowly that its final BFVs only dropped to around 950 at the end.

CF5
On this function, SP-UCI runs all swiftly converged to a small region around the global minimum with mean BFV of 13.2.All DE runs evolved slowly and steadily and, by the end, they all reached a bigger region around the global minimum with BFVs around 90. PSO runs ended up with BFVs falling into three groups.The best group had an average BFV of 18.58.

CF6
In Fig. 9, plot (f) resembles plot (d).Due to the highest complexity among the six test functions, three methods were all trapped to the local minimum of the 10th basic function, whose attractive region dominated over the whole search space.SP-UCI reached to the final point and terminated quickly, whereas PSO and DE both required many more function evaluations to converge to the same point.
On all of the test functions, SP-UCI consistently displayed the highest convergence or evolution speed.DE was always the slowest one, and PSO generally behaved in between.As discussed in the introduction, this difference comes from the difference in efficiency of slope-based and random-based offspring-generating schemes.The 10-and 50-D experimental results demonstrate the similarly pattern.

Change of difference in efficiency with increasing dimensionality
We are also very interested in how these algorithms will behave in even much higher-dimensional spaces, such as 1000-D.However, due to computational constraints, we only tested on CF1, which was the only solvable problem in 100-D.Furthermore, there were only three runs for each algorithm, since we allowed the maximum function evaluation to be 10 8 , which is a extremely large number for this sophisticated composition function.Fortunately, the 100-D results show that behaviors of the three algorithms are consistent on CF1, and there is no need to have too many runs.The results are plotted in Fig. 10.Similarly, SP-UCI converged with remarkable speed.In all three runs, BFV drops to lower than 1 after 3 Â 10 6 function evaluations.PSO also converged, but with much slower speed.By the end, which means after 10 8 function evaluations, BFV only approached close to 1 in all runs.As for DE, it was so slow that all three runs finally approached a BFV of only 900 finally.
Clearly, the 1000-D results indicate that the difference in efficiency between these three algorithms increases substantially with dimensionality.To better illustrate this, efficiency comparisons are plotted in Fig. 11 for experimental results on 10-, 50-, 100-, and 1000-D CF1 functions.Data points represent results from the median run in each case.For 10-D, the Y-axis represents the number of function evaluations required by each algorithm to converge to the solution.For, 50-and 100-D, since DE was not able to evolve the BFV to lower than 10 À6 , the Y-axis represents the number of function evaluations that SP-UCI, PSO, and DE required to achieve final BFV of the DE run.For 1000-D, only PSO and SP-UCI were compared, and the Y-axis represents the number of function evolutions that SP-UCI and PSO needed to achieve the final BFV in the PSO run.This plot reveals that SP-UCI was generally faster than the other two by the order of 10 times; this advantage increases with problem dimensionality.For the 1000-D CF1 function, SP-UCI was 34 times faster than PSO and definitely faster than DE by a much larger magnitude.

Comparison with a PSO variant for high-dimensional problems with local search schemes
To further demonstrate its efficiency and effectiveness on high-dimensional problems, we also compared SP-UCI with a recent and sophisticated variant of PSO, dynamic multi-swarm particle swarm optimizer with local search for large scale    global optimization (DMS-L-PSO), which was proposed by Zhao et al. [36].Different from the standard PSO, DMS-L-PSO divides the population into a large number of dynamic sub-swarms which are regrouped frequently with various regrouping schedules.The Quasi-Newton method is integrated into DMS-L-PSO to improve its local searching ability.DMS-L-PSO code was obtained from its authors and was applied with the same settings specified in Section 3.2.1.Results are presented in Table 1, where the mean and standard deviation of final BFV from 30 independent runs are listed.On CF1 to CF5, SP-UCI always yields smaller mean and variance of BFV compared with DMS-L-PSO, whereas on CF6, the two algorithms consistently give the same results.Wilcoxon rank sum tests were conducted to test the significance of the comparison.The null assumption is H0: that there is no difference between the results of the two algorithms.The extreme small P-values in Table 1 substantiate that SP-UCI significantly outperforms DMS-L-PSO on CF1-CF5.
Furthermore, Fig. 12 illustrates that SP-UCI always exhibits higher efficiency compared with DMS-L-PSO.

Efficiency and effectiveness
Experimental results in Section 4 serve as good examples to support our arguments regarding difficulties of high-dimensional problems and effectiveness of randomization-based and slope-based schemes.In the 10-D experiments, DE demonstrated its outstanding potency by succeeding on four problems and outperforming the other two algorithms.This indicates that randomization has the power to solve low-dimensional deceptive and random functions.However, with increasing dimensionality, the difficulty of the deceptiveness and randomness grows, and power of randomization drops.As a result, fewer benchmark functions could be solved, and the performance of DE deteriorated dramatically.The failure of DE on 1000-D CF1 shows how vulnerable randomization is to high dimensionality.On the other hand, SP-UCI excelled in high-dimensional experiments and exhibited impressive convergence rate and evolution speed.Moreover, the fact that SP-UCI achieved better or at least equal final BFVs compared with PSO and DE in the 100-D experiments indicates that slope-based schemes do have better or, at least, comparable effectiveness on high-dimensional problems.

Flexibility
Meanwhile, we must admit that, in many situations, efficiency and effectiveness are mutually repulsive.Based on users' demands, a method must be able to place a proper balance.This is the reason why we built SP-UCI with four independent modules.This architecture gives sufficient flexibility to the method.By tuning each module, users can adapt the method to their expectations.Here, we briefly explain how to adjust each module and how the adjustment will affect the optimization process: (1) In the shuffling complex scheme, the number of complexes can be changed.Increasing the number of complexes will increase the chance of finding the attractive regions of global optima, but will drag down the efficiency of evolution.(2) For dimension restoration, by default, we set that no dimension should be neglected in every loop.However, given certain type of response surface (such as symmetric functions), SP-UCI can achieve more efficiency while the population degeneration occurs.Therefore, in this situation, we do not need to be restricted to no-lost dimensions.Instead, we can let the scheme restore the dimensionality only when more than a pre-specified number of dimensions have been lost.This pre-specified number can be used to balance efficiency and effectiveness.(3) Within the MCCE module, we can vary the maximum number of simplex iterations.A larger number of simplex iterations leads to higher convergence rates.However, if the response surface is rough, increasing the number of simplex iterations will increase the probability of converging to local minima.4) Multinormal resampling has the strength of overcoming local roughness, but with sacrifice of efficiency.For rough response surfaces, the more points that are resampled, the less likely that the population will be trapped by local minima.By changing the number of sampling iterations, users are able to achieve a balance given their motivations.
The above discussion is a primary introduction.Analytical and quantitative studies of the relationships between algorithmic parameters of each module and the algorithm performance will be among the focuses of our further research.The eventual goal is to make all modules self-adaptive.Nonetheless, if a user prefers, SP-UCI can always be used as a fixed method with all of the default algorithmic parameter values as we did in the experiments presented in this study.

Conclusions
We develop a sophisticated search algorithm, the SP-UCI method, which employs slope-based simplex strategy as well as complex and multinormal sampling and is resistant to population degeneration.Compared with two popular and more randomization-based methods, PSO and DE, this method presents salient convergence or evolution speed on high-dimensional problems.On the other hand, the fact that SP-UCI also outperforms PSO and DE by better or at least equal final BFVs in highdimensional experiments indicates that SP-UCI does not gain efficiency simply by sacrificing effectiveness.Actually, via integrating four potent modules, SP-UCI is trying to balance effectiveness and efficiency at the same time.The capability of capturing the global minimum or improving the objective function value in a very fast manner promises SP-UCI a wide range of real-world problems.
Experimental results bear out the argument that, in high-dimensional space, only problems with well-informative fitness landscapes are solvable, and slope-based schemes are preferable to randomization-based schemes.However, we are not against the implementation of randomizations in evolutionary computation.Instead, the shuffled complex and multinormal resampling schemes in the SP-UCI method demonstrate the power of randomization in escaping local minima and sweeping over roughness response surface.In fact, the point that we are trying to make is that, in high-dimensional spaces, slope-based schemes have more potential to effectively and efficiently drive the sample population evolve compared with randomizationbased schemes and should be used in constructing the kernel of searching algorithms for high-dimensional problems.

A.2. Dimension monitoring and restoration
During every loop of evolution, each complex first goes through module (b) to prevent from population degeneration: Step 1. Check the dimensionality of the space spanned by all points in the complex using the following procedure: (1) Let C be the matrix with the coordinates of each point as its columns.Then, C has the size of d Â p, where d is the dimensionality of the problem, and p is the number of points in each complex where vectors x * i ; i ¼ 1 . . .p, are points in the complex.(2) Transform the original coordinated system to a normalized coordinated system by centering and normalizing each row of C and get C 0 : where c i and v i are the mean and variance of the ith row of C, respectively.Then, x * 0 is the normalized coordinate of point j.This normalization can reduce the effect of differences in the units of different parameters in real problems.The following operations of this module are all discussed as in this normalized space.
(3) Calculate the covariance matrix of C 0 and denote it as R. Obtain eigenvectors and eigenvalues of R. Each eigenvector is a principal component (PC) of the complex, and its corresponding eigenvalue measures the variance of the points in the complex along the direction defined by that PC. (4) By examining eigenvalues, we can determine if there is any dimension lost and, if yes, how many are lost.Theoretically, the complex should fully span the d-dimensional parameter space, which means that the points should have comparable variance along directions defined by every PC.If the variance along the direction of one PC is too small, it means that the complex does not span well over that direction, and that dimension is lost.On a lost dimension, we can use the centroid of the complex, c * 0 , to represent all of the points since they have very small variance on this dimension.In a d-dimensional space, for an isotropic particle population, the expected variance along each PC is 1/d of the total variance.Therefore, in this study, if a PC has variance less than 10% of the expected variance, we treat it as a lost dimension.
Step 2. Search along lost dimensions.For each lost dimension detected in Step 1, do the following random search along the PC that represents it: (1) Sample a point from the positive side of c * 0 along the PC.
where a is a random number generated from normal distribution with mean = 2 and variance = 1, l * is the unit vector representing the PC, and r is the radius of complex, defined by Transform x * 0 back to the original coordinates and evaluate the function at it.If the function value is smaller than that of the worst point in the complex, replace the worst point with this new point, and the search on this PC is over.Otherwise, discard this new point and continue to (2).
(2) Sample a point from the negative side of c * 0 along the PC.
x * 0 ¼ c * 0 À ar l * Again, transform x * 0 back to the original coordinates and evaluate the function at it.If the function value is smaller than that of the worst point in the complex, replace the worst point with this new point.Otherwise, discard this new point.The search on this PC terminates.In summary, Step 2 is designed to quickly explore over lost dimensions to see if there is evident slope along them.If there is, the random sampling is likely to capture it, and the new point mingled into the complex will enable the complex to search along this lost dimension.

A.3. Modified competitive complex evolution (MCCE)
In the interest of high-dimensional problems, we amend the original competitive complex evolution (CCE) strategy in [9].
Step 0. Initialize index i = 1 and set the maximum number of iterations in this module I = d + 1.
Step 1. Sort the complex in order of increasing function value.Assign a triangular probability to each point, except the first one: ; i ¼ 2; . . .; p and p is the number of points in a complex: Step 2. Select the first point and d (problem dimensionality) other points from the complex according to q i .Record each point's position in the complex.
Step 3. The d + 1 selected points form a simplex, S. Generate an offspring from S by the following procedure: (1) Sort points in S and denote them as s  (7); otherwise, continue to ( 6). ( 6) Multinormal sampling: If there is no better point found after going through the above simplex operations, a point will be randomly drawn with a multinormal distribution defined by the simplex.This point will replace s * dþ1 regardless of the function value at it.Let S be the matrix with every point in the simplex as its columns, and then calculate its covariance matrix, R s .Take out the diagonal of R s , In our study, we discern that the shrink step of the original Nelder-Mead simplex method jeopardizes the algorithm's ability to escape local minima, even though it can largely increase the convergence rate.Consequently, we excluded the shrink step from our algorithm.Meanwhile, we included the expansion and outside contraction, which are not in the CCE strategy.The random sampling operation is also different from that in CCE.We adopted the simplex-guiding multinormal distribution instead of uniform distribution.In constructing the R 0 s , we keep only the diagonal of R s in order to achieve decorrelation.Adding the mean to d * 0 is in order to reduce the condition number of R 0 s , whereas multiplying by two is to increase the sampling range.

A.4. Multinormal resampling
Population-driven multinormal resampling proves to be a powerful tool to overcome local roughness.To mitigate the influence of noisy or rough response surface, multinormal resampling is applied to the complex as well.The operation is very similar to operation (6) in Step 3 of module (c).The difference is that this resampling happens at the complex level.Additionally, we do not modify the covariance matrix, and p (the size of complex) new points are drawn.Note that this resampling can be repeated for I (a pre-defined number) times, depending on users' need.In this study, we set I = 1.
where c 0 , c 1 , and c 2 are the significance coefficients, and r where C < 1 is the crossover constant, F is the scaling factor, and p is the component index.If , otherwise, the parent survives to the next generation.

Fig. 1 .
Fig. 1.The main routine of the SP-UCI method.

Fig. 2 .
Fig. 2. The module of dimensionality monitoring and restoration.
1) 10-D problems: Fig. 6 presents the final best function values of the 30 runs of three algorithms on six benchmark functions.DE surpassed the other two on 10-D functions, and it succeeded on CF1, CF2, and CF5 in all 30 runs and on CF3 in 12 runs.On CF6, the majority of the runs of DE reached below the function value of 600.SP-UCI performed slightly Yes No Generate a new point using the multinormal distribution defined by points in S To main routine Apply the Nelder-Mead simplex procedure to S Select d+1 points from the complex according to the triangular pdf.Store them in S Found a new point better than the worst point in S? Replace the worst point in S with the new point Replace S back to the complex and sort it i = i+1 Yes Yes No No

Fig. 6 .
Fig. 6.Box plots of results achieved by 30 runs of 1-SP-UCI, 2-PSO, and 3-DE on 10-D benchmark functions.(a) through (f) correspond to CF1-CF6 functions.(The box has lines at the lower quartile, median, and upper quartile values.The whiskers extend to the most extreme data points not considered outliers.Outliers (denoted by +) are data with values beyond 100 units of interquartile range).

Fig. 7 .
Fig. 7. Box plots of results achieved by 30 runs of 1-SP-UCI, 2-PSO, and 3-DE on 50-D benchmark functions.(a) through (f) correspond to CF1-CF6 functions.(The box has lines at the lower quartile, median, and upper quartile values.The whiskers extend to the most extreme data points not considered outliers.Outliers (denoted by +) are data with values beyond 100 units of interquartile range).

Fig. 8 .
Fig. 8. Box plots of results achieved by 30 runs of 1-SP-UCI, 2-PSO, and 3-DE on 100-D benchmark functions.(a) through (f) correspond to CF1-CF6 functions.(The box has lines at the lower quartile, median, and upper quartile values.The whiskers extend to the most extreme data points not considered outliers.Outliers (denoted by +) are data with values beyond 100 units of interquartile range).

Fig. 11 .
Fig. 11.Efficiency comparison of SP-UCI (triangle), PSO (square), and DE (circle) on benchmark function CF1 of 10-, 50-, 100-, and 1000-D.For each case, the point is plotted with result of the median run.The Y-axis represents the number of function evaluations needed to converge (for 10-D) or to reach to a small BFV (for 50-, 100-, and 1000-D).See text for details.

Fig. 12 .
Fig. 12. Fitness curves of 30 runs of SP-UCI (red/light) and DMS-L-PSO (black/dark) on 100-D benchmark functions.(a) through (f) correspond to CF1-CF6.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

* 1 ;r*
. . .; s * dþ1 with the corresponding function values f 1 < Á Á Á < f d+1 .Calculate the centroid of the best d points and label it with s and get f r .If f 1 < f r < f d , set the offspring s * o ¼ s * r and go to (7).(3) Expand: If f r 6 f 1 , reflect s If f e < f r , let s * o ¼ s * e and go to (7); otherwise, s * o ¼ s * r and go to (7).(4) Contract outside: If f d 6 f r < f d+1 , calculate the outside contraction point, s * oc ¼ s þ 0:5ð s r À sÞ If f oc < f r , let s * o ¼ s * oc and go to (7); otherwise, s * o ¼ s * r and go to (7).(5) Contract inside: If f d+1 6 f r , calculate the inside contraction point, s * ic ¼ s þ 0:5ð s dþ1 À sÞ If f ic < f d+1 , let s * o ¼ s * ic and go to

d*¼o
diagonalðR s Þ ¼ ½r 11 ; . . .; r dd Modify d covariance matrix R 0 s with d * 0as its diagonal and zeros everywhere else.Finally, we can sample an offspring s and put the simplex back to the complex.Let i = i + 1.If i 6 I, go to Step 1; otherwise, sort the complex and return to main routine.
Brief descriptions of PSO and DE algorithms.B.1.Particle swarm optimizer (PSO) Each particle's position x * i is updated by trying a displacement (velocity) based on three sources: (1) the particle's velocity in the previous evolution v * 0 i , (2) the particle's best ever position ðx i Þ, and (3) the population's current best position ðĝÞ,

* 1 ; r * 2 2
R m are random vectors with uniformly-generated components fr p ¼ Uð0; 1Þg m p¼1 : denotes the element-by-element vector multiplication.Iff ðx i Þ 6 f ðx * i Þ; x * i ¼ xi ,otherwise, no replacement.A particle's velocity is bounded by the maximum velocity V max .B.2. Differential evolution (DE) In the DE algorithm, for a particle x * i the candidate offspring xi has hybridized components fx ip g m p¼1 from x * i and another random variable v * i according to a randomly-generated number r * ¼ U½0; 1: j¼1 ða; b; c -iÞ xi ¼ xip ¼ v ip if r 6 C x ip if r > C m p¼1

Table 1
Comparison of final best values (mean ± standard deviation) retrieved by SP-UCI and DSM-L-PSO.
a P-values are from Wilcoxon rank sum tests with H0: that there is no difference between the results of the two algorithm.W. Chu et al. / Information Sciences 181 (2011) 4909-4927