Open Access Published by De Gruyter Open Access December 29, 2017

Localized probability of improvement for kriging based multi-objective optimization

Yinjiang Li , Song Xiao , Paolo Di Barba , Mihai Rotaru and Jan K. Sykulski

https://doi.org/10.1515/phys-2017-0117

Abstract

The paper introduces a new approach to kriging based multi-objective optimization by utilizing a local probability of improvement as the infill sampling criterion and the nearest neighbor check to ensure diversification and uniform distribution of Pareto fronts. The proposed method is computationally fast and linearly scalable to higher dimensions.

Keywords: kriging; multi-objective optimization; Pareto front; surrogate-based optimization

PACS: 89.20.Kk; 85.85.+j; 85.70.-w; 87.10.Mn

1 Introduction

Research on multiple objective optimization (MO) has been attracting significant attention of the engineering community since 1980s; with the aid of fast computers solutions to many complex optimization problems have been made possible. The Vector Evaluated Genetic Algorithm (VEGA) [1] is one of the earliest examples of Multi-Objective Evolutionary Algorithms (MOEAs). The more recent developments include NSGA-II [2] and its modified versions as well as Particle Swarm based methods [3]. A comprehensive review of problem definitions and non-EA based solution methods may be found in [4].

There is an increasing number of indicator-based MOEAs that have been proposed in recent years; the indicator is used as a fitness measure for a set of Pareto points, and – by optimizing the indicator function – the MO problem essentially becomes a single objective optimization problem as the solver only needs to locate the optimal value of the indicator value and update the generation based on it. One of the best-known indicators is the hypervolume [5]; it has been successfully applied to both EAs and surrogate-based algorithms. Despite its unique feature of being strictly monotonic to Pareto improvements [6], it suffers from high computational cost for higher dimensions.

The general opinion favors EAs as advantageous in solving MO problems by often being population based, thus multiple solutions can be obtained in a single run. However, solutions to practical problems may be expensive in terms of computational time and effort. In the context of electromagnetic devices, the finite element method is a common design tool; it often takes hours or even days to obtain a single solution, therefore surrogate model based algorithms are often preferred.

In this study we propose a new indicator focused Localized Probability of Improvement (LPoI) approach for MO problems; its implementation requires the predicted mean and mean square errors to be available, hence it is not applicable to other EAs, but for Gaussian based surrogate models (including those relying on kriging) it has the advantage of being linearly scalable to problems with higher number of objectives.

2 Kriging theory

Modern engineering design often involves implementation of deterministic computer simulation; in electromagnetic design, time consuming finite element models (FEM) are often built to represent the actual devices. Designs are analyzed and optimized before being put into production. In these types of problems, the optimization can be a very time consuming process due to a large number of FEM calls needed. Therefore surrogate modeling techniques are often used to reduce the number of expensive FEM simulations.

Kriging is one of the most commonly used surrogate techniques amongst many others. An ordinary kriging model Y consists of a global mean f and a local departure Z:

Yx=fx+Zx(1)

where x is the location of any design site.

The local departure follows the Gaussian distribution with a mean of zero, variance σ² and non-zero covariance. A general exponential correlation function is one of the most commonly used correlation functions, due to its continuous characteristic and flexibility

corrxi,xj=∏nkexp−θnxin−xjnpn(2)

where x_i and x_j are a pair of observations, k is the problem dimension, while θ_n and p_n are hyperparameters controlling the shape of the correlation function.

Kriging parameters u, σ² and the hyperparameters θ and p are obtained via the Maximum Likelihood Estimation (MLE), with the maximum likelihood function given by

12ππ2(σ2)π2R12exp−y−1u′R−1(y−1u)2σ2(3)

where y denotes all observations and R is the correlation matrix.

The kriging prediction and the predicted mean square error (MSE) at a given location x are given as follows

y^=μ^+rR−1(y−1μ^)(4)

s2=σ^21−r′R−1r+1−r′R−1r21′R−11(5)

with σ^2=1−1u^′R−1(y−1μ^)nandμ^=1′R−1y1′R−11, where μ̂ and σ̂² are the optimal mean and variance, respectively, obtained by solving the maximum likelihood function.

3 Localized probability of improvement

Compared to other surrogate modeling methods, kriging has the advantage of providing both the predicted mean and the associated mean square error (MSE) at an unknown location. The probability of improvement PoI at any location is given by

PoIx=Φytx−y^xs^x(6)

where y_t is the target of improvement, ŷ the kriging predicted mean at location x, ŝ is a square root of the mean square error at location x and Φ(⋅) is the cumulative distribution function.

The first improvement target y_ext is associated with the minimum value of each individual objective function; the subscript ext stands for “extreme value” and y_ext is given by

yextn=yminn⋅(1−p)(7)

where y_minⁿ is the known minimum value of the n^th objective function and p is the percentage of improvement to be defined; parameter p is discussed later in this section. The corresponding PoI is:

PoIextnx=Φyextn−y^nxs^nx(8)

where ŷⁿ, ŝⁿ, y_extⁿ and PoI_extⁿ are the corresponding measures of the n^th objective function.

For the first improvement target, we find n values of PoI, which equals to the number of objectives, because the PoI_ext is calculated based on the extreme value of each objective function. We consider the maximum potential improvement for all individual objectives, hence

PoIextx=maximizePoIextnx(9)

The second improvement target y_intⁿ (x) is associated withthe reference point that is defined based on the location of x. The subscript int stands for “intermediate” and y_ref is calculated as

yintn=yrefn⋅(1−p)(10)

where y_ref is the calculated reference point.

To obtain the reference point y_ref, the algorithm finds the Pareto front for the existing design sites using non-dominated sorting. For each closest set of Pareto points (the number of points is equal to the number of objectives) it calculates the corresponding reference point. The coordinates for the reference point of each dimension is equal to the maximum value of the coordinates for these Pareto points in the same dimension. The coordinates for the corresponding reference point in the n^th dimension Ref (xⁿ) is given by:

yrefn=max{Yn}(11)

where Yⁿ is the collection of the n^th objective values for all of the points in that Pareto set.

Taking a bi-objective problem as an example, assuming the reference point y_ref is to be determined for Pareto points P₁ and P₂, the coordinates of P₁ and P₂ are therefore denoted by [P₁.x¹, P₁.x²] and [P₂.x¹, P₂.x²], respectively. Note that xⁿ is the n^th objective value at the location in the search space associated with P. The x¹ and x² coordinates (in the objective space) of the reference point are thus described as follows

yref1=maxP1.x1,P2.x1(12)

yref2=maxP1.x2,P2.x2(13)

and the corresponding PoI is given as

PoIrefnx=Φyintnx−y^nxs^nx(14)

where y_refⁿ, ŷⁿ, ŝⁿ, PoI_refⁿ and PoI_extⁿ are the corresponding measures of the n^th objective function.

We have therefore obtained n values of PoI for the second improvement target. However, unlike the first improvement target, the second one uses a localized target. Therefore, we consider using the minimum potential improvement for all individual objectives and hence

LPoIrefx=minimizePoIrefnx(15)

Finally, the proposed indicator LPoI for any given point is the maximum of these two probability of improvement measures, given by

LPoIx=maximizeLPoIref,PoIext(16)

where PoI_ext, as described by (9), is due to the fact that the minimum of each individual objective function is always present in the Pareto front, thus the PoI at each location x, over the optimal target of that function, is always considered. This term also contributes to the diversification of the Pareto front.

Furthermore, LPoI_ref – as described by (15) – can be treated as a maximum of the minimum potential improvements to a local target. This term helps to improve the Pareto front both towards the origin and in the direction of the objective value. It contributes to the diversification of the Pareto front, while the max-min method also contributes to the uniformity of the Pareto front.

To obtain the next infill sampling point, the algorithm finds the location x associated with the maximum LPoI measure in the objective space.

The parameter p – as seen in (7) and (10) – is associated with the magnitude of target improvement; it controls the convergence rate of the algorithm. A smaller amount of improvement will guide the solver towards existing Pareto points, while a larger value will encourage the exploration of the design space. It is crucial to use a proper p, since too small a value may lead to a false Pareto front, while a large value may result in a slow convergence rate or zero probability of improvement at all unknown sites. Thus it is advisable to dynamically adjust the value while monitoring the convergence.

We provide a simple self-adjusted method for parameter p in this paper. First, the initial improvement target percentage p_initial is defined and then the parameter p is calculated as

p=pinitial⋅max{LPoIprev}(17)

where LPoI_prev is a complete set of LPoI at previous iteration.

The next infill point is taken at the location with a maximum LPoI. Therefore, the solver tends to minimise the localised probability of improvement and converges towards the Pareto front. When the design space is well explored, or p is especially small, the solver will converge towards existing Pareto fronts; at this stage, it is common for the LPoI to be equal, or come close, to 1 at multiple unknown sites (extremely likely to improve over the target point). In order to obtain a uniformly distributed Pareto front, the algorithm selects candidates which have the largest Euclidean distance to existing Pareto points compared to the next infill sampling points. For this reason, the maximum value of LPoI can be capped between 0.9 and 1 for faster exploitation of the existing Pareto front without degrading the overall performance.

4 Test examples

The top graph in Figure 1 shows the kriging model (solid line) after 45 iterations, with the red crosses plotted at the true Pareto ront, while the bottom plot shows the proposed indicator value for the unknown sites. As can be seen, the algorithm has correctly converged to all four Pareto point clusters in the search space and thus further sampling will lead to more exploitation on the Pareto front. The sampled design sites in the objective space are plotted in Figure 2, where the red dots indicate the location of the true Pareto front. The improvement direction imposed by the two improvement targets are illustrated in Figure 2, where the yellow arrows show the improvement direction for the first improvement target, and the blue arrows indicate the improvement direction for the second improvement target.

Figure 1

The kriging model and the LPoI criterion in the search space

Figure 2

Existing design sites in the criterion space after 20 iterations

Figure 3

Existing design sites in the criterion space after 45 iterations

5 Solving the new TEAM problem

A new TEAM problem was proposed at the Compumag conference in Korea, June 2017 [7]. This is devoted specifically to multi-objective optimization. In its extended version, an additional objective has been added and there are therefore in total three objectives. The model consists of an air-cored multi-turn winding. By arranging the current-carrying coils, a desired magnetic field distribution is to be obtained. The flux density at point z is given by

Bz=μ02∫−dd∫rirsJr2ξdrdξr2ξ+z−ξ23(18)

The problem is specified as follows: given the current density J within the coil, and prescribed flux density, find the optimal r distribution of radii r(z), —d ≤ z ≤ d that yields the prescribed flux density B₀(z).

An initial arrangement of turns was given in the extended paper of [7], the width of each turn w and the height h are 1 mm and 1.5m m, respectively.

Figure 4

Example of radii distribution; geometry and magnetic flux lines

The model consists of 20 turns connected in series, symmetrically distributed, hence there are 10 radii which need to be optimized (the main objective f₁). Two additional objectives were proposed to complement the first objective f₁. The three objectives f₁, f₂ and f₃ may be described as follows:

f₁: find the optimal distribution of r, so that the discrepancy between the prescribed flux density B₀ and the actual induction field B is minimized;

f₂: minimize the sensitivity function;

f₃: minimize the power loss related function.

Mathematically the three objective functions are expressed as

f1r=supq=1,npBrξl,zq−B0rq,zq,l=1,nt(19)

f2r=supl=1,ntB+−Brξl,zq+Brξl,zq−B−,q=1,np(20)

f3r=∑l=1ntrξl(21)

where B⁺ = B (r (ξ_l + Δξ), z_q), B^— = B (r (ξ_l — Δξ),z_q), l = 1, n_t and q = 1, n_p. Δξ = 0.5 mm. At this stage it was suggested to consider only two objectives at the time, f₁ and either f₂ or f₃.

The optimization results are illustrated by Figs. 5 and 6, where objectives 2 and 3 are plotted against objective 1, respectively. The globally optimal points A and B (defined by the closeness to the respective utopia points) are defined by the radii distributions [11.4, 8.6, 9.1, 12.1, 8.9, 8.3, 7.0, 6.4, 6.8, 5.9] and [7.2, 10.6, 7.2, 6.6, 9.0, 5.2, 9.2, 5.0, 5.4, 6.9], respectively.

Figure 5

Pareto front of f₁ and f₂ in the objective space

Figure 6

Pareto front of f₁ and f₃ in the objective space

6 Conclusion

A novel approach to kriging-based multi objective optimization is put forward relying on the Localized Probability of Improvement. For illustration purposes a bi-objective test problem is provided, as well as the recently introduced TEAM benchmark problem. It is shown that the proposed method addresses efficiently both the diversification and uniformity of the Pareto solution, is computationally efficient and is linearly scalable to higher number of objectives.

References

[1] Schaffer J.D., Multiple objective optimization with vector evaluated genetic algorithms, Proceedings of the 1st International Conference on Genetic Algorithms, 1985, 93-100Search in Google Scholar

[2] Deb K., Agrawal S., Pratap A., Meyarivan T., A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II, in Schoenauer M. et al. (eds), Parallel Problem Solving from Nature PPSN VI, 200010.1007/3-540-45356-3Search in Google Scholar

[3] Parsopoulos K.E., Vrahatis M.N., Particle swarm optimization method in multiobjective problems, SAC’02 Proceedings of the 2002 ACM Symposium on Applied Computing, 2002, 603-60710.1145/508791.508907Search in Google Scholar

[4] Marler R.T., Arora J. S., Survey of multi-objective optimization methods for engineering, Structural and Multidisciplinary Optimization, 2004, 26, 6, 36910.1007/s00158-003-0368-6Search in Google Scholar

[5] Zitzler E., Thiele L., Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach, IEEE Transactions on Evolutionary Computation, 1999, 3, 4, 257-27110.1109/4235.797969Search in Google Scholar

[6] Knowles J., Corne D., On metrics for comparing nondominated sets, Evolutionary Computation, Proceedings of CEC’02, Honolulu, 2002, 711-716Search in Google Scholar

[7] Barba P.D., Mognaschi M.E., Song X., Lowther D.A., Sykulski J.K., A benchmark TEAM problem for multiobjective Pareto optimization of electromagnetic devices, IEEE Transactions on Magnetics, 2017, PP, 99Search in Google Scholar

Received: 2017-10-30

Accepted: 2017-11-12

Published Online: 2017-12-29

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

From the journal

Open Physics

Volume 15 Issue 1

Submit manuscript

Journal and Issue

Articles in the same Issue

Regular Articles

Analysis of a New Fractional Model for Damped Bergers’ Equation

Regular Articles

Optimal homotopy perturbation method for nonlinear differential equations governing MHD Jeffery-Hamel flow with heat transfer problem

Regular Articles

Semi- analytic numerical method for solution of time-space fractional heat and wave type equations with variable coefficients

Regular Articles

Investigation of a curve using Frenet frame in the lightlike cone

Regular Articles

Construction of complex networks from time series based on the cross correlation interval

Regular Articles

Nonlinear Schrödinger approach to European option pricing

Regular Articles

A modified cubic B-spline differential quadrature method for three-dimensional non-linear diffusion equations

Regular Articles

A new miniaturized negative-index meta-atom for tri-band applications

Regular Articles

Seismic stability of the survey areas of potential sites for the deep geological repository of the spent nuclear fuel

Regular Articles

Distributed containment control of heterogeneous fractional-order multi-agent systems with communication delays

Regular Articles

Sensitivity analysis and economic optimization studies of inverted five-spot gas cycling in gas condensate reservoir

Regular Articles

Quantum mechanics with geometric constraints of Friedmann type

Regular Articles

Modeling and Simulation for an 8 kW Three-Phase Grid-Connected Photo-Voltaic Power System

Regular Articles

Application of the optimal homotopy asymptotic method to nonlinear Bingham fluid dampers

Regular Articles

Analysis of Drude model using fractional derivatives without singular kernels

Regular Articles

An unsteady MHD Maxwell nanofluid flow with convective boundary conditions using spectral local linearization method

Regular Articles

New analytical solutions for conformable fractional PDEs arising in mathematical physics by exp-function method

Regular Articles

Quantum mechanical calculation of electron spin

Regular Articles

CO₂ capture by polymeric membranes composed of hyper-branched polymers with dense poly(oxyethylene) comb and poly(amidoamine)

Regular Articles

Chain on a cone

Regular Articles

Multi-task feature learning by using trace norm regularization

Regular Articles

Superluminal tunneling of a relativistic half-integer spin particle through a potential barrier

Regular Articles

Neutrosophic triplet normed space

Regular Articles

Lie algebraic discussion for affinity based information diffusion in social networks

Regular Articles

Radiation dose and cancer risk estimates in helical CT for pulmonary tuberculosis infections

Regular Articles

A comparison study of steady-state vibrations with single fractional-order and distributed-order derivatives

Regular Articles

Some new remarks on MHD Jeffery-Hamel fluid flow problem

Regular Articles

Numerical investigation of magnetohydrodynamic slip flow of power-law nanofluid with temperature dependent viscosity and thermal conductivity over a permeable surface

Regular Articles

Charge conservation in a gravitational field in the scalar ether theory

Regular Articles

Measurement problem and local hidden variables with entangled photons

Regular Articles

Compression of hyper-spectral images using an accelerated nonnegative tensor decomposition

Regular Articles

Fabrication and application of coaxial polyvinyl alcohol/chitosan nanofiber membranes

Regular Articles

Calculating degree-based topological indices of dominating David derived networks

Regular Articles

The structure and conductivity of polyelectrolyte based on MEH-PPV and potassium iodide (KI) for dye-sensitized solar cells

Regular Articles

Chiral symmetry restoration and the critical end point in QCD

Regular Articles

Numerical solution for fractional Bratu’s initial value problem

Regular Articles

Structure and optical properties of TiO₂ thin films deposited by ALD method

Regular Articles

Quadruple multi-wavelength conversion for access network scalability based on cross-phase modulation in an SOA-MZI

Regular Articles

Application of ANNs approach for wave-like and heat-like equations