Nothing Special   »   [go: up one dir, main page]

These: PHD Degree in Mathematics (LMD)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 108

Democratic and Popular Republic of Algeria

Ministry of Higher Education and Scientific Research


Ferhat Abbas University of Sétif
Department of Mathematics and Computer Science

These
Presented with a view to obtaining the

PhD Degree in Mathematics (LMD)

Option : OPTIMIZATION
by
GUESSOUM Nabila

LA b-COLORATION DE CERTAINS GRAPHES

Publicly supported the ...... in front of the jury composed of

Mr. ..... ..... .... ......


Mr. ..... ..... .... ......
Mr. ..... ..... .... ......
Mr. ..... ..... .... .....

Academic year : 2023-2024


Abstract

In light of its straightforwardness and effectiveness, the derivative-free global search


algorithm known as DIRECT (DIviding RECTangles) has captured significant attention wi-
thin the optimization community. Numerous innovative concepts and extensions have been
put forth in its pursuit. Nevertheless, the efficacy of numerous DIRECT-type algorithms when
addressing multimodal challenges and situations demanding highly precise solutions has ex-
hibited a decline.

In this thesis, we delve into a global optimization challenge where the objective function
is assumed to exhibit Lipschitz continuity with an unspecified Lipschitz constant.

Drawing inspiration from the recently introduced BIRECT (BIsection of RECTangles) al-
gorithm, we present a novel diagonal partitioning and sampling strategy. Our framework, na-
med BIRECT-V (V for vertices), seamlessly combines bisection with a two-point sampling
approach. In the initial hyper-rectangle, these two points are strategically positioned at 1/3
and 1 along the primary diagonal. Unlike many DIRECT-type algorithms, which find vertex-
based objective function evaluations unsuitable for bisection, our strategy, when paired with
bisection, furnishes a more comprehensive understanding of the objective function. Howe-
ver, this strategy may lead to the creation of new sampling points coinciding with existing
ones at shared vertices, thereby necessitating additional evaluations of the objective func-
tion and increasing the number of function evaluations per iteration.

To overcome this challenge, we propose a modification of the original optimization do-


main to obtain a reliable approximation of the global solution. Empirical investigations de-
monstrate that this modification significantly enhances the performance of the BIRECT-V al-

ii
gorithm. Our proposal exhibits promise as a global optimization algorithm, particularly when
compared to the original BIRECT and two popular DIRECT-type algorithms across a battery
of test problems. It notably excels in scenarios involving high-dimensional problems.

Keywords : Global Optimization, DIRECT algorithm, BIRECT Algorithm, Diagonal Parti-


tioning Strat- egy, Sampling Scheme.
Résumé

À la lumière de sa simplicité et de son efficacité, l’algorithme de recherche globale sans


dérivée connu sous le nom de DIRECT (DIviding RECTangles) a suscité une grande attention
au sein de la communauté de l’optimisation. De nombreuses idées innovantes et extensions
ont été avancées dans sa poursuite. Néanmoins, l’efficacité de nombreux algorithmes de type
DIRECT lors de la résolution de défis multimodaux et de situations exigeant des solutions très
précises a connu un déclin.

Dans cette thèse, nous nous plongeons dans un défi d’optimisation globale où l’on sup-
pose que la fonction objectif présente une continuité Lipschitz avec une constante Lipschitz
non spécifiée.

Puisant inspiration du récemment introduit algorithme BIRECT (BIsection of RECTangles),


nous présentons une nouvelle stratégie de partitionnement diagonal et d’échantillonnage.
Notre cadre, baptisé BIRECT-V (V pour sommets), combine habilement la bisection avec une
approche d’échantillonnage à deux points. Dans l’hyperrectangle initial, ces deux points sont
stratégiquement positionnés à 1/3 et 1 le long de la diagonale principale. Contrairement à de
nombreux algorithmes de type DIRECT, qui jugent inappropriées les évaluations de la fonc-
tion objectif basées sur les sommets pour la bisection, notre stratégie, lorsqu’elle est associée
à la bisection, offre une compréhension plus complète de la fonction objectif. Cependant,
cette stratégie peut entraîner la création de nouveaux points d’échantillonnage coïncidant
avec ceux déjà existants aux sommets partagés, ce qui nécessite des évaluations supplémen-
taires de la fonction objectif et augmente le nombre d’évaluations de la fonction par itération.

i
Pour surmonter ce défi, nous proposons une modification du domaine d’optimisation
d’origine afin d’obtenir une approximation fiable de la solution globale. Les investigations
empiriques démontrent que cette modification améliore significativement les performances
de l’algorithme BIRECT-V. Notre proposition présente des perspectives prometteuses en tant
qu’algorithme d’optimisation globale, notamment par rapport à l’algorithme BIRECT d’ori-
gine et à deux algorithmes populaires de type DIRECT sur un ensemble de problèmes de test.
Elle excelle notamment dans les scénarios impliquant des problèmes de grande dimension.

Mots clés : Optimisation Globale, Algorithme DIRECT, Algorithme BIRECT, Stratégie de


Partitionnement Diagonal, Schéma d’Échantillonnage.
‫ملخص‬
‫بالنظر إلى بساطة وفعالية خوارزمية البحث العالمي بدون مشتقات المعروفة باسم ( ‪DIRECT‬‬
‫كبيرا في مجتمع األمثلة‪ .‬تم تقديم العديد‬
‫‪ ،(DIviding RECTangles‬فقد لفتت هذه الخوارزمية انتباهًا ً‬
‫من المفاهيم االبتكارية والتمديدات في سبيل تطويرها‪ .‬ومع ذلك‪ ،‬فإن فعالية العديد من خوارزميات نوع‬
‫‪ DIRECT‬عند التعامل مع تحديات متعددة األوضاع وحاالت تتطلب حال دقيقًا للغاية قد شهدت انخفا ً‬
‫ضا‪.‬‬

‫في هذه الرسالة‪ ،‬نتناول تحدي األمثلية العالمية حيث يفترض أن تظهر الدالة الهدف تتابع ‪ Lipschitz‬مع‬
‫ثابت ‪ Lipschitz‬غير معروف‪.‬‬

‫مؤخرا‪ ،‬نقدم‬
‫ً‬ ‫مستلهمين من خوارزمية (‪ BIRECT (BIsection of RECTangles‬ال ُمقدمة‬
‫استراتيجية جديدة للتقسيم والعينات على شكل قطر‪ .‬إطارنا‪ ،‬الذي أطلقنا عليه اسم ‪( BIRECT-V‬‬
‫‪V‬للنقاط الزاوية( ‪ ،‬يجمع بسالسة بين االنقسام والعينة بنقطتين‪ .‬في المستطيل الفرعي األولي‪ ،‬تُوضع‬
‫هاتان النقطتان بشكل استراتيجي على ‪ 3/1‬و ‪ 3/2‬على طول القطر الرئيسي‪ .‬على عكس العديد من‬
‫خوارزميات نوع ‪ DIRECT‬األخرى‪ ،‬التي تجد أن تقييمات الدالة الهدف على أساس النقاط الزاوية غير‬
‫مناسبة لالنقسام‪ ،‬توفر استراتيجيتنا‪ ،‬عند استخدامها مع االنقسام‪ ،‬فه ًما أشمل للدالة الهدف‪ .‬ومع ذلك‪ ،‬قد‬
‫تؤدي هذه االستراتيجية إلى إنشاء نقاط عينة جديدة تتزامن مع النقاط الحالية في النقاط الزاوية المشتركة‪،‬‬
‫مما يستدعي تقييمات إضافية للدالة الهدف وزيادة عدد تقييمات الدالة في كل تكرار‪.‬‬

‫ً‬
‫تعديال على المجال األصلي لألمثلية للحصول على تقريب موثوق‬ ‫للتغلب على هذا التحدي‪ ،‬نقترح‬
‫للحالولة العالمية‪ .‬تظهر التحقيقات التجريبية أن هذا التعديل يعزز بشكل كبير من أداء خوارزمية‬
‫صا عند مقارنته بخوارزمية‬‫‪ .BIRECT-V‬اقتراحنا يظهر وعدًا كخوارزمية أمثلية عالمية‪ ،‬خصو ً‬
‫‪ BIRECT‬األصلية واثنتين من خوارزميات نوع ‪ DIRECT‬الشهيرة عبر مجموعة من مشكالت‬
‫االختبار‪ .‬يبرز أداءه بشكل ملحوظ في السيناريوهات التي تتضمن مشكالت عالية األبعاد‪.‬‬

‫الكلمات الرئيسية ‪ :‬األمثلية العالمية‪ ،‬خوارزمية ‪ ،DIRECT‬خوارزمية ‪ ،BIRECT‬استراتيجية‬


‫تقسيم قطرية‪ ،‬نظام العينات‪.‬‬

‫‪iii‬‬
Acknowledgments

First and foremost, I express my gratitude to the Almighty Allah for granting me the
courage and patience to complete this work.
This work would not have been as comprehensive and could not have come to fruition
without the assistance and guidance of Prof. CHITER LAKHDAR. I thank him for the excep-
tional quality of his guidance, his patience, his rigor, his valuable advice, and his availability
during our preparation of this thesis.
I also extend my respect and gratitude to the members of the jury who honored me by
evaluating this work.
Our deepest gratitude goes to our dear parents for their assistance, support, and guidance.
I would also like to thank my husband for his unwavering patience and support, which have
been more than indispensable to me.
A special thank you to the professors who taught us throughout our academic journey and
illuminated our lives with the light of knowledge.
Finally, I extend my thanks in advance to those who will kindly read or listen to me and,
above all, help me progress.

iv
Dedication

W ith immense pleasure, I dedicate this work :

T o the two noblest and dearest people in the world : My late Father and my Mother, who
sacrificed the most
beautiful years of their lives to see me succeed and supported me until the end.

T o my brothers who never ceased to encourage me.

T o my hudsben and to the light of my life, my daughter L I N A.

T o my thesis advisor, who has truly been a great support to me.

T o my very dear friends and comrades.

T o all those who supported me, whether near or far.

T o all those we love and those who love us.

v
Table des matières

Abstract ii

Résumé i

Acknowledgments iv

Introduction 1
0.1 Context and Research Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.2 Object of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
0.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
0.4 Research Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.5 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.6 Structure of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.7 Puplications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1 Global Optimization : State of the Art 5


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Description of Global Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Classification of Global Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Classification of Global Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.1 Stochastic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.2 Heuristic methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.3 Deterministic Global Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

vi
2 The DIRECT Algorithm 16
2.1 Lipschitz Optimization Prolem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 The one-dimensional DIRECT Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.1 Area-Dividing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.2 Potentially optimal intervals (POIs) . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.3 The one-dimensional DIRECT Algorithm . . . . . . . . . . . . . . . . . . . . . . 22
2.2.4 The multidimensional DIRECT algorithm . . . . . . . . . . . . . . . . . . . . . . 22
2.2.5 Strengths of the DIRECT algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.6 Weaknesses of the DIRECT algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Overview DIRECT-type derivative-free methods . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.1 Classification of DIRECT Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Diagonal Partitioning Strategy Using Bisection of Rectangles and a Novel Sampling


Scheme (The BIRECT-V Algorithm) 44
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.1 From BIRECT to BIRECT-V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.2 Description of the new sampling scheme . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4 Conclusions and Perspectives 71

Bibliographie 72

5 Appendices 82
.1 Key characteristics of the Hedar test problems [18] . . . . . . . . . . . . . . . . . . . . . 83
.2 Global minimizer found by the BIRECT-V algorithm using Hedar test problems
[18] with modified domain from Table 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
.3 Derivative-free optimization software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Table des figures

2.2.1 The division strategy for both Piyavskii and DIRECT algorithms . . . . . . . . . . . 20
2.2.2 Dividing of a hypercube. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.3 Illustration of the main operations of BIRECT in n = 2. The gray-colored hyper-
rectangles highlight those that are potentially optimal. They are selected in
the first phase of the current iteration and divided in the second phase of the
same iteration. Each subsequent iteration is obtained by sampling two diago-
nal points and dividing them along the longest direction, starting with the lo-
west index if multiple equally long longest directions exist. The blue points re-
present newly sampled points where the objective function is evaluated, while
for the green points, the function values from previous iterations are reused. . . 32
2.3.4 The maximum distance between the sampled points (solid blue circles) and
an unsampled point (open red circle) is higher in the ADC method than in the
BIRECT method. (Color figure online). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.5 The maximum distance between the sampled points (solid blue circles) and
an unsampled point (open red circle) is higher in the ADC method than in the
BIRECT method. (Color figure online). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

ix
3.2.1 Description of the initialization and the first three iterations used in the new
sampling scheme on on the Branin test problem. Each iteration is performed by
sampling two new points (blue color) issued from the old ones (red color) and
bisecting potentially optimal hyper-rectangles (shown in gray color) along the
coordinate (branching variable x b r , 1 ≤ b r ≤ n ), having the largest side length
(d bi r , where d ji = b ji − a ij , j = 1, ..., n ) and by first considering the coordinate
directions with the smallest index j (if more coordinates may be chosen). . . . . 49
3.2.2 Illustration of selection, sampling and partitioning schemes ranging from ite-
ration 4 to 5 on the Branin test problem. A situation where two adjacent hyper-
rectangles share the same vertex. After bisection of the lower-left hyper-rectangle
in iteration 4, the new created point fall exactly with the one in the adjacent
hyper-rectangle. This point is marked with a circle in iteration 5. . . . . . . . . . . 50
3.2.3 Geometric interpretation of potentially optimal hyper-rectangles using the BIRECT-V
algorithm on the Branin test function in the seventh iteration : (right side),
POHs correspond to the lower-right convex hull of points marked in blue color
(left side). The position of six points (values of f (x )) obtained in BIRECT can be
clearly distinguished. We observe three sampled points at which the objective
function has been re-evaluated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.4 Iteration progress of the BIRECT-Vl algorithm on the left-hand side and BIRECT-V
on the right-hand side while solving the Ackley (No. 3, n =10, from Table 3.3)
test problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Liste des tableaux

3.1 Preliminary results during the first run of the BIRECT-V algorithm. . . . . . . . . . 57
3.2 BIRECT-Vl and BIRECT-V versus BIRECT and BIRECT-l. . . . . . . . . . . . . . . 61
3.3 Comparison between BIRECT-Vl, BIRECT-V, BIRECT-l, BIRECT, DIRECT-l,
and DIRECT algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

1 Key characteristics of the Hedar test problems[18]. . . . . . . . . . . . . . . . . . . . . . 83


2 Global minimizer found by the BIRECT-V algorithm using Hedar test problems
[18] with modified domain from Table 3.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3 Derivative-free solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

viii
List of Algorithms

1
Algorithm252
main steps of the DIRECT Algorithms263
code of BIRECT algorithm35 4
BIRECT-V algorithm56

ix
Introduction

0.1 Context and Research Motivation

The field of global optimization algorithms has seen significant advancements and inno-
vative adaptations since the seminal work of Jones et al. in 1993 [23], when they introduced
the pioneering DIRECT (DIvide RECTangles) algorithm. This thesis traces the evolutionary
trajectory of these optimization methods, aiming to address the inherent limitations of the
original DIRECT approach.
The original DIRECT algorithm, while groundbreaking, had some recognized limitations.
One key limitation was the potential to overly confine the search to local regions, which could
lead to excessive function evaluations if not controlled. To address this, researchers empha-
sized selecting an appropriately sized algorithmic parameter, denoted as "", to strike a ba-
lance between exploration and exploitation. Another approach, proposed by Jones in 2001
[24], involved an alternating strategy where DIRECT alternated between global and local op-
timization phases. However, the effectiveness of this alternation depended on the predefi-
ned quantity of function evaluations in the global phase. Previous literature introduced two-
phase schemes that seamlessly balanced local and global information within a single optimi-
zation procedure.
Various extensions of the DIRECT algorithm have been suggested, with a focus on im-
proving the selection of potential optimal hyper-rectangles and modifying partitioning and
sampling strategies. Unlike most DIRECT-type algorithms that utilize a central sampling stra-
tegy, some algorithms, such as BIRECT (BIsecting RECTangles) [45], have explored alternative
approaches. BIRECT-V (V for vertices) stands out among them, introducing a novel diagonal

1
partitioning and sampling scheme. It combines bisection with the sampling of two strate-
gically positioned points along the main diagonal of the initial hyper-rectangle, denoted as
ADC. Additionally, both DISIMPL versions employ distinct partitioning strategies.

0.2 Object of the Thesis

The objective of this thesis is to introduce an innovative diagonal partitioning and sam-
pling approach, extending the recently developed BIRECT (BIsection of RECTangles) algo-
rithm. This research addresses global optimization problems characterized by Lipschitz-continuous
objective functions, where the Lipschitz constant is unknown.

0.3 Research Objectives

The objectives of this thesis encompass the following :

1. Develop and propose a novel algorithm named BIRECT-V for global optimization.

2. Extend the BIRECT-V algorithm to create a variation called BIRECT-Vl to further en-
hance its performance.

3. Investigate and implement the combination of bisection with sampling on diagonal


vertices, which is a unique feature not commonly found in existing BIRECT-type algo-
rithms.

4. Perform a comprehensive numerical comparison of the BIRECT-V and BIRECT-Vl algo-


rithms against existing optimization approaches using various test problems to assess
their advantages.

5. Evaluate the impact of domain modification on the performance of the BIRECT-V algo-
rithm.

6. Explore the applicability of the proposed approach in handling global optimization pro-
blems involving Lipschitz continuous functions subject to linear constraints, addres-
sing a challenging and relevant problem.
0.4 Research Tasks

To accomplish these aims, the following research tasks must be undertaken :

1. Develop the BIRECT-V algorithm, incorporating bisection and diagonal vertex sam-
pling.

2. Extend the BIRECT-V algorithm to create the BIRECT-Vl variation.

3. Implement numerical experiments and collect data for the numerical comparison of
BIRECT-V and BIRECT-Vl with other optimization methods.

4. Analyze the results of the numerical comparison to highlight the advantages of the pro-
posed approach.

5. Investigate the impact of domain modification on the BIRECT-V algorithm’s perfor-


mance and document the findings.

6. Explore the applicability of the proposed approach to address global optimization pro-
blems involving Lipschitz continuous functions and linear constraints.

7. Document the research findings, methodologies, and experimental results in a com-


prehensive thesis or research report.

0.5 Methodology

The proposed approach builds upon the BIRECT algorithm’s core principles while in-
troducing a refined diagonal partitioning and sampling strategy. By strategically modi-
fying the optimization domain, we aim to reduce the likelihood of redundant evalua-
tions and improve the overall efficiency of the optimization process. This enhancement
is motivated by the need to strike a balance between exploration and exploitation, en-
suring that the algorithm converges to the global solution effectively.

0.6 Structure of the Dissertation

The dissertation consists of an introduction, ? ? chapters, a bibliography, and a list of


publications. The total length of the dissertation is ? ? pages, compLength of the dis-
sertation is ? ? pages, including ? ? figures and ? ? tables. The dissertation was based on ? ?
literature sources.

Chapter 1 provides an overview of the research context, outlines the problem statement,
discusses the motivation behind the research, and presents the research’s aims and ob-
jectives. It also addresses the research questions, describes the research methods em-
ployed, and discusses the research’s approbation.

Chapter 2 delves into the theoretical foundations of DIRECT-type algorithms.

Chapter 3 focuses on the development of an extension of the BIRECT algorithm speci-


fically tailored for black-box global optimization, named BIRECT-V (V for vertices).

Finally, at the end of the dissertation, the main results of this thesis are summarized.

0.7 Puplications
1
CHAPITRE

Global Optimization : State of the

Art

1.1 Introduction

The resolution of global optimization problems has become a central topic in scienti-
fic research and engineering. It encompasses various fields such as computer science,
mathematics, industrial engineering, and many others. This growing importance arises
from the need to find optimal solutions in various contexts, whether it’s optimizing in-
dustrial processes, logistic planning, or solving complex problems in artificial intelli-
gence. Advances in this field continue to significantly impact numerous sectors and
contribute to improving efficiency, performance, and decision-making in various prac-
tical applications.

Furthermore, humans naturally seek to enhance their daily lives and pursue perfection.
They continually strive to minimize expenses, rent, or vehicle consumption, while at-
tempting to maximize their assets. Mathematicians step in to actualize these desires by
modeling real-life problems using cost functions and various optimization techniques.

Today, global optimization has become essential for solving a wide range of problems,
whether in industry or other sectors. Indeed, we have witnessed a rapid growth in re-
search employing global optimization methods in recent years. This trend can be ob-
served across all scientific domains.

In this chapter, we will explore the general definitions of global optimization methods,
which can be divided into two main categories : deterministic and non-deterministic

5
methods(stochastic).

1.2 Description of Global Optimization

Global optimization is a field at the intersection of applied mathematics and nu-


merical analysis, focusing on optimizing a function or a set of functions according to
specific criteria. Typically, a set of bounds and, more generally, constraints are also pro-
vided, and decision variables are optimized while taking these constraints into account.
Global optimization distinguishes itself from local optimization by its objective of fin-
ding the absolute maximum or minimum across the entire range of values within the
domain of definition, as opposed to seeking local minima or maxima.

The problem we are studying here is the search for the global minimum of a real func-
tion f of n real variables x1 , x2 , . . . , xn . Such problems frequently arise in applications.

Let f : D ∈ Rn → R be defined for all x ∈ D , where x = (x1 , x2 , . . . , xn ),

min f (x )
x ∈D

where :

- f : The objective function (or cost function, criterion, or economic function).

- D : The feasible (or admissible) domain.

The goal is to find x ∗ such that f (x ∗ ) ≤ f (x ) for all x ∈ D .

Most of the time, the global optimization problem is defined only as a minimization
problem because the maximization of the real-valued function fˆ(x ) is equivalent to
the minimization of the function :

f (x ) ≡ (−1) × fˆ(x ).

Besides finding the global minimum f ∗ , one or all global minimizers x ∗ such that f (x ∗ ) =
f ∗ are supposed to be found.
1.3 Classification of Global Optimization Problems

In the domain of global optimization, we can categorize problems into two distinct
types : optimization with constraints and unconstrained optimization. When the so-
lution space X is not equivalent to Rn , we refer to the problem (1.1) as a constrained
minimization problem. Conversely, if X = Rn , it is termed an unconstrained minimiza-
tion problem. It is essential to note that, for a problem to possess a global minimum,
it often necessitates the imposition of additional conditions on the function f , inclu-
ding the requirement that the function grows towards infinity as |x | approaches infinity,
expressed as limk →+∞ f (x ) = +∞

To grasp the significant challenges inherent in global optimization problems and the
computational expenses associated with their resolution, it is crucial to recognize that
standard techniques in nonlinear optimization typically, at best, identify local minima.
Furthermore, there is no local criterion to discern whether a local solution is indeed
a global minimum. Consequently, conventional optimization approaches that rely on
derivatives, gradients, subgradients, and similar tools are generally ill-suited to loca-
ting or identifying global optima. It is important to emphasize that, in general, a local
minimum differs from a global minimum.

Hence, researchers in the field of global optimization have introduced specialized me-
thods whose primary objective is not merely to find local minima but, more impor-
tantly, to determine which among them constitutes the global minimum. This deter-
mination heavily relies on the inherent structural characteristics of the given problem.

It’s worth noting that these classifications are not mutually exclusive, and many optimi-
zation problems can fall into multiple categories based on different criteria. The choice
of classification depends on the specific problem, its attributes, and the optimization
objectives. Researchers and practitioners utilize these classifications to select appro-
priate solution methods and algorithms tailored to the problem’s unique nature.
1.4 Classification of Global Optimization Methods

Today, in order to solve global optimization problems, numerous algorithmic strategies


are available. To guide the choice of the best strategy to use, it is necessary to consider
factors such as the problem size, the properties of the objective function and constraints
(continuity, differentiability, linearity, convexity, etc.), and the available time for problem-
solving. Here is a non-exhaustive list of various approaches :

It is certainly impossible to mention all global optimization methods, nor to classify


them, as some methods may belong to multiple categories simultaneously. However,
two main classes of global optimization methods can be distinguished :

1- Stochastic Methods : These methods provide the global minimum of f with a pro-
bability close to 1.

2- Deterministic Methods : Deterministic methods converge with certainty to the


global minimum of f .

1.4.1 Stochastic Methods

Probabilistic methods have made significant contributions in various branches of science.


The rapid development of automation and computing has given rise to new applica-
tions of probability theory since the 1950s. These applications include stochastic mo-
deling and stochastic algorithms, which constitute a domain with numerous open pro-
blems and high relevance. The theoretical development of stochastic algorithms is cur-
rently advanced.

While, in general, they may not yet fully satisfy practical needs, their applications conti-
nue to improve and expand. They are used in the following cases :

1- The objective function is not known analytically but is known through observable
numerical evaluations subject to random perturbation.

2- The objective function is complex and can be approximated, on average, by a ran-


dom quantity.
3- The objective function is known analytically but is highly oscillatory, or the feasible
set is of high dimension (n > 100).

There are stochastic methods based on pure random search, involving the evaluation of
the function f at a finite set of points in D drawn randomly and independently. These
algorithms are particularly useful when f is not given analytically, or when no informa-
tion about f is available other than its value at every point in D ; this is known as the
"Black Box Situation."

Some stochastic methods involve deterministic procedures with local strategy, such as
the multi-start method, which relies entirely on local optimization techniques. It in-
volves multiple local descents starting from randomly drawn points, independently and
uniformly from D .

Another type of stochastic algorithm, known as the "random line algorithm," seeks the
global minimum using a deterministic method applied to curves generated randomly.
As a result, the algorithm transforms a multidimensional global optimization problem
into a one-dimensional problem.

Other stochastic methods are based on partitioning the feasible set D into a finite num-
ber of subsets, followed by the application of a global optimization algorithm (determi-
nistic or stochastic) successively on these randomly drawn subsets based on a distribu-
tion concentrated on the elements of the subdivision.

Among the techniques based on pure random search, we also mention Markovian algo-
rithms, which represent a modification of these techniques. These techniques include
adaptive elements that allow simulating a law dependent on the previous point xk and
the value f (xk ) and generate xk +1 . One of the most well-known Markovian algorithms
is the "simulated annealing" algorithm.

A second class of stochastic methods is the class of methods based on solving stochastic
differential equations. These methods aim to find global minima of a function f defined
on Rn and sufficiently regular by considering the stochastic differential equation of Itô :

d Φ(t ) = −∇ f (Φ(t ))d t + ε(t )d β (t )

where β (t ) is a multidimensional standard Brownian motion, and ε(t ) is a function that


tends to 0 as t tends to ∞.

For a function ε(t ) that slowly tends to 0 as t approaches ∞, the solution of the Itô equa-
tion is a recurrent diffusion process, whose stationary distribution concentrates on the
set of global minimizers of f . The method, therefore, attempts to obtain a global mini-
mizer of f by observing, as t → ∞, a trajectory of the diffusion computed numerically.

Drawbacks of Stochastic Methods

Stochastic algorithms provide the global minimum with a probability close to 1 when a
large number of evaluations of f (x ) are conducted. They are relatively straightforward
to implement on a computer but tend to be less efficient. However, these methods are
deployed when deterministic approaches prove inadequate for solving global optimi-
zation problems.

Stochastic methods encounter particular challenges when confronted with equality constraints
and managing a substantial number of variables, typically exceeding a dozen. Their pri-
mary limitation is their tendency to miss the global solution, and there is no assurance
regarding the quality of the resulting solution. Additionally, these methods can be com-
putationally intensive in terms of CPU time when compared to deterministic counter-
parts.

1.4.2 Heuristic methods

Heuristic methods are valuable problem-solving strategies in optimization that priori-


tize practicality and efficiency. They offer a pragmatic approach to complex problems
where finding the optimal solution is often computationally infeasible. Instead of gua-
ranteeing optimality, heuristics focus on quickly delivering reasonably good solutions,
making them indispensable in various domains.

These methods exhibit several key characteristics. They provide approximate solutions
that may not be optimal but are usually acceptable for practical purposes. Heuristics
are designed for efficiency, making them suitable for large-scale or time-sensitive pro-
blems. They are often tailored to specific optimization challenges, leveraging problem-
specific knowledge and rules to guide the search for solutions. Heuristics employ di-
verse search strategies, including greedy algorithms, local search, simulated annealing,
genetic algorithms, and more, to navigate the solution space effectively. Many heuris-
tics use iterative refinement, progressively improving candidate solutions.

Heuristic methods find applications in logistics, transportation, manufacturing, tele-


communications, and artificial intelligence, among others. They excel in addressing
real-world problems where exact solutions are elusive due to factors like problem size,
uncertainty, or computational complexity. In summary, heuristics provide practical and
efficient tools for tackling optimization challenges, delivering valuable solutions in di-
verse fields.

Drawbacks of Heuristic Methods

Heuristic methods, while providing practical solutions to complex problems, come with
inherent limitations. First and foremost, they do not guarantee the discovery of optimal
solutions, as their primary aim is to deliver reasonably good solutions quickly, often wi-
thout reaching the best possible outcome. Additionally, some heuristics are sensitive to
initial conditions, making them less reliable due to the potential for significantly dif-
ferent solutions arising from slight variations in initial configurations. Evaluating the
quality of heuristic solutions can be challenging, given the absence of a clear optima-
lity criterion. Moreover, heuristics are problem-specific, and adapting them to new do-
mains can be time-consuming. They also lack theoretical guarantees regarding perfor-
mance, making it difficult to predict their effectiveness. In some cases, heuristics may
not converge to a solution or could run indefinitely. Computational intensity, the risk of
getting stuck in local optima, parameter tuning, and limited generalizability are further
drawbacks. Despite these limitations, heuristics remain valuable for solving complex
real-world problems where obtaining exact solutions is impractical, with their effecti-
veness depending on problem characteristics and practitioner expertise.
1.4.3 Deterministic Global Methods

In this category of methods, randomness is not a factor. That is, to solve a problem,
the algorithm will always behave in the same way and provide the same result. These
algorithms can be classified based on the types of problems they can handle, including
linear programs, convex problems, quadratic problems, polynomial problems, or more
general cases.

These techniques generally have the advantage of not requiring starting points. More
importantly, they provide a deterministic answer regarding the quality of the solutions
found : Is the optimum local or global ? What is the degree of certainty ? And so on.

This precision is significantly important because it is often much less costly to find a
solution than to prove that it is indeed the global optimum. Deterministic methods em-
ploy two techniques : reduction and partition.

Dimension Reduction for Multi-Extremal Problems

Multi-extremal multidimensional optimization problems are complex in applied ma-


thematics, so it is natural to consider transforming these problems into simpler ones,
for example, by reducing their dimension. There are several ideas for reducing a multi-
extremal multidimensional global optimization problem to one or more optimization
problems with lower dimensions, especially one-dimensional problems.

The simplest theoretical scheme for dimension reduction is called "multi-step reduc-
tion." It is based on the representation :

min f (x ) = min min f (x1 , ..., xn )


x ∈X ...

where f is a continuous function over the unit cube X = [0, 1]n . This scheme can be
used to transform the initial global optimization problem over a cube into multiple one-
dimensional global optimization problems, although the number of these problems is
generally very large.
Methods Based on Partition and Elimination Techniques (Branch and Bound)

Many problems, especially optimization problems, have a finite (or countable) set of
solutions. In principle, it is possible to enumerate all these solutions and select the one
that suits our needs. However, the major drawback of this approach is the prohibitively
large number of solutions ; enumerating them is not practical.

The branch and bound method (progressive evaluation and separation procedure) aims
to intelligently enumerate these solutions by utilizing certain properties of the problem.
This technique allows for the elimination of partial solutions that do not lead to the de-
sired solution, making it possible to often find the sought-after solution in reasonable
time. Of course, in the worst case, this method still requires explicit enumeration of all
problem solutions.

To achieve this, the branch and bound method employs a function that places bounds
on certain solutions, either excluding them or retaining them as potential solutions.
The performance of the branch and bound method depends, among other things, on
the quality of this function.

General Algorithm :

For convenience, the execution of the branch and bound method is represented through
a tree structure. The root of this tree represents the set of all solutions to the conside-
red problem. Below, we summarize the branch-and-bound method for minimization
problems.

To apply the branch-and-bound method, we need :

A means to calculate a lower bound for a partial solution.

A strategy to subdivide the search space into smaller search spaces. -

A means to calculate an upper bound for at least one solution.

The method starts by considering the initial problem with its set of solutions, referred to
as the root. Procedures for lower and upper bounds are applied to the root. If these two
bounds are equal, then an optimal solution is found, and the process terminates. Other-
wise, the set of solutions is divided into two or more subproblems, becoming children
of the root.
The method is then recursively applied to these subproblems, generating a tree struc-
ture. If an optimal solution is found for a subproblem, it is feasible but not necessarily
optimal for the original problem. Since it is feasible, it can be used to eliminate all its
descendants : if the lower bound of a node exceeds the value of an already known so-
lution, it can be asserted that the global optimal solution cannot be contained in the
subset of solutions represented by this node. The search continues until all nodes are
either explored or eliminated.

Drawbacks of Deterministic Global Methods

Deterministic global optimization methods offer valuable advantages in their ability to


guarantee convergence to the global minimum. However, they are not without limita-
tions. One notable drawback is their computational cost, which can be prohibitive for
high-dimensional problems or those with complex constraints. Moreover, these me-
thods are sensitive to the initial guess, and finding an appropriate starting point can be
challenging, particularly in multi-modal optimization landscapes. Their performance
is also contingent on problem characteristics, such as convexity and differentiability,
which may not hold in real-world scenarios. Additionally, when dealing with mixed-
integer variables, deterministic methods face difficulties due to the combinatorial na-
ture of the search space, which can significantly slow down the optimization process.
These limitations necessitate careful consideration of problem-specific attributes when
selecting an optimization approach.

1.5 Conclusions

In conclusion, global optimization is a vital field for finding the best solutions across
diverse applications. It distinguishes itself from local optimization by seeking absolute
extrema. Global optimization problems can be constrained or unconstrained, with ad-
ditional conditions often required for global solutions.

Global optimization methods fall into two main categories : stochastic methods, which
provide high-probability solutions, and deterministic methods, which guarantee conver-
gence to the global minimum.

Stochastic methods are suitable for complex problems, while heuristic methods prio-
ritize practicality. Deterministic methods offer precision but can be computationally
expensive.

The choice of method depends on the problem and resources available. Researchers
continually refine these methods to address real-world challenges effectively.

In the upcoming sections, we’ll explore these methods further, understanding their strengths
and weaknesses. This chapter equips you with insights into the world of global optimi-
zation and its real-world applications.
2
CHAPITRE

The DIRECT Algorithm

Among the hybrid methods, DIRECT is a Lipschitzian-based algorithm, originally intro-


duced by Jones et al. [23] in 1993, designed for global optimization within the constraints
of simple variable bounds, all without the need for gradient information. This algorithm
guarantees convergence to the global optima under the condition that the objective
function is continuous, or at least continuous in the vicinity of the global optimum [14].
It is particularly well-suited for black-box optimization (BBO) applications, as deter-
mining a Lipschitz constant can be an insurmountable challenge when a closed-form
expression of the objective function is unknown.

Jones et al. introduced an innovative perspective by treating the Lipschitz constant as a


weighting parameter that influences the balance between global and local search. Tra-
ditional Lipschitzian-based approaches assume a high, fixed value for the Lipschitz
constant, ensuring that it is greater than the maximum rate of change of the objec-
tive function. However, the DIRECT algorithm takes a different approach by conducting
multiple searches with varying Lipschitz constants. This enables the algorithm to simul-
taneously consider both local and global exploration, leading to faster convergence.

One notable advantage of the DIRECT algorithm is its applicability to high-dimensional


problems. In cases where the objective function involves n variables with straightfor-
ward bounds, traditional Lipschitzian algorithms treat the search space as an n-dimensional
hyperrectangle in Euclidean space. This approach involves dividing the entire space
into smaller hyperrectangles and conducting function evaluations at their vertices. Du-
ring the initialization phase, this typically necessitates 2n function evaluations at each

16
vertex. However, the DIRECT algorithm simplifies this complexity by requiring evalua-
tions solely at the midpoints rather than the vertices. To ensure sampling at the center
points, mesh refinement involves dividing the interval into thirds, and evaluations are
performed at the midpoint of the left and right thirds. The initial center point simply
becomes the midpoint of the middle interval, where the objective function is already
known. Notably, the algorithm does not rely on a known Lipschitz constant to deter-
mine which hyperrectangles to sample next. Instead, it calculates a set of potential hy-
perrectangles with the lowest objective values and the greatest expected decrease in
objective function values. For termination, since the Lipschitz constant remains unk-
nown, the number of algorithm iterations is pre-specified.

In this chapter, we delve into the DIRECT algorithm, which was conceived by D. R. Jones,
C. D. Perttunen, and B. E. Stuckman [23] in 1993. The name "DIRECT" stems from its
fundamental principle of dividing rectangles, a core aspect of the algorithm. Before del-
ving into the intricacies of the DIRECT algorithm, we establish the abstract problems
that this algorithm aims to address. Additionally, we provide an overview of Lipschitz
continuous functions and their properties, which serve as foundational concepts for
many classical methods in Lipschitz optimization. However, we also highlight the limi-
tations of these classical approaches. Towards the end of the chapter, we introduce the
one-dimensional variant of the DIRECT algorithm and elucidate how it can be extended
to handle problems in higher dimensions.

2.1 Lipschitz Optimization Prolem

The Lipschitz optimization problem is a type of mathematical optimization problem


where the objective function is assumed to have Lipschitz continuity. Lipschitz conti-
nuity is a mathematical property that characterizes how much a function can change
within a given distance in its domain. Specifically, a function f is said to be Lipschitz
continuous if there exists a Lipschitz constant L such that for all pairs of points x and
y in its domain :

| f (x ) − f (y )| ≤ L · kx − y k (2.1.1)
Here, L is the Lipschitz constant, and kx − y k represents the distance between points
x and y in the domain of the function. This property ensures that the function does
not change too rapidly, and it provides a useful condition for certain optimization pro-
blems.

The Lipschitz optimization problem typically involves finding the minimum or maxi-
mum of a Lipschitz continuous objective function subject to certain constraints. The
Lipschitz continuity property can be useful in optimization because it provides a bound
on how much the function can vary, which can guide the optimization process.

Solving Lipschitz optimization problems can be challenging, as the Lipschitz continuity


constraint adds an extra level of complexity to the optimization process. Various opti-
mization algorithms and techniques have been developed to address Lipschitz opti-
mization problems, taking into account the Lipschitz constant to guide the search for
optimal solutions while ensuring convergence. These problems often arise in enginee-
ring, machine learning, and other fields where smoothness and boundedness of the
objective function are critical considerations.

2.2 The one-dimensional DIRECT Algorithm

Both Piyavskii and DIRECT are global optimization algorithms that divide the search
space into smaller regions. Piyavskii focuses on maximizing the reduction in the objec-
tive function’s value when dividing, while DIRECT divides into three equal parts per-
pendicularly to the dimension with the smallest wi value. The choice of algorithm de-
pends on the specific characteristics of the optimization problem and the available in-
formation about the objective function. The problems of the Piyavskii algorithm are
addressed by the DIRECT algorithm, where DIRECT stands for dividing rectangles. This
algorithm was developed by D. R. Jones, C. D. Perttunen, and B. E. Stuckman [23].

To explain this algorithm, we first assume that we know a Lipschitz constant and drop
this requirement later. We start with inequality (2.1.1) in its one-dimensional formula-
tion :
| f (x ) − f (x 0 )| ≤ γ|x − x 0 | ∀x , x 0 ∈ [a , b ].
0 =c
Let c = (a + b )/2 and set x in (2.1.1) . Then, for all x ∈ [a , b ] :
For x ∈ [a , c ] :
f (c ) + L (x − c ) ≤ f (x ) ≤ f (c ) − L (x − c )

For x ∈ [c , b ] :
f (c ) − L (x − c ) ≤ f (x ) ≤ f (c ) + L (x − c ).

These two inequalities define a region in which the graph of the function is contained.
Furthermore, we get a lower bound D (a , b ) for f in [a , b ] by letting x = a or x = b in
these inequalities. This means we evaluate the bounding function at the two endpoints
of the interval :

L (b − a )
D (a , b ) = f (c ) − .
2
The bounding area is large, and therefore needs to be refined. To do this, we need to
sample the function at more points and divide the search area. The next section explains
how DIRECT does this.

2.2.1 Area-Dividing Strategies

The DIRECT algorithm divides the original interval into three sub-intervals of equal
length and assesses the function at the midpoints of these new intervals. As one of
these midpoints coincides with the midpoint of the original interval, only two addi-
tional evaluations of the function are needed. This distinct feature sets DIRECT apart
from the Piyavskii algorithm, which divides the original interval into two sub-intervals
of different lengths.
Galperin [11, 12] and Shen et al. [56] have also proposed evaluating the function at the
interval’s midpoint but then dividing the interval there. Similarly, Gourdin et al. [13]
advocate the same division strategy as DIRECT. They extend this approach to the N-
dimensional case, a topic we will discuss in Section 2.2.4 However, their method differs
from DIRECT in how they decide which hyperrectangle to divide next.
The advantage of dividing the interval as done in the DIRECT algorithm is that it re-
quires evaluation only at the midpoint of the interval. This concept readily extends to
higher dimensions without an exponential increase in the number of function evalua-
tions, as hyperrectangles in higher dimensions still have a single center. In contrast, if
we were to extend the idea of evaluating the function at the endpoints of intervals to hi-
gher dimensions, we would need to evaluate the function at all corners, resulting in an
exponential increase in function calls with increasing dimensionality. Moreover, the DI-
RECT strategy is not dependent on the Lipschitz constant since the points at which the
function is evaluated are always midpoints. This property holds significance for sub-
sequent discussions.

A visual representation of the two different dividing strategies employed by DIRECT and
the Piyavskii algorithm is depicted in Figure (2.2.1 ) In the diagram, ’×’ marks represent
the points where the function is evaluated, while ’|’ marks indicate the endpoints of the
intervals.

FIGURE 2.2.1 – The division strategy for both Piyavskii and DIRECT algorithms .
2.2.2 Potentially optimal intervals (POIs)

The DIRECT algorithm effectively selects intervals for division without requiring prior
knowledge of the Lipschitz constant, denoted as L . Initially, we assume the value of L
is known.

For each interval [a i , bi ], we calculate a lower bound D (a i , bi ) on the function f . The


key strategy is to select the interval with the smallest bound, D (a i , bi ), for division in
the next iteration. This choice is driven by the potential for significant improvement in
the function value within this interval.

In many practical scenarios, the Lipschitz constant L remains unknown. To overcome


this limitation, we aim to estimate L based on available data in a specific step. Speci-
fically, we focus on points where a certain constant K̃ i > 0 exists, indicating that the
corresponding interval would be chosen if L = K̃ i . This implies that we can draw a line
(bi −a i )
with slope K̃ i through the point ( 2 , f (ci )), ensuring that all other generated points
lie above this line.

In cases where multiple intervals have the same length, we only consider points that re-
present one of these intervals with the smallest function value at their midpoint. These
points, for which such a K̃ i exists, hold the potential for the most substantial function
improvement if the Lipschitz constant were K̃ i .

These intervals, where K̃ i exists, are referred to as potentially optimal intervals.

The formal definition of a potentially optimal interval is as follows :

Définition 2.2.1 Let " > 0 be a positive constant, and let fmi n be the current best value of
the function. An interval j is said to be potentially optimal if there exists a rate of varia-
tion K˜ such that

f (c j ) − K˜ (b j − a j )/2 ≤ f (ci ) − K˜ (bi − a i )/2, ∀i (2.2.2)

f (c j ) − K˜ (b j − a j )/2 ≤ fmi n − "| fmi n |. (2.2.3)

In this definition, inequality (2.2.2) tells us the decision to choose intervals that promise
the greatest improvement in the function’s value if f were Lipschitz with a Lipschitz
constant K˜ .

Inequality (2.2.3) ensures that we have the potential for a significant decrease in f wi-
thin the considered interval.

2.2.3 The one-dimensional DIRECT Algorithm

Let [a 1 , b1 ] be the initial interval, and set an initial accuracy threshold " > 0. Initialize
fmin as the current best function value and start the iteration.

1 : Evaluate f at the midpoints c1 = a 1 +b


2
1
and c2 = a 2 +b
2 .
2

2 : Update the interval bounds based on the function evaluations :

If f (c1 ) < f (c2 ),

a 2 ← c2 ,

else,

b1 ← c 1 .

3 : Update fmin if a new minimum is found :


If f (c1 ) < fmin , set fmin ← f (c1 ).
If f (c2 ) < fmin , set fmin ← f (c2 ).

4 : Check the stopping condition : If b1 − a 1 < ", terminate the algorithm.

5 : Otherwise, return to step 1 for the next iteration.

2.2.4 The multidimensional DIRECT algorithm

In the following, we describe how to extend the DIRECT algorithm to the multi-dimensional
case. We first assume that P is a hyperrectangle in Rn . The main difference in this tech-
nique, compared to the one-dimensional case, is the procedure for dividing the search
space. For better understanding, we start with a hypercube and then describe the more
general case of a hyperrectangle.
Dividing in higher dimensions

Dividing of a hypercube :

Let c be the center point of a hypercube. Evaluate the function at the points c ± δei ,
where δ equals 1/3 of the side length of the cube, and ei is the i -th Euclidean base vector.
Define wi as
wi = min{ f (c + δei ), f (c − δei )}.

Divide the hypercube in the order given by the wi , starting with the lowest wi .

We divide the hypercube first perpendicular to the direction with the lowest wi . Then
we divide the remaining volume perpendicular to the direction of the second lowest wi ,
and so on until the hypercube is divided in all directions. The division we create with this
strategy puts c in the center of a hypercube with side length δ. Let b = arg mini =1,...,N { f (c +
δei ), f (c − δei )}. b will be the center of a hyperrectangle with one side with a length of
δ, and N − 1 sides with a length of 3δ. This means that of the new hyperrectangles,
the one with the smallest function value at its center has the largest volume of all new
hyperrectangles.

Division of a hyperrectangle :

When it becomes necessary to subdivide one of the hyperrectangles, our approach is


to exclusively split them along their longest side. This strategic choice ensures that we
consistently reduce the maximum length of the hyperrectangle. Additionally, it’s worth
noting that the hyperrectangles generated by the DIRECT algorithm have, at most, two
distinct side lengths.

Moving forward to Figure 2.2.2 b, we encounter the subsequent step within the algo-
rithm. Here, DIRECT proceeds to partition the shaded region. In the second box of Fi-
gure 2.2.2 b, you can observe the specific locations where DIRECT samples the function,
while the third box illustrates the single division of the rectangle.

Continuing to Figure 2.2.2 c, we delve into the third stage of the algorithm within this
particular example. In this step, DIRECT is set to divide the two shaded rectangles. One
of these rectangles happens to be a square, thus it undergoes a double division, as ex-
plained earlier. Conversely, the larger area takes the form of a rectangle and undergoes
a singular division operation.

FIGURE 2.2.2 – Dividing of a hypercube.

Consequently, we have successfully described the process and presented it as the "Divide
algorithm". The rectangle division procedure is summarized in Algorithm
Algorithm 1 Divide Algorithm
1: Identify the set I of dimensions, the maximum side length d , and set δ = d3 .
2: Evaluate f at the points c ± δei , where i ∈ I .
3: Calculate w and divide according to the values of w .

Potentially optimal hyper-rectangles(POHs)

The only remaining component needed to extend the DIRECT algorithm to n dimen-
sions is the definition of potentially optimal hyperrectangles. The selection procedure
at the initial step is trivial ; there exist only one candidate. To make the selection of po-
tentially optimal hyper-rectangles in the future iterations, DIRECT assesses the good-
ness based on the lower bound estimates for the objective function f (ci ) over each
hyper-rectangle. The requirement of potential optimality is stated formally in Defini-
tion (2.2.2).

Définition 2.2.2 Let " > 0 be a positive constant, and let fmi n be the current best value of
the function. An interval j is said to be potentially optimal if there exists a rate of varia-
tion K˜ such that

f (c j ) − K˜ δ j ≤ f (ci ) − k̃ δi , ∀i , (2.2.4)

f (c j ) − K˜ δ j ≤ fmi n − "| fmi n |. (2.2.5)

The measure of the hyper-rectangle, denoted as δi , is calculated using the formula :

1
δi = ||b − a ||2 (2.2.6)
2

Here, ||b − a || represents the Euclidean norm of the vector (b − a ), where a and b are
vectors defining the bounds of the hyper-rectangle.

The hyper-rectangle j is considered potentially optimal if the lower Lipschitz bound


for the objective function, calculated using the left-hand side of (2.2.4), is the smallest
among the hyper-rectangles in the current partition and is associated with a positive
constant K˜ . The parameter " in (2.2.5) is used to prevent excessive refinement of local
minima, with typical values ranging from 10−3 to 10−7 , as demonstrated in [21].

The only difference between this definition and Definition 2.2.1 is that, instead of the
interval length in the one-dimensional case, we need another measure of size, denoted
as δi , for the hyperrectangle. Jones et al. [3] choose δ to be the distance from the center
of the hyperrectangle to its vertices. Many extensions and modifications of the DIRECT
algorithm can be found in the literature.

In each iteration, DIRECT-type algorithms perform the selection of potentially optimal


hyper-rectangles, which are subsequently evaluated and divided. A concise overview of
the main steps of the multidimensionnel DIRECT algorithm is provided in Algorithm 2.

Algorithm 2 The main steps of the DIRECT Algorithms


Initialisation : Normalize the search space D to be the unit hyper-rectangle D̄ , but refer
to the original space D when making function calls. Evaluate the objective f at the center
point c1 . Set fmi n = f (c1 ), xmi n = c1 , and initialize algorithmic performance measures and
stopping criteria.

While Termination criteria not met do


Selection : Identify the set I of potentially optimal hyper-rectangles (subregions of D̄ )
using Definition (2.2.1).
Sampling : For each hyper-rectangle D̄ j ∈ I sample and evaluate the objective function at
new domain points. Update fmi n ,xmi n , and algorithmic performance measures.
Dividing : For each hyper-rectangle D̄ j ∈ I subdivide (trisect) and update partitioned
search space information.

endwhile
Return : fm i n , xmi n and algorithmic performance measures.

2.2.5 Strengths of the DIRECT algorithm

The DIRECT algorithm offers several notable strengths, making it a valuable tool for
global optimization. Firstly, it operates as a versatile black-box optimization algorithm,
negating the need for prior knowledge of gradients or specific characteristics of the ob-
jective function. This versatility enables its application across a wide spectrum of op-
timization problems. Secondly, DIRECT provides a strong guarantee of global conver-
gence, assuming the continuity of the objective function. This assurance is particularly
valuable when dealing with intricate, multi-modal objective functions.

Furthermore, DIRECT is a deterministic algorithm, eliminating the variability associa-


ted with multiple runs using different random seeds and ensuring consistent, depen-
dable results. Its capacity for parallelism is another advantage, as it can simultaneously
evaluate multiple rectangles in each iteration, leading to faster convergence and effi-
cient utilization of computational resources.

A distinctive feature of DIRECT is its adaptive approach to balance local and global ex-
ploration. This balance is achieved without the need for explicit hyperparameters. DI-
RECT dynamically selects rectangles with lower bounds for specific Lipschitz constants,
with smaller constants favoring local search and larger ones favoring global explora-
tion. This adaptability enhances its robustness, allowing it to perform effectively across
problems of varying complexity without manual parameter tuning.

Moreover, DIRECT exhibits a unique property wherein sampled points tend to cluster
around global minima. This clustering phenomenon can be leveraged to identify pro-
mising regions for initiating local optimization algorithms to further refine solutions.

In summary, DIRECT’s adaptability, convergence guarantee, and efficiency make it a


valuable asset in the realm of global optimization, particularly when tackling complex,
high-dimensional, and multi-modal objective functions.

2.2.6 Weaknesses of the DIRECT algorithm

Despite its notable strengths, the DIRECT algorithm does exhibit certain weaknesses.
One significant limitation is the reliance on a user-defined parameter, namely the de-
sired accuracy (ε). This parameter, although intuitive in its interpretation as the desi-
red level of accuracy, can impact performance. If set too conservatively, it may lead to
overly cautious, locally focused searches, potentially overlooking valuable global op-
tima. Conversely, if set too aggressively, it might waste computational resources explo-
ring negligible improvements. Striking the right balance for ε can be a non-trivial task,
and the algorithm’s sensitivity to this parameter poses a challenge.

Another limitation lies in DIRECT’s inability to handle problems with non-continuous


or discontinuous objective functions effectively. Since the algorithm’s core assumptions
are based on the continuity of the objective function, it may struggle or fail to converge
in scenarios where continuity is not assured. This restriction can limit its applicability
in certain real-world optimization problems.

Furthermore, while DIRECT excels in parallel evaluation of multiple rectangles, it may


face scalability issues in very high-dimensional spaces. As the number of dimensions
increases, the volume of the search space grows exponentially, leading to an exponen-
tial increase in the number of rectangles to be evaluated. This can strain computational
resources and lead to impractical runtimes for problems with a large number of dimen-
sions.

In conclusion, while DIRECT offers valuable capabilities for global optimization, users
must be mindful of its sensitivity to the ε parameter, its reliance on function continuity,
and potential scalability challenges in high-dimensional spaces when considering its
application to specific optimization tasks.

2.3 Overview DIRECT-type derivative-free methods

The DIRECT algorithm, originally introduced in [23], has undergone numerous varia-
tions and extensions over the years. These adaptations and extensions aim to overcome
the inherent limitations of the original DIRECT algorithm and enhance its effectiveness
in solving practical global optimization problems. In this section, we present a compre-
hensive survey of significant modifications and extensions derived from the DIRECT
algorithm, as documented in the existing literature.

Numerous articles have explored ways to modify and extend the DIRECT algorithm, in-
troducing innovative features to improve the selection of potentially optimal hyperrec-
tangles (POHs). These proposals often focus on novel partitioning and sampling tech-
niques, which are instrumental in enhancing the algorithm’s performance.
Table ?? in this section serves as a valuable reference, summarizing the range of algo-
rithms included in the DIRECTGO toolbox for box-constrained global optimization.
While most of these algorithms adhere to the trisection approach for n-dimensional
POHs, it’s important to note that ADC, BIRECT, and both DISIMPL versions deviate
from this convention by employing alternative partitioning strategies.

The remainder of this section will provide a detailed exploration of these various algo-
rithms and their unique contributions to the field of global optimization. In doing so,
we aim to offer a comprehensive overview of the advancements made in the realm of
DIRECT-based optimization techniques.

2.3.1 Classification of DIRECT Variations

These variations can be categorized into several groups based on their key characteris-
tics and strategies. We’ll discuss each group separately :

1. Selection Enhancement Strategies

Adaptive Diagonal Curves (ADC) : Adaptive diagonal curves (ADC) based al-
gorithm with a new two-phase technique balancing local and global informa-
tion was introduced in [51]. Independently on dimensionality, the ADC algo-
rithm evaluates the objective function at two vertices a ik and bik of the main
diagona. Notice that up to 2n hyper-rectangles can share the same vertex, lea-
ding (in a long sequence) to a smaller number of sampled points than the total
number of hyper-rectangles in the current partition. Furthermore, as in the re-
vised version of DIRECT [24], ADC trisects each selected POH along just one
of the longest dimensions. Such a diagonal scheme potentially obtains more
comprehensive information about the objective function than center-based
sampling, which sometimes may take many iterations to find the solution. For
example, a hyper-rectangle containing the optimum with a bad function value
at the midpoint makes him undesirable for further selection. The ADC algo-
rithm intuitively reduces this chance for both sampling points in the hyper-
rectangle containing the optimum solution.
Bisecting Rectangles (BIRECT) : In the domain of derivative-free optimiza-
tion, Paulavičiuset al. introduced the BIRECT algorithm as a notable variant
of the well-established DIRECT method [45]. What sets BIRECT apart is its dis-
tinctive approach to sampling points within rectangles and its preference for
bisection over trisection. In BIRECT, each rectangle is carefully sampled at two
points, strategically positioned at 1/3 and 2/3 along a diagonal within the rec-
tangle (see Figure (2.3.3 ). This innovative sampling strategy addresses a fun-
damental issue encountered in DIRECT : the potential selection of rectangles
with unfavorable center-point function values. By sampling two points per
rectangle, BIRECT significantly reduces the likelihood of selecting rectangles
where both points align with regions of poor function values, thereby promo-
ting more favorable selections in the algorithm’s decision-making process. To
illustrate the division and sampling approach of BIRECT, consider an example
where n = 2. For the complete description, we refer to Paulavičiuset et al.
(2018).

Exemple 2.3.1 Let l 1 = (l 1 , l 2 ) = (1/3, 1/3)T and u 1 = (u 1 , u 2 ) = (2/3, 2/3)T si-


gnify the initial diagonal sampling points (depicted in Fig. (2.3.3 a)). Initially,
when all variables (x j , j = 1, ..., n) of the hyper-rectangle D̄ 1 = [a 1 , b 1 ] possess
identical side lengths (d ji = d ki , ∀ j , k ∈ 1, ..., n), BIRECT performs a bisection
through the midpoint of the variable with the lowest index (x1 ). This division
yields two hyper-rectangles, D̄ 2 and D̄ 3 , both of equal volume (Fig. (2.3.3 b)).
The subsequent sampling in the BIRECT algorithm involves generating new
sampling points by adding and subtracting half the side length of the coordi-
nate (edge) along which the division occurred.

To elaborate, a comparison between the coordinate values l 11 and u 11 is made,


followed by the addition of half the side length (d 11 /2 = |b11 − a 11 |/2) to the lower
coordinate (in this case, l 11 as l 11 < u 11 ). The remaining coordinates from l 1 re-
main unchanged, resulting in the new sampling point l 3 within the newly for-
med hyper-rectangle D̄ 3 . The second point of this new hyper-rectangle, u 3 = u 1 ,
remains consistent with its parent D̄ 1 . Similarly, by subtracting half the side
length (d 11 /2) from the u 11 coordinate while retaining the other inherited co-
ordinates from u 1 unchanged, the second new sampling point u 2 is obtained
within the hyper-rectangle D̄ 2 . The second sampling point for D̄ 2 , namely l 2 ,
remains unaltered : l 2 = l 1 .

As the process advances, BIRECT continues to identify and bifurcate hyper-rectangles


based on their longest coordinates, resulting in further subdivision and sam-
pling. It’s worth noting that the usage of "lower" (l) and "upper" (u) in naming
the sampling points is relative. Initially, the distinction is evident (Fig. (2.3.3
a)), but during optimization, the coordinates of the "lower" point might exceed
those of the "upper" point (as observed in Fig. (2.3.3 c) with points l 5 and u 5 ).
This emphasizes how the labeling’s nature evolves as the algorithm progresses.
FIGURE 2.3.3 – Illustration of the main operations of BIRECT in n = 2. The gray-colored hyper-
rectangles highlight those that are potentially optimal. They are selected in the first phase of
the current iteration and divided in the second phase of the same iteration. Each subsequent
iteration is obtained by sampling two diagonal points and dividing them along the longest
direction, starting with the lowest index if multiple equally long longest directions exist. The
blue points represent newly sampled points where the objective function is evaluated, while
for the green points, the function values from previous iterations are reused.

Another key advantage of BIRECT lies in its ability to establish tighter Lip-
schitzian lower bounds. In comparison to alternative methods like ADC, BI-
RECT minimizes the maximum distance between an unsampled point wi-
thin a rectangle and the closest of the two sampled points (see Figure (2.3.4
). Consequently, it offers more precise estimates of function values within the
rectangle, enhancing the efficiency of the optimization process.

FIGURE 2.3.4 – The maximum distance between the sampled points (solid blue circles) and an
unsampled point (open red circle) is higher in the ADC method than in the BIRECT method.
(Color figure online).

Furthermore, BIRECT excels in controlling the aspect ratio of rectangles. By


employing bisection, it ensures that rectangles never exceed an aspect ratio
of 2 :1. This contrasts with ADC and DIRECT, which may allow aspect ratios of
up to 3 :1. The smaller aspect ratios inherent in BIRECT contribute to a more
accurate representation of the objective function within each rectangle, po-
tentially leading to improved optimization outcomes.

However, it’s worth noting that the full impact of BIRECT, especially in com-
parison to other methods like ADC, remains an active area of research. The
trade-off between the advantages brought by BIRECT’s dual-point sampling
and any potential disadvantages, such as a lower ratio of sampled points to
rectangles, continues to be a subject of investigation.

The complete description of the BIRECT algorithm is shown in Algorithm 3.


The input for the algorithm is one (or few) stopping criteria : required to-
lerance (t o l ), the maximal number of function evaluations (M ma x ) and the
maximal number of BIRECT iterations (K ma x ). After termination, BIRECT re-
turns the found solution : fmi n and solution point xmi n together with algorith-
mic performance measures : final tolerance – percent error (p e ), the number
of function evaluations (m), the number of iterations (k ), and execution time
(t ) if needed.
Algorithm 3 Pseudo-code of BIRECT algorithm
Input : t o l , the maximal number of function evaluations : M ma x , and the maximal number
of iterations : K ma x ;
Output : Global minimum : fmi n , global minimizer : xmi n , and performance measures : m , k
and p e (if needed) ;
//Initialisation
Normalize the search-space D to be the unit hyper-cube D̄ ;
Initialize : k = 1, Ik = {1}, l1 = (1/3, . . . , 1/3), u1 = (2/3, . . . , 2/3) ;

Set : m = 2, fmin = m i n f (l1 ), f (u1 ) , xmin = argmin f (x ) ;
x ∈{l1 ,u1 }
//Main loop
While p e > t o l and m < M ma x and k < K ma x do
Identify the index set Pk ⊆ Ik of potentially optimal hyper-rectangles (POHs) ;
Set Ik = Ik \{Pk } ;
for i ∈ Pk do
//Selection height 0.4pt
Select the branching variable b r (coordinate index) :

b r = mi n{argmax{d ji =| b ji − a ij |}}.
j =1,...,n

Set l m+2 = l m +1 = l i ; u m +2 = u m +1 = u i ; Â copy old sample points


//Sampling  update new sample points
if lij < u ij then
d bi r
l bmr+2 = l bi r + 2 ;
di
u bm+1
r = u bi r − 2b r ;
Evaluate f at new points f (lm +2 ) and f (um +1 ) ;
d bi r
l bmr+1 = l bi r − 2 ;
di
u bm+2
r = u bi r + 2b r ;
Evaluate f at new points f (lm +1 ) and f (um +2 ) ;

m +1
= m i n f (lm +1 ), f (um +1 ) , fmi
m +2
= mi n f (lm+2 ), f (um +2 ) ;
 
Set fmi n n
m +1

Set fmi n = m i n fmi n , fmi n
m+2
, fmi n
,xmi n = argmax f (x ) ;
m+1 m+2
x ∈{ xm i n ,xm i n ,x m i n }
//Dividing
Divide D̄ i into a two new hyper-rectangles D̄ m +1 and D̄ m +2 ;
Update δm +1 and δm +2 ;
Set k = k + 1 and Ik = Ik ∪ {m + 1, m + 2} ;
Return : fmi n , xmi n and algorithmic performance measures : m , k and p e .
DISIMPL (DISIMPL-C and DISIMPL-V) :

DISIMPL [42] departs from the hyper-rectangle approach and uses simplicial
partitions. Initially, it divides the hyper-cube into n ! simplices using combina-
torial vertex triangulation. DISIMPL offers two distinct sampling strategies :
DISIMPL-C evaluates the objective function at the geometric center of the
simplex, while DISIMPL-V evaluates it at all unique vertices of the simplex.
However, DISIMPL is more suitable for small box-constrained problems due
to the rapid growth in the number of initial simplices as the dimension in-
creases. For box-constrained problems, the total number of initial simplices
grows speedily with the dimension increase. Therefore, DISIMPL effectively
can be used only for small box-constrained problems. However, the DISIMPL
approach is auspicious (among all DIRECT-type methods) for symmetric op-
timization problems [42][66] and problems with linear constraints [67].

2. Parameter and Criteria Adaptations

DIRECT-restart Algorithm : In the DIRECT-restart algorithm [6], the authors


introduced an adaptive scheme for the " parameter. Condition (2.2.5) is nee-
ded to stop the DIRECT from wasting function evaluations on minor hyper-
rectangles where only a negligible improvement can be expected. The DIRECT-
restart algorithm starts with " = 0, and the same value for " is maintained while
improvement is achieved. However, if five consecutive iterations have no im-
provement in the best function value, the search may be stagnated around a
local optimum. Therefore, the algorithm switches to " = 0.01 value to prevent
an excessive local search. If the algorithm finds an improvement or fails to
see the progress within 50 iterations in this phase, DIRECT-restart switches
to " = 0. Then, if another 50 iterations pass without improvement, this may
indicate that the global minimum has been found, and one should work on
refining it to higher accuracy.

MrDIRECT and MrDIRECT075 : The authors of MrDIRECT [32] and MrDIRECT075


[45] algorithms introduced three different levels to perform the selection pro-
cedure :
At level 2, DIRECT is run as usual, with " = 10−5 .

¯ ; 10% of the largest


At level 1, the selection is limited to only 90% of i ∈???
hyper-rectangles are ignored. Here,

" = 10−7 is used.

At level 0, the selection is limited to 10% of the largest hyper-rectangles


(ignored at level 1) using " = 0.

Both algorithms cycle through these levels using “W-cycle” : 21011012. The
main difference between the proposed algorithms is that MrDIRECT uses fixed
" = 10−4 value at all levels, while MrDIRECT075 follows above-mentioned rules.

Aggressive DIRECT :

In [1], the authors relaxed the selection criteria of POHs and proposed an ag-
gressive version of the DIRECT algorithm. Aggressive DIRECT’s main idea is to
select and divide at least one hyper-rectangle from each group of different dia-
meters (δik ) having the lowest function value. Therefore, using Aggressive DI-
RECT in the situation presented in Fig. 2, a hyper-rectangle with the slightest
measure δik would also be selected and divided. The aggressive version per-
forms more function evaluations per iteration than other DIRECT-type me-
thods. From the optimization point of view, such an approach seems less fa-
vorable since it “wastes” function evaluations by exploring unnecessary (non-
potentially optimal) hyper-rectangles. However, such a strategy is much more
appealing in a parallel environment, as was shown in [15, 16, 17, 71]. Note that
the authors did not specify which hyper-rectangle should be selected from
the same group (δik ) if more than one with identical objective function values
exist. Thus, the hyper-rectangle with the larger index value was selected in our
implementations.

3. Measure-Based Variations

DIRECT-l :

In [10], the algorithm named DIRECT-l was proposed. In most DIRECT-type


algorithms, the measure of the hyper- rectangle is calculated by a half-length
of a diagonal (see (2.2.6)). In DIRECT-l, this measure is evaluated by the length
of its longest side. This measure corresponds to the infinity norm and allows
the DIRECT-l algorithm to group more hyper-rectangles with the same mea-
sure. Thus, there are fewer different measures, so fewer POHs are selected Mo-
reover, with DIRECT-l at most one hyper-rectangle selected from each group,
even if there is more than one POH in the same group. Such a strategy allows
a reduction in the number of divisions within a group. Once again, the same
rule is adapted [57] to determine which hyper-rectangle to select from several
possible ones..

DIRECT-m and DIRECT-a :

In [7], the authors concluded that the original DIRECT algorithm is sensitive to
the objective function’s additive scaling. Additionally, the algorithm does not
operate well when the objective function values are large enough.The authors
proposed a scaling of function values by subtracting the median ( fm e d i a n ) of
the collected function values to overcome this. DIRECT-m replaces the equa-
tion (2.2.5) in Definition 2.2.2to :

f (c j ) − K˜ δ j ≤ fmi n − "| fmi n − fm e d i a n |. (2.3.7)

Similarly, in [21], the authors extended the same idea in DIRECT-a to reduce
the objective function’s additive scaling. Instead of the median value, the au-
thors proposed to use the average value ( fa v e r a g e ) at each iteration

f (c j ) − K˜ δ j ≤ fmi n − "| fmi n − fa v e r a g e |. (2.3.8)

4. Symmetry and Equivalent Subregions

glbSolve-sym :

Another extension of the DIRECT algorithm was proposed in [14]. The authors
introduced glbSolve-sym (glbSolve-sym2) as DIRECT extensions for symme-
tric Lipschitz continuous functions. When solving symmetric optimization
problems, there exist equivalent subregions in the hyper-rectangle. The algo-
rithm determines which hyper-rectangles can be safely discarded, conside-
ring the problem’s symmetrical nature, and avoids exploration over equiva-
lent subregions.

5. Reduced POH Sets

PLOR :

In the PLOR algorithm [22], the set of POHs is reduced to just two, correspon-
ding to the first and last point on the Pareto front (see the right panel in Fig. 2).
Therefore, only hyper-rectangles with the lowest function value and the most
extensive measure, breaking ties in favor of a better center-point function va-
lue, are selected.

DIRECT-GL :

DIRECT-GL [57], introduced a new approach to identifying the extended set of


POHs. Here, using a novel two-step-based strategy, the set of the best hyper-
rectangles is enlarged by adding more medium-measured hyper-rectangles
with the smallest function value at their centers and, additionally, closest to
the current minimum point. The first step of the selection procedure forces
the DIRECT-GL algorithm to work more globally (compared to the selection
used in DIRECT [23]). In contrast, the second step assures a faster and broa-
der examination around the current minimum point. The original DIRECT-GL
version performs a selection of POHs in each iteration twice [57], and the al-
gorithm separately handles the found independent sets G (using Definition
2 from [57] - DIRECT-G) and L (using Definition 3 from [57]- DIRECT-L). Fol-
lowing the same trend from [60], the version used in this paper slightly dif-
fers compared to [57]. In the current version of DIRECT-GL, identifying these
two sets is performed in succession, and the unique union of these two sets
(S = G ∪ L ) is used in Algorithm 2, Line 2. This modification was introduced
to reduce the data communication between the computational units in the
parallel algorithm version [59]. At the same time, we found that this way mo-
dified DIRECT-GL was, on average, more effective than the original one.
6. Globally Biased Versions

Several globally biased (Gb-) versions of DIRECT-type algorithms were introdu-


ced and investigated [44, 47]. Proposed approaches are primarily oriented for sol-
ving extremely difficult global optimization problems and contain a phase that
constrains itself to large subregions. The introduced step performs until a suffi-
cient number of divisions of hyper- rectangles near the current best point is done.
Once those subdivisions around the current best minima point are performed, the
neighborhood contains only small measure hyper-rectangles and all larger ones
located far away from it. Therefore, the two-phase strategy makes the DIRECT-
type algorithms examine larger hyper-rectangles and return to the general phase
only when an improved minimum is obtained.

7. Hybridized DIRECT-Type Algorithms

Finally, three different hybridized DIRECT-type algorithms are proposed (DIRECT-


rev [24], DIRMIN [33], BIRMIN [47]). In our implementation, all algorithms are
combined with the same local search routine – fmincon. The DIRMIN algorithm
suggests running a local search starting from the midpoint of every POH. However,
such an approach likely generates more local searches than necessary, as many
start points will converge to the same local optimum. The other authors of the
DIRECT-rev and BIRMIN algorithms tried to minimize the usage of local searches.
They suggested using fmincon only when some improvement in the best current
solution is obtained. The authors in [24] additionally incorporated the following
two enhancements. First, in the DIRECT-rev algorithm, selected hyper-rectangles
are trisected only on one longest side. Second, only one POH is selected if several
equally good exist (the same measure and objective values) in Definition 2.2.2. We
have applied the same rule from [57] to determine which hyper-rectangle to select
from several ones when needed.

These extensions and variations of the DIRECT algorithm address its limitations
and enhance its performance in solving real-world global optimization problems.
Each method introduces unique strategies and modifications to improve the se-
lection of potentially optimal hyper-rectangles and the overall optimization pro-
cess.
FIGURE 2.3.5 – The maximum distance between the sampled points (solid blue circles) and an
unsampled point (open red circle) is higher in the ADC method than in the BIRECT method.
(Color figure online).
2.4 Conclusions

In this chapter, we have discussed various global optimization methods and exa-
mined their respective advantages and disadvantages. It’s important to note that
the scope of this work does not permit an exhaustive exploration of all global opti-
mization methods. Specifically, we have highlighted the DIRECT algorithm, which
excels in efficiently discovering global optima for objective functions characteri-
zed by multimodality and Lipschitz continuity, all without relying on derivative in-
formation. This is achieved through a process of iteratively subdividing the search
space and judiciously selecting promising regions, making it well-suited for a wide
array of optimization problems. The DIRECT-L algorithm is its capability to effi-
ciently find global optima for Lipschitz-continuous objective functions in a man-
ner that mitigates the computational cost associated with the original DIRECT al-
gorithm.

We’ve also introduced the BIRECT (BIsection of RECTangles) algorithm, emphasi-


zing its distinct advantage in comprehensively exploring the behavior of the objec-
tive function. This is accomplished by strategically placing sampling points along
the main diagonal of the initial hyper-rectangle, enhancing its ability to handle
complex multimodal optimization challenges through a combination of bisec-
tion and two-point sampling. BIRECT’s adaptability and performance make it a
promising option for global optimization, especially in scenarios involving high-
dimensional problems.

It’s worth emphasizing that determining the superiority of one method over ano-
ther is a task highly dependent on the specific problem, its context, and the desired
objectives.
3
CHAPITRE

Diagonal Partitioning Strategy

Using Bisection of Rectangles and a

Novel Sampling Scheme (The

BIRECT-V Algorithm)

3.1 Introduction

This chapter considers a global optimization problem with a Lipschitz-continuous


objective function without an unknown Lipschitz constant. We present a novel
diagonal partitioning and sampling strategy by developing on the framework crea-
ted by the previously published BIRECT (BIsection of RECTangles) algorithm. Bi-
section and two-point sampling are utilized in our method, which we call BIRECT-V
(V for vertices). Our approach differs from many DIRECT-type methods in that
these points, which are located at points 1/3 and 1 along the main diagonal wi-
thin the initial hyper-rectangle, provide insightful information about the objecti-
ve’s function.

While most DIRECT-type algorithms don’t find evaluating the objective function
at vertices suitable for bisection, our approach defies this trend, leading to a more
comprehensive understanding of the objective function. However, introducing new
sampling points might coincide with existing ones at shared vertices, necessitating
additional objective function evaluations and consequently increasing the num-
ber of evaluations per iteration. To surmount this issue, we propose a modification

44
to the original optimization domain that yields a good approximation of the glo-
bal solution. Experimental investigations underscore the positive impact of this
modification on the performance of the BIRECT-V algorithm. Our proposed me-
thodology showcases promise as a global optimization algorithm, outperforming
the original BIRECT and two popular DIRECT-type algorithms across a range of
test problems, particularly excelling in high-dimensional scenarios.

As such, the primary aim of this chapter revolves around the novel sampling scheme—samplin
at vertices—integrated into the BIRECT-V algorithm framework, without introdu-
cing additional parameters. This chapter delves into the exploration of this fresh
approach, discussing both its merits and limitations.

The remainder of this chapter is organized as follows : In Sect.3.2, we provide an


overview of the original BIRECT algorithm’s principles. This sets the stage for the
upcoming BIRECT-V algorithm proposal. A detailed description of our new sam-
pling and partitioning scheme is presented in Sect.3.2.2. The implementation of
the BIRECT-V algorithm and its comparison with other DIRECT-type algorithms
are discussed in Sect.3.3. We showcase numerical investigation and comparisons
with BIRECT, BIRECT-l, and two DIRECT-type algorithms using 54 variants of He-
dar test problems [18] in Sect.3.3.2. Finally, Sect. 3.4 concludes the chapter with
reflections and potential avenues for future research.

3.2 Materials and Methods

In this section we start by giving a description of the principle of the sampling and
division strategies retained from the original BIRECT algorithm. Then we intro-
duce our suggested method with emphasis on the sampling strategy. We conclude
this section by an illustration of this new scheme.

3.2.1 From BIRECT to BIRECT-V

As we have seen in the subsection 2.3, the original BIRECT (BIsection of REC-
Tangles) algorithm, as formulated by Paulavičius et al. [45], employs a diagonal
space-partitioning technique and comprises two primary procedures : diagonal
sampling and hyper-rectangle bisection. The algorithm’s initial step involves sca-
ling the original search space D to the unit hyper-cube D̄ in which all variables are
normalized.

At the commencement of the BIRECT algorithm, the function f (x) is evaluated at


two points, termed the "lower" point l = (l 1 , . . . , l n ) = (1/3, . . . , 1/3)T and the "up-
per" point u = (u 1 , . . . , u n ) = (2/3, . . . , 2/3)T . These points are positioned along the
main diagonal of the normalized domain D̄ , equidistant between each other and
the diagonal’s endpoints. Subsequently, the hyper-cube is partitioned into smaller
hyper-rectangles, and f (x) is evaluated for each hyper-rectangle at two diagonal
points. This evaluation follows a specific sampling and partitioning strategy that
adheres to two key rules.

Selection rule

Let the partition of D̄ at iteration k be defined as

Pk = {D̄ i : i ∈ Ik },

where D̄ i = [ai , bi ] = {x ∈ Rn : l i ≤ x ≤ u i , ∀i ∈ Ik }, l i , u i ∈ [0, 1] and Ik is the set


of indices identifying the subsets defining the current partition Pk . At the generic
k th iteration, starting from the current partition Pk of D̄ i , a new partition Pk +1 is
obtained by bisecting a set of potentially optimal hyper-rectangles from the pre-
vious partition Pk . The identification of a potentially optimal hyper-rectangle is
based on the lower bound estimates for f (x) over each hyper-rectangle by fixing
some rate of change L̃ > 0 (which has a role analogous to a Lipschitz constant). A
hyper-rectangle D̄ j , j ∈ Ik . We call potentially optimal a hyper-rectangle j if the
following inequalities hold
 
min f (l j ), f (u j ) − L̃ δ j ≤ min f (li ), f (ui ) − L̃ δi , ∀i ∈ Ik (3.2.1)

min f (l j ), f (u j ) − L̃ δ j ≤ fmi n − "| fmi n |, (3.2.2)

where the measure (distance, size) of the hyper-rectangle is given by


2
δi = kbi − ai k, (3.2.3)
3
" > 0 is a positive constant, and fmin is the current best known function value.
A hyper-rectangle j is potentially optimal if the lower bound for f computed by
the left-hand side of (3.2.1) is optimal for some fixed rate of change L̃ among the
hyper-rectangles of the current partition Pk . Inequality (3.2.2) ensures guarding
against an excessive emphasis on the local search [23].

Division and sampling rule

After the inital covering, BIRECT-V moves to the future iterations by partitioning
potentially optimal hyper-rectangles and evaluating the objective function f (x ) at
their new sampling points.
New sampling points are obtained by adding and subtracting from the previous
(old) ones a distance equal to the half-side length of the branching coordinate.
This way, old sampled from the previous iterations are re-used in descendant su-
bregions.
A vital aspect of the algorithm is how the selected hyper-rectangles D̄ i , i ∈ Ik are
divided. For every potentially optimal hyper-rectangle the set of the maximum co-
ordinates (edges) is computed, and every potentially optimal hyper-rectangle is
bisected (divided in halves of equal size), along the coordinate (branching variable
x b r , 1 ≤ b r ≤ n ), having the largest side length (d bi r ) and by first considering the
coordinate directions with the lowest index j (if more coordinates may be chosen),
where function values are more promising, [70].

 n o
b r = min arg max = d j = b j − a j
i i i
, (3.2.4)
1≤ j ≤n

The partitioning process continues until a prescribed number of function evalua-


tions has been performed, or a stopping criterion is satisfied. The best (smaller)
found objective function value f (x̄) over all sampled points of the final partition,
and the corresponding generated point x̄, provide an approximate solution to the
problem.
Further details and comprehensive description of the original BIRECT algorithm
can be found in Paulavicius et al.[45].

3.2.2 Description of the new sampling scheme

In this subsection, we present the basic idea of the new sampling scheme in a more
general setting. An illustration is given in a two-dimensional example in Fig. 3.2.1
and Fig. 3.2.2 . Since our new method is based on the original BIRECT algorithm,
BIRECT-V follows the same hyper-rectangle selection and subdivision procedure,
unlike the sampling method which is done in a different way.
In the initialization phase, BIRECT-V normalize the search domain to an n -dimensional
unit hyper-rectangle D̄01 , and evaluates the objective function f (x) at two different
diagonal points : 00 third00 ti = (t 1i , . . . , t ni ) = (1/3, . . . , 1/3)T and 00 vertex00 vi = (v1i , . . . , vni ) =
(1, . . . , 1)T . The scaled hyper-rectangle is considered as the only trivial selected POH.
In the succeeding iterations, POHs are selected and bisected in essentially the
same way as BIRECT, with the change that in inequalites (3.2.1) and (3.2.2), the
sampled points li and ui are replaced by ti = li and vi = ui + 31 kbi −ai k respectively,
and using the same measure of the hyper-rectangle given by Eq. (3.2.3).
Selected POHs are divided with the restriction that only along the coordinate (bran-
ching variable x b r , 1 ≤ b r ≤ n ), having the largest side length (d bi r ), and by first
considering the coordinate directions with the smallest index j (if more coordi-
nates may be chosen). This restriction guarantees that the hyper-rectangle will re-
duce on every dimension. Potentially optimal hyper-rectangles are shown in the
left-side of Fig. 3.2.3 , and correspond to the lower-right convex hull of the set of
points.
Formalizing our sampling and partitioning schemes in a more general case. Sup-
pose that at iteration k , D̄ki = [ai , bi ] = {x ∈ D̄ : 0 ≤ a ij ≤ x j ≤ b ji ≤ 1, j = 1, ..., n, ∀i ∈
Ik } is a hyper-cube.
Since all the variables (x j , j = 1, ..., n) of D¯ki have the same side lengths (d ji =
b ji − a ij , j = 1, ..., n ), D¯ki is bisected (divided in halves) across the middle point

2 (a 1 + b1i ) of the coordinate direction with the smallest index (x j , j = 1) into two
1 i
FIGURE 3.2.1 – Description of the initialization and the first three iterations used in the new
sampling scheme on on the Branin test problem. Each iteration is performed by sampling
two new points (blue color) issued from the old ones (red color) and bisecting potentially
optimal hyper-rectangles (shown in gray color) along the coordinate (branching variable x b r ,
1 ≤ b r ≤ n ), having the largest side length (d bi r , where d ji = b ji − a ij , j = 1, ..., n ) and by first
considering the coordinate directions with the smallest index j (if more coordinates may be
chosen).
FIGURE 3.2.2 – Illustration of selection, sampling and partitioning schemes ranging from ite-
ration 4 to 5 on the Branin test problem. A situation where two adjacent hyper-rectangles
share the same vertex. After bisection of the lower-left hyper-rectangle in iteration 4, the new
created point fall exactly with the one in the adjacent hyper-rectangle. This point is marked
with a circle in iteration 5.
hyper-rectangles D̄ki +1 , and D̄ki +2 of equal side lengths (see Fig. 3.2.1 , iteration 1 for
illustration).

After D̄ki is bisected, the first iteration is performed by sampling two new points
from the old ones.

The new point ti +2 is obtained by adding or substracting from the old point one
third side-length d bi r /3 to the lower coordinate of the branching variable. Also the
new point vi +1 is obtained from the old one by subtracting or adding the whole
side length d bi r , while keeping all the rest of coordinates issued from ti and vi un-
changed.

In the case where D̄ki is a hyper-rectangle, new sampled points are obtained, after
distinguishing the branching variable (b r ), by adding or substracting the required
side length from the coordinate on which we branch, pursuant the following rule :

If t ji < v ji , then
i
i +2
d br i +1
t br = t br
i
+ , a nd vbr = vbr
i i
− d br , (3.2.5)
3

otherwise, i.e., if t ji > v ji , then

i
i +1
d br i +2
t br = t br
i
− , a nd vbr = vbr
i
+ d br
i
. (3.2.6)
3

The two new points are obtained as follows :

d bi r |b1i −a 1i |
ti +2 = (t 1i , . . . , t bi r ± 3 , . . . , t ni ) = (t 1i , . . . , t bi r ± 3 , . . . , t ni ),

and

vi +1 = (v1i , . . . , vbi r ± d bi r , . . . , vni ) = (v1i , . . . , vbi r ± b1i − a 1i , . . . , vni ).

Each descending hyper-rectangle D̄ki +1 and D̄ki +2 retainss one sampled point ti and
vi , respectively from their ancestor D̄ki , At the same time, old sampling points are
re-used in descending hyper-rectangles as ti +1 = ti and vi +2 = vi . More precisely :

ti +1 = ti = t 1i , . . . , t ni


1 i 1
= (a 1i + b1 − a 1i , . . . , a ni + bni − a ni )
3 3
2 i +1 1
= (a 1 + b1 − a 1 , . . . , a ni +1 + bni +1 − a ni +1 ),
i +1 i +1
3 3

and

vi +2 = vi = v1i , . . . , vni



= a 1i + b1i − a 1i , . . . , a ni + bni − a ni

= a 1i +2 + b1i +2 − a 1i +2 , . . . , a ni +2 + bni +2 − a ni +2 .


The BIRECT-V algorithm continues in this way by sampling two new points in
each potentially optimal hyper-rectangle, by adding and subtracting the required
side-length from the old points, and bisecting through the longest coordinate until
some stopping rule is satisfied. After subdivision, each rectangle resulting from the
previous iteration retains one point from its predecessor.

Notice that the sampled points vi +1 and vi +1 in D̄ki +1 belong to the same diagonal
(see Fig. 3.2.1 for illustration). This is a straightforward consequence of Theorem
1 in [45]. The same conclusion holds for hyper-rectangle D̄ki +2 .

Finally, let us emphasize that, in contrast to the naming convention used in [45] of
the sampling points as “lower” (l) and “upper” (u), to make differentiate two points
belonging to the same hyper-rectangle, we can assume without any confusion that
the new points are affected as “third” t and “vertex” v. In this way, the two points
are always identified during all the optimization process even if they are “lower”
or “upper”.

It is also of importance to stress again, that our new sampling scheme differs in
its unique and distinctive way on how new sampled points are created by using
different side-lengths, in contrast to direct-type algorithms and diagonal sampling
strategies, where they use the same side-lengths.
FIGURE 3.2.3 – Geometric interpretation of potentially optimal hyper-rectangles using the
BIRECT-V algorithm on the Branin test function in the seventh iteration : (right side), POHs
correspond to the lower-right convex hull of points marked in blue color (left side). The posi-
tion of six points (values of f (x )) obtained in BIRECT can be clearly distinguished. We observe
three sampled points at which the objective function has been re-evaluated.

Illustration

Let t1 = (t 11 , t 21 ) = (1/3, 1/3) and v1 = (v11 , v21 ) = (1, 1)T denote two points lying on the
main diagonal (see initialization in Fig. 3.2.1 ) of hyper-rectangle D̄01 = [a1 , b1 ] =
[a 11 , b11 ] × [a 21 , b21 ].

Without losing generality, we restrict our illustration to two iterations only ; the
other situations are the same. In (Fig. 3.2.1 , iteration 2), D̄23 and D̄24 are POHs. For
hyper-rectangle D̄23 , as there is only one longest side (coordinate j = 2) with side
length d 23 = 1. Therefore using the rule in Eq. 3.2.5, the new sampling points t7 and
v6 are expressed as follows :
d 23
 
2 2
  ‹
t =
7
t 17 , t 27 = t 13 , t 23 + = , ,
3 3 3
 
v6 = v16 , v2 = v13 , v23 − d 23 = (1, 0) .
6

For hyper-rectangle D̄24 , we use the second rule given by Eq. 3.2.6. The new sam-
pling points are located at (see Fig. 3.2.1 , iteration 2) :

d 14
 
1 4 1 2
‹   ‹
t =
8
t 14 − = ,t = ,
, t 24 t 14 −
,
3 3 2 6 3
  1 ‹
v = v1 + d 1 , v2 = v1 + 1, v2 = , 1 .
9 4 4 4 4 4
2

However, in Fig. 3.2.2 , we encounter a situation where two adjacent hyper-rectangles


share the same vertex. After bisection of the lower-left hyper-rectangle in itera-
tion 4, the newly created point falls exactly with the one in the adjacent hyper-
rectangle. This point is marked with a circle in iteration 4. This situation is shown
on the right side of Fig. 3.2.3 ), where we distinguish three sampled points at which
the objective function has been evaluated twice at this vertex. Such a difference
becomes more pronounced as optimization proceeds.

Main steps of the BIRECT-V algorithm

The BIRECT-V algorithm main steps are shown in Algorithm 4, where the inputs
are problem ( f ), optimization domain (D ), and some stopping criteria : requi-
red tolerance (εp e ), the maximal number of function evaluations (M ma x ), and the
maximal number of iterations (K ma x ). BIRECT-V returns the value of the objective
function found ( fmi n ), and the point (xmi n ) as well as the algorithmic performance
measures : percent error (p e ), number of function evaluations (m ), and number
of iterations (k ) after termination.

The BIRECT-V algorithm begins the initialization phase by the normalization of


the feasible domain (D ), evaluating the objective function ( f ) at the two first sam-
pling points t1 and v1 , measuring and setting stopping conditions (see Algorithm 4,
lines 2-4). lines 5-21 of Algorithm 4 describes the main while loop, which is execu-
ted until one of the stopping conditions specified is met. As explained in the pre-
vious section (see Subsubsect. 3.2.2), the BIRECT-V algorithm, at the beginning of
each iteration, identifies the set of POHs (see Algorithm 4, line 7, excluding steps
7 (highlighted in magenta color), which are performed only on the BIRECT-Vl
algorithm).(see Algorithm 4, line 6), then bisects all POHs ( Algorithm 4, line 11)
and creates the new sampling points t i and v i of generated hyper-rectangles (see
Algorithm 4, line 12). Finally, BIRECT-V found a solution, and the performance
measures are returned (see Algorithm 4, line 22).

The structure of BIRECT-V is outlined in Algorithm 4.

convergence

Since BIRECT-V is based on the ideas of BIRECT, therefore the convergence of


BIRECT-V could be determined as many times as other DIRECT-type algorithms
[23, 24, 9, 10], in the sense of the "everywhere-dense" type of convergence (see[48]).
In addition, the continuity of the objective function in the neighborhood of global
minima is a sufficient assumption that guarantees convergence.

3.3 Results and Discussion

This section provides a description of the experimental results, their interpreta-


tion, and the experimental conclusions. we compare the performance of our newly
introduced modification, BIRECT-V, and its variant, called BIRECT-Vl, which dif-
fers from BIRECT-V in that, if several rectangles are tied for being potentially opti-
mal, only one of them is selected. with the original BIRECT algorithm, BIRECT-l
[45, 46], and two other well-known DIRECT-type algorithms [23, 24, 9, 10].

3.3.1 Implementation

As the BIRECT-V algorithm is based on the original BIRECT algorithm, we use the
same measure of the size of the hyper-rectangle.

Note that in the DIRECT algorithm, this size is measured by the Euclidean distance
from its center to a corner, while in DIRECT-l, it corresponds to the infinity norm,
permitting the algorithm to collect more hyper-rectangles having the same size.
In BIRECT-Vl, the number of potentially hyper-rectangles in each group, to be
further divided, is reduced to at most one hyper-rectangle.
Algorithm 4 The BIRECT-V algorithm
1: BIRECT-V (f , D , o p t ) ;
Input : Objective function : f , search-space : D , tolerance : εp e , the maximal
number of function evaluations : M ma x , and the maximal number of iterations : K ma x ;
EndWhile: Global minimum : fmi n , global minimizer : xmi n , and performance mea-
sures : m , k and p e (if needed) ;

2: Normalize the search space D to be the unit hyper-cube D̄ ;


3: Initialize t1 = (1/3, . . . , 1/3)T and v1 = (1, . . . , 1)T , m = 1, k = 1, Ik = {1} and p e ;
 p e defined in Eq. (3.3.7)

4: Evaluate f (t1 ) and f (v1 ), and set fmi n = mi n f (t1 ), f (v1 ) , xmi n = argmin f (x ) ;
x ∈{ti ,vi }

5: While p e > "pe , m < Mmax , k < K max do


6: Identify the index set Pk ⊆ Ik of potentially optimal hyper-rectangles (POHs) applying
Inequations (Ineq. (3.2.1) ; Ineq. (3.2.2)) ;
7: Select at most one POH from each group ; // Only in BIRECT-Vl
8: Set Ik = Ik \{Pk } ;
9: for i ∈ Pk do
10: Select the branching variable br (coordinate index) using Eq. (3.2.4) ;
11: Divide D̄ i into a two new hyper-rectangles D̄ m +1 and D̄ m +2 ;
12: Create the new sampling points tm +1 and vm+2 ; Âseeillustration. 3.2.2 ;
13: Evaluate f (tm+1 ) and f (vm +2 )
m +1
= m i n f (tm +1 ), f (vm +1 ) and fmi
m +2
= mi n f (tm +2 ), f (vm +2 ) ;
 
14: Set fmi n n

15: Update the partition set Ik = Ik ∪ {m + 1, m + 2} ;


m +1 m +2
16: if fmi n
≤ fmi n or fmi n
≤ fmi n then
17: Update fmi n and xmi n ;
18: Update performance measures : k , m and p e ;
19: Return : fmi n , xmi n and algorithmic performance measures : m , k and p e .
%,5(&79O %,5(&79

iteration: 1 fmin: 4.6082847879 f evals: 2 LWHUDWLRQIPLQIHYDOV


. .
.
iteration: 50 fmin: 3.2479917988 f evals: 12 iteration: 50 fmin: 3.2479917988 f evals: 522
.
iteration: 133 fmin: 0.0042301342 f evals: 2028
.
iteration: 134 fmin: 0.0040898808 f evals: 1294
iteration: 150 fmin: 0.0007342074 f evals: 14
iteration: 135 fmin: 0.0039448443 f evals: 2422
.
iteration: 136 fmin: 0.0037944837 f evals: 2482
.
iteration: 170 fmin: 0.0002239623 f evals: 4
.
. iteration: 150 fmin: 0.0007342074 f evals: 1746
iteration: 188 fmin: 0.0002239623 f evals: 8 .
iteration: 189 fmin: 0.0002225978 f evals: 10 iteration: 189 fmin: 0.0002225978 f evals: 2430
iteration: 190 fmin: 0.0000152596 f evals: 10 iteration: 190 fmin: 0.0000152596 f evals: 3306

FIGURE 3.3.4 – Iteration progress of the BIRECT-Vl algorithm on the left-hand side and
BIRECT-V on the right-hand side while solving the Ackley (No. 3, n =10, from Table 3.3) test
problem.

TABLE 3.1 – Preliminary results during the first run of the BIRECT-V algorithm.

Problem Optimum BIRECT-Vl BIRECT-V

No. f∗ f ( x̄ ) f.eval. f ( x̄ ) f.eval.

1 0.0 7.42 × 10−5 198 7.42 × 10−5 342


2 0.0 9.17 × 10−5 422 9.17 × 10−5 3514
3 0.0 9.69 × 10−5 984 9.69 × 10−5 70690
4 0.0 8.77 × 10−5 640 8.77 × 10−5 1034
5 0.0 7.14 × 10−5 676 7.14 × 10−5 656
6 0.0 5.96 × 10−5 692 5.96 × 10−5 694
7 0.0 7.58 × 10−5 902 7.58 × 10−5 1062
8 0.0 6.10 × 10−5 234 6.10 × 10−5 254
9 0.39789 0.39790 656 0.39790 492
10 0.0 9.82 × 10−5 2320 9.82 × 10−5 1910
11 0.0 8.92 × 10−5 940 5.48 × 10−5 1432
12 0.0 9.34 × 10−5 28034 9.36 × 10−5 23412
13 0.0 8.79 × 10−3 > 500000 4.73 × 10−4 > 500000
14 −1.0 −0.99999 180 −0.99999 1082
15 3.0 3.00000 28 3.00000 28
16 0.0 5.13 × 10−5 8288 5.13 × 10−5 8950
17 −3.86278 −3.86244 200 −3.86244 208
18 −3.32237 −3.32214 542 −3.32214 542
19 −1.03163 −1.03154 202 −1.03154 334
20 0.0 1.44 × 10−5 188 1.44 × 10−5 226
21 0.0 7.56 × 10−5 674 7.56 × 10−5 1000
22 0.0 9.27 × 10−5 2082 9.27 × 10−5 18676
23 0.0 2.71 × 10−5 148 2.71 × 10−5 208
24 −1.80130 −1.80130 184 −1.80130 314
25 −4.68736 −4.64588 > 500000 −4.68732 339818
26 −9.66015 −8.60560 > 500000 −7.55568 > 500000
27 0.0 0.00000 80890 0.00000 62368
28 0.0 4.59 × 10−5 2786 4.59 × 10−5 1678
29 0.0 9.15 × 10−5 387440 9.15 × 10−5 467200
30 0.0 0.00000 204 0.00000 204
31 0.0 0.00000 14 0.00000 16
32 0.0 0.00000 204 0.00000 210
33 0.0 0.00000 14360 0.00000 14348
34 0.0 9.65 × 10−5 698 9.65 × 10−5 718
35 0.0 2.41 × 10−5 2444 2.41 × 10−5 2972
36 0.0 5.42 × 10−5 16506 5.42 × 10−5 39846
37 0.0 5.64 × 10−5 446 5.64 × 10−5 580
38 0.0 9.49 × 10−5 63908 9.49 × 10−5 23022
39 0.0 1.79 × 10−8 2938 3.42 × 10−5 134562
40 −10.15320 −10.15234 6618 −10.15234 5866
41 −10.40294 −10.40201 2298 −10.40201 2604
42 −10.53641 −10.53544 2498 −10.53544 3324
43 −186.73091 −186.72944 806 −186.72944 1684
44 0.0 1.15 × 10−5 112 1.15 × 10−5 190
45 0.0 2.87 × 10−5 392 2.87 × 10−5 1400
46 0.0 5.74 × 10−5 1054 5.74 × 10−5 27566
47 0.0 8.74 × 10−5 248 8.74 × 10−5 280
48 0.0 3.97 × 10−5 1354 3.97 × 10−5 1776
49 0.0 9.35 × 10−5 3394 9.35 × 10−5 9244
50 −50.0 −49.99511 1402 −49.99511 2112
51 −210.0 −209.98223 168432 −209.98155 368312
52 0.0 0.00000 78 0.00000 78
53 0.0 0.00000 22498 0.00000 24150
54 0.0 1.13284 > 500000 1.21289 > 500000

Average 52485.148 58762.852


Median 921.000 1681.000

In Table 1 are listed the test problems from[18] used in this comparison, which
consist in total of 54 global optimization test problems with dimensions varying
from n = 2 to n = 10, with the main attributs : problem number, problem name, di-
mension (n ), faisible domain (D ), number of local minima, and known minimum
(f ∗ ). Note that these problems could also be found in [68], and in a more detailed
version in [62] and related up-to-date versions.

Some of these test problems have several variants, e.g. (Bohachevsky, Hartmann,
Shekel), while others (Ackley, Dixon and Price, Levy, Rastrigin, Rosenbrock, Schwe-
fel, Sphere, Sum squares, Zakharov) and can be tested for different dimensionality.

Finally, notice that it may occur occasionally that at the initial steps of the algo-
rithm, the sampling is performed near the global minimizer. In this particular si-
tuation, the feasible domain was modified in the same way as in[45], i.e., the upper
bound was increased. For clarity, the modified test problems are marked with an
asterisk.
Implementation and comparison of the newly introduced scheme with the origi-
nal BIRECT together with other DIRECT-type algorithms were performed in the
MATLAB programming language, using MATLAB R2016a on an EliteBook with the
following hardware settings :

Intel Core i5-6300U CPU @ 2.5 GHz, 8 GB of memory, and running on the Windows
10 operating system (64-bit). Potentially optimal hyper-rectangles are identified
using modified Graham’s scan algorithm. In our implementation, the output va-
lues are rounded up to 10 decimals. A test problem is considered successful if an
algorithm returns a value of an objective function that does not exceed 10−4 error
or a minimizer xmi n that achieves a comparable value in [63].

The algorithms were stopped either when the point x̄ (noted also xmi n ) was gene-
rated such that the following stopping criterion was satisfied :

 f (x̄)−∗ f ∗ ≤ 10−4 , f ∗ 6= 0,
|f |
pe = (3.3.7)
 f (x̄) ≤ 10−4 , f ∗ = 0,

(where f ∗ is the known global optimum) or when the number of function evalua-
tions exceeds the prescribed limit of 500, 000. (The maximum number of iterations
was set to 100, 000 but usually it is supposed to be unlimited.)

The comparison is based on two criteria : the best-found function value f (x̄) and
the number of function evaluations (f.eval.). For each test problem, the average
and median numbers of function evaluations are shown at the bottom of each
table. The best number of function evaluations is shown in bold font in Table 3.3.
The number of iterations and the execution time (measured in seconds) are only
reported in Tables 3.1 and 3.2 in the link https://data.mendeley.com/datasets/
x9fpc9w7wh.

3.3.2 Discussion

In this subsection, we discuss the efficiency of the newly introduced BIRECT-V al-
gorithm and compare it with the original BIRECT, BIRECT-l (see [46, 45]) and two
DIRECT-type algorithms. In Table 3.1, we report the results obtained by BIRECT-V
and BIRECT-Vl when the algorithm is running in the usual way without additio-
nal parameters.

In Table 3.2 are reported the results when the best-found objective function value
f (x̄) found by the BIRECT algorithm is used as a known optimal (minimal) value
(f ∗ ). In Table 3.3, are summarized the experimental results for all tested algorithms
are summarized and compared in the case where the original domain (D ) was mo-
dified. Also, the results related to this comparison are presented in Table 2.

TABLE 3.2 – BIRECT-Vl and BIRECT-V versus BIRECT and BIRECT-l.

Problem Optimum BIRECT-Vl BIRECT-V

No. f∗ f ( x̄ ) f.eval. f ( x̄ ) f.eval.

1 0.00000000e + 00 2.54 × 10−5 206 2.54 × 10−5 360


2 0.00000000e + 00 2.54 × 10−5 438 2.54 × 10−5 4196
3 0.00000000e + 00 2.54 × 10−5 1020 2.54 × 10−5 73452
4 0.00000000e + 00 8.77 × 10−5 640 8.77 × 10−5 1034
5 0.00000000e + 00 4.02 × 10−5 1078 4.02 × 10−5 1040
6 0.00000000e + 00 2.19 × 10−5 1138 2.19 × 10−5 1122
7 0.00000000e + 00 3.67 × 10−5 932 3.67 × 10−5 1106
8 0.00000000e + 00 3.81 × 10−6 364 3.81 × 10−6 376
9 3.97890000e − 01 0.39790 656 0.39790 492
10 0.00000000e + 00 4.36 × 10−5 2568 4.36 × 10−5 2182
11 0.00000000e + 00 4.84 × 10−5 1268 3.31 × 10−5 1472
12 0.00000000e + 00 5.99 × 10−5 28368 4.78 × 10−5 23902
13 0.00000000e + 00 8.79 × 10−3 > 500000 4.73 × 10−4 > 500000
14 −1.00000000e + 00 −0.99999 180 −0.99999 1082
15 3.00000000e + 00 3.00000 28 3.00000 28
16 0.00000000e + 00 4.61 × 10−7 8456 4.61 × 10−7 9162
17 −3.86278000e + 00 −3.86244 200 −3.86244 208
18 −3.32237000e + 00 −3.32214 542 −3.32214 542
19 −1.03163000e + 00 −1.03152 168 −1.03152 274
20 0.00000000e + 00 1.44 × 10−5 188 1.44 × 10−5 226
21 0.00000000e + 00 1.12 × 10−5 870 1.12 × 10−5 1406
22 0.00000000e + 00 2.84 × 10−5 2642 2.84 × 10−5 24978
23 0.00000000e + 00 1.70 × 10−6 244 1.70 × 10−6 318
24 −1.80130000e + 00 −1.80130 184 −1.80130 314
25 −4.68736000e + 00 −4.645885 > 500000 −4.68732 339818
26 −9.66015000e + 00 −7.452392 646 −7.37292 2408
27 0.00000000e + 00 0.00000 80890 0.00000 62368
28 0.00000000e + 00 4.59 × 10−5 2786 4.59 × 10−5 1678
29 0.00000000e + 00 9.15 × 10−5 387440 9.15 × 10−5 467200
30 0.00000000e + 00 0.00000 204 0.00000 204
31 0.00000000e + 00 0.00000 14 0.00000 16
32 0.00000000e + 00 0.00000 204 0.00000 210
33 0.00000000e + 00 0.00000 14360 0.00000 14348
34 0.00000000e + 00 9.65 × 10−5 698 9.65 × 10−5 718
35 0.00000000e + 00 2.41 × 10−5 2444 2.41 × 10−5 2972
36 0.00000000e + 00 5.42 × 10−5 16506 5.42 × 10−5 39846
37 0.00000000e + 00 5.64 × 10−5 446 5.62 × 10−5 580
38 0.00000000e + 00 6.41 × 10−5 64414 6.41 × 10−5 26050
39 0.00000000e + 00 1.79 × 10−8 2938 3.42 × 10−5 134562
40 −1.01532000e + 01 −10.15234 6618 −10.15234 5866
41 −1.04029400e + 01 −10.40201 2298 −10.40201 2604
42 −1.05364100e + 01 −10.53544 2498 −10.53544 3324
43 −1.86730910e + 02 −186.72944 806 −186.72944 1684
44 0.00000000e + 00 1.15 × 10−5 112 1.15 × 10−5 190
45 0.00000000e + 00 2.87 × 10−5 392 2.87 × 10−5 1400
46 0.00000000e + 00 5.74 × 10−5 1054 5.74 × 10−5 27566
47 0.00000000e + 00 7.95 × 10−6 274 7.95 × 10−6 318
48 0.00000000e + 00 3.73 × 10−5 1678 3.73 × 10−5 2218
49 0.00000000e + 00 9.11 × 10−6 3636 9.11 × 10−6 9868
50 −5.00000000e + 01 −49.99218 1324 −49.99218 1942
51 −2.10000000e + 02 −209.98223 168432 −209.96002 279324
52 0.00000000e + 00 0.00000 78 0.00000 78
53 0.00000000e + 00 0.00000 22498 0.00000 24150
54 0.00000000e + 00 9.13966 1284 9.13966 1410

Average 34062.037 38966.519


Median 1037.000 1575.000

First, it is easy to observe from Table 3.1, that our proposed partitioning scheme
requires, most often, more function evaluations than in BIRECT and BIRECT-l,
and sometimes does not reach a comparable minimum function value to that ob-
tained in BIRECT for certain test problems. This seems inappropriate and makes
the comparison not favorable to our results. This is the case, for example, of Ackley
test problems (No. 1–3).

At the same time, it requires fewer function evaluations than in DIRECT and DIRECT-l
algorithms. Also, the median value is smallest using BIRECT-Vl (921.000), com-
pared to BIRECT-V (1681.000), DIRECT-l (1752) and DIRECT (3810) algorithms.

On the other hand, our framework gives better results on the basis of the best
(minimum) function value for almost all instances compared to both versions of
BIRECT. In general, the overall average number of objective functions obtained
with BIRECT-V algorithm is approximately 61, 11% (33 out of 54). To confirm the
above-mentioned fact, it can be seen from Table 3.2, that the situation changes
completely when the best-found objective function value f (x̄) found by the BIRECT
algorithm is used as a known optimal (minimal) value (f ∗ ). Both BIRECT-V and
BIRECT-Vl algorithms give on average significantly better results compared to the
original BIRECT- and BIRECT-l algorithms.

The same is observed, especially for some problems (for n = 10 cases), as for Mi-
chalewics (No.26), and Zakharov (No.54) test problems, while others have reached
exactly the known optimal (minimal) value (f ∗ ). This is the case of the following
test problems : Perm (No.27), Power Sum (No.30), Rastrigin (No.31–33), and Zakha-
rov (No.52, 53). These results are confirmed by comparing the value of the global
minimizer xmin from the libraries ([18], [68], [64]), and the value of x̄ generated by
the algorithm (see Table 2).

More precisely, for the case of the problems : Michalewics (No.26), we found x (10) =
[1.57079632679490], Perm (No.27), the global minimizer found is xmin = [1,2,3,4],
Power Sum (No.30), the global minimum is 0, which is attained at [2,1,3,2], Ras-
trigin (No.32), and Zakharov (No.52-53) test problems, the global minimum is 0,
which is attained at xmin = 0. This situation arises occasionally, where at the early
stages of the sampling process, the algorithm samples near a global optimum. Mo-
reover, for some test problems, e.g., (Dixon and Price (No.13), Michalewics (No.25),
Powell (No.29), Schewefel problem (No.39), Trid (No.51), as previously pointed out,
we observed an excessive number of function evaluations. In this case, we observe
the following situations :

There is no improvement in the best function value after many consecutive


iterations. The algorithm suffers from getting close to a global minimizer, and
the objective function seems to be stagnating around a certain value, which
may be a local optimum.

An increasing number of evaluations (per iteration) is observed during the


iteration’s progress, as shown, e.g., in Fig. 3.3.4 .

Notice that these situations are typical for diagonal-based algorithms as well as for
DIRECT-type algorithms. A detailed review could be found in [25]. Let us illustrate
the above situations in the case of our sampling strategy. Assume that a global mi-
nimum is near one of the two sampled points located at 1/3 and 2/3 along one
of the diagonals of a hyper-rectangle. This situation is in favor of BIRECT, since it
samples one of these two points per hyper-rectangle. However, for the BIRECT-V
algorithm, it may produce many unnecessary sampling points of the objective
function at vertices before this optimum is reached. Every vertex could be shared
up to 2n hyper-rectangles, where the function has been re-evaluated. In this case,
the algorithm takes significantly longer than usual to find a good solution close
to the global optimum. This can be observed from the results given in Table 3.3,
where the two algorithms reached approximatively the same best function value
in some situations.

In the opposite scenario, i.e., if the global optimum point is located at the vertex
of a hyper-rectangle, BIRECT- has a contrary impact to the previous situation. As
the optimization proceeds, BIRECT-V requires fewer function evaluations than
BIRECT, since many adjacent hyper-rectangles could share the same vertex.
In contrast to the previous situations, the same objective function value can be at-
tained in many different points of the feasible domain, as in the case of the Branin
test problem (No.9), where xmin = [3.13965, 2.275] for BIRECT-V, while for BIRECT,
xmin = [9.42383, 2.471]. This situation is current for multimodal problems (having
multiple global minima), symmetrical problems, and (convex) quadratic test pro-
blems. Therefore, BIRECT-V requires fewer function evaluations, thus leading to a
much larger set of selected potentially optimal hyper-rectangles having the same
size and objective function value.

For the problems where BIRECT-V failed to converge most often, we suggested a
modification to the original optimization domain to obtain a good approximation
reasonably closer to the real (known) global optimum. The performance of the
BIRECT-V algorithm is better compared to the original results. It is clear that this
strategy does not overcome the situation in a proper way, but it allows the algo-
rithm to avoid unnecessary sampling of objective function points at vertices and
reduces considerably the number of function evaluations.

It should be stressed that we did not adopt any specific rule or known method for
how the optimization domain is modified. Just slightly modify the domain until
we find a minimizer close to the known solution, or at least to the one obtained
by BIRECT. For example, For the Schewefel problem (No. 39), we obtained xmi n
=[420.9635416667] for BIRECT-V, and xmin = [420.9686279297] for BIRECT-Vl. The
domain was modified up to [−500, 700]10 , see [62, 61, 63].

Note that some results reported in Table 3.3, and Table 2 could be improved more
and more, e.g., Ackley problem 1, 2 and 3 could be improved to get f ( x̄ ) = 1.27161957e −
05, with a global minimizer : xmin =[0.0000031789, ...]. Also, it is shown that some
problems are sensitive to the domain modification, while others don’t really re-
quire such a modification.

From Table 3.3, the numerical results prove that both BIRECT-Vl and BIRECT-V
algorithms produce the best results based on the best found objective function
value, with about 89% (48 out of 54) for BIRECT-Vl, and 87% (47 out of 54) for
BIRECT-V. On the other hand, we observe that the number of function evaluations
is most often smallest for the BIRECT (for about 33 out of 54 of the test problems)
and (30 out of 54) of the BIRECT-l problems) when compared to BIRECT-V and
BIRECT-Vl respectively, in particular for the test problems having the same mini-
mum value.

To conclude this comparison, it is important to notice that, despite the excessive


number of evalautions due to many unnecessary sampling points at some sha-
red vertices, BIRECT-Vl produces the best results in terms of the lowest function
values and, on average, the almost smallest number of function evaluations com-
pared to other algorithms.

3.4 Conclusions

This chapter proposes a new diagonal partitioning strategy for global optimization
problems. A modification of the BIRECT algorithm based on bisection and a novel
sampling scheme, contary to the most DIRECT-type algorithms, where the eva-
luation of the objective function at vertices of hyper-rectangles are not suitable
for bisection. The newly introduced BIRECT-V and its variant BIRECT-Vl were
compared against BIRECT, BIRECT-l, and two DIRECT-type algorithms[45, 46].

The experimental results revealed that the new sampling scheme gives signifi-
cantly better results for almost all test problems, particularly when the faisible do-
main is modified.

Further considerations may be investigated using additional assumptions to im-


prove this version. One of these possible improvements is to evaluate the objec-
tive function only once at each vertex of each hyper-rectangle, where the objective
function values at vertices could be stored in a special vertex database, thus avoi-
ding re-evaluation of the objective function at certain shared vertices in adjacent
hyper-rectangles. Another feature, as shown during the previous test process, is
to find a specific rule about how the change in the original optimization domain
should be applied in order to improve the performance of the BIRECT-V algorithm
(see [64, 65, 67, 69]).

Finally, the results could also be extended to other test problems from [61]. All
these observations may be considered for future research directions.
TABLE 3.3 – Comparison between BIRECT-Vl, BIRECT-V, BIRECT-l, BIRECT, DIRECT-l, and DIRECT algorithms.

Problem BIRECT-Vl BIRECT-V BIRECT-l BIRECT DIRECT-l DIRECT

No. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval.

1 1.52 × 10−5 218 1.52 × 10−5 260 2.54 × 10−5 176 2.54 × 10−5 202 7.53 × 10−5 135 7.53 × 10−5 255
2 1.52 × 10−5 524 1.52 × 10−5 2728 2.54 × 10−5 454 2.54 × 10−5 1268 7.53 × 10−5 1777 7.53 × 10−5 8845
3 1.52 × 10−5 1280 1.52 × 10−5 137040 2.54 × 10−5 874 2.54 × 10−5 47792 3.57445 > 500000 7.53 × 10−5 80927
4 8.77 × 10−5 640 8.77 × 10−5 1034 9.17 × 10−5 436 9.17 × 10−5 436 9.29 × 10−5 247 9.29 × 10−5 655
5 1.83 × 10−6 254 3.17 × 10−5 524 4.02 × 10−5 468 4.02 × 10−5 476 3.09 × 10−6 205 3.09 × 10−5 327
6 1.53 × 10−6 252 1.53 × 10−6 284 3.35 × 10−5 472 3.35 × 10−5 478 2.58 × 10−6 233 2.58 × 10−5 345
7 2.88 × 10−6 284 2.88 × 10−6 282 3.68 × 10−5 474 3.67 × 10−5 480 8.21 × 10−5 573 8.21 × 10−5 693
8 2.99 × 10−6 300 2.99 × 10−6 334 6.10 × 10−5 188 6.10 × 10−5 194 6.58 × 10−5 215 6.58 × 10−5 295
9 0.39791 656 0.39791 492 0.39790 242 0.39790 242 0.39789 159 0.39789 195
10 9.82 × 10−5 2320 9.82 × 10−5 1910 9.82 × 10−5 794 9.82 × 10−5 794 3.83 × 10−5 3379 6.08 × 10−5 6585
11 4.01 × 10−5 810 4.01 × 10−5 784 4.84 × 10−5 722 4.84 × 10−5 722 5.32 × 10−5 485 6.25 × 10−5 513
12 7.57 × 10−5 10872 7.57 × 10−5 8446 7.15 × 10−5 4060 7.15 × 10−5 4060 6.45 × 10−5 54843 6.45 × 10−5 19661
13 7.02 × 10−5 35492 7.60 × 10−5 50922 9.52 × 10−5 162862 9.52 × 10−5 164826 0.66667 > 500000 5.79 × 10−5 372619
14 −0.99999 180 −0.99999 1082 −0.99999 480 −0.99999 16420 −0.99999 6851 −0.99999 32845
15 3.00000 28 3.00000 28 3.00019 274 3.00019 274 3.00009 115 3.00009 191
16 4.61 × 10−7 8456 4.61 × 10−7 9162 7.76 × 10−7 5106 7.76 × 10−7 5106 4.84 × 10−6 8379 4.84 × 10−6 9215
17 −3.86245 200 −3.86245 208 −3.86242 352 3.86242 352 −3.86245 111 −3.86245 199
18 −3.32214 542 −3.32214 542 −3.32206 764 3.32206 764 −3.32207 295 −3.32207 571
19 −1.03154 202 −1.03154 334 −1.03154 190 1.03154 334 −1.03162 137 −1.03162 321
20 9.03 × 10−6 136 9.03 × 10−6 154 9.09 × 10−5 152 9.09 × 10−5 152 2.10 × 10−5 77 2.10 × 10−5 105
21 1.83 × 10−5 454 1.83 × 10−5 558 1.83 × 10−5 660 1.83 × 10−5 1024 3.65 × 10−5 359 3.65 × 10−5 705
22 3.54 × 10−5 1182 3.54 × 10−5 7440 3.55 × 10−5 1698 3.55 × 10−5 7904 9.62 × 10−5 5297 6.23 × 10−5 5589
23 2.71 × 10−5 148 2.71 × 10−5 208 2.71 × 10−5 90 2.71 × 10−5 94 3.81 × 10−5 71 3.81 × 10−5 107
24 −1.80130 184 −1.80130 314 −1.80118 126 −1.80118 126 −1.80127 45 −1.80127 69
25 −4.68744 8430 −4.68744 7472 −4.68736 101942 −4.68736 73866 −4.68721 26341 −4.68721 13537
26 −8.60559 > 500000 −7.55576 > 500000 −7.32661 > 500000 −7.32661 > 500000 −7.84588 > 500000 −7.87910 > 500000
27 0.00132 > 500000 0.00189 > 500000 0.00203 > 500000 0.00203 > 500000 0.04054 > 500000 0.04355 > 500000
Continued on next page
Table 3.3 Continued from previous page

Problem BIRECT-Vl BIRECT-V BIRECT-l BIRECT DIRECT-l DIRECT

No. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval.

28 4.59 × 10−5 2786 4.59 × 10−5 1678 4.86 × 10−5 1832 4.86 × 10−5 2114 6.52 × 10−5 32331 9.02 × 10−5 14209
29 9.00 × 10−5 2872 9.00 × 10−5 3072 9.71 × 10−5 92884 9.71 × 10−5 99514 0.02488 > 500000 0.02142 > 500000
30 0.00000 204 9.97 × 10−5 40788 9.00 × 10−5 1718 9.00 × 10−5 10856 0.03524 > 500000 0.00215 > 500000
31 4.81 × 10−5 774 4.81 × 10−5 958 4.81 × 10−5 154 4.81 × 10−5 180 2.30 × 10−5 1727 2.30 × 10−5 987
32 1.29 × 10−5 9126 1.29 × 10−5 11008 1.18 × 10−5 472 1.18 × 10−5 1394 4.97479 > 500000 4.97479 > 500000
33 1.98 × 10−5 124 1.98 × 10−5 1454 2.36 × 10−5 1250 2.36 × 10−5 40254 4.97479 > 500000 9.94967 > 500000
34 9.65 × 10−5 698 9.65 × 10−5 718 9.65 × 10−5 242 9.65 × 10−5 242 9.65 × 10−5 285 9.65 × 10−5 1621
35 2.41 × 10−5 2444 2.41 × 10−5 2972 2.41 × 10−5 1494 2.41 × 10−5 1700 5.75 × 10−5 2703 8.80 × 10−5 20025
36 3.05 × 10−5 19134 3.05 × 10−5 31430 5.42 × 10−5 4590 5.42 × 10−5 10910 8.29 × 10−5 74071 8.29 × 10−5 174529
37 1.37 × 10−7 492 1.37 × 10−7 564 5.64 × 10−5 210 5.64 × 10−5 236 2.88 × 10−5 341 2.88 × 10−5 255
38 3.42 × 10−7 24272 3.42 × 10−7 16704 6.41 × 10−5 1422 6.41 × 10−5 7210 7.21 × 10−5 322039 7.21 × 10−5 31999
39 1.77 × 10−8 1492 1.77 × 10−8 86306 1.30 × 10−6 58058 1.30 × 10−6 315960 1269.34444 > 500000 1187.63199 > 500000
40 −10.15234 6618 −10.15234 5866 −10.15234 1200 −10.15234 147 −10.152234 155
41 −10.40201 2298 −10.40201 2604 −10.402269 1224 −10.40269 1180 −10.40196 141 −10.40196 145
42 −10.53544 2498 −10.53545 3324 −10.53618 1158 −10.53618 1140 −10.53539 139 −10.53539 145
43 −186.72944 806 −186.72945 1684 −186.72441 2114 −186.72441 1780 −186.72153 2043 −186.72153 2967
44 1.15 × 10−5 112 1.15 × 10−5 190 1.15 × 10−5 108 1.15 × 10−5 118 8.74 × 10−5 91 8.74 × 10−5 209
45 2.87 × 10−5 392 2.87 × 10−5 1400 2.87 × 10−5 288 2.87 × 10−5 712 7.49 × 10−5 465 9.39 × 10−5 4653
46 5.74 × 10−5 1054 5.74 × 10−5 27566 5.74 × 10−5 784 5.74 × 10−5 16974 9.63 × 10−5 2057 6.32 × 10−5 99123
47 8.74 × 10−5 248 8.74 × 10−5 280 7.94 × 10−6 226 7.94 × 10−6 244 3.53 × 10−5 77 3.52 × 10−5 107
48 3.97 × 10−5 1354 3.97 × 10−5 1776 3.97 × 10−5 836 3.97 × 10−5 1034 7.19 × 10−5 411 7.19 × 10−5 833
49 9.35 × 10−5 3394 9.35 × 10−5 9244 9.11 × 10−6 3366 9.11 × 10−6 7688 7.76 × 10−6 1809 7.76 × 10−5 8133
50 −49.99788 1312 −49.99788 1662 −49.99512 1138 −49.99512 1506 −49.99525 8731 −49.99525 5693
51 −209.98779 3114 −209.98779 11878 −209.98007 24716 −209.98007 30100 −209.92644 > 500000 −209.98085 90375
52 2.88 × 10−5 156 2.88 × 10−5 162 2.88 × 10−5 338 2.88 × 10−5 502 7.95 × 10−5 209 7.95 × 10−5 237
53 6.43 × 10−5 3810 6.43 × 10−5 4060 6.44 × 10−5 27364 6.44 × 10−5 20974 0.11921 > 500000 9.71 × 10−5 316827
54 2.607286 > 500000 2.607286 > 500000 9.41133 > 500000 9.41133 > 500000 16.47703 > 500000 28.96394 > 500000

Average 30844.296 37072.0371 37283.85 44520.52 121484.19 98677.70


Continued on next page
Table 3.3 Continued from previous page

Problem BIRECT-Vl BIRECT-V BIRECT-l BIRECT DIRECT-l DIRECT

No. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval. f ( x̄ ) f.eval.

Median 808.00 1681.00 789.00 1190.00 1752.00 3810.00

Concluded
4
CHAPITRE

Conclusions and Perspectives

This thesis delved into the extensive field of global optimization, with a specific fo-
cus on methods for solving global optimization problems, especially in the context
of black-box functions. Throughout our journey, we explored the historical evolu-
tion of global optimization methods, highlighting algorithms like BIRECT, a key
player in our study.

The BIRECT algorithm served as our starting point, and we examined it in de-
tail, revealing crucial aspects of its Diagonal Partitioning Strategy and Sampling
Scheme, which significantly improved its efficiency and precision.

Significantly, our research also shed light on the DIRECT algorithm, a well-established
pillar of global optimization. By analyzing its history and fundamental principle,
we observed that despite its undeniable advantages, it had limitations when tack-
ling complex problems. Notably, when DIRECT encounters large cylinders, it may
show performance gaps.

It is precisely in response to these challenges that the research community has un-
dertaken essential modifications to enhance the DIRECT algorithm. Researchers
have developed variants and innovative approaches to address these weaknesses.
In this context, our contribution in the form of the BIRECT-V algorithm has taken
a central place. BIRECT-V distinguishes itself by its bold use of sampling at points
located at 1/3 and 1 along the main diagonal of the initial hyperrectangle. Unlike
most DIRECT algorithms, our approach includes the evaluation of the objective
function at the vertices for bisection, providing a deeper understanding of the ob-

71
jective function. The results of BIRECT-V have been exceptional, demonstrating
its ability to compete with other DIRECT-type algorithms while offering a fresh
perspective on dealing with large cylinders.

In summary, this thesis contributes to the extensive body of knowledge in the field
of global optimization by highlighting significant advances made to overcome the
weaknesses of the DIRECT algorithm. Our comparison with DIRECT has unders-
cored the distinct advantages of our approaches, thus paving the way for ongoing
improvements in global optimization. This research underscores the continued
importance of innovation and exploration in this field and hopes to catalyze fur-
ther research aimed at solving complex optimization problems more effectively.
Bibliographie

[1] Baker, C.A., Watson, L.T., Grossman, B., Mason, W.H., Haftka, R.T. : Parallel
Global Aircraft Configuration Design Space Exploration. Nova Science Publi-
shers, Inc., USA, 79–96.

[2] Custódio, A.L., Rocha, H., Vicente, L.N. : Incorporating minimum Frobenius
norm models in direct search. Computational Optimization and Applica-
tions. (2010), 46(2), 265-278. DOI :10.1007/s10589-009-9283-0.

[3] Dennis, J. E., Torczon,V. : Direct search methods on parallel machines. SIAM
Journal of Optimization, 1 :448–474, 1991.

[4] Di Serafino, D., Liuzzi, G., Piccialli, V., Riccio, F., Toraldo, G. : A modified DI-
viding RECTangles algorithm for a problem in astrophysics. Journal of Opti-
mization Theory and Applications. (2011), 151(1), 175-190. DOI :10.1007/
s10957-011-9856-9.
[5] Finkel, D.E. : Global optimization with the Direct algorithm. Ph.D. thesis,
North Carolina State University (2005).

[6] Finkel, D.E., Kelley, C.T. : An Adaptive Restart Implementation of DIRECT. In


Technical report CRSC-TR04-30. Center for Research in Scientific Computa-
tion, North Carolina State University (2004), Raleigh, 1–16.

[7] Finkel, D.E., Kelley, C.T. : Additive scaling and the DIRECT algorithm.
Journal of Global Optimization. (2006), 36(4), 597-608. DOI :10.1007/
s10898-006-9029-9.

73
[8] Floudas, C.A. : Deterministic Global Optimization : Theory, Methods and Ap-
plications. Nonconvex Optimization and Its Applications, vol. 37. Springer,
Boston, MA (1999). https://doi.org/10.1007/978-1-4757-4949-6.

[9] Gablonsky, J.M. : Modifications of the Direct algorithm. Ph.D. thesis, North
Carolina State University (2001).

[10] Gablonsky, J.M., Kelley, C.T. : A locally-biased form of the DIRECT algo-
rithm. Journal of Global Optimization. (2001), 21(1), 27-37. DOI :10.1023/A:
1017930332101.
[11] Galperin, E. A. : The beta-algorithm. Journal of Mathematical Analysis and
Applications, 126 :455–468, 1987.

[12] Galperin, E. A. : Precision, complexity, and computational schemes of


the cubic algorithm. Journal of Optimization Theory and Applications,
57(2) :223–238, May 1988.

[13] Gourdin. E, Jaumard. B, MacGibbon. B. B. : Global Optimization of Mul- tiva-


riate Lipschitz Functions : Survey and Computational Comparison. Le s Ca-
hiers du GERAD, May 1994.

[14] Grbić, R., Nyarko, E.K., Scitovski, R. : A modification of the direct method
for Lipschitz global optimizatio n for a symmetric function. Journal of Glo-
bal Optimization 57, 4 (2013), 1193–1212. https://doi.org/10.1007/
s10898-012-0020-3.
[15] He, J., Verstak, A., Sosonkina, M., Watson, L.T. : Performance modeling and
analysis of a massively parallel DIRECT–Part 2. The International Journal
of High Performance Computing Applications 23, 1 (2009), 29–41. https:
//doi.org/10.1177/1094342008098463.
[16] He, J., Verstak, A., Sosonkina, M., Watson, L.T. : Performance modeling and
analysis of a massively parallel DIRECT–part 1. The International Journal
of High Performance Computing Applications 23, 1 (2009), 14–28. https:
//doi.org/10.1177/1094342008098462.
[17] He, J, Watson, L.T., Sosonkina, M., Watson, L.T. : VTDIRECT95 : Serial and Pa-
rallel Codes for the Global Optimization Algorithm Direct. ACM Trans. Math.
Softw. 36, 3, Article 17 (July 2009), 24 pages. https://doi.org/10.1145/
1527286.1527291.
[18] Hedar, A. : Test functions for unconstrained global optimization. :http:
//www-optima.amp.i.kyotou.ac.jp/member/student/hedar/
Hedar_files/TestGO.htm (2005). ((accessed on 23 August 2006)).
[19] Horst, R., Pardalos, P.M., Thoai, N.V. : Introduction to Global Optimization.
Nonconvex Optimization and Its Application. Kluwer Academic Publishers
(1995).

[20] Horst, R., Tuy, H. : Global Optimization : Deterministic Approaches. Springer,


Berlin (1996).

[21] Janka, E. : Vergleich Stochastischer Verfahren zur Globalen Optimierung.


Diplo- marbeit, Universit¨at Wien, 1999.

[22] Jonas, M., Paulavičius, R., Dainius Rusakevičius, Šešok, D., Žilinskas, J. : Ap-
plication of Reduced-set Pareto-Lipschitzian Optimization to truss optimiza-
tion. Journal of Global Optimization 67, 1-2 (2017), 425–450. https://doi.
org/10.1007/s10898-015-0364-6.
[23] Jones, D.R., Perttunen, C.D., Stuckman, B.E. : Lipschitzian optimization wi-
thout the Lipschitz constant. Journal of Optimization Theory and Applica-
tion. (1993), 79(1), 157-181. DOI :10.1007/BF00941892.

[24] Jones, D.R. : The Direct global optimization algorithm. In : C.A. Floudas, P.M.
Pardalos (eds.) The Encyclopedia of Optimization, pp. (2001), 431-440. Klu-
wer Academic Publishers, Dordrect (2001).

[25] Jones, D.R., Martins, J.R.R.A. : The DIRECT algorithm : 25 years later. Journal
of Global Optimization 79, 521–566 (2021). https://doi.org/10.1007/
s10898-020-00952-6.
[26] Ma, K., Rios, L. M., Bhosekar, A., Sahinidis, N., V., Rajagopalan, S. : Branch-
and-Model : a derivative-free global optimization algorithm. Computatio-
nal Optimization and Applications. (2023), https://doi.org/10.1007/
s10589-023-00466-3.
[27] Kvasov, D.E., Sergeyev, Y.D. : Lipschitz gradients for global optimization in a
one-point-based partitioning scheme. Journal of Computational and Applied
Mathematics. (2012), 236(16), 4042-4054. DOI :10.1016/j.cam.2012.02.
020.
[28] Liberti, L., Kucherenko, S. : Comparison of deterministic and stochas-
tic approaches to global optimization. International Transactions in
Operational Research 12(3), 263–285 (2005) https://onlinelibrary.
wiley.com/doi/pdf/10.1111/j.1475-3995.2005.00503.x.https:
//doi.org/10.1111/j.1475-3995.2005.00503.x.
[29] Liu, Q. : Linear scaling and the DIRECT algorithm. Journal of Global Op-
timization 56 (2013), 1233–1245. Issue 3. https://doi.org/10.1007/
s10898-012-9952-x.
[30] Liu, Q., Cheng, W. : A modified DIRECT algorithm with bilevel partition.
Journal of Global Optimization. (2014), 60(3), 483-499. DOI :10.1007/
s10898-013-0119-1.
[31] Liu, H., Xu, S.,Wang, X.,Wu, J., Song, Y. : A global optimization algorithm for
simulation-based problems via the extended DIRECT scheme. Eng. Optim.
(2015), 47(11), 1441–1458. DOI :10.1080/0305215X.2014.971777.

[32] Liu, Q., Zeng, J., Yang, G. : MrDIRECT : a multilevel robust DIRECT algorithm
for global optimization problems. Journal of Global Optimization. (2015),
62(2), 205-227. DOI :10.1007/s10898-014-0241-8.

[33] Liuzzi, G., Lucidi, S., Piccialli, V. : A direct-based approach exploiting lo-
cal minimizations for the solution for large-scale global optimization pro-
blems. Computational Optimization and Applications. (2010), 45(2), 353-375.
DOI :10.1007/s10589-008-9217-2.

[34] Liuzzi, G., Lucidi, S., Piccialli, V. : A DIRECT-based approach exploiting local
minimizations for the solution of large-scale global optimization problems.
Computational Optimization and Applications. (2010), 45, 353-375. DOI :10.
1007/s10589-008-9217-2.
[35] Liuzzi, G., Lucidi, S., Piccialli, V. : A partition-based global optimization algo-
rithm. Journal of Global Optimization. (2010), 48(1), 113-128. DOI :10.1007/
s10898-009-9515-y.
[36] Liuzzi, G., Lucidi, S., Piccialli, V. : Exploiting derivative-free local searches in
direct-type algorithms for global optimization. Computational Optimization
and Applications pp. (2014), 1-27. DOI :10.1007/s10589-015-9741-9.

[37] Paulavičius, R., Žilinskas, J. : Analysis of different norms and corresponding


Lipschitz constants for global optimization. Information Technology and
Control. (2007), 36(4), 383-387.

[38] Piyavskii, S.A. : An algorithm for finding the absolute minimum of a func-
tion. Theory of Optimal Solutions 2, 13–24 (1967). https://doi.org/
10.1016/0041-5553(72)90115-2. in Russian. DOI :10.1080/13928619.
2006.9637758.
[39] Shubert, B.O. : A sequential method seeking the global maximum of a func-
tion. SIAM Journal on Numerical Analysis 9, 379–388 (1972). https://doi.
org/10.1137/0709036.
[40] Pintér, J.D. : Global Optimization in Action : Continuous and Lipschitz Opti-
mization : Algorithms, Implementations and Applications. Nonconvex Op-
timization and Its Applications, vol. 6. Springer, Berlin, Germany (1996).
https://doi.org/10.1007/978-1-4757-2502-5.
[41] Paulavičius, R., Žilinskas, J., Grothey, A. : Parallel branch and bound for glo-
bal optimization with combination of Lipschitz bounds. Optimization Me-
thods and Software. (2011), 26(3), 487-498. DOI :10.1080/10556788.2010.
551537 .
[42] Paulavičius, R., Žilinskas, J. : Simplicial Lipschitz optimization without the
Lipschitz constant. Journal of Global Optimization 59, 1 (2013), 23–40.
https://doi.org/10.1007/s10898-013-0089-3.
[43] Paulavičius, R., Žilinskas, J. : Simplicial Global Optimization. SpringerBriefs
in Optimization. Springer New York, New York, NY (2014). DOI :10.1007/
978-1-4614-9093-7.
[44] Paulavičius, R., Sergeyev, Y.D., Kvasov, D.E., Žilinskas, J. : Globally-biased
DISIMPL algorithm for expensive global optimization. Journal of Glo-
bal Optimization 59, 2-3 (2014), 545–567. https://doi.org/10.1007/
s10898-014-0180-4.
[45] Paulavičius, R., Chiter, L., Žilinskas, J. : Global optimization based on
bisection of rectangles, function values at diagonals, and a set of Lip-
schitz constants. J. Glob. Optim. (2018), 71(1), 5–20. DOI :10.1007/
s10898-016-0485-6.
[46] Paulavičius, R., Sergeyev, Y.D., Globally-biased BIRECT algorithm with local
accelerators for expensive global optimization, Expert Systems with Applica-
tions. November 2019.

[47] Paulavičius, R., Sergeyev, Y.D., Kvasov, D.E., Žilinskas, J. : Globally-biased BI-
RECT algorithm with local accelerators for expensive global optimization. Ex-
pert Systems with Applications 144 (2020), 11305. https://doi.org/10.
1016/j.eswa.2019.113052.
[48] Sergeyev, Y.D. : On convergence of \divide the best" global optimization algo-
rithms. Optimization. (1998), 44(3), 303-325.

[49] Sergeyev, Y.D. : An efficient strategy for adaptive partition of N-dimensional


intervals in the framework of diagonal algorithms. Journal of Optimiza-
tion Theory and Applications. (2000), 107(1), 145-168. DOI :10.1023/A:
1004613001755.
[50] Sergeyev, Y.D. : Efficient partition of n-dimensional intervals in the framework
of one-point-based algorithms. Journal of optimization theory and applica-
tions. (2005), 124(2), 503-510. DOI :10.1007/s10957-004-0948-7.

[51] Sergeyev, Y.D., Kvasov, D.E. : Global search based on diagonal partitions and a
set of Lipschitz constants. SIAM Journal on Optimization. (2006), 16(3), 910-
937. DOI :10.1137/040621132.
[52] Sergeyev, Y.D., Kvasov, D.E. : Diagonal Global Optimization Methods. FizMat-
Lit, Moscow (2008). In Russian.

[53] Sergeyev, Y.D., Kvasov, D.E. : On deterministic diagonal methods for sol-
ving global optimization problems with Lipschitz gradients. In : Optimi-
zation, Control, and Applications in the Information Age, 130, pp . Sprin-
ger International Publishing Switzerland. (2015), 315-334. DOI :10.1007/
978-3-319-18567-5-16.
[54] Sergeyev, Y.D., Kvasov, D.E. : Lipschitz global optimization. In : Cochran, J.J.,
Cox, L.A., Keskinocak, P., Kharoufeh, J.P., Smith, J.C. (eds.) Wiley Encyclopedia
of Operations Research and Management Science (in 8 Volumes) vol. 4, pp.
2812–2828. John Wiley and Sons, New York, NY, USA (2011).

[55] Sergeyev, Y.D. ; Kvasov, D.E. Deterministic Global Optimization : An


Introduction to the Diagonal Approach ; SpringerBriefs in Optimiza-
tion ; Springer : Berlin, Germany, 2017. https://doi.org/10.1007/
978-1-4939-7199-2.
[56] Shen, Z., Zhu, Y. : An Interval Version of Shubert’s Iterative Method for the
Localization of the Global Maximum. Computing, 23 :275–280, 1987.

[57] Stripinis, L., Paulavičius, R., Žilinskas, J. : Improved scheme for selection of
potentially optimal hyperrectangles in DIRECT. Optim. Lett. (2018), 12(7),
1699–1712. DOI :10.1007/s11590-017-1228-4.

[58] Stripinis, L., Paulavičius, R., Žilinskas, J. : Penalty functions and two-step se-
lection procedure based DIRECT-type algorithm for constrained global op-
timization. Struct. Multidiscip. Optim. (2019), 59(6), 2155–2175. DOI :10.
1007/s00158-018-2181-2.
[59] Stripinis, L. : Parallel DIRECT-type algorithms for generally constrai-
ned global optimization problems in MATLAB. https://github.com/
blockchain-group/pDIRECT-GLce.
[60] Stripinis, L., Žilinskas, J., Casado, L.G., Paulavičius, R. : On MATLAB ex-
perience in accelerating DIRECT-GLce algorithm for constrained global
optimization through dynamic data structures and parallelization. Appl.
Math. Comput. 390 (2021), 1–17. https://doi.org/10.1016/j.amc.
2020.125596.
[61] Stripinis, L., Paulavičius, R. : DIRECTGOLib - DIRECT Global Optimization
test problems Library, v1.1. Zenodo (2022). https://doi.org/10.5281/
zenodo.6491951.
[62] Stripinis, L., Paulavičius, R. : DIRECTGO : A new DIRECT-type MATLAB
toolbox for derivative free global optimization. GitHub (2022). https://
github.com/blockchain-group/DIRECTGO.
[63] Stripinis, L., Paulavičiuss, R. : DIRECTGO : A new DIRECT-type MATLAB tool-
box for derivative free global optimization. arXiv (2022). https://arxiv.
org/abs/2107.0220.
[64] Stripinis, L., Paulavičius, R. : Lipschitz-inspired HALRECT Algorithm for
Derivative-free Global Optimization. https://doi.org/10.48550/
arXiv.2205.03015.
[65] Stripinis, L. ; Paulavičius, R. An extensive numerical benchmark study of
deterministic vs. stochastic derivative-free global optimization algorithms.
https://doi.org/10.48550/ARXIV.2209.05759.
[66] Stripinis, L. ; Paulavičius, R. An empirical study of various candidate selection
and partitioning techniques in the framework. J. Glob. Optim. 2022, 1–31.
https://doi.org/10.1007/s10898-022-01185-5.
[67] Stripinis, L. ; Paulavičius, R. Experimental Study of Excessive Local Refine-
ment Reduction Techniques for Global Optimization -Type Algorithms. Ma-
thematics 2022, 10, 3760. https://doi.org/10.3390/math10203760.

[68] Surjanovic, S., Bingham, D. : Virtual Library of Simulation Experiments : Test


Functions and Datasets. http://www.sfu.ca/~ssurjano/index.html
(2013). Online ; accessed : 2017-03-22.
[69] Tsvetkov, E.A., Krymov, R.A. : Pure Random Search with Virtual Extension of
Feasible Region. J Optim Theory Appl 195, 575—595 (2022). https://doi.
org/10.1007/s10957-022-02097-w.
[70] Tuy, H. : Convex Analysis and Global Optimization. Springer Science & Busi-
ness Media (2013).

[71] Watson, L.T., Baker, C.A. : A fully-distributed parallel global search algo-
rithm. Engineering Computations. Engineering Computations 18, 1/2 (2001),
155–169. https://doi.org/10.1108/02644400110365851.

[72] Zhigljavsky, A., Žilinskas, A. : Stochastic Global Optimization. Springer, New


York (2008).
5
CHAPITRE

Appendices

82
.1 Key characteristics of the Hedar test problems [18]

TABLE 1 – Key characteristics of the Hedar test problems[18].

Problem Problem Dimension Feasible region No. of local Optimum


No. name n D = ([a j , b j ], j = 1, . . . , n ) minima f∗

1∗ , 2∗ , 3∗ Ackley 2, 5, 10 [−15, 35]n multimodal 0.0


4 Beale 2 [−4.5, 4.5] 2
multimodal 0.0
5∗ Bohachevsky 1 2 [−100, 110]2 multimodal 0.0
6∗ Bohachevsky 2 2 [−100, 110]2 multimodal 0.0
7∗
Bohachevsky 3 2 [−100, 110] 2
multimodal 0.0
8 Booth 2 [−10, 10]2 unimodal 0.0
9 Branin 2 [−5, 10] × [10, 15] 3 0.39789
10 Colville 4 [−10, 10] 4
multimodal 0.0
11, 12, 13 Dixon & Price 2, 5, 10 [−10, 10]n unimodal 0.0
14 Easom 2 [−100, 100]2 multimodal −1.0
15 Goldstein & Price 2 [−2, 2] 2
4 3.0
16∗ Griewank 2 [−600, 700]2 multimodal 0.0
17 Hartman 3 [0, 1]3
4 −3.86278
18 Hartman 6 [0, 1]6
4 −3.32237
19 Hump 2 [−5, 5]2 6 −1.03163
20, 21, 22 Levy 2, 5, 10 [−10, 10] n
multimodal 0.0
23 ∗
Matyas 2 [−10, 15] 2
unimodal 0.0
24 Michalewics 2 [0, π]2 2! −1.80130
25 Michalewics 5 [0, π] 5
5! −4.68765
26 Michalewics 10 [0, π] 10
10 ! −9.66015
27 Perm 4 [−4, 4]4 multimodal 0.0
28, 29 Powell 4, 8 [−4, 5] n
multimodal 0.0
30 Power Sum 4 [0, 4]4
multimodal 0.0
31∗ , 32∗ , 33∗ Rastrigin 2, 5, 10 [−5.12, 6.12]n multimodal 0.0
34, 35, 36 Rosenbrock 2, 5, 10 [−5, 10] n
unimodal 0.0
37, 38, 39 ∗
Schwefel 2, 5, 10 [−500, 500] n
unimodal 0.0
40 Shekel, m = 5 4 [0, 10]4 5 −10.15320
41 Shekel, m = 7 4 [0, 10] 4
7 −10.40294
42 Shekel, m = 10 4 [0, 10] 4
10 −10.53641
43 Shubert 2 [−10, 10]2 760 −186.73091
∗ ∗
44 , 45 , 46 ∗
Sphere 2, 5, 10 [−5.12, 6.12] n
multimodal 0.0
∗ ∗
47 , 48 , 49 ∗
Sum squares 2, 5, 10 [−10, 15] n
unimodal 0.0
50 Trid 6 [−36, 36]6 multimodal −50.0
51 Trid 10 [−100, 100] 10
multimodal −210.0
∗ ∗
52 , 53 , 54 ∗
Zakharov 2, 5, 10 [−5, 11] n
multimodal 0.0
.2 Global minimizer found by the BIRECT-V algorithm
using Hedar test problems [18] with modified domain
from Table 3.3
TABLE 2 – Global minimizer found by the BIRECT-V algorithm using Hedar test problems [18] with modified domain from
Table 3.3.

Problem Dimension Modified domain Global minimizer Globally optimal known


number (from Table 1) n D̃ found by BIRECT-V solution (Source [18, 68, 61])

1, 2, 3 2, 5, 10 [−15, 32]n [0.0000038147, [0]


4 2 −− [3.0000000000 0.4980468750] [3 ; 0.5]
5, 6, 7 2 [−100, 110.7]2 [0.0001953125, [0 ; 0]
8 2 [−10, 10.1]2 [0.9987304687 3.0008789062] [1 ; 3]
9 2 −− [3.1396484375 2.2753906250] [3.1416 ; 2.275]
10 4 −− [0.9993489583, [1 ; 1 ; 1 ; 1]

i i
11 2 [−10, 10, 4554]2 [1.0033203125 -0.7069335937] [2(−((2 −2)/(2 ))) ]
i i
12 5 [−10.40, 12.301]5 [1.006 0.709 0.594 0.545 0.523] [2(−((2 −2)/(2 ))) ]
i i
13 10 [−10, 12]10 [1.002 0.708 0.595 0.544 0.521 0.510 0.505 0.502 0.501 0.501] [2(−((2 −2)/(2 ))) ]

14 2 −− [3.1412760417, [π; π]
15 2 −− [0.0000000000 -1.0000000000] [0 ; -1]
16 2 −− [0.0006357829 -0.0010172526] [0]

17 3 −− [0.114 0.557 0.854] [0.115 ; 0.556 ; 0.852]


18 6 −− [0.203 0.148 0.476 0.273 0.312 0.656] [0.202 0.150 0.477 0.275 0.312 0.657]

19 2 −− [-0.0911458333 0.7096354167] [-0.090 ; 0.713]

20 2 [−10, 10.51]2 [1.0027604167 1.0027604167] [1]


21 5 [−10, 10.5]5 [0.9973958333, [1]
22 10 [−10, 10.5]10 [0.9973958333, [1]

23 2 −− [0.0260416667, [0]

24 2 −− [2.203 1.571] [ 2.203 1.571]


25 5 [1.04, π]5 [2.203 1.571 1.285 1.924 1.720] [2.203 1.571 1.285 1.923 1.720 ]
26 10 −− [2.209 1.571 1.288 1.117 0.982 1.571 0.834 2.356 0.736 1.571] [2.203 1.571 1.285 1.923 1.720 1.571 1.454 1.756 1.656 1.571]

27 4 −− [1 2 3 4] [1 2 3 4]

28 4 −− [-0.021 0.002 -0.039 -0.039] [0]


29 8 [−4, 4.01]8 [0.009 -0.001 0.005 0.005 -0.050 0.005 0.005 0.005] [0]

30 4 −− [1.001 2.000 2.000 3.000] [2.000 1.000 3.000 2.000]

31 2 −− [-0.0003483073] [0]
32 5 [−5.12, 5.30]5 [0.0001139323, [0]
33 10 [−5.12, 5.12]10 [0.0001000000, [0]

34 2 −− [1.0009765625, [1]
35 5 −− [0.9997558594, [1]
36 10 [−5, 10.1]10 [0.9998168945, [1]

37, 38 2, 5 [−519, 519]n [420.9694824219, [420.9687474737558,


39 10 [−500, 650]10 [420.9686279297, [420.9687474737558,

40 4 −− [4.001 4.001 4.001 3.997] [4.000 ; 4.000 ; 4.000 ; 4.000]


41 4 −− [4.001 4.001 3.997 3.997] [4.000 ; 4.001 ; 3.999 ; 3.999]
42 4 −− [4.001 4.001 3.997 3.997] [4.001 ; 4.000 ; 3.999 ; 3.999]

43 2 [−5.12, 512]2 [-1.426 -0.801] [4.858, -7.083]


44 2 −− [0.0023958333, [0]
45 5 −− [0.0023958333, [0]
46 10 −− [0.0023958333, [0]

47 2 [−10, 11.5]2 [0.0016276042 -0.0065104167] [0]


48 5 [−10, 10.5]5 [0.0016276042, [0]
49 10 [−10, 10.5]10 [-0.0004069010, [0]

50 6 [−36.5, 36.5]6 [5.9880 9.980 11.976 11.976 9.9800 5.988] [i ∗ (6 + 1 − i )]


51 10 [−120, 120]10 [10.000 17.969 23.984 27.969 29.922 29.922 27.969 24.062 17.969 10.000] [i ∗ (10 + 1 − i )]

52 2 [−5, 12]2 [0.0026041667 0.0026041667] [0]


53 5 [−5, 10.01774]5 [0.0010247461, [0]
54 10 [−5, 13]10 [0.000 0.125 0.000 0.250 0.250 -1.000 0.000 0.129 0.129 0.125] [0]
.3 Derivative-free optimization software

The objective of this appendix is to evaluate the current status of software imple-
mentations of the previously discussed algorithms. These software implementa-
tions have been in development or have been released since 1998. They share the
common capability of effectively handling black-box functions in a non-intrusive
manner. This implies that these optimization solvers can operate without access
to or modification of the source code of the optimization problem being addres-
sed.

The following table (Table ??) provides an overview of the considered solvers, ac-
companied by their key characteristics. Each solver’s handling of variable bounds
is indicated in the ’Bounds’ column, with potential entries being :

"no" for solvers that lack support for bounds.

"required" for solvers mandating bounds on all variables.

"optional" for solvers that allow bounds but do not necessitate them.

Additionally, the ’Constraints’ column reflects the solvers’ capacity to manage li-
near algebraic constraints and general constraints available solely as black-box
executables, without an explicit functional representation. This appendix will pro-
ceed to delve into each solver’s details

ASA

Adaptive Simulated Annealing (ASA) is a C-based optimization tool designed for


unconstrained problems. It uses a unique probability distribution to explore solu-
tions far from the current point, helping it avoid getting stuck in local minima. ASA
also employs separate temperature parameters for each variable, adjusting them
as it progresses in the optimization process, making it adaptable and effective for
a wide range of optimization challenges.

BOBYQA

Bound Optimization BY Quadratic Approximation (BOBYQA) is a Fortran im- ple-


mentation of Powell’s model-based algorithm [115]. BOBYQA is an extension of the
NEWUOA algorithm to bounded problems with additional considerations on the
set of rules that maintain the set of interpolating points.

CMA-ES

Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [55] is a genetic al- go-
rithm implemented in multiple languages including C, Matlab, and Python. Mu-
tation is performed by a perturbation with expected value zero and a co- variance
matrix which is iteratively updated to guide the search towards areas with expec-
ted lower objective values [56].

DAKOTA solvers

Design Analysis Kit for Optimization and Terascale Applications (DAKOTA) [45]
is a project at Sandia National Laboratories. DAKOTA’s initial scope was to cre-
ate a toolkit of black-box optimization methods. The scope was later expanded
to include additional optimization methods and other engineering applications,
including design of experiments, and nonlinear least squares. DAKOTA contains
a collection of optimization software packages featuring the COLINY library [124]
that includes, among others, the following solvers that we tested : 1. DAKOTA/EA :
an implementation of various genetic algorithms ;

2. DAKOTA/DIRECT : an implementation of the DIRECT algorithm ;

3. DAKOTA/PATTERN : various pattern search methods ; and

4. DAKOTA/SOLIS-WETS : greedy search, comparing the incumbent with points


generated from a multivariate normal distribution.

DFO

Derivative Free Optimization (DFO) [28,125] is an open-source Fortran imple- men-


tation of the trust-region-based algorithm originally developed by Conn et al. [31,32]
and expanded by Conn et al. [33]. DFO is a local optimiza- tion method designed
to handle very expensive function evaluations for small- dimensional problems
with fewer than 50 variables. Given a set of points, DFO identifies the point with
the best objective found and builds a quadratic model by interpolating a selected
subset of points. The resulting model is op- timized within a trust region centered
at the best point. Our computational experiments used the open-source Fortran
software IPOPT [29] to solve the trust-region subproblems.
FMINSEARCH
FMINSEARCH is an implementation of the Nelder-Mead simplex-based method
of Lagarias et al. [81]. This code is included as a Matlab built-in function in the
Optimization Toolbox and handles unconstrained optimization problems.
GLOBAL
GLOBAL [37] is a Matlab implementation of a multistart stochastic method pro-
posed by Boender et al. [20]. GLOBAL draws random uniformly distributed points
in a space of interest, selects a sample, and applies a clustering proce- dure. The
derivative-free local search solver UNIRANDI [69], based on random direction ge-
neration and linear search, is used from points outside the cluster.
HOPSPACK
Hybrid Optimization Parallel Search PACKage (HOPSPACK) [108] is a parallel pro-
cessing framework implemented in C++ that includes a GSS local solver. The user
is allowed to perform an asynchronous parallel optimization run using simulta-
neously the embedded GSS local solver, along with user-provided solvers.
IMFIL
IMFIL [74] is a Matlab implementation of the implicit filtering algorithm [142, 50].
MCS
Multilevel coordinate search (MCS) [101] is a Matlab implementation of the algo-
rithm proposed by Huyer and Neumaier [65] for global optimization of bound-
constrained problems.
NEWUOA
NEWUOA [114] is a Fortran implementation of Powell’s model-based algorithm
for derivative-free optimization [113]. The inputs to NEWUOA include initial and
lower bounds for the trust-region radius, and the number of function values to be
used for interpolating the quadratic model.
NOMAD
NOMAD [6,82] is a C++ implementation of the LTMADS [11] and ORTHO- MADS
[8] methods with the extreme barrier [11], filter [2] and progressive barrier [12] ap-
proaches to handle general constraints. It is designed to solve nonlinear, nons-
mooth, noisy optimization problems. NOMAD’s termination cri- terion allows for
a combination of the number of points evaluated, minimum mesh size, and the
acceptable number of consecutive trials that fail to improve the incumbent. A re-
lated implementation, NOMADm [3], is a collection of Matlab functions that solve
bound-constrained, linear or nonlinear optimization problems with continuous,
discrete, and categorical variables.

PSWARM

PSWARM [137] is a C implementation of the particle swarm pattern search method


[138]. Its search step performs global search based on the particle swarm algo-
rithm. Its poll step relies on a coordinate search method. The poll directions coin-
cide with positive and negative unit vectors of all variable axes. PSWARM allows
the user to compile and use the solver as a stand-alone software or as a custom
AMPL solver. A related Matlab implementation PSwarmM is also available [137].

SID-PSM

SID-PSM [41] is a Matlab implementation of a pattern search method with the poll
step guided by simplex derivatives [40]. The search step relies on the optimization
of quadratic surrogate models [39]. SID-PSM is designed to solve unconstrained
and constrained problems.

SNOBFIT

SNOBFIT is a Matlab implementation of the branch-and-fit algorithm proposed


by Huyer and Neumaier [66].

TOMLAB solvers

TOMLAB [60] is a Matlab environment that provides access to several derivative-


free optimization solvers, the following of which were tested : 1. TOMLAB/GLCCLUSTER
[60, pp.109-111] : an implementation of the DIRECT algorithm [71] hybridized with
clustering techniques [59] ;

2. TOMLAB/LGO [107] : a suite of global and local nonlinear solvers [106] that im-
plements a combination of Lipschitzian-based branch-and-bound with de- ter-
ministic and stochastic local search (several versions of LGO are available, for ins-
tance under Maple, and may offer different features than TOMLAB/LGO) ;

3. TOMLAB/MULTIMIN [60, pp.148–151] : a multistart algorithm ; and

4. TOMLAB/OQNLP [62] : a multistart approach that performs searches using a


local NLP solver from starting points chosen by a scatter search algorithm.

Additional solvers considered

In addition to the above solvers for which detailed results are presented in the
sequel, we experimented with several solvers for which results are not presented
here. These solvers were : 1. PRAXIS : an implementation of the minimization al-
gorithm of Brent [25] ;

2. TOMLAB/GLBSOLVE [60, pp.106–108] : an implementation of the DIRECT algo-


rithm [72], specifically designed to handle box-bounded problems ;

3. TOMLAB/GLCSOLVE [60, pp.118–122] : an implementation of an extended ver-


sion of the DIRECT algorithm [72] that can handle integer variables and linear or
nonlinear constraints ;

4. TOMLAB/EGO [61, pp. 11–24] : an implementation of the EGO algorithm [126,


73] modified to handle linear and nonlinear constraints ;

5. TOMLAB/RBFSOLVE [61, pp.5–10] : an implementation of the radial basis func-


tion [19,53] that can handle box-constrained global optimization problems. The
PRAXIS solver is one of the first derivative-free optimization solvers developed.
This solver had below average performance and has not been up- dated since its
release in 1973. Results for the above four TOMLAB solvers are not included in the
comparisons since these solvers have been designed with the expectation for the
user to provide a reasonably small bounding box for the problem variables [59].
TABLE 3 – Derivative-free solvers

Solver URL Constraints Bounds Version Languages


ASA www.ingber.com 26.30 C Required No
BOBYQA Available by email from 2009 Fortran Required No
mjdp@cam.ac.uk
CMA-ES www.lri.fr/~hansen/cmaesintro. 3.26beta MATLAB Optimal No
html
DAKOTA solvers www.cs.sandia.gov/dakota/ 4.2 C++ Required Yes
DFO projects.coin-or.org/Dfo 2.0 Fortran Required Yes
GLOBAL www.inf.u-szeged.hu/~csendes 1.0 Matlab Required No
HOPSPACK software.sandia.gov/trac/ 2.0 C++ Optimal Yes
hopspack
IMFIL www4.ncsu.edu/~ctk/imfil.html 1.01 Matlab Required Yes
MCS www.mat.univie.ac.at/~neum/ 2.0 Matlab Required No
software/mcs/
NEWUOA Available by email from 2004 Fortran No No
mjdp@cam.ac.uk
NOMAD www.gerad.ca/nomad/ 3.3 C++ Optimal Yes
PSWARM www.norg.uminho.pt/aivaz/ 1.3 Matlab Required Yes∗
pswarm/
SID-PSM www.mat.uc.pt/sid-psm/ 1.1 Matlab Optimal Yes∗
Table 3 – Continued from previous page
Solver URL Version Languages Bounds Constraints
SNOBFIT www.mat.univie.ac.at/~neum/ 2.1 Matlab Required Yes
software/snobfit/
TOMLAB solvers URL 7.3 Matlab Required Yes

You might also like