PDF Business Analytics For Decision Making First Edition Kimbrough Ebook Full Chapter
PDF Business Analytics For Decision Making First Edition Kimbrough Ebook Full Chapter
PDF Business Analytics For Decision Making First Edition Kimbrough Ebook Full Chapter
https://textbookfull.com/product/business-analytics-data-
analysis-decision-making-6th-edition-s-christian-albright/
https://textbookfull.com/product/business-analytics-data-
analysis-decision-making-mindtap-course-list-7th-edition-
albright/
https://textbookfull.com/product/accounting-business-reporting-
for-decision-making-sixth-edition-jacqueline-birt/
https://textbookfull.com/product/accounting-business-reporting-
for-decision-making-seventh-edition-jacqueline-birt/
Metaheuristics for Business Analytics: A Decision
Modeling Approach 1st Edition Abraham Duarte
https://textbookfull.com/product/metaheuristics-for-business-
analytics-a-decision-modeling-approach-1st-edition-abraham-
duarte/
https://textbookfull.com/product/data-science-for-business-and-
decision-making-1st-edition-luiz-paulo-favero/
https://textbookfull.com/product/business-ethics-decision-making-
for-personal-integrity-social-responsibility-4th-edition-laura-
hartman/
https://textbookfull.com/product/data-analysis-for-business-
decision-making-a-laboratory-manual-2nd-edition-andres-fortino/
Business Analytics
for Decision Making
This page intentionally left blank
Business Analytics
for Decision Making
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid-
ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti-
lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy-
ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents
Preface xi
I Starters 1
1 Introduction 3
v
vi Contents
3 Linear Programming 43
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Wagner Diet Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Solving an LP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Post-Solution Analysis of LPs . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 More than One at a Time: The 100% Rule . . . . . . . . . . . . . . . . . . 53
3.6 For Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.7 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
II Optimization Modeling 59
4 Simple Knapsack Problems 61
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Solving a Simple Knapsack in Excel . . . . . . . . . . . . . . . . . . . . . . 61
4.3 The Bang-for-Buck Heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Post-Solution Analytics with the Simple Knapsack . . . . . . . . . . . . . . 64
4.4.1 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.2 Candle Lighting Analysis . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5 Creating Simple Knapsack Test Models . . . . . . . . . . . . . . . . . . . . 72
4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.7 For Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.8 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 Assignment Problems 81
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2 The Generalized Assignment Problem . . . . . . . . . . . . . . . . . . . . . 82
5.3 Case Example: GAP 1-c5-15-1 . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4 Using Decisions from Evolutionary Computation . . . . . . . . . . . . . . . 86
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.6 For Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.7 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3 Solution Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.3.1 Exact Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.3.2 Heuristic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.3.2.1 Construction Heuristics . . . . . . . . . . . . . . . . . . . . 101
6.3.2.2 Iterative Improvement or Local Search . . . . . . . . . . . 102
6.3.3 Putting Everything Together . . . . . . . . . . . . . . . . . . . . . . 103
6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Contents vii
V Conclusion 279
19 Conclusion 281
A Resources 289
Bibliography 291
Index 303
Preface
Business analytics is about using data and models to solve—or at least to help out on—
decision problems faced by individuals and organizations of all sorts. It is a broad, open,
and expanding concept, with many facets and with application well beyond the prototypical
context of a commercial venture. We take the term business analytics to apply in any situ-
ation in which careful decision making is undertaken based on models and data (including
text, video, graphic, etc. data, as well as standard numerical and transaction processing
data).
The two principal facets or aspects of business analytics are data analytics (associated
with the popular buzz term “big data”) and model analytics. These two facets, of course,
interact and overlap extensively. Models are needed for data analytics and model analytics
often, as we shall see, involve exploration of large corpora of data.
This book is mainly about model analytics, particularly model analytics for constrained
optimization. Moreover, our focus is unremittingly practical. To this end, we focus heavily on
parameter sweeping (a term of art in the simulation community, apt for but not much used
in optimization) and decision sweeping, a new term we introduce here. Both are methods
and conceptual tools for model analytics. The larger, governing principle is what we call
solution pluralism. It is constituted by the collection of multiple solutions to models as an
aid to deliberation with them.
Our primary topics emphasis is three-fold and distinctive in each case.
1. We focus on computationally challenging problems actually arising in business enter-
prises and contexts. This is natural for us because of where and who we teach, because
of the research we do, and simply because there are very many such problems.
2. We dwell extensively (but not exclusively) on using heuristics for solving difficult
constrained optimization problems. Modern metaheuristics—such as simulated an-
nealing and genetic algorithms, which we discuss in detail—have proved their worth
compellingly and have received enthusiastic uptake among practitioners. This book
is unusual among business analytics texts in focusing on using heuristics for solving
difficult optimization problems that are important in practice.
3. We emphasize throughout the use of constrained optimization models for decision
making. In consequence, post-solution analysis of models as it contributes to deliber-
ation for decision making is a main topic in the book. We take seriously the saying,
commonly assented to among practitioners, to the effect that after a model has been
specified, formulated, implemented, and solved, then the real work begins. This book
is very much about that real work, undertaken after a model has been solved. The em-
phasis on post-solution analysis is another distinctive feature of the book, motivated
by quite apparent practical needs.
Our emphasis on post-solution analysis is in distinction to the usual preoccupation with
model formulation and solution theory. These are important topics and we certainly touch
upon them in no small way, as is appropriate. Model formulation and model solution theory
xi
xii Preface
are topics that complement the study of post-solution analysis, which is a main concern in
this book.
Our intended audience for the book consists of those who would undertake serious de-
liberative decision making with models, in short, practitioners of business analytics both
present and future (that is, working professionals and university students). The subject is
larger than can be comprehensively treated in a single book, and we mean this book to be
introductory and accessible to a broad range of readers, having a broad range of mathe-
matical and computational preparation. We aim to present advanced ideas of post-solution
analysis and model-based deliberation in a way that is accessible to tyros and informative
for veteran practitioners.
Our approach, our strategy, is to pick a part of business analytics that is important in
itself, one that should be known by practitioners as an essential part of the background
knowledge for analytics (whether or not the analyst is working directly in the area) and
one that affords discussion of post-solution analysis of models. To that end and as already
noted, we have chosen that part of business analytics that contains model analytics for
constrained optimization models. Further to that end, the book dwells in large part on
combinatorial optimization problems, and the use of modern metaheuristics to solve them
and to obtain useful information from them for post-solution analysis. Programming is not
required to understand the material, but we make available programming examples for those
interested. We give examples in Excel, GAMS, MATLAB, and OPL. The metaheuristics
code is available online at the book’s Web site in a documented library of Python modules
(http://pulsar.wharton.upenn.edu/~sok/biz_analytics_rep), along with data, mate-
rial for homework exercises, and much else. For readers without programming skills, we
hope to communicate useful information about modeling and model deliberation with the
examples in the text. For readers chary of programming, we note that this material has
been taught and tested with many others who share your outlook. We do our best to make
valuable information for modern decision making accessible to all.
The book is organized into five parts, each containing one or more chapters that naturally
group together.
Part I, “Starters”, contains three chapters. Chapter 1, “Introduction,” frames the entire
book and introduces key concepts and terminology that are used throughout what follows.
Chapter 2, “Constrained Optimization Models: Introduction and Concepts,” is an
overview of the various kinds of constrained optimization models, categorized conventionally
for ease of reference to the literature. A higher level classification divides these models into
those that are tractable (relatively easy to solve, even for large instances) and those that
are intractable (in theory very difficult to solve when scaled up). Our attention in this book
is mainly directed at models in the intractable category and at how we can use heuristics
and metaheuristics to solve them and gain useful information from them for purposes of
decision making.
Chapter 3, “Linear Programming,” presents an overview of linear programming models.
These constrained optimization models are tractable. Large scale instances, with thousands
and even millions of decision variables, are widely deployed. Our treatment of linear pro-
gramming models is light and brief. We say what they are and give examples of how they
can be applied. We touch on solution methods and formulations. The exposition is introduc-
tory, with no attempt to be comprehensive. We aim to provide a useful point of departure
for exploration of post-solution analysis and deliberation, as well as for deeper exploration
of model formulation and solution theory. This is also the pattern we follow in Part II of
the book.
Part II, “Optimization Modeling,” consists of seven chapters, each providing an intro-
duction to an important class of (mostly) intractable constrained optimization models. As
is the case with linear programming, our treatment is brief, light, and broadly accessible.
Preface xiii
We aim to provide a useful point of departure for exploration of post-solution analysis and
deliberation, as well as for deeper exploration of model formulation and solution theory.
The seven classes of models are:
• Chapter 4, knapsack problems. These models are often used for R&D and for financial
portfolio planning.
• Chapter 5, assignment problems. These models are often used for assigning workers
to tasks in a factory or consulting firm.
• Chapter 6, traveling salesmen problems. These problems very often arise in trans-
portation, logistics, and tourism applications.
• Chapter 7, vehicle routing problems. These problems occur ubiquitously in service
firms having responsibilities that range geographically.
• Chapter 8, resource-constrained scheduling problems. These occur very often in
project scheduling for construction and manufacturing ventures, as well as elsewhere.
• Chapter 9, location analysis problems. These arise ubiquitously in siting service and
distribution centers, both for commercial ventures and governments. Related to these
are zone design problems (Chapter 18) that seek to find good service areas, given
demands. Designing sales and service districts are example applications.
• Chapter 10, two-sided matching problems. These are important for designing markets,
typically of buyers on one side and sellers on the other, when what is exchanged is
not a commodity.
Familiarity with and exposure to these seven classes of models, plus linear programming,
belongs in the intellectual armamentarium of every business analyst. Whether or not any
of these models are directly employed, familiarity with them affords a certain maturity of
mind that can recognize and act upon opportunities to improve deliberation and decision
making with models.
Part III, “Metaheuristic Solution Methods,” is about modern metaheuristics, which in
many if not most cases have become preferred methods for solving intractable optimization
problems. A heuristic is a method thought to perform well in most cases. We give exam-
ples throughout the discussion in Part II of heuristics for solving constrained optimization
problems. These traditional heuristics, however, are tailored to very specific model classes
and in many cases metaheuristics perform better. A metaheuristic may be thought of as
a general, parameterized heuristic. For example, genetic algorithms constitute a kind of
metaheuristic. They are general in that they apply very broadly and do not require (much)
problem specific information. Also, they are parameterized. For example, genetic algorithms
will typically have a mutation rate parameter (inspired by biological evolution), which can
be tuned for good performance on a particular class of problems. A genetic algorithm with
all of its parameters set—“instantiated”—constitutes a particular heuristic. Thus, genetic
algorithms in their various forms are said to constitute a metaheuristic because they may
be instantiated in indefinitely many ways to create particular heuristics. Genetic algorithms
are types of heuristics; instantiated as concrete procedures they are tokens of heuristics. The
point applies to other kinds of metaheuristics.
Part III contains three chapters. Chapter 11, “Local Search Metaheuristics,” presents,
motivates, and discusses several metaheuristics in the broader class of local search meta-
heuristics, including greedy hill climbing, simulated annealing, and tabu search. Using
Python code available on the book’s Web site, the chapter also discusses how these methods
may be used to solve exemplar problems discussed in Part II.
xiv Preface
Chapter 12, “Evolutionary Algorithms,” presents, motivates, and discusses several meta-
heuristics in the broader class of population-based metaheuristics, including evolutionary
programming and genetic algorithms. Again, using Python code available on the book’s
Web site, the chapter also discusses how these methods may be used to solve exemplar
problems discussed in Part II.
Chapter 13, “Identifying and Collecting Decisions of Interest,” bridges Parts III and
IV, “Post-Solution Analysis of Optimization Models.” The emphasis on metaheuristics is
a distinctive feature of this book, well justified by current developments in research and
trends in applications. Metaheuristics as we describe them are very commonly accepted and
deployed in practice. Chapter 13 develops connections between these two topics by showing
how metaheuristics may be used to generate and collect information that is valuable for
post-solution analysis. Metaheuristics thus receive a boost. Besides being appropriate when
exact methods fail and besides being useful for finding good starting solutions for exact
methods, metaheuristics are invaluable for post-solution analysis.
Part IV contains five chapters discussing post-solution analysis. The presentation is
example driven. Although we provide ample computational information for the interested
reader, the discussion is designed to be very broadly accessible.
Chapter 14, “Decision Sweeping,” is about using a plurality of solutions or decisions
for a model in support of decision making. It discusses how these data—the plurality of
solutions—obtained from metaheuristic solvers may be used to obtain rich corpora of deci-
sions of interest, and how these corpora may be used to support deliberative decision making
with models. Continuing to use a set of exemplar problems from Part II, the chapter presents
actual data produced by metaheuristic solution of the exemplar models and discusses its use
for decision making. An important point made by the examples is that the data so obtained
are easily understandable and afford deliberation by any general stakeholder.
Chapter 15, “Parameter Sweeping,” discusses how a plurality of solutions to a model,
obtained by varying its parameters systematically, may be used to support deliberative
decision making with the model. Again, the chapter presents actual data obtained from
exemplar models and shows how the results are easily understandable and afford deliberation
by any general stakeholder.
Chapter 16, “Multiattribute Utility Modeling,” presents the very basics of utility theory
and then discusses in detail how to build simple, practical, and useful models for comparing
entities on multiple dimensions at once. The method is quite general and has many success-
ful applications outside of model analytics. Our discussion of how to apply multiattribute
utility modeling for post-solution analysis is the first we know of. Essential to the project is
access to a corpus of solutions or decisions for the model(s) under consideration. The use of
metaheuristics not only for solving optimization models, but for generating useful corpora
of decisions, is a main theme in the book. Thus, the general method of multiattribute utility
modeling is linked innovatively with model analytics.
Chapter 17, “Data Envelopment Analysis,” shows how linear programming may be used
to filter a large number of found solutions for a model, as were produced and used in
Chapters 14 and 15, in order to distinguish efficient from inefficient solutions.
Chapter 18, “Redistricting: A Case Study in Zone Design,” is a case study of the applica-
tion of many of the post-solution analysis ideas found in this book. It describes work related
to a contest to design councilmanic districts in Philadelphia, work which was recognized
both in the original contest and in the INFORMS Wagner Prize competition.
Finally, Part V, “Conclusion,” consists of a single chapter, Chapter 19. It wraps up the
book and directs the reader to points beyond.
***
Preface xv
We wish explicitly to thank a number of people. Steven Bankes, Roberta Fallon, Mar-
garet Hall, Ann Kuo, Peter A. Miller, Fred Murphy, and Ben Wise each read portions of the
book and offered sage and useful comments. Nick Quintus did the excellent GIS work. Over
the years we have benefitted greatly from many ideas by and conversations with Steven
Bankes, Harvey Greenberg, Fred Glover, and Fred Murphy. Our heartfelt thanks to all.
This page intentionally left blank
List of Figures
xvii
xviii List of Figures
3.11 100% rule for constraint RHS values. z ∗ = current optimal value of objective
function. ∆bi amount of increase (positive or negative) in bi (right-hand-side
value of the constraint). λi = shadow price of the constraint (the change in
the objective function value per unit increase in bi ). z 0 = value of objective
function with an optimal solution to the revised problem. Valid for ∆bi s
within their allowable ranges. . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.12 100% rule for objective function coefficients. z ∗ = current optimal value
of objective function. ∆cj amount of increase (positive or negative) in cj
(objective function coefficient on decision variable xj ). x∗j s = optimal values
of the problem’s decision variables, xj s. z 0 = value of objective function to
the revised problem (for which the x∗j s are unchanged). Valid for ∆cj s within
their allowable ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.13 Custom-molder linear programming model. . . . . . . . . . . . . . . . . . 57
6.3 Example of a nearest neighbor heuristic for finding a TSP solution. . . . . 101
6.4 TSP: Cheapest versus nearest insertion heuristics. . . . . . . . . . . . . . . 102
6.5 Initial tour is 1-2-3-4-5 with cost 15. . . . . . . . . . . . . . . . . . . . . . 103
6.6 A first local improvement. . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.7 A second local improvement. . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.8 Scatter plot of 20 jobs. Depot at center as -101. . . . . . . . . . . . . . . . 105
6.9 The 20 jobs of Figure 6.8 sequenced in a random order. . . . . . . . . . . . 106
6.10 The 20 jobs of Figure 6.8 sequenced with a simple insertion procedure. . . 107
6.11 The 20 jobs of Figure 6.8 sequenced with a 2-opt procedure. . . . . . . . . 108
6.12 The best sequencing of the 20 jobs of Figure 6.8 found by 100 2-opt trials. 109
9.1 Philadelphia census tract map with high quality tracts highlighted, and #0,
the putative best, separately distinguished. . . . . . . . . . . . . . . . . . . 134
9.2 Pseudocode description of one run of the greedy hill climbing
metaheuristic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3 More specific pseudocode description of the greedy hill climbing
metaheuristic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.4 Best found decision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.1 Pseudocode for the deferred acceptance algorithm (DAA) for the simple
marriage matching problem, Xs proposing to Y s. . . . . . . . . . . . . . . 156
15.1 Plot of the inverse tangent function from -20 to 20. . . . . . . . . . . . . . 221
15.2 Example of z ∗ changes in a linear program as a result of changes in an
objective function coefficient. . . . . . . . . . . . . . . . . . . . . . . . . . 222
xx List of Figures
18.1 Philadelphia City Council districts after the 2000 census. . . . . . . . . . . 256
18.2 Philadelphia’s 66 wards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
18.3 Districting problem formulated as a p-median problem. . . . . . . . . . . . 260
18.4 A districting plan for Philadelphia using the p-median model of Figure 18.3. 261
18.5 GAMS listing for p-median model of Figure 18.3. . . . . . . . . . . . . . . 262
18.6 GAMS listing for p-median model of Figure 18.3 with population constraints
of expression (18.6) added. . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
18.7 A districting plan for Philadelphia using the p-median model with popula-
tion constraints of ±5%. District 1 is not contiguous. . . . . . . . . . . . . 264
18.8 Districting problem formulated as a p-median problem with population and
contiguity constraints, and known centers. After [144, page 1059]. . . . . . 265
18.9 GAMS code for districting with population and contiguity constraints and
known centers, part 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
18.10 GAMS code for districting with population and contiguity constraints and
known centers, part 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
18.11 A districting plan for Philadelphia that is population balanced (±5%), con-
tiguous, and minimizes distance to district centers. Districts are labeled
with the IDs of their ward centers. . . . . . . . . . . . . . . . . . . . . . . 270
18.12 Districting plan discovered by the evolutionary algorithm. Team Fred’s fa-
vorite because of neighborhood integrity. . . . . . . . . . . . . . . . . . . . 273
18.13 Districting plan discovered by the evolutionary algorithm. Strong on pro-
tecting neighborhoods in the “river wards” (in districts 1, 5, and 9 in the
map). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
5.1 FoIs: Interesting feasible decisions found by EC for the GAP 1-c5-15-1 prob-
lem. Decision 0 is optimal. . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2 IoIs: Interesting infeasible decisions found by EC for the GAP 1-c5-15-1
problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
9.1 The 43 best single locations for minimizing total distance. . . . . . . . . . 131
9.2 The 43 best single locations for minimizing total weighted distance. . . . . 133
9.3 Thirty-three tracts are within 5% of the best tract on unweighted distances
and on weighted distances. . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.4 Top 43 single locations having the first 9 of 10 tracts in common, using the
naı̈ve greedy algorithm with weighted distances. . . . . . . . . . . . . . . . 137
9.5 Locating 10 service centers: Greedy hill climbing results from 43 runs, sorted
by total population-weighted distance. . . . . . . . . . . . . . . . . . . . . 141
9.6 Locating 10 service centers: Top 43 greedy hill climbing results from 500 runs
and a sampled neighborhood of 200, sorted by total population-weighted
distance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
9.7 Locating 10 service centers: Results from 50 runs, sorted by total
population-weighted distance, using a threshold accepting algorithm. . . . 143
9.8 Locating 10 service centers: Top 43 results from a trial of 120 runs of a
simulated annealing heuristic, sorted by total population-weighted distance. 144
11.1 Objective function values of the best 12 decisions found in a trial of greedy
hill climbing on the GAP 1-c5-15-1 model with numTries = 1 and numRuns
= 200000, and 630,744 decision evaluations. . . . . . . . . . . . . . . . . . 169
11.2 Objective function values of the best 12 decisions found in a trial of greedy
hill climbing on the GAP 1-c5-15-1 model with numTries = 10 and numRuns
= 10000, and 644,082 decision evaluations. . . . . . . . . . . . . . . . . . . 170
11.3 Objective function values of the best 12 decisions found in a trial of greedy
hill climbing on the GAP 1-c5-15-1 model with numTries = 50 and numRuns
= 2000, and 517,550 decision evaluations. . . . . . . . . . . . . . . . . . . 170
11.4 Results from a trial of 50 runs simulated annealing on the Philadelphia
districting problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.5 Results from a trial of 50 runs of simulated annealing on the GAP 1-c5-15-1
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
11.6 Results from a trial of 50 runs of a threshold accepting algorithm on the
GAP 1-c5-15-1 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
xxi
xxii List of Tables
14.1 Top 50 type A decisions found in a run of the GAP model with FI2-Pop
GA. SSLKs: sum of slack values. SLKs: the slack values. i = item/decision
number. Obj = objective value of the decision. SSLKs = sum of the slack
values. SLKs = slack values for the constraints. Decisions = associated
settings of the decision variables. . . . . . . . . . . . . . . . . . . . . . . . 208
14.2 Top 50 type B decisions found in a run of the GAP model with FI2-Pop
GA, having an objective value 300 or more. i = item/decision number. Obj
= objective value of the decision. SSLKs = sum of the slack values. SLKs
= slack values for the constraints. Decisions = associated settings of the
decision variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
14.3 Top 50 type C decisions found in a run of the GAP model with FI2-Pop GA.
i = item/decision number. Obj = objective value of the decision. SNSLKs
= sum of the negative (infeasible) slack values. SLKs = slack values for the
constraints. Decisions = associated settings of the decision variables. . . . 210
14.4 Top 50 type D decisions found in a run of the GAP model with FI2-Pop
GA, having sum of negative slacks ≥ −20. i = item/decision number. Obj =
objective value of the decision. SNSLKs = sum of the negative (infeasible)
slack values. SLKs = slack values for the constraints. Decisions = associated
settings of the decision variables. . . . . . . . . . . . . . . . . . . . . . . . 211
15.1 Parameter sweep results for the Converse formula. Rows: Pa values.
Columns: Pb values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
17.1 Objective function values (z), constraint left-hand-side values (X(j), DEA
BCC efficiency values (E), and slack values (S(j)) for fifty feasible decisions
of interest (numbering 1–50) for a GAP with five agents. . . . . . . . . . . 251
17.2 Objective function values (z), constraint left-hand-side values (X(j), DEA
BCC efficiency values (E), and slack values (S(j)) for fifty feasible decisions
of interest (numbering 51–100) for a GAP with five agents. . . . . . . . . . 252
Part I
Starters
1
This page intentionally left blank
Chapter 1
Introduction
1. Encoding.
How to express an algorithm or procedure in the computational environment of our
choice and get it to run and produce useful results?
Important modern computational environments include Excel (and other spreadsheet
programs), MATLAB, Mathematica, Python, Ruby, PHP, JavaScript, R, SAS, SPSS,
NetLogo, and much else, including core programming languages such as C, C++, and
Java.
We shall discuss coding in this book, but for the most part relegate it to later chapters,
aimed at more advanced users, or at least users who would be more advanced and so
3
4 Business Analytics for Decision Making
wish to study programming for business analytics. Python and MATLAB will serve
as our focal core programming languages, although we shall have many occasions to
mention others. We will advert to Excel, GAMS, NetLogo, and OPL as needed, to the
extent helpful in the context.
2. Solution design.
Solution design is an issue that arises prior to encoding. Given a problem, how should
it be represented so that it can be coded in one’s computational environment of
choice (Python, MATLAB, Excel, etc.)? Alternatively put: How should we design the
representations—the data structures and algorithms—for solving the problem?
Solution design, including representation, efforts occur at a level of abstraction re-
moved from encoding efforts. The two are not entirely independent, since felicitous
choice of computational environment aids and abets representation enormously. Still,
good designs will ideally be accessible to, and useful for, all parties in an analytics
effort, programmers and non-programmers alike.
3. Analytics: Post-solution analysis.
Given a solution design, its encoding into a model, and successful execution of the
model (or of a solver for the model), we arrive at a solution to the model. Now the
real work begins! We need to undertake post-solution anlaysis (also known as model
analysis) in order to validate the model, test and understand its performance, obtain
information perhaps relevant to reconsidering the assumptions of the model, and so
on.
Further, how can we use data, text, and models to solve business problems (broadly
conceived)? Or given a business problem, how shall we approach its solution with
data, text, and models (assuming it is amenable to such)? In short, how can and how
should we go about “thinking with data and models” in the context of real problems?
Post-solution analysis (or model analysis) refers, then, to that variety of activities we
undertake after a model has been designed, formulated, and solved. These activities
may be directed primarily at testing and validating a model or at supporting delibera-
tive decision making with the model, although in fact it is often pointless to maintain
a distinction between the two goals. In any event, the activities characteristic of post-
solution analysis are typically useful and used both for validating a model and for
deliberating with it regarding what actions to take.
The main focus of this book is on how to address these questions with specifics, to
provide recommended and usable techniques, along with realistic examples. We emphasize
using implementations that let us exercise and explore models and data for the sake of
discovery, understanding, and decision making. To this end we dwell at length throughout
on post-solution analysis. Its techniques and methods are not only accessible to all who
would engage in thinking with models, they are also essential to actual deployment and
decision making with models.
These levels are interdependent. Coding depends on representation and representation
depends on what the business problems are. To do analytics we need to specify the problem
rigorously, meaning we need to find implementable representations and we need to code them
up. And then, we emphasize, the real work begins: We need to use our implementations to
explore and discover what our models and data can tell us. To really do analytics we need
the results of representation and encoding, but that is just the beginning.
The book is about all three levels. In the beginning, however, and for much of the book
we emphasize analytics and de-emphasize representation and encoding, so that readers will
Introduction 5
3. Solution design.
A good slogan in this context: “Algorithms + Data Structures = Programs” [158].
(a) Data structures
(b) Algorithms
have much to sink their teeth into without raising the hackles of those adverse to computer
programming.
Let us not forget the larger context of business analytics. We can embed our levels in
the larger context of a computational problem solving cycle. See Figure 1.1, which we might
frame as an updating or complement for the twenty-first century and new technology of
Peirce’s “The Fixation of Belief.”1
The last three steps in the cycle correspond to, or encompass, our three levels. The first
step—problem recognition—gets the ball rolling. Throughout, we aim to provide realistic
information on representative problem situations. This will serve to motivate the analyses
and to expose the reader to real world considerations.
The second step in the cycle framework is to find a solution concept for the recognized
problem. By this we mean a general approach that is likely to be successful in addressing an
identified aspect of the problem. Should we build a model or take a data-driven approach?
If we build a model, what sort of model should we build?
Finally, to turn the list into a cycle, we emphasize that every step taken, every decision
made, is provisional. Computational problem solving efforts iterate through the list and
revisit steps until deliberation is—provisionally—abandoned in favor of action.
1 The article [130] addresses decision making from a practical, pragmatic perspective. Indeed, it is one of
the founding documents of philosophical Pragmatism. It was written for a popular audience and remains
well worth reading today.
6 Business Analytics for Decision Making
Enough abstraction for the present! We now turn to an example that touches upon and
illustrates these principles.