Robust Linear Optimization With Recourse: Aur Elie Thiele, Tara Terry, and Marina Epelman March, 2010

Robust Linear Optimization With Recourse
Aurélie Thiele∗, Tara Terry†, and Marina Epelman‡
March, 2010
Abstract
We propose an approach to two-stage linear optimization with recourse that does not in-
volve a probabilistic description of the uncertainty and allows the decision-maker to adjust the
degree of conservativeness of the model, while preserving its linear properties. We model un-
certain parameters as belonging to a polyhedral uncertainty set and minimize the sum of the
first-stage costs and the worst-case second-stage costs over that set, i.e., taking a robust opti-
mization approach. The decision-maker’s conservatism is taken into account through a budget
of uncertainty, which determines the size of the uncertainty set around the nominal values of the
uncertain parameters. We establish that the robust problem is a linear programming problem
with a potentially very large number of constraints, and describe how a cutting plane algorithm
can be used in the the two-stage setting. We test the robust modeling approach on two ex-
amples of problems, one with simple, and one with general recourse, to illustrate the structure
and performance of robust policies as well as to evaluate the performance of the cutting plane
algorithm.
1 Introduction
Linear optimization with recourse was first introduced by Dantzig in [17] as a mathematical frame-
work for sequential decision-making under uncertainty. In that setting, the decision-maker must
make some decisions before discovering the actual value taken by the uncertain parameters, but
has the opportunity to take further action once uncertainty has been revealed, with the objective
of minimizing total expected cost. This framework later became known as stochastic programming
and is described in detail in the monographs by Birge and Louveaux [13] and Kall and Wallace [25].
However, as early as the mid-1960s, researchers such as Dupačová, née Žáčková, [37] recognized
the practical limitations of the expected-value paradigm, which requires the exact knowledge of the
underlying probability distributions. The fact that such probabilities are very hard to estimate in
practice motivated the development of a minimax approach, where the decision-maker minimizes
the maximal expected cost over a family of probability distributions. It has received significant
attention in the stochastic programming literature, for instance from Dupačová, [18, 19, 20, 37],
∗
Department of Industrial and Systems Engineering, Lehigh University, aurelie.thiele@lehigh.edu. Supported in
part by NSF Grant DMI-0540143, a P.C. Rossin Assistant Professorship and an IBM Faculty Award.
†
Industrial and Operations Engineering Department, University of Michigan, tlterry@umich.edu. Supported in
part by NSF Grant CCF-0306240 and Elizabeth C. Crosby Research Award.
‡
Industrial and Operations Engineering Department, University of Michigan, mepelman@umich.edu. Supported
in part by NSF Grant CCF-0306240 and Elizabeth C. Crosby Research Award.
1
whose work laid the foundation for subsequent research efforts. Other early references include
Jagannathan [24], who studied stochastic linear programming with simple recourse when the first
two moments of the distributions are known, and Birge and Wets [14], who focused on bounding
and approximating stochastic problems. More recently, Shapiro and Ahmed [31] and Shapiro and
Kleywegt [32] have investigated further the theoretical properties of minimax stochastic optimiza-
tion, while Takriti and Ahmed described in [35] an application to electricity contracts. The main
drawback of the stochastic minimax approach is that the solution methods proposed in the lit-
erature (a stochastic gradient technique in Ermoliev et al. [23], a bundle method in Breton and
El Hachem [15], a cutting-plane algorithm in Riis and Andersen [29], to name a few), all require
finding explicitly the worst-case probability distribution for the current candidate solution at each
step of the algorithm, and hence suffer from dimensionality problems. These are particularly acute
here as stochastic programming often yields large-scale formulations. Although Shapiro and Ahmed
[31] and Shapiro and Kleywegt [32] have studied specific classes of problems for which the minimax
framework leads to traditional stochastic formulations, no such approach has been developed to
date for the general case. Furthermore, while the field of stochastic programming has seen in recent
years a number of algorithmic advances, e.g., sampling methods (Shapiro [30]), the problem with
recourse remains significantly more difficult to solve than its deterministic linear counterpart, and
does not allow for easy insights into the impact of uncertainty on the optimal values of the decision
variables.
Therefore, the need arises to develop an approach to linear optimization with recourse that does
not rely on a probabilistic description of the uncertainty, yields insights into the way uncertainty
affects the optimal solution, and remains solvable in practice, even if not tractable in the theoretical
sense. The purpose of this paper is to present such an approach based on robust optimization.
While robust optimization has been previously used in stochastic programming as a method to
incorporate cost variability in the objective function (Takriti and Ahmed [36]), we consider here a
different methodology, which was developed independently under the same name. What we refer
to as robust optimization addresses data uncertainty in mathematical programming problems by
finding the optimal solution for the worst-case instances of unknown but bounded parameters.
Traditionally, the term “robust optimization” refers to an approach to dealing with parameter
uncertainty in one-stage optimization problems (i.e., problems without recourse). This approach
was pioneered in 1973 by Soyster [33], who proposed a model that guarantees feasibility for all
instances of the parameters within a convex set. However, the geometry of this uncertainty set
he considered to assure tractability of the resulting model lead to very conservative solutions, in
the sense that they are too far from optimality in the nominal model to be of practical interest
for real-life implementation. This issue of over-conservatism hindered the adoption of robust tech-
niques in optimization problems until the mid-1990s, when Ben-Tal and Nemirovski [3, 4, 5], El
Ghaoui and Lebret [21] and El Ghaoui et al. [22] started investigating tractable robust counterparts
of linear, semidefinite and other important convex optimization problems. They focused on ellip-
soidal uncertainty sets, which allow for important insights into the robust framework but increase
the complexity of the problem considered, e.g., yield second-order cone problems as the robust
counterpart of linear models. In contrast, Bertsimas and Sim [8] study polyhedral uncertainty sets,
which do not change the class of the problem at hand, and explicitly quantify the trade-off between
performance and conservatism in terms of probabilistic bounds of constraint violation. An advan-
tage of their approach is that it can be easily extended to integer and mixed-integer programming
problems (Bertsimas and Sim [7]). While robust optimization has been applied in the references
above as a way to address parameter uncertainty, Bertsimas and Thiele [9] use this framework to
2
model random variables and address uncertainty on the underlying distributions in a multi-period
inventory problem. Their approach highlights the potential of robust optimization for dynamic
decision-making in the presence of randomness.
A first step towards applying robust techniques in stochastic programming with recourse is taken by
Ben-Tal et al. [2], who coin the terms “adjustable decision variables” as a synonym for second-stage,
or recourse, decisions, and “adjustable robust optimization” for robust counterparts of problems
with recourse. Unfortunately, the resulting adjustable robust optimization problems are frequently
computationally intractable, which leads them to restrict the recourse variables to affine functions
of the uncertain data. Most of the following literature imposes some similar type of restriction
on recourse variables to obtain a theoretically tractable optimization problem that approximates
the adjustable robust counterpart (e.g., “deflected-linear” functions considered by Chen and Sim
in [16] and references therein), or identifies special cases when the robust adjustable model at
hand is tractable. Takeda et al. [34] discuss conditions for tractability of convex robust problems
with recourse; Ordóñez and Zhao [28], for example, consider a tractable special case of network
design under uncertain demand. Atamtürk and Zhang in [1] also propose a robust adjustable
model for two-stage network flow and design under uncertain demand, but do not invoke affinely
adjustable decision variables. Similarly, in this paper we do not impose any limitations on the
structure of the functional dependence of the recourse on the uncertain parameters. The model
presented here is broader in scope than the network design problem considered in [1], in that we
develop a robust approach for general two-stage stochastic linear problems with uncertainty on the
right-hand side. We believe that the framework proposed in this paper offers a new perspective on
linear programming with recourse that combines the decision-maker’s degree of conservatism and
the uncertainty on the probability distributions.
The algorithm proposed here can be seen as a variation of Benders’ decomposition, or delayed
constraint generation (see Section 3 for details). Previous extensions of decomposition to robust
optimization problems include work by Bienstock and Özbay [12] and Bienstock [11]. In these
papers the authors consider specific applications of single-stage, non-adjustable, robust framework
(determining a robust basestock level under uncertain demands, and robust portfolio optimization
with uncertain returns, respectively), and apply a generic decomposition algorithm that alternates
between solving a restricted master problem that includes a limited subset of possible data realiza-
tions to determine a solution, and an adversarial problem which finds, exactly or approximately,
the worst-case data realization for the solution found. The newly identified data realization is then
added to the restricted master problem, and the process is repeated. The papers focus on the
specifics of solving the master and adversarial problems for the applications and uncertainty sets
considered. While working on this manuscript, we became aware of the recent work by Mutapcic
and Boyd [27], which applies the same paradigm and its variations to robust counterparts of single-
stage convex problems and presents several examples. In contrast to the ordinary one-stage robust
problems considered in the above papers, in this paper we deal with robust problems with recourse,
i.e., two-stage problems, which present additional challenges (specifically in solving the adversarial
problem). We discuss exact solution methods of adversarial problems arising in problems with
recourse under right-hand-side uncertainty in problems with simple as well as general recourse.
Specifically, we make the following contributions:
1. We address right-hand-side uncertainty in linear programming problems with recourse by

considering an adjustable (two-stage) robust formulation with a polyhedral uncertainty set.
The level of conservatism of the optimal solution is flexibly adjusted by setting a parameter
3
called the “budget of uncertainty” to an appropriate value.
2. We propose a cutting-plane method, based on Kelley’s method, for solving adjustable robust
linear programs of this type. This method is similar to, but less computationally demanding
than, Benders’ decomposition algorithm for their stochastic counterparts.
3. We provide techniques for finding worst-case realizations of the uncertain parameters within
the polyhedral uncertainty set for problems with simple and general recourse, and demonstrate
their performance via computational experiments.
4. Finally, using the above methods we obtain solutions to robust formulations of two exam-
ple problems, which yield insight into structure and performance of such solutions under
uncertainty.
The organization of the paper is as follows. In Section 2, we define the model of uncertainty and
present the main ideas underlying the robust approach. Section 3 presents a cutting-plane algorithm
based on Kelley’s method for solving robust linear programs with recourse, along with a comparison
with the Benders’ decomposition algorithm for solving stochastic programming problems. Sections
4 and 5 discuss how one can compute objective function values and subgradients in the implemen-
tation of our algorithm for problems with simple and general recourse, respectively, focusing on
solving the adversarial problem. Sections 6 and 7 present results of computational experiments on
problems with these two types of recourse. Finally, Section 8 contains some concluding remarks.
2 Problem Overview
2.1 Optimization with Recourse
The focus of this paper is on two-stage linear optimization with right-hand side uncertainty, which
was first described by Dantzig in [17]. The deterministic problem can be formulated as:
min cT x + dT y
s.t. Ax + By = b (1)
x ∈ S, y ≥ 0
with the following notation:

x : the first-stage decision variables,
y : the second-stage decision variables,
c : the first-stage costs,
d : the second-stage costs,
A : the first-stage coefficient matrix,
B : the second-stage coefficient matrix,
b : the requirement vector.
Here S is a polyhedron described by a finite list of inequalities.
In many applications, the requirement vector is uncertain and the decision-maker implements the
first-stage (“here-and-now”) variables without knowing the actual requirements, but chooses the
second-stage (“wait-and-see”) variables only after the uncertainty has been revealed. This has
traditionally been modeled using stochastic programming techniques, i.e., by assuming that the
4
requirements obey a known probability distribution and minimizing the expected cost of the prob-
lem. In mathematical terms, if we define the recourse function, once the first-stage decisions have
been implemented and the realization of the uncertainty is known, as:
Q(x, b) = min dT y
s.t. By = b − Ax (2)
y ≥ 0,
the stochastic counterpart of problem (1) can be formulated as a nonlinear problem:
min cT x + Eb [Q(x, b)]

(3)
s.t. x ∈ S.
If the uncertainty is discrete, consisting of Ω possible requirement vectors each occurring with
probability πω , ω = 1, . . . , Ω, problem (3) can be written as a linear programming problem:
Ω
X
min cT x + π ω · dT y ω
ω=1 (4)
s.t. Ax + Byω = bω , ω = 1, . . . , Ω
x ∈ S, yω ≥ 0, ω = 1, . . . , Ω.
However, a realistic description of the uncertainty generally requires a high number of scenarios.
Therefore, the deterministic equivalent (4) is often a large-scale problem, which necessitates the
use of special-structure algorithms such as decomposition methods or Monte-Carlo simulations (see
Birge and Louveaux [13] and Kall and Wallace [25] for an introduction to these techniques). Prob-
lem (4) can thus be considerably harder to solve than problem (1), although both are linear. The
difficulty in estimating probability distributions accurately also hinders the practical implementa-
tion of these techniques.
2.2 The Adjustable Robust Approach
In contrast with the stochastic programming framework, robust optimization models uncertain
parameters using uncertainty sets rather than probability distributions. The objective is then to
minimize the worst-case cost in that set. Specifically, let B be the uncertainty set of the requirement
vector. The robust problem with recourse, or adjustable robust problem, is formulated as:
min cT x + max Q(x, b)

b∈B (5)
s.t. x ∈ S.
If B = {b̄}, where b̄ is a known mean/nominal value of the requirement vector, problem (5) is
the “nominal” problem. As B expands around b̄, the decision-maker protects the system against
more realizations of the uncertain parameters, and the solution becomes more robust, but also
more conservative. If the decision-maker does not take uncertainty into account, he might incur
very large costs once the uncertainty has been revealed. On the other hand, if he includes every
possible outcome in his model, he will protect the system against realizations that would indeed
be detrimental to his profit, but are also very unlikely to happen. The question of choosing uncer-
tainty sets that yield a good trade-off between performance and conservatism is central to robust
optimization.
5
Following the approach developed by Bertsimas and Sim [7, 8] and Bertsimas and Thiele [9], we
focus on polyhedral uncertainty sets and model the uncertain parameter bi , i = 1, . . . , m, as a
parameter of known mean, or nominal value, b̄i and belonging to the interval [b̄i − bbi , b̄i + bbi ].
Equivalently,
bi = b̄i + bbi zi , |zi | ≤ 1, ∀i.
To avoid overprotecting the system, we impose the constraint
m
X
|zi | ≤ Γ,
i=1
which bounds the total scaled deviation of the parameters from their nominal values. Such a
constraint was first proposed by Bertsimas and Sim [8] in the context of linear programming with
uncertain coefficients. The parameter Γ, which we assume to be integer, is called the budget of
uncertainty. Γ = 0 yields the nominal problem and, hence, does not incorporate uncertainty at
all, while Γ = m corresponds to interval-based uncertainty sets and leads to the most conservative
case. In summary, we will consider the following uncertainty set:
n o
B = b : bi = b̄i + bbi zi , i = 1, . . . , m, z ∈ Z , (6)
with
m
( )
X
Z= z: |zi | ≤ Γ, |zi | ≤ 1, i = 1, . . . , m . (7)
i=1
In the remainder of the paper, we investigate how problem (5) can be solved for the polyhedral
set defined in equations (6)—(7), and how the robust approach can help us gain insights into the
impact of the uncertainty on the optimal solution.
2.3 Assumptions and Preliminaries
Consider the problem (5) with Q(x, b) defined by (2). We assume that Q(x, b) > −∞ for all
x ∈ S and b ∈ B (i.e., problem (5) has sufficiently expensive recourse in stochastic programming
parlance), and thus the dual problem of (2) is feasible. By strong duality, we can write:
Q(x, b) = max (b − Ax)T p

p (8)
s.t. BT p ≤ d.
Thus, problem (5) is equivalent to:

min cT x + max T
(b − Ax) p . (9)
x∈S b∈B, p:BT p≤d
Proposition 2.1. For any b ∈ B, Q(x, b) is a convex function in x. For any x ∈ S, Q(x, b) is a
convex function in b.
Proof. The proof of the first claim is familiar from the stochastic programming literature: let b ∈ B
be fixed, let x1 , x2 ∈ S, and λ ∈ [0, 1]. If, for either of the values x1 or x2 , problem (2) (the primal
recourse problem) is infeasible, the relationship
Q(λx1 + (1 − λ)x2 , b) ≤ λQ(x1 , b) + (1 − λ)Q(x2 , b)
6
holds trivially. Otherwise, Q(xi , b) is finite (due to the sufficiently expensive recourse assumption),
with the optimal value of problem (2) attained at yi , for i = 1, 2. Then y = λy1 + (1 − λ)y2 is
feasible for problem (2) with x = λx1 + (1 − λ)x2 , and so
Q(λx1 + (1 − λ)x2 , b) ≤ dT (λy1 + (1 − λ)y2 ) = λdT y1 + (1 − λ)dT y2 = λQ(x1 , b) + (1 − λ)Q(x2 , b),
proving the first claim.

The second claim can be easily established by analyzing the dual recourse problem (8).
Throughout the rest of the paper, we make the following assumption for ease of presentation.
Assumption 2.2. Problem (5) possesses relatively complete recourse, i.e., problem (2) is feasible
for all x ∈ S and b ∈ B.
Together with the earlier assumption on feasibility of (8), Assumption 2.2 implies that Q(x, b) is
finite for all x ∈ S and b ∈ B.
3 A Cutting-Plane Approach: Kelley’s Algorithm
In this section, we describe a cutting-plane algorithm for solving problem (5) based on Kelley’s
algorithm, originally proposed in [26]. Kelley’s algorithm is designed to minimize a linear objective
function over a compact convex feasible region that is complex (possibly described by an infinite
number of constraints) or given only by a separation oracle, i.e., a subroutine that, given a point
in the variable space, either correctly asserts that the point is feasible or returns the normal vector
and intercept of some hyperplane that strictly separates the point from the feasible region. At
each iteration, the algorithm maintains a polyhedral outer approximation of the feasible region; the
objective function is minimized over the approximation, and if the resultant solution is infeasible,
adds a linear inequality (cut) derived from the separating hyperplane to the description of the
polyhedral approximation of the feasible region, thus improving the approximation. For problems
with feasible regions described by a (possibly infinite) family of differentiable convex inequality
constraints, cuts can be generated using gradients of the violated constraints; subgradients can
be used for convex non-differentiable functions. Recall that vector g is a subgradient of a convex
function f (x) at x̃ (denoted g ∈ ∂f (x̃)) if the following subgradient inequality is satisfied:
f (x) ≥ f (x̃) + gT (x − x̃) ∀x.
Problem (5) has a simple feasible region S, but the objective function cT x + maxb∈B Q(x, b) is
complex. Let us define
Q(x) = max Q(x, b) = max (b − Ax)T p. (10)

b∈B b∈B, p:BT p≤d
(Following [12], we can view the problem maxb∈B Q(x, b) as the adversarial problem, since its
goal is to find the worst-case requirement vector for the first-stage decision x.) Then (5) can be
re-written as
min cT x + α
x,α
s.t. α ≥ Q(x) (11)

x ∈ S.
7
In light of the first claim of Proposition 2.1, the function Q(x) is convex, making problem (11)
suitable for application of Kelley’s algorithm. In our implementation of the algorithm, we maintain
a piecewise linear lower approximation of the function Q(x). The approximation is improved by
adding cuts derived using values and subgradients of this function. In addition, we will maintain a
lower bound and upper bound on the optimal objective function value; let these be denoted L and
U , respectively.
Computing the value of function Q(x) = maxb∈B Q(x, b) is not an easy task, in general: since
Q(x, b) is convex in b for any x, it requires maximization of a convex function. We postpone the
discussion of this issue until Sections 4 and 5, for simple and general recourse cases, respectively.
However, once the “worst-case” right-hand side for a given value of the first-stage decision is found,
the subgradient of (10) can be found easily.
Lemma 3.1. Let b̃ ∈ arg maxb∈B Q(x̃, b). Furthermore, let p̃ be an optimal solution of (8) with
(x, b) = (x̃, b̃). Then −AT p̃ ∈ ∂Q(x̃).
Proof. For an arbitrary x,
Q(x) ≥ Q(x, b̃) ≥ (b̃ − Ax)T p̃

= Q(x̃, b̃) + (−AT p̃)T (x − x̃)
= Q(x̃) + (−AT p̃)T (x − x̃),
proving the claim.
We refer to the vector p̃ as in Lemma 3.1 as the dual recourse vector (for x̃).
We can now specify Kelley’s algorithm for problem (11):
Algorithm 3.2 (Kelley’s Algorithm for Robust Linear Program with Recourse).
Initialization Let Q0 (x) be the initial piecewise linear lower approximation of Q(x). Set L = −∞
and U = ∞; t = 0.
Iteration t Given L, U, and Qt (x),
Step 1 Solve min{cT x + α : x ∈ S, α ≥ Qt (x)}:

x,α
min cT x + α
x,α
s.t. α + pTl Ax ≥ bTl pl , l = 1, . . . , t − 1 (12)

α ≥ Q0 (x)
x ∈ S.
Let (xt , αt ) be an optimal solution and let L = cT xt + αt = cT xt + Qt (xt ).

Step 2 Compute Q(xt ), let bt and pt be the corresponding worst-case right-hand-side and
dual recourse vector, respectively. Let U = min{U, cT xt + Q(xt )}. If U − L is sufficiently
small, stop and return xt as an approximate solution.
Step 3 Define
Qt+1 (x) = max{Qt (x), pTt (bt − Ax)}.
8
Step 4 Set t ← t + 1.
Observe that, for a given x, the function max(b − Ax)T p is convex in p, and therefore, under
b∈B
Assumption 2.2, problem (11) can be re-written as the following master problem:
min cT x + α
x,α
s.t. α ≥ max(b − Ax)T pk , k = 1, . . . , K (13)
b∈B
x ∈ S,
where pk , k = 1, . . . , K are the extreme points of {p : BT p ≤ d}. Algorithm 3.2 can be seen as
a variant of Bender’s decomposition for problem (13), with relaxed master problem (12). Bender’s
decomposition is typically presented for the stochastic programming problem (4) [6]. (Bertsimas
and Tsitsiklis provide an introduction to these techniques in [10]; the reader is also referred to Birge
and Louveaux [13] and Kall and Wallace [25] for an extensive treatment of these methods in the
context of stochastic optimization.) The corresponding master problem can be written as:
Ω
X
min cT x + π ω Zω
ω=1 (14)
s.t. Zω ≥ pTk (bω − Ax) ∀k, ω
x ≥ 0,
where ω = 1, . . . , Ω are the scenarios. Here, at each iteration a relaxed master problem is solved
to obtain a first-stage solution x̃ and the corresponding value of the recourse function Z eω when
scenario ω is realized. To then check if this solution is optimal for the full master problem or to
apply a cut to the expected recourse function Ω
P
ω=1 ω Q(x, bω ), one needs to solve the recourse
π
problem (8) for each scenario ω = 1, . . . , Ω. While these problems are similar to each other and each
can be solved efficiently by applying, for instance, the dual simplex method, the large number of
subproblems is a drawback in accurately solving the stochastic programming counterpart of problem
(1) in many real-life settings. In contrast, Algorithm 3.2, which applies a similar technique to the
adjustable robust counterpart of problem (1), involves solving only one subproblem per iteration
— a computationally simpler task so long the relevant subproblem can be easily identified. The
methods for identifying the subproblem are discussed in Sections 4 and 5.
We note that if Assumption 2.2 does not hold, the algorithm can be modified by inclusion of
feasbility cuts, and can be viewed as a variant of delayed constraint generation for solving the
following modification of (13):
min cT x + α
x,α
s.t. α ≥ max(b − Ax)T pk , k = 1, . . . , K
b∈B
0 ≥ max(b − Ax)T dl , l = 1, . . . , L
b∈B
x ∈ S,
where pk , k = 1, . . . , K are the extreme points, and dl , l = 1, . . . , L are the extreme rays of
{p : BT p ≤ d}.
To obtain further perspective on Algorithm 3.2, recall that Q(x, b) is convex in b. Then, if the
9
uncertainty set B is compact, problem (11) can be re-written as follows:
min cT x + α
x,β
s.t. α ≥ Q(x, be ), be ∈ E(B)
x ∈ S,
where E(B) denotes the set of extreme points of B (or of the convex hull of B). Furthermore, using
Assumption 2.2 again, an equivalent formulation is
min cT x + α
x,β
s.t. α ≥ (be − Ax)T pk , k = 1, . . . , K, be ∈ E(B) (15)
x ∈ S.
At each iteration, Kelley’s algorithm adds one of the constraints of (15) as a cut. If the uncertainty
set B is polyhedral (e.g., if it is specified by equations (6)—(7)), (15) has a finite number of
constraints, thus assuring convergence of Algorithm 3.2 after a finite number of iterations. In the
worst case, Algorithm 3.2 may need to generate cuts corresponding to all non-redundant constraints
of (15) to find an optimal solution (depending on the structure of the problem, there may be a
connection between dual recourse vectors p that can correspond to a particular value of b ∈ E(B)
for all x ∈ S, rendering some constraints of (15) redundant). In practice, however, algorithms of
this type often perform better than this worst-case analysis suggests, finding approximate solutions
after a small number of iterations. Sections 6 and 7 assess the number of iterations needed to solve
instances of problem (15) to optimality for two example problems.
To conclude this section, we present a direct reformulation of the adjustable robust linear program
(5) with B given by (6)—(7) as a large-scale linear programming problem. In the proof, we refer
to the set
m
( )
X
Z 0 = z0 : zi0 ≤ Γ, 0 ≤ zi0 ≤ 1, i = 1, . . . , m . (16)
i=1
Theorem 3.3. If a robust linear program (5) with general recourse and B given by (6)—(7) satisfies
Assumption 2.2, it can be written as
min cT x + α
x,α,µ,λ
m
X
s.t. α ≥ (b̄ − Ax)T pk + λk Γ + µik ∀k
i=1 (17)
λk + µik ≥ bbi |pik | ∀i, k
λk , µik ≥ 0 ∀i, k
x ∈ S,
where p1 , . . . , pK are (all) the extreme points of the set {p : BT p ≤ d}.
Proof. Recall that problem (5) can be written in the form (13).
Given x and p,
m
X
max(b − Ax)T p = (b̄ − Ax)T p + max
0 0
|pi | bbi zi0 ,
b∈B z ∈Z
i=1
10
with Z 0 defined by (16). Since Z 0 is nonempty and bounded, strong duality holds, which yields:
max(b − Ax)T p = (b̄ − Ax)T p+ min λΓ + m
P
i=1 µi
b∈B λ, µ
s.t. λ + µi ≥ bbi |pi |, i = 1, . . . , m (18)
λ, µi ≥ 0, i = 1, . . . , m.
Combining this with formulation (13) yields problem (17).
The above approach to solving the robust problem is only practical if the set of extreme points of
{p : BT p ≤ d} is known and is relatively small in size.
4 Analysis of Robust Linear Programs with Simple Recourse
An important special case of linear programs with recourse are those with simple recourse, where
the decision-maker is able to address excess or shortage for each of the requirements independently.
For instance, he might pay a unit shortage penalty si for falling short of the uncertain target bi or
a unit holding cost hi for exceeding the uncertain target bi , for each i. We describe an application
to multi-item newsvendor problems in Section 6.
The deterministic model with simple recourse can be formulated as:
min cT x + sT y− + hT y+
s.t. Ax + y− − y+ = b
x ∈ S, y− , y+ ≥ 0,
and the recourse function of equation (2) becomes:
Q(x, b) = min sT y− + hT y+
s.t. y− − y+ = b − Ax (19)
y− , y+ ≥ 0.
Note that Assumption 2.2 holds for (19). We will further require that s+h ≥ 0 to ensure sufficiently
expensive recourse. It is straightforward to see that Q(x, b) is available in closed form:
m
X
Q(x, b) = [si · max{0, bi − (Ax)i } + hi · max{0, (Ax)i − bi }] .
i=1
However, we will focus on problem (19) to develop an efficient method for computing Q(x). We
obtain an equivalent characterization of the recourse function by invoking strong duality:
Q(x, b) = max (b − Ax)T p
(20)
s.t. −h ≤ p ≤ s.
Therefore, in this section we will be developing methods to solve:

T T
min c x + max (b − Ax) p , (21)
x∈S b∈B, −h≤p≤s
where B has been defined in equations (6)—(7).

The following theorem provides a simple method for computing Q(x) in problems with simple
recourse.
11
Theorem 4.1. Given x, define for i = 1, . . . , m,
n o
∆i = max (b̄i + bbi − (Ax)i )si , ((Ax)i − b̄i + bbi )hi − max (b̄i − (Ax)i )si , ((Ax)i − b̄i )hi .
Let I be the set of indices corresponding to the Γ largest ∆i ’s (ties can be broken arbitrarily). Then
Q(x) = max Q(x, b) with Q(x, b) given by (20) and B given by (6)—(7) verifies:
b∈B
X n o
Q(x) = max (b̄i + bbi − (Ax)i )si , ((Ax)i − b̄i + bbi )hi
i∈I
X
+ max (b̄i − (Ax)i )si , ((Ax)i − b̄i )hi . (22)
i∈I
/
Proof. We note that for any first-stage decision vector x:
Q(x) = max max (b − Ax)T p,

b∈B −h≤p≤s
Xm
= max max {(bi − (Ax)i )si , ((Ax)i − bi )hi } (23)
b∈B
i=1
m
X m
X
∆i zi0 ,

= max (b̄i − (Ax)i )si , ((Ax)i − b̄i )hi + max
0 0 z ∈Z
i=1 i=1
where Z 0 is defined in (16). The last equality is obtained by observing that the expression in (23)
is convex in b, and thus the worst-case value of b that attains the maximum can be found at an
extreme point of B. The extreme points of B can be enumerated by letting Γ components of b
deviate up or down (to their highest or lowest values), while keeping the remaining components
at their nominal values. Whether the worst case is reached when bi deviates up or P down (to its
highest orPlowest value) is captured by the value of ∆i . It then follows that maxz0 ∈Z 0 m 0
i=1 ∆i zi is
equal to i∈I ∆i , where I is as specified in the theorem.
Corollary 4.2. Given x, corresponding worst-case demand b can be determined as follows: for
i = 1, . . . , m

b̄i + b̂i if i ∈ I and (b̄i + b̂i − (Ax)i )si ≥ ((Ax)i − b̄i + b̂i )hi

bi = b̄i − b̂i if i ∈ I and (b̄i + b̂i − (Axt )i )si < ((Ax)i − b̄i + b̂i )hi (24)

b̄i if i 6∈ I.

Corresponding dual recourse vector p can be determined as follows: for i = 1, . . . , m




 si if i∈I and (b̄i + b̂i − (Ax)i )si ≥ ((Ax)i − b̄i + b̂i )hi

−h
i if i∈I and (b̄i + b̂i − (Ax)i )si < ((Ax)i − b̄i + b̂i )hi
pi = (25)


 si if i 6∈ I and (b̄i − (Ax)i )si ≥ (Ax)i − b̄i )hi

−h
i if i 6∈ I and (b̄i − (Ax)i )si < ((Ax)i − b̄i )hi .
The subgradient of Q(x) can now be computed as in Lemma 3.1, allowing the implementation of
Kelley’s algorithm 3.2.
12
5 Analysis of Robust Linear Programs with General Recourse
In this section we return to the analysis of the robust linear program with general recourse function
(2). Without any assumptions on the structure of recourse matrix B, evaluation of Q(x) of (10)
can no longer be done in closed form, as was the case with simple recourse. We present a general
approach for computing the value of Q(x), along with corresponding worst-case value of b and dual
recourse variable p, via a mixed integer programming problem, under the assumption of relatively
complete recourse.
Theorem 5.1. Suppose Assumption 2.2 holds for problems (5). Given x, Q(x) = max Q(x, b)
b∈B
with Q(x, b) given by (8) and B given by (6)—(7) can be computed as
Q(x) = max (b̄ − Ax)T (p+ − p− ) + b

b T (q+ + q− )
p+ ,p− ,q+ ,q− ,r+ ,r−
s.t. BT (p+ − p− ) ≤ d
0 ≤ q + ≤ p+
0 ≤ q − ≤ p−
q+ ≤ M r + (26)
q− ≤ M r −
eT (r+ + r− ) ≤ Γ
r+ + r− ≤ e
r+ , r− ∈ {0, 1}m
p+ , p− ≥ 0,
where e is the vector of all ones and M is a sufficiently large positive number.
Proof. Introducing variables p+ , p− ≥ 0 and recalling definition of B, problem (10) can be rewritten
as
m
" #
X
T + − + −
Q(x) = max (b̄ − Ax) (p − p ) + max bbi (p − p )zi
i i
p+ ,p− z∈Z
i=1 (27)
s.t. BT (p+ − p− ) ≤ d
p+ , p− ≥ 0,
where Z was defined in equation (7). Note that for a given p+ and p− , the inner maximization
problem or (27) is a linear program in variables z, and its optimal solution
Pmcan be found at one of the
extreme points of the set Z, which have the form zi ∈ {−1, 0, 1} ∀i and i=1 |zi | = Γ. Therefore, we
can re-write the inner maximization problem of (27) as the following integer program in variables
r+ , r− :
Xm
max bbi (p+ − p− )(r+ − r− )
i i i i
r+ ,r−
i=1
s.t. eT (r+ + r− ) ≤ Γ
r+ + r− ≤ e
r+ , r− ∈ {0, 1}m .
13
Substituting the above into (27), we obtain the following equivalent formulation:
m
X
Q(x) = max (b̄ − Ax)T (p+ − p− ) + bbi (p+ − p− )(r+ − r− )
i i i i
p± ,r±
i=1
s.t. BT (p+ − p− ) ≤ d
eT (r+ + r− ) ≤ Γ (28)
r+ + r− ≤ e
r+ , r− ∈ {0, 1}m
p+ , p− ≥ 0.
−
Finally, to linearize the objective function of (28), we introduce variables qi+ = p+ +
i ri and qi =
p− − + − − +
i ri ∀i (note that we can assume pi ri = pi ri = 0 ∀i without loss of generality). Making the
substitution in (28) and adding appropriate forcing constraints results in problem (26).
Corollary 5.2. Given x, let (p± , r± , q± ) solve (26). Then corresponding worst-case value of b
can be determined as follows: bi = b̄i + bbi (ri+ − ri− ), i = 1, . . . , m, and corresponding dual recourse
vector p can be determined as p = p+ − p− .
The subgradient of Q(x) can now be computed as in Lemma 3.1.
6 Computational Results: Newsvendor Problem
In this section we test the robust methodology on a multi-item newsvendor problem. The decision-
maker orders perishable items subject to a capacity constraint, faces uncertain demand, and incurs
surplus and shortage costs for each item at the end of the time period. His goal is to minimize
total cost. We use the following notation:
n: the number of items,

ci : the unit ordering cost of item i,
hi : the unit holding cost of item i,
si : the unit shortage cost of item i,
bi : the demand for item i,
A: the purchasing budget.
The deterministic problem can be formulated as:

n
X
min cT x + max {si (bi − xi ), hi (xi − bi )}
i=1
s.t. cT x ≤ A
x≥0
or equivalently, as:
min cT x + sT y− + hT y+
s.t. x + y− − y+ = b,
(29)
cT x ≤ A,
x ≥ 0.
14
Problem (29) is an example of a linear programming problem with simple recourse and therefore
can be analyzed using the techniques described in Sections 3 and 4. We consider a case with
n = 50 items and budget A = 100n, with ordering cost ci = 1, nominal demand b̄i = 8 + 2 i, and
maximum deviation of the demand from its nominal value bbi = 0.5 · b̄i , for each i = 1, . . . , 50. We
consider two different structures for the surplus and shortage penalties, resulting in two instances
of problem (29). In the first instance, items with larger nominal demand (and thus wider demand
variability by the above definitions of b̄ and b̂) have larger surplus and shortage penalties than items
with smaller nominal demand. In the second instance, surplus and shortage penalties follow the
opposite pattern. In particular, the penalties for item i are shown in the table below:
Shortage penalty si Surplus penalty hi

Instance 1 2i i
Instance 2 2(n + 1 − i) (n + 1 − i)
We applied Algorithm 3.2 to problem (29). Since the recourse value in this problem is always
nonnegative, we set Q0 (x) ≡ 0 in the initialization step. Step 2 of the algorithm was carried out
as discussed in Theorem 4.1 and Corollary 4.2. Finally, we terminated the algorithm when L = U ,
solving the robust problem to optimality.
6.1 Analysis of Problem Solutions
To understand the effect of the budget of uncertainty Γ utilized by the decision-maker in the
selection of the uncertainty set B, we solved both instances of the newsvendor problem for values
of Γ ranging from 0 to 50. Figure 1 summarizes our findings.
Figures 1(a) and 1(b) show the worst-case costs of the two instances (i.e., optimal objective value
of (5)) as a function of Γ, which, as expected, increase as the solution becomes more conservative
(the blue curves).
The decision-maker’s choice of the uncertainty set (and in particular b̄i ’s and b̂i ’s) can reflect his
knowledge or assumptions on the possible realizations of demands, including any information on
the probability distribution when the demands are perceived by the decision maker as random
variables. On the other hand, the goal of the robust approach is to be able to provide a solution
that performs reasonably well even if the distribution is not known with any great deal of precision.
To assess the performance of robust solutions, we considered an example in which demands are
indeed random, each following an independent Normal distribution with mean b̄i and standard
deviation 0.4 · b̂i for each i (recall that bbi = 0.5 · b̄i ). We created a sample of 5000 realizations of the
demands. The resulting average costs of robust solutions are depicted in Figure 1 (the red curves
in Figures 1(a) and 1(b) and, on a different scale, in Figures 1(c) and 1(d); the error bars reflect
the sample standard deviations). For both instances of the newsvendor problem, we observe from
Figures 1(c) and 1(d) that the average cost first decreases with Γ, as incorporating a small amount
of uncertainty in the model yields more robust solutions, reaches its minimum, and starts increasing
with Γ as the solution becomes overly conservative for the typical demand realization. In Figure
1(c), the best trade-off is reached at Γ = 5, and the average cost of corresponding robust solution
achieves savings of 2.82% over solution obtained for Γ = 0 (i.e., solution targeted to satisfy the
nominal demand b̄), while Figure 1(d) has an best trade-off at Γ = 11 and savings of 4.09%; both
are consistent with the guidelines provided by Bertsimas and Sim in [8], namely, that the budget
15
√ √
of uncertainty should be of the order of n (here, 50 ≈ 7.1). Note that for the first instance, the
worst case for Γ = 5 corresponds to the situation where the demand for the last 5 items (items 46
to 50) is equal to its highest value and demand for the other items is equal to its nominal value,
which makes sense as the last 5 items have the largest shortage and holding penalties. In the second
instance, the worst-case instance for Γ = 11 consists of demand for 11 products with mid-range
penalties equal to its highest value.
70000
40000
60000 35000
50000 30000
25000
40000
Cost
Cost
20000
30000
15000
20000
10000
10000
Average Cost 5000 Average Cost
Worst-case Cost Worst-case Cost
0 10 20 30 40 50 0 10 20 30 40 50
Budget of Uncertainty Budget of Uncertainty
(a) Instance 1: worst-case cost vs. Γ. (b) Instance 2: worst-case cost vs. Γ.
Average Cost Average Cost
29000 18000
28000
17000
27000
26000 16000
Cost
Cost
25000
15000
24000
23000 14000
22000
13000
0 10 20 30 40 50 0 10 20 30 40 50
(c) Instance 1: average cost vs. Γ. (d) Instance 2: average cost vs. Γ.
Figure 1: The impact of the budget of uncertainty on worst-case and average costs for two instances
of the newsvendor problem.
6.2 Algorithmic Performance
The algorithm was implemented using AMPL/CPLEX v.12.1 on a Linux machine with Intel Core
Quad CPU at 2.50GHz.
Figure 2 illustrates the effect of the budget of uncertainty on both the number of iterations and the
16
running time, in CPU seconds, of Algorithm 3.2. Neither the number of iterations nor the running
time showed any particular dependence on Γ (although problems with very small and very large
values of Γ appear easier to solve, perhaps due to relatively smaller numbers of extreme points of
B, and thus small number of non-redundant constraints in problem (15)). The maximum number
of iterations needed for either problem instance was around 180, while the maximum running time
was under 2 seconds.
170 190
180
160
170
150 160
Number of Iterations
150
140
140
130 130
120
120
110
110 100
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
(a) Instance 1: Iteration count vs. Γ. (b) Instance 2: Iteration count vs. Γ.
1.8 2
1.7
1.8
1.6
1.6
1.5
CPU Seconds
CPU Seconds
1.4 1.4
1.3
1.2
1.2
1
1.1
1 0.8
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
(c) Instance 1: Run time (CPU seconds) vs. Γ. (d) Instance 2: Run time (CPU seconds) vs. Γ.
Figure 2: The impact of the budget of uncertainty on the number of iterations and run time (sec)
for two instances of the multi-item newsvendor problem.
We next investigated how the size of the problem affects the number of iterations required by
the algorithm, as well as its running time. We compared instances of the newsvendor problem
with n = 50, 75, 100, 125 and A = 100n. One again, two cost structures were considered, and the
problems were solved for all values of Γ = 0, 1, . . . , n.
Overall, the number of iterations required, as well as the running time of the algorithm, increase
with the increase in problem size. (The number of iterations for n = 125 is above 360 for some
17
400 400
125 items 125 items
100 items 100 items
75 items 75 items
50 items 50 items
350
350
300
300
Numer of iterations
Numer of iterations
250
250
200
200
150
150
100
50 100
0 20 40 60 80 100 120 0 20 40 60 80 100 120
(a) Instance 1: Iteration count vs. Γ. (b) Instance 2: Iteration count vs. Γ.
Figure 3: The impact of problem size on the number of iterations for two instances of the multi-item
newsvendor problem.
values of Γ in both instances.) Figures 3(a) and 3(b) illustrate the dependence of the number of
iterations performed on the structure of the problem, reflected in this case by the difference in the
cost vectors in the two instances. The number of constraints in problem (15) is the same in the
two instances for the same values of n and Γ. However, in instance 1, the number of iterations
required decreases significantly for larger values of n and Γ. This is likely due to the fact that
the number of non-redundant constraints in significantly smaller for those problems (not surprising
given the relationship between worst-case demand vectors b and dual recourse vectors p highlighted
in Corollary 4.2).
Figures 4(a) and 4(b) demonstrate corresponding increases in the running time of the algorithm for
increasing problem sizes. This overall increase can be attributed to two factors: the increase in the
number of iterations, as well as the increase in the complexity of each iteration, as the number of
variables in the subproblem solved, as well as the size of vector to be sorted to solve the adversarial
problem, increases. Figures 4(c) and 4(d) depict the ratio between the running time and the number
of iterations required for each instance solved. It appears that the increased complexity of each
iteration contributes significantly to the increase in running times.
It should be noted that we did not make any special efforts to implement the individual steps of
our algorithm efficiently. E.g., we did not take advantage (beyond what is done automatically by
the CPLEX solver under default settings) of the fact that the linear program being solved at each
iteration differs from the one solved in the previous iteration by just one additional constraint, and
hence is likely easily handled by dual simplex algorithm. Such savings could potentially improve
the running time of the algorithm, but not change the number of iterations required.
7 Computational Results: Production Planning Problem
Here, we consider a production planning example where the demand is uncertain but must be met.
Once demand has been revealed, the decision-maker has the option to buy additional raw materials
18
9 14
125 items 125 items
100 items 100 items
75 items 75 items
8 50 items 50 items
12
10
8
CPU seconds
CPU seconds
5
4
6
2
1
0 0
0 20 40 60 80 100 120 0 20 40 60 80 100 120
(a) Instance 1: Run time (CPU seconds) vs. Γ. (b) Instance 2: Run time (CPU seconds) vs. Γ.
0.024 0.035
125 items 125 items
100 items 100 items
75 items 75 items
0.022 50 items 50 items
0.03
0.02
0.018 0.025
CPU seconds per iteration
CPU seconds per iteration
0.016
0.02
0.014
0.012 0.015
0.01
0.01
0.008
0.006 0.005
0 20 40 60 80 100 120 0 20 40 60 80 100 120
(c) Instance 1: CPU seconds per iteration vs. Γ. (d) Instance 2: CPU seconds per iteration vs. Γ.
Figure 4: The impact of problem size on run time (sec) for two instances of the multi-item newsven-
dor problem.
at a higher cost and re-run the production process, so that demand for all products is satisfied.
The goal is to minimize the ordering costs of raw materials and the production costs of finished
products, as well as the inventory (or disposal) costs on the materials and products remaining at
the end of the time period. We introduce the following notation:
19
m: the number of raw materials,
n: the number of finished products,
c: the first-stage unit cost of the raw materials,
d: the second-stage unit cost of the raw materials,
f: the first-stage unit production cost,
g: the second-stage unit production cost,
h: the unit inventory cost of unused raw materials,
k: the unit inventory cost of unsold finished products,
A: the productivity matrix,
b: the demand for the finished products,
x: the raw materials purchased in the first stage,
y: the raw materials purchased in the second stage,
u: the products produced in the first stage,
v: the products produced in the second stage,
We assume that all coefficients of the matrix A are nonnegative, as are all the costs. The deter-
ministic problem can be formulated as:
min cT x + dT y + f T u + gT v + hT (x + y − A(u + v)) + kT (u + v − b)

x,y,u,v
s.t. Au ≤ x
Au + Av ≤ x + y (30)
u+v ≥b
x, y, u, v ≥ 0.
Note that, according to the formulation, raw materials can be purchased in the first stage of the
time period and used in production in the second stage. This problem satisfies assumption of
relatively complete recourse.
7.1 Analysis of Production Planning Problem
The robust production planning problem is as follows:
min cT x + f T u + max Q(x, u, b)

x,u b∈B
s.t. Au ≤ x (31)
x, u ≥ 0,
where the recourse function, Q(x, u, b) is given by
Q(x, u, b) = c̄(x, u) − kT b + min (d + h)T y + (g − AT h + k)T v

y,v
s.t. y − Av ≥ Au − x (32)
v ≥b−u
y, v ≥ 0,
where c̄(x, u) = hT x−hT Au+kT u. Since this form of the recourse function is slightly different from
expression (2) in that the constraints are in inequality form, we begin by deriving a modification
of the integer program of Theorem 5.1.
20
The dual of the recourse function is:
Q(x, u, b) = c̄(x, u) − kT b + max (Au − x)T q + (b − u)T p
q,p
s.t. 0 ≤ q ≤ d + h (33)
0 ≤ p ≤ AT q + g − AT h + k.
To obtain Q(x, u) = maxb∈B Q(x, u, b), where Q(x, u, b) is defined by (33), we can write:
n
!
X
Q(x, u) = c̄(x, u) − kT b̄ + maxq,p T T
(Au − x) q + (b̄ − u) p + max bbi (pi − ki )zi
z∈Z
i=1
s.t. 0 ≤ q ≤ d + h
0 ≤ p ≤ AT q + g − AT h + k,
(34)
where Z was defined in equation (7). Applying the same logic as in Theorem 5.1, for a given q and
p, we can re-write the inner maximization problem of (34) as an integer programming problem in
variables r+ and r− :
Xm
max bbi (pi − ki )(r+ − r− )
i i
r+ , r−
i=1
s.t. eT (r+ + r− ) ≤ Γ, (35)
r+ + r− ≤ e,
r+ , r− ∈ {0, 1}m .
Substituting (35) into (34), we obtain:
n
X
T
Q(x, u) = c̄(x, u) − k b̄ + max (Au − x)T q + (b̄ − u)T p + bbi (pi − ki )(r+ − r− )
i i
q,p,r±
i=1
s.t. 0 ≤ q ≤ d + h,
0 ≤ p ≤ AT q + g − AT h + k, (36)
eT (r+ + r− ) ≤ Γ,
r+ + r− ≤ e,
r+ , r− ∈ {0, 1}n .
To remove the nonlinearity from the objective function, we introduce variables s+ +

i = pi ri and
s− −
i = pi ri ∀i. Substituting the above into (36) we conclude that we can calculate the value of
Q(x, u) by solving the following IP:
c̄(x, u) − kT b̄ + max b T (s+ − s− ) − Pn kibbi (r+ − r− )

(Au − x)T q + (b̄ − u)T p + b i=1 i i
q,p,s± ,r±
s.t. 0 ≤ q ≤ d + h
0 ≤ p ≤ AT q + g − AT h + k
0 ≤ s+ ≤ p+
0 ≤ s− ≤ p− (37)
s+ ≤ M r+
s− ≤ M r−
eT (r+ + r− ) ≤ Γ
r+ + r− ≤ 1
r+ , r− ∈ {0, 1}n ,
where M is a sufficiently large positive number.
21
Lastly, given (x, u), let (q, p, s± , r± ) solve (37). The corresponding dual recourse vector is (q, p),
and corresponding worst-case value of b can be determined as bi = b̄i + bbi (ri+ − ri− ), i = 1, . . . , m.
The subgradient of cT x + f T u + max Q(x, u, b) can now be computed as in Lemma 3.1, and is
b∈B
equal to
c−q+h
.
f + AT q − p − AT h + k
7.2 Computational Results
We considered an example of the production planning problem with m = 2 raw materials and
n = 30 finished products with the following data:
• Purchasing and inventory costs of raw materials: cj = 100, dj = 150, hj = 20 for j = 1, 2;
• Production and inventory costs of products: fi = 530, gi = 750, ki = 50 for i = 1, . . . , 30;
• Components of productivity matrix A were independently drawn from the integer-valued

uniform distribution U [0, 15];
• Product demand ranges: b̄i = 10 and b̂i = 5 for i = 1, . . . , 30.
In our example, g > AT h + k, ensuring that any production v in the second stage is performed
solely in order to satisfy demand not filled by first-stage production u (see (32)).
To initialize the algorithm, one can select arbitrary feasible values of first-stage variables (x, u),
compute the value and subgradient of Q(x, u), and use the resulting linear function as Q0 (x, u)
(we used conservative values of u = b̄ + b̂ and x = Au).
900000
850000
800000
750000
Cost
700000
650000
600000
Average Cost
Worst-case Cost
550000
0 5 10 15 20 25 30
Budget of Uncertainty
(a) Normally distributed demands. (b) Uniformly distributed demands.
Figure 5: The impact of the budget of uncertainty on worst-case and average cost of the production
planning problem, under Normal and Uniform demand distributions.
In this experiment, we considered the situation in which the decision maker assumes the demands
for individual products to be random, and has information about the average demands for the
various products, as well as some sense of the variabilities of these demands, but does not have a
22
specific probability model he trusts. Given his information, he fixes the uncertainty set by selecting
the values of b̄i and b̂i specified above, and considers a range of values of the uncertainty budget Γ.
The blue curve in Figure 5(a) (and in Figure 5(b)) shows the worst-case cost of this problem as
a function of budget of uncertainty Γ ∈ [0, 30]. To assess the expected performance of robust so-
lutions, we considered two possible distributions of demand: one in which demands for individual
products follow independent Normal distributions with mean b̄i = 10 and standard deviation b̂i = 5
(truncated at 0, to avoid negative values), and one in which demands follow independent continuous
Uniform distributions on the interval [b̄i − b̂i , b̄i + b̂i ] = [5, 15]. (Recall that the decision-maker does
not have the full knowledge of the distribution when selecting the uncertainty set.) We generated
independent samples of 5000 realizations of the demands for each distribution and plotted the re-
sulting average cost in Figures 5(a) and 5(b) for the Normal and Uniform distributions, respecitvely.
Just as in the newsvendor problem of the previous section, the average cost first decreases with
Γ, as incorporating uncertainty into the model yields more robust solutions, reaches its minimum,
and then increases with Γ as the solutions become overly conservative. The minimum average cost
in Figure 5(a) occurs at Γ = 9, and by implementing the corresponding ordering and production
planning solution, the decision-maker can achieve savings of 3.3% over the solution obtained for
Γ = 0 (i.e., the solution targeted to satisfy the demand b̄); in Figure 5(b) the minimum occurs at
Γ = 6, with savings of 2.5%. These outcomes make intuitive sense, since demands in the sample
generated from the Normal distribution exhibit higher variability than in the sample generated
from the Uniform distribution. This example illustrates the advantage of the robust optimization
approach in situations when precise estimates of probability distributions of uncertain parameters
are unavailable, or turn out to be inaccurate. Observe that implementing any of the robust so-
lutions obtained by setting Γ anywhere in the range between 5 and 10 yields a robust solution
that would perform well (as measured by the expected cost) regardless of whether the demands
follow a Normal or Uniform distribution. We performed a number of additional experiments with
demands sampled from a variety of distributions. Results presented in this section are typical of
all these experiments, with the minimum average cost occurring at Γ ∈ [5, 15], which indicates that
in the production planning problem a fair amount of uncertainty needs to be considered to obtain
solutions that perform well in expectation.
It is informative to consider the amounts of raw materials purchased and production done in the first
and second stages. The first-stage purchasing and production decisions (x and u, respectively) are
made according to the solution calculated by solving the robust problem (31). Once the first-stage
decisions are made and implemented, actual demand is revealed and the second-stage decisions (y
and v) tune themselves to the realized demands.
Figures 6(a) and 6(b) plot the total amount of raw materials purchased and products produced,
respectively, in the first and second stage as fractions of the total purchasing/production that occurs
under the worst-case demand outcome. For Γ = 0, we start with first-stage purchasing/production
levels targeted to satisfy nominal demand. As Γ increases, we split purchasing and production
between the two stages. However, for higher values of Γ worst-case demand realizations tend to
have values higher than nominal, and thus we see the second-stage purchasing and production
decreasing.
Figure 7(a) displays the sample averages of the fractions of total amounts of raw materials purchased
in the first and second stage when first-stage decisions are obtained by solving the robust formulation
for various values of Γ, and the demands are normally distributed. (Figure 7(b) captures similar
23
1 1
0.8 0.8
0.6 0.6
Fraction
Fraction
1st Stg Purchase 1st Stg Production
Worst-case 2nd Stg Purchase Worst-case 2nd Stg Production
0.4 0.4
0.2 0.2
0 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
(a) Raw materials purchased. (b) Production performed.
Figure 6: First- and second-stage purchasing and production, as fractions of the total purchasing
and production under worst-case demand outcome.
information for total production, and Figures 7(c) and 7(d) display these fractions for uniformly
distributed demands.) The errors bars show the standard deviation of these average fractions.
With uniformly distributed demands, first-stage decisions obtained using high values of Γ in the
description of the uncertainty set actually satisfy the realized demand in most cases, as the average
second-stage purchasing is zero, and second-stage production is nearly zero, as 7(c) and 7(d) show
(recall also that for these high values of Γ, the average cost of these solutions is almost equal to their
worst-case costs). With normally distributed demands, average fraction of material purchases done
in the second stage is nearly zero for Γ ∈ [20, 25], but increases for solutions obtained with higher
values of Γ; average fraction of second-stage production decreases with Γ, but never reaches zero.
Again, this behavior can partially be explained by higher variability of demands under Normal
distribution.
Figures 8(a) and 8(b) summarize our computational experience by illustrating the effect of budget
of uncertainty on the number of iterations and the running time of Algorithm 3.2 on this problem.
The number of iterations required is not particularly sensitive to the value of Γ (except for very
small and very large values of Γ), and therefore the determining factor in the running time are
the computational demands of solving the integer program (37) (the adversarial problem), which
generally increase with Γ. Therefore, the overall running time of the algorithm generally increases
with Γ up to Γ = 23 and then drops off sharply. Thus, the algorithm has fairly low computa-
tional demands for budgets of uncertainty of Γ = 15 and lower, which were most appropriate for
determining robust solutions with low average costs in this and other experiments.
It should be pointed out that, in addition to the value of Γ, the computational demands of the
adversarial problem were greatly influenced by the density of the productivity matrix A. Indeed, if
the productivity matrix is dense (as it is in the example presented here), all the raw materials would
contribute roughly equally towards the production of most or all of the products, which makes it
harder to determine which products (and implicitly which raw materials) are more sensitive to
demand fluctuations. In examples where a sparse productivity matrix with pronounced block
structure allowed the decisions to be implicitly decomposed by material, instances of the MIP (37)
24
1 1
0.8 0.8
0.6 0.6
Average Fraction
Average Fraction
Avg 1st Stg Purchase Avg 1st Stg Production

Avg 2nd Stg Purchase Avg 2nd Stg Production
0.4 0.4
0.2 0.2
0 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
(a) Raw material purchased, normally distributed de- (b) Production performed, normally distributed de-
mands. mands.
(c) Raw material purchased, uniformly distributed de- (d) Production performed, uniformly distributed de-
mands. mands.
Figure 7: Sample averages of first- and second-stage purchasing and production, as fractions of the
total purchasing and production, under Normal and Uniform demand distributions.
25
250 250
200 200
150 150
CPU Seconds
100 100
50 50
0 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
(a) Iteration count vs. Γ. (b) Run time (CPU seconds) vs. Γ.
Figure 8: The impact of the budget of uncertainty Γ on the number of iterations and run time
(CPU seconds) for production planning problem.
were easier and required shorter solution times.

Finally, we would like to remark that, depending on relative magnitudes of problem parameters
(e.g., the relative magnitudes of inventory and first- and second-stage production costs), one can
devise heuristics that would generate an approximate solution to the adversarial problem (34) (i.e.,
a “bad,” if not worst, demand realization) quite easily — almost as easily as solving the adversarial
problem to optimality in the case of simple recourse. If the demand realization found is worse, in
the adversarial sense, than realizations already considered, it can be used to produce a weak cut
that separates the current iterate from the epigraph of Q(x, u), but is typically not a supporting
hyperplane (thus, increasing the number of iterations, but possibly leading to faster overall solution
times). Alternatively, an algorithm solving (37) can be terminated prematurely once a high-quality
incumbent solution has been found. In fact, in our experiments we observed that the optimal
solution of (37) was discovered relatively quickly, and, as is often the case, the bulk of the solution
time was spent improving dual bounds and proving optimality of the incumbent solution.
8 Conclusions
We have proposed an approach to linear optimization with recourse that is robust with respect to
the underlying probabilities. Specifically, instead of relying on the actual distribution, which would
be difficult to estimate accurately, or a family of distributions, which would significantly increase
the complexity of the problem at hand, we have modeled random variables as uncertain parameters
in a polyhedral uncertainty set and analyzed the problem for the worst-case instance within that
set. We have shown that this robust formulation can be solved using a cutting-plane algorithm and
standard linear optimization software. We tested our approach on a multi-item newsvendor problem
and a production planning problem with demand uncertainties, with encouraging computational
results. Analysis of obtained solutions provides insight into appropriate levels of conservatism in
planning (as captured by the budget of uncertainty) to obtain lower average costs.
26
References
[1] Alper Atamtürk and Muhong Zhang. Two-stage robust network flow and design under demand
uncertainty. Operations Research, 55(4):662–673, 2007.
[2] A. Ben-Tal, A. Goryashko, A. Guslitzer, and A. Nemirovski. Adjustable robust solutions of

uncertain linear programs. Math. Program., 99(2):351–376, 2004.
[3] A. Ben-Tal and A. Nemirovski. Robust convex optimization. Math. Oper. Res., 23(4):769–805,
1998.
[4] A. Ben-Tal and A. Nemirovski. Robust solutions of uncertain linear programs. Operations
Research Letters, 25(1):1–13, August 1999.
[5] A. Ben-Tal and A. Nemirovski. Robust solutions of linear programming problems contaminated
with uncertain data. Math. Program., Ser. A 88:411–424, 2000.
[6] J. F. Benders. Partitioning procedures for solving mixed-variables programming problems.

Numer. Math., 4:238–252, 1962.
[7] Dimitris Bertsimas and Melvyn Sim. Robust discrete optimization and network flows. Math.
Program., 98(1-3, Ser. B):49–71, 2003. Integer programming (Pittsburgh, PA, 2002).
[8] Dimitris Bertsimas and Melvyn Sim. The price of robustness. Oper. Res., 52(1):35–53, 2004.
[9] Dimitris Bertsimas and Aurélie Thiele. A robust optimization approach to inventory theory.
Oper. Res., 54(1):150–168, 2006.
[10] Dimitris Bertsimas and John N. Tsitsiklis. Introduction to Linear Optimization. Athena
Scientific, 1997.
[11] D Bienstock. Histogram models for robust portfolio optimization. 2007.
[12] D Bienstock and N Özbay. Computing robust basestock levels. Discrete Optimization, Jan
2008.
[13] John R. Birge and François Louveaux. Introduction to stochastic programming. Springer Series
in Operations Research. Springer-Verlag, New York, 1997.
[14] John R. Birge and Roger J.-B. Wets. Computing bounds for stochastic programming problems
by means of a generalized moment problem. Math. Oper. Res., 12(1):149–162, 1987.
[15] Michèle Breton and Saeb El Hachem. Algorithms for the solution of stochastic dynamic
minimax problems. Comput. Optim. Appl., 4(4):317–345, 1995.
[16] Wenqing Chen and Melvyn Sim. Goal-driven optimization. Oper. Res., 57(2):342–357, 2009.
[17] George B. Dantzig. Linear programming under uncertainty. Management Sci., 1:197–206,
1955.
[18] J. Dupačová. On minimax decision rule in stochastic linear programming. In Studies on math-
ematical programming (Papers, Third Conf. Math. Programming, Mátrafüred, 1975), volume 1
of Math. Methods Oper. Res., pages 47–60. Akad. Kiadó, Budapest, 1980.
27
[19] J Dupačová. The minimax approach to stochastic programming and an illustrative application.
Stochastics, 20(1):73–88, 1987.
[20] J. Dupačová. Stochastic programming: minimax approach. In Ch. A Floudas and P. M.

Pardalos, editors, Encyclopedia of Optimization, volume V, pages 327–330. Kluwer Academic
Publishers, 2001.
[21] Laurent El Ghaoui and Hervé Lebret. Robust solutions to least-squares problems with uncer-
tain data. SIAM J. Matrix Anal. Appl., 18(4):1035–1064, 1997.
[22] Laurent El Ghaoui, Francois Oustry, and Hervé Lebret. Robust solutions to uncertain semidef-
inite programs. SIAM J. on Optimization, 9(1):33–52, 1998.
[23] Yu. Ermoliev, A. Gaivoronski, and C. Nedeva. Stochastic optimization problems with incom-
plete information on distribution functions. SIAM J. Control Optim., 23(5):697–716, 1985.
[24] R. Jagannathan. Minimax procedure for a class of linear programs under uncertainty. Opera-
tions Res., 25(1):173–177, 1977.
[25] Peter Kall and Stein W. Wallace. Stochastic programming. Wiley-Interscience Series in Systems
and Optimization. John Wiley & Sons Ltd., Chichester, 1994.
[26] J. E. Kelley, Jr. The cutting-plane method for solving convex programs. J. Soc. Indust. Appl.
Math., 8:703–712, 1960.
[27] A Mutapcic and S Boyd. Cutting-set methods for robust convex optimization with pessimizing
oracles. 2008.
[28] Fernando Ordóñez and Jiamin Zhao. Robust capacity expansion of network flows. Networks,
50(2):136–145, 2007.
[29] Morten Riis and Kim Allan Andersen. Applying the minimax criterion in stochastic recourse
programs. European J. Oper. Res., 165(3):569–584, 2005.
[30] Alexander Shapiro. Monte Carlo sampling methods. In Stochastic programming, volume 10 of
Handbooks Oper. Res. Management Sci., pages 353–425. Elsevier, Amsterdam, 2003.
[31] Alexander Shapiro and Shabbir Ahmed. On a class of minimax stochastic programs. SIAM J.
Optim., 14(4):1237–1249 (electronic), 2004.
[32] Alexander Shapiro and Anton Kleywegt. Minimax analysis of stochastic problems. Optim.
Methods Softw., 17(3):523–542, 2002. Stochastic programming.
[33] A. L. Soyster. Convex programming with set-inclusive constraints and applications to inexact
linear programming. Oper. Res., 21(5):1154–1157, 1973.
[34] A Takeda, S Taguchi, and R Tütüncü. Adjustable robust optimization models for a nonlinear
two-period system. Journal of Optimization Theory and Applications, Jan 2008.
[35] Samer Takriti and Shabbir Ahmed. Managing short-term electricity contracts under uncer-
tainty: a minimax approach. Technical report, Georgia Institute of Technology, 2002.
[36] Samer Takriti and Shabbir Ahmed. On robust optimization of two-stage systems. Math.
Program., 99(1, Ser. A):109–126, 2004.
28
[37] Jitka Žáčková. On minimax solutions of stochastic linear programming problems. Časopis
Pěst. Mat., 91:423–430, 1966.
29

Robust Linear Optimization With Recourse: Aur Elie Thiele, Tara Terry, and Marina Epelman March, 2010

Uploaded by

Copyright:

Available Formats

Robust Linear Optimization With Recourse: Aur Elie Thiele, Tara Terry, and Marina Epelman March, 2010

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Robust Linear Optimization With Recourse: Aur Elie Thiele, Tara Terry, and Marina Epelman March, 2010

Uploaded by

Copyright:

Available Formats

Robust Linear Optimization With Recourse

Aurélie Thiele∗, Tara Terry†, and Marina Epelman‡

1. We address right-hand-side uncertainty in linear programming problems with recourse by

2.1 Optimization with Recourse

with the following notation:

the stochastic counterpart of problem (1) can be formulated as a nonlinear problem:

min cT x + Eb [Q(x, b)]

2.2 The Adjustable Robust Approach

min cT x + max Q(x, b)

2.3 Assumptions and Preliminaries

Q(x, b) = max (b − Ax)T p

Thus, problem (5) is equivalent to:

Q(λx1 + (1 − λ)x2 , b) ≤ λQ(x1 , b) + (1 − λ)Q(x2 , b)

Q(λx1 + (1 − λ)x2 , b) ≤ dT (λy1 + (1 − λ)y2 ) = λdT y1 + (1 − λ)dT y2 = λQ(x1 , b) + (1 − λ)Q(x2 , b),

proving the first claim.

3 A Cutting-Plane Approach: Kelley’s Algorithm

f (x) ≥ f (x̃) + gT (x − x̃) ∀x.

Q(x) = max Q(x, b) = max (b − Ax)T p. (10)

s.t. α ≥ Q(x) (11)

Proof. For an arbitrary x,

Q(x) ≥ Q(x, b̃) ≥ (b̃ − Ax)T p̃

proving the claim.

Iteration t Given L, U, and Qt (x),

Step 1 Solve min{cT x + α : x ∈ S, α ≥ Qt (x)}:

s.t. α + pTl Ax ≥ bTl pl , l = 1, . . . , t − 1 (12)

Let (xt , αt ) be an optimal solution and let L = cT xt + αt = cT xt + Qt (xt ).

where p1 , . . . , pK are (all) the extreme points of the set {p : BT p ≤ d}.

4 Analysis of Robust Linear Programs with Simple Recourse

where B has been defined in equations (6)—(7).

Proof. We note that for any first-stage decision vector x:

Q(x) = max max (b − Ax)T p,

Corresponding dual recourse vector p can be determined as follows: for i = 1, . . . , m

Q(x) = max (b̄ − Ax)T (p+ − p− ) + b

The subgradient of Q(x) can now be computed as in Lemma 3.1.

6 Computational Results: Newsvendor Problem

n: the number of items,

The deterministic problem can be formulated as:

Shortage penalty si Surplus penalty hi

6.1 Analysis of Problem Solutions

Average Cost Average Cost

6.2 Algorithmic Performance

7 Computational Results: Production Planning Problem

CPU seconds per iteration

min cT x + dT y + f T u + gT v + hT (x + y − A(u + v)) + kT (u + v − b)

7.1 Analysis of Production Planning Problem

The robust production planning problem is as follows:

min cT x + f T u + max Q(x, u, b)

where the recourse function, Q(x, u, b) is given by

Q(x, u, b) = c̄(x, u) − kT b + min (d + h)T y + (g − AT h + k)T v

To remove the nonlinearity from the objective function, we introduce variables s+ +

c̄(x, u) − kT b̄ + max b T (s+ − s− ) − Pn kibbi (r+ − r− )

where M is a sufficiently large positive number.

7.2 Computational Results

• Purchasing and inventory costs of raw materials: cj = 100, dj = 150, hj = 20 for j = 1, 2;

• Production and inventory costs of products: fi = 530, gi = 750, ki = 50 for i = 1, . . . , 30;

• Components of productivity matrix A were independently drawn from the integer-valued

• Product demand ranges: b̄i = 10 and b̂i = 5 for i = 1, . . . , 30.

(a) Normally distributed demands. (b) Uniformly distributed demands.