Generalized Derivatives of Lexicographic Linear Programs

Jose Alberto Gomez¹,
Kai Höffner¹,
Kamil A. Khan^1,2 &
…
Paul I. Barton¹

385 Accesses
6 Citations
Explore all metrics

Abstract

Lexicographic linear programs are fixed-priority multiobjective linear programs that are a useful model of biological systems using flux balance analysis and for goal-programming problems. The objective function values of a lexicographic linear program as a function of its right-hand side are nonsmooth. This work derives generalized derivative information for lexicographic linear programs using lexicographic directional derivatives to obtain elements of the Bouligand subdifferential (limiting Jacobian). It is shown that elements of the limiting Jacobian can be obtained by solving related linear programs. A nonsmooth equation-solving problem is solved to illustrate the benefits of using elements of the limiting Jacobian of lexicographic linear programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient solution of ordinary differential equations with a parametric lexicographic linear program embedded

Article 29 July 2015

On multiobjective optimization problems involving higher order strong convexity using directional convexificators

Article 21 August 2024

Relating Lexicographic Smoothness and Directed Subdifferentiability

Article 03 June 2016

References

Harwood, S.M., Höffner, K., Barton, P.I.: Efficient solution of ordinary differential equations with a parametric lexicographic linear program embedded. Numer. Math. 133(4), 623–653 (2016)
Article MathSciNet MATH Google Scholar
Höffner, K., Harwood, S.M., Barton, P.I.: A reliable simulator for dynamic flux balance analysis. Biotechnol. Bioeng. 110(3), 792–802 (2013)
Article Google Scholar
Ignizio, J.P.: Linear Programming in Single- & Multiple-Objective Systems. Prentice Hall, Englewoods Cliffs (1982)
MATH Google Scholar
Höffner, K., Barton, P.I.: Design of microbial consortia for industrial biotechnology. Comput. Aided Chem. Eng. 34, 65–74 (2014)
Article Google Scholar
Von Stackelberg, H.: The Theory of the Market Economy. Oxford University Press, Oxford (1952)
Google Scholar
Ferris, M.C., Pang, J.S.: Engineering and economic applications of complementarity problems. SIAM Rev. 39(4), 669–713 (1997)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Lexicographic differentiation of nonsmooth functions. Math. Program. 104(2–3), 669 (2005)
Article MathSciNet MATH Google Scholar
Khan, K.A., Barton, P.I.: A vector forward mode of automatic differentiation for generalized derivative evaluation. Optim. Methods Softw. 30(6), 1185–1212 (2015)
Article MathSciNet MATH Google Scholar
Scholtes, S.: Introduction to Piecewise Differentiable Equations. Springer, New York (2012)
Book MATH Google Scholar
Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Scientific, Belmont (1997)
Google Scholar
Khan, K.A., Barton, P.I.: Generalized derivatives for solutions of parametric ordinary differential equations with non-differentiable right-hand sides. J. Optim. Theory Appl. 163(2), 355–386 (2014)
Article MathSciNet MATH Google Scholar
Barton, P.I., Khan, K.A., Stechlinski, P., Watson, H.A.: Computationally relevant generalized derivatives: theory, evaluation and applications. Optim. Methods Soft. (2017). https://doi.org/10.1080/10556788.2017.1374385
Clarke, F.H.: Optimization and Nonsmooth Analysis. SIAM, Philadelphia (1990)
Book MATH Google Scholar
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2013)
MATH Google Scholar
Höffner, K., Khan, K.A., Barton, P.I.: Generalized derivatives of dynamic systems with a linear program embedded. Automatica 63, 198–208 (2016)
Article MathSciNet MATH Google Scholar
Gomez, J.A., Höffner, K., Barton, P.I.: DFBAlab: a fast and reliable MATLAB code for dynamic flux balance analysis. BMC Bioinform. 15, 409 (2014)
Article Google Scholar
CPLEX IBM ILOG: 12.7 User’s Manual. https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.studio.help/pdf/usrcplex.pdf (2017)
Gurobi Optimization: Gurobi optimizer reference manual. http://www.gurobi.com/documentation/ (2017)
Chen, J., Gomez, J.A., Höffner, K., Barton, P.I., Henson, M.A.: Metabolic modeling of synthesis gas fermentation in bubble column reactors. Biotechnol. Biofuels 8, 89 (2015)
Article Google Scholar
Chen, J., Gomez, J.A., Höffner, K., Phalak, P., Barton, P.I., Henson, M.A.: Spatiotemporal modeling of microbial metabolism. BMC Syst. Biol. 10, 21 (2016)
Article Google Scholar
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58(1–3), 353–367 (1993)
Article MathSciNet MATH Google Scholar
Boyd, S., Vandenberghe, L.: Subgradients: Notes for EE364b, Stanford University, Winter 2006-07. https://see.stanford.edu/materials/lsocoee364b/01-subgradients_notes.pdf (2008)

Download references

Acknowledgements

The authors would like to acknowledge Dr. Stuart Harwood for the insightful discussions that took place while preparing the paper.

Author information

Authors and Affiliations

Process Systems Engineering Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
Jose Alberto Gomez, Kai Höffner, Kamil A. Khan & Paul I. Barton
Department of Chemical Engineering, McMaster University, Hamilton, ON, L8S 4L8, Canada
Kamil A. Khan

Authors

Jose Alberto Gomez
View author publications
You can also search for this author in PubMed Google Scholar
Kai Höffner
View author publications
You can also search for this author in PubMed Google Scholar
Kamil A. Khan
View author publications
You can also search for this author in PubMed Google Scholar
Paul I. Barton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul I. Barton.

Additional information

Communicated by Alexander Martin.

Appendices

Appendix A

Proof of Proposition 2.1

First we will show that if $\text {int}(F) \ne \emptyset $, then $\mathbf {A}$ is full row rank. Assume $\mathbf {A}$ is not full row rank and $\text {int}(F) \ne \emptyset $. Then, there exists at least one row that is linearly dependent on the other rows. Without loss of generality, let us assume that the last row of $\mathbf {A}$ is linearly dependent. Let $\mathbf {b}_0 \in \text {int}(F)$. Then, any small perturbation of the last component of $\mathbf {b}_0$ while keeping all other components of $\mathbf {b}_0$ constant results in LP (1) becoming infeasible. Then, $\mathbf {b}_0 \notin \text {int}(F)$ and there is a contradiction.

For the next part assume $\mathbf {A}$ is full row rank. Let $\mathbf {v}= \mathbf {1}$ and $\mathbf {b}= \mathbf {A}\mathbf {v}$. By hypothesis, this $\mathbf {b}\in F$. Let $\mathbf {d}\in \mathbb {R}^m$ and $\epsilon > 0$. Without loss of generality, let $\Vert \mathbf {d}\Vert = \mathbf {1}$. Let us find a $\Delta \mathbf {v}$ that satisfies, $\mathbf {A}(\mathbf {v}+ \Delta \mathbf {v}) = \mathbf {b}+ \epsilon \mathbf {d}$. Since $\mathbf {A}\mathbf {v}= \mathbf {b}$, we need $\mathbf {A}\Delta \mathbf {v}= \epsilon \mathbf {d}$. Since $\mathbf {A}$ is full row rank, the columns of $\mathbf {A}$ span $\mathbb {R}^m$. Consider all possible collections of m columns that form a basis for the column space of $\mathbf {A}$. Take such a basis $\tilde{\mathbf {A}}$ and let $\Delta {\tilde{\mathbf {v}}} = \epsilon \tilde{\mathbf {A}}^{-1}\mathbf {d}$. Without loss of generality, let the columns of $\tilde{\mathbf {A}}$ be the first m columns of $\mathbf {A}$. Let $\Delta v_j = \Delta {\tilde{v}}_j$ if $j \in \{1,\ldots ,m\}$ and 0 otherwise. It is clear that $\Delta \mathbf {v}$ constructed in this form satisfies $\mathbf {A}(\mathbf {v}+ \Delta \mathbf {v}) = \mathbf {b}+ \epsilon \mathbf {d}$. Moreover, $\Vert \Delta \mathbf {v}\Vert = \Vert \Delta {\tilde{\mathbf {v}}} \Vert $. Therefore, $\Vert \Delta \mathbf {v}\Vert = \Vert \Delta {\tilde{\mathbf {v}}} \Vert \le \epsilon \Vert \tilde{\mathbf {A}}^{-1} \Vert \Vert \mathbf {d}\Vert = \epsilon \Vert \tilde{\mathbf {A}}^{-1} \Vert $ and for small enough $\epsilon >0$, $\Vert \Delta \mathbf {v}\Vert < 1$. Then all components $| \Delta v_j |<1$ and $\mathbf {v}+ \Delta \mathbf {v}> \mathbf {0}$, and $\mathbf {b}+ \epsilon \mathbf {d}\in F$. By hypothesis $F = \{\mathbf {A}\mathbf {v}{:}\,\mathbf {v}\ge \mathbf {0} \}$ [10], so F is a nonempty convex set and $\mathbf {b}+ \delta \mathbf {d}\in F$ for all $\delta \in ]0,\epsilon [$. Since $\mathbf {d}$ was arbitrary, $\mathbf {b}\in \text {int}(F)$ and $\text {int}(F) \ne \emptyset $. $\square $

Proof of Proposition 2.3

Notice that $\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\theta }(\mathbf {x}+ \epsilon \mathbf {d})$ holds $\forall \epsilon \in [0,\delta ^*_\mathbf {d}[$. From Lemma 2.1 there exists $\delta >0$ such that $\varvec{\theta }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\theta }(\mathbf {x}) + \varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})$, $\forall \Vert \epsilon \mathbf {d}\Vert < \delta $. Let $\delta _\mathbf {d}= \min (\delta ,\delta ^*_\mathbf {d})$. Given that $\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\theta }(\mathbf {x}+ \epsilon \mathbf {d})$ for all $\epsilon \in ]0,\delta _\mathbf {d}[$, then $\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\rho }(\mathbf {x}) + \varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})$ for all $\epsilon \in ]0,\delta _\mathbf {d}[$. Now let us show that $\varvec{\rho }^\prime (\mathbf {x};\epsilon \mathbf {d})$ exists and is equal to $\varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})$. Consider any sequence $\{\tau _i\}^\infty _{i=0}$, $\tau _i \in ]0,\delta _\mathbf {d}[$, $\tau _i \downarrow 0$. From the definition of directional derivative: $\begin{aligned} {\mathop {\lim }\nolimits _{i \rightarrow \infty }}\frac{\varvec{\theta }(\mathbf {x}+\tau _i\mathbf {d}) - \varvec{\theta }(\mathbf {x})}{\tau _i} =\varvec{\theta }^\prime (\mathbf {x};\mathbf {d}). \end{aligned}$ Also, for all $\tau _i$, $\begin{aligned} \frac{\varvec{\theta }(\mathbf {x}+\tau _i\mathbf {d}) - \varvec{\theta }(\mathbf {x})}{\tau _i} = \frac{\varvec{\rho }(\mathbf {x}+\tau _i\mathbf {d}) - \varvec{\rho }(\mathbf {x})}{\tau _i}. \end{aligned}$ Therefore, $\varvec{\rho }^\prime (\mathbf {x};\mathbf {d})$ exists and $\varvec{\rho }^\prime (\mathbf {x};\mathbf {d}) = \varvec{\theta }^\prime (\mathbf {x}; \mathbf {d})$.

From Eq. (4), $\varvec{\rho }^\prime (\mathbf {x};\epsilon \mathbf {d}) = \epsilon \varvec{\rho }^\prime (\mathbf {x};\mathbf {d}) = \epsilon \varvec{\theta }^\prime (\mathbf {x}; \mathbf {d}) = \varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})$. Hence for all $\epsilon \in ]0,\delta _\mathbf {d}[$, $\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\rho }(\mathbf {x}) + \varvec{\rho }^\prime (\mathbf {x}; \epsilon \mathbf {d})$. $\square $

Proof of Theorem 2.1

For $\varvec{\rho }$ to be l-smooth, it is necessarily Lipschitz continuous near $\mathbf {x}$. Since $\varvec{\zeta }$ is locally Lipschitz continuous, the composition $\varvec{\sigma }:= \varvec{\zeta }\circ \varvec{\rho }$ is Lipschitz continuous near $\mathbf {x}$. First, we show that $\varvec{\sigma }$ is directionally differentiable at $\mathbf {x}$.

We need to show that $\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \frac{\varvec{\sigma }(\mathbf {x}+ \tau \mathbf {d})-\varvec{\sigma }(\mathbf {x})}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } \end{aligned} \in \mathbb {R}^p$ for all $\mathbf {d}\in \mathbb {R}^m$. First, $\begin{aligned}{\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})-\varvec{\rho }(\mathbf {x}) - \tau \varvec{\rho }'(\mathbf {x};\mathbf {d})}{\tau }=\mathbf {0}\end{aligned}$ from the definition of the directional derivative. Next given $\mathbf {d}\in \mathbb {R}^m$, $\begin{aligned} \varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d})) = {\mathop {\lim }\nolimits _{\tau \downarrow 0}}\end{aligned}$$\begin{aligned}\frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } \end{aligned} \in \mathbb {R}^p$ by assumption. We show that $\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \end{aligned}$$\begin{aligned} \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau }, \end{aligned}$ which is equivalent to showing that $\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \;&\frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))}{\tau } = \mathbf {0}. \end{aligned}$ From the existence of the directional derivative $\varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d}))$, there exists $\epsilon _1>0$ such that $ \varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }^\prime (\mathbf {x};\mathbf {d}) \in Z,$ for all $\tau \in ]0,\epsilon _1[.$ Since $\varvec{\zeta }$ is locally Lipschitz continuous there exists $\epsilon _2>0$ and $K>0$ such that,

$$\begin{aligned}&\left\| \varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})) -\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d})) \right\| \le K\left\| \varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d})) \right\| ,\\ \end{aligned}$$

$$\begin{aligned}&\left\| \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})) -\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))}{\tau }\right\| \le K \left\| \frac{\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))}{\tau }\right\| , \end{aligned}$$

$\forall \tau \in ]0,\epsilon _1[$ such that $\Vert \varvec{\rho }(\mathbf {x}+\tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x})+\tau \varvec{\rho }^\prime (\mathbf {x};\mathbf {d})) \Vert < \epsilon _2 $. By the existence of $\varvec{\rho }^\prime (\mathbf {x};\mathbf {d})$ it follows that $\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \left\| \frac{\varvec{\rho }(\mathbf {x}+\tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x})+\tau \varvec{\rho }^\prime (\mathbf {x};\mathbf {d})) }{\tau } \right\| = 0, \end{aligned}$ and then

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \left\| \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})) -\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))}{\tau }\right\| = 0. \end{aligned}$$

Since $\mathbf {d}\in \mathbb {R}^m$ was arbitrary, $\varvec{\sigma }$ is directionally differentiable at $\mathbf {x}$ and

$$\begin{aligned} \varvec{\sigma }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d}) = \varvec{\zeta }^{(0)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d})). \end{aligned}$$

Now let us assume $\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) = \varvec{\zeta }^{(j-1)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j))$ for $j \in \{1,\ldots ,q\}$, for all $\mathbf {M}\in \mathbb {R}^{m\times q}$ and $\mathbf {m}_{q+1} \in \mathbb {R}^m$. Let $\mathbf {y}:= \varvec{\rho }(\mathbf {x})$, $\mathbf {Y}:= \varvec{\rho }^\prime (\mathbf {x};\mathbf {M})$, and ${\hat{\mathbf {m}}} := \mathbf {m}_{j+1}$. We need to show that the following limit exists in $\mathbb {R}^p$ for all $\mathbf {M}\in \mathbb {R}^{m \times q}$ and $\mathbf {m}_{q+1} \in \mathbb {R}^m$:

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})-\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)}{\tau } \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)\right) }{\tau }. \end{aligned}$$

Given $\mathbf {M}\in \mathbb {R}^{m\times q}$ and $\mathbf {m}_{q+1} \in \mathbb {R}^m$, since $\varvec{\zeta }$ is $\varvec{\rho }-$weakly l-smooth at $\mathbf {x}$, we know the following limit exists in $\mathbb {R}^p$:

$$\begin{aligned}&[\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}]^\prime (\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j),[\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})) \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)\right) }{\tau }, \end{aligned}$$

and $\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) \in G^{(j-1)}_{\mathbf {x}}$ for $\tau >0$ small enough. We also know from the definition of l-smoothness that:

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\left( \mathbf {m}_j + \tau {\hat{\mathbf {m}}}\right) -\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) - \tau \left[ \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\right] ^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})\right) }{\tau }=\mathbf {0}. \end{aligned}$$

We will show that

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j))}{\tau } \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)\right) }{\tau }, \end{aligned}$$

which is equivalent to

From Corollary 2.2, $\varvec{\zeta }^{(k)}_{\mathbf {y},\mathbf {Y}}$ is globally Lipschitz continuous for all $k \in \{0,\ldots ,q\}$. Then, there exists $\delta >0$ such that $\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) \in G^{(j-1)}_{\mathbf {x}}$ if $\tau \in [0,\delta [$, and there exists $K>0$ and $\tau \in [0,\delta [$ such that

$$\begin{aligned}&K \left\| \frac{\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) - \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})}{\tau }\right\| \\&\quad \ge \left\| \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))}{\tau }\right\| . \end{aligned}$$

By the existence of the directional derivative $[\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})$,

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \left\| \frac{\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) - \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})}{\tau }\right\| = 0, \end{aligned}$$

and then

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \left\| \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))}{\tau }\right\| = 0. \end{aligned}$$

Since $\mathbf {M}\in \mathbb {R}^{m \times q}$ and $\mathbf {m}_{q+1} \in \mathbb {R}^m$ were arbitrary, $\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}$ is directionally differentiable at $\mathbf {m}_j$ and $\varvec{\sigma }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}}) = \varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}}))$ for all $\mathbf {M}\in \mathbb {R}^{m \times q}$ and $\mathbf {m}_{q+1} \in \mathbb {R}^m$. Given that $\varvec{\sigma }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_1) = \varvec{\zeta }^{(0)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_1))$ for any $\mathbf {M}\in \mathbb {R}^{m\times q}$, the rest of the LD-derivatives follow by induction. $\square $

Proof of Theorem 2.2

First we show the limit for the directional derivative. Given $\mathbf {d}\in \mathbb {R}^m$ we want to show that the following limit exists in $\mathbb {R}^p$:

$$\begin{aligned} \varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d})) = \underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau }. \end{aligned}$$

We know that $ \begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\sigma }(\mathbf {x}+ \tau \mathbf {d})-\varvec{\sigma }(\mathbf {x})}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } \end{aligned}$ exists in $\mathbb {R}^p$.

If $\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau }, \end{aligned}$ which is equivalent to showing that

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \;&\frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))}{\tau } = \mathbf {0}, \end{aligned}$$

(14)

the proof is complete. Notice that $\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))$ is well defined by the assumption of $\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}) \in Z$ for all $\tau \in ]0,\delta _\mathbf {d}[$. Since $\varvec{\zeta }$ is locally Lipschitz, the proof of Theorem 2.1 establishes (14). Also notice that

$$\begin{aligned} \varvec{\sigma }'(\mathbf {x};\mathbf {d}) = \varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d})) =\varvec{\sigma }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d}) = \varvec{\zeta }^{(0)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d})). \end{aligned}$$

Given that $\mathbf {d}\in \mathbb {R}^m$ was arbitrary, the directional derivative of $\varvec{\zeta }$ at $\varvec{\rho }(\mathbf {x})$ exists in all directions $\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d})$ and the first function of sequence (7) is well defined with its corresponding domain $G^{(0)}_{\mathbf {x}}= \{\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {Z}}(\mathbf {z}_1){:}\,\mathbf {Z} \in \mathbb {R}^{m \times q}, \mathbf {z}_{q+1} \in \mathbb {R}^m \}$. By Proposition 2.5, $\varvec{\zeta }^{(0)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}$ is globally Lipschitz on $G^{(0)}_{\mathbf {x}}$.

Now let us assume that for $j \in \{1,\ldots ,q\}$ and for all $\mathbf {M}\in \mathbb {R}^{m \times q}$ and $\mathbf {m}_{q+1} \in \mathbb {R}^m$, $\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) = \varvec{\zeta }^{(j-1)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)),$ and that the function $\varvec{\zeta }^{(j-1)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}$ is well defined in the sense of (7) and is globally Lipschitz on its domain

$$\begin{aligned} G^{(j-1)}_{\mathbf {x}}=\left\{ \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {Z}}(\mathbf {z}_j){:}\,\mathbf {Z} \in \mathbb {R}^{m \times q}, \mathbf {z}_{q+1} \in \mathbb {R}^m \right\} . \end{aligned}$$

For brevity, let $\mathbf {y}:= \varvec{\rho }(\mathbf {x})$ and $\mathbf {Y}:= \varvec{\rho }^\prime (\mathbf {x};\mathbf {M})$. Also, let ${\hat{\mathbf {m}}} := \mathbf {m}_{j+1}$. We want to show that

$$\begin{aligned}&[\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}]^\prime (\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j),[\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})) \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)+ \tau \left[ \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\right] ^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j))}{\tau } \end{aligned}$$

exists in $\mathbb {R}^p$. We know that the following limits exist in $\mathbb {R}^p$:

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})-\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)}{\tau }\\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})\right) -\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)}{\tau }. \end{aligned}$$

Analogous to the proof in Theorem 2.1, the proof is complete if we show that

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \;&\frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)+ \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\left( \mathbf {m}_j + \tau {\hat{\mathbf {m}}})\right) }{\tau } = \mathbf {0}. \end{aligned}$$

(15)

For $\tau >0$ small enough, $\mathbf {m}_{q+1} \in \mathbb {R}^m$, and $\mathbf {M}\in \mathbb {R}^{m \times q}$, $\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)+ \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))$ is well-determined by assumption. Then, the proof of Theorem 2.1 establishes (15) and $\varvec{\sigma }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}}) = \varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}})).$ Since $\mathbf {M}$ and $\mathbf {m}_{q+1}$ are arbitrary, the domain of $\varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}$ is $\begin{aligned} G^{(j)}_{\mathbf {x}}=\{\varvec{\rho }^{(j)}_{\mathbf {x},\mathbf {Z}}(\mathbf {z}_{j+1}){:}\,\mathbf {Z} \in \mathbb {R}^{m \times q}, \mathbf {z}_{q+1} \in \mathbb {R}^m \},\end{aligned}$ and by Proposition 2.5 $\varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}$ is globally Lipschitz on $G^{(j)}_{\mathbf {x}}$. Since the case for $j=0$ was established, the proof follows by induction. $\square $

Proof of Proposition 3.3

Notice that $\mathbf {z}_0$ is not required to be in the interior of F. g is a convex and piecewise linear function on F [10] of the form $g(\mathbf {z}) = \underset{\varvec{\lambda }\in \varLambda }{\max } \;\;\mathbf {z}^\text {T}\varvec{\lambda }$ where $\varLambda $ is the finite set that contains all extreme points of the polyhedron $\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0$. $\varLambda $ is nonempty because $\mathbf {A}$ is full row rank (Theorem 2.6 in [10]). Let ${\tilde{g}}{:}\,\mathbb {R}^{m} \rightarrow \mathbb {R}{:}\,\mathbf {z}\mapsto {\mathop {\max }\nolimits _{\varvec{\lambda }\in \varLambda }} \;\; \mathbf {z}^\text {T}\varvec{\lambda }$ and let $\mathbf {z}\in \mathbb {R}^m$. From Proposition 2.2.7 in [13] since ${\tilde{g}}$ is a convex function that is Lipschitz near $\mathbf {z}$, the generalized gradient of ${\tilde{g}}$ at $\mathbf {z}$ ($\partial {\tilde{g}}(\mathbf {z})$) coincides with the subdifferential at $\mathbf {z}$ and for all $\mathbf {d}\in \mathbb {R}^{m}$, ${\tilde{g}}^\prime (\mathbf {z};\mathbf {d}) = {\tilde{g}}^\circ (\mathbf {z};\mathbf {d})$ (where the second quantity is the generalized directional derivative defined in Section 2 of [13]). Let ${\tilde{J}}(\mathbf {z}) := \{ \varvec{\lambda }\in \varLambda {:}\,{\tilde{g}}(\mathbf {z}) = \mathbf {z}^\text {T}\varvec{\lambda }\}$ and $J(\mathbf {z}) := \{ \varvec{\lambda }\in \varLambda {:}\,g(\mathbf {z}) = \mathbf {z}^\text {T}\varvec{\lambda }\}$; it is clear that ${\tilde{J}}(\mathbf {z}) = J(\mathbf {z}), \; \forall \mathbf {z}\in F$. Given that ${\tilde{g}}$ is the pointwise maximum of convex differentiable functions, $\partial {\tilde{g}}(\mathbf {z}) = \text {co}({\tilde{J}}(\mathbf {z}))$ (section 3.4 in [22]) and for $\mathbf {z}\in F$,

$$\begin{aligned} \text{ co }({\tilde{J}}(\mathbf {z})) = \text {co}(J(\mathbf {z})) = \left\{ \varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, \mathbf {z}^\text {T}\varvec{\lambda }= g(\mathbf {z}) \right\} . \end{aligned}$$

Finally, $\{\varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, \mathbf {z}^\text {T}\varvec{\lambda }= g(\mathbf {z})\} =\{\varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, -\mathbf {z}^\text {T}\varvec{\lambda }\le -g(\mathbf {z}) \}$. For $\mathbf {z}\in F$, ${\tilde{g}}^\prime (\mathbf {z};\mathbf {d}) = {\tilde{g}}^\circ (\mathbf {z};\mathbf {d}) = \max \{\mathbf {d}^\text {T}\varvec{\lambda }{:}\,\varvec{\lambda }\in \partial {\tilde{g}}(\mathbf {z})\}$ by Proposition 2.1.2 in [13], and therefore $\max \{\mathbf {d}^\text {T}\varvec{\lambda }{:}\,\varvec{\lambda }\in \partial {\tilde{g}}(\mathbf {z})\} = \max \{\mathbf {d}^\text {T}\varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, -\mathbf {z}^\text {T}\varvec{\lambda }\le -g(\mathbf {z})\}$. For $\mathbf {z}\in F$ and $\mathbf {d}\in \mathbb {R}^{m}$ such that there exists $\delta >0$ such that for all $\epsilon \in [0,\delta [$, $\mathbf {z}+ \epsilon \mathbf {d}\in F$ and ${\tilde{g}}^\prime (\mathbf {z};\mathbf {d}) = g^\prime (\mathbf {z};\mathbf {d})$. Then, $g^\prime (\mathbf {z}_0;\mathbf {d})$ is given by LP (8). $\square $

Proof of Proposition 3.6

Under Assumption 2.1 F is nonempty. If $h^E_{-1}(\mathbf {z}) = 0$, then $\mathbf {z}\in F$ because there exists $\mathbf {v}\ge \mathbf {0}$ such that $\mathbf {A}\mathbf {v}= \mathbf {z}$. Then LP (1) is feasible and by Proposition (2.2) LP (2) is also feasible for all i. If $h^E_{-1}(\mathbf {z}) = 0$, then the variables $\mathbf {s}_+, \; \mathbf {s}_- = \mathbf {0}$ and they can be removed together with the constraint $\sum _{i=1}^m s_{+i} + s_{-i} = h^E_{-1}(\mathbf {z})$ from LP (12) resulting in $h^E_i(\mathbf {z}) = h_i(\mathbf {z})$ for all $i \in \{0,\ldots ,n_h\}$. By Proposition 2.1, F is nonempty and for $\mathbf {z}\in F$, $\mathbf {h}^E(\mathbf {z}) \in \mathbb {R}^{n_h+2}$. This implies that dual LPs of (11) and (12) are always feasible. Now let us show that (11) and (12) satisfy Assumption 3.1. Let $n_h^p := n_h + 1$, $\mathbf {A}^p := \begin{bmatrix} \mathbf {A}&\mathbf {I}_m&-\mathbf {I}_m \end{bmatrix}$. Let $(\mathbf {c}_{0}^p)^\text {T}:= \begin{bmatrix}\mathbf {0}^\text {T}&\mathbf {1}^\text {T}&\mathbf {1}^\text {T}\end{bmatrix}$ and for $i \in \{0,\ldots ,n_h\}$, let $(\mathbf {c}_{i+1}^p)^\text {T}:= \begin{bmatrix}\mathbf {c}_i^\text {T}&\mathbf {0}^\text {T}&\mathbf {0}^\text {T}\end{bmatrix}$. Then LPs (11) and (12) can be expressed in the format of LPs (1) and (2). Since the dual LPs of (11) and (12) are always feasible, $\mathbf {A}^p$ and $\mathbf {c}_i^p$ for $i \in \{0,\ldots ,n_h^p\}$ are such that $\mathbf {h}^E(\mathbf {z})>-\varvec{\infty }$ for all $\mathbf {z}\in \mathbb {R}^m$. Since $\mathbf {A}$ is full row rank, $\mathbf {A}_p$ is full row rank. Then LPs (11) and (12) satisfy Assumption 3.1. Then by Proposition 3.1, $\mathbf {h}^E$ is l-smooth for any $\mathbf {z}\in \mathbb {R}^m$. $\square $

1.1 Appendix B

Proof of Proposition 4.12 in [14] not Being Applicable

This proposition refers to optimization problems of the form

$$\begin{aligned} \underset{\mathbf {x}\in X}{\min } \; f(\mathbf {x},\mathbf {u}), \;\; \text {s.t. } \mathbf {x}\in \varPhi , \end{aligned}$$

(16)

where $\mathbf {u}\in U$ is a parameter vector and $\varPhi $ is nonempty and closed. Under certain conditions, this proposition calculates the directional derivative of the optimal value function. To apply Proposition 4.12 in [14], we need to consider the duals of each LP in (1) and (2):

$$\begin{aligned} h_i(\mathbf {z}) = \underset{\varvec{\lambda }\in \mathbb {R}^{m + i}}{\max } \; [\mathbf {q}^i(\mathbf {z})]^\text {T}\varvec{\lambda },\;\; \text {s.t. } [\mathbf {A}^i]^\text {T}\varvec{\lambda }\le \mathbf {c}_i. \end{aligned}$$

(17)

For $i \in \{0,\ldots ,n_h\}$, let $\mathscr {D}^i\subset \mathbb {R}^{m+i}$ be the feasible set of LP (17). Since Proposition 4.12 in [14] considers a minimization problem, $f^i(\varvec{\lambda },\mathbf {z}) = -[\mathbf {q}^i(\mathbf {z})]^\text {T}\varvec{\lambda }$. Notice that the feasible set of (17) is independent of $\mathbf {z}$ and nonempty under Assumption 2.1 for $i \in \{0,\ldots ,n_h\}$. The application of Proposition 4.12 in [14] requires that the inf-compactness condition is satisfied at $\mathbf {z}\in F$. The following propositions show that the inf-compactness condition cannot be satisfied by LLPs.

Definition B.1

Consider the optimization problem (16). The inf-compactness condition holds at $\mathbf {u}_0 \in U$ if there exists $\alpha \in \mathbb {R}$ and a compact set $C \subset X$ such that for every $\mathbf {u}$ near $\mathbf {u}_0$, the level set $\text {lev}_\alpha f(\cdot ,\mathbf {u}) := \{ \mathbf {x}\in \varPhi {:}\,f(\mathbf {x},\mathbf {u}) \le \alpha \}$ is nonempty and contained in C.

Proposition B.1

Let Assumption 2.1 hold and consider LP (17). Then for all i, the inf-compactness condition is not satisfied at $\mathbf {z}_0$ if $\mathbf {q}^i(\mathbf {z}_0) \in \text {bnd}(F_i)$.

Proof

$F_i$ is closed [1]. Let us assume $\mathbf {q}^i(\mathbf {z}_0) \in \text {bnd}(F_i)$. Since $\mathbf {q}^i(\mathbf {z}_0) \in F_i$, the solution set of LP (17) is nonempty, and since it can be described as a polyhedron, it is closed and convex. Proposition 3.5 in [15] implies that $\mathbf {q}^i(\mathbf {z}_0) \in \text {int}(F_i)$ if and only if solution set of LP (17) is nonempty, convex, and compact. Since $\mathbf {q}^i(\mathbf {z}_0) \in \text {bnd}(F_i)$, the solution set of LP (17) cannot be compact and therefore must be unbounded. Since the optimal level set is unbounded and this is the smallest nonempty level set, there is no nonempty bounded level set at $\mathbf {z}_0$ and the inf-compactness condition is not satisfied. $\square $

Proposition B.2

Let Assumption 2.1 hold. For all $i>0$, $\mathbf {q}^i(F) \subset \text {bnd}(F_i)$.

Proof

By Proposition 2.2, $\mathbf {q}^i(F) \subset F_i$. In addition, $q^i_{m+i}(\mathbf {z}) = h_{i-1}(\mathbf {z})$. Let $\mathbf {d}= -\mathbf {e}_{m+i}$. For any $\epsilon >0$, $\mathbf {q}^i(\mathbf {z}) + \epsilon \mathbf {d}\notin F_i$ because the $(m+i)^{th}$ component of $\mathbf {q}^i(\mathbf {z})$ is $h_{i-1}(\mathbf {z})$ which is optimal. Hence, any value less than $h_{i-1}(\mathbf {z})$ results in an infeasible LP. Then $\mathbf {q}^i(\mathbf {z}) \in \text {bnd}(F_i)$ and thus $\mathbf {q}^i(F) \subset \text {bnd}(F_i)$. $\square $

Hence, by Propositions B.1 and B.2, Proposition 4.12 in [14] cannot be applied to LLPs and an alternative method to calculate the LD-derivatives of the optimal values function is required.

1.2 Appendix C

Let $\mathbf {h}{:}\,\mathbb {R}^2_+ \rightarrow \mathbb {R}^2$ where:

$$\begin{aligned} \mathbf {h}(\mathbf {b}) = \text {lex } \underset{\mathbf {v}\in \mathbb {R}^2}{\min } \; \mathbf {C}^\text {T}\mathbf {v}, \text {s.t. } v_1 \le b_1, v_1 + v_2 \le b_2, v_1, v_2 \ge 0, \end{aligned}$$

(18)

with $\mathbf {C} = -\mathbf {I}_2$, where first $v_1$ is maximized, and then $v_2$ is maximized. LP (18) can be transformed into standard form to obtain the form of LLP (1) and (2):

$$\begin{aligned} \mathbf {h}(\mathbf {b}) = \text {lex } \underset{\mathbf {v}\in \mathbb {R}^2}{\min } \; {\hat{\mathbf {C}}}^\text {T}\mathbf {v}, \; \text {s.t. } \begin{bmatrix} 1&0&1&0 \\ 1&1&0&1\end{bmatrix} \mathbf {v}= \mathbf {b}, \mathbf {v}\ge \mathbf {0}, \end{aligned}$$

(19)

where ${\hat{\mathbf {C}}} = \begin{bmatrix} -1&0&0&0 \\ 0&-1&0&0\end{bmatrix}$. For LP (18), $F = \left\{ \mathbf {b}{:}\,\mathbf {b}\ge \mathbf {0} \right\} $. Consider $\mathbf {b}_0 = [1 \; \; 1]^\text {T}$. Clearly, $\mathbf {b}_0 \in \text {int}(F)$ and $\mathbf {h}(\mathbf {b}_0) = [-1 \;\; 0]^\text {T}$. $\mathbf {h}$ is not differentiable at $\mathbf {b}_0$. Given $\mathbf {M}\in \mathbb {R}^{2 \times 2}$ and LP (19), $\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M})$ can be computed with Proposition 3.4. Consider $\mathbf {M}_1 = \mathbf {I}_2$, $\mathbf {M}_2 = \begin{bmatrix} 1&0 \\ 0&-1\end{bmatrix}$, $\mathbf {M}_3 = \begin{bmatrix} -1&0 \\ 0&1\end{bmatrix}$, and $\mathbf {M}_4 = -\mathbf {I}_2$. The resulting LD-derivatives are: $\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_1) = \begin{bmatrix} 0&-1 \\ 0&0\end{bmatrix}$, $\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_2) = \begin{bmatrix} 0&1 \\ 0&0\end{bmatrix}$, $\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_3) = \begin{bmatrix} 1&0 \\ -1&-1\end{bmatrix}$, and $\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_4)= \begin{bmatrix} 1&0 \\ -1&1\end{bmatrix}$ (Fig. 1).

Different elements of the lexicographic subdifferential are obtained from the linear system $\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M})\mathbf {M}= \mathbf {h}^\prime (\mathbf {b};\mathbf {M})$. The resulting l-derivatives are:

$$\begin{aligned}&\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_1) = \mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_2) = \begin{bmatrix} 0&-1 \\ 0&0\end{bmatrix}, \\&\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_3) =\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_4) = \begin{bmatrix} -1&0 \\ 1&-1\end{bmatrix}. \end{aligned}$$

Notice that $\mathbf {M}_1$ and $\mathbf {M}_2$ result in the same l-derivative matrix as well as $\mathbf {M}_3$ and $\mathbf {M}_4$. These two l-derivatives form the B-subdifferential of $\mathbf {h}$ at $\mathbf {b}_0 = [1 \; \; 1]^\text {T}$. Proposition 2.6.2 in [13] shows that for a nonscalar function $\mathbf {h}$ evaluated at $\mathbf {b}_0$, Clarke’s generalized Jacobian is a subset of the Cartesian product of the generalized gradients of each component of $\mathbf {h}$. In this example, the Cartesian product of the generalized gradients of $h_0$ and $h_1$ at $\mathbf {b}_0 = [1 \; \; 1]^\text {T}$ results in the convex hull of $\left\{ \begin{bmatrix} 0&-\,1 \\ 0&0 \end{bmatrix}, \begin{bmatrix} 0&-\,1 \\ 1&-\,1 \end{bmatrix}, \begin{bmatrix} -\,1&0 \\ 0&0 \end{bmatrix}, \begin{bmatrix} -\,1&0 \\ 1&-\,1 \end{bmatrix} \right\} $. However, the kinks in the functions $h_0$ and $h_1$ are lined up such that the B-subdifferential of $\mathbf {h}$ at $\mathbf {b}_0 = [1 \; \; 1]^\text {T}$ contains only two matrices. The LD-derivatives are guaranteed to find at most these two matrices (Fig. 2). The results of this example can be easily verified by expressing $\mathbf {h}$ as:

$$\begin{aligned}&h_0(\mathbf {b}) = -\min \{b_1,b_2\} = \frac{|b_1 - b_2|-b_1 - b_2}{2} \\&h_1(\mathbf {b}) = -\max \{0,b_2-b_1\} = \frac{b_1 - b_2 - |b_1 - b_2|}{2}. \end{aligned}$$

1.3 Appendix D

Model Description of Example 4.1

(a)
Mass balance of biomass of C. ijungdahlii:
$$\begin{aligned}&\frac{\partial X}{\partial t}(z,t) = \mu (z,t) X(z,t) - \frac{u_L}{\epsilon _L}\frac{\partial X}{\partial z}(z,t) + D_A \frac{\partial ^2 X}{\partial z^2}(z,t), \\&u_L X(0,t) - \epsilon _L D_A \frac{\partial X}{\partial z}(0,t) = 0, \;\; \frac{\partial X}{\partial z}(L,t) = 0, \; X(z,0) = X_0, \end{aligned}$$
where X is the concentration of biomass, t is time, $\mu $ is the growth rate, $u_L$ is the liquid velocity, z is the spatial position, $D_A$ is the diffusivity, $\epsilon _L$ is the liquid volume fraction in the reactor, and $X_0$ is the initial biomass concentration.
(b)
Mole balances of liquid-phase CO, H$_2$, CO$_2$, ethanol, and acetate:
$$\begin{aligned}&\frac{\partial G_L}{\partial t}(z,t) = v_G(z,t) X(z,t) + \frac{k_{m,G}}{\epsilon _L}(G^*(z,t) - G_L(z,t)) \\&\quad -\,\frac{u_L}{\epsilon _L} \frac{\partial G_L}{\partial z}(z,t) + D_A \frac{\partial ^2 G_L}{\partial z^2}(z,t), \\&u_L G_L(0,t) - \epsilon _L D_A \frac{\partial G_L}{\partial z}(0,t) = u_L G_{gF} H_G, \; \frac{\partial G_L}{\partial z}(L,t) = 0, \; G_L(z,0) = G_{L0}, \end{aligned}$$
where G can be CO, H$_2$, CO$_2$, ethanol, and acetate concentrations, $v_G$ refers to the exchange flux rate for species G, $k_{m,G}$ the liquid mass transfer coefficient for species G, $G_L$ the liquid concentration of species G, $G^*$ the liquid concentration in equilibrium with the gas concentration of species G, $G_{gF}$ the gas concentration in the feed of species G, $G_{L0}$ the initial concentration of species G in the liquid, and $H_G$ is Henry’s constant for species G.
(c)
Mole balances of gas-phase CO, H$_2$, and CO$_2$:
$$\begin{aligned} \frac{\partial G_g}{\partial t}(z,t)&= - \frac{k_{m,G}}{\epsilon _g}(G^*(z,t) - G_L(z,t)) - \frac{u_g}{\epsilon _g} \frac{\partial G_g}{\partial z}(z,t), \\ G_g(0,t)&= G_{gF}, G_g(z,0) = G_{g0}, \end{aligned}$$
where $G_g$ is the concentration of species G in the gas, $\epsilon _g$ is the gas volume fraction, $u_g$ is the gas velocity, and $G_{g0}$ is the initial gas concentration of species G.
(d)
Column pressure profile: $P(z) = P_L + \rho _L g(L-z)$, where P is the pressure as a function of position in the column, $P_L$ is the pressure at the top of the column, $\rho _L$ is the density of the liquid, L is the size of the column, and g is gravitational acceleration.

The spatial dimension of the bubble column was discretized using the finite volume strategy; 100 nodes were considered. The state vector for each finite volume is:

$$\begin{aligned} \mathbf {x}= [X, \text {CO}_L, \text {CO}_g, \text {H}_{2g}, \text {H}_{2L}, \text {CO}_{2L}, \text {CO}_{2g}, A, E ], \end{aligned}$$

where A stands for acetate and E for ethanol. To obtain the growth rate $\mu $ and the exchange flux rates $v_G$, this problem is transformed into a DFBA problem using LP (13).

The resulting nonsmooth system of equations were solved for the following parameter values:

$$\begin{aligned}&u_g = 75\,\mathrm{m/h}, L = 25\,\mathrm{m}, u_L = 0.25\,\mathrm{m/h}, D_A = 0.25\,\mathrm{m}^2/\mathrm{h}, T = 310.15\,\mathrm{K}, \\&P_L = 1\,\mathrm{atm}, \rho _L = 1000\,\mathrm{kg/m}^3, P_{\mathrm{CO}} = 0.6*P_0, P_{\mathrm{H}_2} = 0.4*P_0, \\&G_{\mathrm{gF}} = \frac{P_G}{R \times T}, H_{\mathrm{CO}} = 8.0 \times 10^{-4} \frac{\mathrm{mol}}{(\mathrm{L}\,\mathrm{atm})}, H_{\mathrm{H}_2} = 6.6 \times 10^{-4} \frac{\mathrm{mol}}{(\mathrm{L}\,\mathrm{atm})}, \\&H_{\mathrm{CO}_2} = 2.5 \times 10^{-2} \frac{\mathrm{mol}}{(\mathrm{L}\,\mathrm{atm})}, k_{m,\mathrm{CO}_2} = k_{m,\mathrm{CO}} = 80/\mathrm{h}, k_{m,\mathrm{H}_2} = 2.5k_{m,\mathrm{CO}}, \\&k_{m,E} = k_{m,A} = 0, \epsilon _g = u_g \frac{0.53}{3600(0.15 + u_g/3600)}, \epsilon _L = 1 - \epsilon _g, \\&v_{\mathrm{max},\mathrm{CO}} = 35 \frac{\mathrm{mmo}l}{\mathrm{g} \times \mathrm{h}}, v_{\mathrm{max},\mathrm{CO}} = v_{\mathrm{max},\mathrm{CO}_2}= 35 \frac{\mathrm{mmol}}{\mathrm{g} \times \mathrm{h}}, v_{\mathrm{max},\mathrm{H}_2} = 70 \frac{\mathrm{mmol}}{\mathrm{g} \times \mathrm{h}}, \\&K_{m,\mathrm{CO} }= K_{m,\mathrm{CO}_2} = 0.02\,\mathrm{mmol/L}, K_{m,\mathrm{H}_2} = 0.02\,\mathrm{mmol/L}, K_{I} = 10\,\mathrm{mmol/L}, \end{aligned}$$

where $P_0$ is the pressure at the bottom of the column, $P_G$ is the partial pressure of species G, T is the temperature in Kelvin, and R is the universal gas constant. This system comprises 901 equations (the last equation corresponds to the penalty), 100 LLPs each one with 682 equality constraints, 1715 variables, and 7 objective functions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gomez, J.A., Höffner, K., Khan, K.A. et al. Generalized Derivatives of Lexicographic Linear Programs. J Optim Theory Appl 178, 477–501 (2018). https://doi.org/10.1007/s10957-018-1309-2

Download citation

Received: 07 June 2017
Accepted: 14 May 2018
Published: 23 May 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s10957-018-1309-2