Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Generalized Derivatives of Lexicographic Linear Programs

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

Lexicographic linear programs are fixed-priority multiobjective linear programs that are a useful model of biological systems using flux balance analysis and for goal-programming problems. The objective function values of a lexicographic linear program as a function of its right-hand side are nonsmooth. This work derives generalized derivative information for lexicographic linear programs using lexicographic directional derivatives to obtain elements of the Bouligand subdifferential (limiting Jacobian). It is shown that elements of the limiting Jacobian can be obtained by solving related linear programs. A nonsmooth equation-solving problem is solved to illustrate the benefits of using elements of the limiting Jacobian of lexicographic linear programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Harwood, S.M., Höffner, K., Barton, P.I.: Efficient solution of ordinary differential equations with a parametric lexicographic linear program embedded. Numer. Math. 133(4), 623–653 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  2. Höffner, K., Harwood, S.M., Barton, P.I.: A reliable simulator for dynamic flux balance analysis. Biotechnol. Bioeng. 110(3), 792–802 (2013)

    Article  Google Scholar 

  3. Ignizio, J.P.: Linear Programming in Single- & Multiple-Objective Systems. Prentice Hall, Englewoods Cliffs (1982)

    MATH  Google Scholar 

  4. Höffner, K., Barton, P.I.: Design of microbial consortia for industrial biotechnology. Comput. Aided Chem. Eng. 34, 65–74 (2014)

    Article  Google Scholar 

  5. Von Stackelberg, H.: The Theory of the Market Economy. Oxford University Press, Oxford (1952)

    Google Scholar 

  6. Ferris, M.C., Pang, J.S.: Engineering and economic applications of complementarity problems. SIAM Rev. 39(4), 669–713 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  7. Nesterov, Y.: Lexicographic differentiation of nonsmooth functions. Math. Program. 104(2–3), 669 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  8. Khan, K.A., Barton, P.I.: A vector forward mode of automatic differentiation for generalized derivative evaluation. Optim. Methods Softw. 30(6), 1185–1212 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Scholtes, S.: Introduction to Piecewise Differentiable Equations. Springer, New York (2012)

    Book  MATH  Google Scholar 

  10. Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Scientific, Belmont (1997)

    Google Scholar 

  11. Khan, K.A., Barton, P.I.: Generalized derivatives for solutions of parametric ordinary differential equations with non-differentiable right-hand sides. J. Optim. Theory Appl. 163(2), 355–386 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  12. Barton, P.I., Khan, K.A., Stechlinski, P., Watson, H.A.: Computationally relevant generalized derivatives: theory, evaluation and applications. Optim. Methods Soft. (2017). https://doi.org/10.1080/10556788.2017.1374385

  13. Clarke, F.H.: Optimization and Nonsmooth Analysis. SIAM, Philadelphia (1990)

    Book  MATH  Google Scholar 

  14. Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2013)

    MATH  Google Scholar 

  15. Höffner, K., Khan, K.A., Barton, P.I.: Generalized derivatives of dynamic systems with a linear program embedded. Automatica 63, 198–208 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  16. Gomez, J.A., Höffner, K., Barton, P.I.: DFBAlab: a fast and reliable MATLAB code for dynamic flux balance analysis. BMC Bioinform. 15, 409 (2014)

    Article  Google Scholar 

  17. CPLEX IBM ILOG: 12.7 User’s Manual. https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.studio.help/pdf/usrcplex.pdf (2017)

  18. Gurobi Optimization: Gurobi optimizer reference manual. http://www.gurobi.com/documentation/ (2017)

  19. Chen, J., Gomez, J.A., Höffner, K., Barton, P.I., Henson, M.A.: Metabolic modeling of synthesis gas fermentation in bubble column reactors. Biotechnol. Biofuels 8, 89 (2015)

    Article  Google Scholar 

  20. Chen, J., Gomez, J.A., Höffner, K., Phalak, P., Barton, P.I., Henson, M.A.: Spatiotemporal modeling of microbial metabolism. BMC Syst. Biol. 10, 21 (2016)

    Article  Google Scholar 

  21. Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58(1–3), 353–367 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  22. Boyd, S., Vandenberghe, L.: Subgradients: Notes for EE364b, Stanford University, Winter 2006-07. https://see.stanford.edu/materials/lsocoee364b/01-subgradients_notes.pdf (2008)

Download references

Acknowledgements

The authors would like to acknowledge Dr. Stuart Harwood for the insightful discussions that took place while preparing the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul I. Barton.

Additional information

Communicated by Alexander Martin.

Appendices

Appendices

Appendix A

Proof of Proposition 2.1

First we will show that if \(\text {int}(F) \ne \emptyset \), then \(\mathbf {A}\) is full row rank. Assume \(\mathbf {A}\) is not full row rank and \(\text {int}(F) \ne \emptyset \). Then, there exists at least one row that is linearly dependent on the other rows. Without loss of generality, let us assume that the last row of \(\mathbf {A}\) is linearly dependent. Let \(\mathbf {b}_0 \in \text {int}(F)\). Then, any small perturbation of the last component of \(\mathbf {b}_0\) while keeping all other components of \(\mathbf {b}_0\) constant results in LP (1) becoming infeasible. Then, \(\mathbf {b}_0 \notin \text {int}(F)\) and there is a contradiction.

For the next part assume \(\mathbf {A}\) is full row rank. Let \(\mathbf {v}= \mathbf {1}\) and \(\mathbf {b}= \mathbf {A}\mathbf {v}\). By hypothesis, this \(\mathbf {b}\in F\). Let \(\mathbf {d}\in \mathbb {R}^m\) and \(\epsilon > 0\). Without loss of generality, let \(\Vert \mathbf {d}\Vert = \mathbf {1}\). Let us find a \(\Delta \mathbf {v}\) that satisfies, \(\mathbf {A}(\mathbf {v}+ \Delta \mathbf {v}) = \mathbf {b}+ \epsilon \mathbf {d}\). Since \(\mathbf {A}\mathbf {v}= \mathbf {b}\), we need \(\mathbf {A}\Delta \mathbf {v}= \epsilon \mathbf {d}\). Since \(\mathbf {A}\) is full row rank, the columns of \(\mathbf {A}\) span \(\mathbb {R}^m\). Consider all possible collections of m columns that form a basis for the column space of \(\mathbf {A}\). Take such a basis \(\tilde{\mathbf {A}}\) and let \(\Delta {\tilde{\mathbf {v}}} = \epsilon \tilde{\mathbf {A}}^{-1}\mathbf {d}\). Without loss of generality, let the columns of \(\tilde{\mathbf {A}}\) be the first m columns of \(\mathbf {A}\). Let \(\Delta v_j = \Delta {\tilde{v}}_j\) if \(j \in \{1,\ldots ,m\}\) and 0 otherwise. It is clear that \(\Delta \mathbf {v}\) constructed in this form satisfies \(\mathbf {A}(\mathbf {v}+ \Delta \mathbf {v}) = \mathbf {b}+ \epsilon \mathbf {d}\). Moreover, \(\Vert \Delta \mathbf {v}\Vert = \Vert \Delta {\tilde{\mathbf {v}}} \Vert \). Therefore, \(\Vert \Delta \mathbf {v}\Vert = \Vert \Delta {\tilde{\mathbf {v}}} \Vert \le \epsilon \Vert \tilde{\mathbf {A}}^{-1} \Vert \Vert \mathbf {d}\Vert = \epsilon \Vert \tilde{\mathbf {A}}^{-1} \Vert \) and for small enough \(\epsilon >0\), \(\Vert \Delta \mathbf {v}\Vert < 1\). Then all components \(| \Delta v_j |<1\) and \(\mathbf {v}+ \Delta \mathbf {v}> \mathbf {0}\), and \(\mathbf {b}+ \epsilon \mathbf {d}\in F\). By hypothesis \(F = \{\mathbf {A}\mathbf {v}{:}\,\mathbf {v}\ge \mathbf {0} \}\) [10], so F is a nonempty convex set and \(\mathbf {b}+ \delta \mathbf {d}\in F\) for all \(\delta \in ]0,\epsilon [\). Since \(\mathbf {d}\) was arbitrary, \(\mathbf {b}\in \text {int}(F)\) and \(\text {int}(F) \ne \emptyset \). \(\square \)

Proof of Proposition 2.3

Notice that \(\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\theta }(\mathbf {x}+ \epsilon \mathbf {d})\) holds \(\forall \epsilon \in [0,\delta ^*_\mathbf {d}[\). From Lemma 2.1 there exists \(\delta >0\) such that \(\varvec{\theta }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\theta }(\mathbf {x}) + \varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})\), \(\forall \Vert \epsilon \mathbf {d}\Vert < \delta \). Let \(\delta _\mathbf {d}= \min (\delta ,\delta ^*_\mathbf {d})\). Given that \(\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\theta }(\mathbf {x}+ \epsilon \mathbf {d})\) for all \(\epsilon \in ]0,\delta _\mathbf {d}[\), then \(\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\rho }(\mathbf {x}) + \varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})\) for all \(\epsilon \in ]0,\delta _\mathbf {d}[\). Now let us show that \(\varvec{\rho }^\prime (\mathbf {x};\epsilon \mathbf {d})\) exists and is equal to \(\varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})\). Consider any sequence \(\{\tau _i\}^\infty _{i=0}\), \(\tau _i \in ]0,\delta _\mathbf {d}[\), \(\tau _i \downarrow 0\). From the definition of directional derivative: \(\begin{aligned} {\mathop {\lim }\nolimits _{i \rightarrow \infty }}\frac{\varvec{\theta }(\mathbf {x}+\tau _i\mathbf {d}) - \varvec{\theta }(\mathbf {x})}{\tau _i} =\varvec{\theta }^\prime (\mathbf {x};\mathbf {d}). \end{aligned}\) Also, for all \(\tau _i\), \(\begin{aligned} \frac{\varvec{\theta }(\mathbf {x}+\tau _i\mathbf {d}) - \varvec{\theta }(\mathbf {x})}{\tau _i} = \frac{\varvec{\rho }(\mathbf {x}+\tau _i\mathbf {d}) - \varvec{\rho }(\mathbf {x})}{\tau _i}. \end{aligned}\) Therefore, \(\varvec{\rho }^\prime (\mathbf {x};\mathbf {d})\) exists and \(\varvec{\rho }^\prime (\mathbf {x};\mathbf {d}) = \varvec{\theta }^\prime (\mathbf {x}; \mathbf {d})\).

From Eq. (4), \(\varvec{\rho }^\prime (\mathbf {x};\epsilon \mathbf {d}) = \epsilon \varvec{\rho }^\prime (\mathbf {x};\mathbf {d}) = \epsilon \varvec{\theta }^\prime (\mathbf {x}; \mathbf {d}) = \varvec{\theta }^\prime (\mathbf {x}; \epsilon \mathbf {d})\). Hence for all \(\epsilon \in ]0,\delta _\mathbf {d}[\), \(\varvec{\rho }(\mathbf {x}+ \epsilon \mathbf {d}) = \varvec{\rho }(\mathbf {x}) + \varvec{\rho }^\prime (\mathbf {x}; \epsilon \mathbf {d})\). \(\square \)

Proof of Theorem 2.1

For \(\varvec{\rho }\) to be l-smooth, it is necessarily Lipschitz continuous near \(\mathbf {x}\). Since \(\varvec{\zeta }\) is locally Lipschitz continuous, the composition \(\varvec{\sigma }:= \varvec{\zeta }\circ \varvec{\rho }\) is Lipschitz continuous near \(\mathbf {x}\). First, we show that \(\varvec{\sigma }\) is directionally differentiable at \(\mathbf {x}\).

We need to show that \(\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \frac{\varvec{\sigma }(\mathbf {x}+ \tau \mathbf {d})-\varvec{\sigma }(\mathbf {x})}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } \end{aligned} \in \mathbb {R}^p\) for all \(\mathbf {d}\in \mathbb {R}^m\). First, \(\begin{aligned}{\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})-\varvec{\rho }(\mathbf {x}) - \tau \varvec{\rho }'(\mathbf {x};\mathbf {d})}{\tau }=\mathbf {0}\end{aligned}\) from the definition of the directional derivative. Next given \(\mathbf {d}\in \mathbb {R}^m\), \(\begin{aligned} \varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d})) = {\mathop {\lim }\nolimits _{\tau \downarrow 0}}\end{aligned}\)\(\begin{aligned}\frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } \end{aligned} \in \mathbb {R}^p\) by assumption. We show that \(\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \end{aligned}\)\(\begin{aligned} \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau }, \end{aligned}\) which is equivalent to showing that \(\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \;&\frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))}{\tau } = \mathbf {0}. \end{aligned}\) From the existence of the directional derivative \(\varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d}))\), there exists \(\epsilon _1>0\) such that \( \varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }^\prime (\mathbf {x};\mathbf {d}) \in Z,\) for all \(\tau \in ]0,\epsilon _1[.\) Since \(\varvec{\zeta }\) is locally Lipschitz continuous there exists \(\epsilon _2>0\) and \(K>0\) such that,

$$\begin{aligned}&\left\| \varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})) -\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d})) \right\| \le K\left\| \varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d})) \right\| ,\\ \end{aligned}$$

\(\forall \tau \in ]0,\epsilon _1[\) such that \(\Vert \varvec{\rho }(\mathbf {x}+\tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x})+\tau \varvec{\rho }^\prime (\mathbf {x};\mathbf {d})) \Vert < \epsilon _2\). This is equal to

$$\begin{aligned}&\left\| \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})) -\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))}{\tau }\right\| \le K \left\| \frac{\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))}{\tau }\right\| , \end{aligned}$$

\(\forall \tau \in ]0,\epsilon _1[\) such that \(\Vert \varvec{\rho }(\mathbf {x}+\tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x})+\tau \varvec{\rho }^\prime (\mathbf {x};\mathbf {d})) \Vert < \epsilon _2 \). By the existence of \(\varvec{\rho }^\prime (\mathbf {x};\mathbf {d})\) it follows that \(\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \left\| \frac{\varvec{\rho }(\mathbf {x}+\tau \mathbf {d}) - (\varvec{\rho }(\mathbf {x})+\tau \varvec{\rho }^\prime (\mathbf {x};\mathbf {d})) }{\tau } \right\| = 0, \end{aligned}\) and then

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \left\| \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d})) -\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))}{\tau }\right\| = 0. \end{aligned}$$

Since \(\mathbf {d}\in \mathbb {R}^m\) was arbitrary, \(\varvec{\sigma }\) is directionally differentiable at \(\mathbf {x}\) and

$$\begin{aligned} \varvec{\sigma }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d}) = \varvec{\zeta }^{(0)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d})). \end{aligned}$$

Now let us assume \(\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) = \varvec{\zeta }^{(j-1)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j))\) for \(j \in \{1,\ldots ,q\}\), for all \(\mathbf {M}\in \mathbb {R}^{m\times q}\) and \(\mathbf {m}_{q+1} \in \mathbb {R}^m\). Let \(\mathbf {y}:= \varvec{\rho }(\mathbf {x})\), \(\mathbf {Y}:= \varvec{\rho }^\prime (\mathbf {x};\mathbf {M})\), and \({\hat{\mathbf {m}}} := \mathbf {m}_{j+1}\). We need to show that the following limit exists in \(\mathbb {R}^p\) for all \(\mathbf {M}\in \mathbb {R}^{m \times q}\) and \(\mathbf {m}_{q+1} \in \mathbb {R}^m\):

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})-\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)}{\tau } \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)\right) }{\tau }. \end{aligned}$$

Given \(\mathbf {M}\in \mathbb {R}^{m\times q}\) and \(\mathbf {m}_{q+1} \in \mathbb {R}^m\), since \(\varvec{\zeta }\) is \(\varvec{\rho }-\)weakly l-smooth at \(\mathbf {x}\), we know the following limit exists in \(\mathbb {R}^p\):

$$\begin{aligned}&[\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}]^\prime (\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j),[\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})) \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)\right) }{\tau }, \end{aligned}$$

and \(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) \in G^{(j-1)}_{\mathbf {x}}\) for \(\tau >0\) small enough. We also know from the definition of l-smoothness that:

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\left( \mathbf {m}_j + \tau {\hat{\mathbf {m}}}\right) -\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) - \tau \left[ \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\right] ^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})\right) }{\tau }=\mathbf {0}. \end{aligned}$$

We will show that

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j))}{\tau } \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)\right) }{\tau }, \end{aligned}$$

which is equivalent to

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))}{\tau } = \mathbf {0}. \end{aligned}$$

From Corollary 2.2, \(\varvec{\zeta }^{(k)}_{\mathbf {y},\mathbf {Y}}\) is globally Lipschitz continuous for all \(k \in \{0,\ldots ,q\}\). Then, there exists \(\delta >0\) such that \(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) \in G^{(j-1)}_{\mathbf {x}}\) if \(\tau \in [0,\delta [\), and there exists \(K>0\) and \(\tau \in [0,\delta [\) such that

$$\begin{aligned}&K \left\| \frac{\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) - \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})}{\tau }\right\| \\&\quad \ge \left\| \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))}{\tau }\right\| . \end{aligned}$$

By the existence of the directional derivative \([\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})\),

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \left\| \frac{\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}) - \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})}{\tau }\right\| = 0, \end{aligned}$$

and then

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \left\| \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) + \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}}))}{\tau }\right\| = 0. \end{aligned}$$

Since \(\mathbf {M}\in \mathbb {R}^{m \times q}\) and \(\mathbf {m}_{q+1} \in \mathbb {R}^m\) were arbitrary, \(\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}\) is directionally differentiable at \(\mathbf {m}_j\) and \(\varvec{\sigma }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}}) = \varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}}))\) for all \(\mathbf {M}\in \mathbb {R}^{m \times q}\) and \(\mathbf {m}_{q+1} \in \mathbb {R}^m\). Given that \(\varvec{\sigma }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_1) = \varvec{\zeta }^{(0)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_1))\) for any \(\mathbf {M}\in \mathbb {R}^{m\times q}\), the rest of the LD-derivatives follow by induction. \(\square \)

Proof of Theorem 2.2

First we show the limit for the directional derivative. Given \(\mathbf {d}\in \mathbb {R}^m\) we want to show that the following limit exists in \(\mathbb {R}^p\):

$$\begin{aligned} \varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d})) = \underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau }. \end{aligned}$$

We know that \( \begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\sigma }(\mathbf {x}+ \tau \mathbf {d})-\varvec{\sigma }(\mathbf {x})}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } \end{aligned}\) exists in \(\mathbb {R}^p\).

If \(\begin{aligned} {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau } = {\mathop {\lim }\nolimits _{\tau \downarrow 0}} \; \frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}))}{\tau }, \end{aligned}\) which is equivalent to showing that

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \;&\frac{\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))-\varvec{\zeta }(\varvec{\rho }(\mathbf {x}+ \tau \mathbf {d}))}{\tau } = \mathbf {0}, \end{aligned}$$
(14)

the proof is complete. Notice that \(\varvec{\zeta }(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}))\) is well defined by the assumption of \(\varvec{\rho }(\mathbf {x}) + \tau \varvec{\rho }'(\mathbf {x};\mathbf {d}) \in Z\) for all \(\tau \in ]0,\delta _\mathbf {d}[\). Since \(\varvec{\zeta }\) is locally Lipschitz, the proof of Theorem 2.1 establishes (14). Also notice that

$$\begin{aligned} \varvec{\sigma }'(\mathbf {x};\mathbf {d}) = \varvec{\zeta }^\prime (\varvec{\rho }(\mathbf {x});\varvec{\rho }'(\mathbf {x};\mathbf {d})) =\varvec{\sigma }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d}) = \varvec{\zeta }^{(0)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d})). \end{aligned}$$

Given that \(\mathbf {d}\in \mathbb {R}^m\) was arbitrary, the directional derivative of \(\varvec{\zeta }\) at \(\varvec{\rho }(\mathbf {x})\) exists in all directions \(\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {M}}(\mathbf {d})\) and the first function of sequence (7) is well defined with its corresponding domain \(G^{(0)}_{\mathbf {x}}= \{\varvec{\rho }^{(0)}_{\mathbf {x},\mathbf {Z}}(\mathbf {z}_1){:}\,\mathbf {Z} \in \mathbb {R}^{m \times q}, \mathbf {z}_{q+1} \in \mathbb {R}^m \}\). By Proposition 2.5, \(\varvec{\zeta }^{(0)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}\) is globally Lipschitz on \(G^{(0)}_{\mathbf {x}}\).

Now let us assume that for \(j \in \{1,\ldots ,q\}\) and for all \(\mathbf {M}\in \mathbb {R}^{m \times q}\) and \(\mathbf {m}_{q+1} \in \mathbb {R}^m\), \(\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j) = \varvec{\zeta }^{(j-1)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)),\) and that the function \(\varvec{\zeta }^{(j-1)}_{\varvec{\rho }(\mathbf {x}),\varvec{\rho }^\prime (\mathbf {x};\mathbf {M})}\) is well defined in the sense of (7) and is globally Lipschitz on its domain

$$\begin{aligned} G^{(j-1)}_{\mathbf {x}}=\left\{ \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {Z}}(\mathbf {z}_j){:}\,\mathbf {Z} \in \mathbb {R}^{m \times q}, \mathbf {z}_{q+1} \in \mathbb {R}^m \right\} . \end{aligned}$$

For brevity, let \(\mathbf {y}:= \varvec{\rho }(\mathbf {x})\) and \(\mathbf {Y}:= \varvec{\rho }^\prime (\mathbf {x};\mathbf {M})\). Also, let \({\hat{\mathbf {m}}} := \mathbf {m}_{j+1}\). We want to show that

$$\begin{aligned}&[\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}]^\prime (\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j),[\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}})) \\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)+ \tau \left[ \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\right] ^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j))}{\tau } \end{aligned}$$

exists in \(\mathbb {R}^p\). We know that the following limits exist in \(\mathbb {R}^p\):

$$\begin{aligned}&\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})-\varvec{\sigma }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)}{\tau }\\&\quad =\underset{\tau \downarrow 0}{\lim } \; \frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}\left( \varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j + \tau {\hat{\mathbf {m}}})\right) -\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)}{\tau }. \end{aligned}$$

Analogous to the proof in Theorem 2.1, the proof is complete if we show that

$$\begin{aligned} \underset{\tau \downarrow 0}{\lim } \;&\frac{\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)+ \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))-\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}\left( \mathbf {m}_j + \tau {\hat{\mathbf {m}}})\right) }{\tau } = \mathbf {0}. \end{aligned}$$
(15)

For \(\tau >0\) small enough, \(\mathbf {m}_{q+1} \in \mathbb {R}^m\), and \(\mathbf {M}\in \mathbb {R}^{m \times q}\), \(\varvec{\zeta }^{(j-1)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}(\mathbf {m}_j)+ \tau [\varvec{\rho }^{(j-1)}_{\mathbf {x},\mathbf {M}}]^\prime (\mathbf {m}_j;{\hat{\mathbf {m}}}))\) is well-determined by assumption. Then, the proof of Theorem 2.1 establishes (15) and \(\varvec{\sigma }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}}) = \varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}(\varvec{\rho }^{(j)}_{\mathbf {x},\mathbf {M}}({\hat{\mathbf {m}}})).\) Since \(\mathbf {M}\) and \(\mathbf {m}_{q+1}\) are arbitrary, the domain of \(\varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}\) is \(\begin{aligned} G^{(j)}_{\mathbf {x}}=\{\varvec{\rho }^{(j)}_{\mathbf {x},\mathbf {Z}}(\mathbf {z}_{j+1}){:}\,\mathbf {Z} \in \mathbb {R}^{m \times q}, \mathbf {z}_{q+1} \in \mathbb {R}^m \},\end{aligned}\) and by Proposition 2.5 \(\varvec{\zeta }^{(j)}_{\mathbf {y},\mathbf {Y}}\) is globally Lipschitz on \(G^{(j)}_{\mathbf {x}}\). Since the case for \(j=0\) was established, the proof follows by induction. \(\square \)

Proof of Proposition 3.3

Notice that \(\mathbf {z}_0\) is not required to be in the interior of F. g is a convex and piecewise linear function on F [10] of the form \(g(\mathbf {z}) = \underset{\varvec{\lambda }\in \varLambda }{\max } \;\;\mathbf {z}^\text {T}\varvec{\lambda }\) where \(\varLambda \) is the finite set that contains all extreme points of the polyhedron \(\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0\). \(\varLambda \) is nonempty because \(\mathbf {A}\) is full row rank (Theorem 2.6 in [10]). Let \({\tilde{g}}{:}\,\mathbb {R}^{m} \rightarrow \mathbb {R}{:}\,\mathbf {z}\mapsto {\mathop {\max }\nolimits _{\varvec{\lambda }\in \varLambda }} \;\; \mathbf {z}^\text {T}\varvec{\lambda }\) and let \(\mathbf {z}\in \mathbb {R}^m\). From Proposition 2.2.7 in [13] since \({\tilde{g}}\) is a convex function that is Lipschitz near \(\mathbf {z}\), the generalized gradient of \({\tilde{g}}\) at \(\mathbf {z}\) (\(\partial {\tilde{g}}(\mathbf {z})\)) coincides with the subdifferential at \(\mathbf {z}\) and for all \(\mathbf {d}\in \mathbb {R}^{m}\), \({\tilde{g}}^\prime (\mathbf {z};\mathbf {d}) = {\tilde{g}}^\circ (\mathbf {z};\mathbf {d})\) (where the second quantity is the generalized directional derivative defined in Section 2 of [13]). Let \({\tilde{J}}(\mathbf {z}) := \{ \varvec{\lambda }\in \varLambda {:}\,{\tilde{g}}(\mathbf {z}) = \mathbf {z}^\text {T}\varvec{\lambda }\}\) and \(J(\mathbf {z}) := \{ \varvec{\lambda }\in \varLambda {:}\,g(\mathbf {z}) = \mathbf {z}^\text {T}\varvec{\lambda }\}\); it is clear that \({\tilde{J}}(\mathbf {z}) = J(\mathbf {z}), \; \forall \mathbf {z}\in F\). Given that \({\tilde{g}}\) is the pointwise maximum of convex differentiable functions, \(\partial {\tilde{g}}(\mathbf {z}) = \text {co}({\tilde{J}}(\mathbf {z}))\) (section 3.4 in [22]) and for \(\mathbf {z}\in F\),

$$\begin{aligned} \text{ co }({\tilde{J}}(\mathbf {z})) = \text {co}(J(\mathbf {z})) = \left\{ \varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, \mathbf {z}^\text {T}\varvec{\lambda }= g(\mathbf {z}) \right\} . \end{aligned}$$

Finally, \(\{\varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, \mathbf {z}^\text {T}\varvec{\lambda }= g(\mathbf {z})\} =\{\varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, -\mathbf {z}^\text {T}\varvec{\lambda }\le -g(\mathbf {z}) \}\). For \(\mathbf {z}\in F\), \({\tilde{g}}^\prime (\mathbf {z};\mathbf {d}) = {\tilde{g}}^\circ (\mathbf {z};\mathbf {d}) = \max \{\mathbf {d}^\text {T}\varvec{\lambda }{:}\,\varvec{\lambda }\in \partial {\tilde{g}}(\mathbf {z})\}\) by Proposition 2.1.2 in [13], and therefore \(\max \{\mathbf {d}^\text {T}\varvec{\lambda }{:}\,\varvec{\lambda }\in \partial {\tilde{g}}(\mathbf {z})\} = \max \{\mathbf {d}^\text {T}\varvec{\lambda }{:}\,\mathbf {A}^\text {T}\varvec{\lambda }\le \mathbf {c}_0, -\mathbf {z}^\text {T}\varvec{\lambda }\le -g(\mathbf {z})\}\). For \(\mathbf {z}\in F\) and \(\mathbf {d}\in \mathbb {R}^{m}\) such that there exists \(\delta >0\) such that for all \(\epsilon \in [0,\delta [\), \(\mathbf {z}+ \epsilon \mathbf {d}\in F\) and \({\tilde{g}}^\prime (\mathbf {z};\mathbf {d}) = g^\prime (\mathbf {z};\mathbf {d})\). Then, \(g^\prime (\mathbf {z}_0;\mathbf {d})\) is given by LP (8). \(\square \)

Proof of Proposition 3.6

Under Assumption 2.1 F is nonempty. If \(h^E_{-1}(\mathbf {z}) = 0\), then \(\mathbf {z}\in F\) because there exists \(\mathbf {v}\ge \mathbf {0}\) such that \(\mathbf {A}\mathbf {v}= \mathbf {z}\). Then LP (1) is feasible and by Proposition (2.2) LP (2) is also feasible for all i. If \(h^E_{-1}(\mathbf {z}) = 0\), then the variables \(\mathbf {s}_+, \; \mathbf {s}_- = \mathbf {0}\) and they can be removed together with the constraint \(\sum _{i=1}^m s_{+i} + s_{-i} = h^E_{-1}(\mathbf {z})\) from LP (12) resulting in \(h^E_i(\mathbf {z}) = h_i(\mathbf {z})\) for all \(i \in \{0,\ldots ,n_h\}\). By Proposition 2.1, F is nonempty and for \(\mathbf {z}\in F\), \(\mathbf {h}^E(\mathbf {z}) \in \mathbb {R}^{n_h+2}\). This implies that dual LPs of (11) and (12) are always feasible. Now let us show that (11) and (12) satisfy Assumption 3.1. Let \(n_h^p := n_h + 1\), \(\mathbf {A}^p := \begin{bmatrix} \mathbf {A}&\mathbf {I}_m&-\mathbf {I}_m \end{bmatrix}\). Let \((\mathbf {c}_{0}^p)^\text {T}:= \begin{bmatrix}\mathbf {0}^\text {T}&\mathbf {1}^\text {T}&\mathbf {1}^\text {T}\end{bmatrix}\) and for \(i \in \{0,\ldots ,n_h\}\), let \((\mathbf {c}_{i+1}^p)^\text {T}:= \begin{bmatrix}\mathbf {c}_i^\text {T}&\mathbf {0}^\text {T}&\mathbf {0}^\text {T}\end{bmatrix}\). Then LPs (11) and (12) can be expressed in the format of LPs (1) and (2). Since the dual LPs of (11) and (12) are always feasible, \(\mathbf {A}^p\) and \(\mathbf {c}_i^p\) for \(i \in \{0,\ldots ,n_h^p\}\) are such that \(\mathbf {h}^E(\mathbf {z})>-\varvec{\infty }\) for all \(\mathbf {z}\in \mathbb {R}^m\). Since \(\mathbf {A}\) is full row rank, \(\mathbf {A}_p\) is full row rank. Then LPs (11) and (12) satisfy Assumption 3.1. Then by Proposition 3.1, \(\mathbf {h}^E\) is l-smooth for any \(\mathbf {z}\in \mathbb {R}^m\). \(\square \)

1.1 Appendix B

Proof of Proposition 4.12 in [14] not Being Applicable

This proposition refers to optimization problems of the form

$$\begin{aligned} \underset{\mathbf {x}\in X}{\min } \; f(\mathbf {x},\mathbf {u}), \;\; \text {s.t. } \mathbf {x}\in \varPhi , \end{aligned}$$
(16)

where \(\mathbf {u}\in U\) is a parameter vector and \(\varPhi \) is nonempty and closed. Under certain conditions, this proposition calculates the directional derivative of the optimal value function. To apply Proposition 4.12 in [14], we need to consider the duals of each LP in (1) and (2):

$$\begin{aligned} h_i(\mathbf {z}) = \underset{\varvec{\lambda }\in \mathbb {R}^{m + i}}{\max } \; [\mathbf {q}^i(\mathbf {z})]^\text {T}\varvec{\lambda },\;\; \text {s.t. } [\mathbf {A}^i]^\text {T}\varvec{\lambda }\le \mathbf {c}_i. \end{aligned}$$
(17)

For \(i \in \{0,\ldots ,n_h\}\), let \(\mathscr {D}^i\subset \mathbb {R}^{m+i}\) be the feasible set of LP (17). Since Proposition 4.12 in [14] considers a minimization problem, \(f^i(\varvec{\lambda },\mathbf {z}) = -[\mathbf {q}^i(\mathbf {z})]^\text {T}\varvec{\lambda }\). Notice that the feasible set of (17) is independent of \(\mathbf {z}\) and nonempty under Assumption 2.1 for \(i \in \{0,\ldots ,n_h\}\). The application of Proposition 4.12 in [14] requires that the inf-compactness condition is satisfied at \(\mathbf {z}\in F\). The following propositions show that the inf-compactness condition cannot be satisfied by LLPs.

Definition B.1

Consider the optimization problem (16). The inf-compactness condition holds at \(\mathbf {u}_0 \in U\) if there exists \(\alpha \in \mathbb {R}\) and a compact set \(C \subset X\) such that for every \(\mathbf {u}\) near \(\mathbf {u}_0\), the level set \(\text {lev}_\alpha f(\cdot ,\mathbf {u}) := \{ \mathbf {x}\in \varPhi {:}\,f(\mathbf {x},\mathbf {u}) \le \alpha \}\) is nonempty and contained in C.

Proposition B.1

Let Assumption 2.1 hold and consider LP (17). Then for all i, the inf-compactness condition is not satisfied at \(\mathbf {z}_0\) if \(\mathbf {q}^i(\mathbf {z}_0) \in \text {bnd}(F_i)\).

Proof

\(F_i\) is closed [1]. Let us assume \(\mathbf {q}^i(\mathbf {z}_0) \in \text {bnd}(F_i)\). Since \(\mathbf {q}^i(\mathbf {z}_0) \in F_i\), the solution set of LP (17) is nonempty, and since it can be described as a polyhedron, it is closed and convex. Proposition 3.5 in [15] implies that \(\mathbf {q}^i(\mathbf {z}_0) \in \text {int}(F_i)\) if and only if solution set of LP (17) is nonempty, convex, and compact. Since \(\mathbf {q}^i(\mathbf {z}_0) \in \text {bnd}(F_i)\), the solution set of LP (17) cannot be compact and therefore must be unbounded. Since the optimal level set is unbounded and this is the smallest nonempty level set, there is no nonempty bounded level set at \(\mathbf {z}_0\) and the inf-compactness condition is not satisfied. \(\square \)

Proposition B.2

Let Assumption 2.1 hold. For all \(i>0\), \(\mathbf {q}^i(F) \subset \text {bnd}(F_i)\).

Proof

By Proposition 2.2, \(\mathbf {q}^i(F) \subset F_i\). In addition, \(q^i_{m+i}(\mathbf {z}) = h_{i-1}(\mathbf {z})\). Let \(\mathbf {d}= -\mathbf {e}_{m+i}\). For any \(\epsilon >0\), \(\mathbf {q}^i(\mathbf {z}) + \epsilon \mathbf {d}\notin F_i\) because the \((m+i)^{th}\) component of \(\mathbf {q}^i(\mathbf {z})\) is \(h_{i-1}(\mathbf {z})\) which is optimal. Hence, any value less than \(h_{i-1}(\mathbf {z})\) results in an infeasible LP. Then \(\mathbf {q}^i(\mathbf {z}) \in \text {bnd}(F_i)\) and thus \(\mathbf {q}^i(F) \subset \text {bnd}(F_i)\). \(\square \)

Hence, by Propositions B.1 and B.2, Proposition 4.12 in [14] cannot be applied to LLPs and an alternative method to calculate the LD-derivatives of the optimal values function is required.

1.2 Appendix C

Let \(\mathbf {h}{:}\,\mathbb {R}^2_+ \rightarrow \mathbb {R}^2\) where:

$$\begin{aligned} \mathbf {h}(\mathbf {b}) = \text {lex } \underset{\mathbf {v}\in \mathbb {R}^2}{\min } \; \mathbf {C}^\text {T}\mathbf {v}, \text {s.t. } v_1 \le b_1, v_1 + v_2 \le b_2, v_1, v_2 \ge 0, \end{aligned}$$
(18)

with \(\mathbf {C} = -\mathbf {I}_2\), where first \(v_1\) is maximized, and then \(v_2\) is maximized. LP (18) can be transformed into standard form to obtain the form of LLP (1) and (2):

$$\begin{aligned} \mathbf {h}(\mathbf {b}) = \text {lex } \underset{\mathbf {v}\in \mathbb {R}^2}{\min } \; {\hat{\mathbf {C}}}^\text {T}\mathbf {v}, \; \text {s.t. } \begin{bmatrix} 1&0&1&0 \\ 1&1&0&1\end{bmatrix} \mathbf {v}= \mathbf {b}, \mathbf {v}\ge \mathbf {0}, \end{aligned}$$
(19)

where \({\hat{\mathbf {C}}} = \begin{bmatrix} -1&0&0&0 \\ 0&-1&0&0\end{bmatrix}\). For LP (18), \(F = \left\{ \mathbf {b}{:}\,\mathbf {b}\ge \mathbf {0} \right\} \). Consider \(\mathbf {b}_0 = [1 \; \; 1]^\text {T}\). Clearly, \(\mathbf {b}_0 \in \text {int}(F)\) and \(\mathbf {h}(\mathbf {b}_0) = [-1 \;\; 0]^\text {T}\). \(\mathbf {h}\) is not differentiable at \(\mathbf {b}_0\). Given \(\mathbf {M}\in \mathbb {R}^{2 \times 2}\) and LP (19), \(\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M})\) can be computed with Proposition 3.4. Consider \(\mathbf {M}_1 = \mathbf {I}_2\), \(\mathbf {M}_2 = \begin{bmatrix} 1&0 \\ 0&-1\end{bmatrix}\), \(\mathbf {M}_3 = \begin{bmatrix} -1&0 \\ 0&1\end{bmatrix}\), and \(\mathbf {M}_4 = -\mathbf {I}_2\). The resulting LD-derivatives are: \(\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_1) = \begin{bmatrix} 0&-1 \\ 0&0\end{bmatrix}\), \(\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_2) = \begin{bmatrix} 0&1 \\ 0&0\end{bmatrix}\), \(\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_3) = \begin{bmatrix} 1&0 \\ -1&-1\end{bmatrix}\), and \(\mathbf {h}^\prime (\mathbf {b}_0;\mathbf {M}_4)= \begin{bmatrix} 1&0 \\ -1&1\end{bmatrix}\) (Fig. 1).

Different elements of the lexicographic subdifferential are obtained from the linear system \(\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M})\mathbf {M}= \mathbf {h}^\prime (\mathbf {b};\mathbf {M})\). The resulting l-derivatives are:

$$\begin{aligned}&\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_1) = \mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_2) = \begin{bmatrix} 0&-1 \\ 0&0\end{bmatrix}, \\&\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_3) =\mathbf {J}_\text {L}\mathbf {h}(\mathbf {b};\mathbf {M}_4) = \begin{bmatrix} -1&0 \\ 1&-1\end{bmatrix}. \end{aligned}$$
Fig. 1
figure 1

This figure shows graphically how \(\mathbf {M}_1,\mathbf {M}_2,\mathbf {M}_3\) and \(\mathbf {M}_4\) result in different LD-derivatives at \(\mathbf {b}_0 = [1 \; \; 1]^\text {T}\). a Feasible set (gray) and optimal solution point (red dot) for LP (18). If the first column of the matrix of directions is \([1 \; \; 0]^\text {T}\), b is obtained. In b, \(b_1\) is increased and the optimal solution point does not change. Then, the second column of the matrix of directions can be \([0 \; 1]^\text {T}\) or \([0 \; \; -1]^\text {T}\) resulting in c and d, respectively. In c, the solution point changes such that \(h_0\) decreases and in d it changes such that \(h_0\) increases. If the first column of the matrix of directions is \([-1 \; \; 0]^\text {T}\), e is obtained. In e, the solution point changes such that \(h_0\) increases and \(h_1\) decreases. Then, the second column of the matrix of directions can be \([0 \; \; 1]^\text {T}\) or \([0 \; \; -1]^\text {T}\) resulting in f and g, respectively. In f, the solution point changes such that \(h_1\) decreases, and in g it changes such that \(h_1\) increases

Notice that \(\mathbf {M}_1\) and \(\mathbf {M}_2\) result in the same l-derivative matrix as well as \(\mathbf {M}_3\) and \(\mathbf {M}_4\). These two l-derivatives form the B-subdifferential of \(\mathbf {h}\) at \(\mathbf {b}_0 = [1 \; \; 1]^\text {T}\). Proposition 2.6.2 in [13] shows that for a nonscalar function \(\mathbf {h}\) evaluated at \(\mathbf {b}_0\), Clarke’s generalized Jacobian is a subset of the Cartesian product of the generalized gradients of each component of \(\mathbf {h}\). In this example, the Cartesian product of the generalized gradients of \(h_0\) and \(h_1\) at \(\mathbf {b}_0 = [1 \; \; 1]^\text {T}\) results in the convex hull of \(\left\{ \begin{bmatrix} 0&-\,1 \\ 0&0 \end{bmatrix}, \begin{bmatrix} 0&-\,1 \\ 1&-\,1 \end{bmatrix}, \begin{bmatrix} -\,1&0 \\ 0&0 \end{bmatrix}, \begin{bmatrix} -\,1&0 \\ 1&-\,1 \end{bmatrix} \right\} \). However, the kinks in the functions \(h_0\) and \(h_1\) are lined up such that the B-subdifferential of \(\mathbf {h}\) at \(\mathbf {b}_0 = [1 \; \; 1]^\text {T}\) contains only two matrices. The LD-derivatives are guaranteed to find at most these two matrices (Fig. 2). The results of this example can be easily verified by expressing \(\mathbf {h}\) as:

$$\begin{aligned}&h_0(\mathbf {b}) = -\min \{b_1,b_2\} = \frac{|b_1 - b_2|-b_1 - b_2}{2} \\&h_1(\mathbf {b}) = -\max \{0,b_2-b_1\} = \frac{b_1 - b_2 - |b_1 - b_2|}{2}. \end{aligned}$$
Fig. 2
figure 2

Surface plots of \(\mathbf {h}\) with respect to \(\mathbf {b}\). The red dots indicate the point \(\mathbf {b}_0 = [1 \; \; 1]^\text {T}\). Notice that both components of \(\mathbf {h}\) can be divided into two regions of differentiability with two different gradients. In particular, \(\nabla h_0(\mathbf {b}) = [-1 \; \; 0]^\text {T}\) or \(\nabla h_0(\mathbf {b}) = [0 \; \; -1]^\text {T}\) and \(\nabla h_1(\mathbf {b}) = [0 \; \; 0]^\text {T}\) or \(\nabla h_1(\mathbf {b})[1 \; -1]^\text {T}\). The four matrices of directions \(\mathbf {M}_1,\mathbf {M}_2,\mathbf {M}_3\) and \(\mathbf {M}_4\) probe possible combinations of these gradients at \(\mathbf {b}_0\), resulting in two different l-derivative matrices. In fact, these two matrices form the B-subdifferential of \(\mathbf {h}\) at \(\mathbf {b}_0\). In this case, the generalized Jacobian of \(\mathbf {h}\) at \(\mathbf {b}_0\) is a strict subset of the Cartesian product of the generalized gradients of each component of \(\mathbf {h}\) at \(\mathbf {b}_0\)

1.3 Appendix D

Model Description of Example 4.1

  1. (a)

    Mass balance of biomass of C. ijungdahlii:

    $$\begin{aligned}&\frac{\partial X}{\partial t}(z,t) = \mu (z,t) X(z,t) - \frac{u_L}{\epsilon _L}\frac{\partial X}{\partial z}(z,t) + D_A \frac{\partial ^2 X}{\partial z^2}(z,t), \\&u_L X(0,t) - \epsilon _L D_A \frac{\partial X}{\partial z}(0,t) = 0, \;\; \frac{\partial X}{\partial z}(L,t) = 0, \; X(z,0) = X_0, \end{aligned}$$

    where X is the concentration of biomass, t is time, \(\mu \) is the growth rate, \(u_L\) is the liquid velocity, z is the spatial position, \(D_A\) is the diffusivity, \(\epsilon _L\) is the liquid volume fraction in the reactor, and \(X_0\) is the initial biomass concentration.

  2. (b)

    Mole balances of liquid-phase CO, H\(_2\), CO\(_2\), ethanol, and acetate:

    $$\begin{aligned}&\frac{\partial G_L}{\partial t}(z,t) = v_G(z,t) X(z,t) + \frac{k_{m,G}}{\epsilon _L}(G^*(z,t) - G_L(z,t)) \\&\quad -\,\frac{u_L}{\epsilon _L} \frac{\partial G_L}{\partial z}(z,t) + D_A \frac{\partial ^2 G_L}{\partial z^2}(z,t), \\&u_L G_L(0,t) - \epsilon _L D_A \frac{\partial G_L}{\partial z}(0,t) = u_L G_{gF} H_G, \; \frac{\partial G_L}{\partial z}(L,t) = 0, \; G_L(z,0) = G_{L0}, \end{aligned}$$

    where G can be CO, H\(_2\), CO\(_2\), ethanol, and acetate concentrations, \(v_G\) refers to the exchange flux rate for species G, \(k_{m,G}\) the liquid mass transfer coefficient for species G, \(G_L\) the liquid concentration of species G, \(G^*\) the liquid concentration in equilibrium with the gas concentration of species G, \(G_{gF}\) the gas concentration in the feed of species G, \(G_{L0}\) the initial concentration of species G in the liquid, and \(H_G\) is Henry’s constant for species G.

  3. (c)

    Mole balances of gas-phase CO, H\(_2\), and CO\(_2\):

    $$\begin{aligned} \frac{\partial G_g}{\partial t}(z,t)&= - \frac{k_{m,G}}{\epsilon _g}(G^*(z,t) - G_L(z,t)) - \frac{u_g}{\epsilon _g} \frac{\partial G_g}{\partial z}(z,t), \\ G_g(0,t)&= G_{gF}, G_g(z,0) = G_{g0}, \end{aligned}$$

    where \(G_g\) is the concentration of species G in the gas, \(\epsilon _g\) is the gas volume fraction, \(u_g\) is the gas velocity, and \(G_{g0}\) is the initial gas concentration of species G.

  4. (d)

    Column pressure profile: \(P(z) = P_L + \rho _L g(L-z)\), where P is the pressure as a function of position in the column, \(P_L\) is the pressure at the top of the column, \(\rho _L\) is the density of the liquid, L is the size of the column, and g is gravitational acceleration.

The spatial dimension of the bubble column was discretized using the finite volume strategy; 100 nodes were considered. The state vector for each finite volume is:

$$\begin{aligned} \mathbf {x}= [X, \text {CO}_L, \text {CO}_g, \text {H}_{2g}, \text {H}_{2L}, \text {CO}_{2L}, \text {CO}_{2g}, A, E ], \end{aligned}$$

where A stands for acetate and E for ethanol. To obtain the growth rate \(\mu \) and the exchange flux rates \(v_G\), this problem is transformed into a DFBA problem using LP (13).

The resulting nonsmooth system of equations were solved for the following parameter values:

$$\begin{aligned}&u_g = 75\,\mathrm{m/h}, L = 25\,\mathrm{m}, u_L = 0.25\,\mathrm{m/h}, D_A = 0.25\,\mathrm{m}^2/\mathrm{h}, T = 310.15\,\mathrm{K}, \\&P_L = 1\,\mathrm{atm}, \rho _L = 1000\,\mathrm{kg/m}^3, P_{\mathrm{CO}} = 0.6*P_0, P_{\mathrm{H}_2} = 0.4*P_0, \\&G_{\mathrm{gF}} = \frac{P_G}{R \times T}, H_{\mathrm{CO}} = 8.0 \times 10^{-4} \frac{\mathrm{mol}}{(\mathrm{L}\,\mathrm{atm})}, H_{\mathrm{H}_2} = 6.6 \times 10^{-4} \frac{\mathrm{mol}}{(\mathrm{L}\,\mathrm{atm})}, \\&H_{\mathrm{CO}_2} = 2.5 \times 10^{-2} \frac{\mathrm{mol}}{(\mathrm{L}\,\mathrm{atm})}, k_{m,\mathrm{CO}_2} = k_{m,\mathrm{CO}} = 80/\mathrm{h}, k_{m,\mathrm{H}_2} = 2.5k_{m,\mathrm{CO}}, \\&k_{m,E} = k_{m,A} = 0, \epsilon _g = u_g \frac{0.53}{3600(0.15 + u_g/3600)}, \epsilon _L = 1 - \epsilon _g, \\&v_{\mathrm{max},\mathrm{CO}} = 35 \frac{\mathrm{mmo}l}{\mathrm{g} \times \mathrm{h}}, v_{\mathrm{max},\mathrm{CO}} = v_{\mathrm{max},\mathrm{CO}_2}= 35 \frac{\mathrm{mmol}}{\mathrm{g} \times \mathrm{h}}, v_{\mathrm{max},\mathrm{H}_2} = 70 \frac{\mathrm{mmol}}{\mathrm{g} \times \mathrm{h}}, \\&K_{m,\mathrm{CO} }= K_{m,\mathrm{CO}_2} = 0.02\,\mathrm{mmol/L}, K_{m,\mathrm{H}_2} = 0.02\,\mathrm{mmol/L}, K_{I} = 10\,\mathrm{mmol/L}, \end{aligned}$$

where \(P_0\) is the pressure at the bottom of the column, \(P_G\) is the partial pressure of species G, T is the temperature in Kelvin, and R is the universal gas constant. This system comprises 901 equations (the last equation corresponds to the penalty), 100 LLPs each one with 682 equality constraints, 1715 variables, and 7 objective functions.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gomez, J.A., Höffner, K., Khan, K.A. et al. Generalized Derivatives of Lexicographic Linear Programs. J Optim Theory Appl 178, 477–501 (2018). https://doi.org/10.1007/s10957-018-1309-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-018-1309-2

Keywords

Mathematics Subject Classification

Navigation