Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Robust Algorithms for TSP and Steiner Tree

Published: 09 March 2023 Publication History

Abstract

Robust optimization is a widely studied area in operations research, where the algorithm takes as input a range of values and outputs a single solution that performs well for the entire range. Specifically, a robust algorithm aims to minimize regret, defined as the maximum difference between the solution’s cost and that of an optimal solution in hindsight once the input has been realized. For graph problems in P, such as shortest path and minimum spanning tree, robust polynomial-time algorithms that obtain a constant approximation on regret are known. In this paper, we study robust algorithms for minimizing regret in NP-hard graph optimization problems, and give constant approximations on regret for the classical traveling salesman and Steiner tree problems.

1 Introduction

In many graph optimization problems, the inputs are not known precisely and the algorithm is desired to perform well over a range of inputs. For instance, consider the following situations. Suppose we are planning the delivery route of a vehicle that must deliver goods to n locations. Due to varying traffic conditions, the exact travel times between locations are not known precisely, but a range of possible travel times is available from historical data. Can we design a tour that is nearly optimal for all travel times in the given ranges? Consider another situation where we are designing a telecommunication network to connect a set of locations. We are given cost estimates on connecting every two locations in the network but these estimates might be off due to unexpected construction problems. Can we design the network in a way that is nearly optimal for all realized construction costs?
These questions have led to the field of robust graph algorithms. To define a robust graph algorithm, we start with a “standard” optimization problem \(\mathcal {P}\) defined by a set system \(\mathcal {S} \subseteq 2^{E}\) over edges E with weights \(d_e\). For example, if \(\mathcal {P}\) is the minimum spanning tree problem, \(\mathcal {S}\) would be the set of all sets of edges comprising spanning trees. Given these inputs, the goal of the standard version of \(\mathcal {P}\) is to find the set \(S\in \mathcal {S}\) that minimizes1 \(\sum _{e \in S} d_e\). In the robust version of \(\mathcal {P}\), given a range of weights \([\ell _e, u_e]\) for every edge e, we want a solution that is good for all realizations of edge weights simultaneously. To quantify how good a solution is, we define its regret as the maximum difference between the algorithm’s cost and the optimal cost for any vector of edge weights \({\bf d}\). In other words, the regret of \({\rm\small SOL}\) is:
\begin{equation*} \max _{{\bf d}} ({\rm\small SOL} ({\bf d}) - {\rm\small OPT} ({\bf d})), \qquad {\rm\small SOL} ({\bf d}) := \sum _{e \in {\rm\small SOL}} d_e, \qquad {\rm\small OPT} ({\bf d}) := \min _{S \in \mathcal {S}} \sum _{e \in S} d_e. \end{equation*}
Here, \({\rm\small SOL} ({\bf d})\) (resp. \({\rm\small OPT} ({\bf d})\)) denotes the cost of \({\rm\small SOL}\) (resp. the cost of the optimal solution) in instance \({\bf d}\), and \({\bf d}\) ranges over all realizable inputs, i.e., inputs such that \(\ell _e \le d_e \le u_e\) for all e. We emphasize that \({\rm\small SOL}\) is a fixed solution (independent of \({\bf d}\)) whereas the solution determining \({\rm\small OPT} ({\bf d})\) is dependent on the input \({\bf d}\).
Now, the goal of the robust version of \(\mathcal {P}\) is to find a solution that minimizes regret. The solution that achieves this minimum is called the minimum regret solution(\({\rm\small MRS}\)), and its regret is the minimum regret(\({\rm\small MR}\)). In many cases, however, minimizing regret turns out to be NP-hard, in which case one seeks an approximation guarantee. Namely, a \(\beta\)-approximation algorithm satisfies, for all input realizations \({\bf d}\), \({\rm\small SOL} ({\bf d}) - {\rm\small OPT} ({\bf d}) \le \beta \cdot {\rm\small MR}\), i.e., \({\rm\small SOL} ({\bf d}) \le {\rm\small OPT} ({\bf d}) + \beta \cdot {\rm\small MR}\).
To the best of our knowledge, all previous work in polynomial-time algorithms for minimizing regret in robust graph optimization focused on problems in P. In this paper, we study robust graph algorithms for minimizing regret in NP-hard optimization problems. In particular, we study robust algorithms for the classical traveling salesman (tsp) and Steiner tree (stt) problems, that model e.g., the two scenarios described at the beginning of the paper. As a consequence of the NP-hardness, we cannot hope to show guarantees of the form: \({\rm\small SOL} ({\bf d}) \le {\rm\small OPT} ({\bf d}) + \beta \cdot {\rm\small MR}\), since for \(\ell _e = u_e\) (i.e., \({\rm\small MR} = 0\)), this would imply an exact algorithm for an NP-hard optimization problem. Instead, we give guarantees: \({\rm\small SOL} ({\bf d}) \le \alpha \cdot {\rm\small OPT} ({\bf d}) + \beta \cdot {\rm\small MR}\), where \(\alpha\) is (necessarily) at least as large as the best approximation guarantee for the optimization problem. We call such an algorithm an \((\alpha , \beta)\)-robust algorithm. If both \(\alpha\) and \(\beta\) are constants, we call it a constant-approximation to the robust problem. In this paper, our main results are constant approximation algorithms for the robust traveling salesman and Steiner tree problems. We hope that our work will lead to further research in the field of robust approximation algorithms, particularly for minimizing regret in other NP-hard optimization problems in graph algorithms as well as in other domains.

1.1 Related Work

Robust graph algorithms have been extensively studied in the operations research community. It is known that minimizing regret is NP-hard for shortest path [25] and minimum cut [1] problems, and using a general theorem for converting exact algorithms to robust ones, 2-approximations are known for these problems [11, 16]. In some cases, better results are known for special classes of graphs, e.g., [17]. Robust minimum spanning tree (mst) has also been studied, although in the context of making exponential-time exact algorithms more practical [24]. Moreover, robust optimization has been extensively researched for other (non-graph) problem domains in the operations research community, and has led to results in clustering [4, 5, 6, 19], linear programming [14, 20], and other areas [3, 16]. More details can be found in the book by Kouvelis and Yu [18] and the survey by Aissi et al. [2].
Other robust variants of graph optimization where one does not know the edge costs ahead of time have also been studied in the literature. In the robust combinatorial optimization model proposed by Bertsimas and Sim [7], edge costs are given as ranges as in this paper, but instead of optimizing for all realizations of costs within the ranges, the authors consider a model where at most k edge costs can be set to their maximum value and the remaining are set to their minimum value. The objective is to minimize the maximum cost over all realizations. In this setting, there is no notion of regret and an approximation algorithm for the standard problem translates to an approximation algorithm for the robust problem with the same approximation factor.
In the data-robust model [12], the input includes a polynomial number of explicitly defined “scenarios” for edge costs, with the goal of finding a solution that is approximately optimal for all given scenarios. That is, in the input one receives a graph and a polynomial number of scenarios \({\bf d}^{(1)}, {\bf d}^{(2)} \ldots {\bf d}^{(k)}\) and the goal is to find \({\rm\small ALG}\) whose maximum cost across all scenarios is at most some approximation factor times \(\min _{{\rm\small SOL}} \max _{i \in [k]} \sum _{e \in {\rm\small SOL}} d_{e}^{(i)}\). In contrast, in this paper, we have exponentially many scenarios and look at the maximum of \({\rm\small ALG} ({\bf d}) - {\rm\small OPT} ({\bf d})\) rather than \({\rm\small ALG} ({\bf d})\). A variation of this is the recoverable robust model [9], where after seeing the chosen scenario, the algorithm is allowed to “recover” by making a small set of changes to its original solution.

1.2 Problem Definition and Results

We first define the Steiner tree (stt) and traveling salesman problems (tsp). In both problems, the input is an undirected graph \(G = (V, E)\) with non-negative edge costs. In Steiner tree, we are also given a subset of vertices called terminals and the goal is to obtain a minimum cost connected subgraph of G that spans all the terminals. In traveling salesman, the goal is to obtain a minimum cost tour that visits every vertex in V.2 In the robust versions of these problems, the edge costs are ranges \([\ell _e, u_e]\) from which any cost may realize.
Our main results are the following:
Theorem 1.1 (Robust Approximations).
There exist constant approximation algorithms for the robust traveling salesman and Steiner tree problems.
The constants we are able to obtain for the two problems are very different: \((4.5, 3.75)\) for tsp (in Section 3) and \((2755, 64)\) for stt (in Section 5). While we did not attempt to optimize the precise constants, obtaining small constants for stt comparable to the tsp result requires new ideas beyond our work and is an interesting open problem.
We complement our algorithmic results with lower bounds. Note that if \(\ell _e = u_e\), we have \({\rm\small MR} = 0\) and thus an \((\alpha , \beta)\)-robust algorithm gives an \(\alpha\)-approximation for precise inputs. So, hardness of approximation results yield corresponding lower bounds on \(\alpha\). More interestingly, we show that hardness of approximation results also yield lower bounds on the value of \(\beta\) (see Section 6 for details):
Theorem 1.2 (APX-hardness).
A hardness of approximation of \(\rho\) for tsp (resp., stt) under \({\bf P}\not={\bf NP}\) implies that it is NP-hard to obtain \(\alpha \le \rho\) (irrespective of \(\beta\)) and \(\beta \le \rho\) (irrespective of \(\alpha\)) for robust tsp (resp., robust stt).

1.3 Our Techniques

We now give a sketch of our techniques. Before doing so, we note that for problems in P with linear objectives, it is known that running an exact algorithm using weights \(\frac{\ell _e + u_e}{2}\) gives a \((1, 2)\)-robust solution [11, 16]. One might hope that a similar result can be obtained for NP-hard problems by replacing the exact algorithm with an approximation algorithm in the above framework. Unfortunately, we show in Section 3 that this is not true in general. In particular, we give a robust tsp instance where using a 2-approximation for tsp with weights \(\frac{\ell _e + u_e}{2}\) gives a solution that is not \((\alpha , \beta)\)-robust for any \(\alpha = o(n), \beta = o(n)\). More generally, a black-box approximation run on a fixed realization could output a solution including edges that have small weight relative to \({\rm\small OPT}\) for that realization (so including these edges does not violate the approximation guarantee), but these edges could have large weight relative to \({\rm\small MR}\) and \({\rm\small OPT}\) in other realizations, ruining the robustness guarantee. This establishes a qualitative difference between robust approximations for problems in P considered earlier and NP-hard problems being considered in this paper, and demonstrates the need to develop new techniques for the latter class of problems.
LP relaxation. We denote the input graph \(G = (V, E)\). For each edge \(e\in E\), the input is a range \([\ell _e, u_e]\) where the actual edge weight \(d_e\) can realize to any value in this range. The robust version of a graph optimization problem \(\mathcal {P}\) then has the LP relaxation
\begin{equation*} \min \Big \lbrace r : {\bf x} \in \mathcal {S}; \sum _{e \in E} d_e x_e \le {\rm\small OPT} ({\bf d}) + r,\ \forall {\bf d}\Big \rbrace , \end{equation*}
where P is the standard polytope for \(\mathcal {P}\), and \({\rm\small OPT} ({\bf d})\) denotes the cost of an optimal solution when the edge weights are \({\bf d}= \lbrace d_e: e\in E\rbrace\). That is, this is the standard LP for the problem, but with the additional constraint that the fractional solution \({\bf x}\) must have regret at most r for any realization of edge weights. We call the additional constraints the regret constraint set. Note that setting \({\bf x}\) to be the indicator vector of mrs and r to \({\rm\small MR}\) gives a feasible solution to the LP; thus, the LP optimum is at most \({\rm\small MR}\), i.e., the optimal solution to the LP gives a lower bound for the regret minimization problem.
Solving the LP. We assume that the constraints in P are separable in polynomial time (e.g., this is true for most standard optimization problems including stt and tsp). So, designing the separation oracle comes down to separating the regret constraint set, which requires checking that:
\begin{equation*} \max _{{\bf d}} \left[\sum _{e \in E} d_ex_e - {\rm\small OPT} ({\bf d})\right] = \max _{{\bf d}} \max _{{\rm\small SOL}} \left[\sum _{e \in E} d_ex_e - {\rm\small SOL} ({\bf d})\right] = \max _{{\rm\small SOL}} \max _{{\bf d}} \left[\sum _{e \in E} d_ex_e - {\rm\small SOL} ({\bf d})\right] \le r. \end{equation*}
Thus, given a fractional solution \({\bf x}\), we need to find an integer solution \({\rm\small SOL}\) and a weight vector \({\bf d}\) that maximizes the regret of \({\bf x}\) given by \(\sum _{e \in E} d_ex_e - {\rm\small SOL} ({\bf d})\). Once \({\rm\small SOL}\) is fixed, finding \({\bf d}\) that maximizes the regret is simple: If \({\rm\small SOL}\) does not include an edge e, then to maximize \(\sum _{e \in E} d_e x_e - {\rm\small SOL} ({\bf d})\), we set \(d_e = u_e\); else if \({\rm\small SOL}\) includes e, we set \(d_e = \ell _e\). Note that in these two cases, edge e contributes \(u_e x_e\) and \(\ell _e x_e - \ell _e\) respectively to the regret. The above maximization thus becomes:
\begin{equation} \max _{{\rm\small SOL}} \left[\sum _{e\notin {\rm\small SOL}} u_e x_e + \sum _{e\in {\rm\small SOL}} (\ell _e x_e - \ell _e)\right] = \sum _{e\in E} u_e x_e - \min _{{\rm\small SOL}} \sum _{e\in {\rm\small SOL}} (u_e x_e - \ell _e x_e + \ell _e). \end{equation}
(1)
Thus, \({\rm\small SOL}\) is exactly the optimal solution with edge weights \(a_e := u_ex_e - \ell _ex_e + \ell _e\). (For reference, we define the derived instance of a robust graph problem as the instance with edge weights \(a_e\).) Note that these weights are non-negative as \(u_e \gt \ell _e\) and \(x_e \ge 0\).
Now, if we were solving a problem in P, we would simply need to solve the problem on the derived instance. Indeed, we will show later that this yields an alternative technique for obtaining robust algorithms for problems in P, and recover existing results in [16]. However, we cannot hope to find an optimal solution to an NP-hard problem. Our first compromise is that we settle for an approximate separation oracle. More precisely, our goal is to show that there exists some fixed constants \(\alpha ^{\prime }, \beta ^{\prime } \ge 1\) such that if \(\sum _e d_e x_e \gt \alpha ^{\prime } \cdot {\rm\small OPT} ({\bf d}) + \beta ^{\prime } \cdot r\) for some \({\bf d}\), then we can find \({\rm\small SOL}, {\bf d}^{\prime }\) such that \(\sum _e d_e^{\prime } x_e \gt {\rm\small SOL} ({\bf d}^{\prime }) + r\). Since the LP optimum is at most \({\rm\small MR}\), we can then obtain an \((\alpha ^{\prime }, \beta ^{\prime })\)-robust fractional solution using the standard ellipsoid algorithm.
For tsp, we show that the above guarantee can be achieved by the classic 2-approximation based on the mst solution of the derived instance. The details appear in Section 3. Although stt also admits a 2-approximation based on the mst solution, this turns out to be insufficient for the above guarantee. Instead, we use a different approach here. We note that the regret of the fractional solution against any fixed solution \({\rm\small SOL}\) (i.e., the argument over which Equation (1) maximizes) can be expressed as the following difference:
\begin{equation*} \sum _{e \notin {\rm\small SOL}} (u_ex_e - \ell _e x_e + \ell _e) - \sum _{e \in E} (\ell _e - \ell _e x_e) = \sum _{e \notin {\rm\small SOL}} a_e - \sum _{e \in E} b_e, \text{ where } b_e := \ell _e - \ell _e x_e. \end{equation*}
The first term is the weight of edges in the derived instance that are not in sol. The second term corresponds to a new stt instance with different edge weights \(b_e\). It turns out that the overall problem now reduces to showing the following approximation guarantees on these two stt instances (\(c_1\) and \(c_2\) are constants):
\begin{equation*} \text{(i)} \sum _{e \in {\rm\small ALG} \setminus {\rm\small SOL}} a_e \le c_1 \cdot \sum _{e \in {\rm\small SOL} \setminus {\rm\small ALG}} a_e \qquad \text{and} \qquad \text{(ii)} \sum _{e \in {\rm\small ALG}} b_e \le c_2 \cdot \sum _{e \in {\rm\small SOL}} b_e. \end{equation*}
Note that guarantee (i) on the derived instance is an unusual “difference approximation” that is stronger than usual approximation guarantees. Moreover, we need these approximation bounds to simultaneously hold, i.e., hold for the same alg. Obtaining these dual approximation bounds simultaneously forms the most technically challenging part of our work, and is given in Section 5.
Rounding the fractional solution. After applying our approximate separation oracles, we have a fractional solution \({\bf x}\) such that for all edge weights \({\bf d}\), we have \(\sum _e d_e x_e \le \alpha ^{\prime } \cdot {\rm\small OPT} ({\bf d}) + \beta ^{\prime } \cdot {\rm\small MR}\). Suppose that, ignoring the regret constraint set, the LP we are using has integrality gap at most \(\delta\) for precise inputs. Then we can bound the difference between the cost of mrs and \(\delta {\bf x}\) in every realization: Since the integrality gap is at most \(\delta\), we have \(\delta \cdot \sum _{e \in E} d_e x_e \ge {\rm\small OPT} ({\bf d})\) for any \({\bf d}\). This implies that:
\begin{equation*} {\rm\small MRS} ({\bf d}) - \delta \cdot \sum _{e \in E} d_e x_e \le {\rm\small MRS} ({\bf d}) - {\rm\small OPT} ({\bf d}) \le {\rm\small MR}. \end{equation*}
Hence, the regret of mrs with respect to \(\delta x\) is at most \({\rm\small MR}\). Then a natural rounding approach is to try to match this property of mrs, i.e., search for an integer solution alg that does not cost much more than \(\delta {\bf x}\) in any realization. Suppose we choose alg that satisfies:
\begin{equation} {\rm\small ALG} = \text{argmin}_{{\rm\small SOL}} \max _{{\bf d}} \left[{\rm\small SOL} ({\bf d}) - \delta \sum _{e \in E} d_e x_e\right]. \end{equation}
(2)
Since alg has minimum regret with respect to \(\delta {\bf x}\), alg’s regret is also at most \({\rm\small MR}\). Note that \(\delta {\bf x}\) is a \((\delta \alpha ^{\prime }, \delta \beta ^{\prime })\)-robust solution. Hence, \({\rm\small ALG}\) is a \((\delta \alpha ^{\prime }, \delta \beta ^{\prime } + 1)\)-robust solution.
If we are solving a problem in P, the \({\rm\small ALG}\) that satisfies Equation (2) is the optimal solution with weights \(\max \lbrace \ell _e, u_e - (u_e - \ell _e) \delta x_e\rbrace\) and thus can be found in polynomial time. So, using an integral LP formulation (i.e., integrality gap of 1), we get a \((1, 2)\)-robust algorithm overall for these problems. This exactly matches the results in [16], although we are using a different set of techniques.3 Of course, for NP-hard problems, finding a solution \({\rm\small ALG}\) that satisfies Equation (2) is NP-hard as well. It turns out, however, that we can design a generic rounding algorithm that gives the following guarantee:
Theorem 1.3.
There exists a rounding algorithm that takes as input an \((\alpha , \beta)\)-robust fractional solution to stt (resp. tsp) and outputs a \((\gamma \delta \alpha , \gamma \delta \beta + \gamma)\)-robust integral solution, where \(\gamma\) and \(\delta\) are respectively the best approximation factor and integrality gap for (classical) stt (resp., tsp).
We remark that while we stated this rounding theorem for stt and tsp here, we actually give a more general version (Theorem 2.1) in Section 2 that applies to a broader class of covering problems including set cover, survivable network design, and so on, and might be useful in future research in this domain.

1.4 Roadmap

We present the general rounding algorithm for robust problems in Section 2. In Section 3, we use this rounding algorithm to give a robust algorithm for the Traveling Salesman problem. Section 4 gives a local search algorithm for the Steiner Tree problem. Both the local search algorithm and the rounding algorithm from Section 2 are then used to give a robust algorithm for the Steiner Tree problem in Section 5. The hardness results for robust problems appear in Section 6. Finally, we conclude with some interesting directions of future work in Section 7.

2 A General Rounding Algorithm for Robust Problems

In this section we give the rounding algorithm of Theorem 1.3, which is a corollary of the following, more general theorem:
Theorem 2.1.
Let \(\mathcal {P}\) be an optimization problem defined on a set system \(\mathcal {S} \subseteq 2^{E}\) that seeks to find the set \(S\in \mathcal {S}\) that minimizes \(\sum _{e \in S} d_e\), i.e., the sum of the weights of elements in S. In the robust version of this optimization problem, we have \(d_e\in [\ell _e, u_e]\) for all \(e\in E\).
Consider an LP formulation of \(\mathcal {P}\) (called \(\mathcal {P}\)-LP) given by: \(\lbrace \min \sum _{e \in E} d_e x_e: {\bf x}\in X, {\bf x}\in [0, 1]^E\rbrace\), where X is a polytope containing the indicator vector \(\chi _{S}\) of all \(S \in \mathcal {S}\) and not containing \(\chi _{S}\) for any \(S \notin \mathcal {S}\). The corresponding LP formulation for the robust version (called \(\mathcal {P}_{\rm robust}\)-LP) is given by: \(\lbrace \min r: {\bf x}\in X, {\bf x}\in [0, 1]^E, \sum _{e \in E} d_e x_e \le {\rm\small OPT} ({\bf d}) + r ~\forall {\bf d}\rbrace\).
Now, suppose we have the following properties:
There is a \(\gamma\)-approximation algorithm for \(\mathcal {P}\).
The integrality gap of \(\mathcal {P}\)-LP is at most \(\delta\).
There is a feasible solution \({\bf x}^*\) to \(\mathcal {P}\)-LP that satisfies: \(\forall {\bf d}: \sum _{e \in E}d_e x_e^* \le \alpha \cdot {\rm\small OPT} ({\bf d}) + \beta \cdot {\rm\small MR}\).
Then, there exists an algorithm that outputs a \((\gamma \delta \alpha , \gamma \delta \beta + \gamma)\)-robust \({\rm\small SOL}\) for \(\mathcal {P}\).
Proof.
The algorithm is as follows: Construct an instance of \(\mathcal {P}\) which uses the same set system \(\mathcal {S}\) and where element e has weight \(\max \lbrace u_e(1 - \delta x_e^*), \ell _e (1 - \delta x_e^*)\rbrace + \delta \ell _e x_e^*\). Then, use the \(\gamma\)-approximation algorithm for \(\mathcal {P}\) on this instance to find an integral solution S, and output it.
Given a feasible solution S to \(\mathcal {P}\), note that:
\begin{equation*} \max _{\bf d}\left[\sum _{e \in S} d_e - \delta \sum _{e \in E} d_e x_e^*\right] = \sum _{e \in S} \max \left\lbrace u_e\left(1 - \delta x_e^*\right), \ell _e(1 - \delta x_e^*)\right\rbrace - \sum _{e \notin S}\delta \ell _e x_e^* \end{equation*}
\begin{equation*} = \sum _{e \in S} \left[\max \left\lbrace u_e\left(1 - \delta x_e^*\right), \ell _e\left(1 - \delta x_e^*\right)\right\rbrace + \delta \ell _e x_e^*\right] - \sum _{e \in E} \delta \ell _e x_e^*. \end{equation*}
Now, note that since S was output by a \(\gamma\)-approximation algorithm, for any feasible solution \(S^{\prime }\):
\begin{equation*} \sum _{e \in S} [\max \lbrace u_e(1 - \delta x_e^*), \ell _e(1 - \delta x_e^*)\rbrace + \delta \ell _e x_e^*] \le \gamma \sum _{e \in S^{\prime }} [\max \lbrace u_e(1 - \delta x_e^*), \ell _e(1 - \delta x_e^*)\rbrace + \delta \ell _e x_e^*] \end{equation*}
\begin{align*} & \Rightarrow \ \sum _{e \in S} [\max \lbrace u_e(1 - \delta x_e^*), \ell _e(1 - \delta x_e^*)\rbrace + \delta \ell _e x_e^*] - \gamma \sum _{e \in E} \delta \ell _ex_e^*\\ \le &\, \gamma \left[\sum _{e \in S^{\prime }} [\max \lbrace u_e(1 - \delta x_e^*), \ell _e(1 - \delta x_e^*)\rbrace + \delta \ell _e x_e^*] - \sum _{e \in E} \delta \ell _e x_e^* \right] \\ = \, &\gamma \max _{\bf d}\left[\sum _{e \in S^{\prime }} d_e - \delta \sum _{e \in E} d_e x_e^* \right]. \end{align*}
Since \(\mathcal {P}\)-LP has integrality gap \(\delta\), for any fractional solution \({\bf x}\), \(\forall {\bf d}: {\rm\small OPT} ({\bf d}) \le \delta \sum _{e \in E} d_e x_e\). Fixing \(S^{\prime }\) to be the set of elements used in the minimum regret solution then gives:
\begin{equation*} \max _{\bf d}\left[\sum _{e \in S^{\prime }} d_e - \delta \sum _{e \in E} d_e x_e^* \right] \le \max _{\bf d}[{\rm\small MRS} ({\bf d}) - {\rm\small OPT} ({\bf d})] = {\rm\small MR}. \end{equation*}
Combined with the previous inequality, this gives:
\begin{align*} &\sum _{e \in S} [\max \lbrace u_e(1 - \delta x_e^*), \ell _e(1 - \delta x_e^*)\rbrace + \delta \ell _e x_e^*] - \gamma \sum _{e \in E} \delta \ell _ex_e^* \le \gamma {\rm\small MR} \\ \Rightarrow &\sum _{e \in S} [\max \lbrace u_e(1 - \delta x_e^*), \ell _e(1 - \delta x_e^*)\rbrace + \delta \ell _e x_e^*] - \sum _{e \in E} \delta \ell _ex_e^* \le \gamma {\rm\small MR} + (\gamma - 1) \sum _{e \in E} \delta \ell _ex_e^* \\ \Rightarrow &\max _{\bf d}\left[\sum _{e \in S} d_e - \delta \sum _{e \in E} d_e x_e^* \right] \le \gamma {\rm\small MR} + (\gamma - 1) \sum _{e \in E} \delta \ell _ex_e^*. \end{align*}
This implies:
\begin{align*} \forall {\bf d}: {\rm\small SOL} ({\bf d}) = \sum _{e \in S} d_e &\le \delta \sum _{e \in E} d_e x_e^* + \gamma {\rm\small MR} + (\gamma - 1) \sum _{e \in E} \delta \ell _ex_e^* \\ &\le \delta \sum _{e \in E} d_e x_e^* + \gamma {\rm\small MR} + (\gamma - 1) \sum _{e \in E} \delta d_ex_e^*\\ &=\gamma \delta \sum _{e \in E} d_e x_e^* + \gamma {\rm\small MR} \\ &\le \gamma \delta [\alpha {\rm\small OPT} ({\bf d}) + \beta {\rm\small MR} ] + \gamma {\rm\small MR} \\ &= \gamma \delta \alpha \cdot {\rm\small OPT} ({\bf d}) + (\gamma \delta \beta + \gamma) \cdot {\rm\small MR}. \end{align*}
i.e., \({\rm\small SOL}\) is \((\gamma \delta \alpha , \gamma \delta \beta + \gamma)\)-robust as desired.□

3 Algorithm for the Robust Traveling Salesman Problem

In this section, we give a robust algorithm for the traveling salesman problem:
Theorem 3.1.
There exists a \((4.5, 3.75)\)-robust algorithm for the traveling salesman problem.
Recall that we consider the version of the problem where we are allowed to use edges multiple times in tsp. We recall that any tsp tour must contain a spanning tree, and an Eulerian walk on a doubled mst is a 2-approximation algorithm for tsp (known as the “double-tree algorithm”). One might hope that since we have a \((1, 2)\)-robust algorithm for robust mst, one could take its output and apply the double-tree algorithm to get a \((2, 4)\)-robust solution to robust TSP. Unfortunately, we show in Section 3.1 that this algorithm is not \((\alpha , \beta)\)-robust for any \(\alpha = o(n), \beta = o(n)\). Nevertheless, we are able to leverage the connection to mst to arrive at a \((4.5, 3.75)\)-robust algorithm for tsp, given in Section 3.3.

3.1 Failure of Double-Tree Algorithm

The black-box reduction of [16] for turning exact algorithms into \((1, 2)\)-robust algorithms simply uses the exact algorithm to find the optimal solution when all \(d_e\) are set to \(\frac{\ell _e + u_e}{2}\) and outputs this solution (see [16] for details on its analysis). We give an example of a robust tsp instance where applying the double-tree algorithm to the \((1, 2)\)-robust mst generated by this algorithm does not give a robust tsp solution. Since the doubling of this mst is a 2-approximation for tsp when all \(d_e\) are set to \(\frac{\ell _e + u_e}{2}\), this example will also show that using an approximation algorithm instead of an exact algorithm in the black-box reduction fails to give any reasonable robustness guarantee as stated in Section 1.
Consider an instance of robust tsp with vertices \(V = \lbrace v^{\prime }, v_1 \ldots v_n\rbrace\), where there is a “type-1” edge from \(v^{\prime }\) to \(v_i\) with length \(1 - \epsilon\) for some \(\epsilon \gt \frac{1}{2(n-1)}\), and where there is a “type-2” edge from \(v_i\) to \(v_{i+1}\) for all i, as well as from \(v_n\) to \(v_1\), with length in the range \([0, 2 - \frac{1}{n-1}]\).
Consider \({\rm\small MRS}\), which uses \(n-1\) type-2 edges and two type-1 edges to connect \(v^{\prime }\) to the rest of the tour.4 Its regret is maximized in the realization where all the type-2 edges it is using have length \(2 - \frac{1}{n-1}\) and the type-2 edge it is not using has length 0. Note that if a solution contains a type-2 edge of length \(2 - \frac{1}{n-1}\), we can replace it with the two type-1 edges it is adjacent to and the cost of the solution decreases since we set \(\epsilon \gt \frac{1}{2(n-1)}\). In turn, the optimum solution for this realization uses the type 2-edge with length 0, the two type-1 edges adjacent to this type-2 edge once, and then the other \(n-2\) type-1 edges twice. So \({\rm\small MRS}\) has cost \((n-1)(2 - \frac{1}{n-1})+2(1-\epsilon) \le 2(n-1)\) whereas \({\rm\small OPT}\) has cost \(2(n-1)(1-\epsilon)\). Then, the regret of this solution is at most \(n\epsilon\).
When all edge costs are set to \(\frac{\ell _e + u_e}{2}\), since \(\epsilon \gt \frac{1}{2(n-1)}\) the minimum spanning tree of the graph is a star centered at \(v^{\prime }\), i.e., all the length \(1-\epsilon\) edges. So the \((1, 2)\)-approximation algorithm outputs this tree for mst. Doubling this tree gives a solution to the robust tsp instance that costs \(2n(1-\epsilon)\) in all realizations of demands.
Consider the realization \({\bf d}\) where all type-2 edges have length 0. \({\rm\small MRS}\) costs \(2-2\epsilon\) and is also the optimal solution. If the double-tree solution is \((\alpha , \beta)\)-robust we get that:
\begin{equation*} 2n(1-\epsilon) \le \alpha \cdot {\rm\small OPT} ({\bf d}) + \beta \cdot {\rm\small MR} \le \alpha \cdot (2 - 2\epsilon) + \beta n\epsilon . \end{equation*}
Setting \(\epsilon\) to e.g., \(1/n\) gives that one of \(\alpha , \beta\) is \(\Omega (n)\).

3.2 LP Relaxation

We use the LP relaxation of robust traveling salesman in Figure 1. This is the standard subtour LP (see e.g., [22]), but augmented with variables specifying the edges used to visit each new vertex, as well as with the regret constraint set. Integrally, \(y_{uv}\) is 1 if splitting the tour into subpaths at each point where a vertex is visited for the first time, there is a subpath from u to v (or vice-versa). That is, \(y_{uv}\) is 1 if between the first time u is visited and the first time v is visited, the tour only goes through vertices that were already visited before visiting u. \(x_{e,u,v}\) is 1 if on this subpath, the edge e is used. We use \(x_e\) to denote \(\sum _{u, v \in V} x_{e,u,v}\) for brevity. We discuss in this subsection why the constraints other than the regret constraint set in (3) are identical to the standard tsp polytope. This discussion may be skipped without affecting the readability of the rest of the paper.
Fig. 1.
Fig. 1. The robust TSP polytope.
The standard LP for tsp is the subtour LP (see e.g., [22]), which is as follows:
\begin{equation} \begin{array}{lll} \min & \sum _{(u, v) \in E} c_{uv} y_{uv}&\\ \text{s.t. }& \forall \emptyset \ne S \subset V: & \sum _{(u, v) \in \delta (S)} y_{uv} \ge 2 \\ & \forall u \in V: & \sum _{(u, v) \in E} y_{uv} = 2 \\ &\forall (u, v) \in E:& 0 \le y_{uv} \le 1 \end{array} \end{equation}
(4)
where \(\delta (S)\) denotes the set of edges with one endpoint in S. Note that because the graph is undirected, the order of u and v in terms such as \((u,v)\), \(c_{uv}\), and \(y_{uv}\) is immaterial, e.g., there is no distinction between edge \((u,v)\) and edge \((v,u)\) and \(y_{uv}\) and \(y_{vu}\) denote the same variable. This LP is written for the problem formulation where the triangle inequality holds, and thus we only need to consider tours that are cycles that visit every vertex exactly once. We are concerned, however, with the formulation where the triangle inequality does not necessarily hold, but tours can revisit vertices and edges multiple times. To modify the subtour LP to account for this formulation, we instead let \(y_{uv}\) be an indicator variable for whether our solution connects u to v using some path in the graph. Using this definition for \(y_{uv}\), the subtour LP constraints then tell us that we must buy a set of paths such that a set of edges directly connecting the endpoints of the paths would form a cycle visiting every vertex exactly once. Then, we introduce variables \(x_{e, u, v}\) denoting that we are using the edge e on the path from u to v. For ease of notation, we let \(x_e = \sum _{u, v \in V} x_{e, u, v}\) denote the number of times a fractional solution uses the edge e in paths. We can then use standard constraints from the canonical shortest path LP to ensure that in an integer solution \(y_{uv}\) is set to 1 only if for some path from u to v, all edges e on the path have \(x_{e, u, v}\) set to 1.
Lastly, note that the optimal tour does not use an edge more than twice. Suppose a tour uses the edge \(e = (u, v)\) thrice. By fixing a start/end vertex for the tour, we can split the tour into \(e, P_1, e, P_2, e, P_3\) where \(P_1\) is the part of the tour between the first and second use of e, \(P_2\) is the part of the tour between the second and third use of e, and \(P_3\) is the part of the tour after the third use of e. Because the tour starts and ends at the same vertex (u or v), and each of the three uses of edge e goes from u to v or vice versa, the number of \(P_1\), \(P_2\), and \(P_3\) that go from u to v or vice-versa (as opposed to going from u to u or v to v) must be odd, and hence not zero. Without loss of generality, we can assume \(P_1\) goes from u to v. Then, the tour \(\overline{P_1}, P_2, e, P_3\), where \(\overline{P_1}\) denotes the reversal of \(P_1\), is a valid tour and costs strictly less than the original tour. So any tour using an edge more than twice is not optimal. This lets us add the constraint \(x_e \le 2\) to the LP without affecting the optimal solution.
This gives the formulation for tsp without triangle inequality but with repeated edges allowed:
\begin{equation} \begin{array}{lll} \min &\sum _{e \in E} c_e x_{e}&\\ \text{s.t. } & \forall \emptyset \ne S \subset V: & \sum _{u \in S,v \in V \backslash S} y_{uv} \ge 2 \\ & \forall u \in V: & \sum _{v \ne u} y_{uv} = 2 \\ &\forall \emptyset \ne S \subset V, u \in S, v \in V \backslash S:& \sum _{e \in \delta (S)} x_{e, u, v} \ge y_{uv}\\ &\forall u,v \in V, v \ne u:& 0 \le y_{uv} \le 1\\ &\forall e \in E, u, v \in V, v \ne u:& 0 \le x_{e, u, v} \le 1\\ &\forall e \in E:& x_e \le 2 \end{array} \end{equation}
(5)
By integrality of the shortest path polytope, if we let \(p_{uv}\) denote the length of the shortest path from u to v, then \(\sum _{e \in E, u, v \in V} c_e x_{e, u, v} \ge \sum _{u, v \in V}p_{uv}y_{uv}\). In particular, if we fix the value of \(y_{uv}\) the optimal setting of \(x_{e, u, v}\) values is to set \(x_{e, u, v}\) to \(y_{uv}\) for every e on the shortest path from u to v. So (5) without the triangle inequality assumption is equivalent to (4) with the triangle inequality assumption. In particular, the integrality gap of (5) is the same as the integrality gap of (4), which is known to be at most 3/2 [23]. Then, adding a variable r for the fractional solution’s regret and the regret constraint set gives (3).

3.3 Approximate Separation Oracle

We now describe the separation oracle RRTSP-Oracle used to separate (3). All constraints except the regret constraint set can be separated in polynomial time by solving a min-cut problem. Recall that exactly separating the regret constraint set involves finding an “adversary” \({\rm\small SOL}\) that maximizes \(\max _{{\bf d}} [\sum _{e \in E} d_ex_e - {\rm\small SOL} ({\bf d})]\), and seeing if this quantity exceeds r. However, since TSP is NP-hard, finding this solution in general is NP-hard. Instead, we will only consider a solution \({\rm\small SOL}\) if it is a walk on some spanning tree T, and find the one that maximizes \(\max _{{\bf d}} [\sum _{e \in E} d_ex_e - {\rm\small SOL} ({\bf d})]\).
Fix any \({\rm\small SOL}\) that is a walk on some spanning tree T. For any e, if e is not in T, the regret of \({\bf x}, {\bf y}\) against \({\rm\small SOL}\) is maximized by setting e’s length to \(u_e\). If e is in T, then \({\rm\small SOL}\) is paying \(2d_e\) for that edge whereas the fractional solution pays \(d_ex_e \le 2d_e\), so to maximize the fractional solution’s regret, \(d_e\) should be set to \(\ell _e\). This gives that the regret of fractional solution \({\bf x}\) against any \({\rm\small SOL}\) that is a spanning tree walk on T is
\begin{equation*} \sum _{e \in T}(\ell _ex_e - 2\ell _e) + \sum _{e \notin T} u_ex_e = \sum _{e \in E} u_ex_e - \sum _{e \in T}(u_ex_e - (\ell _ex_e - 2\ell _e)). \end{equation*}
The quantity \(\sum _{e \in E} u_ex_e\) is fixed with respect to T, so finding the spanning tree T that maximizes this quantity is equivalent to finding T that minimizes \(\sum _{e \in T}(u_ex_e - (\ell _ex_e - 2\ell _e))\). But this is just an instance of the minimum spanning tree problem where edge e has weight \(u_ex_e - (\ell _ex_e - 2\ell _e)\), and thus we can find T in polynomial time. This gives the following lemma:
Lemma 3.2.
For any instance of robust traveling salesman there exists an algorithm RRTSP-Oracle that given a solution \(({\bf x}, {\bf y}, r)\) to (3) either:
Outputs a separating hyperplane for (3), or
Outputs “Feasible”, in which case \(({\bf x}, {\bf y})\) is feasible for the (non-robust) TSP LP and \(\forall {\bf d}: \sum _{e \in E} d_e x_e \le 2 \cdot {\rm\small OPT} ({\bf d}) + r\).
Proof of Lemma 3.2
RRTSP-Oracle is given in Figure 2. All inequalities except the regret constraint set can be checked exactly by RRTSP-Oracle. Consider the tree \(T^{\prime }\) computed in RRTSP-Oracle and \({\bf d}^{\prime }\) with \(d_e^{\prime } = \ell _e\) for \(e \in T^{\prime }\) and \(d_e^{\prime } = u_e\) for \(e \notin T^{\prime }\). The only other violated inequality RRTSP-Oracle can output is the inequality \(\sum _{e \in T^{\prime }}(\ell _ex_e - 2\ell _e) + \sum _{e \notin T^{\prime }} u_ex_e \le r\) in line 5, which is equivalent to \(\sum _{e \in E} d_e^{\prime } x_e \le 2\sum _{e \in T} d_e^{\prime } + r\). Since \(2\sum _{e \in T} d_e^{\prime }\) is the cost of a tour in realization \({\bf d}^{\prime }\) (the tour that follows a DFS on the spanning tree T), this inequality is implied by the inequality \(\sum _{e \in E} d_e^{\prime } x_e \le {\rm\small OPT} ({\bf d}^{\prime }) + r\) from the regret constraint set. Furthermore, RRTSP-Oracle only outputs this inequality when it is actually violated.
Fig. 2.
Fig. 2. Separation Oracle for (3).
So, it suffices to show that if there exists \({\bf d}\) such that \(\sum _{e \in E} d_e x_e \gt 2 {\rm\small OPT} ({\bf d}) + r\) then RRTSP-Oracle outputs a violated inequality on line 5. Since \({\rm\small OPT} ({\bf d})\) visits all vertices, it contains some spanning tree T, such that \({\rm\small OPT} ({\bf d}) \ge \sum _{e \in T} d_e\). Combining these inequalities gives
\begin{equation*} \sum _{e \in E}d_ex_e \gt 2 \sum _{e \in T}d_e + r. \end{equation*}
Since all \(x_e\) are at most 1, setting \(d_e = \ell _e\) for \(e \in T\) and \(d_e = u_e\) otherwise can only increase \(\sum _{e \in E}d_ex_e - 2 \sum _{e \in T}d_e\), so
\begin{equation*} \sum _{e \in T}\ell _ex_e + \sum _{e \notin T}u_ex_e \gt 2 \sum _{e \in T}\ell _e + r \Rightarrow \sum _{e \in E}u_ex_e - \sum _{e \in T}(u_ex_e - (\ell _ex_e - 2\ell _e)) \gt r. \end{equation*}
Then RRTSP-Oracle finds a minimum spanning tree \(T^{\prime }\) on \(G^{\prime }\), i.e., \(T^{\prime }\) such that
\begin{equation*} \sum _{e \in T^{\prime }}(u_ex_e - (\ell _ex_e - 2\ell _e)) \le \sum _{e \in T}(u_ex_e - (\ell _ex_e - 2\ell _e)). \end{equation*}
which combined with the previous inequality gives
\begin{equation*} \sum _{e \in E}u_ex_e - \sum _{e \in T^{\prime }}(u_ex_e - (\ell _ex_e - 2\ell _e)) \gt r \Rightarrow \sum _{e \in T^{\prime }}(\ell _ex_e - 2\ell _e) + \sum _{e \notin T^{\prime }}u_ex_e \gt r. \end{equation*}
By using the ellipsoid method with separation oracle RRTSP-Oracle and the fact that (3) has optimum at most mr, we get a \((2, 1)\)-robust fractional solution. Applying Theorem 1.3 as well as the fact that the TSP polytope has integrality gap \(3/2\) (see e.g., [22]) and the TSP problem has a \(3/2\)-approximation gives Theorem 3.1.

4 A Local Search Algorithm for Steiner Tree

In this section, we describe a local search algorithm for the Steiner tree, given in [13]. By a simplified version of the analysis appearing in [13], we show that the algorithm is 4-approximate. As with many local search algorithms, this algorithm could run in superpolynomial time in the worst case. Standard tricks can be used to modify this algorithm into a polynomial time \((4+\epsilon)\)-approximation. This algorithm will serve as a primitive in the algorithms we design in Section 5.
The local moves considered by the algorithm are all path swaps, defined as follows: If the current Steiner tree is T, the algorithm can pick any two vertices \(u, v\) in T such that there exists a path from u to v where all vertices except u and v on the path are not in T (and thus all edges on the path are not in T). The algorithm may take any path f from u to v of this form. It suffices to consider only the shortest path of this form. The path f is added to T, inducing a cycle. The algorithm then picks a subpath a in the cycle and removes it from T, while maintaining that T is a feasible Steiner tree. It suffices to consider only maximal subpaths. These are just the subpaths formed by splitting the cycle at every vertex with degree at least 3 in \(T \cup \lbrace f\rbrace\). Let n denote the number of nodes and m the number of edges in the graph. Since there are at most \(\genfrac(){0.0pt}1{n}{2}\) pairs of vertices \(u, v\), and a shortest path f between u and v can be found in \(O(m + n\log n)\) time, and all maximal subpaths in the induced cycle in \(T \cup \lbrace f\rbrace\) can be found in \(O(n)\) time, if there is a move that improves the cost of T, we can find it in polynomial time.
We will use the following lemma to show the approximation ratio.
Lemma 4.1.
For any tree the fraction of vertices with degree at most 2 is strictly greater than \(1/2\).
Proof.
This follows from a Markov bound on the random variable X defined as the degree of a uniformly random vertex minus one. A tree with n vertices has average degree \(\frac{2n-2}{n} \lt 2\), so \(\mathbb {E}[X] \lt 1\). In turn, the fraction of vertices with degree 3 or greater is \(\Pr [X \ge 2]\) \(\lt\) \(1/2\).□
Theorem 4.2.
Let A be a solution to an instance of Steiner tree such that no path-swap reduces the cost of A. Then A is a 4-approximation.
Proof.
Consider any other solution F to the Steiner tree instance. We partition the edges of A into the subpaths such that these subpaths’ endpoints are (i) vertices with degree 3 or larger in A, or (ii) vertices in A and F (which might also have degree 3 or larger in A). Besides the endpoints, all other vertices in each subpath have degree 2 and are in A but not in F. Note that a vertex may appear as the endpoint of more than one subpath. Note also that the set of vertices in F includes the terminals, which, without loss of generality includes all leaves in A. This along with condition (i) for endpoints ensures the partition into subpaths is well-defined, i.e., if a subpath ends at a leaf of A, that leaf is in F.
We also decompose F into subpaths, but some edges may be contained in two of these subpaths. To decompose F into subpaths, we first partition the edges of F into maximal connected subgraphs of F whose leaves are vertices in A (including terminals) and whose internal vertices are not in A. Note that some vertices may appear in more than one subgraph, e.g., an internal vertex in F that is in A becomes a leaf in multiple subgraphs. Since these subgraphs are not necessarily paths, we next take any DFS walk on each of these subgraphs starting from one of their leaves (that is, one of the vertices in A). We take the route traversed by the DFS walk, and split it into subpaths at every point where the DFS walk reaches a leaf. This now gives a set of subpaths in F such that each subpaths’ endpoints are vertices in A, no other vertices on a subpath are in A, and no edge appears in more than two subpaths. Let \(\mathcal {A}\) and \(\mathcal {F}\) denote the set of subpaths we decomposed A and F into, respectively.
For \(a \in \mathcal {A}\) let \(N(a) \subseteq \mathcal {F}\) be the set of subpaths f in \(\mathcal {F}\) such that \(A \setminus a \cup f\) is a feasible Steiner tree, i.e., f can be swapped for a, and let \(N(X) = \cup _{a \in X} N(a)\). We will show that for any \(X \subseteq \mathcal {A}\), \(|N(X)| \ge \frac{1}{2}|X|\). By an extension of Hall’s Theorem (Fact 15 in [13]) this implies the existence of a weight function \(\alpha : \mathcal {A} \times \mathcal {F} \rightarrow \mathbb {R}^+\) such that:
(1)
\(\alpha (a, f) \gt 0\) only if f can be swapped for a
(2)
For all subpaths \(a \in \mathcal {A}\), \(\sum _{f \in N(a)} \alpha (a, f) = 1\).
(3)
For all subpaths \(f \in \mathcal {F}\), \(\sum _{a \in N^{-1}(f)} \alpha (a, f) \le 2\).
This weight function then gives:
\begin{equation*} c(A) = \sum _{a \in \mathcal {A}} c(a) = \sum _{a \in \mathcal {A}} \sum _{f \in N(a)} c(a) \alpha (a, f) \le \sum _{f \in \mathcal {F}} \sum _{a \in N^{-1}(f)} c(f) \alpha (a, f) \le \sum _{f \in \mathcal {F}} 2c(f) \le 4c(F), \end{equation*}
where \(N^{-1}(f) = \lbrace a \in \mathcal {A}: f \in N(a)\rbrace\). The first inequality holds by the assumption in the lemma statement that no swaps reduce the cost of A, so for \(a \in \mathcal {A}\) and \(f \in N(a)\), \(c(a) \le c(f)\). The last inequality follows by the fact that every edge in F appears in at most two subpaths in \(\mathcal {F}\).
We now turn towards proving that for any \(X \subseteq \mathcal {A}\), \(|N(X)| \ge \frac{1}{2}|X|\). Fix any \(X \subseteq \mathcal {A}\). Suppose that we remove all of the edges on paths \(a \in X\) from A, and also remove all vertices on these paths except their endpoints. After removing these nodes and edges, we are left with \(|X|+1\) connected components. Let \(T^{\prime }\) be a tree with \(|X|+1\) vertices, one for each of these connected components, with an edge between any pair of vertices in \(T^{\prime }\) whose corresponding components are connected by a subpath \(a \in X\). Consider any vertices of \(T^{\prime }\) with degree at most 2. We claim the corresponding component contains a vertex in F. Let \(V^{\prime }\) denote the set of vertices in the corresponding component that are endpoints of subpaths in \(\mathcal {A}\). There must be at least one such vertex in \(V^{\prime }\). Furthermore, it is not possible that all of the vertices in \(V^{\prime }\) are internal vertices of A with degree at least 3, since at most two subpaths leave this component and there are no cycles in A. The only other option for endpoints is vertices in F, so this component must contain some vertex in F.
Applying Lemma 4.1, strictly more than \((|X|+1)/2\) (i.e., at least \(|X|/2 + 1\)) of the components have degree at most 2, and by the previous argument contain a vertex in F. These vertices are connected by F, and since each subpath in \(\mathcal {F}\) does not have internal vertices in A, no subpath in \(\mathcal {F}\) passes through more than two of these components. Hence, at least \(|X|/2\) of the subpaths in \(\mathcal {F}\) have endpoints in two different components because at least \(|X|/2\) edges are required to connect \(|X|/2+1\) vertices. In turn, any of these \(|X|/2\) paths f could be swapped for one of the subpaths \(a \in X\) that is on the path between the components containing f’s endpoints. This shows that \(|N(X)| \ge |X|/2\) as desired.□

5 Algorithm for the Robust Steiner Tree Problem

In this section, our goal is to find a fractional solution to the LP in Figure 3 for the robust Steiner tree. By Theorem 1.3 and known approximation/integrality gap results for Steiner Tree, this will give the following theorem:
Fig. 3.
Fig. 3. The robust Steiner tree polytope.
Theorem 5.1.
There exists a \((2755, 64)\)-robust algorithm for the Steiner tree problem.
It is well-known that the standard Steiner tree polytope admits an exact separation oracle (by solving the \(s, t\)-min-cut problem using edge weights \(x_e\) for all \(s, t \in T\)) so it is sufficient to find an approximate separation oracle for the regret constraint set. So, we focus on this section in deriving an approximate separation oracle. Doing so is the most technically difficult part of the paper, so we break the section up into multiple parts as follows: In Section 5.1, we start with a simpler case where \(\ell _e = 0\) for all edges, and show how the local search algorithm of the previous section can help design the separation oracle in this case. In Section 5.2, we state our main technical lemma (Lemma 5.5), give a high-level overview of its proof, and show how it implies Theorem 5.1. In Section 5.3, we give the algorithm of Lemma 5.5. In Section 5.4, we analyze this algorithm’s guarantees, deferring some proofs that are highly technical and already covered at a high level in Section 5.2. In Section 5.5, we give the deferred proofs.
We believe the high-level overview in Section 5.2 captures the main ideas of Sections 5.3, 5.4, and 5.5, and thus a reader who just wants to understand the algorithm and analysis at a high level can stop reading after Section 5.2. A reader who wants a deeper understanding of the algorithm’s design choices and implementation but is not too concerned with the details of the analysis can stop reading after Section 5.3. A reader who wants a deeper understanding of the analysis can stop reading after Section 5.4 and still have a strong understanding of the analysis.

5.1 Special Case where the Lower Bounds on All Edge Lengths Are ℓe = 0

In this section, we give a simple algorithm/analysis for the special case when \(\ell _e = 0\) for all edges.
First, we create the derived instance of the Steiner tree problem which is a copy \(G^{\prime }\) of the input graph G with edge weights \(u_e x_e +\ell _e -\ell _e x_e\). As noted earlier, the optimal Steiner tree \(T^*\) on the derived instance maximizes the regret of the fractional solution \({\bf x}\). However, since Steiner tree is NP-hard, we cannot hope to exactly find \(T^*\). We need a Steiner tree \(\hat{T}\) such that the regret caused by it can be bounded against that caused by \(T^*\). The difficulty is that the regret corresponds to the total weight of edges not in the Steiner tree (plus an offset that we will address later), whereas standard Steiner tree approximations give guarantees in terms of the total weight of edges in the Steiner tree. We overcome this difficulty by requiring a stricter notion of “difference approximation” – that the weight of edges \(\hat{T} \setminus T^*\) be bounded against those in \(T^* \setminus \hat{T}\). Note that this condition ensures that not only is the weight of edges in \(\hat{T}\) bounded against those in \(T^*\), but also that the weight of edges not in \(\hat{T}\) is bounded against that of edges not in \(T^*\). We show the following lemma to obtain the difference approximation:
Lemma 5.2.
For any \(\epsilon \gt 0\), there exists a polynomial-time algorithm for the Steiner tree problem such that if \({\rm\small OPT}\) denotes the set of edges in the optimal solution and \(c(S)\) denotes the total weight of edges in S, then for any input instance of Steiner tree, the output solution \({\rm\small ALG}\) satisfies \(c({\rm\small ALG} \setminus {\rm\small OPT}) \le (4 + \epsilon) \cdot c({\rm\small OPT} \setminus {\rm\small ALG})\).
Proof
The algorithm we use is the local search algorithm described in Section 4, which finds \({\rm\small ALG}\) such that \(c({\rm\small ALG}) \le 4 \cdot c({\rm\small OPT})\). Suppose that the cost of each edge \(e \in {\rm\small ALG} \cap {\rm\small OPT}\) is now changed from its initial value to 0. After this change, \({\rm\small ALG}\) remains locally optimal because for every feasible solution F that can be reached by making a local move from \({\rm\small ALG}\), the amount by which the cost of \({\rm\small ALG}\) has decreased by setting edge costs to zero is at least the amount by which F has decreased. Hence, no local move causes a decrease in cost. Thus, \({\rm\small ALG}\) remains a 4-approximation, which implies that \(c({\rm\small ALG} \setminus {\rm\small OPT}) \le 4 \cdot c({\rm\small OPT} \setminus {\rm\small ALG})\).
We also need to show that the algorithm converges in polynomially many iterations. The authors in [13] achieve this convergence by discretizing all the edge costs to the nearest multiple of \(\frac{\epsilon }{kn} c({\rm\small APX})\) for an initial solution \({\rm\small APX}\) such that \(c({\rm\small OPT}) \le c({\rm\small APX}) \le k c({\rm\small OPT})\) (e.g., a simple way to do so is to start with a solution formed by the union of shortest paths between terminals, and then remove edges which cause cycles arbitrarily. This solution has cost between \(c({\rm\small OPT})\) and \(n^2 \cdot c({\rm\small OPT})\). See Section B.3 of [13] for more details). This guarantees that the algorithm converges in \(\frac{kn}{\epsilon }\) iterations, at an additive \(\epsilon \cdot c({\rm\small OPT})\) cost. For a standard approximation algorithm this is not an issue, but for an algorithm that aims for a guarantee of the form \(c({\rm\small ALG} \setminus {\rm\small OPT}) \le O(1) \cdot c({\rm\small OPT} \setminus {\rm\small ALG})\) an additive \(\epsilon \cdot c({\rm\small OPT})\) might be too much.
We modify the algorithm as follows to ensure that it converges in polynomially many iterations: We only consider swapping out a for f if the decrease in cost is at least \(\epsilon /4\) times the cost of a, and we always choose the swap of this kind that decreases the cost by the largest amount.5
We now show the algorithm converges. Later in the section, we will prove two claims, so for brevity’s sake we will not include the proofs of the claims here. The first claim is that as long as \(c({\rm\small ALG} \setminus {\rm\small OPT}) \gt (4+\epsilon)c({\rm\small OPT} \setminus {\rm\small ALG})\), there is a swap between \({\rm\small ALG}\) and \({\rm\small OPT}\) where decrease in cost is at least \(\epsilon /4\) times the cost of the path being swapped out, and is at least \(\epsilon /4n^2\) times \(c({\rm\small ALG} \setminus {\rm\small OPT})\) (the proof follows similarly to Lemma 5.10 in Section 5.3). The second claim is that in any swap the quantity \(c({\rm\small ALG} \setminus {\rm\small OPT}) - c({\rm\small OPT} \setminus {\rm\small ALG})\) decreases by the same amount that \(c({\rm\small ALG})\) does (see Lemma 5.12 in Section 5.4).
So, we use \(c({\rm\small ALG} \setminus {\rm\small OPT}) - c({\rm\small OPT} \setminus {\rm\small ALG})\) as a potential to bound the number of swaps. This potential is initially at most \(n \max _e c_e\), is always at least \(\min _e c_e\) as long as \(c({\rm\small ALG} \setminus {\rm\small OPT}) \gt c({\rm\small OPT} \setminus {\rm\small ALG})\), and each swap decreases it multiplicatively by at least a factor of \((1-\epsilon /4n^2)\) as long as \(c({\rm\small ALG} \setminus {\rm\small OPT}) \gt (4+\epsilon)c({\rm\small OPT} \setminus {\rm\small ALG})\). Thus, the algorithm only needs to make \(\frac{\log (n \max _e c_e / \min _e c_e)}{-\log (1-\epsilon /4n^2)}\) swaps to arrive at a solution that is a \((4+\epsilon)\)-approximation, which is a polynomial in the input size.□
Recall that the regret caused by T is not exactly the weight of edges not in T, but includes a fixed offset of \(\sum _{e \in E} (\ell _e - \ell _e x_e)\). If \(\ell _e = 0\) for all edges, i.e., the offset is 0, then we can recover a robust algorithm from Lemma 5.2 alone with much better constants than in Theorem 5.1:
Lemma 5.3.
For any instance of robust Steiner tree for which all \(\ell _e = 0\), for every \(\epsilon \gt 0\) there exists an algorithm RRST-Oracle-ZLB which, given a solution \(({\bf x}, r)\) to the LP in Figure 3, either:
Outputs a separating hyperplane for the LP in Figure 3, or
Outputs “Feasible”, in which case \({\bf x}\) is feasible for the (non-robust) Steiner tree LP and \(\forall {\bf d}: \sum _{e \in E} d_e x_e \le {\rm\small OPT} ({\bf d}) + (4 + \epsilon)r\).
RRST-Oracle-ZLB is given in Figure 4. Via the ellipsoid method this gives a \((1, 4+\epsilon)\)-robust fractional solution. Using Theorem 1.3, the fact that the integrality gap of the LP we use is 2 [21], and that there is a \((\ln 4+\epsilon) \approx 1.39\)-approximation for Steiner tree [8], with appropriate choice of \(\epsilon\) we get the following corollary:
Fig. 4.
Fig. 4. Separation Oracle for the LP in Figure 3 when \(\ell _e = 0,~\forall e\).
Corollary 5.4.
There exists a \((2.78, 12.51)\)-robust algorithm for Steiner tree when \(\ell _e = 0\) for all \(e \in E\).
Proof of Lemma 5.3
All inequalities except the regret constraint set can be checked exactly by RRST-Oracle-ZLB. Consider the tree \(T^{\prime }\) computed in RRST-Oracle-ZLB and \({\bf d}^{\prime }\) with \(d_e^{\prime } = 0\) for \(e \in T^{\prime }\) and \(d_e^{\prime } = u_e\) for \(e \notin T^{\prime }\). The only other violated inequality RRST-Oracle-ZLB can output is the inequality \(\sum _{e \notin T^{\prime }} u_e x_e \le r\) in line 5, which is equivalent to \(\sum _{e \in E} d_e^{\prime } x_e \le T^{\prime }({\bf d}^{\prime }) + r\), an inequality from the regret constraint set. Furthermore, RRST-Oracle-ZLB only outputs this inequality when it is actually violated. So, it suffices to show that if there exists \({\bf d}, {\rm\small SOL}\) such that \(\sum _{e \in E} d_e x_e \gt {\rm\small SOL} ({\bf d}) + (4 + \epsilon)r\) then RRST-Oracle-ZLB outputs a violated inequality on line 5, i.e., finds Steiner tree \(T^{\prime }\) such that \(\sum _{e \notin T^{\prime }} u_e x_e \gt r\).
Suppose there exists \({\bf d}, {\rm\small SOL}\) such that \(\sum _{e \in E} d_e x_e \gt {\rm\small SOL} ({\bf d}) + (4 + \epsilon)r\). Let \({\bf d}^*\) be the vector obtained from \({\bf d}\) by replacing \(d_e\) with \(u_e\) for edges not in \({\rm\small SOL}\) and with 0 for edges in \({\rm\small SOL}\). Replacing \({\bf d}\) with \({\bf d}^*\) can only increase \(\sum _{e \in E} d_e x_e - {\rm\small SOL} ({\bf d})\), i.e.,:
\begin{equation} \sum _{e \notin {\rm\small SOL}} u_e x_e = \sum _{e \in E} d_e^* x_e \gt {\rm\small SOL} ({\bf d}^*) + (4 + \epsilon)r = (4 + \epsilon)r. \end{equation}
(9)
Consider the graph \(G^{\prime }\) made by \(\text {RRST}-{\text{O}\scriptstyle\text{RACLE}}-\text{ZLB}\). We’ll partition the edges into four sets, \(E_0, E_S, E_T, E_{ST}\) where \(E_0\) = \(E \setminus ({\rm\small SOL} \cup T^{\prime })\), \(E_S = {\rm\small SOL} \setminus T^{\prime }\), \(E_T = T^{\prime } \setminus {\rm\small SOL}\), \(E_{ST} = {\rm\small SOL} \cap T^{\prime }\). Let \(c(E^{\prime })\) for \(E^{\prime } = E_0, E_S, E_T, E_{ST}\) denote \(\sum _{e \in E^{\prime }} u_ex_e\), i.e., the total cost of the edge set \(E^{\prime }\) in \(G^{\prime }\). Since \({\bf d}^*\) has \(d_e = 0\) for \(e \in {\rm\small SOL}\), from (9) we get that \(c(E_0) + c(E_{T}) \gt (4 + \epsilon) r\).
Now note that \(\sum _{e \notin T^{\prime }} u_e x_e = c(E_0) + c(E_{S})\). Lemma 5.2 gives that \((4 + \epsilon)c(E_{S}) \ge c(E_{T})\). Putting it all together, we get that:
\begin{equation*} \sum _{e \notin T^{\prime }} u_e x_e = c(E_0) + c(E_S) \ge c(E_0) + \frac{c(E_T)}{4 + \epsilon } \ge \frac{c(E_0) + c(E_T)}{4 + \epsilon } \gt \frac{(4+\epsilon)r}{4 + \epsilon } = r. \end{equation*}

5.2 General Case for Arbitrary Lower Bounds on Edge Lengths: High Level Overview

In this section, we give our main lemma (Lemma 5.5), a high-level overview of the algorithm and analysis proving this lemma, and show how the lemma implies Theorem 5.1.
In the general case, the approximation guarantee given in Lemma 5.2 alone does not suffice because of the offset of \(\sum _{e \in E} (\ell _e - \ell _e x_e)\). We instead rely on a stronger notion of approximation formalized in the next lemma that provides simultaneous approximation guarantees on two sets of edge weights: \(c_e = u_e x_e - \ell _e x_e + \ell _e\) and \(c_e^{\prime } = \ell _e - \ell _e x_e\). The guarantee on \(c^{\prime }\) can then be used to handle the offset.
Lemma 5.5.
Let G be a graph with terminals T and two sets of edge weights c and \(c^{\prime }\). Let \({\rm\small SOL}\) be any Steiner tree connecting T. Let \(\Gamma ^{\prime } \gt 1\), \(\kappa \gt 0\), and \(0 \lt \epsilon \lt \frac{4}{35}\) be fixed constants. Then there exists a constant \(\Gamma\) (depending on \(\Gamma ^{\prime }, \kappa , \epsilon\)) and an algorithm that obtains a collection of Steiner trees \({\rm\small ALG}\), at least one of which (let us call it \({\rm\small ALG} _i\)) satisfies:
\(c({\rm\small ALG} _i \setminus {\rm\small SOL}) \le 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} _i),\) and
\(c^{\prime }({\rm\small ALG} _i) \le (4\Gamma ^{\prime } +\kappa + 1 + \epsilon) \cdot c^{\prime }({\rm\small SOL})\).
The fact that Lemma 5.5 generates multiple solutions (but only polynomially many) is fine because as long as we can show that one of these solutions causes sufficient regret, our separation oracle can just iterate over all solutions until it finds one that causes sufficient regret.
We give a high level sketch of the proof of Lemma 5.5 here, and defer the full details to Section 5.4. The algorithm uses the idea of alternate minimization, alternating between a “forward phase” and a “backward phase”. The goal of each forward phase/backward phase pair is to keep \(c^{\prime }({\rm\small ALG})\) approximately fixed while obtaining a net decrease in \(c({\rm\small ALG})\). In the forward phase, the algorithm greedily uses local search, choosing swaps that decrease c and increase \(c^{\prime }\) at the best “rate of exchange” between the two costs (i.e., the maximum ratio of decrease in c to increase in \(c^{\prime }\)), until \(c^{\prime }({\rm\small ALG})\) has increased past some upper threshold. Then, in the backward phase, the algorithm greedily chooses swaps that decrease \(c^{\prime }\) while increasing c at the best rate of exchange, until \(c^{\prime }({\rm\small ALG})\) reaches some lower threshold, at which point we start a new forward phase.
We guess the value of \(c^{\prime }({\rm\small SOL})\) (we can run many instances of this algorithm and generate different solutions based on different guesses for this purpose) and set the upper threshold for \(c^{\prime }({\rm\small ALG})\) appropriately so that we satisfy the approximation guarantee for \(c^{\prime }\). For c we show that as long as \({\rm\small ALG}\) is not a \(4\Gamma\)-difference approximation with respect to c then a forward/backward phase pair reduces \(c({\rm\small ALG})\) by a non-negligible amount (of course, if \({\rm\small ALG}\) is a \(4\Gamma\)-difference approximation then we are done). This implies that after enough iterations, \({\rm\small ALG}\) must be a \(4\Gamma\)-difference approximation as \(c({\rm\small ALG})\) can only decrease by a bounded amount. To show this, we claim that while \({\rm\small ALG}\) is not a \(4\Gamma\)-difference approximation, for sufficiently large \(\Gamma\) the following bounds on rates of exchange hold:
For each swap in the forward phase, the ratio of decrease in \(c({\rm\small ALG})\) to increase in \(c^{\prime }({\rm\small ALG})\) is at least some constant \(k_1\) times \(\frac{c({\rm\small ALG} \setminus {\rm\small SOL})}{c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG})}\).
For each swap in the backward phase, the ratio of increase in \(c({\rm\small ALG})\) to decrease in \(c^{\prime }({\rm\small ALG})\) is at most some constant \(k_2\) times \(\frac{c({\rm\small SOL} \setminus {\rm\small ALG})}{c^{\prime }({\rm\small ALG} \setminus {\rm\small SOL})}\).
Before we discuss how to prove this claim, let us see why this claim implies that a forward phase/backward phase pair results in a net decrease in \(c({\rm\small ALG})\). If this claim holds, suppose we set the lower threshold for \(c^{\prime }({\rm\small ALG})\) to be, say, \(101 c^{\prime }({\rm\small SOL})\). That is, throughout the backward phase, we have \(c^{\prime }({\rm\small ALG}) \gt 101 c^{\prime }({\rm\small SOL})\). This lower threshold lets us rewrite our upper bound on the rate of exchange in the backward phase in terms of the lower bound on rate of exchange for the forward phase:
\begin{equation*} \begin{aligned}k_2 \frac{c({\rm\small SOL} \setminus {\rm\small ALG})}{c^{\prime }({\rm\small ALG} \setminus {\rm\small SOL})} & \le \ k_2 \frac{c({\rm\small SOL} \setminus {\rm\small ALG})}{c^{\prime }({\rm\small ALG}) - c^{\prime }({\rm\small SOL})} \le k_2 \frac{c({\rm\small SOL} \setminus {\rm\small ALG})}{100 c^{\prime }({\rm\small SOL})} \le k_2 \frac{c({\rm\small SOL} \setminus {\rm\small ALG})}{100 c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG})} \\ & \le \ k_2 \frac{1}{4\Gamma } \frac{c({\rm\small ALG} \setminus {\rm\small SOL})}{100 c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG})} = \frac{k_2}{400\Gamma k_1} \cdot k_1\frac{c({\rm\small ALG} \setminus {\rm\small SOL})}{c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG})}. \end{aligned} \end{equation*}
In other words, the bound in the claim for the rate of exchange in the forward phase is larger than the bound for the backward phase by a multiplicative factor of \(\Omega (1) \cdot \Gamma\). While these bounds depend on \({\rm\small ALG}\) and thus will change with every swap, let us make the simplifying assumption that through one forward phase/backward phase pair these bounds remain constant. Then, the change in \(c({\rm\small ALG})\) in one phase is just the rate of exchange for that phase times the change in \(c^{\prime }({\rm\small ALG})\), which by definition of the algorithm is roughly equal in absolute value for the forward and backward phase. So this implies that the decrease in \(c({\rm\small ALG})\) in the forward phase is \(\Omega (1) \cdot \Gamma\) times the increase in \(c({\rm\small ALG})\) in the backward phase, i.e., the net change across the phases is a non-negligible decrease in \(c({\rm\small ALG})\) if \(\Gamma\) is sufficiently large. Without the simplifying assumption, we can still show that the decrease in \(c({\rm\small ALG})\) in the forward phase is larger than the increase in \(c({\rm\small ALG})\) in the backward phase for large enough \(\Gamma\) using a much more technical analysis. In particular, our analysis will show there is a net decrease as long as:
\begin{equation} \min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - \left(e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1 \right) \gt 0, \end{equation}
(10)
where
\begin{equation*} \zeta ^{\prime } = \frac{4(1+\epsilon)\Gamma ^{\prime }}{(\sqrt {\Gamma ^{\prime }}-1)(\sqrt {\Gamma ^{\prime }}-1-\epsilon)(4\Gamma ^{\prime }-1)(4\Gamma -1)}. \end{equation*}
Note that for any positive \(\epsilon , \kappa , \Gamma ^{\prime }\), there exists a sufficiently large value of \(\Gamma\) for (10) to hold, since as \(\Gamma \rightarrow \infty\), we have \(\zeta ^{\prime } \rightarrow 0\), so that
\begin{equation*} \left(e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1 \right) \rightarrow 0 \text{ and } \end{equation*}
\begin{equation*} \min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace \rightarrow \min \lbrace 1/2, \kappa /(4+4\epsilon)\rbrace . \end{equation*}
So, the same intuition holds: as long as we are willing to lose a large enough \(\Gamma\) value, we can make the increase in \(c({\rm\small ALG})\) due to the backward phase negligible compared to the decrease in the forward phase, giving us a net decrease.
It remains to argue that the claimed bounds on rates of exchange hold. Let us argue the claim for \(\Gamma = 4\), although the ideas easily generalize to other choices of \(\Gamma\). We do this by generalizing the analysis giving Lemma 5.2. This analysis shows that if \({\rm\small ALG}\) is a locally optimal solution, then
\begin{equation*} c({\rm\small ALG} \setminus {\rm\small SOL}) \le 4\cdot c({\rm\small SOL} \setminus {\rm\small ALG}), \end{equation*}
i.e., \({\rm\small ALG}\) is a 4-difference approximation of \({\rm\small SOL}\). The contrapositive of this statement is that if \({\rm\small ALG}\) is not a 4-difference approximation, then there is at least one swap that will improve it by some amount. We modify the approach of [13] by weakening the notion of locally optimal. In particular, suppose we define a solution \({\rm\small ALG}\) to be “approximately” locally optimal if at least half of the (weighted) swaps between paths a in \({\rm\small ALG} \setminus {\rm\small SOL}\) and paths f in \({\rm\small SOL} \setminus {\rm\small ALG}\) satisfy \(c(a) \le 2c(f)\) (as opposed to \(c(a) \le c(f)\) in a locally optimal solution; the choice of 2 for both constants here implies \(\Gamma = 4\)). Then a modification of the analysis of the local search algorithm, losing an additional factor of 4, shows that if \({\rm\small ALG}\) is approximately locally optimal, then
\begin{equation*} c({\rm\small ALG} \setminus {\rm\small SOL}) \le 16 \cdot c({\rm\small SOL} \setminus {\rm\small ALG}). \end{equation*}
The contrapositive of this statement, however, has a stronger consequence than before: if \({\rm\small ALG}\) is not a 16-difference approximation, then a weighted half of the swaps satisfy \(c(a) \gt 2c(f)\), i.e., reduce \(c({\rm\small ALG})\) by at least
\begin{equation*} c(a) - c(f) \gt c(a) - c(a)/2 = c(a)/2. \end{equation*}
The decrease in \(c({\rm\small ALG})\) due to all of these swaps together is at least \(c({\rm\small ALG} \setminus {\rm\small SOL})\) times some constant. In addition, since a swap between a and f increases \(c^{\prime }({\rm\small ALG})\) by at most \(c^{\prime }(f)\), the total increase in \(c^{\prime }\) due to these swaps is at most \(c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG})\) times some other constant. An averaging argument then gives the rate of exchange bound for the forward phase in the claim, as desired. The rate of exchange bound for the backward phase follows analogously.
This completes the algorithm and proof summary, although more details are needed to formalize these arguments. Moreover, we also need to show that the algorithm runs in polynomial time.
We now formally define our separation oracle RRST-Oracle in Figure 5 and prove that it is an approximate separation oracle in the lemma below:
Fig. 5.
Fig. 5. Separation Oracle for LP in Figure 3.
Lemma 5.6.
Fix any \(\Gamma ^{\prime } \gt 1, \kappa \gt 0, 0 \lt \epsilon \lt 4/35\) and let \(\Gamma\) be the constant given in Lemma 5.5. Let \(\alpha = (4\Gamma ^{\prime }+\kappa +2+\epsilon)4\Gamma +1\) and \(\beta = 4\Gamma\). Then there exists an algorithm RRST-Oracle that given a solution \(({\bf x}, r)\) to the LP in Figure 3 either:
Outputs a separating hyperplane for the LP in Figure 3, or
Outputs “Feasible”, in which case \({\bf x}\) is feasible for the (non-robust) Steiner tree LP and
\begin{equation*} \forall {\bf d}: \sum _{e \in E} d_e x_e \le \alpha \cdot {\rm\small OPT} ({\bf d}) + \beta \cdot r. \end{equation*}
Proof.
It suffices to show that if there exists \({\bf d}, {\rm\small SOL}\) such that
\begin{equation*} \sum _{e \in E} d_e x_e \gt \alpha \cdot {\rm\small SOL} ({\bf d}) + \beta \cdot r, \text{ i.e., } \sum _{e \in E} d_ex_e - \alpha \cdot {\rm\small SOL} ({\bf d}) \gt \beta \cdot r \end{equation*}
then RRST-Oracle outputs a violated inequality on line 6, i.e., the algorithm finds a Steiner tree \(T^{\prime }\) such that
\begin{equation*} \sum _{e \notin T^{\prime }} u_ex_e + \sum _{e \in T^{\prime }} \ell _ex_e - \sum _{e \in T^{\prime }} \ell _e \gt r. \end{equation*}
Notice that in the inequality
\begin{equation*} \sum _{e \in E} d_ex_e - \alpha \cdot {\rm\small SOL} ({\bf d}) \gt \beta \cdot r, \end{equation*}
replacing \({\bf d}\) with \({\bf d}^{\prime }\) where \(d_e^{\prime } = \ell _e\) when \(e \in {\rm\small SOL}\) and \(d_e^{\prime } = u_e\) when \(e \notin {\rm\small SOL}\) can only increase the left hand side. So replacing \({\bf d}\) with \({\bf d}^{\prime }\) and rearranging terms, we have
\begin{equation*} \sum _{e \in {\rm\small SOL}} \ell _ex_e + \sum _{e \notin {\rm\small SOL}} u_ex_e \gt \alpha \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r = \sum _{e \in {\rm\small SOL}} \ell _e + \left[(\alpha -1) \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r\right]. \end{equation*}
In particular, the regret of the fractional solution against \({\rm\small SOL}\) is at least \((\alpha -1) \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r\).
Let \(T^{\prime }\) be the Steiner tree satisfying the conditions of Lemma 5.5 with \(c_e = u_ex_e - \ell _ex_e + \ell _e\) and \(c_e^{\prime } = \ell _e - \ell _e x_e\). Let \(E_0\) = \(E \setminus ({\rm\small SOL} \cup T^{\prime })\), \(E_S = {\rm\small SOL} \setminus T^{\prime }\), and \(E_T = T^{\prime } \setminus {\rm\small SOL}\). Let \(c(E^{\prime }) = \sum _{e \in E^{\prime }} (u_ex_e - \ell _ex_e + \ell _e)\), i.e., the total weight of the edges \(E^{\prime }\) in \(G^{\prime }\). Now, note that the regret of the fractional solution against a solution using edges \(E^{\prime }\) is:
\begin{align*} \sum _{e \notin E^{\prime }} u_ex_e + \sum _{e \in E^{\prime }} \ell _ex_e - \sum _{e \in E^{\prime }} \ell _e &= \sum _{e \notin E^{\prime }} (u_ex_e - \ell _ex_e + \ell _e) - \sum _{e \in E}(\ell _e - \ell _ex_e) \\ &= c(E \setminus E^{\prime }) - \sum _{e \in E}(\ell _e - \ell _ex_e). \end{align*}
Plugging in \(E^{\prime } = {\rm\small SOL}\), we then get that:
\begin{equation*} c(E_0) + c(E_T) - \sum _{e \in E}(\ell _e - \ell _ex_e) \gt (\alpha - 1) \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r. \end{equation*}
Isolating \(c(E_T)\) then gives:
\begin{align*} c(E_T) &\gt (\alpha - 1) \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r - \sum _{e \in E_0} (u_ex_e - \ell _e x_e + \ell _e) + \sum _{e \in E}(\ell _e - \ell _ex_e) \\ &=(\alpha - 1) \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r - \sum _{e \in E_0} u_ex_e + \sum _{e \notin E_0}(\ell _e - \ell _ex_e). \end{align*}
Since \(\beta = 4\Gamma\), Lemma 5.5 along with an appropriate choice of \(\epsilon\) gives that \(c(E_{T}) \le \beta c(E_{S})\), and thus:
\begin{equation*} c(E_S) \gt \frac{1}{\beta } \left[ (\alpha - 1) \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r - \sum _{e \in E_0} u_ex_e + \sum _{e \notin E_0}(\ell _e - \ell _ex_e) \right]. \end{equation*}
Recall that our goal is to show that \(c(E_0) + c(E_S) - \sum _{e \in E} (\ell _e - \ell _e x_e) \gt r\), i.e., that the regret of the fractional solution against \(T^{\prime }\) is at least r. Adding \(c(E_0)- \sum _{e \in E} (\ell _e - \ell _e x_e)\) to both sides of the previous inequality, we can lower bound \(c(E_0) + c(E_S) - \sum _{e \in E} (\ell _e - \ell _e x_e)\) as follows:
\begin{align*} &c(E_0) + c(E_S) - \sum _{e \in E} (\ell _e - \ell _e x_e) \\ &\gt \frac{1}{\beta } \left[ (\alpha - 1) \sum _{e \in {\rm\small SOL}} \ell _e + \beta \cdot r - \sum _{e \in E_0} u_ex_e + \sum _{e \notin E_0}(\ell _e - \ell _ex_e) \right]\\ &\qquad +\sum _{e \in E_0} (u_e x_e - \ell _e x_e + \ell _e) - \sum _{e \in E} (\ell _e - \ell _e x_e)\\ &= r + \frac{\alpha -1}{\beta } \sum _{e \in {\rm\small SOL}} \ell _e + \frac{1}{\beta }\sum _{e \notin E_0}(\ell _e - \ell _ex_e) +\frac{\beta - 1}{\beta } \sum _{e \in E_0} u_e x_e - \sum _{e \notin E_0} (\ell _e - \ell _e x_e)\\ &\ge r + \frac{\alpha -1-\beta }{\beta } \sum _{e \in {\rm\small SOL}} \ell _e + \frac{1}{\beta }\sum _{e \notin E_0}(\ell _e - \ell _ex_e) +\frac{\beta - 1}{\beta } \sum _{e \in E_0} u_e x_e - \sum _{e \in E_T} (\ell _e - \ell _e x_e) \ge r. \end{align*}
Here, the last inequality holds because by our setting of \(\alpha\), we have
\begin{equation*} \frac{\alpha - 1 - \beta }{\beta } = 4\Gamma ^{\prime } + \kappa + 1 + \epsilon , \end{equation*}
and thus Lemma 5.5 gives that
\begin{equation*} \sum _{e \in E_T} (\ell _e - \ell _e x_e) \le \frac{\alpha - 1 - \beta }{\beta }\sum _{e \in {\rm\small SOL}} (\ell _e - \ell _e x_e) \le \frac{\alpha - 1 - \beta }{\beta }\sum _{e \in {\rm\small SOL}} \ell _e. \end{equation*}
By using Lemma 5.6 with the ellipsoid method and the fact that the LP optimum is at most \({\rm\small MR}\), we get an \((\alpha , \beta)\)-robust fractional solution. Then, Theorem 1.3 and known approximation/integrality gap results give us the following theorem, which with an appropriate choice of constants gives Theorem 5.1:
Theorem 5.7.
Fix any \(\Gamma ^{\prime } \gt 1, \kappa \gt 0, 0 \lt \epsilon \lt 4/35\) and let \(\Gamma\) be the constant given in Lemma 5.5. Let \(\alpha = (4\Gamma ^{\prime }+\kappa +2+\epsilon)4\Gamma +1\) and \(\beta = 4\Gamma\). Then there exists a polynomial-time \((2\alpha \ln 4 + \epsilon , 2 \beta \ln 4 + \ln 4+\epsilon)\)-robust algorithm for the Steiner tree problem.
Proof of Theorem 5.7
By using the ellipsoid method with Lemma 5.6 we can compute a feasible \((\alpha , \beta)\)-robust fractional solution to the Steiner tree LP (as the robust Steiner tree LP has optimum at most \({\rm\small MR}\)). Then, the theorem follows from Theorem 1.3, and the fact that the polytope in Figure 3 has integrality gap \(\delta = 2\) and there is a \(\gamma = (\ln 4 + \epsilon)\)-approximation for the Steiner tree problem due to [8] (The error parameters can be rescaled appropriately to get the approximation guarantee in the theorem statement).□
Optimizing for \(\alpha\) in Theorem 5.7 subject to the constraints in (10), we get that for a fixed (small) \(\epsilon\), \(\alpha\) is minimized by setting \(\Gamma \approx 9.284 + f_1(\epsilon), \Gamma ^{\prime } \approx 5.621 + f_2(\epsilon), \kappa \approx 2.241 + f_3(\epsilon)\) (for monotonically increasing \(f_1, f_2, f_3\) which approach 0 as \(\epsilon\) approaches 0). Plugging in these values gives Theorem 5.1.

5.3 Algorithm Description

In this section we give the algorithm description for DoubleApprox, as well as a few lemmas that motivate our algorithm’s design and certify it is efficient. We will again use local search to find moves that are improving with respect to c. However, now our goal is to show that we can do this without blowing up the cost with respect to \(c^{\prime }\). We can start to show this via the following lemma, which generalizes the arguments in Section 4. Informally, it says that as long as a significant fraction \((1/\theta)\) of the swaps (rather than all the swaps) that the local search algorithm can make between its solution A and an adversarial solution F do not improve its objective by some factor \(\lambda\) (rather than by any amount at all), A’s cost can still be bounded by \(4 \lambda \theta\) times F’s cost.
From Lemma 5.8 to Lemma 5.11 we will refer to the cost functions on edges by \(w, w^{\prime }\) instead of \(c, c^{\prime }\). This is because these lemmas are agnostic to the cost functions they are applied to and will be applied with both \(w = c, w^{\prime } = c^{\prime }\) and \(w = c^{\prime }, w^{\prime } = c\) in our algorithm. We also define \(\mathcal {A}, \mathcal {F}, \alpha , N(\cdot), N^{-1}(\cdot)\) as in the proof of Theorem 4.2 for these lemmas.
Lemma 5.8.
Let A and F be solutions to an instance of Steiner tree with edge costs w such that if all edges in \(A \cap F\) have their costs set to 0, then for \(\lambda \ge 1, \theta \ge 1\), we have
\begin{equation*} \sum _{a \in \mathcal {A}, f \in N(a): w(a) \le \lambda w(f)} \alpha (a, f)w(a) \ge \frac{1}{\theta } \sum _{a \in \mathcal {A}, f \in N(a)} \alpha (a, f)w(a). \end{equation*}
Then \(w(A \setminus F) \le 4\lambda \theta w(F \setminus A)\).
Proof.
This follows by generalizing the argument in the proof of Theorem 4.2. After setting costs of edges in \(A \cap F\) to 0, note that \(w(A) = w(A \setminus F)\) and \(w(F) = w(F \setminus A)\). Then:
\begin{align*} w(A \setminus F) &= \sum _{a \in \mathcal {A}} w(a)\\ &\le \sum _{a \in \mathcal {A}} \sum _{f \in N(a)} \alpha (a, f) w(a) \\ &= \sum _{f \in F} \sum _{a \in N^{-1}(f)} \alpha (a, f) w(a) \\ &\le \theta \sum _{f \in F} \sum _{a \in N^{-1}(f): w(a) \le \lambda w(f)} \alpha (a, f) w(a) \\ &\le \lambda \theta \sum _{f \in F} \sum _{a \in N^{-1}(f): w(a) \le \lambda w(f)} \alpha (a, f) w(f)\\ &\le \lambda \theta \sum _{f \in F} \sum _{a \in N^{-1}(f)} \alpha (a, f) w(f) \le 4\lambda \theta w(F \setminus A). \end{align*}
Corollary 5.9.
Let A, F be solutions to an instance of Steiner tree with edge costs w such that for parameters \(\lambda \ge 1, \theta \ge 1\), \(w(A \setminus F) \gt 4\lambda \theta w(F \setminus A)\). Then after setting the cost of all edges in \(A \cap F\) to 0,
\begin{equation*} \sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \lambda w(f)} \alpha (a, f)w(a) \gt \frac{\theta - 1}{\theta } \sum _{a \in \mathcal {A}, f \in N(a)} \alpha (a, f)w(a). \end{equation*}
The corollary effectively tells us that if \(w(A \setminus F)\) is sufficiently larger than \(w(F \setminus A)\), then there are many local swaps between S in A and f in F that decrease \(w(A)\) by a large fraction of \(w(a)\). The next lemma then shows that one of these swaps also does not increase \(w^{\prime }(A)\) by a large factor (even if instead of swapping in f, we swap in an approximation of f), and reduces \(w(A)\) by a non-negligible amount.
Lemma 5.10.
Let A and F be solutions to an instance of Steiner tree with two sets of edge costs, w and \(w^{\prime }\), such that for parameter \(\Gamma \gt 1\), \(w(A \setminus F) \gt 4 \Gamma \cdot w(F \setminus A)\). Fix any \(0 \lt \epsilon \lt \sqrt {\Gamma } - 1\). Then there exists a swap between \(a \in \mathcal {A}\) and a path f between two vertices in A such that \(\frac{(1+\epsilon)w^{\prime }(f) - w^{\prime }(a)}{w(a) - (1+\epsilon)w(f)} \le \frac{4(1+\epsilon)\Gamma }{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)} \cdot \frac{w^{\prime }(F \setminus A)}{w(A \setminus F)}\) and \(w(a) - (1+\epsilon)w(f) \ge \frac{1}{n^2} w(A \setminus F)\).
Proof.
We use an averaging argument to prove the lemma. Consider the quantity
\begin{equation*} R = \frac{\sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \sqrt {\Gamma } w(f), w(a) -(1+\epsilon)w(f) \ge \frac{1}{n^2} w(A \setminus F)} \alpha (a, f) [(1+\epsilon)w^{\prime }(f) - w^{\prime }(a)]}{\sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \sqrt {\Gamma } w(f), w(a) - (1+\epsilon)w(f) \ge \frac{1}{n^2} w(A \setminus F)} \alpha (a, f) [w(a) - (1+\epsilon)w(f)]}, \end{equation*}
which is the ratio of the weighted average of increase in \(c^{\prime }\) to the weighted average of decrease in c over all swaps where \(w(a) \gt \sqrt {\Gamma } w(f)\) and \(w(a) -(1+\epsilon)w(f) \ge \frac{1}{n^2} w(A \setminus F)\).
For any edge e in \(A \cap F\), it is also a subpath \(f \in \mathcal {A} \cap \mathcal {F}\) for which the only \(a \in \mathcal {A}\) such that \(A \cup f \setminus a\) is feasible is \(a = f\). So for all such e we can assume that \(\alpha\) is defined such that \(\alpha (e, e) = 1\), \(\alpha (e, f) = 0\) for \(f \ne e\), and \(\alpha (a, e) = 0\) for \(a \ne e\). Clearly \(w(a) \gt \sqrt {\Gamma }w(a)\) does not hold, so no swap with a positive \(\alpha\) value in either sum involves edges in \(A \cap F\). So we can now set the cost with respect to both \(c, c^{\prime }\) of edges in \(A \cap F\) to 0, and doing so does not affect the quantity R.
Then, the numerator can be upper bounded by \(4(1+\epsilon)w^{\prime }(F \setminus A)\). For the denominator, we first observe that
\begin{equation} \begin{split} \sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \sqrt {\Gamma } w(f), w(a) -(1+\epsilon)w(f) \ge \frac{1}{n^2} w(A \setminus F)} \alpha (a, f) [w(a) - (1+\epsilon)w(f)] \\ \ge \sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \sqrt {\Gamma } w(f)} \alpha (a, f) [w(a) - (1+\epsilon)w(f)] \\ - \sum _{a \in \mathcal {A}, f \in N(a): w(a) -(1+\epsilon)w(f) \lt \frac{1}{n^2} w(A \setminus F)} \alpha (a, f) [w(a) - (1+\epsilon)w(f)]. \end{split} \end{equation}
(11)
The second term on the right-hand side of (11) is upper bounded by:
\begin{equation*} \sum _{a \in \mathcal {A}, f \in N(a): w(a) -(1+\epsilon)w(f) \lt \frac{1}{n^2} w(A \setminus F)} \alpha (a, f) [w(a) - (1+\epsilon)w(f)] \end{equation*}
\begin{equation*} \le \frac{1}{n^2} \sum _{a \in \mathcal {A}, f \in N(a): w(a) -(1+\epsilon)w(f) \lt \frac{1}{n^2} w(A \setminus F)} \alpha (a, f) w(A \setminus F)\le \frac{1}{n}w(A \setminus F). \end{equation*}
The inequality follows because there are at most n different \(a \in \mathcal {A}\), and for each one we have \(\sum _{f \in F} \alpha (a, f) = 1\). We next use Corollary 5.9 (setting both parameters to \(\sqrt {\Gamma }\)) to get the following lower bound on the first term in (11):
\begin{align*} &\sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \sqrt {\Gamma } w(f)} \alpha (a, f) [w(a) - (1+\epsilon)w(f)]\\ &\ge \sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \sqrt {\Gamma } w(f)} \alpha (a, f) \left[w(a) - \frac{1+\epsilon }{\sqrt {\Gamma }}w(a)\right]\\ &=\frac{\sqrt {\Gamma }-1-\epsilon }{\sqrt {\Gamma }} \sum _{a \in \mathcal {A}, f \in N(a): w(a) \gt \sqrt {\Gamma } w(f)} \alpha (a, f) w(a)\\ &\ge \frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{\Gamma } \sum _{a \in \mathcal {A}, f \in N(a)} \alpha (a, f) w(a) \\ &=\frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{\Gamma } w(A \setminus F). \end{align*}
This lower bounds the denominator of R by \((\frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{\Gamma } - 1/n) \cdot w(A \setminus F)\). By properly choosing of \(\epsilon\), for sufficiently large n we can ignore the \(1/n\) term. Then, combining the bounds implies that R is at most \(\frac{4(1+\epsilon ^{\prime })\Gamma }{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon ^{\prime })} \cdot \frac{w^{\prime }(F \setminus A)}{w(A \setminus F)}\). In turn, one of the swaps being summed over in R satisfies the lemma statement.□
We now almost have the tools to state our algorithm and prove Lemma 5.5. However, the local search process is now concerned with two edge costs, so just considering adding the shortest path with respect to c between each pair of vertices and deleting a subset of vertices in the induced cycle will not suffice. We instead use the following lemma:
Lemma 5.11.
Given a graph \(G=(V, E)\) with edge costs w and \(w^{\prime }\), two vertices s and t, and input parameter \(W^{\prime }\), let p be the shortest path from s to t with respect to w whose cost with respect to \(c^{\prime }\) is at most \(W^{\prime }\). For all \(\epsilon \gt 0\), there is a polynomial time algorithm that finds a path from s to t whose cost with respect to w is at most \(w(p)\) and whose cost with respect to \(w^{\prime }\) is at most \((1+\epsilon)W^{\prime }\).
Proof.
If all edge lengths with respect to \(w^{\prime }\) are multiples of \(\Delta\), an optimal solution can be found in time \(\text {poly}(|V|, |E|, W^{\prime }/\Delta)\) via dynamic programming: Let \(\ell (v, i)\) be the length of the shortest path from s to v with respect to w whose cost with respect to \(w^{\prime }\) is at most \(i \cdot \Delta\). Using \(\ell (s, i) = 0\) for all i and the recurrence \(\ell (v, i) \le \ell (u, i-(w^{\prime }_{uv}/\Delta)) + w_{uv}\) for edge \((u, v)\), we can compute \(\ell (v, i)\) for all \(v, i\) and use backtracking from \(\ell (t, W^{\prime })\) to retrieve p in \(\text {poly}(|V|, |E|, W^{\prime }/\Delta)\) time.
To get the runtime down to polynomial, we use a standard rounding trick, rounding each \(w_e^{\prime }\) down to the nearest multiple of \(\epsilon W^{\prime } / |V|\). After rounding, the runtime of the dynamic programming algorithm is \(\text {poly}(|V|, |E|, \frac{W^{\prime }}{\epsilon W^{\prime }/|V|}) = \text {poly}(|V|, |E|, \frac{1}{\epsilon })\). Any path has at most \(|V|\) edges, and so its cost decreases by at most \(\epsilon W^{\prime }\) in this rounding process, i.e., all paths considered by the algorithm have cost with respect to \(w^{\prime }\) of at most \((1+\epsilon)W^{\prime }\). Lastly, since p’s cost with respect to \(w^{\prime }\) only decreases, \(w(p)\) still upper bounds the cost of the shortest path considered by the algorithm with respect to w.□
The idea is to run a local search with respect to c starting with a good approximation with respect to \(c^{\prime }\). Our algorithm alternates between a “forward” and “backward” phase. In the forward phase, we use Lemma 5.11 to decide which paths can be added to the solution in local search moves. The local search takes any swap that causes both \(c({\rm\small ALG})\) and \(c^{\prime }({\rm\small ALG})\) to decrease if any exists. Otherwise, it picks the swap between \(S \in {\rm\small ALG}\) and f that among all swaps where \(c(f) \lt c(a)\) and \(c^{\prime }(f) \le c^{\prime }({\rm\small SOL})\) minimizes the ratio \(\frac{c^{\prime }(f) - c^{\prime }(a)}{c(a) - c(f)}\) (we assume we know the value of \(c^{\prime }({\rm\small SOL})\), as we can guess many values, and our algorithm will work for the right value for \(c^{\prime }({\rm\small SOL})\)).
If the algorithm only made swaps of this form, however, \(c^{\prime }({\rm\small ALG})\) might become a very poor approximation of \(c^{\prime }({\rm\small SOL})\). To control for this, when \(c^{\prime }({\rm\small ALG})\) exceeds \((4\Gamma ^{\prime }+\kappa) \cdot c^{\prime }({\rm\small SOL})\) for some constant \(\Gamma ^{\prime } \gt 1\), we begin a “backward phase”: We take the opposite approach, greedily choosing either swaps that improve both c and \(c^{\prime }\) or that improve \(c^{\prime }\) and minimize the ratio \(\frac{c(f) - c(a)}{c^{\prime }(a) - c^{\prime }(f)}\), until \(c^{\prime }({\rm\small ALG})\) has been reduced by at least \(\kappa \cdot c^{\prime }({\rm\small SOL})\). At this point, we begin a new forward phase.
The intuition for the analysis is as follows: If, throughout a forward phase, \(c({\rm\small ALG} \setminus {\rm\small SOL}) \ge 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG})\), Lemma 5.10 tells us that there is swap where the increase in \(c^{\prime }({\rm\small ALG})\) will be very small relative to the decrease in \(c({\rm\small ALG})\). (Note that our goal is to reduce the cost of \(c({\rm\small ALG} \setminus {\rm\small SOL})\) to something below \(4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG})\).) Throughout the subsequent backward phase, we have \(c^{\prime }({\rm\small ALG}) \gt 4\Gamma ^{\prime } \cdot c^{\prime }({\rm\small SOL})\), which implies \(c^{\prime }({\rm\small ALG} \setminus {\rm\small SOL}) \gt 4\Gamma ^{\prime } \cdot c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG})\). So Lemma 5.10 also implies that the total increase in \(c({\rm\small ALG})\) will be very small relative to the decrease in \(c^{\prime }({\rm\small ALG})\). Since the absolute change in \(c^{\prime }({\rm\small ALG})\) is similar between the two phases, one forward and one backward phase should decrease \(c({\rm\small ALG})\) overall.
The formal description of the backward and forward phase is given as algorithm DoubleApprox in Figure 6. For the lemmas/corollaries in the following section, we implicitly assume that we know values of \(\chi\) and \(\chi ^{\prime }\) satisfying the conditions of DoubleApprox. When we conclude by proving Lemma 5.5, we will simply call DoubleApprox for every reasonable value of \(\chi , \chi ^{\prime }\) that is a power of \(1+\epsilon\), and one of these runs will have \(\chi , \chi ^{\prime }\) satisfying the conditions. Furthermore, there are multiple error parameters in our algorithm and its analysis. For simplicity of presentation, we use the same value \(\epsilon\) for all error parameters in the algorithm and its analysis.
Fig. 6.
Fig. 6. Algorithm DoubleApprox, which finds \({\rm\small ALG}\) such that \(c({\rm\small ALG} \setminus {\rm\small SOL}) \le O(1) \cdot c({\rm\small SOL} \setminus {\rm\small ALG})\) and \(c^{\prime }({\rm\small ALG}) \le O(1) \cdot c^{\prime }({\rm\small SOL})\).
Fig. 7.
Fig. 7. Algorithm GreedySwap, which finds a swap with the properties described in Lemma 5.10.

5.4 Algorithm Analysis and Proof of Lemma 5.5

In this section we analyze DoubleApprox and give the proof of Lemma 5.5. We skip the proof of some technical lemmas whose main ideas have been covered already. We first make some observations. The first lets us relate the decrease in cost of a solution \({\rm\small ALG}\) to the decrease in the cost of \({\rm\small ALG} \setminus {\rm\small SOL}\).
Lemma 5.12.
Let \({\rm\small ALG}, {\rm\small ALG}^{\prime }, {\rm\small SOL}\) be any Steiner tree solutions to a given instance. Then
\begin{equation*} c({\rm\small ALG}) - c({\rm\small ALG}^{\prime }) = [c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})] - [c({\rm\small ALG}^{\prime } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG}^{\prime })]. \end{equation*}
Proof.
By symmetry, the contribution of edges in \({\rm\small ALG} \cap {\rm\small ALG}^{\prime }\) and edges in neither \({\rm\small ALG}\) nor \({\rm\small ALG}^{\prime }\) to both the left and right hand side of the equality is zero, so it suffices to show that all edges in \({\rm\small ALG} \oplus {\rm\small ALG}^{\prime }\) contribute equally to the left and right hand side.
Consider any \(e \in {\rm\small ALG} \setminus {\rm\small ALG}^{\prime }\). Its contribution to \(c({\rm\small ALG}) - c({\rm\small ALG}^{\prime })\) is \(c(e)\). If \(e \in {\rm\small ALG} \setminus {\rm\small SOL}\), then e contributes \(c(e)\) to \(c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})\) and 0 to \(-[c({\rm\small ALG}^{\prime } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG}^{\prime })]\). If \(e \in {\rm\small ALG} \cap {\rm\small SOL}\), then e contributes 0 to \(c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})\) and \(c(e)\) to \(- [c({\rm\small ALG}^{\prime } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG}^{\prime })]\). So the total contribution of e to \([c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})] - [c({\rm\small ALG}^{\prime } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG}^{\prime })]\) is \(c(e)\).
Similarly, consider \(e \in {\rm\small ALG}^{\prime } \setminus {\rm\small ALG}\). Its contribution to \(c({\rm\small ALG}) - c({\rm\small ALG}^{\prime })\) is \(-c(e)\). If \(e \in {\rm\small SOL} \setminus {\rm\small ALG}\), then e contributes \(-c(e)\) to \(c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})\) and 0 to \([c({\rm\small ALG}^{\prime } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG}^{\prime })]\). If \(e \notin {\rm\small SOL}\), then e contributes 0 to \(c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})\) and \(-c(e)\) to \(-[c({\rm\small ALG}^{\prime } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG}^{\prime })]\). So the total contribution of e to \([c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})] - [c({\rm\small ALG}^{\prime } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG}^{\prime })]\) is \(-c(e)\).□
Lemma 5.12 is useful because Lemma 5.10 relates the ratio of change in \(c, c^{\prime }\) to \(c({\rm\small ALG} \setminus {\rm\small SOL})\), but it is difficult to track how \(c({\rm\small ALG} \setminus {\rm\small SOL})\) changes as we make swaps that improve \(c({\rm\small ALG})\). For example, \(c({\rm\small ALG} \setminus {\rm\small SOL})\) does not necessarily decrease with swaps that cause \(c({\rm\small ALG})\) to decrease (e.g., consider a swap that adds a light edge not in \({\rm\small SOL}\) and removes a heavy edge in \({\rm\small SOL}\)). Whenever \(c({\rm\small ALG} \setminus {\rm\small SOL}) \gg c({\rm\small SOL} \setminus {\rm\small ALG})\) (if this doesn’t hold, we have a good approximation and are done), \(c({\rm\small ALG} \setminus {\rm\small SOL})\) and \(c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})\) are off by a multiplicative factor that is very close to 1, and thus we can relate the ratio of changes in Lemma 5.10 to \(c({\rm\small ALG} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG})\) instead at a small loss in the constant, and by Lemma 5.12 changes in this quantity are much easier to track over the course of the algorithm, simplifying our analysis greatly.
The next lemma lets us assume that any backward phase uses polynomially many calls to GreedySwap.
Lemma 5.13.
Let \(\chi ^{\prime }\) be any value such that \(\chi ^{\prime } \in [c^{\prime }({\rm\small SOL}), (1+\epsilon)c^{\prime }({\rm\small SOL})]\), and suppose we round all \(c_e^{\prime }\) up to the nearest multiple of \(\frac{\epsilon }{n}\chi ^{\prime }\) for some \(0 \lt \epsilon \lt 1\). Then any \(\gamma\)-approximation of \({\rm\small SOL}\) with respect to \(c^{\prime }\) using the rounded \(c_e^{\prime }\) values is an \(\gamma (1+2\epsilon)\)-approximation of \({\rm\small SOL}\) with respect to \(c^{\prime }\) using the original edge costs.
Proof.
This follows because the rounding can only increase the cost of any solution, and the cost increases by at most \(\epsilon \chi ^{\prime } \le \epsilon (1 + \epsilon)c^{\prime }({\rm\small SOL}) \le 2 \epsilon c^{\prime }({\rm\small SOL})\).□
Via this lemma, we will assume all \(c^{\prime }_e\) are already rounded. The following two lemmas formalize the intuition given in Section 5.2; in particular, by using bounds on the “rate of exchange”, they show that the decrease in c in the forward phase can be lower bounded, and the increase in c in the backward phase can be upper bounded. Their proofs are highly technical and largely follow the intuition given in Section 5.2, so we defer them to the following section.
Lemma 5.14 (Forward Phase Analysis).
For any even i in algorithm DoubleApprox, let \(\rho\) be the power of \((1+\epsilon)\) times \(\min _e c_e\) such that \(\rho \in [c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}),(1+\epsilon)c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) ]\). Suppose all values of \({\rm\small ALG} ^{(i+1)}_\rho\) and the final value of \({\rm\small ALG} ^{(i)}\) in DoubleApprox satisfy \(c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho })\) and \(c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)})\). Then for \(0 \lt \epsilon \lt 2/3 -5/12\Gamma\), the final values of \({\rm\small ALG} ^{(i)}, {\rm\small ALG} ^{(i+1)}_\rho\) satisfy
\begin{equation*} c \big ({\rm\small ALG} ^{(i)} \big) - c \big ({\rm\small ALG} ^{(i+1)}_\rho \big) \ge \min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace \cdot c\big ({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL} \big). \end{equation*}
Lemma 5.15 (Backward Phase Analysis).
Fix any even \(i+2\) in algorithm DoubleApprox and any value of \(\rho\). Suppose all values of \({\rm\small ALG} ^{(i+2)}_{\rho }\) satisfy \(c({\rm\small ALG} ^{(i+2)}_{\rho } \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho })\). Let \(T =\) \(\frac{c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho }) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho })}{c^{\prime }({\rm\small SOL})}\). Then for
\begin{equation*} \zeta ^{\prime } = \frac{4(1+\epsilon)\Gamma ^{\prime }}{(\sqrt {\Gamma ^{\prime }}-1)(\sqrt {\Gamma ^{\prime }}-1-\epsilon)(4\Gamma ^{\prime }-1)(4\Gamma -1)}, \end{equation*}
the final values of \({\rm\small ALG} ^{(i+1)}_\rho , {\rm\small ALG} ^{(i+2)}_\rho\) satisfy
\begin{equation*} c \big ({\rm\small ALG} ^{(i+2)}_\rho \big) - c \big ({\rm\small ALG} ^{(i+1)}_\rho \big) \le \big (e^{\zeta ^{\prime } T} - 1 \big) \cdot c \big ({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL} \big). \end{equation*}
By combining the two preceding lemmas, we can show that as long as \({\rm\small ALG}\) is a poor difference approximation, a combination of the forward and backward phase collectively decrease \(c({\rm\small ALG} ^{(i)})\).
Corollary 5.16.
Fix any positive even value of \(i+2\) in algorithm DoubleApprox, and let \(\rho\) be the power of \((1+\epsilon)\) times \(\min _e c_e\) such that \(\rho \in [c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}), (1+\epsilon)c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL})]\). Suppose all values of \({\rm\small ALG} ^{(i+1)}_\rho\) and the final value of \({\rm\small ALG} ^{(i)}\) in DoubleApprox satisfy \(c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho })\) and \(c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)})\). Then for \(0 \lt \epsilon \lt 2/3 -5/12\Gamma\) and \(\zeta ^{\prime }\) as defined in Lemma 5.15, the final values of \({\rm\small ALG} ^{(i+2)}, {\rm\small ALG} ^{(i)}\) satisfy
\begin{equation*} c({\rm\small ALG} ^{(i)}) - c({\rm\small ALG} ^{(i+2)}) \end{equation*}
\begin{equation*} \ge \left[\min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1)\right] \cdot c \big ({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL} \big). \end{equation*}
Proof.
It suffices to lower bound \(c({\rm\small ALG} ^{(i)}) - c({\rm\small ALG} ^{(i+2)}_\rho)\) for this value of \(\rho\), since \(c({\rm\small ALG} ^{(i)}) - c({\rm\small ALG} ^{(i+2)})\) must be at least this large. After rescaling \(\epsilon\) appropriately, we have
\begin{equation*} c^{\prime }\big ({\rm\small ALG} ^{(i+1)}_\rho \big) - c^{\prime } \big ({\rm\small ALG} ^{(i+2)}_\rho \big) \le c^{\prime } \big ({\rm\small ALG} ^{(i+1)}_\rho \big) \le (4\Gamma ^{\prime } + \kappa + 1 + \epsilon)c^{\prime }({\rm\small SOL}), \end{equation*}
because the algorithm can increase its cost with respect to \(c^{\prime }\) by at most \((1+\epsilon)c^{\prime }({\rm\small SOL})\) in any swap in the forward phase (by line 5 of GreedySwap, which bounds the increase \(w^{\prime }(\hat{f}) \le W \le (1+\epsilon)w^{\prime }({\rm\small SOL})\)), so it exceeds the threshold \((4\Gamma ^{\prime }+\kappa)\chi ^{\prime } \le (4\Gamma ^{\prime }+\kappa)(1+\epsilon)c^{\prime }({\rm\small SOL})\) on line 13 of DoubleApprox by at most this much. Then applying Lemma 5.14 to \(c({\rm\small ALG} ^{(i)}) - c({\rm\small ALG} ^{(i+1)}_\rho)\) and Lemma 5.15 to \(c({\rm\small ALG} ^{(i+1)}_\rho) - c({\rm\small ALG} ^{(i+2)}_\rho)\) (using \(T \le 4\Gamma ^{\prime } + \kappa + 1 +\epsilon\)) gives:
\begin{equation*} c \big ({\rm\small ALG} ^{(i)}\big) - c\big ({\rm\small ALG} ^{(i+2)}_\rho \big) = \big [c\big ({\rm\small ALG} ^{(i)}\big) - c\big ({\rm\small ALG} ^{(i+1)}_\rho \big)\big ] + \big [c\big ({\rm\small ALG} ^{(i+1)}_\rho \big) - c\big ({\rm\small ALG} ^{(i+2)}_\rho \big)\big ] \end{equation*}
\begin{equation*} \ge \left[\min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1)\right] \cdot c \big ({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL} \big). \end{equation*}
Now, we can chain Corollary 5.16 multiple times to show that after sufficiently many iterations of DoubleApprox, if all intermediate values of \({\rm\small ALG} ^{(i)}\) are poor difference approximations, over all these iterations \(c({\rm\small ALG} ^{(i)})\) must decrease multiplicatively by more than \(\frac{n \max _e c_e}{\min _e c_e}\), which is a contradiction as this is the ratio between an upper and lower bound on the cost of every Steiner tree. In turn, some intermediate value of \({\rm\small ALG} ^{(i)}\) must have been a good difference approximation:
Lemma 5.17.
Suppose \(\Gamma , \Gamma ^{\prime }, \kappa\), and \(\epsilon\) are chosen such that for \(\zeta ^{\prime }\) as defined in Lemma 5.15,
\begin{equation*} \min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1) \gt 0, \end{equation*}
and \(0 \lt \epsilon \lt 2/3 -5/12\Gamma\). Let \(\eta\) equal
\begin{equation*} \frac{\min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1)}{1 + \frac{4\Gamma - 1}{4\Gamma } (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1)}. \end{equation*}
Assume \(\eta \gt 0\) and let \(I = 2(\lceil \log \frac{n \max _e c_e}{\min _e c_e} / \log (1+\eta) \rceil +1)\). Then there exists some intermediate value \({\rm\small ALG} ^*\) assigned to \({\rm\small ALG} ^{(i)}_\rho\) by the algorithm for some \(i \le I\) and \(\rho\) such that \(c({\rm\small ALG} ^* \setminus {\rm\small SOL}) \le 4 \Gamma c({\rm\small SOL} \setminus {\rm\small ALG} ^*)\) and \(c^{\prime }({\rm\small ALG} ^*) \le (4\Gamma ^{\prime } +\kappa + 1 + \epsilon) c^{\prime }({\rm\small SOL})\).
Proof.
Let \(\Phi (i) := c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)})\) for even i. Assume that the lemma is false. Since algorithm DoubleApprox guarantees that \(c^{\prime }({\rm\small ALG} ^{(i)}_{\rho }) \le (4\Gamma ^{\prime } +\kappa + 1 + \epsilon) c^{\prime }({\rm\small SOL})\), if the lemma is false it must be that for all i and \(\rho\), \(c({\rm\small ALG} ^{(i)}_{\rho } \setminus {\rm\small SOL}) \gt 4 \Gamma c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)}_{\rho })\). By Corollary 5.16, and the assumption on \(\Gamma , \Gamma ^{\prime }, \kappa , \epsilon\) in the statement of this lemma, for all i \(c({\rm\small ALG} ^{(i)}) \lt c({\rm\small ALG} ^{(i-2)})\), so the while loop on Line 3 of DoubleApprox never breaks. This means that for all even \(i \le I\), \({\rm\small ALG} ^{(i)}\) is assigned a value in DoubleApprox. We will show that this implies that for the final value of \({\rm\small ALG} ^{(I)}\), \(\Phi (I) = c({\rm\small ALG} ^{(I)} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(I)}) \lt \frac{4\Gamma - 1}{4\Gamma } \min _e c_e\). The inequality \(c({\rm\small ALG} ^{(I)} \setminus {\rm\small SOL}) \gt 4 \Gamma c({\rm\small SOL} \setminus {\rm\small ALG} ^{(I)})\) implies \(c({\rm\small ALG} ^{(I)} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(I)}) \gt \frac{4\Gamma - 1}{4\Gamma } c({\rm\small ALG} ^{(I)} \setminus {\rm\small SOL}).\) The value of \(c({\rm\small ALG} ^{(I)} \setminus {\rm\small SOL})\) must be positive (otherwise \(c({\rm\small ALG} ^{(I)} \setminus {\rm\small SOL}) \le 4\Gamma c({\rm\small SOL} \setminus {\rm\small ALG} ^{(I)})\) trivially), and hence it must be at least \(\min _e c_e\). These two inequalities conflict, which implies a contradiction. Hence, the lemma must be true.
We now analyze how the quantity \(\Phi (i)\) changes under the assumption that the lemma is false. Of course \(\Phi (0) \le n \max _e c_e\). Lemma 5.12 gives that \(\Phi (i) - \Phi (i+2)\) is exactly equal to \(c({\rm\small ALG} ^{(i)}) - c({\rm\small ALG} ^{(i+2)})\). For the value of \(\rho\) such that \(\rho \in [c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}), (1+\epsilon) c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL})]\), by Corollary 5.16 and the assumption that the lemma is false, for even i we have
\begin{equation*} \Phi (i) - \Phi (i+2) \end{equation*}
\begin{equation*} \ \ge \left[\min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1)\right] \cdot c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) \end{equation*}
\begin{equation*} \ \ge \left[\min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1)\right] \end{equation*}
\begin{equation} \ \cdot [c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_\rho)]. \end{equation}
(12)
Lemma 5.15 (using the proof from Corollary 5.16 that \(T \le 4\Gamma ^{\prime } + \kappa + 1 +\epsilon\)), Lemma 5.12, and the inequality \(c({\rm\small ALG} ^{(i+1)}_{\rho } \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho }) \gt \frac{4\Gamma - 1}{4\Gamma } c({\rm\small ALG} ^{(i+1)}_{\rho } \setminus {\rm\small SOL})\) give:
\begin{align*} &\Phi (i+2) - [c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_\rho)] \\ \le &(e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1) c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) \\ \lt & \frac{4\Gamma - 1}{4\Gamma } (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1) [c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)})]\\ \Rightarrow &[c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_\rho)] \gt \frac{1}{1 + \frac{4\Gamma - 1}{4\Gamma } (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1))}\Phi (i+2). \end{align*}
Plugging this into (12) gives:
\begin{equation*} \Phi (i+2) \lt \left(1 + \frac{\min \lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1)}{1 + \frac{4\Gamma - 1}{4\Gamma } (e^{\zeta ^{\prime } (4\Gamma +\kappa +1+\epsilon)} - 1)}\right)^{-1} \Phi (i) \end{equation*}
\begin{equation*} = (1+\eta)^{-1}\Phi (i). \end{equation*}
Applying this inductively gives:
\begin{equation*} \Phi (i) \le (1+\eta)^{-i/2} \Phi (0) \le (1+\eta)^{-i/2} n \max _e c_e. \end{equation*}
Plugging in \(i = I = 2(\lceil \log \frac{n \max _e c_e}{\min _e c_e} / \log (1+\eta) \rceil +1)\) gives \(\Phi (I) \le (1 + \eta)^{-1} \min _e c_e \lt \min _e c_e\) as desired.□
To prove Lemma 5.5, we now just need to certify that it suffices to guess multiple values of \(\chi , \chi ^{\prime }\), and that the algorithm is efficient.
Proof of Lemma 5.5
If we have \(\chi \in [c({\rm\small SOL}), (1+\epsilon)\cdot c({\rm\small SOL})]\) and \(\chi ^{\prime } \in [c^{\prime }({\rm\small SOL}), (1+\epsilon)\cdot c^{\prime }({\rm\small SOL})]\), and the \(c_e^{\prime }\) values are multiples of \(\frac{\epsilon }{n}\chi ^{\prime }\), then the conditions of DoubleApprox are met. As long as (10) holds, that is:
\begin{equation*} \min \left\lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\right\rbrace - (e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1) \gt 0,(\href{#eq10}{\text{10}}) \end{equation*}
then we have \(\eta \gt 0\) in Lemma 5.17, thus giving the approximation guarantee in Lemma 5.5. For any positive \(\epsilon , \kappa , \Gamma ^{\prime }\), there exists a sufficiently large value of \(\Gamma\) for (10) to hold, since as \(\Gamma \rightarrow \infty\), we have \(\zeta ^{\prime } \rightarrow 0\), \((e^{\zeta ^{\prime } (4\Gamma ^{\prime }+\kappa +1+\epsilon)} - 1) \rightarrow 0\), and \(\min \lbrace \frac{4\Gamma - 1}{8\Gamma } ,\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2}\rbrace \rightarrow \min \lbrace 1/2, \kappa /(4+4\epsilon)\rbrace\), so for any fixed choice of \(\epsilon , \kappa , \Gamma ^{\prime }\), a sufficiently large value of \(\Gamma\) causes \(\eta \gt 0\) to hold as desired.
Some value in \(\lbrace \min _e c_e, (1+\epsilon)\min _e c_e, \ldots (1+\epsilon)^{\lceil \log _{1+\epsilon }\frac{n \max _e c_e}{\min _e c_e} \rceil } \min _e c_e\rbrace\) satisfies the conditions for \(\chi\), and there are polynomially many values in this set. The same holds for \(\chi ^{\prime }\) in \(\lbrace \min _e c^{\prime }_e, \ldots (1+\epsilon)^{\lceil \log _{1+\epsilon }\frac{n \max _e c^{\prime }_e}{\min _e c^{\prime }_e} \rceil } \min _e c^{\prime }_e\rbrace\). So we can run \({\rm D}{\rm\small{OUBLE}}{\rm A}{\rm\small{PPROX}}\) for all pairs of \(\chi , \chi ^{\prime }\) (paying a polynomial increase in runtime), and output the union of all outputs, giving the guarantee of Lemma 5.5 by Lemma 5.17. For each \(\chi ^{\prime }\) we choose, we can round the edge costs to the nearest multiple of \(\frac{\epsilon }{n}\chi ^{\prime }\) before running DoubleApprox, and by Lemma 5.13 we only pay an additive \(O(\epsilon)\) in the approximation factor with respect to \(c^{\prime }\). Finally, we note that by setting \(\epsilon\) appropriately in the statement of Lemma 5.17, we can achieve the approximation guarantee stated in Lemma 5.5 for a different value of \(\epsilon\).
Then, we just need to show DoubleApprox runs in polynomial time. Lemma 5.17 shows that the while loop of Line ?? only needs to be run a polynomial number (I) of times. The while loop for the forward phase runs at most \(O(n^2)\) times since each call to GreedySwap decreases the cost with respect to c by at least \(\frac{1}{10n^2} \rho\), and once the total decrease exceeds \(\rho /2\) the while loop breaks. The while loop for the backward phase runs at most \((\kappa +1+\epsilon)\frac{n}{\epsilon }\) times, since the initial cost with respect to \(c^{\prime }\) is at most \((4\Gamma + \kappa + 1 + \epsilon)\chi ^{\prime }\), the while loop breaks when it is less than \(4\Gamma ^{\prime }\chi ^{\prime }\), and each call to GreedySwap improves the cost by at least \(\frac{\epsilon }{n} \chi ^{\prime }\). Lastly, GreedySwap can be run in polynomial time as the maximal a which needs to be enumerated and can be computed in polynomial time as described in Section 4.□

5.5 Proofs of Lemmas 5.14 and 5.15

Proof of Lemma 5.14
Let \({\rm\small ALG} ^{(i+1)}_{\rho , j}\) denote the value of \({\rm\small ALG} ^{(i+1)}_\rho\) after j calls to GreedySwap on \({\rm\small ALG} ^{(i+1)}_\rho\), and let J be the total number of calls of GreedySwap on \({\rm\small ALG} ^{(i+1)}_\rho\). Then \({\rm\small ALG} ^{(i+1)}_{\rho , 0}\) is the final value of \({\rm\small ALG} ^{(i)}\), and the final value of \({\rm\small ALG} ^{(i+1)}_\rho\) is \({\rm\small ALG} ^{(i+1)}_{\rho , J}\). Any time GreedySwap is invoked on \({\rm\small ALG} ^{(i+1)}_\rho\), by line 6 of DoubleApprox and the assumption \(\rho \le (1+\epsilon)c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL})\) in the lemma statement, we have:
\begin{equation*} c \big ({\rm\small ALG} ^{(i+1)}_\rho \big) \gt c({\rm\small ALG} ^{(i)}) - \rho /2 \ge c({\rm\small ALG} ^{(i)}) - \frac{1+\epsilon }{2}c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}). \end{equation*}
Then, by Lemma 5.12 and the assumption \(c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)})\) in the lemma statement, we have:
\begin{align*} c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) &\ge c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho }) \\ &=c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)}) + c({\rm\small ALG} ^{(i+1)}_\rho) - c({\rm\small ALG} ^{(i)}) \\ &\ge c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)}) - \frac{1+\epsilon }{2}c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) \\ & \ge \left(\frac{1-\epsilon }{2} - \frac{1}{4\Gamma }\right)c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}), \end{align*}
For \(\epsilon \lt 2/3 - 5/12\Gamma\), \(c({\rm\small ALG} ^{(i+1)}_{\rho } \setminus {\rm\small SOL})/n^2 \ge \frac{1}{10n^2}\rho\). So by Lemma 5.10 GreedySwap never outputs a tuple where \(stops = 1\), and thus we can ignore lines 8 -10 of DoubleApprox under the conditions in the lemma statement.
Suppose \({\rm\small ALG} ^{(i+1)}_{\rho , J}\) satisfies \(c({\rm\small ALG} ^{(i+1)}_{\rho , J}) \le c({\rm\small ALG} ^{(i)}) - \rho /2\), a condition that causes the while loop at line 6 of DoubleApprox to exit and the forward phase to end. Then
\begin{align*} c({\rm\small ALG} ^{(i)}) - c({\rm\small ALG} ^{(i+1)}_{\rho , J}) &\ge \rho /2 \ge \frac{1}{2} c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL})\\ &\ge \frac{1}{2} [c({\rm\small ALG} ^{(i)} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i)})]\\ &= \frac{1}{2} [c({\rm\small ALG} ^{(i+1)}_{\rho ,0} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho ,0})]\\ &\ge \frac{1}{2}[c({\rm\small ALG} ^{(i+1)}_{\rho ,J} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho ,J}) ]\\ &\ge \frac{4\Gamma - 1}{8\Gamma } c({\rm\small ALG} ^{(i+1)}_{\rho , J}\setminus {\rm\small SOL}). \end{align*}
The second-to-last inequality is using Lemma 5.12, which implies \(c({\rm\small ALG} ^{(i+1)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho , j})\) is decreasing with swaps, and the last inequality holds by the assumption \(c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho })\) in the lemma statement. Thus, if \(c({\rm\small ALG} ^{(i+1)}_{\rho , J}) \le c({\rm\small ALG} ^{(i)}) - \rho /2\), the lemma holds.
Now assume instead that \(c({\rm\small ALG} ^{(i+1)}_{\rho , J}) \gt c({\rm\small ALG} ^{(i)}) - \rho /2\) when the forward phase ends. We want a lower bound on
\begin{equation*} c({\rm\small ALG} ^{(i+1)}_{\rho , 0}) - c({\rm\small ALG} ^{(i+1)}_{\rho , J})= \sum _{j = 0}^{J-1}[c({\rm\small ALG} ^{(i+1)}_{\rho , j}) - c({\rm\small ALG} ^{(i+1)}_{\rho , j+1})]. \end{equation*}
We bound each \(c({\rm\small ALG} ^{(i+1)}_{\rho , j}) - c({\rm\small ALG} ^{(i+1)}_{\rho , j+1})\) term using Lemma 5.10 and Lemma 5.11. By Lemma 5.10 and the assumption in the lemma statement that \(c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho })\), we know there exists a swap between \(a \in {\rm\small ALG} _{\rho , j}^{(i+1)}\) and \(f \in {\rm\small SOL}\) such that
\begin{equation*} \frac{(1+\epsilon)c^{\prime }(f) - c^{\prime }(a)}{c(a) - (1+\epsilon)c(f)} \le \frac{4(1+\epsilon)\Gamma }{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)} \cdot \frac{c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho , j})}{c({\rm\small ALG} ^{(i+1)}_{\rho , j} \setminus {\rm\small SOL})}. \end{equation*}
By Lemma 5.11, we know that when \(G^{\prime }\) is set to a value in \([c^{\prime }(f), (1+\epsilon) \cdot c^{\prime }(f)]\) in line 2 of GreedySwap, the algorithm finds a path \(f^{\prime }\) between the endpoints of f such that \(c(f^{\prime }) \le (1+\epsilon) c(f)\) and \(c^{\prime }(f^{\prime }) \le (1+\epsilon) c^{\prime }(f)\). Thus, \((a, f^{\prime }) \in swaps\) and the swap \((a^*, f^*)\) chosen by the \((j+1)\)th call to GreedySwap satisfies:
\begin{equation*} \frac{c^{\prime }(f^*) - c^{\prime }(a^*)}{c(a^*) - c(f^*)} \le \frac{c^{\prime }(f^{\prime }) - c^{\prime }(a)}{c(a) - c(f)} \le \frac{(1+\epsilon)c^{\prime }(f) - c^{\prime }(a)}{c(a) - (1+\epsilon)c(f)} \end{equation*}
\begin{equation*} \le \ \frac{4(1+\epsilon)\Gamma }{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)} \cdot \frac{c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho , j})}{c({\rm\small ALG} ^{(i+1)}_{\rho , j} \setminus {\rm\small SOL})}. \end{equation*}
Rearranging terms and observing that \(c^{\prime }({\rm\small SOL}) \ge c^{\prime }({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho ,j})\) gives:
\begin{align*} c({\rm\small ALG} ^{(i+1)}_{\rho , j}) - c({\rm\small ALG} ^{(i+1)}_{\rho , j+1}) &= c(a^*) - c(f^*)\\ &\ge \frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{4(1+\epsilon)\Gamma } \cdot c({\rm\small ALG} ^{(i+1)}_{\rho ,j} \setminus {\rm\small SOL}) \frac{c^{\prime }(f^*)-c^{\prime }(a^*)}{c^{\prime }({\rm\small SOL})}\\ &= \frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{4(1+\epsilon)\Gamma } \cdot c({\rm\small ALG} ^{(i+1)}_{\rho ,j} \setminus {\rm\small SOL}) \frac{c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho ,j+1})-c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho ,j})}{c^{\prime }({\rm\small SOL})}. \end{align*}
This in turn gives:
\begin{align*} c({\rm\small ALG} ^{(i+1)}_{\rho , 0}) - c({\rm\small ALG} ^{(i+1)}_{\rho ,J})&= \sum _{j = 0}^{J-1}[c({\rm\small ALG} ^{(i+1)}_{\rho , j}) - c({\rm\small ALG} ^{(i+1)}_{\rho , j+1})] \\ &\ge \sum _{j=0}^{J-1} \frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{4(1+\epsilon)\Gamma } \cdot c({\rm\small ALG} ^{(i+1)}_{\rho ,j} \setminus {\rm\small SOL}) \\ &\qquad \qquad \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j+1})-c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j})}{c^{\prime }({\rm\small SOL})} \\ & \ge \frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{4(1+\epsilon)\Gamma } \sum _{j=0}^{J-1} [c({\rm\small ALG} ^{(i+1)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho , j})]\\ &\qquad \qquad \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j+1})-c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j})}{c^{\prime }({\rm\small SOL})} \\ &\ge \frac{(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{4(1+\epsilon)\Gamma } \sum _{j=0}^{J-1} [c({\rm\small ALG} ^{(i+1)}_{\rho , J} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho , J})]\\ &\qquad \qquad \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j+1})-c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j})}{c^{\prime }({\rm\small SOL})}\\ &\ge \frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{16(1+\epsilon)\Gamma ^2} c({\rm\small ALG} ^{(i+1)}_{\rho , J} \setminus {\rm\small SOL}) \\ &\qquad \qquad \cdot \sum _{j=0}^{J-1} \frac{c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j+1})-c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , j})}{c^{\prime }({\rm\small SOL})}\\ &=\frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)}{16(1+\epsilon)\Gamma ^2} c({\rm\small ALG} ^{(i+1)}_{\rho , J} \setminus {\rm\small SOL}) \\ &\qquad \qquad \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , J})-c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , 0})}{c^{\prime }({\rm\small SOL})}\\ &\ge \frac{(4\Gamma - 1)(\sqrt {\Gamma }-1)(\sqrt {\Gamma }-1-\epsilon)\kappa }{16(1+\epsilon)\Gamma ^2} c({\rm\small ALG} ^{(i+1)}_{\rho , J} \setminus {\rm\small SOL}). \end{align*}
The third-to-last inequality is using Lemma 5.12, which implies \(c({\rm\small ALG} ^{(i+1)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho , j})\) is decreasing with swaps. The second-to-last inequality is using the assumption \(c({\rm\small ALG} ^{(i+1)}_\rho \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+1)}_{\rho })\) in the statement the lemma. The last inequality uses the fact that the while loop on line 6 of DoubleApprox terminates because \(c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , J}) \gt (4\Gamma ^{\prime } + \kappa)\chi ^{\prime }\) (by the assumption that \(c({\rm\small ALG} ^{(i+1)}_{\rho , J}) \gt c({\rm\small ALG} ^{(i)}) - \rho /2\)), and lines 2 and 13 of DoubleApprox give that \(c^{\prime }({\rm\small ALG} ^{(i+1)}_{\rho , 0}) \le 4\Gamma ^{\prime } \chi ^{\prime }\).□
Proof of Lemma 5.15
Because \(c^{\prime }({\rm\small ALG} ^{(i+2)}_\rho) \gt 4\Gamma ^{\prime }\chi ^{\prime }\) in every backwards phase and \(\chi ^{\prime } \ge c^{\prime }({\rm\small SOL})\), by Lemma 5.10 whenever GreedySwap is called on \({\rm\small ALG} ^{(i+2)}_\rho\) in line 14 of DoubleApprox, at least one swap is possible. Since all edge costs are multiples of \(\frac{\epsilon }{n}\chi ^{\prime }\), and the last argument to GreedySwap is \(\frac{\epsilon }{n}\chi ^{\prime }\) (which lower bounds the decrease in \(c^{\prime }({\rm\small ALG} ^{(i+2)}_\rho)\) due to any improving swap), GreedySwap always makes a swap.
Let \({\rm\small ALG} ^{(i+2)}_{\rho , j}\) denote the value of \({\rm\small ALG} ^{(i+2)}\) after j calls to GreedySwap on \({\rm\small ALG} ^{(i+2)}\), and let J be the total number of calls of GreedySwap on \({\rm\small ALG} ^{(i+2)}\). Then \({\rm\small ALG} ^{(i+2)}_{\rho , 0}\) is the final value of \({\rm\small ALG} ^{(i+1)}\) and the final value of \({\rm\small ALG} ^{(i+2)}\) is \({\rm\small ALG} ^{(i+2)}_{\rho , J}\). We want to show that
\begin{equation*} c({\rm\small ALG} ^{(i+2)}_{\rho , J}) - c({\rm\small ALG} ^{(i+2)}_{\rho , 0}) = \sum _{j = 0}^{J-1}[c({\rm\small ALG} ^{(i+2)}_{\rho , j+1}) - c({\rm\small ALG} ^{(i+2)}_{\rho , j})] \le (e^{\zeta ^{\prime } T} - 1) c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL}). \end{equation*}
We bound each \(c({\rm\small ALG} ^{(i+2)}_{j+1}) - c({\rm\small ALG} ^{(i+2)}_{j})\) term using Lemma 5.10 and Lemma 5.11. Since in a backward phase we have \(c^{\prime }({\rm\small ALG} ^{(i+2)}_\rho) \gt 4\Gamma ^{\prime } c^{\prime }({\rm\small SOL})\), by Lemma 5.10 we know there exists a swap between \(a \in {\rm\small ALG} ^{(i+2)}_{\rho , j}\) and \(f \in {\rm\small SOL}\) such that
\begin{equation*} \frac{(1+\epsilon)c(f) - c(a)}{c^{\prime }(a) - (1+\epsilon)c^{\prime }(f)} \le \frac{4(1+\epsilon)\Gamma ^{\prime }}{(\sqrt {\Gamma ^{\prime }}-1)(\sqrt {\Gamma ^{\prime }}-1-\epsilon)} \cdot \frac{c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})}{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL})}. \end{equation*}
By Lemma 5.11, we know that when \(G^{\prime }\) is set to the value in \([c(f), (1+\epsilon) \cdot c(f)]\) in line 2 of GreedySwap, the algorithm finds a path \(f^{\prime }\) between the endpoints of f such that \(c^{\prime }(f^{\prime }) \le (1+\epsilon) c^{\prime }(f)\) and \(c(f^{\prime }) \le (1+\epsilon) c(f)\). Thus, \((a, f^{\prime }) \in swaps\) and we get that the swap \((a^*, f^*)\) chosen by the \((j+1)\)th call to GreedySwap satisfies:
\begin{align*} \frac{c(f^*) - c(a^*)}{c^{\prime }(a^*) - c^{\prime }(f^*)} &\le \frac{c(f^{\prime }) - c(a)}{c^{\prime }(a) - c^{\prime }(f)} \le \frac{(1+\epsilon)c(f) - c(a)}{c^{\prime }(a) - (1+\epsilon)c^{\prime }(f)} \\ &\le \frac{4(1+\epsilon)\Gamma ^{\prime }}{(\sqrt {\Gamma ^{\prime }}-1)(\sqrt {\Gamma ^{\prime }}-1-\epsilon)} \cdot \frac{c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})}{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL})}\\ &\le \frac{4(1+\epsilon)\Gamma ^{\prime }}{(\sqrt {\Gamma ^{\prime }}-1)(\sqrt {\Gamma ^{\prime }}-1-\epsilon)(4\Gamma ^{\prime }-1)(4\Gamma)} \cdot \frac{c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL})}{c^{\prime }({\rm\small SOL})}. \end{align*}
The last inequality is derived using the assumption \(c({\rm\small ALG} ^{(i+2)}_{\rho } \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho })\) in the statement of the lemma, as well as the fact that for all \(j \lt J\), \(c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) \ge 4\Gamma c^{\prime }({\rm\small SOL}) \Rightarrow c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) \ge c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small SOL}) \ge (4\Gamma ^{\prime }-1) c^{\prime }({\rm\small SOL})\).
This in turn gives:
\[\begin{eqnarray*} c({\rm\small ALG} ^{(i+2)}_{\rho , J}) - c({\rm\small ALG} ^{(i+2)}_{\rho , 0}) &= \sum _{j = 0}^{J-1}[c({\rm\small ALG} ^{(i+2)}_{\rho , j+1}) - c({\rm\small ALG} ^{(i+2)}_{\rho , j})] \nonumber \nonumber\\ &= \sum _{j = 0}^{J-1}\frac{c({\rm\small ALG} ^{(i+2)}_{\rho , j+1}) - c({\rm\small ALG} ^{(i+2)}_{\rho , j})}{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})} \cdot [c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})] \nonumber \nonumber\\ &\le \frac{4(1+\epsilon)\Gamma ^{\prime }}{(\sqrt {\Gamma ^{\prime }}-1)(\sqrt {\Gamma ^{\prime }}-1-\epsilon)(4\Gamma ^{\prime }-1)(4\Gamma)} \sum _{j = 0}^{J-1} [c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL})] \nonumber \nonumber\\ &\qquad \qquad \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})}{c^{\prime }({\rm\small SOL})} \nonumber \nonumber \end{eqnarray*}\]
\begin{align} &\le \frac{4(1+\epsilon)\Gamma ^{\prime }}{(\sqrt {\Gamma ^{\prime }}-1)(\sqrt {\Gamma ^{\prime }}-1-\epsilon)(4\Gamma ^{\prime }-1)(4\Gamma -1)} \nonumber \nonumber\\ &\qquad \qquad \cdot \sum _{j = 0}^{J-1} [c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})] \nonumber \nonumber\\ &\qquad \qquad \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})}{c^{\prime }({\rm\small SOL})} \nonumber \nonumber\\ &= \zeta ^{\prime } \sum _{j = 0}^{J-1} [c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})] \end{align}
(13)
\begin{align} & \qquad \qquad \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})}{c^{\prime }({\rm\small SOL})}. \end{align}
(14)
The last inequality is proved using the assumption \(c({\rm\small ALG} ^{(i+2)}_{\rho } \setminus {\rm\small SOL}) \gt 4\Gamma \cdot c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho })\) in the statement of the lemma, which implies
\begin{align*} c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) &= \frac{4\Gamma }{4\Gamma -1}c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - \frac{1}{4\Gamma -1}c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL})\\ &\lt \frac{4\Gamma }{4\Gamma -1}c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - \frac{4\Gamma }{4\Gamma -1}c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j}). \end{align*}
It now suffices to show
\begin{equation*} \sum _{j = 0}^{J-1} [c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})] \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})}{c^{\prime }({\rm\small SOL})} \end{equation*}
\begin{equation*} \le \ \frac{e^{\zeta ^{\prime } T} - 1}{\zeta ^{\prime }} c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL}). \end{equation*}
To do so, we view the series of swaps as occurring over a continuous timeline, where for \(j = 0, 1, \ldots J-1\) the \((j+1)\)th swap takes time \(\tau (j) = \frac{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})}{c^{\prime }({\rm\small SOL})}\), i.e., occurs from time \(\sum _{j^{\prime } \lt j} \tau (j^{\prime })\) to time \(\sum _{j^{\prime } \le j} \tau (j^{\prime })\). The total time taken to perform all swaps in the sum is the total decrease in \(c^{\prime }\) across all swaps, divided by \(c^{\prime }({\rm\small SOL})\), i.e., exactly T. Using this definition of time, let \(\Phi (t)\) denote \(c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})\) for the value of j satisfying \(\Phi (t) \in [\sum _{j^{\prime } \lt j} \tau (j^{\prime }), \sum _{j^{\prime } \le j} \tau (j^{\prime }))\). Using this definition, we get:
\begin{equation*} \sum _{j = 0}^{J-1} [c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})] \cdot \frac{c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})}{c^{\prime }({\rm\small SOL})} = \int _0^{\rightarrow T} \Phi (t)\ dt. \end{equation*}
We conclude by claiming \(\Phi (t) \le e^{\zeta ^{\prime } t} c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL})\). Given this claim, we get:
\begin{equation*} \int _0^{\rightarrow T} \Phi (t)\ dt \le c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL}) \int _0^{\rightarrow T} e^{\zeta ^{\prime } t}\ dt = \frac{e^{\zeta ^{\prime } T} - 1}{\zeta ^{\prime }}c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL}). \end{equation*}
Which completes the proof of the lemma. We now focus on proving the claim. Since \(\Phi (t)\) is fixed in the interval \([\sum _{j^{\prime } \lt j} \tau (j^{\prime }), \sum _{j^{\prime } \le j} \tau (j^{\prime }))\), it suffices to prove the claim only for t which are equal to \(\sum _{j^{\prime } \lt j} \tau (j^{\prime })\) for some j, so we proceed by induction on j. The claim clearly holds for \(j = 0\) since \(\sum _{j^{\prime } \lt 0} \tau (j^{\prime }) = 0\) and \(\Phi (0) = c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , 0}) \le c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL})\).
Assume that for \(t^{\prime } = \sum _{j^{\prime } \lt j} \tau (j^{\prime })\), we have \(\Phi (t^{\prime }) \le e^{\zeta ^{\prime } t^{\prime }} c({\rm\small ALG} ^{(i+2)}_{\rho , 0} \setminus {\rm\small SOL})\). For \(t^{\prime \prime } = t^{\prime } + \tau (j)\), by induction we can prove the claim by showing \(\Phi (t^{\prime \prime }) \le e^{\zeta ^{\prime } \tau (j)} \Phi (t^{\prime })\).
To show this, we consider the quantity
\begin{align*} \Phi (t^{\prime \prime }) - \Phi (t^{\prime }) &= [c({\rm\small ALG} ^{(i+2)}_{\rho , j+1} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j+1})] - [c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})] \\ &= [c({\rm\small ALG} ^{(i+2)}_{\rho , j+1} \setminus {\rm\small SOL}) - c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL})] + [c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j+1})]. \end{align*}
By Lemma 5.12 and reusing the bound in (14), we have:
\begin{align*} \Phi (t^{\prime \prime }) - \Phi (t^{\prime }) &= c({\rm\small ALG} ^{(i+2)}_{\rho ,j+1}) - c({\rm\small ALG} _{\rho ,j}^{(i+2)}) \\ &\le \zeta ^{\prime } \frac{[c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})]}{c^{\prime }({\rm\small SOL})} [c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j}) - c^{\prime }({\rm\small ALG} ^{(i+2)}_{\rho , j+1})]\\ &= \zeta ^{\prime } \cdot [c({\rm\small ALG} ^{(i+2)}_{\rho , j} \setminus {\rm\small SOL}) - c({\rm\small SOL} \setminus {\rm\small ALG} ^{(i+2)}_{\rho , j})] \cdot \tau (j) = \zeta ^{\prime } \cdot \Phi (t^{\prime }) \cdot \tau (j). \end{align*}
Rearranging terms we have:
\begin{equation*} \Phi (t^{\prime \prime }) \le \left(1 + \zeta ^{\prime } \cdot \tau (j)\right)\Phi (t^{\prime })\le e^{\zeta ^{\prime } \tau (j)} \Phi (t^{\prime }), \end{equation*}
where we use the inequality \(1+x \le e^x\). This completes the proof of the claim.□

6 Hardness Results for Robust Problems

We give the following general hardness result for a family of problems that includes many graph optimization problems:
Theorem 6.1.
Let \(\mathcal {P}\) be any robust covering problem whose input includes a weighted graph G where the lengths \(d_e\) of the edges are given as ranges \([\ell _e, u_e]\) and for which the non-robust version of the problem, \(\mathcal {P}^{\prime }\), has the following properties:
A solution to an instance of \(\mathcal {P}^{\prime }\) can be written as a (multi-)set S of edges in G, and has cost \(\sum _{e \in S} d_e\).
Given an input including G to \(\mathcal {P}^{\prime }\), there is a polynomial-time approximation-preserving reduction from solving \(\mathcal {P}^{\prime }\) on this input to solving \(\mathcal {P}^{\prime }\) on some input including \(G^{\prime }\), where \(G^{\prime }\) is the graph formed by taking G, adding a new vertex \(v^*\), and adding a single edge from \(v^*\) to some \(v \in V\) of weight 0.
For any input including G to \(\mathcal {P}^{\prime }\), given any spanning tree T of G, there exists a feasible solution only including edges from T.
Then, if there exists a polynomial time \((\alpha , \beta)\)-robust algorithm for \(\mathcal {P}\), there exists a polynomial-time \(\beta\)-approximation algorithm for \(\mathcal {P}\).
Before proving Theorem 6.1, we note that robust traveling salesman and robust Steiner tree are examples of problems that Theorem 6.1 implicitly gives lower bounds for. For both problems, the first property clearly holds.
For traveling salesman, given any input G, any solution to the problem on input \(G^{\prime }\) as described in Theorem 6.1 can be turned into a solution of the same cost on input G by removing the new vertex \(v^*\) (since \(v^*\) was distance 0 from v, removing \(v^*\) does not affect the length of any tour), giving the second property. For any spanning tree of G, a walk on the spanning tree gives a valid TSP tour, giving the third property.
For Steiner tree, for the input with graph \(G^{\prime }\) and the same terminal set, for any solution containing the edge \((v, v^*)\) we can remove this edge and get a solution for the input with graph G that is feasible and of the same cost. Otherwise, the solution is already a solution for the input with graph G that is feasible and of the same cost, so the second property holds. Any spanning tree is a feasible Steiner tree, giving the third property.
We now give the proof of Theorem 6.1.
Proof of Theorem 6.1
Suppose there exists a polynomial time \((\alpha , \beta)\)-robust algorithm A for \(\mathcal {P}\). The \(\beta\)-approximation algorithm for \(\mathcal {P}^{\prime }\) is as follows:
(1)
From the input instance \(\mathcal {I}\) of \(\mathcal {P}\) where the graph is G, use the approximation-preserving reduction (that must exist by the second property of the theorem) to construct instance \(\mathcal {I}^{\prime }\) of \(\mathcal {P^{\prime }}\) where the graph is \(G^{\prime }\).
(2)
Construct an instance \(\mathcal {I}^{\prime \prime }\) of \(\mathcal {P}\) from \(\mathcal {I}^{\prime }\) as follows: For all edges in \(G^{\prime }\), their length is fixed to their length in \(\mathcal {I^{\prime }}\). In addition, we add a “special” edge from \(v^*\) to all vertices besides v with length range \([0, \infty ]\).6
(3)
Run A on \(\mathcal {I}^{\prime \prime }\) to get a solution \({\rm\small SOL}\). Treat this solution as a solution to \(\mathcal {I}^{\prime }\) (we will show it only uses edges that appear in \(\mathcal {I}\)). Use the approximation-preserving reduction to convert \({\rm\small SOL}\) into a solution for \(\mathcal {I}\) and output this solution.
Let O denote the cost of the optimal solution to \(\mathcal {I}^{\prime }\). Then, \({\rm\small MR} \le O\). To see why, note that the optimal solution to \(\mathcal {I}^{\prime }\) has cost O in all realizations of demands since it only uses edges of fixed cost, and thus its regret is at most O. This also implies that for all \({\bf d}\), \({\rm\small OPT} ({\bf d})\) is finite. Then for all \({\bf d}\), \({\rm\small SOL} ({\bf d}) \le \alpha \cdot {\rm\small OPT} ({\bf d}) + \beta \cdot {\rm\small MR}\), i.e., \({\rm\small SOL} ({\bf d})\) is finite in all realizations of demands, so \({\rm\small SOL}\) does not include any special edges, as any solution with a special edge has infinite cost in some realization of demands.
Now consider the realization of demands \({\bf d}\) where all special edges have length 0. The special edges and the edge \((v, v^*)\) span \(G^{\prime }\), so by the third property of \(\mathcal {P}^{\prime }\) in the theorem statement there is a solution using only cost 0 edges in this realization, i.e., \({\rm\small OPT} ({\bf d})\) = 0. Then in this realization, \({\rm\small SOL} ({\bf d}) \le \alpha \cdot {\rm\small OPT} ({\bf d}) + \beta \cdot {\rm\small MR} \le \beta \cdot O\). But since \({\rm\small SOL}\) does not include any special edges, and all edges besides special edges have fixed cost and their cost is the same in \(\mathcal {I}^{\prime \prime }\) as in \(\mathcal {I}^{\prime }\), \({\rm\small SOL} ({\bf d})\) also is the cost of \({\rm\small SOL}\) in instance \(\mathcal {I}^{\prime }\), i.e., \({\rm\small SOL} ({\bf d})\) is a \(\beta\)-approximation for \(\mathcal {I}^{\prime }\). Since the reduction from \(\mathcal {I}\) to \(\mathcal {I}^{\prime }\) is approximation-preserving, we also get a \(\beta\)-approximation for \(\mathcal {I}\).□
From [10, 15] we then get the following hardness results:
Corollary 6.2.
Finding an \((\alpha , \beta)\)-robust solution for Steiner tree where \(\beta \lt 96/95\) is NP-hard.
Corollary 6.3.
Finding an \((\alpha , \beta)\)-robust solution for TSP where \(\beta \lt 121/120\) is NP-hard.

7 Conclusion

In this paper, we designed constant approximation algorithms for the robust Steiner tree (stt) and traveling salesman problems (tsp). More precisely, our algorithms take as input a range of possible edge lengths in a graph and obtain a single solution for the problem at hand that can be compared to the optimal solution for any realization of edge lengths in the given ranges. While our approximation bounds for tsp are small constants, those for stt are very large constants. A natural question is whether these constants can be made smaller, e.g., of the same scale as classic approximation bounds for stt. While we did not seek to optimize our constants, obtaining truly small constants for stt appears to be beyond our techniques, and is an interesting open question.
More generally, robust algorithms are a key component in the area of optimization under uncertainty that is of much practical and theoretical significance. Indeed, as mentioned in our survey of related work, several different models of robust algorithms have been considered in the literature.
Optimizing over input ranges is one of the most natural models in robust optimization, but has been restricted in the past to polynomial-time solvable problems because of definitional limitations. We circumvent this by setting regret minimization as our goal, and creating the \((\alpha , \beta)\)-approximation framework, which then allows us to consider a large variety of interesting combinatorial optimization problems in this setting. We hope that our work will lead to more research in robust algorithms for other fundamental problems in combinatorial optimization, particularly in algorithmic graph theory.

Footnotes

1
We focus on minimization problems over sets of edges in this paper, but one can easily extend the definition to maximization problems and problems over arbitrary set systems.
2
There are two common and equivalent assumptions made in the tsp literature in order to achieve reasonable approximations. In the first assumption, the algorithms can visit vertices multiple times in the tour, while in the latter, the edges satisfy the metric property. We use the former in this paper.
3
They obtain \((1, 2)\)-robust algorithms by choosing alg as the optimal solution with edge weights \(\ell _e + u_e\). For any \({\bf d}\), consider \({\bf d}^{\prime }\) with weights \(d_e^{\prime } = u_e + \ell _e - d_e\). By optimality of alg, \({\rm\small ALG} ({\bf d}) + {\rm\small ALG} ({\bf d}^{\prime }) \le {\rm\small MRS} ({\bf d}) + {\rm\small MRS} ({\bf d}^{\prime })\). Rearranging, we get \({\rm\small ALG} ({\bf d}) - {\rm\small MRS} ({\bf d}) \le {\rm\small MRS} ({\bf d}^{\prime }) - {\rm\small ALG} ({\bf d}^{\prime }) \le {\rm\small MR}\), so alg’s cost exceeds mrs’ regret by at most mr in every realization.
4
We do not prove this is \({\rm\small MRS}\): Even if it is not, it suffices to upper bound \({\rm\small MR}\) by this solution’s regret.
5
Note that \(c(a) - c(f)\) and \(\frac{c(a) - c(f)}{c(a)}\) are both maximized by maximizing \(c(a)\) and minimizing \(c(f)\). Any path f from u to v that we consider adding is independent of the paths a we can consider removing, since f by definition does not intersect with our solution. So in finding a swap satisfying these conditions if one exists, it still suffices to only consider swaps between shortest paths f and longest paths a in the resulting cycles as before.
6
\(\infty\) is used to simplify the proof, but can be replaced with a sufficiently large finite number. For example, the total weight of all edges in G suffices and has small bit complexity.

References

[1]
H. Aissi, C. Bazgan, and D. Vanderpooten. 2008. Complexity of the min–max (regret) versions of min cut problems. Discrete Optimization 5, 1 (2008), 66–73.
[2]
Hassene Aissi, Cristina Bazgan, and Daniel Vanderpooten. 2009. Min–max and min–max regret versions of combinatorial optimization problems: A survey. European Journal of Operational Research 197, 2 (2009), 427–438. DOI:
[3]
Igor Averbakh. 2001. On the complexity of a class of combinatorial optimization problems with uncertainty. Mathematical Programming 90, 2 (01 Apr.2001), 263–272.
[4]
Igor Averbakh. 2005. The minmax relative regret median problem on networks. INFORMS Journal on Computing 17, 4 (2005), 451–461.
[5]
I. Averbakh and Oded Berman. 1997. Minimax regret p-center location on a network with demand uncertainty. Location Science 5, 4 (1997), 247–254. arXiv:
[6]
Igor Averbakh and Oded Berman. 2000. Minmax regret median location on a network under uncertainty. INFORMS Journal on Computing 12, 2 (2000), 104–110. arXiv:
[7]
Dimitris Bertsimas and Melvyn Sim. 2003. Robust discrete optimization and network flows. Mathematical Programming 98, 1 (01 Sep.2003), 49–71. DOI:
[8]
Jaroslaw Byrka, Fabrizio Grandoni, Thomas Rothvoß, and Laura Sanità. 2010. An improved LP-based approximation for Steiner tree. In Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5–8 June 2010. 583–592.
[9]
André Chassein and Marc Goerigk. 2015. On the recoverable robust traveling salesman problem. Optimization Letters 10 (092015). DOI:
[10]
Miroslav Chlebík and Janka Chlebíková. 2002. Approximation hardness of the Steiner tree problem on graphs. In Algorithm Theory — SWAT 2002, Martti Penttonen and Erik Meineche Schmidt (Eds.). Springer Berlin, Berlin, 170–179.
[11]
Eduardo Conde. 2012. On a constant factor approximation for minmax regret problems using a symmetry point scenario. European Journal of Operational Research 219, 2 (2012), 452–457. DOI:
[12]
Kedar Dhamdhere, Vineet Goyal, R. Ravi, and Mohit Singh. 2005. How to pay, come what may: Approximation algorithms for demand-robust covering problems. In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05), 23–25 October 2005, Pittsburgh, PA, USA, Proceedings. 367–378.
[13]
Martin Groß, Anupam Gupta, Amit Kumar, Jannik Matuschke, Daniel R. Schmidt, Melanie Schmidt, and José Verschae. 2018. A local-search algorithm for Steiner forest. In 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, January 11–14, 2018, Cambridge, MA, USA. 31:1–31:17. DOI:
[14]
Masahiro Inuiguchi and Masatoshi Sakawa. 1995. Minimax regret solution to linear programming problems with an interval objective function. European Journal of Operational Research 86, 3 (1995), 526–536. DOI:
[15]
Marek Karpinski, Michael Lampis, and Richard Schmied. 2013. New inapproximability bounds for TSP. In Algorithms and Computation, Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam (Eds.). Springer Berlin, Berlin, 568–578.
[16]
Adam Kasperski and PawełZieliński. 2006. An approximation algorithm for interval data minmax regret combinatorial optimization problems. Inf. Process. Lett. 97, 5 (March2006), 177–180. DOI:
[17]
Adam Kasperski and Pawel Zieliński. 2007. On the existence of an FPTAS for minmax regret combinatorial optimization problems with interval data. Oper. Res. Lett. 35 (2007), 525–532.
[18]
P. Kouvelis and G. Yu. 1996. Robust Discrete Optimization and Its Applications. Springer US. 96043291
[19]
Panos Kouvelis and Gang Yu. 1997. Robust 1-Median Location Problems: Dynamic Aspects and Uncertainty. Springer US, Boston, MA, 193–240. DOI:
[20]
Helmut E. Mausser and Manuel Laguna. 1998. A new mixed integer formulation for the maximum regret problem. International Transactions in Operational Research 5, 5 (1998), 389–403. DOI:
[21]
V. Vazirani. 2001. Approximation Algorithms. Springer-Verlag, Berlin.
[22]
Jens Vygen. [n. d.]. New Approximation Algorithms for the TSP.
[23]
Laurence A. Wolsey. 1980. Heuristic analysis, linear programming and branch and bound. In Combinatorial optimization II, V. J. Rayward-Smith (Ed.). Springer Berlin, Berlin, 121–134. DOI:
[24]
H. Yaman, O. E. Karaşan, and M. Ç. Pinar. 2001. The robust spanning tree problem with interval data. Operations Research Letters 29, 1 (2001), 31–40.
[25]
P. Zieliński. 2004. The computational complexity of the relative robust shortest path problem with interval data. European Journal of Operational Research 158, 3 (2004), 570–576.

Cited By

View all

Index Terms

  1. Robust Algorithms for TSP and Steiner Tree

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Algorithms
    ACM Transactions on Algorithms  Volume 19, Issue 2
    April 2023
    367 pages
    ISSN:1549-6325
    EISSN:1549-6333
    DOI:10.1145/3582899
    • Editor:
    • Edith Cohen
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 March 2023
    Online AM: 14 December 2022
    Accepted: 23 October 2022
    Revised: 30 August 2022
    Received: 14 October 2021
    Published in TALG Volume 19, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Steiner tree
    2. traveling salesman

    Qualifiers

    • Research-article

    Funding Sources

    • NSF
    • NSF
    • NSF
    • NSF CAREER
    • Indo-US Virtual Networked Joint Center on Algorithms

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 1,276
      Total Downloads
    • Downloads (Last 12 months)673
    • Downloads (Last 6 weeks)100
    Reflects downloads up to 26 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media