The Massively Parallel Computation (MPC) model is a popular abstraction for large-scale distributed computing. The Set Cover problem is a classical combinatorial optimization problem that has a wide variety of applications and generalizes many well-studied problems such as vertex cover, edge cover, and minimum dominating set. For a Set Cover instance with a ground set of n elements and a collection of m sets, we present two O(log n)-approximation algorithms in the MPC model. Our algorithms run in \(\tilde{O}(\log N)\)-rounds in the linear-memory MPC model and in \(\tilde{O}(\log ^{1.5} N)\) rounds in the sublinear-memory MPC model, where N = n + m. These are the first O(log n)-approximation algorithms for Set Cover in this setting that run in o(log 2N) rounds. Our results are obtained by repurposing the sparsified graph exponentiation technique that has been successful for the maximal independent set problem in the MPC model and applying it to a simple, distributed Set Cover algorithm by Grunau, Mitrović, Rubinfeld, and Vakilian (SODA 2020).
1 Introduction
We present fast algorithms for the SetCover problem in the Massively Parallel Computation (MPC) model [12, 16, 24, 31]. In the SetCover problem we are given a ground set of n elements and a collection of m sets that jointly cover all elements. The goal is to find the smallest subcollection of sets that still cover all elements. SetCover is a classical combinatorial optimization problem that is a generalization of many well-studied problems like vertex cover, edge cover, and minimum dominating set. The MPC model is defined by a set of machines connected via an all-to-all communication network, with each machine having at most W words of local memory (one word is O(log N) bits). Communication and computation in this model are synchronous. At the beginning of the algorithm, the input is distributed across the machines. During a round, each machine receives up to W words from all the other machines, performs local computation, and sends up to W words to all the other machines to guide the computation in the next round. Note that in this model the bandwidth constraint is not associated with individual edges, but is associated with each machine.
A natural, greedy sequential algorithm provides a ln n + O(1)-approximation [5, 23, 26]. This algorithm is optimal since there can be no cln n-approximation to SetCover for any c < 1 unless P = NP [11]. The SetCover problem has also been extensively studied in parallel settings and O(log n)-approximation algorithms are known [3, 30], and these algorithms have served as the basis for distributedSetCover approximation algorithms [4, 22]. For example, in the Congest model1 of distributed computation, the work of Censor-Hillel and Dory [4] implies an algorithm for the SetCover problem that guarantees an approximation ratio of O(log n), and takes O(log 2N) rounds, where N = n + m. This Congest algorithm succeeds with high probability (whp), i.e., with probability at least 1 − 1/Nc for a constant c > 1.
There is an active stream of research that aims to design MPC algorithms that are significantly faster than their Congest counterparts for various problems [7, 8, 14, 15, 25]. In particular for the Maximal Independent Set (MIS) problem, the best-known round complexity (in terms of the number of vertices, N) in the Congest model is O(log N) [1, 27], whereas in low-memory MPC model it is \(\tilde{O}(\sqrt {\log N})\)2 [15, 25] and in the linear-memory MPC model it is O(log log N) [14]. But, as far as we know, SetCover seems to have resisted such improvements thus far and the fastest known MPC algorithm simply simulates the fastest Congest algorithm [4] one round at a time, taking O(log 2N) rounds. In this paper, motivated by the speedups obtained for MIS, we show that there are substantially faster MPC algorithms for SetCover. Specifically, we prove the following theorem.
Theorem (Informal Main Result) We are given a SetCover instance with a ground set of n elements and a collection of m sets. Let N = m + n. We can compute whp an O(log n)-approximate solution to SetCover that runs in
•
\(\tilde{O}(\log ^{1.5} N)\) rounds in the low-memory MPC model and
•
\(\tilde{O}(\log N)\) rounds in the linear-memory MPC model.
1.1 SetCover in the MPC Model
The input to the SetCover problem is a ground set X, |X| = n and a collection \(\mathcal {S} = \lbrace S_1, S_2, \ldots , S_m\rbrace\) of subsets Si⊆X satisfying ∪ iSi = X. The output of the problem is required to be the smallest subcollection \(\mathcal {C} \subseteq \mathcal {S}\) such that \(\cup _{S \in \mathcal {C}} S = X\). For a SetCover instance, we use s to denote the size of the largest set and t to denote the largest number of sets an element appears in. A SetCover instance can be represented as a bipartite graph \(G = (X, \mathcal {S}, E)\) with node set X ∪ S of size N = m + n, and the edge set E consisting of edges {S, x}, where x ∈ X, \(S \in \mathcal {S}\), and x ∈ S.
In the MPC model, the local memory of each machine W and the number of machines \(|\mathcal {M}|\) used are assumed to be sublinear to the input size I. Specifically, W = Iα and \(|\mathcal {M}| = I^{\beta }\), where 0 < α < 1, and 0 < β < 1 are constants. The number of machines \(|\mathcal {M}|\) is assumed to be large enough so that we have enough memory to store the entire input, that is \(|\mathcal {M}| \cdot W \ge I\). When the input is an N-node graph, the difficulty of designing MPC algorithms seems to be tied to how the memory per machine W relates to N. For this reason, researchers have separately studied the following three memory-size regimes. Note that in all three regimes, it is still true that the local memory parameter W is sublinear in the total input size I.
Low-Memory MPC:
W = Θ(Nδ) local memory per machine for 0 < δ < 1.
Linear-Memory MPC:
W = Θ(N) local memory per machine.
Superlinear-Memory MPC:
Θ(N1 + δ) local memory per machine for 0 < δ < 1.
It is known that SetCover can be solved (approximately) in O(1) rounds in the superlinear-memory MPC model [20] (see details below in Section 3). Our paper focuses on the two more challenging memory regimes, low-memory MPC and linear-memory MPC.
In the MPC model, the standard assumption is that the input graph edges are distributed arbitrarily among the machines. But it is convenient to design algorithms assuming that the input is distributed in a node-centric manner3. That is for each node v in the input graph, there is a machine Mv that hosts it and Mv knows all the neighbors of v and the machines that host these neighbors. A machine can host multiple vertices as long as the local memory constraint is not violated.
An issue that immediately arises in the low-memory MPC model, is that the degree of a node itself can be too large to fit in a single machine. We deal with this issue in a standard way [2, 15, 25]. We first replace the edges incident on a node v with \(\deg (v) \gt N^\delta\) by a virtual, NO(δ)-ary balanced tree of depth O(1/δ). The root of this tree is v and its leaves are the neighbors of v. Each intermediate node is a virtual node with degree NO(δ) and thus the neighborhood of every virtual node can fit in a machine. Communication between v and its neighbors occurs via this tree and takes O(1/δ) rounds, instead of just 1 round.
2 Technical Contributions
Our starting point is a simple SetCover algorithm by Grunau, Mitrović, Rubinfeld, and Vakilian [19]. For our purpose, it is best to view this as a distributed Congest model algorithm. In the pseudocode (Algorithm 1) below, s is the size of the largest set in \(\mathcal {S}\) and t is the maximum number of sets an element appears in. We use \(\deg _{i,k}(S)\) to refer to the number of uncovered elements in S just before iteration k in stage i. This algorithm produces an O(log s)-approximate solution in expectation to SetCover [19]. This algorithm can be implemented “as is” in the linear-memory MPC model in O(log slog t) rounds and in the low-memory MPC model in \(O\left(\frac{1}{\delta } \log \mathbf {s} \log \mathbf {t}\right)\) rounds.
While this algorithm is closely related to the nearly-30-year-old parallel SetCover algorithm of Berger, Rompel, and Shor [3], it is similar in spirit to other parallel and distributed SetCover algorithms [4, 22, 30]. All of these algorithms emulate the standard, sequential greedy SetCover algorithm, but relax the sequential nature of the greedy algorithm by processing not just one “best” set, but multiple “good enough” sets in parallel. Specifically, all the sets S for which \(\deg _{i, k}(S) \ge \mathbf {s}/2^i\) (in Line 4 Algorithm 1) are good enough and they are candidates for being added to the set cover. For example, in Stage 1, all sets with cardinality at least s/2 are considered good enough. However, adding all of these good enough sets to the set cover can lead to a very large solution and damage the approximation factor. So sets are added in log t iterations in a randomized fashion with geometrically increasing probability. For example, in iteration 1, sets are added with probability 2/t. Intuitively, this means that elements that belong to close to t sets are covered, but only by O(1) sets. This allows the algorithm to double the set selection probability to 4/t in the next iteration and so on. Adding candidate sets to the set cover in such a way that no element is covered by more than O(1) sets leads to a O(log s)-approximation. The log t iterations in each stage, essentially do this type of local symmetry breaking in parallel. Our work in this paper is motivated by the fact that for MIS, which is a classical local symmetry breaking problem, there are techniques that lead to fast MPC algorithms.
2.1 Sparsified Graph Exponentiation
A popular technique [13, 15, 21, 28] for speeding up the simulation of Congest algorithms in “all-to-all” communication models is graph exponentiation. The main idea of this technique is that, if every node v knows the state of the k-neighborhood around v, then by exchanging this information with all nodes, it is possible to learn the state of the 2k-neighborhood around each node. If the information exchange in each iteration can be done in O(1) rounds, then nodes can learn the state of their ℓ -neighborhood in O(log ℓ) rounds. The main obstacle to obtaining this exponential speedup is that the k-neighborhoods around nodes may be so large that exchanging them may take too many rounds. However, many Congest algorithms are randomized and there is a natural sparsification that occurs, i.e., in each round only a randomly sampled subset of the nodes are active, and the rest are silent. This implies that the k-neighborhoods that are exchanged only need to involve sparse subgraphs induced by the sampled nodes. This sparsified graph exponentiation technique has been applied to the MIS problem in the low-memory MPC model [15]. Furthermore, a simulation theorem that allows us to apply sparsified graph exponentiation as a black box was presented in [25] and applied to the 2-ruling set problem and MIS. In this paper, we show that the sparsified graph exponentiation technique can be applied to Algorithm 1 to obtain fast simulations in both the low-memory and the linear-memory MPC models.
2.2 Sparsified Graph Exponentiation Applied to SetCover
Our overall plan is to partition the log t iterations in a stage in Algorithm 1 into batches and simulate each batch fast, using sparsified graph exponentiation. Consider a batch with b iterations, k = B + 1, B + 2, …, B + b in Stage i. At the start of this batch, it is possible to identify sets that will be “active” in this batch. Call a set S a candidate for this batch if \(\deg _{i, B}(S) \ge \mathbf {s}/2^i\). For each candidate set S, we toss b coins ck(S) for k = B + 1, B + 2, …, B + b with Prob[ck(S) = 1] = 2k/t. Call a set S selected in this batch if it is a candidate for this batch and ck(S) = 1 for some B + 1 ≤ k ≤ B + b. It is clear from Lines 4-5 in Algorithm 1 that the only sets that can join the set cover in this batch are those that are selected in this batch.
Whether a selected set S is added to the set cover in this batch can be determined by examining the 2b-hop neighborhood of S in the subgraph induced by selected sets and all elements. We could get a fast simulation of Algorithm 1 if we can use the graph exponentiation technique to gather this subgraph at S, for every selected set S in O(log b) rounds. Unfortunately, this graph is too large because while we have sparsified the sets (by keeping only selected sets), we have not sparsified the elements. To explain our idea of element sparsification, we first describe the role of elements in the sparsified graph. Consider a candidate set S for which ck(S) = 1 for some B + 1 ≤ k ≤ B + b, but \(c_{k^{\prime }}(S) = 0\) for all B + 1 ≤ k′ < k. In other words, k denotes the first iteration in this batch in which S is selected. Even though S is selected in iteration k, whether it should be added to the set cover depends on whether S is still a candidate just before iteration k, i.e., if \(\deg _{i, k}(S) \ge \mathbf {s}/2^i\). To check this S needs to know the number of uncovered elements it contains (though not necessarily which elements are uncovered). In order to sparsify elements, we use the insight that it is enough for S to have a good enough estimate of the number of uncovered elements it contains. It turns out that it is enough to randomly sample a small number of elements so that there are roughly Θ(log N) sampled elements for each iteration in the neighborhood of each candidate set.
2.3 Main Results
Let N = m + n. For the low-memory MPC algorithm, each batch has a fixed length consisting of \(\Theta (\delta \sqrt {\log N}/4)\) iterations, that can be simulated using sparsified graph exponentiation in O(log log N) rounds whp. This leads to a low-memory MPC model algorithm that runs in
rounds (see Theorem 4.6). In the linear-memory MPC algorithm, we simulate the first log t − O(log log N) iterations of each stage i by using the sparsification technique developed in the low-memory MPC algorithm. The interesting point here is that we do not need sparsified graph exponentiation because the active graph induced by the selected candidate sets and sampled elements in the first log t − O(log log N) iterations has O(N) size and can be sent to a single machine for local processing. Then the remaining O(log log N) iterations can be executed one by one. Our linear-memory MPC algorithm has an overall running time of O(log s · log log N) rounds whp in the linear-memory MPC model (see Theorem 5.2).
3 Related Work
Harvey, Liaw, and Liu [20] present, what can be viewed as a O(1)-round superlinear-memory MPC algorithm giving a t-approximation algorithm for weighted SetCover. More precisely, their algorithm runs in O((c/μ)2) rounds, where the space per machine is assumed to be Θ(t · n1 + μ).
It is also worth mentioning that Grunau, Mitrović, Rubinfeld, and Vakilian [19] use their base algorithm to design algorithms in Local Computation Algorithms (LCA) model with small query complexity. In the LCA model one can perform queries on the input and learn about a small portion of the output. For example, for SetCover in the LCA model, we want to perform as few queries as possible to find out if a set S belongs to the approximate set cover solution. Some of the techniques used in the present paper (e.g., sparsified graph exponentiation) for SetCover in the MPC model may be implicit in [19] and are used to reduce query complexity in LCA. However, none of the results in [19] are in the MPC model.
In a recent paper, Götte, Kolb, Scheideler, and Werthmann [17] present a SetCover approximation algorithm in the beeping model4 and also presents a variant that has low message complexity in the Congest model. Even though this algorithm has many features specific to the beeping model, at its core it is similar to the base algorithm in [19]. It is important to note that even though our work is in the MPC model, [19] is in the LCA model, and [17] is in the beeping and Congest models, the common goal in all of these is to reduce the message complexity of the algorithms without worsening the quality of the approximation.
Note added to the camera-ready version: As the current paper was being reviewed, we were alerted to a paper by Dhulipala, Dinitz, Łącki, and Mitrović posted on arxiv [10] (and to appear in DISC 2024 [9]), one of whose results is very similar to our result. In fact, the “new, surprisingly simple” algorithm referred to in the abstract of this paper [10] seems quite similar to the “base algorithm” from [19] that we refer to. We note that [10] uses this algorithm as a starting point for deriving multiple algorithms (with different approximation factors), not just in the MPC model, but also in the PRAM model.
4 Low-Memory MPC Algorithm
Fix an arbitrary stage i. The log t iterations in the stage are partitioned into batches, each batch consisting of \(\delta \sqrt {\log N}/4\) iterations. To be precise, let \(P_j = {\delta \sqrt {\log N}/4}\cdot (j-1)\) denote batch boundaries. Then batch j, for j = 1, 2, … consists of iterations k = Pj + 1 through k = Pj + 1. Recall that a set S is called a candidate for batch j if \(\deg _{i, P_j+1}(S) \ge \mathbf {s}/2^i\). Also recall that each candidate S tosses \(\delta \sqrt {\log N}/4\) coins ck(S) for Pj + 1 ≤ k ≤ Pj + 1 with Prob[ck(S) = 1] = 2k/t to mimic the random selection of candidates in Line 5 in Algorithm 1 . A candidate set S is said to be selected if ck(S) = 1 for some Pj + 1 ≤ k ≤ Pj + 1. A candidate set S is said to be selected in iteration ℓ, Pj + 1 ≤ ℓ ≤ Pj + 1 if cℓ(S) = 1 and ck(S) = 0 for all Pj + 1 ≤ k < ℓ. As mentioned in the previous section, we just need enough not-yet-covered elements to participate in the algorithm in batch j to obtain an up-to-date estimate of the number of not-yet-covered elements in each selected candidate set. For this purpose, we sample each element that is not covered by the start of batch j with probability 2clog N · 2i/s independently for each iteration in batch j, for some constant c that will be fixed later. This is done so that we can get independent estimates of the degree of each iteration.
Let \(A = (\mathcal {S}_A, X_A, E_A)\) be the bipartite graph with node set \(\mathcal {S}_A \cup X_A\), where \(\mathcal {S}_A\) is the set of selected candidate sets in batch j and XA is the set of sampled elements in batch j and the edge set EA consists of edges {S, e}, where \(S \in \mathcal {S}_A\), e ∈ XA and e ∈ S. We call A the active graph for batch j because it includes all sets and elements that might be active in batch j. Let BA(v, r) be the subgraph of A induced by nodes that are at most r hops from v ∈ SA ∪ XA. Each node w in BA(v, r) representing a set is labeled with the following information: (a) \(\deg _{i, P_j+1}(w)\) and (b) the coin toss sequence \((c_{P_j+1}(w), c_{P_j+2}(w), \ldots , c_{P_{j+1}}(w))\). The algorithm that simulates batch j in stage i is described in Algorithm 2 .
As informally described in the introduction, for a set S that is selected in iteration ℓ, Pj + 1 ≤ ℓ ≤ Pj + 1 (i.e., in batch j), we use a sample of elements to estimate \(\deg _{i,\ell }(S)\) and determine if \(\deg _{i, \ell }(S) \ge \mathbf {s}/2^i\). The following lemma shows that the estimator used in Line 9 of Algorithm 2 for \(\deg _{i, \ell }(S)\) is good enough. Specifically, set S may be incorrectly classified as a candidate when the number of not-yet-covered elements in S is very close, i.e., within a \(\left[\frac{1}{(1+\epsilon)}, \frac{1}{(1-\epsilon)}\right]\) factor of the Stage i threshold of s/2i for an arbitrarily small constant ϵ. However, this only worsens the approximation factor of the algorithm by a factor that depends on ϵ.
Lemma 4.1
Suppose that S is a candidate selected in iteration ℓ, Pj + 1 ≤ ℓ ≤ Pj + 1. Let C(S) be the subset of S of sampled elements in batch j. Let Cu(S)⊆C(S) be the set of sampled neighbors of u that remain uncovered at the end of iteration ℓ − 1. Then, for any constant ϵ > 0, whp,
Let U denote the set of not-yet-covered elements in S just before the start of the batch. Thus, \(\deg _{i,P_j+1}(S) = |U|\). Since S is a candidate before the start of this batch, |U| ≥ s/2i and therefore \(\mathbb {E}[|C(S)|] = (2c\log N \cdot 2^i/\mathbf {s}) \cdot |U| \ge 2c\log N\) and by Chernoff bounds
for large enough constant c. Furthermore, \(\mathbb {E}[|C_u(S)|] = (2c\log N \cdot 2^i/\mathbf {s}) \cdot \deg _{i,\ell }(S)\). We partition the rest of the proof into two cases depending on how large \(\deg _{i,\ell }(S)\) is relative to s/2i + 1.
Case (1):
\(\deg _{i,\ell }(S) \ge \mathbf {s}/2^{i+1}\). In this case, \(\mathbb {E}[|C_u(S)|] \ge c \log N\) and by Chernoff bounds, for large enough constant c,
Let \(\mathcal {E}_1\) denote the event \(||C(S)| - \mathbb {E}[|C(S)|]| \lt \frac{\epsilon }{4} \cdot \mathbb {E}[|C(S)|]\) and similarly let \(\mathcal {E}_2\) denote the event \(||C_u(S)| - \mathbb {E}[|C_u(S)|]| \lt \frac{\epsilon }{4} \cdot \mathbb {E}[|C_u(S)|]\). Then, by bounds in (1) and (2), \(\mbox{Prob}[\mathcal {E}_1 \wedge \mathcal {E}_2] \ge 1 - 2/N^3 \ge 1 - 1/N^2\) for N ≥ 2. Therefore, with probability at least 1 − 1/N2
\(\deg _{i,\ell }(S) \lt \mathbf {s}/2^{i+1}\). In this case, \(\mathbb {E}[|C_u(S)|] \lt c \log N\). Therefore, by Chernoff bounds Prob[|Cu(S)| ≥ (3c/2)log N] ≤ 1/N3. The bound in (1) also implies that Prob[|C(S)| ≤ ((3c/2)log N · 2i/s) · |U|] ≤ 1/N3. Therefore, using a similar reasoning as in Case 1, with probability at least 1 − 1/N2, both the events |Cu(S)| < (3c/2)log N and |C(S)| > ((3c/2)log N · 2i/s) · |U| occur. This implies that with probability at least 1 − 1/N2,
To finish the proof we separately consider the two parts in the statement of the lemma.
•
If \((|C_u(S)|/|C(S)|)\cdot \deg _{i, P_j+1}(S) \ge \mathbf {s}/2^i\), then with probability at least 1 − 1/N2 Case (1) holds. This is because if Case (2) were to hold, then with probability at least 1 − 1/N2, (|Cu(S)|/|C(S)|) · |U| < s/2i. Thus, whp \(\deg _{i,\ell }(S) \ge \mathbf {s}/2^{i+1}\) and therefore \((|C_u(S)|/|C(S)|)\cdot \deg _{i, P_j+1}(S) \le (1+\epsilon) \cdot \deg _{i,\ell }(S)\). Hence, \(\deg _{i,\ell }(S) \ge \mathbf {s}/2(1+\epsilon)\).
•
If \((|C_u(S)|/|C(S)|)\cdot \deg _{i, P_j+1}(S) \ge \mathbf {s}/2^i\) then either Case (1) or (2) above may hold. If Case (1) holds and \(\deg _{i,\ell }(S) \ge \mathbf {s}/2^{i+1}\) then \((|C_u(S)|/|C(S)|)\cdot \deg _{i, P_j+1}(S) \ge (1-\epsilon) \cdot \deg _{i,\ell }(S)\) and therefore \(\deg _{i,\ell }(S) \le \mathbf {s}/2^i(1-\epsilon)\). If Case (2) holds, then \(\deg _{i,\ell }(S) \lt \mathbf {s}/2^{i+1} \lt \mathbf {s}/2^i(1-\epsilon)\).
□
The lemma above shows that the proposed estimator for \(\deg _{i, \ell }(S)\) is good enough. To compute this estimator, a selected candidate set S needs to know the quantities |Cu(S)|, |C(S)|, and \(\deg _{i, P_j+1}(S)\). The latter two quantities are available at the start of the batch just by communicating with neighbors. The following lemma shows that Cu(S) can be computed via local computation on \(B_A(S, \delta \sqrt {\log N}/2)\).
Lemma 4.2
Suppose that a candidate node S is selected in iteration ℓ, Pj + 1 ≤ ℓ ≤ Pj + 1. Further suppose that machine hosting set S knows \(B_A(S, \delta \sqrt {\log N}/2)\). Then, the machine can determine whp via local computation the set Cu(S) of sampled neighbors who remain uncovered after ℓ − 1 iterations.
Proof.
Let MS denote the machine hosting set S. We will show by induction that for any h, 1 ≤ h ≤ ℓ, machine MS can perform local computation on \(B_A(S, \delta \sqrt {\log N}/2)\) to determine for any selected candidate set Y in BA(S, 2(ℓ − h)) whp, the set Cu(Y) of sampled neighbors of Y who remain uncovered after k iterations for any k ≤ h − 1. Plugging h = ℓ into this claim leads to the lemma.
Base case: h = 1. Consider a selected candidate set Y ∈ BA(S, 2(ℓ − 1)). Let C(Y) denote the set of sampled neighbors of Y. After h − 1 = 0 iterations all nodes in C(Y) remain uncovered. Since all elements in C(Y) belong to \(B_A(S, 2(\ell -1)+1) \subseteq B_A(S, \delta \sqrt {\log N}/2)\), machine MS can compute the set Cu(Y) = C(Y) from \(B_A(S, \delta \sqrt {\log N}/2)\) using the first group of sampled elements.
Inductive step: We assume the inductive claim is true after iteration h. In other words, for any selected candidate set Y ∈ BA(S, 2(ℓ − h)) and k ≤ h − 1, machine MS can compute the set Cu(Y) of sampled neighbors of Y who remain uncovered after k iterations via local computation on \(B_A(S, \delta \sqrt {\log N}/2)\) using the first h − 1 groups of sampled elements. Therefore, for each such Y and k, machine MS knows if \((|C_u(Y)|/|C(Y)|)\cdot \deg _{i, k}(Y) \ge \mathbf {s}/2^i\) and if Y is selected in iteration k + 1. This implies that for each such Y and k, machine MS knows if Y joins the set cover in iteration k + 1. Now consider any sampled element x ∈ BA(S, 2(ℓ − h) − 1) belonging to a group k′ ≤ h. All neighbors of x that are selected candidates are in BA(S, 2(ℓ − h)). Therefore, for each k ≤ h − 1, machine MS can figure out via local computation on \(B_A(S, \delta \sqrt {\log N}/2)\) if x is covered after k′ = k + 1 iterations have been completed. This implies that for any selected candidate set Y ∈ BA(S, 2(ℓ − h) − 2) = BA(S, 2(ℓ − (h + 1))) and any k′ ≤ h, machine MS can compute Cu(Y), the set of sampled neighbors in group k′ that remain uncovered after k′ iterations. This establishes the inductive hypothesis after iteration h + 1. □
In the next lemma, we show an upper bound on the maximum degree of the active graph. Subsequently, we use this (in Lemma 4.4) to show that every \(\delta \sqrt {\log N}/2\)-neighborhood in the active graph has size O(Nδ/2).
Lemma 4.3
At the start of batch j, whp every not-yet-covered element x has at most \(\mathbf {t} \cdot \ln N/2^{P_j}\) candidate neighbors.
Proof.
Consider an element x ∈ X that is not yet covered at the start of batch j. Suppose x has more than \(\mathbf {t} \cdot \ln N/2^{P_j}\) candidate neighbors at the start of batch j. Each of these candidates is selected with probability \(2^{P_{j}+1}/\mathbf {t}\) in batch j − 1. Then the probability that none of these candidates are selected in batch j is
Thus with probability at least 1 − 1/N2 such an element x is covered in batch j. By the union bound, with probability at least 1 − 1/N every element x ∈ X not-yet-covered at the start of batch j has at most \(\mathbf {t} \cdot \ln N/2^{P_{j+1}}\) candidate neighbors. □
Lemma 4.4
For any node x ∈ A, the number of edges in the ball \(B_A(x, \delta \sqrt {\log N}/2)\) is bounded by O(Nδ/2) whp.
Proof.
By Lemma 4.3 each not-yet-covered element has at most \(\mathbf {t} \ln N/2^{P_j}\) neighboring candidates with high probability. Furthermore, each candidate set is selected with probability at most \(2\cdot 2^{P_{j+1}}/\mathbf {t}\). Thus, in expectation, a not-yet-covered element has at most \(2\ln N\cdot 2^{\delta \sqrt {\log N}/4}\) selected candidate neighbors. By Chernoff bounds, a not-yet-covered node has more than \((6\log N)2\ln N\cdot 2^{\delta \sqrt {\log N}/4}\) with probability at most 2− 6log N = 1/N6. Thus, whp, every not-yet-covered element has at most \((6\log N)2\ln N\cdot 2^{\delta \sqrt {\log N}/4} = O(\log ^2 N\cdot 2^{\delta \sqrt {\log N}/4})\) selected candidate neighbors.
On the other hand, every not-yet-covered element e is sampled with probability 2clog N · 2i/s. At the beginning of stage i the number of uncovered elements of any set not currently in the set cover is at most s/2i − 1, since any set with more not-yet-covered elements would have been a candidate in the previous stage and was either added to the set cover or now has fewer than s/2i − 1 uncovered elements. Thus, the expected number of sampled not-yet-covered elements in any candidate set in this batch is 4clog N. By Chernoff bounds, the probability that a set not in the current set cover contains more than 24clog N sampled uncovered elements is less than 2− 24clog N = 1/N24c. Thus whp every set not in the current set cover has at most 24clog N sampled uncovered elements, from one group. And since we independently sample a group of elements for each iteration in the batch, whp every set not in the current set cover has at most 24clog 1.5N sampled uncovered elements overall.
Combining the previous two results, the degree of the active graph is bounded by \(\max \lbrace O(\log ^2 N\cdot 2^{\delta \sqrt {\log N}/4}), O(\log ^{1.5} N)\rbrace =O(\log ^2 N \cdot 2^{\delta \sqrt {\log N}/4})\) whp. Next, we can bound the number of edges in \(B_A(x, \delta \sqrt {\log N}/2)\) by \((C\log ^2 N \cdot 2^{\delta \sqrt {\log N}/4})^{\delta \sqrt {\log N}/2}\) for some constant C. This quantity is bounded above by O(Nδ/2). □
Lemma 4.5
Suppose that the total memory, in all machines, is Ω(N1 + δ). Then, for every set \(S \in \mathcal {S}_A\), the subgraph \(B_A(S, \delta \sqrt {\log N}/2)\) can be gathered at S (in parallel) in O(log log N) rounds.
Proof.
Suppose the total memory is Ω(N1 + δ). In that case, there are Ω(N) machines (since per machine memory is O(Nδ)) and therefore we can assume without loss of generality that every machine hosts O(1) sets or elements. For a node v in the active graph A, let Mv denote the machine hosting v. Suppose that each machine Mv knows BA(v, p) for some \(0 \le p \le \delta \sqrt {\log N}\). Each machine Mv then sends BA(v, p) to machine Mu for every node u in BA(v, p). After this communication is completed, each machine Mv can construct BA(v, 2p) from the information it has received because BA(v, 2p) is contained in the union of BA(u, p) for all u in BA(v, p).
We now argue that this communication can be performed in O(1) rounds. First, note that by Lemma 4.4, the size of BA(v, p) is bounded above by O(Nδ/2). This also means that BA(v, p) contains O(Nδ/2) nodes. Therefore, Mv needs to send a total of O(Nδ/2) × O(Nδ/2) = O(Nδ) words. A symmetric argument shows a O(Nδ) bound on the number of words Mv receives. Since O(Nδ) words can be sent and received in each communication round by a machine this communication can be completed in O(1) rounds. □
Theorem 4.6
There is a low-memory MPC algorithm that runs in \(\tilde{O}\left(\frac{1}{\delta ^2}\log \mathbf {s}\log \mathbf {t}/\sqrt {\log N}\right)\) rounds and produces an O(log s)-approximate set cover whp. This algorithm needs O(I + N1 + δ) total memory.
Proof.
The purpose of Line 1 in Algorithm 2 is for every element to know if it is covered by a set added to the set cover in the previous batch. This boolean information can be aggregated up the depth-O(1/δ)-tree representing the neighborhood of each element. Hence, Line 1 can be implemented in O(1/δ) rounds. The purpose of Line 2 in Algorithm 2 is for every set S not in the set cover to know \(\deg _{i,P_j+1}(S)\), i.e., the number of elements in S not-yet-covered at the end of the previous batch. This count can be aggregated up the depth-O(1/δ)-tree representing the neighborhood of each set. Hence, Line 2 can be implemented in O(1/δ) rounds. Lines 3-5 involve local computation only. Line 6 is the graph exponentiation step, which takes O(log log N) rounds in the low-memory MPC model according to Lemma 4.5. Lines 7-9 only require local computation. Hence, Algorithm 1, which simulates a batch of iterations, can be implemented in O(1/δ + log log N) rounds. Each batch consists of \(\delta \sqrt {\log N}/4\) iterations and thus there are \(4\log \mathbf {t}/\delta \sqrt {\log N}\) batches in a stage. This implies a total of \(O\left(\frac{1}{\delta ^2} \log \mathbf {t} \log \log N/ \sqrt {\log N}\right)\) rounds per stage and thus a total of \(\tilde{O}\left(\frac{1}{\delta ^2}\log \mathbf {s}\log \mathbf {t}/\sqrt {\log N}\right)\) rounds.
The fact that Algorithm 1 is an O(log s)-approximation (in expectation) is shown in [18, 19]. Algorithm 2 simulates Algorithm 1 faithfully expects that a set S may be incorrectly classified as a candidate when the number of not-yet-covered elements in S is very close to the Stage i threshold of s/2i. Specifically, in Algorithm 1 a set S is a candidate in iteration ℓ in Stage i if \(\deg _{i, \ell }(S) \ge \mathbf {s}/2^i\). But, as per Lemma 4.1, S may be classified as a candidate even if \(\deg _{i, \ell }(S)\) is slightly smaller, i.e., in the range \(\left[\frac{\mathbf {s}}{2^i(1+\epsilon)}, \frac{\mathbf {s}}{2^i}\right)\). Similarly, S may be may be classified as a non-candidate even if \(\deg _{i, \ell }(S)\) is slightly larger, i.e., in the range \(\left[\frac{\mathbf {s}}{2^i}, \frac{\mathbf {s}}{2^i(1-\epsilon)} \right)\). This approximation candidate-estimation only worsens the approximation factor by a factor that depends on ϵ, which we can take to be an arbitrarily small constant.
The O(log s)-approximation factor holds only in expectation. We can make the O(log s)-approximation hold whp (with a larger constant) using the standard technique of running O(log N) copies of the algorithm and returning the smallest solution. Standard Chernoff bounds show that the minimum of these solutions has an approximation ratio O(log s) whp. This increases our memory usage (both local and total) by a factor of log N but this can be absorbed in the big-Oh notation by reducing δ to 0.9δ since log N = O(N0.1δ). This only increases the round complexity by a constant factor. □
4.1 Linear Total Memory
In this section, we describe how to modify Algorithm 2 to get linear total memory. This comes at the cost of a slightly higher round complexity compared to Theorem 4.6.
Theorem 4.7
There is a low-memory MPC algorithm that runs in \(\tilde{O}\left(\frac{1}{\delta ^2}\sqrt {\log \mathbf {s}}\log \mathbf {t}\right)\) rounds and produces an O(log s)-approximate set cover whp. This algorithm needs \(\tilde{O}(I)\) total memory.
Proof.
To use less total memory, we modify Algorithm 2 so that each batch in stage i executes \(R = \delta \sqrt {\log \mathbf {s}}/4\) iterations. Therefore, the size of the active graph in Lemma 4.4 is bounded above by O(sδ/2) whp. In stage i, only candidates with degree in range [s/2i, s/2i − 1) are active. Therefore until stage (1 − δ/2)log s we can charge the active graph collected at an active set S to unique input graph edges for each batch since \(\deg (S) \ge \mathbf {s}^{\delta /2}\). These stages are simulated in \(O\left(\frac{1}{\delta ^2}\log \mathbf {t} \sqrt {\log \mathbf {s}}\right)\) rounds. Now we want to simulate the remaining (δ/2)log s stages. We shrink the batch size to R/2 and this allows us to simulate an additional (δ/4)log s stages while still charging the active graph (of size O(sδ/4)) to unique input edges for each batch. We keep halving the batch size at most O(log log s) times where we simulate (δ/2t + 1)log s stages with batch size R/2t until we simulate all stages. This simulation runs in \(\tilde{O}\left(\frac{1}{\delta ^2} \sum _{t=1}^{\log \log \mathbf {s}} \delta 2^t\log \mathbf {s} \log \mathbf {t} / 2^{t+1} R\right) = \tilde{O}\left(\frac{1}{\delta ^2} \log \mathbf {t} \sqrt {\log \mathbf {s}} \right)\) rounds. Each input graph edge is charged a most O(log slog t) times so the total memory used is \(\tilde{O}(I)\). □
5 Exponentially Faster Simulation in Linear-Memory MPC
In this section we will show that in the linear-memory MPC model the simulation from the previous section can be substantially simplified and accelerated, leading to an algorithm that runs O(log s · log log N) rounds whp. Our main insight is that the active graph induced by a large number of initial iterations in a stage has size O(N). This implies that these initial iterations can be simulated in constant rounds in the linear-memory MPC model.
In particular, for any stage i consider iterations 1 to log t − 2log log N. Each candidate has a probability \(\sum _{1}^{\log \mathbf {t} - 3\log \log N} 2^k/\mathbf {t} \le 2 \cdot 2^{\log \mathbf {t} - 3\log \log N }/\mathbf {t} \le 2/(\log ^3 N)\) of being selected. In addition to selecting candidates, each element that is not covered by the start of batch j in stage i is sampled independently in log t − 3log log N groups, each with probability 2clog N · 2i/s, for some constant c. Let \(A = (\mathcal {S}_A, X_A, E_A)\) be the active graph for stage i and batch j which is the bipartite graph with node set \(\mathcal {S}_A \cup X_A\), where \(\mathcal {S}_A\) is the set of selected candidate sets and XA is the set of sampled elements and the edge set EA consists of edges {S, e}, where \(S \in \mathcal {S}_A, e \in X_A\), and e ∈ S.
Lemma 5.1
The active graph A for iterations 1 to log t − 3log log N of stage i has O(N/log N) edges whp.
Proof.
Each candidate will select itself in the first log t − 3log log N iterations with probability at most 2/log 3N. Therefore, the expected number of selected candidates that is \(\mathbb {E}[|\mathcal {S}_A|] \le 2N/\log ^2 N\). Since the candidates select themselves independently, we can use Chernoff bound to say that whp, \(|\mathcal {S}_A| \le 2c N/\log ^2 N\).
Now since we are in stage i, each set has degree at most s/2i and each element samples itself independently, in log t − 3log log N groups, each with probability 2clog N · 2i/s. Therefore, using Chernoff bound, we can say that whp, every candidate at most O(log Nlog t) = O(log 2N) sampled neighbors. Therefore, \(|E_A| \le |\mathcal {S}_A| \cdot O(\log ^2 N) \le O(N/\log N)\) whp, and the lemma holds. □
We are now ready to prove the main result of this section.
Theorem 5.2
There is a linear-memory MPC algorithm that runs in O(log s · log log N) rounds and produces an O(log s)-approximate set cover whp. This algorithm uses a total of \(\tilde{O}(I)\) memory.
Proof.
We simulate iterations 1 to log t − 3log log N of stage i by having each machine compute the local part of the active graph A and send it to a single leader machine. Then the leader machine can simulate the log t − 3log log N iterations locally using Algorithm 2 and send to each set and element its local outcome. This requires O(1) rounds.
The remaining 3log log N iterations can be simulated one by one, and then we move on to stage i + 1. Hence stage i can be simulated in O(log log N) rounds. Therefore, the overall round complexity is O(log s · log log N).
Similar to the proof of Theorem 4.6, the use of the estimators of Lemma 4.1 only worsens the expected O(log s) approximation factor of the base Algorithm 1 by a factor that depends on ϵ, which we can make an arbitrarily small constant.
The O(log s)-approximation factor we get only holds for the expected cost returned by the algorithm. We can make the O(log s)-approximation hold whp (with a larger constant) using the standard technique of running O(log N) copies of the algorithm and returning the smallest solution. For each copy, the active graph in stage i has size O(N/log N) whp, and we assign O(log N) different leader machines that gather each active graph, simulate all iterations and send the outcome to each set and element. Therefore at each stage, each machine sends and receives at most O((N/log N) · log N) = O(N) messages, and we have no overhead in round complexity.
We can also simulate the remaining O(log log N) for each copy in parallel by noting that a faithful simulation of Algorithm 1 only requires each set added to the cover to send one bit to each element it covers and each uncovered element sends one bit to its neighboring sets, so it can update the \(\deg _{i,k}(\cdot)\) value for the next iteration k. Since each set an element sends only one bit, we can fit the messages of the O(log N) copies in O(1) words of local memory by having an implicit ordering over all the copies. Therefore, the local memory usage for each machine is still O(N) words.
Once each set knows which of the O(log N) set covers it belongs to, we can aggregate this information into one machine that can compute all O(log N) costs. Each set sends a bit string of size O(log N) to this machine denoting all the copies in which it belongs to the set cover solution. Therefore we have O(log N) solutions generated independently, each with expected approximation ratio O(log s). Standard Chernoff bounds show that the minimum of these solutions has an approximation ratio O(log s) whp.
Note that for the entire simulation we need only O(log N) additional machines and since I = Ω(N), the additional O(Nlog N) total memory is absorbed into the \(\tilde{O}(\cdot)\) notation. □
6 Conclusion and Future Work
This paper presents the first \(\tilde{O}(\log N)\)-round and \(\tilde{O}(\log ^{1.5} N)\)-round algorithms in the linear-memory and low-memory MPC models respectively for the classical SetCover problem. These algorithms are obtained by using the technique of sparsified graph exponentiation, which was previously used to obtain faster MPC algorithms for MIS [15] and 2-ruling sets [25].
Several natural questions arise from this work.
First, is it possible to design o(log N)-round MPC algorithms for SetCover, at least in the linear-memory MPC model? Some impressive progress has been made on derandomizing MPC algorithms for MIS [6] and coloring [7]. Can these techniques be extended to yield fast, deterministic algorithms for SetCover? In combinatorial optimization, researchers have considered many generalizations of SetCover such as partial SetCover, capacitated SetCover, multi-SetCover, etc. Designing fast MPC algorithms for these generalizations would be a natural follow-up to this work.
Footnotes
1
The Congest model [29] is a classical distributed model where the input is a graph G = (V, E), which also serves as the communication network. Nodes in the graph are processors with unique IDs from a space whose size is polynomial in |V|. In each round, each node can send an O(log |V|)-bit message to each of its neighbors.
2
The \(\tilde{O}(\cdot)\) notation hides polylog log N factors.
3
Given an arbitrary distribution of edges across machines we can convert it to a node-centric distribution as follows: (1) for each edge {u, v} create two ordered tuples (u, v) and (v, u) and (2) sort the list of tuples by node ID and machine ID so that earliest tuples in the ordering appear on machines with smallest ID. The result is that the edges incident on each node are now on the same or two machines with adjacent IDs. It is easy to go from this intermediate distribution to a node-centric distribution by having machines with adjacent IDs coordinate with each other. Since we can do sorting in MPC in constant rounds [16], a node-centric distribution be assumed without loss of generality.
4
In the beeping model nodes communicate with beeps. In the specific version of the beeping model in [17], in each round a node can either listen or beep. If, in a round, a node v is listening and some subset of its neighbors beep, then v hears a beep.
References
[1]
Noga Alon, László Babai, and Alon Itai. 1986. A Fast and Simple Randomized Parallel Algorithm for the Maximal Independent Set Problem. J. Algorithms 7, 4 (dec 1986), 567–583.
Soheil Behnezhad, Sebastian Brandt, Mahsa Derakhshan, Manuela Fischer, MohammadTaghi Hajiaghayi, Richard M. Karp, and Jara Uitto. 2019. Massively Parallel Computation of Matching and MIS in Sparse Graphs. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing (Toronto ON, Canada) (PODC ’19). Association for Computing Machinery, New York, NY, USA, 481–490.
Bonnie Berger, John Rompel, and Peter W. Shor. 1994. Efficient NC Algorithms for Set Cover with Applications to Learning and Geometry. In Proceedings of the 30th IEEE Symposium on Foundations of Computer Science (Research Triangle Park, North Carolina, USA). Academic Press, Inc., USA, 454–477.
Artur Czumaj, Peter Davies, and Merav Parter. 2020. Graph Sparsification for Derandomizing Massively Parallel Computation with Low Space. Association for Computing Machinery, New York, NY, USA, 175–185.
Artur Czumaj, Peter Davies, and Merav Parter. 2021. Simple, Deterministic, Constant-Round Coloring in Congested Clique and MPC. SIAM J. Comput. 50, 5 (2021), 1603–1626. arXiv:
Artur Czumaj, Jakub Łącki, Aleksander Mądry, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski. 2018. Round compression for parallel matching algorithms. Proceedings of the Annual ACM Symposium on Theory of Computing 14, 1 (2018), 471–484.
Laxman Dhulipala, Michael Dinitz, Jakub Łącki, and Slobodan Mitrović. 2024. Parallel Set Cover and Hypergraph Matching via Uniform Random Sampling. In 38th International Symposium on Distributed Computing (DISC 2024)(Leibniz International Proceedings in Informatics (LIPIcs), Vol. 319), Dan Alistarh (Ed.). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 19:1–19:23.
Irit Dinur and David Steurer. 2014. Analytical Approach to Parallel Repetition. In Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing (New York, New York) (STOC ’14). Association for Computing Machinery, New York, NY, USA, 624–633.
Mohsen Ghaffari. 2017. Distributed MIS via All-to-All Communication. In Proceedings of the ACM Symposium on Principles of Distributed Computing (Washington, DC, USA) (PODC ’17). Association for Computing Machinery, New York, NY, USA, 141–149.
Mohsen Ghaffari, Themis Gouleakis, Christian Konrad, Slobodan Mitrović, and Ronitt Rubinfeld. 2018. Improved massively parallel computation algorithms for mis, matching, and vertex cover. In the Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC). Association for Computing Machinery, New York, NY, USA, 129–138.
Mohsen Ghaffari and Jara Uitto. 2019. Sparsifying Distributed Algorithms with Ramifications in Massively Parallel Computation and Centralized Local Computation. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, Timothy M. Chan (Ed.). SIAM, Philadelphia, PA 19104, 1636–1653.
Michael T. Goodrich, Nodari Sitchinava, and Qin Zhang. 2011. Sorting, Searching, and Simulation in the Mapreduce Framework. In Proceedings of the 22nd International Conference on Algorithms and Computation (Yokohama, Japan) (ISAAC’11). Springer-Verlag, Berlin, Heidelberg, 374–383.
Thorsten Götte, Christina Kolb, Christian Scheideler, and Julian Werthmann. 2021. Beep-And-Sleep: Message and Energy Efficient Set Cover. In Algorithms for Sensor Systems: 17th International Symposium on Algorithms and Experiments for Wireless Sensor Networks, ALGOSENSORS 2021, Lisbon, Portugal, September 9–10, 2021, Proceedings (Lisbon, Portugal). Springer-Verlag, Berlin, Heidelberg, 94–110.
Christoph Grunau, Slobodan Mitrovic, Ronitt Rubinfeld, and Ali Vakilian. 2019. Improved Local Computation Algorithm for Set Cover via Sparsification. CoRR abs/1910.14154 (2019), 2993–3011. arXiv:https://arXiv.org/abs/1910.14154http://arxiv.org/abs/1910.14154
Christoph Grunau, Slobodan Mitrović, Ronitt Rubinfeld, and Ali Vakilian. 2020. Improved Local Computation Algorithm for Set Cover via Sparsification. In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics, 3600 University City Science Center Philadelphia, PA, United States, 2993–3011.
Nicholas J. A. Harvey, Christopher Liaw, and Paul Liu. 2018. Greedy and Local Ratio Algorithms in the MapReduce Model. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, SPAA 2018, Vienna, Austria, July 16-18, 2018, Christian Scheideler and Jeremy T. Fineman (Eds.). ACM, New York, NY, USA, 43–52.
James W. Hegeman, Sriram V. Pemmaraju, and Vivek Sardeshmukh. 2014. Near-Constant-Time Distributed Algorithms on a Congested Clique. In Distributed Computing - 28th International Symposium, DISC 2014, Austin, TX, USA, October 12-15, 2014. Proceedings(Lecture Notes in Computer Science, Vol. 8784), Fabian Kuhn (Ed.). Springer, Berlin, Germany, 514–530.
Howard Karloff, Siddharth Suri, and Sergei Vassilvitskii. 2010. A model of computation for mapreduce. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms. SIAM, Society for Industrial and Applied Mathematics, USA, 938–948.
Kishore Kothapalli, Shreyas Pai, and Sriram V. Pemmaraju. 2020. Sample-And-Gather: Fast Ruling Set Algorithms in the Low-Memory MPC Model. In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2020, December 14-18, 2020, BITS Pilani, K K Birla Goa Campus, Goa, India (Virtual Conference)(LIPIcs, Vol. 182), Nitin Saxena and Sunil Simon (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Oktavie-Allee, 66687 Wadern, Germany, 28:1–28:18.
Merav Parter and Eylon Yogev. 2018. Congested Clique Algorithms for Graph Spanners. In 32nd International Symposium on Distributed Computing, DISC 2018, New Orleans, LA, USA, October 15-19, 2018(LIPIcs, Vol. 121), Ulrich Schmid and Josef Widder (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, USA, 40:1–40:18.
David Peleg. 2000. Distributed Computing: A Locality-Sensitive Approach. Society for Industrial and Applied Mathematics, 3600 University City Science Center Philadelphia, PA, United States.
Sridhar Rajagopalan and Vijay V. Vazirani. 1999. Primal-Dual RNC Approximation Algorithms for Set Cover and Covering Integer Programs. SIAM J. Comput. 28, 2 (feb 1999), 525–540.
Grigory Yaroslavtsev and Adithya Vadapalli. 2018. Massively Parallel Algorithms and Hardness for Single-Linkage Clustering under ℓp Distances. In Proceedings of the 35th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 80), Jennifer Dy and Andreas Krause (Eds.). PMLR, US, 5600–5609. https://proceedings.mlr.press/v80/yaroslavtsev18a.html
SODA '21: Proceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms
In the dynamic minimum set cover problem, a challenge is to minimize the update time while guaranteeing close to the optimal min(O(log n), f) approximation factor. (Throughout, m, n, f, and C are parameters denoting the maximum number of sets, number of ...
PODC '16: Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing
We present a simple deterministic distributed (2+ε) approximation algorithm for minimum weight vertex cover, which completes in O(logδ/εlog logδ) rounds, where δ is the maximum degree in the graph, for any ε > 0 which is at most O(1). For a constant ε, ...
In the (non-preemptive) Generalized Min Sum Set Cover Problem, we are given $$n$$ ground elements and a collection of sets $$\mathcal{S }= \{S_1, S_2, \ldots , S_m\}$$ where each set $$S_i \in 2^{[n]}$$ has a positive requirement $$\kappa (S_i)$$ that has to be fulfilled. We would like to order all elements to minimize the total (...