Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Prophet Inequalities with Linear Correlations and Augmentations

Published: 19 December 2023 Publication History

Abstract

In a classical online decision problem, a decision-maker who is trying to maximize her value inspects a sequence of arriving items to learn their values (drawn from known distributions) and decides when to stop the process by taking the current item. The goal is to prove a “prophet inequality”: that she can do approximately as well as a prophet with foreknowledge of all the values. In this work, we investigate this problem when the values are allowed to be correlated. Since nontrivial guarantees are impossible for arbitrary correlations, we consider a natural “linear” correlation structure introduced by Bateni et al. as a generalization of the common-base value model of Chawla et al. 
A key challenge is that threshold-based algorithms, which are commonly used for prophet inequalities, no longer guarantee good performance for linear correlations. We relate this roadblock to another “augmentations” challenge that might be of independent interest: many existing prophet inequality algorithms are not robust to slight increases in the values of the arriving items. We leverage this intuition to prove bounds (matching up to constant factors) that decay gracefully with the amount of correlation of the arriving items. We extend these results to the case of selecting multiple items by designing a new (1+o(1))-approximation ratio algorithm that is robust to augmentations.

1 Introduction

In classic optimal-stopping problems, a decision-maker wishes to select one of a set \([n] = \lbrace 1,\dots ,n\rbrace\) of options whose values are distributed according to a known joint distribution. Option i materializes at time i, revealing its value \(X_i\) . The decision-maker must then either select option i, receiving a value of \(X_i\) , or permanently reject it and continue. Her goal is to choose an option whose value, in expectation over her selection algorithm and the randomness in the problem instance, obtains at least a \(1/\alpha\) -fraction of the expected maximum value for some approximation ratio \(\alpha \ge 1\) . Such an approximation is referred to as a “prophet inequality,” as it compares the decision-maker’s performance to that of a prophet who knows the realizations of all values in advance and can always stop at the maximum. Examples of optimal-stopping problems include hiring, in which an employer interviews a sequence of candidates and must make a hiring decision on the spot, or house-buying in a sellers’ market, in which a buyer must make an offer at the open house. These optimal-stopping problems became more popular in the past 15 years particularly because of their applications in mechanism design. For example, an \(\alpha\) -prophet inequality often implies a posted-pricing mechanism that gets a \(1/\alpha\) -fraction of the maximum welfare for a sequence of bidders arriving online (see related work in Section 1.3).
The approximability of optimal stopping problems depends heavily on the correlations of the option values. In the case where all the values are independent, a tight 2-approximation ratio was shown by Krengel and Sucheston [34]. In 1984, Samuel-Cahn [45] presented a simple median-of-maximum “threshold-based” rule with the same performance: compute \(\tau\) as the median of the distribution of the maximum value, and stop at the first \(X_i\) exceeding \(\tau\) . Other threshold rules are also known to obtain 2-approximation—for example, Kleinberg and Weinberg [33] showed this for \(\tau = \tfrac{1}{2}\mathop {\mathbb {E}}_{} \left[\max _i X_i\right]\) . When the values are negatively correlated, the problem intuitively becomes even easier than the independent case: observing and rejecting a low \(X_i\) increases the chances of seeing a high \(X_{i^{\prime }}\) in the future (and vice versa, accepting a large \(X_i\) decreases the chances of having missed a high \(X_{i^{\prime }}\) in the future). For example, a simple implication of threshold-based algorithms is a 2-approximation ratio for negatively associated (a form of negative correlation) values; see Appendix B. Rinott and Samuel-Cahn [39, 41] indeed show that the value of the optimal stopping algorithm for negative correlations exceeds that of the independent case, holding the marginals fixed. However, with general and positive correlation structures, no algorithm can guarantee better than \(\Omega ({n})\) -approximation ratio (as is known from Hill and Kertz [31] and will also be a special case of our lower bounds). Therefore, the question is how to impose a structure on the correlations that models interesting applications and allows for better bounds.

1.1 Linear Correlations Model

We consider a linear correlations model in which there exists a set of m independent variables \(Y_1,\dots ,Y_m\) , with each option value \(X_i\) being a positive linear combination of some subset:
\begin{equation*} {{\bf X}}= A \cdot {{\bf Y}}{,} \end{equation*}
where A is a nonnegative matrix. The algorithm is given the matrix A and the distributions of all the \(Y_j\) s, but when \(X_i\) arrives, it only finds \(X_i\) and not any of the realizations of the \(Y_j\) s. This model was introduced in an auctions context by Bateni et al. [9], where it was inspired by the common-base value model of Chawla et al. [14]. It has two natural parameters capturing the degree of correlation of an instance. If each row of A has at most \(s_{\text{row}}\) nonzero entries (row sparsity \(s_{\text{row}}\) ), this implies that each \(X_i\) only depends on at most \(s_{\text{row}}\) different \(Y_j\) s. However, if each column of A has at most \(s_{\text{col}}\) nonzero entries (column sparsity \(s_{\text{col}}\) ), this implies that each \(Y_j\) is only relevant to at most \(s_{\text{col}}\) different \(X_i\) s.
General Applications. Linear correlations occur in applications when each option i is defined by the degree \(A_{ij}\) to which it exhibits each feature \(j\in [m]\) . The value \(\sum _jA_{ij}Y_j\) of the option is then determined by the values \(Y_j\) of its features, which is unknown to the decision-maker. Of particular interest in our setting are applications with many features, such as the hiring and house-buying examples often used to motivate prophet problems. Other relevant applications include selecting hotel rooms, restaurants, and movies. Here we elaborate on the hiring and house-buying examples, noting how they naturally exhibit column or row sparsity.1
In a hiring application, the features of a candidate might include where he received his education, his major, his work experience in each relevant industry, and aspects of his personality, among others. When the employer interviews a candidate, she learns how much she likes him, but not how to attribute her value for the candidate to particular features (every school/major/industry is a different feature). If the candidate pool is diverse, in which candidates come from many different schools/majors/industries, we might expect the instance to have a low column sparsity. Similarly, in house-buying, the features of a house might include the commute time, the square footage, and various bells-and-whistles like the existence of a patio, a hot-tub, a roof-deck, and the number of parking spaces, if any. Again, the value of a house is a linear combination of the value of its features, but when seeing a house, the buyer may only be able to access and articulate an overall valuation. If each house has a limited number of bells-and-whistles, we expect the instance to have low row sparsity.
Mechanism Design Applications: Welfare for Linearly Correlated Values. Prophet inequalities can directly imply social welfare and revenue guarantees for sequential posted-price mechanisms [13, 33]. In the simplest model, a single item is for sale to a sequence of arriving bidders with values \(X_1,\dots ,X_n\) , drawn from distributions known to the seller. A threshold- \(\tau\) stopping rule immediately translates to a posted price \(\tau\) . The item is purchased by the first bidder whose value satisfies \(X_i \ge \tau\) . In particular, social welfare is the value of the bidder who purchases the item, so a prophet inequality directly translates to a social welfare guarantee.2 Whereas we show that threshold-based policies fail for linearly correlated values, we obtain positive results with inclusion-threshold policies. These correspond to offering a fixed posted price to a predetermined subset of buyers, whereas the others are automatically rejected.
For linearly correlated bidder values, our positive results immediately imply social welfare guarantees using such inclusion-posted-price mechanisms. Here, linear correlations capture some component of common values in bidder preferences. Namely, there are different features \(Y_1,\dots ,Y_m\) of the object, with bidder i placing weight \(A_{ij}\) on feature j. In the mechanism design setting, it is particularly natural to make our assumption that the decision-maker (here, the seller) is not able to access \(Y_{j}\) when value \(X_i\) arrives.
Results. We start from the observation that threshold-based algorithms cannot guarantee good approximation ratios as soon as any correlations are introduced. Therefore, we define inclusion-threshold algorithms that probabilistically include a subset of the arrivals for consideration and take the first arrival in this subset to exceed a threshold.
We first design an inclusion-threshold algorithm to obtain an \(O(s_{\text{col}})\) -approximation ratio (i.e., a guarantee that degrades gracefully as the amount of correlation increases) from the known \(O(1)\) bound for the independent case to the known \(\Theta (n)\) worst-case bound. Then, we design a more complex inclusion-threshold algorithm to obtain an \(O({s_{\text{row}}})\) -approximation ratio (i.e., another gracefully degrading guarantee). Together, these prove an \(O({\min \lbrace {s_{\text{row}}}, {s_{\text{col}}}\rbrace })\) -approximation guarantee for the linear correlations model. We then design a lower bound instance and prove that this is tight up to constants—that is, no algorithm can guarantee better than an \(\Omega ({\min \lbrace {s_{\text{row}}}, {s_{\text{col}}}\rbrace })\) -approximation.
Main Result 1 (Informal Theorem 4.1).
For the linearly correlated prophet problem, there exists an \(O({\min \lbrace {s_{\text{col}}}, {s_{\text{row}}} \rbrace })\) -approximation ratio algorithm.
Finally, we extend these results to the case of selecting a subset of up to r of the arriving options with a goal of maximizing their expected sum (also known as an r-uniform matroid constraint). It is known that for independent distributions, \(1+o(1)\) -approximation ratio prophet inequality algorithms are possible for the case of large r [3, 30]. We show a similar result for linearly correlated instances with bounded column sparsity \(s_{\text{col}}\) .
Main Result 2 (Informal Theorem 5.1).
For the linearly correlated prophet problem where we select r options, there exists a \(1+o(1)\) -approximation ratio algorithm when \(r \gg s_{\text{col}}\) , where the \(o(1)\) is \(O((\frac{s_{\text{col}}}{r})^{1/5} \text{polylog} \ {r})\) .
The case of bounded row sparsity, however, turns out to be harder: regardless of r, no algorithm can guarantee better than an \(\Omega ({s_{\text{row}}})\) -approximation ratio for unbounded \(s_{\text{col}}\) , as in the \(r=1\) case.

1.2 Techniques and the Augmented Prophets Problem

A crucial technique for our results is to introduce and solve the augmented prophets problem. The idea is to suppose we have an instance with independent random variables \(Z_1,\dots ,Z_n\) and an algorithm, say a threshold rule, guaranteeing some approximation ratio. Now suppose we “augment” the instance by sending instead the values \(X_1 := Z_1+W_1, ~ \dots , ~ X_n := Z_n + W_n\) where the \(W_i\) s are nonnegative. Does the algorithm (which does not know \(W_i\) s) continue to guarantee its original approximation ratio (measured against the maximum \(Z_i\) )? One would hope so, as each arriving option has only increased, whereas the benchmark has not. However, this turns out not to be true for the median-of-maximum threshold rule. For example, if \(Z_i \sim \text{Bernoulli}(p)\) i.i.d. for \(p \ll \tfrac{1}{n}\) , the median is zero, and augmenting the first arrival to a miniscule positive value causes the (strict) median threshold rule to always take it, resulting in an arbitrarily poor approximation. Luckily, we show that the half-of-expected-maximum threshold algorithm retains its approximation guarantees, even when the \(W_i\) s are chosen by an adversary depending on the past \(X_{i^{\prime }}\) s (i.e., \(i^{\prime }\lt i\) ).
Armed with this “augmentation lemma,” we use subsampling to obtain inclusion sets of arrivals \(\lbrace X_i\rbrace\) with significant independent portions \(\lbrace Z_i\rbrace\) . In the bounded \(s_{\text{col}}\) case, direct subsampling of arrivals succeeds. The case where \(s_{\text{row}}\) is bounded but \(Y_j\) can appear in any number of arrivals is more challenging. We show it suffices to obtain a contention-resolution style subsampling of the arrivals such that each \(Y_j\) is well represented, but only with its maximum coefficient \(A_{ij}\) . We then use a graph-theoretic argument to construct such a scheme.
The augmented prophets problem is also our key technique for the linearly correlated prophets problem with an r-uniform matroid constraint. In this case, however, we notice that none of the existing \(1+o(1)\) algorithms are robust to augmentations. Hence, we design a new \(1+o(1)\) algorithm and prove its robustness using a much more sophisticated analysis involving a sequence of thresholds and “buckets” with different cardinality constraints. By combining this augmentation result with random partitioning of the input, we obtain the \(1+o(1)\) -approximation for the r-uniform matroid problem with fixed \(s_{\text{col}}\) .
Main Result 3 (Informal Lemmas 3.2 and 3.4).
For the augmented prophets problem, there exists a 2-approximation ratio algorithm when selecting a single option and a \(1+o(1)\) -approximation ratio algorithm when selecting \(r\gg 1\) options, where the \(o(1)\) is \(O(\frac{\text {polylog}\ r}{r^{1/4}})\) .
One can also view the augmented prophets problem as capturing correlations induced by a mischievous wish-granting genie who awards bonuses \(W_i \ge 0\) at each step but tries to choose them so as to worsen the algorithm’s performance. We think this problem is of independent interest and will find further applications in designing robust prophet inequality algorithms.

1.3 Related Work

The past decade has seen significant interest in prophet inequalities motivated by their applications in mechanism design. Many works focus on multiple-choice prophet inequality problems. This includes prophet inequalities for uniform matroids (e.g., [3, 30]), for general matroids (e.g., [13, 29, 33, 36, 47]), for matchings and combinatorial auctions (e.g., [4, 22, 25, 28]), and for arbitrary packing constraints (e.g., [42, 43]). There has also been a lot of work on variants of prophet inequalities: the prophet secretary problem where the values arrive in a random order [1, 7, 16, 18, 25, 26, 36], and the limited information setting where we only have sample access to distributions [6, 17, 44]. All these works assume mutually independent values, whereas capturing correlations and designing robust algorithms is the main challenge in our work.
Rinott and Samuel-Cahn [39, 40, 41] study correlated prophet inequalities. However, their techniques are not applicable to our work because their positive results hold only for negatively correlated values. Furthermore, their benchmark is the expected maximum of independent values having the same marginal distributions. This benchmark could be a factor n larger than the expected maximum for positively correlated values.
Our approach via the augmented prophets problem is also related to the line of work on designing robust stochastic optimization algorithms. Since algorithms that assume known input distributions tend to over-fit, here the goal is to design algorithms that are robust to adversarial noise (see [10, 12, 19, 20, 21, 27, 35, 37, 38] and references therein). Our single-item and multiple-items augmentation algorithms can be seen as robust prophet inequality algorithms that retain their guarantees even when the input distributions are augmented by an adversary. Another relevant reference is that of Dütting and Kesselheim [23], which gives prophet inequalities assuming only probability distributions that are \(\epsilon\) -close (in some metric) to the true distributions. Their technical results, however, are not useful here because augmented distributions can be very far from the original distributions.
Subsequent Related Work. Since the initial version of this article appeared in the conference EC 2020, several new related papers have appeared. For combinatorial auctions with subadditive agents, we now know an \(O(\log \!\log m)\) -approximation ratio algorithm using posted prices [24] and an \(O(1)\) -approximation ratio general online algorithm [15]. Our multiple-items augmented prophets problem was extended to online packing integer programs; in particular, we know an \(O(1)\) -approximation ratio algorithm when the budget is large [5]. An interesting Markov random field correlations model for prophet inequality was recently studied by Cai and Oikonomou [11].

2 Model and Fixed-threshold Algorithms

2.1 Model and Notation

In the linear correlations model, there are n random variables \(X_1, X_2, \ldots , X_n\) that linearly depend on m independent nonnegative random variables (sometimes called features) \(Y_1, Y_2, \ldots , Y_m\) as
\begin{equation*} {{\bf X}}= A \cdot {{\bf Y}}, \end{equation*}
where matrix A only contains non-negative entries. Let \(s_{\text{row}}\) denote the row sparsity of A (maximum number of nonzero entries in any row) and \(s_{\text{col}}\) denote the column sparsity of A (maximum number of nonzero entries in any column).
An online algorithm is initially given A and the distributions of \(Y_1,\dots ,Y_m\) . Then, it observes the realizations of \(X_1,\dots ,X_n\) one at a time. After observing \(X_i\) , the algorithm decides either to stop and take the reward \(X_i\) , ending the process, or to reject \(X_i\) and continue to \(X_{i+1}\) . Given such an algorithm \(\text{ALG}\) , we abuse notation by writing \(\text{ALG}\) for the reward of the algorithm, a random variable. The algorithm has an approximation ratio of \(\alpha\) for some \(\alpha (n,s_{\text{row}},s_{\text{col}})\) if for all \(n,s_{\text{row}},s_{\text{col}}\) and all instances of the problem with these parameters,
\begin{equation*} {\mathbb {E}}[\text{ALG}] \ge \frac{1}{\alpha (n, s_{\text{row}}, s_{\text{col}})} \cdot {\mathbb {E}}[\max _i X_i] . \end{equation*}
Such a guarantee is often called a prophet inequality because it compares the algorithm to a “prophet” that can predict the realizations of all \(X_i\) in advance and take \(\max _i \lbrace X_i \rbrace\) every time.
We use the notation \((\cdot)^+\) to mean \(\max \lbrace \cdot , 0\rbrace\) .
Our examples frequently use random variables that are either zero or some fixed positive value. We say the variable is active if it takes its positive value. We sometimes say that \(X_i\) “includes” \(Y_j\) if \(A_{ij} \gt 0\) .

2.2 Fixed-Threshold Algorithms

A fixed-threshold algorithm selects a single threshold \(\tau\) and takes the first arrival \(X_i\) that exceeds \(\tau\) . We refer to such an algorithm as \(\text{ALG}_{\tau }\) . Fixed-threshold algorithms have been very successful in prophet inequality design. However, our first result shows their severe limitation for even mildly correlated prophet inequalities.
Lemma 2.1.
In the linear correlations model, even for \(s_{\text{row}}=s_{\text{col}}=2\) , there exist instances where every fixed-threshold algorithm \(\text{ALG}_{\tau }\) has an approximation ratio at least \(\Omega ({n})\) (even when \(\tau\) is a random threshold).
The full proof is deferred to Appendix D, but the instance is important and is described next. Intuitively, the problem is that an arrival \(X_i\) just crossing the threshold may be correlated with some later \(X_{i^{\prime }}\) being very large. Taking \(X_i\) prevents the algorithm from ever obtaining the gains from \(X_{i^{\prime }}\) . Our proof uses the following “tower” variables, which will also be useful later.
Definition 2.2 (Tower Y Variables)
Given \(\epsilon \gt 0\) , define the tower \({{\bf Y}}\) variables as \(Y_i = \frac{1}{\epsilon ^i}\) with probability \(\epsilon ^i\) and \(Y_i = 0\) otherwise for each \(i \in \lbrace 1,\dots ,m\rbrace\) .
Example 2.3 (srow = scol = 2 Tower Instance)
Take the tower \({{\bf Y}}\) variables. Let A be an \(n \times n\) matrix with entry \(A_{i,i} = 1\) for all i and \(A_{i,i+1} = \epsilon\) for \(i \in [1,n-1]\) . All other entries are 0. Visually,
\begin{align*} X_1 = Y_1 + \epsilon Y_2, \qquad X_2 = Y_2 + \epsilon Y_3 , \qquad \ldots , \qquad X_n = Y_n . \end{align*}
We have \(s_{\text{col}}= s_{\text{row}}=2\) . The point is that if \(X_i\) is nonzero, then almost certainly \(X_i = \frac{1}{\epsilon ^i}\) . But in this case, a threshold algorithm cannot distinguish between the more likely case that \(Y_i = \frac{1}{\epsilon ^i}\) , in which case it should stop and take \(X_i\) , and the unlikely case that \(Y_i = 0\) and \(Y_{i+1} = \frac{1}{\epsilon ^{i+1}}\) , in which case it should wait and take \(X_{i+1}\) .
Indeed, these coefficients of matrix A play an important role, and in Appendix A we show that when entries of A are restricted to being only 0 or 1 (i.e., the unweighted setting), there exists a constant-factor approximation fixed-threshold algorithm. Roughly this happens because each \(Y_j\) has “limited influence” on any \(X_i\) : either \(Y_j\) appears with coefficient 0 and has no influence on \(X_i\) , or it appears with coefficient 1 and has the same influence on every such \(X_i\) .
Theorem 2.4.
The unweighted linear correlations problem has a fixed-threshold constant-factor approximation algorithm.
This raises the question whether for a general matrix A any policy can achieve a better approximation, let alone a simple policy. Our positive results will show that relatively simple inclusion-threshold algorithms can achieve tight prophet inequalities.
Definition 2.5.
An inclusion-threshold algorithm selects a subset \(S \subseteq \lbrace 1,\dots ,n\rbrace\) and threshold \(\tau\) , possibly both at random, and selects the first \(X_i\) such that \(i \in S\) and \(X_i \ge \tau\) . In other words, it commits to a subset S of arrivals and applies a threshold policy to those \(X_i\) , ignoring the others.
For some intuition on the power of inclusion-threshold algorithms, notice that in Example 2.3 if set S contains every other \(X_i\) (say all odd i or all even i, with probability \(1/2\) each), then \(\lbrace X_i\rbrace _{i \in S}\) will be independent. Hence, we could just apply the analysis of independent prophet inequality for \(\lbrace X_i\rbrace _{i \in S}\) to obtain a threshold that achieves an \(O(1)\) -approximation ratio.

3 Our Approach of Handling Correlations Via Augmentations

In analysis of prophet inequalities, the problem is to upper-bound the expected maximum of the variables \(X_i\) as compared to one’s algorithm. We first discuss why the following standard approach does not work: an important approach for prophet inequalities is to use that for any threshold \(\tau\) ,
\begin{align} {\mathbb {E}}[\max _i X_i] \quad \le \quad \tau + {\mathbb {E}}[\max _i \left(X_i - \tau \right)^+ ] \quad \le \quad \tau + \sum _i {\mathbb {E}}[ \left(X_i - \tau \right)^+ ] . \end{align}
(1)
When all the arrivals \(X_i\) are independent, it is known that one can always select \(\tau\) such that the left and right sides of (1) differ by at most a constant factor—that is, \(\frac{e}{e-1} \approx 1.6\) (this is related to the correlation gap [2]). In fact, the prototypical prophets analysis shows that setting some threshold \(\tau\) allows \(\text{ALG}_{\tau }\) to approximate the right-hand side up to a constant factor. However, when \(\lbrace X_i\rbrace\) are correlated, this could be a very loose upper bound, even when \(s_{\text{row}}=1\) . For example, consider \(X_1 = \cdots = X_n = Y_1 \sim \text{Bernoulli}(p)\) . The left side equals p while the right side equals \(\tau + np(1-\tau) = np + \tau (1-np) \ge np\) for \(p \lt \frac{1}{n}\) . So the right side can be a factor n larger than the left, and we cannot hope to approximate the right side with any algorithm.
Hence, we take a totally different approach. The first key idea is to use inclusion-threshold algorithms. To see why, consider a first attempt: discard certain \(X_i\) such that we are only left with a subset that are all independent of each other. Now a standard prophet algorithm that only considers these \(X_i\) would obtain a constant factor of the maximum in this subset. One could then hope to argue that this subset’s maximum approximates the original maximum up to a factor depending on the amount of correlation. Indeed, one can show that this approach succeeds on the tower instance in Example 2.3 with \(s_{\text{row}}= s_{\text{col}}= 2\) , for example, by including every other \(X_i\) . But in general, this approach cannot give tight bounds. This is because each \(X_i\) contains \(s_{\text{row}}\) variables, each of which can appear in up to \(s_{\text{col}}-1\) other \(X_{i^{\prime }}\) , so including \(X_i\) requires eliminating \(\approx s_{\text{row}}s_{\text{col}}\) other variables. Our goal is to achieve approximations to within \(\min \lbrace s_{\text{row}},s_{\text{col}}\rbrace\) factors even if \(\max \lbrace s_{\text{row}},s_{\text{col}}\rbrace = n\) .
So in addition to including only a subset of \(X_i\) , we will use a second key idea: decompose each variable as \(X_i = Z_i + W_i\) , where the \(Z_i\) s satisfy independence requirements and \(W_i\) s are viewed as “bonus” augmentations. We will show that \({\mathbb {E}}[ \max _i Z_i]\) is an approximation to \({\mathbb {E}}[ \max _i X_i]\) . Then we will compete with \({\mathbb {E}}[ \max _i Z_i]\) . However, the augmentations add an additional challenge, requiring us to solve the following problem.
Definition 3.1 (Single-Item Augmented Prophets Problem).
The algorithm is given the distributions of a set of independent nonnegative random variables \(Z_1,\dots ,Z_n\) . Then, it observes one at a time the realizations of \(X_i = Z_i + W_i\) for \(i \in [1,n]\) , where each \(W_i\) is nonnegative and satisfies that each \(Z_i\) is independent of \(X_1,\dots ,X_{i-1}\) for each i (but \(W_i\) could depend on \(X_1, \ldots , X_{i}\) ). The algorithm chooses at each step whether to continue or stop and obtain value \(X_i\) . It must compete with \({\mathbb {E}}[\max _i Z_i]\) .
One can view the augmented prophets problem as capturing correlations induced by a mischievous genie who awards bonuses \(W_i \ge 0\) at each step so as to worsen the algorithm’s performance. We note that the genie cannot base her choices on the future—that is, \(W_i\) is a random variable that may be correlated with \(Z_{i^{\prime }}\) if \(i^{\prime } \le i\) but not if \(i^{\prime } \gt i\) .
Intuitively, it might seem like algorithms for prophets problems should continue to perform well, as the genie can only increase the rewards at each step. However, this is not true for the classical median stopping rule—that is, for \(\tau =\) the median of \(\max _i Z_i\) (see an example in Section 1.2). Luckily, a different threshold rule is robust.
Lemma 3.2 (Single-Item Augmentation Lemma).
For the augmented prophets problem, a fixed-threshold algorithm with \(\tau = \frac{1}{2} {\mathbb {E}}[\max _i Z_i]\) guarantees \(\frac{{\mathbb {E}}[\text{ALG}_{\tau }]}{{\mathbb {E}}[\max _i Z_i]} \ge \frac{1}{2}\) .
Proof.
We “augment” a standard prophet inequality proof. Let \(P = \Pr [ \max _i X_i \ge \tau ]\) . Now, conditioning on the time the algorithm picks an element,
\begin{align*} {\mathbb {E}}[\text{ALG}_{\tau }] &= P\cdot \tau + \sum _i \Pr [ X_{i^{\prime }} \lt \tau ~ (\forall i^{\prime } \lt i)] \cdot {\mathbb {E}}\big [ (X_i-\tau)^+ \mid X_{i^{\prime }} \lt \tau ~ (\forall i^{\prime } \lt i) \big ]\\ &\ge P\cdot \tau + \sum _i (1-P) \cdot {\mathbb {E}}\big [ (X_i-\tau)^+ \mid X_{i^{\prime }} \lt \tau ~ (\forall i^{\prime } \lt i) \big ], \end{align*}
where the inequality uses that \(\text{ALG}_{\tau }\) selects no element with probability \(1-P\) . Nonnegativity of \(W_i\) implies \(X_i \ge Z_i\) , so
\begin{align*} {\mathbb {E}}[\text{ALG}_{\tau }] &\ge P\cdot \tau + \sum _i (1-P) \cdot {\mathbb {E}}\big [ (Z_i-\tau)^+ \mid X_{i^{\prime }} \lt \tau ~ (\forall i^{\prime } \lt i) \big ] \\ & = P\cdot \tau + (1-P) \cdot {\mathbb {E}}\Big [ \sum _i (Z_i-\tau)^+ \Big ] \end{align*}
since \(Z_i\) is independent of the event \(\lbrace X_{i^{\prime }} \lt \tau ~ (\forall i^{\prime } \lt i) \rbrace\) . Since \(\sum _i (Z_i - \tau)^+ \ge \max _i(Z_i - \tau)^+\) ,
\begin{align*} {\mathbb {E}}[\text{ALG}_{\tau }] &\ge P\cdot \tau + (1-P) \cdot {\mathbb {E}}\big [\max _i (Z_i - \tau)^+\big ] \\ &\ge P\cdot \tau + (1-P)\cdot {\mathbb {E}}\big [ \max _i Z_i - \tau \big ] \quad = \quad P\cdot \tau + (1-P) \tau \quad = \quad \tau . \end{align*}
Using \(\tau = \frac{1}{2} {\mathbb {E}}[\max _i Z_i]\) , this proves that \({\mathbb {E}}[\text{ALG}_{\tau } ] \ge \frac{1}{2} {\mathbb {E}}[\max _i Z_i]\) , as claimed. □
Multiple Items. The key idea in proving our \(1+o(1)\) -approximation ratio result for selecting multiple items for bounded \(s_{\text{col}}\) is to extend the augmentation lemma to cardinality constraints.
Definition 3.3 (Multiple-Items Augmented Prophets Problem).
In the augmented prophets problem with cardinality constraint r, the algorithm is given the distributions of a set of independent nonnegative random variables \(Z_1,\dots ,Z_n\) . Then, it observes one at a time the realizations of \(X_i = Z_i + W_i\) for \(i\in [1,n]\) , where each \(W_i\) is nonnegative and satisfies that each \(Z_i\) is independent of \(X_1,\dots ,X_{i-1}\) . The algorithm chooses at each step to reject or accept \(X_i\) subject to taking at most r variables total. It must compete with \({\mathbb {E}}[\sum _{i=1}^{r} Z^{(i)}]\) , where \(Z^{(i)}\) is the ith-largest \(\lbrace Z_1,\dots ,Z_n\rbrace\) .
Since none of the prior \(1+o(1)\) -approximation ratio algorithms for multiple items is robust to augmentations, in Section 5.3 we design a new algorithm to prove the following multiple-items augmentation lemma.
Lemma 3.4 (Multiple-Items Augmentation Lemma).
There is an algorithm for the augmented prophets problem with cardinality constraint r achieving a \((1 + O(\frac{(\log r)^{3/2}}{r^{1/4}}))\) -approximation ratio.
Next, we utilize the single-item augmentation lemma, along with a careful decomposition of \(\lbrace X_i\rbrace\) into \(\lbrace Z_i + W_i\rbrace\) and the power of inclusion-threshold algorithms, to separately attack the single-item problem for the cases of bounded \(s_{\text{row}}\) and \(s_{\text{col}}\) .

4 Selecting A Single Item

In this section, we prove our main theorem. Later, we will also address cases where the algorithm can take multiple items.
Theorem 4.1.
There exists an inclusion-threshold algorithm for the linearly correlated prophet problem with approximation ratio \(O({\min \lbrace {s_{\text{col}}}, {s_{\text{row}}} \rbrace })\) .
We will first show in Proposition 4.2 that an inclusion-threshold algorithm guarantees \(O({s_{\text{col}}})\) , and then in Proposition 4.4 that an inclusion-threshold algorithm achieves \(O({s_{\text{row}}})\) . The algorithm that runs one of these according to which of \(s_{\text{col}},s_{\text{row}}\) is smaller is an inclusion-threshold algorithm achieving the claimed performance.
We will see that bounded column sparsity is the easier case, requiring a simpler algorithm and analysis. For the case of bounded row sparsity, we will need much more careful reasoning about dependencies and correlations between \(Y_j\) . This difficulty will also manifest quantitatively when we move to the cardinality-constraint setting in Section 5, where better bounds will be achievable only in the case of bounded column sparsity.

4.1 Bounded Column Sparsity

Recall that column sparsity \(s_{\text{col}}\) is the maximum number of times a given feature \(Y_j\) may appear with nonzero coefficient. We now give a relatively straightforward algorithm for achieving \(\Omega (\frac{1}{s_{\text{col}}})\) fraction of \({\mathbb {E}}[ \max _i X_i]\) . The idea is similar to the “first attempt” described in Section 3, using the single-item augmentation lemma (Lemma 3.2) to overcome the challenges discussed there.
Proposition 4.2.
There exists an inclusion-threshold algorithm for the linearly correlated prophet problem with approximation ratio \({2e \cdot s_{\text{col}}}\) .
Proof.
Choose \(S \subseteq [n]\) by including each \(i \in [n]\) independently with probability \(\frac{1}{s_{\text{col}}}\) . This gives the inclusion subset; now we define the threshold \(\tau\) . Assign each \(Y_j\) to the first surviving \(X_i\) that includes it—that is, for each \(i \in S\) , construct a set \(T_i := \lbrace j : A_{ij} \gt 0 \text{ and } A_{i^{\prime }j} = 0 ~ (\forall i^{\prime } \in S \text{ where } i^{\prime } \lt i)\rbrace\) . Let \(Z_i = \sum _{j \in T_i} A_{ij} Y_j\) and set \(\tau = \frac{1}{2} {\mathbb {E}}[\max \lbrace Z_1, \ldots , Z_n \rbrace ]\) . If \(i \not\in S\) , then \(T_i = \emptyset\) and \(Z_i = 0\) .
By definition of an inclusion-threshold algorithm (Definition 2.5), we automatically reject any \(X_i\) such that \(i \not\in S\) , and we select the first arriving \(X_i\) such that \(i \in S\) and \(X_i \ge \tau\) .
Now, by construction, we can write \(X_i = Z_i + W_i,\) where each \(Z_i\) contains only variables \(Y_j\) not appearing in any prior \(X_{i^{\prime }}\) for \(i^{\prime } \in S\) and \(i^{\prime } \lt i\) . So, \(Z_i\) is independent of the preceding \(X_{i^{\prime }}\) under consideration. Hence, by the single-item augmentation lemma (Lemma 3.2), \({\mathbb {E}}[\text{ALG}] \ge \frac{1}{2} {\mathbb {E}}[ \max _i Z_i ].\)
Next, we show that \({\mathbb {E}}[ \max _i Z_i ]\) is comparable to \({\mathbb {E}}[ \max _i X_i ]\) .
Claim 4.3.
\({\mathbb {E}}[ \max \lbrace Z_1,\ldots ,Z_n\rbrace ] \ge \frac{1}{e \cdot s_{\text{col}}} {\mathbb {E}}[ \max \lbrace X_1,\ldots ,X_n\rbrace ]\) , where the expectation is over the construction of S as well as \(Y_1,\dots ,Y_n\) . □
Proof of Claim 4.3
We prove that for every fixed realization of \(Y_1,\ldots ,Y_n\) , the inequality holds in expectation over S. Let \(X_{i^*} = \max _i X_i\) . For each j with \(A_{i^*j} \gt 0\) , we claim \(\Pr [j \in T_i] \ge \frac{1}{e \cdot s_{\text{col}}}\) . This is because \(X_i\) survives with probability \(\frac{1}{s_{\text{col}}}\) and, independently, all the other variables \(X_{i^{\prime }}\) with \(A_{i^{\prime }j} \gt 0\) (there are at most \(s_{\text{col}}-1\) of them) fail to survive with probability at least3 \((1 - \frac{1}{s_{\text{col}}})^{s_{\text{col}}-1} \ge \frac{1}{e}\) . (If \(s_{\text{col}}=1\) , then this probability is 1.) So with probability only over the construction of S,
\begin{align*} {\mathbb {E}}_S [Z_{i^*} ] \quad = \quad \sum _j \Pr [j \in T_{i^*}] A_{i^* j} Y_j \quad \ge \quad \frac{1}{e \cdot s_{\text{col}}} \sum _j A_{i^* j} Y_j \quad = \quad \frac{1}{e \cdot s_{\text{col}}} X_{i^*} . \end{align*}
So we have \({\mathbb {E}}_S[ Z_{i^*}] \ge \frac{1}{e\cdot s_{\text{col}}}\max _i X_i\) , so \({\mathbb {E}}_S[ \max _i Z_i ] \ge \frac{1}{e\cdot s_{\text{col}}} \max _i X_i\) . This holds for each fixed realization of \(Y_1,\dots ,Y_n\) , so it holds in expectation. □
Claim 4.3 completes the proof of Proposition 4.2, as we have
\begin{equation*} {\mathbb {E}}[\text{ALG}] \quad \ge \quad \frac{1}{2} {\mathbb {E}}[\max _i Z_i ] \quad \ge \quad \frac{1}{2 e \cdot s_{\text{col}}} {\mathbb {E}}[ \max _i X_i ]. □ \end{equation*}

4.2 Bounded Row Sparsity

Recall that row sparsity \(s_{\text{row}}\) implies that each \(X_i\) only depends on at most \(s_{\text{row}}\) different features \(Y_j\) ; however, a given \(Y_j\) may appear in arbitrarily many \(X_i\) s. In this section, for notational convenience, we assume without loss of generality that \(\max _i \lbrace A_{ij} \rbrace = 1\) for all j. (If this is not the case, one can renormalize each column and redefine a scaled version of \(Y_j\) .) We prove the following.
Proposition 4.4.
There exists an inclusion-threshold algorithm for the linearly correlated prophets problem achieving approximation ratio \({2 e^3 \cdot s_{\text{row}}}\) .
This case requires more care. There does not seem to be an analogous approach to randomly excluding \(X_i\) , as for bounded column sparsity. Moreover, an important observation is that the \(X_i\) cannot be treated identically in a manner oblivious to the structure of A. For every “important” row that ought to be included, there can be many unimportant rows. Indeed, we can take any instance and prepend it with arbitrarily many variables of the form \(X_i = Y_1\) without changing the row sparsity \(s_{\text{row}}\) . An oblivious inclusion-threshold algorithm would essentially keep only variables from this prefix, ignoring the actual instance.
Before the formal proof of Proposition 4.4, we develop a tool to address this challenge. The key idea is to design an inclusion scheme that, for any instance structure, allows each \(Y_j\) to be both represented and “independent” with a reasonably high probability. Here, independence refers to not sharing an \(X_i\) with any other included \(Y_{j^{\prime }}\) . Inspired by contention-resolution schemes, which have a similar flavor, we define a representative construction of a subset of the \(Y_j\) and corresponding \(X_i\) with \(A_{ij}=1\) , where \(X_i\) is matched to \(Y_{j(i)}\) .
Definition 4.5.
Consider a randomized selection of \(S \subseteq \lbrace 1,\dots ,n\rbrace\) and \(T \subseteq \lbrace 1,\dots ,m\rbrace\) of equal size with a perfect matching \(j(i)\) satisfying \(A_{ij(i)} = 1\) . Call this construction \(\alpha\) -representative or \(\alpha\) -rep. if
(1)
for all \(j \in \lbrace 1,\dots ,m\rbrace\) , we have \(\Pr [j \in T] \ge \alpha\) , and
(2)
for all \(i,i^{\prime } \in S\) , \(i\not= i^{\prime }\) , we have \(A_{i^{\prime }j(i)} = 0\) .
Note that we cannot hope for better than an \(O(\frac{1}{s_{\text{row}}})\) -rep. construction, as any inclusion of some \(Y_j\) can rule out \(s_{\text{row}}-1\) other features. This raises the question of whether one can achieve \(\Omega (\frac{1}{s_{\text{row}}})\) -rep.
Lemma 4.6.
For any linearly correlated instance there exists a \(\frac{1}{e^2 \cdot s_{\text{row}}}\) rep. construction.
Proof.
For each \(Y_j\) , define its primary \(i(j)\) by picking any i such that \(A_{ij} = 1\) (by our renormalization assumption, there is at least one). Consider a directed graph G where the nodes are \(\lbrace 1,\dots ,m\rbrace\) representing the independent variables \(Y_j\) . There is a directed edge \((j,j^{\prime })\) if \(A_{i(j)j^{\prime }} \gt 0\) —that is, j points to all other \(j^{\prime }\) who are included in its primary variable \(X_{i(j)}\) . We note that both edges \((j,j^{\prime })\) and \((j^{\prime },j)\) might be present.
The key property is that each vertex in G has out-degree \(\le s_{\text{row}}-1\) , as \(Y_j\) has only one primary \(X_{i(j)}\) and at most \(s_{\text{row}}-1\) other variables \(j^{\prime }\) have \(A_{i(j)j^{\prime }} \gt 0\) . Because average in-degree equals average out-degree, this implies there exists a vertex with in-degree at most \(s_{\text{row}}-1\) . Applying this argument recursively, we get the following claim.
Claim 4.7.
There exists an order \(\pi\) of the vertices of G such that for every j, the induced subgraph on \(\pi (1), \ldots , \pi (j)\) satisfies that the in-degree of \(\pi (j)\) is at most \(s_{\text{row}}-1\) . □
Proof.
As shown, there exists some j with in-degree at most \(s_{\text{row}}-1\) . Set \(\pi (m) = j\) . Now delete j from the graph, including all edges to and from j. In this graph, again all out-degrees are at most \(s_{\text{row}}-1\) , so we can recursively construct \(\pi (m-1),\dots ,\pi (1)\) . □
Consider all the \(Y_j\) variables in the order \(\pi\) given by Claim 4.7. Initialize \(S,T = \emptyset\) . On considering j, if j does not have an edge with any vertex \(j^{\prime } \in T\) (neither incoming nor outgoing), then independently with probability \(\frac{1}{s_{\text{row}}}\) , add j to T and add \(j(i)\) (its primary variable) to S. With the remaining probability, ignore j and continue. We show that this randomized construction of \(S,T\) satisfies the two properties of a \(\frac{1}{e^2 \cdot s_{\text{row}}}\) rep. construction.
For Property 1, note each j has at most \(2(s_{\text{row}}-1)\) total edges (both incoming and outgoing) to nodes \(j^{\prime }\) appearing prior to j in the permutation: j has at most \(s_{\text{row}}-1\) outgoing edges in total and, by construction of \(\pi\) , has at most \(s_{\text{row}}-1\) incoming edges from nodes prior to j in \(\pi\) . So when we reach j in the permutation, we consider it with probability at least the probability that all these \(2(s_{\text{row}}-1)\) neighbors have been rejected, which is at least \((1 - \frac{1}{s_{\text{row}}})^{2(s_{\text{row}}-1)} \ge \frac{1}{e^2}\) , and then we include it with probability \(\frac{1}{s_{\text{row}}}\) . This shows that each j is included with probability at least \(\frac{1}{e^2 \cdot s_{\text{row}}}\) .
Next, for Property 2, consider any \(i,i^{\prime } \in S\) with respective partners \(j,j^{\prime } \in T\) . We must show \(A_{ij^{\prime }} = 0\) . Note that by construction, \(i = i(j)\) and \(i^{\prime } = i(j^{\prime })\) (i.e., they are the primary variables for \(j,j^{\prime }\) ). Either j was selected into T before or after \(j^{\prime }\) . In either case, the second variable could only be selected if edge \((j,j^{\prime })\) did not exist in the graph, which implies \(A_{ij^{\prime }} = 0\) .
This completes the proof of Lemma 4.6.
Given our representative construction, we are ready to complete the algorithm and proof.
Proof of Proposition 4.4
The algorithm is an inclusion-threshold algorithm. Its inclusion set S is obtained by calling our representative construction of Lemma 4.6, which also produces a choice \(j(i)\) for each \(i \in S\) . Define \(Z_i = Y_{j(i)}\) for each \(i \in S\) and \(Z_i = 0\) if \(i \not\in S\) . Set \(\tau = \frac{1}{2} {\mathbb {E}}[ \max _i Z_i]\) .
By the second property of representative constructions, \(Z_i\) is independent of \(X_{i^{\prime }}\) for all \(i^{\prime } \in S, i^{\prime } \ne i\) . Therefore, by the augmentation lemma (Lemma 3.2),
\begin{align} {\mathbb {E}}[\text{ALG}] \ge \frac{1}{2} {\mathbb {E}}[\max _i Z_i] . \end{align}
(2)
Combining this with the following Lemma 4.8 will prove Proposition 4.4.
Lemma 4.8.
\({\mathbb {E}}[ \max _i Z_i ]\ge \frac{1}{e^3 \cdot s_{\text{row}}} {\mathbb {E}}[ \max _i X_i]\) . □
Before proving Lemma 4.8, we will need one more idea. Let \(R_i = \lbrace j : A_{ij} \gt 0\rbrace\) , the variables included in \(X_i\) . Notice that we may have \(|R_i \cap T| \ge 2\) (i.e., multiple variables \(Y_j\) are members of \(X_i\) and appear in the construction T). This can occur when \(X_i\) is not primary for any of them.4 It will help to lower-bound the probability that \(Y_j\) is the unique member of \(R_i \cap T\) .
Claim 4.9.
Under the construction of Lemma 4.6, for each \(j \in R_i\) , \(\Pr [R_i \cap T = \lbrace j\rbrace ] \ge \frac{1}{e^3 \cdot s_{\text{row}}}\) .
Proof.
We have \(\Pr [j \in T] \ge \frac{1}{e^2 \cdot s_{\text{row}}}\) by the representative construction. Meanwhile, conditioned on any other decisions, each \(j^{\prime }\) is included in T with probability at most \(\frac{1}{s_{\text{row}}}\) , because it is considered with probability at most 1 and included independently with probability \(\frac{1}{s_{\text{row}}}\) conditioned on being considered. So, \(\Pr [R_i \cap T = \lbrace j\rbrace | j \in T] \ge (1 - \frac{1}{s_{\text{row}}})^{s_{\text{row}}-1} \ge \frac{1}{e}\) . □
Proof of Lemma 4.8
Fix the realizations of all \(Y_j\) . Let \(i^* = \operatorname{arg max}_i X_i\) . First, notice that
\begin{align*} {\mathbb {E}}[\max _i Z_i ] \quad &= \quad {\mathbb {E}}\big [ \max _{j \in T} \lbrace Y_j \rbrace \big ] \quad \ge \quad {\mathbb {E}}\big [ \max _{j \in T \cap R_{i^*}} \lbrace A_{i^* j} \cdot Y_j \rbrace \big ] . \end{align*}
Now the expected maximum of elements in \(T \cap R_{i^*}\) is at least the sum over the elements of each one’s contribution to the max, which is at least the chance it is the unique survivor times its value. This allows us to relate a max to a sum, and it relies on the fact that the representative construction’s randomness is independent of the realizations of the \(\lbrace Y_j\rbrace\) . Thus,
\begin{align*} {\mathbb {E}}[\max _i Z_i ] ~ \ge ~ {\mathbb {E}}\big [ \max _{j \in T \cap R_{i^*}} \big \lbrace A_{i^* j} \cdot Y_j \big \rbrace \big ] ~ \ge ~ \sum _j \Pr \big [T \cap R_{i^*} = \lbrace j\rbrace \big ] \cdot A_{i^* j} Y_j ~ \ge ~ \sum _j \frac{1}{e^3 \cdot s_{\text{row}}} A_{i^* j} Y_j , \end{align*}
where the last inequality uses Claim 4.9. Since \(\sum _j A_{i^* j} Y_j = X_{i^*}\) , we have \({\mathbb {E}}[\max _i Z_i ] \ge \frac{1}{e^3 \cdot s_{\text{row}}} X_{i^*}\) . Taking expectations over \(Y_1,\dots ,Y_n\) completes the proof. □
Finally, combining (2) with Lemma 4.8 completes the proof of Proposition 4.4.

4.3 Lower Bounds

We now give a matching hardness result, showing that no algorithm can do better than our results in the previous section up to constant factors.
Example 4.10 (General Tower Instance).
Take the tower \({{\bf Y}}\) variables (recall \(Y_i = \frac{1}{\epsilon ^i}\) with probability \(\epsilon ^i\) , and 0 otherwise). Given input integer n, set \(s_{\text{row}}= s_{\text{col}}= n\) . Let A be an \(n \times n\) matrix with entry \(A_{i,j} = 0\) for \(j \lt i\) and \(A_{i,j} = \epsilon ^{j-i}\) for \(j \ge i\) . Visually,
\begin{align*} X_1 = Y_1 + \epsilon Y_2 + ~ \cdots ~ \cdots ~ + \epsilon ^n Y_n , \qquad X_2 = Y_2 + \epsilon Y_3 + \cdots + \epsilon ^{n-1} Y_n, \qquad \ldots , \qquad X_n = Y_n. \end{align*}
The difficulty here amplifies that of Example 2.3. If any of \(Y_i,Y_{i+1},\dots ,Y_n\) are active, then this will cause \(X_i\) to be nonzero. Assuming only one of these variables is active (by far the most likely case), it is impossible for the algorithm to tell whether to stop or continue.
It will turn out that this instance is hard even if the algorithm is given additional power.
Definition 4.11.
In the fractional variant of the prophet problem, at each arrival i, the algorithm may choose to take a fraction \(p_i\) of the current arrival \(X_i\) subject to always taking at most one unit in total (i.e., \(\sum _{i=1}^n p_i \le 1\) with probability 1). Its reward is \(\sum _{i=1}^n p_i X_i\) .
One strategy available in the fractional prophet problem is to spend the entire budget on a single arrival, which is an algorithm for the standard prophets problem. So a lower bound for the fractional problem immediately implies a lower bound for the prophets problem.
Theorem 4.12.
In the linearly correlated prophet problem, even if fractional, no online algorithm can guarantee a smaller approximation ratio than \({\min \lbrace {s_{\text{col}}},{s_{\text{row}}}\rbrace }\) .
Proof.
We consider a family of instances of Example 4.10 where \(n = s_{\text{col}}= s_{\text{row}}\) .
Since for every i we have \(X_i \ge Y_i\) , we get that for sufficiently small \(\epsilon\) ,
\begin{align*} {\mathbb {E}}[\max _i\lbrace X_i\rbrace ] \quad \ge \quad {\mathbb {E}}[\max _j\lbrace Y_j\rbrace ] \quad = \quad \sum _{i=1}^n \epsilon ^i \prod _{j\gt i} (1-\epsilon ^j) \cdot \frac{1}{\epsilon ^i} \quad \ge \quad n (1-\epsilon)^n \quad \ge \quad n(1-n\epsilon). \end{align*}
However, we show every online algorithm that may even fractionally select elements has value at most \(1/(1-\epsilon)^2\) . This implies the approximation ratio can be \({n}(1-n\epsilon)(1-\epsilon)^2\) , which tends to n as \(\epsilon \rightarrow 0\) .
Lemma 4.13.
Suppose at arrival i we have \(p = \sum _{i^{\prime } \lt i} p_{i^{\prime }}\) and \(X_i = 1/\epsilon ^i\) . Conditioned on this event, any online algorithm obtains expected value from elements \(i,\dots ,n\) at most \(({1-p})/{\epsilon ^i}\) . □
Before proving Lemma 4.13, we use it to prove that every algorithm has \(O(1)\) expected value. Notice that if \(X_i\gt 1/\epsilon ^i,\) then the online algorithm should never accept any fraction of the element \(X_i\) as \(X_{i+1}\) is guaranteed to be larger. Hence, by Lemma 4.13, the optimal algorithm \(\text{ALG}\) takes the smallest i for which \(X_i = 1/\epsilon ^i\) , which means
\begin{align*} {\mathbb {E}}[\text{ALG}] & \le \sum _{i \ge 1} \Pr \big [ \big (X_{j} \gt 1/\epsilon ^{j} \text{ for all } j\lt i \big) ~\&~ \big (X_i = 1/\epsilon ^i \big) \big ] \cdot \frac{1}{\epsilon ^i} \\ & = \sum _{i \ge 1} \Pr [Y_{i-1} = 1/\epsilon ^{i-1} ] \cdot \big (\Pr [X_i = 1/\epsilon ^i] \big) \cdot \frac{1}{\epsilon ^i} \\ & \le \sum _{i \ge 1} \Pr [Y_{i-1} = 1/\epsilon ^{i-1} ] \cdot \left(\sum _{j\ge i} \Pr [ Y_j = 1/\epsilon ^j] \right) \cdot \frac{1}{\epsilon ^i} \\ & = \sum _{i \ge 1} \epsilon ^{i-1} \sum _{j \ge i} \epsilon ^j \cdot \frac{1}{\epsilon ^i} \quad \le \quad \sum _{i \ge 1} \frac{\epsilon ^{i-1} }{1-\epsilon } \quad \le \quad \frac{1}{(1-\epsilon)^2}. □ \end{align*}
Now we prove the missing lemma.
Proof of Lemma 4.13
We prove this lemma by reverse induction on i. It is immediately true for \(i=n\) , as spending the entire remaining budget \(1-p\) on acquiring \(X_n = \frac{1}{\epsilon ^n}\) is optimal. To prove the inductive step, notice the optimal online algorithm can be written as a convex combination of the following two algorithms: one that spends the entire remaining budget of \((1-p)\) on \(X_i\) and another one that spends no budget on \(X_i\) and plays optimally afterward. We argue that the first algorithm is better, which means its expected value is \(({1-p})/{\epsilon ^i}\) .
Observe that the second algorithm obtains nonnegative reward only if one of \(Y_j\) s for \(j\gt i\) is active. In this case, \(X_{i+1} = 1/\epsilon ^{i+1}\) (note it cannot be larger because \(X_i = 1/\epsilon ^i\) ), and hence by the induction hypothesis the optimal online algorithm gets value \(X_{i+1} = (1-p)/\epsilon ^{i+1}\) . Thus, the expected value of the algorithm is
\begin{equation*} \Pr \big [\exists j\gt i \text{ s.t. } Y_j = 1/\epsilon ^j \mid X_i = 1/\epsilon ^i \big ] \cdot (1-p)/\epsilon ^{i+1}. \end{equation*}
We show \(\Pr [\exists j\gt i \text{ s.t. } Y_j = 1/\epsilon ^j \mid X_i = 1/\epsilon ^i ] \le \epsilon\) , which implies the first algorithm is always better. To see this, notice
\begin{align*} \Pr [\exists j\gt i \text{ s.t. } Y_j = 1/\epsilon ^j \mid X_i = 1/\epsilon ^i] & = \frac{ \Pr [(\exists j\gt i \text{ s.t. } Y_j = 1/\epsilon ^j) ~\&~ (X_i = 1/\epsilon ^i])] }{\Pr [X_i = 1/\epsilon ^i ]} \\ & = \frac{\sum _{j\gt i} \Pr [(Y_j = 1/\epsilon ^j) ~\&~ (X_i = 1/\epsilon ^i])] }{\sum _{j \ge i} \Pr [(Y_j = 1/\epsilon ^j) ~\&~ (X_i = 1/\epsilon ^i]) ] } \\ & = \frac{\sum _{j\gt i} \epsilon ^j/(1- \epsilon ^j)\cdot \prod _{j^{\prime } \ge i} (1-\epsilon ^{j^{\prime }}) }{\sum _{j \ge i} \epsilon ^j/(1- \epsilon ^j)\cdot \prod _{j^{\prime } \ge i } (1-\epsilon ^{j^{\prime }})} \quad = \quad \frac{\sum _{j\gt i} \epsilon ^j/(1- \epsilon ^j) }{\sum _{j \ge i} \epsilon ^j/(1- \epsilon ^j)}. \end{align*}
Now using \(1+ x \le \frac{1}{1-x} \le 1+ 2x\) for \(0 \le x\lt 0.5\) , we get
\begin{align*} \Pr [\exists j\gt i \text{ s.t. } Y_j = 1/\epsilon ^j \mid X_i = 1/\epsilon ^i ] & \le \frac{\sum _{j\ge i+1} \epsilon ^j \cdot (1 + 2\epsilon ^j) }{\sum _{j \ge i} \epsilon ^j \cdot (1 + \epsilon ^j)} \\ & = \frac{\epsilon ^{i+1} - \epsilon ^{n+1} + (2 \epsilon ^{2i+2} - 2 \epsilon ^{2n+2}) /(1+\epsilon) }{\epsilon ^i -\epsilon ^{n+1} + (\epsilon ^{2i} -\epsilon ^{2n+2})/(1+\epsilon)} \quad \le \quad \epsilon , \end{align*}
where the last inequality uses \(\epsilon \lt 1/2\) and \(\epsilon ^n \lt 2\) . □

5 Selecting Multiple Items

In this section, we show that our approach via augmentations extends to a variant of the problem in which one may take up to \(r \in \mathbb {N}\) of the arriving variables \(X_1,\dots ,X_n\) . Let \(Q \subseteq \lbrace 1,\dots ,n\rbrace\) be the indices chosen by \(\text{ALG}\) with \(|Q| \le r\) . Let \(X^{(i)}\) denote the ith-largest realized variable (we later use notation \(Z^{(i)}\) for the ith-largest among \(\lbrace Z_1,\dots ,Z_n\rbrace\) as well). We have
\begin{equation*} \text{ALG}= \sum _{i \in Q} X_i \quad \text{ while } \quad {\rm\small OPT}= \sum _{i=1}^r X^{(i)} . \end{equation*}
We refer to this problem as a cardinality constraint of r. It is also referred to as selecting an independent set of a rank-r matroid in the special case of r-uniform matroids.
In this setting, there will be significant differences between row and column sparsity assumptions. We will show that for bounded column sparsity \(s_{\text{col}}\) , one can design \((1+o(1))\) -approximation algorithms for cardinalities \(r \rightarrow \infty\) , although this does not hold for bounded row sparsity \(s_{\text{row}}\) .

5.1 Bounded Column Sparsity

As \(r \rightarrow \infty\) , we will show for bounded column sparsity an approximation ratio approaching 1.
Theorem 5.1.
For a fixed \(s_{\text{col}}\) , the linearly correlated prophets problem with cardinality constraint r admits a \((1 + O((\frac{s_{\text{col}}}{r})^{1/5} (\log r)^{6/5}))\) -approximation.
The key idea is to prove an augmentation lemma for selecting multiple items (restated next).
Lemma 3.4 (Multiple-Items Augmentation Lemma). There is an algorithm for the augmented prophets problem with cardinality constraint r achieving a \((1 + O(\frac{(\log r)^{3/2}}{r^{1/4}}))\) -approximation ratio.
In Section 5.3, we prove this augmentation lemma, but first we use it to prove Theorem 5.1.
The idea of the reduction is that randomly partitioning the variables into \(s_{\text{col}}/\epsilon ^{\prime }\) “groups” gives us multiple independent augmented prophets problems. We think of each group as a subproblem of selecting \(\epsilon ^{\prime } r/s_{\text{col}}\) elements and use the augmentation lemma to approximately solve it.
Proof of Theorem 5.1
Formally, let there be \(c = \lceil s_{\text{col}}/\epsilon ^{\prime } \rceil\) sets, which we call groups, \(B_1,\dots ,B_c\) . For each \(X_i\) , place it in a group \(j \in \lbrace 1,\dots ,c\rbrace\) chosen uniformly at random. For each \(X_i\) , let
\begin{equation*} X_i^{\prime } = \sum _j {A_{ij} Y_j \cdot \mathbf {1} [\text{$Y_j$ only appears once in the group containing $X_i$]}} \end{equation*}
denote the sum of \(X_i\) ’s components \(Y_j\) that do not appear with any other variable in the group containing \(X_i\) . Let \({\rm\small OPT}_j^{\prime }\) denote the sum of the largest \(\epsilon ^{\prime } r/s_{\text{col}}\) elements \(X_i^{\prime }\) in group \(B_j\) .
Claim 5.2.
\({\mathbb {E}}[ \sum _j {\rm\small OPT}_j^{\prime } ] \ge (1- \epsilon ^{\prime }) \cdot {\mathbb {E}}[{\rm\small OPT}].\)  □
Proof.
Consider a fixed \(X_i\) . Condition on \(X_i\) landing in a group. Notice that each of its \(Y_j\) s have at least \(1-\epsilon ^{\prime }\) chance of appearing only with \(X_i\) in this group (i.e., the \(s_{\text{col}}-1\) other rows containing \(Y_j\) go to a different group, along with a union bound), and hence it contributes to \({\rm\small OPT}_j^{\prime }\) . □
Since each group \(B_j\) forms a separate instance of the augmented prophets problems, we can apply Augmentation Lemma 3.4 on each of them. Let \(\text{ALG}_j\) denote the algorithm’s performance in group j. By selecting \(\epsilon ^{\prime }\) less than \(O(\frac{(\log r)^{3/2}}{(\epsilon ^{\prime } r /s_{\text{col}})^{1/4}})\) (i.e., choosing \(\epsilon ^{\prime } = \Theta ((\frac{s_{\text{col}}}{r})^{1/5} (\log r)^{6/5})\) ), we get
\begin{equation*} \sum _j {\mathbb {E}}[ \text{ALG}_j ] \quad \ge \quad \sum _j (1- O(\epsilon ^{\prime }))\cdot {\mathbb {E}}[{\rm\small OPT}^{\prime }_j] \quad \ge \quad (1- O(\epsilon ^{\prime })) \cdot {\mathbb {E}}[{\rm\small OPT}], \end{equation*}
where the last inequality uses Claim 5.2.

5.2 Bounded Row Sparsity

For cardinality constraints, the symmetry between bounds for row and column sparsity breaks: one cannot guarantee better than a \(\Theta ({s_{\text{row}}})\) -approximation in general even as \(r \rightarrow \infty\) . In fact, this follows by reducing to our previous hardness result for fractional prophets.
Theorem 5.3.
No algorithm for linearly correlated prophets with cardinality constraint can guarantee better than an \(\Omega ({s_{\text{row}}})\) -approximation, even as \(r \rightarrow \infty\) .
Finally, we show that the \(O({s_{\text{row}}})\) -approximation upper bound for single item can be extended to the cardinality constraint setting. The intuition is straightforward, as we can simply instantiate r parallel versions of our previous single item algorithm and assign arriving variables to each at random. The analysis needs to show that no more than a constant factor is lost due to cases where members of OPT are sent to the same bucket.
Theorem 5.4.
For all r, there is an \(O({s_{\text{row}}})\) -approximation for the linearly correlated prophet problem with cardinality constraint r.
We now present formal proofs of the last two theorems.
Proof of Theorem 5.3
For any \(r, s_{\text{row}}\) , we construct the general tower instance of Example 4.10 with parameter \(s_{\text{row}}\) , but we simply make r copies of each variable \(X_i\) (not independent, but exact copies). The row sparsity is unchanged. Now we will make the problem easier in two steps.
In the first step, allow variables to arrive in batches of size r. The algorithm may select any subset of the variables (until fulfilling its cardinality constraint), then reject the rest and receive the next batch. This is a strictly easier problem, so an algorithm’s performance can only improve. Of course, this will not help on this instance, since each batch of r are all identical, and it will turn out to be optimal to either take them all or none.
In the second step, instead we send the original tower instance (with no duplication), and we give the algorithm a cardinality constraint of 1, but we allow it to pick a fractional amount of each variable \(X_i\) . In other words, we return exactly to the setting of Theorem 4.12. In each case, the algorithm sees the same information before making each decision (i.e., the value of the current \(X_i\) ). In the fractional problem, the algorithm can pick any fraction \(p_i\) of \(X_i\) , so long as the total amount picked is at most 1. In the preceding first step, the algorithm can pick any fraction \(\frac{c}{r}\) of the variables that equal the original \(X_i\) , as long as it has does not exceed a total of \(\frac{r}{r}\) . The fractional problem allows more choice and the benchmarks (normalized) are the same, since the maximum of the duplicated instance will take all r copies of the largest \(X_i\) . So the fractional problem is only easier.
We now invoke Theorem 4.12, which gives a lower bound of \(\Omega ({s_{\text{row}}})\) -approximation on the fractional problem. (Note that \(s_{\text{row}}\) did not change during the preceding reduction, although \(s_{\text{col}}\) did.) □
Proof of Theorem 5.4
Given an instance, we create r buckets \(B_1,\dots ,B_r\) and place each variable \(X_i\) in a bucket \(B_j\) uniformly at random. We then run our algorithm for the linearly correlated prophet problem with bounded \(s_{\text{row}}\) in each bucket j (call it \(\text{ALG}_j\) ), selecting one item. Given a fixed assignment of variables to buckets, in each bucket \(B_j\) we have by the algorithm’s guarantee that, with expectation over realizations of \(\lbrace X_i : i \in B_j\rbrace\) , by Proposition 4.4,
\begin{equation*} {\mathbb {E}}_{{{\bf X}}}[\text{ALG}_j] \ge \Omega \left(\frac{1}{s_{\text{row}}}\right) \cdot {\mathbb {E}}_{{{\bf X}}}[\max _{X_i\in B_j} X_i] . \end{equation*}
Taking expectations over both buckets and variables, we have
\begin{align*} {\mathbb {E}}[\text{ALG}] ~ = ~ \sum _{j=1}^r {\mathbb {E}}_B {\mathbb {E}}_{{{\bf X}}} [\text{ALG}_j ] ~ \ge ~ \Omega \left(\frac{1}{s_{\text{row}}}\right) \sum _{j=1}^r {\mathbb {E}}_B {\mathbb {E}}_{{{\bf X}}}\big [\max _{X_i\in B_j} X_i\big ] ~=~ r \cdot \Omega \left(\frac{1}{s_{\text{row}}}\right) {\mathbb {E}}_{{{\bf X}}} {\mathbb {E}}_B\big [\max _{X_i\in B_1} X_i \big ], \end{align*}
where the last equality is by symmetry of the buckets. Now since \({\rm\small OPT}= X^{(1)} + \cdots + X^{(r)}\) where \(X^{(i)}\) is the ith largest variable, and since for fixed variable realizations \(\Pr [X^{(i)} = \max _{X_{i^{\prime }}\in B_1} X_{i^{\prime }}] = \frac{1}{r}\prod _{i^{\prime }=1}^{i-1}(1- \frac{1}{r}) \ge \frac{1}{e\cdot r}\) , we get
\begin{align*} {\mathbb {E}}[\text{ALG}] &\ge r \cdot \Omega \left(\frac{1}{s_{\text{row}}}\right) {\mathbb {E}}_{{{\bf X}}}\Big [\sum _{i=1}^r \Pr [X^{(i)} = \max _{X_{i^{\prime }}\in B_1} X_{i^{\prime }}] \cdot X^{(i)}\Big ] \\ &\ge r \cdot \Omega \left(\frac{1}{s_{\text{row}}}\right) {\mathbb {E}}_{{{\bf X}}}\Big [\sum _{i=1}^r \frac{1}{e\cdot r} X^{(i)} \Big ] \quad = \quad \Omega \left(\frac{1}{s_{\text{row}}}\right) \cdot {\mathbb {E}}[{\rm\small OPT}]. □ \end{align*}

5.3 Multiple-Items Augmentation Lemma

In Section 3, we showed a 2-approximation single-item augmentation lemma using the half of expected maximum as a threshold. In this section, we prove a \(1+o(1)\) -approximation multiple-items augmentation lemma (Lemma 3.4), assuming the cardinality constraint r is sufficiently large. We first give a surrogate benchmark \({\rm\small OPT}^{\prime }\) that competes with \({\rm\small OPT}\) . We then define the algorithm and show that it competes with \({\rm\small OPT}^{\prime }\) .

Surrogate Benchmark.

The analysis hinges on threshold \(\tau _0 := \frac{{\mathbb {E}}[{\rm\small OPT}]}{\epsilon }\) , where we call variables \(Z_i\) and \(X_i\) that fall above the threshold heavy and below light. For light variables, we exclude the scenario where they are very small, below some threshold \(\tau _c \le \epsilon \frac{ {\mathbb {E}}[{\rm\small OPT}]}{r}\) . We will generally use the prime symbol \(\prime\) to denote a version of a variable that is zeroed out if it is too heavy (or too light). Let
\begin{align*} Z_i^{\prime } = Z_i \cdot \mathbf {1} [\tau _c \le Z_i \lt \tau _0] \quad \text{and} \quad Z^{(i)\prime } = Z^{(i)} \cdot \mathbf {1} [\tau _c \le Z_i \lt \tau _0], \end{align*}
where the latter is the truncated version of order statistic \(Z^{(i)}\) (and not the order statistic of the truncated variable \(Z_i^{\prime }\) ). Let
\begin{align*} {\rm\small OPT}^{\prime } &= {\rm\small OPT}_1^{\prime } + {\rm\small OPT}_2^{\prime }, \qquad \text{ where}\\ {\rm\small OPT}_1^{\prime } &= \max _i ~ Z_i \cdot \mathbf {1} [Z_i \ge \tau _0] \quad \text{and} \quad {\rm\small OPT}_2^{\prime } = {\textstyle \sum _{j=1}^r Z^{(j)\prime }}. \end{align*}
Note that \({\rm\small OPT}^{\prime }\) only considers one heavy variable and throws all other heavy variables away. Nevertheless, in Appendix D, we will show the following claim that it competes with \({\rm\small OPT}\) .
Claim 5.5.
\({\mathbb {E}}[{\rm\small OPT}^{\prime }] \ge (1-2\epsilon) {\mathbb {E}}[{\rm\small OPT}]\) .

Algorithm Overview.

For intuition, the problem with setting any particular fixed threshold is that variables with insignificant values of \(Z_i\) can be boosted by the adversary to some \(X_i\) just above the threshold. The algorithm would use up its r slots and be unable to take the later, larger arrivals that contribute to \({\rm\small OPT}\) .
Therefore, we will define a sequence of thresholds, each a factor of \(1-\epsilon\) apart. The algorithm will have a certain number of slots \(\tilde{r}_j\) for the “bucket” j of arrivals between any two thresholds. For this fixed bucket, such a boosting strategy by the adversary can only cost the algorithm a factor of \(1-\epsilon\) .
Roughly, this strategy will cover the case where \({\rm\small OPT}\) is concentrated. To allow for cases where most of \({\rm\small OPT}\) comes from very rare, very large variables, we will also reserve a slot for such variables and analyze it separately.

Algorithm Definition.

We define a sequence of thresholds. Recall that \({\rm\small OPT}= \sum _{j=1}^r Z^{(j)}\) , where \(Z^{(j)}\) is the jth-largest of \(Z_1,\dots ,Z_n\) . Let \(c = \lceil \frac{1}{\epsilon } \ln \frac{r}{\epsilon ^2}\rceil\) and define thresholds
\begin{align*} \tau _j &= (1-\epsilon)^j \frac{{\mathbb {E}}[{\rm\small OPT}]}{\epsilon } & (j = 0,\dots ,c) \end{align*}
Observation 5.6.
The largest threshold is \(\tau _0 = \frac{{\mathbb {E}}[{\rm\small OPT}]}{\epsilon }\) and the smallest threshold \(\tau _c \le \frac{\epsilon \cdot {\mathbb {E}}[{\rm\small OPT}]}{r}\) .
Proof.
\(\tau _0\) is immediate, and we have \((1-\epsilon)^c \le e^{-\epsilon c} \le \exp (- \epsilon \frac{1}{\epsilon } \ln \frac{r}{\epsilon ^2}) = \frac{\epsilon ^2}{r}\) . □
Now we define the size of each bucket. Recall that we abuse notation by writing \(Z_i \in {\rm\small OPT}\) if \(Z_i\) is one of the r variables included in the OPT solution. Let
\begin{align*} r_j &= {\mathbb {E}}\Big [ \left| \left\lbrace i : Z_i \in {\rm\small OPT}, \tau _j \le Z_i \le \tau _{j-1} \right\rbrace \right| \Big ] & (j = 1,\dots ,c) \\ \beta &= 3 \sqrt {r \ln (c/\epsilon)} \\ \tilde{r}_j &= r_j + \beta & (j = 1,\dots ,c) \\ \tilde{r}_0 &= 1 . \end{align*}
We first define an algorithm \(\text{ALG}^{\prime }\) that does not quite achieve the cardinality constraint r. We will then modify it to obtain \(\text{ALG}\) with only a small loss in performance.
\(\text{ALG}^{\prime }\) initializes \(b_j = 0\) for \(j=0,\dots ,c\) , where \(b_j\) indicates the (current) size of bucket number j, and proceeds as follows when a variable \(X_i\) arrives:
(1)
If \(X_i \lt \tau _c\) , we discard \(X_i\) and continue.
(2)
Otherwise, let \(j = \min \lbrace j^{\prime } : X_i \ge \tau _{j^{\prime }} \rbrace\) .
(3)
If \(b_j \lt \tilde{r}_j\) , we take \(X_i\) and increment \(b_j\) by one unit. Otherwise (bucket j is full), increment j by one unit and repeat this step. If \(j \gt c\) , stop and discard \(X_i\) .
In other words, we attempt to assign \(X_i\) to its original bucket, but if that is full, we allow it to fall into buckets reserved for smaller variables (higher indices j).
Now, the final algorithm \(\text{ALG}\) is defined as follows: run \(\text{ALG}^{\prime }\) , but each time \(\text{ALG}^{\prime }\) takes an arrival \(X_i\) , discard it independently with probability \(\epsilon\) . If a variable is not discarded but the cardinality constraint r is reached, then discard it anyway.

Analysis.

We show in Appendix D that \(\text{ALG}\) approximates \(\text{ALG}^{\prime }\) (i.e., the following lemma).
Lemma 5.7.
For all \(\epsilon \ge \frac{9 \left(\ln r \right)^{3/2}}{r^{1/4}}\) , we have \({\mathbb {E}}[\text{ALG}] \ge (1-2\epsilon) {\mathbb {E}}[\text{ALG}^{\prime }]\) .
Now, we analyze \(\text{ALG}^{\prime }\) .
Let us define the contributions of \(\text{ALG}^{\prime }\) and \({\rm\small OPT}^{\prime }\) bucket-by-bucket. The top bucket of \(\text{ALG}^{\prime }\) is split into cases where the corresponding \(Z_i\) is heavy or light. The following are random sets:
\begin{align*} O_j &= \lbrace i \in {\rm\small OPT}^{\prime } : \tau _j \le Z_i \lt \tau _{j-1}\rbrace & (j = 1,\dots ,c) \\ O_0 &= \lbrace i \in {\rm\small OPT}^{\prime } : \tau _0 \le Z_i\rbrace \\ B_j &= \lbrace i \in \text{ALG}: \tau _j \le X_i \lt \tau _{j-1}\rbrace & (j = 1,\dots ,c) \\ B_0^{\text{light}} &= \lbrace i \in \text{ALG}^{\prime } : Z_i \lt \tau _0 \le X_i\rbrace \\ B_0^{\text{heavy}} &= \lbrace i \in \text{ALG}^{\prime } : \tau _0 \le Z_i\rbrace . \end{align*}
We use the notation \(i \in {\rm\small OPT}^{\prime }\) to denote that i contributes to \({\rm\small OPT}^{\prime }\) —that is, either \(Z_i\) is the largest among \(\lbrace Z_j\rbrace\) while being at least \(\tau _0\) , or \(Z_i\) is among the r largest of \(\lbrace Z_j\rbrace\) while satisfying \(\tau _c \le Z_i \lt \tau _0\) . Similarly, we write \(i \in \text{ALG}^{\prime }\) to mean that the algorithm takes \(X_i\) .
Now, we break down \(\text{ALG}^{\prime }\) as follows:
\begin{align*} \text{ALG}^{\prime } &= \text{ALG}_1^{\prime } + \text{ALG}_2^{\prime }, \qquad \text{where}\\ \text{ALG}_1^{\prime } = \sum _{i \in B_0^{\text{heavy}}} X_i & \quad \text{and} \quad \text{ALG}_2^{\prime } = \sum _{i \in B_0^{\text{light}}} X_i + \sum _{j=1}^c \sum _{i \in B_j} X_i . \end{align*}
In other words, \(\text{ALG}_1^{\prime }\) tracks the contribution of the special “heavy” bucket, but only in the case where the underlying variable \(Z_i\) is heavy. \(\text{ALG}_2^{\prime }\) tracks the remaining case and all other buckets. Notice these definitions are only for the purpose of analysis, as \(Z_i\) is not observable to the algorithm.
Finally, we define
\begin{equation*} P := \Pr [ \max _i X_i \ge \tau _0 ] . \end{equation*}
A key point will be that if P is large, then the algorithm will often get some variable larger than \(\tau _0\) , which is good enough to compete with \({\rm\small OPT}\) .
Lemma 5.8.
\({\mathbb {E}}[\text{ALG}^{\prime }] \ge \frac{P}{\epsilon }{\mathbb {E}}[{\rm\small OPT}]\) .
Proof.
With probability P, some \(X_i\) exceeds \(\tau _0 = \frac{{\mathbb {E}}[{\rm\small OPT}]}{\epsilon }\) . Since Bucket 0 is reserved for such arrivals with budget \(b_0 = 1\) , the algorithm gets such \(X_i\) if this occurs, so its expectation is \(\ge P \tau _0\) . □
Thus, if \(P \ge \epsilon\) , we are already done. The rest of the analysis will leverage cases where P is small.
Lemma 5.9.
\({\mathbb {E}}[\text{ALG}_1^{\prime }] \ge (1-P) {\mathbb {E}}[{\rm\small OPT}_1^{\prime }]\) .
Proof.
\begin{align*} {\mathbb {E}}[\text{ALG}_1^{\prime }] &= {\mathbb {E}}\Big [\sum _i X_i \cdot \mathbf {1} [i \in B_0^{\text{heavy}}] \Big ] \quad \ge \quad {\mathbb {E}}\Big [ \sum _i Z_i \cdot \mathbf {1} [i \in B_0^{\text{heavy}}] \Big ] \\ &= \sum _i \Pr [\text{$b_0 = 0$ when $i$ arrives}] \cdot {\mathbb {E}}\left[ Z_i\cdot \mathbf {1} [Z_i \ge \tau _0] \right] \qquad \text{(using independence)} \\ &\ge (1-P) \sum _i {\mathbb {E}}\left[ Z_i \cdot \mathbf {1} [Z_i \ge \tau _0] \right] \\ & \ge (1-P) {\mathbb {E}}[ \max _i \lbrace Z_i \cdot \mathbf {1} [Z_i \ge \tau _0] \rbrace ] ~~=~~ (1-P) \cdot {\mathbb {E}}[{\rm\small OPT}_1^{\prime }] . □ \end{align*}
The final piece of the argument is to show that the “buckets” strategy works—that is, it cannot be disrupted by augmentations. The idea is that we have reserved an accurate number of slots in each bucket for the case where there is no augmentation. An augmented variable \(X_i\) can take away a bucket slot from some \(Z_{i^{\prime }}\) , with \(Z_{i^{\prime }} \gg Z_i\) , but then it will contribute about as much to \(\text{ALG}^{\prime }\) as \(Z_{i^{\prime }}\) did to \({\rm\small OPT}_2^{\prime }\) . In this case, we should be concerned that \({\rm\small OPT}_2^{\prime }\) gets both \(Z_i\) and \(Z_{i^{\prime }}\) while \(\text{ALG}^{\prime }\) only gets \(X_i\) , with \(X_{i^{\prime }}\) disappearing thanks to the bucket being full. However, the algorithm allows such an \(X_{i^{\prime }}\) to “trickle down” into a lower-tier bucket—in particular, the slot that is not being used by \(X_i\) . And if this slot is full as well, then in any case \(\text{ALG}^{\prime }\) is competing with \({\rm\small OPT}_2^{\prime }\) .
Lemma 5.10.
For \(\epsilon \ge \frac{9 \left(\ln r \right)^{3/2}}{r^{1/4}}\) , we have \({\mathbb {E}}[\text{ALG}_2^{\prime }] \ge (1-\epsilon)^2 {\mathbb {E}}[{\rm\small OPT}_2^{\prime }]\) .
Proof.
First, let C denote the event that none of the \({\rm\small OPT}_2^{\prime }\) buckets are filled to the \(\tilde{r}_j\) capacities—that is, C is the event that \(|O_j| \le \tilde{r}_j\) for all \(j=1,\dots ,c\) .
Claim 5.11.
\(\Pr [C] \ge 1-\epsilon\) . □
Proof.
Recall that \(r_j = {\mathbb {E}}| O_j|\) . Because \(|O_j|\) is a sum of independent Bernoulli random variables, we have by a standard Bernstein bound that
\begin{equation*} \Pr [ |O_j| - r_j \ge \sqrt {2 r_j t} + t] \le e^{-t} . \end{equation*}
Setting \(t = \ln \frac{c}{\epsilon }\) , we get a bound of \(\frac{\epsilon }{c}\) ; a union bound over the buckets will complete the proof. We just need to show that \({\sqrt {2 r_j \ln \tfrac{c}{\epsilon }}} + \ln \frac{c}{\epsilon } \le \beta\) , as then the probability of exceeding \(\tilde{r}_j = r_j + \beta\) is only smaller. As in the proof of Claim D.1, for this choice of \(\epsilon\) , we have \(\ln \frac{c}{\epsilon } \le \ln (r)\) , and \(\ln (r) \le r\) , so
\begin{equation*} \textstyle \sqrt {2 r_j \ln \frac{c}{\epsilon }} + \ln \frac{c}{\epsilon } \le \sqrt {2 r \ln \frac{c}{\epsilon }} + \sqrt {r \ln \frac{c}{\epsilon }} \le \beta . □ \end{equation*}
Next we argue that at each tier of thresholds, \(\text{ALG}^{\prime }\) is getting just as many variables as \({\rm\small OPT}_2^{\prime }\) , even if their identities are different. First we present a helpful property.
Claim 5.12.
Conditioned on C, suppose \(|B_j| \lt \tilde{r}_j\) for some \(j \ge 1\) . Then,
\begin{equation*} |B_0^{\text{light}}| + \sum \limits _{j^{\prime }=1}^j |B_{j^{\prime }}| \ge \sum \limits _{j^{\prime }=1}^j |O_{j^{\prime }}|. \end{equation*}
Proof.
Consider any arrival i that contributes to the right side. We claim it is also counted on the left. We know \(X_i \ge Z_i \ge \tau _j\) , because the variable contributes to the right side. So the algorithm will attempt to place \(X_i\) in some assigned bucket \(j^{\prime } \le j\) . If it does not succeed because the bucket is full, it will proceed to \(j^{\prime }+1,\dots ,\) and possibly eventually j. Because \(|B_j| \lt \tilde{r}_j\) , we know there is space for i in bucket j, so the algorithm definitely takes i in bucket j or earlier. By definition, \(Z_i \lt \tau _0\) , so i cannot be a member of \(B_0^{\text{heavy}}\) . Therefore, it is counted by the left side. □
Now we can show the key fact.
Claim 5.13.
Conditioned on C, we have for all \(j = 1,\dots ,c\) that
\begin{equation*} \left|B_0^{\text{light}}\right| + \sum _{j^{\prime }=1}^j |B_{j^{\prime }}| \ge \sum _{j^{\prime }=1}^j |O_{j^{\prime }}| . \end{equation*}
Proof.
By induction on j. For \(j=1\) , we must show \(|B_0^{\text{light}}| + |B_1| \ge |O_1|\) . Recall that an arrival i is a member of \(O_1\) if \(\tau _1 \le Z_i \lt \tau _0\) . There are two cases. If \(|B_1| = \tilde{r}_1\) (i.e., the bucket is full), then the case is proven as we have assumed event C, which implies \(|O_1| \le \tilde{r}_1\) . Otherwise, the case follows by Claim 5.12.
Now consider \(j \gt 1\) . If \(|B_j| \lt \tilde{r}_j\) (i.e., the bucket is not full), then the case follows by Claim 5.12. Otherwise (i.e., bucket j is full), then we have \(|B_j| \ge |O_j|\) because of property C. Combining this with the induction hypothesis proves that \(|B_0^{\text{light}}| + \sum _{j^{\prime }=1}^j |B_{j^{\prime }}| \ge \sum _{j^{\prime }=1}^j |O_{j^{\prime }}|\) . □
We are now ready to complete the proof of Lemma 5.10. From Claim 5.13, given event C, we can make a one-to-one mapping from contributions \(Z_i^{\prime }\) of \({\rm\small OPT}_2^{\prime }\) to contributions \(X_j\) of \(\text{ALG}_2^{\prime }\) such that \(X_j\) is in the same bucket or a higher bucket than \(Z_i^{\prime }\) . (In other words, map all elements of \(O_1\) to elements of \(B_1\) or \(B_0^{\text{light}}\) , map all elements of \(O_2\) to remaining elements of these or to elements of \(B_2\) , and so on.) For each such pair, we have \(X_j \ge (1-\epsilon) Z_i^{\prime }\) because, at worst, both are in the same bucket.5 In total, this implies that, conditioned on C, we always have \(\text{ALG}_2^{\prime } \ge (1-\epsilon){\rm\small OPT}_2^{\prime }\) , and C occurs with probability at least \(1-\epsilon\) .
Corollary 5.14.
For \(\epsilon \ge \frac{9 (\ln r)^{3/2}}{r^{1/4}}\) , we have \({\mathbb {E}}[\text{ALG}^{\prime }] \ge (1-\epsilon)^2 \cdot {\mathbb {E}}[{\rm\small OPT}^{\prime }]\) .
Proof.
If \(P \ge \epsilon\) , then by Lemma 5.8 we have \({\mathbb {E}}[\text{ALG}^{\prime }] \ge {\mathbb {E}}[{\rm\small OPT}] \ge {\mathbb {E}}[{\rm\small OPT}^{\prime }]\) , proving the claim. Otherwise, by Lemma 5.9, \({\mathbb {E}}[\text{ALG}_1^{\prime }] \ge (1-P) {\mathbb {E}}[{\rm\small OPT}_1^{\prime }] \ge (1-\epsilon){\mathbb {E}}[{\rm\small OPT}_1^{\prime }]\) , and by Lemma 5.10, \({\mathbb {E}}[\text{ALG}_2^{\prime }] \ge (1-\epsilon)^2 {\mathbb {E}}[{\rm\small OPT}_2^{\prime }]\) . So in the case \(P \lt \epsilon\) , we have
\begin{align*} {\mathbb {E}}[\text{ALG}^{\prime }] \quad = \quad {\mathbb {E}}[\text{ALG}_1^{\prime }] + {\mathbb {E}}[\text{ALG}_2^{\prime }] \quad &\ge \quad (1-\epsilon){\mathbb {E}}[{\rm\small OPT}_1^{\prime }] + (1-\epsilon)^2{\mathbb {E}}[{\rm\small OPT}_2^{\prime }] \\ &\ge \quad (1-\epsilon)^2 {\mathbb {E}}[{\rm\small OPT}^{\prime }]. □ \end{align*}
Proof of Augmentation Lemma 3.4
For \(\epsilon \le \frac{1}{2}\) , \(\epsilon \ge \frac{9 (\ln r)^{3/2}}{r^{1/4}}\) , we have
\begin{align*} {\mathbb {E}}[\text{ALG}] \stackrel{\text{Lemma~5.7}}{\ge } (1-2\epsilon){\mathbb {E}}[\text{ALG}^{\prime }] \stackrel{\text{Corollary~5.14}}{\ge } (1-2\epsilon)^3 {\mathbb {E}}[{\rm\small OPT}^{\prime }] \stackrel{\text{Claim~5.5}}{\ge } (1-2\epsilon)^4 {\mathbb {E}}[{\rm\small OPT}] . \end{align*}
So we obtain a \((1+O(\epsilon))\) -approximation. □

Footnotes

1
Note, however, that our bounds are expressed in terms of the minimum of row and column sparsity of an instance, and hence apply even to instances with high row/column sparsity.
2
Revenue guarantees, at least in the classical independent- \(X_i\) model, can be obtained using a threshold in virtual value space.
3
We often use the inequality \((1-\frac{1}{N})^{N-1} = (\frac{N-1}{N})^{N-1} \ge \frac{1}{e}\) , which follows from \((\frac{N}{N-1})^{N-1} = (1 + \frac{1}{N-1})^{N-1} \le e\) .
4
An illuminating instance is \(X_i = Y_i\) for all \(i \le s_{\text{row}}\) and \(X_{s_{\text{row}}+ 1} = 0.99(Y_1 + \dots + Y_{s_{\text{row}}})\) .
5
For example, we may have \(Z_i^{\prime } \approx \tau _4\) while \(X_j = \tau _3 = (1-\epsilon)\tau _4\) .
6
We say \(\lbrace X_j\rbrace\) are negatively correlated if for all \(i,i^{\prime }\) we have \(\text{Cov}(X_i,X_{i^{\prime }})\le 0\) .

A Unweighted Linear Correlations

In this section, we consider a linear correlations model where the nonnegative matrix A from \({{\bf X}}= A \cdot {{\bf Y}}\) is unweighted—that is, each of its entries is either 0 or 1. Alternately, for \(i\in [n],\) there are known sets \(S_1, S_2, \ldots , S_n \subseteq [m]\) such that \(X_i = \sum _{j\in S_i }Y_j\) . Our lower bound tower instance from Section 4.3 no longer holds as it crucially exploits that matrix A has entries that decrease exponentially in \(\epsilon\) . Can we do better than a \(\Theta \left({\min \lbrace s_{\text{col}},s_{\text{row}}\rbrace }\right)\) -approximation ratio? One might wonder if there exists an alternate hardness instance that only has \(0-1\) entries in A. We show that this is not the case. In fact, there exist simple threshold-based constant approximation algorithms.
Theorem 2.4.
The unweighted linear correlations problem has a fixed-threshold constant-factor approximation algorithm.
The main intuition in the proof of Theorem 2.4 is that for the unweighted problem, each independent \(Y_j\) has limited “influence” on the \(X_i\) s. This is because either \(Y_j\) appears with coefficient 0, in which case it has no influence on the value \(X_i\) , or it appears with coefficient 1, in which case it has the same influence in the value of every such \(X_i\) . A threshold algorithm is therefore difficult to fool because unlike the tower instance, it is not possible to have a scenario where a \(Y_j\) is very large but our algorithm selects it within an \(X_i\) where it appears with a small coefficient \(\epsilon\) .
For readers familiar with the revenue-maximization result of Babaioff et al. [8] (i.e., the best of selling items individually and selling all the items together in a single bundle is a constant-factor approximation to optimal revenue), our result has a similar flavor, although the technical details are quite different. We decompose our problem instance into a “core” and a “tail” part. The tail consists of cases where any \(Y_j\) exceeds a boundary \(\tau\) ; the core, the rest. For the tail case, we show that approximating \({\mathbb {E}}[\max _j\lbrace Y_j\rbrace ]\) (the best individual item) suffices, and in the core case, we can approximate the best bundle \(X_i\) .
This argument will show that there is one fixed-threshold \(\tau _{\text{core}}\) such that the algorithm taking the first arrival above \(\tau _{\text{core}}\) achieves a constant approximation to the optimal core contribution to \(\max \lbrace X_i\rbrace\) , and similarly for \(\tau _{\text{tail}}\) and the tail part. There remains a corner case, where we show a fixed threshold equal to the boundary \(\tau\) gives a constant factor. Thus, for any given instance, one can select among \(\tau , \tau _{\text{core}}, \tau _{\text{tail}}\) to get a fixed-threshold constant-factor approximation algorithm.

A.1 Notation and Proof Overview

We first choose a real number \(\tau\) representing a boundary. Let \(p_j = \Pr [Y_j \gt \tau ]\) . We set \(\tau\) such that \(\prod _j (1-p_j) =1/2\) , i.e., with half probability all \(Y_j\) are below \(\tau\) . We let the set A be all “heavy” \(Y_j\) variables: \(A := \lbrace j : Y_j \gt \tau \rbrace\) .
Recall that for each \(X_i\) , the set of active indices is \(S_i\) , so we have \(X_i = \sum _{j \in S_i} Y_j\) . We first upper-bound \({\rm\small OPT}:= \max _i X_i\) by contributions from the core event that \(A = \emptyset\) (all \(Y_j\) are small) and from the remaining tail event.
Claim A.1.
\begin{equation*} {\mathbb {E}}[{\rm\small OPT}] \le {\mathbb {E}}[\max _{i} X_i \mid A=\emptyset ] + \sum _{j} p_j \cdot {\mathbb {E}}[Y_j \mid Y_j \gt \tau ] . \end{equation*}
Proof.
For any outcome of \(Y_j\) s, we separately count those in A (larger than \(\tau\) ) and the rest, and relax the objective to always take the large ones:
\begin{align*} \max _i X_i \quad = \quad \max _{i} \left\lbrace \sum _{j \in S_i}Y_j \right\rbrace \quad \le \quad \max _{i} \left\lbrace \sum _{j \in S_i \cap \overline{A}}Y_j \right\rbrace + \sum _{j\in A} Y_j. \end{align*}
Now taking expectations on both sides,
\begin{align*} {\mathbb {E}}[{\rm\small OPT}] \quad = \quad {\mathbb {E}}[\max _i X_i ] \quad & \le \quad {\mathbb {E}}\left[\max _{i} \left\lbrace \sum _{j \in S_i \cap \overline{A}}Y_j \right\rbrace \right]+ {\mathbb {E}}\left[\sum _{j\in A} Y_j\right]\\ & \le \quad {\mathbb {E}}[\max _{i} X_i \mid A = \emptyset ] + \sum _{j} p_j \cdot {\mathbb {E}}[Y_j \mid Y_j \gt \tau ]. \end{align*}
To justify that \({\mathbb {E}}[\max _{i} \lbrace \sum _{j \in S_i \cap \overline{A}}Y_j \rbrace ] \le {\mathbb {E}}[\max _{i} X_i \mid A = \emptyset ]\) , use a coupling argument: first draw all \(Y_j\) from their initial distributions and consider the value of \(\max _i \lbrace \sum _{j \in S_i \cap \overline{A}} Y_j\rbrace\) . Now take any variables \(Y_j \gt \tau\) , and redraw them until they fall below \(\tau\) . The value of the inner sum can only increase, but now we are exactly obtaining \({\mathbb {E}}[\max _i X_i \mid A = \emptyset ]\) . □
Given that Claim A.1 upper-bounds \({\mathbb {E}}[{\rm\small OPT}]={\mathbb {E}}[ \max _i X_i]\) by the sum of two terms, our proof goes in two steps. First, we approximate the tail contributions \(\sum _{j} p_j \cdot {\mathbb {E}}[Y_j \mid Y_j \gt \tau ]\) . In Lemma A.2, we use Augmentation Lemma 3.2 to give a simple fixed-threshold- \(\tau _{\text{tail}}\) algorithm with expected value \(\Omega (\max \lbrace Y_j\rbrace)\) . This is a bit surprising because the lower bound in Section 4.3 actually proves such a result is not possible for weighted linear correlations. In Claim A.3, we show that \(\Omega (\max \lbrace Y_j\rbrace)\) suffices to capture the tail term.
To capture the core contributions \({\mathbb {E}}[\max _{i} X_i \mid A=\emptyset ]\) , in Claim A.5 we argue that for all instances where \(Y_j\) s are bounded by \(\tau\) , we can use concentration of XOS functions (or fractional subadditive functions) [46] to argue that \(\Pr [\max _i X_i \gt 1/2 \cdot {\mathbb {E}}[\max _i X_i]]\) is at least a constant, and hence a simple fixed-threshold- \(\tau _{\text{core}}\) algorithm suffices. This second step also holds for prophets with weighted linear correlations. There is also a corner case where \(\tau\) is too large to apply concentration. But in this case, setting a threshold \(\tau\) will directly achieve a constant factor.

A.2 Proof

Lemma A.2.
For the prophet inequality problem with unweighted linear correlations, there exists a fixed-threshold algorithm with expected value \(\Omega (\max \lbrace Y_j\rbrace)\) .
Proof.
Define \(Z_i\) to be the sum of \(Y_j\) s that appear in \(X_i\) and have not appeared in any \(X_{i^{\prime }}\) for \(i^{\prime } \lt i\) . Since every \(Y_j\) appears in some \(Z_i\) , we know \(\max \lbrace Z_i\rbrace \ge \max \lbrace Y_j\rbrace\) . Augmentation Lemma 3.2 now completes the proof. □
Now we argue that \({\mathbb {E}}[\max _j Y_j]\) takes care of the second term in Claim A.1.
Claim A.3.
\begin{equation*} {\mathbb {E}}[\max _j Y_j] \ge \frac{1}{2} \cdot \sum _{j} p_j \cdot {\mathbb {E}}[Y_j \mid Y_j \gt \tau ]. \end{equation*}
Proof.
We have
\begin{align*} {\mathbb {E}}[\max _j Y_j] ~~\ge ~~ \sum _{j} \Pr [A = \lbrace j\rbrace ] \cdot {\mathbb {E}}[\max _j Y_j \mid A = \lbrace j\rbrace ] ~~\ge ~~ \frac{1}{2}\sum _j {p_j} \cdot {\mathbb {E}}[ Y_j \mid A = \lbrace j\rbrace ], \end{align*}
where the second inequality uses independence to get \(\Pr [A = \lbrace j\rbrace ] = \Pr [Y_j \gt \tau ] \cdot \Pr [Y_{j^{\prime }} \le \tau (\forall j^{\prime } \ne j)] \ge (p_j)\left(\frac{1}{2}\right)\) and that \(\max _j Y_j\) given that \(A = \lbrace j\rbrace\) is the same as \(Y_j\) . □
Corollary A.4.
For any instance with unweighted linear correlations, there exists \(\tau _{\text{tail}}\) such that the algorithm setting a fixed threshold of \(\tau _{\text{tail}}\) obtains \({\mathbb {E}}[\text{ALG}] \ge \Omega (\sum _j p_j \cdot {\mathbb {E}}[Y_j \mid Y_j \gt \tau ])\) .
We now turn to the core portion of contributions to \({\rm\small OPT}\) .
Claim A.5.
Let \(V = {\mathbb {E}}[\max _i X_i \mid A=\emptyset ]\) . If the boundary satisfies \(\tau \le V /10\) , there exists \(\tau _{\text{core}}\) such that the algorithm setting a fixed threshold of \(\tau _{\text{core}}\) obtains \({\mathbb {E}}[\text{ALG}] \ge \Omega (V)\) .
Proof.
Let \(Y_j^{\prime }\) be a copy of \(Y_j\) conditioned on falling into the range \([0,\tau ]\) . Let \(X_i^{\prime } = \sum _{j \in S_i} Y_j^{\prime }\) , and let \(W = \max _i X_i^{\prime }\) . We have \({\mathbb {E}}[\max _i X_i \mid A=\emptyset ] = {\mathbb {E}}[W] = V\) .
Now, note that W is an XOS function of the independent \(Y_i^{\prime }\) variables (meaning it is a maximum of weighted combinations). Thus, we can apply the concentration of XOS functions (more generally, for self-bounding functions; e.g., see [46]) to get
\begin{equation*} \Pr [W \lt (1 - \delta)\cdot {\mathbb {E}}[W]] \le \exp \Big (-\delta ^2 \cdot \frac{{\mathbb {E}}[W]}{2\tau } \Big). \end{equation*}
In particular, for \(\delta =1/2,\) we get
\begin{equation*} \Pr \Big [W \lt \frac{1}{2} {\mathbb {E}}[W] \Big ] \quad \le \quad \exp \Big (-\frac{{\mathbb {E}}[W]}{8\tau } \Big) \quad \le \gamma , \end{equation*}
for some constant \(\gamma \lt 1\) , using that \(\tau \le {\mathbb {E}}[W] / 10\) . Hence, the expected value of an algorithm that sets a threshold of \(\frac{1}{2} {\mathbb {E}}[W]\) is at least
\begin{equation*} \Pr \Big [W \ge \frac{1}{2} {\mathbb {E}}[W] \Big ] \cdot \frac{1}{2} {\mathbb {E}}[W] \ge \Omega ({\mathbb {E}}[W]). □ \end{equation*}
Now we have all the tools to prove our main theorem.
Proof of Theorem 2.4
By Claim A.1, one of the following is at least a 2-approximation to the prophet: \(V := {\mathbb {E}}[\max _i X_i \mid A=\emptyset ]\) , and \({\mathbb {E}}[\sum _j p_j \cdot {\mathbb {E}}[Y_j \mid Y_j \gt \tau ]\) . Suppose it is the latter. Then by Claim A.3, setting a fixed threshold of \(\tau _{\text{tail}}\) gives a constant-factor approximation.
So suppose we have \(V \ge {\mathbb {E}}[{\rm\small OPT}]/2\) . If \(\tau \gt V/10\) , then we can set a fixed threshold of \(\tau\) : with probability at least \(\frac{1}{2}\) , some \(X_i \ge \tau\) (because some \(Y_j \ge \tau\) ), so we obtain performance at least \(\frac{1}{2}\tau \ge \frac{1}{40}{\mathbb {E}}[{\rm\small OPT}]\) .
Finally, if \(V \ge {\mathbb {E}}[{\rm\small OPT}]/2\) and \(\tau \le V/10\) , then by Claim A.5, setting a fixed threshold of \(\tau _{\text{core}}\) gives a constant-factor approximation. □

B Negatively Associated Values

In this section, we show a 2-approximation ratio for negatively associated random values, a property that is known to imply negative correlation6 [32]. Formally, we say \(\lbrace X_j\rbrace\) are negatively associated if for all monotone increasing functions \(f,g\) and disjoint subsets \(S,S^{\prime } \subseteq \lbrace 1,\dots ,n\rbrace\) , and for all \(a,b \in \mathbb {R}\) , we have
\begin{equation*} \Pr [ f(X_j : j \in S) \ge a ] \le \Pr [ f(X_j : j \in S) \ge a \mid g(X_j : j \in S^{\prime }) \le b ] . \end{equation*}
Let \(\tau = \frac{1}{2}{\mathbb {E}}[\max _i X_i],\) and let \(P = \Pr [\max _i X_i \ge \tau ]\) . We can write down precisely the usual prophet proof with just one line requiring additional justification:
\begin{align*} {\mathbb {E}}[\text{ALG}_{\tau }] &= P \cdot \tau + \sum _{i=1}^n \Pr [X_{i^{\prime }}\lt \tau (\forall i^{\prime }\lt i)] \cdot {\mathbb {E}}\left[(X_i - \tau)^+ \mid X_{i^{\prime }} \lt \tau (\forall i^{\prime } \lt i)\right] \\ & \ge P\cdot \tau + (1-P) \cdot \sum _{i=1}^n {\mathbb {E}}\left[(X_i-\tau)^+ \mid X_{i^{\prime }} \lt \tau (\forall i^{\prime }\lt i)\right] \\ & \ge P\cdot \tau + (1-P) \cdot \sum _{i=1}^n {\mathbb {E}}\left[(X_i -\tau)^+\right] , \end{align*}
where the last inequality uses negative association as \((X_i-\tau)^+\) is a monotone function, as is \(\max _{i^{\prime }\lt i} X_{i^{\prime }}\) . We can also use weaker notions of negative correlation for the last inequality (e.g., NLODS as in Section 2 in the work of Rinott and Samuel-Cahn [40]). Now repeating the old prophet inequality analysis,
\begin{equation*} \sum _{i=1}^n {\mathbb {E}}[(X_i -\tau)^+ ] \quad \ge \quad {\mathbb {E}}\Big [ \sum _{i=1}^n (X_i -\tau)^+\Big ] \quad \ge \quad {\mathbb {E}}\Big [ \max _i X_i -\tau \Big ] \quad = \quad \tau . \end{equation*}
Thus, \({\mathbb {E}}[\text{ALG}_{\tau }] \ge P\cdot \tau + (1-P) \cdot \tau = \tau .\)

C BOUNDED scol AND SMALL CARDINALITY CONSTRAINT

For the r-uniform matroid problem with bounded \(s_{\text{col}}\) , we have shown in Section 5.1 a \(1+ o(1)\) -approximation for large r tending to infinity. Here, we complement that result with a gracefully improving approximation ratio for all r that smoothly interpolates between \(O\left({s_{\text{col}}}\right)\) for \(r=1\) (the classic result) and \(O(1)\) for \(r \ge s_{\text{col}}\) .
The approach is an extension of our algorithm for bounded column sparsity in the \(r=1\) case.
Theorem C.1.
For any \(s_{\text{col}}, r\) , the linearly correlated prophets problem with column sparsity \(s_{\text{col}}\) and cardinality constraint r admits an approximation ratio of \({2e^2} \cdot \max \lbrace 1, \frac{s_{\text{col}}}{r}\rbrace\) .
In other words, as \(r=2,3,\ldots ,s_{\text{col}}\) , the guarantee improves to a constant factor times \(\frac{2}{s_{\text{col}}}, \frac{3}{s_{\text{col}}}, \ldots , 1\) .
Proof.
Let there be r sets (“buckets”) \(B_1,\dots ,B_r\) . If \(r \lt s_{\text{col}}\) , let there also be a “discard pile” \(B_0\) . Let \(c = \max \lbrace r,s_{\text{col}}\rbrace\) .
For each \(X_i\) , place it in a bucket \(j \in \lbrace 1,\dots ,r\rbrace ,\) each chosen with probability \(\frac{1}{c}\) . If \(r \lt s_{\text{col}}\) , then with the remaining probability of \(\frac{s_{\text{col}}-r}{s_{\text{col}}}\) , place \(X_i\) in the discard bucket \(B_0\) .
For each bucket \(j=1,\dots ,r\) , give the bucket a cardinality constraint of one item and run the following algorithm (based on the \(r=1\) case). Let \(S_j = \lbrace i : X_i \in B_j\rbrace ,\) and note that they are disjoint for different j. When a variable \(X_i\) arrives, send it to the algorithm for its bucket, or if it is in \(B_0\) , discard \(X_i\) and continue. In bucket j, we run an inclusion-threshold algorithm with \(S_j\) and with \(\tau _j\) to be determined next. Assign each \(Y_{j^{\prime }}\) to the first \(X_i \in B_j\) that includes it—that is, let \(T_i = \lbrace j^{\prime } : A_{ij^{\prime }} \gt 0 \text{ and } A_{i^{\prime }j^{\prime } = 0} (\forall i^{\prime } \lt i, i^{\prime } \in S_j)\rbrace\) . Let \(Z_i = \sum _{j^{\prime } \in T_i} A_{ij^{\prime }} Y_{j^{\prime }}\) , and let \(\tau _j = \frac{1}{2}{\mathbb {E}}[ \max _{i\in S_j} Z_i]\) .
Claim C.2.
For each bucket \(j \in \lbrace 1,\dots ,r\rbrace\) , the expected value selected by the algorithm is at least \(\frac{1}{2e} \cdot {\mathbb {E}}[\max _{i\in S_{j}} X_i]\) . □
Proof.
By construction, each \(Z_i\) is independent of all previous \(X_i\) in the same bucket, so by Augmentation Lemma 3.2, bucket j obtains expected reward at least \(\frac{1}{2}{\mathbb {E}}[ \max _{i \in S_j} Z_i]\) with randomness over the variables.
For each \(Y_{j^{\prime }}\) with \(A_{ij^{\prime }} \gt 0\) , we claim \(\Pr [j \in T_i] \ge \frac{1}{e}\) because there are at most \(s_{\text{col}}-1\) other variables \(X_{i^{\prime }}\) that include \(Y_{j^{\prime }}\) , and each misses bucket j with probability at least \(1-\frac{1}{s_{\text{col}}}\) , so they all miss bucket j with probability at least \((1-\frac{1}{s_{\text{col}}})^{s_{\text{col}}-1} \ge \frac{1}{e}\) . In this case, we must have \(j \in T_i\) . So for each fixed \(X_i\) , we have with probability only over the bucket assignments and construction of \(S_j\) ,
\begin{align*} {\mathbb {E}}[ Z_i] \quad = \quad \sum _{j^{\prime }} \Pr [j^{\prime } \in T_i]\cdot A_{ij^{\prime }}Y_{j^{\prime }} \quad \ge \quad \frac{1}{e} \sum _{j^{\prime }} A_{ij^{\prime }} Y_{j^{\prime }} \quad = \quad \frac{1}{e} X_i . \end{align*}
Combining these facts, each bucket j obtains expected reward at least \(\frac{1}{2e}{\mathbb {E}}[ \max _{i \in S_j} X_i]\) . □
Write \({\mathbb {E}}_B\) for an expectation taken over the bucketing and \({\mathbb {E}}_{{{\bf X}}}\) for expectation over the realizations of the variables. The preceding gives that for every set of variable realizations, \({\mathbb {E}}_B[\text{ALG}] \ge \frac{1}{2e}\sum _{j=1}^r {\mathbb {E}}_B [\max _{i \in S_j} X_i]\) . By linearity of expectation and symmetry of the buckets, we have
\begin{align} {\mathbb {E}}_{{{\bf X}},B} [\text{ALG}] &\ge \frac{r}{2e} {\mathbb {E}}_{{{\bf X}},B} \big [ \max _{i \in S_1}X_i \big ]. \end{align}
(3)
Claim C.3.
\({\mathbb {E}}_{{{\bf X}},B} [ \max _{i \in S_1}X_i ] \ge \frac{1}{e\cdot c} \cdot {\mathbb {E}}_{{{\bf X}}}[{\rm\small OPT}]\) , where \(c = \max \lbrace r,s_{\text{col}}\rbrace\) .
Proof.
Let the random variable \(X^{(i)}\) equal the ith-largest realized variable—that is, in particular,
\begin{align*} {\rm\small OPT}&= \sum _{i=1}^r X^{(i)} . \end{align*}
Recall that any fixed variable falls into bucket \(B_1\) with probability \(\frac{1}{c}\) . Fixing realizations of \(X^{(1)},\dots ,X^{(r)}\) , we have with probability taken only over the buckets,
\begin{align*} {\mathbb {E}}_B \big [\max _{i\in S_1} X_i \big ] &\ge \sum _{i=1}^r \Pr \left[X^{(i)} \in B_1 \text{ and } X^{(i^{\prime })} \not\in B_1 (\forall i^{\prime }\lt i)\right] X^{(i)} \\ &= \sum _{i=1}^r \frac{1}{c}\left(1-\frac{1}{c}\right)^{i-1} X^{(i)} \quad \ge \quad \frac{1}{e \cdot c} \sum _{i=1}^r X^{(i)} \quad = \quad \frac{1}{e \cdot c} {\rm\small OPT}. \end{align*}
Now taking an expectation on both sides over the realizations of \({{\bf X}}\) proves the claim. □
Finally, combine Claim C.3 with Inequality (3) to get
\begin{align*} {\mathbb {E}}[\text{ALG}] \quad \ge \quad \frac{r}{2 e} {\mathbb {E}}_{{{\bf X}}} {\mathbb {E}}_B \big [ \max _{i \in S_1} X_i \big ] \quad \ge \quad \frac{r}{2e^2 c} {\mathbb {E}}_{{{\bf X}}} [ {\rm\small OPT}] , \end{align*}
which proves Theorem C.1.

D Missing Proofs

D.1 Missing Proofs from Section 2

Lemma 2.1.
In the linear correlations model, even for \(s_{\text{row}}=s_{\text{col}}=2\) , there exist instances where every fixed-threshold algorithm \(\text{ALG}_{\tau }\) has an approximation ratio at least \(\Omega ({n})\) (even when \(\tau\) is a random threshold).
Proof of Lemma 2.1
We consider the \(s_{\text{row}}=s_{\text{col}}=2\) tower instance (Example 2.3), with sufficiently small \(\epsilon\) chosen later. We claim that \({\mathbb {E}}[\max _i X_i] = \Omega (n)\) while any fixed-threshold- \(\tau\) algorithm has \({\mathbb {E}}[\text{ALG}_{\tau }] \le 3\) .
First, we bound \({\mathbb {E}}[\text{ALG}_{\tau }]\) . Let \(p_j = \Pr [\text{$\text{ALG}_{\tau }$ takes $X_j$ and $Y_j$ is active}]\) . In this case, the algorithm’s reward includes \(Y_j = \frac{1}{\epsilon ^j}\) with coefficient 1. Let \(p_j^{\prime } = \Pr [\text{$\text{ALG}_{\tau }$ takes $X_{j-1}$ and $Y_j$ is active}]\) . In this case, its reward includes \(Y_j = \frac{1}{\epsilon ^j}\) with coefficient \(\epsilon\) . We note that \(p_j,p_j^{\prime } \le \Pr [\text{$Y_j$ is active}] = \epsilon ^j\) . By summing over \(Y_1,\dots ,Y_n\) , we have
\begin{align*} {\mathbb {E}}[\text{ALG}_{\tau }] \quad = \quad \sum _{j=1}^n \Big (p_j \frac{1}{\epsilon ^j} + (p_j^{\prime }) (\epsilon) \frac{1}{\epsilon ^j} \Big) \quad \le \quad \sum _{j=1}^n \Big (p_j \frac{1}{\epsilon ^j} + \epsilon \Big) \quad =\quad n\epsilon + \sum _{j=1}^n p_j \frac{1}{\epsilon ^j} . \end{align*}
We argue that \(p_j = 0\) for all but at most two terms. Let \(j^*\) satisfy \({\epsilon ^{-(j^*+1)}} \lt \tau \le {\epsilon ^{-j^*}}\) , or \(j^* = 1\) if \(\tau \le \frac{1}{\epsilon }\) . We claim \(p_j = 0\) if \(j \le j^*-2\) : assuming \(\epsilon \lt \frac{1}{2}\) , we have
\begin{equation*} X_j \quad \le \quad \frac{2}{\epsilon ^j} \quad \le \quad \frac{1}{\epsilon ^{j+1}} \quad \lt \quad \tau , \end{equation*}
so \(X_j\) is never taken. We also claim \(p_j = 0\) if \(j \ge j^*+1\) : if \(Y_j\) is active, then
\begin{equation*} X_{j-1} \quad \ge \quad \epsilon \frac{1}{\epsilon ^j} \quad \ge \quad \frac{1}{\epsilon ^{j-1}} \quad \ge \quad \tau , \end{equation*}
so \(X_{j-1}\) is taken and \(X_j\) is not. So we have \(p_j = 0\) unless \(j \in \lbrace j^*-1, j^*\rbrace\) , in which case \(p_j \le \epsilon ^j\) . So
\begin{align*} {\mathbb {E}}[\text{ALG}_{\tau }] \quad \le \quad n\epsilon + \sum _{j=1}^n p_j \frac{1}{\epsilon ^j} \quad \le \quad n\epsilon + 2 \quad \le \quad 3 \end{align*}
for \(\epsilon \le \frac{1}{n}\) .
For the benchmark, since \(\max _i\lbrace X_i \rbrace \ge \max _j\lbrace Y_j\rbrace\) , it suffices to show \({\mathbb {E}}[\max _j Y_j] = \Omega (n)\) . Now,
\begin{align*} {\mathbb {E}}[\max _j\lbrace Y_j\rbrace ] &= \sum _{i=1}^n \Pr [Y_j = 0 ~ (\forall j \gt i)] \cdot \Pr [Y_i \ne 0] \cdot \Big (\frac{1}{\epsilon ^i}\Big) \\ & \ge \Pr [Y_j = 0 ~ (\forall j)] \cdot \sum _{i=1}^n \Pr [Y_i \ne 0] \cdot \Big (\frac{1}{\epsilon ^i}\Big). \end{align*}
Since \(\Pr [Y_i \ne 0] = \epsilon ^i\) , we get
\begin{align*} {\mathbb {E}}[\max _j\lbrace Y_j\rbrace ] \quad &\ge \quad \Pr [Y_j = 0 ~ (\forall j)] \cdot n \\ &\ge \quad \big (1 - \sum _j \Pr [Y_j \ne 0] \big) \cdot n \quad \ge \quad \left(1 - n\epsilon \right) \cdot n \quad \ge \quad {n}/{2} \end{align*}
for any choice of \(\epsilon \le \frac{1}{2n}\) . This gives an approximation ratio of at least \(\frac{n/2}{3} = \frac{n}{6}\) for \(\epsilon \le \frac{1}{2n}\) . Randomization over threshold \(\tau\) does not help since that is only a convex combination of deterministically chosen thresholds. □

D.2 Missing Proofs from Section 5.3

Proof of Claim 5.5
First, consider supplementing \({\rm\small OPT}^{\prime }\) by including tiny elements below \(\tau _c\) when there is room—that is, let \(Z^{(i)\prime \prime } = Z^{(i)} \mathbf {1} [Z_i \lt \tau _0]\) and consider \({\rm\small OPT}_2^{\prime \prime } = \sum _{j=1}^r Z^{(j)\prime \prime }\) . Let \({\rm\small OPT}^{\prime \prime } = {\rm\small OPT}_1^{\prime } + {\rm\small OPT}_2^{\prime \prime }\) . On a case-by-case basis, \({\rm\small OPT}^{\prime \prime }\) differs from \({\rm\small OPT}^{\prime }\) by at most r elements, each at most \(\tau _c \le \frac{\epsilon }{r} {\mathbb {E}}[{\rm\small OPT}]\) . This proves that \({\mathbb {E}}[{\rm\small OPT}^{\prime }] \ge {\mathbb {E}}[{\rm\small OPT}^{\prime \prime }] - \epsilon {\mathbb {E}}[{\rm\small OPT}]\) . We next prove that \({\mathbb {E}}[{\rm\small OPT}^{\prime \prime }] \ge (1-\epsilon){\mathbb {E}}[{\rm\small OPT}]\) , which completes the proof of the claim.
Let H be the event there exists a heavy element—that is, \(\max _i Z_i \ge \tau _0 = \mathbb {E}[{\rm\small OPT}]/\epsilon\) . Let \(p = \Pr [H]\) . So,
\begin{align} {\mathbb {E}}[{\rm\small OPT}^{\prime \prime }] &= p \cdot {\mathbb {E}}[{\rm\small OPT}^{\prime \prime } \mid H] + (1-p) \cdot {\mathbb {E}}[{\rm\small OPT}^{\prime \prime } \mid \lnot H] \nonumber \nonumber\\ &= p \cdot {\mathbb {E}}[{\rm\small OPT}^{\prime \prime } \mid H] + (1-p) \cdot {\mathbb {E}}[{\rm\small OPT}\mid \lnot H] \nonumber \nonumber\\ &\ge p \cdot {\mathbb {E}}[Z^{(1)} \mid H] + (1-p) \cdot {\mathbb {E}}[{\rm\small OPT}\mid \lnot H] . \end{align}
(4)
Now, we claim \({\mathbb {E}}[{\rm\small OPT}\mid H] \le {\mathbb {E}}[Z^{(1)} \mid H] + {\mathbb {E}}[{\rm\small OPT}]\) . Indeed, let \(M_i\) be the event that \(i = \arg \max _{i^{\prime }} Z_{i^{\prime }}\) and \(Z_i \ge \tau _0\) . Let \({\rm\small OPT}_{-i}\) be the sum of the largest \(r-1\) elements excluding \(Z_i\) . Note that conditioning on all others lying below \(Z_i\) , for any fixed \(Z_i\) , only decreases \({\rm\small OPT}_{-i}\) , as the variables are independent:
\begin{align*} {\mathbb {E}}[{\rm\small OPT}\mid H] &= {\mathbb {E}}[Z^{(1)} \mid H] + {\sum _{i=1}^n } \Pr [M_i \mid H] \cdot {\mathbb {E}}[{\rm\small OPT}_{-i} \mid M_i] \\ &\le {\mathbb {E}}[Z^{(1)} \mid H] + {\sum _{i=1}^n } \Pr [M_i \mid H] \cdot {\mathbb {E}}[{\rm\small OPT}_{-i}] \\ &\le {\mathbb {E}}[Z^{(1)} \mid H] + {\sum _{i=1}^n } \Pr [M_i \mid H] \cdot {\mathbb {E}}[{\rm\small OPT}] \quad = \quad {\mathbb {E}}[Z^{(1)} \mid H] + {\mathbb {E}}[{\rm\small OPT}]. \end{align*}
Using this,
\begin{align*} {\mathbb {E}}[{\rm\small OPT}] &= p \cdot {\mathbb {E}}[{\rm\small OPT}\mid H] + (1-p) \cdot {\mathbb {E}}[{\rm\small OPT}\mid \lnot H] \\ &\le p \cdot {\mathbb {E}}[Z^{(1)} \mid H] + p \cdot {\mathbb {E}}[{\rm\small OPT}] + (1-p) \cdot {\mathbb {E}}[{\rm\small OPT}\mid \lnot H] . \end{align*}
This implies
\begin{align*} (1-p)\cdot {\mathbb {E}}[{\rm\small OPT}] ~~\le ~~ p \cdot {\mathbb {E}}[ Z^{(1)} \mid H] + (1-p) \cdot {\mathbb {E}}[{\rm\small OPT}\mid \lnot H] . \end{align*}
Combining with Inequality (4) gives \({\mathbb {E}}[{\rm\small OPT}^{\prime \prime }] \ge (1-p){\mathbb {E}}[{\rm\small OPT}]\) . We have \(p \le \epsilon\) by Markov’s inequality: \(p = \Pr [\max _i Z_i \ge \tau _0] \le {\mathbb {E}}[\max _i Z_i]/\tau _0 \le {\mathbb {E}}[{\rm\small OPT}]/\tau _0 = \epsilon\) . This completes the proof. □
Proof of Lemma 5.7
Suppose r is large enough that \(\epsilon \le 0.5\) and otherwise the lemma is immediate. Let \(\delta := \frac{1 + c \cdot \beta }{r}\) .
Claim D.1.
\(\epsilon \ge 2\delta\) .
Proof.
Using that \(\epsilon \ge r^{-1/4}\) , we have \(c := \lceil \frac{1}{\epsilon } \ln \frac{r}{\epsilon ^2} \rceil \le \frac{3}{\epsilon } \ln (r)\) . Further using that \(\epsilon \ge 3r^{-1/4}\) , this implies \(\ln \frac{c}{\epsilon } \le \ln (r^{1/2} \ln (r)) \le \ln (r)\) . Therefore, \(\beta := 3 {\sqrt {r \ln \tfrac{c}{\epsilon }}} \le 3 \sqrt {r \ln (r)}\) . Then, \(c \cdot \beta \le (\frac{3}{\epsilon }\ln (r)) (3 \sqrt {r \ln (r)}) \le \frac{9 \sqrt {r} (\ln r)^{3/2}}{\epsilon }\) . Using that \(\epsilon \ge 9r^{-1/4}(\ln r)^{3/2}\) , this gives \(c \cdot \beta \le r^{3/4}\) , so \(1 + c \cdot \beta \le 2r^{3/4}\) , so \(\delta \le 2 r^{-1/4} \le \epsilon / 2\) . □
Let K be the number of arrivals taken by \(\text{ALG}\) , a random variable.
Claim D.2.
With at least \(1-\epsilon\) probability, \(K \lt r\) (i.e., \(\text{ALG}\) does not reach its cardinality constraint).
Proof.
The number of arrivals taken by \(\text{ALG}^{\prime }\) is at most \(\sum _{j=0}^c \tilde{r}_j = 1 + c \cdot \beta + \sum _{j=1}^c r_j\) . Because \({\rm\small OPT}\) takes at most r arrivals pointwise, and thus in expectation, we have \(\sum _{j=1}^c r_j \le r\) . So \(\text{ALG}^{\prime }\) takes at most, in the worst case, \(K^{\prime } = r + 1 + c \cdot \beta = r(1+\delta)\) arrivals. Because \(\text{ALG}\) keeps each independently with probability \(1-\epsilon \le 1-2\delta \lt (1-\delta)^2\) , the chance it reaches r is upper-bounded by the chance that a Binomial( \(K^{\prime },(1-\delta)^2\) ) variable exceeds r. This is upper-bounded by the chance it exceeds \(K^{\prime }(1-\delta)^2(1+\delta) = r(1-\delta)^2(1+\delta)^2 = r(1-\delta ^2)^2 \lt r\) . So by a Chernoff bound,
\begin{align*} \Pr [K \ge r] \quad \le \quad \Pr [\text{Binom}(K^{\prime },(1-\delta)^2) \ge K^{\prime }(1-\delta)^2(1+\delta)] \quad \le \quad \exp \left(\frac{-\delta ^2 K^{\prime }(1-\delta)^2}{3}\right). \end{align*}
We have \(\delta \le \epsilon /2 \le 0.25\) , and \(K^{\prime } \ge r\) , so \(K^{\prime }(1-\delta)^2 \ge \frac{r}{2}\) . Additionally, \(\delta \ge \frac{c \cdot \beta }{r} \ge \frac{\beta }{r}\) .
\begin{align*} \Pr [K \ge r] &~\le ~ \exp \left(\frac{-\delta ^2 r}{6}\right) ~\le ~ \exp \left(\frac{-\beta ^2}{6 r}\right) ~\le ~ \exp \left(-\ln \frac{c}{\epsilon }\right) ~\le ~ \epsilon . □ \end{align*}
Now, each time \(\text{ALG}^{\prime }\) obtains some variable \(X_i\) , \(\text{ALG}\) also obtains it unless either it has reached its cardinality constraint or it independently discards \(X_i\) (with probability \(\epsilon\) ). By a union bound over these two events, when \(\text{ALG}^{\prime }\) obtains \(X_i\) , \(\text{ALG}\) also obtains it except with probability \(1-2\epsilon\) . So, \({\mathbb {E}}[\text{ALG}] \ge (1-2\epsilon){\mathbb {E}}[\text{ALG}^{\prime }]\) , which proves Lemma 5.7.

Acknowledgments

We are thankful to the anonymous reviewers of TEAC and EC 2020 for very helpful comments on improving the presentation of this article.

References

[1]
Melika Abolhasani, Soheil Ehsani, Hossein Esfandiari, MohammadTaghi Hajiaghayi, Robert Kleinberg, and Brendan Lucier. 2017. Beating 1-1/e for ordered prophets. In Proceedings of STOC. ACM, New York, NY, 61–71.
[2]
Shipra Agrawal, Yichuan Ding, Amin Saberi, and Yinyu Ye. 2012. Price of correlations in stochastic optimization. Operations Research 60, 1 (2012), 150–162.
[3]
Saeed Alaei. 2011. Bayesian combinatorial auctions: Expanding single buyer mechanisms to many buyers. In Proceedings of FOCS.
[4]
Saeed Alaei, MohammadTaghi Hajiaghayi, and Vahid Liaghat. 2012. Online prophet-inequality matching with applications to ad allocation. In Proceedings of EC. 18–35.
[5]
C. J. Argue, Anupam Gupta, Marco Molinaro, and Sahil Singla. 2022. Robust secretary and prophet algorithms for packing integer programs. In Proceedings of SODA. 1273–1297.
[6]
Pablo Daniel Azar, Robert Kleinberg, and S. Matthew Weinberg. 2014. Prophet inequalities with limited information. In Proceedings of SODA.
[7]
Yossi Azar, Ashish Chiplunkar, and Haim Kaplan. 2018. Prophet secretary: Surpassing the 1-1/e barrier. In Proceedings of EC.
[8]
Moshe Babaioff, Nicole Immorlica, Brendan Lucier, and S. Matthew Weinberg. 2014. A simple and approximately optimal mechanism for an additive buyer. In Proceedings of FOCS.
[9]
MohammadHossein Bateni, Sina Dehghani, MohammadTaghi Hajiaghayi, and Saeed Seddighin. 2015. Revenue maximization for selling multiple correlated items. In Algorithms—ESA 2015. Springer, 95–105.
[10]
Domagoj Bradac, Anupam Gupta, Sahil Singla, and Goran Zuzic. 2020. Robust algorithms for the secretary problem. In Proceedings of ITCS.
[11]
Yang Cai and Argyris Oikonomou. 2021. On simple mechanisms for dependent items. In Proceedings of EC. ACM, New York, NY, 242–262.
[12]
Moses Charikar, Jacob Steinhardt, and Gregory Valiant. 2017. Learning from untrusted data. In Proceedings of STOC.
[13]
Shuchi Chawla, Jason D. Hartline, David L. Malec, and Balasubramanian Sivan. 2010. Multi-parameter mechanism design and sequential posted pricing. In Proceedings of STOC.
[14]
Shuchi Chawla, David Malec, and Balasubramanian Sivan. 2015. The power of randomness in Bayesian optimal mechanism design. Games and Economic Behavior 91 (2015), 297–317.
[15]
José Correa and Andrés Cristi. 2023. A constant factor prophet inequality for online combinatorial auctions. In Proceedings of STOC. ACM, New York, NY.
[16]
José Correa, Patricio Foncea, Ruben Hoeksma, Tim Oosterwijk, and Tjark Vredeveld. 2017. Posted price mechanisms for a random stream of customers. In Proceedings of EC. 169–186.
[17]
José R. Correa, Paul Dütting, Felix A. Fischer, and Kevin Schewior. 2019. Prophet inequalities for I.I.D. random variables from an unknown distribution. In Proceedings of EC. 3–17.
[18]
José R. Correa, Raimundo Saona, and Bruno Ziliotto. 2019. Prophet secretary through blind strategies. In Proceedings of SODA. 1946–1961.
[19]
Ilias Diakonikolas. 2018. Tutorial: Algorithmic High-Dimensional Robust Statistics. (August 2018). Retrieved September 15, 2023 from http://www.iliasdiakonikolas.org/simons-tutorial-robust.html
[20]
Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2016. Robust estimators in high dimensions without the computational intractability. In Proceedings of FOCS.
[21]
Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2018. Robustly learning a Gaussian: Getting optimal error, efficiently. In Proceedings of SODA. 2683–2702.
[22]
Paul Dütting, Michal Feldman, Thomas Kesselheim, and Brendan Lucier. 2017. Prophet inequalities made easy: Stochastic optimization by pricing non-stochastic inputs. In Proceedings of FOCS. 540–551.
[23]
Paul Dütting and Thomas Kesselheim. 2019. Posted pricing and prophet inequalities with inaccurate priors. In Proceedings of EC. 111–129.
[24]
Paul Dütting, Thomas Kesselheim, and Brendan Lucier. 2020. An O(log log m) prophet inequality for subadditive combinatorial auctions. In Proceedings of FOCS. IEEE, Los Alamitos, CA, 306–317.
[25]
Soheil Ehsani, Mohammad Hajiaghayi, Thomas Kesselheim, and Sahil Singla. 2018. Prophet secretary for combinatorial auctions and matroids. In Proceedings of SODA.
[26]
Hossein Esfandiari, MohammadTaghi Hajiaghayi, Vahid Liaghat, and Morteza Monemizadeh. 2017. Prophet secretary. SIAM Journal on Discrete Mathematics 31, 3 (2017), 1685–1701.
[27]
Hossein Esfandiari, Nitish Korula, and Vahab Mirrokni. 2018. Allocation with traffic spikes: Mixing adversarial and stochastic models. ACM Transactions on Economics and Computation 6, 3-4 (2018), 14.
[28]
Michal Feldman, Nick Gravin, and Brendan Lucier. 2015. Combinatorial auctions via posted prices. In Proceedings of SODA. 123–135.
[29]
Moran Feldman, Ola Svensson, and Rico Zenklusen. 2016. Online contention resolution schemes. In Proceedings of SODA. 1014–1033.
[30]
MohammadTaghi Hajiaghayi, Robert D. Kleinberg, and Tuomas Sandholm. 2007. Automated online mechanism design and prophet inequalities. In Proceedings of AAAI.
[31]
Theodore P. Hill and Robert P. Kertz. 1992. A survey of prophet inequalities in optimal stopping theory. Contemporary Mathematics 125 (1992), 191–207.
[32]
Kumar Joag-Dev and Frank Proschan. 1983. Negative association of random variables with applications. Annals of Statistics 11, 1 (1983), 286–295.
[33]
Robert Kleinberg and S. Matthew Weinberg. 2012. Matroid prophet inequalities. In Proceedings of STOC. 123–136.
[34]
Ulrich Krengel and Louis Sucheston. 1978. On semiamarts, amarts, and processes with finite value. Advances in Probability and Related Topics 4 (1978), 197–266.
[35]
Kevin A. Lai, Anup B. Rao, and Santosh Vempala. 2016. Agnostic estimation of mean and covariance. In Proceedings of FOCS.
[36]
Euiwoong Lee and Sahil Singla. 2018. Optimal online contention resolution schemes via ex-ante prophet inequalities. In Proceedings of ESA.
[37]
Thodoris Lykouris, Vahab S. Mirrokni, and Renato Paes Leme. 2018. Stochastic bandits robust to adversarial corruptions. In Proceedings of STOC. 114–122.
[38]
Ankur Moitra. 2018. Robustness meets algorithms (invited talk). In Proceedings of SWAT. Article 3, 1 page.
[39]
Yosef Rinott and Ester Samuel-Cahn. 1991. Orderings of optimal stopping values and prophet inequalities for certain multivariate distributions. Journal of Multivariate Analysis 37, 1 (1991), 104–114.
[40]
Yosef Rinott and Ester Samuel-Cahn. 1992. Optimal stopping values and prophet inequalities for some dependent random variables. Lecture Notes–Monograph Series 22 (1992), 343–358.
[41]
Yosef Rinott and Ester Samuel-Cahn. 1987. Comparisons of optimal stopping values and prophet inequalities for negatively dependent random variables. Annals of Statistics 15, 4 (1987), 1482–1490.
[42]
Aviad Rubinstein. 2016. Beyond matroids: Secretary problem and prophet inequality with general constraints. In Proceedings of STOC. 324–332.
[43]
Aviad Rubinstein and Sahil Singla. 2017. Combinatorial prophet inequalities. In Proceedings of SODA. 1671–1687.
[44]
Aviad Rubinstein, Jack Z. Wang, and S. Matthew Weinberg. 2020. Optimal single-choice prophet inequalities from samples. In Proceedings of ITCS.
[45]
Ester Samuel-Cahn. 1984. Comparison of threshold stop rules and maximum for independent nonnegative random variables. Annals of Probability 12, 4 (1984), 1213–1216.
[46]
Jan Vondrák. 2010. A note on concentration of submodular functions. arXiv:1005.2791 (2010).
[47]
Qiqi Yan. 2011. Mechanism design via correlation gap. In Proceedings of SODA.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Economics and Computation
ACM Transactions on Economics and Computation  Volume 11, Issue 3-4
December 2023
149 pages
ISSN:2167-8375
EISSN:2167-8383
DOI:10.1145/3637839
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 December 2023
Online AM: 08 September 2023
Accepted: 05 September 2023
Received: 08 August 2020
Published in TEAC Volume 11, Issue 3-4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Robust stopping time algorithms
  2. online algorithms
  3. posted price mechanisms

Qualifiers

  • Research-article

Funding Sources

  • Schmidt Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 278
    Total Downloads
  • Downloads (Last 12 months)278
  • Downloads (Last 6 weeks)53
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media