Set cover, over a universe of size $n$, may be modelled as a
data-streaming problem, where the $m$ sets that comprise the instance
are to be read one by one. A semi-streaming algorithm is allowed only
$O(n \text{ poly}\{\log n, \log m\})$ space to process this stream. For each
$p \ge 1$, we give a very simple deterministic algorithm that makes $p$ passes
over the input stream and returns an
appropriately certified $(p+1)n^{1/(p+1)}$-approximation to the
optimum set cover. More importantly, we proceed to show that this
approximation factor is essentially tight, by showing that a factor
better than $0.99\,n^{1/(p+1)}/(p+1)^2$ is unachievable for a $p$-pass
semi-streaming algorithm, even allowing randomisation. In particular,
this implies that achieving a $\Theta(\log n)$-approximation requires
$\Omega(\log n/\log\log n)$ passes, which is tight up to the
$\log\log n$ factor.
These results extend to a relaxation of the set cover problem where we
are allowed to leave an $\varepsilon$ fraction of the universe uncovered: the
tight bounds on the best approximation factor achievable in $p$ passes
turn out to be $\Theta_p(\min\{n^{1/(p+1)}, \varepsilon^{-1/p}\})$.
Our lower bounds are based on a construction of a family of high-rank
incidence geometries, which may be thought of as vast generalisations
of affine planes. This construction, based on algebraic techniques,
appears flexible enough to find other applications and is therefore
interesting in its own right.