Advanced Algorithms Course. Lecture Notes. Part 9: 3-SAT: How To Satisfy Most Clauses
Advanced Algorithms Course. Lecture Notes. Part 9: 3-SAT: How To Satisfy Most Clauses
Advanced Algorithms Course. Lecture Notes. Part 9: 3-SAT: How To Satisfy Most Clauses
1
clauses), we may apply some simple randomized algorithm and show that
the desired structure is produced with some positive probability. Hence this
structure must exist. Of course, the approach does not work for any such
problem (due to lack of a simple randomized algorithm), and it proves only
the existence of the thing we are looking for, but it does not say how we can
find it efficiently. These questions must be studied for any specific search
problem at hand.
In the case of MAX 3-SAT, how difficult is it to actually find an assign-
ment that satisfies at least 7/8 of the clauses? The obvious idea is to iterate
the above algorithm until success. We analyze the expected number of iter-
ations needed. Let pj be the probability that exactly j clauses are satisfied.
Since the expected value of j is 7k/8, we have the following equality, where
the sum is already split in two cases:
X X X
7k/8 = jpj = jpj + jpj .
j j<7k/8 j7k/8
k 0 pj + kpj = k 0 (1 p) + kp k 0 + kp.
X X
7k/8
j<7k/8 j7k/8
2
element with rank bn/2c is called the median. Median finding and Selection
has nice applications in geometry and in the analysis of statistical data.
(Side remark: In the refined greedy algorithm that gave an approxima-
tion ratio 1.5 for Load Balancing we had sorted the jobs by length. A closer
look reveals that we actually need much less: It is enough to separate the
k longest jobs from the shorter jobs, since the analysis does not use any
sorting within these two sets of jobs.)
We may first sort S in O(n log n) time, which makes the Selection prob-
lem trivial. But the nice thing is that we can bypass sorting and solve Selec-
tion directly in O(n) time. There exists a deterministic divide-and-conquer
algorithm for Selection, but it is a bit complicated and, more importantly,
the hidden constant in O(n) is rather large. It is much more advisable to
apply a simple randomized algorithm like the following.
Just for simplicity we assume that all ai are distinct. The scheme of
the algorithm is as follows. Choose an element s S called the splitter.
Compare all elements to s, in O(n) time. Now we know the rank r of s. If
r > k then throw out s and all elements larger than s. If r < k then throw
out s and all elements smaller than s, and set k := k r. If r = k then
return s. Repeat this procedure recursively.
Correctness should be obvious. The only unspecified step is the choice of
the splitter. Let us use the above scheme with a splitter chosen uniformly at
random. That is, every element of S becomes the splitter with probability
1/n.
Intuitively, this is a good algorithm because a random element will usu-
ally split the set in two reasonably well balanced subsets, hence the number
of elements to consider should exponentially decrease in each iteration. For
a rigorous analysis of the expected runtime we just introduce a cut-off
point that defines when the split is well balanced or not. More precisely,
we call an element central in a set if this element is smaller and larger,
respectively, than at least 1/4 of the elements. We say that the algorithm
is in phase j if the number of remaining elements is between n(3/4)j+1
and n(3/4)j . Clearly, our random splitter is central with probability 1/2.
It follows immediately that the expected number of splitters needed in any
phase j is 2 = O(1). Furthermore, since j n(3/4)j is a geometric series
P
converging to some O(n), the total expected time is O(n), but now with
some moderate hidden constant. Once we have chosen the right approach,
the analysis is not too complicated, is it?
3
A Simple Analysis of Quicksort
The basic version of the famous Quicksort algorithm (which we do not repeat
here) works with a random splitter in every recursion step. For the sake of
a simple analysis we slightly modify the algorithm, however we keep it close
to the original Quicksort: We check after comparison to all other elements
whether the random splitter is central, and if not, we discard it altogether
and pick a new splitter. Of course, this is a certain waste of time. Hence
the original Quicksort performs no worse than this modified Quicksort.
We say that a subproblem is of type j if the number of elements is
between n(3/4)j+1 and n(3/4)j . As in the previous section, we find a central
splitter after an expected number of only 2 attempts. Thus, the expected
time spent on any subproblem of type j is O(n(3/4)j ). Moreover, since we
accepted only central splitters, we can easily see that all subproblems of type
j are disjoint, i.e., they deal with disjoint subsets of the entire set. Hence
at most (4/3)j+1 subproblems of type j can exist during the execution of
the algorithm. By linearity of expectation, the expected time spent on all
subproblems of size j is therefore O(n). Since O(log n) types exist, the total
expected time is O(n log n).