1. Introduction
Cormen et al.’s book [
1] is a widely used textbook in colleges worldwide and is a standard reference for researchers. When discussing the probabilistic analysis of algorithms in chapter 5, the authors introduce the assistant hiring algorithm, where
n candidates, with rankings from 1 (worst) to
n (best), scrambled in random order, are interviewed to fill an assistant position. Every time a candidate is better (i.e., has a higher ranking) than the current one, (s)he is hired and the current one is fired. The first applicant is always hired. Under the hypothesis that all
permutations of the rankings are equally likely, using indicator random variables and the linearity of the expected value, the authors compute the expected number of hires as
The sum above, the
n-th harmonic number, will be denoted by
. More generally, we will use the notation
The variance in the number of hires in the assistant hiring algorithm, a relevant parameter that provides a measure of the spread of the
H values around its expectation, is not computed in [
1]. In his seminal book [
2], Donald Knuth also introduces a probabilistic analysis of algorithms by looking at the same assistant hiring problem, but refers to this with a less playful name: Algorithm M (finding the maximum: page 96 and the following). He obtains both the expectation and the variance, referring the reader to [
3] for the asymptotic values of higher moments, although in this article they base their analysis of the higher moments on a further reference on power series with a finite number of singularities: [
4]. In both [
2] and [
3], they find difference equations for the probability mass function of
H and derive formulas for the
factorial moment generating function
of
H through some computations involving Stirling numbers and Lagrange’s method for solving differential equations.
In this self-contained note, we revisit the assistant hiring algorithm, which essentially looks at the number of times the value of the current maximum is updated when finding the maximum element of a random permutation. Our approach is different, in that we do not focus on the probability mass function of H, and we believe it is simpler. We first prove that H can be expressed as the sum of independent indicator variables, using a well-known combinatorial identity, the hockey stick identity. Then, we obtain a reasonably compact moment generating function that allows for us to find exact expressions for its first three moments, as well as asymptotic values for its higher moments, using simple differentiation and an inductive argument.
2. The Results
Let
H be the number of hires. Then, we can write
where
in case the candidate
i is hired, and
otherwise. It is not clear that
and
are independent random variables: if
, a possible heuristic would say that the hiring of candidate
j in the future should not affect the hiring of the earlier candidate
i, but we can turn this heuristic around and say that the hiring of candidate
i implies a large value for the ranking for this candidate, possibly affecting the occurrence of large rankings for future candidate
j. Heuristics aside, we argue that the joint behavior of the candidates
i and
j,
is only determined by the rankings
, and if we sort them in increasing order
, and we identify
with
k, we can simply work with the subarray of rankings
. In particular,
the reason being that, in order for candidate
j to be hired, their ranking should be
j and all previous candidates may have any of the remaining
rankings. From this, we conclude that
. This fact, and the linearity of the expectation, imply that taking expectations on both sides of (
2) yields (
1). With these preliminaries, we can prove the following.
Proposition 1. The variables and , , are independent.
Proof. We will prove that
equals
Indeed,
To count the number of favorable outcomes, first, we argue that in order for candidate
j to be hired, they must have ranking
j. The possible values
for the ranking of candidate
i are
, and the reason for this is that if we select
, we will force a ranking greater than or equal to
i in the first
candidates, thus precluding the hiring of candidate
i. Once we assign the ranking
to candidate
i, there are
possible rankings that could be distributed among the first
candidates in
ways. Finally, the remaining
rankings for candidates
through
can be assigned in
ways. Putting all these facts together, we obtain
After multiplying and dividing by
, and with some algebraic manipulations, we obtain that the previous probability can be written as
However,
by the well-known hockey stick identity (also known as Chu’s theorem; see [
5]) and, therefore,
. It is trivial to see that this is enough to prove the independence of the variables, that is,
, where
a and
b can be arbitrarily chosen as either 0 or 1. □
As the variables
are
pairwise independent, a simple computation of
follows. Indeed, using the formula for the variance of the sum, and since the independence of two variables implies that their covariance vanishes, we obtain
The previous derivation of the variance is simple enough to find its way into an introductory textbook on the subject. However, to find higher orders of H, we will need to resort to the moment generating function of the random variable, and first obtain the following.
Proposition 2. The variables , , …, , , are independent.
Proof. Let us prove that
are independent for
. Again, since these variables only take two values, 0 and 1, it is enough to show that
Arguing as before, we can restrict our attention to the rankings
. Candidate
k must obtain the ranking
k; thus, the possibles values of the ranking
for candidate
j are
. Once the rank
is fixed, it is clear that the possible values
for the ranking of candidate
i are
. Once we fix
, there are
possible rankings to be placed between candidates 1 and
in
ways. Next, between candidates
and
, we can place
available rankings in
ways. Finally, between candidates
and
we can place the remaining
unassigned rankings in
ways. Then, we can write
Multiplying and dividing by
, and taking this term into the inner summation, we can see that the above probability equals
where the last equality holds by the hockey stick identity. Expanding the binomial coefficient and some cancelling yields that the previous expression equals
Multiplying and dividing by
and applying in the hockey stick identity again renders the previous expression equal to
as desired.
It is clear that we can generalize the above argument to prove that the variables are independent for any choice of subindices : once we have written the probability as an expression involving summations, we can work our way from the innermost to the outermost summation, applying the hockey stick identity times. □
The two-valued random variable
has a simple moment generating function (m.g.f.):
and since the
s are independent, the m.g.f of
H is the product of the m.g.f.s of the
s, that is,
The relatively simple expression for the m.g.f of
H is a direct consequence of the independence of the
. Without this fact, finding the m.g.f. of
H would be a much harder task. Now, in principle, we could obtain the moments of
H by taking the derivatives of
and evaluating them at
. The formulas become more involved after a few differentiations, but we can obtain some asymptotics for
. We first note that
and
By evaluating (
4) and (
5) at
, we obtain
and
Therefore, we recover (
1) and the variance as
Another differentiation of (
5) yields
Evaluating at
, we obtain
It is well-known (see [
1]) that
, or, in a more compact and somewhat less informative way,
, where the constants involved in the upper and lower bounds are both equal to 1. As the numbers
, for
, are partial sums of convergent series, we see that the dominating term there is
and, thus,
.
By inspecting Formula (
6) and proceeding recursively, we see that the
k-th derivative of
can be written as:
where the
are functions involving powers of
and sums of inverse powers of
,
. After evaluating at
, this becomes
or
The values
involve either partial sums of convergent series, or
, and inside the summation there are only moments
for
. Thus, a simple inductive argument tells us that the dominant term in (
8) is
, or, in other words,
for
, where the constants involved in the upper and lower bounds are both equal to 1.
Remark 1. The assistant hiring algorithm is a basic prototype of the so-called secretary problem, where the hiring strategy may be more involved than interviewing all candidates and being allowed to hire a candidate every time the current one is better than the preceding ones. Typically, in the secretary problem, only one hire is allowed. Candidates are interviewed individually, and the current candidate can be hired or rejected. If rejected, they cannot be summoned again in the future. The desired outcome is an optimal strategy that maximizes the probability of selecting the best candidate. Other problems target the second best candidate, the best k out of n, etc. For more details of these generalizations, the reader may consult references [6,7]. 3. Conclusions
In this note, we studied the random variable that counts the number of hires in the assistant hiring algorithm, or, in other words, the number of updates to the current maximum when searching for the maximum element in a random permutation of an array of real numbers. We found exact expressions for its first three moments, and asymptotic values for its higher moments. The technique was to write the random variable under consideration as the sum of indicator variables that take the value 1 if a hire is carried out in the
i-th interview (or, in other words, if an update of the current maximum is performed after the inspection of the
i-th number), and 0 otherwise. An application of the hockey stick identity showed that these indicator variables are independent, implying that the moment generating function of the variable under consideration equals the product of the simple moment generating functions of the indicator variables. We obtained thus a reasonable expression for the moment generating function, which differentiated and evaluated at 0 yielded that the
k-th moment of the variable under study, asymptotically, is the
k-th power of the natural logarithm. We believe that our method, which does not rely on explicit knowledge of the probability mass function of the variable under study, is simpler than other treatments of the algorithm found in the literature. The basic idea of decomposing a random variable into the sum of indicator random variables is well-known, and is used extensively for the analysis of algorithms, at least for computation of the expected values, in textbooks such as [
1]. Due to the linearity of the expectation, the expected value of the original variable is simply the sum of the expected values of the indicator variables, which, in turn, is the probability that the indicator variables take the value 1. However, we must warn that the computation of the variance, or that of higher moments, can be a difficult task unless the indicator variables are independent, as was the case in the assistant hiring algorithm.