OceanofPDF - Com Train Your Brain Challenging Yet Elementa - Bogumil Kaminski
OceanofPDF - Com Train Your Brain Challenging Yet Elementa - Bogumil Kaminski
OceanofPDF - Com Train Your Brain Challenging Yet Elementa - Bogumil Kaminski
Textbooks in Mathematics
Series editors:
Al Boggess, Kenneth H. Rosen
Nonlinear Optimization
Models and Applications
William P. Fox
Linear Algebra
James R. Kirkwood, Bessie H. Kirkwood
Real Analysis
With Proof Strategies
Daniel W. Cunningham
Train Your Brain
Challenging Yet Elementary Mathematics
Bogumił Kamiński, Paweł Prałat
Contemporary Abstract Algebra, Tenth Edition
Joseph A. Gallian
Geometry and Its Applications
Walter J. Meyer
Linear Algebra
What you Need to Know
Hugo J. Woerdeman
Introduction to Real Analysis, 3rd Edition
Manfred Stoll
Discovering Dynamical Systems Through Experiment and Inquiry
Thomas LoFaro, Jeff Ford
Functional Linear Algebra
Hannah Robbins
https://www.routledge.com/Textbooks-in-Mathematics/book-series/CANDHTEXBOOMTH
Train Your Brain —
Challenging Yet Elementary
Mathematics
By
Bogumił Kamiński
Paweł Prałat
First edition published 2020
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of their
use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let
us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, micro lming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access
www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identi cation and explanation without intent to infringe.
1 Inequalities
1.1 Convexity and Concavity
1.2 Arithmetic-Geometric Inequality
1.3 Mathematical Induction
1.4 Bernoulli’s Inequality
1.5 Euler s Number
1.6 Asymptotics
1.7 Cauchy-Schwarz Inequality
1.8 Probability
1.9 Geometry
4 Combinatorics
4.1 Enumeration
4.2 Tilings
4.3 Counting
4.4 Extremal Graph Theory
4.5 Probabilistic Methods
4.6 Probability
4.7 Combinations of Geometrical Objects
4.8 Pigeonhole Principle
4.9 Generating Functions
5 Number Theory
5.1 Greatest Common Divisors
5.2 Modular Arithmetic
5.3 Factorization
5.4 Fermat s Little Theorem and Euler s Theorem
5.5 Rules of Divisibility
5.6 Remainders
5.7 Aggregation
5.8 Equations
6 Geometry
6.1 Circles
6.2 Congruence
6.3 Similarity
6.4 Menelaus s Theorem
6.5 Parallelograms
6.6 Power of a Point
6.7 Areas
6.8 Thales Theorem
7 Hints
7.1 Inequalities
7.2 Equalities and Sequences
7.3 Functions, Polynomials, and Functional Equations
7.4 Combinatorics
7.5 Number Theory
7.6 Geometry
8 Solutions
8.1 Inequalities
8.2 Equalities and Sequences
8.3 Functions, Polynomials, and Functional Equations
8.4 Combinatorics
8.5 Number Theory
8.6 Geometry
Further Reading
Index
Introduction
The book contains carefully selected problems that are challenging, yet only
require elementary mathematics. It is intended to prepare the readers for
rigorous mathematics, but neither prior preparation nor any mathematical
sophistication is required from them before reading this book. The book
guides the readers to think and express themselves in a rigorous,
mathematical way, to extract facts, analyze the problem, and identify main
challenges. Moreover, it shows how to draw appropriate, true conclusions
and helps to see a big picture. Despite the fact that this is not the main goal
of this book, as a bi-product, the readers are provided with a rm
foundation in a diverse range of topics that might be useful in their future
work. Finally, we often use computer support to help us get a better
intuition into discussed problems. This is a still rather unique approach in
mathematics but is getting more and more popular in the current
multidisciplinary and data driven world.
The presented material can be seen as a means to bridge the gap between
introductory calculus/linear algebra courses and more advanced courses that
are offered at universities. It improves the ability to read, write, and think in
a rigorous, mature mathematical fashion. It provides a solid foundation of
various topics that would be useful for more advanced courses. However,
the book is not only intended for undergraduate students that would like to
become professional research mathematicians. In almost any
mathematically related work (such as computer programming, data science,
machine learning, economics, engineering, etc.), precise reasoning, and
understanding what logical steps need to be taken to transition from the
assumptions to the desired conclusion, are crucial to be successful.
The content of this book is also suitable for high school students that are
interested in competing in math competitions or simply for people of all
ages and backgrounds who want to expand their knowledge and to
challenge themselves with interesting questions. In fact, the problems are
mostly selected from an extensive collection of problems from Polish
Mathematical Olympics and a library of training problems from XIV High
School of Stanislaw Staszic in Warsaw (Poland).
This book is clearly not the only one of its type. There are three main
reasons for writing another book on this topic. First of all, we found that
many interesting problems appear only in the Polish language and are not
translated to other languages. Some of them are unique and might be
interesting for a broader, English speaking, audience. We feel that they
deserve to be popularized. More importantly, we grouped questions into six
chapters representing various disciplines of mathematics. Each chapter
consists of many sections devoted to a collection of related topics. Each of
these sections starts with a problem that is followed by the necessary
background (de nitions and theorems used), careful and detailed solution,
and discussion of possible generalizations. The sections nish with a
number of additional related exercises that are solved at the end of the book.
As a result, this book can be used as a textbook for a systematic and
structured introduction to a fascinating world of high school math
competitions, or as a book preparing university students for more advanced
courses. Finally, with an increasing role of computational methods in
mathematics, we decided to show a few examples when computer aid can
be used to verify or guide the solutions to some problems. We present the
related code for a few suitable problems, discuss the implementation details
and, in the “Julia language companion” available on-line at
www.ryerson.ca/train-your-brain/, we provide a thorough introduction to
the Julia language that we use in this book, along with detailed explanations
of the codes we present.
In order to help the reader to navigate in the text, for each problem, we
clearly distinguish a few subsections, whose functions are listed below.
SOURCE
SOURCE
REMARKS
This part contains follow up exercises that use the same or similar concepts.
They should serve as a good test whether the reader “digested” the content
or needs more practice.
We tried to indicate the source for as many problems as possible. If the
source is omitted, it means that the problem is either our own, is well-
known, or we had it in our personal notes but were unable to recover the
original source. We also tried to make sure that it is clear whether the
solution is also taken from the source or is our own. We did our best to track
back all the sources but please contact us if we missed anything. We would
be more than happy to provide a more complete and accurate picture of the
sources of all the problems we have presented in a later edition of the book.
In particular, problems from Polish Mathematical Olympics and their
solutions are marked under the acronym PLMO. We would like here to
thank the organizers for granting us the right to use their translations in this
book. In the chapter on geometry, we have extensively used an excellent
collection of problems “Exercises in geometry” (in Polish) prepared by
Waldemar Pompe who also kindly agreed to include their translations in
this book.
If you nd any errors or omissions in this book, then please kindly let us
know and we will re ect it in the errata that will be available at
www.ryerson.ca/train-your-brain/.
Finally, we would like to thank Calum MacRury for carefully reading the
manuscript, and Igor Kamiński for helping with selecting topics and
problems to include in this book.
Chapter 1
Inequalities
We begin the book with a chapter on inequalities. This is an exciting subject, as it very
often requires reducing the problem to some other area of mathematics which may not
initially seem to be related to the problem at hand. For example, it might turn out that one
of the sides can be interpreted as the probability that some event holds, or that the side has
some geometric interpretation.
Since this is the rst chapter, let us start with introducing some basic de nitions that will
be used through the entire book.
THEORY
Let R denote the set of real numbers, let N = {1, 2, …} denote the set of natural numbers,
let Z := {…, −1, 0, 1, …} denote the set of integers, and let Q := {a/b : a ∈ Z, b ∈ N}
denote the set of rational numbers. Let [n] denote the set of the rst n natural numbers; that
is, [n] := {1, 2, …, n}. We use subscript + and - to restrict the set to positive and negative
numbers, respectively. For example, R+ denotes the set of positive real numbers. We will
use ln(x) to denote the natural logarithm of x.
1 1 1 1 1 1
+ + ≤ + + .
a b c b + c − a c + a − b a + b − c
(1.1)
THEORY
Triangle Inequality If a, b, and c are the lengths of the sides of some triangle, then the
triangle inequality states that
c ≤ a + b .
Note that this statement permits the inclusion of degenerate triangles; that is, when
c = a + b. However, usually this possibility is excluded, thus leaving out the possibility of
equality.
(1.2)
Intuitively, a function is convex if for all x , x ∈ D its graph lies below a straight line
1 2
(1.3)
Finally, let us mention that if a function is continuous, then it is enough to check that the
condition (1.2) or (1.3) holds for t = 1/2 in order to establish that the corresponding
function is convex or, respectively, concave. Using this fact, one can easily prove that
function f (x) = 2 is convex. Indeed, notice that for all x, y ∈ R, we have
x
2
x/2 y/2 x (x+y)/2 y
0 ≤ (2 − 2 ) = 2 − 2 ⋅ 2 + 2 .
It follows immediately from triangle inequality that all fractions on the right hand side of
(1.1) are positive. In particular,
1 1 (c + a − b) + (b + c − a) 2c
+ = = > 0 .
2
b + c − a c + a − b (b + c − a)(c + a − b) c
2
− (a − b)
Since the numerator is positive (that is, 2c > 0), it follows that the denominator is positive
too (that is, c − (a − b) > 0). Moreover, clearly (a − b) ≥ 0, and so c − (a − b) ≤ c
2 2 2 2 2 2
1 1 2c 2
+ = ≥ .
2
b + c − a c + a − b c
2
− (a − b) c
(1.4)
1 1 2
+ ≥ , and
c + a − b a + b − c a
(1.5)
1 1 2
+ ≥ .
b + c − a a + b − c b
(1.6)
After summing the three inequalities (that is, (1.4), (1.5), and (1.6)) together and dividing
by 2, we get the desired result. We additionally notice that equality holds if and only if
a = b = c.
REMARKS
Note that if one starts from the left hand side of (1.1), it is not clear how to reach the right
hand side of (1.1). (In particular, observe that 1/a + 1/b − 1/c may be greater than
1/(a + b − c); consider, for example, a = 3, b = 5, and c = 4.) However, when we look at
the right hand side, we notice that sum of any two denominators is twice some denominator
on the left hand side. This suggests that it might be easier to start from the right hand side
and try to reach the left hand side. Using our observation, it makes sense to re-write the
right hand side as
Let us also observe that a more general inequality in fact holds. For any convex function
f : R → R on a connected subset D ⊆ R, it follows that
(1.7)
provided that a, b, c ∈ D. Our problem is a speci c case when f (x) = 1/x, convex
function on D = R . +
The proof of this more general inequality follows exactly the same argument as above.
Point A has coordinates (a + b − c, f (a + b − c)) and point
B = (b + c − a, f (b + c − a)). Now, point D is the midpoint between A and B; that is, D
2
1
2
1
coordinate as D, but its second coordinate is equal to f (c). Inequality (1.4) is a special case
of the following observation illustrated in Figure 1.1: for any convex function f on
D = [x , x ],
1 2
x1 x2 f (x1 ) f (x2 )
f( + ) ≤ + .
2 2 2 2
(1.8)
THEORY
Jensen’s Inequality Let us point out that inequality (1.8) can be easily generalized to any
number of numbers x , …, x (not only two), and to any weights (not only half). This
1 n
generalization is known as Jensen’s inequality and can be stated as follows: for any convex
function f (x) : D → R, numbers x , …, x ∈ D, and weights a , …, a ∈ R , we have
1 n 1 n +
that
n n
∑ ai xi ∑ ai f (xi )
i=1 i=1
f( ) ≤ .
n n
∑ ai ∑ ai
i=1 i=1
1 1 1 1
− + ≥ .
a b c a + c − b
Illustrate the solution graphically. Does the same inequality hold for any function
f : R → R that is convex on some connected subset of R?
1.1.2. Prove that for any n ∈ N and any real number s ≥ 2, the following inequality holds:
n s s−1
∑ k 2 1
k=1
≥ ( n + ) .
n
∑ k 3 3
k=1
√x + √x + 2 < 2√ x + 1 .
Problem and idea for the solution: XXXI PLMO – Phase 2 – Problem 2
PROBLEM
(1.9)
THEORY
ratio and a is a scale factor, equal to the sequence’s initial value. It follows immediately
from the de nition that a geometric sequence follows the following recursive relation: for
every integer i ≥ 1, a = ra . Hence, the i-th term is given by a = ar .
i i−1 i
i−1
n n n
a(1 − r )
i−1
∑ ai = ∑ ar = ,
1 − r
i=1 i=1
(1.10)
provided r ≠ 1. Indeed,
n n n
i−1 1 i−1
∑ ai = ∑ ar = ⋅ (1 − r) ∑ ar
1−r
i=1 i=1 o=1
1 n−1 2 n
= ⋅ ((a + ar + ... + ar ) − (ar + ar + ⋅ ⋅ ⋅ + ar ))
1−r
2
n a(1−r )
a−ar
= = .
1−r 1−r
n
1
A(x1 , …, xn ) := ∑ xi .
n
i=1
G(x1 , …, xn ) := (∏ xi ) .
i=1
The following inequality relates the rst two means and appears to be very useful. For
any of n numbers x , …, x ∈ R ∪ {0},
1 n +
1/n
n n
1
A(x1 , …, xn ) = ∑ xi ≥ (∏ xi ) = G(x1 , …, xn ) .
n
i=1 i=1
(1.11)
i ∈ [n]. Moreover, to prove inequality (1.11), it is enough to show that the following
inequality holds:
n n 1/n
1
ln ( ∑ xi ) ≥ln ((∏ xi ) )
n i=1 i=1
n
n ∑ ln(xi )
1 i=1
= ⋅ ln (∏ xi ) = .
n i=1 n
(1.12)
follows immediately from Jensen’s inequality applied to f (x) = ln(x), a concave function
on D = R , and a = 1/n for all i. From this inequality, we also get that equality holds (in
+ i
both inequalities (1.11) and (1.12)) if and only if all the xi terms are equal.
Finally, let us consider the relationship between the harmonic and the geometric mean.
We claim that
1/n
n
n
H (x1 , …, xn ) = ≤ (∏ xi ) = G(x1 , …, xn ).
n
∑ 1/xi i=1
i=1
As before, equality holds if and only if all the xi terms are equal.
SOLUTION
First, let us note that, without loss of generality, we may assume that x ∈ R ∪ {0} for all i +
i ∈ [n]. Indeed, if inequality (1.9) holds for all sequences x , …, x ∈ R ∪ {0}, then for 1 n +
We will start from the left hand side of inequality (1.9), and try to reach its right hand
side. First, note that
n
n n n 1/2
2
∏ xi = (1 ⋅ ∏ x )
i=1 i=1 i
n n
1/2 n−i
1/2
n−1 i i
n 2 ⋅2 n 2 2
= (1 ⋅ ∏ x ) = (1 ⋅ ∏ ∏ x )
i=1 i i=1 j=1 i
n n−1 n
1 − 2
n−i j n n
1 + ∑2 = 1 + ∑2 = 1 + = 1 + (2 − 1) = 2
1 − 2
i=1 j=0
terms. (See equality (1.10) for the value of the geometric series.) Hence, we can apply the
theorem relating the geometric and the arithmetic mean to get that
n
n−1
1/2 n−i
i i
n 2 2 1 n 2 2
(1 ⋅ ∏ ∏ x ) ≤ n
(1 + ∑ ∑ x )
i=1 j=1 i 2 i=1 j=1 i
i
2
n−i i x
1 n 2 2 1 n i
= n
+ ∑ n
x = n
+ ∑ .
2 i=1 2 i 2 i=1 2
i
REMARKS
As the left hand side of inequality (1.9) is a product that is smaller than the right hand side
that is a sum of some kind, it is natural to try to apply the geometric-arithmetic inequality.
The fact that the smallest term of the right hand side is 1/2 suggests that we need 2n n
terms, and exactly one of them is equal to 1. Then, we see that x is divided by 2i which
i
2
i
means that we need 2 such terms. Combining all these observations together, our goal is
n−i
to transform our inequality so that the following properties hold: 1) there are 2n terms in the
sum, 2) xi should be present in 2 identical terms. So, starting from the right hand side of
n−i
Finally, the only other thing to notice is that there are exactly 2n terms added together. This
allows us to use the geometric-arithmetic inequality, thus completing the argument.
EXERCISES
1.2.2. Show that for any n numbers a 1, …, an ∈ R+ , the following inequality holds:
2
a1 a2 an−1 an n
+ + ⋯ + + ≥ ,
a2 + 1 a3 + 1 an + 1 a1 + 1 n + α
where α = ∑ 1/a .
n
i=1 i
(Source of the problem: Exam – Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
1.2.3. Prove that for any a, b ∈ R , for which ab = 1, we have that
+
m m
a + b ≥ 2 ,
where m ∈ R . +
PROBLEM
Prove that for any integer n ≥ 2 and any sequence of n real numbers a 1, …, an ∈ (1, ∞) ,
n n
ai ai
∑ ≥ ∑ ,
ln(ai+1 ) ln(ai )
i=1 i=1
(1.13)
where a n+1 = a1 .
THEORY
This method requires two things to be proven. First, one needs to check the base case; that
is, to prove that the property holds for the smallest number n0. Second, one needs to prove
the induction step; that is, show that if the property holds for some n ∈ {n , n + 1, …} 0 0
(this assumption is often called inductive hypothesis), then it holds for n + 1. These two
steps establish the property P (n) for every integer n ∈ {n , n + 1, …}. 0 0
In order to illustrate the method, let us prove the following simple inequality (property
P (n)): 2n + 1 ≤ 2 for all integers n ≥ 3. The base case ( P (3)) clearly holds:
n
3
7 = 2 ⋅ 3 + 1 ≤ 2 = 8 .
Suppose that P (n) holds: 2n + 1 ≤ 2 for some integer n
n ≥ 3 . We want to show that
P (n + 1) holds: 2(n + 1) + 1 ≤ 2 . This is true since n+1
n n n n+1
2(n + 1) + 1 = (2n + 1) + 2 ≤ 2 + 2 ≤ 2 + 2 = 2 .
(The rst inequality holds by inductive hypothesis; the second one holds since 2 ≤ 2
n
for
any n ≥ 3.)
SOLUTION
We say that property P (n) holds if inequality (1.13) holds for all sequences of n real
numbers a , …, a ∈ (1, ∞). The following symmetry will turn out to be useful:
1 n
an+1 = a .) In particular, it implies that, without loss of generality, we may assume that an
1
is a smallest element in the sequence (as one can shift the initial sequence until smallest
element is at the end).
We will prove by mathematical induction (on n) that P (n) holds for all integers n ≥ 2.
First, let us check the base case ( n = 2). We need to show that property P (2) holds; that
is, to prove that for any a , a ∈ (1, ∞), we have
1 2
a1 a2 a1 a2
+ ≥ + .
ln(a2 ) ln(a1 ) ln(a1 ) ln(a2 )
1 1
(a1 − a2 )( − ) ≥ 0.
ln(a2 ) ln(a1 )
(1.14)
(1.15)
We want to show that P (n) holds; that is, for any a 1, …, an ∈ (1, ∞) ,
n n n−2
ai ai ai an−1 an
∑ ≤ ∑ = ∑ + + .
ln(ai ) ln(ai+1 ) ln(ai+1 ) ln(an ) ln(a1 )
i=1 i=1 i=1
(1.16)
Fix any sequence a , …, a ∈ (1, ∞). Without loss of generality, we may assume that an is
1 n
a smallest element. Starting from the left hand side of inequality (1.16) and using the
inductive hypothesis (inequality (1.15)), we get that
n n−1 n−2
ai ai an ai an−1 an
∑ = ∑ + ≤ ∑ + + .
ln(ai ) ln(ai ) ln(an ) ln(ai+1 ) ln(a1 ) ln(an )
i=1 i=1 i=1
which is equivalent to
1 1
(an−1 − an )( − ) ≥ 0.
ln(an ) ln(a1 )
Again, similarly to the argument used for inequality (1.14), it is straightforward to see that
this inequality holds since it is assumed that an is a smallest element. The induction step is
nished and so is the proof.
REMARKS
In fact, one can prove more general property. For any two increasing functions f and g on D
the following is true: for any integer n ≥ 2 and any sequence of n real numbers
a , …, a
1 n∈ D,
n n
i=1 i=1
i=1 i=1
Our question is a speci c case of this general inequality when f (x) = x and
g(x) = 1/ ln(x).
Finally, let us mention that the above two inequalities are a direct consequence of the
following rearrangement inequality: for every two monotone sequences x ≤ … ≤ x and 1 n
y ≤ … ≤ y ,
1 n
xn y1 + … + x1 yn ≤ xσ(1) y1 + ⋯ + xσ(n) yn ≤ x1 y1 + … + xn yn ,
where σ : [n] → [n] is any permutation of [n]. It is good to recall such general inequalities,
and that they can be proven using mathematical induction. The idea for the proof of the
initial problem then comes naturally.
EXERCISES
b a a b
a b ≤ a b .
(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
ab bc ca
+ + ≥ a + b + c .
c a b
(Source of the problem: Student Circle – High School of Stanisław Staszic in Warsaw.
Solution: our own.)
a b c (a+b+c)/3
a b c ≥ (abc) .
PROBLEM
sin(α) sin(α)
√ (2−cos2 (α)) + √ cos2 (α) ≥ 2.
THEORY
for r ≤ 0 or r ≥ 1, and
r
(1 + x) ≤ 1 + rx,
SOLUTION
Fix any 0 < α < π/2. Recall that sin 2 2
(α) + cos (α) = 1 . Substituting
r := 1/ sin(α ) ∈ (1, ∞)
and
2 2
x := sin (α ) = 1 − cos (α ) ∈ (0, 1),
we get that
sin(α) sin(α)
r r
√ (2−cos2 (α)) + √ cos2 (α) = (1 + x) + (1 − x) .
Now, since x > −1, −x > −1, and r > 1, we can apply Bernoulli’s inequality to get
r r
(1 + x) + (1 − x) ≥ (1 + xr) + (1 − xr ) = 2.
REMARKS
Note that in this example we actually showed a slightly stronger inequality. Indeed,
although x and r are related to each other (both are functions of α), the inequality is true
even if they are not related. Such an approach of trying to prove a stronger result instead of
the one we really care about is not uncommon in mathematics. It sometimes leads to a
simpler proof of the result we care about.
A tricky part in this problem is to nd a substitution x = 1 − cos (α). In order to reach 2
it, the rst step is to check when the right hand side is equal to the left hand side, and we
immediately see that this is the case when cos (α) = 1. It is then helpful to know that a
2
typical trick in such cases is to consider a deviation from the equality case. From here, we
obtain:
1/ sin(α) 1/ sin(α)
(1 + x) + (1 − x) .
Now, if one remembers Bernoulli’s inequality, one immediately gets that it is at least 2 as
long as 1/ sin(α) > 1. Fortunately, this is the case in our example. Alternatively, one can
use Jensen’s inequality as
2 2
(2 − cos (α)) + cos (α)
= 1 ,
2
and ar is convex for r > 1.
EXERCISES
2
n −1
n−1 1
b) ( ) < .
n n+2
(Source of the problem: Exam by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
n
x
√ 1+x ≤ 1 + .
n
(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
PROBLEM
Let m, n be any two natural numbers such that m > n > 2. Prove that
n m
m < n .
THEORY
for all x ∈ R . +
Note that for n ≥ −a, it follows from the arithmetic-geometric inequality that
1/(n+1)
n a
a n/(n+1)
a 1 + n(1 + ) a
n
(1 + ) = (1 ⋅ ∏ (1 + )) < = 1 + .
n n n + 1 n + 1
i=1
(1.17)
It follows that the sequence is eventually increasing for any .
n
a
xn := (1 + ) a ≠ 0
n
−1
a (x+a)−a −1
−a a.
= ( lim (1 + ) ) = (e ) = e
x+a
x→∞
There is one technical and subtle issue with the argument above. At some point, we
switched from the limit of a sequence, lim f (x ), to the limit of a function,
n→∞ n
(x )
n n≥1
such that lim x = a. Note that it might be the case that lim
n→∞ n f (x) does x→∞
not exist but lim n→∞ f (x ) does; consider, for example, f (x) = sin(x) and x
n = πn. n
In our situation, as constant e was de ned by the limit of a sequence, we should have
been slightly more careful and make sure we take a limit over integers. This is easy to
verify after noting that
⌈n/a⌉−1 n/a ⌊n/a⌋+1
1 1 1
(1 + ) ≤ (1 + ) ≤ (1 + ) .
⌈n/a⌉ n/a ⌊n/a⌋
Since it is trivially true for a = 0, we can now safely claim that for any a ∈ R,
n
a
a
lim (1 + ) = e .
n→∞
n
(1.18)
There are many important and useful inequalities involving the constant e. We mention
only a few here. For any x ∈ R,
x
1 + x ≤ e .
(1.19)
To see this we note that for natural n > −x we can apply Bernoulli’s inequality to get:
x n
e ≥ (1 + x/n) ≥ 1 + n ⋅ x/n = 1 + x .
On the other hand, for any b ∈ R and any x ∈ [0, b], or for any b ∈ R and any x ∈ [b, 0]
+ −
,
FIGURE 1.2: Illustration for inequalities (1.19) and (1.20).
b
e − 1
x
1 + ⋅ x ≥ e .
b
(1.20)
SOLUTION
REMARKS
In many problems that involve power function, we reach terms that can be expressed in the
form (1 + x/n) . Then, it is often useful to remember that such terms (treated as sequences
n
or functions of n with x xed) are increasing, but bounded from above by ex.
EXERCISES
(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
1.6 Asymptotics
SOURCE
PROBLEM
Check for which n ∈ N the following statement holds: for all x ∈ R + ∪ {0} ,
2
n
(nx)
(1 + x) ≥ 1 + nx + .
2
THEORY
Binomial Theorem The binomial theorem can be written as follows: for any n ∈ N and
any x, y ∈ R,
n
n
n
n−i i
(x + y) = ∑( )x y .
i
i=0
Asymptotic Notation Let f (x) and g(x) be any two functions. In our applications, f (x) is
usually a complicated function whose behavior we would like to understand, and g(x) has a
simple form, and is positive for large enough x. We write:
f (x) = O(g(x)) if there exists a positive constant C such that for all
suf ciently large x we have that |f (x)| ≤ C|g(x)|,
f (x) = Ω(g(x)) if there exists a positive constant c such that for all
suf ciently large x we have that |f (x)| ≥ c|g(x)|,
values of x, the last two terms are dominant but eventually the rst one becomes much
larger than both of them. Clearly, for any x ≥ 1,
5/2 2 10
f (x) = 3x + 10x + 10 x ln x
it follows that f (x) = Ω(x ) and so f (x) = Θ(x ). It is straightforward to see that, say,
5/2 5/2
f (x) = O(x ) but f (x) = ω(x ); that is, f (x) is negligible compared to x
3 2 3 but grows
2 10
f (x) 10x 10 x ln x
lim = lim (1 + + ) = 1.
x→∞ 5/2 x→∞ 5/2 5/2
3x 3x 3x
Alternatively, we could have observed that
5/2 2
f (x) = 3x + O(x ) + O(x ln x)
One needs to be careful when working with asymptotic notation, as the notation presents
many counterintuitive properties. For example, note that we cannot deduce that
Ω(f (n)) + Ω(g(n)) = Ω(f (n)). Indeed, if f and g are of the same order, it might not be
true: 2
7n + 3n = Ω(n ) and
2
−7n + 10 ln n = Ω(n ) but 2 10 2
ln n = Ω(n).
2 2 10 10
(7n + 3n) + (−7n + 10 ln n) = 3n + 10
The above de nitions and examples assume that x → ∞. However, sometimes we would
like to understand the behavior of some function f (x) when x → 0. The notation
introduced above can be easily adjusted to this situation. For example, we write
f (x) = O(g(x)) if there exist positive constants C and M such that for all x ∈ (0, M ) we
(1.21)
For x = 0 it trivially holds. Now x any x ∈ R ∖ {0} . Our goal is to show that
limn→∞ sn = e
x
, where
n i
x
sn := ∑ .
i!
i=0
We may assume that n ≥ 3|x| is large enough integer (but xed) so that for any integer
i ≥ n we have
i+1
|x|
(i+1)! |x| 1
0 ≤ = ≤ .
i
|x| i + 1 3
i!
This implies, that by equality (1.10), the value of the geometric series with the scale factor
1/3 has
j
j
m x m! m |x| m!
∑ ⋅ ≤ ∑ ⋅
j=n+1 j! (m−j)!m
j j=n+1 j! (m−j)!m
j
j j
m |x| m m−1 m−j+1 m |x|
= ∑ ⋅ ⋅ ⋅ ... ⋅ ≤ ∑
j=n+1 j! m m m j=n+1 j!
n n
m |x| 1 |x|
≤ ∑ ⋅ ≤ ,
j=n+1 n! 3
j−n
2n!
for any integer m > n. As a result, for any integer m > n ≥ 3|x|,
n i n
x m! |x|
em − ∑ ⋅ ≤ .
i
i! (m − i)!m 2n!
i=0
(1.22)
Next, observe that sn can be made arbitrarily close to ∑ by making sure that
i
n x m!
i=0 i! (m−i)!m
i
m is large enough. To see this, note that there is a nite number of terms in the sum
(namely, n + 1), and that tends to 1 as m tends to in nity. Therefore, as x and n
(m−i)!m
m!
i
are xed, we can choose m large enough (that is, m = f (n) ∈ N for some function f) to
ensure that
n i n
x m! |x|
sn − ∑ ≤ .
i
i! (m − i)!m 2n!
i=0
(1.23)
n
(Let us comment that we choose as a convenient upper bound as it matches the bound
|x|
2n!
in (1.22); however, here we could use any bound that tends to zero as n → ∞.) Now,
combining inequalities (1.22) and (1.23), we get that
n
|x|
|sn − em | ≤ .
n!
Finally, as e → e as m → ∞, for m large enough (that is, possibly after adjusting
m
x
Now, to see that inequality (1.19) holds asymptotically, we use (1.21) and note that
2 2
x
x 3
x
e = 1 + x + + O(x ) ≥ 1 + x + ≥ 1 + x,
2 4
provided that x ∈ R is suf ciently close to zero. (Of course, it holds for all x ∈ R but the
aim here is to understand the behavior around zero.) Similarly, to see that the rst part of
inequality (1.20) holds asymptotically, note that for any ϵ > 0,
x 2
e = 1 + x + O(x ) ≤ 1 + x + ϵx = 1 + (1 + ϵ)x,
SOLUTION
We will prove that the statement does not hold for any n ∈ N; that is, for any n ∈ N, there
exists x ≥ 0 such that
2
(nx) n
f (x ) := 1 + nx + − (1 + x) > 0.
2
Clearly, for n = 1, we have
2 2
x x
f (x ) = 1 + x + − (1 + x ) = > 0
2 2
for every x > 0. Similarly, for n = 2, we get that for any x > 0
2
(2x) 2 2
f (x ) = 1 + 2x + − (1 + x) = x > 0.
2
Now, let us x any n ≥ 3. This time we need a more sophisticated argument, as the
statement clearly holds for large enough x but also for x = 0. However, it fails for x
suf ciently close to zero (but not equal to zero). Using the binomial theorem, we get that
2
(nx) n n i
f (x) = 1 + nx + − ∑ (1 )x
2 i=0 i
2
n 2 n(n−1) 2 n n i
= 1 + nx + ⋅ x − (1 + nx + ⋅ x + ∑ (1 )x )
2 2 i=3 i
n 2 n n i
= ⋅ x − ∑ (1 )x .
2 i=3 i
(1.24)
i
) ≤ (
n
⌊n/2⌋
) = (
⌈n/2⌉
n
) , we get that for any
x ∈ [0, 1/2] ,
n
n n n
n 2 i n 2 i
f (x) = ⋅ x − ∑ ( )x ≥ ⋅ x − ( )∑ x
2 i=3 2 i=3
i ⌊n/2⌋
n n−3
n ∞ i
n 2 3 i n 2 3 1
≥ ⋅ x − ( )x ∑ x ≥ ⋅ x − ( )x ∑ ( )
2 i=0 2 i=0 2
⌊n/2⌋ ⌊n/2⌋
n n
n 2 3 2 n
= ⋅ x − 2( )x = x ( − 2( )x).
2 2
⌊n/2⌋ ⌊n/2⌋
⌊n/2⌋
)) ∈ (0, 1/2] to get the desired counter-example, that is,
2
f (x0 ) = x0 (n/2 − n/4) = x0 ⋅ n/4 > 0
2
.
REMARKS
Let us come back to our original question. Using the asymptotic notation (when x → 0),
one can continue the argument from equation (1.24) as follows, avoiding tedious
calculations. Indeed, observe that
n 2 n i n 2 3
f (x) = ⋅ x − ∑ Θ(x ) = ⋅ x − Θ(x )
2 i=3 2
n 2 n 2
= ⋅ x (1 + Θ(x))~ ⋅ x .
2 2
(1.25)
(Recall that n is perhaps large but xed constant.) Hence, for suf ciently small x, we get
f (x) > 0 and we are done. Actually, this asymptotic analysis suggests how to formalize the
argument in the proof above, which only adds that we choose a speci c (small) value for x
to avoid asymptotic notation.
As mentioned in the theory part, we used above a non-standard notation when x → 0. Of
course, it is possible to avoid it and use a standard one with y → ∞ by letting
x := 1/y → 0. Then, instead (1.25), we have
n 2 n i
f (x) = f (1/y) = ⋅ (1/y) − ∑i=3 Θ((1/y) )
2
n 2 3 n 2
= ⋅ (1/y) − Θ((1/y) )~ ⋅ (1/y) .
2 2
1.6.1. Show that for any n ∈ N, there exists a non-negative x ∈ R such that
n 2
i
n + n + 1
∏ (1 + x) < 1 + x .
2
i=1
1.6.2. Prove that for any polynomial W (x) and suf ciently large x we have that
> W (x), if n ∈ N is greater than the degree of W. What does it tell us about
n
(1 + x/n)
PROBLEM
(1.26)
THEORY
2
n n n
2 2
(∑ ai )(∑ bi ) ≥ (∑ ai bi ) ;
equality holds if and only if the two sequences are proportional; that is, there exists a
constant c ∈ R such that a = cb for all i ∈ [n].
i i
There are at least 12 different proofs of this inequality; here we present an elementary
one. Note that
( )
n n 2 n n 2 2 2 2
0 ≤ ∑ ∑ (ai bj − aj bi ) = ∑ ∑ (a b − 2ai aj bi bj + a b )
i=1 j=1 i=1 j=1 i j j i
n 2 n 2 n n n 2 n 2
= ∑ a ∑ b − 2∑ ai bi ∑ aj bj + ∑ b ∑ a
i=1 i j=1 j i=1 j=1 i=1 i j=1 j
n n n 2
2 2
= 2(∑ a )(∑ b ) − 2(∑ ai bi ) ,
i=1 i i=1 i i=1
Titu’s Lemma The next inequality, known as Titu’s lemma, is a direct consequence of
Cauchy-Schwarz inequality. For any two sequences x , …, x ∈ R and y , …, y ∈ R, 1 n 1 n
n 2
n 2
x (∑ xi )
i i=1
∑ ≥ .
n
yi ∑ yi
i=1 i=1
SOLUTION
First, let us note that the left hand side of (1.26) is equal to A 1 + A2 , where
2 2 2 2 2
1 1 1 1 1
A1 := + + ⋅ ⋅ ⋅ + + + ,
x2 +x3 x3 +x4 xn−1 +xn xn +x1 x1 +x2
2 2 2 2 2
x x x x xn
1 2 n−2 n−1
A2 := + + ⋅ ⋅ ⋅ + + + .
x2 +x3 x3 +x4 xn−1 +xn xn +x1 x1 +x2
n 2
2
(∑ 1) n
i=1
A1 ≥ = ;
n n
∑i=1 (xi+1 + xi+2 ) 2 ∑i=1 xi
as before, we used the convention that x n+1 = x1 and x = x . Applying the lemma one
n+2 2
n 2 n 2
n
(∑i=1 xi ) (∑i=1 xi ) 1
A2 ≥ = = ∑ xi .
n n
∑ (xi+1 + xi+2 ) 2∑ xi 2
i=1 i=1 i=1
Hence,
1/2
2 n 2 n
1 n n
A1 + A2 ≥ ( + ∑ xi ) ≥ ( ⋅ ∑ xi ) = n,
n n
2 ∑ xi ∑ xi
i=1 i=1 i=1 i=1
REMARKS
If one remembers Titu’s lemma, then the solution comes to mind naturally after noticing
that fractions on the left hand side of (1.26) contain squares in their numerators but there
are no squares in their denominators. However, the question is what one needs to do
without knowing the lemma (or without realizing that it can be applied to this problem).
Here is an elementary argument. Observe that
2 2
(2 − b) + (2a − b) ≥ 0,
(1.27)
which is equivalent to
2 2
1 + a ≥ b + ab − b /2,
Now, after substituting a = xi and b = xi+1 + xi+2 , we immediately get the desired
inequality:
n 2 n
1 + x xi+1 + xi+2
i
∑ ≥ ∑ (1 + xi − ) = n.
xi+1 + xi+2 2
i=1 i=1
The only question remaining is how to guess the starting point, that is, inequality (1.27)?
One possible line of reasoning is as follows. We need to deal with the sum of n fractions,
each of the form , where a = x > 0 and b = x + x > 0. We observe that,
2
1+a
i i+1 i+2
b
although it is possible that some fractions are close to zero, the sum has to be large. Indeed,
if one fraction is small, then its denominator must be large and so we expect the following
fractions to be large. Our hope is that this will balance out and, on average, fractions have
values at least 1. In fact, it is natural to conjecture that the left hand side of inequality (1.26)
reaches its minimum for x = … = x = 1. But how do we turn it into a formal
1 n
argument? Since the quadratic function is not the easiest to work with, the goal is to bound
from below by a simpler, linear, function g(a) = ca + d with similar
2
1+a
f (a) =
b
behavior, namely, if one term is small the other terms are forced to be large. It makes sense
to make an approximation as tight as possible, so we want the line g(a) to touch the
parabola f (a). However, what should be the touching point? The answer is relatively easy
—as already mentioned, the original inequality (1.26) is tight when all xi are equal (in fact,
all are equal to 1). This implies that we want a = b/2 and so a touching point should be
(b/2, 1/b + b/4); see Figure 1.3 for an illustration. Hence,
g(a) = c(a − b/2) + 1/b + b/4.
FIGURE 1.3: Illustration for tuning functions f (a) and g(a).
It remains to calculate the constant c. Since we want f (a) ≥ g(a), the function
1 2
2cb − b
f (a) − g(a ) = ⋅ a − c ⋅ a +
b 4
should be a quadratic function with its discriminant equal to zero. It follows that
2
1 2cb − b 2
c − 4 ⋅ ⋅ = c − 2c + 1 = 0,
b 4
and so c = 1. We get that f (a) ≥ g(a) = a − b/4 + 1/b. Finally, since
b 1 b
+ ≥ 2√ ⋅ b = 1,
4 b 4
we get that
2
1 + a b 1 b 1 b b
≥ a − + = + + a − ≥ 1 + a − ,
b 4 b 4 b 2 2
which is what we need to nish the proof—see (1.28).
EXERCISES
1 1 1
(a + b + c)( + + ) ≥ 9.
a b c
(Source of the problem: inspired by problem PLMO II – Phase 1 – Problem 6. Solution: our
own.)
(Source of the problem: Student Circle – High School of Stanisław Staszic in Warsaw.
solution: our own.)
1.7.3. Prove that if a, b, c ∈ R are such that a + b + c = 1 and min{a, b, c} ≥ −3/4, then
a b c 9
+ + ≤ .
a2 + 1 b2 + 1 c2 + 1 10
Does this inequality hold without the additional assumption that min{a, b, c} ≥ −3/4?
(Source of the problem and solution: XLVII OM – Phase 2 – Problem 3.)
1.8 Probability
SOURCE
PROBLEM
m m n n n m nm
((x + y) − x ) + ((x + y) − y ) ≥ (x + y) .
(1.29)
(1.30)
THEORY
Boole’s Inequality The following elementary fact, known as Boole’s inequality but also as
the union bound is very useful. For any collection of events A , …, A in some probability 1 n
space,
n
⎛ ⎞ n
P ⋃ Ai ≤ ∑ P(Ai ) .
⎝ ⎠ i=1
i=1
(1.31)
We note that this inequality is sharp, since the equality holds for disjoint events.
SOLUTION
Proving the special case, inequality (1.30), is relatively easy. After dividing both sides by
, we get an equivalent inequality
2
n +1
2
n
1 1
(1 − ) ≥
2n 2
Now, we notice that the left hand side of inequality (1.32) is an increasing function of n—
see inequality (1.17). On the other hand, it is obvious that the right hand side of inequality
(1.32) is a decreasing function of n. Hence the desired inequality holds if it holds for the
smallest natural number, that is, for n = 1. For n = 1, both sides of inequality (1.32) are
equal to 1/4 and so we are done.
The proof of inequality (1.29) is more challenging. We start with dividing both sides by
and setting p = x/(x + y) ∈ (0, 1) to get an equivalent inequality: for any
nm
(x + y)
p ∈ (0, 1),
m n n m
f (p ) := (1 − p ) + (1 − (1 − p) ) ≥ 1 .
(1.33)
We are going to introduce a random process and two events, A and B, such that
m n
P(A) = (1 − p ) ,
(1.34)
n m
P(B) = (1 − (1 − p) ) ,
(1.35)
and argue that no matter what the outcome of the process is, at least one of the two events
must hold. This will nish the proof, as then
It is immediately apparent that the inequality can be reduced to one variable. Then, the
question is what is most convenient way to do it. One natural approach is to make the right
hand side of inequality (1.29) a constant. This directly gives
m n n m
x y
(1 − ( ) ) + (1 − ( ) ) ≥ 1 .
x + y x + y
If x/(x + y) is now the probability of some event, then y/(x + y) is the probability of its
complement. So we try to nd a process involving n ⋅ m events, and two associated events.
Alternatively, one can solve this problem analytically. If n = 1 or m = 1, then the left
hand side of inequality (1.29) is equal to its right hand side. Hence, we need to concentrate
on the case min{n, m} > 1. It will be more convenient to focus on inequality (1.33). The
left hand side of inequality (1.33), function f (p), has two terms: (1 − p ) and m n
the second one is an increasing one, but it is not clear what the behavior of the sum of the
two is. In order to show that f (p) > 0 for p ∈ (0, 1), we will show that f (p) > 0 for some
p ∈ (0, 1) and that there is one extremum in the interval (0, 1).
For the rst property, note that for given n and m that are greater than 1, if p tends to
zero, then
m n n m
(1 + p ) +(1 − (1 − p) )
m
m 2m 2
= (1 − np + O(p )) + (1 − (1 − np + O(p )))
m 2m m m m
= (1 − np + O(p )) + n p (1 + O(p))
m m m+1
= 1 + (n − n)p + O(p )
m m
= 1 + (n − n)p (1 + O(p)).
This implies that if p is greater than zero but suf ciently small, then f (p) > 1 . The rst
property therefore holds.
For the second property, note that function f (p) is differentiable on [0, 1]. Clearly,
After setting f ′
(p) = 0 and rearranging terms, we get that
n m−1 n−1
m
1 − (1 − p) 1 − p
( ) = ( ) .
1 − (1 − p) 1 − p
Now, using the formula for geometric series (see (1.10)), we notice that it is equivalent to
m−1 n−1
n−1 m−1
i i
(∑ (1 − p) ) = (∑ p ) .
i=0 i=0
Since the left hand side monotonically decreases from n (for p = 0) to 1 (when p → 1)
m−1
and the right hand side monotonically increases from 1 (for p → 0) to m (for p = 1), n−1
there is exactly one point when the two sides are equal. It follows that f (p) = 0 has only ′
EXERCISES
1.8.2. Prove that for k, n ∈ N , such that k ≤ n and p, q ∈ [0, 1] , such that p < q we have
that
n
n i n−i i n−i
∑( )(q (1 − q) − p (1 − p) ) ≥ 0.
i
i=k
1.9 Geometry
SOURCE
(1.36)
THEORY
Heron’s Formula Heron’s formula states that the area of a triangle whose sides have
lengths a, b, and c is
a + b + c
s = .
2
Heron’s formula can also be written as
1
A = √ (a + b + c)(−a + b + c)(a − b + c)(a + b − c) .
4
SOLUTION
First, let us observe that there is a lot of symmetry in both the left hand side of inequality
(1.36), the function , and the right hand side, the function
2
2 2
f (a, b) := (a + b )
and g(a, b, c) are not affected by the sign of variables; for example,
=: A(a, b, c, )
(1.37)
and use Heron’s formula to notice that the right hand side of inequality (9), function
A(a, b, c), is the area of the considered triangle. Now, since a triangle with arms of lengths
a and b has area less than or equal to ab/2 we get A(a, b, c) ≤ ab/2. (The equality holds if
and only if the angle between the two arms is 90∘.) By the geometric-arithmetic mean
inequality, ab/2 ≤ (a + b )/4 and so the desired inequality holds. Finally, let us note that
2 2
REMARKS
Knowing Heron’s formula turns out to be useful in this example and the way it can be
applied comes to mind naturally. However, one can solve this problem without using it.
After simplifying inequality (1.36), we get an equivalent inequality
4 2 2 2 4 4
h(a, b, c) := c − 2(b + a )c + 2(b + a ) ≥ 0 ,
and we note that h(a, b, c) is a quadratic polynomial of c2. Since the discriminant, Δ ,
satis es
2
2 2 4 4 2 2
Δ := (−2(b + a )) − 4 ⋅ 2(b + a ) = −4(b − a ) ≤ 0,
the desired inequality holds. Moreover, as before, we get that the equality holds if and only
if |a| = |b| = |c|/√2.
EXERCISES
2
n n
2
n + (∑ ai ) ≤ ∑ √1 + a .
⎷ i
i=1 i=1
x 1 4
√ 1 − x2 + √ 16x2 − 1 < .
2 16 9
Chapter 2
Equalities and Sequences
Suppose that you are given an equation or a set of equations involving some
number of variables. A natural question that is usually asked is to nd a solution
in some given domain. In general, there are the following three possible cases:
Our goal is to nd all real solutions, that is, all values of x ∈ R that satisfy this
equation. After multiplying both sides by 4a, one can re-write it as follows:
2 2
(2ax − b) = b − 4ac .
It is now clear that if b − 4ac < 0, then the equation has no real solutions, as the
2
left hand side is non-negative. On the other hand, if b − 4ac = 0, then the
2
equation has exactly one solution, namely, x = b/(2a). Finally, if b − 4ac > 0,
2
allows deducing some properties of the roots without computing them, is called
the discriminant and is often denoted as Δ.
Another important distinction is with respect to the number of equations. The
problem we need to deal with could consist of
one equation,
It is easy to see that the sum of the second and the third equation is equal to the
rst equation, so the system is not independent. After dropping the rst equation
we get the following, independent and equivalent, system:
2x + y + z = 3
{
y = x.
z = 3 − 3x
{
y = x.
As we are not able to further reduce the system, we conclude that there are an
in nite number of solutions: a triple (x, y, z) = (t, t, 3 − 3t) satis es the original
system of equations for any t ∈ R.
Finally, let us mention about the geometric interpretation. Let us start with a
simple example, a linear system involving two variables, say x and y. Each linear
equation determines a line on the xy-plane. Because a solution to a linear system
must satisfy all of the equations, the solution set is the intersection of these lines,
and is hence either a line (an in nite number of solutions), a single point (a
unique solution), or the empty set (no solution). The three cases are illustrated on
Figure 2.1. If there is only one equation x + y = 2, then any point
(x, y) = (t, 2 − t), t ∈ R satis es this equation—see Figure 2.1(a). The
following system
x + y = 2
{
x − y = 0.
has precisely one solution, the intersection of the corresponding two lines—see
Figure 2.1(b). Finally, there is no solution to the following system
x + y = 2
{x − y = 0
y = 2,
as no point belongs to all of the corresponding three lines—see Figure 2.1(c). For
three variables, each linear equation determines a plane in three-dimensional
space, and the solution set is the intersection of these planes. In general, for n
variables, each linear equation determines a hyperplane in n-dimensional space.
FIGURE 2.1: Geometric interpretation of linear systems.
Situation is more complex for non-linear systems but one can still gain some
intuition by representing each equation as a family of points that satisfy it. In the
next section, we solve the following simple example algebraically:
2 2
x + y = 2
{
x + y = 2.
In Figure 2.2, we present the corresponding graphs that suggest the unique
solution (x, y) = (1, 1).
Solve the following system of equations, given that all variables involved are real
numbers:
2
x = yz + 1
2
{y = zx + 2
2
z = xy + 4 .
THEORY
A natural approach when solving systems of equations, that often turns out to be
ef cient, is to transform a given system into some other equivalent system that is
easier to deal with.
Suppose that we are given n equations with unknown variables, represented by
the vector x, that is of the form f (x) = b for i ∈ [n]. For a given sequence of
i i
weights c ∈ R, i ∈ [n], if c ≠ 0 for some j ∈ [n], then one can take the original
i j
system of equations and replace the jth equation, f (x) = b , with a linear j j
∑ ci fi (x ) = ∑ ci bi .
i=1 i=1
The resulting system of equations has identical solutions as the original system.
Indeed, in order to see this, let us rst assume that some x is a solution to the
original system. It is clear that it is also a solution of the derived system. On the
other hand, if some x is a solution of the derived system, then it must also satisfy
f (x) = b as c ≠ 0 and ∑
j j j c f (x) = ∑
i i c b . i i
i≠j i≠j
2 2
x + y = 2
{
x + y = 2.
One can replace the rst equation with the rst equation minus two times the
second equation to get the following equivalent system:
2 2
x − 2x + y − 2y = 2 − 4
{
x + y = 2,
or equivalently,
2 2
(x − 1) + (y − 1) = 0
{
x + y = 2.
x = 1
{y = 1
x + y = 2.
After subtracting the rst equation from the second one, we get that
(y − x)(x + y + z ) = 1.
(2.1)
(z − y)(x + y + z ) = 2.
(2.2)
(z − 3y + 2x)(x + y + z ) = 0.
It follows that z = 3y − 2x and so we can reduce the system to two equations and
two variables:
2
x = y(3y − 2x) + 1
{ 2
y = x(3y − 2x) + 2 ,
or equivalently,
2 2
(x + y) = 4y + 1
{ 2 2
2(x + y) = y + 7yx + 2 .
Substituting the rst equation into the second one gives us the following sequence
of equivalent equalities 2(4y + 1) = y + 7yx + 2, y = yx, and so
2 2 2
(x, y, z) = (−1, 0, 2) and (x, y, z) = (1, 0, −2). We can then directly check that
In order to gain some more experience, let us consider another problem to show
how the technique we practice in this section can be applied. As an example, we
use the problem from OM LVIII – Phase 1 – Problem 1. We are asked to solve the
following system of equations, where variables involved are real numbers:
2
x + 2yz + 5x = 2
2
{y + 2zx + 5y = 2
2
z + 2xy + 5z = 2.
two cases.
−5 + √ 49 1 −5 − √ 49
x1 = = and x2 = = −2 .
2 ⋅ 3 3 2 ⋅ 3
It follows that there are two solutions: (x, y, z) = (1/3, 1/3, 1/3) and
(x, y, z) = (−2, −2, −2).
Case 2: not all the numbers are equal. Due to the symmetry, without loss of
generality, we may assume that x ≠ y—the other solutions will be obtained by
permuting the solution vector (x, y, z). After comparing the left hand sides of the
rst and the second equation we get that (x − y)(x + y − 2z + 5) = 0. Since
x ≠ y, it follows that x + y − 2z + 5 = 0. We will now independently consider
or equivalently that
17 16 1 16
2
0 = x + x + = (x + )(x + ) .
3 9 3 3
Combining all the cases together, we deduce that there are 8 candidate
solutions:
We may then directly check that all of them meet the original system of
equations.
EXERCISES
2.1.1. Solve the following system of equations, given that all variables involved
are real numbers:
3
⎧a + b = c
3
b + c = d
⎨ 3
c + d = a
⎩
3
d + a = b.
(Source of the problem and solution: PLMO LXIII – Phase 2 – Problem 1.)
2.1.2. Solve the following system of equations, given that all variables involved
are real numbers:
3 3
(x − y)(x + y ) = 7
{ 3 3
(x + y)(x − y ) = 3.
(Source of the problem and solution: PLMO LXII – Phase 2 – Problem 1.)
2.1.3. Solve the following system of equations, given that all variables involved
are real numbers:
2
x − (y + z + yz)x + (y + z)yz = 0
2
{y − (z + x + zx)y + (z + x)zx = 0
2
z − (x + y + xy)z + (x + y)xy = 0.
(Source of the problem and solution: PLMO LXI – Phase 2 – Problem 1.)
PROBLEM
Solve the following system of equations, given that all variables involved are real
numbers:
2 3 3
⎧a = b + c
2 3 3
b = c + d
2 3 3
⎨c = d + e
2 3 3
d = e + a
⎩
2 3 3
e = a + b .
THEORY
⎧ f1 (x1 , x2 , …, xn ) = 0
f2 (x1 , x2 , …, xn ) = 0
⎨
⋮
⎩
fn (x1 , x2 , …, xn ) = 0
is called cyclic if the system does not change after replacing variable xi with
variable x for i ∈ [n − 1], and replacing variable xn with variable x1.
i+1
Let us note that any cyclic system with a nite number of variables
x , x , …, x has the property that a circular permutation of any solution is also a
1 2 n
result, when solving cyclic systems of equations it is often useful to start with
assuming that some variable attains the maximum or the minimum value among
the whole set of all variables. Once the solution is found, one can simply recover
the whole family of solutions that can be obtained by applying circular
permutations to the particular solution.
In order to illustrate this technique in a simple setting, let us consider the
following cyclic system of n equations and n variables x , x , …, x ∈ R, where 1 2 n
Since
3
x1 + 2 = 3x2 ≤ 3x1 ,
we get that
3 2 2
0 ≥ x1 + 2 − 3x1 = (x1 − 2x1 + 1)(x1 + 2 ) = (x1 − 1) (x1 + 2 ) .
We will show that there are no more solutions. For a contradiction, suppose that
there exists a solution for which x < −2. In this case, we get that 1
3 3 2
x + 2 x + 2 − 3x1 (x1 − 1) (x1 + 2)
1 1
x2 − x1 = − x1 = = < 0.
3 3 3
It follows that x < x < −2. Since the system is cyclic, we can keep repeating
2 1
SOLUTION
Without loss of generality, we may assume that b is the largest variable involved,
that is, b = max{a, b, c, d, e}. In particular, since function f (x) = x is an 3
(2.3)
3 3 3 3
a + b ≤ a + |b| ≤ 0.
follows from the third and the fourth equation that c = b . If c = b (that is, 2 2
variables are equal). If c = −b then we get from the rst equation that a = 0, and
again all variables are equal to 0.
REMARKS
In all examples presented so far, all solutions had the property that all variables
are equal, that is, when the minimum values is equal to the maximum value.
Indeed, this is often the case but, of course, it does not have to be in general.
Consider, for instance, the following cyclic system of equations where variables
x1, x2, x3 are real numbers:
x1 (1 − x2 ) = 1
{x2 (1 − x3 ) = 1
x3 (1 − x1 ) = 1.
otherwise, the left hand side of one of the equations would be equal to 0. Hence,
we can re-write the system as follows:
x2 = 1 − 1/x1
{x3 = 1 − 1/x2
x1 = 1 − 1/x3 .
Note that if x ∉ {0, 1}, then f (x) := 1 − 1/x ∉ {0, 1} . Hence, after making a
substitution, we get that
x2 = 1 − 1/x1
{x3 = 1 − 1/(1 − 1/x1 ) = −1/(x1 − 1)
x1 = 1 − 1/(−1/(x1 − 1) ) = x1 .
solution vectors.
Finally, let us mention that the trick of assuming that one of the variables
attains the maximum or the minimum does not only apply to cyclic systems—see,
for example, Problem 2.2.3.
EXERCISES
2.2.1. Solve the following system of equations, given that all variables involved
are real numbers:
3
⎧(x + y) = 8z
3
⎨(y + z) = 8x
⎩ 3
(z + x) = 8y .
2.2.2. Solve the following system of equations, given that all variables involved
are real numbers:
5 3
x = 5y − 4z
5 3
{y = 5z − 4x
5 3
z = 5x − 4y .
(Source of the problem and solution: PLMO LIX – Phase 1 – Problem 1.)
2.2.3. Solve the following system of equations, given that all variables involved
are positive real numbers:
3 3 3 3
a + b + c = 3d
4 4 4 4
{b + c + d = 3a
5 5 5 5
c + d + a = 3b .
PROBLEM
2 2 √C + 2 ∑ x ,
∑ √ x + Cxi xi+1 + x = i
i i+1
i=1 i=1
(2.4)
SOLUTION
Let us note that the right hand side of (2.4) can be rewritten as follows:
n √ C+2 n √ C+2 n
√C + 2 ∑ xi = ∑ (xi + xi+1 ) = ∑ |xi + xi+1 | − ϵ
i=1 2 i=1 2 i=1
2
n (C+2)(xi +xi+1 )
∑ √ − ϵ,
= i=1 4
where
n n
√C + 2
ε := (∑ |xi + xi+1 | − ∑ (xi + xi+1 ) ) ≥ 0,
2
i=1 i=1
2
n (C+2)(xi +xi+1 )
− √
2 2
0 = ∑ (√ x + Cxi xi+1 + x ) + ϵ
i=1 i i+1 4
2
(C+2)(x +x )
2 2 i i+1
n x +Cxi xi+1 +x −
i i+1 4
= ∑i=1 + ϵ,
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4
(2.5)
where in the last step we used a standard method of removing square roots, that
is, using the fact that
√a + √ b a − b
√a − √ b = (√a − √ b) ⋅ = .
√a + √ b √a + √ b
Starting from the right hand side of (2.5), we get that
2
(C+2)(x +x )
2 2 i i+1
n x +Cxi xi+1 +x −
i i+1 4
∑ + ϵ
i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4
2 2 2 2 2
n 4x +4Cxi xi+1 +4x −4x −(C+2)x −2(C+2)xi xi+1 −(C+2)x
1 i i+1 i+1 i i+1
= ∑ + ϵ
4 i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4
2 2
n (2−C)x −(4−2C)xi xi+1 +(2−C)x
1 i i+1
∑ + ϵ
4 i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4
2
2−C n (xi −xi+1 )
= ∑ + ϵ ≥ 0,
4 i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4
since C ∈ (−2, 2). As (2 − C)/4 > 0, it is now obvious that the equality holds if
and only if all xi are equal, that is, (x , x , …, x ) = (t, t, …, t) for some t ∈ R, 1 2 n
and
√C + 2
ε = (|2t| − 2t ) = 0.
2
It follows that all solutions are of the form (x1 , x2 , …, xn ) = (t, t, …, t) for
some t ∈ R ∪ {0}. +
REMARKS
Since the right hand side of (2.4) has the term √C + 2, we immediately see that
the assumption that C > −2 is needed and natural. On the other hand, the
assumption that C < 2 is not needed and so we deduce that it has to be an
important condition that affects the solution. Indeed, if C = 2, then the left hand
side of (2.4) is equal to ∑ |x + x | and so it is equal to 2 ∑ x , the right
n
i=1 i i+1
n
i=1 i
2 2 2
∑ √x + Cxi xi+1 + x = ∑ √ (xi + xi+1 ) − (2 − C)xi xi+1 ,
i i+1
i=1 i=1
and to analyze the problem in terms of x + x . Now, one can notice that the i i+1
n n 2
(C + 2)(xi + xi+1 )
√C + 2 ∑ x √
i ≤ ∑ ,
4
i=1 i=1
as we did in the solution. To see this we used the fact that z ≤ |z| for all z ∈ R.
The remaining of the solution follows naturally by grouping the terms using xi
and x .
i+1
since C ∈ (−2, 2) .
EXERCISES
4 2 2
(x + 3y )√ |x + 2| + |y| = 4 xy ,
provided that x, y ∈ R.
(Source of the problem: PLMO LXIV – Phase 2 – Problem 4. Solution: our own.)
2.3.2. Solve the following system of equations, given that all variables involved
are real numbers:
2 2 2
3(x + y + z ) = 1
{ 3
2 2 2 2 2 2
x y + y z + z x = xyz(x + y + z) .
2.3.3. Solve the following system of equations, given that all variables involved
are real numbers:
2
x y + 2 = x + 2yz
2
{y z + 2 = y + 2zx
2
z x + 2 = z + 2xy .
(Source of the problem and solution: PLMO LXIX – Phase 1 – Problem 3.)
Let n be any natural number such that n ≥ 2. Suppose that a sequence of non-
negative numbers (c , c , …, c ) satis es the following condition:
0 1 n
cp cs + cr ct = cp+r cr+s
THEORY
If the angle α is given, then all sides of the right-angled triangle are well
de ned, up to a scaling factor. This means that the ratio of any two side lengths
depends only on α. These six ratios de ne six functions of α, which are the
trigonometric functions:
a
sin (α) = (sine)
h
b
cos (α) = (cosine)
h
a
tan(α) = (tangent).
b
1 h
sec(α) = = (secant)
cos(α) b
1 b
cot(α) = = (cotangent).
tan(α) a
For extending these de nitions to functions whose domain is the whole real line,
one can use geometrical de nitions using the standard unit circle (a circle with
radius of 1)—see Figure 2.4.
FIGURE 2.4: De nition of trigonometric functions in coordinate system. Note that
h = √a + b = √1 = 1
2 2
and this time b is negative, so cos(α) = b/h is also negative.
In all the computations in this book, we always assume that the angle α is
measured in radians, that is, the length of an arc of a unit circle de ned by the
angle and measured counter-clockwise. This is because radians are more
“natural” and give more elegant formulation of a number of important results such
as
sin(α)
lim = 1.
α→0 α
The trigonometric functions also have simple and elegant series expansions when
radians are used. As a result, modern de nitions express trigonometric functions
as in nite series:
3 5 7
α α α
sin (α) = α − + − + ...
3! 5! 7!
2 4 6
α α α
cos (α) = 1 − + − + ... .
2! 4! 6!
By examining the unit circle, we notice that re ections in the directions 0, π/4,
π/2, and π generate equally looking results. As a result, the following properties
,
sin(−x) = − sin(x) cos(−x) = cos(x) ;
,
sin(π/2 − x) = cos(x) cos(π/2 − x) = sin(x) ;
,
sin(π − x) = sin(x) cos(π − x) = − cos(x) ;
,
sin(π/2 + x) = cos(x) cos(π/2 + x) = − sin(x) ;
,
sin(π + x) = − sin(x) cos(π + x) = − cos(x) ;
,
sin(2π + x) = sin(x) cos(2π + x) = cos(x) .
From the above identities, one can easily deduce analogous properties of the
tangent and the cotangent functions—we leave the details details for the reader.
Here are a few, slightly more complex identities. The rst four identities are
known as the angle addition identities:
cot(x)cot(y)−1
cot (x + y) = .
cot(x)+cot(y)
2 2 2 2
cos (2x) = cos (x)− sin (x) = 2 cos (x) − 1 = 1 − 2 sin (x).
cos(x−y)−cos(x+y)
sin (x) sin (y) = ,
2
sin(x+y)+sin(x−y)
sin (x) cos (y) = ;
2
x+y x−y
sin (x)+ sin (y) = 2 sin ( ) cos ( ),
2 2
x+y x−y
cos (x)+ cos (y) = 2 cos ( ) cos ( ),
2 2
x+y x−y
cos (x)− cos (y) = −2 sin ( ) sin ( ).
2 2
SOLUTION
By considering p = n and r = s = t = 0, we get that c c + c = c c , which n 0
2
0 n 0
c = c
p
for each 0 ≤ p ≤ n/2. In fact, by symmetry, c = c
n−p
for each p n−p
cr+1 ≥ 1. It follows that all ci, except c0 and cn, are at least 1. We will not need
this property but, in order to build an intuition, let us mention that with this
stronger property at hand we get that for each r ∈ [n − 3] we have c ≥ 2 and
2
r+1
so all ci, except c , c , c , and cn, are at least √2. One may continually recurse
0 1 n−1
(2.6)
c = c
0 = 0, c = c
n 1 = 1, and (2.6) holds for each r ∈ [n − 2]. Moreover, if
n−1
there is a unique c2 that satis es the desired property, then in fact the whole
sequence is de ned uniquely. Indeed, by (2.6) we see that c is determined by r+2
(0, 1, d , d , …, d
2 3 , 1, 0) that satisfy the desired property and d ≠ c . Without
n−2 2 2
induction that for r ∈ [n − 2] we have that d > c and that r+1 r+1
cr+1 1 cr+2
> − = .
cr cr+1 cr cr+1
But this also immediately gives that d > c . This nishes the inductive proof
r+2 r+2
de ned uniquely.
We will now show that the sequence c = sin(πi/n)/ sin(π/n), i ∈ [n] ∪ {0}
i
satis es the desired property, and so this is the only solution to our problem.
Indeed, note that c = 0, c = 1, and for all non-negative integers p, r, s, t such
0 1
= 2
2sin (π/n)
π(p−s) π(r−t)
cos( )+cos( )
n n
= 2
,
2sin (π/n)
cp cs + cr ct = 2
sin (π/n)
π(2p+2r−n) π(−2s−2r+n)
cos( )cos( )
2n 2n
= 2
sin (π/n)
π(p+r) π(s+r)
sin( )sin( )
n n
= 2
= cp+r cs+r .
sin (π/n)
REMARKS
Let us note that the solution of our problem is clearly divided into two separate
steps: the proof of the uniqueness of the solution and then nding the actual
sequence that satis es the desired property. The main dif culty is the second part,
and initially it is not clear how to guess a possible solution. From the rst part, we
have learnt that the sequence starts from c = 0, then increases to c = 1, and 0 1
c = c
p n−p for any 0 ≤ p ≤ n. It is natural to guess that the sequence is rst
increasing, and then after reaching a “turning point” the monotonicity of the
sequence changes. However, at this point it is only a conjecture. Now, being
aware that trigonometric functions are possible substitutions, the natural guess is
to use c = sin(πi/n) as it satis es the requirements and is also non negative for
i
by sin(π/n). All that is left now is to check that this guess satis es the original
equation, which we leave for the reader.
EXERCISES
where x n+1 = x . 1
1 + axn
xn+1 = .
a − xn
Find all values of a for which the sequence has a period equal to 8.
(Source of the problem and solution: PLMO LVI – Phase 1 – Problem 9.)
PROBLEM
Suppose that n is an odd natural number. Find the number of real solutions of the
following system of equations:
⎨ ⋮
x (xn−1 + 1) = xn (xn − 1)
⎩ n−1
xn (xn + 1) = x1 (x1 − 1 ) .
THEORY
There are many interesting problems in which one is asked to count the number of
solutions to a given system of equations without actually nding any solution. As
mentioned at the beginning of this chapter, one can distinguish four cases and
each of them usually requires a different approach: there is no solution, there is
exactly one solution, there are many solutions but a nite number of them, and
there are in nitely many solutions. When more than one solution is expected, but
nitely many, then typically the proof strategy requires the following three steps:
deriving the rule for candidate solutions (that is, nding all
solutions that meet the desired conditions of the problem);
For the last step, it is often useful to use the powerful technique called double
counting that we discuss in detail in Section 4.3. The idea is to construct a
bijection from the set of solutions to some other set that is easier to count.
In order to illustrate the double counting technique, let us consider the
following problem. Suppose that k and n are natural numbers such that k ≤ n.
Our goal is to compute the number of solutions of the equation ∑ x = n, k
i=1 i
provided that x ∈ N for each i ∈ [k]. We will show that there are (
i ) distinct
n−1
k−1
i=1 i j
j
i=1 i
whose terms are bounded above by n − 1, uniquely yields one solution. In other
words, there is a bijection from the set of solutions we want to count and the set
of increasing sequences of natural numbers that are at most n − 1. The former
one seems dif cult to count but the latter one is easy to count as each increasing
sequence is uniquely de ned by an (k − 1)-element subset of the set [n − 1].
Since there are ( ) subsets of size k − 1 selected from the set of n − 1
n−1
k−1
change of n cents using only ¢1 coins. Let x be the number of ways one can
2
n
make change of n cents using only ¢1 and ¢5 coins. Let x be the number of ways 3
n
one can make change of n cents using only ¢1, ¢5, and ¢10 coins. Finally, let x 4
n
be the number of ways one can make change of n cents using ¢1, ¢5, ¢10 and ¢25
coins. Our goal is to nd x . 4
100
dispensed and so there are ⌊n/5⌋ + 1 ways to do it (including not using ¢5 coins
at all). The remaining amount is uniquely delt with ¢1 coins.
We are now ready to compute x 4
100
. We rst observe that
4 3 3 3 3
x100 = x100 + x75 + x50 + x25 + 1 ,
3 2 2 2 2 2 2
x = x + x + x + x + x + x
50 50 40 30 20 10 0
= 11 + 9 + 7 + 5 + 3 + 1 = 36
3 2 2 2 2 2 3
x = x + x + x + x + x + x
75 75 65 55 45 35 25
= 16 + 14 + 12 + 10 + 8 + 12 = 72
3 2 2 2 2 2 3
x = x + x + x + x + x + x
100 100 90 80 70 60 50
= 21 + 19 + 17 + 15 + 13 + 36 = 121
It follows that x4
100
= 121 + 72 + 36 + 12 + 1 = 242 .
Finally, let us mention that Larry King said in his USA Today column that
there are 293 ways to make change for a dollar. Why did he get a different
number? He included ¢50 coins! Though not commonly used today, half-dollar
coins have a long history of heavy use alongside other denominations of coinage,
but have faded out of general circulation for many reasons. With this additional
coin available, the number of ways to make change for a dolar is
4 4
x100 + x50 + 1 = 242 + 49 + 1 = 292 ,
since
4 3 3
x50 = x50 + x25 + 1 = 12 + 36 + 1 = 49 .
We are still missing one way! The reason for that is Larry King included also a
dollar coin which seems controversial. For example, Walter Wright said that a
dollar coin cannot be considered change for a dollar bill, arguing after Webster’s
New World Dictionary that de nes change as “a number of coins or bills whose
total value equals a single larger coin or bill.”
SOLUTION
Let n be any odd natural number. For convenience, let us use the convention that
xn+1 = x . 1Then, for each i ∈ [n] we can re-write the equation
x (x + 1) = x
i i (x − 1)
i+1 as follows: (x + x )(x − x − 1) = 0. It
i+1 i i+1 i+1 i
x := (x , x , …, x , x
1 2 ) can be uniquely determined by the
n n+1 rst term, x1, and
the sequence s := (s , s , …, s ). Indeed, since x = s x + (s + 1)/2 for
1 2 n i+1 i i i
There is clearly an in nite number of sequences x of this form but not all of them
satisfy our system as we also require that x = x , that is, 1 n+1
n n n
si + 1
x1 = xn+1 = x1 ∏ si + ∑ ∏ sj .
2
i=1 i=1 j=i+1
(2.7)
Recall that n is odd. We will show that the number of terms in s that are equal
to 1 is even. For a contradiction, suppose that it is not true, that is, the number of
terms that are equal to −1 is even. In particular, ∏ s = 1 and so (2.7) reduces n
i=1 i
to
n n
si + 1
0 = ∑ ∏ sj .
2
i=1 j=i+1
However, since the number of terms for which (s + 1)/2 = 1 is odd and i
(s + 1)/2 = 0 otherwise, we deduce that the right hand side of the above
i
equality is the sum of odd number of terms, each of them from {−1, 1}, and so it
is not equal to zero. We get the desired contradiction and so, indeed, the number
of terms in s that are equal to 1 is even. Since ∏ s = −1, the required
n
i=1 i
are different, let i ∈ [n] be such that s ≠ s . Without loss of generality, we may
i
′
i
xi+1
= s x + (s + 1)/2 = x + 1.
i i i
On i
the other hand,
= s x + (s + 1)/2 = −x . It follows that x + 1 = −x and so
′ ′
xi+1 i i i i i i
and so there is a bijection between the sequences with even number of 1’s and the
set of solutions of our system.
It remains to count the number of such sequences s. We note that in order to
generate a sequence of 1’s and −1’s with even number of 1’s, one can take any
sequence (s , s , …, s ) of 1’s and −1’s of length n − 1 (there are clearly
1 2 n−1
2
n−1
of them) and then the last term, sn, is uniquely determined (in order to keep
the number of 1’s even). We conclude that there are 2 n−1
solutions to the system
of equations we deal with.
REMARKS
In the problem we deal with in this section, the key idea was to introduce the
sequence si and write down an explicit solution to the recurrence relation for xi.
The sequence si is often called a control sequence of the sequence xi.
The solution we have found is relatively complex. The idea is quite
straightforward but there are many places and calculations where one can easily
make a mistake. In such cases, to be on a safe side, it is strongly recommended to
re-check some cases manually, or use computer to re-compute the number of
solutions for some instances. Here is a program written in Julia that allows us to
re-check the solution.
function find_sequence(s)
x1 = sum((s[i]+1)/4*prod(s[i+1:end]) for i in 1:le
ngth(s))
x = Int[x1]
for si in s
push!(x, si*x[end] + (si+1)/2)
end
println(x)
end
Now, we can test our solution for all scenarios, provided n = 3:
2.5.2. Solve the following system of equations, given that all variables involved
are positive real numbers:
2010 2009 2009 2010
(x − 1)(y − 1) = (x − 1)(y − 1) .
(Source of the problem and solution: PLMO LXI – Phase 1 – Problem 1.)
2.5.3. Fix an integer n ≥ 2, and consider the following system of n equations: for
i ∈ [n]
2 2
xi+1 + xi + 50 = 12xi+1 + 16xi .
(As usual, we use the convention that x = x .) Find the number of solutions
n+1 1
of this system, given that all the variables involved are integers.
(Source of the problem and solution: PLMO L – Phase 3 – Problem 4.)
n−1
ai
an = −∑ .
n + 1 − i
i=0
p = p(s ), such that p(s ) is true for all i ∈ N. In our case we are asked to prove
i i
that the invariant is a > 0 for all i ∈ N. The usual technique of proving
i
1
xn+1 = xn − .
n(n + 1)
In this example, it is not enough to use the fact that x > 0 to prove that n
xn+1 > 0. Hence, this natural strategy will simply not work. On the other hand, a
stronger property, namely that x = 1/n for all n, is easy to prove by induction.
n
Indeed, the base case clearly holds: x = 1 = 1/1. For the inductive step, from
1
property holds by induction; in particular, all the terms of the sequence are
positive.
Squeeze Theorem Let us mention about one more useful tool that will be needed
to solve one of the exercises. The squeeze theorem, also known as the sandwich
theorem, is a theorem regarding the limit of a function or a sequence. It is used to
con rm the limit of a function via comparison with two other functions whose
limits are known or which can be easily computed.
Let I be a set having the point a as a limit point, that is, there exists a sequence
of elements of I which converges to a. Let f, g, and h be functions de ned on I,
except possibly at a itself. Suppose that for every x in I not equal to a, we have
g(x ) ≤ f (x ) ≤ h(x)
In this case, limx→a f (x) = L. Let us note that a is not required to be an element
from I. Indeed, if a is an endpoint of an open interval I, then the above limits are
left-hand or right-hand limits. A similar statement holds for unbounded sets; for
example, if I = (0, ∞), then the conclusion holds taking the limits as x → ∞.
As already mentioned, this theorem is also valid for sequences that corresponds
to the case with I = N and a = ∞. Let (x ) and (z ) be two sequences
n n∈N n n∈N
property:
xn ≤ yn ≤ zn
We will prove the desired property by induction on n. The base case is easy to
verify: a = 1/2 > 0. For the inductive step, suppose that a > 0 for some
1 n
n n−1
ai 1 ai an
an+1 = −∑ = − ∑ − .
n + 2 − i n + 2 n + 2 − i 2
i=0 i=1
Since
n + 2 1 1
⋅ ≤ ,
n + 1 n + 2 − i n + 1 − i
it follows that
n−1
1 n + 1 ai an
an+1 ≥ − ∑ − .
n + 2 n + 2 n + 1 − i 2
i=1
and so
1 n + 1 1 an n + 1 1
an+1 ≥ − ( − an ) − = an ( − ) > 0.
n + 2 n + 2 n + 1 2 n + 2 2
The proof by induction is nished.
REMARKS
The key idea in the proof was to try to replace the sum involving all ai by the
formula that involved only the previous term in the sequence. Such an approach is
possible as the previous term was also de ned in terms of the same (but one)
terms ai.
In our problem, in order to use this approach we had to transform into 1
n+2−i
n+1−i
1
for i > 1 in a way that does not depend on i. To achieve this, we
investigated the following ratio
1/(n + 2 − i) n + 1 − i n + 1
= ≤ ,
1/(n + 1 − i) n + 2 − i n + 2
and we checked that this universal bound is enough to derive the required claim.
EXERCISES
2.6.1. Let x1 be any positive real number, and for each n ∈ N let
1
xn+1 = xn + .
2
xn
(Source of the problem: PLMO L – Phase 1 – Problem 10. Solution: our own.)
2.6.2. Consider the following sequence de ned recursively: a = 4 and for each 1
(Source of the problem and solution: PLMO XLIX – Phase 1 – Problem 3.)
2.6.3. You are given two numbers a, b ∈ R. Let x = a, x = b, and for each
1 2
there are at least 2, 000 distinct pairs (k, ℓ), k < ℓ, such that x = x . On the k ℓ
PROBLEM
m, n ∈ N, we have that
In this section, we concentrate on the following important types of sets, open sets
and closed sets. In general, such sets can be very abstract but in practice, open
sets are usually chosen to be similar to the open intervals of the real line. We will
restrict ourselves to Euclidean space.
Euclidean Space Euclidean space is the fundamental space of geometry.
Originally, this was the three-dimensional space but in modern mathematics there
are Euclidean spaces of any dimension that is a natural number, including the
three-dimensional space, the two-dimensional Euclidean plane, and one-
dimensional real line. One way to think of the Euclidean space is as a set of points
satisfying certain relationships, expressible in terms of distance and angles. In
Cartesian coordinates, if p = (p , p , …, p ) ∈ R 1 2 and n
n
q = (q , q , …, q ) ∈ R
1 2 n are two points in Euclidean n-space, then the distance
n
n
2
d(p, q ) = d(q, p ) = ∑ (qi − pi ) .
⎷
i=1
all points at distance less than or equal to r away from p. In other words,
| ( )
n
Br [p ] := {x ∈ R | d(x, p) ≤ r } .
Note that a ball (open or closed) always includes p itself. A subset of some space
is bounded if it is contained in some ball.
Open and Closed Sets An open set is an abstract concept generalizing the idea of
an open interval in the real line. One of the simplest examples are sets which
contain a ball around each of their points but, as already mentioned above, an
open set, in general, can be very abstract: any collection of sets can be called
open, as long as the union of an arbitrary number of open sets is open, the
intersection of a nite number of open sets is open, and the space itself is open.
These conditions are very loose, and they allow enormous exibility in the choice
of open sets.
We restrict ourselves to Euclidean spaces. A subset S of the Euclidean n-space
Rn is called open if for any point x ∈ S , there exists a real number ε = ε(x) > 0
such that B (x) ⊆ S . One can show that the three desired properties are satis ed:
ε
a) the union of any number of open sets, or in nitely many open sets, is open, b)
the intersection of a nite number of open sets is open, and c) Rn itself is open.
On the other hand, note that in nite intersections of open sets need not be open.
For example, the intersection of all intervals of the form (−1/n, 1/n), where
n ∈ N, is the set {0} which is not open in the real line R .
1
The closure of a set S consists of all points in S together with all limit points of
S. Formally, for a given set S ⊆ R , x is a point of closure of S if every open ball
n
centered at x contains a point of S (this point may be x itself), that is, for all ε > 0
, B (x) ∩ S ≠ ∅. The de nition of a point of closure is closely related to the
ε
de nition of a limit point. The difference between the two de nitions is subtle but
important, namely, in the de nition of limit point, every neighborhood of the
point x in question must contain a point of the set other than x itself. As a result,
every limit point is a point of closure, but not every point of closure is a limit
point. Finally, the boundary of a set S is the set of points which can be approached
both from S and from the outside of S.
In general, a closed set is a set whose complement is an open set. However,
there are other equivalent de nitions that can be applied to Euclidean spaces. A
set is closed if and only if it coincides with its closure. Equivalently, a set is
closed if and only if it contains all of its limit points. Yet another equivalent
de nition is that a set is closed if and only if it contains all of its boundary points.
Any intersection of closed sets is closed (including intersections of in nitely
many closed sets). The union of nitely many closed sets is closed.
Note that both the empty set and the whole Euclidean space are both open and
closed. Let us also mention that a set is connected if it cannot be represented as
the union of two or more disjoint non-empty open sets.
There are many interesting and important partial orders but here we only focus
on relations on real numbers or vectors of real numbers. Let us rst observe that
the real numbers ordered by the standard less-than-or-equal relation ≤ is a partial
order. (In fact, it is a totally ordered set as for any x, y ∈ R, x ≤ y or y ≤ x.)
Indeed, x ≤ x for any x ∈ R, and so ≤ is re exive. Since x ≤ y and y ≤ x
implies that x = y, the relation is also antisymmetric. Finally, if x ≤ y and y ≤ z,
then x ≤ z and so ≤ is transitive.
For x, y ∈ R , we usually say that
n
( ) ( )
x = (x1 , x2 , …, xn ) ≤ (y1 , y2 , …, yn ) = y
if and only if x ≤ y for all i ∈ [n]. We leave it for the reader to check that this
i i
relation is a partial order. However, note that it is not a total order! For example,
for x = (1, 2) ∈ R and y = (2, 1) ∈ R , neither x ≤ y nor y ≤ x.
2 2
Rn that is less than or equal to all elements of S, if such an element exists. In other
words, for a given S ⊆ R , let L ⊆ R be de ned as follows: x ∈ L if and
n
S
n
S
points that are greater than or equal to all elements of S, again, if such an element
exists.
The two de nitions are symmetric and, indeed, the in mum is in a precise
sense dual to the concept of a supremum. As already mentioned, in ma and
suprema do not necessarily exist. However, if an in mum or supremum does
exist, it is unique. Moreover, it follows immediately from the de nition that the
in mum of S ⊆ R exists if and only if it exists for S along every dimension of
n
consider y := x/2). On the other hand, the in mum of R+ exists and is equal to 0.
Indeed, it follows from the previous observation together with the fact that 0 ≤ x
for all x ∈ R . Another simple examples are
+
|
inf {x ε R | 0 < x < 1} = 0
2
inf {x ε Q x > 2} = √2
n 1
inf {(−1) + εR n ε N} = −1.
n
SOLUTION
Let (a )n n∈N
be any sequence of positive numbers that satis es the desired
properties. In particular, recall that there exists C > 1 such that the sequence 0
(a )
n n∈N
satis es a ≤ C (a
m+n + a ) for all m, n ∈ N. Let S be the set of real
0 m n
numbers (not necessarily grater than 1) C for which a ≤ C(a + a ) for all m+n m n
*
S = [C , ∞) ( S is a closed set). We will show that the latter is true, that is, C ,
*
an in mum of S, belongs to S.
Let C* be an in mum of S. For a contradiction, suppose that C* does not
belong to S. It follows that there exist m, n ∈ N such that a > C (a + a ). m+n
*
m n
However, since the inequality is sharp, we get that there exists some ε > 0 such
that a m+n > (C + δ)(a
*
+ a ) for all 0 ≤ δ ≤ ε. This contradicts the fact that
m n
*
C is an in mum. It follows that C ∈ S. *
2
am+n = a 2 = am2 +n2 +2mm ≤ C * (am2 + an2 + 2mn)
(m+n)
2 2 2
≤ C * (am + C * (an2 + a2mn )) = C * (an + C * (an + a2 am an ))
2 2
= C * (am + C * (an + 2am an )).
Since all the terms are positive, after taking a square root of both sides, we get
that
2
* *
2(C ) + C
√
am+n ≤ (am + an ) .
3
On the other hand, since the in mum C* is contained in S, for any ε > 0 there
must exist m, n ∈ N for which a > (C − ε)(a + a ). Combining this
m+n
*
m n
with the above inequality (that holds for any m, n ∈ N), we get that
2
* *
2(C ) + C
* √
(C − ε)(am + an ) < am+n ≤ (am + an ) ,
3
or equivalently that
2
* *
2(C ) + C
* √
C − ε < .
3
2
* *
C
*
and so, in particular, a
= 1 n ≤ n for all n ∈ N, which can be proved by
induction on n.
Let us now note that for each k ∈ N , a2k = (a2 )
k
= 2
k
. Using this and the
bound we proved above, we get
k k k
2 = a2k ≤ am + a2k −m ≤ m + (2 − m) = 2 ,
and so am + a2k −m = 2
k
for all 1 ≤ m ≤ 2
k
− 1 . But it means that for any
1 ≤ m ≤ 2
k
− 1 , we get
k k
2 = am + a2k −m ≤ am + (2 − m),
REMARKS
The key idea that leads to the proof is to notice that there exists C* that is a
smallest value of C for which the desired bound for a is met. This follows m+n
from the fact that an intersection of closed sets is a closed set. What is this family
of closed sets in our case? For any pair m, n ∈ N we want to have the bound
am+n ≤ C(a + a ) which clearly holds for any C ≥ a
m n /(a + a ). As a m+n m n
result, the set of values of C for which am+n ≤ C(am + an ) for all m, n ∈ N is
the set
am+n
S := ⋂ [ , ∞).
am + an
m,n∈N
With this observation, we know that for any ε > 0, there are natural numbers m
and n such that
*
am+n > (C − ε)(am + an ) .
If we want to start using the fact that a = a a , then it makes sense to try to
mn m n
square the above inequality and use the fact that a = (a ) . This then
(m+n)
2
m+n
2
implies that
2
2 * 2
a 2 = (am+n ) > (C − ε) (am + an ) .
(m+n)
If we do this, then this leads us to the following observation: for all ε > 0
2
* *
2(C ) + C 2
*
> (C − ε) .
3
2 2
Since this holds for any ε > 0, we have that in the limit, 2(C *
) + C
*
≥ 3(C )
*
,
2
and so C *
≥ (C )
*
. Therefore, we see that C *
= 1 , as it cannot be less than 1.
EXERCISES
2.7.1. Find the number of in nite sequences (a ) , such that a ∈ {−1, 1} for i i∈N i
2.7.3. Let n be any natural number such that n ≥ 3. Find all sequences of real
numbers (x , x , …, x ) that satisfy the following conditions:
1 2 n
n n
2
∑ xi = n and ∑ (xi−1 − xi + xi+1 ) = n,
i=1 i=1
(Source of the problem and solution idea: PLMO LX – Phase 2 – Problem 6.)
Chapter 3
Functions, Polynomials, and Functional Equations
single variable x can always be written (or rewritten) in the following form
n
n n−1 i
P (x ) = an x + an−1 x + … + a1 x + a0 = ∑ ai x ,
i=0
As a result, the degree of this polynomial is 4. Polynomials of small degrees have names; in
particular, degree 0 polynomial P (x) = C is called non-zero constant (if C ≠ 0) or special
case (if C = 0), degree 1 polynomials are called linear, degree 2 ones are called quadratic,
and degree 3 ones cubic.
In order to determine the degree of a polynomial that is not in standard form, one has to
put it in standard form by expanding the products and combining the terms. For example,
2 2
(x + 1) − (x − 1) = 4x
if of degree 1 despite the fact that each summand has degree 2. This is not needed when the
polynomial is expressed as a product of polynomials. One can easily see that the degree of a
product is the sum of the corresponding degrees of all the factors. Similarly, the degree of
the composition of two non-constant polynomials, say P (x) and Q(x), is the product of
their degrees. For example, if P (x) = x − x and Q(x) = x + x, then
2 3
2
3 3
P (x)o Q(x) = P (Q(x)) = (x + x) − (x + x)
6 4 3 6 4 3
= x + 2x + x − x − x = x + 2x − x
has degree 6 = 2 ⋅ 3.
Complex Numbers In order to make our next observation about the number of roots of a
given polynomial, it will be convenient to introduce a concept of complex numbers.
However, we will not use them anymore in this book.
A complex number is a number that can be expressed in the form a + bi, where a and b
are real numbers, and i is a solution of the equation x = −1. Clearly, no real number
2
satis es this equation, and so i is called an imaginary number. For a given complex number
z = a + bi, a is called the real part, and b is called the imaginary part.
Complex numbers allow solutions to certain equations that have no solutions in real
numbers. For example, x − 4x + 8 = 0 has no real solution. Indeed, it can be rewritten as
2
follows (x − 2) = −4 and now it is clear that the square of a real number cannot be
2
negative. On the other hand, since i = −1, both 2 + 2i and 2 − 2i are solutions to this
2
2 2 2 2
((2 − 2i) − 2) = (−2i) = (−2) i = 4(−1) = −4.
Roots A root (or zero) of a real or complex function f = f (x) is a number x from the
domain of f such that f (x) = 0. In particular, a root of a polynomial P (x) is a root of the
corresponding polynomial function.
The fundamental theorem of algebra states that every non-constant single-variable
polynomial with complex coef cients has at least one complex root. The theorem can be
alternatively stated as follows: every non-zero, single-variable, degree n polynomial with
complex coef cients has, counted with multiplicity, exactly n complex roots. The
equivalence of the two statements can be proven through the use of successive polynomial
division.
Clearly, the fundamental theorem of algebra can be applied to polynomials with real
coef cients that we are concerned in this book, since every real number is a complex
number with an imaginary part equal to zero. It follows that the number of complex roots of
each such polynomial is exactly n and so the number of real roots is at most n. As observed
earlier, it can be strictly less than n.
Suppose that real numbers x1, x2, x3, and x4 are roots of some polynomial W (x) of degree 4
with all integer coef cients. Prove that if x + x is rational and x x is irrational, then
3 4 3 4
x + x = x + x .
1 2 3 4
THEORY
Vieta’s FormulasVieta’s formulas are formulas that relate the coef cients of a polynomial
to sums and products of its roots. Consider a polynomial P (x) = ∑ a ⋅ x , where
n i
i=0 i
a , a , …, a
1 2 ∈ R and a
n ≠ 0. Since P (x) has degree n, by the fundamental theorem of
n
the second representation and then identifying the coef cients of each power of x in the two
representations. It follows that for any k ∈ [n], we have that
k
k
an−k
∑ ∏ xij = (−1) .
an
1≤i1 <i2 <⋯<ik ≤n j=1
(3.1)
Note that the indices ik are in increasing order to ensure that each sub-product of roots is
used exactly once. In particular, for k = 1, k = 2, and k = n we get
n an−1
∑ xi = −
i=1 an
n n an−2
∑ ∑ xi ⋅ xj =
i=1 j=i+1 an
n n a0
∏ xi = (−1) .
i=1 an
(3.2)
2
(∑ xi ) = ∑ xi + 2 ∑ ∑ xi xj
and so
2
n n n n 2
2
an−1 an−2
∑ xi = (∑ xi ) − 2 ∑ ∑ xi xj = (− ) − 2 .
an an
i=1 i=1 i=1 j=i+1
SOLUTION
that all roots of the considered polynomial are non-zero. Clearly, x3 and x4 are non-zero as
their product is irrational. In order to derive a contradiction, suppose that x = 0 or x = 0. 1 2
Without loss of generality, we may assume that x = 0; x2 may or may not be equal to 0. 1
then the formula for k = 2 gives us that x x = a /a is rational, which gives us the 3 4 2 4
desired contradiction. Hence, all roots of the considered polynomial are indeed non-zero.
We will use Vieta’s formulas again, in fact, a few times. Observe rst that, since x + x 3 4
i=1
i=1
i≠j
is rational. Since
is rational,
REMARKS
The key value of Viete’s formulas is that one can use them to prove many useful properties
of roots of polynomials without actually nding them. In the problem we consider in this
section, the key observation is that if the coef cients of a given polynomial are integers,
then all values following from Viete’s formulas must be rational. In fact, one could relax the
assumption and only assume that the coef cients are rational (not necessarily integers).
Indeed, this clearly follows from the fact that multiplying any polynomial by a constant does
not change its roots so any polynomial with rational coef cients can be transformed into a
polynomial with integer coef cients.
Ones we observe that all values following from Viete’s formulas are rational, the next
step is to exhaustively write down all the facts about rationality or irrationality of different
combinations of xis. In practice, when solving problems of this avour, one would write
down many more facts that are possible to derive, most of them would turn out to be useless.
This is hard to avoid as it typically dif cult to predict which combination of them is crucial
for the problem at hand.
EXERCISES
3.1.1. Find all sets of six real numbers a , a , a , b , b , b with the property that for all
1 2 3 1 2 3
let a = a and b = b ).
4 1 4 1
(Source of the problem and solution idea: PLMO LXX – Phase 1 – Problem 5.)
n−3
n i
f (x ) = x + ∑ ai x
i=0
PROBLEM
(3.3)
THEORY
f (x + y ) = f (x)f (y)
2 2 2
f (x + y) = f (x) + f (y)
for all x, y ∈ R. By considering the case x = y = 0, we get that f (0) = f (0) + f (0)
2 2 2
and so f (0) = 0. From this it follows that f (0) = 0. Now, consider any x ∈ R and let
2
y = −x. We get
2 2 2 2 2
f (x) ≤ f (x) + f (−x) = f (x − x) = f (0) = 0
Continuous Functions In order to solve our next example, we need to introduce continuous
functions. A function f : R → R is continuous if suf ciently small changes of the
argument x result in arbitrarily small changes of the value f (x). Otherwise, a function is
said to be discontinuous.
There are several different formal de nitions of continuity of a function, all of them being
equivalent. Below, we present a few which are commonly used.
Limits of functions: Function f is continuous at some point c of its domain if the limit of
f (x), as x approaches c through elements of the domain of f, exists and is equal to f (c); that
is,
lim f (x ) = f (c ) .
x→c
Limits of sequences: One can instead require that for any sequence (x ) of points in the n n∈N
is,
(3.4)
Epsilon–delta: For every ϵ > 0 (arbitrarily small), there exists δ = δ(ϵ) > 0 (which depends
on ϵ) such that for all x in the domain of f with c − ϵ < x < c + ϵ, the function satis es
f (c) − δ < f (x) < f (c) + δ; that is,
Cauchy’s Functional Equation and Additive Functions Let us consider one particular
functional equation that is both interesting and important. Cauchy’s functional equation is
the functional equation
f (x + y ) = f (x) + f (y ) .
(3.5)
f (ax ) = af (x ) , f or any a ∈ N, x ∈ R .
(3.6)
After substituting x with x/a in (3.6) and multiplying both sides by b/a, we get that
b x
f (x ) = bf ( ) , f or any a, b ∈ N, x ∈ R .
a a
(3.7)
By combining (3.6) and (3.7), we get that
b x b
f (x ) = bf ( ) = f( x),
a a a
b b b
f (q ) = f( ⋅ 1) = f (1 ) = c = cq .
a a a
0 = f (0 ) = f (x − x ) = f (x) + f (−x),
and so f (−x) = −f (x). This allows us to reduce this case to the previous one. It follows
that for any q ∈ Q , −
It remains to show that, since f is continuous, f (x) = cx for any x ∈ R ∖ Q. For that we
will use the “limits of sequences” variant of the de nition for a function to be continuous—
see (3.4). Let x ∈ R ∖ Q. It is easy to construct a sequence (x ) of rational numbers n n∈N
xn − ⌊xn⌋ 1
0 ≤ x − xn = < ,
n n
and so, indeed, lim n→∞ xn = x . From (3.4) it follows that
SOLUTION
After applying (3.3) for the speci c case y = f (0) − x , we get that
f (x) = f (2x − f (0)) + f (x)
2
which implies that
2
f (2x − f (0) ) = f (x) − f (x) = f (x)(1 − f (x) ) .
Since function g(z) = z(1 − z) attains its maximum value, 1/4, at z = 1/2, we get that
f (2x − f (0)) ≤ . Since x is arbitrary and f (0) xed (once function is xed; in fact, as
1
argued below, it is always equal to 0 or −1), we conclude that function f has the following
important property:
(3.8)
Applying (3.3) for another speci c case, x = f (0) and y = −f (0), we get that
f (2f (0)) = f (2f (0)) + f (f (0)) and so f (f (0)) = 0. On the other hand, for x = y = 0
2
we get f (f (0)) = f (0) + f (0) so, combining the two observations, we get that there are
2
only two possible values of f (0): f (0) = 0 or f (0) = −1. We will consider these two cases
independently.
Suppose rst that f (0) = −1. In this case, f (−1) = f (f (0)) = 0 and (3.3) for
(x, y) = (0, 1) gives us f (f (1)) = f (−1) + f (0) = 0 + (−1) = 1. But this contradicts
2 2
(3.8), and so no function f satis es both the functional equation (3.3) and f (0) = −1.
Suppose then that f (0) = 0. Clearly, the constant function f (x) = 0 for x ∈ R, satis es
(3.3); so this is certainly one solution. We will show that it is actually the only one. For a
contradiction, suppose that for some z ∈ R we have f (z) ≠ 0. Considering the case
we get from (3.3) that f (z + f (2z)) = f (z) , which implies that
2
x = y = z
In particular, f (x 1
) = δ > 0 . By applying (3.3) with x = x i+1
and y = 0, we get that
2
f (xi+1 ) = f (xi + f (xi ) ) = f (xi ) + f (xi ) ≥ f (xi ) .
it implies that
2 2
f (xi+1 ) = f (xi ) + f (xi ) ≥ f (xi ) + δ
unbounded (that is, lim f (x ) = ∞) which contradicts (3.8). Hence, indeed, f (x) = 0
n→∞ n
REMARKS
A standard approach used to solve functional equations is to prove some speci c properties
of the function involved. In particular, it is often useful to try to establish the value of the
function for some carefully chosen, characteristic values. In our problem, this speci c value
was equal to 0. Another common approach is to try to reduce the functional equation
involving, say, two independent variables (in our case, x and y) into a more general equation
involving only one variable. In our case, we used this approach by setting x = y. It is
important to remember that if we prove some property of a more general equation (in this
case involving one variable), then we always have to check if the nal solution meets the
original equation.
EXERCISES
for all a, b ∈ Z.
(Source and the problem and solution idea: PLMO LXV– Phase 1 – Problem 5.)
3.2.2. Find all pairs of functions f : R → R and g : R → R such that
for all x, y ∈ R.
(Source of the problem and solution idea: PLMO LXIII – Phase 2 – Problem 4.)
3.2.3. Find all functions f : R → R such that for all x, y ∈ R, we have that
(Source of the problem and solution idea: PLMO LIX – Phase 2 – Problem 3.)
for all x, y ∈ R.
THEORY
When solving functional equations, it is quite common that only some speci c cases are
considered. For example, one may x some speci c values in the given equation or relate
one variable to some other variable or variables. Indeed, typically one starts from the
original functional equation but then immediately transforms it into something more
manageable and insightful. But, as a result, the resulting equations are not equivalent to the
original ones. In other words, the obtained conditions are only necessary but often not
suf cient. Therefore, it is always important to make sure that the nal result actually
satis es the original functional equation.
In order to illustrate this issue, let us consider the following functional equation. Suppose
that our goal is to nd all functions f : R → R that satisfy the following condition
+ +
√ f (x) + f (y) = x + y,
for all x, y ∈ R. First, after xing y = 0 and squaring the equation, we get that
2 2
f (x) + f (0 ) = x + 0 = x .
solution we have simpli ed and relaxed the original condition by setting y = 0 and squaring
the equation. Therefore, f (x) = x is only a candidate solution and now we have to check
2
that it actually satis es the original equation. Once we substitute the function into the
original equation we get √x + y = x + y. This equation clearly does not hold unless
2 2
Let us start with rewriting the functional equation as follows: for all x, y ∈ R,
2
f (y) = f (x + y) − xf (x ) .
2 2
f (x − x) = f ((x − 1) + (x − 1) ) = (x − 1)f (x − 1) + f (x − 1),
and so
2
f (−x) = (x − 1)f (x − 1) + f (x − 1) − xf (1 + (x − 1))
2
= −(f (1 + (x − 1)) − f (x − 1))x.
Going back to the original equation for the last time, we get that f (−x) = −f (1)x. It
follows that the only functions that potentially satisfy the given functional equation have the
form f (x) = ax, where a is a xed real number.
We directly check that if f (x) = ax, then
2 2
f (x + y) − xf (x ) = ax + ay − x ⋅ ax = ay = f (y ) .
However, in our problem, our goal is to nd bijections and so the case a = 0 has to be ruled
out. We conclude that the only functions that meet all the conditions are of the form
f (x) = ax where a ≠ 0.
REMARKS
In order to better understand how one can get ideas leading to the solution of similar
problems, let us make some observations that we made during the process of solving the
problem at hand. First, by xing x = 1 we observed that for any y ∈ R,
f (y + 1 ) = f (y) + f (1 ) .
It follows that function f constantly increases by the same value, f (1), when we increase its
argument by 1.
Our second observation was that changing x to −x does not affect the left hand side of
the original equation and so xf (x) = −xf (−x) for any x ∈ R. Hence, for x ≠ 0 we get
that f (x) = −f (−x). On the other hand,
f (0) = f (−1 + 1) = f (−1) + f (1) = −f (1) + f (1) = 0, and so in fact f (x) = −f (−x)
for all x ∈ R.
Our next idea was to try to nd two different expressions of the form x + y that have 2
identical value. One such expression was given in the solution above. Another identity that
could have been used is x + x = (x + 1) + (−1 − x). Applying it we would get that
2 2
2
f (x + x) = xf (x) + f (x)
2
f ((x + 1) + (−1 − x)) = (x + 1)f (x + 1) + f (−1 − x).
As a result,
= −f (x + 1) + (x + 1)f (x + 1)
= xf (x + 1) = xf (x) + xf (1),
all x ∈ R).
(Source of the problem and solution idea: PLMO LIII – Phase 2 – Problem 1.)
3.3.2. Given that function f (x) satis es f (1/(1 − x)) = xf (x) + 1, nd the value of f (5).
(Source of the problem: question asked on Quora. Solution: our own.)
3.3.3. Suppose that a function f (x, y, z) of three real arguments satis es the following
condition
5 5
i=1 i=1
i=1 i=1
where x i+n = x . i
(Source of the problem and solution: PLMO LIX – Phase 3 – Problem 2.)
3.4 Polynomials with Integer Coef cients
SOURCE
Problem and solution idea: PLMO LVII – Phase 3 – Problem 6 (modi ed)
PROBLEM
Find all pairs of integers (a, b) with the property that there exists a polynomial P (x) having
integer coef cients such that
n
2 i
(x + ax + b) ⋅ P (x ) = Q(x ) = ∑ ci x ,
i=0
(3.9)
Rational Root Theorem Our next theorem, the rational root theorem (sometimes called the
rational root test), states a constraint on rational solutions of a polynomial equation with
integer coef cients. Consider any polynomial P (x) = ∑ a x with all integer
n i
i=0 i
coef cients, that is, a ∈ Z for all i ∈ [n] ∪ {0}. Suppose that P (x) has a rational root p/q,
i
where p ∈ Z and q ∈ Z are co-prime (in other words, the fraction p/q is in its lowest
terms). Then, p | a and q | a .
0 n
i=0
i−1 n−i n
p ⋅ (∑ ai p q ) + a0 q = 0.
i=1
Since p and q are co-prime, we get that p | a0 . Similarly, from the very same equation, we
get that
n−1
n i n−i−1
an p + q ⋅ (∑ ai p q ) = 0,
i=0
and q are co-prime, then it is possible to represent it as (x − p/q)R(x), where R(x) has
rational coef cients. If R(x) is a rational constant r, then P (x) = r(x − p/q) = a(qx − p),
where a = r/q is an integer, as by assumption r and rp/q are integers and p and q are co-
prime. If R(x) is not a constant, then the lemma of Gauss implies that P (x) = P (x)Q(x), ′
where P (x) and Q(x) are non constant and have integer coef cients and P (x) has a root
′ ′
p/q. If P (x) has degree 1, then clearly it can be written as a(qx − p) where a is an integer
′
and we are done. Otherwise, we have a polynomial P (x) of degree less than the degree of
′
P (x) with integer coef cients that has a root p/q. As P (x) initially had a nite degree, by
replacing P (x) by P (x) and repeated application of this reasoning, we get that eventually
′
we must reach the required factorization P (x) = (qx − p)R (x), where R (x) has integer ′ ′
i=0 i
i
n 0
above.
SOLUTION
Let P (x) be a polynomial with integer coef cients that satis es (3.9) for some pair of
integers (a, b) and some set of coef cients c ∈ {−1, 1}. Let us rst observe that if
i
n−1
n n i
|x | = |x| = ∑ ci x .
i=0
We will now show that |x| < 2. For a contradiction, suppose that there exists x ∈ R such
that |x| ≥ 2 and the above equality holds. Since,
n−1 n−1 n
n i
|x| − 1 n
i
|x| = ∑ ci x ≤ ∑ |x| = ≤ |x| − 1,
|x| − 1
i=0 i=0
we get that 0 ≤ −1 which gives us the desired contradiction. Since Q(x) does not have
roots which absolute values are greater than or equal to 2, the same property holds for
polynomial R(x) := x + ax + b.
2
It follows that R(2) = 4 + 2a + b > 0,
b b
−2 − < a < 2 + .
2 2
It follows immediately from (3.9), since c ∈ {1, −1} and P (x) has integer coef cients,
0
that b must be equal to 1 or −1. We will independently consider these two cases.
Case 1: b = −1. The possible values of a are −1, 0, and 1. If a = −1 or a = 1, then we can
clearly x P (x) = 1 to get the desired property. If a = 0, then one can take P (x) = x + 1
to get that
2 2 3 2
(x + ax + b) ⋅ P (x ) = (x − 1)(x + 1 ) = x + x − x − 1,
Case 2: b = 1. This time there are more possible values of a to consider: −2, −1, 0, 1, and
2. If a = −1 or a = 1, then we again use P (x) = 1. If a = 0, then we use P (x) = x + 1 to
get a desired property:
2 2 3 2
(x + ax + b) ⋅ P (x ) = (x + 1)(x + 1 ) = x + x + x + 1.
We are left with two cases, −2 and 2, for which we use P (x) = x + 1 and, respectively,
P (x) = x − 1: and
2 3 2
(x − 2x + 1)(x + 1) = x − x − x + 1
(x
2
+ 2x + 1)(x − 1) = x
3
+ x
2
− x − 1 .
Putting both cases together, we conclude that the set of solutions to our problem is
(a, b) ∈ {(−2, 1), (−1, −1), (−1, 1), (0, −1), (0, 1), (1, −1), (1, 1), (2, 1) } .
REMARKS
In order to solve the problem in this section, we used the geometry of the roots of a given
polynomial, that is, the information about their localization (in the complex plane or on the
real line, depending if we allow complex roots or restrict ourselves to real ones). It is
perhaps surprising that one can actually deduce it from the degree and the coef cients of the
polynomial. Some of these properties are important for many applications, such as upper
bounds on the absolute values of the roots, which de ne a disk containing all roots, or lower
bounds on the distance between two roots. Such bounds are widely used for root- nding
algorithms for polynomials, either for limiting the regions where roots should be searched
in, or for the computation of the computational complexity of these algorithms.
There are many upper bounds for the magnitudes of all complex roots. We will only
mention two of them, Lagrange’s and Cauchy’s bounds. In our problem, we used the bound
of Cauchy. Let P (x) = ∑ a x be a polynomial and let z be its root, that is, P (z) = 0.
n i
i=0 i
Lagrange’s Bound is
n−1
ai
|z | ≤ max{1, ∑ } ,
an
i=0
ai
|z | ≤ 1 + max { } .
0≤i≤n−1 an
(3.10)
Lagrange’s bound is smaller than Cauchy’s one only when 1 is larger than the sum of all
ratios |a /a | but the largest which is relatively rare in practice. As a result, Cauchy’s bound
i n
is more widely known and used than Lagrange’s. We will prove the bound of Cauchy below.
Let z be any root of P (x). If |z| ≤ 1, then (3.10) is trivially satis ed so suppose that
|z| > 1. Since P (z) = 0, we get that
n−1
n i
−an z = ∑ ai z ,
i=0
and so
n n
|z| −1 |z|
= max |ai | ≤ max |ai | ,
|z|−1 |z|−1
0≤i≤n−1 0≤i≤n−1
having only integer coef cients with the property that for each n ∈ N there exists
k = k(n) ∈ Z such that P (k) = f . n
3.4.2. Suppose that a polynomial P (x) has all integer coef cients. Prove that if polynomials
P (P (P (x))) and P (x) have a common real root, then P (x) also has an integer root.
(Source of the problem and solution idea: PLMO LVIII – Phase 2 – Problem 1.)
prime number p, there exists k ∈ Z such that P (k) and P (k + 1) are divisible by p. Prove
that there exists m ∈ Z such that P (m) = P (m + 1) = 0.
(Source of the problem and solution idea: PLMO LVI – Phase 2 – Problem 4.)
Find all polynomials P (x) of odd degree that satisfy the following equation:
2 2
P (x − 1) = (P (x)) − 1.
(3.11)
THEORY
The Lagrange polynomial is the polynomial of lowest degree that passes through all of these
n points. It is easy to see that the interpolating polynomial of the least degree is unique and
can be computed using the following formula:
n
x − xk
P (x ) = ∑ yj ∏ .
xj − xk
j=1 1≤k≤n
k≠j
Indeed, it is easy to see that P (x) goes through (x , y ) as for x = x all terms in the sum
i i i
but the ith term vanish, and in the ith term all fractions in the product are equal to 1. In order
to prove uniqueness, consider two polynomials P (x) and Q(x) of degree less than n that go
through points (x , y ). But this means that P (x) − Q(x) is a polynomial of degree less
i i
than n and has n distinct roots, namely, xi, i ∈ [n]. However, by the fundamental theorem of
algebra, this is only possible if P (x) − Q(x) = 0 for all x and so P (x) = Q(x) for all x.
In fact, the above proof of uniqueness naturally extends to the case of in nite number of
points. We get the following useful fact. Suppose that there is an in nite set of points
(x , y ), i ∈ N, with no two xi values equal to each other. Note that it might be the case that
i i
there is no polynomial that goes through all of these points (consider, for example, the set
{(0, 1), (1, 0), (2, 0), (3, 0), …}; there is no polynomial P (x) that has an in nite number of
roots, unless P (x) = 0 for all x, the special case). On the other hand, if there is a
polynomial that passes through all of these points, then this polynomial is uniquely de ned.
In order to see this, consider any two polynomials P (x) and Q(x) that pass through these
points. As for the nite case, by considering the polynomial P (x) − Q(x), we get that
P = Q, since both P (x) and Q(x) have a nite degree.
Finally, let us discuss one more useful fact. Suppose that a polynomial P (x) has a root at
point x0 and x0 is a local extremum (either local maximum or local minimum). We will
show that the multiplicity of root x0 is even. In order to see this, to derive a contradiction,
suppose that the multiplicity of x0 is some odd natural number k. It follows that one can
represent polynomial P (x) as P (x) = (x − x ) Q(x), where Q(x) is some polynomial
k
0
and Q(x ) ≠ 0. Since Q(x) is continuous, there exists an open interval around x0,
0
(x − ϵ, x + ϵ) for some ϵ > 0, such that Q(x) does not change sign on that interval. On
0 0
the other hand, since k is odd, the polynomial (x − x ) does change the sign at x0 and so
0
k
P (x) also changes it at that point. As a result, there is no extremum at x0 and we get the
Even and Odd Functions Let us also add a remark on even and odd functions, as we will
need the associated symmetry relations to solve our problem. A function f : R → R is even
if f (x) = f (−x) for all x ∈ R. Geometrically speaking, the graph of an even function is
symmetric with respect to the y-axis, meaning that its graph remains unchanged after
re ecting it about the y-axis. Examples of even functions are f (x) = |x|, f (x) = x , and 2
f (x) = cos x. On the other hand, a function f : R → R is odd if −f (x) = f (−x) for all
x ∈ R. Geometrically, the graph of an odd function has symmetry with respect to the
origin, meaning that its graph remains unchanged after a rotation of 180 degrees about the
origin. Examples of odd functions are f (x) = x, f (x) = x , f (x) = sin x.
3
Let us mention about some basic but useful properties and their implications for
polynomials. The sum of two even functions is even and the sum of two odd functions is
odd. If f (x) is even, then so is −f (x); the same property holds for odd functions. As a
result, the difference between two odd functions is odd and the difference between two even
functions is even. The sum of an even and odd function is neither even nor odd, unless one
of the functions is equal to zero over the whole domain.
From these observations we get that one can represent any polynomial P (x) = ∑ c x n
i=0 i
i
as a sum of two polynomials Q(x) and R(x) such that Q(x) is odd and R(x) is even.
Indeed, one can partition all the terms of P (x) into even and odd terms, that is,
⌊(n−1)/2⌋ ⌊n/2⌋
2i+1 2i
Q(x) = ∑ c2i+1 x and R(x) = ∑ c2i x .
i=0 i=0
It follows that P (x) is even if and only if Q(x) = 0. Similarly, P (x) is odd if and only if
R(x) = 0. Moreover, since any polynomial has nite degree, in order to determine whether
a polynomial is even or odd it is enough to check the corresponding conditions (
P (x) = P (−x) or, respectively, P (x) = −P (−x)) for in nitely many distinct points x (not
necessarily for all x ∈ R). Finally, note that if a function is even and odd, it must be equal
to 0 everywhere. As a result, the only polynomial that passes both tests for in nitely many
distinct points is the polynomial P (x) = 0, x ∈ R.
SOLUTION
( )
2 2 2 2
(P (x)) = P (x − 1) + 1 = P ((−x) − 1) + 1 = (P (−x)) .
It follows that either P (x) = P (−x) holds for an in nite number of values of x or the
equation P (x) = −P (−x) does. Note that both of these properties cannot hold
simultaneously, unless P (x) = 0 everywhere, which we directly check that is impossible.
Finally, as noted above, satisfying any of these properties for in nitely many points implies
that it actually holds for the whole real line R.
Since P (x) has odd degree, it follows that P (x) = −P (−x) for all x ∈ R, that is, P (x)
is symmetric around the origin. In particular, we get that P (0) = 0. Using the original
equation with x = 0, we get that P (−1) = P (0 − 1) = (P (0)) − 1 = −1 and so 2 2
P (1) = 1.
The next property that we will prove is that P (y) ≥ −1 for all y ≥ −1. (However, we
will only use it for y ≥ 1.) Let y ≥ −1 be any real number. Since x = x(y) := √y + 1
satis es y = x − 1, we get that
2
2 2
P (y ) = P (x − 1) = (P (x)) − 1 ≥ −1 .
Our next task is to show that P (x) = x holds for an in nitely many values of x and so the
only solution is the polynomial P (x) = x. In order to see this, let us recursively de ne the
following sequence of numbers: x = 1 and for each n ∈ N ∖ {1}, we de ne
1
tending to (√5 + 1)/2 ≈ 1.618, the unique solution to the equation x = √x + 1. However,
we will not need this property. We will show by induction on n that for all n ∈ N, we have
P (x ) = x , which will
n n nish the proof.
The base case is trivial: note that P (x ) = P (1) = 1 = x . For the inductive step,
1 1
suppose that P (x ) = x n−1 for some n ∈ N ∖ {1}. It follows from the original equation
n−1
(3.11) that
2 2
(P (xn )) = P (xn − 1) + 1 = P (xn−1 ) + 1 .
REMARKS
In our problem, we were restricted to polynomials of odd degree. Let us now relax this
assumption and consider polynomials of even degree. Constant polynomials (that is,
polynomials of degree 0) are easy to investigate. If P (x) = c for some c ∈ R, then c must
satisfy the equation c = c − 1 and so there are only two constant polynomials that satisfy
2
that P (x) = Q(x ) = R(x − 1). From this observation, using additionally the fact that
2 2
2 2 2
2 2 2
R((x − 1) − 1) = P (x − 1) = (P (x)) − 1 = (R(x − 1)) − 1.
R(x) satis es (3.11) for all x ∈ [−1, ∞). Arguing as before, we get that
(R(x)) = (R(−x)) for all x ∈ [−1, 1] and so either R(x) = R(−x) holds for an in nite
2 2
number of values of x ∈ [−1, 1], or R(x) = −R(−x) does. It follows that R(x) is either
even or odd on the whole real line R. It follows that R(x) in fact satis es (3.11) for all
x ∈ R, not only for those x ∈ [−1, ∞).
Let us note that the degree of R(x) is the same as the degree of Q(x) and so it is less than
the one of P (x). As a result, repeating this reduction process we will eventually reach the
case when R(x) has an odd degree, as it is impossible that R(x) is constant if P (x) were
not constant. Hence, if this happens, then R(x) = x since we showed earlier that this is the
only solution for the odd case. Therefore, all non-constant solutions of even degree have the
form T (x) for some n ∈ N, and T (x) = x − 1. Note that T (x) = T ∘ … ∘ T (x) is
(n) 2 (n)
the composition of the function f (x) performed n times—see Section 3.7 for more details.
EXERCISES
3.5.1. Find all polynomials P (x) with real coef cients that satisfy the following property: if
x + y is rational, then P (x) + P (y) is also rational.
(Source of the problem: PLMO LIV – Phase 1 – Problem 9. Solution: our own.)
3.5.2. Let P (x) be a polynomial with real coef cients. Prove that if there exists an integer k
such that P (k) is not an integer, then there are in nitely many such integers.
(Source of the problem and solution idea: PLMO LXVI – Phase 3 – Problem 2.)
3.5.3. Let F (x), G(x), and H (x) be some polynomials of degree at most 2n + 1 with real
coef cients. Moreover, suppose that the following properties hold:
i ∈ [n],
PROBLEM
Suppose that P (x) and Q (x) are two different polynomials with real coef cients that
1 1
satisfy the following condition: P (Q (x)) = Q (P (x)) for all x ∈ R. For n ∈ N ∖ {1},
1 1 1 1
let P (x) := P (P (x)) and Q (x) = Q (Q (x)). Prove that P (x) − Q (x) divides
n 1 n−1 n 1 n−1 1 1
THEORY
i=0 i
i
0
this, observe that we can nd the coef cients qi and the constant r explicitly by comparing
the corresponding coef cients of the two polynomials. We get that q = p , n−1 n
qi (staring from q and nishing with q0), and at the end then compute r. Let us also note
n−1
that r = P (x ). As a result, the above result could also be obtained in a different way.
0
has to be equal to zero and we get that P (x) = (x − x )Q(x) for some polynomial of 0
degree n − 1.
SOLUTION
Let us rst observe that for any two different polynomials G(x) and H (x), and any
polynomial F (x) = ∑ c x , we have that G(x) − H (x) divides F (G(x)) − F (H (x)).
n i
i=0 i
n i i
= ∑ ci ((G(x)) − (H (x)) )
i=0
n i−1 i i−1−j
= (G(x) − H (x)) ∑ ci ∑ (G(x)) (H (x)) .
i=0 j=0
Our second observation is that P (Q (x)) = Q (P (x)) for each n ∈ N. This can be
n 1 1 n
easily proved by induction on n. Indeed, the base case ( n = 1) follows immediately from
our assumption that P (Q (x)) = Q (P (x)). For the inductive step, suppose that
1 1 1 1
Pn−1 (Q (x)) = Q (P
1 1 (x)) for some n ∈ N ∖ {1}. Using the inductive hypothesis and
n−1
P (x) − Q (x) for all n ∈ N. We will prove it by induction on n. The base case ( n = 1) is
n n
trivial. For the inductive step, suppose that the claim holds for some n ∈ N, that is,
P (x) − Q (x) divides P (x) − Q (x). Without loss of generality, we may assume that
1 1 n n
P (x) is not constant (note that, since P (x) and Q (x) are different, they also cannot be
1 1 1
where the second equality holds because the composition of functions is associative—see
Section 3.7 for more details. We will independently show that both terms are divisible by
P (x) − Q (x). The
1 1 rst observation implies that the rst term, P (P (x)) − P (Q (x)), n 1 n 1
is divisible by P (x) − Q (x) since P (x) and Q (x) are different and P1 is not constant.
1 1 1 1
Using the second observation, we may re-write the second term as follows:
P (Q (x)) − Q (Q (x)) = Q (P (x)) − Q (Q (x)). If P (x) and Q (x) are identical,
n 1 n 1 1 n 1 n n n
then this term vanishes. Otherwise, we apply the rst observation one more time to get that
this term is divisible by P (x) − Q (x), and so also by P (x) − Q (x), by the inductive
n n 1 1
hypothesis.
REMARKS
The problem we considered in this section belongs to a large and important family of
problems where one assumes that some property (or a set of properties) holds and the goal
is to show that some other property also holds. However, it is important to keep in mind that
formally the statement we aim to prove is an example of the conditional statement P → Q.
Moreover, the statement P → Q is true when P is false, regardless whether Q is false or
true. In such examples, we say that the conditional statement is vacuously true or true by
default, which may lead to situations not necessarily intended by the author. As an example,
consider the following two statements: 1) All the banks we robbed are in Canada, and 2) All
the banks we robbed are outside of Canada. Since we actually did not rob any bank,
regardless whether in Canada or not, both statements are vacuously true.
Therefore, in practice it is important to make sure that there are objects that satisfy the
assumed properties of the theorem. The problem in this section did not ask us to verify this
so let us now make sure that we did not prove a statement that is vacuously true. Indeed, in
our problem one can clearly take Q (x) = x and any P (x) different than Q (x) to satisfy
1 1 1
the assumptions of our problem. Less trivial example is the pair Q (x) = x − 3x and 1
3
3
2 2
Q1 (P1 (x)) = (x − 2) − 3(x − 2)
6 4 2
= x − 6x + 9x − 2
2
3
= (x − 3x) − 2 = P1 (Q1 (x)).
On the other hand, it might be the case that there are actually no objects that satisfy the
assumed properties of the theorem. In such situations, conditional statements can often help
us to formally prove it. Indeed, one can show that the statement P → Q is true and then that
Q is false. The conclusion is that P has to be false since that is the only possibility for the
conditional statement P → Q to be true. Such reasoning is called proof by contradiction
and we often use it in this book.
In order to illustrate this technique, let us consider the following example related to the
problem from this section. We will rst prove the following conditional statement: if a
polynomial P : R → R of odd degree satis es the equation P (x) = Q(x ) for some 2
polynomial Q(x), then P (x) is even. Indeed, it is clear that for each x ∈ R we have
P (x) = Q(x ) = Q((−x) ) = P (−x) so P (x) is even. The conditional statement is true.
2 2
However, any polynomial P (x) of odd degree has the property that lim P (x) = ∞ and x→∞
lim x→−∞ P (x) = −∞, or vice versa, lim P (x) = −∞ and lim
x→∞ P (x) = ∞. Asx→−∞
a result, there exists x ∈ R such that P (x) ≠ P (−x), and so P (x) is not even. In other
words, we showed that no polynomial of odd degree is even. The conclusion is that no
polynomial P (x) of odd degree satis es P (x) = Q(x ) for some polynomial Q(x).
2
EXERCISES
(Source of the problem and solution: PLMO XXX – Phase 1 – Problem 5.)
3.6.3. Find all polynomials P (x) with real coef cients that satisfy the following property:
for all x ∈ R, P (x ) ⋅ P (x ) = (P (x)) .
2 3 5
(Source of the problem and solution idea: PLMO LIX – Phase 1 – Problem 6.)
3.7 Polynomials and Number Theory
SOURCE
PROBLEM
Let 3
f (t) = t for t ∈ R. Consider the family of iterated functions de ned as follows:
+ t
there exist rational numbers x and y and natural numbers m and n such that xy = 3 and
(y).
(m) (n)
f (x) = f
THEORY
Finally, let us mention that in order to solve the problem, we will use the concept of an
invariant, which is discussed in more detail in Section 4.2.
SOLUTION
Let us rst observe that f (t) = f (f (t)) = f (t) so, indeed, we deal with iterated
(1) (0)
functions. Recall also that the composition of functions is associative, that is, for each
i ∈ N ∖ {1} , we have f = f ∘ f (i)
= f
(1)
∘ f . (i−1) (i−1) (1)
We will show that the value of s does not change under transformation f, which is the key
observation that will allow us to solve the problem. Indeed, consider any rational number
r = a/b expressed in lowest terms. Then,
3 2 2
a a a(a + b )
3
s(f (r) ) = s(r + r) = s( + ) = s( ) .
3 3
b b b
Suppose rst that s(r) = 0, that is, 3 | a. Since gcd(a, b) = 1, we get that b is not
divisible by 3. It follows that b3 is not divisible by 3 whereas 3 | a(a + b ), and so we get 2 2
that s(f (r)) = 0. On the other hand, if s(r) = 1, then 3 does not divide a. It follows that
also 3 does not divide a(a + b ), as a + b would be divisible by 3 if and only if both a
2 2 2 2
and b were divisible by 3 which is not the case, and so s(f (r)) = 1.
Our nal observation is that if xy = 3 for some rational numbers x, y, then s(x) ≠ s(y).
Indeed, suppose that x = a/b is a rational number expressed in lowest terms. If s(x) = 0
(that is, 3 divides a but 3 does not divide b), then a = 3k for some k ∈ Z and so
y = 3/x = 3b/a = b/k. Since 3 does not divide b we get that s(y) = 1. On the other hand,
if s(x) = 1 (that is, 3 does not divide a), then in the fraction y = 3b/a the numerator must
be divisible by 3 and so s(y) = 0. However, this means that there is no solution to the
equation de ned in the problem, as no matter which n and m we select, function s evaluated
at f (x) is different than the one at f (y) as long as xy = 3. In particular, it implies that
(m) (n)
(y).
(m) (n)
f (x) ≠ f
REMARKS
Let us point out that the assumption in our problem that x and y are rational numbers is
crucial. Indeed, if x and y are allowed to be any real numbers, then for any n, m ∈ N, we
can easily nd x, y ∈ R such that xy = 3 and f (x) = f (y). (m) (n)
In order to see this note that for any n ∈ N, the function f (x) satis es the following (n)
properties: i) f (0) = 0, ii) f (x) is increasing on the interval [0, ∞), iii)
(n) (n)
Using the properties of f (x) we get that g(x) is continuous on (0, ∞),
(n)
limx→0+ g(x) = −∞, and lim g(x) = ∞. It follows that there exists x ∈ (0, ∞) such
x→∞ 0
that g(x ) = 0. Hence, there exists a pair x = x ∈ R and y = 3/x ∈ R such that
0 0 + 0 +
xy = 3 and f (y).
(m) (n)
(x) = f
EXERCISES
3.7.1. Prove that there are no polynomials P1 (x), P2 (x), P3 (x), P4 (x) with rational
coef cients that satisfy
4
2 2
∑ (Pi (x)) = x + 7 f or all x ∈ R .
i=1
(Source of the problem and solution idea: PLMO LXII – Phase 3 – Problem 6.)
n | (p − q)(q − r)(r − p ) .
(Source of the problem and solution idea: PLMO LXIV – Phase 2 – Problem 1.)
3.7.3. Consider a polynomial P (x) with integer coef cients that satis es the following
property: if a, b ∈ Q and a ≠ b, then P (a) ≠ P (b). Does it mean that P (a) ≠ P (b) for all
a, b ∈ R, a ≠ b?
(Source of the problem and solution idea: PLMO LXIV – Phase 2 – Problem 5.)
Chapter 4
Combinatorics
4.1 Enumeration
4.2 Tilings
4.3 Counting
4.4 Extremal Graph Theory
4.5 Probabilistic Methods
4.6 Probability
4.7 Combinations of Geometrical Objects
4.8 Pigeonhole Principle
4.9 Generating Functions
Graphs Some of our examples will be from graph theory and so here we introduce a few basic
de nitions. A (simple) graph G = (V , E) is a pair consisting of a vertex set V = V (G) and an edge set
E = E(G) consisting of pairs of vertices; that is,
We write uv if u and v form an edge, and say that u and v are adjacent or joined. We refer to u and v as
endpoints of the edge uv. The order of a graph is n := |V (G)|, and its size is m := |E(G)|.
If u and v are the endpoints of an edge, then we say that they are neighbors. The neighborhood of a
vertex v, denoted N (v), is the set of all neighbors of v. The degree of a vertex v, written deg(v), is the
number of neighbors of v; that is, deg(v) := |N (v)|. The numbers
are the minimum degree and, respectively, the maximum degree of G. A graph is called k -regular,
provided each of its vertices has degree k.
A clique (sometimes called a complete graph) is a set of pairwise-adjacent vertices. The clique of
order n is denoted by Kn. An independent set (sometimes called an empty graph) is a set of pairwise-
nonadjacent vertices. The path on n vertices, denoted by Pn, consists of n vertices, v , …, v , and n − 1
1 n
edges, v v for i ∈ [n − 1]. The cycle on n vertices, denoted by Cn, consists of n vertices, v , …, v ,
i i+1 1 n
′ ′ ′
G[V ] = (V , {uv ∈ E : u, v ∈ V })
Bipartite Graphs A graph G is bipartite if the vertex set can be partitioned into two sets, X and Y (that
is, V (G) = X ∪ Y , where X ∩ Y = ∅), and every edge is of the form xy, where x ∈ X and y ∈ Y .
Here X and Y are called partite sets. This de nition can be easily generalized to r -partite graphs. This
time, V (G) = X ∪ X ∪ … ∪ X for some r ≥ 2 and there is no edge of the graph with both
1 2 r
endpoints in Xi, for any i ∈ [r]. The complete r-partite graph K is the graph with partite sets
n1 ,n2 ,…,nr
X , …, X
1 n with n = |X | ( i ∈ [r]) and edges between every pair of vertices from different partite
i i
sets.
Matchings A matching in a graph G is a collection of disjoint edges. The vertices of G incident to the
edges of a matching M are called saturated or matched by M; the other edges are unsaturated or
unmatched. A matching is maximal if it cannot be extended by adding an edge. A matching is
maximum if it contains the largest possible amount of edges. In particular, a perfect matching in a
graph G is a (maximum) matching in G that saturates all vertices of G.
4.1 Enumeration
SOURCE
PROBLEM
There are 100 students at the party. Each student knows at least 67 other students. Prove that there are
at least four students that all know each other. We assume that this relationship is symmetric; that is, if
student A knows student B, then student B knows student A.
THEORY
Indeed, if √2 is rational, then we are immediately done: x = y = √2 is the pair that has the
√2
√2
√2 √ 2⋅√ 2 2
y
x = (√ 2 ) = √2 = √2 = 2 .
Let us stress the fact that based on the above argument, we do not know which of the two pairs of x and
y satis es the desired property but we do know that precisely one of them does. In fact, it turns out that
√2
√2 is irrational but this fact is not needed to claim the correctness of the statement.
Greedy Algorithm Let us come back to constructive arguments. The easiest approach one can try is to
construct the desired object by making locally optimal choices at each stage with the intent of nding a
global optimum. Such approach is often called greedy strategy (or greedy algorithm). Let us stress the
fact that a greedy strategy does not usually produce an optimal solution but it may yield one or at least a
good approximation of it.
SOLUTION
We will perform a greedy search for four students that mutually know each other. Start with any student
A from the set of all students. Now, select any student B, different than A, that knows A. Clearly, it is
possible since there are at least 67 students that know A.
We will now show that there exists a student C that knows both A and B. At most
100 − 67 − 1 = 32 students do not know A and, similarly, at most 32 students do not know B. Since
there are 98 students different than A and B and 32 + 32 < 98, there must exist a student that knows
both A and B, as claimed. We select any such student and call it C.
We continue this greedy selection process to nd student D that knows students A, B, and C.
Arguing as before, there are 97 students to chose from but only at most 3 ⋅ 32 = 96 of them do not
know at least one of A, B, or C. Hence, at least one such student exists and the process is nished.
REMARKS
The property stated in the problem is best possible in the following sense. Suppose that there are still
100 students but this time each of them knows at least 66 other students, instead of 67. With this
slightly weaker assumption, it is possible that there are no four students who know one another. Indeed,
let us partition students into three sets of sizes 33, 33, and 34, respectively, and assume that a student
from one set knows only students from the other two sets.
This property can be reformulated in the language of graph theory as follows: there exists a graph on
n = 100 vertices, minimum degree 67, and without K4, the complete graph on 4 vertices, as a
subgraph. Moreover, this example is, in fact, the well-known Turán graph T (100, 3), related to an
important problem in extremal combinatorics. We will come back to such problems in Section 4.4. In
general, the Turán graph T (n, r) is a graph formed by partitioning a set of n vertices into r subsets, with
sizes as equal as possible, and connecting two vertices by an edge if and only if they belong to different
subsets. The number of edges in this graph is at most (1 − 1/r)n /2 with equality holding if and only
2
graph on n vertices that satis es this property, while having this maximum number of edges. In order to
prove this uniqueness property, suppose that G = (V , E) is such extremal graph. We will start with
proving the following property.
′ ′ ′
S ⊆ V (G ) = (V ∖ {c}) ∪ {a }, |S| = r + 1, induces the complete graph in G if a ∉ S (otherwise, S
′
would induce the complete graph in G). The same holds if both a and a′ are in S (since a and a′ are not
adjacent in G′). Finally, we argue that this is true if a ∈ S but a ∉ S (otherwise, (S ∖ {a }) ∪ {a}
′ ′
would induce the complete graph in G). This contradicts the fact that G has the maximum number of
edges.
Case 2: deg(c) ≥ deg(a). This time we construct G′ from G by removing vertices a and b and
adding vertices c′ and c′′, two copies of vertex c; that is, c′ and c′′ are adjacent to v ∈ V ∖ {a, b} if and
only if c is, and are not adjacent to each other. In particular, {c, c , c } induce no edge. Note that G′ has
′ ′′
more edges than G. Indeed, note that we removed deg(a) + deg(b) − 1 ≤ 2 deg(a) − 1 edges (since a
and b were adjacent in G), less than the number of edges added, namely, 2 deg(c) ≥ 2 deg(a).
Moreover, arguing as in the previous case, G′ does not contain K and so we get the desired
r+1
contradiction.
It follows from the observation that for any three vertices a, b, c ∈ V , if ab ∉ E and bc ∉ E, then
ac ∉ E . Hence, one can partition the vertex set V into k disjoint subsets V , V , …, V such that 1 2 k
vertices from Vi are adjacent to all vertices in V ∖ V but to no vertex in Vi. Since no r + 1 vertices
i
induce K , we know that k ≤ r; otherwise, one could pick one vertex from each set V , V , …V
r+1 1 2 r+1
to form K .
r+1
k k k k
1 1 2
1 2 2
∑ ni (n − ni ) = (n ∑ ni − ∑ n ) = (n − ∑ n ).
i i
2 2 2
i=1 i=1 i=1 i=1
By Jensen’s inequality (see Section 1.1), the sum is minimized for , so the number
k 2
∑ n ni = n/k
i=1 i
2 2
1 n 1 n
2
(n − k( ) ) = (1 − ) ,
2 k k 2
which is maximized for k = r. Note that it might happen that n/k is not an integer and so the above
construction cannot be achieved. Hence, in fact, the unique graph that maximizes the number of edges
has the actual sizes of Vi’s selected in such a way that they differ by at most 1. This nishes the proof.
EXERCISES
4.1.1. There are 2n members of a chess club; each member knows at least n other members (knowing a
person is a reciprocal relationship). Prove that it is possible to assign members of the club into n pairs
in such a way that in each pair both members know each other.
(Source of the problem and solution idea: PLMO XLV – Phase 1 – Problem 9, modi ed.)
4.1.2. There are 17 players in the tournament in which each pair of two players compete against each
other. Every game can last 1, 2, or 3 rounds. Prove that there exist three players who have played
exactly the same number of rounds with one another.
(Source of the problem and solution: well-known, classic problem related to Ramsey numbers.)
4.1.3. Consider a group of people with the following property. Some of them know each other, in which
case the corresponding pair of people mutually like each other or dislike each other. Moreover, there is
a person who knows at least six other people. Interestingly, for each person the number of people he or
she likes is equal to the number of people he or she dislikes. Prove that it is possible to remove some,
but not all, like/dislike links such that it is still the case that each person has the same number of liked
and disliked acquaintances.
(Source of the problem: PLMO LXIX – Phase 1 – Problem 7, modi ed. Solution: our own.)
4.2 Tilings
SOURCE
PROBLEM
THEORY
Invariant An invariant is a property held by a class of mathematical objects, which remains unchanged
when transformations of a certain type are applied to the objects. Invariants are used in diverse areas of
mathematics such as geometry, topology, algebra, and discrete mathematics.
In order to formally de ne an invariant we have to de ne an object, its property, and a transformation
under which this property is invariant. Here are some classical examples, where in each of them we
highlight the object, the property and, the transformation:
1. the distance (property) between two points on a number line (object) is not changed by adding
the same quantity to both numbers (transformation);
2. the area (property) of a gure (object) is invariant with respect to translation (transformation);
4. the measure of angle (property) based on a given circle arc (object) is invariant with respect to
the choice of location of the vertex on this arc (transformation).
In solving tilings problems we often rely on nding an insightful invariant of some mathematical
property. In the problem we deal with in this section, the invariant will ensure that, after we
appropriately assign numbers to all cells, no matter how a 5 × 1 block is placed it covers cells with the
same sum of numbers. On the other hand, the corresponding sum for the 2 × 2 block will not have this
property. This difference will turn out to be a key observation to get the proof.
SOLUTION
Part a) is rather straightforward. Observe that one can easily cover the 120 × 128 rectangular grid with
3, 072 tiles of size 5 × 1 (since 120 is divisible by 5). So we will be done if we can cover the remaining
8 × 128 rectangular grid. In fact, since covering the 8 × 120 grid with 192 tiles of size 5 × 1 is equally
easy, we may reduce the problem of covering the 128 × 128 grid to the one of covering the 8 × 8 grid
(this time with 12 blocks of size 5 × 1 and one block of size 2 × 2). The tiling presented in Figure 4.1
uses the allowed blocks which nishes part a).
Part b) is more interesting. Label the grid so that the bottom left cell has label (1, 1) and the top right
one has label (128, 128). Starting from the bottom left corner, assign numbers from the set {0, 1, 2} to
all cells of the 128 × 128 square grid using the pattern presented in Figure 4.2. (For example, cells with
labels (x, y) for x and y that both give the remainder of 2 when divided by 5 will get number 2
assigned.) Observe that rows 126, 127, 128 and columns 126, 127, 128 use only part of the pattern,
namely, the part restricted to the rst three rows and, respectively, the rst three columns.
Let us rst calculate the sum of the numbers assigned to the whole grid. It contains 25 ⋅ 25 complete
copies of our 5 × 5 pattern, 25 copies of a part of our pattern consisting of its rst three bottom rows,
25 copies of a part of the pattern consisting of its three leftmost columns, and one piece containing the
rst three bottom rows and three leftmost columns. Counting (independently) the sums of numbers in
the four respective strips, we get that the total sum is equal to 25 ⋅ 10 + 25 ⋅ 6 + 25 ⋅ 6 + 6 = 6, 556.
2
Let us now notice that no matter how we place 5 × 1 block it will always cover numbers that sum up
to 2. Hence, regardless how we place 3, 276 such blocks, they cover numbers that sum up to
2 ⋅ 3, 276 = 6, 552. This is the desired invariant that leads us to the solution of this problem. It follows
that the remaining 2 × 2 block must cover numbers that sum up to 4. However, given the pattern in
Figure 4.2 that we used, it is only possible if it lies in the top right corner of the pattern. Given the way
we used the pattern to cover the 128 × 128 square grid, all of these positions do not lie on the border of
the square grid. This proves part b) of the problem.
In fact, we not only solved part b) of the problem but proved something stronger. Namely, there are
only 252 possible places where the 2 × 2 block can be placed. Moreover, by adjusting the process
described in part a) of this problem, we can easily see that each of those 252 locations are possible.
FIGURE 4.2: Illustration for Problem 4.2, part b).
REMARKS
The solution presented above is nice and easy to follow. However, it is not clear how to attack similar
problems in the future. Hence, a natural question is how to guess the pattern presented in Figure 4.2.
There are several possible methods of deriving such patterns but all of them aim to propose a setup that
is repeating in terms of one of the blocks; in our problem, it is 5 × 1 block. A natural starting point is
the straightforward pattern presented in Figure 4.3 and repeating it as described in the solution. Clearly,
in this pattern 5 × 1 block covers exactly one 1 and four 0’s.
Repeating the reasoning presented in the previous solution, we get that the number of 1’s in the
whole square grid is 25 ⋅ 5 + 25 ⋅ 3 + 25 ⋅ 3 + 3 = 3, 278. On the other hand, 5 × 1 blocks cover
2
3, 276 squares with 1’s, which means that the 2 × 2 block must cover exactly two 1’s. It follows that if
this block lies on the border, then there exists some i ∈ {0, …, 25} such that it lies:
We observe now that one can repeat the argument when 1’s form the diagonal from the top left cell to
the bottom right one (instead of from the bottom left to the top right). In particular, the conclusion is
that if the 2 × 2 block lies on the two bottom rows, it must lie on column 2 + 5i and 3 + 5i for some
i ∈ {0, …, 25}. Hence, there is no solution with the 2 × 2 block touching the bottom border of the
square grid. The solutions for the other three borders can be eliminated the same way.
The solution presented earlier merges the two arguments by simply introducing the pattern obtained
from the two diagonals.
FIGURE 4.3: Illustration for Problem 4.2, part b)—starting pattern.
EXERCISES
4.2.1. Consider a square grid of size 25 × 25 that has a smaller square grid of size 5 × 5 cut out from
its bottom left corner. Can you cover the remaining cells with 100 blocks of size 1 × 6 or 2 × 3?
(Source of the problem: Polish Junior Mathematical Olympics X – Phase 1 – Problem 6. Solution: our
own.)
4.2.2. Prove that it is impossible to cover a square grid of size 9 × 9 with tiles of size 1 × 5 or 1 × 6.
(Source of the problem: Letters of Polish Junior Mathematical Olympics, September 2014. Solution:
our own.)
4.2.3. Can you cover a square grid of size 10 × 10 with 25 “T-shaped” blocks consisting of 4 small
squares?
(Source of the problem and solution idea: Letters of Polish Junior Mathematical Olympics, September
2014.)
4.3 Counting
SOURCE
PROBLEM
There are various clubs in a class consisting of 23 students. Each club has exactly 5 members.
Moreover, any two different clubs have at most 3 members in common. Prove that there are less than
2, 018 clubs in the class.
THEORY
Permutations Let S be a set of n elements. We are interested in investigating various ways in which
objects from S may be selected, without replacement, to form a sequence of n elements. Each of these
possible sequences is called a permutation. Formally, a permutationπ is a bijection π : [n] → S ; π(i) is
the element that was selected at round i. (Recall that a bijection is a function between the elements of
two sets, say A and B, where each element of A is paired with exactly one element of B, and each
element of B is paired with exactly one element of A.)
The number of permutations of an n-element set (that is, the number of ways one can order n
elements) is equal to
n
n! = ∏ i = 1 ⋅ 2 ⋅ … ⋅ n .
i=1
(4.1)
One can easily prove this formula by induction. Alternatively, note that there are n ways to select the
rst object from S. Since the selection is done without replacements, there are n − 1 objects left to
select from; we select any of them and continue until all elements are picked. The total number of ways
is then n ⋅ (n − 1) ⋅ … ⋅ 1 = n! and the formula (4.1) is veri ed.
Combinations This time k objects are selected from a set S of n elements to produce subsets without
ordering (that is, unlike permutations, the order of selection does not matter). More formally, a k -
combination of S is a subset of k distinct elements of S.
The number of k-combinations of an n-element set (provided that 1 ≤ k ≤ n ) is equal to the
binomial coef cient
n n! n(n − 1)⋯(n − k + 1)
( ) = = .
k k!(n − k)! k(k − 1)⋯1
(4.2)
One can prove this formula in many ways; we provide a direct counting argument that is similar to the
solution to our problem above. Select k elements, one by one, without replacement; there are
n(n − 1)⋯(n − k + 1) ways to do it. Clearly, each subset of k elements of S can be obtained in k!
different ways (we know which elements are selected and we are happy with any permutation of them)
so we are over-counting. The formula (4.2) holds.
Double Counting Let us nish with a very useful double counting combinatorial proof technique for
showing that two expressions are equal by demonstrating that they are simply two ways of counting the
same thing. For example, note that
n
n
n
∑( ) = 2 .
k
k=0
Indeed, on the left hand side we independently count k-elements subsets of an n-element set while the
right hand side counts all subsets. Alternatively, one can use the binomial theorem (see Section 1.6) to
get that
n n
n
n n
n k n−k
2 = (1 + 1) = ∑( ) ⋅ 1 ⋅ 1 = ∑( ) .
k k
k=0 k=0
n n − 2 n − 2 n − 2
( ) = ( ) + 2( ) + ( ) .
k k k − 1 k − 2
In order to see it, let us color n − 2 elements of an n-element set S red and the remaining 2 elements
blue (arbitrarily). We observe that the left hand side counts k-element subsets of S. On the other hand,
the right hand side independently counts k-element subsets with a given number of blue elements (that
is, 0, 1, and 2, respectively).
Finally, let us show that
n
2
n 2n
∑( ) = ( ) .
k n
k=0
Since ( n
k
) = (
n
n−k
) , it is enough to show that
n
n n 2n
∑( ) ⋅ ( ) = ( ) .
k n − k n
k=0
But this equality is obvious. The right hand side counts the number of n-element subsets of the set [2n].
The left hand side counts the same thing, where, for 0 ≤ k ≤ n, the term ( ) ⋅ ( ) counts the
n
k
n
n−k
number of subsets in which k elements are chosen from the set [n] and n − k elements are chosen from
the set [2n] ∖ [n].
SOLUTION
Let C be the set of students such that |C| = 23. Clubs can be represented by a family of subsets Ai of C
of size 5 ( i ∈ [k], where k is the number of clubs). Since no two clubs have more than 3 members in
common, each subset B of C of size 4 is contained in at most one Ai. We can then label each B ⊆ C of
size 4 with i if it belongs to a unique Ai, and assign it label 0 otherwise; that is, when students from B
are not members of the same club. On the other hand, each Ai has clearly 5 distinct four element
subsets, so 5 sets B of size 4 have label i assigned to them. Since the number of sets B of size 4 with
non-zero label is at most ( ), the total number of subsets of size 4, we get that k, the number of clubs,
23
is at most
23
( )/5 = 1, 771 < 2, 018 .
4
REMARKS
The problem considered in this section is an example of a typical situation when the solution can be
obtained by careful and appropriate counting technique. In our problem we rst see that there are
) = 33, 649 sets of size 5. However, clearly not all of them can form a club as it would violate the
23
(
5
property that clubs cannot share many members. In order to reduce the number of possible clubs, we
observe that it is enough to know only 4 members of a club to uniquely identify it. This observation
leads to the solution.
EXERCISES
4.3.1. The class consists of 12 people. Count in how many ways one can divide them into: 6 pairs, 4
triples, 3 quadruples, and 2 six-tuples. Which option yields the largest number of possibilities?
4.3.2. Consider an n × n square grid on which we want to place k ≤ n chess rooks in such a way that
none of them attack another rook. Count the number of ways one can do it.
4.3.3. Create all possible 4-digit numbers using digits from set [9] = {1, 2, 3, 4, 5, 6, 7, 8, 9} . Find the
sum of those numbers.
4.3.4. Alice has 20 balls, all different. She rst splits them into two piles and then she picks one of the
piles with at least two balls, and splits it into two. She repeats this until each pile has only one ball.
Find the number of ways in which she can carry out this procedure.
(Source of the problem and solution: Problem 1.8.27 from Discrete Mathematics by Lovász, Pelikán,
and Vesztergombi.)
PROBLEM
20 boys and 20 girls attended a high-school prom. During this event, there were 98 dances. In each
dance, one boy danced with one girl, and no pair danced more than once. Prove that there were two
boys (say, b , b ) and two girls (say, g , g ) such that they all danced with one another (that is, b1 and b2
1 2 1 2
THEORY
In this section we are interested in basic extremal graph theory that studies extremal (maximal or
minimal) graphs which satisfy some certain property. Extremality can be taken with respect to different
graph invariants, such as the number of vertices, the number of edges, or the length of a longest path.
Extremal graph theory of cially began with Turán’s theorem that we already stated (and proved) in
Section 4.1.
The problem we deal with in this section is closely related to ex(n; C ) de ned next. The connection
4
Turán Number Given a class of graphs F = {F , F , …}, let us call a graph F -free if it contains no
1 2
copy of F as a subgraph for each F ∈ F . Let the Turán number, denoted ex(n; F ), be the maximal
number of edges in an F -free graph on n vertices. If the class of graphs F consists of a single graph,
then we write ex(n; F ) instead of ex(n; {F }).
In Section 4.1 we considered T (n, r), the Turán graph; that is, the complete equi-partite graph,
K n1 ,n2 ,…,nrwhere ∑ n = n and ⌊n/r⌋ ≤ n ≤ ⌈n/r⌉. By Turán’s theorem we have
i i i
extremal number. In fact, the case ex(n; K ) = ⌈n /4⌉ was shown earlier by Mantel.
3
2
In order to show that the bounds for the number of dances is, in some sense, best possible we need to
introduce a family of graphs obtained from the projective planes. We de ne them now and we will
explain the connection soon.
Projective Planes Given a set P of points and a set L of lines, we de ne the corresponding incidence
graph G(P , L) to be the bipartite graph whose vertices consist of the points (one partite set), and lines
(the second partite set), with point p ∈ P adjacent to line ℓ ∈ L if p lies on ℓ.
A projective plane consists of a set of points and lines satisfying the following axioms.
1. There is exactly one line incident with every pair of distinct points.
2. There is exactly one point incident with every pair of distinct lines.
3. There are four points such that no line is incident with more than two of them.
Finite projective planes possess q + q + 1 points for some q ∈ N (called the order of the plane) and
2
the same number of lines. Projective planes of order q exist for all prime powers q, and an unsettled
conjecture claims that q must be a prime power for such planes to exist.
It follows immediately from the axioms that the corresponding incidence graph does not contain C4,
a cycle of length 4 (and of course any odd cycle as the graph is bipartite). It is also possible to show that
this graph is q + 1 regular.
See Figure 4.4 for G(P , L), where (P , L) is the Fano plane (that is, the projective plane of order 2).
We note the incidence graph of the Fano plane is isomorphic to the well-known Heawood graph.
SOLUTION
For a contradiction, suppose that there were no two boys and two girls that danced with each other. For
any i, j ∈ [20], let x = 1 if i’th boy danced with jth girl; otherwise, x = 0. Since there were 98
i,j i,j
20 20
∑ ∑ xi,j = 98 .
i=1 j=1
Note that g = ∑ x is the number of girls that danced with the i’th boy; similarly, b = ∑ x
i
20
j=1 i,j j
20
i=1 i,j
20 20
i=1 i=1
(4.3)
Indeed, one can simply consider all boys she dances with (that is, those for which x = 1); each of i,j
them danced with g − 1 girls other than the j’th girl. In fact, the right hand side of (4.3) is not only an
i
upper bound for f (j) but equality holds. This is because since no two girls danced with the same two
boys, all of these girls must be unique. Finally, since there are 19 girls other than the j’th girl, we get
that f (j) ≤ 19, or equivalently that
20
∑ xi,j ⋅ gi ≤ 19 + bj .
i=1
It follows that
20 20 20 20
∑j=1 ∑i=1 xi,j ⋅ gi ≤ ∑j=i (19 + bj ) = 20 ⋅ 19 + ∑j=1 bj
= 380 + 98 = 478.
(4.4)
2
20 2 1 20 2 1 20
= ∑i=1 gi = 20 ⋅ ∑i=1 gi ≥ 20( ∑i=1 gi )
20 20
2 2,401
98
= 20( ) = = 480.2 > 478,
20 5
(4.5)
where the rst inequality follows from the fact that the function f (x) = x is convex (see Section 1.1 2
for more details). Inequalities (4.4) and (4.5) give us the desired contradiction.
REMARKS
In order to see the bigger picture, we will provide an alternative solution. Let us rst reformulate the
problem in the language of graph theory. Let B and G be the set of boys and the set of girls,
respectively. Dances can be represented as bipartite graph G = (B ∪ G, E) where bg ∈ E if and only if
boy b danced with girl g. We know that n = |B| = |G| = 20 and m = |E| = 98. Our goal is to show
that G contains C4, a cycle of length 4.
For a contradiction, suppose that G does not contain C4. Let
2
|F | = ∑ (deg b − 1) = ∑ deg b − ∑ deg b
bϵB bϵB bϵB
1 2 2
= (∑ deg b)(∑ 1 ) − ∑ deg b.
n bϵB bϵB bϵB
Clearly,
2
1 2 2
|F | = ∑ (deg b)(deg b − 1 ) = ∑ deg b − ∑ deg b = (∑ deg b)(∑ 1 ) − ∑ deg b.
n
b∈B b∈B b∈B b∈B b∈B b∈B
as m = ∑
b∈B
. On the other hand, since there is no cycle of length 4 in G, each pair of girls
deg b
(g1 , g2 ) is associated with at most one boy b in the family F. It follows that
|F | ≤ n(n − 1 ) .
We get that
2
m
− m − n(n − 1 ) ≤ 0
n
(4.6)
and so
1 + √ 1 + 4(n − 1) n
m ≤ = (1 + √ 4n − 3 ) .
2/n 2
Since n = 20, the following bound must hold: m ≤ 97.75. We get the desired contradiction as m = 98
.
In our problem, we assumed that the graph is bipartite but, in fact, one can easily adjust the argument
for general graphs. The only difference is that ∑ deg b is equal to 2m, not m. Instead of (4.6) we
b∈B
get
2
4m
− 2m − n(n − 1 ) ≤ 0
n
which implies that
2 + √ 4 + 16(n − 1) n 1 3/2
ex(n, C4 ) ≤ = (1 + √ 4n − 3 ) = ( + o(1))n .
8/n 4 2
On the other hand, the incidence graph of the projective plane of order q is an example of a dense
graph without C4. Indeed, G(P , L) has
2 2
n = 2 (q + q + 1) = (2 + o(1) ) q
vertices and
2 3
1 3/2
m = (q + q + 1)(q + 1 ) = (1 + o(1) ) q = ( + o(1) ) n
3/2
2
edges. This construction works when q is a prime power. But, since it is known that for every integer n
there exists a prime p satisfying n ≤ p ≤ (1 + o(1))n, the above estimation applies to all values of n. It
follows that
1 3/2
ex(n, C4 ) ≥ ( + o(1) ) n .
3/2
2
In fact, one can show that the upper bound is sharp; that is, there is a construction that (almost) matches
this bound; that is, ex(n, C ) = (1 + o(1))n /2.
4
3/2
EXERCISES
4.4.1. There is a club with 100 members where there are 1, 000 pairs of friends. We want to pick a three
person team from the club with one team member selected as a team leader. The procedure is that one
club member rst becomes a leader. The leader then chooses two followers from his/her friends and the
team is formed. Show that it is possible to pick a team from the club in at least 19, 000 ways.
4.4.2. Consider the following combinatorial game between two players, Builder and Painter. The game
starts with the empty graph on 400 vertices. In each round, Builder presents an edge uv between two
non-adjacent vertices u and v which has to be immediately colored red or blue by Painter. Show that
Builder can force Painter to create a monochromatic (that is, either red or blue) path on 100 vertices in
400 rounds.
(Source of the problem and solution: well-known, classic problem related to on-line size Ramsey
numbers.)
4.4.3. Consider a chess club consisting of 4t members for some t ∈ N; some of the members know
each other. Show that there exist t members that all know each other, or there exist t members such that
no two of them know each other.
(Source of the problem and solution: well-known, classic problem related to Ramsey numbers.)
PROBLEM
There are 100 people sitting at the round table. Each person has ordered an ice cream, either vanilla or
chocolate avoured. In total, 51 people asked for vanilla ice cream; the remaining 49 preferred
chocolate one. The correct number of each avour was prepared and placed on a table—one ice cream
in front of one person. However, the waiter has forgotten who ordered which dessert so it is not
guaranteed that everyone received a desert he or she ordered. Fortunately, it is possible to rotate the
table and try to satisfy more customers. Prove that one can rotate the table such that at least 52 people
will get what they wanted.
THEORY
Union Bound Let us use the following elementary fact, also known as Boole’s inequality, that we
already introduced in Section 1.8. For any collection of events A , …A , 1 n
n
⎛ ⎞ n
P ⋃ Ai ≤ ∑ P(Ai ) .
⎝ ⎠ i=1
i=1
(4.7)
Note that this inequality is best possible—the equality holds for disjoint events.
Bonferroni Inequalities Moreover, let us mention that (4.7) may be generalized to nd stronger upper
and lower bounds. These bounds are known as Bonferroni inequalities. In particular,
n
⎛ ⎞
P ⋃ Ai ≥ ∑ P(Ai ) − ∑ P(Ai ∩ Aj ) .
⎝ ⎠ 1≤i≤n 1≤i<j≤n
i=1
(4.8)
Sj := ∑ P(Ai ∩ … ∩ Ai ) .
1 j
1≤i1 <…<ij ≤n
E [X ] =
P
⎜⎟
⎛
⎝
⋃ Ai
i=1
⋃ Ai
i=1
⋃ Ai
i=1
n
∑ x ⋅ P (X = x )
x∈χ
⎞
⎠
≤
=
k
∑ (−1)
j=1
∑ (−1)
j=1
∑ (−1)
j=1
∑ x ⋅ P (X = x )
x∈χ
For example, roll a fair die once and de ne X to be the number rolled. Clearly, χ = [6] and
=
6
∑x ⋅
x=1
j−1
j−1
Boole’s inequality is recovered by setting k = 1. When k = n, then equality holds and the resulting
identity is in fact equivalent the well-known inclusion–exclusion principle:
n
n
j−1
Linearity of Expectation Consider a nite probability space and a (real) random variable X that takes
values from the set χ. The expectation of X is de ned as
E [X ] :=
Sj ,
Sj .
Sj .
An important and very useful property of the expectation is that it is a linear operator; that is, for any
sequence of random variables X , …, X and any sequence of constants c , …, c ∈ R,
∑ i ⋅ pi
i=1
=
∞
∑i ⋅
i=1
1
E [∑ ci Xi ]
6
(
i=1
6
n
)
n
=
∞
∑∑
i=0
=
n
∑ ci E [Xi ]
i=1
For example, if you roll two fair dice, the expected sum is equal to 2 ⋅ (7/2) = 7.
In order to show how the expected value can be calculated for some more complex random variables,
we consider the following experiment (that we believe is natural and interesting on its own). Assume
you are given a 24-card deck (that is, the deck consisting of 9-10-J-Q-K-A for each of the four suits).
Let us rst consider the following experiment. Draw one card from this deck at random. If it is an
Ace, then you nish the game; otherwise, you need to put the card back into the deck and restart the
experiment. Our goal is to nd the expected number of draws till you draw an Ace. It is clear that the
process nishes after i rounds with probability p := ( ) . (In fact, the number of rounds is a
j=i
1
6
5
6
(
i−1
random variable following the geometric distribution with parameter p = 1/6.) It follows that the
6
)
where the formula for a sum of a geometric series was used twice. However, assuming that the expected
j
value exists and is equal to E, we can alternatively compute it in a simpler way. Observe that we either
1
=
=
∑(
i=0
1
∞
7
2
.
6
n
)
i
= 6,
(4.9)
draw an Ace in the rst round (which happens with probability 1/6) or ‘lose’ one draw and then repeat
the identical process (this happens with probability 5/6). It follows that
1 5
E = + (1 + E ) ,
6 6
which immediately implies that E = 6. Let us stress that in order for this reasoning to be correct, we
had to assume that the expected value of the random variable we were interested in exists (that is, it is
nite).
Let us now change slightly the setting, and assume that cards are drawn at random without
replacement. For this variant, it is obvious that the expected value exists, as it must be less than 21 (we
have 24 cards and 4 aces, so at the worst case the process nishes at the end of round 21). In order to
nd the expectation, one could write down the sum over all possible values for the length of the process
(as we did above) but in this case it would be even more cumbersome. Fortunately, it is much simpler to
use the other approach which is justi ed as we are guaranteed that the expected value exists. Let us
denote by En the expected number of draws from an n-card deck that contains 4 aces till we hit an ace.
Clearly, E = 1. Now, arguing as before we get that
4
4 n − 4 n − 4
En = + (1 + En−1 ) = 1 + En−1 .
n n n
It is clear that E = (n + 1)/5 satis es this recursion, and so for a 24-card deck the expected value is
n
(24 + 1)/5 = 5. Finally, let us observe that the waiting time without replacement is smaller than with
replacement. This is what one should intuitively expect as each unsuccessful draw in the variant
without replacement increases the probability that we nish in the next round whereas in the other
variant it remains the same.
independently). Now, for a given set A of n teams, the probability that there is another team that beats
all teams in A is (1/2) . Hence, the probability that there is no team better than all teams in A is equal
n
to (1 − 1/2 ) n
. The same formula holds for another set of n teams, say set B. Clearly, there are
m−n
some correlations between the corresponding events; for example, the fact that there is a team that beats
all teams in A increases the probability that there is a team that beats all teams in B, provided that
A ∩ B ≠ ∅ . However, by the union bound (see (4.7)), the probability that there is at least one set of n
teams for which there is no better team is at most
m−n
m 1
q(n, m ) := ( )(1 − ) .
n
n 2
If q(n, m) < 1, then we are guaranteed that there exists a tournament with m teams such that no n
teams can be awarded without another team beating all of them.
Clearly, for any xed n ∈ N, one can nd m = m(n) large enough such that q(n, m) < 1 as ( ) m
q(3, 91) < 1 and q(10, 102653) < 1. Note also that for any natural numbers m ≥ n, (
m n
) ≤ (em/n)
n
2 n
= exp (n + n(n ln 2+ ln n) − n + n
) < 1
2
provided m = 2 n 2
n and n ≥ 12.
Let us now switch gears and discuss another elementary probabilistic method. It is obvious that one
can use the expectation of random variable X to estimate the minimum and maximum value X can take.
In other words, we are guaranteed that there exist x , x ∈ χ such that 1 2
x1 ≤ E [X] and x2 ≥ E [X ] .
where xmax is the maximum value in χ. Similarly, E [X] ≥ x , where xmin is the minimum value in χ.
min
Surprisingly, this naive method can be used to prove many non-trivial statements.
To illustrate the method, consider any n nite sets A , …, A . Then one can pick some of them such
1 n
that at least half of the underlying elements are repeated an odd number of times. In order to see this,
let us pick each Ai with probability 1/2, independently for all i. Note that for any a ∈ A := ⋃ i∈[n]
Ai ,
the probability that x is repeated an odd number of times is equal to 1/2. Indeed, for each set Ai that a
belongs to, we toss a fair coin to decide if Ai is picked or not. Regardless of the current state of the
process, the last Ai that we need to consider causes a to be repeated an odd number of times with
probability 1/2. It follows that the expected number of elements that are repeated an odd number of
times is equal to |A|/2. By the probabilistic method, we are guaranteed that it is possible to pick some
sets so that the number of elements that are repeated odd number of times is at least |A|/2.
Here is another example, this time from graph theory. We will show that in every graph G = (V , E),
one can partition V, the vertex set, into V1 and V2 such that the number of edges with one endpoint in
V1 and another in V2 is at least |E|/2. Indeed, construct a random set V ⊆ V by putting each vertex of 1
V in V1 independently, with probability 1/2. Let V := V ∖ V . For a given edge e ∈ E, let Xe denote
2 1
the indicator random variable that e has exactly one endpoint in V1; that is, X = 1 if e has the desired e
1
E [Xe ] = 1 ⋅ P (Xe = 1) + 0 ⋅ P (Xe = 0 ) = P (Xe = 1 ) = .
2
Note that G has X = ∑
e∈E
Xe edges with the desired property. By linearity of expectation,
|E|
E [X ] = E [∑ Xe ] = ∑ E [Xe ] = ,
2
e∈E e∈E
SOLUTION
Observe that we have 100 possible rotations of the table (including a trivial one; that is, without
rotating at all). Let us number all possible con gurations using numbers from 1 to 100; for example, in
order to be precise, con guration i ∈ [100] is obtained by rotating the table clockwise by i places.
Consider a given con guration i, and let xi be the number of people who wanted to get chocolate ice
cream but got vanilla one. Since 49 people asked for chocolate ice cream, 49 − x people wanted i
chocolate ice cream and got what they wanted. Moreover, since 51 people got vanilla ice cream,
51 − x people wanted vanilla ice cream and got what they wanted. Therefore, 100 − 2x people got
i i
what they asked for. It remains to show that there exists i ∈ [100] such that x ≤ 24 as this guarantees i
We are going to use the double counting argument discussed in Section 4.3. We rotate the table
investigating all 100 con gurations and counting how many people in total wanted chocolate ice cream
but got vanilla one. On the one hand, this is clearly equal to ∑ x . Now we will count the same thing
100
i=1 i
but this time from the perspective of any person out of 49 people who wanted chocolate ice cream.
While table was rotating, this person saw precisely 51 vanilla ice creams. It follows that
100
∑ xi = 49 ⋅ 51 = 2, 499 ,
i=1
or equivalently x = 24.99. Since the average value is 24.99, there must be at least one i for
1 100
∑ i=1 i
100
which x ≤ 24.99. Moreover, since all numbers are integers, we are guaranteed that x ≤ 24, which
i i
REMARKS
The problem can be equivalently solved using the probabilistic method. Assume that one rotates the
table uniformly at random; that is, each con guration i ∈ [100] occurs with probability 1/100. There
are 49 people that asked for chocolate ice cream; let us mark them with labels c , …, c . For any 1 49
j ∈ [49], let Xj be the random variable that equals 1 if cj got vanilla ice cream, and equals 0 otherwise.
(As mentioned above, such random variables are called indicators.) Clearly,
51
E [Xj ] = 1 ⋅ P (Xj = 1) + 0 ⋅ P (Xj = 0 ) = P (Xj = 1 ) = ,
100
as 51 vanilla ice creams were served.
Let us stress the fact that random variables Xj and Xk are correlated. Indeed, cj and ck are sitting
around the table, and ice creams are placed on the table; it might happen that the fact that X = 1 j
affects the probability that X = 1. Fortunately, the linearity of expectation holds for any sequence of
k
By the probabilistic method, we get that one can rotate the table so that ∑
49
j=1
Xj ≤ 24.99 and we are
done.
EXERCISES
X := [N ] = {1, 2, …, N } into two subsets A and B such that neither A nor B contains an arithmetic
progression of length k.
(Source of the problem and solution: well-known, classic problem related to Van der Waerden
numbers.)
4.5.2. Show that for any n ∈ N there is a tournament with n basketball teams participating in which
there are at least k = n!/2 n−1
orderings t , …, t such that team ti won against team t , for all
1 n i+1
i ∈ [n − 1].
(Source of the problem and solution: well-known, classic problem related to directed Hamilton paths.)
4.5.3. Consider a graph with T triangles. Show that it is possible to color the edges of this graph with
two colors so that the number of monochromatic triangles is at most T /4.
4.5.4. There are 100 people invited to the party; 450 pairs of people know each other. Show that it is
possible to select 10 people so that no two of them know each other.
(Source of the problem and solution: well-known, classic problem related to independent sets in a graph
with a given degree sequence.)
4.6 Probability
SOURCE
Problem and solution: our own (inspired by a problem from the book “Are You Smart Enough to Work
at Google?” by William Poundstone)
PROBLEM
Let n ≥ 2 be any natural number. Take a unit stick and break it in n random places. Formally, each
breaking point is chosen uniformly at random from the whole stick. Find the probability that one can
create a polygon from the n + 1 resulting pieces.
THEORY
Geometrical Probability In order to solve the special case of our problem (when n = 2), we will use
some basic geometrical probability. This eld studies some basic properties of geometrical objects such
as points, lines, planes, circles, spheres, focusing on some natural and fundamental concepts such as
random points, random planes, random directions. Let us note that any rigorous discussion on
geometrical probability would require sophisticated mathematical background (such as measure theory
and integral geometry). As a result, we only scratch the surface in this book, focusing exclusively on
very simple applications.
In order to give a avour of results in this eld, let us consider the clean tile problem that is an
example of a mathematical game of chance that is concerned with dropping a circular coin at random.
This game was studied by Buffon who is famous because of another game of chance he studied, the
needle problem that is concerned with dropping a needle at random. The needle problem requires
slightly more sophisticated tools so we only discuss the clean tile problem here.
In a room tiled with equal tiles of any shape a coin is thrown upward. One of the players bets that
after its fall the coin will rest clean, that is, on one tile only. The second player bets that the coin will
rest on the crack that separates tiles. We would like to investigate if the game is fair. Buffon himself
considered tiles shaped as squares, equilateral triangles, rhombuses, and hexagons. We concentrate on
squares, the easiest case. Suppose that a coin has diameter d and the oor is led with squares, each of
side ℓ for some ℓ > d. We assume that the center of the coin lands at a random place on the oor. It is
clear that the coin touches the separating crack if and only if the center is at distance less than d/2 from
the crack—see Figure 4.5. Hence, the probability for the coin to be entirely within one of the tiles is
given by the ratio between the area of the inner square and that of the outer square, that is, the rst
player wins with probability p where
2 2
(ℓ − d) d
p := = (1 − ) .
2
ℓ ℓ
For the game to be fair these two probabilities must be equal, that is, the following equation has to be
satis ed
2 2
ℓ − 4dℓ + 2d = 0.
There are two solutions, ℓ = (2 ± √ 2)d , but since ℓ > d, the only acceptable solution is
ℓ = (2 + √ 2)d ≈ 3.41 d.
Smaller values of ℓ give advantage for the second player and larger values favour the rst player.
Disjoint and Mutually Exclusive Events Two events A and B are said to be disjoint if they cannot
occur at the same time, that is, A ∩ B = ∅. In particular, P(A ∩ B) = P(∅) = 0. The simplest
example of disjoint events is a coin toss. A tossed coin outcome can be either head or tails, but both
outcomes cannot occur simultaneously.
Being mutually exclusive is a slightly different property of events (sets in a probability space). Two
events are mutually exclusive if the probability of them both occurring is zero, that is, P(A ∩ B) = 0.
With that de nition, disjoint sets are necessarily mutually exclusive, but mutually exclusive events are
not necessarily disjoint.
In order to illustrate the difference, suppose that a point is selected uniformly at random from the unit
square. Each coordinate is uniformly and independently distributed from the set [0, 1]. Let A be the
event that the x coordinate is greater than or equal to the y coordinate, and B be the event that the y
coordinate is greater than or equal to the x coordinate. Clearly, P(A) = P(B) = 1/2 and
A ∩ B = {(x, x) : x ∈ [0, 1]}, and so the events are not disjoint. However, P(A ∩ B) = 0, as the area
Consider now a sequence of pairwise mutually exclusive events A1 , …, An . One can combine the
Union Bound (4.7) and the Bonferroni inequality (4.8) to get that
n
⎛ ⎞
and so
n
⎛ ⎞ n
P ⋃ Ai = ∑ P(Ai ) .
⎝ ⎠ i=1
i=1
(4.10)
SOLUTION
Let us rst de ne the problem more formally. Let A0 and A be the two endpoints of the unit stick.
n+1
Let A , …, A be the n random breaking points. Recall that, for each i ∈ [n], Ai is chosen uniformly at
1 n
random from the whole stick. It follows that Xi, the distance from A0 to Ai, is a random variable that
has a real number from the interval [0, 1] assigned uniformly at random. These random variables are
independent and so we can generate them one by one or all the same time (simultaneously)—we will
use this property at some point. Note also that with probability zero X = X for some i ≠ j, and so
i j
we may assume that such situation does not happen. As a result, we may order Ai’s in an increasing
order of the corresponding Xi’s. Formally, let π : [n] → [n] be the unique permutation such that
X π(i) < X for any i ∈ [n − 1] (notice that π is also a random variable that depends on random
π(i+1)
variables Xi).
In order to keep the notation simple, let us x X = 0 and X
π(0)
= 1. Once we break the stick,
π(n+1)
we get n + 1 pieces. The ith piece ( i ∈ [n + 1]) has length L = X − X i . Clearly, the desired
π(i) π(i−1)
polygon can be created if and only if no piece has length larger than 1/2, that is, L ≤ 1/2 for all i
i ∈ [n + 1]. Note that the probability that some piece has length exactly 1/2 is equal to zero and so we
may or may not include this degenerate case without affecting the result.
P(A1 ) = P (L1 > 1/2 ) = P
⎜⎟
Let Ai be the event that the ith piece is too long, that is, L > 1/2. Finding the probability that the
⎝
n
⋃ (Xi
i=1
i
rst piece is too long is easy. Indeed, in this case one can simply look at the independent random
variables Xi to get that
> 1/2)
⎞
⎠
=
Similarly, the last piece is too long with the same probability, namely, P(A ) = 1/2 . But what
(
then the two events Ai and Aj are disjoint, that is, they cannot occur at the same time. In particular,
P
⎛
⋃ Ai
⎝
n+1
i=1
P(Ai ∩ Aj )
⎠
=
n+1
∑ P(Ai )
i=1
= 0.
It follows that the probability that one can create a polygon from the
1 − (n + 1)/2 .
REMARKS
n
2
n
.
n + 1
Recall that the outcome of our random experiment (that is, breaking the stick into n pieces) can be
represented by n random variables Xi. Each Xi ( i ∈ [n]) has a real number from the interval [0, 1]
assigned uniformly at random and independently. As a result, one can alternatively think about this
experiment as a process of selecting a point uniformly at random from the n-dimensional unit cube. We
1
n+1
will use this point of view to geometrically solve our problem for the speci c case of breaking the stick
Let (x, y) ∈ [0, 1] be a random point from the unit square (recall that X = x, X = y). If both x
2
1
and y are less than 1/2, then clearly we will not be able to construct a triangle in the original problem
as the third piece will be too long. Similarly, if both are more than 1/2, then the rst piece will be too
long. If x < 1/2 < y, then our task is doable if and only if y − x < 1/2, that is, the middle piece is
short enough. The same argument applies to the situation when y < 1/2 < x. In this case, the suf cient
and necessary condition for being able to achieve our task is x − y < 1/2. We present all four cases in
Figure 4.6—shaded areas correspond to the two cases when the triangle can be constructed. It follows
)
about some middle piece? It is not clear. It feels that the situation can be different but it turns out that
the distributions of all Li’s are the same. To see this we do the following trick. Instead of breaking the
stick into n + 1 pieces by breaking it in n random places, we start with a rope that forms a circle with
unit circumference (that is, we take a unit length rope and glue the two endpoints together). Now, we
cut the rope in n + 1 random places. Again, we can do the cuts one by one or all the same time. If we
do the cuts one by one, then we immediately see that the two processes are equivalent. Indeed, after the
rst cut the situation is exactly the same as at the beginning of the process of breaking the stick. From
that point on, the two processes can be coupled together. On the other hand, if we cut the rope in n + 1
places simultaneously, then, by symmetry, we see that there is nothing special about L1 or L . All the
random variables Li have the same distribution; in particular, P(A ) = P (L > 1/2) = 1/2 .
i i
We need one more observation to solve our problem. Clearly, the events Ai are not independent. If
the ith piece is too long, then the chance that some other piece is too long is smaller. In fact, if i ≠ j,
n
n+1
pieces is equal to
2
that the probability we are successful is equal to 1/4 (of course, it is consistent with our general result:
1/4 = 1 − (2 + 1)/2 ).
2
EXERCISES
4.6.1. Consider an urn that initially contains one white and one black ball. We repeatedly perform the
following process. In a given round, one ball is drawn randomly from the urn and its color is observed.
The ball is then returned to the urn, and an additional ball of the same color is added to the urn. We
repeat this selection process for 50 rounds so that the urn contains 52 balls. What number of white balls
is the most probable?
(Source of the problem: PLMO L – Phase 1 – Problem 11. Solution: our own.)
4.6.2. There are 65 participants competing in a ski jumping tournament. They take turns and perform
their jumps in a given sequence. We assume that no two jumpers obtain the same result and that each
nal resulting order of participants is equally probable. At each given round of the tournament, the
person that has obtained the best result thus far is called a leader. Prove that the probability that the
leader changed exactly once during the whole tournament is greater than 1/16.
(Source of the problem: PLMO XLVII – Phase 1 – Problem 11. Solution: our own.)
4.6.3. Three random events meet the following three conditions: (a) their probabilities are all equal, (b)
they are pairwise independent, and (c) all of them cannot happen at the same time. What is the
maximum probability that at least one of these three events holds?
(Source of the problem and solution idea: PLMO XXXV – Phase 1 – Problem 9.)
PROBLEM
Consider the set P of all points (x, y) on a plane with x, y ∈ Z; that is, P = Z × Z, the Cartesian
product of Ζ and Ζ. Suppose that each point in P is painted red or blue. Prove that there exists an
in nite subset of P that has a center of symmetry and consists of points having the same color.
THEORY
Recall that for any two sets A and B, the Cartesian product A × B is the set of all ordered pairs (a, b)
where a ∈ A and b ∈ B; that is,
A × B := {(a, b) : a ∈ A, b ∈ B } .
This de nition extends naturally to any dimension n ∈ N, the Cartesian product A × … × A , where 1 n
instead of ordered pairs we deal with ordered n-tuples. Moreover, if all Ai’s are the same, then we
simply write An instead of A × … × A.
Point Re ection Let p = (p1 , …, pn ) ∈ R be any point in n-dimensional space. For any point
n
a = (a1 , …, an ) ∈ R
n
, the re ection of a across the point p is point
= 2p − a.
In the case where p = (0, …, 0) ∈ R is the origin, point re ection of a is simply the negation of
n
vector a. In two dimensions, namely when n = 2, a point re ection is the same as a rotation of 180
degrees.
Point Symmetry A set S ⊆ R that is invariant under a point re ection is said to possess point
n
symmetry. In other words, S has point symmetry if and only if there exists point p ∈ R such that n
S = S , where
′
′
S := {Ref p (a) : a ∈ S } .
SOLUTION
Consider any coloring of P with the two colors red and blue. Towards a contradiction, suppose that
there is no monochromatic in nite subset of P that has a center of symmetry. In other words, for every
point p ∈ A × A, where A := {k/2 : k ∈ Z}, the set
′ ′
P = P (p) := {a ∈ P : both a and 2p − a have the same color}
is nite. In fact, we will only use this assumption for p ∈ {(0, 0), (1/2, 0)}.
Suppose rst that the center of symmetry is located at p = (0, 0), the origin. It follows from our
assumption that there exists M ∈ N such that for all x, y ∈ Z with y ≥ M ,
1 1
(4.11)
Similarly, if p = (1/2, 0), then we are guaranteed that there exists M2 ∈ N such that for all x, y ∈ Z
with y ≥ M , 2
(4.12)
Let us now x y = max{M , M }. It follows immediately from (4.11) and (4.12) that for any x ∈ Z,
1 2
points (x, y) and (x + 1, y) have the same color (since both of them have a different color than
(−x, −y) and there are only two colors, red and blue).
As a result, this means that all the points in Q := {(x, max{M , M }) : x ∈ Z} have the same color.
1 2
Since Q clearly has point symmetry (in fact, any point in Q is a center of the symmetry) and is in nite,
we get the desired contradiction.
REMARKS
Very often problems formulated in terms of relationships of geometrical objects can be reformulated as
combinatorial problems, in which geometrical properties of the considered objects form combinatorial
constraints. The opposite situation may also occur; that is, sometimes combinatorial problems can be
solved after rephrasing them in the language of geometry and then using geometrical tools.
In order to illustrate the power of this approach, let us consider convex n-gons. Recall that n-gon is a
polygon with n sides—see Chapter 6 for more. A convex polygon is de ned as a polygon with all its
interior angles less than 180 degrees. This means that all the vertices of the polygon will point
outwards, away from the interior of the shape. Assuming that there are no 3 diagonals going through
the same point, let us count how many intersection points all the diagonals have.
For simplicity, let us concentrate on 8-gons and label vertices with integers from 0 to 7, starting with
an arbitrary vertex and then proceeding clockwise. There are 8 diagonals from vertex i to vertex i + 2,
i = 0, …, 7 (using modular arithmetic), each of them intersecting 1 ⋅ 5 other diagonals. There are 8
diagonals from vertex i to vertex i + 3, i = 0, …, 7; this time, each of them intersects 2 ⋅ 4 other
diagonals. Finally, there are 4 diagonals from i to i + 4, i = 0, …, 3 and each of them intersects 3 ⋅ 3
other diagonals. Moreover, each pair of intersecting diagonals occurs twice. Hence, the total number of
intersections is equal to
8 ⋅ 1 ⋅ 5 + 8 ⋅ 2 ⋅ 4 + 4 ⋅ 3 ⋅ 3 140
= = 70 .
2 2
One can repeat this argument for any value of n but it gets complicated quickly and no general formula
seems to appear.
Alternatively, one can observe that each intersection point can be labelled with the set consisting of
the labels of the two corresponding diagonals. For example, diagonal 15 intersects diagonal 36 at the
point labelled with set {1, 3, 5, 6}. It is easy to see that no two pairs of diagonals yield the same set. On
the other hand, each set of 4 labels corresponds to one intersection point. It follows that there exists a
bijection from the set of points of intersections and the family of 4-element sets of set {0, …, 7} and so
the two sets have the same size. Since ( ) =8
4
8⋅7⋅6⋅5
4⋅3⋅2
= 70, we get an alternative way of obtaining the
result. More importantly, it easily generalizes to any value of n: there are ( ) intersection points of two
n
EXERCISES
4.7.1. Let P be a set of ve points on a plane with the property that no three of them lie on the same
line. Denote by a(P ) the number of obtuse triangles whose vertices lie in P. Find the minimum and the
maximum value that a(P ) can attain over all possible sets P.
4.7.2. Every point on a circle is painted with one of three colors. Prove that there are three points on the
circle that have the same color and form an isosceles triangle.
(Source of the problem and solution idea: PLMO LI – Phase 1 – Problem 4.)
4.7.3. Take a set of n ≥ 2 points with the property that no three of them lie on the same line. We paint
all line segments formed by those points in such a way that no two line segments that have a common
vertex have the same color. Find the minimum number of colors for which such coloring exists.
During the Polish Mathematical Olympics that lasts two days, participants are solving a total of 6
problems. Each participant can get 6, 5, 2, or 0 points for the solution of each problem. During one of
the competitions, the following interesting property occurred: for any two participants there were two
problems for which they obtained different scores. How many participants came to this competition, at
most?
THEORY
Pigeonhole Principle The tool we introduce and use in this section is obvious but perhaps surprisingly,
often an extremely powerful tool. It can be stated as follows. If one has n boxes and places more than n
objects into them, then there will be at least one box that contains more than one object. In fact, one can
make the following stronger statement: if k objects are placed into n boxes, then there will be at least
one box that contains at least ⌈k/n⌉ objects.
In order to see this tool in action (in an easy scenario) let us consider the following example. We
shoot 65 shots at a square target, the side of which is 80 centimeters long. Since we are pretty good at
this, all of our shots hit the target. Prove that there are two bullet holes that are closer than 15
centimeters from each other.
Suppose that our target is an old 8 × 8 chessboard. (Formally, we say that the target is tessellated
into 8 × 8 square grid.) There are 8 ⋅ 8 = 64 squares and the board received 65 > 64 shots. Hence, by
the pigeonhole principle, there must be a square that received at least two shots. We claim that these
two shots are at distance at most 15 centimeters from each other. Indeed, since the size of each square is
10 centimeters, the distance between any two points is, by Pythagorean theorem, at most
√ 2 2
√ 200
10 + 10 = ≈ 14.1 < 15 .
SOLUTION
First of all, let us note that the distribution of points does not matter (well, from the perspective of our
problem). Hence, we may assume that each solution gets a score from the set P = {0, 1, 2, 3} and so
the performance of each participant can be represented by a vector from set
X = {(a , a , a , a , a , a ) : a ∈ P }. Clearly, |P | = 4 = 4, 096.
6
1 2 3 4 5 6 i
We know that participants got unique vectors from a subset A of P that satisfy the following
property: any two vectors from A differ in at least two coordinates. Our goal is to provide an upper
bound for the size of A. To that end, let us observe that the number of vectors of length 5 of elements
from P is 4 = 1, 024. Hence, if A contained more than 1, 024 vectors, then by pigeonhole principle it
5
would have two vectors that coincide on the rst 5 coordinates and so differ on at most one coordinate
(that is, possibly the last one). This shows that |A| ≤ 1, 024.
Now we will show that this upper bound is sharp; that is, one can construct set A of size 1, 024 with
the desired property. Let
5
A = {(a1 , a2 , a3 , a4 , a5 , a6 = ∑ ai (mod 4)) : ai ϵ P f or 1 ≤ i ≤ 5}
i=1
⊆ X.
In other words, A is constructed by considering all ve element vectors (a , …, a ) from P and adding 1 5
a (mod 4) at the very last coordinate. (Note that a ∈ P .) Clearly, |A| = 4 = 1, 024, as
5 5
a = ∑
6 i 6
i=1
b = (b , …, b ) that differ on at most one coordinate. Clearly, by construction, a and b differ on at least
1 6
one coordinate from the rst ve coordinates and so a and b must differ on precisely one coordinate:
a ≠ b for some ℓ such that 1 ≤ ℓ ≤ 5. In particular, a = b . But this implies that
ℓ ℓ 6 6
5 5
∑ ai = a6 = b6 = ∑ bi ( mod 4),
i=1 i=1
REMARKS
Let us rst note how one can come up with the proof that set of size 1, 024 can be constructed. The key
observation was that each ve element vector can be associated with one of the four signatures;
moreover, if two of these vectors differ on only one coordinate, then they must have different
signatures. We used these signatures to de ne the 6th coordinate.
Let us also mention about the following three, closely related, problems: Birthday Paradox, that can
be viewed as probabilistic pigeonhole principle, Coupon Collector Problem, and Birthday Attack.
Birthday Paradox Suppose that k people are selected at random. The convenient assumption is that
each day of the year (including February 29) is equally probable for a birthday, independently for each
person. We are interested in estimating the probability that some pair of the selected people will have
the same birthday. Since there are n = 366 possible birthdays, by the pigeonhole principle, this
probability is equal to one if k ≥ 367. On the other hand, if k = n = 366, then we are not guaranteed
that such pair exists but the probability that each person has a unique birthday (that we denote by
p(n, k)) is extremely small. Indeed, it is clear that
n ⋅ (n − 1)⋯(n − k + 1) n!
p(n, k ) := = ,
k k
n n (n − k)!
and so p(366, 366) ≈ 5.36 ⋅ 10 . It may seem surprising that this probability is below 50% for a
−158
k−1 i k(k−1)
≤ exp (− ∑i=1 ) =exp (− ) =: p̂ (n, k).
n 2n
Moreover, in practice this upper bound is not too far from the truth value; for example,
^(366, 23) ≈ 0.499998.
0.492703 ≈ p(366, 23) ≤ p
Coupon Collector’s Problem We continue selecting k people at random but this time we would like k
to be large enough so that q(n, k), the probability that every single day of the year someone has a
birthday, is close to one. This problem is known as the coupon collector’s problem as the question can
be reformulated as the problem of collecting n unique coupons hidden in boxes of some brand of
cereals. Clearly, for any k < n we have q(n, k) = 0. If one is extremely lucky, then the group of
k = n = 366 people could have the desired property but the probability is very low; indeed,
−158
q(366, 366 ) = p(366, 366 ) ≈ 5.36 ⋅ 10 .
The exact values for q(n, k) are extremely dif cult to compute (unless n and k are small or, for
example, n = k) as they are related to the Stirling number of the second kind, the number of ways to
partition a set of k objects into n non-empty subsets. Using simulations, we determined that k = 2, 294
is the smallest value for which q(366, k) > 0.5.
On the other hand, it is possible to show that the expected number of people that need to be selected
in order for the desired property to hold is
n
1 1
n∑ = n Hn = n ln n + γn + + o(1),
i 2
i=1
where Hn is the n-th harmonic number and γ ≈ 0.577216 is the Euler-Mascheroni constant. In
particular, for n = 366 it is approximately 2, 372.1245. The function q(n, k) has the following
asymptotic behavior. If k = n(ln n + c ) for some sequence (c )
n of real numbers, then
n n∈N
0 if cn → −∞
−c
−e
q(n, k ) ∼ {e if cn → c ∈ R
1 if cn → ∞ .
Note that for n = 366 and k = 2, 294 the above estimate gives q(n, k) ≈ 0.4995, which is in line with
the simulation results. So it is approximately 100 times more than what is needed for the birthday
paradox case. Finally, let us highlight the following weaker but often useful statement: for any ϵ > 0,
FIGURE 4.7: Plot of q(366, k). Dashed horizontal line is for 50% probability, dotted vertical line denotes expected value that is roughly
equal to 2, 372.1245.
Birthday Attack A birthday attack is a type of cryptographic attack that exploits the mathematics
behind the birthday problem discussed above. It can be formulated as follows. Given a function
f = f (x), the goal of the attack is to nd two different values of x, say x1 and x2, such that
. Such pair x , x is called a collision. The method used to nd a collision is simply to
f (x1 ) = f (x2 ) 1 2
evaluate function f for many values of x that can be selected randomly until the same result is obtained
more than once. Because of the birthday problem, this method is surprisingly quite ef cient.
In particular, if function f gives any of the n different outputs uniformly at random and n is
suf ciently large, then we expect to obtain a collision after evaluating the function for about 1.25√n
different arguments, on average. Indeed, the probability that the rst collision occurs at time i is equal
to
i−2 t 3 2 i−1
= exp (− ∑t=2 + O(i /n )) ⋅
n n
(i−1)(i−2) 3 2 i−1
= exp (− + O(i /n )) ⋅
2n n
2
x exp(−x /2)
˜ ,
√n
provided i = x√n for some x ∈ R. In order to calculate an asymptotic value for the probability to see
a collision by time i = x√n, denoted P ( ≤ i), we need to use integrals. This part is not considered to
be elementary mathematics so the less advanced reader can safely skip this part; we will not use this
result later on. Moreover, we only provide a sketch, as a formal argument is more delicate and
technical.
x√n x
2
P(≤ x√n) = ∑ P(i)~ ∫ z exp (−z /2)dz
i=1 0
2 x 2
= − exp (−z /2) = 1− exp (−x /2).
0
So one needs to evaluate function f roughly √2 ln 2 √n ≈ 1.18√n times to get the probability close
to 1/2. Similarly, the expected number of values that need to be evaluated to get the rst collision is
equal to
∞ ∞
2 2
∑ i ⋅ P (i ) ∼ ∫ z exp(−z /2 ) dz ⋅ √ n = √ π/2 √n ≈ 1.25 √n .
i=1 0
EXERCISES
4.8.1 Twenty ve boys and twenty ve girls sit around a table. Prove that it is always possible to nd a
person both of whose neighbors are girls.
(Source of the problem and solution: Interactive Mathematics Miscellany and Puzzles by Alexander
Bogomolny, https://www.cut-the-knot.org.)
4.8.2 A person takes at least one aspirin a day for 30 days. Show that if the person takes 45 aspirin
altogether, then in some sequence of consecutive days that person takes exactly 14 aspirin.
(Source of the problem and solution: Interactive Mathematics Miscellany and Puzzles by Alexander
Bogomolny, https://www.cut-the-knot.org.)
4.8.3 Prove that if we take n + 1 numbers from the set from 1 to 2n, then in this subset there exist two
numbers such that one divides the other.
Can you design two different dice so that their sums behave just like a pair of ordinary dice? That is,
there must be two ways to roll a 3, six ways to roll a 7, one way to roll a 12, and so forth. Each die must
have six sides, and each side must be labelled with a positive integer.
THEORY
i=0
of real numbers is
de ned as follows:
∞
i
G(x ) = G(a, x ) := ∑ ai x .
i=0
Unlike an ordinary series, this formal series is allowed to diverge, meaning that the generating function
is not always a true function and the “variable” x is actually an indeterminate allowing us to perform
useful algebraic manipulations.
Let us start with a simple and standard application of generating functions. The Fibonacci sequence
is de ned recursively as follows:
a0 , a1 , a2 , …
Our goal is to nd an explicit formula for an. Instead of looking for the sequence, we will look for its
generating function G(x) = ∑ a x . Once we get it, we will try to recover the coef cient an in front
j≥0 j
j
n
by x , and sum over all values of n for which the relation is valid. We get
n n n
∑ an+1 x = ∑ an x + ∑ an−1 x .
(4.13)
G(x) − a1 x − a0 G(x) − x
n 2 3
∑ an+1 x = a2 x + a3 x + a4 x + … = = .
x x
n≥1
n n
∑ an x + ∑ an−1 x = G(x) + xG(x ) .
n≥1 n≥1
It follows that
G(x) − x
= G(x) + xG(x)
x
and so
x x
G(x ) = = ,
2
1 − x − x (1 − xr+ )(1 − xr− )
where r = (1 + √5)/2 and r = (1 − √5)/2. Our rst task is done—we have an explicit formula
+ −
1 1 1
G(x) = ( − )
r+ −r− 1−xr+ 1−xr−
1 n n n n
= (∑ r + x − ∑ r − x )
√5 n≥0 n≥0
1 n n n
= ∑ (r+ − r− )x ,
n≥0 √5
1 n n
1 1 + √5 1 − √5
an = (r+ − r− ) = (( ) − ( ) ) .
√5 √5 2 2
1 1 − √5 1
( ) ≤ ≈ 0.447 < 1/2
√5 2 √5
1 1 + √5
an = ⌊ ( ) ⌉ ,
√5 2
Golden Ratio The constant that appeared in the formula for the nth Fibonacci number,
1 + √5
ϕ = ≈ 1.618 ,
2
is the Golden Ratio that appears in mathematics surprisingly often. Perhaps even more surprising is the
fact that it appears in some patterns in nature, including the spiral arrangement of leaves and other plant
parts, music, architecture, or paintings. Two quantities, a and b are in the golden ratio if their ratio is the
same as the ratio of their sum to the larger of the two quantities. In other words, ϕ is de ned as follows:
a a + b
ϕ = = ,
b a
where a > b > 0.
SOLUTION
Consider the generating function where ak represents the number of appearances of the number k on the
die. Thus, an ordinary die would be represented by the polynomial
2 3 4 5 6
f (x) = x + x + x + x + x + x
2 2
= x(x + 1)(x + x + 1)(x − x + 1).
The key observation is that the result of rolling two (or more, in general) dice is represented by the
product of their generating functions. Therefore, if g(x) and h(x) are the functions associated with the
rst die and the second one, respectively, then we get that
2
2 2 2
g(x)h(x ) = f (x) = (x(x + 1)(x + x + 1)(x − x + 1)) .
There are some constraints we need to consider: we cannot have a non-zero constant term in g(x) or
h(x) (since that would imply that some sides are labelled “0”) or any negative term. It follows that we
need to assign one copy of each “x” factor to g(x) and h(x). Moreover, g(1) = h(1) = 6 (the number
of sides), so we need to assign one copy of each “ (x + x + 1)” and “ (x + 1)” factor to g(x) and
2
h(x) as well. It remains to distribute the two “ (x − x + 1)” factors. If we give one copy to each of
2
2 2 3 4
g(x) = x(x + 1)(x + x + 1) = x + 2x + 2x + x
2
2 2 3 4 5 6 8
h(x) = x(x + 1)(x + x + 1)(x − x + 1) = x + x + x + x + x + x ,
REMARKS
Alternatively, one could have solved this problem using a computer in the following way. Observe rst
that there is only one way to get the sum to be equal to 2 or 12. This means that each dice must have
exactly one 1 and both of them must have a unique maximum value (which can be different). Now,
observe that this unique value must be greater than 3; otherwise, one of the dies would be
{1, 2, 2, 2, 2, 3} and the other die would have to have 1, 9 and four numbers that are between 2 and 8
but then there are too many ways to get the sum to be equal to 11. This means that the maximum
number on any die is at most 8. One can easily enumerate all such dies and check which of them meet
the required criteria.
Here is a basic program written in Julia language that performs this brute-force check. We did not
optimize it for speed as its run-time is under one second anyway.
using Base.Iterators
function listdies()
ref = [1,2,3,4,5,6,5,4,3,2,1] # reference distribution
# traverse all possible die configurations with one 1
for d1 in product(1, 2:8, 2:8, 2:8, 2:8, 2:8)
for d2 in product(1, 2:8, 2:8, 2:8, 2:8, 2:8)
# filter options to avoid reporting duplicates
if issorted(d1) \& \& issorted(d2) \& \& d1 < = d2
# x will be a 6x6 matrix storing possible sums
x = [a1 + a2 for a1 in d1, a2 in d2]
# check if counts of sums equals what we want
if [count(v - > v==s, x) for s in 2:12] == ref
# print the result on the screen on success
println((d1, d2))
end
end
end
end
end
Now running listdies() ensures us that there are actually only two solutions of the problem:
EXERCISES
4.9.1. Consider the Sicherman dice problem in which the restriction that each side is labelled with a
positive integer is relaxed to any integer, not necessarily positive. Can you design more pairs of dice?
4.9.2. Solve the recurrence xn+1 = xn + 2xn−1 for n ∈ N , with x0 = 0 and x1 = 1 . Verify your
solution using induction.
4.9.3. Your friend wants to play the following game with you. You toss three 6-sided fair dies and
calculate the sum of outcomes. For every game you have to pay : textdoll : 1. If the sum is 10 or 11
you get : textdoll : 4, otherwise you get nothing. Is this game fair?
Chapter 5
Number Theory
THEORY
Divisibility For any two integers a and b, we say that a dividesb (or a is a
divisor ofb) if and only if b/a ∈ Z; that is, b = ak for some k ∈ Z. If a
divides b, then we write a | b.
For example, 5 | 15 since 15/5 = 3 ∈ Z. On the other hand, 5 | 7.
Indeed, for a contradiction suppose that 5 | 7; that is, 7 = 5k for some
k ∈ Z. But this implies that k = 7/5 ∉ Z which gives the desired
contradiction.
For any b ∈ Z , we have 1 | b and b | b (since b/1 = b ∈ Z and
b/b = 1 ∈ Z). On the other hand, for any integer a > 1, a | 1 (since
(5.1)
that is,
k
αi α1 α2 αk
n = ∏ p = p p ⋯p ,
i 1 2 k
i=1
This notation is often convenient; for example, it can be used to express the
following well-known Legendre’s formula that gives an expression for the
exponent of the largest power of a prime p that divides the factorial n!:
∞
n
αn! (p ) = ∑⌊ ⌋.
i
p
i=1
(5.2)
Now, let us move to the proof. The main ingredient is the following
observation: every integer greater than 1 is divisible by a prime. The claim
is clearly true when n is a prime, as n | n. Let us then concentrate on
composite integers. For a contradiction, suppose that there are composite
numbers not divisible by any prime; let us call them bad. Let N be the
smallest bad number. (As we will see soon, it is convenient to concentrate
on the smallest potential bad number, as it means that no smaller composite
number is bad.) By the de nition of composite numbers, N has at least 3
distinct divisors; in particular, there must be some divisor d | N with
d ≠ 1 and d ≠ N . By (5.1), if d | N then d ≤ N and therefore
1 < d < N . By assumption, d is not prime. Since d is a composite number
less than N, d is not bad and so there is prime p such that p | d (recall that
N is the smallest bad number). Since divisibility is transitive, we get that
p | d together with d | N implies that p | N ; rather, N is divisible by a
n = ∏
p∈P
p (the prove of its uniqueness is slightly longer and so we
αn (p)
skip it here). For a contradiction, suppose that there is some integer greater
than 1 for which there is no such function; let n be the smallest such
example. Note that n cannot be a prime number, for otherwise α (n) = 1 n
αn/q (p)
n/q = ∏ p .
p∈P
∏
αn/q (p)
n = q ∏ p
p∈P
and so
αn/q (p) if p ≠ q
αn (p ) = {
αn/q (q) + 1 if p = q
PROBLEM
THEORY
d d d
n = dq + r and 0 ≤ r < d .
We call q the quotient and r the remainder of n when divided by d.
For example,
second part, we need to show that this pair is unique. We will prove these
two parts independently.
Existence: Let and d ∈ N. Put q = ⌊n/d⌋ and
n ∈ Z
to show that 0 ≤ r < d. From the de nition of the oor function ⌊⋅⌋ it
follows that 0 ≤ n/d − ⌊n/d⌋ < 1, so 0 ≤ r/d < 1 and the assertion
follows.
Uniqueness: Let n ∈ Z and d ∈ N and take any integers q , q , r , r 1 2 1 2
r = r
1 and then that also q = q . For a contradiction, suppose r ≠ r .
2 1 2 1 2
gcd(a, b) = gcd(b, r ) .
Clearly, a + b < b + r and so one can repeat this procedure until reaching
gcd(d, 0) for some d ∈ N and then the algorithm stops since gcd(d, 0) = d
.
In order to prove the above key observation, let a, b ∈ N with a > b and
let d = gcd(a, b), e = gcd(b, r). Since d = gcd(a, b), d | a, d | b, and
thus also d | bq and d | a − bq = r. It follows that d is a common divisor
of b and r. Since e is the greatest such divisor, d ≤ e. Similarly, since
e = gcd(b, r), e | b, e | r, and thus also e | bq and e | bq + r = a. We
ax + by = gcd(a, b ) .
Instead of proving this property, we will show one example which not only
should convince the reader that the property above holds but also
demonstrates how to actually nd x, y which satisfy the desired equation.
Let us nd integers x, y such that 425x + 112y = 1. The rst step is to
nd gcd(425, 112) using Euclidean Algorithm:
89 = 3 ⋅ 23 + 20(⇒ 20 = 89 − 3 ⋅ 23)
23 = 1 ⋅ 20 + 3(⇒ 3 = 23 − 20)
20 = 6 ⋅ 3 + 2(⇒ 2 = 20 − 6 ⋅ 3)
3 = 1 ⋅ 2 + 1(⇒ 1 = 3 − 2)
2 = 2 ⋅ 1 + 10,
so gcd(425, 112) = 1. Now, one can reverse all the operations to get:
1 = 3 − 2
= 3 − (20 − 6 ⋅ 3) = −20 + 7 ⋅ 3
= −8(89 − 3 ⋅ 23) + 7 ⋅ 23 = 31 ⋅ 23 − 8 ⋅ 89
p∈P
(5.3)
In particular, this observation shows that if one wants to compute gcd(a, b),
then each prime factor can be considered independently. Let us also notice
that gcd(a, a) = a, gcd(a, 1) = 1, gcd(a, b) = gcd(b, a) and
gcd(a, b, c) = gcd(gcd(a, b), c). The last inequality follows from (5.3) and
the fact that for any i, k, ℓ ∈ N ∪ {0}, min{i, k, ℓ} = min{min{i, k}, ℓ}.
SOLUTION
We will independently consider the following three cases which will nish
the proof.
Case 1: α = min{α, β, γ, δ}. Both the left hand side and the right hand
side are clearly equal to 2α.
Case 2: . Since α + β = γ + δ, we get
β = min{α, β, γ, δ}
gcd (a, c)⋅ gcd (a, d) = q⋅ gcd (a/q, c/q)⋅ gcd (a/q, d)
2
= q ⋅ (a/q)⋅ gcd (a/q, b, c/q, d)
In both cases, the desired equality holds and so the claim holds by
induction.
EXERCISES
5.1.2. You are given two natural numbers a and b. Prove that if a + b |
2
a ,
then a + b | b .2
Suppose that
2 2 2 2
a + b + c + d = 2018!
(5.4)
a ± c ≡ b ± d ( mod n),
(5.5)
ac ≡ bd ( mod n).
(5.6)
1009 1009
= 9 ⋅ 3 ≡ (−1) ⋅ 3 = −3 ≡ 7(mod 10).
2
2 = 2 ⋅ 2 ≡ 2 ⋅ 2 = 4 (mod 10)
3 2
2 = 2 ⋅ 2 ≡ 4 ⋅ 2 = 8 (mod 10)
4 3
2 = 2 ⋅ 2 ≡ 8 ⋅ 2 = 16 ≡ 6 (mod 10)
5 4
2 = 2 ⋅ 2 ≡ 6 ⋅ 2 = 12 ≡ 2 (mod 10)
6 5
2 = 2 ⋅ 2 ≡ 2 ⋅ 2 = 4 (mod 10)
k k−1
2 = 2 ⋅ 2 ≡ ...
(5.7)
contradiction.
Here is another useful property: if gcd(a, n) = 1, then
(5.8)
1x ≡ 1y (mod n)
x ≡ y (mod n).
SOLUTION
For a contradiction, suppose that at least one of a, b, c, d is smaller than or
equal to 10 < 16 = 2 . Let α be the largest non-negative integer
250 250 1000
such that 2α divides each of a, b, c, d; that is, using the unique factorization
theorem,
∞ 2018
α2018 (2) = ∑ ⌊ ⌋
i=1 2
i
= 2011.
2018!
′2 ′2 ′2 ′2
a + b + c + d = .
2α
2
(5.9)
congruent to 0, 1, or 4 ( mod 8). It is easy to see then that the left hand
side of equality (5.9) is not congruent to 0 ( mod 8), and we get the
desired contradiction.
REMARKS
The key idea in our solution is to notice that if the right hand side of
equality (5.4) is divisible by 8, then each of a, b, c, d must be divisible by 2,
as explained above. With this observation in hand, one can divide both sides
of equality (5.4) by 4, and repeat the process recursively as long as the right
hand side is divisible by 8. It shows that each of the four numbers must be
large. The proof presented above uses this idea but avoids the recursive
argument by noticing that in the unique factorization of 2018! prime
number 2 is raised to a large power.
EXERCISES
5.2.2. You are given three consecutive natural numbers (say, a, a + 1, and
a + 2) such that the middle one is a cube (that is, a + 1 = ℓ for some 3
5.2.3. Prove that for any natural n ∈ N that is not divisible by 10 there
exists k ∈ N such that nk has in its decimal representation the same digit at
the rst and the last position.
(Source of the problem and solution idea: “Delta” monthly – January,
2017.)
5.3 Factorization
SOURCE
PROBLEM
Consider any prime number p > 2. Prove that there exists exactly one
n ∈ N such that n + np is a square; that is, n + np = k for some
2 2 2
k ∈ N.
THEORY
In this section, we will use a basic but useful fact that follows immediately
from the fundamental theorem of arithmetic:
(5.10)
For example, we will use (5.10) to show that √2 is irrational. (Recall that a
real number q is rational if q = a/b for some a, b ∈ Z and b ≠ 0; a real
number r is irrational if it is not rational.) For a contradiction, suppose that
there are integers a, b, b ≠ 0 such that √2 = a/b. We may assume that
gcd(a, b) = 1 (we may write √ 2 in lowest terms). Now, note that
(1, 1, pq).
SOLUTION
Suppose that n 2
+ np = k
2
for some k ∈ N; that is,
2 2
np = k − n = (k − n)(k + n ) .
Since ′
gcd(n , k ) = 1 we
′
have ,
gcd(n , k − n ) = 1 and ′ ′ ′
′
gcd(n , k
′
+ n ) = 1.
′
It follows that n′ has to divide x, that is,
∈ N and so
′ ′
x = x/n
′ ′ ′ ′ ′
p = x (k − n )(k + n ) .
(5.11)
After noting that p is a prime and k − n < k + n , we get that the product ′ ′ ′ ′
solution of the system, provided that both n and k are natural numbers.
In order to nish the proof, we will use the fact that p is a prime greater
than 2; in particular, it is odd: p = 2ℓ + 1 for some ℓ ∈ N. We get that
k = (p
2
− 1)/4 = (4ℓ
2
+ 4ℓ)/4 = ℓ
2
+ ℓ ∈ N and
2
n = (p − 1) /4 = (2ℓ) /4 = ℓ
2 2
∈ N , and the proof is nished.
REMARKS
get that
2
1 + p = (2n + p − 2k) + (2n + p + 2k) = 4n + 2p
and so n = (p − 1) 2
/4 .
EXERCISES
5.3.1. You are given two integers, a and b, and a prime p > 2. Prove that if
p | a + b and p | a + b , then p | a + b .
2 2 2 2 2
THEORY
equivalent, that is, if one of them is true then the other one must also be
true.
Fix any prime number p. We will prove Fermat’s little theorem by
induction on n. The base case is trivial: 1 ≡ 1 ( mod p). For the p
i
) or, equivalently,
( mod p). Indeed,
p
( ) ≡ 0
i
p p(p − 1)⋯(p − i + 1)
( ) = ;
i i(i − 1)⋯1
1
k k−1
ϕ(q ) = p − p = q(1 − ) .
p
To see this observe that there are exactly p numbers between 1 and q that
k−1
are of the form i ⋅ p, where i ∈ [p ]; these are the only numbers in that
k−1
(5.12)
d ≡ r (d ) = r (d ) ≡ d
1 q 1 q 2( mod q), we have p | d − d
2 and 1 2
Now, notice that the codomain of function f is of size |[p]| ⋅ |[q]| = pq,
exactly the same as the size of its domain. Hence, function f is not only a
one-to-one function but it is also onto (and so it is a bijection).
We are nally ready to show (5.12). Notice that d is co-prime to pq if and
only if it is co-prime to p and to q. Moreover, by the Euclidean algorithm, d
is co-prime to p if and only if r (d) is co-prime to p. Similarly, d is co-
p
p and rq is co-prime to q. The number of such pairs is ϕ(p) ⋅ ϕ(q) and the
proof of this property is nished.
i=1
ki
i
(pi ) is a
sequence of prime numbers and k ∈ N for all i ∈ [ℓ].
i
Note that Fermat’s little theorem is indeed a special case of Euler’s theorem,
because if q is a prime number, then ϕ(q) = q − 1. (See Property 1 above.)
In order to prove Euler’s theorem, let r , r , …, r be all the numbers
1 2 ϕ(q)
with q, it follows from (5.8) that r ≡ r ( mod q), which is not possible.
i j
n ⋅ r ≡ s
i i( mod q) for all i, we get
ϕ(q) ϕ(q)
ϕ(q)
n ∏ ri ≡ ∏ si ( mod q) .
i=1 i=1
Using the fact that all ri are co-prime with q and that S = R , we can use
(5.8) to get the desired result.
( )
x ≡ a1 (mod n1 )
x ≡ ak (mod nk ),
Ni N
i
−1
≡ 1 ( mod ni ) . Let
−1 −1 −1
x̂ = a1 N1 N + a2 N2 N + … + ak Nk N .
1 2 k
Clearly, x
^ is a simultaneous solution to all of the congruences. Indeed, for
Let us note that the proof is constructive; that is, it gives us an explicit
formula for the solution. For example, let us solve the following system of
congruences:
x ≡ 1 (mod 5)
x ≡ 2 (mod 7)
x ≡ 3 (mod 9)
x ≡ 4 (mod 11).
Note that the moduli are pairwise relatively prime, as required by the
Chinese remainder theorem. Using the notation from the proof, we have
N = 5 ⋅ 7 ⋅ 9 ⋅ 11 = 3465, N = N /5 = 693, N = N /7 = 495,
1 2
N
−1
1
= 2 ( 693 ⋅ 2 ≡ 3 ⋅ 2 = 6 ≡ 1 ( mod 5)), N = 3 ( −1
( mod 7)), ( −1
495 ⋅ 3 ≡ 5 ⋅ 3 = 15 ≡ 1 N = 4
3
2 k 2 k
(y + 2 ) − (y − 2 ) 17 − 1
k
2 = = = 8
2 2
and
2 k 2 k
(y + 2 ) + (y − 2 ) 17 + 1
2
y = = = 9,
2 2
which gives us x = 2k = 6 and y = 3 . One can easily verify that indeed
2 + 17 = 3 .
6 4
REMARKS
yield an easy solution to the problem. Hence, the main dif culty in this
problem is to show that x is even. We present one way to show it but,
alternatively, one can use an argument similar to the one from Section 5.2.
We observe that for x = 1, 2, 3, 4, 5, 6, 7, 8
x
2 ≡ 2, 4, 8, 16, 15, 13, 9, 1 ( mod 17) .
and so {2, 4, 8, 16, 15, 13, 9, 1} are the only possible remainders when the
left hand side of the equation is divided by 17.
On the other hand, the only possible reminders when y4, the right hand
side of the equation, is divided by 17 are 0, 1, 4, 13, and 16, which can be
veri ed by enumerating the reminders of 0 , 1 , …, 16 when divided by
4 4 4
17. Here is a line of code written in Julia language that performs this
calculation and shows the result:
i=1
i
∑
j=0
2
j
.
5.4.3. Find the last two digits in the decimal representation of 7123.
THEORY
with a ≠ 0 and 0 ≤ a
k i ≤ 9 for 0 ≤ i ≤ k. Here are some standard rules of
divisibility:
5. 3 | n if and only if 3 | ∑
k
i=0
ai (the sum of digits is divisible by
3),
9).
The proofs are straightforward so we only show the rule for 3 in order to
illustrate the argument. Note that 10 ≡ 1 ( mod 3), so for any i ∈ N we
get 10 ≡ 1 ≡ 1 ( mod 3). Thus
i i
| ( )
3|n ⇔ n ≡ 0 (mod 3)
k k−1
⇔ 10 ak + 10 ak−1 + ⋅ ⋅ ⋅ + 10a1 + a0 ≡ 0 (mod 3)
SOLUTION
ki+1 > 2k . Our goal is to show that 3 | i; however, in fact, we will show
i
into i/3 sets, each of the form {d, 2d, 4d} for some odd positive integer d.
We will show that if d ≤ k is an odd divisor of n, then both 2d and 4d
i
are not only divisors of n (which is obvious) but also 2d < 4d ≤ k . This i
will nish the proof as it implies that the desired partition exists. Consider
any odd divisor d ≤ k of n. By our assumption, 2d ≤ 2k < k
i
but this
i i+1
REMARKS
The key observation that leads to the solution is to notice that if k > 2k , i+1 i
EXERCISES
5.5.1. Decide if there exists k ∈ N with the property that in the decimal
representation of 2k each of the 10 digits ( 0, 1, 2, …, 9) is present the same
number of times.
(Source of the problem and solution: LXX – Phase 1 – Problem 1.)
5.5.2. Find the minimum of |20 − 9 | over all natural numbers m and n.
m n
(Source of the problem and solution idea: LXIV – Phase 1 – Problem 5.)
5.6 Remainders
SOURCE
n ∈ N ∖ {1}, a n= a + a
n−1 ⌊n/2⌋. Prove that there are in nitely many
terms in the sequence (a ) that are divisible by 7.
i
THEORY
S = {a + ib : i ∈ Z, k + 1 ≤ i ≤ k + p } .
SOLUTION
Let us rst
observe that a = a + a = 2, a = a + a = 3,
2 1 1 3 2 1
a4k−3
S = {a4k−3 + ia2k−1 : i ∈ Z, 0 ≤ i ≤ 6 } .
Since prime number 7 does not divide a , it follows from the observation
2k−1
we made in the theory part that all numbers in S yield unique remainders
when divided by 7. Hence, one of them is equal to 0 and so one of the
consecutive terms we consider is divisible by 7. We get the desired
contradiction and the proof is nished.
REMARKS
Since our task was to prove that there are in nite number of values of
n ∈ N for which 7 | a , it is clear that one should consider the recurrence
n
an = a + a
n−1 but concentrate on reminders when an in divided by 7.
⌊n/2⌋
EXERCISES
(Source of the problem and solution idea: General Mathematics Vol. 15, No.
4 (2007), 145–148.)
i=1 i
2011
i=1
(Source of the problem and solution idea: PLMO LXII – Phase 3 – Problem
3.)
(Source of the problem and solution idea: PLMO LXI – Phase 3 – Problem
1.)
5.7 Aggregation
SOURCE
Problem and solution idea: PLMO XLII – Phase 1 – Problem 11 (modi ed)
PROBLEM
Let n > 1, 000 be any integer. For i ∈ [n], let ri be the reminder when 2n is
divided by i. Prove that ∑ r > 3.5n.
n
i=1 i
THEORY
n−1
n(n − 1)
∑ (a + i ⋅ b ) = n ⋅ a + ⋅ b
2
i=0
Similarly,
n−2 n−1
b − 1
i
∑b =
b − 1
i=0
i=0
i
∑ ri ≥ ⌈n/2⌉ − 1 .
i=1
In order to get the desired lower bound of 3.5n, we need to use a slightly
more delicate argument.
Let us rst note that for any number i ∈ [n], there exist unique
a, b ∈ N ∪ {0}, such that
( )
a
i = 2 ⋅ (2b + 1 ) .
If b = 0, then i = 2 divides 2n
and so r = 0. As a result, we focus on
a
i
a a
⌊n/2 ⌋ ⌊n/2 ⌋ n
⌈ ⌉ − 1 ≥ − 1 ≥ − 3/2
a+1
2 2 2
numbers of “type a.”
Let us now concentrate on any number of “type a”: i = 2 a
(2b + 1) ∈ [n] ,
b ≥ 1. The key observation is that
n n n
n 2 2 2
ri = 2 − ⌊ ⌋ ⋅ i = ( − ⌊ ⌋) ⋅ i
i i i
n−a n−a
2 2 a
= ( − ⌊ ⌋) ⋅ 2 (2b + 1)
2b+1 2b+1
1 a a
≥ ⋅ 2 (2b + 1) = 2 ,
2b+1
REMARKS
Let us mention that our argument above easily gives a slightly stronger
asymptotic lower bound. Indeed,
n amax n 3 a n amax
∑ ri ≥ ∑a=0 ( − ⋅ 2 ) = (amax + 1) ⋅ + O(2 )
i=1 2 2 2
n nlog2 n nlog2 n
= (log2 n + O(1)) ⋅ + O(n) = + O(n)~ .
2 2 2
i/2 ∼ n /4.
n 2
∑
i=1
EXERCISES
5.7.1. Prove that if the sum of positive divisors of some natural number n is
odd, then either n is a square or n/2 is a square.
5.7.2. Find all natural numbers n for which there exist 2n pairwise different
numbers a , a , …, a , b , b , …, b such that ∑ a = ∑ b and
1 2 n 1 2 n
n
i=1 i
n
i=1 i
b .
n n
∏ a = ∏
i i
i=1 i=1
(Source of the problem and solution idea: PLMO LXI – Phase 1 – Problem
10.)
5.8 Equations
SOURCE
PROBLEM
Find all pairs of integers (x, y) that satisfy the following equation:
2 2
x (y − 1) + y (x − 1) = 1 .
THEORY
Ax + By = C,
where A, B, C are given integers. Although the practical applications of
Diophantine analysis have been somewhat limited in the past, this kind of
analysis has become much more important in the digital age. In particular,
they play an important role in the theory of public-key cryptography.
In order to warm up, we concentrate on the simplest and best-understood
equations, namely, linear equations. Let us rst note that not all linear
Diophantine equations have a solution. For example, 10x + 5y = 3 does
not have a solution as for any pair of two integers x and y, the left hand side
of this equation is divisible by 5 whereas the right hand side is not.
Fortunately, there is a formal process to determine whether the equation has
a solution or not. Indeed, nding all solutions to linear Diophantine
equations involves nding an initial solution, and then altering that solution
in some way to nd the remaining solutions. For the rst task we are going
to use the Bézout’s Identity.
Bézout’s Identity Let us recall that in Section 5.1 we used the extended
Euclidean algorithm to prove that for any A, B ∈ Z ∖ {0}, there exist
x, y ∈ Z such that
Ax + By = gcd(A, B ) .
B A
(x, y ) = (x̂ + m , ŷ − m )
gcd(A, B) gcd(A, B)
B A
A(x̂ + m ) + B(ŷ − m ) = Ax̂ + Bŷ = C,
gcd(A, B) gcd(A, B)
which shows these are indeed solutions to the equation. On the other hand,
given any solution (x, y), we have
Ax + By = Ax̂ + Aŷ
A B
(x − x̂) = − (y − ŷ ).
gcd(A,B) gcd(A,B)
Since A
gcd(A,B)
and B
gcd(A,B)
are relatively prime, there exists an integer m
such that x − x^ = m
B
gcd(A,B)
and y − y
^ = −m
A
gcd(A,B)
. This shows that
there are no more solutions.
SOLUTION
Let us rst observe that, due to the symmetry, without loss of generality we
may assume that x ≤ y. In other words, solutions to this equation come in
pairs: if (x, y) = (a, b) is a solution, then so is (x, y) = (b, a).
If x = 1, then y has to satisfy y − 1 = 1 and we get two solutions:
(x, y) = (2, 1) and (x, y) = (1, 2). If x = 2, then 4(y − 1) + y = 1 and
2
and so
2
xy(x + y) + 2xy = 1 + (x + y) .
Next,
xy(x + y + 2 ) = (x + y + 2)(x + y − 2) + 5
and nally
(xy − x − y + 2)(x + y + 2 ) = 5.
There are 4 cases to consider. We will see that none of them yields new
solutions and so the proof will be nished.
re-discover two solutions, (x, y) = (2, −5) and (x, y) = (5, −2).
The key idea of the proposed solution is to use a factorization that reduces
the solution to 4 simple cases. However, it is not clear how to transform the
equation to get this desired form. In order to see this, it is easier to
substitute a = x − 1 and b = y − 1 to get
2 2
(a + 1) b + (b + 1) a = 1.
ab(a + b) + 4ab + a + b = 1.
It follows that
ab(a + b + 4) + a + b = 1,
and now it is much easier to see that it is enough to add 4 to both sides to
reach the desired factorization,
(ab + 1)(a + b + 4 ) = 5.
EXERCISES
(Source of the problem and solution idea: PLMO LVI – Phase 3 – Problem
1.)
5.8.3. Find all natural numbers satisfying the following system of equations:
a + b + c = xyz,
x + y + z = abc,
6.1 Circles
6.2 Congruence
6.3 Similarity
6.4 Menelaus's Theorem
6.5 Parallelograms
6.6 Power of a Point
6.7 Areas
6.8 Thales' Theorem
goes through point C; this line intersects line AB at point C′ and is called the orthogonal
projection of C on line AB.
A polygon is a plane gure that is bounded by a nite sequence of straight line segments
closing in a loop to form a closed polygonal circuit. These segments are called its edges (or
sides), and the points where two edges meet are the polygon’s vertices. An n -gon is a
polygon with n sides; for example, a triangle is a 3-gon, quadrilateral is a 4-gon. Finally, an
equilateral triangle is a triangle in which all three sides are equal, and an isosceles triangle is
a triangle that has two sides of equal length.
6.1 Circles
SOURCE
Problem: “Exercises in geometry” (in Polish) by Waldemar Pompe – Problem 19
Solution: our own
PROBLEM
Points E and F lie on sides AB and BC of square ABCD and |BE| = |BF |. Point S is the
orthogonal projection of B on line CE. Show that angle < ) DSF is the right angle.
THEORY
Circumcircle The circumcircle is a triangle’s circumscribed circle, that is, the unique circle
that passes through each of the triangle’s three vertices. The center of the circumcircle is
called the circumcenter, and the circle’s radius is called the circumradius. The circumcenter’s
position depends on the type of triangle.
If and only if it is a right triangle (that is, a triangle in which one angle is the
right angle), the circumcenter lies on one of its sides (namely, the
hypotenuse, the longest side of a right triangle, opposite the right angle).
If and only if a triangle is acute (all angles smaller than the right angle), the
circumcenter lies inside the triangle.
If and only if it is obtuse (has one angle larger than the right angle), the
circumcenter lies outside the triangle.
Central and Inscribed Angles A central angle of a circle is an angle whose vertex is the
center O of the circle and whose sides, called radii, are line segments from O to two points
on the circle. In Figure 6.1, < ) BOC is a central angle and we say that it intercepts the arc
BC . An inscribed angle of a circle is an angle whose vertex is a point A on the circle and
whose sides are line segments, called chords, from A to two other points on the circle. In
Figure 6.1, < ) BAC is an inscribed angle that intercepts the arc BC .
Here is a very useful relation between inscribed and central angles. If an inscribed angle
< ) BAC and a central angle < ) BOC intercept the same arc, then
As a result, inscribed angles which intercept the same arc are equal.
FIGURE 6.1: Relation between central and inscribed angles.
Let us stress that a central angle can be more than π. If A and O lie on the same side of
line BC , then < ) BOC is smaller than π (and so < ) BAC is acute). On the other hand, if
A and O are on different sides of BC , then it is greater than π (and so < ) BAC is obtuse).
Finally, let us mention that often we are not explicitly told that we deal with inscribed
angles. For example, perhaps we are given two triangles ABC and ABC where C and C′ lie
′
on the same side of line AB. We then make a connection and notice that
< ) ACB = < ) AC B if and only if they have the same circumcircle. Similarly, suppose
′
that C and C′ lie on different sides of line AB. In this case, < ) ACB + < ) AC B = π if ′
Three lines bisecting angles of a triangle intersect in one point. This point is
called the incenter of the triangle and is a center of the inscribed circle; that
is, the largest circle contained within the triangle.
The three perpendicular bisectors of the sides of a triangle meet in one point,
the circumcenter of the triangle.
The altitude of a triangle is a line which passes through a vertex of the triangle and is
perpendicular to the opposite side (possibly extended). There are therefore three altitudes in
a triangle. As a potential application of the above observations, let us show that the three
altitudes of the triangle intersect in a single point called orthocenter. Note that the
orthocenter is not always inside the triangle; if the triangle is obtuse, it will be outside.
Let us rst consider any acute triangle ABC . Take the orthogonal projection A′ of A on
′
BC and the orthogonal projection B of B on AC . Denote intersection point of altitudes AA
′
and BB as H—see Figure 6.2. First, note that < ) ABB < < ) ABC and
′ ′
′
< ) BAA < < ) BAC so H lies inside triangle ABC . Now, observe that C, A , H and B
′ ′
Arguing similarly, we get that A, B′, A′, B lie on a circle and so < ) B A A = < ) B BA.
′ ′ ′
But this means that if C′ is an intersection of CH with AB, then < ) AC C = π/2 as ′
The reasoning when < ) BCA is not acute is identical with the only difference that H lies
outside of triangle ABC , if the triangle is obtuse, or H lies on the triangle, if the triangle is
right.
SOLUTION
Let F′ be the point on line segment AD such that |F A| = |F B|. First, note that
′
< ) ABF = < ) BCE . Hence, lines F B and CE are orthogonal and so S lies on the
′ ′
intersection of the two. But this means that < ) SF F = < ) SCF . (In fact, these two
′
angles are equal to the previously mentioned two but we do not need this observation here.)
It follows that triangles SF F and SCF have the same circumcircle and so it contains points
′
S, F, C, F′. Now, since F F DC is a rectangle with its three vertices (F′, F, and C) lying on
′
the circle, the fourth vertex, D, must also lie on this circle. It follows that DF is a diameter
of the circle and, consequently, < ) DSF is the right angle.
FIGURE 6.3: Illustration for Problem 6.1.
REMARKS
In geometric problems, very often drawing an additional point, line, or circle signi cantly
helps to nd a solution. In our case, introducing an auxiliary point F′ turned out to be very
helpful. How can one think of such a point? It is natural to extend line segment BS beyond
point S and then to notice that point lying on the intersection of this line with AD forms a
rectangle AF F B whose diagonal, F B contains line segment BS .
′ ′
EXERCISES
6.1.1. We are given an acute triangle ABC with < ) ACB = π/3. Let A′ be the orthogonal
projection of A on BC , let B′ be the orthogonal projection of B on AC , and let M be the
middle point of line segment AB. Prove that |A B | = |A M | = |B M |.
′ ′ ′ ′
6.1.2. Consider a square ABCD. Choose point P outside of this square such that < ) CP B
is the right angle. Denote by Q the intersection of AC and BD. Prove that
< ) QP C = < ) QP B.
6.1.3. Point O is the center of a circumcircle of a triangle ABC . Point C′ is the orthogonal
projection of C on AB. Prove that < ) ACC = < ) OCB.
′
PROBLEM
Consider a rectangle ABCD. We choose point F such that triangle ABF is equilateral and
AF lies inside angle < ) BAD. Similarly, we choose point E such that triangle BCE is
equilateral and BE lies inside angle < ) ABC . Prove that triangle DEF is equilateral.
THEORY
Congruence Two gures or objects are congruent if they have the same shape and size, or if
one has the same shape and size as the mirror image of the other. It is worth remembering
that there are the following conditions for determining congruence between two triangles:
2. one side and two angles are equal (angle-side-angle or angle-angle-side condition);
3. one angle and two sides associated with this angle are equal (side-angle-side
condition);
4. one angle that is not acute and any two sides are equal.
Importantly, the exception here is when we know one acute angle and two sides of the
triangle but only one of them is adjacent to the angle. In this case, there are actually two
possible triangles that meet those conditions but are not congruent.
SOLUTION
Observe that angles < ) F AD, < ) ECD, and < ) EBF are all equal to π/6. Also
|AD| = |EC| = |EB| and |F A| = |CD| = |BF |. This means that triangles ECD, F AD,
and EBF are congruent. In particular, it implies that |ED| = |F D| = |EF | and so DEF is
equilateral.
FIGURE 6.4: Illustration for Problem 6.2.
REMARKS
A common strategy when solving geometry problems is to draw a picture and then try to
write down all lengths and angles that one can possibly calculate (or list all relationships
between them). In our solution, we simply marked the corresponding values for angles and
identi ed sides that have equal length. When one does it, it is often easy to spot congruences.
Congruence allows us to reason about unknown lengths of sides or unknown angles.
EXERCISES
6.2.1. Suppose that points P and Q lie on sides BC and CD of a square ABCD such that
< ) P AQ = π/4. Prove that |BP | + |DQ| = |P Q|.
6.2.2. Point P lies on a diagonal AC of a square ABCD. Points Q and R are the orthogonal
projections of P on lines CD and DA, respectively. Prove that |BP | = |RQ|.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 1.
Solution: our own.)
6.2.3. Consider an acute triangle ABC where < ) ACB = π/4. Point B′ is the orthogonal
projection of B on AC and point A′ is the orthogonal projection of A on BC . Let H be the
intersection point of AA and BB . Prove that |CH | = |AB|.
′ ′
6.3 Similarity
SOURCE
Adaptation of a puzzle mentioned by Peter Winkler while visiting one of the authors of this
book.
PROBLEM
You are given a triangle ABC and an n-element set S of non-overlapping disks, all having
radius 1 and centers lying inside or on the triangle. Disks do not need to lie inside the
triangle as long as their centers are. Moreover, by “non-overlapping” we mean that they can
“touch” each other; that is, we allow the intersection of any two disks to be one point.
Suppose that set S is maximal; that is, there is no disk of radius 1 such that its center lies
inside or on triangle ABC and it does not overlap with any disk from S. Prove that you can
completely cover triangle ABC with 4n disks of radius 1 (of course, this time we allow
them to overlap).
THEORY
Similarities Two gures that have the same shape are said to be similar. Formally, given two
gures A and B lying on the same plane, we say that they are similar if we can transform A
into B only using the following two operations:
In particular, if two gures are similar, then the ratios of the lengths of their corresponding
sides are equal.
For instance, any two circles or any two squares are always similar. In fact, in general, all
regular n-gons are similar. If we are given two triangles, the rules of similarity are the same
as rules of congruence with the only difference that the requirement that the corresponding
lengths of sides are equal is replaced by equality of proportions. Finally, let us mention that
if gure A has to be scaled by a factor of α to be congruent to gure B, then the ratio
between the area of gure A and the area of gure B is α2.
SOLUTION
First, let us construct another auxiliary set of n disks; this time all of them being of radius 2.
For each disk D from S, we put disk E to R which has the same center as D (but radius 2, not
1). Note that R completely covers triangle ABC . Indeed, if point X is lying inside or on the
triangle but is not covered by R, then the distance from X to any of the centers is greater than
2. But this implies that a disk of radius 1, centered at X, would not overlap with any of the
disks in S, contradicting the maximality of set S.
Consider now a triangle A B C and a set R′ that are similar to triangle ABC and,
′ ′ ′
respectively, set R but both are shrunk by a factor of two in each dimension. Clearly, triangle
′
A B C
′
is completely covered by disks of radius 1 from R′. Hence, it is enough to show that
′
one can cover triangle ABC with four copies of A B C ; indeed, if this can be done, then
′ ′ ′
triangle ABC can be covered by four copies of R′. But this is easy to do. Let D, E, and F be
the midpoints of line segments AB, AC , and BC , respectively—see Figure 6.5. Now, we
observe that triangle A B C can be partitioned into four triangles ADE, BDF , EF C and
′ ′ ′
REMARKS
The key observation in our solution is that any triangle can be partitioned into four copies of
themselves (scaled by a factor of two in each dimension). Clearly, triangles are not the only
gures with this property.
The solution is cute but it feels that it is far from being optimal. Therefore, it is perhaps
surprising that, in fact, the factor 4 is best possible! In other words, if we replace 4n by
⌊(4 − ϵ)n⌋ for some ϵ > 0, then the property is no longer true, regardless how close to zero ϵ
is. Formally, for any ϵ > 0 there exists a counter-example, a triangle ABC and an n-element
maximal set S of non-overlapping disks such that more than (4 − ϵ)n disks are needed to
cover triangle ABC . To see this, we consider a very large triangle so that the boundary
effects are negligible. Hence, for a moment, let us forget about the triangle and think about
placing disks on the plane.
First, let us consider a tiling with regular hexagonal tiles; each hexagonal tile consists of
six equilateral triangles of side lengths equal to r. We carefully choose r such that a unit disk
is just a tiny bit larger than a disk inscribed into one hexagon. Since the altitude of any of the
six equilateral triangles making up the hexagon is arbitrarily close to one, the radius of the
disk, we may assume that r is arbitrarily close to 2/√3 (but it must be a tiny bit smaller).
Formally, we set r := 2/√3(1 + f (ϵ)) for some function f : R → R such that f (ϵ) → 0
+ +
as ϵ → 0, which will be determined soon. We put disks in every third tile such that their
centers coincide with the corresponding centers of hexagons—see Figure 6.6. This
con guration of disks just barely prevents us from adding any more disks without
overlapping and so it is maximal. The fact that this is the most ef cient way to prevent the
addition of a non-overlapping disk is a challenging task to prove and is way beyond the
scope of this book. The (limiting) ratio between the total area of all unit disks used and the
area of the tiling is
2
(π ⋅ 1 )/3 π√ 3(1 + f (ϵ))
= .
2 18
6(√ 3r /4)
FIGURE 6.6: Maximal set of unit disks and covering with disks of radii 2.
Now, the next question is: what is the most ef cient way to cover the plane by unit disks?
The answer is as follows: by circumscribing the tiles in some other hexagonal tiling, this
time each hexagon must have unit radius (also, as for triangles, often called circumradius).
This is another dif cult question that is beyond the scope of this book. In such tiling, the
ratio between the total area of all unit disks used and the area of the tiling is
π 4π√ 3
= .
18
6(√ 3/4)
It follows that the ratio between the number of disks used in the second scenario and n, the
number of disks used in the rst one, is 4/(1 + f (ϵ)). Hence, one needs more than (4 − ϵ)n
disks to cover the triangle, provided that f (ϵ) is suf ciently close to zero and triangle ABC
is suf ciently large so that the ( nite) ratio is close to its limiting counterpart.
EXERCISES
6.3.1. You are given a rectangle that can be covered with n disks of radius r. Prove that it can
be also covered by 4n disks of radius r/2.
6.3.2. You are given an acute triangle ABC . Let B′ be the projection of B on AC and C′ be
the projection of C on AB. Show that ABC and AB C are similar. ′ ′
6.3.3. Consider two circles, o1 and o2, that intersect at two points, A and B. Let P be a point
on o1 such that AP goes through the center of o1 and Q be a point on o2 such that AQ goes
through the center of o2. Prove that if < ) P AQ = π/2, then |P B|/|BQ| = [o ]/[o ], 1 2
Consider an acute triangle ABC where |AC| < |BC|. Point D lies on line segment BC and
|BD| = |AC|. Points E and F are the middle points of line segments CD and AB,
Before we state the rst theorem, we need one de nition. A transversal is a line that passes
through two lines at two distinct points.
Menelaus’s Theorem Consider a triangle ABC , and a transversal line that crosses BC , AC ,
and AB at points D, E, and F respectively, with D, E, and F distinct from A, B, and C—see
Figure 6.7. Menelaus’s theorem then states that the following relation holds:
(6.1)
The converse is also true. If points D, E, and F are chosen on BC , AC , and AB respectively
so that (6.1) holds, then D, E, and F are collinear.
In order to see this, draw the line KC parallel to AB and observe that, by similarity of the
triangles, we have
|BD| |BF |
=
|DC| |CK|
and
|AE| |AF |
= .
|EC| |CK|
We obtain it by writing (6.1) for triangle AF C and line BE, triangle BCF and line AD,
and then dividing them side by side and rearranging the terms.
SOLUTION
(6.2)
and
(6.3)
But |BA| = 2|F A| and |BC| = 2|CE| + |AC|. After substituting it to equation (6.2) we get
and so
2|EC||EF |
|EG | =
|AC|
2|EC||EF |
|EF |(|AC| + |CG|)(|AC| + 2|EC|) = |AC|(|AC| + |CE|)( + |EF | ) .
|AC|
Finally, after simpli cation we get the desired equality; namely, |CG| = |CE|.
REMARKS
It is useful to remember the “arrow shaped pattern” that is obtained by two overlapping
triangles. For example, coming back to our problem, Figure 6.8 contains point A which can
be viewed a the “head of the arrow” consisting of two overlapping triangles, AF G and ABC
. In such situations, especially when the problem concerns lengths of certain sections,
Menelaus’s theorem very often turns out to be useful.
EXERCISES
6.4.1. Points D, E and F lie on sides BC , CA and AB of a triangle ABC in such a way that
lines AD, BE and CF intersect in a single point P. Prove that
|AF |/|F B| + |AE|/|EC| = |AP |/|P D|.
(Source of the problem and solution idea: “Delta” monthly, March 2011 – DeltaMi –
Problem 2.)
6.4.2. You are given a triangle ABC where < ) ACB = π/2. On side AC build a square
ACGH , externally to the triangle. Similarly, on side BC build a square CBEF , externally
to the triangle. Show that the point of intersection of AE and BH lies on the line orthogonal
to AB that goes through point C.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 105.
Solution: our own.)
6.4.3. You are given a convex quadrilateral ABCD and a line that intersects lines DA, AB,
BC , and CD in points K, L, M, and N, respectively. Prove that
|DK| ⋅ |AL| ⋅ |BM | ⋅ |CN | = |AK| ⋅ |BL| ⋅ |CM | ⋅ |DN |.
6.5 Parallelograms
SOURCE
PROBLEM
Point P lies inside parallelogram ABCD and < ) ABP = < ) ADP . Show that
< ) DAP = < ) DCP .
THEORY
the diagonals of a parallelogram divide it into four triangles of equal area (in
particular, the area of a parallelogram is twice the area of a triangle created
by one of its diagonals);
any line going through the midpoint of a parallelogram bisects the area.
SOLUTION
Draw an auxiliary point P′ such that P P is parallel to CD (and so also to AB) and for
′
< ) ABP = < ) BP P , as BP is a transversal that passes the two parallel lines, P P and
′ ′
BC . Now, using the assumption of the problem that < ) ABP = < ) ADP we get that
< ) BP P = < ) CP . It follows that triangles BP P and BCP share one of the sides
′ ′ ′ ′
(the line segment BP ) and the two angles that are opposite to BP are equal and lie on the
′ ′
same side of BP . Using the connection between central and inscribed angles discussed in
′
Section 6.1 we deduce that one can draw a circle through points B, P, C, and P′. Therefore,
< ) P P C = < ) P BC , as they are inscribed angels which intercept the same arc. But,
′ ′
clearly, < ) P P C = < ) DCP (as CP is a transversal that passes two parallel lines, CD
′
REMARKS
In this example, we see one more time how useful it is to add some auxiliary object to the
gure—this time, it is point P′. The idea for adding it comes from the fact that creating
another parallelogram introduces many angles that must be preserved and so many useful
conditions must be satis ed.
EXERCISES
6.5.1. Consider a quadrilateral ABCD. Prove that the sum of distances from any point P
inside this quadrilateral to the lines AB, BC , CD, and DA is constant (that is, does not
depend on the choice of P) if and only if ABCD is a parallelogram.
6.5.2. Consider a triangle ABC such that |AB| = |AC| (that is, an isosceles triangle), AD is
the height of this triangle, and E is in the middle of AD. Let F be the orthogonal projection
of D on BE. Prove that < ) AF C = π/2.
6.5.3. Consider a triangle ABC . Outside of the triangle, on sides AB and AC , we built
squares ABDE and, respectively, ACF G. Let M and N be the middle points of DG and,
respectively, EF . What are the possible values of the rato |M N |/|BC|?
(Source of the problem: “Wokół obrotów” book by Waldemar Pompe, Problem 4.22.
Solution: our own.)
Consider two externally disjoint circles A and B; that is, they are not only disjoint but also
neither of them lies inside the other one. There are two lines ℓ and ℓ tangent to A and B
1 2
selected so that they are not separating A and B. Line ℓ ( i ∈ {1, 2}) touches circle A in
i
point Ai and circle B in point Bi. Now consider a line A B . It intersects circle A in point A3
1 2
THEORY
Tangent Line The tangent line to a curve at a given point is the straight line that “just
touches” the curve at that point. (A formal de nition is outside the scope of this book.) As a
speci c example, consider a circle B and a point X outside of it. Then, there are two tangent
lines, ℓ and ℓ , to the circle; the intersection of ℓ (respectively, ℓ ) and the circle is
1 2 1 2
Due to the symmetry, without loss of generality, it is enough to focus on showing that
< ) AB O = π/2. For a contradiction, suppose that < ) AB O ≠ π/2; that is, B1 is not
1 1
the orthogonal projection of O on line AB . This implies that there exists another point on
1
line AB that is at the same distance from O as B1; that is, both points lie on the circle. We
1
assumed however that there was only one point of intersection of AB and the circle, and so
1
Case 1: Consider a circle and a point A inside it, together with any line going through A. Let
the points of intersection of this line with the circle be B and C. The product |AB| ⋅ |AC|
then does not depend on the choice of the line and is equal to the square of the radius of the
circle minus the square of the distance from A to O, the center of the circle.
To see this, let us rst note that if A = O, then the desired property is trivially true so we
may assume that it is not the case. Now, let us introduce an auxiliary line going trough A and
O that intersects the circle at points P and Q—see Figure 6.10. Our goal is to show that the
desired property holds for all lines passing through A (including this auxiliary line OP ) but,
clearly, it holds for OP . Indeed, observe that
2 2
= (|OP | − |AO|) ⋅ (|OP | + |AO|) = |OP | − |AO| .
So it is enough to show that |AB| ⋅ |AC| = |AP | ⋅ |AQ|. Now, observe that triangles AQB
and ACP are similar as < ) QBC = < ) QP C and < ) CAP = < ) QAB (see the
connection between central and inscribed angles discussed in Section 6.1). Therefore,
|AB|/|AQ| = |AP |/|AC|, which yields the desired property.
Case 2: Now, consider a circle and a point A outside of it. Consider any line going through A
that intersects with the circle. Let the points of intersection of this line with the circle be B
and C. Then, the product |AB| ⋅ |AC| does not depend on the choice of the line and is equal
to the square of the distance from A to O, the center of the circle, minus the square of the
radius of the circle.
First, let us note that possibly B = C (that is, the line intersects the circle at one point and
so the line is, in fact, the tangent line) but this case is rather uninteresting and easy to deal
with. Indeed, as argued above, in this case < ) ABO is the right angle and so we get the
desired property immediately.
FIGURE 6.11: Power of a Point—Case 2: |AB| ⋅ |AC| = |AO| 2
− |OB|
2
.
Now assume B ≠ C and choose a point P on the same side of line AO as points B and C
such that < ) OP A is the right angle and P lies on a circle (see Figure 6.11). Since
|OB| = |OP |, triangle BOP is an isosceles triangle and so
< ) BP O = π/2 − < ) BOP /2. Since < ) OP A is the right angle,
< ) BP A = π/2 − < ) BP O = < ) BOP /2. But, as < ) BCP is an inscribed angle and
< ) BOP is the central angle that intercept the same arc, < ) BCP = < ) BOP /2, and so
< ) BCP = < ) BP A. This means that triangles ACP and ABP are similar. Therefore,
|AP |/|AC| = |AB|/|AP | and so |AB||AC| = |AP | . But, as < ) OP A is the right angle,
2
SOLUTION
2 2 2
|B2 A3 | ⋅ |B2 A1 | = |B2 OA | − |A2 OA | = |A2 B2 | ,
line of symmetry of the two cycles, A, B, together with the two lines, ℓ and ℓ . Therefore, 1 2
REMARKS
In this example we used Power of a Point property. It is a natural tool to try in situations
when we have to prove facts about lengths of sections de ned by a circle.
EXERCISES
6.6.1. Two circles intersect in points A and B. Point P is selected on line AB outside of the
circles. Points C and D are locations where tangent lines going through point P touch both
circles. Prove that < ) P CD = < ) P DC .
6.6.2. Consider a convex hexagon ABCDEF such that |AB| = |BC|, |CD| = |DE|, and
|EF | = |F A|. Prove that lines containing altitudes of triangles BCD, DEF , and F AB
6.6.3. Consider two points A and B. Take two circles o1 and o2 such that o1 is tangent to AB
in point A, o2 is tangent to AB in point B, and o1 and o2 are externally tangent in point X. If
we allow o1 and o2 to vary, then what is the set of points that contains all possible locations
of X.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 26.
Solution: our own.)
6.7 Areas
SOURCE
PROBLEM
Let A be any convex quadrilateral that has area equal to 1. To simplify the notation,
1 A2 A3 A4
let A 0 = A . Let us introduce four new points: for i ∈ {1, 2, 3, 4}, let A be the point on line
4
′
i
Ai Ai−1 such that A is the midpoint of line segment A A . Calculate the area of
i−1 i
′
i
A A .
′ ′ ′ ′
A1 A2 3 4
THEORY
In order to be able to deal with areas of some gures, it is often the case that one needs to use
a formula for the area of a triangle. Consider a triangle ABC and denote by H an orthogonal
projection of C on line AB (which does not have to be on the line segment AB). The area of
the triangle, denoted by [ABC], is then
1 1
⋅ |AB| ⋅ |CH | = bh .
2 2
(6.4)
Here b = |AB| is often called the length of the base of the triangle, and h = |CH | is called
the altitude of the triangle. Although simple, this formula is only useful if the height can be
readily found, which is not always the case. Hence, there are other formulas available. For
example, the shape of the triangle is determined by the lengths of the sides. Therefore, the
area can also be derived from the lengths of the sides by Heron’s formula that we already
used in this book (see Problem 1.9).
A direct consequence of (6.4) is that if the altitude of the triangle is xed and one only
changes the length of the base, then the area of the triangle changes proportionally. This
implies that if one angle of the triangle is xed but the two adjacent sides are rescaled by
factors a and b, respectively, then the area of the triangle changes by a factor of a ⋅ b. For a
given angle α ∈ (0, π), this constant factor is typically de ned as the area of the triangle
whose sides adjacent to angle α have lengths 1 and 2. It is called the sine of angle α and is
denoted by sin(α). It follows that for any triangle ABC we have
[ABC] =
1
2
⋅ |AC| ⋅ |AB|⋅ sin (⦔BCA) = 1
2
⦔ABC)
. |BA| ⋅ |BC|⋅ sin (
=
1
2
⋅ |CA| ⋅ |CB|⋅ sin ( ⦔ACB).
(6.5)
The Law of Sines The set of equalities in (6.5) give us immediately the following important
observation known as the law of sines. For any triangle ABC , we have
This ratio is equal to the diameter of the circumscribed circle of the given triangle. Another
interpretation of this observation is that every triangle with angles α, β, and γ is similar to a
triangle with side lengths equal to sin(α), sin(β), and sin(γ).
Let us nish with two more observations. From the discussion above, we get that
sin(0) = sin(π) = 0, sin(π/2) = 1, sin(α) = sin(π − α) for any α ∈ (0, π), and that the
sine function is increasing in the range from 0 to π/2, and decreasing in the range from π/2
to π.
Moreover, consider a triangle ABC such that < ) ABC = α and < ) BAC = π/2; that
is, < ) BAC is the right angle. Then, [ABC] = |AB||AC|/2 = sin(α)|AB||BC|/2, so
sin(α) = |AB|/|BC|.
SOLUTION
We will use [A1 A2 …An ] to denote the area of an n-gon A1 A2 …An . Let us rst observe
that,
' '
[A1 A4 A4 ] = 2[A1 A3 A4 ]
' '
[A2 A2 A3 ] = 2[A1 A2 A3 ].
Hence,
' ' ' '
[A1 A4 A4 ] + [A2 A2 A3 ] = 2[A1 A3 A4 ] + 2[A1 A2 A3 ]
= 2([A1 A3 A4 ] + [A1 A2 A3 ])
= 2[A1 A2 A3 A4 ] = 2.
Similarly, since
' '
[A1 A1 A2 ] = 2[A2 A4 A1 ]
' '
[A3 A3 A4 ] = 2[A2 A3 A4 ],
we get [A ′
1 A1 A2 ]
′
+ [A3 A3 A4 ] = 2
′ ′
. Combining all of these together, we conclude that
' ' ' ' ' ' ' '
[A1 A2 A3 A4 ] = [A1 A2 A3 A4 ] + [A1 A4 A4 ] + [A2 A2 A3 ]
REMARKS
This problem shows that it is important to remember about the following two facts.
Suppose that two triangles, say, ABC and DEF , are such that |AB| = |DE|
and |AC| = |DF |; moreover, < ) BAC + < ) EDF = π. (For example,
triangles A A A and A A A on Figure 6.13 satisfy these properties.)
1 2 3 1 2
′
3
EXERCISES
6.7.1. Let P be an interior point of a triangle ABC . Let lines AP , BP , and CP intersect
sides BC , CA, and AB in points A′, B′ and, respectively, C′. Prove that
|P A|/|AA | + |P B|/|BB | + |P C|/|CC | = 2.
′ ′ ′
6.7.2. Points E and F lie on sides BC and, respectively, DA of a parallelogram ABCD such
that |BE| = |DF |. Select any point K on side CD. Let P and Q be intersection points of line
F E with lines AK and, respectively, BK . Prove that [AP F ] + [BQE] = [KP Q].
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 45.
Solution: our own.)
6.7.3. Consider a convex quadrilateral ABCD. Select points K and L on side AB such that
|AK| = |KL| = |LB| = |AB|/3. Similarly, select points N and M on side DC such that
PROBLEM
Prove that KLM N is a parallelogram and its area is less than or equal to the half of the area
of ABCD.
THEORY
Thales’ Theorem Let us highlight the following observation, known as Thales’ theorem or
the intercept theorem, about the ratios of various line segments that are created if two
intersecting lines are intercepted by a pair of parallels. In fact, it is equivalent to the theorem
about ratios in similar triangles. Lines A A and B B are parallel if and only if
1 2 1 2
|A1 C| |B1 C|
= .
|A2 C| |B2 C|
Point C may lie anywhere on the plane except for being situated on the lines A 1
A2 or B
1
B2 .
FIGURE 6.14: Thales’ Theorem.
SOLUTION
However, since |KB|/|AB| = |LB|/|CB| = 1/(1 + α), the area of BKL is 1/(1 + α) of
2
the area of ABD. Considering the remaining three triangles in a similar way, we get that
their total area is
2
1 α
+
2 2
(1 + α) (1 + α)
2
2
1 α 1 1 α 1
= 2(√ ⋅ ) ≤ 2( ( + )) = ,
1+α 1+α 2 1+α 1+α 2
where the inequality is obtained by the geometric-arithmetic mean inequality. The equality is
obtained if α = 1.
FIGURE 6.15: Illustration for Problem 6.8.
REMARKS
In this problem, since the ratios of the lengths of the corresponding line segments are
preserved, it is natural to try to use Thales’ theorem. Interestingly, one could ask a question
what conclusion can be obtained if, instead, the following property holds:
(Note the reversed proportions for one pair of opposing sides.) This time, KLM N is not a
parallelogram (in general). However, we immediately see that the area of KBL is
α/(1 + α) fraction of the area of ABD. Similar properties are also satis ed for the three
2
other triangles. Therefore, the area of KLM N is equal to 1 − 2α/(1 + α) of the area of
2
ABCD. Arguing as before, we get that it is at least half of the area of ABCD and the
equality holds when α = 1. In fact, we get a slightly stronger property: the area of the gure
obtained this way added to the area of the gure from our original problem is exactly equal
to the area of ABCD.
EXERCISES
6.8.1. Given a parallelogram ABCD, consider points M and N that are in the middle of sides
BC and CD, respectively. Section BD intersects with AN in point Q, and with AM in
6.8.2. Points K, L, M, and N are the middle points of sides AB, BC , CD and, respectively,
DA of a parallelogram ABCD whose area is equal to 1. Let P be the intersection point of
M A with LD, and, nally, S be the intersection point of N B with M A. Calculate the area
of P QRS .
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 59.
Solution: our own.)
6.8.3. Points E and F are on sides AB and, respectively, AD of rhombus ABCD. Lines CE
and CF intersect line BD in points K and L, respectively. Line EL intersects side CD in
point P. Line F K intersects side BC in point Q. Prove that |CP | = |CQ|.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 62.
Solution: our own.)
Chapter 7
Hints
7.1 Inequalities
7.2 Equalities and Sequences
7.3 Functions, Polynomials, and Functional Equations
7.4 Combinatorics
7.5 Number Theory
7.6 Geometry
In this chapter we provide hints for all exercises presented in the book.
7.1 Inequalities
1.1.1. Observe that a + c = b + (a + c − b) and a ≤ a + c − b ≤ c.
1.1.2. Apply Jensen’s inequality to function f (k) = k
s−1
and weights
proportional to k.
1.1.3. Apply Jensen’s inequality to function f (x) = √x.
1.2.1. Apply the arithmetic-harmonic mean inequality and observe that equality
holds when a = b = c/2 = d/4.
1.2.2. Apply the arithmetic-geometric mean inequality to the right hand side,
rearrange the terms, and nally apply the geometric-harmonic mean
inequality to get the result.
1.2.3. Divide both sides by 2 and apply the arithmetic-geometric mean
inequality to the left hand side.
1.3.1. Take logarithm of both sides of the inequality and directly apply the
rearrangement inequality.
1.3.2. Divide both sides by abc and then apply the rearrangement inequality to
the obtained inequality.
1.3.3. Take a logarithm of both sides and then apply rearrangement inequality
after simplifying the expression.
1.4.1. In both cases, rst invert both sides of the inequality and then apply
Bernoulli’s inequality, observing that n ≥ 2.
1.4.2. Raise both sides to the power of n and then apply Bernoulli’s inequality.
1.5.1. Invert both sides of the inequality and note that n 2
− n = n ⋅ (n − 1) .
1.5.2. Raise both sides of the inequality to the power of n(n + 1) and rearrange
the obtained inequality.
1.5.3. Use the fact that (1 + a/n) is increasing for a ≠ 0.
n
1.5.5. To prove the rst inequality, use Bernoulli’s inequality. For the second
part, note that the right hand side tends to 3 and the middle term is
bounded from above by e.
1.6.1. Use the binomial expansion of (1 + x)
i
and observe that
= 1 + ix + O(x ).
i 2
(1 + x)
Jensen’s inequality.
1.7.3. Bound function from above by a linear function passing points
x
2
x +1
1.8.1. Consider ipping a fair coin 2n times and calculate the probability of
obtaining exactly n heads.
1.8.2. Consider the probability that in n coin tossings there are at least k heads,
provided that the probability of getting a head is equal to p and,
respectively, q.
1.9.1. Consider n + 1 points of the form (i, ∑ , .
i
aℓ ) i ∈ {0, 1, …, n}
ℓ=1
1.9.2. Consider the area of the pentagon with each side length equal to 1/2 with
the two diagonals adjacent to the same vertex that have the same length,
x.
(x − (y + z))(x − yz ) = 0
{(y − (z + x))(y − zx ) = 0
(z − (x + y))(z − xy ) = 0.
.
2.3.2 Find an upper bound for and a lower bound for
2
(x + y + z)
x y + y z + z x .
2 2 2 2 2 2
2.4.2 Use the identity cot(2x) = (cot(x) − tan(x))/2 and then use the
substitution y = tan(x).
2.4.3 Use the identity
1 + cot(x) cot(y)
cot(y − x ) = .
cot(x) − cot(y)
2.5.1 For each pair of the three equations, subtract one from the other to cancel
out a and one of the squares.
2.5.2 Add x
2009 2009
y to both sides of the equation.
2.5.3 Re-write the equations as follows: (x .
2 2
i+1 − 6) + (xi − 8) = 50
this prove that all ai must be positive. Finally, show that all ai are equal to
1.
3.1.2 It follows immediately from Vieta’s formulas (see (3.1)) that
x = 0, where xi are roots of the polynomial f (x). This implies
n 2
∑
i=1 i
that if all roots are real, then all of them are equal to 0.
3.1.3 Observe that we have 1
x
4
= 9(x + 2)
2
.
3.2.1 Prove that f (0) = 0 and then that f (−a) = −f (a). Then, consider
possible values of f (2) depending on the arbitrarily chosen value of f (1).
3.2.2 Consider the equation for y = 0 and y = f (x) to get the relations needed
to derive the solution.
3.2.3 Consider pairs of (x, y) of the form (0, 0) , (0, f (0)) , (0, y) , and
(x, f (x)).
2
) .
3.3.2 Note that for x ≠ 0 we have that f (x) = (f (1/(1 − x) − 1)/x.
3.3.3 Show that
f (a, b, c) = a + f (0, 0, b) − f (0, 0, a) + f (0, b, c) − f (0, a, b) .
3.4.1 Since there exist a, b ∈ Z such that P (a) = 0 and P (b) = 1, we get that
(b − a ) | (P (b) − P (a)), and so |b − a| = 1. Then, de ne
Q(x) := P (a + (b − a)x) and show that Q(f ) = f for all i ∈ N.
i i
0 ≤ ℓ < k, and Q(x) has degree less than ℓ (or Q(x) = 0 everywhere if
ℓ = 0). Then, consider the case when the polynomial P (x) has only one
term.
3.7.1 Prove rst that each P (x) has degree at most 1. Then, represent each
i
7.4 Combinatorics
4.1.1. Greedily select pairs of members that know each other. If you stop
prematurely, since there are no more pairs to select from, then select any
two members that are not yet assigned, say, A1 and A2. Now, show that,
knowing that each of them knows at least n members already assigned, it
is possible to nd an already selected pair of people, say, B1 and B2, in
such a way that Ai and Bi ( i = 1, 2) know each other. After removing the
pair (B , B ) and adding the pairs (A , B ) and (A , B ), we improve
1 2 1 1 2 2
4.4.3 Select an arbitrary member of the club. We will assign him/her 1 if he/she
knows at least half of the members, and 0 otherwise. Now, remove this
member and all the members that do not match the chosen majority,
leaving at least 4 /2 members. Repeat the process on the remaining
t
subset of members. Observe that this process lasts at least 2t round (if
there is only one member left at the beginning of round 2t, we may
assign 0 or 1 arbitrarily). But this means that at least t members have 1
assigned or at least t members have 0 assigned. The last step is to observe
that this set of people satis es the requirements of the problem.
4.5.1 Estimate the number of increasing arithmetic progressions of length k in
X by ( ) = . Consider then a random partition of X
N N (N −1) k−1
< 2
2 2
into two subsets A and B and show that the expected number of k-
element sequences in A and in B is less than 1.
4.5.2 Consider a random tournament. For each ordering, compute the
probability that it has the desired property, namely, ti won against t ,i+1
4.9.1 Notice that by adding 1 to all sides on one die and subtracting 1 from all
sides on the other die does not affect the distribution for their sum.
4.9.2 You should obtain xn = (2
n n
− (−1) )/3 .
4.9.3 Calculate the probability of getting 10 and 11 using f (x) , where f (x) is3
the generating function for a fair die we have introduced in the solution.
You might want to use computer to expand the resulting polynomial.
to get that
12 6 3 6 3
n + 64 = (n − 4n + 8)(n + 4n + 8)
2 4 3 2
= (n + 2n + 2)(n − 2n + 2n − 4n + 4)
2 4 3 2
(n − 2n + 2)(n + 2n + 2n + 4n + 4).
5.4.3 ϕ(100) = 40 .
5.5.1 Observe that 3 does not divide 2k.
5.5.2 Note that |20 − 9 | = 11 and that |20 − 9 | ≡ 1 or 9
1 1 m n
( mod 10) .
Then, show that |20 − 9 | cannot be equal to 1 nor to 9.
m n
5.6.1 Separately consider the case when x is even and when x is odd.
5.6.2 Note rst that one may assume that xi and yi are co-prime for each
i ∈ [2011]. Next, consider the reminder of each term in the product when
divided by 3.
5.6.3 Select any number a ∈ S and show that it is possible to select b ∈ S so
that the reminder when dividing a + b by n is in S. Then, show that
having a and b xed, one can select c ∈ S so that a + c, b + c and
a + b + c satisfy the desired conditions.
i=1
ℓi
i=1 j=0 i
j ∈ {0, 1, …, ℓ },
i and two different divisors have
i different
representations.
5.7.2 Show rst that there is no solution for n = 1 nor for n = 2. Next, nd
solutions for n ∈ {3, 4, 5}. Finally, show that if there is a solution for n,
then there is one for n + 3.
5.7.3 Denote by D(n) the difference between the sum of white and the sum of
black divisors of n. Prove that D(p ⋅ q) = D(p) ⋅ D(q) when p and q are
co-prime. From this conclude that it is enough to show that D(q ), where k
2 2
2 2
(2y − 1) = (2x − x) − (x − 1) .
7.6 Geometry
6.1.1 Observe that points A, B′, A′, and B lie on a circle and that M is the center
of this circle.
6.1.2 Observe that P, B, Q, and C lie on a circle.
6.1.3 Compute angles ∢COB and ∢CAB.
6.2.1 Consider point R inside the angle such that
∢P AQ
6.4.3 Add line BD to the plot and apply Menelaus’s theorem twice.
6.5.1 Prove that if two half-lines, ℓ and ℓ , are not parallel and have a common
1 2
|AB| = |AC| have the property that all points lying on this line segment
6.5.2 Add point X such that that ADCX is a rectangle. Note that B, F, E, and P
are collinear and so ∢DF X = π/2. Finally, observe that points D, F, A,
X, and C lie on the same circle.
6.5.3 Add point P such that EAGP is a parallelogram. Observe then that
CF P E is also a parallelogram. Finally, note that N is the middle point of
the line segment P C , and M is the middle point of the line segment P B.
6.6.1 Prove that |P C| = |P D|.
6.6.2 Consider three circles: circle k1 with center in D and going through points
C and E, circle k2 with center in F and going through points E and A, and
circle k3 with a center in B and going through points A and C. Consider
now three sets of points. The rst one, l12, has the same power with
respect to k1 and k2. Prove that l12 is a line that contains the altitude of
F EB going through E. Similarly, de ne l13 and l23, and prove that they
DM .
6.8.1 Consider triangles ABQ and DQN , and then triangles AP D and BM P .
6.8.2 Denote by C′ the point of intersection of line CK with line AD. Observe
that |AC | = |DA|, and that N C and BC are parallel. Use those facts
′ ′
calculate [BCP ]. Apply the same process to triangles DQC , ARD, and
BSA.
8.1 Inequalities
8.2 Equalities and Sequences
8.3 Functions, Polynomials, and Functional Equations
8.4 Combinatorics
8.5 Number Theory
8.6 Geometry
In this chapter we provide solutions for all exercises presented in the book.
8.1 Inequalities
Problem 1.1.1. Prove that for any a, b, c ∈ R such that 0 < a ≤ b ≤ c,
1 1 1 1
− + ≥ .
a b c a + c − b
Illustrate the solution graphically. Does the same inequality hold for any function f : R → R that is
convex on some connected subset of R?
Solution. Let us observe rst that a + c = b + (a + c − b) and a ≤ a + c − b ≤ c (see Figure 8.1 for
the illustration of these observations, the length of the dashed line is equal to
(f (a) + f (c))/2 − (f (b) + f (a + c − b))/2). If a = c (and so, in fact, a = b = c), then both sides
are equal to 1/a and we are done. If a < c, then we note that for any convex function f (in particular,
for f (x) = 1/x, x > 0) we have
c−b b−a b−a c−b
f (a) + f (c) = f (a) + f (a) + f (c) + f (c)
c−a c−a c−a c−a
= f (b) + f (a + c − b).
Hence, not only the desired inequality holds but the same is true for any convex function f. For
graphical illustration see Figure 8.1.
FIGURE 8.1: Illustration for Problem 1.1.1 (case b < a + c − b ). We take A = (a, 1/a) , B = (b, 1/b) , C = (c, 1/c) ,
D = (a + c − b, 1/(a + c − b)).
Problem 1.1.2. Prove that for any n ∈ N and any real number s ≥ 2, the following inequality holds:
n s s−1
∑ k 2 1
k=1
≥ ( n + ) .
n
∑ k 3 3
k=1
n
s−1 n s−1
s n n 2
∑ k ∑ k
s−1 k k
∑ ≥ (∑
k=1 k=1
n = k n k n ) = ( n )
∑ k k=1 ∑ i k=1 ∑ i ∑ i
k=1 i=1 i=1 i=1
s−1
n(n+1)(2n+1)/6 s−1
2 1
= ( ) = ( n + ) .
n(n+1)/2 3 3
√x + √x + 2 < 2√ x + 1 .
Solution. Since f (x) = √x is concave, we get from Jensen’s inequality that for any x ∈ R +
1 1 1 1
√x + √x + 2 = 2( √x + √x + 2 ) ≤ 2√ x + (x + 2) = 2√ x + 1 .
2 2 2 2
Since f (x) = √x is not a linear function and we applied the inequality with x 1
= x ≠ x + 2 = x2 , in
fact sharp inequality holds.
Problem 1.2.1. Show that for any a, b, c, d ∈ R , the following inequality holds:
+
1 1 4 16
(a + b + c + d)( + + + ) ≥ 64.
a b c d
d
c d
a + b + 2 + 4 8
2 4
≥ .
1 1 1 1
8 + + 2 + 4
a b c/2 d/4
Problem 1.2.2. Show that for any n numbers a 1, …, an ∈ R+ , the following inequality holds:
2
a1 a2 an−1 an n
+ + ⋯ + + ≥ ,
a2 + 1 a3 + 1 an + 1 a1 + 1 n + α
where α = ∑ n
i=1
1/ai .
Solution. In order to simplify the notation, let us set an+1 = a1 . Using the arithmetic-geometric mean
inequality, we get that
n ai
n
∑i=1 n
ai ai+1 +1 n ai
∑ = n ≥ n ∏ .
⎷
ai+1 + 1 n ai+1 +1
i=1 i=1
However,
n n n
n ai n ai n 1
n ∏ = n ∏ = n ∏ .
⎷ ⎷ ⎷
ai+1 +1 ai +1 1/ai +1
i=1 i=1 i=1
n
n 1 n n
∏ ≥ = ,
⎷ n
1/ai +1 ∑i=1 1 + 1/ai n + α
i=1
m m
a + b ≥ 2 ,
where m ∈ R . +
b a a b
a b ≤ a b .
Solution. The inequality we aim to prove is equivalent to the following one
Without loss of generality, due to the symmetry, we may assume that a ≤ b and so log(a) ≤ log(b) .
Now, the inequality above follows immediately from the rearrangement inequality.
Problem 1.3.2. Prove that for any a, b, c ∈ R , +
b b
ab bc ca
+ + ≥ a + b + c .
c a b
Solution. After dividing both sides by abc ∈ R we get the following inequality:
+
1 1 1 1 1 1 1 1 1 1 1 1
⋅ + ⋅ + ⋅ ≥ ⋅ + ⋅ + ⋅ .
c c a a b b b c c a a b
Due to the symmetry, without loss of generality, we may assume that 1/a ≤ 1/b ≤ 1/c. As in the
previous problem, the above inequality follows immediately from the rearrangement inequality.
Problem 1.3.3. Prove that for any a, b, c ∈ R , +
a b c (a+b+c)/3
a b c ≥ (abc) .
Solution. Without loss of generality, we may assume that a ≤ b ≤ c . After taking a logarithm of both
sides of the inequality we get
a + b + c
a log(a) + b log(b) + c log(c ) ≥ (log(a) + log(b) + log(c) ) .
3
After multiplying both sides by 3 and rearranging the terms, we get
and that
2
n −1 1
n−1
b) ( ) < .
n n+2
Solution. In both parts, we rst inverse both sides of the inequality and then apply Bernoulli’s
inequality. To show part a), that is, to show that
2
n −n
2
2n − 1 < (1 + )
n
(Since n ≥ 2, n 2
− n = n(n − 1) ≥ 2 > 1 .) Similarly, in order to show part b), that is, to show that
2
n −1
1
n + 2 < (1 + )
n − 1
(Since n ≥ 2, n 2
− 1 ≥ 3 > 1 .)
Problem 1.4.2. Prove that for any real number x > −1 and n ∈ N,
n
x
√ 1+x ≤ 1 + .
n
Solution. Raising both sides to the power of n yields an equivalent inequality that
n
x
1 + x ≤ (1 + )
n
n−1
2 2
4 < (1 + ) = ((1 + ) ) .
n n
Since for n > 2 we have that (1 + 2/n) > (1 + 2/2) = 4, the above inequality holds. Finally, note
n 2
that lim = e , the constant 4 can be replaced by any number smaller than e ≈ 7.39.
n 2 2
n→∞ (1 + 2/n)
n n+1
√n > √ n+1 ?
Solution. Raising both sides of the inequality to the power of n(n + 1) gives us n > (n + 1) .
n+1 n
n
After dividing both sides by n , we see that it is enough to show that n > (1 + 1/n) . Since n
= e < 3, this inequality holds for any n ≥ 3. The remaining two cases can
n n
(1 + 1/n) ≤ exp (1/n)
be checked by hand: it does not hold for n = 1 but it does hold for n = 2. We get that the inequality
holds for any natural number at least 2.
Problem 1.5.3. For which n ∈ N do we have that
n n+1 2n+1
(n − 1) (n + 1) > n ?
or, equivalently,
2
n
1 1
(1 − ) > .
2 n
n (1 + 1/n)
2 2
= e, the right hand side is at least 1/e. As a result, the original inequality
n n
(1 + 1/n) ≤ exp (1/n)
Solution. After multiplying both sides by (n + 1)/n we get n + 1 > (1 + 1/n) . Since n n
= e < 3, the inequality holds for any n ≥ 2. One can directly check that it
n n
(1 + 1/n) ≤ exp (1/n)
In order to show the second inequality, observe that (1 + 1/n) ≤ (exp(1/n)) = e < 2.72. On the
n n
other hand, since 3(n + 1)/(n + 2) is increasing and tending to 3 as n → ∞, the desired inequality
holds for n large enough. In fact, it certainly holds for n ≥ 9, as 3(n + 1)/(n + 2) = 30/11 > 2.72 if
n = 9. One can easily inspect the 8 missing cases n ∈ [8] to show that the inequality holds for all
Problem 1.6.1. Show that for any n ∈ N, there exists a non negative x ∈ R such that
n 2
i
n + n + 1
∏ (1 + x) < 1 + x .
2
i=1
Solution. Since n is xed, there are nite number of elements in the product on the left hand side as
well as in the binomial expansion of (1 + x) for any i ∈ [n]. As a result,
i
n n
i 2
∏ (1 + x) = ∏ (1 + ix + O(x ))
i=1 i=1
n
2 n(n+1) 2
= 1 + ∑ ix + O(n ) = 1 +
2
x + O(x )
i=1
2 2
n +n x n +n+1
< 1 + x + = 1 + x,
2 2 2
Problem 1.6.2. Prove that for any polynomial W (x) and suf ciently large x we have that
> W (x), if n ∈ N is greater than the degree of W. What does it tell us about the function
n
(1 + x/n)
x
e?
Solution. Since W (x) = O(n n−1
) , observe that
n n n
n
x x n−1
x x
(1 + ) − W (x ) = + O(x ) = (1 + O(1/x) ) > > 0,
n n n
n n n 2n
provided that x ∈ R is suf ciently large.
As e =lim
x
i→+∞ (1 + x/n) and the sequence (1 + x/n) is increasing, we see that function ex
n n
1 1 1
(a + b + c)( + + ) ≥ 9.
a b c
2
(√ a√ 1/a + √ b√ 1/b + √ c√ 1/c) = 3 = 9.
As a remark, we note that the problem can be also solved by applying the arithmetic-harmonic
inequality (and this approach is mentioned in the remark to the original problem presented in PLMO II
– Phase 1 – Problem 6, which assumed additionally that a + b + c = 1).
Problem 1.7.2. Prove that for any a, b, c ∈ R such that a + b + c = 1, we have that
+
√ 2a + 1 + √ 2b + 1 + √ 2c + 1 ≤ √ 15 .
which gives us the desired result. Alternatively, one can use Jensen’s inequality to get that
= 3√ 5/3 = √ 15.
Problem 1.7.3. Prove that if a, b, c ∈ R are such that a + b + c = 1 and min{a, b, c} ≥ −3/4, then
a b c 9
+ + ≤ .
2 2 2
a + 1 b + 1 c + 1 10
Does this inequality hold without the additional assumption that min{a, b, c} ≥ −3/4?
Solution. It is natural to try to bound function f (x) := x/(x 2
+ 1) by some linear function, that is, we
search for a bound of the form
x
f (x ) = ≤ Ax + B =: g(x)
2
x + 1
for some constants A and B. Indeed, if this can be done, then the left hand side of our inequality,
f (a) + f (b) + f (c), would be bounded by g(a) + g(b) + g(c) = A(a + b + c) + 3B = A + 3B.
3 A
f (1/3 ) = = g(1/3 ) = + B.
10 3
Moreover, the additional assumption that a, b, and c must be at least −3/4 suggests that the same
should be true for x = −3/4:
12 −3A
f (−3/4 ) = − = g(−3/4 ) = + B.
25 4
We get the following system of equations: A/3 + B = 3/10 and −3A/4 + B = −12/25. It follows
that A = 18/25 and B = 3/50. Since A + 3B = 9/10, the result will hold once we prove that for any
x ≥ −3/4
x 18 3
≤ x +
2
x + 1 25 50
or, equivalently, that
3 2
h(x ) := 36x + 3x − 14x + 3 ≥ 0
(One can see it after multiplying both sides of the previous inequality by 50(x + 1).) But we already 2
know that g(1/3) = h(−3/4) = 0 so h(x) is divisible by (3x − 1)(4x + 3) (see Section 3.6 for
additional explanations of this fact). After dividing these two polynomials we nd that the remaining
factor is 3x − 1 and so the inequality is equivalent to (3x − 1) (4x + 3) ≥ 0, which is clearly true for
2
any x ≥ −3/4.
We will show now that the assumption that min{a, b, c} ≥ −3/4 is not necessary. We start with
proving a few properties of function f (x)— for a graph of this function see a gray curve in Figure 8.2.
Property A: Function f (x) is odd, that is, f (x) = −f (x) for any x ∈ R. Indeed,
x −(−x)
f (x ) = = = −f (−x ) .
2 2
x + 1 (−x) + 1
0 ≤ f (x ) ≤ 1/2 = f (1 ) .
The lower bound of 0 is trivial. To see the upper bound of 1/2, note that x 2 2
− 2x + 1 = (x − 1) ≥ 0
Property C: Function f (x) is increasing on the interval [0, 1], that is, f (x) < f (y) for any
0 ≤ x < y ≤ 1. Indeed, f (x) < f (y) is equivalent to x(y + 1) < y(x + 1) which, in turn, is
2 2
equivalent to 0 < (1 − yx)(y − x). But the last inequality is clearly true for any 0 ≤ x < y ≤ 1 as
xy ∈ [0, 1).
Property D: Function f (x) is decreasing on the interval [1, ∞), that is, f (x) > f (y) for any 1 ≤ x < y
. In order to see this, the same argument as before can be applied with the only difference being that
now xy ∈ (1, ∞).
Let us summarize what we have learned about function f (x): it is decreasing on the interval
(−∞, −1], reaching −1/2 at x = −1, then increasing on the interval [−1, 1], reaching 1/2 at x = 1,
a b c 2 1 0 9
+ + ≤ + + = .
2 2 2 2 2 2
a + 1 b + 1 c + 1 2 + 1 1 + 1 0 + 1 10
On the other hand, if c ∈ [−3, −3/4], then
a b c 1 1 −3 7 9
+ + ≤ + + = < ,
2 2 2 2 2 2
a + 1 b + 1 c + 1 1 + 1 1 + 1 3 + 1 10 10
and we are done.
Problem 1.8.1. Prove that for any n ∈ N,
2n
1 ( )
n
≤ ≤ 1 .
2n
2n + 1 2
Can you improve these bounds for large n?
Solution. Consider ipping a fair coin 2n times. For any i ∈ [2n], let P(i) be the probability of getting
exactly i heads. Clearly, ∑ P(i) = 1 (the number of heads has to be between 0 and 2n) and for any
2n
i=0
i ∈ [2n]
2n
2n 1
P(i ) = ( ) ⋅ ( ) .
i 2
(there are ( ) ways to select i rounds when heads are obtained and each such outcome occurs with
2n
probability (1/2) ). Hence, our goal is to estimate P(n). Because of this connection, the upper bound
2n
( P(n) ≤ 1) is trivial. (Let us note that we could have obtained the same upper bound without using
the above probabilistic argument by observing that 2 = (1 + 1) and then using the binomial 2n 2n
theorem.) In order to see the lower bound ( P(i) ≥ 1/(2n + 1)) we note that P(i), as a function of i,
is maximized exactly for i = n; that is, for any i ∈ [2n], P(i) ≤ P(n). To see this note that
P(i) = P(i − 1) ⋅ (2n + 1 − i)/i, which is larger than one for i ≤ n and smaller than one otherwise.
2n
1 1
P(n ) ≥ ∑ P(i ) = ,
2n + 1 2n + 1
i=0
2n
1 ( ) 1
n
≤ ≤ .
2n
2√n 2 √ 2n
2n 2n
( ) (2n)! √ 2π(2n)(2n/e) 1
n
= ∼ = .
2n 2 2n
2 (n!) 2
2n
2πn(n/e) 2
2n
√πn
Problem 1.8.2. Prove that for k, n ∈ N , such that k ≤ n and p, q ∈ [0, 1] , such that p < q , we have
that
n
n n−i n−i
i i
∑( )(q (1 − q) − p (1 − p) ) ≥ 0.
i
i=k
Solution. In order to solve this problem, we are going to use a standard but very useful proof technique
in probability theory that allows one to compare two experiments. Consider two biased coins, the rst
with probability p of turning up heads and the second with probability q > p of turning up heads. For
any xed k, the probability pk that the rst coin produces at least k heads should be at most the
probability qk that the second coin produces at least k heads. Clearly,
n n
n i n−i
n i n−i
pk = ∑( )p (1 − p) and qk = ∑( )q (1 − q) .
i i
i=k i=k
However, proving it is rather dif cult with a standard counting argument. Coupling easily
circumvents this problem. Let X , X , …, X be indicator random variables for heads in a sequence
1 2 n
of n ips of the rst coin. In other words, X = 1 if ith ip is a head and X = 0 otherwise. It follows
i i
that
pk = P (∑ Xi ≥ k ) .
i=0
then Y = 1 with probability (q − p)/(1 − p). Clearly, the sequence of Yi has exactly the probability
i
distribution of tosses made with the second coin. Indeed, for any i ∈ [n],
q − p q − p
P(Yi = 1 ) = P (Xi = 1) + P (Xi = 0) ⋅ = p + (1 − p) ⋅ = q.
1 − p 1 − p
P(X ≥ k) ≤ P (Y ≥ k), as expected. We typically say that X is (stochastically) bounded from above
by Y.
Problem 1.9.1. Prove that for all sequences of n numbers a 1, …, an ∈ R , we have that
2
n n
2
n + (∑ ai ) ≤ ∑ √1 + a .
⎷ i
i=1 i=1
Pi := (i, ∑ aℓ ) .
ℓ=1
(In particular, P
0 = (0, 0) .) For any two points, Pi and Pj, let d(P i, Pj ) be the distance between Pi and
Pj. Clearly,
2 2
n n
2
d(P0 , Pn ) = (n − 0) + (∑ aℓ − 0) = n − (∑ aℓ ) ,
⎷ ⎷
ℓ=0 ℓ=0
the left hand side of our inequality. On the other hand, for each i ∈ [n],
Solution.
′′ ′′
A B C D E
′′
d(Pi−1 , Pi )
Consider a
=
2
pentagon
(i − (i − 1))
d(P0 , Pn )
ABCDE
2
16
n
i
+ (∑ aℓ − ∑ aℓ )
ℓ=0
∑ d(Pi−1 , Pi ) .
i=1
ℓ=0
But this is exactly the inequality we wanted to prove and so we are done! Finally, let us note that the
equality holds if and only if all the ai are equal.
Problem 1.9.2. Prove that for any x ∈ R such that 1/4 ≤ x ≤ 1, we have that
x
√ 1 − x2 +
1
√ 16x2 − 1 <
ABD is an isosceles triangle. Since 1/2 = |AB| ≤ |AD| + |BD| = 2x, x ≥ 1/4. On the other hand,
x = |AD| ≤ |AE| + |ED| = 1. The limiting shape when x → 1 is an isosceles triangle ABD
whereas if x → 1/4 we get two isosceles triangles ADE and BCD (see Figure 8.3).
FIGURE 8.3: Illustration for Problem 1.9.2. ABCDE is the regular pentagon, A B C
(dashed grey) is the ‘limiting pentagon’ when x → 1/4.
′′ ′′
′ ′ ′
Fix any x ∈ [1/4, 1] and consider the corresponding pentagon ABCDE (including the two limiting
scenarios, x = 1/4 and x = 1). Let us rst calculate the area of our pentagon ABCDE which is the
′
D E
sum of the areas of three isosceles triangles. The rst one, ABD, has one side of length 1/2 and two
sides of length x. The remaining two, BCD and ADE, have one side of length x and two sides of
4
′
.
2
= √ 1 + a2 .
i
1 2 2 1 2 2
(1/2) √ x /(1/2) x √ (1/2) /x − 1/4
2 2
− 1/4 + 2 ⋅
2 2
1 x
= √ 16x2 − 1 + √ 1 − x2 ,
16 2
which is exactly the left hand side of our inequality. On the other hand, the area of our pentagon is less
than or equal to the area of a regular pentagon of side length a = 1/2, which is equal to
2
a 1 4
√ 5(5 + 2√ 5) = √ 5(5 + 2√ 5) < 0.44 < .
4 16 9
Solution. By subtracting the rst equation from the third one, we get that
2 2
b − d = (c − a)(c + ac + a + 1) .
Analogously, after subtracting the second equation from the fourth one, we get that
2 2
c − a = (d − b)(d + bd + b + 1)
(8.1)
Now, by considering the term f (d) := d + bd + b + 1 as a function of d, we see that f (d) > 0 as the
2 2
that g(c) := c + ac + a + 1 > 0. Hence, we get from (8.1) that b = d. A symmetric argument can
2 2
be used to show that c = a. We can now get back to the original equations to see that a + b = a and 3
and so a ∈ {0, −√2, √2}. It follows that there are three candidate solutions:
We directly check that all of them satifsy the original system of equations.
Problem 2.1.2. Solve the following system of equations, given that all variables involved are real
numbers:
3 3
(x − y)(x + y ) = 7
{ 3 3
(x + y)(x − y ) = 3.
Solution. We see immediately that x ≠ y and x ≠ −y. Therefore, we can divide the two equations to
get that
2 2
(x − y)(x + y)(x − xy + y ) 7
= ,
2 2
(x + y)(x − y)(x + xy + y ) 3
and so
2 2 2 2 2 2
0 = 7(x + xy + y ) − 3(x − xy + y ) = 4x + 10xy + 4y
If y = −2x, then we would have to have 3x = −1 which is impossible. On the other hand, if
4
x = −2y, then we get 3y = 1 and so y = 1/√ 3 or y = −1/√ 3. We conclude that there are two
4 4 4
candidate solutions
4 4 4 4
(x, y ) ∈ {(2/√ 3, −1/√ 3), (−2/√ 3, 1/√ 3) } .
We directly check that all of them satisfy the original system of equations.
Problem 2.1.3. Solve the following system of equations, given that all variables involved are real
numbers:
2
x − (y + z + yz)x + (y + z)yz = 0
2
{y − (z + x + zx)y + (z + x)zx = 0
2
z − (x + y + xy)z + (x + y)xy = 0.
Solution. Let us note that the rst equality can be rewritten as follows:
2
0 = x − x(y + z) − xyz + (y + z)yz
= (x − (y + z))(x − yz).
The same simpli cation can be done with the remaining equations to get the following equivalent
system:
(x − (y + z))(x − yz ) = 0
{(y − (z + x))(y − zx ) = 0
(z − (x + y))(z − xy ) = 0.
It follows that each variable is either the product or the sum of the remaining variables. We will
independently consider the following four cases.
Case 1: all variables are the sums. We have x = y + z, y = z + x, and z = x + y. We immediately get
that x = y = z = 0.
Case 2: two variables are the sums and one is the product. Without loss of generality, we may assume
that x = y + z, y = z + x, and z = xy. We immediately get that z = 0, and this implies that
x = y = 0, the solution we already discovered.
Case 3: one variable is the sum and two are the products. Without loss of generality, we may assume
that x = y + z, y = zx, and z = xy. Indeed, by symmetry we can recover the whole family of
solutions by permuting the solution vector (x, y, z). It follows that x = y + z = x(z + y) = x and so 2
namely (x, y, z) = (1, 1/2, 1/2), (x, y, z) = (1/2, 1, 1/2), and (x, y, z) = (1/2, 1/2, 1), by
permuting the variables.
Case 4: all numbers are the products. We have x = yz, y = zx, and z = xy. If one variable is equal to
zero, then all of them must be zero and we get the particular solution (x, y, z) = (0, 0, 0) one more
time. If no variable is equal to zero, then we get that y = z(yz) = yz , and so z = 1. The symmetric 2 2
arguments give us y = 1, x = 1, and so all of the variables are either 1 or −1. Since the value of x is
2 2
determined by the value of y and z ( x = yz), by considering all possibilities for y and z, we see that
either all of the variables x, y, z are equal to 1 or precisely one of them is equal to 1. We directly check
that these potential solutions are feasible, giving us the following four additional solutions:
(x, y, z) = (1, 1, 1), (x, y, z) = (1, −1, −1), (x, y, z) = (−1, 1, −1), (x, y, z) = (−1, −1, 1).
(x, y, z) ∈ {(0, 0, 0), (1, 1/2, 1/2), (1/2, 1, 1/2), (1/2, 1/2, 1),
(1, 1, 1), (1, −1, −1), (−1, 1, −1), (−1, −1, 1)}.
Problem 2.2.1. Solve the following system of equations, given that all variables involved are real
numbers:
3
⎧(x + y) = 8z
3
⎨(y + z) = 8x
⎩ 3
(z + x) = 8y .
Solution. Since f (x) := x is an increasing function, x > z if and only if (x + y) > (y + z) . Using
3 3 3
this observation, we get from the rst equation and the second one that x = z. Similarly, from the
second equations and the third one we get that x = y. It follows that all the variables are equal. We get
that 8x = (2x) = 8x , so 0 = 8x − 8x = 8x(x − 1)(x + 1). We conclude that there are three
3 3 3
Problem 2.2.2. Solve the following system of equations, given that all variables involved are real
numbers:
5 3
x = 5y − 4z
5 3
{y = 5z − 4x
5 3
z = 5x − 4y .
Solution. Since the system is cyclic, without loss of generality, we may assume that x ≥ y and x ≥ z.
We will independently consider the following two cases.
Case 1: y ≤ z . Since and h(x) := x are increasing functions, we get that
f (x) := x
5
It follows that x ∈ {−2, −1, 0, 1, 2}. It is straightforward to check directly that the following triplets
are solutions to our system:
(x, y, z ) ∈ {(−2, −2, −2), (−1, −1, −1), (0, 0, 0), (1, 1, 1), (2, 2, 2) } .
Problem 2.2.3. Solve the following system of equations, given that all variables involved are positive
real numbers:
3 3 3 3
a + b + c = 3d
4 4 4 4
{b + c + d = 3a
5 5 5 5
c + d + a = 3b .
Solution. By the pigeonhole principle, at least one of the variables d, a or b attains the maximum or the
minimum value from the set {a, b, c, d}. We will independently consider the following two cases.
Case 1: a, bor d is the maximum. Let us rst assume that a is the maximum. We will show that
a = b = c = d. Indeed, if this is not the case, then 3a = b + c + d < 3a which is clearly not
4 4 4 4 4
possible. (Let us remark that here we used the fact that f (x) := x is an increasing function on R+.) If b 4
or d is the maximum, then the argument is the same but this time we need to respectively use the third
or the rst equation and the fact that g(x) := x and h(x) := x are increasing functions.
5 3
Case 2: a, b or d is the minimum. Again, we will show that a = b = c = d. As before, the argument is
similar for each sub-case and we will present it assuming that a is the minimum. Indeed, if this is not
the case that a = b = c = d, then 3a = b + c + d > 3a which is a contradiction.
4 4 4 4 4
In all the cases, we get that all the variables are equal and it is easy to check that any 4-tuple
(a, b, c, d) = (t, t, t, t) satis es all the equations for each t > 0.
4 2 2
(x + 3y )√ |x + 2| + |y| = 4 xy ,
(8.2)
provided that x, y ∈ R.
Solution. Let us rst note that if (x, y) = (−2, 0) or (x, y) = (0, 0), then both sides of (8.2) are equal
to 0 and so the desired equality holds. We may then assume that (x, y) ≠ (−2, 0) and (x, y) ≠ (0, 0).
In particular, the left hand side of (8.2) is non-zero.
By the arithmetic-geometric mean inequality, we get that
4 2 4
4 2 2 2
x + 3y ≥ 4√ x ⋅y ⋅y ⋅y = 4|xy|√ |y| ,
4 2 2
(x + 3y )√ |x + 2| + |y| ≥ 4|xy|√ |y|√ |x + 2| + |y| ≥ 4 xy ;
the rst equality holds when x = |y|, and the second equality holds when x = −2. Since by our
2
earlier assumption y ≠ 0 when x = −2, we get that (8.2) holds only when x = −2 and
|y| = (−2) = 4, that is, when (x, y) = (−2, 4) or (x, y) = (−2, −4). We conclude that the solution
2
is
Problem 2.3.2. Solve the following system of equations, given that all variables involved are real
numbers:
2 2 2
3(x + y + z ) = 1
{ 3
2 2 2 2 2 2
x y + y z + z x = xyz(x + y + z) .
2 2 2
≤ 3(x + y + z ) = 1,
2 2 2 2 2 2 2 2
(z x + z y )/2 = z (x + y )/2 ≥ z xy.
(x = 0 ∨ y = z) ∧ (y = 0 ∨ z = x) ∧ (z = 0 ∨ x = y) .
For this condition to hold, clearly, if no variable is equal to 0, then x = y = z. It is also easy to see that
it is impossible that only one variable is equal to zero. Hence, we get that the equality holds if
x = y = z or at least two of the three variables x, y, z are equal to 0. But this means that
2 2 2 2 2 2 3
x y + y z + z x ≥ xyz(x + y + z ) ≥ xyz(x + y + z) ,
where in the last step we use the fact we showed at the very beginning, namely, that (x + y + z) ≤ 1. 2
More importantly, the condition for the equality remains the same, that is, either x = y = z (since then
(x + y + z) = 1) or at least two of the three variables x, y, z are equal to 0 (since then both sides of
2
the inequality are equal to 0). We will consider these two cases separately.
Case 1: x = y = z. Our system reduces to
2
9x = 1
{ 4 6
3x = 27x ,
Case 2: at least two of the three variables x, y, z are equal to 0. Without loss of generality, we may
assume that y = z = 0 and other solutions will be obtained by permuting the variables. This time our
system reduces to
2
3x = 1
{
0 = 0.
This leads us to another six solutions of the system (1/√3, 0, 0), (−1/√3, 0, 0), (0, 1/√ 3, 0) ,
(0, −1/√ 3, 0), (0, 0, 1/√ 3), and (0, 0, −1/√ 3).
Combining the two cases together we conclude that the solution to the system is
(x, y, z) ∈ {(1/3, 1/3, 1/3), (−1/3, −1/3, −1/3), (1/√ 3, 0, 0), (−1/√ 3, 0, 0),
(0, 1/√ 3, 0), (0, −1/√ 3, 0), (0, 0, 1/√ 3), (0, 0, −1/√ 3)}.
Problem 2.3.3. Solve the following system of equations, given that all variables involved are real
numbers:
2
x y + 2 = x + 2yz
2
{y z + 2 = y + 2zx
2
z x + 2 = z + 2xy .
Solution. Let us rst note that if x = 0, then from the third equation we get that z = 2 and from the
rst one that y = 1/2. But this contradicts the second equation, and so x ≠ 0. Symmetric arguments
show that also y ≠ 0 and z ≠ 0.
Let us use the following substitution: a = xy ≠ 0, b = yz ≠ 0, and c = xz ≠ 0. After multiplying
the rst equation by y, multiplying the second equation by 2, and adding it together we get
a + 4 = a + 4c. Symmetric operations give us the following system of equations:
2
2
a + 4 = a + 4c
2
{b + 4 = b + 4a
2
c + 4 = c + 4b .
Due to the symmetry, without loss of generality, we may assume that a is a largest number from a, b, c
and then circularly shift the solution, if needed. We get that a + 4 = a + 4c ≤ a + 4a = 5a, or
2
alternatively that 0 ≥ a − 5a + 4 = (a − 1)(a − 4). It follows that a ∈ [1, 4]. Since function
2
f : [1, 4] → [1, 4], f (a) := (a + 4 − a)/4 is a bijection and c = f (a), we get that also c ∈ [1, 4].
2
Similarly, we get that b = f (c) ∈ [1, 4]. More importantly, function f (x) has the property that
1 < f (x) < x unless x = 1 or x = 4. Suppose that 1 < a < 4. Then 1 < c = f (a) < a, and
consequently 1 < b = f (c) < c and 1 < a = f (b) < b. This contradicts the fact that a is a largest
value. It follows that the only possible solutions are (a, b, c) = (1, 1, 1) and (a, b, c) = (4, 4, 4).
Going back to the original set of equations, we see that x = y = z and so there are only four
potential triples (x, y, z) that satisfy the original system: (1, 1, 1), (−1, −1, −1), (2, 2, 2), and
(−2, −2, −2). The last triple does not satisfy the original system and so the solution is
Problem 2.4.1. Let n ≥ 2 be any natural number. Find the number of sequences (x 1, x2 , …, xn ) of
non-negative real variables that satisfy the following system of equations: for i ∈ [n]
2
xi+1 + xi = 4xi ,
where x n+1 = x1 .
Solution. Let us rst note that for each i ∈ [n] we have
2
xi+1 = 4xi − xi = xi (4 − xi ) .
Since xi+1 is non-negative, we get that x ∈ [0, 4]. As there exist α ∈ [0, ∞) such that
i i
(α ), it is natural to use this substitution. In fact, there are many choices for αi for a given
2
xi = 4 sin i
2 2
= 4(2 sin(αi ) cos(αi )) = 4sin (2αi ),
and so x = 4 sin (2 α ). In particular, since x = x , we get that sin (α) = sin (2 α). It
i
2 i−1
1 1 n+1
2 2 n
Let us rst deal with a degenerate case, namely, α = 0 that yields the following particular solution:
(x , x , …, x ) = (0, 0, …, 0). If sin(α) = sin(2 α) for some α > 0, then there exists k ∈ N such
n
1 2 n
that either α + 2kπ = 2 α or −α + π + 2kπ = 2 α. On the other hand, if sin(α) = − sin(2 α) for
n n n
some α > 0, then there exists k ∈ N such that either α + π + 2kπ = 2 α or −α + 2kπ = 2 α. n n
Combining these two observations together, we get that (2 + 1)α = kπ for some k ∈ N or n
(2 − 1)α = kπ for some k ∈ N. Since α ∈ [0, π/2], including the degenerate case, we get that
n
kπ kπ
n−1 n−1
α ∈ { : k ∈ [2 ]} ∪ { : k ∈ [2 − 1]} ∪ {0 } .
n n
2 + 1 2 − 1
Finally, note that these two rst sets above are disjoint as 2 + 1 and 2 − 1 are co-prime for n ≥ 2. It n n
Using the substitution y = tan(x), our equality can be equivalently rewritten as follows:
n
1 1 1
y − = n − y = n y − .
n
y y y
(8.3)
The problem is easy if n = 1, as then (8.3) is always satis ed, provided that y ≠ 0. As a result, any
value of x ∈ R that falls into the domain of both tan(x) and cot(x) functions, satis es the original
equation. In other words, the solution is:
kπ
x ∈ R ∖ { : k ∈ Z} .
2
Suppose then that n ≥ 2. We will independently consider the following two cases.
Case 1: y = 1 or y = −1. Both sides of (8.3) are equal to zero, which yields the following family of
solutions: x = (2k + 1)π/4 for some k ∈ Z.
Case 2: y ≠ 1and y ≠ −1 . This time |y − 1/y| ≠ 0 and so after dividing both sides of (8.3) by
|y − 1/y| we get:
2n
y −1
2 2n−2 2n−4 2
y
n (y −1)(y +y +…+y +1)
1−n
n = 2
= y ⋅ 2
y −1 (y −1)
y
n n
n+1−2i n+1−2i
= ∑ y = ∑ |y| ,
i=1 i=1
(8.4)
where the last equality holds because either all the terms y n+1−2i
are positive or all are negative. Using
the arithmetic-geometric mean inequality, we get that
1/n
n n
1 n+1−2i n+1−2i n
∑ |y| ≥ (∏ |y| ) = √1 = 1,
n
i=1 i=1
where the equality holds if and only if |y| = 1. It follows that (8.4) holds if and only if |y| = 1, which
are excluded in Case 2 (we already considered them in Case 1).
Combining the two cases together, we conclude that for n ≥ 2 the solution is:
(2k + 1)π
x ∈ { : k ∈ Z} .
4
Problem 2.4.3. For a given a ∈ R , let us recursively de ne the following sequence: x0 = √ 3 and for
all non-negative integers n,
1 + axn
xn+1 = .
a − xn
Find all values of a for which the sequence has period equal to 8.
Solution. Let us rst note that for any a ∈ R, there exists α = α(a) ∈ R such that a = cot(α). In
particular, for convenience we set α = cot (a) ∈ (0, π), where cot (⋅) is the inverse of the
−1 −1
cotangent function. (Note that none of the six trigonometric functions are one-to-one. They are
restricted to their principal branch in order to have inverse functions. For cotangent the principal branch
is (0, π).) Let us now recursively de ne another sequence: y = π/6 and for all non-negative integers
0
The base case ( n = 1) is easy: cot(y ) = cot(π/6) = √3 = x . For the inductive step, assume
0 0
1 + cot(x) cot(y)
cot(y − x ) = .
cot(x) − cot(y)
function has period of π. Since α ∈ (0, π), in fact, k ∈ [7]. It follows that all the possible values of a
that satisfy the desired condition of the problem are of the form a = cot(kπ/8) for some k ∈ [7].
Suppose that a = cot(kπ/8) for some k ∈ [7]. We get that for each n ∈ N ∪ {0},
π π nkπ 4 − 3nk
xn = cot( − nα(a) ) = cot( − ) = cot(π ⋅ ) .
6 6 8 24
Since 4 − 3nk is not divisible by 3, (4 − 3nk)/24 is never an integer. As a result, π(4 − 3nk)/24
always belongs to the domain of the cotangent function and so for all seven identi ed values of a the
sequence is properly de ned.
Note that in the problem we required the function to have a period of 8 but its fundamental period
can be smaller. In particular we note that for k ∈ {2, 6} the fundamental period of the sequence is 4
and for k = 4 the fundamental period of the sequence is 2.
Problem 2.5.1. For a given a ∈ R, consider the following system of equations:
2 2
x + y + z = a
2 2
{x + y + z = a
2 2
x + y + z = a.
which implies that either x = y or x = 1 − y. Symmetric arguments may be applied to the remaining
two pairs of equations. Since 1 − (1 − t) = t, we conclude that all the solutions must have one of the
following four forms: (x, y, z) = (t, t, t), (x, y, z) = (t, t, 1 − t), (x, y, z) = (t, 1 − t, t), or
(x, y, z) = (1 − t, t, t), where t ∈ R. Clearly, if t = 1/2, then only one solution should be counted,
namely, (x, y, z) = (1/2, 1/2, 1/2). It means that we need to be extra careful with the case
a = (1/2) + (1/2) + (1/2) = 1. On the other hand, if t ≠ 1/2, then all solutions are distinct. More
2 2
importantly, by symmetry, the last three forms are associated in the following sense: if one of them is a
solution, then so are the remaining two. We will independently consider the two cases.
Case 1: there is a solution of the form (t, t, t) for some t ∈ R. It follows that 2t + t − a = 0. Since
2
the discriminant is equal to Δ = 1 − 8a, we conclude that there are no solutions of this form if
a < −1/8 (that is, Δ < 0), precisely one solution if a = −1/8, and two solutions if a > −1/8.
Case 2: there is a solution of the form (1 − t, t, t) for some t ∈ R. This time we get that
2t − t + (1 − a) = 0 and so Δ = 1 − 8(1 − a) = 8a − 7. It follows that there are no solutions of
2
this form if a < 7/8, precisely one solution if a = 7/8, and two solutions if a > 7/8.
Let us now come back to the special case a = 1 that requires more attention. If a = 1, then we have
two solutions of the form (t, t, t) for some t ∈ R, namely (1/2, 1/2, 1/2) and (−1, −1, −1).
Moreover, there are two solutions of the form (1 − t, t, t) for some t ∈ R, again including
(1/2, 1/2, 1/2) which we do not want to count. The other one, namely (1, 0, 0), yields another two
solutions of the form (t, 1 − t, t) and (t, t, 1 − t). So there are 5 solutions for this special case:
(x, y, z ) ∈ {(−1, −1, −1), (1/2, 1/2, 1/2), (1, 0, 0), (0, 1, 0), (0, 0, 1) } .
Let us summarize our observations. The number of the solutions of our system of equations is equal
to:
5 = 2 + 3 ⋅ 1, provided a = 7/8 or a = 1;
Problem 2.5.2. Solve the following system of equations, given that all variables involved are positive
real numbers:
2010 2009 2009 2010
(x − 1)(y − 1) = (x − 1)(y − 1) .
Solution. Let us rst note that if x = 1 or y = 1, then the equality trivially holds. We will assume then
that they are not equal to 1.
Let us rst re-write the equation as follows:
2010 2009 2009 2009 2010 2009
x (y − 1) − y = (x − 1)y − x .
2008 2008
2009 i 2009 i
x ∑y = y ∑x .
i=0 i=0
(Recall that we assumed that x ≠ 1 and y ≠ 1.) Since x and y are both non-zero (in fact, they are both
positive real numbers), this equation can be equivalently rewritten as follows:
2009 2009
1 1
∑ = ∑ .
i i
y x
i=1 i=1
The nal observation is that for each i ∈ [2009], f (t) := 1/t is a decreasing function on the domain i
(0, ∞). It follows that g(t) := ∑ 1/t is also a decreasing function on that domain. Since x and y
2009 i
i=1
Problem 2.5.3. Fix an integer n ≥ 2, and consider the following system of n equations: for i ∈ [n]
2 2
xi+1 + xi + 50 = 12xi+1 + 16xi .
(As usual, we use the convention that x n+1 = x1 .) Find the number of solutions of this system, given
that all variables involved are integers.
Solution. Let us start with rewriting the system as follows: for i ∈ [n]
2 2
(xi − 8) + (xi+1 − 6) = 50 .
Since both x − 8 and x − 6 are integers, we need to decompose 50 into a sum of two squares of
i i+1
integers. The only decompositions involving natural numbers are 1 + 7 = 50 and 5 + 5 = 50. It 2 2 2 2
(xi , xi+1 ) ∈ S := {(1, 5), (1, 7), (3, 1), (3, 11), (7, −1), (7, 13),
(9, −1), (9, 13), (13, 1), (13, 11), (15, 5), (15, 7)}.
We will show now that there is no i ∈ [n] for which (x , x ) = (1, 5). Indeed, for a contradiction,
i i+1
suppose that (x , x ) = (1, 5) for some i ∈ [n]. Then we get that (x , x ) = (5, x ) ∈ S , but
i i+1 i+1 i+2 i+2
there is no pair in S with the rst coordinate equal to 5 (here we extended our convention and use
xn+2 = x ). We get the desired contradiction and so the pair (1, 5) is eliminated from the set of
2
potential pairs. Using similar arguments one can eliminate more pairs to get that for each i ∈ [n],
(xi , xi+1 ) ∈ T := {(1, 7), (7, 13), (13, 1) } .
Our next observation is that the rst pair, pair (x , x ), uniquely determines the sequence 1 2
(x , x , …, x , x
1 2 n ). Moreover, the numbers form the cycle of length three. As a result, since
n+1
x = x
1 , we get that the solution exists if and only if 3 | n. We conclude that the system has 3
n+1
1
xn+1 = xn + .
2
xn
convenient to use the following substitution: for each n ∈ N, y := x > 0. It follows that n
3
n
1/3 1/3 1
y = yn + .
n+1
2/3
yn
3 1
yn+1 = yn + 3 + + .
2
yn yn
As a result, by unrolling the recursion all the way to y1, we get that
n−1 n−1
3 1 3 1
yn = y1 + ∑(3 + + ) = 3(n − 1) + y1 + ∑( + ) .
2 2
yi y yi y
i=1 i i=1 i
In particular, we get that y > 3(i − 1) for all i ∈ N. After switching back to xn, we get the following
i
3 3 xn 3
yn
an := √ 3− < = √ < bn ,
3
n √n n
where
3 3 6 n−1
3 −3+x +3/x +1/x 1 3 1
1 1 1
bn := 3+ + ∑( + ).
⎷ 2
n n 3(i−1) (3(i−1))
i=2
It is clear that an → √ 3
3
as n → ∞ . Hence, by sandwiching the sequence (xn /√ n)
3
n∈N
between
and (b ) , to show that x /√n → √3, it is enough to show that (see the
3 3 3
(an )n∈N n n∈N n bn → √ 3
We will independently show that ∑ 1/i is nite (and so the second term tends to 0) and that
∞
i=1
2
1/i ≤ ln(n) (and so the rst term tends to 0 as well—clearly, ln(n) tends to in nity much
n
H n = ∑
i=1
n n n
n n n
1 1 i−(i−1)
∑ 2
< 1 + ∑ = 1 + ∑
i i(i−1) i(i−1)
i=1 i=2 i=2
n
1 1 1
= 1 + ∑ ( − ) = 2 − .
i−1 i n
i=2
It follows that ∑
∞
i=1
1/i
2
=limn→∞ ∑
n
i=1
1/i
2
is smaller than or equal to 2. In fact,
∑
∞
i=1
1/i
2 2
= π /6 ≈ 1.6449 .
For the second task, let us recall that in Section 1.5 we showed that e < (1 + 1/(i − 1)) and so i
< i/(i − 1). It follows that 1/i < ln(i) − ln(i − 1). Since ln(1) = 0, we get that for n ≥ 2,
1/i
e
n n
1
Hn = ∑ < 1 + ∑(ln(i) − ln(i − 1) ) = 1 + ln(n ) .
i
i=1 i=2
This bound is quite good as one can show that H > ln(n + 1) (see the solution to Problem 4.6.2) and n
Let us mention about an alternative solution to this problem that uses the Stolz–Cesàro theorem. This
theorem is a criterion for proving the convergence of a sequence and can be viewed as a generalization
of a L’Hôpital’s rule. Suppose that (a ) and (b ) are sequences of real numbers such that
n n∈N n n∈N
b /a
n
also exists and they are equal. (Let us note that the converse of this implication is not true in
n
general.)
Let us now come back to our problem. We observe that x → ∞ as n → ∞. Then, we deal with the n
3
3 3 3 2 3
xn x −xn (xn +1/xn ) −xn
n+1
lim = lim = lim
n n→∞ (n+1)−n n→∞ (n+1)−n
n→∞
3 1
= lim (3 + 3
+ 6
) = 3.
n→∞ x x
n n
cn is bounded.
Solution. It is clear that sequence (a ) is increasing and a ≥ 4 for each n ∈ N. We will upper
n n∈N n
2 2
an /2 ≤ an+1 < an .
slightly weaker bounds, namely, 2 . It follows that 2 < 2 , and so 0 < c < 1.
n−1 n
2 2 n−1 n
< an < 2 < b n n
Problem 2.6.3. You are given two numbers a, b ∈ R. Let x = a, x = b, and for each n ∈ N let 1 2
x = x
n+2 + x . Show that there exist a, b ∈ R, a ≠ b, for which there are at least 2, 000 distinct
n+1 n
pairs (k, ℓ), k < ℓ, such that x = x . On the other hand, the number of such pairs is nite even if
k ℓ
a = b, unless a = b = 0.
Solution. Consider any pair a, b ∈ R that is different than a = b = 0. In particular, it is allowed that
a = b ∈ R ∖ {0}. We will rst show that all but possibly a nite number of terms of the sequence are
unique. This proves the second part of the problem. We will independently consider the following three
cases.
Case 1: x > 0 and x
i > 0 for some i ∈ N. It is clear that the sequence (x
i+1 ) i+n n∈N
is strictly
increasing and so indeed all but nitely many terms of the sequence (x ) are unique. n n∈N
Case 2: x i and x
< 0 < 0 for some i ∈ N. We get the same conclusion as before since the sequence
i+1
(xi+n )
n∈N
is strictly decreasing.
Before we move to the last case, let us suppose that x = 0 for some i ∈ N. If there are more terms
i
xi that are equal to 0, then we concentrate on the rst one. Since the case a = b = 0 is excluded, we get
that x i+1 ≠ 0. Indeed, if 0 = x = x + x = 0 + x
i+1 i for some i ≥ 2, then x
i−1 = 0 which
i−1 i−1
gives us a contradiction (xi is the rst term equal to 0). It follows that x = x + x = x and we i+2 i+1 i i+1
arrive in either Case 1 or Case 2. Hence, without loss of generality, we may assume that x ≠ 0 for all i
Case 3: the sequence (x ) oscillates between positive and negative values. Suppose that for some
n n∈N
0 > x i+3= x + x
i+1 > x , and so the sequence (x
i+2 i+1 ) is a strictly increasing sequence
i+1+2n n∈N
of negative numbers. As a result, all but nitely many terms of the sequence (x ) are unique. (In n n∈N
fact, with a slightly more delicate argument one can argue that all of them are unique.)
Before we move to the proof of the rst part of the problem, let us make one remark. One can show
(by induction on n) that for each n ∈ N,
n n
1 + √5 1 − √5
xn = A( ) + B( )
2 2
for some carefully chosen A = A(a, b) and B = B(a, b). (Constants A and B can be determined by
considering x = a and x = b.) Note that A = B = 0 only if a = b = 0 and in which case we have
1 2
in nitely many pairs (k, ℓ). Otherwise, at some point the sequence (x ) must be strictly increasing n n∈N
or strictly decreasing, and so the number of pairs (k, ℓ) we are interested in is nite. This also shows
that Case 3 is impossible.
Let us now come back to our problem. We will show that it is possible to select a and b such that
there are at least 2, 000 pairs (k, ℓ), k < ℓ, with x = x . In order to see this, let us consider the classic
k ℓ
Fibonacci sequence where x = x = 1. (See Section 4.9 for more on that sequence.) However, we
1 2
will extend it to negative indices. Since we want to preserve that x = x + x for each n ∈ Z, n+2 n+1 n
we get that
xn = xn+2 − xn+1 .
(8.5)
In particular, ,
x0 = x2 − x1 = 1 − 1 = 0
x −1 = x1 − x0 = 1 − 0 = 1 , and
x−2 = x0 − x−1 = 0 − 1 = −1 .
We will show by (strong) induction on i that for each i ∈ N,
i
x−i = −(−1) xi .
(8.6)
i i+1
x−(i+2) = x−i − x−(i+1) = −(−1) xi − (−(−1) xi+1 )
i+2
= −(−1) xi+2 ,
(by applying it to i = 2k + 1). This means that it is enough to “shift” the Fibonacci sequence, that is,
take a := x and b := x
−3999
to generate the desired sequence.
−3998
Problem 2.7.1. Find the number of in nite sequences (a ) , such that a ∈ {−1, 1} for all i i∈N i i ∈ N ,
amn = a a for all m, n ∈ N, and each consecutive triple contains at least one 1 and one −1.
m n
a = −1 (as the triple a , a , a has to contain at least one −1), a = 1 (as the triple a , a , a has to
7 7 8 9 5 5 6 7
contain at least one 1), and a = a a = 1. But then the triple a , a , a does not contain any −1,
10 2 5 8 9 10
Suppose that a = a = x ∈ {−1, 1} for some k ∈ N, k ≥ 2. Using the property for consecutive
k k+1
consecutive triples one more time, we get that x = x. It follows that 2k+1
a2k−2 = a = a
2k+1 ( = x). If k = 3ℓ + 1 for some ℓ ∈ N, then we would get that
2k+4
a6ℓ = a = a
6ℓ+3 but, as a result, also a = a
6ℓ+6 = a , which is not possible (the
2ℓ 2ℓ+1 2ℓ+2
corresponding triple does not satisfy the desired property). It follows that the following property is
satis ed:
(8.7)
We will prove, by (strong) induction on ℓ, that for all ℓ ∈ N ∪ {0}, a = 1 and a = −1. The 3ℓ+1 3ℓ+2
base case ( ℓ = 0) holds: a = 1 and a = −1. For the inductive step, suppose that a
1 2 = 1 and 3ℓ+1
a3ℓ+2 = −1 for all non-negative integers that are less than ℓ ∈ N. Our goal is to show that a = 1 0 3ℓ0 +1
and a = −1. We will independently investigate two cases, depending on the parity of ℓ .
3ℓ0 +2 0
We showed above that the values of the sequence (a ) are determined when n ≡ 1 or n n∈N
n ≡ 2 ( mod 3). We will now show that the value of a3 uniquely determines the whole sequence.
Indeed, note that any natural number n (not necessarily divisible by 3) is uniquely represented as
follows: n = 3 (3q + r), where p, q ∈ N ∪ {0} and r ∈ {1, 2}. It follows that
p
p p r+1
an = a3p (3q+r) = (a3 ) a3q+r = (a3 ) (−1) .
Therefore, indeed, by xing a ∈ {−1, 1} we uniquely de ne two possible sequences.
3
It remains to show that both sequences satisfy the two desired properties. Property (8.7) guarantees
that all consecutive triples contain at least one 1 and one −1. In order to show that a = a a for all mn m n
p1 +p2
mn = 3 (9q1 q2 + 3(q1 r2 + q2 r1 ) + r1 r2 ) .
As a result, if r 1
= r2 , then mn = 3 p1 +p2
(3q3 + 1) for some q 3
∈ N ∪ {0} and so
p1 +p2 2 p1 +p2 2 r1 +r2
amn = (a3 ) (−1) = (a3 ) (−1) (−1)
p1 r1 +1 p2 r2 +1
= (a3 ) (−1) (a3 ) (−1) = am an ,
p1 r1 +1 p2 r2 +1
= (a3 ) (−1) (a3 ) (−1) = am an .
The two desired properties are satis ed and the proof is nished.
Problem 2.7.2. Let us x any real number a. We recursively de ne sequence (a ) as follows: let n n∈N
this sequence has in nitely many non-positive elements and in nitely many non-negative elements.
Solution. Let us rst note that if a = 0 for some N ∈ N, then a = 0 for all n ≥ N and so the
N n
desired property is trivially satis ed. Therefore, we may assume that a ≠ 0 for all n ∈ N. It follows n
equivalently as a + 1 = (a
2
n+1
− a ) . This implies that for each n ∈ N, |a
n+1 n
− a | > 1.
2
n+1 n
Combining this with the previous observation we conclude that a < a − 1. As a result, for some n+1 n
k > n we get that a < 0 (for example, it is easy to see that k − n ≤ ⌈a ⌉); recall that we had
k n
The conclusion is that, regardless of the choice of a ∈ R, the sequence (a ) must either reach zero n n∈N
(and stay zero forever), or oscillate in nitely many times between positive and negative values, as
required.
Problem 2.7.3. Let n be any natural number such that n ≥ 3 . Find all sequences of real numbers
(x , x , …, x ) that satisfy the following conditions:
1 2 n
n n
2
∑ xi = n and ∑ (xi−1 − xi + xi+1 ) = n,
i=1 i=1
i=1
n
2
= ∑((xi−1 − xi + xi+1 ) − 2(xi−1 − xi + xi+1 ) + 1)
i=1
n n
2
= ∑ (xi−1 − xi + xi+1 ) − 2 ∑(xi−1 − xi + xi+1 ) + n
i=1 i=1
n n n n
2
= ∑ (xi−1 − xi + xi+1 ) − 2(∑ xi−1 − ∑ xi + ∑ xi+1 ) + n
= n − 2(n − n + n) + n = 0.
xi+2 = 1 + x − x = 1 + (1 + x − x
i+1 i i) − x = 2 − x
i−1 . As
i a
i−1 result,
xi+5 = 2 − x = 2 − (2 − x
i+2 ) = x
i−1 . It follows that x
i−1 = x for all i ∈ [n − 6]. We will
i+6 i
independently consider the following two cases depending whether n is divisible by 6 or not.
Case 1: 6 divides n. We will show that the values of x1 and x2 uniquely determine the whole sequence.
Indeed, once x1 and x2 are xed, we get that x = 1 + x − x , x = 1 + x − x = 2 − x ,
3 2 1 4 3 2 1
x = 2 − x , and x = 1 + x − x = 1 − x + x . Since x
5 2 6 5 4 2 = x for all i ∈ [n − 6], the
1 i+6 i
i=1 i
n
i=1 i
as desired. The second condition is forced by the fact that x − x + x = 1 for all i ∈ [n]. We
i−1 i i+1
conclude that in this case one can x any values of x1 and x2, and these two values determine the
sequence that satisfy the desired properties. These are the only sequences.
Case 2: 6 does not divide n. As before, we x the values of x1 and x2. Arguing as before, we determine
the remaining values of the sequence. However, since 6 does not divide n, we obtain additional
constraints for x1 and x2. Depending on the remainder of n when divided by 6, we get one of the
following conditions:
x1 = x2
∙ n ≡ 1 (mod 6) : {
x2 = 1 + x2 − x1
x1 = 1 + x2 − x1
∙ n ≡ 2 (mod 6) : {
x2 = 2 − x1
x1 = 2 − x1
∙ n ≡ 3 (mod 6) : {
x2 = 2 − x2
x1 = 2 − x2
∙ n ≡ 4 (mod 6) : {
x2 = 1 − x2 + x1
x1 = 1 − x2 + x1
∙ n ≡ 5 (mod 6) : {
x2 = x1
In each case, the only solution is x = x = 1 and then all other values are also equal to 1. As a result,
1 2
if 6 does not divide n, the only solution is a constant sequence, namely, x = 1 for all i ∈ [n]. i
,a and b
i+1 are two different solutions of the equation x + a x + b = 0 (here we let a = a and
i+1
2
i i 4 1
b = b ).
4 1
Solution. Let us rst observe that if b i+1 = 0 for some i, then bi is also equal to 0, as bi+1 is a root of
x
2
. Therefore, we would have that all bis are equal to 0. But this would mean that
+ ai x + bi
may assume that b ≠ 0 for all i. Using Viete’s formulas, we get that
i
a1 + b1 = −a3
a2 + b2 = −a1
a3 + b3 = −a2
a1 b1 = b3
a2 b2 = b1
a3 b3 = b2 .
After multiplying the last three equations and dividing both sides by b b b , we get that a 1 2 3 1 a2 a3 .
= 1
This, in particular, implies that no coef cient is equal to 0. Now, calculate bi from the rst three
equations and substitute them into the last three equations to get that
a1 (a3 + a1 ) = a2 + a3
a2 (a1 + a2 ) = a3 + a1
a3 (a2 + a3 ) = a1 + a2 .
Since a a a = 1, either all coef cients are positive or only one of them is positive. We will
1 2 3
a2 + a3 = a1 (a3 + a1 ) ≥ a3 + a1 ,
and so a ≥ a . However, because of our assumption that a1 is a largest coef cient and the fact that the
2 1
inequality above is sharp when a > 1, this is only possible when a = a = 1. But then we get that
1 1 2
also a = 1, which in turn implies that b = b = b = −2. It is straightforward to check that, indeed,
3 1 2 3
Suppose now that two coef cients ai are negative and one of them is positive. Without loss of
generality, we may assume that a > 0, a < 0, and a < 0. From the rst equation, we have that
1 2 3
2
a1 + (a1 − 1)a3 = a2 .
From the same equation we have that a (a + a ) = a + a < 0 which implies that a + a < 0 and
1 3 1 2 3 3 1
so a < −a < −1. Using this inequality and the fact that a + a < 0, it follows from the third
3 1 2 3
equation that
or equivalently that a + a > −2a . Since a < 0, we get that a + a > 0, which is a contradiction,
1 3 2 2 1 3
Combining the two cases together we get that the only solution is a = a = a = 1 and 1 2 3
b = b = b = −2.
1 2 3
Problem 3.1.2. Let n ≥ 3 be an integer. Prove that the polynomial
n−3
n i
f (x ) = x + ∑ ai x
i=0
Suppose now that f (x) has n real roots: xi, i ∈ [n]. It follows from (8.6) that
n 2
an−1 − 2 ⋅ an ⋅ an−2
2
∑ xi = .
2
a
i=1 n
Since a = 0, we get that ∑ x = 0. If all xi are real, then we get that x = 0 for all i.
n 2
n−1 = a n−2 i=1 i i
As a result, the considered polynomial is f (x) = x (all ai are equal to 0), and the proof is nished.
n
i=1
.1
4
x
i
After squaring both sides and then dividing both sides by x4, we get that x
1
4
= 9(x + 2)
2
. It follows
that
3 3 3
1 2 2
∑ 4
= 9 ∑ (xi + 2) = 9 ∑ (x
i
+ 4xi + 4)
x
i=1 i i=1 i=1
3 3
2
= 108 + 9 ∑ x
i
+ 36 ∑ xi .
i=1 i=1
3
−6
∑ xi = = −2
3
i=1
and
3 2
2
6 − 2 ⋅ 3 ⋅ 0
∑ xi = = 4.
2
3
i=1
Therefore, the sum we are looking for is equal to 108 + 9 ⋅ 4 + 36 ⋅ (−2) = 72.
Problem 3.2.1. Find all functions f : Z → Z that satisfy the following condition:
3 3 3
f (a + b) − f (a) − f (b) = 3f (a)f (b)f (a + b)
for all a, b ∈ Z.
Solution. After setting a = b = 0, we get that −f (0) = 3f (0) and so f (0) = 0. Next, after
3 3
considering any a = −b ∈ Z, we get that f (a) = −f (−a) . Since the function g(x) := x is a
3 3 3
bijection, we get that f (a) = −f (−a), and so the function is symmetric about the origin (point (0, 0)).
As a result, we may restrict our analysis to arguments that are natural numbers.
Suppose that f (1) = k for some k ∈ Z and consider x := f (2) ∈ Z which may or may not depend
on k. By considering a = b = 1, we get that x − 2k = 3k x and so (x − 2k)(x + k) = 0. It 3 3 2 2
some natural number m ≥ 2. Our goal is to show that f (m + 1) = k(m + 1). By considering
0 0 0
3 3 3
f (m0 + 1) − (km0 ) − k − 3(km0 )kf (m0 + 1 ) = 0.
Since our goal is to show that f (m + 1) = k(m + 1), it will be convenient to factor out the term
0 0
f (m + 1) − k(m + 1) from the left hand side of the above equality. Guided by this, we re-write the
0 0
equation as follows:
2 2 2
(f (m0 + 1) − k(m0 + 1))(f (m0 + 1) + k(m0 + 1)f (m0 + 1) + k (m0 − m0 + 1) ) = 0.
(Recall that m ≥ 2.) It follows that f (m + 1) = k(m + 1), as required, and so the proof by
0 0 0
induction is nished.
Let us summarize our observations in this case. We obtained that one possible family of solutions is
f (m) = km for some xed integer k. It is straightforward to check that indeed this family satis es our
original equation.
Case 2: x = −k. We may assume that k ≠ 0, as this case was already considered above. After taking
a = 2 and b = 1, we get that f (3) + k − k = 3(−k)kf (3), or equivalently that 3 3 3
f (3)(f (3) + 3k ) = 0. Since the second term is positive, we get that f (3) = 0. Now, by considering
2 2
f (a + 3) = f (a). It follows that the only family of functions that satis es these conditions is
0 if a ≡ 0 ( mod 3)
f (a ) = {k if a ≡ 1 ( mod 3)
−k if a ≡ 2 ( mod 3),
where k ∈ Z is some xed integer. As usual, we directly check that this family satis es the original
condition.
Problem 3.2.2. Find all pairs of functions f : R → R and g : R → R such that
for all x, y ∈ R.
Solution. Let us x y = 0 to get that for any x ∈ R, we have that
that the family of functions f (x) = g(x) = a − x for some xed a ∈ R satis es the original
condition.
Problem 3.2.3. Find all functions f : R → R such that for all x, y ∈ R we have
Solution. Let us rst set x = y = 0 to get that f (f (0)) = 2f (0). Now, set x = 0 and y = f (0) to get
that f (0) = f (0) + f (f (f (0)) − f (0)). Using f (f (0)) = 2f (0) (twice!), we get that
0 = f (f (0)) = 2f (0), and so f (0) = 0. After xing x = 0, we get that for any y ∈ R, we have
f (−y) = f (f (y)). For any x ∈ R, after xing y = f (x), we get that
as f (0) = 0. It follows that f (x) = −x , and one can directly check that this function satis es the
original condition:
f (f (x) − y) = −f (x) + y = x + y
= −x − f (y) + f (−x) + x
Problem 3.3.1. Prove that if a function f : R → R satis es the condition f (x) = f (2x) = f (1 − x)
for all x ∈ R, then it is periodic (that is, there exists some a ∈ R such that f (x + a) = f (x) for all
+
x ∈ R).
Solution. Let f : R → R be any function that satis es the condition. We will show that f is periodic
with period a = 1/2. Indeed, note that for any x ∈ R, we have that
f (1/(1 − x)) − 1
f (x ) = .
x
Using this formula three times we get f (5) = (f (−1/4) − 1)/5, f (−1/4) = −4(f (4/5) − 1) , and
f (4/5) = 5(f (5) − 1)/4. It follows that
i=1 i=1
i=1 i=1
where x i+n = xi .
Solution. By considering (x , x , x , x , x ) = (0, 0, 0, 0, 0), we get that f (0, 0, 0) = 0. On the other
1 2 3 4 5
so there is hope that after subtracting the two, many values will cancel out. Indeed, after subtracting the
two equalities and using the fact that f (0, 0, 0) = 0 we get
It follows that
n n n
Since x i+n = xi , all the terms in the second and the third sum cancel out and we nally get that
n n
i=1 i=1
as required.
Problem 3.4.1. Let f = 0, f = 1, and f1 = f + f
2 for all n ∈ N. Find all polynomials P (x)
n+2 n+1 n
having only integer coef cients with the property that for each n ∈ N there exists k = k(n) ∈ Z such
that P (k) = f . n
Solution. Suppose that P (x) is a polynomial with only integer coef cients, that is, P (x) = ∑ c x
r i
i=0 i
for some r ∈ N and c ∈ Z for all i ∈ [r] ∪ {0}. Let us start with proving the following useful
i
property that we will use many times. Let p and q be any integers. Note that
r r r
i i i i
P (p) − P (q) = ∑ ci p − ∑ ci q = ∑ ci (p − q )
i=0 i=0 i=1
r i−1
j i−1−j
= (p − q) ∑ ci ∑ p q ,
i=1 j=0
(8.8)
b = k(2) ∈ Z such that P (a) = f = 0 and P (b) = f = 1. It follows from (8.8) that (b − a) divides
1 2
Let us de ne the auxiliary polynomial Q(x) := P (a + (b − a)x). Clearly, Q(0) = P (a) = 0 and
Q(1) = P (b) = 1. We will prove by induction that Q(f ) = f for all i ∈ N. This will
i i
nish the proof
as the only polynomial that satis es this property is Q(x) = x. Indeed, each polynomial R(x) of
degree at least 2 has the property that |R(x)| > x for all x ≥ x , where x0 is a suf ciently large
0
constant. Since Q(x) = x for in nitely many natural numbers x, we get that Q(x) has to be of degree
at most 1. Constant polynomials are clearly ruled out and Q(x) = x is the only linear function that
satis es the property. Using the fact that |b − a| = 1, we get that the only polynomials that satisfy the
original equation are polynomials of the form P (x) = x + c or P (x) = −x + c for some c ∈ Z. It is
straightforward to check that, indeed, they satisfy the desired equation.
It remains to show that Q(f ) = f for all i ∈ N. We already showed that this property holds for
i i
i = 1 and i = 2. For the base case, we will show that it also holds for i ∈ {3, 4, 5, 6, 7}. In fact, we
will prove something stronger, namely, that fi is the only integer k that satis es f (k) = f . i
Case: i = 3 . Suppose that Q(k) = f3 = 2for some k ∈ Z. Using (8.8) we get that k − 1 divides
Q(k) − Q(1) = f3 − f2 = 2 − 1 = 1 , and so k − 1 = 1 or k − 1 = −1. Since Q(0) = 0, k = 0 is
ruled out and we get that k = 2 is the unique solution.
Case: i = 4. Suppose that Q(k) = f = 3 for some k ∈ Z. Using the same argument as before, we get
4
k − 2 = −1. Since k = 1 is ruled out ( Q(1) = 1 ≠ 3), we get that k = 3 = f is the unique solution. 4
ruled out. However, since also (k − 3 ) | (5 − 3), we get that k = 5 is the unique solution.
Case: i = 6. If Q(k) = f = 8, then (k − 5 ) | (8 − 5) and so k − 5 ∈ {−3, −1, 1, 3}. It follows
6
that k ∈ {4, 6, 8}, as k = 2 is already ruled out. Moreover, (k − 1 ) | (8 − 1), and so k = 8 is the
only solution.
Case: i = 7. If Q(k) = f = 13, then (k − 0 ) | (13 − 0) so k ∈ {−13, −1, 13} as k = 1 is ruled
7
Let k ∈ Z be such that Q(k) = f . Since Q(0) = 0, we get from (8.8) that
n+1
k − 0 | Q(k) − Q(0) and so k | Q(k). Moreover, from the same property it follows that k − f n
divides f − f
n+1 = f
n n−1 and so −f ≤ k − f ≤ f
n−1 . We conclude
n n−1 that
5 = f ≤ f
5 n−2 < k ≤ f n+1 (note that Q(f ) = f ≠ f
n−2 so k = f
n−2 is ruled out). Since
n+1 n−2
Recall that our goal is to show that x = 1. For a contradiction, suppose that x > 1. Applying (8.8)
twice, we get that (k − 1 ) | (xk − 1) and (k − 2 ) | (xk − 2). In other words, there exist a, b ∈ N
such that b > a > 1, a(k − 1) = xk − 1, and b(k − 2) = xk − 2. It follows that
a(k − 1) − b(k − 2) = 1, or equivalently that (b − a)(k − 1) = b − 1. It will be convenient to x
c := b − a ∈ N. We get that b = c(k − 1) + 1 and so b(k − 2) = (c(k − 1) + 1)(k − 2) = xk − 2.
This means that k divides (c(k − 1) + 1)(k − 2) + 2 = k(ck − 3c + 1) + 2c and so we get that
k | 2c.
Let us now summarize what we have learnt. We showed the following three things: x ≤ 7 , k ≥ 6 ,
and k | 2c. But using these observations, we get that
6 6 9
≤ ≤ = ,
k−3+2/k 6−3+2/6 5
Solution. Let a ∈ R be a common root of P (P (P (x))) and P (x), that is, P (P (P (a))) = P (a) = 0.
This implies that P (P (0)) = 0, that is, P (0) is also a root of P (x). But P (0) is an integer, as P (x)
has all integer coef cients; in particular, the free term is an integer.
Problem 3.4.3. Consider a polynomial P (x) = x + ax + b with a, b ∈ Z. Suppose that for every
2
prime number p, there exists k ∈ Z such that P (k) and P (k + 1) are divisible by p. Prove that there
exists m ∈ Z such that P (m) = P (m + 1) = 0.
Solution. Fix any prime number p. By our assumption, there exists k = k(p) ∈ Z such that both P (k)
and P (k + 1) are divisible by p. Our goal is to nd a number which does not depend on k that is
divisible by p. To that end, note that
2 2
P (k + 1) − P (k) = ((k + 1) + a(k + 1) + b) − (k + ak + b)
= 2k + (a + 1)
is divisible by p and so is
2 2
2P (k) − k(2k + (a + 1)) = 2k + 2ak + 2b − (2k + k(a + 1))
= k(a − 1) + 2b.
2
= −a + 1 + 4b
is divisible by p. Since this property holds for any p, we get that a = 4b + 1. In particular, a is odd,
2
Solution. Consider any polynomial P (x) with real coef cients that satis es the desired property. Fix
any rational number q and consider the polynomial Q(x) := P (q + x) + P (q − x) for all x ∈ R.
Since for each x ∈ R we have that q + x + (q − x) = 2q is rational, it follows that Q(x) is rational
for all x ∈ R. But, since Q(x) is continuous, it is only possible when Q(x) is constant. In particular,
P (q + q) + P (q − q ) = Q(q ) = Q(0 ) = P (q + 0) + P (q − 0 ) ,
so P (2q) + P (0) = 2P (q) . It will be convenient to represent P (x) as follows:
2
P (x) = x R(x) + ax + b for some polynomial R(x). It follows that
2 2
(2q) R(2q) + a(2q) + b + b = 2(q R(q) + aq + b)
so, assuming that q ≠ 0, we get that 2R(2q) = R(q). Since this argument holds for all rational
numbers q, we get that R(2 q) = R(q)/2 . It follows that R(q) does not tend to +∞ or −∞ as
n n
q → +∞. This is only possible if R(q) = 0 for all rational numbers and so R(x) = 0 everywhere. As
a result, we get that P (x) = ax + b. Now, after letting x = 0 we see that b must be rational, and after
letting x = 1 we see that a must also be rational. Finally, one can directly check that if a and b are
rational, then the desired condition is satis ed.
Problem 3.5.2. Let P (x) be a polynomial with real coef cients. Prove that if there exists an integer k
such that P (k) is not an integer, then there are in nitely many such integers.
Solution. Let P (x) be any polynomial such that P (k) ∉ Z for some k ∈ Z. It will be more convenient
to work with the polynomial Q(x) := P (x + k) instead of P (x). Indeed, if there are in nitely many
integers ℓ such that Q(ℓ) ∉ Z, then clearly the same property holds for P (x). An advantage of working
with Q(x) is that, by assumption, Q(0) = P (k) ∉ Z and evaluating polynomials at x = 0 is easy.
For a contradiction, suppose that the set A := {x ∈ Z : Q(x) ∉ Z} is nite which implies that the
set B := Z ∖ A = {x ∈ Z : Q(x) ∈ Z} is in nite. Suppose that the degree of Q(x) is n ∈ N ∪ {0}.
In fact, n ≠ 0 as Q(0) ∉ Z and Q(x) ∈ Z for any x ∈ B, and so Q(x) is not a constant polynomial.
Since B is in nite, we may consider n points (x , y ), where both xi and y = Q(x ) are integers. From
i i i i
the Lagrange interpolation formula for these points, we get that all the coef cients of Q(x) are rational.
It follows that Q(x) = ∑ ⋅ x , where n ∈ Z, d ∈ N, and Q(0) = n /d ∉ Z; in particular,
n ni i
i=0 i i 0 0
di
d ≥ 2.
0
Let us now consider the sequence of natural numbers de ned as follows: y := (∏ d ) for t ∈ N
n t
t i=0 i
n
ni n0
i
Q(yt ) = ∑ ⋅ yt = ct + ,
di d0
i=0
where ct is some integer. It follows that Q(y ) ∉ Z for all t ∈ N, and so we have an in nite sequence
t
(y , Q(y ))
t t of distinct pairs consisting of integer and non-integer which contradicts the fact that A
t∈N
Q(x) ≥ P (x) ≥ 0. In particular, it means that both polynomials are of degree at most 2n as having
degree 2n + 1 would imply that either lim P (x) = −∞ or lim x→−∞ P (x) = −∞. Moreover, x→∞
from property (2) it follows that for i ∈ [n] we have P (x ) = Q(x ) = 0. But this means that all of the
i i
xi are roots of even multiplicity. Since we have 2n roots in total (including multiplicities) and P (x) and
Q(x) have degree at most 2n, we get that there exists a ∈ [0, 1] such as P (x) = aQ(x), for all x ∈ R.
and so a = 1/2 . It follows that for each x ∈ R , 2(G(x) − F (x)) = H (x) − F (x) , and so
2G(x) = F (x) + H (x) , as needed.
Problem 3.6.1. Find all
real numbers m for which the polynomial
+ 22x − 8 has two real roots whose product is equal to 2.
4 3 2
f (x) = 2x − 7x + mx
Solution. Suppose that the polynomial f (x) has two real roots a and b such that ab = 2. It follows that
f (x) = (x − a)(x − b)(2x + cx + d) for some a, b, c, d ∈ R. After comparing the corresponding
2
−2a − 2b + c = −7
2ab − c(a + b) + d = m
abc − (a + b)d = 22
abd = −8.
Since ab = 2, we get from the last equation that d = −4. Substituting it to the third equation and
adding twice the rst one, we get that c = 2. If follows that a + b = 9/2, and so m = −9 is the only
possible solution. Since
4 3 2
2x − 7x − 9x + 22x − 8 = 2(x − 4)(x − 1)(x − 1/2)(x + 2 ) ,
we get that, indeed, the polynomial f (x) has two roots, namely 4 and 1/2, whose product is equal to 2,
as required.
Problem 3.6.2. Given the polynomial P (x) = x 4
− 3x
3
+ 5x
2
− 9x , x ∈ R , nd all pairs of integers
a and b such that a ≠ b and P (a) = P (b).
Solution. Let us rst note that
2
P (−x + 1) − P (x ) = (2x − 1)(x − x + 6) > 0,
provided that x > 2. It follows that for x ∈ N ∖ {1, 2}, we have that
As a result, there are no a, b ∈ Z ∖ {−1, 0, 1, 2} such that P (a) = P (b). We directly compute that
P (−1) = 18, P (0) = 0, P (1) = −6, P (2) = −6, and P (3) = 18. Since all the values of the
polynomial P (x) evaluated at integers greater than 3 or smaller than −1 are greater than P (3) = 18,
we get that there are only four solutions to the problem:
Problem 3.6.3. Find all polynomials P (x) with real coef cients that satisfy the following property: for
all x ∈ R, P (x ) ⋅ P (x ) = (P (x)) .
2 3 5
Solution. We will independently consider two cases depending on how many terms the considered
polynomial has.
Let us rst assume that P (x) has exactly one term (including the special case P (x) = 0 for x ∈ R),
that is, P (x) = ax for some a ∈ R and k ∈ N ∪ {0}. Substituting this into the equation we want to
k
in this case are: P (x) = 0, P (x) = 1, and P (x) = x for some xed k ∈ N. It is straightforward to k
directly check that all of these polynomials satisfy the original equation.
Let us now assume that P (x) has more than one term and has degree k ∈ N. Then, it can be
represented as follows: P (x) = ax + bx + Q(x), where a, b ∈ R ∖ {0}, ℓ ∈ Z such that 0 ≤ ℓ < k
k ℓ
, and Q(x) has degree less than ℓ (or Q(x) = 0 everywhere if ℓ = 0). Substituting this form into the
original equation we get that for all x ∈ R,
5
2k 2ℓ 2 3k 3ℓ 3 k ℓ
(ax + bx + Q(x ))(ax + bx + Q(x ) ) = (ax + bx + Q(x)) .
Let us now compare the coef cients in front of the term x on both the left and the right hand side 4k+ℓ
of the above equation. The rst term on the left hand side is clearly a x but, since ℓ < k, the next 2 5k
non-zero term is abx . Since 3k + 2ℓ < 4k + ℓ, there is no term we are looking for. Alternatively,
3k+2ℓ
we may say that the coef cient in front of x is equal to 0. On the other hand, the right hand side
4k+l
after expanding is equal to a x + 5a bx + R(x), where R(x) has degree less than 4k + ℓ. It
5 5k 4 4k+ℓ
follows that the coef cient in front of the term x is equal to 5a b ≠ 0. This contradiction proves 4k+l 4
that P (x) cannot have more than one term. We conclude that the only polynomials that satisfy the
desired property are those that we found in the previous case.
Problem 3.7.1. Prove that there are no polynomials P1 (x), P2 (x), P3 (x), P4 (x) with rational
coef cients that satisfy
4
2 2
∑ (Pi (x)) = x + 7 f or all x ∈ R .
i=1
(8.9)
Solution. Due to the symmetry, without loss of generality, we may assume that
n := n1 ≥ n2 ≥ n3 ≥ n4 ≥ 0 , where ni is the degree of P (x), i ∈ [4]. For i ∈ [4], let c ∈ R be the i i
xn
coef cient in front of the term in P (x). Clearly, c ≠ 0. More importantly, after expanding the left
i 1
hand side of (8.9), the coef cient in front of the term x is equal to ∑ c ≥ c > 0. Since the 2n 4 2 2
i=1 i 1
degree of the right hand side of (8.9) is 2, we get that n = 1. As a result, since all the coef cients are
rational, we may represented each polynomial P (x) as follows: P (x) = (a x + b )/m, where ai, bi ( i i i i
After comparing the coef cients in (8.9) that are in front of the term xk for k ∈ {2, 1, 0}, we get the
following set of equations:
2 2 2 2 2
a + a + a + a = m
1 2 3 4
a1 b1 + a2 b2 + a3 b3 + a4 b4 = 0
2 2 2 2 2
b + b + b + b = 7m
1 2 3 4
For i ∈ [4], let p := a + b and q := a − b . Adding the rst, the third, and twice the second
i i i i i i
equation we get that p + p + p + p = 8m . Adding the rst, the third, and subtracting the second
2
1
2
2
2
3
2
4
2
equation twice we get that q + q + q + q = 8m . Finally, after subtracting the third equation from
2
1
2
2 3
2 2
4
2
the rst equation, we get that p1 q1 + p2 q2 + p3 q3 + p4 q4 = −6m
2
. Summarizing, we get the
following system of equations:
2 2 2 2 2
p + p + p + p = 8m
1 2 3 4
2 2 2 2 2
q + q + q + q = 8m
1 2 3 4
2
p1 q1 + p2 q2 + p3 q3 + p4 q4 = −6m .
p , p , p , p , q , q , q , q , m) had some common positive divisor d, one could divide the three
1 2 3 4 1 2 3 4
equations by d2 to get an equivalent system of equations. Note that for any x ∈ Z, the reminder of x2
when divided by 8 is equal to 0, 1 or 4. Hence, from the rst equation we get that that all pi are even,
and from the second one it follows that all qi are even. But this means that m is odd, since
gcd(p , p , p , p , q , q , q , q , m) = 1. However, if m is odd and all other variables are even, then the
1 2 3 4 1 2 3 4
left hand side of the third equation is divisible by 4 wheras the right hand side is not. The conclusion is
that there are no polynomials with rational coef cients that satisfy (8.9), and so the proof is complete.
Problem 3.7.2. Consider a polynomial f (x) := x + bx + c, where 2
b, c ∈ Z . Prove that if n ∈ N
n | (p − q)(q − r)(r − p ) .
= (p − q)((p + q) + b).
2 2 2 2
= (p − q)(rp + rq)pq − pr + qr − qp
2
= (p − q)(rp + rq) − (p − q)(pq + r )
2
= (p − q)(rp + rq − pq − r )
Solution. We will show that this is not true for P (x) := x − 2x. First, let us observe that 3
enough to show that P (x) has the desired property, namely, that there are no two different rational
numbers a and b such that P (a) = P (b). For a contradiction, suppose that there are q1 , q2 ∈ Q such
that q ≠ q and P (q ) = P (q ). It follows that
1 2 1 2
3 3
0 = P (q1 ) − P (q2 ) = q − 2q1 − q + 2q2
1 2
2 2
= (q1 − q2 )(q + q1 q2 + q − 2).
1 2
Since q1 and q2 are rational numbers, we may express these numbers as follows: q1 = a/c and
q = b/c for some a, b, c ∈ Z, and gcd(a, b, c) = 1. It follows that
2
2 2 2
a + ab + b = 2c .
(8.10)
Note that a and b cannot be both even as then the left hand side of (8.10) would be divisible by 4
whereas the right hand side would not. Similarly, if both a and b were odd, then the left hand side
would be odd but the right hand side would be even. Finally, if only one of the two numbers a and b is
even and the other one is odd, then the left hand side is odd, which is again impossible. We get the
desired contradiction and so, indeed, the polynomial P (x) = x − 2x is a counter-example to our
3
problem.
Let us make a nal remark on how one can guess that P (x) = x − 2x is a counter-example to our 3
problem. Let us rst consider polynomials with integer coef cients that are of degree 2. Such
polynomials can be expressed as follows: P (x) := a(x − p)(x − q), where a, −a(p + q), apq ∈ Z.
Since P (0) = P (p + q) = apq and both 0 and p + q are rational, no polynomial of degree 2 satis es
the desired property.
Hence, we shift our attention to polynomials of degree 3 by considering polynomials of the form
P (x) := a(x − p)(x − q)(x − r) and with integer coef cients. No two of the three roots, say p and q,
can be rational as then P (p) = P (q) = 0 fails the required assumption. So we have two options: all of
them are irrational or exactly one is rational. The second option seems easier to deal with and, without
loss of generality, we may assume that p = 0. Indeed, if P (x) is a counter-example, then so is
Q(x) := P (x − p). Since P (x) has integer coef cients, we get that a ∈ Z and again, without loss of
generality, we may assume that a = 1. It follows that P (x) = x(x − q)(x − r) with q, r ∈ R ∖ Q but
qr ∈ Z. A natural choice for q is an irrational square root of some natural number and r = −q so that
P (x) = x(x
2
. Choosing q
2
− q ) = √2 is an intuitive rst guess, as it is related to a well known proof
that √2 is not a rational number.
8.4 Combinatorics
Problem 4.1.1. There are 2n members of a chess club; each member knows at least n other members
(knowing a person is a reciprocal relationship). Prove that it is possible to assign members of the club
into n pairs in such a way that in each pair both members know each other.
Solution. Let us rst rephrase the question in the language of graph theory. Suppose that G = (V , E) is
a graph on |V | = 2n vertices and the minimum degree, δ = δ(G) ≥ n. Our goal is to show that G has
a perfect matching.
We will construct a perfect matching in n rounds, distinguishing two phases. During the rst phase,
we apply a trivial, greedy algorithm to construct a maximal matching, that is, a matching that cannot be
extended by adding an edge. We start with an empty matching M = (∅, ∅). In each round i, we
0
consider the graph G[V ∖ V (M )] induced by unsaturated vertices. If it contains edges, then we
i−1
arbitrarily pick one of them (say, edge a b ) and add it to the current matching; that is,
i i
V (M ) = V (M
i i−1 ) ∪ {a , b } and E(M ) = E(M
i i i ) ∪ {a b }. This phase ends if there are no more
i−1 i i
edges to pick from. If all the vertices are saturated, then we are done; otherwise, we move on to the
second phase.
During the second phase, at the beginning of each round i ≤ n, set V ∖ V (M ) contains at least
i−1
two vertices and it induces an independent set. We pick any two vertices, say, p and q from that set. We
will show that there is an edge in E(M ), say, rs such that pr ∈ E and qs ∈ E. In other words, we
i−1
will show that there exists a path (p, r, s, q) of length 3 (such paths are often called augmenting paths).
The existence of such paths allows us to improve the size of our matching. Indeed, we can simply
remove rs from the matching and add edges pr and qs instead. Formally, V (M ) = V (M ) ∪ {p, q}
i i−1
To nish the proof, let us note that p has at least n neighbors in V (M ) (since δ ≥ n and
i−1
V ∖ V (M i−1 ) induces an independent set). Let R := N (p) ⊆ V (M ), and let S be the set of vertices
i−1
matched with vertices from R, that is, S = {s ∈ V (M ) : sr ∈ E(M ) f or some r ∈ R}. Clearly,
i−1 i−1
|S| = |R| ≥ δ ≥ n. Moreover, S and R can overlap (and, in fact, they do) but it causes no problem.
follows that q has at least one neighbor in S which nishes the argument.
Finally, let us mention that a stronger property holds. Graph G not only contains a perfect matching,
but it in fact has to have a Hamilton cycle, that is, cycle of length 2n whose vertex set is precisely
V (G). This is a famous suf cient condition for the existence of a Hamilton cycle due to Dirac. It is
indeed a stronger property since one can take every second edge of a Hamilton cycle to form a perfect
matching.
Problem 4.1.2. There are 17 players in the tournament in which each pair of two players compete
against each other. Every game can last 1, 2, or 3 rounds. Prove that there exist three players who have
played exactly the same number of rounds with one another.
Solution. As before, let us rephrase this problem in the language of graph theory. The tournament can
be represented as coloring of the edges of K17, the complete graph on 17 vertices, with three colors
(say, red, blue, and green). Coloring edge vw red indicates that the game between players
corresponding to vertices v and w lasted one round. Similarly, blue and green indicate that the
corresponding game lasted two and, respectively, three rounds. Our goal is to show that, regardless how
the graph is colored, it must contain a monochromatic triangle (that is, the edges of some K3 are all the
same color).
In order to warm up, let us prove something slightly easier. Suppose that only two colors are
available (it does not matter which ones; without loss of generality, we may assume that we use red and
blue). We claim that, regardless how the edges of K6 are colored with these two selected colors, there is
a monochromatic triangle. Indeed, pick any vertex v and consider the 5 edges incident to v. Clearly, at
least three of them (say va, vb, and vc) must be of the same color, say red. If any one of ab, ac, bc is
red, then we have a red triangle. If none of these edges is red, then we have a blue triangle. This proves
the claim about two colors.
Now, let us come back to the original problem with three colors and K17. Pick any vertex v and
consider the 16 edges incident to v. Since, 3 ⋅ 5 < 16, at least 6 of them must be of the same color, say,
green. Let N be the set of neighbors of v that are adjacent to v by a green edge. If any edge of G[N ],
the graph induced by N, is colored green, then we have a green triangle. If none of these edges is green,
then all the edges of G[N ] are colored red and blue. By the previous claim, this also generates a
monochromatic triangle and so we are done.
Let us mention that this result is sharp, namely, one can color the edges of K16 with three colors and
avoid monochromatic triangle. Finally, let us mention that this is a speci c case of the classic and
famous problem of Ramsey numbers (for three colors and triangles). Indeed, this observation can be
generalized to any number of colors and any order of a monochromatic complete graph.
Problem 4.1.3. Consider a group of people with the following property. Some of them know each other,
in which case the corresponding pair of people mutually like each other or dislike each other.
Moreover, there is a person who knows at least six other people. Interestingly, for each person the
number of people he or she likes is equal to the number of people he or she dislikes. Prove that it is
possible to remove some, but not all, like/dislike links such that it is still the case that each person has
the same number of liked and disliked acquaintances.
Solution. As usual, let us rephrase this problem in the language of graph theory. Note rst that
acquaintances can be modelled by a graph G: if v and w know each other, then we put an edge between
v and w; otherwise, v and w are not adjacent. Then, likes and dislikes can be represented by coloring
edges red and, respectively, blue. We assume that the maximum degree is at least 6. More importantly,
we assume that the following property holds: for each vertex v ∈ V (G), the number of red edges
incident to v is equal to the number of blue edges incident to V (in particular, it implies that each vertex
has even degree). Our goal is to show that it is possible to remove some edges (but not all of them) such
that this property is preserved.
It will be convenient to use a notion of a walk on graphs. A walk W = (v , …, v ) of length k is a
0 k
sequence of vertices such that v v ∈ E for any i ∈ [k]. Note that walks are allowed to revisit some
i−1 i
vertices and edges but they do not have to. As a result, a path is a walk but not every walk is a path.
In order to show the result, let us select any vertex v0 that has degree at least 6 and start walking
from there, rst using a red edge and then alternate colors. Note that, because of the property of our
coloring, each time we enter some vertex v ≠ v we may continue using some other edge of the other
0
color. Hence, at some point, we need to get back to v0; let W1 be the walk we did so far. If W1 has even
length, then we get the desired property after removing edges of this walk (note that each vertex on the
walk is incident to the same number of red and blue edges used in this walk). On the other hand, if the
length of W1 is odd, then the two edges used by W1 that are incident to v0 are red. We continue walking
from v0 starting from blue edge and oscillating colors. However, this time, we are not allowed to use
any edges of W1. As before, we are guaranteed that we will not get stuck and we need to get back to v0;
let W2 be the second walk. If W2 is even, then removing this walk gives us the desired property. If it is
odd, then removing both W1 and W2 does the trick. (Recall that W1 and W2 are edge disjoint.) Finally,
let us mention that the condition about the maximum degree is at least 6 is needed. One can easily
construct a counter-example when Δ(G) = 4.
Problem 4.2.1. Consider a square grid of size 25 × 25 that has a smaller square grid of size 5 × 5 cut
out from its bottom left corner. Can you cover the remaining cells with 100 blocks of size 1 × 6 or
2 × 3?
Solution. Label the grid so that the bottom left cell has label (1, 1) and the top right one has label
(25, 25). Put 1 in a cell with label (i, j) if i + j is divisible by 3; otherwise, put 0—see Figure 8.4.
Observe that each block (regardless whether it is of size 1 × 6 or 2 × 3) covers precisely two 1’s. There
are 100 such blocks but the number of 1’s to cover is 199. To see this note that in each row we have
either 8 or 9 ones (before removing 5 × 5 square). We have exactly 17 rows with 8 ones and 8 rows
with 9 ones. It follows that the 25 × 25 grid contains 17 ⋅ 8 + 8 ⋅ 9 = 208 ones. After removing 9 of
them from the bottom left 5 × 5 square grid we are left with 199 ones. The conclusion is that no matter
how hard we try, we will not be able to cover the remaining cells with 100 blocks.
FIGURE 8.4: Illustration for Problem 4.2.1. We put ‘x’ in places where 1 should be placed. We also shown ‘x’ in the 5 × 5 grid that was
removed.
Problem 4.2.2. Prove that it is impossible to cover a square grid of size 9 × 9 with tiles of size 1 × 5 or
1 × 6.
Solution. Let us rst observe that any potential covering would have to use at least 14 blocks (as
13 ⋅ 6 = 78 < 81 = 9 ⋅ 9). Hence, there must be at least 7 blocks that are vertical or at least 7 that are
horizontal. Without loss of generality, we may assume that there are at least 7 horizontal blocks.
However, this means that there must be exactly 9 horizontal blocks because the middle column (column
5) is covered by each of such blocks and so no vertical block can intersect it. Consider now the middle
row (row 5). At least 5 cells are covered by the horizontal block so there are at most 4 vertical blocks.
But 4 + 9 = 13 < 14.
Problem 4.2.3. Can you cover a square grid of size 10 × 10 with 25 “T-shaped” blocks consisting of 4
small squares?
Solution. As usual, label the grid so that the bottom left cell has label (1, 1) and the top right one has
label (10, 10). Put 1 in a cell with label (i, j) if i + j is even, and 0 otherwise. The number of 1’s
covered by each block is 1 or 3. Since there are 25 blocks they are going to cover an odd number of 1’s.
On the other hand, the 10 × 10 grid contains an even number of 1’s (precisely 50). Hence, our task is
not possible.
Problem 4.3.1. The class consists of 12 people. Count in how many ways one can divide them into 6
pairs, 4 triples, 3 quadruples, and 2 six-tuples. Which option yields the largest number of possibilities?
Solution. Suppose that we have n people and we want to divide them into k groups. Assume that k | n
so that there will be s = n/k people in each group. In order to generate a division, we can rst assign
unique numbers from the set [n] to all the people. This can be done in n! ways. Now, people with
numbers from 1 to s form the rst group, people with numbers from s + 1 to 2s form the second group,
and so on.
The problem is that a given group is generated multiple times. First of all, we do not care if
{1, 2, …, s} form the rst or the second group. That means that we can rearrange the k groups in any
way we want. There are k! ways to do it. Moreover, in any particular group such as {1, 2, …, s}, it
does not matter who has 1 assigned and who has 2. This gives us another factor of s! per group.
Combining these observations together we get that there are such divisions. (To be slightly more
n!
k
k!(s!)
formal, one can construct a bijection between the family of unique divisions and the partition of the set
of permutations into sets of size k!(s!) .) Alternatively, one can count it as ∏ ( )/k!, because
k k−1 n−is
i=0 s
we can iteratively select s element sets and then observe that each division is counted k! times.
In our particular situation, we have n = 12 people so we can divide them into = 10, 395 pairs (
6!2
12!
6
k = 6), = 15, 400 triples ( k = 4), = 5, 775 quadruples ( k = 3), and = 462 six-
12! 12! 12!
4 3 2
4!6 3!24 2!720
the same column, ith rook can be placed in n − i + 1 ways. As a result, the number of ways we can
achieve our task is equal to
2
n n! n!
( ) = ( ) /k ! .
k (n − k)! (n − k)!
Another way to see it is to notice that each rook eliminates precisely one column and one row. Hence,
one can place rooks one by one and observe that there are (n + 1 − i) spots available for placing the
2
i-th rook. Once we nish the process, there are k! duplicates because of k! possible permutations of
placing rooks.
Problem 4.3.3. Create all possible 4-digit numbers using digits from set [9] = {1, 2, 3, 4, 5, 6, 7, 8, 9}.
Find the sum of those numbers.
Solution. Let us rst notice that there are 94 numbers that satisfy our requirement. Then, notice that
number c c c c can be associated with number d d d d , where d = 10 − c . As a result, we get a
1 2 3 4 1 2 3 4 i i
bijection from the set of possible numbers to itself. Additionally, the sum of c c c c and d d d d is 1 2 3 4 1 2 3 4
equal to 11, 110, independently of the number used. Therefore, the sum of all numbers is equal to
9 ⋅ 11, 110/2 = 36, 446, 355. (We had to divide the value by 2 because each number was counted
4
twice.)
It is easy to verify that our result is correct using the following one line Julia code: sum(x for x in
1000:9999 if !(0 in digits(x))).
Problem 4.3.4. Alice has 20 balls, all different. She rst splits them into two piles and then she picks
one of the piles with at least two balls, and splits it into two. She repeats this until each pile has only
one ball. Find the number of ways in which she can carry out this procedure.
Solution. The number of ways this splitting procedure can be carried out is the same as the number of
ways to do it backward; that is, Alice can start with 20 piles, each of them containing only one ball, and
then keep merging piles together. Indeed, to see this let us note that any sequence of splits of one set of
20 balls that results in 20 sets can be uniquely reversed to get a sequence of merges from 20 sets to one
set. In other words, there is a bijection between the two sets corresponding to these two operations and
so it does not matter which one we concentrate on. Working backward is slightly easier. In the i-th
move, Alice has 21 − i sets to choose from so she can do ( ) = (21 − i)(20 − i)/2 different
21−i
merges. As she does 19 moves in total, we get that the number of ways is equal to
19
(21 − i)(20 − i)
19
∏ = 20! ⋅ 19!/2 .
2
i=1
Problem 4.4.1. There is a club with 100 members where there are 1, 000 pairs of friends. We want to
pick a three person team from the club with one team member selected as a team leader. The procedure
is that one club member rst becomes a leader. The leader then chooses two followers from his/her
friends and the team is formed. Show that it is possible to pick a team from the club in at least 19, 000
ways.
Solution. Suppose that the ith club member has di friends. Since there are 1, 000 friends in the club, we
get that ∑
100
i=1
di = 2, 000 . Therefore, if we choose the ith member as a leader, he/she can form
(
di
2
) = di (di − 1)/2 unique teams. After taking the sum over all club members we get the number of
possible teams is equal to
Now we see from Jensen’s inequality applied to f (x) = x (see Section 1.1) that 2
2
100 100 2 100
d
2 i
∑ di = 100 ∑ ≥ 100(∑ di /100) = 40, 000 .
100
i=1 i=1 i=1
Let us note that this problem can be reformulated in the language of graph theory. One can consider
a “friendship graph” consisting of 100 vertices corresponding to the club members and edges that
represent friendship relationships. Our goal is to show that any graph on 100 vertices with the average
degree 20 has at least 19, 000 paths of length 2. Indeed, each path abc of length 2 corresponds to a
team with b being the leader of the team.
Let us also note that the lower bound we just proved is best possible as it is achieved when every
member of the club has precisely 20 friends. The corresponding arrangement exists and an underlying
graph is called a 20-regular graph. In order to see one possible example, imagine all members of the
club sitting in a circle. Each member is a friend with 10 people to the left and with 10 people to the
right.
Problem 4.4.2. Consider the following combinatorial game between two players, Builder and Painter.
The game starts with the empty graph on 400 vertices. In each round, Builder presents an edge uv
between two non-adjacent vertices u and v which has to be immediately colored red or blue by Painter.
Show that Builder can force Painter to create a monochromatic (that is, either red or blue) path on 100
vertices in 400 rounds.
Solution. Let rt and bt be the number of vertices in a longest red and, respectively, blue path after t
rounds of the game. Clearly, both rt and bt are nondecreasing functions of t. We will show that Builder
has a strategy that in two rounds increases the sum of rt and bt by 1; that is, for each t ∈ N,
r2t + b ≥ t. In particular, it will follow that r
2t + b ≥ 200 and so max{r
400 ,b
400 } ≥ 100, as 400 400
required.
In order to see this, we will prove slightly stronger claim and insist that the two paths (red and blue)
are vertex disjoint, that is, have no common vertices. Moreover, we will require that there are two
endpoints, one in each path, that are not adjacent. At time 0, we initiate the process by picking two
different vertices. The desired property as well as the desired lower bound trivially holds:
r + t = 2 ≥ 0.
0 0
Suppose now that at time 2t we have two disjoint paths, a red path R = (v , …, v ) and a blue t 1 rt
path B = (w , …, w ). Moreover, the desired property (there is no edge between v and w ) and the
t 1 bt rt bt
desired condition ( r + b ≥ t) are met. Bulder presents edge v w . Without loss of generality, we
2t 2t rt bt
may assume that Painter paints it red. Builder now presents and edge from w to a new vertex v. If bt
Painter paints it blue then we discard the edge v w but keep w v to extend blue path. We get
rt bt bt
the desired lower bound holds, the two paths are disjoint, and the corresponding endpoints are not
adjacent. Suppose then that Painter paints it red. This time, we discard vertex w from the blue path, bt
making it shorter. If, as a result, the blue path becomes empty, we choose any unused vertex as
initialization of the blue path. We get that
vertices using two colors, say, red and blue. Edge uv is colored red if the members corresponding to u
and v know each other; otherwise, uv is colored blue. Our goal is to show that no matter how the edges
of the complete graph on 2 vertices are colored, there exists a set of t vertices that induces a
2t
monochromatic graph; that is, all edges of this induced graph are red or all of them are blue.
Start the process with selecting an arbitrary vertex v. Note that v has an odd number of neighbors,
namely 2 − 1. As a result, either at least 2 /2 = 2
2t 2t
of them are adjacent to v via red edge or at
2t−1
least 22t−1
of them are adjacent to v via blue edge. If v is adjacent to more red edges than blue ones,
then we assign label R to v, remove v and all of its neighbors that are adjacent to v via blue edges. For
simplicity, if needed, we additionally and arbitrarily remove some vertices to keep the number of them
to be exactly 2 . On the other hand, if majority of neighbors of v are blue, then v gets label B
2t−1
assigned. This time, we remove v and its neighbors that are adjacent to v via red edges, and remove
some additional vertices so that the number of vertices left is 2 . 2t−1
We repeat the process on the remaining subset of vertices until we exhaust all of them. Since the
number of vertices decreases by a factor of 2 each time, the process lasts 2t rounds. The last round,
round 2t, is slightly different as there is only one vertex left. The last vertex can get any label assigned,
say, B. It follows that there are at least t vertices with label R assigned or at least t vertices with label B.
Due to symmetry, we may assume without loss of generality that at least t vertices have label R
assigned. It is straightforward to see that all edges in the complete graph induced by these vertices are
red. The desired property is satis ed.
Finally, let us mention that this problem is a famous and an extremely dif cult problem. The
corresponding numbers that we tried to bound in this problem are called the Ramsey numbers. In fact,
with slightly more work, one can replace 4t by ( ) ≤ 4 /√2t. However, perhaps surprisingly, it is not
2t
t
t
Problem 4.5.1. Let k ∈ N and x N = N (k) := ⌊2 ⌋. Show that it is possible to partition set
k/2
X := [N ] = {1, 2, …, N } into two subsets A and B such that neither A nor B contains an arithmetic
progression of length k.
Solution. Let us rst make an obvious observation. If a , a , …a is an arithmetic progression, then so
1 2 k
progressions.
Since the rst two terms of an increasing arithmetic progression uniquely de nes it, the number of
increasing arithmetic progressions of length k in X is at most .
N N (N −1) 2 k−1
( ) = < N /2 ≤ 2
2 2
Consider then a random partition of X into two subsets A and B; that is, each element of X is
independently put into A with probability 1/2. Clearly, the probability that a given k-element sequence
is in A is equal to (1/2) and the same holds for B. It follows that the expected number of arithmetic
k
sequences of length k entirely contained in one of the two sets is less than 2 ⋅ 2 ⋅ (1/2) = 1. By the
k−1 k
Solution. We will compute the expected number of desired orderings in a random tournament where for
each pair A, B of teams, team A wins against team B with probability 1/2, independently from all
other games. Let us x any of the n! permutations of teams: t , …, t . The probability that it satis es
1 n
the desired property is equal to (1/2) . Hence, the expected number of desired orderings is equal to
n−1
n!/2
n−1
and, by the probabilistic method, there must exist a tournament for which there are at least
n!/2
n−1
such orderings. (Surprisingly, this trivial argument gives the results that is almost as best as
possible. It is known that in any tournament, the number of such orderings is O(n n!/2 ).) 3/2 n−1
Problem 4.5.3. Consider a graph with T triangles. Show that it is possible to color the edges of this
graph with two colors so that the number of monochromatic triangles is at most T /4.
Solution. Let us color the edges of this graph at random, uniformly and independently. The probability
that a given triangle is monochromatic is equal to 2 ⋅ (1/2) = 1/4. Therefore, the expected number of
3
monochromatic triangles is T /4. It follows that there must exist a coloring for which the number of
monochromatic triangles is less than or equal to T /4. (Moreover, if T is not divisible by 4, then a strict
inequality holds.)
Problem 4.5.4. There are 100 people invited to the party; 450 pairs of people know each other. Show
that it is possible to select 10 people so that no two of them know each other.
Solution. Since the acquaintances between people invited to the party can be represented as a graph,
our problem can be reformulated in the language of graph theory. Our goal is to show that any graph G
on n = 100 vertices and m = 450 edges has an independent set of size 10.
For a given permutation π of the vertices, we put vertex v into set S = S(π) if no neighbor of v
follows it in the permutation. Clearly, S forms an independent set. Let π now be a random permutation
of the vertices of G taken with uniform distribution; that is, each permutation occurs with probability
1/n!. For a given vertex v ∈ V , let d (v) be the number of neighbors of v that follow it in the
+
permutation. The random variable d (v) attains each of the values 0, 1, …, deg(v) with probability
+
1/(deg(v) + 1). Indeed, this follows from the fact that the random permutation π induces a uniform
random permutation on the set of deg(v) + 1 vertices consisting of v and its neighbors (to see this one
can x the positions of non-neighbors of v rst, and then xing one of deg(v) + 1 free positions for the
vertex v will yield desired values of d (v) with uniform distribution). Therefore the expected number
+
permutation with at least C vertices of this type, which form an independent set.
The last part is an optimization problem. Notice that the average degree is equal to
d := ∑ deg(v)/n = 2m/n = 2 ⋅ 450/100 = 9 . It follows from Jensen’s inequality (see Section 1.1)
applied to function 1/t that
1
∑
deg(v)+1
1 1
∑
v∈V
C = = n ≥ n
deg(v)+1 n ∑ (deg(v)+1)
v∈V
v∈V
n
n n 100
= nd+n
= = = 10.
d+1 9+1
n
Problem 4.6.1. Consider an urn that initially contains one white and one black ball. We repeatedly
perform the following process. In a given round, one ball is drawn randomly from the urn and its color
is observed. The ball is then returned to the urn, and an additional ball of the same color is added to the
urn. We repeat this selection process for 50 rounds so that the urn contains 52 balls. What number of
white balls is the most probable?
Solution. Let p be the probability of seeing exactly k white balls in an urn having n balls in total.
k,n
One could write down the recursion for p and then solve it but it seems that it is easier to perform
k,n
calculations for the few rst rounds to make a natural conjecture that can be then proved by induction.
During the rst round, we select a white ball with probability 1/2 so we end up with 2 white balls with
probability 1/2 and, otherwise, we stay with 1 white ball. It follows that p = p = 1/2. In order to 1,3 2,3
see 1 white ball after two rounds, we have to chose black balls during the two rounds and so
p1,4
= (1/2) ⋅ (2/3) = 1/3. Similarly, to see 1 black ball, we have to select white balls twice and so
conjecture that for any n ≥ 2 and any 1 ≤ k ≤ n − 1, p = 1/(n − 1). We also see that k,n
p0,n = pn,n = 0, as there is always at least one black and one white ball in the urn.
We prove the claim by induction. The base case ( n = 2) is trivial: p = 1 (in fact, we already 1,2
checked it for n = 3 and n = 4). For the inductive step, suppose that p = 1/(n − 1) for some k,n
n ≥ 2 and all 1 ≤ k ≤ n − 1. Fix k such that 1 ≤ k ≤ n. Our goal is to show that p = 1/n. Note k,n+1
that in order to see k white balls at the end of some round we need to have k white balls in the previous
round and select a black ball, or have k − 1 white balls and select a white one. We get that if k > 1,
then
n−k k−1
pk,n+1 = pk,n + pk−1,n
n n
n − 1 0 n − 1 1 1
p1,n+1 = p1,n + p0,n = ⋅ = .
n n n n − 1 n
The inductive hypothesis holds and the proof is nished.
Finally, let us mention that this problem is an easy, speci c case of the famous Pólya urn model. This
endows the urn with a self-reinforcing property sometimes expressed as the rich get richer. In some
sense, the Pólya urn model is the “opposite” of the model of sampling without replacement, where
every time a particular value is observed, it is less likely to be observed again, whereas in the Pólya urn
model, an observed value is more likely to be observed again.
Problem 4.6.2. There are 65 participants competing in a ski jumping tournament. They take turns and
perform their jumps in a given sequence. We assume that no two jumpers obtain the same result and
that each nal resulting order of participants is equally probable. At each given round of the
tournament, the person that has obtained the best result thus far is called a leader. Prove that the
probability that the leader changed exactly once during the whole tournament is greater than 1/16.
Solution. Let π : [n] → [n] be the nal order/permutation of jumpers. In particular, π(1) is the winner
of the tournament. Our assumption is that π is a random permutation; that is, for a given permutation π0
of [n], we have that π = π with probability 1/n!. Let p be the probability that the leader changed
0 i,n
exactly i times during the tournament of n participants. Our goal is to show that p > 1/16. 1,65
Let q (k) be the probability that the kth participant won the tournament of n ski jumpers and that the
n
leader changed exactly once during the whole event. If the rst participant wins, then he is the leader
from the very beginning and no change in the leadership occurs. It follows that q (1) = 0 and so n
n n
p1,n = ∑ qn (k ) = ∑ qn (k ) .
k=1 k=2
There are many ways to generate random permutations of [n]. The following one will be very
convenient to compute q (k). We start with 1 and then place 2 before 1 with probability 1/2;
n
otherwise, it will be placed after 1. After that we place 3 in a random place and move on to 4. Formally,
given a partial (random) permutation of elements 1, …, k − 1 (for some integer k ≥ 2), we place k
uniformly at random in one of the k possible places. This point of view has an important implication
for our problem. We immediately get that the kth participant becomes a leader (at least till the next
participant jumps) with probability 1/k. It follows that
1 2 k − 2 1 k k + 1 n − 1 1
qn (k ) = ⋅ ⋯ ⋅ ⋅ ⋅ ⋯ = .
2 3 k − 1 k k + 1 k + 2 n n(k − 1)
As a result,
n n
1 1 1 1
p1,n = ∑ = ∑ = Hn−1 ,
n(k − 1) n k − 1 n
k=2 k=2
One way to prove the desired lower bound for p is to compare the harmonic series with another1,n
divergent series where each denominator is replaced with the next largest power of two:
1 1 1 1 1 1 1 1 1 1
H2i = + + + + + + + + + … + i
1 2 3 4 5 6 7 8 9 2
1 1 1 1 1 1 1 1 1 1
> + + + + + + + + + … + i
1 2 4 4 8 8 8 8 16 2
1 1 1 i−1 1 i
= 1 + + 2 ⋅ + 4 ⋅ + … + 2 ⋅ i
= 1 + .
2 4 8 2 2
(8.11)
It follows that
1 1 6 4
p1,65 = H26 > (1 + ) = ,
65 65 2 65
which is very close to the desired bound of 1/4 but, unfortunately, slightly smaller than that.
Fortunately, it is easy to improve the bound (4) for H to, for example, H > 1 + i/2 + 1/3 − 1/4. 2
i
2
i
1 1 6 1 1 49 1
p1,65 = H26 > (1 + + − ) = > 0.0628 > .
65 65 2 3 4 780 16
We are done with this problem but let us make two additional comments. Let us rst note that
another way to lower bound Hn is to do the following. We know from Section 1.5 that for any i ∈ N,
i i
i + 1 1
( ) = (1 + ) < e.
i i
After taking the natural logarithm of both sides of this inequality, we get
1
ln(i + 1) − ln(i ) <
i
and so
n n
1
Hn = ∑ > ∑(ln(i + 1) − ln(i) ) = ln(n + 1) − ln(1 ) = ln(n + 1 ) .
i
i=1 i=1
This bound is quite good as one can show that H < ln(n) + 1 (see solution to Problem 2.6.1) and
n
follows that
1 ln(65)
p1,65 = H64 > > 0.0642 .
65 65
Let us also mention about p for some values of i ≠ 1. Clearly, p = (n − 1)!/n! = 1/n as there
i,n 0,n
are (n − 1)! permutations that correspond to situations when the rst participant is the winner. On the
other hand, for i ∈ N such that 2 ≤ i ≤ n − 1, we can get a recursive formula by independently
considering cases when the last change of the leader occurred at round k + 1. We get
n−1 n−1
1 k + 1 k + 2 n − 1 1
pi,n = ∑ pi−1,k ⋅ ⋅ ⋅ ⋯ = ∑ pi−1,k .
k + 1 k + 2 k + 2 n n
k=i k=i
Using this recursion, with computer support, we can easily nd p i,n for some given parameters. Here is
a simple program written in Julia that does this.
function probs65()
probs = Dict{Tuple{Int, Int}, Float64}()
function prob(j,k)
if !haskey(probs, (j,k))
if j == 0
probs[(j,k)] = 1/k
else
probs[(j,k)] = 1/k*sum(prob(j-1,i) for i in j:(k-1))
end
end
return probs[(j,k)]
end
[prob(j, 65) for j in 0:64]
end
You can run it by writing probs65() in the Julia session.
The rst few probabilities are p ≈ 0.0154, p
0,65 ≈ 0.073, p ≈ 0.1606, p
1,65 ≈ 0.2204, 2,65 3,65
changes of the leader in the tournament, and 4 changes are only slightly less probable.
If one is unsure about our computations, it is relatively easy to check them using a computer
simulation. Here is another Julia code that simulates the tournament.
function sim_tournament()
# the jump length of first jumper drawn uniformly from [0,1) interv
al
best_length = rand()
best_changes = 0
# simulate jumps of consecutive jumpers
# and count the number of leader changes
for i in 2:65
jump_length = rand()
if jump_length > best_length
best_length = jump_length
best_changes += 1
end
end
return best_changes
end
function run_simulation()
# simprobs65 will hold the counts of observed tournament results
simprobs65 = zeros(65)
sim_runs = 10_000_000
for i in 1:sim_runs
# we have to add 1 to sim_tournament() result as
# in Julia arrays are 1-based and
# a minimal number of leader changes in the tournament is 0
simprobs65[sim_tournament() + 1] += 1
end
simprobs65 / sim_runs
end
simprobs65 = run_simulation()
The rst few probabilities estimated by the simulation are p ≈ 0.0155,
0,65 p ≈ 0.0729,
1,65
, , ,
p2,65 ≈ 0.1604 p3,65 ≈ 0.2207 p4,65 ≈ 0.2138 p5,65 ≈ 0.1569 , and p ≈ 0.0914 (you might get a
6,65
slightly different results because this time we use a rand simulation to generate the outputs). They are
close to the exact values calculated earlier and so we are quite con dent that no mistake was made.
Problem 4.6.3. Three random events satisfy the following three conditions: (a) their probabilities are all
equal to p for some p ∈ [0, 1], (b) they are pairwise independent, and (c) all of them cannot happen at
the same time. What is the maximum value that p may take?
Solution. Denote the the events by Ai (for i ∈ [3] ). It follows from condition (a) that there exists
p ∈ [0, 1] such that p = P(A ) for all i. By condition (b) we get that P(A ∩ A ) = p for i ≠ j.
i i j
2
Finally, condition (c) implies that P(A ∩ A ∩ A ) = 0. It follows immediately from the inclusion–
1 2 3
+P(A1 ∩ A2 ∩ A3 )
2
= 3p − 3p + 0.
2 2
3p = P((A1 ∩ A2 ) ∪ (A1 ∩ A3 ) ∪ (A2 ∩ A3 ) ) ≤ P(A1 ∪ A2 ∪ A3 ) = 3p − 3p ,
Suppose that we roll an 8-sided fair die. A1 represents the event that an even number is rolled, A2
represents the event that the number rolled is less than or equal to 4, and A3 represents the event than a
number from the set {1, 3, 6, 8} is rolled. It is straightforward to check that all conditions are met.
Problem 4.7.1. Let P be a set of ve points on a plane with the property that no three of them lie on the
same line. Denote by a(P ) the number of obtuse triangles whose vertices lie in P. Find the minimum
and the maximum value that a(P ) can attain over all possible sets P.
Solution. First note that 5 points, A, B, C, D, and E, create 10 triangles as one can select 3 points from
the set of 5 points in ( ) = 10 ways and each choice yields a unique triangle. We will rst prove that
5
a(P ) ≥ 2 for any con guration P of 5 points. This bound is best possible as shown in Figure 8.5.
FIGURE 8.5: Con guration of ve points forming two obtuse triangles. We take |AB| = |BC| = |CD| and the following angles are
right ∢DEA, ∢ABC , ∢BCD, ∢CDA, ∢DAB, ∢BDE and ∢EAC . Out of the 10 triangles created by points A, B, C, D, and E only
triangles EDC and EAB are obtuse.
We will independently consider two cases. Let us rst assume that some point, say point A, lies
inside a convex hull of the remaining four points. As the points are not colinear, it must lie inside a
triangle formed by some other three points, say, B, C, and D. Note that the sum of the three angles
∢BAC , ∢BAD, and ∢CAD is equal to 2π and all of them are less than π. As a result, at least two of
them are obtuse. These two angles form the two obtuse triangles, as required. Suppose now that the ve
points form a convex pentagon. Note that the sum of the interior angles of the pentagon is equal to 3π.
As each individual angle is less than π, we get that at least two of them are obtuse and, as before, at
least two obtuse triangles are present.
On the other hand, the maximum number of obtuse triangles formed by ve points is 10, that is, it is
possible that all triangles are obtuse. Such con guration is shown in Figure 8.6.
FIGURE 8.6: Con guration of ve points forming ten obtuse triangles.
Problem 4.7.2. Every point on a circle is painted with one of three colors. Prove that there are three
points on the circle that have the same color and form an isosceles triangle.
Solution. In order to warm up, let us consider a much simpler version of this problem when only two
colors are available. Our goal is the same, we want to show that there are three points on the circle that
have the same color and form an isosceles triangle. In order to see this one can take any 5 points that
form a regular pentagon inscribed in the circle. It is enough to concentrate on these 5 points as no
matter how they are colored, the desired triangle has to be created. Indeed, observe that at least three of
these vertices must be painted with the same color. It follows that two of them, say A and B, must be
adjacent. If the third vertex, say C, is adjacent to either A or B, then they form an isosceles triangle
(note that they form the two sides of the pentagon). On the other hand, if C is not adjacent neither to A
nor to B, then |AC| = |BC| as they are diagonals of the pentagram. See Figure 8.7 for an illustration
of both cases.
FIGURE 8.7: Two possible scenarios of two-colorings with at least three gray points.
Let us now come back to the original problem with three colors. Our proof technique is the same as
before. However, instead of regular pentagon we will use 13-sided regular polygon inscribed in the
circle. Let us label the vertices of this polygon with numbers from 1 to 13, anticlockwise. Clearly, at
least 5 of the vertices must be painted with the same color. We will concentrate on them and disregard
the remaining vertices of the polygon (and an in nite number of other points from the circle). Let us
denote their unique labels as a , a , …, a ∈ [13]. We will say that two vertices are at distance k if the
1 2 5
number of vertices (from our polygon) that separate them is equal to k. Note that the smallest distance
is 0 (corresponding to the situation when the two vertices are adjacent) and the largest distance is 5.
It is easy to see that three vertices form an isosceles triangle if and only if the distance from one of
them to the remaining two is the same. We will do case analysis to show that such situation cannot be
avoided when 5 vertices need to be selected and so the desired triangle must exist. In order to reduce
the number of cases to consider, notice that regardless which 5 vertices are selected (out of 13 vertices),
the minimum distance between them cannot be greater than 1 (since 5 + 2 ⋅ 5 = 15 > 13).
Case 1: The minimum distance is equal to 1. Due to symmetry, without loss of generality, we may
assume that 1 and 3 are selected. Because of the distance constraint, 2, 4, and 13 cannot be chosen. In
order to avoid an isosceles triangle, 5 and 12 are also forbidden. This means that either 6 or 11 has to
be chosen. Again, due to symmetry, without loss of generality we may assume that 6 is chosen. This
disallows 7 (because of the distance constraint), 9 (because of the triangle 6-1-9), 10 (because of 6-10-
1), and 11 (because of 6-11-3). The only vertex left is 8 and so we are not able to select 5 points in
total. See Figure 8.8.
Case 2: The minimum distance is equal to 0. Without loss of generality, we may assume that 1 and 2
are chosen which eliminates 3, 8 and 13. We consider the following sub-cases.
Case 2a: 4 or 12 is selected. Without loss of generality, we may assume that 4 is chosen which
disallows 6, 7, 9, and 11. We are left with three numbers, 5, 10, and 12, but no two of them can be
selected at the same time. Indeed, 5 and 10 cannot be together because of 1, 5 and 12 cannot be
together because of 2, and nally 10 and 12 cannot be together because of 1. See Figure 8.9.
Case 2b: Neither 4 nor 12 is selected but 5 or 11 is. Without loss of generality, we may assume that 5 is
chosen which disallows 9 and 10. As before, we are left with three numbers, 6, 7, and 11, but no two of
them can be selected at the same time. See Figure 8.10.
In problems like this one that require investigating a large number of cases, it is often useful to check
the proof using the computer. Here is a simple code written in Julia language that veri es that for any
selection of 5 vertices from the 13 sided regular polygon, there always exists an isosceles triangle.
Running this code by calling test13gon() returns true, ans so we have a computational
con rmation of our claim.
using Combinatorics
function isisosceles(points)
# make sure points are in ascending order
sort(points)
# calculate their distances
d1 = points[2] - points[1]
d2 = points[3] - points[2]
d3 = 13 - d1 - d2
# check if any of their distances is equal
return d1 == d2 || d2 == d3 || d3 == d1
end
function test13gon()
# pick all 5 element subsets from the set 1:13
for p5 in combinations(1:13, 5)
# check if any 3 element subset of the picked
# 5 element subset forms an isosceles triangle
if !any(isisosceles(p3) for p3 in combinations(p5, 3))
return false
end
end
return true
end
Additionally, we might search for a coloring of the 13-gon that yields a minimum number of
monochromatic isosceles triangles. Below is an additional function, also written in Julia, that calculates
it.
function isosceles_count(i)
# convert number to its representation in base 3
c = string(i, base=3, pad=13)
# count monochromatic triangles that are isosceles
count(t - > c[t[1]] == c[t[2]] == c[t[3]] \& \& isisosceles(t),
combinations(1:13, 3)), c
end
function best13gon()
# initialize the sequence with monochromatic colorings
# then traverse all non monochromatic colorings
# we may then assume (without loss of generality)
# that they start with digits 0 and 2
mapreduce(isosceles_count, min, 2*3^11:3^12-1)
end
best13gon() returns (2, ”0200011022112”). We also check that
Running this code by calling
init=isosceles_count(0) produces a larger number (a monochromatic coloring). It implies
that it is almost possible to avoid isosceles triangles when coloring 13-gon with three colors. The
returned coloring creates only two monochromatic isosceles triangles—see Figure 8.12. Let us note
that it is only one example of such coloring; in other words, this example is not unique.
FIGURE 8.12: Optimal coloring of the 13-gon. Only two monochromatic isosceles triangles are created: 3-4-5 and 1-3-5.
Let us also notice that 13 sided regular polygon is the smallest polygon that can be used in our
method. Indeed, one can color vertices of the 12 sided regular polygon so that it contains no isosceles
triangle. For example, consider the following coloring: vertices {1, 2, 4, 5} are colored red,
{8, 9, 11, 12} are colored green, and {3, 6, 7, 10} are colored blue. Similar patterns can be found for
this is a suf cient condition but not a necessary one. As a result, we get a natural connection between
our problem and the famous Van der Waerden numbers.
Let us start with a striking observation made by Van der Waerden. For any given natural numbers r
and k, there is some number n = n(r, k) such that if the integers from [n] are colored, each with one of
r available colors, then there are at least k integers, all of the same color, which form an arithmetic
progression. The least such n is the van der Waerden number W (r, k).
In our problem, we do not actually need to know W (r, k), all we need is to make sure it exists. It is
guaranteed by the original observation of Van der Waerden and can be proved by induction. Indeed,
despite the fact that we have so powerful computers these days, only 6 nontrivial numbers and known:
W (2, 3) = 9 (easy exercise), W (2, 4) = 35 (Chvátal (1970)), W (2, 5) = 178 (Stevens and Shantaram
(1978)), W (2, 6) = 1, 132 (Kouril and Paul (2008)), W (3, 3) = 27 (Chvátal (1970)), and
W (4, 3) = 76 (Beeler and O’Neil (1979)).
Let us come back to the original problem. We start with a regular n-gon with n = W (3, 3) = 27,
vertices of which are labelled with numbers from [n]. Regardless how we color the vertices, a
monochromatic arithmetic sequence of length 3 must be created and the corresponding vertices form an
isosceles triangle. This argument is not optimal (27-gon is used instead of 13-gon) but it trivially
generalizes to any number of colors. For an arbitrary number of r colors, one needs to start with
n = W (r, 3) and the same argument follows.
Problem 4.7.3. Take a set of n ≥ 2 points with the property that no three of them lie on the same line.
We paint all line segments formed by those points in such a way that no two line segments that have a
common vertex have the same color. Find the minimum number of colors for which such coloring
exists.
Solution. Since no three points lie on the same line, the number of lines is equal to f (n), the number of
two element subsets of an n-element set; f (n) = ( ) = n(n − 1)/2. Let g(n) be the maximum
n
number of disjoint two element subsets of such a set; g(n) = ⌊n/2⌋ ( g(n) = n/2 if n is even and
g(n) = (n − 1)/2 if n is odd). Clearly, if the two line segments created by points a, b and,
respectively, points c, d are colored with the same color, then all of these points are different. It follows
that the maximum number of line segments that are in the same color is at most g(n). Combining the
two observations we get that the minimum number of colors for which the desired coloring exists is
then at least
n − 1 if n is even
f (n)/g(n ) = {
n otherwise .
In order to see that this bound is achievable, let us consider the following simple construction.
Suppose rst that n is odd. We start with n points on the circle, equally spaced (that is, these points are
vertices of a regular n-gon). Clearly, there are n directions de ned by the line segments formed by those
points, and in each direction there are exactly (n − 1)/2 line segments. All line segments associated
with the same direction receive the same color. See Figure 8.13 (left) for an example with n = 5. Since
g(n) = (n − 1)/2 line segments have the same direction, only f (n)/g(n) = n colors are used.
For an even value of n, we simply use the previous construction with n − 1 points that can be delt
with n − 1 colors. Note that there are n − 1 colors (or, equivalently, directions) but each vertex is part
of n − 2 line segments. Observe that, as a result, each vertex is missing one unique color. We add the
n-th point in the center of the circle and connect it to the n − 1 points using the missing color. See
Figure 8.13 (right) for an example with n = 5.
FIGURE 8.13: coloring the line segments for n = 5 (left) and n = 6 (right).
Problem 4.8.1. Twenty ve boys and 25 girls sit around a table. Prove that it is always possible to nd a
person both of whose neighbors are girls.
Solution. Let us label seats with numbers from the set [50]. Consider two subsets of people, those
sitting in even and odd positions at the table. In one of those sets there must be at least 13 girls.
However, this implies that there are two girls that are separated by one person. That person is the one
that we are looking for.
Problem 4.8.2. A person takes at least one aspirin a day for 30 days. Show that if the person takes 45
aspirin altogether, then in some sequence of consecutive days that person takes exactly 14 aspirin.
Solution. For i ∈ [30], let ai be the cumulative number of aspirins taken up to and including day i. We
know that a > 0 and for all i, a
1 > a (since a person takes at least one aspirin a day). Moreover,
i+1 i
a30 = 45 (the person takes 45 aspirin altogether). Now, for i ∈ [30], let b = a + 14. The properties
i i
that we determined for ai’s imply that b > 14, b = 59 and for all i, b
1 30 > b . Putting these two
i+1 i
sequences together, we get 60 numbers in total, all of them are positive and smaller than 60. By the
pigeonhole principle we get that there are two numbers, k and ℓ, for which a = b = a + 14. It k ℓ ℓ
follows that a − a = 14, so the person takes exactly 14 aspirin between day ℓ + 1 and day k.
k ℓ
Problem 4.8.3. Prove that, if we take n + 1 numbers from the set from 1 to 2n, then in this subset there
exist two numbers such that one divides the other.
Solution. Each number from the set [2n] can be uniquely represented in the form 2 q, where p
Clearly, two numbers of the same type have the desired property, that is, one divides the other. So our
goal is to show that regardless which n + 1 numbers are selected from [2n], there will be two numbers
of the same type. Since the number of types is equal to n (there are n odd numbers in [2n]), this follows
immediately from the pigeonhole principle. Finally, let us mention that this result is sharp in the sense
that one can select n numbers from [2n] (namely, all odd numbers) and avoid this property.
Problem 4.9.1. Consider the Sicherman dice problem in which the restriction that each side is labelled
with a positive integer is relaxed to any integer, not necessarily positive. Can you design more pairs of
dice?
Solution. Notice that by adding 1 to all sides on one die and subtracting 1 from all sides on the other
die does not affect the distribution for their sum. So there are in nitely many solutions, for example,
((0, 1, 1, 2, 2, 3), (2, 4, 5, 6, 7, 9)) or ((2, 3, 4, 5, 6, 7), (0, 1, 2, 3, 4, 5)).
Problem 4.9.2. Solve the recurrence xn+1 = xn + 2xn−1 for n ∈ N , with x0 = 0 and x1 = 1 . Verify
your solution using induction.
Solution. In order to nd the corresponding generating function we follow the same strategy as for the
Fibonacci sequence (see the example above). We get that
G(x) − x
= G(x) + 2xG(x ) .
x
Therefore
∞
1 1 1 1 n n n n
G(x ) = ( + ) = ∑(−(−1) x + 2 x ) ,
3 −1 − x 1 − 2x 3
i=0
xn+1 = xn + 2xn−1
n n n−1 n−1
= (2 − (−1) )/3 + 2(2 − (−1) )/3
n+1 n+1
= (2 − (−1) )/3.
Problem 4.9.3. Your friend wants to play the following game with you. You toss three 6-sided fair dies
and calculate the sum of outcomes. For every game you have to pay $1. If the sum is 10 or 11 you get
$4, otherwise you get nothing. Is this game fair?
Solution. We will compute the probability of getting 10 and 11 by investigating f (x) , where is
3
f (x)
the generating function for a fair die we have introduced in the solution. We note that
3
6 10
3 i i 21−i
f (x) = (∑ x ) = ∑ ai (x + x ) ,
i=1 i=3
where (a , a , a , a , a , a , a , a ) = (1, 3, 6, 10, 15, 21, 25, 27). It follows that the probability of
3 4 5 6 7 8 9 10
winning (that is, earning $4 − $1 = $3) is equal to p := 2a /6 . On the other hand, the probability of 10
3
i=3 i
3 3 3
10
4a10
= ∑
10
a = ∑
i=3 i
a + a
9
, and so q + p = 4p or q = 3p. The expected number of dollars
i=3 i 10
3p − q = 0,
2 2 2
gcd(a + b, a + ab + b ) = gcd(a + b, (a + b) − ab ) = 1.
Since a + b | (a + b)
2
, it follows that
2
gcd(a + b, (a + b) − ab ) = gcd(a + b, ab ) .
Consider any prime p that divides ab. Since a/b is in lowest terms, p cannot divide both a and b. By
symmetry, without loss of generality, we may assume that it divides a but not b. It follows that p does
not divide a + b and so gcd(a + b, ab) = 1, and the proof is nished.
Problem 5.1.2. You are given two natural numbers a and b. Prove that if a + b |
2
a , then a + b |
2
b .
Solution. Let a and b be any two natural numbers such that a + b | a . Let c = gcd(a, b) and set 2
a := a/c, b := b/c so that gcd(a , b ) = 1. Note that our assumption a + b | a can be rewritten as
′ ′ ′ ′ 2
Problem 5.1.3. Consider a set A of four digit numbers whose decimal representation uses precisely two
digits; moreover, both of them are non-zero. Let f : A → A be the function such that f (a) ips the
digits of a ∈ A (for example, f (1333) = 3111). Find n > f (n) for which gcd(n, f (n)) is as large as
possible.
Solution. Let us rst note that gcd(8484, 4848) = 1212. We will show that this is the maximum
possible value of gcd(n, f (n)) and so 8484 is the value of n we are looking for. In fact, we will prove
that it is the unique value of n such that n > f (n) that maximizes gcd(n, f (n)).
Suppose that n > f (n) is such that k = gcd(n, f (n)) ≥ 1212. Note that
k = gcd(n, f (n)) = gcd(n, n + f (n)) and so, in particular, k divides n + f (n). Suppose that the
representation of n uses digits a and b, 1 ≤ a, b ≤ 9 and a ≠ b. It is easy to see that the property of the
function f implies that
Therefore, n must have the form abab, as numbers of the form baaa, abaa, aaba, aaab and aabb are
not divisible by 101. In order to see this, note that
baaa : 1000b + 111a = 10(a − b) + 101(10b + a) and 0 < 10 ≤ |10(a − b)| ≤ 80 < 101
,
abaa: 1011a + 100b = (a − b) + 101(10a + b) and 0 < 1 ≤ |a − b| ≤ 8 < 101,
aaba: 1101a + 10b = 10(b − a) + 101 ⋅ 11a and 0 < 10 ≤ |10(b − a)| ≤ 80 < 101,
aaab: 1110a + b = (b − a) + 101 ⋅ 11a and 0 < 1 ≤ |b − a| ≤ 8 < 101,
aabb : 1100a + 11b = 11(b − a) + 101 ⋅ 11a and 0 < 11 ≤ |11(b − a)| ≤ 88 < 101.
The remaining case that is left to deal with is when n is of the form abab which can be written as
101(10a + b). The corresponding value of f (n) (that is of the form baba) can be written as
101(10b + a) and so
a + b is not divisible by 9. Let c := gcd(a, b), a = a/c, and b = b/c. We get that
′ ′
gcd(9, a + b ) ≤ 3. As a result, the maximum value gcd(n, f (n)) can attain is less than or equal to
′ ′
Actually, one can show that that this is the unique value of n such that n > f (n) that maximizes
gcd(n, f (n)). Indeed, in order to achieve the maximum value of 1212, we must have that c = 4 and
follows that a + b = 3, or equivalently, that a + b = 12. Since 1 ≤ b < a ≤ 9 (as n > f (n)), 4 | a,
′ ′
Combining all three cases together we conclude that the only solution is p = 3.
Problem 5.2.2. You are given three consecutive natural numbers (say, a, a + 1, and a + 2) such that the
middle one is a cube (that is, a + 1 = ℓ for some ℓ ∈ N). Prove that their product is divisible by 504.
3
8 | k. Suppose then that ℓ is odd. It follows that ℓ is also odd and so ℓ − 1 and ℓ + 1 are two
3 3 3
consecutive even numbers. One of them must be divisible by 4 and so 8 | (ℓ − 1)(ℓ + 1). We 3 3
Case: 7 | k. As before, it is easy to check that ℓ is congruent to 0, 1, or 6 modulo 7. One of the three
3
Problem 5.2.3. Prove that for any natural n ∈ N that is not divisible by 10 there exists k ∈ N such that
nk has in its decimal representation the same digit at the rst and the last position.
Solution. The property trivially holds for n < 10 as n 1
= n has only one digit. In order to deal with
n > 10, let us rst show the following useful property:
4k+1
n ≡ n ( mod 10) f or all k ∈ N .
As a result, we will be able to restrict ourselves to the subsequence (n ) , that has the property 4k+1
k∈N
that all terms have the same last digit, and concentrate exclusively on the rst digit.
Let
2
4k+1 k k k
ℓ := n − n = n(n − 1)(n + 1)((n ) + 1) .
Our goal is to show that 10 | ℓ. We will show independently that 2 | ℓ and that 5 | ℓ. The rst task
is easy: it is clear that 2 | n(n − 1) and so 2 | ℓ. Divisibility by 5 requires considering a few cases.
k
For each case, we will show that 5 divides some term in the above representation of ℓ. If
( mod 5), then 5 | n. If n ≡ 1 ( mod 5), then 5 | (n − 1). If n ≡ 4
k k k k
n ≡ 0 ( mod 5)
Let us now consider numbers of the form n , for i ∈ [91], and concentrate on their rst two digits.
4i
Clearly, there are 90 possibilities for the rst two digits, from 10 to 99. Hence, by the pigeonhole
principle, there exist such that n and n have the same two rst digits; that is,
1 ≤ i1 < i2 ≤ 91
4i1 4i2
n
4i1
= (d + r )10 1 and n = (d + r )10 , for some integer d such that 10 ≤ d ≤ 99, some real
p1 4i2
2
p2
numbers r , r such that 0 ≤ r , r < 1, and some integers 0 ≤ p < p . In fact r , r > 0 as n is not
1 2 1 2 1 2 1 2
divisible by 10 (which will be important soon). Recall that i < i , p < p , and note that 1 2 2 3
p2
(d + r2 )10 r2 − r1
4(i2 −i1 ) p2 −p1
n = = (1 + )10 .
p1
(d + r1 )10 d + r1
(8.12)
where t = i − i ∈ N and s = p − p ∈ N. Let us stress it again that the assumption that n is not
2 1 2 1
subsequence (n )
4k+1
. It will be convenient to represent each term in its (normalized) scienti c
k∈N
notation which is a standard way of expressing numbers that are too large or too small to be
conveniently written in decimal form. All terms can be written in the form m ⋅ 10 , where the exponent n
n (called the order of magnitude) is an integer, and the coef cient m (called the signi cand or mantissa)
is a real number with absolute value at least one but less than ten. The rst digit of the term is equal to
the oor of the corresponding mantissa. We start with the original number n (the term corresponding to
k = 0) that has the last digit c ≠ 0. To get the next term, we multiply the current term by n
4t s
= x ⋅ 10
. Because of the property (8.12), the mantissa does not change much after that (unless, of course, it
“switches” from a value from the interval [1, 2) to a value from the interval [9, 10), or vice versa). As a
result, the rst digit never “skips” any digit. Indeed, if 1 < x < 1.1, then the mantissa keeps
geometrically increasing (until it eventually “switches”). More importantly, since 9 ⋅ x < 9.9 < 10, it
never “skips” any digit (the extreme case is when digit 8 changes to 9). Similarly, if 0.9 < x < 1, then
the mantissa keeps geometrically decreasing (again, until it eventually “switches”). Since 10 ⋅ x > 9, it
never “skips” any digit (this time the extreme case is when digit 1 changes to 9). Hence, for some
k ∈ N, the oor of the mantissa is equal to c and so the rst and the last digits of the term n are 4tk+1
this and the fact that p | a + b , we get that p also divides (a + b) − (a + b ) = 2ab. As p > 2, it
2 2 2 2 2
and, as a consequence, p 2
| a + b .
2 2
Problem 5.3.2. You are given four integers a, b, c , and d. Prove that if a − c | ab + cd , then
a − c | ad + bc.
(ab + cd) − (ad + bc) − (ab + cd) = −(ad + bc). We conclude that a − c | ad + bc.
Problem 5.3.3. Consider any natural number n ≥ 2. Prove that n + 64 has at least four different non- 12
+ 64.
12
1 < a < b < c < d < n
12 6 3 6 3
n + 64 = (n − 4n + 8)(n + 4n + 8) .
and that
6 3 2 4 3 2
n + 4n + 8 = (n − 2n + 2)(n + 2n + 2n + 4n + 4 ) .
It is obvious that
2 2
n − 2n + 2 < n + 2n + 2
and that
4 3 2 4 3 2
n − 2n + 2n − 4n + 4 < n + 2n + 2n + 4n + 4 .
So, in order to nish the proof, it is enough to show that for any n ≥ 3
2 4 3 2
n + 2n + 2 < n − 2n + 2n − 4n + 4,
or equivalently that
4 3 2
n − 2n + n − 6n + 2 > 0.
The desired inequality thus holds, as for any n ≥ 3, we have that
4 3 2 3 2
n − 2n + n − 6n + 2 = n (n − 3) + n(n + n − 6) + 2
≥ 0 + n(9 + 3 − 6) + 2 ≥ 6n + 2 ≥ 20 > 0.
One rst needs to notice that n − 4n + 8 has no integer roots (see Section 3.4 for a discussion on the
6 3
rational root theorem). Therefore, the next step is to try to nd a factorization of the form
2 4 3 2
(n + a1 n + a0 )(n + b3 n + b2 n + b1 n + b0 ) .
By comparing the coef cients associated with a given power of n, we get the following system of
equations:
b3 + a1 = 0
a1 b3 + b2 + a0 = 0
a0 b3 + a1 b2 + b1 = −4
a0 b2 + a1 b1 + b0 = 0
a0 b1 + a1 b0 = 0
a0 b0 = 8.
One can consecutively remove bi’s from this systems to get a system of only two equations in a1 and a2.
Then, as we know that a | 8, it is enough to check 8 possible values of a0 ( ±1, ±2, ±4, ±8) to nd
0
In order to see that a − a is divisible by 5, let us consider two cases. If 5 | a, then we are
64 4
immediately done. Suppose then that a is not divisible by 5. It follows that n = a is also not divisible 15
by 5 and so, since 5 is a prime, we get from Fermat’s little theorem that p = 5 divides
. Exactly the same argument shows that either 7 or 7 .
4 6
p−1 15 10
n − 1 = (a ) − 1 | a | (a ) − 1
Problem 5.4.2. Prove that for any odd integer n, we have that n .
n i j
| ∏ ∑ 2
i=1 j=0
j i+1
∏∑2 = ∏ (2 − 1) .
If n is prime, then it follows immediately from Fermat’s little theorem that n | 2 − 1 and clearly
n−1
n = ∏
s
p
k=1
, where pk’s are unique prime numbers and wk’s are natural numbers. Since n is odd,
wk
order to deal with this case, we will use the fact that for any natural number x, 2 − 1 is divisible
x(pk −1)
by pk. Indeed, using Fermat’s little theorem one more time we get that
x(pk −1) x
2 − 1 ≡ 1 − 1 = 0 ( mod pk ) .
Clearly, for any x ∈ N, x(p − 1) ≥ 2. Hence, in order to see that the number of terms in the product
k
that are of the form 2 − 1 is at least wk, it is enough to check that w (p − 1) ≤ n + 1. But this
x(pk −1)
k k
inequality holds as
wk wk
wk (pk − 1 ) ≤ (pk − 1) ≤ p ≤ n ≤ n + 1.
k
( )( )
1 1
ϕ(100 ) = 100(1 − )(1 − ) = 40 .
2 5
Hence, since 7 and 100 are co-prime, it follows from Euler’s theorem that 100 |
40
7 − 1 or,
equivalently, that 7 ≡ 1 ( mod 100). Finally, since
40
3
123 40 3 3
7 = (7 ) ⋅ 7 ≡ 1 ⋅ 343 = 343 ≡ 43 ( mod 100) ,
i=0
divisible by 3. But this is clearly impossible and we get the desired contradiction.
Problem 5.5.2. Find the minimum of |20 m
− 9 |
n
over all natural numbers m and n.
Solution. Let us rst note that |20
1 1
− 9 | = 11 . We will show that it is impossible to achieve smaller
values. Clearly,
m n m n
20 − 9 ≡ 0 − (−1) = ±1 ( mod 10) ,
so the last digit in the decimal representation of |20 − 9 | is either 1 or 9. Hence, the only potentially
m n
possible values of |20 − 9 | that are less than 11 are 9 or 1. We will independently rule them out.
m n
Clearly |20 − 9 | = 9 is not possible as 20 is not divisible by 9. In order to rule out the case
m n
7.
Problem 5.5.3. Given m, n, d ∈ N, prove that if 2
m n + 1 and mn
2
+ 1 are divisible by d, then
m + 1 and n + 1 are also divisible by d.
3 3
Solution. If d = 1, then the desired property trivially holds. Hence, we may assume that d ≥ 2.
Suppose that m n + 1 and mn + 1 are divisible by d. Let us note that, due to the symmetry, it is
2 2
otherwise d would divide m n and so it would not divide m n + 1. As such, we must have that d
2 2
2 2 3
m (m − n) + (m n + 1 ) = m + 1,
as desired.
Problem 5.6.1. Find all x, y ∈ N such that 2 x
+ 5
y
is a square.
Solution. Suppose that 2 + 5 = z for some natural numbers x, y, and z. Let us rst note that z is not
x y 2
divisible by 5. We will split the proof into two independent cases depending on the parity of x.
The case when x is odd is easy to deal with. If x = 2k + 1 for some non-negative integer k, then
x
2 = 2
2k+1
gives the reminder of 2 or 3 when divided by 5 ( 2 = 2 ( mod 5), 1
2
3
= 8 ≡ 3 , = 32 ≡ 2 ( mod 5), etc.). On the other hand, z2 gives the reminder of 1
( mod 5) 2
5
( mod 5), 4 = 16 ≡ 1 ( mod 5), 6 ≡ 1 = 1 ( mod 5), etc.). It follows that there
2 2 2 2
3 = 9 ≡ 4
follows that the second term, namely, z − 2 is equal to 1 and so 5 = 2 ⋅ 2 + 1. Note that there exists
k y k
one solution corresponding to k = 1: x = 2 and y = 1. We will show that this is the only solution.
For a contradiction, suppose that for some y, k ∈ N ∖ {1}, we have that
y k
5 − 2 ⋅ 2 = 1 = 5 − 4,
or equivalently that
y−1 k−1
5(5 − 1) = 4(2 − 1) .
In order for the right hand side to be divisible by 5, we must have that k = 4t + 1 for some non-
negative integer t; that is, 2 has to be of the form 16t. But it means that both sides are also divisible
k−1
by 3, since 16 − 1 ≡ 1 − 1 = 0 ( mod 3). Now, in order for the left hand side to be divisible by 3,
t t
we must have that y = 2s + 1 for some s ∈ N; that is, 5 has to be of the form 25s. Recall that the y−1
case y = 1 ( s = 0) corresponds to a feasible solution and is excluded now. But this implies that the
left hand side is divisible by 8, as 25 − 1 ≡ 1 − 1 = 0 ( mod 8), wheres the right hand side is
s s
clearly not. We get the desired contradiction and the proof is nished.
Problem 5.6.2. Prove that for any two sequences, and , of natural numbers,
2011 2011
(xi ) (yi )
i=1 i=1
Solution. Let us rst note that, without loss of generality, we may assume that gcd(x , y ) = 1 for all i i
i ∈ [2011]. Indeed, it is easy to see that one could factor out gcd (x , y ) (that is clearly a square) from
2
i i
the ith term and move it in front of the product. As a result, two sequences (x ) and (y ) satisfy i
2011
i=1 i
2011
i=1
i=1 i i i
2011
i=1
Now, assuming that gcd(x , y ) = 1, we will analyze the reminder of 2x + 3y when divided by 3.
i i
2
i
2
i
If 3 | x then, by our assumption, 3 does not divide yi and so y ≡ 1 ( mod 3). Indeed, if
i
2
i
yi ≡ 1 ( mod 3) , then
= 1
2
y
( mod 3)
i
whereas if y ≡ 2 ( mod 3), then
≡ 1
1
i
2
yi ≡ 2
2
= 4 ≡ 1 . It follows that 2x + 3y = 3(3t + 1) for some t ∈ N. On the other
( mod 3)
2
i
2
i i i
These two cases naturally de ne a partition of the 2011 terms of the product. Let A be the subset of
[2011] that consists of those indices i for which 3 | x , and let B = [2011] ∖ A. i
For a contradiction, suppose that the product ∏ (2x + 3y ) is a square. Since 3 does not divide
2011
i=1
2
i
2
i
the term 2x + 3y when i ∈ B and each term corresponding to i ∈ A has precisely one 3 in its unique
2
i
2
i
factorization, we get that |A| is even, that is, |A| = 2s for some non-negative integer s. Hence,
|B| = 2011 − |A| is odd. But this means that
2 2 |B| |B|
∏ (2xi + 3yi ) ≡ ∏ (3ti + 2 ) ≡ 2 ≡ (−1) ≡ −1 ≡ 2 ( mod 3)
i∈B i∈B
2 2 2s 2s
∏ (2xi + 3yi ) = 3 ∏ (3ti + 1) ∏ (3ti + 2 ) = 3 (3p + 2)
for some p ∈ N. But it means that 3p + 2 is a square but this is impossible as no square gives a
reminder of 2 when divided by 3. This nishes the proof.
Problem 5.6.3. Consider any integer n ≥ 2 and any subset S of the set N := {0, 1, 2, …, n − 1} that
has more than n elements. Prove that there exist integers a, b, c such that the remainders when
3
4
3
Let us start with the following, simple but useful, observation that we will use a few times. Let
x, y, z ∈ N be such that x < y. Then, z + x and z + y yield two different reminders when divided by
a + b when divided by n is in S.
Similarly, having a and b xed, we observe that there are less than n values of x ∈ S for which the 1
reminder of a + x when dividing by n is not in S (property used with z = a), less than n values that 1
create a problem for b + x (property used with z = b), and less than n values not satisfying the 1
condition for (a + b) + x (property used with z = a + b). It follows that there are less than n values 3
that do not satisfy some condition but we have more than n values to choose from. Hence, we are 3
guaranteed that there exists c ∈ S that, together with a and b, satisfy the desired conditions.
Problem 5.7.1. Prove that if the sum of positive divisors of some natural number n is odd, then either n
is a square or n/2 is a square.
Solution. Let us consider the unique factorization of n. That is, we write n = ∏ p for some
k ℓi
i=1 i
sequence of prime numbers 2 ≤ p < p < … < p and ℓ ∈ N for i ∈ [k]. Note that each positive
1 2 k i
divisor of n has unique representation ∏ p , where j ∈ {0, 1, …, ℓ }, and two different divisors
k ji
i=1 i i i
have different representations. It follows that the sum of all positive divisors of n is equal to
ℓ1 ℓk k k ℓi
ji j
S := ∑⋯∑∏p = ∏∑p .
i i
Since our assumption is that S is odd, we get that ∑ is odd for each i ∈ [k].
ℓi j
j=0
p
i
Consider any i ∈ [k]. Suppose rst that p > 2 and so it is odd. Since ∑ p is odd, the sum has
ℓi j
i j=0 i
ℓ + 1 terms, and each term is odd, it follows that the number of terms is odd, that is, ℓ is even. In this
i i
case we get that p is a square. On the other hand, if p = 2, then the rst term in the corresponding
ℓi
i 1
sum is odd ( 2 = 1) and the remaining terms are even. As a result, the sum is always odd and ℓ could
0
1
1
is a square. Putting these ℓ1 −1
observations together we conclude that if n is odd (that is, p ≠ 2), then n is a square. If n is even ( 1
Problem 5.7.2. Find all natural numbers n for which there exist 2n pairwise different numbers
a , a , …, a , b , b , …, b such that ∑ b and ∏ b .
n n n n
1 2 n 1 2
a = ∑
n
a = ∏ i i i i
i=1 i=1 i=1 i=1
Solution. It is clear that there is no solution for n = 1. For n = 2, we have the following two
conditions: a + a = b + b and a a = b b . From the
1 2 1 2 rst equation we have that
1 2 1 2
a = b + b − a .
2 1 After substituting this to the second equation we get that
2 1
We will now show that if there is a solution for n = n , then there is one for n = n + 3 (inductive
0 0
step). Since we already showed that there is a solution for n ∈ {3, 4, 5} (base case), by mathematical
induction we will get that there is a solution for any natural number n ≥ 3.
In order to prove the claim, let us make two simple observations. First of all, let us note that the
solution that we have for n = 3 ( (a , a , a ) = (2, 8, 9) and (b , b , b ) = (3, 4, 12)) can be easily
1 2 3 1 2 3
generalized to get an in nite family of solutions. Indeed, it is obvious that for any x ∈ N,
(a , a , a ) = (2x, 8x, 9x) and (b , b , b ) = (3x, 4x, 12x) is also a solution to our problem. The
1 2 3 1 2 3
second ingredient that we need is the fact that any two solutions that consist of non-overlapping values
can be merged together to get another solution. Formally, suppose that the pair (a , …, a ) and 1 n
(b , …, b ) is the solution for some n ≥ 3. Then, one can take x large enough such that
1 n
Let us make some nal remarks. Finding solutions for n ∈ {3, 4, 5} by hand can be tedious.
However, with access to a computer, one can easily do it. Below is a short Julia script that was used to
nd the solutions given above.
using Combinatorics
function f(n, k)
for x in combinations(1:k, n),
y in combinations(setdiff(1:k, x), n)
if x < y # avoid printing duplicates
if sum(x) == sum(y) \& \& prod(x) == prod(y)
println((x, y))
end
end
end
end
and now we can run it to get the desired solutions:
D(x ) := ∑ wx − ∑ bx .
wx ∈W (x) bx ∈B(x)
We will show that D(x) ≠ 0 which gives a negative answer to the question; that is, there is no integer
for which the sum of its white divisors is equal to the sum of its black divisors.
Let us start with proving the following useful property. For any p and q that are co-prime, we have
that
(8.13)
⎛ ⎞⎛ ⎞
D(p) ⋅ D(q) = ∑ wp − ∑ bp ∑ wq − ∑ bq
⎝ ⎠⎝ ⎠
wp ∈W (p) bp ∈B(p) wq ∈W (q) bq ∈B(q)
⎛ ⎞
= ∑ wp ∑ wq + ∑ bp ∑ bq
⎝ ⎠
wp ∈W (p) wq ∈W (q) bp ∈B(p) bq ∈B(q)
⎛ ⎞
− ∑ wp ∑ bq + ∑ bp ∑ wq
⎝ ⎠
wp ∈W (p) bq ∈B(q) bp ∈B(p) wq ∈W (q)
Note also that w w and b b are white divisors of p ⋅ q (as both the sum of two even numbers and the
p q p q
sum of two odd numbers is even), and w b and b w are black divisors of p ⋅ q (as the sum of an even
p q p q
and an odd number is odd). Also, in the expression above, all divisors of p ⋅ q are present exactly once
since p and q are co-prime. This shows that, indeed, (8.13) holds.
Let us now come back to our task of showing that D(x) ≠ 0. Let x = ∏ p be the unique prime
t si
i=1 i
t t
si si
D(x ) = D(∏ p ) = ∏ D(p ) .
i i
i=1 i=1
Finally, note that all positive divisors of p have the form p for . Moreover, the
si k
0 ≤ k ≤ si
i i
2 2
2 2
(2y − 1) = (2x − x) − (x − 1) .
2 2 2
2 2 2
(2y − 1) ≤ (2x − x) − (2 − 1) < (2x − x) .
2
2 2
= (2x − x) − (x − 1) − x(3x − 2)
2 2
2 2
< (2x − x) − (x − 1) = (2y − 1) .
But there is no natural number such that its square is between squares of two consecutive natural
numbers and so we get the desired contradiction. It follows that x = 1.
If x = 1, then we get that (2y − 1) = 1 which implies that y = 1. Therefore, the only solution of
2 2
assume that this is a smallest example (in terms of variable x), that is, there is no other pair x , y ∈ N ′ ′
that satisfy the desired equality and x < x. Suppose rst that both x and y are divisible by 3. It is easy
′
to see that x = x/3 ∈ N and y = y/3 ∈ N also form a solution, which contradicts our assumption.
′ ′
Similarly, it is not possible that one of the numbers is divisible by 3 and other is not, as then the left
hand side is not divisible by 3 while the right hand side is. Finally, if both x and y are not divisible by 3,
then x + y gives the reminder of 2 when divided by 3 whereas the right hand side is clearly divisible
2 2
to the symmetry, we may assume that x > y and potential solutions will come in pairs; that is, if
(x, y) = (x , y ) is a solution, then so is (x, y) = (y , x ). Either way, we may assume that z ≥ 1.
0 0 0 0
After substitution, our equation becomes z = y + zy. Now, multiply both sides by 4 and add z2 to
n 2
z2
Note that the right hand side of this equation is a square. Since is a square, it follows that 4z + 1
n−2
is also a square. As 4z + 1 is clearly an odd number, it must be a square of an odd natural number,
n−2
that is, 4z + 1 = (2t + 1) for some t ∈ N. It follows that z = t(t + 1). Since t and t + 1 are
n−2 2 n−2
co-prime, t = a and t + 1 = b
n−2
for some a, b ∈ N. We get that b
n−2
− a = (t + 1) − t = 1.
n−2 n−2
2 2 3 3 2 2 2 2
(t (t + 1))(t(t + 1) ) = t (t − 1) = y + t(t + 1)y = y + (t(t + 1) − t (t + 1))y
2 2
(y − t (t + 1))(y + t(t + 1) ) = 0.
Since both y and t are at least 1 (both are natural numbers), we get that y = t (t + 1). Then 2
Finally, we have to check that y = t (t + 1) and x = (t + 1) t do, in fact, yield a solution. We get
2 2
Problem 5.8.3. Find all natural numbers satisfying the following system of equations:
a + b + c = xyz,
x + y + z = abc,
a + b + c + x + y + z = abc + xyz .
Observe that
abc − (a + b + c) = c(ab − 1) − a − b
= c(ab − 1) − a − b + ab + 1 − (ab − 1) − 2
= (c − 1)(ab − 1) + (a − 1)(b − 1) − 2.
Similarly,
Let us note that all the 4 terms at the right hand side are non-negative.
Now, observe that if c ≥ 2 (and so a and b are also at least 2), then
(c − 1)(ab − 1) + (a − 1)(b − 1 ) ≥ 4.
(a − 1)(b − 1) + (x − 1)(y − 1 ) = 4.
Now, if , then we have (x − 1)(y − 1) = 4 and so (x, y) = (3, 3) or (x, y) = (5, 2). If
b = 1
(x, y) = (3, 3), then xyz = 9 and x + y + z = 7. But this would mean that a + 2 = 9 and 2a = 7,
which is not possible. If (x, y) = (5, 2), then xyz = 10 and x + y + z = 8. But this would mean that
a + 2 = 10 and 2a = 8, which is also not possible. Therefore, we conclude that b ≥ 2 and, by
8.6 Geometry
Problem 6.1.1. We are given an acute triangle ABC with ∢ACB = π/3. Let A′ be the orthogonal
projection of A on BC , let B′ be the orthogonal projection of B on AC , and let M be the middle point
of line segment AB. Prove that |A B | = |A M | = |B M |.
′ ′ ′ ′
Solution. Since A AC is a right triangle, ∢B AA = ∢CAA = π/2 − π/3 = π/6. Since AA B and
′ ′ ′ ′ ′
′ ′
AB B are right triangles, points A, B, A , and B lie on a circle whose center is M. But this means that
′
Problem 6.1.3. Point O is the center of a circumcircle of a triangle ABC . Point C′ is the orthogonal
projection of C on AB. Prove that ∢ACC = ∢OCB. ′
Solution. Let us rst note that ∢OCB = π/2 − ∢COB/2 = π/2 − ∢CAB. On the other hand, since
triangle ACC is a right triangle, ∢ACC = π/2 − ∢CAC = π/2 − ∢CAB. It follows that
′ ′ ′
∢ACC = ∢OCB.
′
Problem 6.2.1. Suppose that points P and Q lie on sides BC and CD of a square ABCD such that
∢P AQ = π/4. Prove that |BP | + |DQ| = |P Q|.
Solution. Consider point R inside the square such that |AR| = |AB| = |AD| and ∢BAP = ∢P AR.
Note that R lies inside the angle ∢P AQ. After considering congruent triangles BAP and P AR, we
get that |BP | = |P R|. Now, notice that ∢BAP + ∢DAQ = π/2 − π/4 = π/4. Using this we have
∢QAR = π/4 − ∢P AR = π/4 − ∢BAP = π/4 − (π/4 − ∢DAQ) = ∢DAQ. It follows that
|DQ| = |QR|.
It is left to show that R lies on the line segment P Q, as then we will conclude that
|BP | + |DQ| = |P R| + |QR| = |P Q|. But ∢QRA = ∢QDA = π/2. Similarly,
∢P RA = ∢P BA = π/2, and so ∢P RQ = π, as required.
Problem 6.2.2. Point P lies on a diagonal AC of a square ABCD. Points Q and R are the orthogonal
projections of P on lines CD and DA, respectively. Prove that |BP | = |RQ|.
Solution. Since RP QD is a rectangle, |RQ| = |P D|. Since ∢DCP = ∢BCP (= (π/2)/2 = π/4)
and |DC| = |BC|, triangles P DC and P BC are congruent. It follows that |P B| = |P D| = |RQ|,
and the proof is nished.
Problem 6.2.3. Consider an acute triangle ABC where ∢ACB = π/4. Point B′ is the orthogonal
projection of B on AC and point A′ is the orthogonal projection of A on BC . Let H be the intersection
point of AA and BB . Prove that |CH | = |AB|.
′ ′
Solution. Since triangle BB C is a right triangle and ∢B CB = π/4, we get that |BB | = |CB |.
′ ′ ′ ′
Since ∢CB H = ∢CA H = π/2, points H, B′, C, and A′ lie on a circle. It follows that
′ ′
′ ′ ′ ′
∢B CH = ∢B A H . Similarly, since ∢AB B = ∢AA B = π/2, points A, B , A , and B lie on a
′ ′ ′
circle. It follows that ∢B A H = ∢B BA. As a result, we get that triangles BB A and H B C are
′ ′ ′ ′ ′
Solution. Since ′
∢BB C = ∢BC C = π/2
′
, points B, C, B′, and C lie on a circle. Thus,
CB = ∢C BC . This means that also ∢AC B = ∢BCB .
′ ′ ′ ′ ′ ′ ′ ′ ′
AB C = π/2 − ∢C B B = π/2 − ∢C
gure x.
Solution. Let us rst note that the centers of o1 and o2 cannot lie inside the other circle as then ∢P AQ
could not be equal to π/2. Note then that ∢ABQ = ∢ABP = π/2, and so Q, B and P are colinear. It
follows that AQP is a right triangle and B is an orthogonal projection of A on P Q. So
|QB|/|AB| = |QA|/|AP |. Similarly |P B|/|AB| = |P A|/|AQ|. We conclude that
|P B|/|BQ| = |P A| /|AQ| = [o ]/[o ], and the proof is nished.
2 2
1 2
Alternatively, for the last step one could use the power of the point property we introduce in Section
6.6. Using it one gets that |AQ| = |QB||QP | and that |AP | = |P B||P Q|, and so 2 2
Problem 6.4.1. Points D, E, and F lie on sides BC , CA, and AB of a triangle ABC in such a way that
lines AD, BE , and CF intersect in a single point P. Prove that
|AF |/|F B| + |AE|/|EC| = |AP |/|P D|.
Solution. After applying Menelaus’s theorem to triangle ABD and line F P , we get that
|BC| |DP | |AF |
⋅ ⋅ = 1,
|DC| |P A| |F B|
and so
|AF | |DC| |P A|
= ⋅ .
|F B| |BC| |DP |
and so
|AE| |DB| |P A|
= ⋅ .
|EC| |CB| |DP |
It follows that
|AF | |AE| |DC| |P A| |DB| |P A|
+ = ⋅ + ⋅
|F B| |EC| |BC| |DP | |CB| |DP |
|DC|+|DB| |P A| |P A|
= ⋅ = ,
|BC| |DP | |DP |
as required.
Problem 6.4.2. You are given a triangle ABC where ∢ACB = π/2. On side AC build a square
ACGH , externally to the triangle. Similarly, on side BC build a square CBEF , externally to the
triangle. Show that the point of intersection of AE and BH lies on the line orthogonal to AB that goes
through point C.
Solution. Let A′ be the intersection point of AE and BC , B′ be the intersection point of BH and AC ,
and C′ be the orthogonal projection of C on AB. Since AC is parallel to BE,
|CA |/|A B| = |AC|/|BE| = |AC|/|BC|. Similarly, we argue that |AB |/|B C| = |AC|/|BC|. Let
′ ′ ′ ′
us now observe that triangles AC C and BC C are similar, and so |AC |/|CC | = |CC |/|BC |. It
′ ′ ′ ′ ′ ′
follows that |BC |/|C A| = (|BC |/|CC |) . But BC C and ACB are similar, and so
′ ′ ′ ′ 2 ′
|BC |/|C A| = (|CB|/|AC|) . We get that |CA |/|A B| ⋅ |BC |/|C A| ⋅ |AB |/|B C| = 1. Using
′ ′ 2 ′ ′ ′ ′ ′ ′
Ceva’s theorem, we conclude that lines AA , BB , and CC intersect in one point, which nishes the
′ ′ ′
proof.
Problem 6.4.3. You are given a convex quadrilateral ABCD and a line that intersects lines DA, AB,
BC , and CD in points K, L, M, and N, respectively. Prove that
|DK| ⋅ |AL| ⋅ |BM | ⋅ |CN | = |AK| ⋅ |BL| ⋅ |CM | ⋅ |DN |.
Solution. Let us add an auxiliary line BD to the plot. Let X be the intersection point of line BD with
the new line going through K, L, M, and N. Applying Menelaus’s theorem twice, the rst time to
triangle ABD, and the second time to triangle BDC , we get that
point A. We will show that for any two points B ∈ ℓ and C ∈ ℓ with |AB| = |AC| the following
1 2
property holds: all points lying on the line segment BC have the same total distance to lines ℓ and ℓ .
1 2
To see this, let us consider any point P on the line segment BC . Clearly, [ABC] = [ABP ] + [ACP ],
where [x] denotes the area of gure x. Since |AB| = |AC|, we immediately get that the sum of heights
of the two triangles ABP and ACP , projected from P to AB and, respectively, from P to AC is
constant (namely, equal to 2[ABC]/|AB| = 2[ABC]/|AC|). From this argument, we immediately get
the following important observation. Any point P lies on the unique line segment BC de ned as above;
in particular |AB| = |AC|. More importantly, for any two points Pi ( i ∈ {1, 2}) and the associated
line segments B C , the total distances from Pi to ℓ and ℓ are equal if and only if the two
i i 1 2
corresponding line segments B C and B C are identical. Moreover, if we extend half-lines ℓ and ℓ
1 1 2 2 1 2
to lines, then the set of all points having the same distance from these two lines forms a rectangle with
point A being the intersection of its diagonals. All points inside this rectangle have sums of the
distances from these two lines strictly smaller than the points on this rectangle.
Let us now go back to our problem. Clearly, if ABCD is a parallelogram, then the desired property
holds. Suppose then that ABCD is not a parallelogram. Our goal is to show that there are two points
inside of ABCD with different sums of distances.
Let us rst deal with convex quadrilaterals. Without loss of generality, we may assume that AB and
CD are not parallel. Select any point P inside of ABCD. From the observation above it follows that
the set of points that are at the same distance as P from the two lines yielded by line segments AB and
CD lie on some rectangle R . We will independently consider the following two cases.
Case 1: AD and BC are parallel. Note that the sum of distances from AD and BC for all points inside
of ABCD is the same. Thus, we may select any point P′ not lying on rectangle R but lying inside of
ABCD (note that such point always exists) to conclude that its total distance from the sides of ABCD
segment CB. Observe now that AB CD is de ned by the same lines as ABCD, but is convex; in
′ ′
particular, AB CD is not a parallelogram as ABCD is not. By the previous argument, we get that
′ ′
there are points inside of AB CD (and so also inside of ABCD) with different sums of the distances
′ ′
Solution. Let us introduce an auxiliary point X such that that ADCX forms a rectangle. Note that
2|DE| = |CX| and 2|BD| = |BC| so B, F, E, and X lie on the same line; in particular,
∢DF X = ∢DF E = π/2. It follows that points D, F, A, and X lie on some circle. But C lies on the
circle on which A, D, and X lie. It follows that they all lie on the same cycle and so
∢CF A = ∢CDA = π/2.
Problem 6.5.3. Consider a triangle ABC . Outside of the triangle, on sides AB and AC , we built
squares ABDE and, respectively, ACF G. Let M and N be the middle points of DG and, respectively,
EF . What are the possible values of the rato |M N |/|BC|?
Solution. Let us add an auxiliary point P such that EAGP is a parallelogram; in particular, P E and
GA are parallel and have equal length. On the other hand, since ACF G is a square, F C and GA are
also parallel and have equal length. It follows that CF P E is a parallelogram. But this means that N lies
in the middle of the line segment P C as the diagonals of a parallelogram intersect in their middles.
Using the same argument we get that GP DB is a parallelogram and M lies in the middle of the line
segment BP . It follows that |P N |/|P C| = |P M |/|P B| = 1/2 which means that a triangles P BC
and P M N are similar and thus |N M |/|CB| = 1/2. Hence, this is the only possible ratio.
Problem 6.6.1. Two circles intersect in points A and B. Point P is selected on line AB outside of the
circles. Points C and D are locations where tangent lines going through point P touch both circles.
Prove that ∢P CD = ∢P DC .
Solution. Let us rst note that points C and D are not uniquely de ned (there are two possible
locations). However, regardless of their location, |P C| = |P A| ⋅ |P B| = |P D| . It follows that P CD
2 2
altitude of a triangle F ED as DEF E is a kite ( |F E| = |F E | and |DE| = |DE |). The crucial
′ ′ ′
observation now is the fact that all points lying on a line going through E and E′ have the same power
with respect to circles k1 and k2, as can be seen by calculating this power along EE line. ′
Let us now consider a circle k1 and a circle k3 with center in B and radius |BA| = |BC|. As before,
we de ne point C′ that is a second intersection (the rst one is C) of k1 and k3. We note that CC ′
contains the altitude of BCD, and all points on the line CC have the same power with respect to
′
circles k1 and k3. Let us now observe that it is not possible that EE and CC are parallel as then
′ ′
∢F DB would have to be 0, which is not the case. Therefore, lines EE and CC have an unique
′ ′
intersection point Z.
It follows that the power of point Z with respect to circles k2 and k3 is the same. Let us draw a line
going through Z and A. Because of the above fact, it must also go through point A′ that is the other
intersection point of circles k2 and k3 (the rst one is A). We conclude that AA contains the altitude of
′
center in Y and radius |AY | = |BY | (excluding points A and B). It remains to show that for each such
point, it is possible to generate the two cycles that satisfy the desired properties. (Let us mention that,
indeed, points A and B are excluded, as for them one of the circles would be degenerated to a point.)
Select any point X on such a semi-circle (again, excluding points A and B). It is easy to see that it is
possible to select then a point P such that |P A| = |P X| and ∢Y AP = ∢Y XP = π/2. Similarly, we
select Q such that |QB| = |QX| and ∢Y BQ = ∢Y XQ = π/2. It remains to show that the two
circles with centers in P and Q and radiuses |P A| and, respectively, |QB| are tangent. In order to prove
it is is enough to show that X lies on a line segment P Q. But this is indeed true as
∢P XY + ∢QXY = π/2 + π/2 = π.
Problem 6.7.1. Let P be an interior point of a triangle ABC . Let lines AP , BP , and CP intersect sides
BC , CA, and AB in points A′, B′ and, respectively, C′. Prove that
|P A|/|AA | + |P B|/|BB | + |P C|/|CC | = 2.
′ ′ ′
It follows that
′ ′ ′
|P A| |P B| |P C| |P A | |P B | |P C |
′
+ ′
+ ′
= (1 − ′
) + (1 − ′
) + (1 − ′
)
|AA | |BB | |CC | |AA | |BB | |CC |
= 3 − 1 = 2.
Problem 6.7.2. Points E and F lie on sides BC and, respectively, DA of a parallelogram ABCD such
that |BE| = |DF |. Select any point K on side CD. Let P and Q be intersection points of line F E with
lines AK and, respectively, BK . Prove that [AP F ] + [BQE] = [KP Q].
Solution. Since |BE| = |F D|, |BC| = |AD|, and BC and AD are parallel, we get that ABEF and
CDF E are congruent trapezoids. It follows that [ABEF ] = [CDF E] = [ABCD]/2. On the other
hand, since triangle AKB has the same base (namely, AB) and the height projected on this base as the
parallelogram ABCD, [AKB] = [ABCD]/2. It follows that
Let us now note that [KLM ] = [LBM ] as these triangles have the same height projected on bases of
equal length. Similarly, [M KN ] = [KN D]. But this implies that
1
= ([KLM ] + [LBM ] + [M KN ] + [KN D])
2
1 1 2
= [KBM D] = ⋅ [ABCD] = [ABCD]/3.
2 2 3
Problem 6.8.1. Given a parallelogram ABCD, consider points M and N that are in the middle of sides
BC and CD, respectively. Section BD intersects with AN in point Q, and with AM in point P. Prove
N B, Q be the intersection point of LD with KC , R be the intersection point of M A with LD, and,
C P N and BCP are similar with ratio of 3/2. Consider now heights of these triangles projected from
′
P onto C N and BC . Denote their lengths as h1 and, respectively, as h2. Since h /h = 3/2 (by
′
1 2
|BC|h /2 = 1/5,
2 and so [BP C] = 1/5. Similarly, we conclude that
[DQC] = [ARD] = [BSA] = 1/5, and so [P QRS] = 1 − 4/5 = 1/5.
Problem 6.8.3. Points E and F are on sides AB and, respectively, AD of rhombus ABCD. Lines CE
and CF intersect line BD in points K and L, respectively. Line EL intersects side CD in point P. Line
F K intersects side BC in point Q. Prove that |CP | = |CQ|.
Solution. Consider
triangles F LD and BCL. By Thales’ theorem we get that
|F D|/|LD| = |BC|/|LB| . Consider now triangles LP D and LEB. Using Thales’ theorem one more
time we get that |DP |/|LD| = |BE|/|LB|. Combining those two facts together, we conclude that
|DP | = |BE| ⋅ |LD|/|LB| = |BE| ⋅ |F D|/|BC|. Analogously, by analyzing triangles F BK and
DKC , and then triangles BKQ and F KD, we get that |BQ| = |BE| ⋅ |F D|/|DC|. We conclude that
We do hope that our book increased appetite for more problems to solve
and the readers will search for more books to expand her or his knowledge
and skills. Here is a list of books that we have on our shelves and like to
read but, of course, this list is not complete. There are many more books
that are worth reading. Moreover, the mathematical level of these books
varies a lot. In any case, we hope that the readers will enjoy reading some of
them and keep solving interesting problems.
Good luck!
Index
Degree of a Polynomial, 73
Diophantine Equations, 176
Discriminant, 35
Disjoint Events, 130
Divisibility, 149
Double Counting, 58, 115
Double-angle Identities, 54
Gauss’s Lemma, 88
Generating Function, 144
Geometric Distribution, 123
Geometric Mean, 7
Geometric Sequence, 6
Geometric Series, 6
Geometrical Probability, 129
Golden Ratio, 146
Graphs, 103
Greatest Common Divisors, 152
Greedy Algorithm, 106
Harmonic Mean, 7
Heawood Graph, 118
Heron’s Formula, 32
Hypotenuse, 51, 182
Imaginary Number, 74
Incidence Graph, 117
Inclusion–Exclusion Principle, 123
Independent Set, 104
Induced Subgraph, 104
Induction Step, 10
Inductive Hypothesis, 10
In mum, 68
Inscribed Angles, 182
Intercept Theorem, 207
Intersecting Lines, 181
Invariant, 63, 109
Isosceles Triangle, 181
Iterated Function, 100
Jensen’s Inequality, 5
Lagrange Polynomials, 92
Lagrange’s Bound, 90
Legendre’s Formula, 150
Limit Point, 67
Linearity of Expectation, 123
Linearization, 36
Matchings, 104
Mathematical Induction, 10
Maximal Matching, 104
Maximum Degree, 103
Maximum Matching, 104
Menelaus’s Theorem, 193
Minimum Degree, 103
Multiplicative Inverse, 158
Mutually Exclusive Events, 130
Needle Problem, 129
Neighborhood, 103
Non-constructive Argument, 105
Parallelogram, 197
Partially Ordered Sets, 68
Path, 104
Perfect Matching, 104
Permutations, 114
Pigeonhole Principle, 138
Point Re ection, 135
Point Symmetry, 135
Polygon, 181
Polynomials, 73
Power of a Point, 199
Prime Numbers, 149
Probabilistic Method, 124
Product-to-sum Identities, 54
Projective Planes, 117
Proof by Contradiction, 99
Pythagorean Identity, 53
Quadrilateral, 181
Quotient-Remainder Theorem, 152
Quotients, 152
Radian, 52
Rational Root Theorem, 87
Rearrangement Inequality, 12
Rectangle, 197
Relatively Prime Numbers, 152
Remainders, 152
Rhombus, 197
Right Angle, 181
Right Triangle, 182
Roots, 74
Sandwich Theorem, 63
Scale Factor, 6
Secant Function, 51
Similarities, 189
Sine Function, 51
Square, 197
Squeeze Theorem, 63
Stolz–Cesàro Theorem, 251
Subgraph, 104
Sum-to-product Identities, 54
Supremum, 68
System of Equations, 36
System of Linear Equations, 36
Tangent Function, 51
Tangent Line, 199
Thales’ Theorem, 207
The Law of Sines, 204
Titu’s Lemma, 25
Transversal, 193
Trapezoid, 197
Triangle, 181
Triangle Inequality, 2
Trigonometric Functions, 51
Trigonometric Identities, 53
Turán Graph, 106
Turán Number, 117