Nothing Special   »   [go: up one dir, main page]

OceanofPDF - Com Train Your Brain Challenging Yet Elementa - Bogumil Kaminski

Download as pdf or txt
Download as pdf or txt
You are on page 1of 307

Train Your Brain

Textbooks in Mathematics

Series editors:
Al Boggess, Kenneth H. Rosen

Nonlinear Optimization
Models and Applications
William P. Fox
Linear Algebra
James R. Kirkwood, Bessie H. Kirkwood
Real Analysis
With Proof Strategies
Daniel W. Cunningham
Train Your Brain
Challenging Yet Elementary Mathematics
Bogumił Kamiński, Paweł Prałat
Contemporary Abstract Algebra, Tenth Edition
Joseph A. Gallian
Geometry and Its Applications
Walter J. Meyer
Linear Algebra
What you Need to Know
Hugo J. Woerdeman
Introduction to Real Analysis, 3rd Edition
Manfred Stoll
Discovering Dynamical Systems Through Experiment and Inquiry
Thomas LoFaro, Jeff Ford
Functional Linear Algebra
Hannah Robbins
https://www.routledge.com/Textbooks-in-Mathematics/book-series/CANDHTEXBOOMTH
Train Your Brain —
Challenging Yet Elementary
Mathematics

By

Bogumił Kamiński
Paweł Prałat
First edition published 2020
by CRC Press

6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press

2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

© 2021 Taylor & Francis Group, LLC

CRC Press is an imprint of Taylor & Francis Group, LLC

Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of their
use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let
us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, micro lming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access
www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
mpkbookspermissions@tandf.co.uk

Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identi cation and explanation without intent to infringe.

ISBN: 978-0-367-56487-2 (pbk)


ISBN: 978-1-003-09798-3 (ebk)
ISBN: 978-0-367-67935-4 (hbk)

Typeset in Computer Modern font


by KnowledgeWorks Global Ltd.
Contents
Introduction

1 Inequalities
1.1 Convexity and Concavity
1.2 Arithmetic-Geometric Inequality
1.3 Mathematical Induction
1.4 Bernoulli’s Inequality
1.5 Euler s Number
1.6 Asymptotics
1.7 Cauchy-Schwarz Inequality
1.8 Probability
1.9 Geometry

2 Equalities and Sequences


2.1 Combining Equalities
2.2 Extremal Values
2.3 Solving via Inequalities
2.4 Trigonometric Identities
2.5 Number of Solutions
2.6 Sequence Invariants
2.7 Solving Sequences

3 Functions, Polynomials, and Functional Equations


3.1 Vieta s Formulas
3.2 Functional Equations, Exploration
3.3 Functional Equations, Necessary Conditions
3.4 Polynomials with Integer Coef cients
3.5 Unique Representation of Polynomials
3.6 Polynomial Factorization
3.7 Polynomials and Number Theory

4 Combinatorics
4.1 Enumeration
4.2 Tilings
4.3 Counting
4.4 Extremal Graph Theory
4.5 Probabilistic Methods
4.6 Probability
4.7 Combinations of Geometrical Objects
4.8 Pigeonhole Principle
4.9 Generating Functions

5 Number Theory
5.1 Greatest Common Divisors
5.2 Modular Arithmetic
5.3 Factorization
5.4 Fermat s Little Theorem and Euler s Theorem
5.5 Rules of Divisibility
5.6 Remainders
5.7 Aggregation
5.8 Equations

6 Geometry
6.1 Circles
6.2 Congruence
6.3 Similarity
6.4 Menelaus s Theorem
6.5 Parallelograms
6.6 Power of a Point
6.7 Areas
6.8 Thales Theorem
7 Hints
7.1 Inequalities
7.2 Equalities and Sequences
7.3 Functions, Polynomials, and Functional Equations
7.4 Combinatorics
7.5 Number Theory
7.6 Geometry

8 Solutions
8.1 Inequalities
8.2 Equalities and Sequences
8.3 Functions, Polynomials, and Functional Equations
8.4 Combinatorics
8.5 Number Theory
8.6 Geometry

Further Reading

Index
Introduction

The book contains carefully selected problems that are challenging, yet only
require elementary mathematics. It is intended to prepare the readers for
rigorous mathematics, but neither prior preparation nor any mathematical
sophistication is required from them before reading this book. The book
guides the readers to think and express themselves in a rigorous,
mathematical way, to extract facts, analyze the problem, and identify main
challenges. Moreover, it shows how to draw appropriate, true conclusions
and helps to see a big picture. Despite the fact that this is not the main goal
of this book, as a bi-product, the readers are provided with a rm
foundation in a diverse range of topics that might be useful in their future
work. Finally, we often use computer support to help us get a better
intuition into discussed problems. This is a still rather unique approach in
mathematics but is getting more and more popular in the current
multidisciplinary and data driven world.
The presented material can be seen as a means to bridge the gap between
introductory calculus/linear algebra courses and more advanced courses that
are offered at universities. It improves the ability to read, write, and think in
a rigorous, mature mathematical fashion. It provides a solid foundation of
various topics that would be useful for more advanced courses. However,
the book is not only intended for undergraduate students that would like to
become professional research mathematicians. In almost any
mathematically related work (such as computer programming, data science,
machine learning, economics, engineering, etc.), precise reasoning, and
understanding what logical steps need to be taken to transition from the
assumptions to the desired conclusion, are crucial to be successful.
The content of this book is also suitable for high school students that are
interested in competing in math competitions or simply for people of all
ages and backgrounds who want to expand their knowledge and to
challenge themselves with interesting questions. In fact, the problems are
mostly selected from an extensive collection of problems from Polish
Mathematical Olympics and a library of training problems from XIV High
School of Stanislaw Staszic in Warsaw (Poland).
This book is clearly not the only one of its type. There are three main
reasons for writing another book on this topic. First of all, we found that
many interesting problems appear only in the Polish language and are not
translated to other languages. Some of them are unique and might be
interesting for a broader, English speaking, audience. We feel that they
deserve to be popularized. More importantly, we grouped questions into six
chapters representing various disciplines of mathematics. Each chapter
consists of many sections devoted to a collection of related topics. Each of
these sections starts with a problem that is followed by the necessary
background (de nitions and theorems used), careful and detailed solution,
and discussion of possible generalizations. The sections nish with a
number of additional related exercises that are solved at the end of the book.
As a result, this book can be used as a textbook for a systematic and
structured introduction to a fascinating world of high school math
competitions, or as a book preparing university students for more advanced
courses. Finally, with an increasing role of computational methods in
mathematics, we decided to show a few examples when computer aid can
be used to verify or guide the solutions to some problems. We present the
related code for a few suitable problems, discuss the implementation details
and, in the “Julia language companion” available on-line at
www.ryerson.ca/train-your-brain/, we provide a thorough introduction to
the Julia language that we use in this book, along with detailed explanations
of the codes we present.
In order to help the reader to navigate in the text, for each problem, we
clearly distinguish a few subsections, whose functions are listed below.

SOURCE

In this part, we provide the source of the problem.

SOURCE

This part contains the statement of the problem.


THEORY
These parts are “sprinkled” across the whole book and appear “as needed”;
that is, the rst time a given de nition or theorem is used, it is introduced
and later on it is only referenced. Let us also mention that for some
theorems presented in the book we provide proofs (especially if they are
easy, insightful, or potentially useful for solving the problems) but often the
proofs are omitted. Similarly, some de nitions are accompanied by
examples but some are not. As always, we try to select material that is
helpful for the reader to prepare for future related questions rather than
trying to be exhaustive.
SOLUTION

In this part, we provide a detailed solution. For some problems we provide


more than one solution as the aim of the book is not to simply solve all
problems, but rather to highlight the most important and common
approaches and tricks needed to be successful in solving similar questions
in the future.

REMARKS

Here we try to explain what is a typical thought process of the (successful)


person trying to solve the question. It is often the case in mathematics that
the solution of the question itself does not give us any insight on how it is
actually found. After reading the solution, the reader is convinced that the
claim holds but the reason for that and the process of “discovering” the
solution may remain mysterious. Hence, this section is as important as the
solution itself and should not be skipped.
EXERCISES

This part contains follow up exercises that use the same or similar concepts.
They should serve as a good test whether the reader “digested” the content
or needs more practice.
We tried to indicate the source for as many problems as possible. If the
source is omitted, it means that the problem is either our own, is well-
known, or we had it in our personal notes but were unable to recover the
original source. We also tried to make sure that it is clear whether the
solution is also taken from the source or is our own. We did our best to track
back all the sources but please contact us if we missed anything. We would
be more than happy to provide a more complete and accurate picture of the
sources of all the problems we have presented in a later edition of the book.
In particular, problems from Polish Mathematical Olympics and their
solutions are marked under the acronym PLMO. We would like here to
thank the organizers for granting us the right to use their translations in this
book. In the chapter on geometry, we have extensively used an excellent
collection of problems “Exercises in geometry” (in Polish) prepared by
Waldemar Pompe who also kindly agreed to include their translations in
this book.
If you nd any errors or omissions in this book, then please kindly let us
know and we will re ect it in the errata that will be available at
www.ryerson.ca/train-your-brain/.
Finally, we would like to thank Calum MacRury for carefully reading the
manuscript, and Igor Kamiński for helping with selecting topics and
problems to include in this book.
Chapter 1
Inequalities

1.1 Convexity and Concavity


1.2 Arithmetic-Geometric Inequality
1.3 Mathematical Induction
1.4 Bernoulli's Inequality
1.5 Euler's Number
1.6 Asymptotics
1.7 Cauchy-Schwarz Inequality
1.8 Probability
1.9 Geometry

We begin the book with a chapter on inequalities. This is an exciting subject, as it very
often requires reducing the problem to some other area of mathematics which may not
initially seem to be related to the problem at hand. For example, it might turn out that one
of the sides can be interpreted as the probability that some event holds, or that the side has
some geometric interpretation.
Since this is the rst chapter, let us start with introducing some basic de nitions that will
be used through the entire book.

THEORY
Let R denote the set of real numbers, let N = {1, 2, …} denote the set of natural numbers,
let Z := {…, −1, 0, 1, …} denote the set of integers, and let Q := {a/b : a ∈ Z, b ∈ N}
denote the set of rational numbers. Let [n] denote the set of the rst n natural numbers; that
is, [n] := {1, 2, …, n}. We use subscript + and - to restrict the set to positive and negative
numbers, respectively. For example, R+ denotes the set of positive real numbers. We will
use ln(x) to denote the natural logarithm of x.

1.1 Convexity and Concavity


SOURCE
Problem: XLV PLMO – Phase 1 – Problem 3
Solution: our own
PROBLEM
Prove that if a, b, and c are the lengths of the sides of a triangle, then

1 1 1 1 1 1
+ + ≤ + + .
a b c b + c − a c + a − b a + b − c
(1.1)

THEORY
Triangle Inequality If a, b, and c are the lengths of the sides of some triangle, then the
triangle inequality states that

c ≤ a + b .
Note that this statement permits the inclusion of degenerate triangles; that is, when
c = a + b. However, usually this possibility is excluded, thus leaving out the possibility of

equality.

Cartesian Coordinate System A cartesian coordinate system is a coordinate system that


speci es each point uniquely in a plane by a pair of numerical coordinates, which are the
signed distances to the point from two xed perpendicular directed lines, measured in the
same unit of length. Each reference line is called a coordinate axis (plural axes) of the
system, and the point where they meet is its origin, the ordered pair (0, 0).

Convex and Concave Functions A function f : R → R is said to be convex on a


connected set (interval) D ⊆ R if for all x , x ∈ D and t ∈ [0, 1], we have that
1 2

f (tx1 + (1 − t)x2 ) ≤ tf (x1 ) + (1 − t)f (x2 ).

(1.2)

Intuitively, a function is convex if for all x , x ∈ D its graph lies below a straight line
1 2

connecting points (x , f (x )) and (x , f (x )). For example, function f (x) = 1/x is


1 1 2 2

convex on D = R and functions g(x) = x and h(x) = 2 are convex on D = R.


+
2 x

Similarly, a function f : R → R is said to be concave on a connected set (interval)


D ⊆ R if for all x , x ∈ D and t ∈ [0, 1], we have that
1 2

f (tx1 + (1 − t)x2 ) ≥ tf (x1 ) + (1 − t)f (x2 ).

(1.3)

Examples of concave functions are: f (x) = −2(x + 7) 2


+ 4 on D = R or f (x) = ln n on
D = R . +

Finally, let us mention that if a function is continuous, then it is enough to check that the
condition (1.2) or (1.3) holds for t = 1/2 in order to establish that the corresponding
function is convex or, respectively, concave. Using this fact, one can easily prove that
function f (x) = 2 is convex. Indeed, notice that for all x, y ∈ R, we have
x

2
x/2 y/2 x (x+y)/2 y
0 ≤ (2 − 2 ) = 2 − 2 ⋅ 2 + 2 .

Rearranging this inequality, we obtain (2


x y
+ 2 )/2 ≥ 2
(x+y)/2
, but this is precisely the
condition (1.2) with t = 1/2.
SOLUTION

It follows immediately from triangle inequality that all fractions on the right hand side of
(1.1) are positive. In particular,

1 1 (c + a − b) + (b + c − a) 2c
+ = = > 0 .
2
b + c − a c + a − b (b + c − a)(c + a − b) c
2
− (a − b)

Since the numerator is positive (that is, 2c > 0), it follows that the denominator is positive
too (that is, c − (a − b) > 0). Moreover, clearly (a − b) ≥ 0, and so c − (a − b) ≤ c
2 2 2 2 2 2

. Putting all of these observations together, we get that

1 1 2c 2
+ = ≥ .
2
b + c − a c + a − b c
2
− (a − b) c

(1.4)

Moreover, the equality holds if and only if a = b. Similarly, we get

1 1 2
+ ≥ , and
c + a − b a + b − c a
(1.5)

1 1 2
+ ≥ .
b + c − a a + b − c b
(1.6)

After summing the three inequalities (that is, (1.4), (1.5), and (1.6)) together and dividing
by 2, we get the desired result. We additionally notice that equality holds if and only if
a = b = c.

REMARKS
Note that if one starts from the left hand side of (1.1), it is not clear how to reach the right
hand side of (1.1). (In particular, observe that 1/a + 1/b − 1/c may be greater than
1/(a + b − c); consider, for example, a = 3, b = 5, and c = 4.) However, when we look at

the right hand side, we notice that sum of any two denominators is twice some denominator
on the left hand side. This suggests that it might be easier to start from the right hand side
and try to reach the left hand side. Using our observation, it makes sense to re-write the
right hand side as

1/2 1/2 1/2 1/2 1/2 1/2


( + ) + ( + ) + ( + )
b + c − a c + a − b c + a − b a + b − c b + c − a a + b − c

and continue from there.

Let us also observe that a more general inequality in fact holds. For any convex function
f : R → R on a connected subset D ⊆ R, it follows that

f (a) + f (b) + f (c ) ≤ f (b + c − a) + f (c + a − b) + f (a + b − c),

(1.7)

provided that a, b, c ∈ D. Our problem is a speci c case when f (x) = 1/x, convex
function on D = R . +

The proof of this more general inequality follows exactly the same argument as above.
Point A has coordinates (a + b − c, f (a + b − c)) and point
B = (b + c − a, f (b + c − a)). Now, point D is the midpoint between A and B; that is, D

has the rst coordinate equal to (a + b − c) + (b + c − a) = c, and the second


1

2
1

coordinate equal to f (a + b − c) + f (b + c − a). Finally, point C has the same rst


1

2
1

coordinate as D, but its second coordinate is equal to f (c). Inequality (1.4) is a special case
of the following observation illustrated in Figure 1.1: for any convex function f on
D = [x , x ],
1 2

x1 x2 f (x1 ) f (x2 )
f( + ) ≤ + .
2 2 2 2
(1.8)

We apply this observation with x1 = b + c − a and x2 = c + a − b to obtain the desired


inequality.
FIGURE 1.1: Illustration for proving (1.8).

THEORY

Jensen’s Inequality Let us point out that inequality (1.8) can be easily generalized to any
number of numbers x , …, x (not only two), and to any weights (not only half). This
1 n

generalization is known as Jensen’s inequality and can be stated as follows: for any convex
function f (x) : D → R, numbers x , …, x ∈ D, and weights a , …, a ∈ R , we have
1 n 1 n +

that
n n
∑ ai xi ∑ ai f (xi )
i=1 i=1
f( ) ≤ .
n n
∑ ai ∑ ai
i=1 i=1

The inequality is reversed if f is concave. Rather,


n n
∑ ai xi ∑ ai f (xi )
i=1 i=1
f( ) ≥ .
n n
∑ ai ∑ ai
i=1 i=1

In both cases, equality holds if and only if x 1 = … = xn or f is linear.


EXERCISES

1.1.1. Prove that for any a, b, c ∈ R such that 0 < a ≤ b ≤ c,

1 1 1 1
− + ≥ .
a b c a + c − b
Illustrate the solution graphically. Does the same inequality hold for any function
f : R → R that is convex on some connected subset of R?

1.1.2. Prove that for any n ∈ N and any real number s ≥ 2, the following inequality holds:
n s s−1
∑ k 2 1
k=1
≥ ( n + ) .
n
∑ k 3 3
k=1

1.1.3. Prove that for any x ∈ R , +

√x + √x + 2 < 2√ x + 1 .

1.2 Arithmetic-Geometric Inequality


SOURCE

Problem and idea for the solution: XXXI PLMO – Phase 2 – Problem 2
PROBLEM

Show that the following inequality holds for all x 1, …, xn ∈ R :


i
n n 2
1 x
i
∏ xi ≤ + ∑ .
n i
2 2
i=1 i=1

(1.9)

THEORY

Geometric Sequence A geometric sequence is a sequence of numbers where each term


after the rst is found by multiplying the previous one by a xed, non-zero number called
the common ratio. For example, the sequence 5, 10, 20, 40, … is a geometric sequence
with common ratio 2. Similarly 45, 15, 5, 5/3, … is a geometric sequence with common
ratio 1/3. The common ratio of a geometric sequence may be negative, resulting in an
alternating sequence; for example, 5, −10, 20, −40, … is a geometric sequence with
common ratio −2.
The general form of a geometric sequence is a, ar, ar , …, where r ≠ 0 is the common 2

ratio and a is a scale factor, equal to the sequence’s initial value. It follows immediately
from the de nition that a geometric sequence follows the following recursive relation: for
every integer i ≥ 1, a = ra . Hence, the i-th term is given by a = ar .
i i−1 i
i−1

Geometric Series A geometric series is de ned as ∑ . It is straightforward to see that


n
ai
i=1

n n n
a(1 − r )
i−1
∑ ai = ∑ ar = ,
1 − r
i=1 i=1

(1.10)

provided r ≠ 1. Indeed,
n n n
i−1 1 i−1
∑ ai = ∑ ar = ⋅ (1 − r) ∑ ar
1−r
i=1 i=1 o=1

1 n−1 2 n
= ⋅ ((a + ar + ... + ar ) − (ar + ar + ⋅ ⋅ ⋅ + ar ))
1−r

2
n a(1−r )
a−ar
= = .
1−r 1−r

Arithmetic, Geometric, and Harmonic Means For any sequence of n numbers


x , …, x
1 n∈ R, the arithmetic mean is de ned as

n
1
A(x1 , …, xn ) := ∑ xi .
n
i=1

For any sequence of n numbers x 1, …, xn ∈ R+ ∪ {0} , the geometric mean is de ned as


1/n
n

G(x1 , …, xn ) := (∏ xi ) .

i=1

Finally, for any sequence of n numbers x 1


, …, xn ∈ R+ , the harmonic mean is de ned as
n
H (x1 , …, xn ) := .
n
∑ 1/xi
i=1

The following inequality relates the rst two means and appears to be very useful. For
any of n numbers x , …, x ∈ R ∪ {0},
1 n +

1/n
n n
1
A(x1 , …, xn ) = ∑ xi ≥ (∏ xi ) = G(x1 , …, xn ) .
n
i=1 i=1

(1.11)

The equality holds if and only if all xi are equal.


In order to verify inequality (1.11), let us rst note that the inequality trivially holds if
x = 0 for some i. Hence, without loss of generality, we may assume that x ∈ R
i for all i +

i ∈ [n]. Moreover, to prove inequality (1.11), it is enough to show that the following

inequality holds:

n n 1/n
1
ln ( ∑ xi ) ≥ln ((∏ xi ) )
n i=1 i=1

n
n ∑ ln(xi )
1 i=1
= ⋅ ln (∏ xi ) = .
n i=1 n

(1.12)

This is because f (x) = ln(x) is an increasing function on D = R . But, inequality (1.12) +

follows immediately from Jensen’s inequality applied to f (x) = ln(x), a concave function
on D = R , and a = 1/n for all i. From this inequality, we also get that equality holds (in
+ i

both inequalities (1.11) and (1.12)) if and only if all the xi terms are equal.

Finally, let us consider the relationship between the harmonic and the geometric mean.
We claim that
1/n
n
n
H (x1 , …, xn ) = ≤ (∏ xi ) = G(x1 , …, xn ).
n
∑ 1/xi i=1
i=1

Indeed, by substituting y i = 1/xi , we use inequality (1.11) to get


−1 −1/n 1/n
n n n
n 1
= ( ∑ yi ) ≤ (∏ yi ) = (∏ xi ) .
n
∑ 1/xi n
i=1 i=1 i=1 i=1

As before, equality holds if and only if all the xi terms are equal.

SOLUTION

First, let us note that, without loss of generality, we may assume that x ∈ R ∪ {0} for all i +

i ∈ [n]. Indeed, if inequality (1.9) holds for all sequences x , …, x ∈ R ∪ {0}, then for 1 n +

any sequence y , …, y ∈ R, we get that


1 n
i i
n n n 2 n 2
1 x 1 y
i i
∏ yi ≤ ∏ xi ≤ + ∑ = + ∑ ,
n i n i
2 2 2 2
i=1 i=1 i=1 i=1

by setting xi = |yi | for all i ∈ [n].

We will start from the left hand side of inequality (1.9), and try to reach its right hand
side. First, note that
n
n n n 1/2
2
∏ xi = (1 ⋅ ∏ x )
i=1 i=1 i

n n
1/2 n−i
1/2
n−1 i i
n 2 ⋅2 n 2 2
= (1 ⋅ ∏ x ) = (1 ⋅ ∏ ∏ x )
i=1 i i=1 j=1 i

The product under the root has

n n−1 n
1 − 2
n−i j n n
1 + ∑2 = 1 + ∑2 = 1 + = 1 + (2 − 1) = 2
1 − 2
i=1 j=0

terms. (See equality (1.10) for the value of the geometric series.) Hence, we can apply the
theorem relating the geometric and the arithmetic mean to get that
n
n−1
1/2 n−i
i i
n 2 2 1 n 2 2
(1 ⋅ ∏ ∏ x ) ≤ n
(1 + ∑ ∑ x )
i=1 j=1 i 2 i=1 j=1 i

i
2
n−i i x
1 n 2 2 1 n i
= n
+ ∑ n
x = n
+ ∑ .
2 i=1 2 i 2 i=1 2
i

This nishes the proof of the result.

REMARKS

As the left hand side of inequality (1.9) is a product that is smaller than the right hand side
that is a sum of some kind, it is natural to try to apply the geometric-arithmetic inequality.
The fact that the smallest term of the right hand side is 1/2 suggests that we need 2n n

terms, and exactly one of them is equal to 1. Then, we see that x is divided by 2i which
i
2
i

means that we need 2 such terms. Combining all these observations together, our goal is
n−i

to transform our inequality so that the following properties hold: 1) there are 2n terms in the
sum, 2) xi should be present in 2 identical terms. So, starting from the right hand side of
n−i

inequality (1.9), we get that


i n−i
n 2 n 2
1 x 1 1 i
i 2
+ ∑ = + ∑∑ xi .
n i n n
2 2 2 2
i=1 i=1 j=1

Finally, the only other thing to notice is that there are exactly 2n terms added together. This
allows us to use the geometric-arithmetic inequality, thus completing the argument.

EXERCISES

1.2.1. Show that for any a, b, c, d ∈ R , the following inequality holds:


+
1 1 4 16
(a + b + c + d)( + + + ) ≥ 64.
a b c d

When does equality hold?


(Source of the problem: Student Circle – High School of Stanisław Staszic in Warsaw.
Solution: our own.)

1.2.2. Show that for any n numbers a 1, …, an ∈ R+ , the following inequality holds:
2
a1 a2 an−1 an n
+ + ⋯ + + ≥ ,
a2 + 1 a3 + 1 an + 1 a1 + 1 n + α
where α = ∑ 1/a .
n

i=1 i

(Source of the problem: Exam – Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)
1.2.3. Prove that for any a, b ∈ R , for which ab = 1, we have that
+

m m
a + b ≥ 2 ,

where m ∈ R . +

1.3 Mathematical Induction


SOURCE

Problem and solution: well-known problem

PROBLEM

Prove that for any integer n ≥ 2 and any sequence of n real numbers a 1, …, an ∈ (1, ∞) ,
n n
ai ai
∑ ≥ ∑ ,
ln(ai+1 ) ln(ai )
i=1 i=1

(1.13)

where a n+1 = a1 .

THEORY

Mathematical Induction Mathematical induction is a powerful proof technique. It is


typically used to prove that a property P (n) holds for every integer n ≥ n , where n ∈ Z. 0 0

This method requires two things to be proven. First, one needs to check the base case; that
is, to prove that the property holds for the smallest number n0. Second, one needs to prove
the induction step; that is, show that if the property holds for some n ∈ {n , n + 1, …} 0 0

(this assumption is often called inductive hypothesis), then it holds for n + 1. These two
steps establish the property P (n) for every integer n ∈ {n , n + 1, …}. 0 0
In order to illustrate the method, let us prove the following simple inequality (property
P (n)): 2n + 1 ≤ 2 for all integers n ≥ 3. The base case ( P (3)) clearly holds:
n

3
7 = 2 ⋅ 3 + 1 ≤ 2 = 8 .
Suppose that P (n) holds: 2n + 1 ≤ 2 for some integer n
n ≥ 3 . We want to show that
P (n + 1) holds: 2(n + 1) + 1 ≤ 2 . This is true since n+1

n n n n+1
2(n + 1) + 1 = (2n + 1) + 2 ≤ 2 + 2 ≤ 2 + 2 = 2 .

(The rst inequality holds by inductive hypothesis; the second one holds since 2 ≤ 2
n
for
any n ≥ 3.)

SOLUTION

We say that property P (n) holds if inequality (1.13) holds for all sequences of n real
numbers a , …, a ∈ (1, ∞). The following symmetry will turn out to be useful:
1 n

inequality (1.13) applied to a sequence a , …, a ∈ (1, ∞) is equivalent to its application


1 n

to the sequence b , …, b ∈ (1, ∞), where b = a . (Recall the convention that


1 n i i+1

an+1 = a .) In particular, it implies that, without loss of generality, we may assume that an
1

is a smallest element in the sequence (as one can shift the initial sequence until smallest
element is at the end).
We will prove by mathematical induction (on n) that P (n) holds for all integers n ≥ 2.
First, let us check the base case ( n = 2). We need to show that property P (2) holds; that
is, to prove that for any a , a ∈ (1, ∞), we have
1 2

a1 a2 a1 a2
+ ≥ + .
ln(a2 ) ln(a1 ) ln(a1 ) ln(a2 )

Clearly, this inequality holds if and only if

1 1
(a1 − a2 )( − ) ≥ 0.
ln(a2 ) ln(a1 )

(1.14)

By our assumption that a2 is a smallest element, we have that a − a ≥ 0 and that 1 2

1/ ln(a ) − 1/ ln(a ) ≥ 0, so inequality (1.14) holds and the base case is


2 1 nished.
For the induction step, assume that P (n − 1) holds for some integer n ≥ 3; that is, for
any a , …, a
1 ∈ (1, ∞),
n−1

n−1 n−1 n−2


ai ai ai an−1
∑ ≤ ∑ = ∑ + .
ln(ai ) ln(ai+1 ) ln(ai+1 ) ln(a1 )
i=1 i=1 i=1

(1.15)

We want to show that P (n) holds; that is, for any a 1, …, an ∈ (1, ∞) ,
n n n−2
ai ai ai an−1 an
∑ ≤ ∑ = ∑ + + .
ln(ai ) ln(ai+1 ) ln(ai+1 ) ln(an ) ln(a1 )
i=1 i=1 i=1

(1.16)

Fix any sequence a , …, a ∈ (1, ∞). Without loss of generality, we may assume that an is
1 n

a smallest element. Starting from the left hand side of inequality (1.16) and using the
inductive hypothesis (inequality (1.15)), we get that
n n−1 n−2
ai ai an ai an−1 an
∑ = ∑ + ≤ ∑ + + .
ln(ai ) ln(ai ) ln(an ) ln(ai+1 ) ln(a1 ) ln(an )
i=1 i=1 i=1

Hence, to get inequality (1.16) it is enough to show that


an−1 an an−1 an
+ ≤ + ,
ln(a1 ) ln(an ) ln(an ) ln(a1 )

which is equivalent to

1 1
(an−1 − an )( − ) ≥ 0.
ln(an ) ln(a1 )

Again, similarly to the argument used for inequality (1.14), it is straightforward to see that
this inequality holds since it is assumed that an is a smallest element. The induction step is
nished and so is the proof.

REMARKS

In fact, one can prove more general property. For any two increasing functions f and g on D
the following is true: for any integer n ≥ 2 and any sequence of n real numbers
a , …, a
1 n∈ D,

n n

∑ f (ai )g(ai+1 ) ≤ ∑ f (ai )g(ai ).

i=1 i=1

Similarly, for any increasing function f and any decreasing function g on D,


n n

∑ f (ai )g(ai+1 ) ≥ ∑ f (ai )g(ai ).

i=1 i=1

Our question is a speci c case of this general inequality when f (x) = x and
g(x) = 1/ ln(x).

Finally, let us mention that the above two inequalities are a direct consequence of the
following rearrangement inequality: for every two monotone sequences x ≤ … ≤ x and 1 n

y ≤ … ≤ y ,
1 n

xn y1 + … + x1 yn ≤ xσ(1) y1 + ⋯ + xσ(n) yn ≤ x1 y1 + … + xn yn ,
where σ : [n] → [n] is any permutation of [n]. It is good to recall such general inequalities,
and that they can be proven using mathematical induction. The idea for the proof of the
initial problem then comes naturally.

EXERCISES

1.3.1. Prove that for any a, b ∈ R , +

b a a b
a b ≤ a b .
(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)

1.3.2. Prove that for any a, b, c ∈ R , +

ab bc ca
+ + ≥ a + b + c .
c a b
(Source of the problem: Student Circle – High School of Stanisław Staszic in Warsaw.
Solution: our own.)

1.3.3. Prove that for any a, b, c ∈ R , +

a b c (a+b+c)/3
a b c ≥ (abc) .

1.4 Bernoulli’s Inequality


SOURCE

Problem and solution: our own

PROBLEM

Show that for any 0 < α < π/2,

sin(α) sin(α)
√ (2−cos2 (α)) + √ cos2 (α) ≥ 2.

THEORY

Bernoulli’s Inequality If x > −1, then


r
(1 + x) ≥ 1 + rx,

for r ≤ 0 or r ≥ 1, and
r
(1 + x) ≤ 1 + rx,

for 0 ≤ r ≤ 1. Strict inequalities hold, provided that x ≠ 0 and r ≠ 0, 1.

SOLUTION
Fix any 0 < α < π/2. Recall that sin 2 2
(α) + cos (α) = 1 . Substituting

r := 1/ sin(α ) ∈ (1, ∞)

and
2 2
x := sin (α ) = 1 − cos (α ) ∈ (0, 1),

we get that

sin(α) sin(α)
r r
√ (2−cos2 (α)) + √ cos2 (α) = (1 + x) + (1 − x) .

Now, since x > −1, −x > −1, and r > 1, we can apply Bernoulli’s inequality to get
r r
(1 + x) + (1 − x) ≥ (1 + xr) + (1 − xr ) = 2.

This nishes the proof.

REMARKS

Note that in this example we actually showed a slightly stronger inequality. Indeed,
although x and r are related to each other (both are functions of α), the inequality is true
even if they are not related. Such an approach of trying to prove a stronger result instead of
the one we really care about is not uncommon in mathematics. It sometimes leads to a
simpler proof of the result we care about.
A tricky part in this problem is to nd a substitution x = 1 − cos (α). In order to reach 2

it, the rst step is to check when the right hand side is equal to the left hand side, and we
immediately see that this is the case when cos (α) = 1. It is then helpful to know that a
2

typical trick in such cases is to consider a deviation from the equality case. From here, we
obtain:

1/ sin(α) 1/ sin(α)
(1 + x) + (1 − x) .

Now, if one remembers Bernoulli’s inequality, one immediately gets that it is at least 2 as
long as 1/ sin(α) > 1. Fortunately, this is the case in our example. Alternatively, one can
use Jensen’s inequality as
2 2
(2 − cos (α)) + cos (α)
= 1 ,
2
and ar is convex for r > 1.

EXERCISES

1.4.1. Prove that for any integer n > 1,


2
n −n
n 1
a) ( ) < ,
n+2 2n−1

2
n −1
n−1 1
b) ( ) < .
n n+2
(Source of the problem: Exam by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)

1.4.2. Prove that for any real number x > −1 and n ∈ N,

n
x
√ 1+x ≤ 1 + .
n
(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)

1.5 Euler’s Number


SOURCE

Problem and solution: classic, well-known, problem

PROBLEM

Let m, n be any two natural numbers such that m > n > 2. Prove that
n m
m < n .
THEORY

Constant e The constant e ≈ 2.71828 is a mathematical constant which appears in many


different settings throughout mathematics. It can be de ned as follows:
n ∞
1 1
e = lim (1 + ) or e = ∑ .
n→∞ n i!
i=0

Moreover, the constant e is the unique real number such that


x x+1
1 1
(1 + ) < e < (1 + )
x x

for all x ∈ R . +

Note that for n ≥ −a, it follows from the arithmetic-geometric inequality that
1/(n+1)
n a
a n/(n+1)
a 1 + n(1 + ) a
n
(1 + ) = (1 ⋅ ∏ (1 + )) < = 1 + .
n n n + 1 n + 1
i=1

So, after raising both sides to the power of n + 1, we get that


n+1
n
a a
(1 + ) < (1 + ) .
n n + 1

(1.17)
It follows that the sequence is eventually increasing for any .
n
a
xn := (1 + ) a ≠ 0
n

Additionally, for a > 0,


a a
x/a x/a
x
a 1 1
a
lim (1 + ) = lim ((1 + ) ) = ( lim (1 + ) ) = e ,
x→∞ x x→∞ x→∞
x/a x/a

and similarly for a < 0,


x −x −x
a x −a
lim (1 + ) = lim ( ) = lim (1 + )
x a+x x+a
x→∞ x→∞ x→∞

−1
a (x+a)−a −1
−a a.
= ( lim (1 + ) ) = (e ) = e
x+a
x→∞

There is one technical and subtle issue with the argument above. At some point, we
switched from the limit of a sequence, lim f (x ), to the limit of a function,
n→∞ n

lim x→∞f (x). There is a sequential characterization of limits of functions, namely,

lim f (x) = L (a could be in nity) if and only if lim


x→a f (x ) = L for every sequence
n→∞ n

(x )
n n≥1
such that lim x = a. Note that it might be the case that lim
n→∞ n f (x) does x→∞

not exist but lim n→∞ f (x ) does; consider, for example, f (x) = sin(x) and x
n = πn. n

In our situation, as constant e was de ned by the limit of a sequence, we should have
been slightly more careful and make sure we take a limit over integers. This is easy to
verify after noting that
⌈n/a⌉−1 n/a ⌊n/a⌋+1
1 1 1
(1 + ) ≤ (1 + ) ≤ (1 + ) .
⌈n/a⌉ n/a ⌊n/a⌋

Since it is trivially true for a = 0, we can now safely claim that for any a ∈ R,
n
a
a
lim (1 + ) = e .
n→∞
n
(1.18)

There are many important and useful inequalities involving the constant e. We mention
only a few here. For any x ∈ R,
x
1 + x ≤ e .
(1.19)

To see this we note that for natural n > −x we can apply Bernoulli’s inequality to get:
x n
e ≥ (1 + x/n) ≥ 1 + n ⋅ x/n = 1 + x .

On the other hand, for any b ∈ R and any x ∈ [0, b], or for any b ∈ R and any x ∈ [b, 0]
+ −

,
FIGURE 1.2: Illustration for inequalities (1.19) and (1.20).

b
e − 1
x
1 + ⋅ x ≥ e .
b
(1.20)

Indeed, one can use Jensen’s inequality to show that


b
e − 1 x 0
x b 0(1−x/b)+b(x/b) x
1 + ⋅ x = (1 − ) ⋅ e + ⋅ e ≥ e = e .
b b b
Both inequalities are illustrated on Figure 1.2. Once we introduce asymptotic notation, we
will come back to these inequalities and prove inequalities (1.19) and (1.20) once more.
However, as appropriate for the technique, we will concentrate on values of x close to zero
and so the results obtained will be weaker.

SOLUTION

Inequality (1.19) is all we need to solve this problem:


n n n
n n m n n+m−n n m−n
m = n ( ) = n ( ) = n (1 + )
n n n
n
m−n
n n m−n n m−n m.
≤ n (e n ) = n e ≤ n n = n

(The last inequality holds since e < 3 ≤ n.)

REMARKS

In many problems that involve power function, we reach terms that can be expressed in the
form (1 + x/n) . Then, it is often useful to remember that such terms (treated as sequences
n

or functions of n with x xed) are increasing, but bounded from above by ex.
EXERCISES

1.5.1. Prove that for any integer n > 2,


2
n −n
n 1
( ) < .
n−1
n + 2 4

Can the constant 4 be improved for large n?

1.5.2. For which n ∈ N do we have that


n+1
n
√n > √ n+1 ?

(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)

1.5.3. For which n ∈ N do we have that


n n+1 2n+1
(n − 1) (n + 1) > n ?

1.5.4. For which n ∈ N do we have that


n n−1
n > (n + 1) ?

(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)

1.5.5 Prove that for any n ∈ N,


n
1 n + 1
2 ≤ (1 + ) ≤ 3 ⋅ .
n n + 2

(Source of the problem: Lecture by Paweł Bechler – High School of Stanisław Staszic in
Warsaw. Solution: our own.)

1.6 Asymptotics
SOURCE

Problem and solution: well-known problem

PROBLEM

Check for which n ∈ N the following statement holds: for all x ∈ R + ∪ {0} ,
2

n
(nx)
(1 + x) ≥ 1 + nx + .
2
THEORY
Binomial Theorem The binomial theorem can be written as follows: for any n ∈ N and
any x, y ∈ R,
n
n
n
n−i i
(x + y) = ∑( )x y .
i
i=0

Asymptotic Notation Let f (x) and g(x) be any two functions. In our applications, f (x) is
usually a complicated function whose behavior we would like to understand, and g(x) has a
simple form, and is positive for large enough x. We write:

f (x) = O(g(x)) if there exists a positive constant C such that for all
suf ciently large x we have that |f (x)| ≤ C|g(x)|,

f (x) = Ω(g(x)) if there exists a positive constant c such that for all
suf ciently large x we have that |f (x)| ≥ c|g(x)|,

f (x) = Θ(g(x)) if f (x) = O(g(x)) and f (x) = Ω(g(x)),

f (x) = o(g(x)) if lim x→∞ f (x)/g(x) = 0 ,

f (x) = ω(g(x)) if lim x→∞ |f (x)|/|g(x)| = ∞ ,

f (x) ∼ g(x) if lim x→∞ f (x)/g(x) = 1 .

As a simple example, consider f (x) = 3x + 10x + 10 x ln x. For some moderate


5/2 2 10

values of x, the last two terms are dominant but eventually the rst one becomes much
larger than both of them. Clearly, for any x ≥ 1,
5/2 2 10
f (x) = 3x + 10x + 10 x ln x

5/2 5/2 10 5/2 11 5/2


≤ 3x + 10x + 10 x ≤ 10 x ;

it follows that f (x) = O(x 5/2


) . On the other hand, for any x ≥ 1,
5/2 2 10 5/2
f (x ) = 3x + 10x + 10 x ln x ≥ 3x ;

it follows that f (x) = Ω(x ) and so f (x) = Θ(x ). It is straightforward to see that, say,
5/2 5/2

f (x) = O(x ) but f (x) = ω(x ); that is, f (x) is negligible compared to x
3 2 3 but grows

faster than x2. Finally, note that f (x) ∼ 3x as 5/2

2 10
f (x) 10x 10 x ln x
lim = lim (1 + + ) = 1.
x→∞ 5/2 x→∞ 5/2 5/2
3x 3x 3x
Alternatively, we could have observed that
5/2 2
f (x) = 3x + O(x ) + O(x ln x)

5/2 2 5/2 5/2 5/2


= 3x + O(x ) = 3x + O(x )~3x .

Here are some useful properties.

O(f (x)) + O(g(x)) = O(|f (x)| + |g(x)|) = O(max{|f (x)|, |g(x)|}) ,

O(f (x)) ⋅ O(g(x)) = O(f (x) ⋅ g(x)) ,

Ω(f (x)) ⋅ Ω(g(x)) = Ω(f (x) ⋅ g(x)) ,

if f : N → R and f (n) = O(1), then f (n) is bounded by a constant.

One needs to be careful when working with asymptotic notation, as the notation presents
many counterintuitive properties. For example, note that we cannot deduce that
Ω(f (n)) + Ω(g(n)) = Ω(f (n)). Indeed, if f and g are of the same order, it might not be

true: 2
7n + 3n = Ω(n ) and
2
−7n + 10 ln n = Ω(n ) but 2 10 2

ln n = Ω(n).
2 2 10 10
(7n + 3n) + (−7n + 10 ln n) = 3n + 10

The above de nitions and examples assume that x → ∞. However, sometimes we would
like to understand the behavior of some function f (x) when x → 0. The notation
introduced above can be easily adjusted to this situation. For example, we write
f (x) = O(g(x)) if there exist positive constants C and M such that for all x ∈ (0, M ) we

have f (x) ≤ Cg(x).


To illustrate this variant, let us come back to inequalities (1.19) and (1.20), and show
their asymptotic counterparts. However, before we move to this task, let us show the
following well-known fact: for any x ∈ R,
∞ i 2 3
x
x x x
e = ∑ = 1 + x + + + ….
i! 2! 3!
i=0

(1.21)

For x = 0 it trivially holds. Now x any x ∈ R ∖ {0} . Our goal is to show that
limn→∞ sn = e
x
, where
n i
x
sn := ∑ .
i!
i=0

We will relate the sequence sn to the sequence en de ned as


n
x
en := (1 + ) ,
n
for n ∈ N. We also recall equality (1.18), whose importance will soon be seen:
n
n
x
x
e = lim (1 + ) = lim en .
n→∞ n→∞
n
For a xed n ∈ N, we use the binomial theorem to get that
n i n i
n
x n x x n!
en := (1 + ) = ∑( ) = ∑ ⋅ .
n i ni i! (n − i)!ni
i=0 i=0

We may assume that n ≥ 3|x| is large enough integer (but xed) so that for any integer
i ≥ n we have

i+1
|x|

(i+1)! |x| 1
0 ≤ = ≤ .
i
|x| i + 1 3
i!

This implies, that by equality (1.10), the value of the geometric series with the scale factor
1/3 has

j
j
m x m! m |x| m!
∑ ⋅ ≤ ∑ ⋅
j=n+1 j! (m−j)!m
j j=n+1 j! (m−j)!m
j

j j
m |x| m m−1 m−j+1 m |x|
= ∑ ⋅ ⋅ ⋅ ... ⋅ ≤ ∑
j=n+1 j! m m m j=n+1 j!

n n
m |x| 1 |x|
≤ ∑ ⋅ ≤ ,
j=n+1 n! 3
j−n
2n!

for any integer m > n. As a result, for any integer m > n ≥ 3|x|,

n i n
x m! |x|
em − ∑ ⋅ ≤ .
i
i! (m − i)!m 2n!
i=0

(1.22)

Next, observe that sn can be made arbitrarily close to ∑ by making sure that
i
n x m!
i=0 i! (m−i)!m
i

m is large enough. To see this, note that there is a nite number of terms in the sum
(namely, n + 1), and that tends to 1 as m tends to in nity. Therefore, as x and n
(m−i)!m
m!
i

are xed, we can choose m large enough (that is, m = f (n) ∈ N for some function f) to
ensure that

n i n
x m! |x|
sn − ∑ ≤ .
i
i! (m − i)!m 2n!
i=0

(1.23)
n

(Let us comment that we choose as a convenient upper bound as it matches the bound
|x|

2n!

in (1.22); however, here we could use any bound that tends to zero as n → ∞.) Now,
combining inequalities (1.22) and (1.23), we get that
n
|x|
|sn − em | ≤ .
n!
Finally, as e → e as m → ∞, for m large enough (that is, possibly after adjusting
m
x

function f), we are guaranteed that


n
|x|
x
|em − e | ≤ .
n!
It follows that
n
2|x|
x
|sn − e | ≤ ,
n!
which nishes the argument, as it implies that s → e as n → ∞ (since, clearly n
x

2|x| /n! → 0 as n → ∞).


n

Now, to see that inequality (1.19) holds asymptotically, we use (1.21) and note that
2 2
x
x 3
x
e = 1 + x + + O(x ) ≥ 1 + x + ≥ 1 + x,
2 4
provided that x ∈ R is suf ciently close to zero. (Of course, it holds for all x ∈ R but the
aim here is to understand the behavior around zero.) Similarly, to see that the rst part of
inequality (1.20) holds asymptotically, note that for any ϵ > 0,
x 2
e = 1 + x + O(x ) ≤ 1 + x + ϵx = 1 + (1 + ϵ)x,

again, provided that x ∈ R is small enough.


+

SOLUTION

We will prove that the statement does not hold for any n ∈ N; that is, for any n ∈ N, there
exists x ≥ 0 such that
2
(nx) n
f (x ) := 1 + nx + − (1 + x) > 0.
2
Clearly, for n = 1, we have
2 2
x x
f (x ) = 1 + x + − (1 + x ) = > 0
2 2
for every x > 0. Similarly, for n = 2, we get that for any x > 0
2
(2x) 2 2
f (x ) = 1 + 2x + − (1 + x) = x > 0.
2
Now, let us x any n ≥ 3. This time we need a more sophisticated argument, as the
statement clearly holds for large enough x but also for x = 0. However, it fails for x
suf ciently close to zero (but not equal to zero). Using the binomial theorem, we get that
2
(nx) n n i
f (x) = 1 + nx + − ∑ (1 )x
2 i=0 i

2
n 2 n(n−1) 2 n n i
= 1 + nx + ⋅ x − (1 + nx + ⋅ x + ∑ (1 )x )
2 2 i=3 i

n 2 n n i
= ⋅ x − ∑ (1 )x .
2 i=3 i

(1.24)

Noting that for any i ∈ {3, …, n} , (


n

i
) ≤ (
n

⌊n/2⌋
) = (
⌈n/2⌉
n
) , we get that for any
x ∈ [0, 1/2] ,

n
n n n
n 2 i n 2 i
f (x) = ⋅ x − ∑ ( )x ≥ ⋅ x − ( )∑ x
2 i=3 2 i=3
i ⌊n/2⌋

n n−3
n ∞ i
n 2 3 i n 2 3 1
≥ ⋅ x − ( )x ∑ x ≥ ⋅ x − ( )x ∑ ( )
2 i=0 2 i=0 2
⌊n/2⌋ ⌊n/2⌋

n n
n 2 3 2 n
= ⋅ x − 2( )x = x ( − 2( )x).
2 2
⌊n/2⌋ ⌊n/2⌋

Finally, we set x0 = n/(8(


n

⌊n/2⌋
)) ∈ (0, 1/2] to get the desired counter-example, that is,
2
f (x0 ) = x0 (n/2 − n/4) = x0 ⋅ n/4 > 0
2
.

REMARKS

Let us come back to our original question. Using the asymptotic notation (when x → 0),
one can continue the argument from equation (1.24) as follows, avoiding tedious
calculations. Indeed, observe that
n 2 n i n 2 3
f (x) = ⋅ x − ∑ Θ(x ) = ⋅ x − Θ(x )
2 i=3 2

n 2 n 2
= ⋅ x (1 + Θ(x))~ ⋅ x .
2 2

(1.25)

(Recall that n is perhaps large but xed constant.) Hence, for suf ciently small x, we get
f (x) > 0 and we are done. Actually, this asymptotic analysis suggests how to formalize the

argument in the proof above, which only adds that we choose a speci c (small) value for x
to avoid asymptotic notation.
As mentioned in the theory part, we used above a non-standard notation when x → 0. Of
course, it is possible to avoid it and use a standard one with y → ∞ by letting
x := 1/y → 0. Then, instead (1.25), we have

n 2 n i
f (x) = f (1/y) = ⋅ (1/y) − ∑i=3 Θ((1/y) )
2

n 2 3 n 2
= ⋅ (1/y) − Θ((1/y) )~ ⋅ (1/y) .
2 2

The desired conclusion holds for suf ciently large y.


EXERCISES

1.6.1. Show that for any n ∈ N, there exists a non-negative x ∈ R such that
n 2
i
n + n + 1
∏ (1 + x) < 1 + x .
2
i=1

1.6.2. Prove that for any polynomial W (x) and suf ciently large x we have that
> W (x), if n ∈ N is greater than the degree of W. What does it tell us about
n
(1 + x/n)

the function ex?

1.7 Cauchy-Schwarz Inequality


SOURCE

Problem: LXIX PLMO – Phase 1 – Problem 10

Solution: our own

PROBLEM

Prove that for any integer n ≥ 3 and any sequence of n numbers x 1, …, xn ∈ R+ ,


2 2 2 2 2
1 + x 1 + x 1 + x 1 + x 1 + xn
1 2 n−2 n−1
+ + … + + + ≥ n.
x2 + x3 x3 + x4 xn−1 + xn xn + x1 x1 + x2

(1.26)

THEORY

Cauchy-Schwarz Inequality The Cauchy-Schwarz inequality is an elementary inequality,


and at the same time a powerful observation, which can be stated as follows. For any two
sequences a , …, a ∈ R and b , …, b ∈ R,
1 n 1 n

2
n n n

2 2
(∑ ai )(∑ bi ) ≥ (∑ ai bi ) ;

i=1 i=1 i=1

equality holds if and only if the two sequences are proportional; that is, there exists a
constant c ∈ R such that a = cb for all i ∈ [n].
i i

There are at least 12 different proofs of this inequality; here we present an elementary
one. Note that

( )
n n 2 n n 2 2 2 2
0 ≤ ∑ ∑ (ai bj − aj bi ) = ∑ ∑ (a b − 2ai aj bi bj + a b )
i=1 j=1 i=1 j=1 i j j i

n 2 n 2 n n n 2 n 2
= ∑ a ∑ b − 2∑ ai bi ∑ aj bj + ∑ b ∑ a
i=1 i j=1 j i=1 j=1 i=1 i j=1 j

n n n 2
2 2
= 2(∑ a )(∑ b ) − 2(∑ ai bi ) ,
i=1 i i=1 i i=1

which immediately implies the desired inequality.

Titu’s Lemma The next inequality, known as Titu’s lemma, is a direct consequence of
Cauchy-Schwarz inequality. For any two sequences x , …, x ∈ R and y , …, y ∈ R, 1 n 1 n

n 2
n 2
x (∑ xi )
i i=1
∑ ≥ .
n
yi ∑ yi
i=1 i=1

It is obtained by applying the substitutions ai = xi /√ yi and bi = √ yi into the Cauchy-


Schwarz inequality.

SOLUTION

First, let us note that the left hand side of (1.26) is equal to A 1 + A2 , where
2 2 2 2 2
1 1 1 1 1
A1 := + + ⋅ ⋅ ⋅ + + + ,
x2 +x3 x3 +x4 xn−1 +xn xn +x1 x1 +x2

2 2 2 2 2
x x x x xn
1 2 n−2 n−1
A2 := + + ⋅ ⋅ ⋅ + + + .
x2 +x3 x3 +x4 xn−1 +xn xn +x1 x1 +x2

It follows from Titu’s lemma (applied to two sequences, 1, 1, …, 1 and


x2 + x3 , x3 + x4 , …, xn + x1 , x1 + x2 ) that

n 2
2
(∑ 1) n
i=1
A1 ≥ = ;
n n
∑i=1 (xi+1 + xi+2 ) 2 ∑i=1 xi

as before, we used the convention that x n+1 = x1 and x = x . Applying the lemma one
n+2 2

more time (this time the rst sequence is x 1 , x2 , …, xn ), we get that

n 2 n 2
n
(∑i=1 xi ) (∑i=1 xi ) 1
A2 ≥ = = ∑ xi .
n n
∑ (xi+1 + xi+2 ) 2∑ xi 2
i=1 i=1 i=1

Hence,
1/2
2 n 2 n
1 n n
A1 + A2 ≥ ( + ∑ xi ) ≥ ( ⋅ ∑ xi ) = n,
n n
2 ∑ xi ∑ xi
i=1 i=1 i=1 i=1

where last inequality follows from the arithmetic-geometric mean inequality.

REMARKS

If one remembers Titu’s lemma, then the solution comes to mind naturally after noticing
that fractions on the left hand side of (1.26) contain squares in their numerators but there
are no squares in their denominators. However, the question is what one needs to do
without knowing the lemma (or without realizing that it can be applied to this problem).
Here is an elementary argument. Observe that
2 2
(2 − b) + (2a − b) ≥ 0,

(1.27)

which is equivalent to
2 2
1 + a ≥ b + ab − b /2,

and if b > 0, we get that


2
1 + a b
≥ 1 + a − .
b 2
(1.28)

Now, after substituting a = xi and b = xi+1 + xi+2 , we immediately get the desired
inequality:
n 2 n
1 + x xi+1 + xi+2
i
∑ ≥ ∑ (1 + xi − ) = n.
xi+1 + xi+2 2
i=1 i=1

The only question remaining is how to guess the starting point, that is, inequality (1.27)?
One possible line of reasoning is as follows. We need to deal with the sum of n fractions,
each of the form , where a = x > 0 and b = x + x > 0. We observe that,
2
1+a
i i+1 i+2
b

although it is possible that some fractions are close to zero, the sum has to be large. Indeed,
if one fraction is small, then its denominator must be large and so we expect the following
fractions to be large. Our hope is that this will balance out and, on average, fractions have
values at least 1. In fact, it is natural to conjecture that the left hand side of inequality (1.26)
reaches its minimum for x = … = x = 1. But how do we turn it into a formal
1 n

argument? Since the quadratic function is not the easiest to work with, the goal is to bound
from below by a simpler, linear, function g(a) = ca + d with similar
2
1+a
f (a) =
b

behavior, namely, if one term is small the other terms are forced to be large. It makes sense
to make an approximation as tight as possible, so we want the line g(a) to touch the
parabola f (a). However, what should be the touching point? The answer is relatively easy
—as already mentioned, the original inequality (1.26) is tight when all xi are equal (in fact,
all are equal to 1). This implies that we want a = b/2 and so a touching point should be
(b/2, 1/b + b/4); see Figure 1.3 for an illustration. Hence,
g(a) = c(a − b/2) + 1/b + b/4.
FIGURE 1.3: Illustration for tuning functions f (a) and g(a).

It remains to calculate the constant c. Since we want f (a) ≥ g(a), the function

1 2
2cb − b
f (a) − g(a ) = ⋅ a − c ⋅ a +
b 4
should be a quadratic function with its discriminant equal to zero. It follows that

2
1 2cb − b 2
c − 4 ⋅ ⋅ = c − 2c + 1 = 0,
b 4
and so c = 1. We get that f (a) ≥ g(a) = a − b/4 + 1/b. Finally, since

b 1 b
+ ≥ 2√ ⋅ b = 1,
4 b 4
we get that
2
1 + a b 1 b 1 b b
≥ a − + = + + a − ≥ 1 + a − ,
b 4 b 4 b 2 2
which is what we need to nish the proof—see (1.28).

EXERCISES

1.7.1. Prove that for a, b, c ∈ R , +

1 1 1
(a + b + c)( + + ) ≥ 9.
a b c

(Source of the problem: inspired by problem PLMO II – Phase 1 – Problem 6. Solution: our
own.)

1.7.2. Prove that for any a, b, c ∈ R such that a + b + c = 1, we have that


+
√ 2a + 1 + √ 2b + 1 + √ 2c + 1 ≤ √ 15 .

(Source of the problem: Student Circle – High School of Stanisław Staszic in Warsaw.
solution: our own.)

1.7.3. Prove that if a, b, c ∈ R are such that a + b + c = 1 and min{a, b, c} ≥ −3/4, then

a b c 9
+ + ≤ .
a2 + 1 b2 + 1 c2 + 1 10
Does this inequality hold without the additional assumption that min{a, b, c} ≥ −3/4?
(Source of the problem and solution: XLVII OM – Phase 2 – Problem 3.)

1.8 Probability
SOURCE

Problem and solution: inspired by XLV PLMO – Phase 1 – Problem 10

PROBLEM

Prove that for any x, y ∈ R and any m, n ∈ N,


+

m m n n n m nm
((x + y) − x ) + ((x + y) − y ) ≥ (x + y) .

(1.29)

In particular, when x = y = 1 and n = m ∈ N, we get that


2
n n n
2(2 − 1) ≥ 2 .

(1.30)

THEORY

Boole’s Inequality The following elementary fact, known as Boole’s inequality but also as
the union bound is very useful. For any collection of events A , …, A in some probability 1 n

space,
n

⎛ ⎞ n

P ⋃ Ai ≤ ∑ P(Ai ) .
⎝ ⎠ i=1
i=1

(1.31)

We note that this inequality is sharp, since the equality holds for disjoint events.

SOLUTION
Proving the special case, inequality (1.30), is relatively easy. After dividing both sides by
, we get an equivalent inequality
2
n +1
2

n
1 1
(1 − ) ≥
2n 2

that, in turn, after raising both sides to the power of 2 n


/n is equivalent to
n
2
1 1
(1 − ) ≥ .
n
n 2 /n
2 2
(1.32)

Now, we notice that the left hand side of inequality (1.32) is an increasing function of n—
see inequality (1.17). On the other hand, it is obvious that the right hand side of inequality
(1.32) is a decreasing function of n. Hence the desired inequality holds if it holds for the
smallest natural number, that is, for n = 1. For n = 1, both sides of inequality (1.32) are
equal to 1/4 and so we are done.
The proof of inequality (1.29) is more challenging. We start with dividing both sides by
and setting p = x/(x + y) ∈ (0, 1) to get an equivalent inequality: for any
nm
(x + y)

p ∈ (0, 1),

m n n m
f (p ) := (1 − p ) + (1 − (1 − p) ) ≥ 1 .

(1.33)

We are going to introduce a random process and two events, A and B, such that
m n
P(A) = (1 − p ) ,

(1.34)
n m
P(B) = (1 − (1 − p) ) ,

(1.35)

and argue that no matter what the outcome of the process is, at least one of the two events
must hold. This will nish the proof, as then

1 = P(A ∪ B ) = P(A) + P(B) − P(A ∩ B ) ≤ P(A) + P(B ) .

The last inequality is a speci c case of Boole’s inequality—see inequality (1.31).


Suppose that there are n students and m elective courses. Each student s is taking course
c with probability p; all n ⋅ m events are independent of one another. Let A be the event
that none of the students take all the courses. It is straightforward to see that equality (1.34)
holds. Let B be the event that each course has at least one student taking it. Again, it is clear
that equality (1.35) holds. The last piece missing is to show that P(A ∪ B) = 1. Suppose
that B does not hold; that is, some course c is not taken by any student. This means that no
student takes all the courses, and so A holds. Thus, at least one of the two events must hold,
and so the proof is nished.
REMARKS

It is immediately apparent that the inequality can be reduced to one variable. Then, the
question is what is most convenient way to do it. One natural approach is to make the right
hand side of inequality (1.29) a constant. This directly gives
m n n m
x y
(1 − ( ) ) + (1 − ( ) ) ≥ 1 .
x + y x + y

If x/(x + y) is now the probability of some event, then y/(x + y) is the probability of its
complement. So we try to nd a process involving n ⋅ m events, and two associated events.
Alternatively, one can solve this problem analytically. If n = 1 or m = 1, then the left
hand side of inequality (1.29) is equal to its right hand side. Hence, we need to concentrate
on the case min{n, m} > 1. It will be more convenient to focus on inequality (1.33). The
left hand side of inequality (1.33), function f (p), has two terms: (1 − p ) and m n

(1 − (1 − p) ) . Clearly, f (0) = f (1) = 1. The rst term is a decreasing function of p,


n m

the second one is an increasing one, but it is not clear what the behavior of the sum of the
two is. In order to show that f (p) > 0 for p ∈ (0, 1), we will show that f (p) > 0 for some
p ∈ (0, 1) and that there is one extremum in the interval (0, 1).

For the rst property, note that for given n and m that are greater than 1, if p tends to
zero, then
m n n m
(1 + p ) +(1 − (1 − p) )
m
m 2m 2
= (1 − np + O(p )) + (1 − (1 − np + O(p )))

m 2m m m m
= (1 − np + O(p )) + n p (1 + O(p))

m m m+1
= 1 + (n − n)p + O(p )

m m
= 1 + (n − n)p (1 + O(p)).

This implies that if p is greater than zero but suf ciently small, then f (p) > 1 . The rst
property therefore holds.
For the second property, note that function f (p) is differentiable on [0, 1]. Clearly,

′ n m−1 n−1 m−1 m n−1


f (p ) = mn((1 − (1 − p) ) (1 − p) − p (1 − p ) ) .

After setting f ′
(p) = 0 and rearranging terms, we get that
n m−1 n−1
m
1 − (1 − p) 1 − p
( ) = ( ) .
1 − (1 − p) 1 − p

Now, using the formula for geometric series (see (1.10)), we notice that it is equivalent to
m−1 n−1
n−1 m−1
i i
(∑ (1 − p) ) = (∑ p ) .

i=0 i=0
Since the left hand side monotonically decreases from n (for p = 0) to 1 (when p → 1)
m−1

and the right hand side monotonically increases from 1 (for p → 0) to m (for p = 1), n−1

there is exactly one point when the two sides are equal. It follows that f (p) = 0 has only ′

one solution in (0, 1), which nishes the proof.

EXERCISES

1.8.1. Prove that for any n ∈ N,


2n
1 ( )
n
≤ ≤ 1 .
2n
2n + 1 2
Can you improve these bounds for large n?

1.8.2. Prove that for k, n ∈ N , such that k ≤ n and p, q ∈ [0, 1] , such that p < q we have
that
n
n i n−i i n−i
∑( )(q (1 − q) − p (1 − p) ) ≥ 0.
i
i=k

1.9 Geometry
SOURCE

Generalization of the problem from British Mathematical Olympiad 2006/7 – Phase 1 –


Problem 5
PROBLEM

Prove that for any a, b, c ∈ R,


2
2 2
(a + b ) ≥ (a + b + c)(a + b − c)(b + c − a)(c + a − b ) .

(1.36)

THEORY

Heron’s Formula Heron’s formula states that the area of a triangle whose sides have
lengths a, b, and c is

A = √ s(s − a)(s − b)(s − c) ,

where s is the semi-perimeter of the triangle; that is,

a + b + c
s = .
2
Heron’s formula can also be written as

1
A = √ (a + b + c)(−a + b + c)(a − b + c)(a + b − c) .
4
SOLUTION

First, let us observe that there is a lot of symmetry in both the left hand side of inequality
(1.36), the function , and the right hand side, the function
2
2 2
f (a, b) := (a + b )

g(a, b, c) := (a + b + c)(−a + b + c)(a − b + c)(a + b − c). In particular, both f (a, b)

and g(a, b, c) are not affected by the sign of variables; for example,

g(−a, b, c) = (−a + b + c)(−(−a) + b + c)(−a − b + c)(−a + b − c)

= (−a + b + c)(a + b + c)(a + b − c)(a − b + c) = g(a, b, c).

Hence, without loss of generality, it is enough to concentrate on non-negative values of


a, b, c.

As in many earlier examples, it is convenient to translate the problem to another domain.


We observe that we may assume that a, b, c are sides of some triangle, as otherwise
g(a, b, c) < 0 and so the desired inequality trivially holds (since f (a, b) ≥ 0 for any a, b).

Hence, we can re-write the inequality as follows:


2+ 2
a b 1
≥ √ (a + b + c)(a + b − c)(b + c − a)(c + a − b)
4 4

=: A(a, b, c, )

(1.37)

and use Heron’s formula to notice that the right hand side of inequality (9), function
A(a, b, c), is the area of the considered triangle. Now, since a triangle with arms of lengths

a and b has area less than or equal to ab/2 we get A(a, b, c) ≤ ab/2. (The equality holds if
and only if the angle between the two arms is 90∘.) By the geometric-arithmetic mean
inequality, ab/2 ≤ (a + b )/4 and so the desired inequality holds. Finally, let us note that
2 2

ab/2 = (a + b )/4 if and only if a = b. So putting these observations together, we get


2 2

that f (a, b) = g(a, b, c) if and only if a = b and c = 2√a.

REMARKS

Knowing Heron’s formula turns out to be useful in this example and the way it can be
applied comes to mind naturally. However, one can solve this problem without using it.
After simplifying inequality (1.36), we get an equivalent inequality
4 2 2 2 4 4
h(a, b, c) := c − 2(b + a )c + 2(b + a ) ≥ 0 ,

and we note that h(a, b, c) is a quadratic polynomial of c2. Since the discriminant, Δ ,
satis es
2
2 2 4 4 2 2
Δ := (−2(b + a )) − 4 ⋅ 2(b + a ) = −4(b − a ) ≤ 0,
the desired inequality holds. Moreover, as before, we get that the equality holds if and only
if |a| = |b| = |c|/√2.

EXERCISES

1.9.1. Prove that for all sequences of n numbers a 1, …, an ∈ R we have

 2
n n

2
n + (∑ ai ) ≤ ∑ √1 + a .
⎷ i

i=1 i=1

When does equality hold?


(Source of the problem and solution: inspired by PLMO XXXVIII – Phase 1 – Problem 7.)

1.9.2. Prove that for any x ∈ R such that 1/4 ≤ x ≤ 1 we have

x 1 4
√ 1 − x2 + √ 16x2 − 1 < .
2 16 9
Chapter 2
Equalities and Sequences

2.1 Combining Equalities


2.2 Extremal Values
2.3 Solving via Inequalities
2.4 Trigonometric Identities
2.5 Number of Solutions
2.6 Sequence Invariants
2.7 Solving Sequences

As usual, we start the chapter with some basic de nitions.


THEORY

Suppose that you are given an equation or a set of equations involving some
number of variables. A natural question that is usually asked is to nd a solution
in some given domain. In general, there are the following three possible cases:

no feasible solution exits,

there exists a unique solution,

there are multiple solutions (the number of them could be nite or


in nite).

For example, consider a quadratic equation ax + bx + c = 0, where a ≠ 0.


2

Our goal is to nd all real solutions, that is, all values of x ∈ R that satisfy this
equation. After multiplying both sides by 4a, one can re-write it as follows:
2 2
(2ax − b) = b − 4ac .

It is now clear that if b − 4ac < 0, then the equation has no real solutions, as the
2

left hand side is non-negative. On the other hand, if b − 4ac = 0, then the
2

equation has exactly one solution, namely, x = b/(2a). Finally, if b − 4ac > 0,
2

then we get two different solutions: x = (b + √b − 4ac)/(2a) and


1
2
2
x2 = (b + √ b + 4ac)/(2a) . Let us mention that the value b − 4ac, which
2

allows deducing some properties of the roots without computing them, is called
the discriminant and is often denoted as Δ.
Another important distinction is with respect to the number of equations. The
problem we need to deal with could consist of

one equation,

more than one equation but nitely many—in this case we


typically say that we have a system of equations (the word
“system” indicates that the equations are to be considered
collectively, rather than individually),

in nitely many equations—this case is often represented as a


recursive sequence of equations specifying the relationships
between the involved variables.

Let us give a simple example of an in nite series of recursive equations, as this


is not a typical situation. Let x = a for some non-zero real number a, and for
0

i ∈ N, let x = 2x . It is straightforward to see that x = 2 a is the only


n
i+1 i n

solution to this system of equations. Formally, one could prove it by induction on


n.
Let us mention a speci c family of equations which play a fundamental role in
linear algebra, a subject which is used in most areas of modern mathematics. A
system of linear equations is a collection of two or more linear equations
involving the same set of variables. A system of non-linear equations can often be
approximated by a linear system (such a process is called linearization), a helpful
technique when designing a mathematical model or computer simulation of a
complex system.
The equations of a linear system are independent if none of the equations can
be derived algebraically from the others. In other words, when the equations are
independent, each equation contains new information about the variables, and
removing any of the equations increases the size of the solution set. Any system
of n independent equations involving n variables has a unique solution.
The simplest method for solving a system of linear equations is to repeatedly
eliminate variables. Consider, for example, the following system of linear
equations:
x + 2y + z = 3
{2x + y + z = 3
x − y = 0.

It is easy to see that the sum of the second and the third equation is equal to the
rst equation, so the system is not independent. After dropping the rst equation
we get the following, independent and equivalent, system:

2x + y + z = 3
{
y = x.

We can now eliminate variable y from the rst equation to get:

z = 3 − 3x
{
y = x.

As we are not able to further reduce the system, we conclude that there are an
in nite number of solutions: a triple (x, y, z) = (t, t, 3 − 3t) satis es the original
system of equations for any t ∈ R.
Finally, let us mention about the geometric interpretation. Let us start with a
simple example, a linear system involving two variables, say x and y. Each linear
equation determines a line on the xy-plane. Because a solution to a linear system
must satisfy all of the equations, the solution set is the intersection of these lines,
and is hence either a line (an in nite number of solutions), a single point (a
unique solution), or the empty set (no solution). The three cases are illustrated on
Figure 2.1. If there is only one equation x + y = 2, then any point
(x, y) = (t, 2 − t), t ∈ R satis es this equation—see Figure 2.1(a). The
following system

x + y = 2
{
x − y = 0.

has precisely one solution, the intersection of the corresponding two lines—see
Figure 2.1(b). Finally, there is no solution to the following system

x + y = 2
{x − y = 0
y = 2,

as no point belongs to all of the corresponding three lines—see Figure 2.1(c). For
three variables, each linear equation determines a plane in three-dimensional
space, and the solution set is the intersection of these planes. In general, for n
variables, each linear equation determines a hyperplane in n-dimensional space.
FIGURE 2.1: Geometric interpretation of linear systems.

Situation is more complex for non-linear systems but one can still gain some
intuition by representing each equation as a family of points that satisfy it. In the
next section, we solve the following simple example algebraically:

2 2
x + y = 2
{
x + y = 2.

In Figure 2.2, we present the corresponding graphs that suggest the unique
solution (x, y) = (1, 1).

FIGURE 2.2: Graphs of x 2 2


+ y = 2 and x + y = 2.

2.1 Combining Equalities


SOURCE
Problem: PLMO LVI – Phase 1 – Problem 1
Solution: our own
PROBLEM

Solve the following system of equations, given that all variables involved are real
numbers:
2
x = yz + 1
2
{y = zx + 2
2
z = xy + 4 .

THEORY

A natural approach when solving systems of equations, that often turns out to be
ef cient, is to transform a given system into some other equivalent system that is
easier to deal with.
Suppose that we are given n equations with unknown variables, represented by
the vector x, that is of the form f (x) = b for i ∈ [n]. For a given sequence of
i i

weights c ∈ R, i ∈ [n], if c ≠ 0 for some j ∈ [n], then one can take the original
i j

system of equations and replace the jth equation, f (x) = b , with a linear j j

combination of all equations, that is, with


n n

∑ ci fi (x ) = ∑ ci bi .

i=1 i=1

The resulting system of equations has identical solutions as the original system.
Indeed, in order to see this, let us rst assume that some x is a solution to the
original system. It is clear that it is also a solution of the derived system. On the
other hand, if some x is a solution of the derived system, then it must also satisfy
f (x) = b as c ≠ 0 and ∑
j j j c f (x) = ∑
i i c b . i i
i≠j i≠j

To illustrate this technique, let us consider the following simple example of


such a system, where x, y ∈ R:

2 2
x + y = 2
{
x + y = 2.

One can replace the rst equation with the rst equation minus two times the
second equation to get the following equivalent system:

2 2
x − 2x + y − 2y = 2 − 4
{
x + y = 2,
or equivalently,
2 2
(x − 1) + (y − 1) = 0
{
x + y = 2.

Since (x − 1) ≥ 0 and (y − 1) ≥ 0, the rst equality holds only for x = 1 and


2 2

y = 1. As a result, we may equivalently re-write the system as follows:

x = 1
{y = 1
x + y = 2.

It is obvious now that this system of equations is satis ed for x = y = 1 . It


follows that this is the only solution of the original system.
SOLUTION

After subtracting the rst equation from the second one, we get that

(y − x)(x + y + z ) = 1.

(2.1)

Similarly, from the second and third equation, we get that

(z − y)(x + y + z ) = 2.

(2.2)

We observe that x + y + z ≠ 0 and after subtracting (2.1) twice from (2.2), we


get that

(z − 3y + 2x)(x + y + z ) = 0.

It follows that z = 3y − 2x and so we can reduce the system to two equations and
two variables:

2
x = y(3y − 2x) + 1
{ 2
y = x(3y − 2x) + 2 ,

or equivalently,
2 2
(x + y) = 4y + 1
{ 2 2
2(x + y) = y + 7yx + 2 .

Substituting the rst equation into the second one gives us the following sequence
of equivalent equalities 2(4y + 1) = y + 7yx + 2, y = yx, and so
2 2 2

y(y − x) = 0. It follows that x = y or y = 0. Note that it is not the case that


x = y , as then the left hand side of (2.1) is equal to 0 but the right hand side is
equal to 1. If y = 0, then from z = 3y − 2x we get that z = −2x and from
(x + y) = 4y + 1 we have x = 1. This gives us two candidate solutions:
2 2 2

(x, y, z) = (−1, 0, 2) and (x, y, z) = (1, 0, −2). We can then directly check that

both of them satisfy the original system of equations.


REMARKS

In order to gain some more experience, let us consider another problem to show
how the technique we practice in this section can be applied. As an example, we
use the problem from OM LVIII – Phase 1 – Problem 1. We are asked to solve the
following system of equations, where variables involved are real numbers:
2
x + 2yz + 5x = 2
2
{y + 2zx + 5y = 2
2
z + 2xy + 5z = 2.

Adding all equations together we get that (x + y + z)(x + y + z + 5) = 6, so


x + y + z = 1 or x + y + z = −6. We will independently consider the following

two cases.

Case 1: all the numbers are equal. We get that 3x 2


+ 5x − 2 = 0 and so there are
two possible values of x:

−5 + √ 49 1 −5 − √ 49
x1 = = and x2 = = −2 .
2 ⋅ 3 3 2 ⋅ 3
It follows that there are two solutions: (x, y, z) = (1/3, 1/3, 1/3) and
(x, y, z) = (−2, −2, −2).

Case 2: not all the numbers are equal. Due to the symmetry, without loss of
generality, we may assume that x ≠ y—the other solutions will be obtained by
permuting the solution vector (x, y, z). After comparing the left hand sides of the
rst and the second equation we get that (x − y)(x + y − 2z + 5) = 0. Since
x ≠ y, it follows that x + y − 2z + 5 = 0. We will now independently consider

two sub-cases, depending on the value of x + y + z.

Sub-case 2a: x + y + z = 1. Since x + y − 2z + 5 = 0, z = 2 and y = −1 − x .


After substituting this into the rst equation, we get that
2
x + 2(−1 − x) ⋅ 2 + 5x = 2,
or equivalently that 0 = x
2
+ x − 6 = (x − 2)(x + 3) . We get two solutions:
(x, y, z) = (2, −3, 2) or (x, y, z) = (−3, 2, 2).

Sub-case 2b: x + y + z = −6. This time, after combining this with


x + y − 2z + 5 = 0, we get that z = −1/3 and y = −17/3 − x. After
substituting this into the rst equation, we get that
2
x + 2(−17/3 − x) ⋅ (−1/3) + 5x = 2,

or equivalently that

17 16 1 16
2
0 = x + x + = (x + )(x + ) .
3 9 3 3

We get two more


solutions: (x, y, z) = (−1/3, −16/3, −1/3) and
(x, y, z) = (−16/3, −1/3, −1/3).

Combining all the cases together, we deduce that there are 8 candidate
solutions:

(x, y, z) ε {(1/3, 1/3, 1/3, ), (−2, −2, −2),

(−3, 2, 2), (2, −3, 2), (2, 2, −3),

(−16/3, −1/3, −1/3), (−1/3, −16/3, −1/3),

(−1/3, −1/3, −16/3, ).

We may then directly check that all of them meet the original system of
equations.
EXERCISES

2.1.1. Solve the following system of equations, given that all variables involved
are real numbers:
3
⎧a + b = c
3
b + c = d
⎨ 3
c + d = a

3
d + a = b.

(Source of the problem and solution: PLMO LXIII – Phase 2 – Problem 1.)

2.1.2. Solve the following system of equations, given that all variables involved
are real numbers:
3 3
(x − y)(x + y ) = 7
{ 3 3
(x + y)(x − y ) = 3.

(Source of the problem and solution: PLMO LXII – Phase 2 – Problem 1.)

2.1.3. Solve the following system of equations, given that all variables involved
are real numbers:
2
x − (y + z + yz)x + (y + z)yz = 0
2
{y − (z + x + zx)y + (z + x)zx = 0
2
z − (x + y + xy)z + (x + y)xy = 0.

(Source of the problem and solution: PLMO LXI – Phase 2 – Problem 1.)

2.2 Extremal Values


SOURCE

Problem and method of solution: PLMO LVII – Phase 3 – Problem 1

PROBLEM

Solve the following system of equations, given that all variables involved are real
numbers:
2 3 3
⎧a = b + c
2 3 3
b = c + d
2 3 3
⎨c = d + e
2 3 3
d = e + a

2 3 3
e = a + b .

THEORY

Cyclic Systems of Equations A system of n equations with n variables


x , x , …, x of the form
1 2 n

⎧ f1 (x1 , x2 , …, xn ) = 0
f2 (x1 , x2 , …, xn ) = 0



fn (x1 , x2 , …, xn ) = 0

is called cyclic if the system does not change after replacing variable xi with
variable x for i ∈ [n − 1], and replacing variable xn with variable x1.
i+1
Let us note that any cyclic system with a nite number of variables
x , x , …, x has the property that a circular permutation of any solution is also a
1 2 n

solution. In other words, if (x , x , …, x ) = (x̄ , x̄ , …, x̄ ) is a solution, then


1 2 n 1 2 n

so is (x , x , …, x ) = (x̄ , x̄ , …, x̄ , x̄ , x̄ , …, x̄ ), where k ∈ [n]. As a


1 2 n k k+1 n 1 2 k−1

result, when solving cyclic systems of equations it is often useful to start with
assuming that some variable attains the maximum or the minimum value among
the whole set of all variables. Once the solution is found, one can simply recover
the whole family of solutions that can be obtained by applying circular
permutations to the particular solution.
In order to illustrate this technique in a simple setting, let us consider the
following cyclic system of n equations and n variables x , x , …, x ∈ R, where 1 2 n

n ∈ N ∖ {1}. For i ∈ [n], x + 2 = 3x , assuming x = x . Using our


3
i i+1 n+1 1

observation, without loss of generality, we may assume that x = max x . 1 i∈[n] i

Since
3
x1 + 2 = 3x2 ≤ 3x1 ,

we get that
3 2 2
0 ≥ x1 + 2 − 3x1 = (x1 − 2x1 + 1)(x1 + 2 ) = (x1 − 1) (x1 + 2 ) .

It follows that x = 1 or x ≤ −2. 1 1

If x = 1, then x = (x + 2)/3 = (1 + 2)/3 = 1 and, since the system is


1 2
3
1

cyclic, we conclude that in fact x = 1 for each i ∈ [n]. Hence, i

(x , x , …, x ) = (1, 1, …, 1) is one solution to the system. Similarly, if


1 2 n

x = −2, then x = (x + 2)/3 = (−8 + 2)/3 = −2, and arguing as before we


3
1 2 1

get that (x , x , …, x ) = (−2, −2, …, −2) is another solution to the system.


1 2 n

We will show that there are no more solutions. For a contradiction, suppose that
there exists a solution for which x < −2. In this case, we get that 1

3 3 2
x + 2 x + 2 − 3x1 (x1 − 1) (x1 + 2)
1 1
x2 − x1 = − x1 = = < 0.
3 3 3
It follows that x < x < −2. Since the system is cyclic, we can keep repeating
2 1

this argument to conclude that

−2 > x1 > x2 > … > xn > x1 ,


which is clearly a contradiction.

SOLUTION
Without loss of generality, we may assume that b is the largest variable involved,
that is, b = max{a, b, c, d, e}. In particular, since function f (x) = x is an 3

increasing function, b ≥ d . By subtracting the second equation from the rst


3 3

one, we get that


2 2 3 3 3 3 3 3
a − b = (b + c ) − (c + d ) = b − d ≥ 0,

(2.3)

and so a ≥ b . Since b ≥ a, we see that a


2 2 2
≥ b
2
holds only if a ≤ −|b| or a = b.
We will independently consider both cases.

Case 1: a ≤ −|b|. We get that a and so


3 3 3
≤ (−|b|) = −|b|

3 3 3 3
a + b ≤ a + |b| ≤ 0.

On the other hand, the fth equation implies that a + b = e ≥ 0 and so e = 0. 3 3 2

As a result, the fourth equation implies that a = d ≥ 0 and so a = 0 and b = 0.


3 2

Coming back to the original equations, we deduce that d = 0 and consequently


that c = 0.

Case 2: a = b. It follows from (2.3) that b = d and so we also get that b = d. It


3 3

follows from the third and the fourth equation that c = b . If c = b (that is, 2 2

a = b = c = d), then we get from the second equation that


0 = 2b − b = b (2b − 1) and so b = 0 or b = 1/2 (and in both cases all other
3 2 2

variables are equal). If c = −b then we get from the rst equation that a = 0, and
again all variables are equal to 0.

In summary, there are only two solutions to our system of equations:

(a, b, c, d, e) ∈ {(0, 0, 0, 0, 0), (1/2, 1/2, 1/2, 1/2, 1/2) } .

REMARKS

In all examples presented so far, all solutions had the property that all variables
are equal, that is, when the minimum values is equal to the maximum value.
Indeed, this is often the case but, of course, it does not have to be in general.
Consider, for instance, the following cyclic system of equations where variables
x1, x2, x3 are real numbers:
x1 (1 − x2 ) = 1
{x2 (1 − x3 ) = 1
x3 (1 − x1 ) = 1.

We get immediately that none of the variables x , x , x is equal to 0 or 1;


1 2 3

otherwise, the left hand side of one of the equations would be equal to 0. Hence,
we can re-write the system as follows:

x2 = 1 − 1/x1
{x3 = 1 − 1/x2
x1 = 1 − 1/x3 .

Note that if x ∉ {0, 1}, then f (x) := 1 − 1/x ∉ {0, 1} . Hence, after making a
substitution, we get that

x2 = 1 − 1/x1
{x3 = 1 − 1/(1 − 1/x1 ) = −1/(x1 − 1)
x1 = 1 − 1/(−1/(x1 − 1) ) = x1 .

It follows that any triple of a form (x , x , x ) = (t, 1 − 1/t, −1/(t − 1)) is a


1 2 3

solution to our system, provided that t ∈ R is not equal to 0 nor 1. In particular,


(x , x , x ) = (1/2, −1, 2) is a solution so, indeed, there are non-constant
1 2 3

solution vectors.
Finally, let us mention that the trick of assuming that one of the variables
attains the maximum or the minimum does not only apply to cyclic systems—see,
for example, Problem 2.2.3.
EXERCISES

2.2.1. Solve the following system of equations, given that all variables involved
are real numbers:
3
⎧(x + y) = 8z
3
⎨(y + z) = 8x
⎩ 3
(z + x) = 8y .

(Source of the problem: OM LXIII – Phase 1 – Problem 1. Solution: our own.)

2.2.2. Solve the following system of equations, given that all variables involved
are real numbers:
5 3
x = 5y − 4z
5 3
{y = 5z − 4x
5 3
z = 5x − 4y .

(Source of the problem and solution: PLMO LIX – Phase 1 – Problem 1.)
2.2.3. Solve the following system of equations, given that all variables involved
are positive real numbers:
3 3 3 3
a + b + c = 3d
4 4 4 4
{b + c + d = 3a
5 5 5 5
c + d + a = 3b .

(Source of the problem and solution: PLMO LV – Phase 2 – Problem 1.)

2.3 Solving via Inequalities


SOURCE

Problem: PLMO LVI – Phase 3 – Problem 4 (slightly modi ed)


Solution: our own

PROBLEM

Suppose that n ∈ N ∖ {1} and C ∈ (−2, 2) . Find all solutions (x 1, x2 , …, xn ) of


the following equation:
n n

2 2 √C + 2 ∑ x ,
∑ √ x + Cxi xi+1 + x = i
i i+1

i=1 i=1

(2.4)

where x i ∈ R for i ∈ [n] and x n+1 = x1 .


THEORY

Consider an equation f (x) = g(x), where x = (x , x , …, x ) is a vector 1 2 n

consisting of n unknown variables. Suppose that our goal is to nd all vectors x


that satisfy the equation. One possible approach that often turns out to be useful is
to start with showing that f (x) ≤ g(x) (or that f (x) ≥ g(x)). There are a lot of
techniques for achieving this, many of them we already discussed in Chapter 1.
More importantly, if one does this step carefully, it often turns out that the result
is sharp, that is, best possible. In other words, there are vectors
x = (x , x , …, x ) that satisfy f (x) = g(x). But this is exactly what we
1 2 n

wanted! Hence, after careful investigation of the obtained inequality, we usually


see all the “bottlenecks” that prevented us from improving the bound even further,
and so all the necessary conditions for equality become apparent.

SOLUTION

Let us note that the right hand side of (2.4) can be rewritten as follows:

n √ C+2 n √ C+2 n
√C + 2 ∑ xi = ∑ (xi + xi+1 ) = ∑ |xi + xi+1 | − ϵ
i=1 2 i=1 2 i=1

2
n (C+2)(xi +xi+1 )
∑ √ − ϵ,
= i=1 4

where

n n
√C + 2
ε := (∑ |xi + xi+1 | − ∑ (xi + xi+1 ) ) ≥ 0,
2
i=1 i=1

as C > −2 . Our goal is to solve the following equation:

2
n (C+2)(xi +xi+1 )
− √
2 2
0 = ∑ (√ x + Cxi xi+1 + x ) + ϵ
i=1 i i+1 4

2
(C+2)(x +x )
2 2 i i+1
n x +Cxi xi+1 +x −
i i+1 4
= ∑i=1 + ϵ,
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4

(2.5)

where in the last step we used a standard method of removing square roots, that
is, using the fact that

√a + √ b a − b
√a − √ b = (√a − √ b) ⋅ = .
√a + √ b √a + √ b
Starting from the right hand side of (2.5), we get that
2
(C+2)(x +x )
2 2 i i+1
n x +Cxi xi+1 +x −
i i+1 4
∑ + ϵ
i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4

2 2 2 2 2
n 4x +4Cxi xi+1 +4x −4x −(C+2)x −2(C+2)xi xi+1 −(C+2)x
1 i i+1 i+1 i i+1
= ∑ + ϵ
4 i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4

2 2
n (2−C)x −(4−2C)xi xi+1 +(2−C)x
1 i i+1
∑ + ϵ
4 i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4

2
2−C n (xi −xi+1 )
= ∑ + ϵ ≥ 0,
4 i=1
2
(C+2)(x +x )
2 2 √ i i+1
√ x +Cxi xi+1 +x +
i i+1 4

since C ∈ (−2, 2). As (2 − C)/4 > 0, it is now obvious that the equality holds if
and only if all xi are equal, that is, (x , x , …, x ) = (t, t, …, t) for some t ∈ R, 1 2 n

and

√C + 2
ε = (|2t| − 2t ) = 0.
2
It follows that all solutions are of the form (x1 , x2 , …, xn ) = (t, t, …, t) for
some t ∈ R ∪ {0}. +

REMARKS

Since the right hand side of (2.4) has the term √C + 2, we immediately see that
the assumption that C > −2 is needed and natural. On the other hand, the
assumption that C < 2 is not needed and so we deduce that it has to be an
important condition that affects the solution. Indeed, if C = 2, then the left hand
side of (2.4) is equal to ∑ |x + x | and so it is equal to 2 ∑ x , the right
n

i=1 i i+1
n

i=1 i

hand side, provided that x + x ≥ 0 for all i. In particular, any vector


i i+1

(x , x , …, x ) with non-negative coef cients is a solution to the system.


1 2 n

This reasoning leads us to consider the following equality


n n

2 2 2
∑ √x + Cxi xi+1 + x = ∑ √ (xi + xi+1 ) − (2 − C)xi xi+1 ,
i i+1

i=1 i=1

and to analyze the problem in terms of x + x . Now, one can notice that the i i+1

right hand side is easy to transform to the following form:


n n
√C + 2
√C + 2 ∑ x = ∑ (xi + xi+1 ) .
i
2
i=1 i=1
Since the sum on the left hand side has each term in the form of the square root, it
is natural to represent the right hand side the same way, obtaining

n n 2
(C + 2)(xi + xi+1 )
√C + 2 ∑ x √
i ≤ ∑ ,
4
i=1 i=1

as we did in the solution. To see this we used the fact that z ≤ |z| for all z ∈ R.
The remaining of the solution follows naturally by grouping the terms using xi
and x .
i+1

Finally, let us note that for C ∈ (−2, 2) we have that x + Cx x 2


i i i+1
2
+ x
i+1
is
always non-negative, so the original problem is well de ned for all xi ∈ R. In

order to see this, note that


2 2
2 2
C C 2
xi + Cxi xi+1 + xi+1 = (xi + xi+1 ) + (1 − )xi+1 ≥ 0,
2 4

since C ∈ (−2, 2) .
EXERCISES

2.3.1. Solve the following equation

4 2 2
(x + 3y )√ |x + 2| + |y| = 4 xy ,

provided that x, y ∈ R.
(Source of the problem: PLMO LXIV – Phase 2 – Problem 4. Solution: our own.)

2.3.2. Solve the following system of equations, given that all variables involved
are real numbers:
2 2 2
3(x + y + z ) = 1
{ 3
2 2 2 2 2 2
x y + y z + z x = xyz(x + y + z) .

(Source of the problem and solution approach: PLMO XLVIII – Phase 3 –


Problem 2.)

2.3.3. Solve the following system of equations, given that all variables involved
are real numbers:
2
x y + 2 = x + 2yz
2
{y z + 2 = y + 2zx
2
z x + 2 = z + 2xy .
(Source of the problem and solution: PLMO LXIX – Phase 1 – Problem 3.)

2.4 Trigonometric Identities


SOURCE

Problem and solution: PLMO LX – Phase 3 – Problem 6


PROBLEM

Let n be any natural number such that n ≥ 2. Suppose that a sequence of non-
negative numbers (c , c , …, c ) satis es the following condition:
0 1 n

cp cs + cr ct = cp+r cr+s

for all non-negative integers p, r, s, t such that p + r + s + t = n . Find c2 under


the assumption that c = 1.
1

THEORY

If one encounters a cyclic system of equations, it is often useful to transform it


into another system using variable substitution with trigonometric functions. The
reason why such substitution has a chance to work is that trigonometric functions
are periodic which can be used to exploit a cyclic nature of the system of
equations at hand. This is especially useful when one has to deal with a system of
equations for which the number of equations is a variable parameter.
In the problem we aim to solve in this section, the system is not cyclic but
exhibits enough other similarities to justify trying such substitutions of
trigonometric functions.

Trigonometric Functions The trigonometric functions are real functions which


relate an angle of a right-angled triangle to ratios of two side lengths. They are
among the simplest periodic functions, and as such are also widely used for
studying periodic phenomena. The most widely used trigonometric functions are
the sine, the cosine, and the tangent. Their reciprocals are respectively the
cosecant, the secant, and the cotangent.
The oldest de nitions of trigonometric functions, related to right-angle
triangles, de ne them only for acute angles.Given an acute angle α of a right-
angled triangle, the hypotenuseh is the side that connects the two acute angles—
see Figure 2.3. The side b adjacent to α is the side of the triangle that connects α
to the right angle. The third side a is said to be opposite to α.

FIGURE 2.3: Classical de nition of trigonometric functions.

If the angle α is given, then all sides of the right-angled triangle are well
de ned, up to a scaling factor. This means that the ratio of any two side lengths
depends only on α. These six ratios de ne six functions of α, which are the
trigonometric functions:
a
sin (α) = (sine)
h

b
cos (α) = (cosine)
h

a
tan(α) = (tangent).
b

As a result, the reciprocal functions of cosine, sine, and tangent, are:


1 h
csc (α) = = (cosecant)
sin(α) a

1 h
sec(α) = = (secant)
cos(α) b

1 b
cot(α) = = (cotangent).
tan(α) a

For extending these de nitions to functions whose domain is the whole real line,
one can use geometrical de nitions using the standard unit circle (a circle with
radius of 1)—see Figure 2.4.
FIGURE 2.4: De nition of trigonometric functions in coordinate system. Note that
h = √a + b = √1 = 1
2 2
and this time b is negative, so cos(α) = b/h is also negative.

In all the computations in this book, we always assume that the angle α is
measured in radians, that is, the length of an arc of a unit circle de ned by the
angle and measured counter-clockwise. This is because radians are more
“natural” and give more elegant formulation of a number of important results such
as

sin(α)
lim = 1.
α→0 α
The trigonometric functions also have simple and elegant series expansions when
radians are used. As a result, modern de nitions express trigonometric functions
as in nite series:
3 5 7
α α α
sin (α) = α − + − + ...
3! 5! 7!

2 4 6
α α α
cos (α) = 1 − + − + ... .
2! 4! 6!

Trigonometric Identities The most useful trigonometric identities that can be


exploited in such cases include the following basic relationship between the sine
and the cosine that is called the Pythagorean Identity:
2 2
sin (x) + cos (x ) = 1.

Dividing this identity by either 2


sin (x) or cos (x)
2
yields the other two
Pythagorean identities:
2 2 2 2
1 + tan (x ) = sec (x) and 1 + cot (x ) = csc (x ) .

By examining the unit circle, we notice that re ections in the directions 0, π/4,
π/2, and π generate equally looking results. As a result, the following properties

of the trigonometric functions can be established:

,
sin(−x) = − sin(x) cos(−x) = cos(x) ;

,
sin(π/2 − x) = cos(x) cos(π/2 − x) = sin(x) ;

,
sin(π − x) = sin(x) cos(π − x) = − cos(x) ;

sin(2π − x) = − sin(x) = sin(−x) ,


cos(2π − x) = cos(x) = cos(−x) .
By shifting the arguments of trigonometric functions by a quarter turn, a half turn,
or a full turn, one can get more identities:

,
sin(π/2 + x) = cos(x) cos(π/2 + x) = − sin(x) ;

,
sin(π + x) = − sin(x) cos(π + x) = − cos(x) ;

,
sin(2π + x) = sin(x) cos(2π + x) = cos(x) .

From the above identities, one can easily deduce analogous properties of the
tangent and the cotangent functions—we leave the details details for the reader.
Here are a few, slightly more complex identities. The rst four identities are
known as the angle addition identities:

sin (x + y) =sin (x) cos (y)+ cos (x) sin (y),

cos (x + y) =cos (x) cos (y)− sin (x) sin (y);


tan(x)+tan(y)
tan (x + y) = ,
1−tan(x)tan(y)

cot(x)cot(y)−1
cot (x + y) = .
cot(x)+cot(y)

The above formulas yield the double-angle identities by setting x = y . In


particular,
2tan(x)
sin (2x) = 2 sin (x) cos (x) = 2
,
1+tan (x)

2 2 2 2
cos (2x) = cos (x)− sin (x) = 2 cos (x) − 1 = 1 − 2 sin (x).

Let us nish with the product-to-sum and the sum-to-product identities:


cos(x−y)+cos(x+y)
cos (x) cos (y) = ,
2

cos(x−y)−cos(x+y)
sin (x) sin (y) = ,
2

sin(x+y)+sin(x−y)
sin (x) cos (y) = ;
2
x+y x−y
sin (x)+ sin (y) = 2 sin ( ) cos ( ),
2 2

x+y x−y
cos (x)+ cos (y) = 2 cos ( ) cos ( ),
2 2

x+y x−y
cos (x)− cos (y) = −2 sin ( ) sin ( ).
2 2

SOLUTION
By considering p = n and r = s = t = 0, we get that c c + c = c c , which n 0
2
0 n 0

implies that c = 0. Now, by considering 0 ≤ p ≤ n/2, s = p, t = 0, and


0

r = n − 2p, we get that c = c . Since ci are non-negative, we get that


2
p
2
n−p

c = c
p
for each 0 ≤ p ≤ n/2. In fact, by symmetry, c = c
n−p
for each p n−p

0 ≤ p ≤ n. In particular, we have c = c = 0 and c = c = 1. Finally,


n 0 n−1 1

consider p = s = 1, r ∈ [n − 2] ∪ {0}, and t = n − r − 2 to get that


2 2
cr+1 = c1 + cr cn−r−2 = 1 + cr cr+2 .

In particular, since all ci are non-negative, it implies that c ≥ 1 and so


2
r+1

cr+1 ≥ 1. It follows that all ci, except c0 and cn, are at least 1. We will not need

this property but, in order to build an intuition, let us mention that with this
stronger property at hand we get that for each r ∈ [n − 3] we have c ≥ 2 and
2
r+1

so all ci, except c , c , c , and cn, are at least √2. One may continually recurse
0 1 n−1

this argument to get a sequence of lower bounds.


From the identity above we get that for each r ∈ [n − 2], we have that

cr+2 cr+1 − 1/cr+1


=
cr+1 cr

(2.6)

(note that we excluded r = 0 as c = 0). Let us summarize what we know so far:


0

c = c
0 = 0, c = c
n 1 = 1, and (2.6) holds for each r ∈ [n − 2]. Moreover, if
n−1

there is a unique c2 that satis es the desired property, then in fact the whole
sequence is de ned uniquely. Indeed, by (2.6) we see that c is determined by r+2

cr+1 and cr.


We will show that indeed the sequence is unique. For a contradiction, let us
suppose that there are two sequences (0, 1, c , c , …, c , 1, 0) and 2 3 n−2

(0, 1, d , d , …, d
2 3 , 1, 0) that satisfy the desired property and d ≠ c . Without
n−2 2 2

loss of generality, we may assume that d > c . We will prove by (strong) 2 2

induction that for r ∈ [n − 2] we have that d > c and that r+1 r+1

dr+1 /d > cr /c . This, in particular, will imply that d


r+1 r > c which will n−1 n−1

give us the desired contradiction as d = c = 1. n−1 n−1

The base case ( r = 1) clearly holds; by our assumption,


d /d = d > c = c /c
2 1 2 2 and so both inequalities hold. For the inductive step,
2 1

assume that both d > c and d /d > c /c hold for all


s+1 s+1 s+1 s s+1 s

1 ≤ s ≤ r ≤ n − 3. In this case, we can use (2.6) to get that


dr+2 dr+1 −1/dr+1 dr+1 1
= = −
dr+1 dr dr dr+1 dr

cr+1 1 cr+2
> − = .
cr cr+1 cr cr+1

But this also immediately gives that d > c . This nishes the inductive proof
r+2 r+2

and gives the desired contradiction. It follows that the sequence (c , c , …, c ) is 0 1 n

de ned uniquely.
We will now show that the sequence c = sin(πi/n)/ sin(π/n), i ∈ [n] ∪ {0}
i

satis es the desired property, and so this is the only solution to our problem.
Indeed, note that c = 0, c = 1, and for all non-negative integers p, r, s, t such
0 1

that p + r + s + t = n, we have that


sin(πp/n)sin(πs/n)+sin(πr/n)sin(πt/n)
cp cs + cr ct = 2
sin (π/n)

π(p−s) π(p+s) π(r−t) π(r+t)


cos( )+cos( )+cos( )+cos( )
n n n n

= 2
2sin (π/n)

π(p−s) π(r−t)
cos( )+cos( )
n n

= 2
,
2sin (π/n)

since cos(π(r + t)/n) = cos(π − (p + s)/n) = − cos((p + s)/n). We continue


in the following way, noting that t = n − p − r − s:
π(p−s+r−(n−p−s−r)) π(p−s−r+(n−p−s−r))
cos( )cos( )
2n 2n

cp cs + cr ct = 2
sin (π/n)

π(2p+2r−n) π(−2s−2r+n)
cos( )cos( )
2n 2n

= 2
sin (π/n)

π(p+r) π(s+r)
sin( )sin( )
n n

= 2
= cp+r cs+r .
sin (π/n)

This shows the desired equality, thus completing the proof.

REMARKS

Let us note that the solution of our problem is clearly divided into two separate
steps: the proof of the uniqueness of the solution and then nding the actual
sequence that satis es the desired property. The main dif culty is the second part,
and initially it is not clear how to guess a possible solution. From the rst part, we
have learnt that the sequence starts from c = 0, then increases to c = 1, and 0 1

then c ≥ √2. However, at some point it must start decreasing as it is symmetric:


2

c = c
p n−p for any 0 ≤ p ≤ n. It is natural to guess that the sequence is rst
increasing, and then after reaching a “turning point” the monotonicity of the
sequence changes. However, at this point it is only a conjecture. Now, being
aware that trigonometric functions are possible substitutions, the natural guess is
to use c = sin(πi/n) as it satis es the requirements and is also non negative for
i

each i ∈ [n] ∪ {0}. However, since c = 1, we have to normalize it by dividing


1

by sin(π/n). All that is left now is to check that this guess satis es the original
equation, which we leave for the reader.

EXERCISES

2.4.1. Let n ≥ 2 be any natural number. Find the number of sequences


(x1 , x2 , …, xn ) of non-negative real variables that satisfy the following system of

equations: for i ∈ [n]


2
xi+1 + xi = 4xi ,

where x n+1 = x . 1

(Source of the problem and solution: PLMO LI – Phase 3 – Problem 1.)

2.4.2. Let n ∈ N. Find all solutions of the equation


n n
|tan (x) − cot (x) | = 2n| cot(2x) | .

(Source of the problem and solution: PLMO XLIX – Phase 1 – Problem 5,


slightly modi ed.)

2.4.3. For a given a ∈ R, let us recursively de ne the following sequence:


x = √ 3 and for all non-negative integers n,
0

1 + axn
xn+1 = .
a − xn
Find all values of a for which the sequence has a period equal to 8.
(Source of the problem and solution: PLMO LVI – Phase 1 – Problem 9.)

2.5 Number of Solutions


SOURCE

Problem and solution approach: PLMO LXII – Phase 3 – Problem 3

PROBLEM
Suppose that n is an odd natural number. Find the number of real solutions of the
following system of equations:

⎧x1 (x1 + 1) = x2 (x2 − 1)


x2 (x2 + 1) = x3 (x3 − 1)

⎨ ⋮

x (xn−1 + 1) = xn (xn − 1)
⎩ n−1
xn (xn + 1) = x1 (x1 − 1 ) .

THEORY

There are many interesting problems in which one is asked to count the number of
solutions to a given system of equations without actually nding any solution. As
mentioned at the beginning of this chapter, one can distinguish four cases and
each of them usually requires a different approach: there is no solution, there is
exactly one solution, there are many solutions but a nite number of them, and
there are in nitely many solutions. When more than one solution is expected, but
nitely many, then typically the proof strategy requires the following three steps:

deriving the rule for candidate solutions (that is, nding all
solutions that meet the desired conditions of the problem);

showing that any two candidate solutions are distinct;

counting the candidate solutions.

For the last step, it is often useful to use the powerful technique called double
counting that we discuss in detail in Section 4.3. The idea is to construct a
bijection from the set of solutions to some other set that is easier to count.
In order to illustrate the double counting technique, let us consider the
following problem. Suppose that k and n are natural numbers such that k ≤ n.
Our goal is to compute the number of solutions of the equation ∑ x = n, k

i=1 i

provided that x ∈ N for each i ∈ [k]. We will show that there are (
i ) distinct
n−1

k−1

solutions. In order to see this, let us consider any solution (x , x , …, x ) of the


1 2 k

equation ∑ x = n. For any j ∈ [k], let us de ne y := ∑ x . It is clear that


k

i=1 i j
j

i=1 i

(y , y , …, y ) is an increasing sequence of natural numbers and y = n. More


1 2 k k

importantly, each increasing sequence (y , y , …, y ) of natural numbers,


1 2 k−1

whose terms are bounded above by n − 1, uniquely yields one solution. In other
words, there is a bijection from the set of solutions we want to count and the set
of increasing sequences of natural numbers that are at most n − 1. The former
one seems dif cult to count but the latter one is easy to count as each increasing
sequence is uniquely de ned by an (k − 1)-element subset of the set [n − 1].
Since there are ( ) subsets of size k − 1 selected from the set of n − 1
n−1

k−1

elements, the proof is nished.


Another standard and well-known counting problem is to count the number of
ways to make change for a dollar. Our goal is to make a change of $1 using ¢1,
¢5, ¢10, and ¢25 coins. There are many ways to solve this problem, including the
one using generating functions which we discuss in Section 4.9. Here, we use a
longer but elementary approach. Let x be the number of ways one can make
1
n

change of n cents using only ¢1 coins. Let x be the number of ways one can
2
n

make change of n cents using only ¢1 and ¢5 coins. Let x be the number of ways 3
n

one can make change of n cents using only ¢1, ¢5, and ¢10 coins. Finally, let x 4
n

be the number of ways one can make change of n cents using ¢1, ¢5, ¢10 and ¢25
coins. Our goal is to nd x . 4
100

Clearly, x = 1 for each non-negative integer n (including x = 1 that is


1
n
1
0

vacuously true—there is exactly one way to make a change of 0 cents, namely, do


nothing). Moreover, we have x = ⌊n/5⌋ + 1, as at most ⌊n/5⌋ ¢5 coins can be
2
n

dispensed and so there are ⌊n/5⌋ + 1 ways to do it (including not using ¢5 coins
at all). The remaining amount is uniquely delt with ¢1 coins.
We are now ready to compute x 4
100
. We rst observe that
4 3 3 3 3
x100 = x100 + x75 + x50 + x25 + 1 ,

by independently considering 5 ways to use ¢25 coins. Similarly, by considering


¢10 coins, we get that
3 2 2 2
x = x + x + x = 6 + 4 + 2 = 12
25 25 15 5

3 2 2 2 2 2 2
x = x + x + x + x + x + x
50 50 40 30 20 10 0

= 11 + 9 + 7 + 5 + 3 + 1 = 36

3 2 2 2 2 2 3
x = x + x + x + x + x + x
75 75 65 55 45 35 25

= 16 + 14 + 12 + 10 + 8 + 12 = 72

3 2 2 2 2 2 3
x = x + x + x + x + x + x
100 100 90 80 70 60 50

= 21 + 19 + 17 + 15 + 13 + 36 = 121

It follows that x4
100
= 121 + 72 + 36 + 12 + 1 = 242 .
Finally, let us mention that Larry King said in his USA Today column that
there are 293 ways to make change for a dollar. Why did he get a different
number? He included ¢50 coins! Though not commonly used today, half-dollar
coins have a long history of heavy use alongside other denominations of coinage,
but have faded out of general circulation for many reasons. With this additional
coin available, the number of ways to make change for a dolar is
4 4
x100 + x50 + 1 = 242 + 49 + 1 = 292 ,

since
4 3 3
x50 = x50 + x25 + 1 = 12 + 36 + 1 = 49 .

We are still missing one way! The reason for that is Larry King included also a
dollar coin which seems controversial. For example, Walter Wright said that a
dollar coin cannot be considered change for a dollar bill, arguing after Webster’s
New World Dictionary that de nes change as “a number of coins or bills whose
total value equals a single larger coin or bill.”
SOLUTION

Let n be any odd natural number. For convenience, let us use the convention that
xn+1 = x . 1Then, for each i ∈ [n] we can re-write the equation
x (x + 1) = x
i i (x − 1)
i+1 as follows: (x + x )(x − x − 1) = 0. It
i+1 i i+1 i+1 i

follows that x = −x or x = x + 1. We may set s = −1 if x = −x


i+1 i i+1 i i i+1 i

and set s = 1 if x = x + 1. As a result, the sequence


i i+1 i

x := (x , x , …, x , x
1 2 ) can be uniquely determined by the
n n+1 rst term, x1, and
the sequence s := (s , s , …, s ). Indeed, since x = s x + (s + 1)/2 for
1 2 n i+1 i i i

each i ∈ [n], we get that for each ℓ ∈ [n]


ℓ ℓ ℓ
si + 1
xℓ+1 = x1 ∏ si + ∑ ∏ sj .
2
i=1 i=1 j=i+1

There is clearly an in nite number of sequences x of this form but not all of them
satisfy our system as we also require that x = x , that is, 1 n+1

n n n
si + 1
x1 = xn+1 = x1 ∏ si + ∑ ∏ sj .
2
i=1 i=1 j=i+1

(2.7)
Recall that n is odd. We will show that the number of terms in s that are equal
to 1 is even. For a contradiction, suppose that it is not true, that is, the number of
terms that are equal to −1 is even. In particular, ∏ s = 1 and so (2.7) reduces n

i=1 i

to
n n
si + 1
0 = ∑ ∏ sj .
2
i=1 j=i+1

However, since the number of terms for which (s + 1)/2 = 1 is odd and i

(s + 1)/2 = 0 otherwise, we deduce that the right hand side of the above
i

equality is the sum of odd number of terms, each of them from {−1, 1}, and so it
is not equal to zero. We get the desired contradiction and so, indeed, the number
of terms in s that are equal to 1 is even. Since ∏ s = −1, the required
n

i=1 i

condition (2.7) implies that


n n
si + 1
x1 = ∑ ∏ sj ,
4
i=1 j=i+1

and so x1 is uniquely determined by the sequence s (and so is the whole sequence


x). In fact, x1 is the sum of even number of terms, each of which is equal to 1/2
or −1/2, so x1 is an integer (and so are all the entries of x).
Our next task is to show that each sequence s with an even number of 1’s yields
a unique sequence x. For a contradiction, suppose that s = (s , s , …, s ) and 1 2 n

s = (s , s , …, s ) yield the same sequence x = (x , x , …, x ). Since s and s


′ ′ ′ ′ ′
1 2 n 1 2 n

are different, let i ∈ [n] be such that s ≠ s . Without loss of generality, we may
i

i

assume that s = 1 and s = −1. On the one hand, we get that


i

i

xi+1
= s x + (s + 1)/2 = x + 1.
i i i
On i
the other hand,
= s x + (s + 1)/2 = −x . It follows that x + 1 = −x and so
′ ′
xi+1 i i i i i i

x = −1/2 ∉ Z. We get the desired contradiction (since x ∈ Z for all i ∈ [n])


i i

and so there is a bijection between the sequences with even number of 1’s and the
set of solutions of our system.
It remains to count the number of such sequences s. We note that in order to
generate a sequence of 1’s and −1’s with even number of 1’s, one can take any
sequence (s , s , …, s ) of 1’s and −1’s of length n − 1 (there are clearly
1 2 n−1

2
n−1
of them) and then the last term, sn, is uniquely determined (in order to keep
the number of 1’s even). We conclude that there are 2 n−1
solutions to the system
of equations we deal with.
REMARKS
In the problem we deal with in this section, the key idea was to introduce the
sequence si and write down an explicit solution to the recurrence relation for xi.
The sequence si is often called a control sequence of the sequence xi.
The solution we have found is relatively complex. The idea is quite
straightforward but there are many places and calculations where one can easily
make a mistake. In such cases, to be on a safe side, it is strongly recommended to
re-check some cases manually, or use computer to re-compute the number of
solutions for some instances. Here is a program written in Julia that allows us to
re-check the solution.

function find_sequence(s)
x1 = sum((s[i]+1)/4*prod(s[i+1:end]) for i in 1:le
ngth(s))
x = Int[x1]
for si in s
push!(x, si*x[end] + (si+1)/2)
end
println(x)
end
Now, we can test our solution for all scenarios, provided n = 3:

julia > find_sequence([1, 1, -1])


[-1, 0, 1, -1]
julia > find_sequence([1, -1, 1])
[0, 1, -1, 0]
julia > find_sequence([-1, 1, 1])
[1, -1, 0, 1]
julia > find_sequence([-1, -1, -1])
[0, 0, 0, 0]
And let us check just one more solution, this time for n = 7:

julia > find_sequence([1, 1, 1, 1, -1, 1, 1])


[-1, 0, 1, 2, 3, -3, -2, -1]
EXERCISES
2.5.1. For a given a ∈ R, consider the following system of equations:
2 2
x + y + z = a
2 2
{x + y + z = a
2 2
x + y + z = a.

Find the number of real solutions (x, y, z) of this system as a function of a.


(Source of the problem: PLMO XLVIII – Phase 2 – Problem 1. Solution: our
own.)

2.5.2. Solve the following system of equations, given that all variables involved
are positive real numbers:
2010 2009 2009 2010
(x − 1)(y − 1) = (x − 1)(y − 1) .

(Source of the problem and solution: PLMO LXI – Phase 1 – Problem 1.)

2.5.3. Fix an integer n ≥ 2, and consider the following system of n equations: for
i ∈ [n]

2 2
xi+1 + xi + 50 = 12xi+1 + 16xi .

(As usual, we use the convention that x = x .) Find the number of solutions
n+1 1

of this system, given that all the variables involved are integers.
(Source of the problem and solution: PLMO L – Phase 3 – Problem 4.)

2.6 Sequence Invariants


SOURCE

Problem and solution approach: PLMO LVIII – Phase 3 – Problem 6


PROBLEM

Let a0 = −1 and for each n ∈ N let

n−1
ai
an = −∑ .
n + 1 − i
i=0

Show that a n > 0 for all n ∈ N.


THEORY
An invariant of a sequence s = (s ) is a statement or predicate, denoted
i i≥1

p = p(s ), such that p(s ) is true for all i ∈ N. In our case we are asked to prove
i i

that the invariant is a > 0 for all i ∈ N. The usual technique of proving
i

invariants is via induction.


Perhaps surprisingly, it is quite common that proving a stronger invariant than
the one originally suggested is simpler. Let us consider an example. The
following sequence is de ned recursively: x = 1 and for each n ∈ N we have
1

1
xn+1 = xn − .
n(n + 1)

Our goal is to prove that x > 0 for all n.


n

In this example, it is not enough to use the fact that x > 0 to prove that n

xn+1 > 0. Hence, this natural strategy will simply not work. On the other hand, a

stronger property, namely that x = 1/n for all n, is easy to prove by induction.
n

Indeed, the base case clearly holds: x = 1 = 1/1. For the inductive step, from
1

the fact that x = 1/n


n we easily get that
xn+1 = x − 1/(n(n + 1)) = 1/n − 1/(n(n + 1)) = 1/(n + 1). The stronger
n

property holds by induction; in particular, all the terms of the sequence are
positive.

Squeeze Theorem Let us mention about one more useful tool that will be needed
to solve one of the exercises. The squeeze theorem, also known as the sandwich
theorem, is a theorem regarding the limit of a function or a sequence. It is used to
con rm the limit of a function via comparison with two other functions whose
limits are known or which can be easily computed.
Let I be a set having the point a as a limit point, that is, there exists a sequence
of elements of I which converges to a. Let f, g, and h be functions de ned on I,
except possibly at a itself. Suppose that for every x in I not equal to a, we have

g(x ) ≤ f (x ) ≤ h(x)

and also suppose that

lim g(x ) = lim h(x ) = L.


x→a x→a

In this case, limx→a f (x) = L. Let us note that a is not required to be an element

from I. Indeed, if a is an endpoint of an open interval I, then the above limits are
left-hand or right-hand limits. A similar statement holds for unbounded sets; for
example, if I = (0, ∞), then the conclusion holds taking the limits as x → ∞.
As already mentioned, this theorem is also valid for sequences that corresponds
to the case with I = N and a = ∞. Let (x ) and (z ) be two sequences
n n∈N n n∈N

converging to L. Suppose that (y ) is a sequence that satisfy the following


n n∈N

property:

xn ≤ yn ≤ zn

for all n ≥ N , where N ∈ N . Then, (y n )n∈N also converges to L.


Finally, let us mention that in many languages (for example, French, German,
Italian and Russian), the squeeze theorem is also known as the two policemen and
a drunk theorem. The story is that if two policemen are escorting a drunk prisoner
between them, and both of cers go to a cell, then (regardless of the path taken,
and the fact that the prisoner may be wobbling about between the policemen) the
prisoner must also end up in the cell.
SOLUTION

We will prove the desired property by induction on n. The base case is easy to
verify: a = 1/2 > 0. For the inductive step, suppose that a > 0 for some
1 n

n ∈ N. Our goal is to show that a > 0. Since a = −1, we get that


n+1 0

n n−1
ai 1 ai an
an+1 = −∑ = − ∑ − .
n + 2 − i n + 2 n + 2 − i 2
i=0 i=1

Since

n + 2 1 1
⋅ ≤ ,
n + 1 n + 2 − i n + 1 − i
it follows that

n−1
1 n + 1 ai an
an+1 ≥ − ∑ − .
n + 2 n + 2 n + 1 − i 2
i=1

Since a 0 = −1 , we get that


n−1 n−1
ai 1 ai 1
∑ = + ∑ = − an ,
n + 1 − i n + 1 n + 1 − i n + 1
i=1 i=0

and so

1 n + 1 1 an n + 1 1
an+1 ≥ − ( − an ) − = an ( − ) > 0.
n + 2 n + 2 n + 1 2 n + 2 2
The proof by induction is nished.

REMARKS

The key idea in the proof was to try to replace the sum involving all ai by the
formula that involved only the previous term in the sequence. Such an approach is
possible as the previous term was also de ned in terms of the same (but one)
terms ai.
In our problem, in order to use this approach we had to transform into 1

n+2−i

n+1−i
1
for i > 1 in a way that does not depend on i. To achieve this, we
investigated the following ratio

1/(n + 2 − i) n + 1 − i n + 1
= ≤ ,
1/(n + 1 − i) n + 2 − i n + 2

and we checked that this universal bound is enough to derive the required claim.

EXERCISES

2.6.1. Let x1 be any positive real number, and for each n ∈ N let

1
xn+1 = xn + .
2
xn

Prove that x /√n has a limit and then nd it.


n
3

(Source of the problem: PLMO L – Phase 1 – Problem 10. Solution: our own.)

2.6.2. Consider the following sequence de ned recursively: a = 4 and for each 1

n ∈ N, let an+1 = a (a − 1). Moreover, for each n ∈ N, let b


n n = log (a ) and n 2 n

cn = n − log (b ). Prove that cn is bounded.


2 n

(Source of the problem and solution: PLMO XLIX – Phase 1 – Problem 3.)

2.6.3. You are given two numbers a, b ∈ R. Let x = a, x = b, and for each
1 2

n ∈ N let x n+2 = x + x . Show that there exist a, b ∈ R, a ≠ b, for which


n+1 n

there are at least 2, 000 distinct pairs (k, ℓ), k < ℓ, such that x = x . On the k ℓ

other hand, the number of such pairs is nite even if a = b, unless a = b = 0.


(Source of the problem: PLMO LII – Phase 3 – Problem 3, slightly modi ed.
Solution: our own.)
2.7 Solving Sequences
SOURCE

Problem and solution idea: PLMO LXI – Phase 3 – Problem 6

PROBLEM

Let C > 1 be a real number. Suppose that a sequence


0 (an )
n∈N
of positive real
numbers satis es the following properties: a = 1, 1 a2 = 2, and for each

m, n ∈ N, we have that

amn = am an and am+n ≤ C0 (am + an ) .

Prove that for each n ∈ N, a n = n .


THEORY

In this section, we concentrate on the following important types of sets, open sets
and closed sets. In general, such sets can be very abstract but in practice, open
sets are usually chosen to be similar to the open intervals of the real line. We will
restrict ourselves to Euclidean space.
Euclidean Space Euclidean space is the fundamental space of geometry.
Originally, this was the three-dimensional space but in modern mathematics there
are Euclidean spaces of any dimension that is a natural number, including the
three-dimensional space, the two-dimensional Euclidean plane, and one-
dimensional real line. One way to think of the Euclidean space is as a set of points
satisfying certain relationships, expressible in terms of distance and angles. In
Cartesian coordinates, if p = (p , p , …, p ) ∈ R 1 2 and n
n

q = (q , q , …, q ) ∈ R
1 2 n are two points in Euclidean n-space, then the distance
n

from p to q (or from q to p) is given by the Pythagorean formula:

 n
2
d(p, q ) = d(q, p ) = ∑ (qi − pi ) .

i=1

The following de nitions will play an important role. An (open) n-ball of


radius r ∈ R and centered at a point p ∈ R , usually denoted by B (p), is the
+
n
r

set of all points at distance less than r from x. That is,


n
Br (p ) := {x ∈ R | d(x, p) < r } .

A closed n-ball of radius r ∈ R ∪ {0}, which is denoted by B [p], is the set of


+ r

all points at distance less than or equal to r away from p. In other words,
| ( )
n
Br [p ] := {x ∈ R | d(x, p) ≤ r } .

Note that a ball (open or closed) always includes p itself. A subset of some space
is bounded if it is contained in some ball.

Open and Closed Sets An open set is an abstract concept generalizing the idea of
an open interval in the real line. One of the simplest examples are sets which
contain a ball around each of their points but, as already mentioned above, an
open set, in general, can be very abstract: any collection of sets can be called
open, as long as the union of an arbitrary number of open sets is open, the
intersection of a nite number of open sets is open, and the space itself is open.
These conditions are very loose, and they allow enormous exibility in the choice
of open sets.
We restrict ourselves to Euclidean spaces. A subset S of the Euclidean n-space
Rn is called open if for any point x ∈ S , there exists a real number ε = ε(x) > 0
such that B (x) ⊆ S . One can show that the three desired properties are satis ed:
ε

a) the union of any number of open sets, or in nitely many open sets, is open, b)
the intersection of a nite number of open sets is open, and c) Rn itself is open.
On the other hand, note that in nite intersections of open sets need not be open.
For example, the intersection of all intervals of the form (−1/n, 1/n), where
n ∈ N, is the set {0} which is not open in the real line R .
1

The closure of a set S consists of all points in S together with all limit points of
S. Formally, for a given set S ⊆ R , x is a point of closure of S if every open ball
n

centered at x contains a point of S (this point may be x itself), that is, for all ε > 0
, B (x) ∩ S ≠ ∅. The de nition of a point of closure is closely related to the
ε

de nition of a limit point. The difference between the two de nitions is subtle but
important, namely, in the de nition of limit point, every neighborhood of the
point x in question must contain a point of the set other than x itself. As a result,
every limit point is a point of closure, but not every point of closure is a limit
point. Finally, the boundary of a set S is the set of points which can be approached
both from S and from the outside of S.
In general, a closed set is a set whose complement is an open set. However,
there are other equivalent de nitions that can be applied to Euclidean spaces. A
set is closed if and only if it coincides with its closure. Equivalently, a set is
closed if and only if it contains all of its limit points. Yet another equivalent
de nition is that a set is closed if and only if it contains all of its boundary points.
Any intersection of closed sets is closed (including intersections of in nitely
many closed sets). The union of nitely many closed sets is closed.
Note that both the empty set and the whole Euclidean space are both open and
closed. Let us also mention that a set is connected if it cannot be represented as
the union of two or more disjoint non-empty open sets.

Partially Ordered Sets In order to be able to introduce the de nitions of in mum


and supremum, we need to formalize the intuitive concept of an ordering of the
elements of a given set. It will allow us to say that, for certain pairs of elements in
the set, one of the elements precedes the other in the ordering. Note that the
relation is called a “partial order” which immediately indicates that not every pair
of elements needs to be comparable, that is, there may be pairs of elements for
which neither element precedes the other in the partially ordered set. Partial
orders thus generalize total orders, in which every pair is comparable.
Formally, let R be any binary relation, that is, a relation from some set S to the
same set S. Such relations can be represented as a subset of S × S . Indeed,
element x ∈ S is related to element y ∈ S if and only if (x, y) ∈ R. A partial
order (poset) is any binary relation R ⊆ S × S that satis es the following three
properties:

R is re exive: for any x ∈ S , x is related to x (each element is


comparable to itself),

R is antisymmetric: for any x, y ∈ S , if x is related to y and y is


related to x, then x = y (no two different elements precede each
other),

R is transitive: for any x, y, z ∈ S , if x is related to y and y is


related to z, then x is related to z (the start of a chain of
precedence relations must precede the end of the chain).

There are many interesting and important partial orders but here we only focus
on relations on real numbers or vectors of real numbers. Let us rst observe that
the real numbers ordered by the standard less-than-or-equal relation ≤ is a partial
order. (In fact, it is a totally ordered set as for any x, y ∈ R, x ≤ y or y ≤ x.)
Indeed, x ≤ x for any x ∈ R, and so ≤ is re exive. Since x ≤ y and y ≤ x
implies that x = y, the relation is also antisymmetric. Finally, if x ≤ y and y ≤ z,
then x ≤ z and so ≤ is transitive.
For x, y ∈ R , we usually say that
n

( ) ( )
x = (x1 , x2 , …, xn ) ≤ (y1 , y2 , …, yn ) = y

if and only if x ≤ y for all i ∈ [n]. We leave it for the reader to check that this
i i

relation is a partial order. However, note that it is not a total order! For example,
for x = (1, 2) ∈ R and y = (2, 1) ∈ R , neither x ≤ y nor y ≤ x.
2 2

A set S ∈ R is said to be bounded from below if there exists y ∈ R such


n n

that y ≤ x for all x ∈ S . Similarly, a set S ∈ R is said to be bounded from


n

above if there exists y ∈ R such that x ≤ y for all x ∈ S .


n

In mum and Supremum The in mum of a set S ⊆ R is the greatest element in


n

Rn that is less than or equal to all elements of S, if such an element exists. In other
words, for a given S ⊆ R , let L ⊆ R be de ned as follows: x ∈ L if and
n
S
n
S

only if x ≤ y for all y ∈ S . If there exists z ∈ L such that x ≤ z for all x ∈ L ,


S S

then z is the in mum of S; otherwise, there is no in mum. Similarly, the


supremum of a set S ⊆ R is least element in the subset of Rn that contains
n

points that are greater than or equal to all elements of S, again, if such an element
exists.
The two de nitions are symmetric and, indeed, the in mum is in a precise
sense dual to the concept of a supremum. As already mentioned, in ma and
suprema do not necessarily exist. However, if an in mum or supremum does
exist, it is unique. Moreover, it follows immediately from the de nition that the
in mum of S ⊆ R exists if and only if it exists for S along every dimension of
n

the Rn space. This, in particular, implies that S is bounded from below. By


symmetry, similar observation holds for the supremum of S—set S has to be
bounded from above for the supremum to exist. Finally, let us mention that the
two de nitions (in mum and supremum) naturally generalize to any partially
ordered set, not just the relation ≤ on Rn.
If an in mum (supremum) of a set S exists and it belongs to S, then we call it a
minimum (respectively, maximum). The concepts of in mum and supremum are
then closely related to minimum and maximum, but are more useful in analysis
because they better characterize special sets which may have no minimum or
maximum. For instance, the set of positive real numbers R+ does not have a
minimum. Indeed, for any x ∈ R , there exists y ∈ R such that y < x (say,
+ +

consider y := x/2). On the other hand, the in mum of R+ exists and is equal to 0.
Indeed, it follows from the previous observation together with the fact that 0 ≤ x
for all x ∈ R . Another simple examples are
+

|
inf {x ε R | 0 < x < 1} = 0

2
inf {x ε Q x > 2} = √2

n 1
inf {(−1) + εR n ε N} = −1.
n

SOLUTION

Let (a )n n∈N
be any sequence of positive numbers that satis es the desired
properties. In particular, recall that there exists C > 1 such that the sequence 0

(a )
n n∈N
satis es a ≤ C (a
m+n + a ) for all m, n ∈ N. Let S be the set of real
0 m n

numbers (not necessarily grater than 1) C for which a ≤ C(a + a ) for all m+n m n

m, n ∈ N. Observe that if C ∈ S, then C ∈ S provided that C > C . It follows


′ ′

that there exists C ∈ R so that either S = (C , ∞) ( S is an open set) or


* *

*
S = [C , ∞) ( S is a closed set). We will show that the latter is true, that is, C ,
*

an in mum of S, belongs to S.
Let C* be an in mum of S. For a contradiction, suppose that C* does not
belong to S. It follows that there exist m, n ∈ N such that a > C (a + a ). m+n
*
m n

However, since the inequality is sharp, we get that there exists some ε > 0 such
that a m+n > (C + δ)(a
*
+ a ) for all 0 ≤ δ ≤ ε. This contradicts the fact that
m n

*
C is an in mum. It follows that C ∈ S. *

Let us note that C ≥ 1, as 2 = a ≤ C (a + a ) = 2C . We will show that,


*
2
*
1 1
*

in fact, C = 1. Let us rst observe that for any n, m ∈ N,


*

2
am+n = a 2 = am2 +n2 +2mm ≤ C * (am2 + an2 + 2mn)
(m+n)

2 2 2
≤ C * (am + C * (an2 + a2mn )) = C * (an + C * (an + a2 am an ))

2 2
= C * (am + C * (an + 2am an )).

Similar arguments give us the following system of inequalities:


2 * 2 * 2
⎧a ≤ C (am + C (an + 2am an ))
m+n
2 * 2 * 2
⎨a ≤ C (an + C (am + 2am an ))
m+n
⎩ 2 * * 2 2
am+n ≤ C (2am an + C (am + an ) ) .

After combining the three inequalities together, we get that


2
2 * * 2
3am+n ≤ (C + 2(C ) )(am + an ) .

Since all the terms are positive, after taking a square root of both sides, we get
that
2
* *
2(C ) + C

am+n ≤ (am + an ) .
3
On the other hand, since the in mum C* is contained in S, for any ε > 0 there
must exist m, n ∈ N for which a > (C − ε)(a + a ). Combining this
m+n
*
m n

with the above inequality (that holds for any m, n ∈ N), we get that

2
* *
2(C ) + C
* √
(C − ε)(am + an ) < am+n ≤ (am + an ) ,
3
or equivalently that

2
* *
2(C ) + C
* √
C − ε < .
3
2
* *

Since this inequality holds for all ε > 0, we conclude that , or


2(C ) +C
√ *
≥ C
3
2 2
equivalently that 2(C )
*
+ C
*
≥ 3(C )
*
. It follows that
2
0 ≥ (C )
*
− C
*
= C (C
* *
− 1) and so C ≤ 1. Since C ≥ 1, we get that
* *

C
*
and so, in particular, a
= 1 n ≤ n for all n ∈ N, which can be proved by

induction on n.
Let us now note that for each k ∈ N , a2k = (a2 )
k
= 2
k
. Using this and the
bound we proved above, we get
k k k
2 = a2k ≤ am + a2k −m ≤ m + (2 − m) = 2 ,

and so am + a2k −m = 2
k
for all 1 ≤ m ≤ 2
k
− 1 . But it means that for any
1 ≤ m ≤ 2
k
− 1 , we get
k k
2 = am + a2k −m ≤ am + (2 − m),

and so a m ≥ m . The conclusion ( a m = m for all m ∈ N) follows as a m ≤ m .

REMARKS

The key idea that leads to the proof is to notice that there exists C* that is a
smallest value of C for which the desired bound for a is met. This follows m+n

from the fact that an intersection of closed sets is a closed set. What is this family
of closed sets in our case? For any pair m, n ∈ N we want to have the bound
am+n ≤ C(a + a ) which clearly holds for any C ≥ a
m n /(a + a ). As a m+n m n
result, the set of values of C for which am+n ≤ C(am + an ) for all m, n ∈ N is
the set

am+n
S := ⋂ [ , ∞).
am + an
m,n∈N

It follows that S is closed as it is an intersection of closed sets. In particular,


∈ S.
*
C

With this observation, we know that for any ε > 0, there are natural numbers m
and n such that

*
am+n > (C − ε)(am + an ) .

If we want to start using the fact that a = a a , then it makes sense to try to
mn m n

square the above inequality and use the fact that a = (a ) . This then
(m+n)
2
m+n
2

implies that
2
2 * 2
a 2 = (am+n ) > (C − ε) (am + an ) .
(m+n)

In order to use the fact that a ≤ C (a + a ), consider three ways to


m+n
*
m n

distribute the terms in (m + n) = m + n + 2mn, exactly like in the solution.


2 2 2

If we do this, then this leads us to the following observation: for all ε > 0
2
* *
2(C ) + C 2
*
> (C − ε) .
3
2 2
Since this holds for any ε > 0, we have that in the limit, 2(C *
) + C
*
≥ 3(C )
*
,
2
and so C *
≥ (C )
*
. Therefore, we see that C *
= 1 , as it cannot be less than 1.
EXERCISES

2.7.1. Find the number of in nite sequences (a ) , such that a ∈ {−1, 1} for i i∈N i

all i ∈ N, a = a a for all m, n ∈ N, and each consecutive triple contains at


mn m n

least one 1 and one −1.


(Source of the problem and solution: PLMO LV – Phase 2 – Problem 3.)

2.7.2. Let us x any real number a. We recursively de ne sequence (a ) as n n∈N

follows: let a = a and for each n ∈ N, let a


1 = (a − 1/a )/2 if a ≠ 0 and n+1 n n n

an+1 = 0 if a = 0. Prove that this sequence has in nitely many non-positive


n

elements and in nitely many non-negative elements.


(Source of the problem and solution idea: PLMO XXIX – Phase 3 – Problem 5.)

2.7.3. Let n be any natural number such that n ≥ 3. Find all sequences of real
numbers (x , x , …, x ) that satisfy the following conditions:
1 2 n

n n
2
∑ xi = n and ∑ (xi−1 − xi + xi+1 ) = n,

i=1 i=1

where we set x = x and x


0 n = x .
n+1 1

(Source of the problem and solution idea: PLMO LX – Phase 2 – Problem 6.)
Chapter 3
Functions, Polynomials, and Functional Equations

3.1 Vieta's Formulas


3.2 Functional Equations, Exploration
3.3 Functional Equations, Necessary Conditions
3.4 Polynomials with Integer Coef cients
3.5 Unique Representation of Polynomials
3.6 Polynomial Factorization
3.7 Polynomials and Number Theory

As usual, we start the chapter with some basic de nitions.


THEORY

Polynomials A polynomial is an expression consisting of variables and coef cients that


involves only the operations of addition, subtraction, multiplication, and non-negative
integer exponents of variables. An example of a polynomial of a single variable x is
x − 7x + 3. An example in three variables is x + xy z − xz + 7. A polynomial in a
2 3 2

single variable x can always be written (or rewritten) in the following form
n

n n−1 i
P (x ) = an x + an−1 x + … + a1 x + a0 = ∑ ai x ,

i=0

where a 0, a1 , …, an are real constants and a n ≠ 0 .


The degree of a polynomial is the highest degree of its monomials (individual terms) with
non-zero coef cients. The degree of a term is the sum of the exponents of the variables that
appear in it, and thus is a non-negative integer. For example, the polynomial
x + xy z − xz + 7 has 4 terms. The 4 terms have, correspondingly, degrees 3, 4, 2, and 0.
3 2

As a result, the degree of this polynomial is 4. Polynomials of small degrees have names; in
particular, degree 0 polynomial P (x) = C is called non-zero constant (if C ≠ 0) or special
case (if C = 0), degree 1 polynomials are called linear, degree 2 ones are called quadratic,
and degree 3 ones cubic.
In order to determine the degree of a polynomial that is not in standard form, one has to
put it in standard form by expanding the products and combining the terms. For example,
2 2
(x + 1) − (x − 1) = 4x

if of degree 1 despite the fact that each summand has degree 2. This is not needed when the
polynomial is expressed as a product of polynomials. One can easily see that the degree of a
product is the sum of the corresponding degrees of all the factors. Similarly, the degree of
the composition of two non-constant polynomials, say P (x) and Q(x), is the product of
their degrees. For example, if P (x) = x − x and Q(x) = x + x, then
2 3

2
3 3
P (x)o Q(x) = P (Q(x)) = (x + x) − (x + x)

6 4 3 6 4 3
= x + 2x + x − x − x = x + 2x − x

has degree 6 = 2 ⋅ 3.

Complex Numbers In order to make our next observation about the number of roots of a
given polynomial, it will be convenient to introduce a concept of complex numbers.
However, we will not use them anymore in this book.
A complex number is a number that can be expressed in the form a + bi, where a and b
are real numbers, and i is a solution of the equation x = −1. Clearly, no real number
2

satis es this equation, and so i is called an imaginary number. For a given complex number
z = a + bi, a is called the real part, and b is called the imaginary part.

Complex numbers allow solutions to certain equations that have no solutions in real
numbers. For example, x − 4x + 8 = 0 has no real solution. Indeed, it can be rewritten as
2

follows (x − 2) = −4 and now it is clear that the square of a real number cannot be
2

negative. On the other hand, since i = −1, both 2 + 2i and 2 − 2i are solutions to this
2

equation, as demonstrated below:


2 2 2 2
((2 + 2i) − 2) = (2i) = 2 i = 4(−1) = −4

2 2 2 2
((2 − 2i) − 2) = (−2i) = (−2) i = 4(−1) = −4.

Roots A root (or zero) of a real or complex function f = f (x) is a number x from the
domain of f such that f (x) = 0. In particular, a root of a polynomial P (x) is a root of the
corresponding polynomial function.
The fundamental theorem of algebra states that every non-constant single-variable
polynomial with complex coef cients has at least one complex root. The theorem can be
alternatively stated as follows: every non-zero, single-variable, degree n polynomial with
complex coef cients has, counted with multiplicity, exactly n complex roots. The
equivalence of the two statements can be proven through the use of successive polynomial
division.
Clearly, the fundamental theorem of algebra can be applied to polynomials with real
coef cients that we are concerned in this book, since every real number is a complex
number with an imaginary part equal to zero. It follows that the number of complex roots of
each such polynomial is exactly n and so the number of real roots is at most n. As observed
earlier, it can be strictly less than n.

3.1 Vieta’s Formulas


SOURCE
Problem and solution idea: PLMO LXVI – Phase 2 – Problem 4
PROBLEM

Suppose that real numbers x1, x2, x3, and x4 are roots of some polynomial W (x) of degree 4
with all integer coef cients. Prove that if x + x is rational and x x is irrational, then
3 4 3 4

x + x = x + x .
1 2 3 4

THEORY

Vieta’s FormulasVieta’s formulas are formulas that relate the coef cients of a polynomial
to sums and products of its roots. Consider a polynomial P (x) = ∑ a ⋅ x , where
n i
i=0 i

a , a , …, a
1 2 ∈ R and a
n ≠ 0. Since P (x) has degree n, by the fundamental theorem of
n

algebra it has n (complex) roots x , x , …, x and can be written as


1 2 n

(x − x ). Vieta’s formulas can be obtained by multiplying the factors in


n
P (x) = a ∏ n i=1 i

the second representation and then identifying the coef cients of each power of x in the two
representations. It follows that for any k ∈ [n], we have that
k
k
an−k
∑ ∏ xij = (−1) .
an
1≤i1 <i2 <⋯<ik ≤n j=1

(3.1)

Note that the indices ik are in increasing order to ensure that each sub-product of roots is
used exactly once. In particular, for k = 1, k = 2, and k = n we get
n an−1
∑ xi = −
i=1 an

n n an−2
∑ ∑ xi ⋅ xj =
i=1 j=i+1 an

n n a0
∏ xi = (−1) .
i=1 an

The following useful observation is an easy consequence of Vieta’s formulas:


n 2
an−1 − 2 ⋅ an ⋅ an−2
2
∑ xi = .
2
an
i=1

(3.2)

Indeed, note that


2
n n n n

2
(∑ xi ) = ∑ xi + 2 ∑ ∑ xi xj

i=1 i=1 i=1 j=i+1

and so
2
n n n n 2

2
an−1 an−2
∑ xi = (∑ xi ) − 2 ∑ ∑ xi xj = (− ) − 2 .
an an
i=1 i=1 i=1 j=i+1

SOLUTION

Suppose that W (x) = ∑ a x with a , a , …, a ∈ Z and a ≠ 0. We will rst show


4 i
i=0 i 0 1 4 4

that all roots of the considered polynomial are non-zero. Clearly, x3 and x4 are non-zero as
their product is irrational. In order to derive a contradiction, suppose that x = 0 or x = 0. 1 2

Without loss of generality, we may assume that x = 0; x2 may or may not be equal to 0. 1

Using Vieta’s formula (3.1) for k = 1, we get that x + x + x = −a /a is rational. As a 2 3 4 3 4

result, since x + x is rational, x2 is rational too. Similarly, Vieta’s formula for k = 3


3 4

implies that x x x = −a /a is rational. Since x x is irrational, we get that x = 0. But


2 3 4 1 4 3 4 2

then the formula for k = 2 gives us that x x = a /a is rational, which gives us the 3 4 2 4

desired contradiction. Hence, all roots of the considered polynomial are indeed non-zero.
We will use Vieta’s formulas again, in fact, a few times. Observe rst that, since x + x 3 4

is rational, x + x is also rational, as


1 2

(x1 + x2 ) + (x3 + x4 ) = ∑ xi = −a3 /a4

i=1

is rational. Similarly, since x 3


x4 is irrational, x 1
x2 must also be irrational, as

(x1 x2 )(x3 x4 ) = ∏ xi = a0 /a4

i=1

is rational and x 1 x2 ≠ 0 . Observe next that x 1 x2 + x3 x4 is rational, as

(x1 + x2 )(x3 + x4 ) + x1 x2 + x3 x4 = ∑ xi xj = a2 /a4

i≠j

is rational. Since

(x1 + x2 )x3 x4 + (x3 + x4 )x1 x2 = ∑ xi xi xi = −a1 /a4


1 2 3

1≤i1 <i2 <i3 ≤4

is rational,

(x1 + x2 )x3 x4 + (x3 + x4 )x1 x2 − (x1 + x2 )(x1 x2 + x3 x4 ) = (x3 + x4 − x1 − x2 )x1 x2

is rational too. Finally, since x + x and x + x are both rational and


3 4 1 2 x1 x2 is irrational,
we conclude that x + x − x − x = 0 and so x + x = x + x .
3 4 1 2 1 2 3 4

REMARKS

The key value of Viete’s formulas is that one can use them to prove many useful properties
of roots of polynomials without actually nding them. In the problem we consider in this
section, the key observation is that if the coef cients of a given polynomial are integers,
then all values following from Viete’s formulas must be rational. In fact, one could relax the
assumption and only assume that the coef cients are rational (not necessarily integers).
Indeed, this clearly follows from the fact that multiplying any polynomial by a constant does
not change its roots so any polynomial with rational coef cients can be transformed into a
polynomial with integer coef cients.
Ones we observe that all values following from Viete’s formulas are rational, the next
step is to exhaustively write down all the facts about rationality or irrationality of different
combinations of xis. In practice, when solving problems of this avour, one would write
down many more facts that are possible to derive, most of them would turn out to be useless.
This is hard to avoid as it typically dif cult to predict which combination of them is crucial
for the problem at hand.

EXERCISES

3.1.1. Find all sets of six real numbers a , a , a , b , b , b with the property that for all
1 2 3 1 2 3

i ∈ [3], a and b are two different solutions of the equation x + a x + b = 0 (here we 2


i+1 i+1 i i

let a = a and b = b ).
4 1 4 1

(Source of the problem and solution idea: PLMO LXX – Phase 1 – Problem 5.)

3.1.2. Let n ≥ 3 be an integer. Prove that the polynomial

n−3

n i
f (x ) = x + ∑ ai x

i=0

has n real roots if and only if all ai are equal to 0.


(Source of the problem and solution idea: PLMO LII – Phase 2 – Problem 3.)

3.1.3. Let x , x , x be the roots of the equation 3x + 6x


1 2 3
3 2
− 1 = 0 . Find the value of
.
3 1
∑ i=1 4
x
i

(Source of the problem and solution idea: Delta – ZM-1506.)

3.2 Functional Equations, Exploration


SOURCE

Problem and solution idea: PLMO LXIII – Phase 1 – Problem 8

PROBLEM

Find all functions f : R → R , such that for all x, y ∈ R, we have that


2
f (x + f (x + y) ) = f (x − y) + f (x) .

(3.3)
THEORY

Functional Equations A functional equation is any equation in which the unknown


represents a function. Typically, such equations relate the value of a function at some point
with its values at other points. For example,

f (x + y ) = f (x)f (y)

is satis ed by all exponential functions,

f (xy ) = f (x) + f (y)

is satis ed by all logarithmic functions, and

f (xy ) = f (x)f (y)

is satis ed by all power functions.


Solving functional equations can be very dif cult. In order to warm up, let us consider the
following very simple case. Suppose that our goal is to nd all real-valued functions
f : R → R that satisfy

2 2 2
f (x + y) = f (x) + f (y)

for all x, y ∈ R. By considering the case x = y = 0, we get that f (0) = f (0) + f (0)
2 2 2

and so f (0) = 0. From this it follows that f (0) = 0. Now, consider any x ∈ R and let
2

y = −x. We get

2 2 2 2 2
f (x) ≤ f (x) + f (−x) = f (x − x) = f (0) = 0

and so f (x) and nally f (x) = 0.


2
= 0

Continuous Functions In order to solve our next example, we need to introduce continuous
functions. A function f : R → R is continuous if suf ciently small changes of the
argument x result in arbitrarily small changes of the value f (x). Otherwise, a function is
said to be discontinuous.
There are several different formal de nitions of continuity of a function, all of them being
equivalent. Below, we present a few which are commonly used.

Limits of functions: Function f is continuous at some point c of its domain if the limit of
f (x), as x approaches c through elements of the domain of f, exists and is equal to f (c); that

is,

lim f (x ) = f (c ) .
x→c

Limits of sequences: One can instead require that for any sequence (x ) of points in the n n∈N

domain that converges to c, the corresponding sequence (f (x )) converges to f (c); that


n n∈N

is,

lim xn = c Rightarrow lim f (xn ) = f (c ) .


n→∞ n→∞

(3.4)
Epsilon–delta: For every ϵ > 0 (arbitrarily small), there exists δ = δ(ϵ) > 0 (which depends
on ϵ) such that for all x in the domain of f with c − ϵ < x < c + ϵ, the function satis es
f (c) − δ < f (x) < f (c) + δ; that is,

|x − c| < δ Rightarrow |f (x) − f (c)| < ϵ .

Finally, a function is continuous if it is continuous at all elements of its domain.


It is easy to see that the sum of two functions, continuous on some domain, is also
continuous on this domain. The same holds for the product of continuous functions.
Combining this with an easy fact that functions of the form f (x) = Ax + B (for some
constants A, B ∈ R) are continuous, we get in particular that all polynomials are
continuous functions.

Cauchy’s Functional Equation and Additive Functions Let us consider one particular
functional equation that is both interesting and important. Cauchy’s functional equation is
the functional equation

f (x + y ) = f (x) + f (y ) .

(3.5)

Solutions to this equation are called additive functions.


It is clear that any linear functions f : R → R (that is, function of the form f (x) = cx
for some xed constant c ∈ R) is a solution. However, there are some other solutions that
are extremely complicated. Having said that, if one is restricted to continuous functions,
then linear functions are the only solutions to the Cauchy’s functional equation. Hence, any
other solution must be, in some sense, highly pathological.
Let f : R → R be any additive function that is also continuous, and let c = f (1). We
will rst show that f (q) = cq for any q ∈ Q. It is convenient to independently consider the
following three cases: q = 0, q > 0, and q < 0.

Case 1: q = 0 . By setting x = 0 (or, in fact, any value) and y = 0 in (3.5) we get


f (0) = f (0) + f (0) and so f (0) = 0.

Case 2: q > 0. After applying (3.5) to f (ax) = f (x + x + ⋯ + x) repeatedly a times, we


get that

f (ax ) = af (x ) , f or any a ∈ N, x ∈ R .

(3.6)

After substituting x with x/a in (3.6) and multiplying both sides by b/a, we get that

b x
f (x ) = bf ( ) , f or any a, b ∈ N, x ∈ R .
a a
(3.7)
By combining (3.6) and (3.7), we get that

b x b
f (x ) = bf ( ) = f( x),
a a a

and nally, after setting x = 1, we have that for any q = b/a ∈ Q , +

b b b
f (q ) = f( ⋅ 1) = f (1 ) = c = cq .
a a a

Case 3: q < 0. After applying (3.5) with y = −x, we get that

0 = f (0 ) = f (x − x ) = f (x) + f (−x),

and so f (−x) = −f (x). This allows us to reduce this case to the previous one. It follows
that for any q ∈ Q , −

f (q ) = f (−(−q) ) = −f (−q ) = −(c(−q) ) = cq ,

since −q ∈ Q . The desired property follows, namely, f (q) = cq for all q ∈ Q.


+

It remains to show that, since f is continuous, f (x) = cx for any x ∈ R ∖ Q. For that we
will use the “limits of sequences” variant of the de nition for a function to be continuous—
see (3.4). Let x ∈ R ∖ Q. It is easy to construct a sequence (x ) of rational numbers n n∈N

that converges to x, for example, let x = ⌊xn⌋/n ∈ Q. It is clear that


n

xn − ⌊xn⌋ 1
0 ≤ x − xn = < ,
n n
and so, indeed, lim n→∞ xn = x . From (3.4) it follows that

f (x ) = lim f (xn ) = lim cxn = c lim xn = cx .


n→∞ n→∞ n→∞

SOLUTION

After applying (3.3) for the speci c case y = f (0) − x , we get that
f (x) = f (2x − f (0)) + f (x)
2
which implies that
2
f (2x − f (0) ) = f (x) − f (x) = f (x)(1 − f (x) ) .

Since function g(z) = z(1 − z) attains its maximum value, 1/4, at z = 1/2, we get that
f (2x − f (0)) ≤ . Since x is arbitrary and f (0) xed (once function is xed; in fact, as
1

argued below, it is always equal to 0 or −1), we conclude that function f has the following
important property:

f or all x ∈ R , f (x) ≤ 1/4 .

(3.8)

Applying (3.3) for another speci c case, x = f (0) and y = −f (0), we get that
f (2f (0)) = f (2f (0)) + f (f (0)) and so f (f (0)) = 0. On the other hand, for x = y = 0
2

we get f (f (0)) = f (0) + f (0) so, combining the two observations, we get that there are
2
only two possible values of f (0): f (0) = 0 or f (0) = −1. We will consider these two cases
independently.
Suppose rst that f (0) = −1. In this case, f (−1) = f (f (0)) = 0 and (3.3) for
(x, y) = (0, 1) gives us f (f (1)) = f (−1) + f (0) = 0 + (−1) = 1. But this contradicts
2 2

(3.8), and so no function f satis es both the functional equation (3.3) and f (0) = −1.
Suppose then that f (0) = 0. Clearly, the constant function f (x) = 0 for x ∈ R, satis es
(3.3); so this is certainly one solution. We will show that it is actually the only one. For a
contradiction, suppose that for some z ∈ R we have f (z) ≠ 0. Considering the case
we get from (3.3) that f (z + f (2z)) = f (z) , which implies that
2
x = y = z

δ := f (z + f (2z)) > 0. We de ne recursively a sequence of numbers as follows:

x1 = z + f (2z) , xi+1 = xi + f (xi ) f or any i ∈ N .

In particular, f (x 1
) = δ > 0 . By applying (3.3) with x = x i+1
and y = 0, we get that
2
f (xi+1 ) = f (xi + f (xi ) ) = f (xi ) + f (xi ) ≥ f (xi ) .

It follows that for any i ∈ N, f (x ) ≥ f (x ) ≥ … ≥ f (x ) = δ > 0. In particular, it


i i−1 1

means that both (x ) i and (f (x ))


i∈N
are strictly increasing sequences. More importantly,
i i∈N

it implies that
2 2
f (xi+1 ) = f (xi ) + f (xi ) ≥ f (xi ) + δ

and so for any i ∈ N, f (x ) ≥ f (x ) + (i − 1)δ . As a result, the sequence (f (x ))


i 1
2
is i i∈N

unbounded (that is, lim f (x ) = ∞) which contradicts (3.8). Hence, indeed, f (x) = 0
n→∞ n

( x ∈ R) is the unique solution to our problem.

REMARKS

A standard approach used to solve functional equations is to prove some speci c properties
of the function involved. In particular, it is often useful to try to establish the value of the
function for some carefully chosen, characteristic values. In our problem, this speci c value
was equal to 0. Another common approach is to try to reduce the functional equation
involving, say, two independent variables (in our case, x and y) into a more general equation
involving only one variable. In our case, we used this approach by setting x = y. It is
important to remember that if we prove some property of a more general equation (in this
case involving one variable), then we always have to check if the nal solution meets the
original equation.

EXERCISES

3.2.1. Find all functions f : Z → Z which satisfy the following condition:


3 3 3
f (a + b) − f (a) − f (b) = 3f (a)f (b)f (a + b)

for all a, b ∈ Z.
(Source and the problem and solution idea: PLMO LXV– Phase 1 – Problem 5.)
3.2.2. Find all pairs of functions f : R → R and g : R → R such that

g(f (x) − y ) = f (g(y)) + x

for all x, y ∈ R.
(Source of the problem and solution idea: PLMO LXIII – Phase 2 – Problem 4.)

3.2.3. Find all functions f : R → R such that for all x, y ∈ R, we have that

f (f (x) − y ) = f (x) + f (f (y) − f (−x)) + x .

(Source of the problem and solution idea: PLMO LIX – Phase 2 – Problem 3.)

3.3 Functional Equations, Necessary Conditions


SOURCE

Problem and solution idea: PLMO LV – Phase 1 – Problem 3 (modi ed)


PROBLEM

Find all bijections f : R → R that satisfy


2
f (x + y) = xf (x) + f (y)

for all x, y ∈ R.
THEORY

When solving functional equations, it is quite common that only some speci c cases are
considered. For example, one may x some speci c values in the given equation or relate
one variable to some other variable or variables. Indeed, typically one starts from the
original functional equation but then immediately transforms it into something more
manageable and insightful. But, as a result, the resulting equations are not equivalent to the
original ones. In other words, the obtained conditions are only necessary but often not
suf cient. Therefore, it is always important to make sure that the nal result actually
satis es the original functional equation.
In order to illustrate this issue, let us consider the following functional equation. Suppose
that our goal is to nd all functions f : R → R that satisfy the following condition
+ +

√ f (x) + f (y) = x + y,

for all x, y ∈ R. First, after xing y = 0 and squaring the equation, we get that
2 2
f (x) + f (0 ) = x + 0 = x .

Now, we can set x = 0 to learn that f (x) = 0. We conclude that f (x) = x . 2


Let us stress it again that at this point we only proved that if the original equation has a
solution, then this solution must be f (x) = x . However, in order to get this potential
2

solution we have simpli ed and relaxed the original condition by setting y = 0 and squaring
the equation. Therefore, f (x) = x is only a candidate solution and now we have to check
2

that it actually satis es the original equation. Once we substitute the function into the
original equation we get √x + y = x + y. This equation clearly does not hold unless
2 2

x = 0 or y = 0. As function f (x) = x was the only potential solution, we conclude that


2

there is no function f which satis es the original functional equation.


SOLUTION

Let us start with rewriting the functional equation as follows: for all x, y ∈ R,
2
f (y) = f (x + y) − xf (x ) .

For y = −x ∈ R, we get that


2 2 2
f (−x ) = f (x − x) − xf (x ) = f (x − x) − xf (1 + (x − 1) ) .

Since x , using the original equation we get that


2 2
− x = (x − 1) + (x − 1)

2 2
f (x − x) = f ((x − 1) + (x − 1) ) = (x − 1)f (x − 1) + f (x − 1),

and so
2
f (−x) = (x − 1)f (x − 1) + f (x − 1) − xf (1 + (x − 1))

2
= −(f (1 + (x − 1)) − f (x − 1))x.

Going back to the original equation for the last time, we get that f (−x) = −f (1)x. It
follows that the only functions that potentially satisfy the given functional equation have the
form f (x) = ax, where a is a xed real number.
We directly check that if f (x) = ax, then
2 2
f (x + y) − xf (x ) = ax + ay − x ⋅ ax = ay = f (y ) .

However, in our problem, our goal is to nd bijections and so the case a = 0 has to be ruled
out. We conclude that the only functions that meet all the conditions are of the form
f (x) = ax where a ≠ 0.

REMARKS

In order to better understand how one can get ideas leading to the solution of similar
problems, let us make some observations that we made during the process of solving the
problem at hand. First, by xing x = 1 we observed that for any y ∈ R,

f (y + 1 ) = f (y) + f (1 ) .

It follows that function f constantly increases by the same value, f (1), when we increase its
argument by 1.
Our second observation was that changing x to −x does not affect the left hand side of
the original equation and so xf (x) = −xf (−x) for any x ∈ R. Hence, for x ≠ 0 we get
that f (x) = −f (−x). On the other hand,
f (0) = f (−1 + 1) = f (−1) + f (1) = −f (1) + f (1) = 0, and so in fact f (x) = −f (−x)

for all x ∈ R.
Our next idea was to try to nd two different expressions of the form x + y that have 2

identical value. One such expression was given in the solution above. Another identity that
could have been used is x + x = (x + 1) + (−1 − x). Applying it we would get that
2 2

2
f (x + x) = xf (x) + f (x)

2
f ((x + 1) + (−1 − x)) = (x + 1)f (x + 1) + f (−1 − x).

As a result,

f (x) + xf (x) = f (−1 − x) + (x + 1)f (x + 1)

= −f (x + 1) + (x + 1)f (x + 1)

= xf (x + 1) = xf (x) + xf (1),

which yields f (x) = f (1)x, as required.


EXERCISES

3.3.1. Prove that if a function f : R → R satis es the condition f (x) = f (2x) = f (1 − x)


for all x ∈ R, then it is periodic (that is, there exists a ∈ R such that f (x + a) = f (x) for
+

all x ∈ R).
(Source of the problem and solution idea: PLMO LIII – Phase 2 – Problem 1.)

3.3.2. Given that function f (x) satis es f (1/(1 − x)) = xf (x) + 1, nd the value of f (5).
(Source of the problem: question asked on Quora. Solution: our own.)

3.3.3. Suppose that a function f (x, y, z) of three real arguments satis es the following
condition
5 5

∑ f (xi , xi+1 , xi+2 ) = ∑ xi ,

i=1 i=1

where x i+5 = xi . Prove that for all n ≥ 5 we have


n n

∑ f (xi , xi+1 , xi+2 ) = ∑ xi ,

i=1 i=1

where x i+n = x . i

(Source of the problem and solution: PLMO LIX – Phase 3 – Problem 2.)
3.4 Polynomials with Integer Coef cients
SOURCE

Problem and solution idea: PLMO LVII – Phase 3 – Problem 6 (modi ed)

PROBLEM

Find all pairs of integers (a, b) with the property that there exists a polynomial P (x) having
integer coef cients such that
n

2 i
(x + ax + b) ⋅ P (x ) = Q(x ) = ∑ ci x ,

i=0

(3.9)

where c i ∈ {1, −1} for all i ∈ [n] ∪ {0}.


THEORY

Rational Root Theorem Our next theorem, the rational root theorem (sometimes called the
rational root test), states a constraint on rational solutions of a polynomial equation with
integer coef cients. Consider any polynomial P (x) = ∑ a x with all integer
n i
i=0 i

coef cients, that is, a ∈ Z for all i ∈ [n] ∪ {0}. Suppose that P (x) has a rational root p/q,
i

where p ∈ Z and q ∈ Z are co-prime (in other words, the fraction p/q is in its lowest
terms). Then, p | a and q | a .
0 n

Indeed, suppose that


n
i
0 = P (p/q) = ∑ ai (p/q) .

i=0

After multiplying this equation by qn, we get that

i−1 n−i n
p ⋅ (∑ ai p q ) + a0 q = 0.

i=1

Since p and q are co-prime, we get that p | a0 . Similarly, from the very same equation, we
get that

n−1

n i n−i−1
an p + q ⋅ (∑ ai p q ) = 0,

i=0

and we conclude that q | an .


An important application of this theorem is that it can be used to nd all rational roots of
a given polynomial. Indeed, it gives a nite number of possible fractions which can be
checked to see if they are roots or not. If a rational root x = r is found, a linear polynomial
(x − r) can be factored out of the polynomial using polynomial long division, resulting in a
polynomial of a lower degree whose roots are also roots of the original polynomial.
Finally, let us mention that the rational root theorem is a special case (for a single linear
factor) of Gauss’s Lemma on the factorization of polynomials. For our purposes it is enough
to use the following variant. If a polynomial P (x) of degree greater than 1 with integer
coef cients cannot be represented as a product of two non-constant polynomials with
integer coef cients, then it also cannot be represented as a product of two non-constant
polynomials with rational coef cients.
Now, if a polynomial P (x) with integer coef cients has a rational root x = p/q, where p 0

and q are co-prime, then it is possible to represent it as (x − p/q)R(x), where R(x) has
rational coef cients. If R(x) is a rational constant r, then P (x) = r(x − p/q) = a(qx − p),
where a = r/q is an integer, as by assumption r and rp/q are integers and p and q are co-
prime. If R(x) is not a constant, then the lemma of Gauss implies that P (x) = P (x)Q(x), ′

where P (x) and Q(x) are non constant and have integer coef cients and P (x) has a root
′ ′

p/q. If P (x) has degree 1, then clearly it can be written as a(qx − p) where a is an integer

and we are done. Otherwise, we have a polynomial P (x) of degree less than the degree of

P (x) with integer coef cients that has a root p/q. As P (x) initially had a nite degree, by
replacing P (x) by P (x) and repeated application of this reasoning, we get that eventually

we must reach the required factorization P (x) = (qx − p)R (x), where R (x) has integer ′ ′

coef cients. As a consequence, if P (x) = ∑ a x , then q | a and p | a , as stated


n

i=0 i
i
n 0

above.

SOLUTION

Let P (x) be a polynomial with integer coef cients that satis es (3.9) for some pair of
integers (a, b) and some set of coef cients c ∈ {−1, 1}. Let us rst observe that if
i

Q(x) = 0 for some x ∈ R, then we have −c x c x and so, in particular,


n n−1 i
= ∑ n i
i=0

n−1

n n i
|x | = |x| = ∑ ci x .

i=0

We will now show that |x| < 2. For a contradiction, suppose that there exists x ∈ R such
that |x| ≥ 2 and the above equality holds. Since,

n−1 n−1 n

n i
|x| − 1 n
i
|x| = ∑ ci x ≤ ∑ |x| = ≤ |x| − 1,
|x| − 1
i=0 i=0

we get that 0 ≤ −1 which gives us the desired contradiction. Since Q(x) does not have
roots which absolute values are greater than or equal to 2, the same property holds for
polynomial R(x) := x + ax + b.
2
It follows that R(2) = 4 + 2a + b > 0,

R(−2) = 4 − 2a + b > 0, and so

b b
−2 − < a < 2 + .
2 2
It follows immediately from (3.9), since c ∈ {1, −1} and P (x) has integer coef cients,
0

that b must be equal to 1 or −1. We will independently consider these two cases.

Case 1: b = −1. The possible values of a are −1, 0, and 1. If a = −1 or a = 1, then we can
clearly x P (x) = 1 to get the desired property. If a = 0, then one can take P (x) = x + 1
to get that
2 2 3 2
(x + ax + b) ⋅ P (x ) = (x − 1)(x + 1 ) = x + x − x − 1,

and the desired property holds.

Case 2: b = 1. This time there are more possible values of a to consider: −2, −1, 0, 1, and
2. If a = −1 or a = 1, then we again use P (x) = 1. If a = 0, then we use P (x) = x + 1 to
get a desired property:
2 2 3 2
(x + ax + b) ⋅ P (x ) = (x + 1)(x + 1 ) = x + x + x + 1.

We are left with two cases, −2 and 2, for which we use P (x) = x + 1 and, respectively,
P (x) = x − 1: and
2 3 2
(x − 2x + 1)(x + 1) = x − x − x + 1

(x
2
+ 2x + 1)(x − 1) = x
3
+ x
2
− x − 1 .
Putting both cases together, we conclude that the set of solutions to our problem is

(a, b) ∈ {(−2, 1), (−1, −1), (−1, 1), (0, −1), (0, 1), (1, −1), (1, 1), (2, 1) } .

REMARKS

In order to solve the problem in this section, we used the geometry of the roots of a given
polynomial, that is, the information about their localization (in the complex plane or on the
real line, depending if we allow complex roots or restrict ourselves to real ones). It is
perhaps surprising that one can actually deduce it from the degree and the coef cients of the
polynomial. Some of these properties are important for many applications, such as upper
bounds on the absolute values of the roots, which de ne a disk containing all roots, or lower
bounds on the distance between two roots. Such bounds are widely used for root- nding
algorithms for polynomials, either for limiting the regions where roots should be searched
in, or for the computation of the computational complexity of these algorithms.
There are many upper bounds for the magnitudes of all complex roots. We will only
mention two of them, Lagrange’s and Cauchy’s bounds. In our problem, we used the bound
of Cauchy. Let P (x) = ∑ a x be a polynomial and let z be its root, that is, P (z) = 0.
n i
i=0 i

Lagrange’s Bound is
n−1
ai
|z | ≤ max{1, ∑ } ,
an
i=0

whereas Cauchy’s Bound is

ai
|z | ≤ 1 + max { } .
0≤i≤n−1 an
(3.10)

Lagrange’s bound is smaller than Cauchy’s one only when 1 is larger than the sum of all
ratios |a /a | but the largest which is relatively rare in practice. As a result, Cauchy’s bound
i n

is more widely known and used than Lagrange’s. We will prove the bound of Cauchy below.
Let z be any root of P (x). If |z| ≤ 1, then (3.10) is trivially satis ed so suppose that
|z| > 1. Since P (z) = 0, we get that

n−1

n i
−an z = ∑ ai z ,

i=0

and so

n n−1 i n−1 i n−1 i


|an ||z| = ∑ ai z ≤ ∑ |ai ||z| ≤ ( max |ai |) ⋅ ∑ |z|
i=0 i=0 i=0
0≤i≤n−1

n n
|z| −1 |z|
= max |ai | ≤ max |ai | ,
|z|−1 |z|−1
0≤i≤n−1 0≤i≤n−1

as it is assumed that |z| > 1. It follows that

|an |(|z| − 1 ) ≤ max |ai |


0≤i≤n−1

which yields the desired bound (3.10).


EXERCISES

3.4.1. Let f = 0, f = 1, and f


1 2 = f + f
n+2 for all n ∈ N. Find all polynomials P (x)
n+1 n

having only integer coef cients with the property that for each n ∈ N there exists
k = k(n) ∈ Z such that P (k) = f . n

(Source of the problem: PLMO LX – Phase 1 – Problem 7. Solution: our own.)

3.4.2. Suppose that a polynomial P (x) has all integer coef cients. Prove that if polynomials
P (P (P (x))) and P (x) have a common real root, then P (x) also has an integer root.

(Source of the problem and solution idea: PLMO LVIII – Phase 2 – Problem 1.)

3.4.3. Consider a polynomial P (x) = x + ax + b with a, b ∈ Z. Suppose that for every


2

prime number p, there exists k ∈ Z such that P (k) and P (k + 1) are divisible by p. Prove
that there exists m ∈ Z such that P (m) = P (m + 1) = 0.
(Source of the problem and solution idea: PLMO LVI – Phase 2 – Problem 4.)

3.5 Unique Representation of Polynomials


SOURCE

Problem and solution idea: PLMO LI – Phase 3 – Problem 6


PROBLEM

Find all polynomials P (x) of odd degree that satisfy the following equation:

2 2
P (x − 1) = (P (x)) − 1.

(3.11)

THEORY

Lagrange Polynomials Polynomials can be used to approximate complicated functions (for


example, trigonometric functions) that are computationally dif cult to deal with. Indeed,
one can pick a few known data points, create a lookup table, and use polynomials to
interpolate between those data points. This approach results in signi cantly faster
computations. There are fast algorithms to compute numerically stable solutions, much
faster than what is required by standard Gaussian elimination. Alternatively, one may write
down the polynomial immediately in the form of Lagrange polynomials discussed next.
Suppose that one is given a set of n points (x , y ), i ∈ [n], with no two xi values equal.
i i

The Lagrange polynomial is the polynomial of lowest degree that passes through all of these
n points. It is easy to see that the interpolating polynomial of the least degree is unique and
can be computed using the following formula:
n
x − xk
P (x ) = ∑ yj ∏ .
xj − xk
j=1 1≤k≤n
k≠j

Indeed, it is easy to see that P (x) goes through (x , y ) as for x = x all terms in the sum
i i i

but the ith term vanish, and in the ith term all fractions in the product are equal to 1. In order
to prove uniqueness, consider two polynomials P (x) and Q(x) of degree less than n that go
through points (x , y ). But this means that P (x) − Q(x) is a polynomial of degree less
i i

than n and has n distinct roots, namely, xi, i ∈ [n]. However, by the fundamental theorem of
algebra, this is only possible if P (x) − Q(x) = 0 for all x and so P (x) = Q(x) for all x.
In fact, the above proof of uniqueness naturally extends to the case of in nite number of
points. We get the following useful fact. Suppose that there is an in nite set of points
(x , y ), i ∈ N, with no two xi values equal to each other. Note that it might be the case that
i i

there is no polynomial that goes through all of these points (consider, for example, the set
{(0, 1), (1, 0), (2, 0), (3, 0), …}; there is no polynomial P (x) that has an in nite number of

roots, unless P (x) = 0 for all x, the special case). On the other hand, if there is a
polynomial that passes through all of these points, then this polynomial is uniquely de ned.
In order to see this, consider any two polynomials P (x) and Q(x) that pass through these
points. As for the nite case, by considering the polynomial P (x) − Q(x), we get that
P = Q, since both P (x) and Q(x) have a nite degree.
Finally, let us discuss one more useful fact. Suppose that a polynomial P (x) has a root at
point x0 and x0 is a local extremum (either local maximum or local minimum). We will
show that the multiplicity of root x0 is even. In order to see this, to derive a contradiction,
suppose that the multiplicity of x0 is some odd natural number k. It follows that one can
represent polynomial P (x) as P (x) = (x − x ) Q(x), where Q(x) is some polynomial
k
0

and Q(x ) ≠ 0. Since Q(x) is continuous, there exists an open interval around x0,
0

(x − ϵ, x + ϵ) for some ϵ > 0, such that Q(x) does not change sign on that interval. On
0 0

the other hand, since k is odd, the polynomial (x − x ) does change the sign at x0 and so
0
k

P (x) also changes it at that point. As a result, there is no extremum at x0 and we get the

desired contradiction. The statement holds and the proof is nished.

Even and Odd Functions Let us also add a remark on even and odd functions, as we will
need the associated symmetry relations to solve our problem. A function f : R → R is even
if f (x) = f (−x) for all x ∈ R. Geometrically speaking, the graph of an even function is
symmetric with respect to the y-axis, meaning that its graph remains unchanged after
re ecting it about the y-axis. Examples of even functions are f (x) = |x|, f (x) = x , and 2

f (x) = cos x. On the other hand, a function f : R → R is odd if −f (x) = f (−x) for all

x ∈ R. Geometrically, the graph of an odd function has symmetry with respect to the

origin, meaning that its graph remains unchanged after a rotation of 180 degrees about the
origin. Examples of odd functions are f (x) = x, f (x) = x , f (x) = sin x.
3

Let us mention about some basic but useful properties and their implications for
polynomials. The sum of two even functions is even and the sum of two odd functions is
odd. If f (x) is even, then so is −f (x); the same property holds for odd functions. As a
result, the difference between two odd functions is odd and the difference between two even
functions is even. The sum of an even and odd function is neither even nor odd, unless one
of the functions is equal to zero over the whole domain.
From these observations we get that one can represent any polynomial P (x) = ∑ c x n

i=0 i
i

as a sum of two polynomials Q(x) and R(x) such that Q(x) is odd and R(x) is even.
Indeed, one can partition all the terms of P (x) into even and odd terms, that is,

⌊(n−1)/2⌋ ⌊n/2⌋

2i+1 2i
Q(x) = ∑ c2i+1 x and R(x) = ∑ c2i x .

i=0 i=0

It follows that P (x) is even if and only if Q(x) = 0. Similarly, P (x) is odd if and only if
R(x) = 0. Moreover, since any polynomial has nite degree, in order to determine whether
a polynomial is even or odd it is enough to check the corresponding conditions (
P (x) = P (−x) or, respectively, P (x) = −P (−x)) for in nitely many distinct points x (not

necessarily for all x ∈ R). Finally, note that if a function is even and odd, it must be equal
to 0 everywhere. As a result, the only polynomial that passes both tests for in nitely many
distinct points is the polynomial P (x) = 0, x ∈ R.
SOLUTION

Let us rst observe that

( )
2 2 2 2
(P (x)) = P (x − 1) + 1 = P ((−x) − 1) + 1 = (P (−x)) .

It follows that either P (x) = P (−x) holds for an in nite number of values of x or the
equation P (x) = −P (−x) does. Note that both of these properties cannot hold
simultaneously, unless P (x) = 0 everywhere, which we directly check that is impossible.
Finally, as noted above, satisfying any of these properties for in nitely many points implies
that it actually holds for the whole real line R.
Since P (x) has odd degree, it follows that P (x) = −P (−x) for all x ∈ R, that is, P (x)
is symmetric around the origin. In particular, we get that P (0) = 0. Using the original
equation with x = 0, we get that P (−1) = P (0 − 1) = (P (0)) − 1 = −1 and so 2 2

P (1) = 1.

The next property that we will prove is that P (y) ≥ −1 for all y ≥ −1. (However, we
will only use it for y ≥ 1.) Let y ≥ −1 be any real number. Since x = x(y) := √y + 1
satis es y = x − 1, we get that
2

2 2
P (y ) = P (x − 1) = (P (x)) − 1 ≥ −1 .

Our next task is to show that P (x) = x holds for an in nitely many values of x and so the
only solution is the polynomial P (x) = x. In order to see this, let us recursively de ne the
following sequence of numbers: x = 1 and for each n ∈ N ∖ {1}, we de ne
1

xn = √x + 1. It is straightforward to see that this sequence is increasing. In fact, it is


n−1

tending to (√5 + 1)/2 ≈ 1.618, the unique solution to the equation x = √x + 1. However,
we will not need this property. We will show by induction on n that for all n ∈ N, we have
P (x ) = x , which will
n n nish the proof.
The base case is trivial: note that P (x ) = P (1) = 1 = x . For the inductive step,
1 1

suppose that P (x ) = x n−1 for some n ∈ N ∖ {1}. It follows from the original equation
n−1

(3.11) that
2 2
(P (xn )) = P (xn − 1) + 1 = P (xn−1 ) + 1 .

From the inductive hypothesis, we get that


2 2
(P (xn )) = P (xn−1 ) + 1 = xn−1 + 1 = xn ,

and so P (x ) = x or P (x ) = −x . Finally, since x


n n n n n > x1 = 1 and P (x n) ≥ −1 , we get
that P (x ) = x and so the proof is nished.
n n

REMARKS

In our problem, we were restricted to polynomials of odd degree. Let us now relax this
assumption and consider polynomials of even degree. Constant polynomials (that is,
polynomials of degree 0) are easy to investigate. If P (x) = c for some c ∈ R, then c must
satisfy the equation c = c − 1 and so there are only two constant polynomials that satisfy
2

the equation (3.11): P (x) = (1 + √5)/2 and P (x) = (1 − √5)/2.


Let us now consider any non-constant polynomial P (x) of even degree that satis es the
equation (3.11). Arguing as before, we get that (P (x)) = (P (−x)) for all x ∈ R. This
2 2
time, it follows that P (x) = P (−x) for all x ∈ R (that is, P (x) is even), and so P (x) does
not have any non-zero odd terms. As a result, there exists a polynomial Q(x) such that
P (x) = Q(x ). It is convenient to introduce the polynomial R(x) := Q(x + 1), x ∈ R, so
2

that P (x) = Q(x ) = R(x − 1). From this observation, using additionally the fact that
2 2

P (x) satis es the equation (3.11), we get that for all x ∈ R,

2 2 2
2 2 2
R((x − 1) − 1) = P (x − 1) = (P (x)) − 1 = (R(x − 1)) − 1.

Substituting y = x we get that R(y − 1) = (R(y)) − 1 and, since the function


2
− 1
2 2

f : [0, ∞) → [−1, ∞) de ned as f (x) := x − 1 is a bijection, we get that the polynomial


2

R(x) satis es (3.11) for all x ∈ [−1, ∞). Arguing as before, we get that
(R(x)) = (R(−x)) for all x ∈ [−1, 1] and so either R(x) = R(−x) holds for an in nite
2 2

number of values of x ∈ [−1, 1], or R(x) = −R(−x) does. It follows that R(x) is either
even or odd on the whole real line R. It follows that R(x) in fact satis es (3.11) for all
x ∈ R, not only for those x ∈ [−1, ∞).

Let us note that the degree of R(x) is the same as the degree of Q(x) and so it is less than
the one of P (x). As a result, repeating this reduction process we will eventually reach the
case when R(x) has an odd degree, as it is impossible that R(x) is constant if P (x) were
not constant. Hence, if this happens, then R(x) = x since we showed earlier that this is the
only solution for the odd case. Therefore, all non-constant solutions of even degree have the
form T (x) for some n ∈ N, and T (x) = x − 1. Note that T (x) = T ∘ … ∘ T (x) is
(n) 2 (n)

the composition of the function f (x) performed n times—see Section 3.7 for more details.

EXERCISES

3.5.1. Find all polynomials P (x) with real coef cients that satisfy the following property: if
x + y is rational, then P (x) + P (y) is also rational.

(Source of the problem: PLMO LIV – Phase 1 – Problem 9. Solution: our own.)

3.5.2. Let P (x) be a polynomial with real coef cients. Prove that if there exists an integer k
such that P (k) is not an integer, then there are in nitely many such integers.
(Source of the problem and solution idea: PLMO LXVI – Phase 3 – Problem 2.)

3.5.3. Let F (x), G(x), and H (x) be some polynomials of degree at most 2n + 1 with real
coef cients. Moreover, suppose that the following properties hold:

(1) for all x ∈ R, F (x) ≤ G(x) ≤ H (x),


(2) there exist n different numbers x i ,
∈ R i ∈ [n] , such that F (x ) = H (x ) for all
i i

i ∈ [n],

(3) there exists x 0 ∈ R , different than xi for i ∈ [n], such that


F (x0 ) + H (x0 ) = 2G(x ).
0

Prove that for all x ∈ R, F (x) + H (x) = 2G(x).


(Source of the problem and solution: XVIII Mathematical Olympics of Baltic Countries –
Problem 3.)

3.6 Polynomial Factorization


SOURCE

Problem and solution: PLMO LXII – Phase 2 – Problem 6

PROBLEM

Suppose that P (x) and Q (x) are two different polynomials with real coef cients that
1 1

satisfy the following condition: P (Q (x)) = Q (P (x)) for all x ∈ R. For n ∈ N ∖ {1},
1 1 1 1

let P (x) := P (P (x)) and Q (x) = Q (Q (x)). Prove that P (x) − Q (x) divides
n 1 n−1 n 1 n−1 1 1

P (x) − Q (x) for all n ∈ N.


n n

THEORY

Consider any non-zero polynomial P (x) = ∑ p x of degree n and any point x ∈ R.


n

i=0 i
i
0

We can then re-write P (x) as follows: P (x) = (x − x )Q(x) + r, where 0

q x is a polynomial of degree n − 1 and r ∈ R is a constant. In order to see


n−1 i
Q(x) = ∑ i
i=0

this, observe that we can nd the coef cients qi and the constant r explicitly by comparing
the corresponding coef cients of the two polynomials. We get that q = p , n−1 n

qi−1 = x q + p for i ∈ [n − 1], and r = p


0 i i + q x . Hence, we can successively calculate
0 0 0

qi (staring from q and nishing with q0), and at the end then compute r. Let us also note
n−1

that r = P (x ). As a result, the above result could also be obtained in a different way.
0

Consider the polynomial R(x) := P (x) − P (x ). Since R(x ) = 0, we get that 0 0

R(x) = (x − x )Q(x) for some polynomial Q(x), and so P (x) = (x − x )Q(x) + P (x ).


0 0 0

In particular, if x ∈ R is a root of some non-zero polynomial P (x), then the constant r


0

has to be equal to zero and we get that P (x) = (x − x )Q(x) for some polynomial of 0

degree n − 1.

SOLUTION

Let us rst observe that for any two different polynomials G(x) and H (x), and any
polynomial F (x) = ∑ c x , we have that G(x) − H (x) divides F (G(x)) − F (H (x)).
n i
i=0 i

Indeed, note that


n i n i
F (G(x)) − F (H (x)) = ∑i=0 ci (G(x)) − ∑i=0 ci (H (x))

n i i
= ∑ ci ((G(x)) − (H (x)) )
i=0

n i−1 i i−1−j
= (G(x) − H (x)) ∑ ci ∑ (G(x)) (H (x)) .
i=0 j=0
Our second observation is that P (Q (x)) = Q (P (x)) for each n ∈ N. This can be
n 1 1 n

easily proved by induction on n. Indeed, the base case ( n = 1) follows immediately from
our assumption that P (Q (x)) = Q (P (x)). For the inductive step, suppose that
1 1 1 1

Pn−1 (Q (x)) = Q (P
1 1 (x)) for some n ∈ N ∖ {1}. Using the inductive hypothesis and
n−1

our assumption that P (Q (x)) = Q (P (x)), we get that


1 1 1 1

Pn (Q1 (x)) = P1 (Pn−1 (Q1 (x))) = P1 (Q1 (Pn−1 (x)))

= Q1 (P1 (Pn−1 (x))) = Q1 (Pn (x)),

and the second claim holds.


We are now ready to come back to our main task of showing that P (x) − Q (x) divides 1 1

P (x) − Q (x) for all n ∈ N. We will prove it by induction on n. The base case ( n = 1) is
n n

trivial. For the inductive step, suppose that the claim holds for some n ∈ N, that is,
P (x) − Q (x) divides P (x) − Q (x). Without loss of generality, we may assume that
1 1 n n

P (x) is not constant (note that, since P (x) and Q (x) are different, they also cannot be
1 1 1

both constant). Note that

Pn+1 (x) − Qn+1 (x) = P1 (Pn (x)) − Q1 (Qn (x))

= Pn (P1 (x)) − Qn (Q1 (x))

= (Pn (P1 (x)) − Pn (Q1 (x)))

+(Pn (Q1 (x)) − Qn (Q1 (x))),

where the second equality holds because the composition of functions is associative—see
Section 3.7 for more details. We will independently show that both terms are divisible by
P (x) − Q (x). The
1 1 rst observation implies that the rst term, P (P (x)) − P (Q (x)), n 1 n 1

is divisible by P (x) − Q (x) since P (x) and Q (x) are different and P1 is not constant.
1 1 1 1

Using the second observation, we may re-write the second term as follows:
P (Q (x)) − Q (Q (x)) = Q (P (x)) − Q (Q (x)). If P (x) and Q (x) are identical,
n 1 n 1 1 n 1 n n n

then this term vanishes. Otherwise, we apply the rst observation one more time to get that
this term is divisible by P (x) − Q (x), and so also by P (x) − Q (x), by the inductive
n n 1 1

hypothesis.

REMARKS

The problem we considered in this section belongs to a large and important family of
problems where one assumes that some property (or a set of properties) holds and the goal
is to show that some other property also holds. However, it is important to keep in mind that
formally the statement we aim to prove is an example of the conditional statement P → Q.
Moreover, the statement P → Q is true when P is false, regardless whether Q is false or
true. In such examples, we say that the conditional statement is vacuously true or true by
default, which may lead to situations not necessarily intended by the author. As an example,
consider the following two statements: 1) All the banks we robbed are in Canada, and 2) All
the banks we robbed are outside of Canada. Since we actually did not rob any bank,
regardless whether in Canada or not, both statements are vacuously true.
Therefore, in practice it is important to make sure that there are objects that satisfy the
assumed properties of the theorem. The problem in this section did not ask us to verify this
so let us now make sure that we did not prove a statement that is vacuously true. Indeed, in
our problem one can clearly take Q (x) = x and any P (x) different than Q (x) to satisfy
1 1 1

the assumptions of our problem. Less trivial example is the pair Q (x) = x − 3x and 1
3

P (x) = x − 2. Indeed, for this choice we get that


2
1

3
2 2
Q1 (P1 (x)) = (x − 2) − 3(x − 2)

6 4 2
= x − 6x + 9x − 2

2
3
= (x − 3x) − 2 = P1 (Q1 (x)).

On the other hand, it might be the case that there are actually no objects that satisfy the
assumed properties of the theorem. In such situations, conditional statements can often help
us to formally prove it. Indeed, one can show that the statement P → Q is true and then that
Q is false. The conclusion is that P has to be false since that is the only possibility for the
conditional statement P → Q to be true. Such reasoning is called proof by contradiction
and we often use it in this book.
In order to illustrate this technique, let us consider the following example related to the
problem from this section. We will rst prove the following conditional statement: if a
polynomial P : R → R of odd degree satis es the equation P (x) = Q(x ) for some 2

polynomial Q(x), then P (x) is even. Indeed, it is clear that for each x ∈ R we have
P (x) = Q(x ) = Q((−x) ) = P (−x) so P (x) is even. The conditional statement is true.
2 2

However, any polynomial P (x) of odd degree has the property that lim P (x) = ∞ and x→∞

lim x→−∞ P (x) = −∞, or vice versa, lim P (x) = −∞ and lim
x→∞ P (x) = ∞. Asx→−∞

a result, there exists x ∈ R such that P (x) ≠ P (−x), and so P (x) is not even. In other
words, we showed that no polynomial of odd degree is even. The conclusion is that no
polynomial P (x) of odd degree satis es P (x) = Q(x ) for some polynomial Q(x).
2

EXERCISES

3.6.1. Find all real


numbers m for which the polynomial
f (x) = 2x − 7x + mx + 22x − 8 has two real roots whose product is equal to 2.
4 3 2

(Source of the problem and solution: PLMO XXX – Phase 1 – Problem 5.)

3.6.2. Given the polynomial P (x) = x − 3x + 5x − 9x, x ∈ R, nd all pairs of integers


4 3 2

a and b such that a ≠ b and P (a) = P (b).


(Source of the problem and solution idea: PLMO LIV – Phase 2 – Problem 3.)

3.6.3. Find all polynomials P (x) with real coef cients that satisfy the following property:
for all x ∈ R, P (x ) ⋅ P (x ) = (P (x)) .
2 3 5

(Source of the problem and solution idea: PLMO LIX – Phase 1 – Problem 6.)
3.7 Polynomials and Number Theory
SOURCE

Problem: PLMO LXX – Phase 2 – Problem 3


Solution: our own

PROBLEM

Let 3
f (t) = t for t ∈ R. Consider the family of iterated functions de ned as follows:
+ t

(t) = t, t ∈ R, and for each i ∈ N we de ne f (t)), t ∈ R. Decide if


(0) (i) (i−1)
f (t) = f (f

there exist rational numbers x and y and natural numbers m and n such that xy = 3 and
(y).
(m) (n)
f (x) = f

THEORY

Function Composition Problem in this section requires considering repeated application of


the same function to itself. In general, consider any two functions f : X → Y and
g : Y → Z . Then, the function g can be applied to the result of applying the function f to x.

Formally, the composition of these two functions is the function g ∘ f : X → Z de ned as


follows: (g ∘ f )(x) := g(f (x)) for all x ∈ X. This process is called function composition.
If f : X → Y and Y ⊆ X, then one may compose function f with itself. The result is
often denoted by f , that is, f = f ∘ f . More generally, for any n ∈ N ∖ {1}, the nth
(2) (2)

functional power is de ned inductively by f := f ∘ f . Repeated composition of such


(n) (n−1)

a function with itself is called iterated function.


Function composition has several useful properties. First of all, the composition of
functions is always associative; that is, if f, g, and h are any three functions with suitably
chosen domains and codomains, then h ∘ (g ∘ f ) = (h ∘ g) ∘ f . An implication for iterated
functions is that for any k, ℓ ∈ N, we have
(k+ℓ) (k) (ℓ) (ℓ) (k)
f = f ∘ f = f ∘ f .
Moreover, it is easy to show that if f and g are one-to-one, then also g ∘ f is one-to-one.
Similarly, the composition of two onto functions is always onto. As a result, if f and g are
bijections, then g ∘ f is a bijection. The inverse function of a composition (assumed it is
invertible) has the property that
−1 −1 −1
(g ∘ f ) = g ∘ f .

Finally, let us mention that in order to solve the problem, we will use the concept of an
invariant, which is discussed in more detail in Section 4.2.

SOLUTION

Let us rst observe that f (t) = f (f (t)) = f (t) so, indeed, we deal with iterated
(1) (0)

functions. Recall also that the composition of functions is associative, that is, for each
i ∈ N ∖ {1} , we have f = f ∘ f (i)
= f
(1)
∘ f . (i−1) (i−1) (1)

Let us now de ne a function s : Q → {0, 1} in the following way. Let r = a/b be a


rational number expressed in lowest terms; that is, a ∈ Z, b ∈ N, and gcd(a, b) = 1. Then,
s(r) = 0 if 3 | a; otherwise, s(r) = 1.

We will show that the value of s does not change under transformation f, which is the key
observation that will allow us to solve the problem. Indeed, consider any rational number
r = a/b expressed in lowest terms. Then,

3 2 2
a a a(a + b )
3
s(f (r) ) = s(r + r) = s( + ) = s( ) .
3 3
b b b

Suppose rst that s(r) = 0, that is, 3 | a. Since gcd(a, b) = 1, we get that b is not
divisible by 3. It follows that b3 is not divisible by 3 whereas 3 | a(a + b ), and so we get 2 2

that s(f (r)) = 0. On the other hand, if s(r) = 1, then 3 does not divide a. It follows that
also 3 does not divide a(a + b ), as a + b would be divisible by 3 if and only if both a
2 2 2 2

and b were divisible by 3 which is not the case, and so s(f (r)) = 1.
Our nal observation is that if xy = 3 for some rational numbers x, y, then s(x) ≠ s(y).
Indeed, suppose that x = a/b is a rational number expressed in lowest terms. If s(x) = 0
(that is, 3 divides a but 3 does not divide b), then a = 3k for some k ∈ Z and so
y = 3/x = 3b/a = b/k. Since 3 does not divide b we get that s(y) = 1. On the other hand,

if s(x) = 1 (that is, 3 does not divide a), then in the fraction y = 3b/a the numerator must
be divisible by 3 and so s(y) = 0. However, this means that there is no solution to the
equation de ned in the problem, as no matter which n and m we select, function s evaluated
at f (x) is different than the one at f (y) as long as xy = 3. In particular, it implies that
(m) (n)

(y).
(m) (n)
f (x) ≠ f

REMARKS

Let us point out that the assumption in our problem that x and y are rational numbers is
crucial. Indeed, if x and y are allowed to be any real numbers, then for any n, m ∈ N, we
can easily nd x, y ∈ R such that xy = 3 and f (x) = f (y). (m) (n)

In order to see this note that for any n ∈ N, the function f (x) satis es the following (n)

properties: i) f (0) = 0, ii) f (x) is increasing on the interval [0, ∞), iii)
(n) (n)

(x) = ∞, and iv) f (x) is a polynomial and so it is a continuous function.


(n) (n)
limx→∞ f

Now, x any n, m ∈ N and consider function


(n) (m)
g(x ) := f (x) − f (3/x ) .

Using the properties of f (x) we get that g(x) is continuous on (0, ∞),
(n)

limx→0+ g(x) = −∞, and lim g(x) = ∞. It follows that there exists x ∈ (0, ∞) such
x→∞ 0

that g(x ) = 0. Hence, there exists a pair x = x ∈ R and y = 3/x ∈ R such that
0 0 + 0 +

xy = 3 and f (y).
(m) (n)
(x) = f

EXERCISES
3.7.1. Prove that there are no polynomials P1 (x), P2 (x), P3 (x), P4 (x) with rational
coef cients that satisfy

4
2 2
∑ (Pi (x)) = x + 7 f or all x ∈ R .

i=1

(Source of the problem and solution idea: PLMO LXII – Phase 3 – Problem 6.)

3.7.2. Consider a polynomial f (x) := x + bx + c, where


2
b, c ∈ Z . Prove that if n ∈ N

divides f (p), f (q), and f (r) for some p, q, r ∈ Z, then

n | (p − q)(q − r)(r − p ) .

(Source of the problem and solution idea: PLMO LXIV – Phase 2 – Problem 1.)

3.7.3. Consider a polynomial P (x) with integer coef cients that satis es the following
property: if a, b ∈ Q and a ≠ b, then P (a) ≠ P (b). Does it mean that P (a) ≠ P (b) for all
a, b ∈ R, a ≠ b?

(Source of the problem and solution idea: PLMO LXIV – Phase 2 – Problem 5.)
Chapter 4
Combinatorics

4.1 Enumeration
4.2 Tilings
4.3 Counting
4.4 Extremal Graph Theory
4.5 Probabilistic Methods
4.6 Probability
4.7 Combinations of Geometrical Objects
4.8 Pigeonhole Principle
4.9 Generating Functions

As usual, we start the chapter with some basic de nitions.


THEORY

Graphs Some of our examples will be from graph theory and so here we introduce a few basic
de nitions. A (simple) graph G = (V , E) is a pair consisting of a vertex set V = V (G) and an edge set
E = E(G) consisting of pairs of vertices; that is,

E(G) ⊆ {{u, v} : u, v ∈ V (G), u ≠ v } .

We write uv if u and v form an edge, and say that u and v are adjacent or joined. We refer to u and v as
endpoints of the edge uv. The order of a graph is n := |V (G)|, and its size is m := |E(G)|.
If u and v are the endpoints of an edge, then we say that they are neighbors. The neighborhood of a
vertex v, denoted N (v), is the set of all neighbors of v. The degree of a vertex v, written deg(v), is the
number of neighbors of v; that is, deg(v) := |N (v)|. The numbers

δ = δ(G) := min deg (v)


vϵV (G)

Δ = Δ(G) := max deg (v)


vϵV (G)

are the minimum degree and, respectively, the maximum degree of G. A graph is called k -regular,
provided each of its vertices has degree k.
A clique (sometimes called a complete graph) is a set of pairwise-adjacent vertices. The clique of
order n is denoted by Kn. An independent set (sometimes called an empty graph) is a set of pairwise-
nonadjacent vertices. The path on n vertices, denoted by Pn, consists of n vertices, v , …, v , and n − 1
1 n

edges, v v for i ∈ [n − 1]. The cycle on n vertices, denoted by Cn, consists of n vertices, v , …, v ,
i i+1 1 n

and n edges, v v and v v for i ∈ [n − 1].


n 1 i i+1

A graph G = (V , E ) is a subgraph of G = (V , E) if V ⊆ V and E ⊆ E. If V ⊆ V , then


′ ′ ′ ′ ′ ′

′ ′ ′
G[V ] = (V , {uv ∈ E : u, v ∈ V })

is the subgraph of G induced by V′.

Bipartite Graphs A graph G is bipartite if the vertex set can be partitioned into two sets, X and Y (that
is, V (G) = X ∪ Y , where X ∩ Y = ∅), and every edge is of the form xy, where x ∈ X and y ∈ Y .
Here X and Y are called partite sets. This de nition can be easily generalized to r -partite graphs. This
time, V (G) = X ∪ X ∪ … ∪ X for some r ≥ 2 and there is no edge of the graph with both
1 2 r

endpoints in Xi, for any i ∈ [r]. The complete r-partite graph K is the graph with partite sets
n1 ,n2 ,…,nr

X , …, X
1 n with n = |X | ( i ∈ [r]) and edges between every pair of vertices from different partite
i i

sets.

Matchings A matching in a graph G is a collection of disjoint edges. The vertices of G incident to the
edges of a matching M are called saturated or matched by M; the other edges are unsaturated or
unmatched. A matching is maximal if it cannot be extended by adding an edge. A matching is
maximum if it contains the largest possible amount of edges. In particular, a perfect matching in a
graph G is a (maximum) matching in G that saturates all vertices of G.

4.1 Enumeration
SOURCE

Problem and solution idea: PLOM XVIII – Phase 3 – Problem 3

PROBLEM

There are 100 students at the party. Each student knows at least 67 other students. Prove that there are
at least four students that all know each other. We assume that this relationship is symmetric; that is, if
student A knows student B, then student B knows student A.

THEORY

Constructive Argument A constructive argument is a method of proving a statement that demonstrates


the existence of a mathematical object by creating (or providing a method for creating) the object. For
example, a common way of showing that the set of prime numbers is in nite is a famous, constructive
argument due to Euclid. For a contradiction, suppose that the set of prime numbers is nite, in which
case there is the largest prime number that we denote by n. But then, since n! + 1 > n, n! + 1 is not
prime. On the other hand, clearly, all of its prime factors are greater than n which gives us the desired
contradiction.

Non-constructive Argument The approach mentioned above is in contrast to a non-constructive


argument (also known as an existence proof) which proves the existence of a particular kind of object
without providing an explicit example. To illustrate this method, we will show that there exist two
irrational numbers, x and y, such that xy is rational. (Recall that x is rational if x = a/b for some a ∈ Z
and b ∈ N; otherwise, x is irrational. We will also use the fact that √2 is irrational.)
√2

Indeed, if √2 is rational, then we are immediately done: x = y = √2 is the pair that has the
√2

desired property. Otherwise, x = √2 and y = √2 have the desired property as

√2
√2 √ 2⋅√ 2 2
y
x = (√ 2 ) = √2 = √2 = 2 .

Let us stress the fact that based on the above argument, we do not know which of the two pairs of x and
y satis es the desired property but we do know that precisely one of them does. In fact, it turns out that
√2
√2 is irrational but this fact is not needed to claim the correctness of the statement.

Greedy Algorithm Let us come back to constructive arguments. The easiest approach one can try is to
construct the desired object by making locally optimal choices at each stage with the intent of nding a
global optimum. Such approach is often called greedy strategy (or greedy algorithm). Let us stress the
fact that a greedy strategy does not usually produce an optimal solution but it may yield one or at least a
good approximation of it.

SOLUTION

We will perform a greedy search for four students that mutually know each other. Start with any student
A from the set of all students. Now, select any student B, different than A, that knows A. Clearly, it is
possible since there are at least 67 students that know A.
We will now show that there exists a student C that knows both A and B. At most
100 − 67 − 1 = 32 students do not know A and, similarly, at most 32 students do not know B. Since

there are 98 students different than A and B and 32 + 32 < 98, there must exist a student that knows
both A and B, as claimed. We select any such student and call it C.
We continue this greedy selection process to nd student D that knows students A, B, and C.
Arguing as before, there are 97 students to chose from but only at most 3 ⋅ 32 = 96 of them do not
know at least one of A, B, or C. Hence, at least one such student exists and the process is nished.

REMARKS

The property stated in the problem is best possible in the following sense. Suppose that there are still
100 students but this time each of them knows at least 66 other students, instead of 67. With this
slightly weaker assumption, it is possible that there are no four students who know one another. Indeed,
let us partition students into three sets of sizes 33, 33, and 34, respectively, and assume that a student
from one set knows only students from the other two sets.
This property can be reformulated in the language of graph theory as follows: there exists a graph on
n = 100 vertices, minimum degree 67, and without K4, the complete graph on 4 vertices, as a
subgraph. Moreover, this example is, in fact, the well-known Turán graph T (100, 3), related to an
important problem in extremal combinatorics. We will come back to such problems in Section 4.4. In
general, the Turán graph T (n, r) is a graph formed by partitioning a set of n vertices into r subsets, with
sizes as equal as possible, and connecting two vertices by an edge if and only if they belong to different
subsets. The number of edges in this graph is at most (1 − 1/r)n /2 with equality holding if and only
2

if n is divisible by r; that is, all sets have equal sizes.


It is well-known (and, for completeness, we will prove it now) that the Turán graph has the
maximum possible number of edges among all graphs on n ≥ r vertices with the property that no r + 1
vertices induce K , the complete graph on r + 1 vertices. Moreover, the Turán graph is the unique
r+1

graph on n vertices that satis es this property, while having this maximum number of edges. In order to
prove this uniqueness property, suppose that G = (V , E) is such extremal graph. We will start with
proving the following property.

Observation: There are no a, b, c, ∈ V such that ab ∈ E, ac ∉ E, and bc ∉ E.


Proof: For a contradiction, suppose that the opposite is true; that is, that there exist a, b, c, ∈ V such that
ab ∈ E , ac ∉ E , and bc ∉ E . Without loss of generality, we may assume that deg(a) ≥ deg(b). Let us

independently consider the following two cases.


Case 1: deg(c) < deg(a). Construct G′ from G by replacing vertex c with vertex a′, a copy of vertex
a; that is, a′ is adjacent to v ∈ V ∖ {c} if and only if a is. In particular, a and a′ are not adjacent. Note
that G′ has more edges than G. Moreover, G′ does not contain K . Indeed, no r+1

′ ′ ′
S ⊆ V (G ) = (V ∖ {c}) ∪ {a }, |S| = r + 1, induces the complete graph in G if a ∉ S (otherwise, S

would induce the complete graph in G). The same holds if both a and a′ are in S (since a and a′ are not
adjacent in G′). Finally, we argue that this is true if a ∈ S but a ∉ S (otherwise, (S ∖ {a }) ∪ {a}
′ ′

would induce the complete graph in G). This contradicts the fact that G has the maximum number of
edges.
Case 2: deg(c) ≥ deg(a). This time we construct G′ from G by removing vertices a and b and
adding vertices c′ and c′′, two copies of vertex c; that is, c′ and c′′ are adjacent to v ∈ V ∖ {a, b} if and
only if c is, and are not adjacent to each other. In particular, {c, c , c } induce no edge. Note that G′ has
′ ′′

more edges than G. Indeed, note that we removed deg(a) + deg(b) − 1 ≤ 2 deg(a) − 1 edges (since a
and b were adjacent in G), less than the number of edges added, namely, 2 deg(c) ≥ 2 deg(a).
Moreover, arguing as in the previous case, G′ does not contain K and so we get the desired
r+1

contradiction.
It follows from the observation that for any three vertices a, b, c ∈ V , if ab ∉ E and bc ∉ E, then
ac ∉ E . Hence, one can partition the vertex set V into k disjoint subsets V , V , …, V such that 1 2 k

vertices from Vi are adjacent to all vertices in V ∖ V but to no vertex in Vi. Since no r + 1 vertices
i

induce K , we know that k ≤ r; otherwise, one could pick one vertex from each set V , V , …V
r+1 1 2 r+1

to form K .
r+1

Using the notation n = |V |, we clearly have n = ∑ n . The number of edges of G is


k
i i i=1 i

k k k k
1 1 2
1 2 2
∑ ni (n − ni ) = (n ∑ ni − ∑ n ) = (n − ∑ n ).
i i
2 2 2
i=1 i=1 i=1 i=1

By Jensen’s inequality (see Section 1.1), the sum is minimized for , so the number
k 2
∑ n ni = n/k
i=1 i

of edges of G is less than or equal to

2 2
1 n 1 n
2
(n − k( ) ) = (1 − ) ,
2 k k 2

which is maximized for k = r. Note that it might happen that n/k is not an integer and so the above
construction cannot be achieved. Hence, in fact, the unique graph that maximizes the number of edges
has the actual sizes of Vi’s selected in such a way that they differ by at most 1. This nishes the proof.

EXERCISES

4.1.1. There are 2n members of a chess club; each member knows at least n other members (knowing a
person is a reciprocal relationship). Prove that it is possible to assign members of the club into n pairs
in such a way that in each pair both members know each other.
(Source of the problem and solution idea: PLMO XLV – Phase 1 – Problem 9, modi ed.)

4.1.2. There are 17 players in the tournament in which each pair of two players compete against each
other. Every game can last 1, 2, or 3 rounds. Prove that there exist three players who have played
exactly the same number of rounds with one another.
(Source of the problem and solution: well-known, classic problem related to Ramsey numbers.)
4.1.3. Consider a group of people with the following property. Some of them know each other, in which
case the corresponding pair of people mutually like each other or dislike each other. Moreover, there is
a person who knows at least six other people. Interestingly, for each person the number of people he or
she likes is equal to the number of people he or she dislikes. Prove that it is possible to remove some,
but not all, like/dislike links such that it is still the case that each person has the same number of liked
and disliked acquaintances.
(Source of the problem: PLMO LXIX – Phase 1 – Problem 7, modi ed. Solution: our own.)

4.2 Tilings
SOURCE

Problem: PLMO LXX – Phase 1 – Problem 3 (slight modi cation)


Solution: our own

PROBLEM

You are given a square grid of size 128 × 128.


a) Prove that it is possible to tile it with 3, 276 blocks of size 5 × 1 and one block of size 2 × 2.
b) Prove that it is impossible to do it when the block of size 2 × 2 touches the border of the square
grid.

THEORY

Invariant An invariant is a property held by a class of mathematical objects, which remains unchanged
when transformations of a certain type are applied to the objects. Invariants are used in diverse areas of
mathematics such as geometry, topology, algebra, and discrete mathematics.
In order to formally de ne an invariant we have to de ne an object, its property, and a transformation
under which this property is invariant. Here are some classical examples, where in each of them we
highlight the object, the property and, the transformation:

1. the distance (property) between two points on a number line (object) is not changed by adding
the same quantity to both numbers (transformation);

2. the area (property) of a gure (object) is invariant with respect to translation (transformation);

3. the degree (property) of a polynomial (object) is invariant subject to multiplication by non-zero


number (transformation);

4. the measure of angle (property) based on a given circle arc (object) is invariant with respect to
the choice of location of the vertex on this arc (transformation).

In solving tilings problems we often rely on nding an insightful invariant of some mathematical
property. In the problem we deal with in this section, the invariant will ensure that, after we
appropriately assign numbers to all cells, no matter how a 5 × 1 block is placed it covers cells with the
same sum of numbers. On the other hand, the corresponding sum for the 2 × 2 block will not have this
property. This difference will turn out to be a key observation to get the proof.

SOLUTION
Part a) is rather straightforward. Observe that one can easily cover the 120 × 128 rectangular grid with
3, 072 tiles of size 5 × 1 (since 120 is divisible by 5). So we will be done if we can cover the remaining

8 × 128 rectangular grid. In fact, since covering the 8 × 120 grid with 192 tiles of size 5 × 1 is equally

easy, we may reduce the problem of covering the 128 × 128 grid to the one of covering the 8 × 8 grid
(this time with 12 blocks of size 5 × 1 and one block of size 2 × 2). The tiling presented in Figure 4.1
uses the allowed blocks which nishes part a).

FIGURE 4.1: Illustration for Problem 4.2, part a).

Part b) is more interesting. Label the grid so that the bottom left cell has label (1, 1) and the top right
one has label (128, 128). Starting from the bottom left corner, assign numbers from the set {0, 1, 2} to
all cells of the 128 × 128 square grid using the pattern presented in Figure 4.2. (For example, cells with
labels (x, y) for x and y that both give the remainder of 2 when divided by 5 will get number 2
assigned.) Observe that rows 126, 127, 128 and columns 126, 127, 128 use only part of the pattern,
namely, the part restricted to the rst three rows and, respectively, the rst three columns.
Let us rst calculate the sum of the numbers assigned to the whole grid. It contains 25 ⋅ 25 complete
copies of our 5 × 5 pattern, 25 copies of a part of our pattern consisting of its rst three bottom rows,
25 copies of a part of the pattern consisting of its three leftmost columns, and one piece containing the
rst three bottom rows and three leftmost columns. Counting (independently) the sums of numbers in
the four respective strips, we get that the total sum is equal to 25 ⋅ 10 + 25 ⋅ 6 + 25 ⋅ 6 + 6 = 6, 556.
2

Let us now notice that no matter how we place 5 × 1 block it will always cover numbers that sum up
to 2. Hence, regardless how we place 3, 276 such blocks, they cover numbers that sum up to
2 ⋅ 3, 276 = 6, 552. This is the desired invariant that leads us to the solution of this problem. It follows

that the remaining 2 × 2 block must cover numbers that sum up to 4. However, given the pattern in
Figure 4.2 that we used, it is only possible if it lies in the top right corner of the pattern. Given the way
we used the pattern to cover the 128 × 128 square grid, all of these positions do not lie on the border of
the square grid. This proves part b) of the problem.
In fact, we not only solved part b) of the problem but proved something stronger. Namely, there are
only 252 possible places where the 2 × 2 block can be placed. Moreover, by adjusting the process
described in part a) of this problem, we can easily see that each of those 252 locations are possible.
FIGURE 4.2: Illustration for Problem 4.2, part b).

REMARKS

The solution presented above is nice and easy to follow. However, it is not clear how to attack similar
problems in the future. Hence, a natural question is how to guess the pattern presented in Figure 4.2.
There are several possible methods of deriving such patterns but all of them aim to propose a setup that
is repeating in terms of one of the blocks; in our problem, it is 5 × 1 block. A natural starting point is
the straightforward pattern presented in Figure 4.3 and repeating it as described in the solution. Clearly,
in this pattern 5 × 1 block covers exactly one 1 and four 0’s.
Repeating the reasoning presented in the previous solution, we get that the number of 1’s in the
whole square grid is 25 ⋅ 5 + 25 ⋅ 3 + 25 ⋅ 3 + 3 = 3, 278. On the other hand, 5 × 1 blocks cover
2

3, 276 squares with 1’s, which means that the 2 × 2 block must cover exactly two 1’s. It follows that if

this block lies on the border, then there exists some i ∈ {0, …, 25} such that it lies:

1. on the two bottom rows and columns 1 + 5i and 2 + 5i; or

2. on the two leftmost columns and rows 1 + 5i and 2 + 5i; or

3. on the two top rows and columns 2 + 5i and 3 + 5i; or

4. on the two rightmost columns and rows 2 + 5i and 3 + 5i.

We observe now that one can repeat the argument when 1’s form the diagonal from the top left cell to
the bottom right one (instead of from the bottom left to the top right). In particular, the conclusion is
that if the 2 × 2 block lies on the two bottom rows, it must lie on column 2 + 5i and 3 + 5i for some
i ∈ {0, …, 25}. Hence, there is no solution with the 2 × 2 block touching the bottom border of the

square grid. The solutions for the other three borders can be eliminated the same way.
The solution presented earlier merges the two arguments by simply introducing the pattern obtained
from the two diagonals.
FIGURE 4.3: Illustration for Problem 4.2, part b)—starting pattern.

EXERCISES

4.2.1. Consider a square grid of size 25 × 25 that has a smaller square grid of size 5 × 5 cut out from
its bottom left corner. Can you cover the remaining cells with 100 blocks of size 1 × 6 or 2 × 3?
(Source of the problem: Polish Junior Mathematical Olympics X – Phase 1 – Problem 6. Solution: our
own.)

4.2.2. Prove that it is impossible to cover a square grid of size 9 × 9 with tiles of size 1 × 5 or 1 × 6.
(Source of the problem: Letters of Polish Junior Mathematical Olympics, September 2014. Solution:
our own.)

4.2.3. Can you cover a square grid of size 10 × 10 with 25 “T-shaped” blocks consisting of 4 small
squares?
(Source of the problem and solution idea: Letters of Polish Junior Mathematical Olympics, September
2014.)

4.3 Counting
SOURCE

Problem and solution: PLMO LXIX – Phase 2 – Problem 5

PROBLEM
There are various clubs in a class consisting of 23 students. Each club has exactly 5 members.
Moreover, any two different clubs have at most 3 members in common. Prove that there are less than
2, 018 clubs in the class.

THEORY

Permutations Let S be a set of n elements. We are interested in investigating various ways in which
objects from S may be selected, without replacement, to form a sequence of n elements. Each of these
possible sequences is called a permutation. Formally, a permutationπ is a bijection π : [n] → S ; π(i) is
the element that was selected at round i. (Recall that a bijection is a function between the elements of
two sets, say A and B, where each element of A is paired with exactly one element of B, and each
element of B is paired with exactly one element of A.)
The number of permutations of an n-element set (that is, the number of ways one can order n
elements) is equal to
n

n! = ∏ i = 1 ⋅ 2 ⋅ … ⋅ n .

i=1

(4.1)

One can easily prove this formula by induction. Alternatively, note that there are n ways to select the
rst object from S. Since the selection is done without replacements, there are n − 1 objects left to
select from; we select any of them and continue until all elements are picked. The total number of ways
is then n ⋅ (n − 1) ⋅ … ⋅ 1 = n! and the formula (4.1) is veri ed.

Combinations This time k objects are selected from a set S of n elements to produce subsets without
ordering (that is, unlike permutations, the order of selection does not matter). More formally, a k -
combination of S is a subset of k distinct elements of S.
The number of k-combinations of an n-element set (provided that 1 ≤ k ≤ n ) is equal to the
binomial coef cient

n n! n(n − 1)⋯(n − k + 1)
( ) = = .
k k!(n − k)! k(k − 1)⋯1

(4.2)

One can prove this formula in many ways; we provide a direct counting argument that is similar to the
solution to our problem above. Select k elements, one by one, without replacement; there are
n(n − 1)⋯(n − k + 1) ways to do it. Clearly, each subset of k elements of S can be obtained in k!

different ways (we know which elements are selected and we are happy with any permutation of them)
so we are over-counting. The formula (4.2) holds.

Double Counting Let us nish with a very useful double counting combinatorial proof technique for
showing that two expressions are equal by demonstrating that they are simply two ways of counting the
same thing. For example, note that
n
n
n
∑( ) = 2 .
k
k=0

Indeed, on the left hand side we independently count k-elements subsets of an n-element set while the
right hand side counts all subsets. Alternatively, one can use the binomial theorem (see Section 1.6) to
get that
n n
n
n n
n k n−k
2 = (1 + 1) = ∑( ) ⋅ 1 ⋅ 1 = ∑( ) .
k k
k=0 k=0

As another example, note that

n n − 2 n − 2 n − 2
( ) = ( ) + 2( ) + ( ) .
k k k − 1 k − 2

In order to see it, let us color n − 2 elements of an n-element set S red and the remaining 2 elements
blue (arbitrarily). We observe that the left hand side counts k-element subsets of S. On the other hand,
the right hand side independently counts k-element subsets with a given number of blue elements (that
is, 0, 1, and 2, respectively).
Finally, let us show that
n
2
n 2n
∑( ) = ( ) .
k n
k=0

Since ( n

k
) = (
n

n−k
) , it is enough to show that
n
n n 2n
∑( ) ⋅ ( ) = ( ) .
k n − k n
k=0

But this equality is obvious. The right hand side counts the number of n-element subsets of the set [2n].
The left hand side counts the same thing, where, for 0 ≤ k ≤ n, the term ( ) ⋅ ( ) counts the
n

k
n

n−k

number of subsets in which k elements are chosen from the set [n] and n − k elements are chosen from
the set [2n] ∖ [n].

SOLUTION

Let C be the set of students such that |C| = 23. Clubs can be represented by a family of subsets Ai of C
of size 5 ( i ∈ [k], where k is the number of clubs). Since no two clubs have more than 3 members in
common, each subset B of C of size 4 is contained in at most one Ai. We can then label each B ⊆ C of
size 4 with i if it belongs to a unique Ai, and assign it label 0 otherwise; that is, when students from B
are not members of the same club. On the other hand, each Ai has clearly 5 distinct four element
subsets, so 5 sets B of size 4 have label i assigned to them. Since the number of sets B of size 4 with
non-zero label is at most ( ), the total number of subsets of size 4, we get that k, the number of clubs,
23

is at most

23
( )/5 = 1, 771 < 2, 018 .
4

REMARKS

The problem considered in this section is an example of a typical situation when the solution can be
obtained by careful and appropriate counting technique. In our problem we rst see that there are
) = 33, 649 sets of size 5. However, clearly not all of them can form a club as it would violate the
23
(
5

property that clubs cannot share many members. In order to reduce the number of possible clubs, we
observe that it is enough to know only 4 members of a club to uniquely identify it. This observation
leads to the solution.

EXERCISES

4.3.1. The class consists of 12 people. Count in how many ways one can divide them into: 6 pairs, 4
triples, 3 quadruples, and 2 six-tuples. Which option yields the largest number of possibilities?

4.3.2. Consider an n × n square grid on which we want to place k ≤ n chess rooks in such a way that
none of them attack another rook. Count the number of ways one can do it.

4.3.3. Create all possible 4-digit numbers using digits from set [9] = {1, 2, 3, 4, 5, 6, 7, 8, 9} . Find the
sum of those numbers.

4.3.4. Alice has 20 balls, all different. She rst splits them into two piles and then she picks one of the
piles with at least two balls, and splits it into two. She repeats this until each pile has only one ball.
Find the number of ways in which she can carry out this procedure.
(Source of the problem and solution: Problem 1.8.27 from Discrete Mathematics by Lovász, Pelikán,
and Vesztergombi.)

4.4 Extremal Graph Theory


SOURCE

Problem: LXVIII OM – Phase 1 – Problem 6 (modi ed)


Solution: our own

PROBLEM

20 boys and 20 girls attended a high-school prom. During this event, there were 98 dances. In each
dance, one boy danced with one girl, and no pair danced more than once. Prove that there were two
boys (say, b , b ) and two girls (say, g , g ) such that they all danced with one another (that is, b1 and b2
1 2 1 2

both danced with g1 and g2).

THEORY

In this section we are interested in basic extremal graph theory that studies extremal (maximal or
minimal) graphs which satisfy some certain property. Extremality can be taken with respect to different
graph invariants, such as the number of vertices, the number of edges, or the length of a longest path.
Extremal graph theory of cially began with Turán’s theorem that we already stated (and proved) in
Section 4.1.
The problem we deal with in this section is closely related to ex(n; C ) de ned next. The connection
4

will be explained later.

Turán Number Given a class of graphs F = {F , F , …}, let us call a graph F -free if it contains no
1 2

copy of F as a subgraph for each F ∈ F . Let the Turán number, denoted ex(n; F ), be the maximal
number of edges in an F -free graph on n vertices. If the class of graphs F consists of a single graph,
then we write ex(n; F ) instead of ex(n; {F }).
In Section 4.1 we considered T (n, r), the Turán graph; that is, the complete equi-partite graph,
K n1 ,n2 ,…,nrwhere ∑ n = n and ⌊n/r⌋ ≤ n ≤ ⌈n/r⌉. By Turán’s theorem we have
i i i

ex(n; K ) = e(T (n, r)). Furthermore, T (n, r) is the unique K


r+1 -free graph that attains the
r+1

extremal number. In fact, the case ex(n; K ) = ⌈n /4⌉ was shown earlier by Mantel.
3
2

In order to show that the bounds for the number of dances is, in some sense, best possible we need to
introduce a family of graphs obtained from the projective planes. We de ne them now and we will
explain the connection soon.

Projective Planes Given a set P of points and a set L of lines, we de ne the corresponding incidence
graph G(P , L) to be the bipartite graph whose vertices consist of the points (one partite set), and lines
(the second partite set), with point p ∈ P adjacent to line ℓ ∈ L if p lies on ℓ.
A projective plane consists of a set of points and lines satisfying the following axioms.

1. There is exactly one line incident with every pair of distinct points.
2. There is exactly one point incident with every pair of distinct lines.

3. There are four points such that no line is incident with more than two of them.

Finite projective planes possess q + q + 1 points for some q ∈ N (called the order of the plane) and
2

the same number of lines. Projective planes of order q exist for all prime powers q, and an unsettled
conjecture claims that q must be a prime power for such planes to exist.
It follows immediately from the axioms that the corresponding incidence graph does not contain C4,
a cycle of length 4 (and of course any odd cycle as the graph is bipartite). It is also possible to show that
this graph is q + 1 regular.
See Figure 4.4 for G(P , L), where (P , L) is the Fano plane (that is, the projective plane of order 2).
We note the incidence graph of the Fano plane is isomorphic to the well-known Heawood graph.

FIGURE 4.4: The Fano plane and its incidence graph.

SOLUTION

For a contradiction, suppose that there were no two boys and two girls that danced with each other. For
any i, j ∈ [20], let x = 1 if i’th boy danced with jth girl; otherwise, x = 0. Since there were 98
i,j i,j

dances and no pair danced more than once, we have that

20 20

∑ ∑ xi,j = 98 .

i=1 j=1

Note that g = ∑ x is the number of girls that danced with the i’th boy; similarly, b = ∑ x
i
20

j=1 i,j j
20

i=1 i,j

is the number of boys that danced with the j’th girl.


Let us x j ∈ [20] and concentrate on the j’th girl. Our goal is to estimate f (j), the number of other
girls who danced with boys who danced with the jth girl. Clearly,

20 20

f (j ) ≤ ∑ xi,j ⋅ (gi − 1 ) = (∑ xi,j ⋅ gi ) − bj .

i=1 i=1

(4.3)

Indeed, one can simply consider all boys she dances with (that is, those for which x = 1); each of i,j

them danced with g − 1 girls other than the j’th girl. In fact, the right hand side of (4.3) is not only an
i

upper bound for f (j) but equality holds. This is because since no two girls danced with the same two
boys, all of these girls must be unique. Finally, since there are 19 girls other than the j’th girl, we get
that f (j) ≤ 19, or equivalently that
20

∑ xi,j ⋅ gi ≤ 19 + bj .

i=1

It follows that
20 20 20 20
∑j=1 ∑i=1 xi,j ⋅ gi ≤ ∑j=i (19 + bj ) = 20 ⋅ 19 + ∑j=1 bj

= 380 + 98 = 478.

(4.4)

On the other hand,


20 20 20 20 20 20
∑ ∑ xi,j ⋅ gi = ∑ ∑ xi,j ⋅ gi = ∑ gi ∑ xi,j
j=1 i=1 j=1 i=1 i=1 j=1

2
20 2 1 20 2 1 20
= ∑i=1 gi = 20 ⋅ ∑i=1 gi ≥ 20( ∑i=1 gi )
20 20

2 2,401
98
= 20( ) = = 480.2 > 478,
20 5

(4.5)

where the rst inequality follows from the fact that the function f (x) = x is convex (see Section 1.1 2

for more details). Inequalities (4.4) and (4.5) give us the desired contradiction.

REMARKS

In order to see the bigger picture, we will provide an alternative solution. Let us rst reformulate the
problem in the language of graph theory. Let B and G be the set of boys and the set of girls,
respectively. Dances can be represented as bipartite graph G = (B ∪ G, E) where bg ∈ E if and only if
boy b danced with girl g. We know that n = |B| = |G| = 20 and m = |E| = 98. Our goal is to show
that G contains C4, a cycle of length 4.
For a contradiction, suppose that G does not contain C4. Let

2
|F | = ∑ (deg b − 1) = ∑ deg b − ∑ deg b
bϵB bϵB bϵB

1 2 2
= (∑ deg b)(∑ 1 ) − ∑ deg b.
n bϵB bϵB bϵB

Clearly,

2
1 2 2
|F | = ∑ (deg b)(deg b − 1 ) = ∑ deg b − ∑ deg b = (∑ deg b)(∑ 1 ) − ∑ deg b.
n
b∈B b∈B b∈B b∈B b∈B b∈B

By Cauchy-Schwarz inequality (see Section 1.7),


2
2
1 m
|F | ≥ (∑ (deg b) ⋅ 1) − ∑ deg b = − m ,
n n
b∈B b∈B

as m = ∑
b∈B
. On the other hand, since there is no cycle of length 4 in G, each pair of girls
deg b

(g1 , g2 ) is associated with at most one boy b in the family F. It follows that

|F | ≤ n(n − 1 ) .

We get that
2
m
− m − n(n − 1 ) ≤ 0
n
(4.6)
and so

1 + √ 1 + 4(n − 1) n
m ≤ = (1 + √ 4n − 3 ) .
2/n 2

Since n = 20, the following bound must hold: m ≤ 97.75. We get the desired contradiction as m = 98
.
In our problem, we assumed that the graph is bipartite but, in fact, one can easily adjust the argument
for general graphs. The only difference is that ∑ deg b is equal to 2m, not m. Instead of (4.6) we
b∈B

get
2
4m
− 2m − n(n − 1 ) ≤ 0
n
which implies that

2 + √ 4 + 16(n − 1) n 1 3/2
ex(n, C4 ) ≤ = (1 + √ 4n − 3 ) = ( + o(1))n .
8/n 4 2

On the other hand, the incidence graph of the projective plane of order q is an example of a dense
graph without C4. Indeed, G(P , L) has

2 2
n = 2 (q + q + 1) = (2 + o(1) ) q

vertices and

2 3
1 3/2
m = (q + q + 1)(q + 1 ) = (1 + o(1) ) q = ( + o(1) ) n
3/2
2
edges. This construction works when q is a prime power. But, since it is known that for every integer n
there exists a prime p satisfying n ≤ p ≤ (1 + o(1))n, the above estimation applies to all values of n. It
follows that

1 3/2
ex(n, C4 ) ≥ ( + o(1) ) n .
3/2
2
In fact, one can show that the upper bound is sharp; that is, there is a construction that (almost) matches
this bound; that is, ex(n, C ) = (1 + o(1))n /2.
4
3/2

EXERCISES

4.4.1. There is a club with 100 members where there are 1, 000 pairs of friends. We want to pick a three
person team from the club with one team member selected as a team leader. The procedure is that one
club member rst becomes a leader. The leader then chooses two followers from his/her friends and the
team is formed. Show that it is possible to pick a team from the club in at least 19, 000 ways.

4.4.2. Consider the following combinatorial game between two players, Builder and Painter. The game
starts with the empty graph on 400 vertices. In each round, Builder presents an edge uv between two
non-adjacent vertices u and v which has to be immediately colored red or blue by Painter. Show that
Builder can force Painter to create a monochromatic (that is, either red or blue) path on 100 vertices in
400 rounds.
(Source of the problem and solution: well-known, classic problem related to on-line size Ramsey
numbers.)
4.4.3. Consider a chess club consisting of 4t members for some t ∈ N; some of the members know
each other. Show that there exist t members that all know each other, or there exist t members such that
no two of them know each other.
(Source of the problem and solution: well-known, classic problem related to Ramsey numbers.)

4.5 Probabilistic Methods


SOURCE

Problem: PLMO LXX – Phase 1 – Problem 6


Solution: our own

PROBLEM

There are 100 people sitting at the round table. Each person has ordered an ice cream, either vanilla or
chocolate avoured. In total, 51 people asked for vanilla ice cream; the remaining 49 preferred
chocolate one. The correct number of each avour was prepared and placed on a table—one ice cream
in front of one person. However, the waiter has forgotten who ordered which dessert so it is not
guaranteed that everyone received a desert he or she ordered. Fortunately, it is possible to rotate the
table and try to satisfy more customers. Prove that one can rotate the table such that at least 52 people
will get what they wanted.

THEORY

Union Bound Let us use the following elementary fact, also known as Boole’s inequality, that we
already introduced in Section 1.8. For any collection of events A , …A , 1 n

n
⎛ ⎞ n

P ⋃ Ai ≤ ∑ P(Ai ) .
⎝ ⎠ i=1
i=1

(4.7)

Note that this inequality is best possible—the equality holds for disjoint events.

Bonferroni Inequalities Moreover, let us mention that (4.7) may be generalized to nd stronger upper
and lower bounds. These bounds are known as Bonferroni inequalities. In particular,
n

⎛ ⎞
P ⋃ Ai ≥ ∑ P(Ai ) − ∑ P(Ai ∩ Aj ) .
⎝ ⎠ 1≤i≤n 1≤i<j≤n
i=1

(4.8)

In general, for any j ∈ [n] we de ne

Sj := ∑ P(Ai ∩ … ∩ Ai ) .
1 j

1≤i1 <…<ij ≤n

Then, for odd k ∈ [n],


and for even k ∈ [n],

E [X ] =
P

⎜⎟


⋃ Ai
i=1

⋃ Ai
i=1

⋃ Ai
i=1
n

∑ x ⋅ P (X = x )

x∈χ


=
k

∑ (−1)

j=1

∑ (−1)

j=1

∑ (−1)

j=1

∑ x ⋅ P (X = x )
x∈χ

For example, roll a fair die once and de ne X to be the number rolled. Clearly, χ = [6] and

=
6

∑x ⋅

x=1
j−1

j−1

Boole’s inequality is recovered by setting k = 1. When k = n, then equality holds and the resulting
identity is in fact equivalent the well-known inclusion–exclusion principle:
n
n
j−1

Linearity of Expectation Consider a nite probability space and a (real) random variable X that takes
values from the set χ. The expectation of X is de ned as

E [X ] :=
Sj ,

Sj .

Sj .

An important and very useful property of the expectation is that it is a linear operator; that is, for any
sequence of random variables X , …, X and any sequence of constants c , …, c ∈ R,

∑ i ⋅ pi

i=1
=

∑i ⋅

i=1
1

E [∑ ci Xi ]

6
(
i=1

6
n

)
n

=

∑∑

i=0
=
n

∑ ci E [Xi ]

i=1

For example, if you roll two fair dice, the expected sum is equal to 2 ⋅ (7/2) = 7.
In order to show how the expected value can be calculated for some more complex random variables,
we consider the following experiment (that we believe is natural and interesting on its own). Assume
you are given a 24-card deck (that is, the deck consisting of 9-10-J-Q-K-A for each of the four suits).
Let us rst consider the following experiment. Draw one card from this deck at random. If it is an
Ace, then you nish the game; otherwise, you need to put the card back into the deck and restart the
experiment. Our goal is to nd the expected number of draws till you draw an Ace. It is clear that the
process nishes after i rounds with probability p := ( ) . (In fact, the number of rounds is a

expected number of draws is equal to


∞ i−1
i

j=i
1

6
5

6
(
i−1

random variable following the geometric distribution with parameter p = 1/6.) It follows that the

6
)

where the formula for a sum of a geometric series was used twice. However, assuming that the expected
j

value exists and is equal to E, we can alternatively compute it in a simpler way. Observe that we either
1

=
=

∑(

i=0
1


7

2
.

6
n

)
i

= 6,
(4.9)
draw an Ace in the rst round (which happens with probability 1/6) or ‘lose’ one draw and then repeat
the identical process (this happens with probability 5/6). It follows that

1 5
E = + (1 + E ) ,
6 6
which immediately implies that E = 6. Let us stress that in order for this reasoning to be correct, we
had to assume that the expected value of the random variable we were interested in exists (that is, it is
nite).
Let us now change slightly the setting, and assume that cards are drawn at random without
replacement. For this variant, it is obvious that the expected value exists, as it must be less than 21 (we
have 24 cards and 4 aces, so at the worst case the process nishes at the end of round 21). In order to
nd the expectation, one could write down the sum over all possible values for the length of the process
(as we did above) but in this case it would be even more cumbersome. Fortunately, it is much simpler to
use the other approach which is justi ed as we are guaranteed that the expected value exists. Let us
denote by En the expected number of draws from an n-card deck that contains 4 aces till we hit an ace.
Clearly, E = 1. Now, arguing as before we get that
4

4 n − 4 n − 4
En = + (1 + En−1 ) = 1 + En−1 .
n n n
It is clear that E = (n + 1)/5 satis es this recursion, and so for a 24-card deck the expected value is
n

(24 + 1)/5 = 5. Finally, let us observe that the waiting time without replacement is smaller than with

replacement. This is what one should intuitively expect as each unsuccessful draw in the variant
without replacement increases the probability that we nish in the next round whereas in the other
variant it remains the same.

Probabilistic Method The probabilistic method is an example of non-constructive argument for


proving the existence of a prescribed kind of mathematical object (see Section 4.1 for more on non-
constructive arguments). It works by showing that if one randomly chooses objects, the probability that
the result is of the prescribed kind is strictly greater than zero. The conclusion is that the desired object
exists; indeed, the probability would have to be equal to 0 otherwise. Similarly, showing that the
probability is (strictly) less than 1 can be used to prove the existence of an object that does not satisfy
the prescribed properties. Let us stress the fact that, although the proof uses probability, the nal
conclusion is deterministic, without any possible error.
Here is an example of a classical result. Suppose that m basketball teams compete in a tournament
and any two teams play each other exactly once. The organizers would like to select n teams and give
them prizes at the end of the tournament. It is clearly a challenging task to select the best teams and it
would certainly be embarrassing if the organizers ended up selecting n teams but there is another team,
without the prize, that beat all n teams that had won a prize. It seems that the organizers should be safe.
It feels unlikely that any selection of n teams will have some unselected team better than all of them.
Perhaps surprisingly, our intuition is wrong—it is quite possible that this will be the case, at least if m is
large enough. Constructing such tournament explicitly (deterministically) is not easy. On the other
hand, one can easily use the probabilistic method to show that such tournament exists.
Indeed, for any xed m, the results of all ( ) games are chosen randomly (and uniformly and
m

independently). Now, for a given set A of n teams, the probability that there is another team that beats
all teams in A is (1/2) . Hence, the probability that there is no team better than all teams in A is equal
n

to (1 − 1/2 ) n
. The same formula holds for another set of n teams, say set B. Clearly, there are
m−n

some correlations between the corresponding events; for example, the fact that there is a team that beats
all teams in A increases the probability that there is a team that beats all teams in B, provided that
A ∩ B ≠ ∅ . However, by the union bound (see (4.7)), the probability that there is at least one set of n
teams for which there is no better team is at most
m−n
m 1
q(n, m ) := ( )(1 − ) .
n
n 2

If q(n, m) < 1, then we are guaranteed that there exists a tournament with m teams such that no n
teams can be awarded without another team beating all of them.
Clearly, for any xed n ∈ N, one can nd m = m(n) large enough such that q(n, m) < 1 as ( ) m

grows polynomially and (1 − 2 ) decreases exponentially as a function of m. In particular,


−n m−n

q(3, 91) < 1 and q(10, 102653) < 1. Note also that for any natural numbers m ≥ n, (
m n
) ≤ (em/n)
n

and for any x ∈ R, (1 + x) ≤ exp(x). It follows that


em n m−n m−n
q(n, m) ≤ ( ) exp (− n
) =exp (n + n ln (m/n) − n
)
n 2 2

2 n
= exp (n + n(n ln 2+ ln n) − n + n
) < 1
2

provided m = 2 n 2
n and n ≥ 12.
Let us now switch gears and discuss another elementary probabilistic method. It is obvious that one
can use the expectation of random variable X to estimate the minimum and maximum value X can take.
In other words, we are guaranteed that there exist x , x ∈ χ such that 1 2

x1 ≤ E [X] and x2 ≥ E [X ] .

Indeed, it follows immediately from the fact that

E [X] = ∑xϵχ x ⋅ P(X = x) ≤ ∑xϵχ xmax ⋅ P(X = x)

= xmax ∑xϵχ P(X = x) = xmax ,

where xmax is the maximum value in χ. Similarly, E [X] ≥ x , where xmin is the minimum value in χ.
min

Surprisingly, this naive method can be used to prove many non-trivial statements.
To illustrate the method, consider any n nite sets A , …, A . Then one can pick some of them such
1 n

that at least half of the underlying elements are repeated an odd number of times. In order to see this,
let us pick each Ai with probability 1/2, independently for all i. Note that for any a ∈ A := ⋃ i∈[n]
Ai ,
the probability that x is repeated an odd number of times is equal to 1/2. Indeed, for each set Ai that a
belongs to, we toss a fair coin to decide if Ai is picked or not. Regardless of the current state of the
process, the last Ai that we need to consider causes a to be repeated an odd number of times with
probability 1/2. It follows that the expected number of elements that are repeated an odd number of
times is equal to |A|/2. By the probabilistic method, we are guaranteed that it is possible to pick some
sets so that the number of elements that are repeated odd number of times is at least |A|/2.
Here is another example, this time from graph theory. We will show that in every graph G = (V , E),
one can partition V, the vertex set, into V1 and V2 such that the number of edges with one endpoint in
V1 and another in V2 is at least |E|/2. Indeed, construct a random set V ⊆ V by putting each vertex of 1

V in V1 independently, with probability 1/2. Let V := V ∖ V . For a given edge e ∈ E, let Xe denote
2 1

the indicator random variable that e has exactly one endpoint in V1; that is, X = 1 if e has the desired e

property and X = 0 otherwise. Clearly,


e

1
E [Xe ] = 1 ⋅ P (Xe = 1) + 0 ⋅ P (Xe = 0 ) = P (Xe = 1 ) = .
2
Note that G has X = ∑
e∈E
Xe edges with the desired property. By linearity of expectation,
|E|
E [X ] = E [∑ Xe ] = ∑ E [Xe ] = ,
2
e∈E e∈E

and so the result holds by the probabilistic method.


In fact, it is possible to improve this result slightly by considering a random set V1 consisting of
precisely ⌊n/2⌋ vertices. The number of edges between V1 and V2 that we are guaranteed to have is at
least if |V | is even, and at least if |V | is odd.
|E| |V | |E| |E| |V |+1 |E|
⋅ > ⋅ >
2 |V |−1 2 2 |V | 2

SOLUTION

Observe that we have 100 possible rotations of the table (including a trivial one; that is, without
rotating at all). Let us number all possible con gurations using numbers from 1 to 100; for example, in
order to be precise, con guration i ∈ [100] is obtained by rotating the table clockwise by i places.
Consider a given con guration i, and let xi be the number of people who wanted to get chocolate ice
cream but got vanilla one. Since 49 people asked for chocolate ice cream, 49 − x people wanted i

chocolate ice cream and got what they wanted. Moreover, since 51 people got vanilla ice cream,
51 − x people wanted vanilla ice cream and got what they wanted. Therefore, 100 − 2x people got
i i

what they asked for. It remains to show that there exists i ∈ [100] such that x ≤ 24 as this guarantees i

that 100 − 2x ≥ 52. i

We are going to use the double counting argument discussed in Section 4.3. We rotate the table
investigating all 100 con gurations and counting how many people in total wanted chocolate ice cream
but got vanilla one. On the one hand, this is clearly equal to ∑ x . Now we will count the same thing
100

i=1 i

but this time from the perspective of any person out of 49 people who wanted chocolate ice cream.
While table was rotating, this person saw precisely 51 vanilla ice creams. It follows that

100

∑ xi = 49 ⋅ 51 = 2, 499 ,

i=1

or equivalently x = 24.99. Since the average value is 24.99, there must be at least one i for
1 100
∑ i=1 i
100

which x ≤ 24.99. Moreover, since all numbers are integers, we are guaranteed that x ≤ 24, which
i i

nishes the proof.

REMARKS

The problem can be equivalently solved using the probabilistic method. Assume that one rotates the
table uniformly at random; that is, each con guration i ∈ [100] occurs with probability 1/100. There
are 49 people that asked for chocolate ice cream; let us mark them with labels c , …, c . For any 1 49

j ∈ [49], let Xj be the random variable that equals 1 if cj got vanilla ice cream, and equals 0 otherwise.

(As mentioned above, such random variables are called indicators.) Clearly,

51
E [Xj ] = 1 ⋅ P (Xj = 1) + 0 ⋅ P (Xj = 0 ) = P (Xj = 1 ) = ,
100
as 51 vanilla ice creams were served.
Let us stress the fact that random variables Xj and Xk are correlated. Indeed, cj and ck are sitting
around the table, and ice creams are placed on the table; it might happen that the fact that X = 1 j

affects the probability that X = 1. Fortunately, the linearity of expectation holds for any sequence of
k

random variables. We get that


49 49
51
E [∑ Xj ] = ∑ E [Xj ] = 49 ⋅ = 24.99.
100
j=1 j=1

By the probabilistic method, we get that one can rotate the table so that ∑
49

j=1
Xj ≤ 24.99 and we are
done.

EXERCISES

4.5.1. Let k ∈ N and x N = N (k) := ⌊2 ⌋. Show that it is possible to partition set


k/2

X := [N ] = {1, 2, …, N } into two subsets A and B such that neither A nor B contains an arithmetic
progression of length k.
(Source of the problem and solution: well-known, classic problem related to Van der Waerden
numbers.)

4.5.2. Show that for any n ∈ N there is a tournament with n basketball teams participating in which
there are at least k = n!/2 n−1
orderings t , …, t such that team ti won against team t , for all
1 n i+1

i ∈ [n − 1].

(Source of the problem and solution: well-known, classic problem related to directed Hamilton paths.)

4.5.3. Consider a graph with T triangles. Show that it is possible to color the edges of this graph with
two colors so that the number of monochromatic triangles is at most T /4.

4.5.4. There are 100 people invited to the party; 450 pairs of people know each other. Show that it is
possible to select 10 people so that no two of them know each other.
(Source of the problem and solution: well-known, classic problem related to independent sets in a graph
with a given degree sequence.)

4.6 Probability
SOURCE

Problem and solution: our own (inspired by a problem from the book “Are You Smart Enough to Work
at Google?” by William Poundstone)

PROBLEM

Let n ≥ 2 be any natural number. Take a unit stick and break it in n random places. Formally, each
breaking point is chosen uniformly at random from the whole stick. Find the probability that one can
create a polygon from the n + 1 resulting pieces.

THEORY

Geometrical Probability In order to solve the special case of our problem (when n = 2), we will use
some basic geometrical probability. This eld studies some basic properties of geometrical objects such
as points, lines, planes, circles, spheres, focusing on some natural and fundamental concepts such as
random points, random planes, random directions. Let us note that any rigorous discussion on
geometrical probability would require sophisticated mathematical background (such as measure theory
and integral geometry). As a result, we only scratch the surface in this book, focusing exclusively on
very simple applications.
In order to give a avour of results in this eld, let us consider the clean tile problem that is an
example of a mathematical game of chance that is concerned with dropping a circular coin at random.
This game was studied by Buffon who is famous because of another game of chance he studied, the
needle problem that is concerned with dropping a needle at random. The needle problem requires
slightly more sophisticated tools so we only discuss the clean tile problem here.

FIGURE 4.5: The clean tile problem of Buffon.

In a room tiled with equal tiles of any shape a coin is thrown upward. One of the players bets that
after its fall the coin will rest clean, that is, on one tile only. The second player bets that the coin will
rest on the crack that separates tiles. We would like to investigate if the game is fair. Buffon himself
considered tiles shaped as squares, equilateral triangles, rhombuses, and hexagons. We concentrate on
squares, the easiest case. Suppose that a coin has diameter d and the oor is led with squares, each of
side ℓ for some ℓ > d. We assume that the center of the coin lands at a random place on the oor. It is
clear that the coin touches the separating crack if and only if the center is at distance less than d/2 from
the crack—see Figure 4.5. Hence, the probability for the coin to be entirely within one of the tiles is
given by the ratio between the area of the inner square and that of the outer square, that is, the rst
player wins with probability p where
2 2
(ℓ − d) d
p := = (1 − ) .
2
ℓ ℓ

It follows that the second player wins with probability


2
d
1 − p = 1 − (1 − ) .

For the game to be fair these two probabilities must be equal, that is, the following equation has to be
satis ed
2 2
ℓ − 4dℓ + 2d = 0.

There are two solutions, ℓ = (2 ± √ 2)d , but since ℓ > d, the only acceptable solution is

ℓ = (2 + √ 2)d ≈ 3.41 d.

Smaller values of ℓ give advantage for the second player and larger values favour the rst player.

Disjoint and Mutually Exclusive Events Two events A and B are said to be disjoint if they cannot
occur at the same time, that is, A ∩ B = ∅. In particular, P(A ∩ B) = P(∅) = 0. The simplest
example of disjoint events is a coin toss. A tossed coin outcome can be either head or tails, but both
outcomes cannot occur simultaneously.
Being mutually exclusive is a slightly different property of events (sets in a probability space). Two
events are mutually exclusive if the probability of them both occurring is zero, that is, P(A ∩ B) = 0.
With that de nition, disjoint sets are necessarily mutually exclusive, but mutually exclusive events are
not necessarily disjoint.
In order to illustrate the difference, suppose that a point is selected uniformly at random from the unit
square. Each coordinate is uniformly and independently distributed from the set [0, 1]. Let A be the
event that the x coordinate is greater than or equal to the y coordinate, and B be the event that the y
coordinate is greater than or equal to the x coordinate. Clearly, P(A) = P(B) = 1/2 and
A ∩ B = {(x, x) : x ∈ [0, 1]}, and so the events are not disjoint. However, P(A ∩ B) = 0, as the area

of the set A ∩ B is equal to 0, and so they are mutually exclusive.


Note that mutually exclusive events may not be disjoint only if in the considered probability space
there exist events that have probability equal to 0. Such situations are typical when the probability
space on which possible events are de ned is continuous. In our example above, the space is a unit
square and we de ned the probability of a given event as the area of the corresponding sample points
that are included in this event. Geometrical objects that have zero area de ne valid events but the
associated probabilities are equal to 0. In our example, such event was the line segment from point
(0, 0) to point (1, 1).

Consider now a sequence of pairwise mutually exclusive events A1 , …, An . One can combine the
Union Bound (4.7) and the Bonferroni inequality (4.8) to get that
n
⎛ ⎞

∑ P(Ai ) − ∑ P(Ai ∩ Aj ) ≤ P ⋃ Ai ≤ ∑ P(Ai ) ,

1≤i≤n 1≤i<j≤n ⎝ ⎠ 1≤i≤n


i=1

and so
n

⎛ ⎞ n

P ⋃ Ai = ∑ P(Ai ) .

⎝ ⎠ i=1
i=1

(4.10)

Alternatively, one can use the Inclusion–Exclusion Principle (4.9).

SOLUTION

Let us rst de ne the problem more formally. Let A0 and A be the two endpoints of the unit stick.
n+1

Let A , …, A be the n random breaking points. Recall that, for each i ∈ [n], Ai is chosen uniformly at
1 n

random from the whole stick. It follows that Xi, the distance from A0 to Ai, is a random variable that
has a real number from the interval [0, 1] assigned uniformly at random. These random variables are
independent and so we can generate them one by one or all the same time (simultaneously)—we will
use this property at some point. Note also that with probability zero X = X for some i ≠ j, and so
i j

we may assume that such situation does not happen. As a result, we may order Ai’s in an increasing
order of the corresponding Xi’s. Formally, let π : [n] → [n] be the unique permutation such that
X π(i) < X for any i ∈ [n − 1] (notice that π is also a random variable that depends on random
π(i+1)

variables Xi).
In order to keep the notation simple, let us x X = 0 and X
π(0)
= 1. Once we break the stick,
π(n+1)

we get n + 1 pieces. The ith piece ( i ∈ [n + 1]) has length L = X − X i . Clearly, the desired
π(i) π(i−1)

polygon can be created if and only if no piece has length larger than 1/2, that is, L ≤ 1/2 for all i

i ∈ [n + 1]. Note that the probability that some piece has length exactly 1/2 is equal to zero and so we

may or may not include this degenerate case without affecting the result.
P(A1 ) = P (L1 > 1/2 ) = P

⎜⎟
Let Ai be the event that the ith piece is too long, that is, L > 1/2. Finding the probability that the


n

⋃ (Xi
i=1
i

rst piece is too long is easy. Indeed, in this case one can simply look at the independent random
variables Xi to get that

> 1/2)


=

Similarly, the last piece is too long with the same probability, namely, P(A ) = 1/2 . But what
(

then the two events Ai and Aj are disjoint, that is, they cannot occur at the same time. In particular,

We use (4.10) to get

P

⋃ Ai

n+1

i=1
P(Ai ∩ Aj )


=
n+1

∑ P(Ai )

i=1
= 0.

It follows that the probability that one can create a polygon from the
1 − (n + 1)/2 .

REMARKS
n

into three parts ( n = 2).


=
n + 1

2
n
.

n + 1

Recall that the outcome of our random experiment (that is, breaking the stick into n pieces) can be
represented by n random variables Xi. Each Xi ( i ∈ [n]) has a real number from the interval [0, 1]
assigned uniformly at random and independently. As a result, one can alternatively think about this
experiment as a process of selecting a point uniformly at random from the n-dimensional unit cube. We
1

n+1

will use this point of view to geometrically solve our problem for the speci c case of breaking the stick

Let (x, y) ∈ [0, 1] be a random point from the unit square (recall that X = x, X = y). If both x
2
1

and y are less than 1/2, then clearly we will not be able to construct a triangle in the original problem
as the third piece will be too long. Similarly, if both are more than 1/2, then the rst piece will be too
long. If x < 1/2 < y, then our task is doable if and only if y − x < 1/2, that is, the middle piece is
short enough. The same argument applies to the situation when y < 1/2 < x. In this case, the suf cient
and necessary condition for being able to achieve our task is x − y < 1/2. We present all four cases in
Figure 4.6—shaded areas correspond to the two cases when the triangle can be constructed. It follows
)

about some middle piece? It is not clear. It feels that the situation can be different but it turns out that
the distributions of all Li’s are the same. To see this we do the following trick. Instead of breaking the
stick into n + 1 pieces by breaking it in n random places, we start with a rope that forms a circle with
unit circumference (that is, we take a unit length rope and glue the two endpoints together). Now, we
cut the rope in n + 1 random places. Again, we can do the cuts one by one or all the same time. If we
do the cuts one by one, then we immediately see that the two processes are equivalent. Indeed, after the
rst cut the situation is exactly the same as at the beginning of the process of breaking the stick. From
that point on, the two processes can be coupled together. On the other hand, if we cut the rope in n + 1
places simultaneously, then, by symmetry, we see that there is nothing special about L1 or L . All the
random variables Li have the same distribution; in particular, P(A ) = P (L > 1/2) = 1/2 .
i i

We need one more observation to solve our problem. Clearly, the events Ai are not independent. If
the ith piece is too long, then the chance that some other piece is too long is smaller. In fact, if i ≠ j,
n

n+1

pieces is equal to

2
that the probability we are successful is equal to 1/4 (of course, it is consistent with our general result:
1/4 = 1 − (2 + 1)/2 ).
2

FIGURE 4.6: Breaking stick into three parts.

EXERCISES

4.6.1. Consider an urn that initially contains one white and one black ball. We repeatedly perform the
following process. In a given round, one ball is drawn randomly from the urn and its color is observed.
The ball is then returned to the urn, and an additional ball of the same color is added to the urn. We
repeat this selection process for 50 rounds so that the urn contains 52 balls. What number of white balls
is the most probable?
(Source of the problem: PLMO L – Phase 1 – Problem 11. Solution: our own.)

4.6.2. There are 65 participants competing in a ski jumping tournament. They take turns and perform
their jumps in a given sequence. We assume that no two jumpers obtain the same result and that each
nal resulting order of participants is equally probable. At each given round of the tournament, the
person that has obtained the best result thus far is called a leader. Prove that the probability that the
leader changed exactly once during the whole tournament is greater than 1/16.
(Source of the problem: PLMO XLVII – Phase 1 – Problem 11. Solution: our own.)

4.6.3. Three random events meet the following three conditions: (a) their probabilities are all equal, (b)
they are pairwise independent, and (c) all of them cannot happen at the same time. What is the
maximum probability that at least one of these three events holds?
(Source of the problem and solution idea: PLMO XXXV – Phase 1 – Problem 9.)

4.7 Combinations of Geometrical Objects


SOURCE

Problem and solution: PLMO LIX – Phase 3 – Problem 3

PROBLEM

Consider the set P of all points (x, y) on a plane with x, y ∈ Z; that is, P = Z × Z, the Cartesian
product of Ζ and Ζ. Suppose that each point in P is painted red or blue. Prove that there exists an
in nite subset of P that has a center of symmetry and consists of points having the same color.

THEORY
Recall that for any two sets A and B, the Cartesian product A × B is the set of all ordered pairs (a, b)
where a ∈ A and b ∈ B; that is,

A × B := {(a, b) : a ∈ A, b ∈ B } .

This de nition extends naturally to any dimension n ∈ N, the Cartesian product A × … × A , where 1 n

instead of ordered pairs we deal with ordered n-tuples. Moreover, if all Ai’s are the same, then we
simply write An instead of A × … × A.

Point Re ection Let p = (p1 , …, pn ) ∈ R be any point in n-dimensional space. For any point
n

a = (a1 , …, an ) ∈ R
n
, the re ection of a across the point p is point

Ref P (a) := (p1 − (a1 − p1 ), ..., pn − (an − pn ))

= (2p1 − a1 , ..., 2pn − an )

= 2p − a.

In the case where p = (0, …, 0) ∈ R is the origin, point re ection of a is simply the negation of
n

vector a. In two dimensions, namely when n = 2, a point re ection is the same as a rotation of 180
degrees.

Point Symmetry A set S ⊆ R that is invariant under a point re ection is said to possess point
n

symmetry. In other words, S has point symmetry if and only if there exists point p ∈ R such that n

S = S , where


S := {Ref p (a) : a ∈ S } .

Point p is often called the centre of the symmetry.


Playing cards often have point symmetry, so that they look exactly the same from the top or the
bottom. Some letters of the English alphabet (namely, H, I, N, O, S, X, Z) exhibit point symmetry and
many geometrical gures such as circles and rectangles. Finally, graphs of many functions, such as
f (x) = x or f (x) = 1/x, are symmetric.
3

SOLUTION

Consider any coloring of P with the two colors red and blue. Towards a contradiction, suppose that
there is no monochromatic in nite subset of P that has a center of symmetry. In other words, for every
point p ∈ A × A, where A := {k/2 : k ∈ Z}, the set
′ ′
P = P (p) := {a ∈ P : both a and 2p − a have the same color}

is nite. In fact, we will only use this assumption for p ∈ {(0, 0), (1/2, 0)}.
Suppose rst that the center of symmetry is located at p = (0, 0), the origin. It follows from our
assumption that there exists M ∈ N such that for all x, y ∈ Z with y ≥ M ,
1 1

(x, y) and (−x, −y) have dif f erent colors.

(4.11)

Similarly, if p = (1/2, 0), then we are guaranteed that there exists M2 ∈ N such that for all x, y ∈ Z

with y ≥ M , 2

(x + 1, y) and (−x, −y) have dif f erent colors.

(4.12)
Let us now x y = max{M , M }. It follows immediately from (4.11) and (4.12) that for any x ∈ Z,
1 2

points (x, y) and (x + 1, y) have the same color (since both of them have a different color than
(−x, −y) and there are only two colors, red and blue).

As a result, this means that all the points in Q := {(x, max{M , M }) : x ∈ Z} have the same color.
1 2

Since Q clearly has point symmetry (in fact, any point in Q is a center of the symmetry) and is in nite,
we get the desired contradiction.

REMARKS

Very often problems formulated in terms of relationships of geometrical objects can be reformulated as
combinatorial problems, in which geometrical properties of the considered objects form combinatorial
constraints. The opposite situation may also occur; that is, sometimes combinatorial problems can be
solved after rephrasing them in the language of geometry and then using geometrical tools.
In order to illustrate the power of this approach, let us consider convex n-gons. Recall that n-gon is a
polygon with n sides—see Chapter 6 for more. A convex polygon is de ned as a polygon with all its
interior angles less than 180 degrees. This means that all the vertices of the polygon will point
outwards, away from the interior of the shape. Assuming that there are no 3 diagonals going through
the same point, let us count how many intersection points all the diagonals have.
For simplicity, let us concentrate on 8-gons and label vertices with integers from 0 to 7, starting with
an arbitrary vertex and then proceeding clockwise. There are 8 diagonals from vertex i to vertex i + 2,
i = 0, …, 7 (using modular arithmetic), each of them intersecting 1 ⋅ 5 other diagonals. There are 8

diagonals from vertex i to vertex i + 3, i = 0, …, 7; this time, each of them intersects 2 ⋅ 4 other
diagonals. Finally, there are 4 diagonals from i to i + 4, i = 0, …, 3 and each of them intersects 3 ⋅ 3
other diagonals. Moreover, each pair of intersecting diagonals occurs twice. Hence, the total number of
intersections is equal to

8 ⋅ 1 ⋅ 5 + 8 ⋅ 2 ⋅ 4 + 4 ⋅ 3 ⋅ 3 140
= = 70 .
2 2
One can repeat this argument for any value of n but it gets complicated quickly and no general formula
seems to appear.
Alternatively, one can observe that each intersection point can be labelled with the set consisting of
the labels of the two corresponding diagonals. For example, diagonal 15 intersects diagonal 36 at the
point labelled with set {1, 3, 5, 6}. It is easy to see that no two pairs of diagonals yield the same set. On
the other hand, each set of 4 labels corresponds to one intersection point. It follows that there exists a
bijection from the set of points of intersections and the family of 4-element sets of set {0, …, 7} and so
the two sets have the same size. Since ( ) =8

4
8⋅7⋅6⋅5

4⋅3⋅2
= 70, we get an alternative way of obtaining the

result. More importantly, it easily generalizes to any value of n: there are ( ) intersection points of two
n

diagonals in a convex n-gon.

EXERCISES

4.7.1. Let P be a set of ve points on a plane with the property that no three of them lie on the same
line. Denote by a(P ) the number of obtuse triangles whose vertices lie in P. Find the minimum and the
maximum value that a(P ) can attain over all possible sets P.

4.7.2. Every point on a circle is painted with one of three colors. Prove that there are three points on the
circle that have the same color and form an isosceles triangle.
(Source of the problem and solution idea: PLMO LI – Phase 1 – Problem 4.)
4.7.3. Take a set of n ≥ 2 points with the property that no three of them lie on the same line. We paint
all line segments formed by those points in such a way that no two line segments that have a common
vertex have the same color. Find the minimum number of colors for which such coloring exists.

4.8 Pigeonhole Principle


SOURCE

Problem and solution idea: LVII PLMO – Phase 1 – Problem 4


PROBLEM

During the Polish Mathematical Olympics that lasts two days, participants are solving a total of 6
problems. Each participant can get 6, 5, 2, or 0 points for the solution of each problem. During one of
the competitions, the following interesting property occurred: for any two participants there were two
problems for which they obtained different scores. How many participants came to this competition, at
most?
THEORY

Pigeonhole Principle The tool we introduce and use in this section is obvious but perhaps surprisingly,
often an extremely powerful tool. It can be stated as follows. If one has n boxes and places more than n
objects into them, then there will be at least one box that contains more than one object. In fact, one can
make the following stronger statement: if k objects are placed into n boxes, then there will be at least
one box that contains at least ⌈k/n⌉ objects.
In order to see this tool in action (in an easy scenario) let us consider the following example. We
shoot 65 shots at a square target, the side of which is 80 centimeters long. Since we are pretty good at
this, all of our shots hit the target. Prove that there are two bullet holes that are closer than 15
centimeters from each other.
Suppose that our target is an old 8 × 8 chessboard. (Formally, we say that the target is tessellated
into 8 × 8 square grid.) There are 8 ⋅ 8 = 64 squares and the board received 65 > 64 shots. Hence, by
the pigeonhole principle, there must be a square that received at least two shots. We claim that these
two shots are at distance at most 15 centimeters from each other. Indeed, since the size of each square is
10 centimeters, the distance between any two points is, by Pythagorean theorem, at most

√ 2 2
√ 200
10 + 10 = ≈ 14.1 < 15 .
SOLUTION

First of all, let us note that the distribution of points does not matter (well, from the perspective of our
problem). Hence, we may assume that each solution gets a score from the set P = {0, 1, 2, 3} and so
the performance of each participant can be represented by a vector from set
X = {(a , a , a , a , a , a ) : a ∈ P }. Clearly, |P | = 4 = 4, 096.
6
1 2 3 4 5 6 i

We know that participants got unique vectors from a subset A of P that satisfy the following
property: any two vectors from A differ in at least two coordinates. Our goal is to provide an upper
bound for the size of A. To that end, let us observe that the number of vectors of length 5 of elements
from P is 4 = 1, 024. Hence, if A contained more than 1, 024 vectors, then by pigeonhole principle it
5

would have two vectors that coincide on the rst 5 coordinates and so differ on at most one coordinate
(that is, possibly the last one). This shows that |A| ≤ 1, 024.
Now we will show that this upper bound is sharp; that is, one can construct set A of size 1, 024 with
the desired property. Let

5
A = {(a1 , a2 , a3 , a4 , a5 , a6 = ∑ ai (mod 4)) : ai ϵ P f or 1 ≤ i ≤ 5}
i=1

⊆ X.

In other words, A is constructed by considering all ve element vectors (a , …, a ) from P and adding 1 5

a (mod 4) at the very last coordinate. (Note that a ∈ P .) Clearly, |A| = 4 = 1, 024, as
5 5
a = ∑
6 i 6
i=1

the last coordinate is determined by the rst ve coordinates.


For a contradiction, suppose that there are two vectors in A, say, a = (a , …, a ) and 1 6

b = (b , …, b ) that differ on at most one coordinate. Clearly, by construction, a and b differ on at least
1 6

one coordinate from the rst ve coordinates and so a and b must differ on precisely one coordinate:
a ≠ b for some ℓ such that 1 ≤ ℓ ≤ 5. In particular, a = b . But this implies that
ℓ ℓ 6 6

5 5

∑ ai = a6 = b6 = ∑ bi ( mod 4),

i=1 i=1

which gives a = b (mod 4). Since P = {0, 1, 2, 3}, it follows that


ℓ ℓ aℓ = bℓ and so we get the
contradiction which nishes the proof that A has the desired property.

REMARKS

Let us rst note how one can come up with the proof that set of size 1, 024 can be constructed. The key
observation was that each ve element vector can be associated with one of the four signatures;
moreover, if two of these vectors differ on only one coordinate, then they must have different
signatures. We used these signatures to de ne the 6th coordinate.
Let us also mention about the following three, closely related, problems: Birthday Paradox, that can
be viewed as probabilistic pigeonhole principle, Coupon Collector Problem, and Birthday Attack.

Birthday Paradox Suppose that k people are selected at random. The convenient assumption is that
each day of the year (including February 29) is equally probable for a birthday, independently for each
person. We are interested in estimating the probability that some pair of the selected people will have
the same birthday. Since there are n = 366 possible birthdays, by the pigeonhole principle, this
probability is equal to one if k ≥ 367. On the other hand, if k = n = 366, then we are not guaranteed
that such pair exists but the probability that each person has a unique birthday (that we denote by
p(n, k)) is extremely small. Indeed, it is clear that

n ⋅ (n − 1)⋯(n − k + 1) n!
p(n, k ) := = ,
k k
n n (n − k)!

and so p(366, 366) ≈ 5.36 ⋅ 10 . It may seem surprising that this probability is below 50% for a
−158

group as small as 23 individuals; p(366, 23) ≈ 0.49.

Since 1 + x ≤ exp(x) for any x ∈ R, we can estimate p(n, k) as follows.


k−1 n−i k−1 i
p(n, k) = ∏ = ∏ (1 − )
i=1 n i=1 n

k−1 i k(k−1)
≤ exp (− ∑i=1 ) =exp (− ) =: p̂ (n, k).
n 2n

Moreover, in practice this upper bound is not too far from the truth value; for example,
^(366, 23) ≈ 0.499998.
0.492703 ≈ p(366, 23) ≤ p
Coupon Collector’s Problem We continue selecting k people at random but this time we would like k
to be large enough so that q(n, k), the probability that every single day of the year someone has a
birthday, is close to one. This problem is known as the coupon collector’s problem as the question can
be reformulated as the problem of collecting n unique coupons hidden in boxes of some brand of
cereals. Clearly, for any k < n we have q(n, k) = 0. If one is extremely lucky, then the group of
k = n = 366 people could have the desired property but the probability is very low; indeed,

−158
q(366, 366 ) = p(366, 366 ) ≈ 5.36 ⋅ 10 .

The exact values for q(n, k) are extremely dif cult to compute (unless n and k are small or, for
example, n = k) as they are related to the Stirling number of the second kind, the number of ways to
partition a set of k objects into n non-empty subsets. Using simulations, we determined that k = 2, 294
is the smallest value for which q(366, k) > 0.5.
On the other hand, it is possible to show that the expected number of people that need to be selected
in order for the desired property to hold is
n
1 1
n∑ = n Hn = n ln n + γn + + o(1),
i 2
i=1

where Hn is the n-th harmonic number and γ ≈ 0.577216 is the Euler-Mascheroni constant. In
particular, for n = 366 it is approximately 2, 372.1245. The function q(n, k) has the following
asymptotic behavior. If k = n(ln n + c ) for some sequence (c )
n of real numbers, then
n n∈N

0 if cn → −∞
−c
−e
q(n, k ) ∼ {e if cn → c ∈ R

1 if cn → ∞ .

Note that for n = 366 and k = 2, 294 the above estimate gives q(n, k) ≈ 0.4995, which is in line with
the simulation results. So it is approximately 100 times more than what is needed for the birthday
paradox case. Finally, let us highlight the following weaker but often useful statement: for any ϵ > 0,

lim q(n, (1 − ϵ)n ln n ) = 0 and lim q(n, (1 + ϵ)n ln n ) = 1 .


n→∞ n→∞

FIGURE 4.7: Plot of q(366, k). Dashed horizontal line is for 50% probability, dotted vertical line denotes expected value that is roughly
equal to 2, 372.1245.

Birthday Attack A birthday attack is a type of cryptographic attack that exploits the mathematics
behind the birthday problem discussed above. It can be formulated as follows. Given a function
f = f (x), the goal of the attack is to nd two different values of x, say x1 and x2, such that
. Such pair x , x is called a collision. The method used to nd a collision is simply to
f (x1 ) = f (x2 ) 1 2

evaluate function f for many values of x that can be selected randomly until the same result is obtained
more than once. Because of the birthday problem, this method is surprisingly quite ef cient.
In particular, if function f gives any of the n different outputs uniformly at random and n is
suf ciently large, then we expect to obtain a collision after evaluating the function for about 1.25√n
different arguments, on average. Indeed, the probability that the rst collision occurs at time i is equal
to

i−2 t i−1 i−2 t 2 i−1


P(i) = ∏t=1 (1 − ) ⋅ = ∏t=1 exp (− + O((t/n) )) ⋅
n n n n

i−2 t 3 2 i−1
= exp (− ∑t=2 + O(i /n )) ⋅
n n

(i−1)(i−2) 3 2 i−1
= exp (− + O(i /n )) ⋅
2n n

2
x exp(−x /2)
˜ ,
√n

provided i = x√n for some x ∈ R. In order to calculate an asymptotic value for the probability to see
a collision by time i = x√n, denoted P ( ≤ i), we need to use integrals. This part is not considered to
be elementary mathematics so the less advanced reader can safely skip this part; we will not use this
result later on. Moreover, we only provide a sketch, as a formal argument is more delicate and
technical.
x√n x
2
P(≤ x√n) = ∑ P(i)~ ∫ z exp (−z /2)dz
i=1 0

2 x 2
= − exp (−z /2) = 1− exp (−x /2).
0

So one needs to evaluate function f roughly √2 ln 2 √n ≈ 1.18√n times to get the probability close
to 1/2. Similarly, the expected number of values that need to be evaluated to get the rst collision is
equal to
∞ ∞
2 2
∑ i ⋅ P (i ) ∼ ∫ z exp(−z /2 ) dz ⋅ √ n = √ π/2 √n ≈ 1.25 √n .

i=1 0

EXERCISES

4.8.1 Twenty ve boys and twenty ve girls sit around a table. Prove that it is always possible to nd a
person both of whose neighbors are girls.
(Source of the problem and solution: Interactive Mathematics Miscellany and Puzzles by Alexander
Bogomolny, https://www.cut-the-knot.org.)

4.8.2 A person takes at least one aspirin a day for 30 days. Show that if the person takes 45 aspirin
altogether, then in some sequence of consecutive days that person takes exactly 14 aspirin.
(Source of the problem and solution: Interactive Mathematics Miscellany and Puzzles by Alexander
Bogomolny, https://www.cut-the-knot.org.)

4.8.3 Prove that if we take n + 1 numbers from the set from 1 to 2n, then in this subset there exist two
numbers such that one divides the other.

4.9 Generating Functions


SOURCE

Sicherman dice – well-known problem


PROBLEM

Can you design two different dice so that their sums behave just like a pair of ordinary dice? That is,
there must be two ways to roll a 3, six ways to roll a 7, one way to roll a 12, and so forth. Each die must
have six sides, and each side must be labelled with a positive integer.

THEORY

Generating Function The (ordinary) generating function of a sequence a = (a ) i


i=0
of real numbers is
de ned as follows:

i
G(x ) = G(a, x ) := ∑ ai x .

i=0

Unlike an ordinary series, this formal series is allowed to diverge, meaning that the generating function
is not always a true function and the “variable” x is actually an indeterminate allowing us to perform
useful algebraic manipulations.
Let us start with a simple and standard application of generating functions. The Fibonacci sequence
is de ned recursively as follows:
a0 , a1 , a2 , …

a0 = 0, a1 = 1 and an+1 = an + an−1 f or n ≥ 1 .

Our goal is to nd an explicit formula for an. Instead of looking for the sequence, we will look for its
generating function G(x) = ∑ a x . Once we get it, we will try to recover the coef cient an in front
j≥0 j
j

of xn, the nth Fibonacci number.


In order to get G(x), let us take the recurrence relation a = a + a , multiply both sides of it
n+1 n n−1

n
by x , and sum over all values of n for which the relation is valid. We get
n n n
∑ an+1 x = ∑ an x + ∑ an−1 x .

n≥1 n≥1 n≥1

(4.13)

On the left hand side of (4.13) we have

G(x) − a1 x − a0 G(x) − x
n 2 3
∑ an+1 x = a2 x + a3 x + a4 x + … = = .
x x
n≥1

On the other hand, on the right hand side of (4.13) we have

n n
∑ an x + ∑ an−1 x = G(x) + xG(x ) .

n≥1 n≥1

It follows that

G(x) − x
= G(x) + xG(x)
x
and so
x x
G(x ) = = ,
2
1 − x − x (1 − xr+ )(1 − xr− )
where r = (1 + √5)/2 and r = (1 − √5)/2. Our rst task is done—we have an explicit formula
+ −

for the generating function of the Fibonacci sequence.


Using the partial fraction method, we get

1 1 1
G(x) = ( − )
r+ −r− 1−xr+ 1−xr−

1 n n n n
= (∑ r + x − ∑ r − x )
√5 n≥0 n≥0

1 n n n
= ∑ (r+ − r− )x ,
n≥0 √5

thanks to the magic of the geometric series (recall that ∑n≥0 c


n
= 1/(1 − c) ). Therefore, an explicit
formula for the Fibonacci number is
n n

1 n n
1 1 + √5 1 − √5
an = (r+ − r− ) = (( ) − ( ) ) .
√5 √5 2 2

In fact, note that


n

1 1 − √5 1
( ) ≤ ≈ 0.447 < 1/2
√5 2 √5

for any non-negative integer n and so


n

1 1 + √5
an = ⌊ ( ) ⌉ ,
√5 2

where ⌊x⌉ is the nearest integer to real number x.

Golden Ratio The constant that appeared in the formula for the nth Fibonacci number,

1 + √5
ϕ = ≈ 1.618 ,
2
is the Golden Ratio that appears in mathematics surprisingly often. Perhaps even more surprising is the
fact that it appears in some patterns in nature, including the spiral arrangement of leaves and other plant
parts, music, architecture, or paintings. Two quantities, a and b are in the golden ratio if their ratio is the
same as the ratio of their sum to the larger of the two quantities. In other words, ϕ is de ned as follows:

a a + b
ϕ = = ,
b a
where a > b > 0.

SOLUTION

Consider the generating function where ak represents the number of appearances of the number k on the
die. Thus, an ordinary die would be represented by the polynomial
2 3 4 5 6
f (x) = x + x + x + x + x + x

2 2
= x(x + 1)(x + x + 1)(x − x + 1).

The key observation is that the result of rolling two (or more, in general) dice is represented by the
product of their generating functions. Therefore, if g(x) and h(x) are the functions associated with the
rst die and the second one, respectively, then we get that
2
2 2 2
g(x)h(x ) = f (x) = (x(x + 1)(x + x + 1)(x − x + 1)) .

There are some constraints we need to consider: we cannot have a non-zero constant term in g(x) or
h(x) (since that would imply that some sides are labelled “0”) or any negative term. It follows that we

need to assign one copy of each “x” factor to g(x) and h(x). Moreover, g(1) = h(1) = 6 (the number
of sides), so we need to assign one copy of each “ (x + x + 1)” and “ (x + 1)” factor to g(x) and
2

h(x) as well. It remains to distribute the two “ (x − x + 1)” factors. If we give one copy to each of
2

g(x) and h(x), we get an ordinary pair of dice. Otherwise, we get

2 2 3 4
g(x) = x(x + 1)(x + x + 1) = x + 2x + 2x + x

2
2 2 3 4 5 6 8
h(x) = x(x + 1)(x + x + 1)(x − x + 1) = x + x + x + x + x + x ,

which corresponds to two dice: {1, 2, 2, 3, 3, 4} and {1, 3, 4, 5, 6, 8}.

REMARKS

Alternatively, one could have solved this problem using a computer in the following way. Observe rst
that there is only one way to get the sum to be equal to 2 or 12. This means that each dice must have
exactly one 1 and both of them must have a unique maximum value (which can be different). Now,
observe that this unique value must be greater than 3; otherwise, one of the dies would be
{1, 2, 2, 2, 2, 3} and the other die would have to have 1, 9 and four numbers that are between 2 and 8

but then there are too many ways to get the sum to be equal to 11. This means that the maximum
number on any die is at most 8. One can easily enumerate all such dies and check which of them meet
the required criteria.
Here is a basic program written in Julia language that performs this brute-force check. We did not
optimize it for speed as its run-time is under one second anyway.

using Base.Iterators
function listdies()
ref = [1,2,3,4,5,6,5,4,3,2,1] # reference distribution
# traverse all possible die configurations with one 1
for d1 in product(1, 2:8, 2:8, 2:8, 2:8, 2:8)
for d2 in product(1, 2:8, 2:8, 2:8, 2:8, 2:8)
# filter options to avoid reporting duplicates
if issorted(d1) \& \& issorted(d2) \& \& d1 < = d2
# x will be a 6x6 matrix storing possible sums
x = [a1 + a2 for a1 in d1, a2 in d2]
# check if counts of sums equals what we want
if [count(v - > v==s, x) for s in 2:12] == ref
# print the result on the screen on success
println((d1, d2))
end
end
end
end
end
Now running listdies() ensures us that there are actually only two solutions of the problem:

julia > listdies()


((1, 2, 2, 3, 3, 4), (1, 3, 4, 5, 6, 8))
((1, 2, 3, 4, 5, 6), (1, 2, 3, 4, 5, 6))
In case of this problem it would be probably faster for some of the readers to write and run a program
similar to the one shown above than to write out generating functions and appropriately group them.
Note that in many problems it can be useful to generate the solution with the help of a computer as it
might give an insight on how one can solve it or at least check if the idea is correct.

EXERCISES

4.9.1. Consider the Sicherman dice problem in which the restriction that each side is labelled with a
positive integer is relaxed to any integer, not necessarily positive. Can you design more pairs of dice?

4.9.2. Solve the recurrence xn+1 = xn + 2xn−1 for n ∈ N , with x0 = 0 and x1 = 1 . Verify your
solution using induction.

4.9.3. Your friend wants to play the following game with you. You toss three 6-sided fair dies and
calculate the sum of outcomes. For every game you have to pay : textdoll : 1. If the sum is 10 or 11
you get : textdoll : 4, otherwise you get nothing. Is this game fair?
Chapter 5
Number Theory

5.1 Greatest Common Divisors


5.2 Modular Arithmetic
5.3 Factorization
5.4 Fermat's Little Theorem and Euler's Theorem
5.5 Rules of Divisibility
5.6 Remainders
5.7 Aggregation
5.8 Equations

As usual, we start the chapter with some basic de nitions.

THEORY

Divisibility For any two integers a and b, we say that a dividesb (or a is a
divisor ofb) if and only if b/a ∈ Z; that is, b = ak for some k ∈ Z. If a
divides b, then we write a | b.
For example, 5 | 15 since 15/5 = 3 ∈ Z. On the other hand, 5 | 7.
Indeed, for a contradiction suppose that 5 | 7; that is, 7 = 5k for some
k ∈ Z. But this implies that k = 7/5 ∉ Z which gives the desired

contradiction.
For any b ∈ Z , we have 1 | b and b | b (since b/1 = b ∈ Z and
b/b = 1 ∈ Z). On the other hand, for any integer a > 1, a | 1 (since

1/a ∉ Z). In fact, the following more general property holds:

f or all a, b ∈ N, if a > b, then a | b

(5.1)

(again, since 0 < b/a < 0 and so b/a ∉ Z).


Let us note that divisibility is transitive; that is, if a | b and b | c, then
a | c. Indeed, suppose a | b and b | c. It means that there exist k, ℓ ∈ Z
such that b = ak and c = bℓ. Thus, c = bℓ = (ak)ℓ = a(kℓ) and, since
kℓ ∈ Z, we get that a | c.

Prime Numbers A positive integer p is said to be prime if and only if it has


exactly two distinct positive divisors, namely, 1 and p. A positive integer p
is said to be composite if it has more than two distinct positive divisors.
The rst few primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, …. Note that 1
has exactly one positive divisor so it is nether prime nor composite. On the
other hand, it is easy to see that every integer greater than 1 is either prime
or composite. Finally, let us mention that the set of prime numbers is
in nite. We provide a proof of this fact in Section 4.1 as an example of a
famous constructive argument due to Euclid.

Fundamental Theorem of Arithmetic Fundamental theorem of arithmetic


(also known as the unique factorization theorem) states that every integer
n ≥ 2 can be represented in exactly one way as a product of prime powers;

that is,

k
αi α1 α2 αk
n = ∏ p = p p ⋯p ,
i 1 2 k

i=1

where 2 ≤ p 1 < p2 < … < pk are prime numbers and α i ,


∈ N i ∈ [k] .
Before we prove the fundamental theorem of arithmetic, let us mention
that it can be alternatively stated as follows: for each integer n ≥ 2, there
exists a unique function α : P → N ∪ {0} from the set of all prime
n

numbers P to the set of non-negative integers such that n = ∏ p . p∈P


αn (p)

This notation is often convenient; for example, it can be used to express the
following well-known Legendre’s formula that gives an expression for the
exponent of the largest power of a prime p that divides the factorial n!:

n
αn! (p ) = ∑⌊ ⌋.
i
p
i=1

(5.2)

Indeed, since n! = 1 ⋅ 2 ⋅ … ⋅ n, we obtain at least one factor of p for each


multiple of p in [n], of which there are ⌊n/p⌋. Each multiple of p2
contributes an additional factor of p, etc. For example,
7 2
8! = 1 ⋅ 2 ⋅ 3 ⋅ 4 ⋅ 5 ⋅ 6 ⋅ 7 ⋅ 8 = 2 ⋅ 3 ⋅ 5 ⋅ 7.
The exponent of 2 can be computed by Legendre’s formula as follows:

8 8 8 8
α8! (2 ) = ∑⌊ ⌋ = ⌊ ⌋ + ⌊ ⌋ + ⌊ ⌋ = 4 + 2 + 1 = 7.
i
2 2 4 8
i=1

Now, let us move to the proof. The main ingredient is the following
observation: every integer greater than 1 is divisible by a prime. The claim
is clearly true when n is a prime, as n | n. Let us then concentrate on
composite integers. For a contradiction, suppose that there are composite
numbers not divisible by any prime; let us call them bad. Let N be the
smallest bad number. (As we will see soon, it is convenient to concentrate
on the smallest potential bad number, as it means that no smaller composite
number is bad.) By the de nition of composite numbers, N has at least 3
distinct divisors; in particular, there must be some divisor d | N with
d ≠ 1 and d ≠ N . By (5.1), if d | N then d ≤ N and therefore
1 < d < N . By assumption, d is not prime. Since d is a composite number

less than N, d is not bad and so there is prime p such that p | d (recall that
N is the smallest bad number). Since divisibility is transitive, we get that
p | d together with d | N implies that p | N ; rather, N is divisible by a

prime. We get the desired contradiction which nishes the proof.


With this observation in hand we can easily prove that for any integer
n ≥ 2, there exists a function α : P → N ∪ {0} such that n

n = ∏
p∈P
p (the prove of its uniqueness is slightly longer and so we
αn (p)

skip it here). For a contradiction, suppose that there is some integer greater
than 1 for which there is no such function; let n be the smallest such
example. Note that n cannot be a prime number, for otherwise α (n) = 1 n

and α (p) = 0 for p ≠ n would be such a function (clearly, n = n ). By


n
1

our observation, there is a prime q such that q | n. Since n is not a prime,


q < n. Thus, 1 < n/q < n and so n/q is associated with a function α : n/q

αn/q (p)
n/q = ∏ p .

p∈P

(Recall that n is the smallest example.) But it implies that


αn/q (p)
n = q ∏ p

p∈P

and so

αn/q (p) if p ≠ q
αn (p ) = {
αn/q (q) + 1 if p = q

is the desired function associated with n which gives us a contradiction.

5.1 Greatest Common Divisors


SOURCE

Problem: PLMO XXXIV – Phase 3 — Problem 4


Solution: our own

PROBLEM

Consider any a, b, c, d ∈ N such that ab = cd. Prove that

gcd(a, c) ⋅ gcd(a, d ) = a ⋅ gcd(a, b, c, d ) .

THEORY

Greatest Common Divisors The greatest common divisor of two integers a


and b that are not both zero, denoted gcd(a, b), is the largest integer d such
that d | a and d | b. For example, gcd(12, 24) = 12, gcd(16, 20) = 4,
gcd(3, 5) = 1, and gcd(7, 0) = 7. Two natural numbers a, b are said to be

relatively prime (or co-prime) if gcd(a, b) = 1. Finally, let us mention that


this de nition can be easily generalized: the greatest common divisor of k
integers a , a , …, a with at least one non-zero value, denoted
1 2 k

gcd(a , a , …, a ), is their largest common divisor.


1 2 k

Quotient-Remainder Theorem This useful property, which is often called


the division algorithm, is in fact a simple idea that comes directly from long
division. For any n ∈ Z and d ∈ N, there exist unique integers q and r such
that

d d d
n = dq + r and 0 ≤ r < d .
We call q the quotient and r the remainder of n when divided by d.
For example,

if n = 12, d = 4, then q = 3, r = 0 ( 12 = 4 ⋅ 3 + 0),

if n = 148, d = 3, then q = 49, r = 1 ( 148 = 3 ⋅ 49 + 1),

if n = −21, d = 4 , then q = −6, r = 3 (


−21 = 4 ⋅ (−6) + 3 ).

In general, q = ⌊n/d⌋, r = n − qd = n mod d. (Here, ⌊x⌋ is the oor of a


real number x that is de ned as the largest integer not greater than x.)
There are two parts of the proof. First of all, we need to show that for any
n ∈ Z and d ∈ N there exist q and r with the desired properties. In the

second part, we need to show that this pair is unique. We will prove these
two parts independently.
Existence: Let and d ∈ N. Put q = ⌊n/d⌋ and
n ∈ Z

r = n − dq = n − d⌊n/d⌋. Clearly, q, r ∈ Z and n = dq + r so it remains

to show that 0 ≤ r < d. From the de nition of the oor function ⌊⋅⌋ it
follows that 0 ≤ n/d − ⌊n/d⌋ < 1, so 0 ≤ r/d < 1 and the assertion
follows.
Uniqueness: Let n ∈ Z and d ∈ N and take any integers q , q , r , r 1 2 1 2

such that n = dq + r = dq + r and 0 ≤ r , r < d. We will show that


1 1 2 2 1 2

r = r
1 and then that also q = q . For a contradiction, suppose r ≠ r .
2 1 2 1 2

Without loss of generality, we may assume r < r . Then, since 1 2

dq + r = dq + r , we get that d(q − q ) = dq − dq = r − r and so


1 1 2 2 1 2 1 2 2 1

d | r − r . But 0 ≤ r < r < d and thus d > r − r > 0 which yields


2 1 1 2 2 1

a contradiction. Therefore, r = r . To nish the proof, it is enough to


1 2

notice that the equation dq + r = dq + r implies that dq = dq , which


1 1 2 2 1 2

in turn implies that q = q . 1 2

Euclidean Algorithm The Euclidean algorithm (or Euclid’s algorithm) is


an ef cient method for computing the greatest common divisor of two
numbers. It is based on the following observation. Let a, b ∈ N with a > b.
Let r be the remainder of a when divided by b ( a = bq + r). Then,

gcd(a, b) = gcd(b, r ) .

Clearly, a + b < b + r and so one can repeat this procedure until reaching
gcd(d, 0) for some d ∈ N and then the algorithm stops since gcd(d, 0) = d

.
In order to prove the above key observation, let a, b ∈ N with a > b and
let d = gcd(a, b), e = gcd(b, r). Since d = gcd(a, b), d | a, d | b, and
thus also d | bq and d | a − bq = r. It follows that d is a common divisor
of b and r. Since e is the greatest such divisor, d ≤ e. Similarly, since
e = gcd(b, r), e | b, e | r, and thus also e | bq and e | bq + r = a. We

get that e is a common divisor of a and b. Since d is the greatest such


divisor, e ≤ d. To summarize, we have shown that d ≤ e and e ≤ d, so we
get d = e.

Extended Euclidean Algorithm Let us mention about the following


straightforward but useful implication of the Euclidean algorithm. For any
a, b ∈ N, there exist x, y ∈ Z such that

ax + by = gcd(a, b ) .

Instead of proving this property, we will show one example which not only
should convince the reader that the property above holds but also
demonstrates how to actually nd x, y which satisfy the desired equation.
Let us nd integers x, y such that 425x + 112y = 1. The rst step is to
nd gcd(425, 112) using Euclidean Algorithm:

425 = 3 ⋅ 112 + 89(⇒ 89 = 425 − 3 ⋅ 112)

112 = 1 ⋅ 89 + 23(⇒ 23 = 112 − 89)

89 = 3 ⋅ 23 + 20(⇒ 20 = 89 − 3 ⋅ 23)

23 = 1 ⋅ 20 + 3(⇒ 3 = 23 − 20)

20 = 6 ⋅ 3 + 2(⇒ 2 = 20 − 6 ⋅ 3)

3 = 1 ⋅ 2 + 1(⇒ 1 = 3 − 2)

2 = 2 ⋅ 1 + 10,

so gcd(425, 112) = 1. Now, one can reverse all the operations to get:
1 = 3 − 2

= 3 − (20 − 6 ⋅ 3) = −20 + 7 ⋅ 3

= −20 + 7(23 − 20) = −8.20 + 7.23

= −8(89 − 3 ⋅ 23) + 7 ⋅ 23 = 31 ⋅ 23 − 8 ⋅ 89

= 31(112 − 89) − 8 ⋅ 89 = −39 ⋅ 89 + 31 ⋅ 112

= −39(425 − 3 ⋅ 112) + 31.112 = 148 ⋅ 112 − 39 ⋅ 425.

Hence, x = −39 and y = 148.


Finally, note that one can use the fundamental theorem of arithmetic to
get that for any a, b ∈ N we have

min{αa (p),αb (p)}


gcd(a, b ) = ∏ p .

p∈P

(5.3)

In particular, this observation shows that if one wants to compute gcd(a, b),
then each prime factor can be considered independently. Let us also notice
that gcd(a, a) = a, gcd(a, 1) = 1, gcd(a, b) = gcd(b, a) and
gcd(a, b, c) = gcd(gcd(a, b), c). The last inequality follows from (5.3) and

the fact that for any i, k, ℓ ∈ N ∪ {0}, min{i, k, ℓ} = min{min{i, k}, ℓ}.

SOLUTION

As mentioned above, each prime factor can be considered independently.


Hence, without loss of generality, we may assume that
a = p , b = p , c = p , and d = p
α β γ
for some prime p and non-negative
δ

integers α, β, γ, and δ. Our task is to show that if α + β = γ + δ, then

min{α, γ} + min{α, δ } = α + min{α, β, γ, δ } .

We will independently consider the following three cases which will nish
the proof.
Case 1: α = min{α, β, γ, δ}. Both the left hand side and the right hand
side are clearly equal to 2α.
Case 2: . Since α + β = γ + δ, we get
β = min{α, β, γ, δ}

α = γ + (δ − β) ≥ γ . Using the same argument, we get that α ≥ δ. It


follows that the left hand side is equal to γ + δ whereas the right hand side
is equal to α + β. The sides are equal by assumption.
Case 3: γ = min{α, β, γ, δ} (the case when δ = min{α, β, γ, δ} can be
dealt with the same way). Arguing as before, note that δ ≥ α and conclude
that both sides are equal to α + γ.
REMARKS

Alternatively, one could solve this problem by providing a proof by


induction on a. The base case is easy. Suppose that a = 1 and b, c, d ∈ N
satisfy the required assumption, that is, ab = cd. Then, both the left hand
side and the right hand side are equal to 1.
For the inductive step, let us x any quadruple a, b, c, d ∈ N such that
a ≥ 2 and ab = cd. Let q be a prime number such that q | a. Since

ab = cd, we get that q | cd and so, without loss of generality, we may

assume that q | c. Our inductive hypothesis is that the desired equality


holds for any quadruple a , b , c , d ∈ N satisfying a < a and a b = c d .
′ ′ ′ ′ ′ ′ ′ ′ ′

In particular, since (a/q)b = (c/q)d and a/q < a, we have

gcd(a/q, c/q) ⋅ gcd(a/q, d ) = (a/q) ⋅ gcd(a/q, b, c/q, d ) .

Clearly, gcd(a, c) = q ⋅ gcd(a/q, c/q). We independently consider the


following two cases:
Case 1: gcd(a, d) = gcd(a/q, d); that is, the power of q in the (unique)
factorization of d is smaller than the corresponding power for a. Hence,
since ab = cd, the power of q in the factorization of b is smaller than the
corresponding power for c; that is, gcd(b, c) = gcd(b, c/q). It follows that

gcd (a, c)⋅ gcd (a, d) = q⋅ gcd (a/q, c/q)⋅ gcd (a/q, d)

= q ⋅ (a/q)⋅ gcd (a/q, b, c/q, d)

= q⋅ gcd (gcd(a/q, d), gcd (b, c/q))

= a⋅ gcd (gcd (a, d), gcd (b, c))

= a. gcd (a, b, c, d).

Case 2: gcd(a, d) = q ⋅ gcd(a/q, d); that is, the power of q in the


(unique) factorization of d is at least the corresponding power for a. As
before, since ab = cd, the power of q in the factorization of b is at least the
corresponding power for c; that is, gcd(b, c) = q ⋅ gcd(b, c/q) . It follows
that
2
gcd (a, c)⋅ gcd (a, d) = q ⋅ gcd (a/q, c/q)⋅ gcd (a/q, d)

2
= q ⋅ (a/q)⋅ gcd (a/q, b, c/q, d)

= q ⋅ a⋅ gcd (gcd (a/q, d), gcd (b, c/q))

= q ⋅ a⋅ gcd (gcd (a, d)/q, gcd (b, c)/q)

= q ⋅ a⋅ gcd (gcd (a, d), gcd (b, c))/q

= a⋅ gcd (a, b, c, d).

In both cases, the desired equality holds and so the claim holds by
induction.

EXERCISES

5.1.1. A positive fraction a/b is said to be in lowest terms if gcd(a, b) = 1.


Prove that if a positive fraction a/b is in lowest terms, then fraction
2 2
(a + b)/(a + ab + b )

is also in lowest terms.

5.1.2. You are given two natural numbers a and b. Prove that if a + b |
2
a ,
then a + b | b .2

5.1.3. Consider a set A of four digit numbers whose decimal representation


uses precisely two digits; moreover, both of them are non-zero. Let
f : A → A be the function such that f (a) ips the digits of a ∈ A (for
example, f (1333) = 3111). Find n > f (n) for which gcd(n, f (n)) is as
large as possible.

5.2 Modular Arithmetic


SOURCE

Problem and solution idea: PLMO LXX – Phase 1 – Problem 10


PROBLEM

Suppose that
2 2 2 2
a + b + c + d = 2018!
(5.4)

for some a, b, c, d ∈ N. Show that each of a, b, c, d, is greater than 10250.


THEORY

Congruence Let a, b, n be integers with n ≠ 0. We say that a is congruent


to b modulo n, and write a ≡ b ( mod n), if and only if n | (a − b). For
example, 127 ≡ 7 ( mod 4), 127 ≡ 3 ( mod 4), and
127 ≡ −1 ( mod 4).

Clearly, the congruence is re exive (that is, a ≡ a ( mod n) for any


a ∈ Z) and symmetric (that is, if a ≡ b ( mod n), then b ≡ a ( mod n)

). Moreover, it is also transitive, that is, if a ≡ b ( mod n),


b ≡ c ( mod n), then a ≡ c ( mod n). Indeed, suppose that
a ≡ b ( mod n) and b ≡ c ( mod n). Since a ≡ b ( mod n),
a − b = nk for some k ∈ Z. Similarly, since b ≡ c ( mod n), b − c = nℓ

for some ℓ ∈ Z. Thus, a − c = (a − b) + (b − c) = n(k + ℓ) and


k + ℓ ∈ Z. It follows that a ≡ c ( mod n).

The following straightforward observation is quite useful. Fix n ∈ N and


suppose that a ≡ b ( mod n) and c ≡ d ( mod n). Then,

a ± c ≡ b ± d ( mod n),

(5.5)

ac ≡ bd ( mod n).

(5.6)

Since a ≡ b ( mod n) and c ≡ d ( mod n) , and


a − b = nk

c − d = nℓ for some k, ℓ ∈ Z. Since


(a ± c) − (b ± d) = (a − b) ± (c − d) = n(k ± ℓ) and k ± ℓ ∈ Z. We get

that a ± c ≡ b ± d ( mod n) and so (5.5) holds. In order to prove (5.6),


we observe that a = nk + b , c = nℓ + d , and so
2
ac = n kℓ + nkd + nℓb + bd . It follows that
2
ac − bd = n kℓ + nkd + nℓb = n(nkℓ + kd + ℓb) and
nkℓ + kd + ℓb ∈ Z which nishes the proof.
As already mentioned, these properties are incredibly useful in
computations, since when we multiply numbers in modular arithmetic
(modulo n), we do not have to deal with factors larger than n. To see the
power of these properties, let us answer the following question: what is the
ones digit of 1732019? Reformulating the question in the language of
modular arithmetic, our goal is to nd a (unique) digit x ∈ {0, 1, …, 9}
such that 173 ≡ x
2019
( mod 10). Using (5.6) we notice that
( mod 10) which simpli es our task. With one more trick
2019 2019
173 ≡ 3

(and applying (5.6) a few times more), we get


1009
2019 2019 2⋅1009+1 2
173 ≡ 3 = 3 = (3 ) ⋅ 3

1009 1009
= 9 ⋅ 3 ≡ (−1) ⋅ 3 = −3 ≡ 7(mod 10).

It follows that the ones digit of is 3. 1732019


To practice a bit more, let us answer a similar question for which a
different argument is needed: what is the ones digit of 22019? This time we
observe that
1
2 = 2 ≡ 2 (mod 10)

2
2 = 2 ⋅ 2 ≡ 2 ⋅ 2 = 4 (mod 10)

3 2
2 = 2 ⋅ 2 ≡ 4 ⋅ 2 = 8 (mod 10)

4 3
2 = 2 ⋅ 2 ≡ 8 ⋅ 2 = 16 ≡ 6 (mod 10)

5 4
2 = 2 ⋅ 2 ≡ 6 ⋅ 2 = 12 ≡ 2 (mod 10)

6 5
2 = 2 ⋅ 2 ≡ 2 ⋅ 2 = 4 (mod 10)

k k−1
2 = 2 ⋅ 2 ≡ ...

We get the following pattern 2, 4, 8, 6, 2, 4, 8, 6, 2, …. Thus we have shown


that if x is even then 2 ⋅ x ≡ x ( mod 10). Now, since 4
2019 ≡ 3 ( mod 4) , 2019
2 ≡ 2
3
= 8 ( mod 10) and so the ones digit of
22019 is 8.

Multiplicative Inverse We nish this section with a short discussion about


division in modular arithmetic. The key observation is as follows:

if gcd(a, n) = 1, then there exists b ∈ Z such that ab ≡ 1 ( mod n) .

(5.7)

Indeed, by Extended Euclidean Algorithm, gcd(a, n) = 1 implies that there


exit b, k ∈ Z such that ab + kn = 1. Reducing both sides mod n yields the
result. Let us note that b is called the multiplicative inverse of a mod n,
denoted by a . −1

One can use the Extended Euclidean Algorithm to nd inverses mod n.


For example, the multiplicative inverse of 5 mod 22 is 9. Indeed,
22 = 4 ⋅ 5 + 2 and 5 = 2 ⋅ 2 + 1, so
1 = 5 − 2 ⋅ 2 = 5 − 2 ⋅ (22 − 4 ⋅ 5) = 9 ⋅ 5 − 2 ⋅ 22. Thus,
9 ⋅ 5 ≡ 1 ( mod 22). Let us also remark that the assumption in (5.7) that

gcd(a, n) = 1 is necessary; if gcd(a, n) > 1, then no b ∈ Z exists. Indeed,

towards a contradiction, suppose that there exists b ∈ Z such that


ab ≡ 1 ( mod n). Then 1 = ab − nk for some k ∈ Z. But
gcd(a, n) = d > 1 divides the right hand side, so d | 1. We get the desired

contradiction.
Here is another useful property: if gcd(a, n) = 1, then

ax ≡ ay ( mod n) implies that x ≡ y ( mod n) .

(5.8)

Indeed, since gcd(a, n) = 1, one can multiply both sides by a −1


to get
−1 −1
a ax ≡ a ay (mod n)

1x ≡ 1y (mod n)

x ≡ y (mod n).

SOLUTION
For a contradiction, suppose that at least one of a, b, c, d is smaller than or
equal to 10 < 16 = 2 . Let α be the largest non-negative integer
250 250 1000

such that 2α divides each of a, b, c, d; that is, using the unique factorization
theorem,

α = min{αa (2), αb (2), αc (2), αd (2) } .

Then a = 2 a , b = 2 b , c = 2 c , d = 2 d , and at least one of a , b , c , d


α ′ α ′ α ′ α ′ ′ ′ ′ ′

is odd. Moreover, it follows from our assumption that α < 1000.


Let us now observe that the power of 2 in the (unique) factorization of
2018! can be calculated from Legendre’s formula (see (5.2)) as follows:

∞ 2018
α2018 (2) = ∑ ⌊ ⌋
i=1 2
i

= 1009 + 504 + 252 + 126 + 63 + 31 + 15 + 7 + 3 + 1

= 2011.

It is often convenient to check such computations using a computer


program. Here is a simple script that implements the Legendre’s formula in
the Julia language and calculates the value of α (2): 2018!

julia > function legendre(n, d)


s = 0
q = 1
while true
q *= d
a = div(n, q)
if a == 0
return s
end
s += a
end
end
legendre (generic function with 1 method)
julia > legendre(2018, 2)
2011
Hence, after dividing both sides of equation (5.4) by (2 α 2
) = 2

, we get
that

2018!
′2 ′2 ′2 ′2
a + b + c + d = .

2
(5.9)

Since 2011 − 2α ≥ 3, the right hand side of equality (5.9) is congruent


to 0 ( mod 8). On the other hand, note that if n is odd, then
( mod 8), and if n is even, then n ≡ 0 or 4 ( mod 8). Indeed,
2 2
n ≡ 1

if n = 2k + 1 for some k ∈ Z, then


( mod 8). Similarly, if n = 2k for
2 2
n = (2k + 1) = 4k(k + 1) + 1 ≡ 1

some k ∈ Z, then n = 4k ≡ 0 or 4 ( mod 8). It follows that at least


2 2

one of a′ , b′ , c′ , d′ is congruent to 1 ( mod 8) and all of them are


2 2 2 2

congruent to 0, 1, or 4 ( mod 8). It is easy to see then that the left hand
side of equality (5.9) is not congruent to 0 ( mod 8), and we get the
desired contradiction.
REMARKS

The key idea in our solution is to notice that if the right hand side of
equality (5.4) is divisible by 8, then each of a, b, c, d must be divisible by 2,
as explained above. With this observation in hand, one can divide both sides
of equality (5.4) by 4, and repeat the process recursively as long as the right
hand side is divisible by 8. It shows that each of the four numbers must be
large. The proof presented above uses this idea but avoids the recursive
argument by noticing that in the unique factorization of 2018! prime
number 2 is raised to a large power.
EXERCISES

5.2.1. Find all primes p for which p 2


+ 2 is also prime.

5.2.2. You are given three consecutive natural numbers (say, a, a + 1, and
a + 2) such that the middle one is a cube (that is, a + 1 = ℓ for some 3

ℓ ∈ N). Prove that their product is divisible by 504.


(Source of the problem and solution idea: PLMO IX – Phase 3 – Problem
1.)

5.2.3. Prove that for any natural n ∈ N that is not divisible by 10 there
exists k ∈ N such that nk has in its decimal representation the same digit at
the rst and the last position.
(Source of the problem and solution idea: “Delta” monthly – January,
2017.)

5.3 Factorization
SOURCE

Problem and solution idea: PLMO LXVIII – Phase 2 – Problem 1

PROBLEM

Consider any prime number p > 2. Prove that there exists exactly one
n ∈ N such that n + np is a square; that is, n + np = k for some
2 2 2

k ∈ N.

THEORY

In this section, we will use a basic but useful fact that follows immediately
from the fundamental theorem of arithmetic:

if p is a prime number and p | ab, then p | a or p | b .

(5.10)

For example, we will use (5.10) to show that √2 is irrational. (Recall that a
real number q is rational if q = a/b for some a, b ∈ Z and b ≠ 0; a real
number r is irrational if it is not rational.) For a contradiction, suppose that
there are integers a, b, b ≠ 0 such that √2 = a/b. We may assume that
gcd(a, b) = 1 (we may write √ 2 in lowest terms). Now, note that

√ 2 = a/b implies that a = b√ 2 which, in turn, implies that a = 2b ; that


2 2
is, 2 | . By (5.10), 2 | a implies that 2 | a; that is, a = 2k for some
2
a
2

k ∈ Z. Now, a = 2b implies 4k = 2b that, in turn, implies 2 | b .


2 2 2 2 2

Using (5.10) again, we get 2 | b. Since 2 | a and 2 | b,


gcd(a, b) ≥ 2 > 1 and we get the desired contradiction.

Another useful consequence of (5.10) is that if p = ab for some prime


number p and two integers a, b, then there are only four possibilities for a
pair (a, b), namely, (p, 1), (−p, −1), (1, p), and (−1, −p). Alternatively, if
one insists that a, b are natural numbers and a ≤ b, then (a, b) = (1, p).
Moreover, this observation can be easily generalized. For example, if
pq = abc for two prime numbers p, q ( p ≤ q ) and three natural numbers

a, b, c ( a ≤ b ≤ c), then the triple (a, b, c) must be either (1, p, q) or

(1, 1, pq).

SOLUTION
Suppose that n 2
+ np = k
2
for some k ∈ N; that is,
2 2
np = k − n = (k − n)(k + n ) .

Let x = gcd(k, n), let k ′


= k/x ∈ N , and let n ′
= n/x ∈ N . Then,
′ ′ ′ ′ ′
n p = np/x = x(k − n )(k + n ) .

Since ′
gcd(n , k ) = 1 we

have ,
gcd(n , k − n ) = 1 and ′ ′ ′


gcd(n , k

+ n ) = 1.

It follows that n′ has to divide x, that is,
∈ N and so
′ ′
x = x/n

′ ′ ′ ′ ′
p = x (k − n )(k + n ) .

(5.11)

After noting that p is a prime and k − n < k + n , we get that the product ′ ′ ′ ′

on the right hand side of (5.11) is unique: x = 1, k − n = 1, and ′ ′ ′

k + n = p. It follows that k = (p + 1)/2, n = (p − 1)/2, and so


′ ′ ′ ′

k = k x = k n = (p − 1)/4, n = n x = n = (p − 1) /4 is the unique


′ ′ ′ ′2 ′ 2 2

solution of the system, provided that both n and k are natural numbers.
In order to nish the proof, we will use the fact that p is a prime greater
than 2; in particular, it is odd: p = 2ℓ + 1 for some ℓ ∈ N. We get that
k = (p
2
− 1)/4 = (4ℓ
2
+ 4ℓ)/4 = ℓ
2
+ ℓ ∈ N and
2
n = (p − 1) /4 = (2ℓ) /4 = ℓ
2 2
∈ N , and the proof is nished.
REMARKS

Alternatively one could notice that n 2


+ np = k
2
is equivalent to
2
p = (2n + p − 2k)(2n + p + 2k ) .

But this implies that 2n + p − 2k = 1 and 2n + p + 2k = p . From this we 2

get that
2
1 + p = (2n + p − 2k) + (2n + p + 2k) = 4n + 2p

and so n = (p − 1) 2
/4 .
EXERCISES

5.3.1. You are given two integers, a and b, and a prime p > 2. Prove that if
p | a + b and p | a + b , then p | a + b .
2 2 2 2 2

(Source of the problem and solution idea: PLMO LXIX – Phase 1 –


Problem 1.)

5.3.2. You are given four integers a, b, c , and d. Prove that if


a − c | ab + cd, then a − c | ad + bc.

5.3.3. Consider any natural number n ≥ 2. Prove that n + 64 has at least 12

four different non-trivial natural factors; that is, n + 64 = a ⋅ b ⋅ c ⋅ d for


12

some a, b, c, d ∈ N such that 1 < a < b < c < d < n + 64. 12

5.4 Fermat’s Little Theorem and Euler’s Theorem


SOURCE

Problem and solution idea: PLMO LXV – Phase 3 – Problem 5


PROBLEM
Find all x, y ∈ N such that 2 x
+ 17 = y
4
.

THEORY

Fermat’s Little Theorem Fermat’s little theorem is one of the fundamental


results of elementary number theory. We state it as follows: if p is a prime
and n is an integer, then
p
n ≡ n ( mod p) ,

or equivalently p | n − n. Before we prove this fact, let us mention that it


p

is often stated in the following form: if p is a prime and n is an integer not


divisible by p, then
p−1
n ≡ 1 ( mod p),

or equivalently p | n − 1. It is easy to show that both statements are


p−1

equivalent, that is, if one of them is true then the other one must also be
true.
Fix any prime number p. We will prove Fermat’s little theorem by
induction on n. The base case is trivial: 1 ≡ 1 ( mod p). For the p

inductive step, suppose that k ≡ k ( mod p) for some k ∈ N. Our goal


p

is to show that (k + 1) ≡ k + 1 ( mod p).


p

Let us rst note that


p p−1
p
p p
i i p
(k + 1) = ∑( )k = 1 + ∑( )k + k .
i i
i=0 i=1

Then, observe that for any i ∈ Z, 1 ≤ i ≤ p − 1 , p | (


p

i
) or, equivalently,
( mod p). Indeed,
p
( ) ≡ 0
i

p p(p − 1)⋯(p − i + 1)
( ) = ;
i i(i − 1)⋯1

p divides the numerator but not the denominator. It follows that


( mod p). Finally, by inductive hypothesis, we know
p p
(k + 1) ≡ k + 1

that k ≡ k ( mod p) and so (k + 1) ≡ k + 1 ( mod p), as needed.


p p

The proof is nished.


Let us also mention that the converse of Fermat’s little theorem is not
generally true. Counter-examples, that is, composite numbers q for which
( mod q) for all integers n, are called Carmichael numbers. The
q
n ≡ n

smallest Carmichael number is 561. Indeed, 561 = 3 ⋅ 11 ⋅ 17 so 561 is


composite. One can show that a ≡ a ( mod 561) for any a by
561

independently showing that a ≡ a ( mod 3), a ≡ a ( mod 11),


561 561

and a ≡ a ( mod 17).


561

Euler’s Totient Function Our next task is to show an important


generalization of Fermat’s little theorem. However, before we do it, we need
to introduce a closely related function and discuss a few of its useful
properties. Euler’s totient function ϕ : N → N is de ned as follows: for
each q ∈ N, ϕ(q) is equal to the number of integers between 1 and q
(including both 1 and q) that are co-prime with q.
As promised, we discuss next a few interesting properties of ϕ(q).

Property 1. If q is a prime number, then ϕ(q) = q − 1. This fact follows


directly from the de nition of prime numbers.

Property 2. If q = p for some prime number p and k ∈ N, then


k

1
k k−1
ϕ(q ) = p − p = q(1 − ) .
p

To see this observe that there are exactly p numbers between 1 and q that
k−1

are of the form i ⋅ p, where i ∈ [p ]; these are the only numbers in that
k−1

interval that are not co-prime to q.

Property 3. Here is another useful fact:

if p and q are co-prime, then ϕ(pq ) = ϕ(p) ⋅ ϕ(q ) .

(5.12)

In order to show (5.12) we need to make one additional observation. Let us


x any p and q that are co-prime and consider a function
f : [pq] → [p] × [q] de ned as follows: for any d ∈ [pq],

f (d) = (r (d), r (d)), where r (d) is the remainder when d is divided by p


p q p
and r (d) is the remainder when d is divided by q. We claim that f is a
q

bijection; that is,

a) f is one-to-one: if d 1 ≠ d2 , then f (d 1) ≠ f (d2 ) , and


b) f is onto: for any (r , r p q) ∈ [p] × [q] , there exists d ∈ [pq] such
that f (d) = (r , r ). p q

We will show rst that f is one-to-one. To get a contradiction, suppose


that that

f (d1 ) = (rp (d1 ), rq (d1 ) ) = (rp (d2 ), rq (d2 ) ) = f (d2 )

for some d ≠ d . Due to the symmetry, without loss of generality we may


1 2

assume that d > d . Since d ≡ r (d ) = r (d ) ≡ d


1 2 1 ( mod p) and
p 1 p 2 2

d ≡ r (d ) = r (d ) ≡ d
1 q 1 q 2( mod q), we have p | d − d
2 and 1 2

q | d − d . Since p and q are co-prime, pq | d − d . Finally, since


1 2 1 2

1 ≤ d < d ≤ pq , we get that 1 ≤ d − d < pq and so pq | d − d


2 1 1 is
2 1 2

only possible if d = d . We get the desired contradiction.


1 2

Now, notice that the codomain of function f is of size |[p]| ⋅ |[q]| = pq,
exactly the same as the size of its domain. Hence, function f is not only a
one-to-one function but it is also onto (and so it is a bijection).
We are nally ready to show (5.12). Notice that d is co-prime to pq if and
only if it is co-prime to p and to q. Moreover, by the Euclidean algorithm, d
is co-prime to p if and only if r (d) is co-prime to p. Similarly, d is co-
p

prime to q if and only if r (d) is co-prime to q. The desired conclusion


q

follows immediately from our observation that function f is a bijection.


Indeed, we have that ϕ(pq), the number of values of d in [pq] that are co-
prime to pq, is equal to the number of pairs (r , r ) where rp is co-prime to p q

p and rq is co-prime to q. The number of such pairs is ϕ(p) ⋅ ϕ(q) and the
proof of this property is nished.

Property 4. Combining Properties 2 and 3 we get:



1
ϕ(q ) = q ∏ (1 − ),
pi
i=1
where q = ∏ p is the unique factorization of q; in particular,

i=1
ki

i
(pi ) is a
sequence of prime numbers and k ∈ N for all i ∈ [ℓ].
i

Euler’s Theorem Euler’s theorem is a generalization of Fermat’s little


theorem: For any integer q ≥ 2 and any integer n that is co-prime to q, one
has
ϕ(q)
n ≡ 1 ( mod q) .

Note that Fermat’s little theorem is indeed a special case of Euler’s theorem,
because if q is a prime number, then ϕ(q) = q − 1. (See Property 1 above.)
In order to prove Euler’s theorem, let r , r , …, r be all the numbers
1 2 ϕ(q)

in [q] that are co-prime with q. Let us concentrate on n ⋅ r for some i

i ∈ [ϕ(q)]. Because both n and ri are co-prime with q, n ⋅ r is also co-prime i

with q. Now, let si be the reminder when n ⋅ r is divided by q. Observe that


i

si is also co-prime with q. Additionally, if r ≠ r , then s ≠ s . To see this,


i j i j

observe that if s = s , then n ⋅ r ≡ n ⋅ r


i j i( mod q) and, as n is co-prime
j

with q, it follows from (5.8) that r ≡ r ( mod q), which is not possible.
i j

This means that there is a bijection between the set S := {s , s , …, s } 1 2 ϕ(q)

and the set R := {r , r , …, r } and so S = R. Hence, since


1 2 ϕ(q)

n ⋅ r ≡ s
i i( mod q) for all i, we get

ϕ(q) ϕ(q)

ϕ(q)
n ∏ ri ≡ ∏ si ( mod q) .

i=1 i=1

Using the fact that all ri are co-prime with q and that S = R , we can use
(5.8) to get the desired result.

Chinese Remainder Theorem The following result is widely used for


computing with large integers, as it allows replacing a computation for
which one knows a bound on the size of the result by several similar
computations on small integers. The Chinese remainder theorem can be
stated as follows in term of congruences. If the numbers in the set
{n , n , …, n } are pairwise co-prime and a , a , …, a
1 2 k are any integers, 1 2 k

then there exists an integer x such that

( )
x ≡ a1 (mod n1 )

x ≡ ak (mod nk ),

and any two such x are congruent modulo N .


k
= ∏ ni
i=1

For each i ∈ [k], let N = N /n . Note that gcd(N , n ) = 1 for all


i i i i

i ∈ [k]. Let N be an inverse of Ni modulo ni; that is,


−1
i

Ni N
i
−1
≡ 1 ( mod ni ) . Let
−1 −1 −1
x̂ = a1 N1 N + a2 N2 N + … + ak Nk N .
1 2 k

Clearly, x
^ is a simultaneous solution to all of the congruences. Indeed, for

any j ≠ i, N ≡ 0 ( mod n ) and so x


j ^ ≡ a N N i ≡ a ( mod n ), as i i
−1
i i i

required. Moreover, since the moduli n , n , …, n are pairwise relatively


1 2 k

prime, any two simultaneous solutions to the system must be congruent


modulo N. Hence, the full solution is x ≡ x^ ( mod N ).

Let us note that the proof is constructive; that is, it gives us an explicit
formula for the solution. For example, let us solve the following system of
congruences:

x ≡ 1 (mod 5)

x ≡ 2 (mod 7)

x ≡ 3 (mod 9)

x ≡ 4 (mod 11).

Note that the moduli are pairwise relatively prime, as required by the
Chinese remainder theorem. Using the notation from the proof, we have
N = 5 ⋅ 7 ⋅ 9 ⋅ 11 = 3465, N = N /5 = 693, N = N /7 = 495,
1 2

N = N /9 = 385, and N = N /11 = 315. A small calculation gives


3 4

N
−1

1
= 2 ( 693 ⋅ 2 ≡ 3 ⋅ 2 = 6 ≡ 1 ( mod 5)), N = 3 ( −1

( mod 7)), ( −1
495 ⋅ 3 ≡ 5 ⋅ 3 = 15 ≡ 1 N = 4
3

( mod 9)), and ( −1


385 ⋅ 4 ≡ 7 ⋅ 4 = 28 ≡ 1 N = 8
4

315 ⋅ 8 ≡ 7 ⋅ 8 = 56 ≡ 1 ( mod 11)). Hence, one solution is

x̂ = 1 ⋅ 693 ⋅ 2 + 2 ⋅ 495 ⋅ 3 + 3 ⋅ 385 ⋅ 4 + 4 ⋅ 315 ⋅ 8 = 19056 .


The full solution is
( )
x ≡ 19056 ≡ 1731 ( mod N ).

In particular, x = 1731 is the smallest positive integer solution to the


system.
SOLUTION

Suppose that x, y ∈ N are such that 2 x


+ 17 = y
4
. Observe that
16 x 8 x 8 x 4 x 4 x 8 x
y − 16 = (y − 4 )(y + 4 ) = (y − 2 )(y + 2 )(y + 4 ) .

In particular, we get that 17 = y − 2 | y − 16 or, alternatively, that


4 x 16 x

( mod 17). Observe now that 16 ≡ (−1) mod 17. On the


16 x x x
y ≡ 16

other hand, Fermat’s little theorem implies that y ≡ 1 ( mod 17) if y is 16

not divisible by 17 and otherwise we get that y ≡ 0 ( mod 17). This 16

implies that y is not divisible by 17 and, more importantly, that x is even.


Therefore, after letting k = x/2 ∈ N, we can re-write the equation as
follows:
4 2k 2 k 2 k
17 = y − 2 = (y − 2 )(y + 2 ) .

Since 17 is a prime number, this equation holds only if y


2
+ 2
k
= 17 and
y − 2 = 1. It follows that
2 k

2 k 2 k
(y + 2 ) − (y − 2 ) 17 − 1
k
2 = = = 8
2 2
and
2 k 2 k
(y + 2 ) + (y − 2 ) 17 + 1
2
y = = = 9,
2 2
which gives us x = 2k = 6 and y = 3 . One can easily verify that indeed
2 + 17 = 3 .
6 4

REMARKS

It is easy to observe that the decomposition 17 = (y − 2 )(y + 2 ) would 2 k 2 k

yield an easy solution to the problem. Hence, the main dif culty in this
problem is to show that x is even. We present one way to show it but,
alternatively, one can use an argument similar to the one from Section 5.2.
We observe that for x = 1, 2, 3, 4, 5, 6, 7, 8
x
2 ≡ 2, 4, 8, 16, 15, 13, 9, 1 ( mod 17) .

We close the cycle of length 8 which implies that for any x ∈ N, 2 ≡ 2


x+8 x

and so {2, 4, 8, 16, 15, 13, 9, 1} are the only possible remainders when the
left hand side of the equation is divided by 17.
On the other hand, the only possible reminders when y4, the right hand
side of the equation, is divided by 17 are 0, 1, 4, 13, and 16, which can be
veri ed by enumerating the reminders of 0 , 1 , …, 16 when divided by
4 4 4

17. Here is a line of code written in Julia language that performs this
calculation and shows the result:

julia > Set(mod(i ^ 4, 17) for i in 0:16)


Set([4, 16, 0, 13, 1])
Comparing possible remainders on both sides, we conclude that the only
chance that the remainders match is when x is even.
EXERCISES

5.4.1. Prove that for all a ∈ N, we have 35 |


64
a
4
− a .

5.4.2. Prove that for any odd integer n, we have that n | ∏


n

i=1
i

j=0
2
j
.

5.4.3. Find the last two digits in the decimal representation of 7123.

5.5 Rules of Divisibility


SOURCE

Problem and solution: PLMO LXIX – Phase 2 – Problem 2


PROBLEM
Suppose that n ∈ N is such that n ≡ 4 ( mod 8) . Let

1 = k1 < k2 < … < km = n


be all positive divisors of n. Prove that if natural number i < m is not
divisible by 3, then k ≤ 2k . i+1 i

THEORY

When discussing rules for divisibility, it is convenient to write an integer n


we are concerned with in base 10 as follows:
k k−1
n = 10 ak + 10 ak−1 + ⋯ + 10a1 + a0

with a ≠ 0 and 0 ≤ a
k i ≤ 9 for 0 ≤ i ≤ k. Here are some standard rules of
divisibility:

1. 2 | n if and only if 2 | a0 (the last digit is even),

2. 5 | n if and only if 5 | a0 (the last digit is 0 or 5),

3. 4 | n if and only if 4 | 10a + a (the number formed by the two


1 0

rightmost digits is divisible by 4),

4. 8 | n if and only if 8 | 100a + 10a + a (the number formed 2 1 0

by the three rightmost digits is divisible by 8),

5. 3 | n if and only if 3 | ∑
k

i=0
ai (the sum of digits is divisible by
3),

6. 9 | if and only if (the sum of digits is divisible by


k
n 9 | ∑ ai
i=0

9).

The proofs are straightforward so we only show the rule for 3 in order to
illustrate the argument. Note that 10 ≡ 1 ( mod 3), so for any i ∈ N we
get 10 ≡ 1 ≡ 1 ( mod 3). Thus
i i

| ( )
3|n ⇔ n ≡ 0 (mod 3)

k k−1
⇔ 10 ak + 10 ak−1 + ⋅ ⋅ ⋅ + 10a1 + a0 ≡ 0 (mod 3)

⇔ ak + ak−1 + ⋅ ⋅ ⋅ + a0 ≡ 0 (mod 3),

that is, the sum of its digits is divisible by 3.

SOLUTION

Instead of proving the original conditional statement (if 3 | i, then


ki+1 ≤ 2k ) we will prove its contrapositive (if k
i > 2k , then 3 | i), as
i+1 i

both statements are logically equivalent.


Let us rst note that since n ≡ 4 ( mod 8), n is divisible by 4 but not
by 8. As a result, each divisor of n is of the form 2 (2ℓ + 1), s ∈ {0, 1, 2},
s

ℓ ∈ N ∪ {0}. Consider any i ∈ N such that i < m and assume that

ki+1 > 2k . Our goal is to show that 3 | i; however, in fact, we will show
i

something stronger, namely, that the set {k , k , …, k } can be partitioned


1 2 i

into i/3 sets, each of the form {d, 2d, 4d} for some odd positive integer d.
We will show that if d ≤ k is an odd divisor of n, then both 2d and 4d
i

are not only divisors of n (which is obvious) but also 2d < 4d ≤ k . This i

will nish the proof as it implies that the desired partition exists. Consider
any odd divisor d ≤ k of n. By our assumption, 2d ≤ 2k < k
i
but this
i i+1

implies that 2d ≤ k . Repeating the argument one more time we get


i

4d = 2(2d) ≤ 2k < k i which gives 4d ≤ k . The desired property holds


i+1 i

and we are done.

REMARKS
The key observation that leads to the solution is to notice that if k > 2k , i+1 i

then k i+1must be odd. Indeed, if k were even, then k /2 > k would


i+1 i+1 i

be an integer that also divides n but there is no divisor of n between ki and


ki+1. From this observation it follows that all divisors of the form d, 2d and
4d, where d is an odd divisor of n, must be either less than k or all of i+1

them are greater than or equal to k . i+1

EXERCISES

5.5.1. Decide if there exists k ∈ N with the property that in the decimal
representation of 2k each of the 10 digits ( 0, 1, 2, …, 9) is present the same
number of times.
(Source of the problem and solution: LXX – Phase 1 – Problem 1.)

5.5.2. Find the minimum of |20 − 9 | over all natural numbers m and n.
m n

(Source of the problem and solution idea: LXIV – Phase 1 – Problem 5.)

5.5.3. Given m, n, d ∈ N, prove that if m n + 1 and mn


2 2
+ 1 are divisible
by d, then m + 1 and n + 1 are also divisible by d.
3 3

5.6 Remainders
SOURCE

Problem and solution idea: PLMO XLIX – Phase 3 – Problem 4


PROBLEM

Let the sequence (a ) be de ned recursively as follows: a = 1 and for any


i 1

n ∈ N ∖ {1}, a n= a + a
n−1 ⌊n/2⌋. Prove that there are in nitely many
terms in the sequence (a ) that are divisible by 7.
i

THEORY

Let p be any prime number. For given integers a, b, and k, let

S = {a + ib : i ∈ Z, k + 1 ≤ i ≤ k + p } .

If p | b, then clearly all numbers in S give the same remainder when


divided by p, namely, the remainder when a is divided by p. On the other
hand, if p does not divide b, then all numbers in S yield unique remainders
when divided by p. Indeed, for a contradiction, suppose that there exist i, j,
k + 1 ≤ i < j ≤ k + p such that a + ib ≡ a + jb ( mod p). It follows

that p | (j − i)b and so p | (j − i) as p does not divide b. But


0 < j − i < p so it is impossible that p | (j − i). We get the desired

contradiction and so the property holds.

SOLUTION
Let us rst
observe that a = a + a = 2, a = a + a = 3,
2 1 1 3 2 1

a = a + a = 5, a = a + a = 7. So there exists at least one term in


4 3 2 5 4 2

the sequence that is divisible by 7.


For a contradiction, suppose that that there are only nitely many terms
that are divisible by 7; let ak be the largest one. Let us concentrate on the
following seven consecutive terms of the sequence:

a4k−3

a4k−2 = a4k−3 + a2k−1

a4k−1 = a4k−3 + 2a2k−1

a4k = a4k−3 + 2a2k−1 + a2k = a4k−3 + 3a2k−1 + ak

a4k+1 = a4k−3 + 3a2k−1 + ak + a2k = a4k−3 + 4a2k−1 + 2ak

a4k+2 = a4k−3 + 4a2k−1 + 2ak + a2k+1 = a4k−3 + 5a2k−1 + 4ak

a4k+3 = a4k−3 + 5a2k−1 + 4ak + a2k+1 = a4k−3 + 6a2k−1 + 6ak .

Since ak is divisible by 7, the remainders of these consecutive terms when


divided by 7 are the same as the corresponding reminders of numbers in

S = {a4k−3 + ia2k−1 : i ∈ Z, 0 ≤ i ≤ 6 } .

Since prime number 7 does not divide a , it follows from the observation
2k−1

we made in the theory part that all numbers in S yield unique remainders
when divided by 7. Hence, one of them is equal to 0 and so one of the
consecutive terms we consider is divisible by 7. We get the desired
contradiction and the proof is nished.

REMARKS

Since our task was to prove that there are in nite number of values of
n ∈ N for which 7 | a , it is clear that one should consider the recurrence
n

an = a + a
n−1 but concentrate on reminders when an in divided by 7.
⌊n/2⌋

One natural idea that has a chance to lead to a solution is to consider


seven consecutive numbers and hope that all of them have different
reminders when divided by 7. Since the recursive de nition for an involves
a⌊n/2⌋
and it has to be applied twice in order to reduce the number of terms
to at most three, one should consider indexes of the form n = 4k + r for
some seven consecutive values of r.
The key observation is to notice the pattern that implies that the terms
involving a 2k−1 yield unique reminders when divided by 7, provided that
a2k−1 is not divisible by 7. Hence, the same is true for the seven consecutive
terms of the form n = 4k + r, provided that additionally ak is divisible by
7. This leads to the proof by contradiction that assumes that ak is the largest
number divisible by 7.

EXERCISES

5.6.1. Find all x, y ∈ N such that 2 + 5 is a square.


x y

(Source of the problem and solution idea: General Mathematics Vol. 15, No.
4 (2007), 145–148.)

5.6.2. Prove that for any two sequences, (x ) iand (y ) , of natural


2011

i=1 i
2011

i=1

numbers, ∏ (2x + 3y ) is not a square.


2011 2 2
i=1 i i

(Source of the problem and solution idea: PLMO LXII – Phase 3 – Problem
3.)

5.6.3. Consider any integer n ≥ 2 and any subset S of the set


N := {0, 1, 2, …, n − 1} that has more than n elements. Prove that there
3

exist integers a, b, c such that the remainders when numbers a, b, c, a + b,


a + c, b + c, a + b + c are divided by n are all in S.

(Source of the problem and solution idea: PLMO LXI – Phase 3 – Problem
1.)

5.7 Aggregation
SOURCE

Problem and solution idea: PLMO XLII – Phase 1 – Problem 11 (modi ed)

PROBLEM
Let n > 1, 000 be any integer. For i ∈ [n], let ri be the reminder when 2n is
divided by i. Prove that ∑ r > 3.5n.
n

i=1 i

THEORY

When a long sequence of numbers is considered, it is often the case that


some patters naturally emerge, either in the sequence itself or in the sum or
the product of its terms. Here are some simple observations:

n−1
n(n − 1)
∑ (a + i ⋅ b ) = n ⋅ a + ⋅ b
2
i=0

that implies that if n is odd or b is even, then .


n−1
n | ∑ (a + i ⋅ b)
i=0

Similarly,
n−2 n−1
b − 1
i
∑b =
b − 1
i=0

is divisible by n if n is a prime number and b − 1 is not divisible by n. In


fact, more general property holds. For any prime number q and
n = a(q − 1) − 1 for some a ∈ N, the sum ∑ b is divisible by q,
n−2

i=0
i

provided that q does not divide b − 1. Indeed, we recover the previous


property for a = 1. In our problem, we will need to consider slightly more
complex aggregation scheme of this avour.
SOLUTION

Clearly, for any odd number i ≥ 3, 2n is not divisible by i and so ri ≥ 1 .


Since there are ⌈n/2⌉ positive odd numbers, we immediately get
n

∑ ri ≥ ⌈n/2⌉ − 1 .

i=1

In order to get the desired lower bound of 3.5n, we need to use a slightly
more delicate argument.
Let us rst note that for any number i ∈ [n], there exist unique
a, b ∈ N ∪ {0}, such that

( )
a
i = 2 ⋅ (2b + 1 ) .

If b = 0, then i = 2 divides 2n
and so r = 0. As a result, we focus on
a
i

numbers for which b ≥ 1. We will say that a number i ∈ [n] is of “type a” if


the corresponding representation is 2 (2b + 1) and b ≥ 1. Since there are a

⌈⌊n/2 ⌋/2⌉ odd numbers between 1 and ⌊n/2 ⌋, there are


a a

a a
⌊n/2 ⌋ ⌊n/2 ⌋ n
⌈ ⌉ − 1 ≥ − 1 ≥ − 3/2
a+1
2 2 2
numbers of “type a.”
Let us now concentrate on any number of “type a”: i = 2 a
(2b + 1) ∈ [n] ,
b ≥ 1. The key observation is that

n n n
n 2 2 2
ri = 2 − ⌊ ⌋ ⋅ i = ( − ⌊ ⌋) ⋅ i
i i i

n−a n−a
2 2 a
= ( − ⌊ ⌋) ⋅ 2 (2b + 1)
2b+1 2b+1

1 a a
≥ ⋅ 2 (2b + 1) = 2 ,
2b+1

since b ≠ 0. It follows that


n amax amax
n a
n 3 a
∑ ri ≥ ∑( − 3/2)2 = ∑( − ⋅ 2 ),
a+1
2 2 2
i=1 a=0 a=0

where a max = ⌊log2 (n/3)⌋ . Since n > 1, 000, a max ≥ 8 and so


n 8
n 3 a
9 1533
∑ ri ≥ ∑( − ⋅ 2 ) = n − > 3.5n .
2 2 2 2
i=1 a=0

REMARKS

Let us mention that our argument above easily gives a slightly stronger
asymptotic lower bound. Indeed,
n amax n 3 a n amax
∑ ri ≥ ∑a=0 ( − ⋅ 2 ) = (amax + 1) ⋅ + O(2 )
i=1 2 2 2

n nlog2 n nlog2 n
= (log2 n + O(1)) ⋅ + O(n) = + O(n)~ .
2 2 2

A heuristic argument suggests that ri is expected to be around i/2 and so it


is natural to conjecture that is asymptotically equal to
n
∑i=1 ri

i/2 ∼ n /4.
n 2

i=1
EXERCISES
5.7.1. Prove that if the sum of positive divisors of some natural number n is
odd, then either n is a square or n/2 is a square.

5.7.2. Find all natural numbers n for which there exist 2n pairwise different
numbers a , a , …, a , b , b , …, b such that ∑ a = ∑ b and
1 2 n 1 2 n
n

i=1 i
n

i=1 i

b .
n n
∏ a = ∏
i i
i=1 i=1

(Source of the problem and solution idea: PLMO LXI – Phase 1 – Problem
10.)

5.7.3. Call a natural number white if it is equal to 1 or is a product of an


even number of prime numbers; otherwise, call it black. Is there an integer
for which the sum of its white divisors is equal to the sum of its black
divisors?
(Source of the problem and solution idea: PLMO LVIII – Phase 3 –
Problem 2.)

5.8 Equations
SOURCE

Problem and solution idea: PLMO XLVII – Phase 2 – Problem 5

PROBLEM

Find all pairs of integers (x, y) that satisfy the following equation:
2 2
x (y − 1) + y (x − 1) = 1 .

THEORY

Solving an equation or a system of equations when solutions are restricted


to the set of integers usually involves techniques that we investigated in
earlier problems. Most common applications are rules of divisibility and
factorization that will also be used in the current problem. We dedicate a
separate place for these type of equations as they are well-known,
interesting, and worth knowing about.

Diophantine Equations A Diophantine equation is a polynomial equation


whose solutions are restricted to integers. These types of equations are
named after the ancient Greek mathematician Diophantus. A linear
Diophantine equation is a rst-degree equation of this type, that is, equation
of the form

Ax + By = C,
where A, B, C are given integers. Although the practical applications of
Diophantine analysis have been somewhat limited in the past, this kind of
analysis has become much more important in the digital age. In particular,
they play an important role in the theory of public-key cryptography.
In order to warm up, we concentrate on the simplest and best-understood
equations, namely, linear equations. Let us rst note that not all linear
Diophantine equations have a solution. For example, 10x + 5y = 3 does
not have a solution as for any pair of two integers x and y, the left hand side
of this equation is divisible by 5 whereas the right hand side is not.
Fortunately, there is a formal process to determine whether the equation has
a solution or not. Indeed, nding all solutions to linear Diophantine
equations involves nding an initial solution, and then altering that solution
in some way to nd the remaining solutions. For the rst task we are going
to use the Bézout’s Identity.

Bézout’s Identity Let us recall that in Section 5.1 we used the extended
Euclidean algorithm to prove that for any A, B ∈ Z ∖ {0}, there exist
x, y ∈ Z such that

Ax + By = gcd(A, B ) .

Bézout’s Identity generalizes this observation as follows. There exist


integers x and y which satisfy Ax + By = C if and only if
gcd(A, B ) | C . In other words, the integers of the form Ax + By are

exactly the multiples of gcd(A, B).


Using this observation, one can determine if solutions exist or not by
calculating the greatest common divisor of the coef cients of the variables,
and then determining if the constant term can be divided by that greatest
common divisor. As already mentioned, if solutions do exist, then there is
an ef cient method to nd an initial solution—the extended Euclidean
algorithm. With this one solution at hand, we can nd all integer solutions;
there are in nitely many of them.
We will show that if (x, y) = (x, ^) is an integer solution of the
^ y

Diophantine equation Ax + Bx = C , then all integer solutions to the


equation are of the form

B A
(x, y ) = (x̂ + m , ŷ − m )
gcd(A, B) gcd(A, B)

for some integer m. Indeed, we have

B A
A(x̂ + m ) + B(ŷ − m ) = Ax̂ + Bŷ = C,
gcd(A, B) gcd(A, B)

which shows these are indeed solutions to the equation. On the other hand,
given any solution (x, y), we have

Ax + By = Ax̂ + Aŷ

A(x − x̂) = −B(y − ŷ )

A B
(x − x̂) = − (y − ŷ ).
gcd(A,B) gcd(A,B)

Since A

gcd(A,B)
and B

gcd(A,B)
are relatively prime, there exists an integer m
such that x − x^ = m
B

gcd(A,B)
and y − y
^ = −m
A

gcd(A,B)
. This shows that
there are no more solutions.

SOLUTION

Let us rst observe that, due to the symmetry, without loss of generality we
may assume that x ≤ y. In other words, solutions to this equation come in
pairs: if (x, y) = (a, b) is a solution, then so is (x, y) = (b, a).
If x = 1, then y has to satisfy y − 1 = 1 and we get two solutions:
(x, y) = (2, 1) and (x, y) = (1, 2). If x = 2, then 4(y − 1) + y = 1 and
2

so after solving quadratic equation we get y = 1 or y = 5. Hence, there is


another pair of solutions: (x, y) = (2, −5) and (x, y) = (−5, 2). We will
prove that there are no other solutions to this equation.
Let us start with expanding the expression:
2 2 2 2
x y − x + y x − y = 1 .

Now, we re-write it as follows:


2 2
xy(x + y ) = 1 + x + y

and so
2
xy(x + y) + 2xy = 1 + (x + y) .

Next,

xy(x + y + 2 ) = (x + y + 2)(x + y − 2) + 5

and nally

(xy − x − y + 2)(x + y + 2 ) = 5.

There are 4 cases to consider. We will see that none of them yields new
solutions and so the proof will be nished.

1. x + y + 2 = 1 and xy − x − y + 2 = 5. Adding these two


equations together we get that xy = 2 and x = −1 − y. There is no
pair (x, y) satisfying these two equations as the rst equation
implies that x and y have the same sign whereas the second one
implies that they have opposite signs.

2. x + y + 2 = 5 and xy − x − y + 2 = 1. Adding these two


equations together we get xy = 2 and x = 3 − y. This gives us the
equation (3 − y)y = 2 that yields two solutions we are already
aware of, (x, y) = (1, 2) and (x, y) = (2, 1).

3. x + y + 2 = −1 and xy − x − y + 2 = −5. This time we get


xy = −10 and x = −3 − y, so (−3 − y)y = −10. As before, we

re-discover two solutions, (x, y) = (2, −5) and (x, y) = (5, −2).

4. x + y + 2 = −5 and xy − x − y + 2 = −1. We get xy = −10 and


x = −7 − y, so (−7 − y)y = −10. This quadratic equation does

not have any integer solutions.


REMARKS

The key idea of the proposed solution is to use a factorization that reduces
the solution to 4 simple cases. However, it is not clear how to transform the
equation to get this desired form. In order to see this, it is easier to
substitute a = x − 1 and b = y − 1 to get
2 2
(a + 1) b + (b + 1) a = 1.

After expanding, we obtain


2 2
a b + 2ab + b + b a + 2ab + a = 1
and so

ab(a + b) + 4ab + a + b = 1.

It follows that

ab(a + b + 4) + a + b = 1,

and now it is much easier to see that it is enough to add 4 to both sides to
reach the desired factorization,

(ab + 1)(a + b + 4 ) = 5.

EXERCISES

5.8.1. Find all natural solutions of the following equation:


+ y .
4 3 2
x + y = x

5.8.2. Find all pairs of natural numbers that satisfy (x − y) = xy.


n

(Source of the problem and solution idea: PLMO LVI – Phase 3 – Problem
1.)

5.8.3. Find all natural numbers satisfying the following system of equations:

a + b + c = xyz,

x + y + z = abc,

and such that a ≥ b ≥ c ≥ 1 and x ≥ y ≥ z ≥ 1.


(Source of the problem and solution idea: PLMO XLIX – Phase 3 –
Problem 1.)
Chapter 6
Geometry

6.1 Circles
6.2 Congruence
6.3 Similarity
6.4 Menelaus's Theorem
6.5 Parallelograms
6.6 Power of a Point
6.7 Areas
6.8 Thales' Theorem

As usual, we start the chapter with some basic de nitions.


THEORY

In geometry, a point is a speci c location on a plane or, in general in any d-dimensional


space; however, in this chapter we are concerned with problems in 2-dimensions. A line is
de ned as a line of points that extends in nitely in two directions. Points that are on the
same line are called collinear points. A line is de ned by two different points, say A and B,
and is denoted by AB. A part of a line that has de ned endpoints is called a line segment.
Usually it is clear from the context whether we deal with a line or a line segment so we will
use the same notation, namely AB, for both the line segment between A and B, and the line
passing A and B.
Two lines that meet in a point are called intersecting lines. When two lines intersect at a
right angle to each other, they are said to be orthogonal (here a right angle is π/2). Given line
AB and point C not on the line, there is a unique line that is orthogonal to line AB which

goes through point C; this line intersects line AB at point C′ and is called the orthogonal
projection of C on line AB.
A polygon is a plane gure that is bounded by a nite sequence of straight line segments
closing in a loop to form a closed polygonal circuit. These segments are called its edges (or
sides), and the points where two edges meet are the polygon’s vertices. An n -gon is a
polygon with n sides; for example, a triangle is a 3-gon, quadrilateral is a 4-gon. Finally, an
equilateral triangle is a triangle in which all three sides are equal, and an isosceles triangle is
a triangle that has two sides of equal length.

6.1 Circles
SOURCE
Problem: “Exercises in geometry” (in Polish) by Waldemar Pompe – Problem 19
Solution: our own
PROBLEM

Points E and F lie on sides AB and BC of square ABCD and |BE| = |BF |. Point S is the
orthogonal projection of B on line CE. Show that angle < ) DSF is the right angle.
THEORY

Circumcircle The circumcircle is a triangle’s circumscribed circle, that is, the unique circle
that passes through each of the triangle’s three vertices. The center of the circumcircle is
called the circumcenter, and the circle’s radius is called the circumradius. The circumcenter’s
position depends on the type of triangle.

If and only if it is a right triangle (that is, a triangle in which one angle is the
right angle), the circumcenter lies on one of its sides (namely, the
hypotenuse, the longest side of a right triangle, opposite the right angle).

If and only if a triangle is acute (all angles smaller than the right angle), the
circumcenter lies inside the triangle.

If and only if it is obtuse (has one angle larger than the right angle), the
circumcenter lies outside the triangle.

Central and Inscribed Angles A central angle of a circle is an angle whose vertex is the
center O of the circle and whose sides, called radii, are line segments from O to two points
on the circle. In Figure 6.1, < ) BOC is a central angle and we say that it intercepts the arc
BC . An inscribed angle of a circle is an angle whose vertex is a point A on the circle and

whose sides are line segments, called chords, from A to two other points on the circle. In
Figure 6.1, < ) BAC is an inscribed angle that intercepts the arc BC .
Here is a very useful relation between inscribed and central angles. If an inscribed angle
< ) BAC and a central angle < ) BOC intercept the same arc, then

< ) BOC = 2 ⋅ < ) BAC.

As a result, inscribed angles which intercept the same arc are equal.
FIGURE 6.1: Relation between central and inscribed angles.

Let us stress that a central angle can be more than π. If A and O lie on the same side of
line BC , then < ) BOC is smaller than π (and so < ) BAC is acute). On the other hand, if
A and O are on different sides of BC , then it is greater than π (and so < ) BAC is obtuse).
Finally, let us mention that often we are not explicitly told that we deal with inscribed
angles. For example, perhaps we are given two triangles ABC and ABC where C and C′ lie

on the same side of line AB. We then make a connection and notice that
< ) ACB = < ) AC B if and only if they have the same circumcircle. Similarly, suppose

that C and C′ lie on different sides of line AB. In this case, < ) ACB + < ) AC B = π if ′

and only if the two triangles have the same circumcircles.


A commonly used consequence of the above facts is that if quadrilateral ABCD is
inscribed in a circle, then its opposing angles add up to π. The other important consequence
of this fact is that if we have a right triangle ABC and < ) ACB is the right angle, then AB
is a diameter of the circumcircle of ABC . Additionally, there are two useful facts relating
circles and triangles.

Three lines bisecting angles of a triangle intersect in one point. This point is
called the incenter of the triangle and is a center of the inscribed circle; that
is, the largest circle contained within the triangle.

The three perpendicular bisectors of the sides of a triangle meet in one point,
the circumcenter of the triangle.

The altitude of a triangle is a line which passes through a vertex of the triangle and is
perpendicular to the opposite side (possibly extended). There are therefore three altitudes in
a triangle. As a potential application of the above observations, let us show that the three
altitudes of the triangle intersect in a single point called orthocenter. Note that the
orthocenter is not always inside the triangle; if the triangle is obtuse, it will be outside.
Let us rst consider any acute triangle ABC . Take the orthogonal projection A′ of A on

BC and the orthogonal projection B of B on AC . Denote intersection point of altitudes AA

and BB as H—see Figure 6.2. First, note that < ) ABB < < ) ABC and
′ ′


< ) BAA < < ) BAC so H lies inside triangle ABC . Now, observe that C, A , H and B
′ ′

lie on the circumcircle of triangles H B C and H A C . Hence, < ) B A H = < ) B CH .


′ ′ ′ ′ ′

Arguing similarly, we get that A, B′, A′, B lie on a circle and so < ) B A A = < ) B BA.
′ ′ ′

But this means that if C′ is an intersection of CH with AB, then < ) AC C = π/2 as ′

ACC and ABB are similar.


′ ′

FIGURE 6.2: Orthocenter of the triangle.

The reasoning when < ) BCA is not acute is identical with the only difference that H lies
outside of triangle ABC , if the triangle is obtuse, or H lies on the triangle, if the triangle is
right.

SOLUTION

Let F′ be the point on line segment AD such that |F A| = |F B|. First, note that

< ) ABF = < ) BCE . Hence, lines F B and CE are orthogonal and so S lies on the
′ ′

intersection of the two. But this means that < ) SF F = < ) SCF . (In fact, these two

angles are equal to the previously mentioned two but we do not need this observation here.)
It follows that triangles SF F and SCF have the same circumcircle and so it contains points

S, F, C, F′. Now, since F F DC is a rectangle with its three vertices (F′, F, and C) lying on

the circle, the fourth vertex, D, must also lie on this circle. It follows that DF is a diameter
of the circle and, consequently, < ) DSF is the right angle.
FIGURE 6.3: Illustration for Problem 6.1.

REMARKS

In geometric problems, very often drawing an additional point, line, or circle signi cantly
helps to nd a solution. In our case, introducing an auxiliary point F′ turned out to be very
helpful. How can one think of such a point? It is natural to extend line segment BS beyond
point S and then to notice that point lying on the intersection of this line with AD forms a
rectangle AF F B whose diagonal, F B contains line segment BS .
′ ′

EXERCISES

6.1.1. We are given an acute triangle ABC with < ) ACB = π/3. Let A′ be the orthogonal
projection of A on BC , let B′ be the orthogonal projection of B on AC , and let M be the
middle point of line segment AB. Prove that |A B | = |A M | = |B M |.
′ ′ ′ ′

(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 16.


Solution: our own.)

6.1.2. Consider a square ABCD. Choose point P outside of this square such that < ) CP B
is the right angle. Denote by Q the intersection of AC and BD. Prove that
< ) QP C = < ) QP B.

(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 15.


Solution: our own.)

6.1.3. Point O is the center of a circumcircle of a triangle ABC . Point C′ is the orthogonal
projection of C on AB. Prove that < ) ACC = < ) OCB.

(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 17.


Solution: our own.)
6.2 Congruence
SOURCE

Problem: Student Circle – High School of Stanisław Staszic in Warsaw


Solution: our own

PROBLEM

Consider a rectangle ABCD. We choose point F such that triangle ABF is equilateral and
AF lies inside angle < ) BAD. Similarly, we choose point E such that triangle BCE is

equilateral and BE lies inside angle < ) ABC . Prove that triangle DEF is equilateral.

THEORY

Congruence Two gures or objects are congruent if they have the same shape and size, or if
one has the same shape and size as the mirror image of the other. It is worth remembering
that there are the following conditions for determining congruence between two triangles:

1. three sides of two triangles are equal in length (side-side-side condition);

2. one side and two angles are equal (angle-side-angle or angle-angle-side condition);

3. one angle and two sides associated with this angle are equal (side-angle-side
condition);

4. one angle that is not acute and any two sides are equal.

Importantly, the exception here is when we know one acute angle and two sides of the
triangle but only one of them is adjacent to the angle. In this case, there are actually two
possible triangles that meet those conditions but are not congruent.
SOLUTION

Observe that angles < ) F AD, < ) ECD, and < ) EBF are all equal to π/6. Also
|AD| = |EC| = |EB| and |F A| = |CD| = |BF |. This means that triangles ECD, F AD,
and EBF are congruent. In particular, it implies that |ED| = |F D| = |EF | and so DEF is
equilateral.
FIGURE 6.4: Illustration for Problem 6.2.

REMARKS

A common strategy when solving geometry problems is to draw a picture and then try to
write down all lengths and angles that one can possibly calculate (or list all relationships
between them). In our solution, we simply marked the corresponding values for angles and
identi ed sides that have equal length. When one does it, it is often easy to spot congruences.
Congruence allows us to reason about unknown lengths of sides or unknown angles.

EXERCISES

6.2.1. Suppose that points P and Q lie on sides BC and CD of a square ABCD such that
< ) P AQ = π/4. Prove that |BP | + |DQ| = |P Q|.

(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 4.


Solution: our own.)

6.2.2. Point P lies on a diagonal AC of a square ABCD. Points Q and R are the orthogonal
projections of P on lines CD and DA, respectively. Prove that |BP | = |RQ|.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 1.
Solution: our own.)

6.2.3. Consider an acute triangle ABC where < ) ACB = π/4. Point B′ is the orthogonal
projection of B on AC and point A′ is the orthogonal projection of A on BC . Let H be the
intersection point of AA and BB . Prove that |CH | = |AB|.
′ ′

(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 2.


Solution: our own.)

6.3 Similarity
SOURCE
Adaptation of a puzzle mentioned by Peter Winkler while visiting one of the authors of this
book.

PROBLEM

You are given a triangle ABC and an n-element set S of non-overlapping disks, all having
radius 1 and centers lying inside or on the triangle. Disks do not need to lie inside the
triangle as long as their centers are. Moreover, by “non-overlapping” we mean that they can
“touch” each other; that is, we allow the intersection of any two disks to be one point.
Suppose that set S is maximal; that is, there is no disk of radius 1 such that its center lies
inside or on triangle ABC and it does not overlap with any disk from S. Prove that you can
completely cover triangle ABC with 4n disks of radius 1 (of course, this time we allow
them to overlap).
THEORY

Similarities Two gures that have the same shape are said to be similar. Formally, given two
gures A and B lying on the same plane, we say that they are similar if we can transform A
into B only using the following two operations:

re ection with respect to a line (notice that in particular composition of two


re ections can yield us translation and rotation with respect to a point);

scaling with respect to a point.

In particular, if two gures are similar, then the ratios of the lengths of their corresponding
sides are equal.
For instance, any two circles or any two squares are always similar. In fact, in general, all
regular n-gons are similar. If we are given two triangles, the rules of similarity are the same
as rules of congruence with the only difference that the requirement that the corresponding
lengths of sides are equal is replaced by equality of proportions. Finally, let us mention that
if gure A has to be scaled by a factor of α to be congruent to gure B, then the ratio
between the area of gure A and the area of gure B is α2.
SOLUTION
First, let us construct another auxiliary set of n disks; this time all of them being of radius 2.
For each disk D from S, we put disk E to R which has the same center as D (but radius 2, not
1). Note that R completely covers triangle ABC . Indeed, if point X is lying inside or on the
triangle but is not covered by R, then the distance from X to any of the centers is greater than
2. But this implies that a disk of radius 1, centered at X, would not overlap with any of the
disks in S, contradicting the maximality of set S.
Consider now a triangle A B C and a set R′ that are similar to triangle ABC and,
′ ′ ′

respectively, set R but both are shrunk by a factor of two in each dimension. Clearly, triangle

A B C

is completely covered by disks of radius 1 from R′. Hence, it is enough to show that

one can cover triangle ABC with four copies of A B C ; indeed, if this can be done, then
′ ′ ′

triangle ABC can be covered by four copies of R′. But this is easy to do. Let D, E, and F be
the midpoints of line segments AB, AC , and BC , respectively—see Figure 6.5. Now, we
observe that triangle A B C can be partitioned into four triangles ADE, BDF , EF C and
′ ′ ′

DEF , each of them congruent to A B C . This observation nishes the proof.


′ ′ ′

FIGURE 6.5: Illustration for Problem 6.3.

REMARKS

The key observation in our solution is that any triangle can be partitioned into four copies of
themselves (scaled by a factor of two in each dimension). Clearly, triangles are not the only
gures with this property.
The solution is cute but it feels that it is far from being optimal. Therefore, it is perhaps
surprising that, in fact, the factor 4 is best possible! In other words, if we replace 4n by
⌊(4 − ϵ)n⌋ for some ϵ > 0, then the property is no longer true, regardless how close to zero ϵ

is. Formally, for any ϵ > 0 there exists a counter-example, a triangle ABC and an n-element
maximal set S of non-overlapping disks such that more than (4 − ϵ)n disks are needed to
cover triangle ABC . To see this, we consider a very large triangle so that the boundary
effects are negligible. Hence, for a moment, let us forget about the triangle and think about
placing disks on the plane.
First, let us consider a tiling with regular hexagonal tiles; each hexagonal tile consists of
six equilateral triangles of side lengths equal to r. We carefully choose r such that a unit disk
is just a tiny bit larger than a disk inscribed into one hexagon. Since the altitude of any of the
six equilateral triangles making up the hexagon is arbitrarily close to one, the radius of the
disk, we may assume that r is arbitrarily close to 2/√3 (but it must be a tiny bit smaller).
Formally, we set r := 2/√3(1 + f (ϵ)) for some function f : R → R such that f (ϵ) → 0
+ +

as ϵ → 0, which will be determined soon. We put disks in every third tile such that their
centers coincide with the corresponding centers of hexagons—see Figure 6.6. This
con guration of disks just barely prevents us from adding any more disks without
overlapping and so it is maximal. The fact that this is the most ef cient way to prevent the
addition of a non-overlapping disk is a challenging task to prove and is way beyond the
scope of this book. The (limiting) ratio between the total area of all unit disks used and the
area of the tiling is
2
(π ⋅ 1 )/3 π√ 3(1 + f (ϵ))
= .
2 18
6(√ 3r /4)

FIGURE 6.6: Maximal set of unit disks and covering with disks of radii 2.

Now, the next question is: what is the most ef cient way to cover the plane by unit disks?
The answer is as follows: by circumscribing the tiles in some other hexagonal tiling, this
time each hexagon must have unit radius (also, as for triangles, often called circumradius).
This is another dif cult question that is beyond the scope of this book. In such tiling, the
ratio between the total area of all unit disks used and the area of the tiling is

π 4π√ 3
= .
18
6(√ 3/4)

It follows that the ratio between the number of disks used in the second scenario and n, the
number of disks used in the rst one, is 4/(1 + f (ϵ)). Hence, one needs more than (4 − ϵ)n
disks to cover the triangle, provided that f (ϵ) is suf ciently close to zero and triangle ABC
is suf ciently large so that the ( nite) ratio is close to its limiting counterpart.
EXERCISES

6.3.1. You are given a rectangle that can be covered with n disks of radius r. Prove that it can
be also covered by 4n disks of radius r/2.

6.3.2. You are given an acute triangle ABC . Let B′ be the projection of B on AC and C′ be
the projection of C on AB. Show that ABC and AB C are similar. ′ ′
6.3.3. Consider two circles, o1 and o2, that intersect at two points, A and B. Let P be a point
on o1 such that AP goes through the center of o1 and Q be a point on o2 such that AQ goes
through the center of o2. Prove that if < ) P AQ = π/2, then |P B|/|BQ| = [o ]/[o ], 1 2

where [x] denotes the area of gure x.

6.4 Menelaus’s Theorem


SOURCE

Problem: Wojciech Guzicki Workshop


Solution: our own
PROBLEM

Consider an acute triangle ABC where |AC| < |BC|. Point D lies on line segment BC and
|BD| = |AC|. Points E and F are the middle points of line segments CD and AB,

respectively. Lines AC and EF intersect in point G. Prove that |CE| = |CG|.


THEORY

Before we state the rst theorem, we need one de nition. A transversal is a line that passes
through two lines at two distinct points.

Menelaus’s Theorem Consider a triangle ABC , and a transversal line that crosses BC , AC ,
and AB at points D, E, and F respectively, with D, E, and F distinct from A, B, and C—see
Figure 6.7. Menelaus’s theorem then states that the following relation holds:

|AF | ⋅ |BD| ⋅ |CE | = |F B| ⋅ |DC| ⋅ |EA | .

(6.1)

FIGURE 6.7: Menelaus’s Theorem.

The converse is also true. If points D, E, and F are chosen on BC , AC , and AB respectively
so that (6.1) holds, then D, E, and F are collinear.
In order to see this, draw the line KC parallel to AB and observe that, by similarity of the
triangles, we have

|BD| |BF |
=
|DC| |CK|

and

|AE| |AF |
= .
|EC| |CK|

Now, after solving both equations for |CK|, we get that

|DC| ⋅ |BF | |EC| ⋅ |AF |


= ,
|BD| |AE|

which after rearrangement gives us the desired equality (6.1).


We covered the case when two of the tree intersection points lie on sides of a triangle. One
can show that the theorem also holds when neither E, D nor F lies on the side of the triangle.

Ceva’s Theorem A direct consequence of Menelaus’s theorem is the following Ceva’s


theorem. Given a triangle ABC and points F , D, E on sides AB, BC and, respectively, AC ,
the lines AD, BE and CF intersect at the same point if and only if

|AF | ⋅ |BD| ⋅ |CE | = |F B| ⋅ |DC| ⋅ |EA | .

We obtain it by writing (6.1) for triangle AF C and line BE, triangle BCF and line AD,
and then dividing them side by side and rearranging the terms.
SOLUTION

From Menelaus’s theorem (applied twice to points B and G) we get

|EC| ⋅ |BA| ⋅ |F G | = |F A| ⋅ |EG| ⋅ |BC|

(6.2)

and

|EF | ⋅ |AG| ⋅ |BC | = |AC| ⋅ |EB| ⋅ |GF | .

(6.3)

But |BA| = 2|F A| and |BC| = 2|CE| + |AC|. After substituting it to equation (6.2) we get

|EC| ⋅ 2|F A| ⋅ (|EG| + |EF | ) = |F A| ⋅ |EG| ⋅ (2|CE| + |AC| ) .

From this we get

2|EC| ⋅ |EG| + 2|EC| ⋅ |EF | = 2|EG| ⋅ |CE| + |EG| ⋅ |AC | ,

and so

2|EC| ⋅ |EF | = |EG| ⋅ |AC | .


It follows that

2|EC||EF |
|EG | =
|AC|

which we substitute to (6.3) to get

2|EC||EF |
|EF |(|AC| + |CG|)(|AC| + 2|EC|) = |AC|(|AC| + |CE|)( + |EF | ) .
|AC|

Finally, after simpli cation we get the desired equality; namely, |CG| = |CE|.

FIGURE 6.8: Illustration for Problem 6.4.

REMARKS

It is useful to remember the “arrow shaped pattern” that is obtained by two overlapping
triangles. For example, coming back to our problem, Figure 6.8 contains point A which can
be viewed a the “head of the arrow” consisting of two overlapping triangles, AF G and ABC
. In such situations, especially when the problem concerns lengths of certain sections,
Menelaus’s theorem very often turns out to be useful.
EXERCISES

6.4.1. Points D, E and F lie on sides BC , CA and AB of a triangle ABC in such a way that
lines AD, BE and CF intersect in a single point P. Prove that
|AF |/|F B| + |AE|/|EC| = |AP |/|P D|.

(Source of the problem and solution idea: “Delta” monthly, March 2011 – DeltaMi –
Problem 2.)

6.4.2. You are given a triangle ABC where < ) ACB = π/2. On side AC build a square
ACGH , externally to the triangle. Similarly, on side BC build a square CBEF , externally

to the triangle. Show that the point of intersection of AE and BH lies on the line orthogonal
to AB that goes through point C.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 105.
Solution: our own.)
6.4.3. You are given a convex quadrilateral ABCD and a line that intersects lines DA, AB,
BC , and CD in points K, L, M, and N, respectively. Prove that
|DK| ⋅ |AL| ⋅ |BM | ⋅ |CN | = |AK| ⋅ |BL| ⋅ |CM | ⋅ |DN |.

(Source of the problem “Exercises in geometry” by Waldemar Pompe – Problem 106.


Solution: our own.)

6.5 Parallelograms
SOURCE

Problem: “Exercises in geometry” (in Polish) by Waldemar Pompe – Problem 24


Solution: our own

PROBLEM

Point P lies inside parallelogram ABCD and < ) ABP = < ) ADP . Show that
< ) DAP = < ) DCP .

THEORY

Parallelogram A parallelogram is a quadrilateral with two pairs of parallel sides. By


comparison, a quadrilateral with just one pair of parallel sides is a trapezoid. Hence, all
parallelograms are trapezoids but the converse is not true. On the other hand, there are some
special sub-families of parallelograms: rectangle is a parallelogram with four angles of equal
size, rhombus is a parallelogram with four sides of equal length, and square is a
parallelogram with four sides of equal length and angles of equal size (right angles).
There are several basic facts about parallelograms that are often useful:

the opposite sides of a parallelogram are of equal length;

the opposite angles of a parallelogram are equal;

the diagonals of a parallelogram bisect each other;

the diagonals of a parallelogram divide it into four triangles of equal area (in
particular, the area of a parallelogram is twice the area of a triangle created
by one of its diagonals);

any line going through the midpoint of a parallelogram bisects the area.

SOLUTION
Draw an auxiliary point P′ such that P P is parallel to CD (and so also to AB) and for

which |CD| = |P P | (and so |AB| = |P P | as well)—see Figure 6.9. Then,


′ ′

< ) ABP = < ) BP P , as BP is a transversal that passes the two parallel lines, P P and
′ ′

AB. Moreover, < ) ADP = < ) BCP , as DP is parallel to CP and AD is parallel to


′ ′

BC . Now, using the assumption of the problem that < ) ABP = < ) ADP we get that

< ) BP P = < ) CP . It follows that triangles BP P and BCP share one of the sides
′ ′ ′ ′

(the line segment BP ) and the two angles that are opposite to BP are equal and lie on the
′ ′

same side of BP . Using the connection between central and inscribed angles discussed in

Section 6.1 we deduce that one can draw a circle through points B, P, C, and P′. Therefore,
< ) P P C = < ) P BC , as they are inscribed angels which intercept the same arc. But,
′ ′

clearly, < ) P P C = < ) DCP (as CP is a transversal that passes two parallel lines, CD

and P P ), and < ) P BC = < ) P AD (as AP is parallel to BP and AD is parallel to BC


′ ′ ′

). It follows that < ) DCP = < ) P AD and the proof is nished.

FIGURE 6.9: Illustration for Problem 6.5.

REMARKS

In this example, we see one more time how useful it is to add some auxiliary object to the
gure—this time, it is point P′. The idea for adding it comes from the fact that creating
another parallelogram introduces many angles that must be preserved and so many useful
conditions must be satis ed.
EXERCISES

6.5.1. Consider a quadrilateral ABCD. Prove that the sum of distances from any point P
inside this quadrilateral to the lines AB, BC , CD, and DA is constant (that is, does not
depend on the choice of P) if and only if ABCD is a parallelogram.

6.5.2. Consider a triangle ABC such that |AB| = |AC| (that is, an isosceles triangle), AD is
the height of this triangle, and E is in the middle of AD. Let F be the orthogonal projection
of D on BE. Prove that < ) AF C = π/2.

6.5.3. Consider a triangle ABC . Outside of the triangle, on sides AB and AC , we built
squares ABDE and, respectively, ACF G. Let M and N be the middle points of DG and,
respectively, EF . What are the possible values of the rato |M N |/|BC|?
(Source of the problem: “Wokół obrotów” book by Waldemar Pompe, Problem 4.22.
Solution: our own.)

6.6 Power of a Point


SOURCE

Problem and solution presented in Deltoid 38 (in Polish) by Joanna Jaszuńska.


PROBLEM

Consider two externally disjoint circles A and B; that is, they are not only disjoint but also
neither of them lies inside the other one. There are two lines ℓ and ℓ tangent to A and B
1 2

selected so that they are not separating A and B. Line ℓ ( i ∈ {1, 2}) touches circle A in
i

point Ai and circle B in point Bi. Now consider a line A B . It intersects circle A in point A3
1 2

and and circle B in point B3. Prove that |A A | = |B B |.


1 3 2 3

THEORY

Tangent Line The tangent line to a curve at a given point is the straight line that “just
touches” the curve at that point. (A formal de nition is outside the scope of this book.) As a
speci c example, consider a circle B and a point X outside of it. Then, there are two tangent
lines, ℓ and ℓ , to the circle; the intersection of ℓ (respectively, ℓ ) and the circle is
1 2 1 2

precisely one point, B1 (respectively, B2).


We will show that < ) AB O = < ) AB O = π/2, where O is the center of the circle.
1 2

Due to the symmetry, without loss of generality, it is enough to focus on showing that
< ) AB O = π/2. For a contradiction, suppose that < ) AB O ≠ π/2; that is, B1 is not
1 1

the orthogonal projection of O on line AB . This implies that there exists another point on
1

line AB that is at the same distance from O as B1; that is, both points lie on the circle. We
1

assumed however that there was only one point of intersection of AB and the circle, and so
1

we get the desired contradiction.


Now, we are ready to move to the main tool of this section.

Power of a Point We will independently consider two cases.

Case 1: Consider a circle and a point A inside it, together with any line going through A. Let
the points of intersection of this line with the circle be B and C. The product |AB| ⋅ |AC|
then does not depend on the choice of the line and is equal to the square of the radius of the
circle minus the square of the distance from A to O, the center of the circle.
To see this, let us rst note that if A = O, then the desired property is trivially true so we
may assume that it is not the case. Now, let us introduce an auxiliary line going trough A and
O that intersects the circle at points P and Q—see Figure 6.10. Our goal is to show that the
desired property holds for all lines passing through A (including this auxiliary line OP ) but,
clearly, it holds for OP . Indeed, observe that

|AP | ⋅ |AQ| = (|OP | − |AO|) ⋅ (|OQ| + |AO|)

2 2
= (|OP | − |AO|) ⋅ (|OP | + |AO|) = |OP | − |AO| .

So it is enough to show that |AB| ⋅ |AC| = |AP | ⋅ |AQ|. Now, observe that triangles AQB
and ACP are similar as < ) QBC = < ) QP C and < ) CAP = < ) QAB (see the
connection between central and inscribed angles discussed in Section 6.1). Therefore,
|AB|/|AQ| = |AP |/|AC|, which yields the desired property.

FIGURE 6.10: Power of a Point—Case 1: |AB| ⋅ |AC| = |OB|


2 2
− |AO| .

Case 2: Now, consider a circle and a point A outside of it. Consider any line going through A
that intersects with the circle. Let the points of intersection of this line with the circle be B
and C. Then, the product |AB| ⋅ |AC| does not depend on the choice of the line and is equal
to the square of the distance from A to O, the center of the circle, minus the square of the
radius of the circle.
First, let us note that possibly B = C (that is, the line intersects the circle at one point and
so the line is, in fact, the tangent line) but this case is rather uninteresting and easy to deal
with. Indeed, as argued above, in this case < ) ABO is the right angle and so we get the
desired property immediately.
FIGURE 6.11: Power of a Point—Case 2: |AB| ⋅ |AC| = |AO| 2
− |OB|
2
.

Now assume B ≠ C and choose a point P on the same side of line AO as points B and C
such that < ) OP A is the right angle and P lies on a circle (see Figure 6.11). Since
|OB| = |OP |, triangle BOP is an isosceles triangle and so
< ) BP O = π/2 − < ) BOP /2. Since < ) OP A is the right angle,
< ) BP A = π/2 − < ) BP O = < ) BOP /2. But, as < ) BCP is an inscribed angle and

< ) BOP is the central angle that intercept the same arc, < ) BCP = < ) BOP /2, and so

< ) BCP = < ) BP A. This means that triangles ACP and ABP are similar. Therefore,

|AP |/|AC| = |AB|/|AP | and so |AB||AC| = |AP | . But, as < ) OP A is the right angle,
2

we have that |P A| = |AO| − |OP | , which nishes the proof.


2 2 2

SOLUTION

Using Power of a Point property (Case 2 above), we get


2 2 2
|A1 B3 | ⋅ |A1 B2 | = |A1 OB | − |B1 OB | = |A1 B1 | ,

where OB is the center of circle B. Similarly,

2 2 2
|B2 A3 | ⋅ |B2 A1 | = |B2 OA | − |A2 OA | = |A2 B2 | ,

where OA is the center of circle A. Clearly, we have |A B | = |A B | as line O O is a


1 1 2 2 A B

line of symmetry of the two cycles, A, B, together with the two lines, ℓ and ℓ . Therefore, 1 2

|A B | = |B A | but this means that |A A | = |B B |.


1 3 2 3 1 3 2 3
FIGURE 6.12: Illustration for Problem 6.6.

REMARKS

In this example we used Power of a Point property. It is a natural tool to try in situations
when we have to prove facts about lengths of sections de ned by a circle.

EXERCISES

6.6.1. Two circles intersect in points A and B. Point P is selected on line AB outside of the
circles. Points C and D are locations where tangent lines going through point P touch both
circles. Prove that < ) P CD = < ) P DC .

6.6.2. Consider a convex hexagon ABCDEF such that |AB| = |BC|, |CD| = |DE|, and
|EF | = |F A|. Prove that lines containing altitudes of triangles BCD, DEF , and F AB

from vertices C, E, and A, respectively, intersect in one point.

6.6.3. Consider two points A and B. Take two circles o1 and o2 such that o1 is tangent to AB
in point A, o2 is tangent to AB in point B, and o1 and o2 are externally tangent in point X. If
we allow o1 and o2 to vary, then what is the set of points that contains all possible locations
of X.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 26.
Solution: our own.)

6.7 Areas
SOURCE

Problem: well-known problem


Solution: our own

PROBLEM
Let A be any convex quadrilateral that has area equal to 1. To simplify the notation,
1 A2 A3 A4

let A 0 = A . Let us introduce four new points: for i ∈ {1, 2, 3, 4}, let A be the point on line
4

i

Ai Ai−1 such that A is the midpoint of line segment A A . Calculate the area of
i−1 i

i

A A .
′ ′ ′ ′
A1 A2 3 4

THEORY

In order to be able to deal with areas of some gures, it is often the case that one needs to use
a formula for the area of a triangle. Consider a triangle ABC and denote by H an orthogonal
projection of C on line AB (which does not have to be on the line segment AB). The area of
the triangle, denoted by [ABC], is then

1 1
⋅ |AB| ⋅ |CH | = bh .
2 2
(6.4)

Here b = |AB| is often called the length of the base of the triangle, and h = |CH | is called
the altitude of the triangle. Although simple, this formula is only useful if the height can be
readily found, which is not always the case. Hence, there are other formulas available. For
example, the shape of the triangle is determined by the lengths of the sides. Therefore, the
area can also be derived from the lengths of the sides by Heron’s formula that we already
used in this book (see Problem 1.9).
A direct consequence of (6.4) is that if the altitude of the triangle is xed and one only
changes the length of the base, then the area of the triangle changes proportionally. This
implies that if one angle of the triangle is xed but the two adjacent sides are rescaled by
factors a and b, respectively, then the area of the triangle changes by a factor of a ⋅ b. For a
given angle α ∈ (0, π), this constant factor is typically de ned as the area of the triangle
whose sides adjacent to angle α have lengths 1 and 2. It is called the sine of angle α and is
denoted by sin(α). It follows that for any triangle ABC we have

[ABC] =
1

2
⋅ |AC| ⋅ |AB|⋅ sin (⦔BCA) = 1

2
⦔ABC)
. |BA| ⋅ |BC|⋅ sin (

=
1

2
⋅ |CA| ⋅ |CB|⋅ sin ( ⦔ACB).
(6.5)

The Law of Sines The set of equalities in (6.5) give us immediately the following important
observation known as the law of sines. For any triangle ABC , we have

|BC| |AC| |AB|


= = .
sin( < ) BAC) sin( < ) ABC) sin( < ) ACB)

This ratio is equal to the diameter of the circumscribed circle of the given triangle. Another
interpretation of this observation is that every triangle with angles α, β, and γ is similar to a
triangle with side lengths equal to sin(α), sin(β), and sin(γ).
Let us nish with two more observations. From the discussion above, we get that
sin(0) = sin(π) = 0, sin(π/2) = 1, sin(α) = sin(π − α) for any α ∈ (0, π), and that the
sine function is increasing in the range from 0 to π/2, and decreasing in the range from π/2
to π.
Moreover, consider a triangle ABC such that < ) ABC = α and < ) BAC = π/2; that
is, < ) BAC is the right angle. Then, [ABC] = |AB||AC|/2 = sin(α)|AB||BC|/2, so
sin(α) = |AB|/|BC|.

SOLUTION

We will use [A1 A2 …An ] to denote the area of an n-gon A1 A2 …An . Let us rst observe
that,
' '
[A1 A4 A4 ] = 2[A1 A3 A4 ]

' '
[A2 A2 A3 ] = 2[A1 A2 A3 ].

Hence,
' ' ' '
[A1 A4 A4 ] + [A2 A2 A3 ] = 2[A1 A3 A4 ] + 2[A1 A2 A3 ]

= 2([A1 A3 A4 ] + [A1 A2 A3 ])

= 2[A1 A2 A3 A4 ] = 2.

Similarly, since
' '
[A1 A1 A2 ] = 2[A2 A4 A1 ]

' '
[A3 A3 A4 ] = 2[A2 A3 A4 ],

we get [A ′
1 A1 A2 ]

+ [A3 A3 A4 ] = 2
′ ′
. Combining all of these together, we conclude that
' ' ' ' ' ' ' '
[A1 A2 A3 A4 ] = [A1 A2 A3 A4 ] + [A1 A4 A4 ] + [A2 A2 A3 ]

' ' '


1
+[A1 A1 A ] + [A3 A3 A4 ] = 5.
2
FIGURE 6.13: Illustration for Problem 6.7.

REMARKS

This problem shows that it is important to remember about the following two facts.

Suppose that two triangles, say, ABC and DEF , are such that |AB| = |DE|
and |AC| = |DF |; moreover, < ) BAC + < ) EDF = π. (For example,
triangles A A A and A A A on Figure 6.13 satisfy these properties.)
1 2 3 1 2

3

Then, these triangles have the same areas.

Consider any triangle ABC and change the length of AB by a


(multiplicative) factor of α, while keeping the length of AC and the angle
between the two sides, namely, < ) BAC . (Of course, the other two angles
will change as well as the length of the third side. For example, triangles
A A A and A A A have these properties with α = 2.) Then, the area of
′ ′ ′
1 2 3 2 2 3

the triangle changes by a factor of α.

EXERCISES

6.7.1. Let P be an interior point of a triangle ABC . Let lines AP , BP , and CP intersect
sides BC , CA, and AB in points A′, B′ and, respectively, C′. Prove that
|P A|/|AA | + |P B|/|BB | + |P C|/|CC | = 2.
′ ′ ′

6.7.2. Points E and F lie on sides BC and, respectively, DA of a parallelogram ABCD such
that |BE| = |DF |. Select any point K on side CD. Let P and Q be intersection points of line
F E with lines AK and, respectively, BK . Prove that [AP F ] + [BQE] = [KP Q].
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 45.
Solution: our own.)

6.7.3. Consider a convex quadrilateral ABCD. Select points K and L on side AB such that
|AK| = |KL| = |LB| = |AB|/3. Similarly, select points N and M on side DC such that

|DN | = |N M | = |M C| = |CD|/3. Show that [KLM N ] = [ABCD]/3.

(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 50.)

6.8 Thales’ Theorem


SOURCE

Problem: “Exercises in geometry” (in Polish) by Waldemar Pompe – Problem 56 (slightly


modi ed)
Solution: our own

PROBLEM

Given a quadrilateral ABCD, choose points K, L, M, N lying respectively in sides AB, BC ,


CD and DA such that

|AK|/|KB | = |CL|/|LB | = |AN |/|N D | = |CM |/|M D | .

Prove that KLM N is a parallelogram and its area is less than or equal to the half of the area
of ABCD.

THEORY

Thales’ Theorem Let us highlight the following observation, known as Thales’ theorem or
the intercept theorem, about the ratios of various line segments that are created if two
intersecting lines are intercepted by a pair of parallels. In fact, it is equivalent to the theorem
about ratios in similar triangles. Lines A A and B B are parallel if and only if
1 2 1 2

|A1 C| |B1 C|
= .
|A2 C| |B2 C|

Point C may lie anywhere on the plane except for being situated on the lines A 1
A2 or B
1
B2 .
FIGURE 6.14: Thales’ Theorem.

SOLUTION

Denote the common ratio by α; in particular, α = |AK|/|KB|. It follows from Thales’


theorem that the three lines, N M , AD, and KL, are mutually parallel. Similarly, the three
lines N K , CB, and M L are mutually parallel so KLM N is a parallelogram.
In order to estimate the area of KLM N (in comparison to the area of ABCD), note that

[KLM N ] = [ABCD] − [AKN ] − [BKL] − [CLM ] − [DM N ] .

However, since |KB|/|AB| = |LB|/|CB| = 1/(1 + α), the area of BKL is 1/(1 + α) of
2

the area of ABD. Considering the remaining three triangles in a similar way, we get that
their total area is
2
1 α
+
2 2
(1 + α) (1 + α)

of the area of ABCD. Hence,


2 2
[KLM N ] (1+α) −1−α 2α
= 2
= 2
[ABCD] (1+α) (1+α)

2
2
1 α 1 1 α 1
= 2(√ ⋅ ) ≤ 2( ( + )) = ,
1+α 1+α 2 1+α 1+α 2

where the inequality is obtained by the geometric-arithmetic mean inequality. The equality is
obtained if α = 1.
FIGURE 6.15: Illustration for Problem 6.8.

REMARKS

In this problem, since the ratios of the lengths of the corresponding line segments are
preserved, it is natural to try to use Thales’ theorem. Interestingly, one could ask a question
what conclusion can be obtained if, instead, the following property holds:

|AK|/|KB | = |BL|/|LC | = |DM |/|M C | = |CN |/|N C | =: α .

(Note the reversed proportions for one pair of opposing sides.) This time, KLM N is not a
parallelogram (in general). However, we immediately see that the area of KBL is
α/(1 + α) fraction of the area of ABD. Similar properties are also satis ed for the three
2

other triangles. Therefore, the area of KLM N is equal to 1 − 2α/(1 + α) of the area of
2

ABCD. Arguing as before, we get that it is at least half of the area of ABCD and the

equality holds when α = 1. In fact, we get a slightly stronger property: the area of the gure
obtained this way added to the area of the gure from our original problem is exactly equal
to the area of ABCD.

EXERCISES

6.8.1. Given a parallelogram ABCD, consider points M and N that are in the middle of sides
BC and CD, respectively. Section BD intersects with AN in point Q, and with AM in

point P. Prove that 3|QP | = |BD|.


(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 55.
Solution: our own.)

6.8.2. Points K, L, M, and N are the middle points of sides AB, BC , CD and, respectively,
DA of a parallelogram ABCD whose area is equal to 1. Let P be the intersection point of

KC and N B, Q be the intersection point of LD with KC , R be the intersection point of

M A with LD, and, nally, S be the intersection point of N B with M A. Calculate the area
of P QRS .
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 59.
Solution: our own.)
6.8.3. Points E and F are on sides AB and, respectively, AD of rhombus ABCD. Lines CE
and CF intersect line BD in points K and L, respectively. Line EL intersects side CD in
point P. Line F K intersects side BC in point Q. Prove that |CP | = |CQ|.
(Source of the problem: “Exercises in geometry” by Waldemar Pompe – Problem 62.
Solution: our own.)
Chapter 7
Hints

7.1 Inequalities
7.2 Equalities and Sequences
7.3 Functions, Polynomials, and Functional Equations
7.4 Combinatorics
7.5 Number Theory
7.6 Geometry

In this chapter we provide hints for all exercises presented in the book.

7.1 Inequalities
1.1.1. Observe that a + c = b + (a + c − b) and a ≤ a + c − b ≤ c.
1.1.2. Apply Jensen’s inequality to function f (k) = k
s−1
and weights
proportional to k.
1.1.3. Apply Jensen’s inequality to function f (x) = √x.
1.2.1. Apply the arithmetic-harmonic mean inequality and observe that equality
holds when a = b = c/2 = d/4.
1.2.2. Apply the arithmetic-geometric mean inequality to the right hand side,
rearrange the terms, and nally apply the geometric-harmonic mean
inequality to get the result.
1.2.3. Divide both sides by 2 and apply the arithmetic-geometric mean
inequality to the left hand side.
1.3.1. Take logarithm of both sides of the inequality and directly apply the
rearrangement inequality.
1.3.2. Divide both sides by abc and then apply the rearrangement inequality to
the obtained inequality.
1.3.3. Take a logarithm of both sides and then apply rearrangement inequality
after simplifying the expression.
1.4.1. In both cases, rst invert both sides of the inequality and then apply
Bernoulli’s inequality, observing that n ≥ 2.
1.4.2. Raise both sides to the power of n and then apply Bernoulli’s inequality.
1.5.1. Invert both sides of the inequality and note that n 2
− n = n ⋅ (n − 1) .
1.5.2. Raise both sides of the inequality to the power of n(n + 1) and rearrange
the obtained inequality.
1.5.3. Use the fact that (1 + a/n) is increasing for a ≠ 0.
n

1.5.4. Multiply both sides by (n + 1)/n . n

1.5.5. To prove the rst inequality, use Bernoulli’s inequality. For the second
part, note that the right hand side tends to 3 and the middle term is
bounded from above by e.
1.6.1. Use the binomial expansion of (1 + x)
i
and observe that
= 1 + ix + O(x ).
i 2
(1 + x)

1.6.2. Use the fact that for positive x: (1 + x/n) n


> (x/n)
n
.
1.7.1. Use Cauchy-Schwarz inequality for the square roots of a, b and c.
1.7.2. Use Cauchy-Schwarz inequality for x 1 = √ 2a + 1, x2 = √ 2b + 1 , and
x = √ 2c + 1, where y = y = y
3 1 2 3 = 1. You can alternatively use

Jensen’s inequality.
1.7.3. Bound function from above by a linear function passing points
x
2
x +1

(1/3, 3/10) and (−3/4, −12/25).

1.8.1. Consider ipping a fair coin 2n times and calculate the probability of
obtaining exactly n heads.
1.8.2. Consider the probability that in n coin tossings there are at least k heads,
provided that the probability of getting a head is equal to p and,
respectively, q.
1.9.1. Consider n + 1 points of the form (i, ∑ , .
i
aℓ ) i ∈ {0, 1, …, n}
ℓ=1

1.9.2. Consider the area of the pentagon with each side length equal to 1/2 with
the two diagonals adjacent to the same vertex that have the same length,
x.

7.2 Equalities and Sequences


2.1.1 Subtract the rst equations from the third one and subtract the second
equation from the fourth one to derive useful relationships between a, b, c
, and d.
2.1.2 Divide the two equations.
2.1.3 Note that the system is equivalent to:

(x − (y + z))(x − yz ) = 0

{(y − (z + x))(y − zx ) = 0

(z − (x + y))(z − xy ) = 0.

2.2.1 Note that f (x) := x is an increasing function.


3

2.2.2 Note that f (x) := x , g(x) := x


3
, and h(x) := x
5
are increasing
functions.
2.2.3 Use the fact that at least one of the variables d, a or b attains the
maximum or the minimum value from the set {a, b, c, d}.
2.3.1 Use the arithmetic-geometric mean inequality to bound the term x 4 2
+ 3y

.
2.3.2 Find an upper bound for and a lower bound for
2
(x + y + z)

x y + y z + z x .
2 2 2 2 2 2

2.3.3 Use the substitution a = xy, b = yz, and c = xz.


2.4.1 Note that x i+1 = xi (4 − xi ) , and so x ∈ [0, 4] for
i i ∈ [n] . Use the
substitution x 1 = 4 sin
2
(α) for some α ∈ [0, π/2].

2.4.2 Use the identity cot(2x) = (cot(x) − tan(x))/2 and then use the
substitution y = tan(x).
2.4.3 Use the identity

1 + cot(x) cot(y)
cot(y − x ) = .
cot(x) − cot(y)
2.5.1 For each pair of the three equations, subtract one from the other to cancel
out a and one of the squares.
2.5.2 Add x
2009 2009
y to both sides of the equation.
2.5.3 Re-write the equations as follows: (x .
2 2
i+1 − 6) + (xi − 8) = 50

2.6.1 Consider the limit of x 3


n
/n instead.
2.6.2 Show that (a n)
2
/2 ≤ an+1 < an
2
.
2.6.3 Show that the sequence must diverge to ∞ or −∞, unless a = b = 0.
2.7.1 Prove by induction that for all ℓ ∈ N ∪ {0}, a 3ℓ+1 = 1 and a 3ℓ+2 = −1 .
2.7.2 Note that either an = 0 for some n ∈ N or for all n ∈ N ,
.
2 2
a + 1 = (an+1 − an )
n+1

2.7.3 Note that ∑ .


n 2
(xi−1 − xi + xi+1 − 1) = 0
i=1

7.3 Functions, Polynomials, and Functional Equations


3.1.1 Using Vieta’s formulas, write down the six equations involving the values
we are looking for. Then, prove that b ≠ 0 and that a a a = 1. Using
i 1 2 3

this prove that all ai must be positive. Finally, show that all ai are equal to
1.
3.1.2 It follows immediately from Vieta’s formulas (see (3.1)) that
x = 0, where xi are roots of the polynomial f (x). This implies
n 2

i=1 i

that if all roots are real, then all of them are equal to 0.
3.1.3 Observe that we have 1

x
4
= 9(x + 2)
2
.
3.2.1 Prove that f (0) = 0 and then that f (−a) = −f (a). Then, consider
possible values of f (2) depending on the arbitrarily chosen value of f (1).
3.2.2 Consider the equation for y = 0 and y = f (x) to get the relations needed
to derive the solution.
3.2.3 Consider pairs of (x, y) of the form (0, 0) , (0, f (0)) , (0, y) , and
(x, f (x)).

3.3.1 Prove that for each x ∈ R we have that f (x) = f (x + 1

2
) .
3.3.2 Note that for x ≠ 0 we have that f (x) = (f (1/(1 − x) − 1)/x.
3.3.3 Show that
f (a, b, c) = a + f (0, 0, b) − f (0, 0, a) + f (0, b, c) − f (0, a, b) .
3.4.1 Since there exist a, b ∈ Z such that P (a) = 0 and P (b) = 1, we get that
(b − a ) | (P (b) − P (a)), and so |b − a| = 1. Then, de ne
Q(x) := P (a + (b − a)x) and show that Q(f ) = f for all i ∈ N.
i i

3.4.2 Prove that P (0) is a desired integer root.


3.4.3 Prove that in order for the condition in the problem to hold we must have
that a = 4b + 1 by considering the expression
2

(2k + a − 1)P (k + 1) − (2k + a + 3)P (k).

3.5.1. Consider a polynomial P (q + x) + P (q − x) where q is any rational


number and prove that it must be a constant. From this deduce that P (x)
must have the form P (x) = ax + b for some rational numbers a and b.
3.5.2 Proof the statement by contradiction using the fact that P (x) must have
rational coef cients.
3.5.3 Consider polynomials P (x) := G(x) − F (x) and
Q(x) := H (x) − F (x).

3.6.1 Re-write the polynomial in the following form:


2
f (x) = (x − a)(x − b)(2x + cx + d)

for some a, b, c, d ∈ R. Solve the resulting system of equations ensuring


additionally that ab = 2.
3.6.2 Prove rst that for suf ciently large integers x, we have that

P (x) < P (−x + 1) < P (x + 1).

Then, inspect directly the remaining cases.


3.6.3 Prove that P (x)cannot have more than one term by considering
P (x) = ax + bx + Q(x), where a, b ∈ R ∖ {0}, ℓ, k ∈ Z such that
k ℓ

0 ≤ ℓ < k, and Q(x) has degree less than ℓ (or Q(x) = 0 everywhere if

ℓ = 0). Then, consider the case when the polynomial P (x) has only one

term.
3.7.1 Prove rst that each P (x) has degree at most 1. Then, represent each
i

P (x) as follows: P (x) = (a x + b )/m, where ai, bi ( i ∈ [4]), and m


i i i i
are some integers. Finally, prove that the required equality cannot be
satis ed.
3.7.2 Consider the expression
r(f (p) − f (q)) + p(f (q) − f (r)) + q(f (r) − f (p)) and observe that it
is divisible by n.
3.7.3 Show that this is false for P (x) := x 3
− 2x .

7.4 Combinatorics
4.1.1. Greedily select pairs of members that know each other. If you stop
prematurely, since there are no more pairs to select from, then select any
two members that are not yet assigned, say, A1 and A2. Now, show that,
knowing that each of them knows at least n members already assigned, it
is possible to nd an already selected pair of people, say, B1 and B2, in
such a way that Ai and Bi ( i = 1, 2) know each other. After removing the
pair (B , B ) and adding the pairs (A , B ) and (A , B ), we improve
1 2 1 1 2 2

our assignment and continue the argument, if needed.


4.1.2 Start from any player A. Clearly, at least 6 other players played against A
for the same number of rounds. If two players from this group of 6
players played the very same number of rounds, then we are done.
Otherwise, there are 6 players and only two possible number of rounds
for all games between them. We repeat the argument for the reduced
problem.
4.1.3 Start with a person P that has at least six acquaintances. Consider then a
chain of links between people, starting from this person, that oscillates
between links of type like and dislike, starting with like. Notice that if we
keep extending this chain (in any way!) it will eventually come back to P
as once we enter some vertex we can always leave it. Denote this walk by
W1. If such a walk has even length, then we are nished. Otherwise, we
create another walk, W2; this time, starting from dislike. If its length is
even, then removing it solves the problem. Otherwise, we remove both
W1 and W2 to get the desired property.
4.2.1 Label the grid so that the bottom left cell has label (1, 1) and the top right
one has label (25, 25). Put 1 in a cell with label (i, j) if i + j is divisible
by 3; otherwise, put 0. Observe that each block (regardless whether it is
of size 1 × 6 or 2 × 3) covers precisely two 1’s.
4.2.2 Show that there would have to be 9 horizontal blocks or 9 vertical ones.
But this implies that there are at most 4 blocks of the other type which is
not enough to cover 91 cells.
4.2.3 As usual, label the grid so that the bottom left cell has label (1, 1) and the
top right one has label (10, 10). Put 1 in a cell with label (i, j) if i + j is
even, and 0 otherwise. How many 1’s can be covered by each block?
4.3.1 Each of the four groupings can be achieved by evenly grouping some
permutation of people. So it is enough to count how many permutations
yield the same group.
4.3.2 First, select rows in which the rooks are going to be placed, and then
select the columns for them.
4.3.3 Count rst the numbers that are created. Then, observe that one can pair
them in such a way that each pair has the property that the sum of digits
in a given decimal position is 10.
4.3.4 Imagine the procedure backward.
4.4.1 Count the number of possible ways to select a team given an arbitrary
degree distribution in the corresponding friendship graph. Then,
minimize this function to get the desired lower bound.
4.4.2 Let rt and bt be the length of a longest red and, respectively, blue path
after t rounds of the game. Show (by induction) that Builder has a
strategy that increases the sum of rt and bt by 1 in two rounds; that is, for
each t ∈ N, r + b ≥ t. It will follow that max{r , b } ≥ 100.
2t 2t 400 400

4.4.3 Select an arbitrary member of the club. We will assign him/her 1 if he/she
knows at least half of the members, and 0 otherwise. Now, remove this
member and all the members that do not match the chosen majority,
leaving at least 4 /2 members. Repeat the process on the remaining
t

subset of members. Observe that this process lasts at least 2t round (if
there is only one member left at the beginning of round 2t, we may
assign 0 or 1 arbitrarily). But this means that at least t members have 1
assigned or at least t members have 0 assigned. The last step is to observe
that this set of people satis es the requirements of the problem.
4.5.1 Estimate the number of increasing arithmetic progressions of length k in
X by ( ) = . Consider then a random partition of X
N N (N −1) k−1
< 2
2 2

into two subsets A and B and show that the expected number of k-
element sequences in A and in B is less than 1.
4.5.2 Consider a random tournament. For each ordering, compute the
probability that it has the desired property, namely, ti won against t ,i+1

for each i ∈ [n − 1].


4.5.3 Color edges at random, uniformly and independently. Compute the
expected number of monochromatic triangles.
4.5.4 For each i ∈ [100], let di be the number of acquaintances the ith person
has. Notice that the average value of di is equal to 2 ⋅ 450/100 = 9. Take
a random ordering of people. Using this ordering we investigate people,
one by one, and we select a person if he or she does not know anyone
already selected. Compute the expected number of people selected
(function of di’s) and then minimize it.
4.6.1 Write down the recursion for the probability of seeing exactly k white
balls in an urn having n balls in total, or perform calculations for the few
rst rounds to make a natural conjecture that can be then proved by
induction.
4.6.2 Independently consider the probability that the kth participant won the
tournament of n ski jumpers and that the leader changed exactly once
during the whole event.
4.6.3 Use the Inclusion–Exclusion Principle (4.9).
4.7.1 Consider two cases: a) all ve points form a convex pentagon, b) one of
the points lies inside the triangle formed by three other points.
4.7.2 Restrict yourself to coloring a nite number of points, namely, color 13
points that form a regular polygon inscribed in the circle. Prove that any 5
of them consist 3 that form an isosceles triangle.
4.7.3 Count rst f (n), the number of two element subsets of an n-element set,
and then nd g(n), the maximum number of disjoint two element subsets
of such a set. The desired upper bound is equal to f (n)/g(n). Next, nd
a construction achieving this bound (you might want to separately
consider the case when n is even and when n is odd).
4.8.1 Consider two subsets of people, those sitting in even and odd positions at
the table. In one of those sets there must be at least 13 girls.
4.8.2 Consider cumulative number of aspirins taken up to and including day i.
There are 30 such numbers, all distinct. Now consider those numbers
increased by 14. There are also 30 such numbers, again, all distinct.
There are in total 60 numbers, all of them are all less than 60 so two of
them must be equal.
4.8.3 Represent each number in the selected set in the form 2 q
p
, where
p ∈ N ∪ {0} and q ∈ N is an odd number.

4.9.1 Notice that by adding 1 to all sides on one die and subtracting 1 from all
sides on the other die does not affect the distribution for their sum.
4.9.2 You should obtain xn = (2
n n
− (−1) )/3 .
4.9.3 Calculate the probability of getting 10 and 11 using f (x) , where f (x) is3

the generating function for a fair die we have introduced in the solution.
You might want to use computer to expand the resulting polynomial.

7.5 Number Theory


5.1.1 Reduce the problem to showing that gcd(a + b, ab) = 1.
5.1.2 Prove that a + b divides the square of gcd(a, b).
5.1.3 Consider the divisors of n + f (n).
5.2.1 Observe that for any prime p > 3 we have p 2
≡ 1 ( mod 3) .
5.2.2 Observe that 504 = 8 ⋅ 9 ⋅ 7. Next, independently show that the product is
divisible by 8, 9, and 7.
5.2.3 For n < 10, the claim clearly holds for k = 1. For n > 10, one can prove
that n ≡ n ( mod 10) for all k. Then, it is enough to prove that in
4k+1

the sequence (n 4k+1


) , each digit from 1 to 9 has to occur at the rst
k∈N

position in its decimal representation.


5.3.1 Prove that p | a and p | b by considering (a + b) 2
− (a
2
+ b )
2
.
5.3.2 Factor the expression (ab + cd) − (ad + bc).
5.3.3 Use the following observation (that often turns out to be useful)
4 4 2 2 2 2
a + 4b = (a + 2b − 2ab)(a + 2b + 2ab)

to get that
12 6 3 6 3
n + 64 = (n − 4n + 8)(n + 4n + 8)

2 4 3 2
= (n + 2n + 2)(n − 2n + 2n − 4n + 4)

2 4 3 2
(n − 2n + 2)(n + 2n + 2n + 4n + 4).

Then, show that all the factors are different.


5.4.1 Use Fermat’s little theorem to prove that a
64
− a
4
is divisible by 5 and
then that it is also divisible by 7.
5.4.2 First note that ∑ and then that 2 and n are co-prime.
i j i+1
2 = 2 − 1
j=0

5.4.3 ϕ(100) = 40 .
5.5.1 Observe that 3 does not divide 2k.
5.5.2 Note that |20 − 9 | = 11 and that |20 − 9 | ≡ 1 or 9
1 1 m n
( mod 10) .
Then, show that |20 − 9 | cannot be equal to 1 nor to 9.
m n

5.5.3 Note that d | m − n and that m + 1 = m 3 2


(m − n) + (m n + 1)
2
.
The proof for n + 1 holds by symmetry.
3

5.6.1 Separately consider the case when x is even and when x is odd.
5.6.2 Note rst that one may assume that xi and yi are co-prime for each
i ∈ [2011]. Next, consider the reminder of each term in the product when

divided by 3.
5.6.3 Select any number a ∈ S and show that it is possible to select b ∈ S so
that the reminder when dividing a + b by n is in S. Then, show that
having a and b xed, one can select c ∈ S so that a + c, b + c and
a + b + c satisfy the desired conditions.

5.7.1 Consider the unique factorization of n: n = ∏ p for some sequence k

i=1
ℓi

of prime numbers 2 ≤ p < p < … < p and ℓ ∈ N for i ∈ [k]. Use


1 2 k i

the fact that the sum of all positive divisors of n is equal to ∏ ∑ p


k ℓi j

i=1 j=0 i

since each divisor of n has the unique representation ∏ , where


k ji
p
i=1 i

j ∈ {0, 1, …, ℓ },
i and two different divisors have
i different
representations.
5.7.2 Show rst that there is no solution for n = 1 nor for n = 2. Next, nd
solutions for n ∈ {3, 4, 5}. Finally, show that if there is a solution for n,
then there is one for n + 3.
5.7.3 Denote by D(n) the difference between the sum of white and the sum of
black divisors of n. Prove that D(p ⋅ q) = D(p) ⋅ D(q) when p and q are
co-prime. From this conclude that it is enough to show that D(q ), where k

q is a prime number, is not equal to 0 to show that no such numbers exist.


5.8.1 Note that the equation can be rewritten as follows:

2 2
2 2
(2y − 1) = (2x − x) − (x − 1) .

5.8.2 Consider cases n = 1, n = 2, and n ≥ 3 separately. For the case n = 2,


use divisibility by 3. For the case n ≥ 3, use substitution z = x − y.
5.8.3 Add the two equations together and rearrange the outcome to get the
following form:

(ab − 1)(c − 1) + (a − 1)(b − 1) + (xy − 1)(z − 1) + (x − 1)(y − 1 ) = 4.

Observe that each term at the left hand side is non-negative.

7.6 Geometry
6.1.1 Observe that points A, B′, A′, and B lie on a circle and that M is the center
of this circle.
6.1.2 Observe that P, B, Q, and C lie on a circle.
6.1.3 Compute angles ∢COB and ∢CAB.
6.2.1 Consider point R inside the angle such that
∢P AQ

|AR| = |AB| = |AD| and ∢BAP = ∢P AR. Then, notice that


∢DAQ = ∢QAR .
6.2.2 Since RP QD is a rectangle, |RQ| = |P D|.
6.2.3 Observe that |BB | = |CB | and that ∢B'A'H
′ ′
= ∢B'BA .
6.3.1 Adapt the approach that is used for the original problem presented in this
section.
6.3.2 Observe that you can inscribe BCB C in a circle. ′ ′
6.3.3 First prove that P , B, and Q are collinear. Next notice that AP Q , ABQ

and ABP are similar.


6.4.1 Apply Menelaus’s theorem to triangle ABD and line FP , and then to
triangle ACD and line EP .
6.4.2 Note that AC is parallel to BE. Then, calculate the proportion of which
side AE divides side BC using Thales’ theorem (see Section 6.8).
Similarly, calculate the proportion in which side AC is divided by BH .
Let C′ be the orthogonal projection of C on AB. Finally, calculate
|AC |/|C B| and get the desired result using Ceva’s theorem.
′ ′

6.4.3 Add line BD to the plot and apply Menelaus’s theorem twice.
6.5.1 Prove that if two half-lines, ℓ and ℓ , are not parallel and have a common
1 2

origin at point A, then any two points B ∈ ℓ and C ∈ ℓ such that


1 2

|AB| = |AC| have the property that all points lying on this line segment

have the same total distance from lines ℓ and ℓ .


1 2

6.5.2 Add point X such that that ADCX is a rectangle. Note that B, F, E, and P
are collinear and so ∢DF X = π/2. Finally, observe that points D, F, A,
X, and C lie on the same circle.
6.5.3 Add point P such that EAGP is a parallelogram. Observe then that
CF P E is also a parallelogram. Finally, note that N is the middle point of

the line segment P C , and M is the middle point of the line segment P B.
6.6.1 Prove that |P C| = |P D|.
6.6.2 Consider three circles: circle k1 with center in D and going through points
C and E, circle k2 with center in F and going through points E and A, and
circle k3 with a center in B and going through points A and C. Consider
now three sets of points. The rst one, l12, has the same power with
respect to k1 and k2. Prove that l12 is a line that contains the altitude of
F EB going through E. Similarly, de ne l13 and l23, and prove that they

contain the remaining altitudes.


6.6.3 Draw a line tangent to o1 (and so also to o2) in point X. Denote by Y the
point it intersects line AB. Now, consider power of point Y with respect
to o1 and o2.
6.7.1 Consider the areas of triangles P AB, P BC , and P AC .
6.7.2 Note that [ABCD]/2 = [ABEF ] = [AKB].
6.7.3 Note rst that [DAK] = [DAB]/3and [BM C] = [BDC]/3 so
[KBM D] = 2[ABCD]/3. Observe then that L and N bisect KB and

DM .

6.8.1 Consider triangles ABQ and DQN , and then triangles AP D and BM P .
6.8.2 Denote by C′ the point of intersection of line CK with line AD. Observe
that |AC | = |DA|, and that N C and BC are parallel. Use those facts
′ ′

and apply Thales’ theorem to triangles C P N and BCP , and then


calculate [BCP ]. Apply the same process to triangles DQC , ARD, and
BSA.

6.8.3 Show that |DP | = |BQ| = |F D| ⋅ |EB|/|BC|. In order to show this


equality for |DP |, use Thales’ theorem for triangles F LD and BCL,
and then apply it for triangles LP D and EBL. The argument for |BQ| is
analogous.
Chapter 8
Solutions

8.1 Inequalities
8.2 Equalities and Sequences
8.3 Functions, Polynomials, and Functional Equations
8.4 Combinatorics
8.5 Number Theory
8.6 Geometry

In this chapter we provide solutions for all exercises presented in the book.

8.1 Inequalities
Problem 1.1.1. Prove that for any a, b, c ∈ R such that 0 < a ≤ b ≤ c,

1 1 1 1
− + ≥ .
a b c a + c − b
Illustrate the solution graphically. Does the same inequality hold for any function f : R → R that is
convex on some connected subset of R?
Solution. Let us observe rst that a + c = b + (a + c − b) and a ≤ a + c − b ≤ c (see Figure 8.1 for
the illustration of these observations, the length of the dashed line is equal to
(f (a) + f (c))/2 − (f (b) + f (a + c − b))/2). If a = c (and so, in fact, a = b = c), then both sides

are equal to 1/a and we are done. If a < c, then we note that for any convex function f (in particular,
for f (x) = 1/x, x > 0) we have
c−b b−a b−a c−b
f (a) + f (c) = f (a) + f (a) + f (c) + f (c)
c−a c−a c−a c−a

c−b b−a b−a c−b


≥ f( a + c) + f ( a + c)
c−a c−a c−a c−a

= f (b) + f (a + c − b).

Hence, not only the desired inequality holds but the same is true for any convex function f. For
graphical illustration see Figure 8.1.
FIGURE 8.1: Illustration for Problem 1.1.1 (case b < a + c − b ). We take A = (a, 1/a) , B = (b, 1/b) , C = (c, 1/c) ,
D = (a + c − b, 1/(a + c − b)).

Problem 1.1.2. Prove that for any n ∈ N and any real number s ≥ 2, the following inequality holds:
n s s−1
∑ k 2 1
k=1
≥ ( n + ) .
n
∑ k 3 3
k=1

Solution. It follows immediately from Jensen’s inequality, applied to function f (k) = k


s−1
and
i, that
n
a = k/ ∑
k i=1

n
s−1 n s−1
s n n 2
∑ k ∑ k
s−1 k k
∑ ≥ (∑
k=1 k=1
n = k n k n ) = ( n )
∑ k k=1 ∑ i k=1 ∑ i ∑ i
k=1 i=1 i=1 i=1

s−1
n(n+1)(2n+1)/6 s−1
2 1
= ( ) = ( n + ) .
n(n+1)/2 3 3

(Note that f (k) = k s−1


is convex for any s ≥ 2.)
Problem 1.1.3. Prove that for any x ∈ R , +

√x + √x + 2 < 2√ x + 1 .

Solution. Since f (x) = √x is concave, we get from Jensen’s inequality that for any x ∈ R +

1 1 1 1
√x + √x + 2 = 2( √x + √x + 2 ) ≤ 2√ x + (x + 2) = 2√ x + 1 .
2 2 2 2

Since f (x) = √x is not a linear function and we applied the inequality with x 1
= x ≠ x + 2 = x2 , in
fact sharp inequality holds.
Problem 1.2.1. Show that for any a, b, c, d ∈ R , the following inequality holds:
+

1 1 4 16
(a + b + c + d)( + + + ) ≥ 64.
a b c d

When does equality hold?


Solution. Observe that the inequality can be rewritten as follows:

d
c d
a + b + 2 + 4 8
2 4
≥ .
1 1 1 1
8 + + 2 + 4
a b c/2 d/4

The right hand side of the


above inequality is the harmonic mean of 8 terms,
H (a, b, c/2, c/2, d/4, d/4, d/4, d/4), and the left hand side is the arithmetic mean
A(a, b, c/2, c/2, d/4, d/4, d/4, d/4). Using arithmetic-harmonic inequality we immediately get that
this inequality holds for any a, b, c, d ∈ R , and that + the equality is achieved when
a = b = c/2 = d/4.

Problem 1.2.2. Show that for any n numbers a 1, …, an ∈ R+ , the following inequality holds:
2
a1 a2 an−1 an n
+ + ⋯ + + ≥ ,
a2 + 1 a3 + 1 an + 1 a1 + 1 n + α

where α = ∑ n

i=1
1/ai .
Solution. In order to simplify the notation, let us set an+1 = a1 . Using the arithmetic-geometric mean
inequality, we get that
n ai
n
∑i=1  n
ai ai+1 +1 n ai
∑ = n ≥ n ∏ .

ai+1 + 1 n ai+1 +1
i=1 i=1

However,

 n  n  n
n ai n ai n 1
n ∏ = n ∏ = n ∏ .
⎷ ⎷ ⎷
ai+1 +1 ai +1 1/ai +1
i=1 i=1 i=1

Finally, using the geometric-harmonic mean inequality, we get that

 n
n 1 n n
∏ ≥ = ,
⎷ n
1/ai +1 ∑i=1 1 + 1/ai n + α
i=1

and so the desired inequality holds.


Problem 1.2.3. Prove that for any a, b ∈ R , for which ab = 1, we have that
+

m m
a + b ≥ 2 ,

where m ∈ R . +

Solution. It follows immediately from the arithmetic-geometric mean inequality that


m m
m m
a + b
2√ a b
m m
a + b = 2 ≥ = 2.
2
Problem 1.3.1. Prove that for any a, b ∈ R , +

b a a b
a b ≤ a b .
Solution. The inequality we aim to prove is equivalent to the following one

b log(a) + a log(b ) ≤ a log(a) + b log(b ) .

Without loss of generality, due to the symmetry, we may assume that a ≤ b and so log(a) ≤ log(b) .
Now, the inequality above follows immediately from the rearrangement inequality.
Problem 1.3.2. Prove that for any a, b, c ∈ R , +

b b
ab bc ca
+ + ≥ a + b + c .
c a b
Solution. After dividing both sides by abc ∈ R we get the following inequality:
+

1 1 1 1 1 1 1 1 1 1 1 1
⋅ + ⋅ + ⋅ ≥ ⋅ + ⋅ + ⋅ .
c c a a b b b c c a a b
Due to the symmetry, without loss of generality, we may assume that 1/a ≤ 1/b ≤ 1/c. As in the
previous problem, the above inequality follows immediately from the rearrangement inequality.
Problem 1.3.3. Prove that for any a, b, c ∈ R , +

a b c (a+b+c)/3
a b c ≥ (abc) .

Solution. Without loss of generality, we may assume that a ≤ b ≤ c . After taking a logarithm of both
sides of the inequality we get

a + b + c
a log(a) + b log(b) + c log(c ) ≥ (log(a) + log(b) + log(c) ) .
3
After multiplying both sides by 3 and rearranging the terms, we get

2a log(a) + 2b log(b) +2c log(c)

≥ (b + c) log(a) + (a + c) log(b) + (a + b) log(c).

By rearrangement inequality, we get that

a log(a) + b log(b) + c log(c ) ≥ c log(a) + a log(b) + b log(c)

and that

a log(a) + b log(b) + c log(c ) ≥ b log(a) + c log(b) + a log(c ) .

These two inequalities imply immediately the desired inequality above.


Problem 1.4.1. Prove that for any integer n > 1,
2
n n −n 1
a) ( ) < ,
n+2 2n−1

2
n −1 1
n−1
b) ( ) < .
n n+2

Solution. In both parts, we rst inverse both sides of the inequality and then apply Bernoulli’s
inequality. To show part a), that is, to show that
2
n −n
2
2n − 1 < (1 + )
n

we apply Bernoulli’s inequality to get that


2
n −n
2 2
2
(1 + ) > 1 + (n − n) = 2n − 1 .
n n

(Since n ≥ 2, n 2
− n = n(n − 1) ≥ 2 > 1 .) Similarly, in order to show part b), that is, to show that
2
n −1
1
n + 2 < (1 + )
n − 1

we again apply Bernoulli’s inequality to get that


2
n −1
1 1
(1 + ) > 1 + (n − 1)(n + 1) = n + 2 .
n − 1 n − 1

(Since n ≥ 2, n 2
− 1 ≥ 3 > 1 .)
Problem 1.4.2. Prove that for any real number x > −1 and n ∈ N,

n
x
√ 1+x ≤ 1 + .
n
Solution. Raising both sides to the power of n yields an equivalent inequality that
n
x
1 + x ≤ (1 + )
n

follows directly from Bernoulli’s inequality.


Problem 1.5.1. Prove that for any integer n > 2,
2
n −n
n 1
( ) < .
n−1
n + 2 4

Can the constant 4 be improved for large n?


Solution. It is enough to prove that
2
n −n n n−1

n−1
2 2
4 < (1 + ) = ((1 + ) ) .
n n

Since for n > 2 we have that (1 + 2/n) > (1 + 2/2) = 4, the above inequality holds. Finally, note
n 2

that lim = e , the constant 4 can be replaced by any number smaller than e ≈ 7.39.
n 2 2
n→∞ (1 + 2/n)

Such inequality would hold for large enough n.


Problem 1.5.2. For which n ∈ N do we have that

n n+1
√n > √ n+1 ?

Solution. Raising both sides of the inequality to the power of n(n + 1) gives us n > (n + 1) .
n+1 n

n
After dividing both sides by n , we see that it is enough to show that n > (1 + 1/n) . Since n

= e < 3, this inequality holds for any n ≥ 3. The remaining two cases can
n n
(1 + 1/n) ≤ exp (1/n)

be checked by hand: it does not hold for n = 1 but it does hold for n = 2. We get that the inequality
holds for any natural number at least 2.
Problem 1.5.3. For which n ∈ N do we have that
n n+1 2n+1
(n − 1) (n + 1) > n ?

Solution. After rearranging the inequality, we get that


n
1 n 1
(1 − ) > =
2
n n + 1 1 + 1/n

or, equivalently,
2
n
1 1
(1 − ) > .
2 n
n (1 + 1/n)
2 2

Since , the left hand side is at most 1/e. Similarly, since


n n
2 2
(1 − 1/n ) ≤ exp (−1/n ) = 1/e

= e, the right hand side is at least 1/e. As a result, the original inequality
n n
(1 + 1/n) ≤ exp (1/n)

does not hold for any n ∈ N.


Problem 1.5.4. For which n ∈ N do we have that
n n−1
n > (n + 1) ?

Solution. After multiplying both sides by (n + 1)/n we get n + 1 > (1 + 1/n) . Since n n

= e < 3, the inequality holds for any n ≥ 2. One can directly check that it
n n
(1 + 1/n) ≤ exp (1/n)

does not hold for n = 1.


Problem 1.5.5. Prove that for any n ∈ N,
n
1 n + 1
2 ≤ (1 + ) ≤ 3 ⋅ .
n n + 2

Solution. The rst inequality follows immediately from Bernoulli’s inequality:


n
1 1
(1 + ) ≥ 1 + n ⋅ = 2.
n n

In order to show the second inequality, observe that (1 + 1/n) ≤ (exp(1/n)) = e < 2.72. On the
n n

other hand, since 3(n + 1)/(n + 2) is increasing and tending to 3 as n → ∞, the desired inequality
holds for n large enough. In fact, it certainly holds for n ≥ 9, as 3(n + 1)/(n + 2) = 30/11 > 2.72 if
n = 9. One can easily inspect the 8 missing cases n ∈ [8] to show that the inequality holds for all

n ∈ N (in fact, we get equality for n ∈ {1, 2}.

Problem 1.6.1. Show that for any n ∈ N, there exists a non negative x ∈ R such that
n 2
i
n + n + 1
∏ (1 + x) < 1 + x .
2
i=1

Solution. Since n is xed, there are nite number of elements in the product on the left hand side as
well as in the binomial expansion of (1 + x) for any i ∈ [n]. As a result,
i

n n
i 2
∏ (1 + x) = ∏ (1 + ix + O(x ))
i=1 i=1

n
2 n(n+1) 2
= 1 + ∑ ix + O(n ) = 1 +
2
x + O(x )
i=1

2 2
n +n x n +n+1
< 1 + x + = 1 + x,
2 2 2

provided that x ∈ R is suf ciently close to zero.


+

Problem 1.6.2. Prove that for any polynomial W (x) and suf ciently large x we have that
> W (x), if n ∈ N is greater than the degree of W. What does it tell us about the function
n
(1 + x/n)
x
e?
Solution. Since W (x) = O(n n−1
) , observe that
n n n
n
x x n−1
x x
(1 + ) − W (x ) = + O(x ) = (1 + O(1/x) ) > > 0,
n n n
n n n 2n
provided that x ∈ R is suf ciently large.
As e =lim
x
i→+∞ (1 + x/n) and the sequence (1 + x/n) is increasing, we see that function ex
n n

tends to in nity (as x → ∞) faster than any xed polynomial W (x).


Problem 1.7.1. Prove that for a, b, c ∈ R , +

1 1 1
(a + b + c)( + + ) ≥ 9.
a b c

Solution. Let us rst note that


2 2 2
1 1 1 2 2 2
(a + b + c)( + + ) = (√ a + √b + √ c )(√ 1/a + √ 1/b + √ 1/c ) .
a b c

By Cauchy-Schwarz inequality, the right hand side is greater than or equal to


2

2
(√ a√ 1/a + √ b√ 1/b + √ c√ 1/c) = 3 = 9.

As a remark, we note that the problem can be also solved by applying the arithmetic-harmonic
inequality (and this approach is mentioned in the remark to the original problem presented in PLMO II
– Phase 1 – Problem 6, which assumed additionally that a + b + c = 1).
Problem 1.7.2. Prove that for any a, b, c ∈ R such that a + b + c = 1, we have that
+

√ 2a + 1 + √ 2b + 1 + √ 2c + 1 ≤ √ 15 .

Solution. Using Cauchy-Schwarz inequality, we get that


2

(√ 2a + 1 ⋅ 1 + √ 2b + 1 ⋅ 1 + √ 2c + 1 ⋅ 1) ≤ (2a + 1 + 2b + 1 + 2c + 1)(1 + 1 + 1 ) = 15,

which gives us the desired result. Alternatively, one can use Jensen’s inequality to get that

√ 2a + 1 + √ 2b + 1 + √ 2c + 1 ≤ 3(√ (2a + 1 + 2b + 1 + 2c + 1)/3)

= 3√ 5/3 = √ 15.

Problem 1.7.3. Prove that if a, b, c ∈ R are such that a + b + c = 1 and min{a, b, c} ≥ −3/4, then

a b c 9
+ + ≤ .
2 2 2
a + 1 b + 1 c + 1 10
Does this inequality hold without the additional assumption that min{a, b, c} ≥ −3/4?
Solution. It is natural to try to bound function f (x) := x/(x 2
+ 1) by some linear function, that is, we
search for a bound of the form
x
f (x ) = ≤ Ax + B =: g(x)
2
x + 1
for some constants A and B. Indeed, if this can be done, then the left hand side of our inequality,
f (a) + f (b) + f (c), would be bounded by g(a) + g(b) + g(c) = A(a + b + c) + 3B = A + 3B.

But how can we nd suitable constants A and B?


We notice that the equality holds for a = b = c = 1/3 and so the linear upper bound that we are
searching for has to be tight for x = 1/3, that is,

3 A
f (1/3 ) = = g(1/3 ) = + B.
10 3
Moreover, the additional assumption that a, b, and c must be at least −3/4 suggests that the same
should be true for x = −3/4:

12 −3A
f (−3/4 ) = − = g(−3/4 ) = + B.
25 4
We get the following system of equations: A/3 + B = 3/10 and −3A/4 + B = −12/25. It follows
that A = 18/25 and B = 3/50. Since A + 3B = 9/10, the result will hold once we prove that for any
x ≥ −3/4
x 18 3
≤ x +
2
x + 1 25 50
or, equivalently, that
3 2
h(x ) := 36x + 3x − 14x + 3 ≥ 0

(One can see it after multiplying both sides of the previous inequality by 50(x + 1).) But we already 2

know that g(1/3) = h(−3/4) = 0 so h(x) is divisible by (3x − 1)(4x + 3) (see Section 3.6 for
additional explanations of this fact). After dividing these two polynomials we nd that the remaining
factor is 3x − 1 and so the inequality is equivalent to (3x − 1) (4x + 3) ≥ 0, which is clearly true for
2

any x ≥ −3/4.
We will show now that the assumption that min{a, b, c} ≥ −3/4 is not necessary. We start with
proving a few properties of function f (x)— for a graph of this function see a gray curve in Figure 8.2.

FIGURE 8.2: Function f (x) = x/(x 2


+ 1) .

Property A: Function f (x) is odd, that is, f (x) = −f (x) for any x ∈ R. Indeed,

x −(−x)
f (x ) = = = −f (−x ) .
2 2
x + 1 (−x) + 1

Property B: For any x ≥ 0,

0 ≤ f (x ) ≤ 1/2 = f (1 ) .

The lower bound of 0 is trivial. To see the upper bound of 1/2, note that x 2 2
− 2x + 1 = (x − 1) ≥ 0

which implies that 2x ≤ x + 1 and so x/(x + 1) ≤ 1/2.


2 2

Property C: Function f (x) is increasing on the interval [0, 1], that is, f (x) < f (y) for any
0 ≤ x < y ≤ 1. Indeed, f (x) < f (y) is equivalent to x(y + 1) < y(x + 1) which, in turn, is
2 2

equivalent to 0 < (1 − yx)(y − x). But the last inequality is clearly true for any 0 ≤ x < y ≤ 1 as
xy ∈ [0, 1).

Property D: Function f (x) is decreasing on the interval [1, ∞), that is, f (x) > f (y) for any 1 ≤ x < y
. In order to see this, the same argument as before can be applied with the only difference being that
now xy ∈ (1, ∞).
Let us summarize what we have learned about function f (x): it is decreasing on the interval
(−∞, −1], reaching −1/2 at x = −1, then increasing on the interval [−1, 1], reaching 1/2 at x = 1,

and nally decreasing on the interval [1, ∞).


Now, let us come back to the original problem. Suppose that c ≤ b ≤ a are any real numbers such
that a + b + c = 1. Since we already dealt with the case min{a, b, c} ≥ −3/4, we may assume that
c < −3/4 . We independently consider 2 cases. If c ≤ −3, then a ≥ 2 and so

a b c 2 1 0 9
+ + ≤ + + = .
2 2 2 2 2 2
a + 1 b + 1 c + 1 2 + 1 1 + 1 0 + 1 10
On the other hand, if c ∈ [−3, −3/4], then

a b c 1 1 −3 7 9
+ + ≤ + + = < ,
2 2 2 2 2 2
a + 1 b + 1 c + 1 1 + 1 1 + 1 3 + 1 10 10
and we are done.
Problem 1.8.1. Prove that for any n ∈ N,
2n
1 ( )
n
≤ ≤ 1 .
2n
2n + 1 2
Can you improve these bounds for large n?
Solution. Consider ipping a fair coin 2n times. For any i ∈ [2n], let P(i) be the probability of getting
exactly i heads. Clearly, ∑ P(i) = 1 (the number of heads has to be between 0 and 2n) and for any
2n

i=0

i ∈ [2n]

2n
2n 1
P(i ) = ( ) ⋅ ( ) .
i 2

(there are ( ) ways to select i rounds when heads are obtained and each such outcome occurs with
2n

probability (1/2) ). Hence, our goal is to estimate P(n). Because of this connection, the upper bound
2n

( P(n) ≤ 1) is trivial. (Let us note that we could have obtained the same upper bound without using
the above probabilistic argument by observing that 2 = (1 + 1) and then using the binomial 2n 2n

theorem.) In order to see the lower bound ( P(i) ≥ 1/(2n + 1)) we note that P(i), as a function of i,
is maximized exactly for i = n; that is, for any i ∈ [2n], P(i) ≤ P(n). To see this note that
P(i) = P(i − 1) ⋅ (2n + 1 − i)/i, which is larger than one for i ≤ n and smaller than one otherwise.

Using this observation, it follows from the averaging argument that

2n
1 1
P(n ) ≥ ∑ P(i ) = ,
2n + 1 2n + 1
i=0

and we are done.


Finally, let us mention that better bounds can be obtained, for example, one can show that for any
n ∈ N,

2n
1 ( ) 1
n
≤ ≤ .
2n
2√n 2 √ 2n

In fact, one can use Stirling’s formula ( n! ∼ √2πn(n/e) ) to see that n

2n 2n
( ) (2n)! √ 2π(2n)(2n/e) 1
n
= ∼ = .
2n 2 2n
2 (n!) 2
2n
2πn(n/e) 2
2n
√πn

Problem 1.8.2. Prove that for k, n ∈ N , such that k ≤ n and p, q ∈ [0, 1] , such that p < q , we have
that
n
n n−i n−i
i i
∑( )(q (1 − q) − p (1 − p) ) ≥ 0.
i
i=k

Solution. In order to solve this problem, we are going to use a standard but very useful proof technique
in probability theory that allows one to compare two experiments. Consider two biased coins, the rst
with probability p of turning up heads and the second with probability q > p of turning up heads. For
any xed k, the probability pk that the rst coin produces at least k heads should be at most the
probability qk that the second coin produces at least k heads. Clearly,

n n
n i n−i
n i n−i
pk = ∑( )p (1 − p) and qk = ∑( )q (1 − q) .
i i
i=k i=k

Hence, our problem reduces to showing that, indeed, q ≥ p for any k. k k

However, proving it is rather dif cult with a standard counting argument. Coupling easily
circumvents this problem. Let X , X , …, X be indicator random variables for heads in a sequence
1 2 n

of n ips of the rst coin. In other words, X = 1 if ith ip is a head and X = 0 otherwise. It follows
i i

that

pk = P (∑ Xi ≥ k ) .

i=0

For the second coin, de ne a new sequence Y , Y , …, Y such that if X = 1, then Y = 1; if X = 0,


1 2 n i i i

then Y = 1 with probability (q − p)/(1 − p). Clearly, the sequence of Yi has exactly the probability
i

distribution of tosses made with the second coin. Indeed, for any i ∈ [n],

q − p q − p
P(Yi = 1 ) = P (Xi = 1) + P (Xi = 0) ⋅ = p + (1 − p) ⋅ = q.
1 − p 1 − p

However, because of the coupling we trivially get that X := ∑ X ≤ Y := ∑ Y and so i i

P(X ≥ k) ≤ P (Y ≥ k), as expected. We typically say that X is (stochastically) bounded from above

by Y.
Problem 1.9.1. Prove that for all sequences of n numbers a 1, …, an ∈ R , we have that

 2
n n

2
n + (∑ ai ) ≤ ∑ √1 + a .
⎷ i

i=1 i=1

When does equality hold?


Solution. Fix any sequence of numbers a , …, a ∈ R. Let us consider n + 1 points,
1 n P0 , P1 , …, Pn ,
on the 2-dimensional plane. For each i ∈ {0, 1, …, n}, point Pi is de ned as follows:

Pi := (i, ∑ aℓ ) .

ℓ=1

(In particular, P
0 = (0, 0) .) For any two points, Pi and Pj, let d(P i, Pj ) be the distance between Pi and
Pj. Clearly,

 2  2
n n
2
d(P0 , Pn ) = (n − 0) + (∑ aℓ − 0) = n − (∑ aℓ ) ,
⎷ ⎷
ℓ=0 ℓ=0

the left hand side of our inequality. On the other hand, for each i ∈ [n],
Solution.

′′ ′′
A B C D E
′′
d(Pi−1 , Pi )

Consider a
=


2
pentagon
(i − (i − 1))

It follows immediately from the triangle inequality that

d(P0 , Pn )

ABCDE
2

16
n
i

+ (∑ aℓ − ∑ aℓ )

ℓ=0

∑ d(Pi−1 , Pi ) .

i=1
ℓ=0

with each side of length 1/2; that is,


|AB| = |BC| = |CD| = |DE| = |EA| = 1/2. Moreover, assume that x := |AD| = |BD|; that is,
i−1

But this is exactly the inequality we wanted to prove and so we are done! Finally, let us note that the
equality holds if and only if all the ai are equal.
Problem 1.9.2. Prove that for any x ∈ R such that 1/4 ≤ x ≤ 1, we have that

x
√ 1 − x2 +
1
√ 16x2 − 1 <

ABD is an isosceles triangle. Since 1/2 = |AB| ≤ |AD| + |BD| = 2x, x ≥ 1/4. On the other hand,

x = |AD| ≤ |AE| + |ED| = 1. The limiting shape when x → 1 is an isosceles triangle ABD

whereas if x → 1/4 we get two isosceles triangles ADE and BCD (see Figure 8.3).

FIGURE 8.3: Illustration for Problem 1.9.2. ABCDE is the regular pentagon, A B C
(dashed grey) is the ‘limiting pentagon’ when x → 1/4.
′′ ′′
′ ′ ′

Fix any x ∈ [1/4, 1] and consider the corresponding pentagon ABCDE (including the two limiting
scenarios, x = 1/4 and x = 1). Let us rst calculate the area of our pentagon ABCDE which is the

D E

sum of the areas of three isosceles triangles. The rst one, ABD, has one side of length 1/2 and two
sides of length x. The remaining two, BCD and ADE, have one side of length x and two sides of
4


.
2

= √ 1 + a2 .
i

is the ‘limiting pentagon’ when x → 1 and


length 1/2 . Since the area of isosceles triangle with base of length a and sides of equal length b is
/a − 1/4, we get that the area of our pentagon is equal to
1 2 2 2
a √b
2

1 2 2 1 2 2
(1/2) √ x /(1/2) x √ (1/2) /x − 1/4
2 2
− 1/4 + 2 ⋅
2 2

1 x
= √ 16x2 − 1 + √ 1 − x2 ,
16 2

which is exactly the left hand side of our inequality. On the other hand, the area of our pentagon is less
than or equal to the area of a regular pentagon of side length a = 1/2, which is equal to

2
a 1 4
√ 5(5 + 2√ 5) = √ 5(5 + 2√ 5) < 0.44 < .
4 16 9

8.2 Equalities and Sequences


Problem 2.1.1. Solve the following system of equations, given that all variables involved are real
numbers:
3
⎧a + b = c
3
b + c = d
⎨ 3
c + d = a

3
d + a = b.

Solution. By subtracting the rst equation from the third one, we get that
2 2
b − d = (c − a)(c + ac + a + 1) .

Analogously, after subtracting the second equation from the fourth one, we get that
2 2
c − a = (d − b)(d + bd + b + 1)

and, by combining the two, we get that


2 2 2 2
b − d = (d − b)(d + bd + b + 1)(c + ac + a + 1) .

(8.1)

Now, by considering the term f (d) := d + bd + b + 1 as a function of d, we see that f (d) > 0 as the
2 2

discriminant of the polynomial is equal to Δ = b − 4(b + 1) = −3b − 4 < 0. Similarly, we get


2 2 2

that g(c) := c + ac + a + 1 > 0. Hence, we get from (8.1) that b = d. A symmetric argument can
2 2

be used to show that c = a. We can now get back to the original equations to see that a + b = a and 3

b + a = b, which implies that a = −b and so a = −b. As a consequence, we get that a − a = a,


3 3 3 3

and so a ∈ {0, −√2, √2}. It follows that there are three candidate solutions:

(a, b, c, d ) ∈ {(0, 0, 0, 0), (−√ 2, √ 2, −√ 2, √ 2), (√ 2, −√ 2, √ 2, −√ 2) } .

We directly check that all of them satifsy the original system of equations.
Problem 2.1.2. Solve the following system of equations, given that all variables involved are real
numbers:
3 3
(x − y)(x + y ) = 7
{ 3 3
(x + y)(x − y ) = 3.
Solution. We see immediately that x ≠ y and x ≠ −y. Therefore, we can divide the two equations to
get that
2 2
(x − y)(x + y)(x − xy + y ) 7
= ,
2 2
(x + y)(x − y)(x + xy + y ) 3

and so
2 2 2 2 2 2
0 = 7(x + xy + y ) − 3(x − xy + y ) = 4x + 10xy + 4y

= 2(2x + y)(2y + x).

If y = −2x, then we would have to have 3x = −1 which is impossible. On the other hand, if
4

x = −2y, then we get 3y = 1 and so y = 1/√ 3 or y = −1/√ 3. We conclude that there are two
4 4 4

candidate solutions

4 4 4 4
(x, y ) ∈ {(2/√ 3, −1/√ 3), (−2/√ 3, 1/√ 3) } .

We directly check that all of them satisfy the original system of equations.
Problem 2.1.3. Solve the following system of equations, given that all variables involved are real
numbers:
2
x − (y + z + yz)x + (y + z)yz = 0
2
{y − (z + x + zx)y + (z + x)zx = 0
2
z − (x + y + xy)z + (x + y)xy = 0.

Solution. Let us note that the rst equality can be rewritten as follows:
2
0 = x − x(y + z) − xyz + (y + z)yz

= x(x − (y + z)) + ((y + z) − x)yz

= (x − (y + z))(x − yz).

The same simpli cation can be done with the remaining equations to get the following equivalent
system:

(x − (y + z))(x − yz ) = 0

{(y − (z + x))(y − zx ) = 0

(z − (x + y))(z − xy ) = 0.

It follows that each variable is either the product or the sum of the remaining variables. We will
independently consider the following four cases.
Case 1: all variables are the sums. We have x = y + z, y = z + x, and z = x + y. We immediately get
that x = y = z = 0.
Case 2: two variables are the sums and one is the product. Without loss of generality, we may assume
that x = y + z, y = z + x, and z = xy. We immediately get that z = 0, and this implies that
x = y = 0, the solution we already discovered.

Case 3: one variable is the sum and two are the products. Without loss of generality, we may assume
that x = y + z, y = zx, and z = xy. Indeed, by symmetry we can recover the whole family of
solutions by permuting the solution vector (x, y, z). It follows that x = y + z = x(z + y) = x and so 2

x = 0 or x = 1. The case x = 0 leads to the solution we already discovered, namely,


(x, y, z) = (0, 0, 0). If x = 1, then y = z = 1/2. We conclude that there are three additional solutions,

namely (x, y, z) = (1, 1/2, 1/2), (x, y, z) = (1/2, 1, 1/2), and (x, y, z) = (1/2, 1/2, 1), by
permuting the variables.
Case 4: all numbers are the products. We have x = yz, y = zx, and z = xy. If one variable is equal to
zero, then all of them must be zero and we get the particular solution (x, y, z) = (0, 0, 0) one more
time. If no variable is equal to zero, then we get that y = z(yz) = yz , and so z = 1. The symmetric 2 2

arguments give us y = 1, x = 1, and so all of the variables are either 1 or −1. Since the value of x is
2 2

determined by the value of y and z ( x = yz), by considering all possibilities for y and z, we see that
either all of the variables x, y, z are equal to 1 or precisely one of them is equal to 1. We directly check
that these potential solutions are feasible, giving us the following four additional solutions:
(x, y, z) = (1, 1, 1), (x, y, z) = (1, −1, −1), (x, y, z) = (−1, 1, −1), (x, y, z) = (−1, −1, 1).

Putting all of the observations together, we get the following 8 solutions:

(x, y, z) ∈ {(0, 0, 0), (1, 1/2, 1/2), (1/2, 1, 1/2), (1/2, 1/2, 1),

(1, 1, 1), (1, −1, −1), (−1, 1, −1), (−1, −1, 1)}.

Problem 2.2.1. Solve the following system of equations, given that all variables involved are real
numbers:
3
⎧(x + y) = 8z
3
⎨(y + z) = 8x
⎩ 3
(z + x) = 8y .

Solution. Since f (x) := x is an increasing function, x > z if and only if (x + y) > (y + z) . Using
3 3 3

this observation, we get from the rst equation and the second one that x = z. Similarly, from the
second equations and the third one we get that x = y. It follows that all the variables are equal. We get
that 8x = (2x) = 8x , so 0 = 8x − 8x = 8x(x − 1)(x + 1). We conclude that there are three
3 3 3

solutions to the given systems of equations:

(x, y, z ) ∈ {(−1, −1, −1), (0, 0, 0), (1, 1, 1) } .

Problem 2.2.2. Solve the following system of equations, given that all variables involved are real
numbers:
5 3
x = 5y − 4z
5 3
{y = 5z − 4x
5 3
z = 5x − 4y .

Solution. Since the system is cyclic, without loss of generality, we may assume that x ≥ y and x ≥ z.
We will independently consider the following two cases.
Case 1: y ≤ z . Since and h(x) := x are increasing functions, we get that
f (x) := x
5

5x = z + 4y ≤ x + 4z = 5y , and so x ≤ y as g(x) := x is also an increasing function. It follows


3 5 5 3 3

that x = y and so also z = x since y ≤ z ≤ x.


Case 2: z ≤ y. We have 5x 3
= z
5
+ 4y ≤ y
5
+ 4x = 5z
3
, so we again get that x = y = z.
Since in both cases we get that x = y = z, variable x must satisfy the equation x 5
= 5x
3
− 4x that can
be rewritten as follows:
4 2 2 2
0 = x(x − 5x + 4) = x(x − 1)(x − 4)

= (x + 2)(x + 1)x(x − 1)(x − 2).

It follows that x ∈ {−2, −1, 0, 1, 2}. It is straightforward to check directly that the following triplets
are solutions to our system:
(x, y, z ) ∈ {(−2, −2, −2), (−1, −1, −1), (0, 0, 0), (1, 1, 1), (2, 2, 2) } .

Problem 2.2.3. Solve the following system of equations, given that all variables involved are positive
real numbers:
3 3 3 3
a + b + c = 3d
4 4 4 4
{b + c + d = 3a
5 5 5 5
c + d + a = 3b .

Solution. By the pigeonhole principle, at least one of the variables d, a or b attains the maximum or the
minimum value from the set {a, b, c, d}. We will independently consider the following two cases.
Case 1: a, bor d is the maximum. Let us rst assume that a is the maximum. We will show that
a = b = c = d. Indeed, if this is not the case, then 3a = b + c + d < 3a which is clearly not
4 4 4 4 4

possible. (Let us remark that here we used the fact that f (x) := x is an increasing function on R+.) If b 4

or d is the maximum, then the argument is the same but this time we need to respectively use the third
or the rst equation and the fact that g(x) := x and h(x) := x are increasing functions.
5 3

Case 2: a, b or d is the minimum. Again, we will show that a = b = c = d. As before, the argument is
similar for each sub-case and we will present it assuming that a is the minimum. Indeed, if this is not
the case that a = b = c = d, then 3a = b + c + d > 3a which is a contradiction.
4 4 4 4 4

In all the cases, we get that all the variables are equal and it is easy to check that any 4-tuple
(a, b, c, d) = (t, t, t, t) satis es all the equations for each t > 0.

Problem 2.3.1. Solve the following equation

4 2 2
(x + 3y )√ |x + 2| + |y| = 4 xy ,

(8.2)

provided that x, y ∈ R.
Solution. Let us rst note that if (x, y) = (−2, 0) or (x, y) = (0, 0), then both sides of (8.2) are equal
to 0 and so the desired equality holds. We may then assume that (x, y) ≠ (−2, 0) and (x, y) ≠ (0, 0).
In particular, the left hand side of (8.2) is non-zero.
By the arithmetic-geometric mean inequality, we get that

4 2 4
4 2 2 2
x + 3y ≥ 4√ x ⋅y ⋅y ⋅y = 4|xy|√ |y| ,

and the equality holds only when x 2


= |y| . Since (x, y) ≠ (−2, 0), we get that √|x + 2| + |y| > 0. It
follows that

4 2 2
(x + 3y )√ |x + 2| + |y| ≥ 4|xy|√ |y|√ |x + 2| + |y| ≥ 4 xy ;

the rst equality holds when x = |y|, and the second equality holds when x = −2. Since by our
2

earlier assumption y ≠ 0 when x = −2, we get that (8.2) holds only when x = −2 and
|y| = (−2) = 4, that is, when (x, y) = (−2, 4) or (x, y) = (−2, −4). We conclude that the solution
2

is

(x, y ) ∈ {(−2, −4), (−2, 0), (−2, 4), (0, 0) } .

Problem 2.3.2. Solve the following system of equations, given that all variables involved are real
numbers:
2 2 2
3(x + y + z ) = 1
{ 3
2 2 2 2 2 2
x y + y z + z x = xyz(x + y + z) .

Solution. Let us rst note that for any a, b ∈ R we have a


2
+ b
2
≥ 2ab , and the equality holds if and
only if a = b. It follows that
2 2 2 2
(x + y + z) = x + y + z + 2xy + 2yz + 2zx

2 2 2
≤ 3(x + y + z ) = 1,

and the equality holds if x = y = z. Similarly, note that


2 2 2 2 2 2 2 2
(x y + x z )/2 = x (y + z )/2 ≥ x yz ,

and the equality holds if x = 0 or y = z. Symmetric arguments give us


2 2 2 2 2 2 2 2
(y z + y x )/2 = y (z + x )/2 ≥ y zx

2 2 2 2 2 2 2 2
(z x + z y )/2 = z (x + y )/2 ≥ z xy.

Summing the three inequalities together, we get that


2 2 2 2 2 2 2 2 2
x y + y z + z x ≥ x yz + y zx + z xy = xyz(x + y + z ) .

Moreover, the equality holds if and only if

(x = 0 ∨ y = z) ∧ (y = 0 ∨ z = x) ∧ (z = 0 ∨ x = y) .

For this condition to hold, clearly, if no variable is equal to 0, then x = y = z. It is also easy to see that
it is impossible that only one variable is equal to zero. Hence, we get that the equality holds if
x = y = z or at least two of the three variables x, y, z are equal to 0. But this means that

2 2 2 2 2 2 3
x y + y z + z x ≥ xyz(x + y + z ) ≥ xyz(x + y + z) ,

where in the last step we use the fact we showed at the very beginning, namely, that (x + y + z) ≤ 1. 2

More importantly, the condition for the equality remains the same, that is, either x = y = z (since then
(x + y + z) = 1) or at least two of the three variables x, y, z are equal to 0 (since then both sides of
2

the inequality are equal to 0). We will consider these two cases separately.
Case 1: x = y = z. Our system reduces to

2
9x = 1
{ 4 6
3x = 27x ,

which gives the


following two solutions (x, y, z) = (1/3, 1/3, 1/3) and
(x, y, z) = (−1/3, −1/3, −1/3).

Case 2: at least two of the three variables x, y, z are equal to 0. Without loss of generality, we may
assume that y = z = 0 and other solutions will be obtained by permuting the variables. This time our
system reduces to

2
3x = 1
{
0 = 0.

This leads us to another six solutions of the system (1/√3, 0, 0), (−1/√3, 0, 0), (0, 1/√ 3, 0) ,
(0, −1/√ 3, 0), (0, 0, 1/√ 3), and (0, 0, −1/√ 3).

Combining the two cases together we conclude that the solution to the system is
(x, y, z) ∈ {(1/3, 1/3, 1/3), (−1/3, −1/3, −1/3), (1/√ 3, 0, 0), (−1/√ 3, 0, 0),

(0, 1/√ 3, 0), (0, −1/√ 3, 0), (0, 0, 1/√ 3), (0, 0, −1/√ 3)}.

Problem 2.3.3. Solve the following system of equations, given that all variables involved are real
numbers:
2
x y + 2 = x + 2yz
2
{y z + 2 = y + 2zx
2
z x + 2 = z + 2xy .

Solution. Let us rst note that if x = 0, then from the third equation we get that z = 2 and from the
rst one that y = 1/2. But this contradicts the second equation, and so x ≠ 0. Symmetric arguments
show that also y ≠ 0 and z ≠ 0.
Let us use the following substitution: a = xy ≠ 0, b = yz ≠ 0, and c = xz ≠ 0. After multiplying
the rst equation by y, multiplying the second equation by 2, and adding it together we get
a + 4 = a + 4c. Symmetric operations give us the following system of equations:
2

2
a + 4 = a + 4c
2
{b + 4 = b + 4a
2
c + 4 = c + 4b .

Due to the symmetry, without loss of generality, we may assume that a is a largest number from a, b, c
and then circularly shift the solution, if needed. We get that a + 4 = a + 4c ≤ a + 4a = 5a, or
2

alternatively that 0 ≥ a − 5a + 4 = (a − 1)(a − 4). It follows that a ∈ [1, 4]. Since function
2

f : [1, 4] → [1, 4], f (a) := (a + 4 − a)/4 is a bijection and c = f (a), we get that also c ∈ [1, 4].
2

Similarly, we get that b = f (c) ∈ [1, 4]. More importantly, function f (x) has the property that
1 < f (x) < x unless x = 1 or x = 4. Suppose that 1 < a < 4. Then 1 < c = f (a) < a, and

consequently 1 < b = f (c) < c and 1 < a = f (b) < b. This contradicts the fact that a is a largest
value. It follows that the only possible solutions are (a, b, c) = (1, 1, 1) and (a, b, c) = (4, 4, 4).
Going back to the original set of equations, we see that x = y = z and so there are only four
potential triples (x, y, z) that satisfy the original system: (1, 1, 1), (−1, −1, −1), (2, 2, 2), and
(−2, −2, −2). The last triple does not satisfy the original system and so the solution is

(x, y, z ) ∈ {(1, 1, 1), (−1, −1, −1), (2, 2, 2) } .

Problem 2.4.1. Let n ≥ 2 be any natural number. Find the number of sequences (x 1, x2 , …, xn ) of
non-negative real variables that satisfy the following system of equations: for i ∈ [n]
2
xi+1 + xi = 4xi ,

where x n+1 = x1 .
Solution. Let us rst note that for each i ∈ [n] we have
2
xi+1 = 4xi − xi = xi (4 − xi ) .

Since xi+1 is non-negative, we get that x ∈ [0, 4]. As there exist α ∈ [0, ∞) such that
i i

(α ), it is natural to use this substitution. In fact, there are many choices for αi for a given
2
xi = 4 sin i

xi. However, α1 has a unique value in [0, π/2] that satis es 2


x1 = 4 sin (α1 ) ∈ [0, 4] . Let us then note
that
2 2 2
4sin (αi+1 ) = xi+1 = xi (4 − xi ) = 4sin (αi )(4 − 4sin (αi ))

2 2
= 4(2 sin(αi ) cos(αi )) = 4sin (2αi ),

and so x = 4 sin (2 α ). In particular, since x = x , we get that sin (α) = sin (2 α). It
i
2 i−1
1 1 n+1
2 2 n

follows that sin(α) = sin(2 α) or sin(α) = − sin(2 α).


n n

Let us rst deal with a degenerate case, namely, α = 0 that yields the following particular solution:
(x , x , …, x ) = (0, 0, …, 0). If sin(α) = sin(2 α) for some α > 0, then there exists k ∈ N such
n
1 2 n

that either α + 2kπ = 2 α or −α + π + 2kπ = 2 α. On the other hand, if sin(α) = − sin(2 α) for
n n n

some α > 0, then there exists k ∈ N such that either α + π + 2kπ = 2 α or −α + 2kπ = 2 α. n n

Combining these two observations together, we get that (2 + 1)α = kπ for some k ∈ N or n

(2 − 1)α = kπ for some k ∈ N. Since α ∈ [0, π/2], including the degenerate case, we get that
n

kπ kπ
n−1 n−1
α ∈ { : k ∈ [2 ]} ∪ { : k ∈ [2 − 1]} ∪ {0 } .
n n
2 + 1 2 − 1

Finally, note that these two rst sets above are disjoint as 2 + 1 and 2 − 1 are co-prime for n ≥ 2. It n n

follows that there are 2 n−1


+ (2 − 1) + 1 = 2 solutions to the original set of equations.
n−1 n

Problem 2.4.2. Let n ∈ N. Find all solutions of the equation


n n
|tan (x) − cot (x) | = 2n| cot(2x) | .

Solution. Using double-angle identities, we get that


2 2
cos(2x) cos (x) − sin (x) cot(x) − tan(x)
cot(2x ) = = = .
sin(2x) 2 sin(x) cos(x) 2

Using the substitution y = tan(x), our equality can be equivalently rewritten as follows:

n
1 1 1
y − = n − y = n y − .
n
y y y

(8.3)

The problem is easy if n = 1, as then (8.3) is always satis ed, provided that y ≠ 0. As a result, any
value of x ∈ R that falls into the domain of both tan(x) and cot(x) functions, satis es the original
equation. In other words, the solution is:


x ∈ R ∖ { : k ∈ Z} .
2

Suppose then that n ≥ 2. We will independently consider the following two cases.
Case 1: y = 1 or y = −1. Both sides of (8.3) are equal to zero, which yields the following family of
solutions: x = (2k + 1)π/4 for some k ∈ Z.
Case 2: y ≠ 1and y ≠ −1 . This time |y − 1/y| ≠ 0 and so after dividing both sides of (8.3) by
|y − 1/y| we get:

2n
y −1
2 2n−2 2n−4 2
y
n (y −1)(y +y +…+y +1)
1−n
n = 2
= y ⋅ 2
y −1 (y −1)
y

n n
n+1−2i n+1−2i
= ∑ y = ∑ |y| ,
i=1 i=1

(8.4)
where the last equality holds because either all the terms y n+1−2i
are positive or all are negative. Using
the arithmetic-geometric mean inequality, we get that
1/n
n n
1 n+1−2i n+1−2i n

∑ |y| ≥ (∏ |y| ) = √1 = 1,
n
i=1 i=1

where the equality holds if and only if |y| = 1. It follows that (8.4) holds if and only if |y| = 1, which
are excluded in Case 2 (we already considered them in Case 1).
Combining the two cases together, we conclude that for n ≥ 2 the solution is:

(2k + 1)π
x ∈ { : k ∈ Z} .
4

Problem 2.4.3. For a given a ∈ R , let us recursively de ne the following sequence: x0 = √ 3 and for
all non-negative integers n,

1 + axn
xn+1 = .
a − xn
Find all values of a for which the sequence has period equal to 8.
Solution. Let us rst note that for any a ∈ R, there exists α = α(a) ∈ R such that a = cot(α). In
particular, for convenience we set α = cot (a) ∈ (0, π), where cot (⋅) is the inverse of the
−1 −1

cotangent function. (Note that none of the six trigonometric functions are one-to-one. They are
restricted to their principal branch in order to have inverse functions. For cotangent the principal branch
is (0, π).) Let us now recursively de ne another sequence: y = π/6 and for all non-negative integers
0

n, yn+1 = y − α. Of course, it means that y


n = y − nα = π/6 − nα for n ∈ N. We will prove by
n 0

induction on n that x = cot(y ) for all n ∈ N ∪ {0}.


n n

The base case ( n = 1) is easy: cot(y ) = cot(π/6) = √3 = x . For the inductive step, assume
0 0

that x = cot(y ) for some n ∈ N ∪ {0}. Our goal is to show that x


n n = cot(y ) but this follows
n+1 n+1

almost immediately from the following identity:

1 + cot(x) cot(y)
cot(y − x ) = .
cot(x) − cot(y)

Indeed, note that

1 + cot(α) cot(yn ) 1 + axn


cot(yn+1 ) = cot(yn − α ) = = = xn+1 .
cot(α) − cot(yn ) a − xn

The proof by induction is nished.


With this convenient representation at hand, let us come back to our problem of nding values of a
that yield period equal to 8. (Not necessarily the fundamental period has to be equal to 8.) For the
sequence x = y to have period of 8, we must have 8α = kπ for some k ∈ N, as the cotangent
n n

function has period of π. Since α ∈ (0, π), in fact, k ∈ [7]. It follows that all the possible values of a
that satisfy the desired condition of the problem are of the form a = cot(kπ/8) for some k ∈ [7].
Suppose that a = cot(kπ/8) for some k ∈ [7]. We get that for each n ∈ N ∪ {0},

π π nkπ 4 − 3nk
xn = cot( − nα(a) ) = cot( − ) = cot(π ⋅ ) .
6 6 8 24

Since 4 − 3nk is not divisible by 3, (4 − 3nk)/24 is never an integer. As a result, π(4 − 3nk)/24
always belongs to the domain of the cotangent function and so for all seven identi ed values of a the
sequence is properly de ned.
Note that in the problem we required the function to have a period of 8 but its fundamental period
can be smaller. In particular we note that for k ∈ {2, 6} the fundamental period of the sequence is 4
and for k = 4 the fundamental period of the sequence is 2.
Problem 2.5.1. For a given a ∈ R, consider the following system of equations:
2 2
x + y + z = a
2 2
{x + y + z = a
2 2
x + y + z = a.

Find the number of real solutions (x, y, z) of this system as a function of a.


Solution. By subtracting the rst equation from the second one, we get that
2 2
0 = x − x + y − y = (x − y)(x + y − 1 ) ,

which implies that either x = y or x = 1 − y. Symmetric arguments may be applied to the remaining
two pairs of equations. Since 1 − (1 − t) = t, we conclude that all the solutions must have one of the
following four forms: (x, y, z) = (t, t, t), (x, y, z) = (t, t, 1 − t), (x, y, z) = (t, 1 − t, t), or
(x, y, z) = (1 − t, t, t), where t ∈ R. Clearly, if t = 1/2, then only one solution should be counted,

namely, (x, y, z) = (1/2, 1/2, 1/2). It means that we need to be extra careful with the case
a = (1/2) + (1/2) + (1/2) = 1. On the other hand, if t ≠ 1/2, then all solutions are distinct. More
2 2

importantly, by symmetry, the last three forms are associated in the following sense: if one of them is a
solution, then so are the remaining two. We will independently consider the two cases.
Case 1: there is a solution of the form (t, t, t) for some t ∈ R. It follows that 2t + t − a = 0. Since
2

the discriminant is equal to Δ = 1 − 8a, we conclude that there are no solutions of this form if
a < −1/8 (that is, Δ < 0), precisely one solution if a = −1/8, and two solutions if a > −1/8.

Case 2: there is a solution of the form (1 − t, t, t) for some t ∈ R. This time we get that
2t − t + (1 − a) = 0 and so Δ = 1 − 8(1 − a) = 8a − 7. It follows that there are no solutions of
2

this form if a < 7/8, precisely one solution if a = 7/8, and two solutions if a > 7/8.
Let us now come back to the special case a = 1 that requires more attention. If a = 1, then we have
two solutions of the form (t, t, t) for some t ∈ R, namely (1/2, 1/2, 1/2) and (−1, −1, −1).
Moreover, there are two solutions of the form (1 − t, t, t) for some t ∈ R, again including
(1/2, 1/2, 1/2) which we do not want to count. The other one, namely (1, 0, 0), yields another two

solutions of the form (t, 1 − t, t) and (t, t, 1 − t). So there are 5 solutions for this special case:

(x, y, z ) ∈ {(−1, −1, −1), (1/2, 1/2, 1/2), (1, 0, 0), (0, 1, 0), (0, 0, 1) } .

Let us summarize our observations. The number of the solutions of our system of equations is equal
to:

0 = 0 + 3 ⋅ 0, provided a < −1/8;


1 = 1 + 3 ⋅ 0, provided a = −1/8;
2 = 2 + 3 ⋅ 0, provided −1/8 < a < −7/8;

5 = 2 + 3 ⋅ 1, provided a = 7/8 or a = 1;

8 = 2 + 3 ⋅ 2, provided a > 7/8 and a ≠ 1.

Problem 2.5.2. Solve the following system of equations, given that all variables involved are positive
real numbers:
2010 2009 2009 2010
(x − 1)(y − 1) = (x − 1)(y − 1) .

Solution. Let us rst note that if x = 1 or y = 1, then the equality trivially holds. We will assume then
that they are not equal to 1.
Let us rst re-write the equation as follows:
2010 2009 2009 2009 2010 2009
x (y − 1) − y = (x − 1)y − x .

After adding x 2009


y
2009
to both sides of the equation, we get that
2010 2009 2009 2009 2009 2010 2009 2009
x (y − 1) + (x − 1)y = (x − 1)y + x (y − 1) ,

that can be equivalently written as follows:


2009 2009 2009 2009
x (x − 1)(y − 1) = (x − 1)y (y − 1 ) .

Now, after dividing both sides by (x − 1)(y − 1), we get that

2008 2008

2009 i 2009 i
x ∑y = y ∑x .

i=0 i=0

(Recall that we assumed that x ≠ 1 and y ≠ 1.) Since x and y are both non-zero (in fact, they are both
positive real numbers), this equation can be equivalently rewritten as follows:

2009 2009
1 1
∑ = ∑ .
i i
y x
i=1 i=1

The nal observation is that for each i ∈ [2009], f (t) := 1/t is a decreasing function on the domain i

(0, ∞). It follows that g(t) := ∑ 1/t is also a decreasing function on that domain. Since x and y
2009 i
i=1

are positive, the equation holds only if x = y.


We conclude that all solutions of the equation are of the form (t, t), (1, t), or (t, 1), where
t ∈ (0, ∞). It is straightforward to directly check that they indeed satisfy the original system.

Problem 2.5.3. Fix an integer n ≥ 2, and consider the following system of n equations: for i ∈ [n]
2 2
xi+1 + xi + 50 = 12xi+1 + 16xi .

(As usual, we use the convention that x n+1 = x1 .) Find the number of solutions of this system, given
that all variables involved are integers.
Solution. Let us start with rewriting the system as follows: for i ∈ [n]
2 2
(xi − 8) + (xi+1 − 6) = 50 .

Since both x − 8 and x − 6 are integers, we need to decompose 50 into a sum of two squares of
i i+1

integers. The only decompositions involving natural numbers are 1 + 7 = 50 and 5 + 5 = 50. It 2 2 2 2

follows that for each i ∈ [n],

(xi , xi+1 ) ∈ S := {(1, 5), (1, 7), (3, 1), (3, 11), (7, −1), (7, 13),

(9, −1), (9, 13), (13, 1), (13, 11), (15, 5), (15, 7)}.

We will show now that there is no i ∈ [n] for which (x , x ) = (1, 5). Indeed, for a contradiction,
i i+1

suppose that (x , x ) = (1, 5) for some i ∈ [n]. Then we get that (x , x ) = (5, x ) ∈ S , but
i i+1 i+1 i+2 i+2

there is no pair in S with the rst coordinate equal to 5 (here we extended our convention and use
xn+2 = x ). We get the desired contradiction and so the pair (1, 5) is eliminated from the set of
2

potential pairs. Using similar arguments one can eliminate more pairs to get that for each i ∈ [n],
(xi , xi+1 ) ∈ T := {(1, 7), (7, 13), (13, 1) } .

Our next observation is that the rst pair, pair (x , x ), uniquely determines the sequence 1 2

(x , x , …, x , x
1 2 n ). Moreover, the numbers form the cycle of length three. As a result, since
n+1

x = x
1 , we get that the solution exists if and only if 3 | n. We conclude that the system has 3
n+1

solutions if 3 | n, and no solution otherwise.


Problem 2.6.1. Let x1 be any positive real number, and for each n ∈ N let

1
xn+1 = xn + .
2
xn

Prove that x n /√n


3
has a limit and then nd it.
Solution. It is clear that x > 0 for all n ∈ N. (One can formally prove it by induction.) It will be
n

convenient to use the following substitution: for each n ∈ N, y := x > 0. It follows that n
3
n

1/3 1/3 1
y = yn + .
n+1
2/3
yn

By raising both sides to the power of 3, we get

3 1
yn+1 = yn + 3 + + .
2
yn yn

As a result, by unrolling the recursion all the way to y1, we get that

n−1 n−1
3 1 3 1
yn = y1 + ∑(3 + + ) = 3(n − 1) + y1 + ∑( + ) .
2 2
yi y yi y
i=1 i i=1 i

In particular, we get that y > 3(i − 1) for all i ∈ N. After switching back to xn, we get the following
i

bounds that hold for all n ∈ N:

3 3 xn 3
yn
an := √ 3− < = √ < bn ,
3
n √n n

where

 3 3 6 n−1
3 −3+x +3/x +1/x 1 3 1
1 1 1
bn := 3+ + ∑( + ).
⎷ 2
n n 3(i−1) (3(i−1))
i=2

It is clear that an → √ 3
3
as n → ∞ . Hence, by sandwiching the sequence (xn /√ n)
3

n∈N
between
and (b ) , to show that x /√n → √3, it is enough to show that (see the
3 3 3
(an )n∈N n n∈N n bn → √ 3

squeeze theorem). To that end, it is needed to show that

n−1 n−2 n−2


1 3 1 1 1 1 1
∑( + ) = ∑ + ∑ → 0.
2 2
n 3(i − 1) (3(i − 1)) n i 9n i
i=2 i=1 i=1

We will independently show that ∑ 1/i is nite (and so the second term tends to 0) and that

i=1
2

1/i ≤ ln(n) (and so the rst term tends to 0 as well—clearly, ln(n) tends to in nity much
n
H n = ∑
i=1

slower than n).


For the rst task, let us note that for n ≥ 2 we have

n n n
n n n
1 1 i−(i−1)
∑ 2
< 1 + ∑ = 1 + ∑
i i(i−1) i(i−1)
i=1 i=2 i=2

n
1 1 1
= 1 + ∑ ( − ) = 2 − .
i−1 i n
i=2

It follows that ∑

i=1
1/i
2
=limn→∞ ∑
n

i=1
1/i
2
is smaller than or equal to 2. In fact,

i=1
1/i
2 2
= π /6 ≈ 1.6449 .
For the second task, let us recall that in Section 1.5 we showed that e < (1 + 1/(i − 1)) and so i

< i/(i − 1). It follows that 1/i < ln(i) − ln(i − 1). Since ln(1) = 0, we get that for n ≥ 2,
1/i
e

n n
1
Hn = ∑ < 1 + ∑(ln(i) − ln(i − 1) ) = 1 + ln(n ) .
i
i=1 i=2

This bound is quite good as one can show that H > ln(n + 1) (see the solution to Problem 4.6.2) and n

asymptotically H = ln(n) + γ + o(1), where γ ≈ 0.577216 is the Euler-Mascheroni constant.


n

Let us mention about an alternative solution to this problem that uses the Stolz–Cesàro theorem. This
theorem is a criterion for proving the convergence of a sequence and can be viewed as a generalization
of a L’Hôpital’s rule. Suppose that (a ) and (b ) are sequences of real numbers such that
n n∈N n n∈N

(a ) n is increasing and tends to ∞. If the limit of (b


n∈N
− b )/(a − a ) exists, then the limit of n+1 n n+1 n

b /a
n
also exists and they are equal. (Let us note that the converse of this implication is not true in
n

general.)
Let us now come back to our problem. We observe that x → ∞ as n → ∞. Then, we deal with the n

limit of x /n using the Stolz–Cesàro theorem:


3
n

3
3 3 3 2 3
xn x −xn (xn +1/xn ) −xn
n+1
lim = lim = lim
n n→∞ (n+1)−n n→∞ (n+1)−n
n→∞

3 1
= lim (3 + 3
+ 6
) = 3.
n→∞ x x
n n

This immediately gives us that x n /√n


3
→ √3
3
.
Problem 2.6.2. Consider the following sequence de ned recursively: a = 4 and for each n ∈ N, let 1

a n+1= a (a − 1). Moreover, for each n ∈ N, let b


n n = log (a ) and c = n − log (b ). Prove that n 2 n n 2 n

cn is bounded.
Solution. It is clear that sequence (a ) is increasing and a ≥ 4 for each n ∈ N. We will upper
n n∈N n

and lower bound term a in terms of an. An upper bound is easy: a


n+1 = a (a − 1) < a . On the n+1 n n
2
n

other hand, since a ≥ 4, we get that n

= a /2 + a (a /2 − 1) ≥ a /2. It follows that


2 2 2 2
a = a (a − 1) = a /2 + a /2 − a
n+1 n n n n n n n n n

2 2
an /2 ≤ an+1 < an .

Since a = 4 = 2 , we get that 2 . Aiming for a simpler argument, we will use


n−1 n
2 2 +1 2
1 ≤ an < 2

slightly weaker bounds, namely, 2 . It follows that 2 < 2 , and so 0 < c < 1.
n−1 n
2 2 n−1 n
< an < 2 < b n n

Problem 2.6.3. You are given two numbers a, b ∈ R. Let x = a, x = b, and for each n ∈ N let 1 2

x = x
n+2 + x . Show that there exist a, b ∈ R, a ≠ b, for which there are at least 2, 000 distinct
n+1 n

pairs (k, ℓ), k < ℓ, such that x = x . On the other hand, the number of such pairs is nite even if
k ℓ

a = b, unless a = b = 0.

Solution. Consider any pair a, b ∈ R that is different than a = b = 0. In particular, it is allowed that
a = b ∈ R ∖ {0}. We will rst show that all but possibly a nite number of terms of the sequence are
unique. This proves the second part of the problem. We will independently consider the following three
cases.
Case 1: x > 0 and x
i > 0 for some i ∈ N. It is clear that the sequence (x
i+1 ) i+n n∈N
is strictly
increasing and so indeed all but nitely many terms of the sequence (x ) are unique. n n∈N

Case 2: x i and x
< 0 < 0 for some i ∈ N. We get the same conclusion as before since the sequence
i+1

(xi+n )
n∈N
is strictly decreasing.
Before we move to the last case, let us suppose that x = 0 for some i ∈ N. If there are more terms
i

xi that are equal to 0, then we concentrate on the rst one. Since the case a = b = 0 is excluded, we get
that x i+1 ≠ 0. Indeed, if 0 = x = x + x = 0 + x
i+1 i for some i ≥ 2, then x
i−1 = 0 which
i−1 i−1

gives us a contradiction (xi is the rst term equal to 0). It follows that x = x + x = x and we i+2 i+1 i i+1

arrive in either Case 1 or Case 2. Hence, without loss of generality, we may assume that x ≠ 0 for all i

i ∈ N and it remains to investigate oscillating sequences.

Case 3: the sequence (x ) oscillates between positive and negative values. Suppose that for some
n n∈N

i ∈ N we have x > 0 and x i < 0. But then 0 < x


i+1 = x + x < x . It follows that the
i+2 i+1 i i

sequence (x ) i+2n is a strictly decreasing sequence of positive numbers. Similarly,


n∈N

0 > x i+3= x + x
i+1 > x , and so the sequence (x
i+2 i+1 ) is a strictly increasing sequence
i+1+2n n∈N

of negative numbers. As a result, all but nitely many terms of the sequence (x ) are unique. (In n n∈N

fact, with a slightly more delicate argument one can argue that all of them are unique.)
Before we move to the proof of the rst part of the problem, let us make one remark. One can show
(by induction on n) that for each n ∈ N,
n n

1 + √5 1 − √5
xn = A( ) + B( )
2 2

for some carefully chosen A = A(a, b) and B = B(a, b). (Constants A and B can be determined by
considering x = a and x = b.) Note that A = B = 0 only if a = b = 0 and in which case we have
1 2

in nitely many pairs (k, ℓ). Otherwise, at some point the sequence (x ) must be strictly increasing n n∈N

or strictly decreasing, and so the number of pairs (k, ℓ) we are interested in is nite. This also shows
that Case 3 is impossible.
Let us now come back to our problem. We will show that it is possible to select a and b such that
there are at least 2, 000 pairs (k, ℓ), k < ℓ, with x = x . In order to see this, let us consider the classic
k ℓ

Fibonacci sequence where x = x = 1. (See Section 4.9 for more on that sequence.) However, we
1 2

will extend it to negative indices. Since we want to preserve that x = x + x for each n ∈ Z, n+2 n+1 n

we get that

xn = xn+2 − xn+1 .

(8.5)

In particular, ,
x0 = x2 − x1 = 1 − 1 = 0
x −1 = x1 − x0 = 1 − 0 = 1 , and
x−2 = x0 − x−1 = 0 − 1 = −1 .
We will show by (strong) induction on i that for each i ∈ N,
i
x−i = −(−1) xi .

(8.6)

The base case ( i = 1 and i = 2) clearly holds: x = 1 = −(−1) x and x −1


1
1 −2 = −1 = −(−1) x2
2
.
For the inductive step, suppose that for some i ∈ N, we have x−i = −(−1) xi
i
and
x−(i+1) = −(−1)
i+1
xi+1 . Our goal is to show that x−(i+2) = −(−1)
i+2
xi+2 . It follows immediately
from (8.5) that

i i+1
x−(i+2) = x−i − x−(i+1) = −(−1) xi − (−(−1) xi+1 )

i+2 i+2 i+2


= −(−1) xi − (−1) xi+1 = −(−1) (xi + xi+1 )

i+2
= −(−1) xi+2 ,

as required. This nishes the proof of (8.6).


The rest of the proof is straightforward. It follows from (8.6) that for each k ∈ N, x = x 2k+1 −2k−1

(by applying it to i = 2k + 1). This means that it is enough to “shift” the Fibonacci sequence, that is,
take a := x and b := x
−3999
to generate the desired sequence.
−3998

Problem 2.7.1. Find the number of in nite sequences (a ) , such that a ∈ {−1, 1} for all i i∈N i i ∈ N ,
amn = a a for all m, n ∈ N, and each consecutive triple contains at least one 1 and one −1.
m n

Solution. Let us rst note that for any n ∈ N we have a = a a = (a ) = 1. In particular, n


2
n n n
2

a = a = a = 1. We will show that a = −1. For a contradiction, suppose that a = 1. Then,


1 4 9 2 2

a = −1 (as the triple a , a , a


3 has to contain at least one −1), a = a a = −1, a = a a = 1,
1 2 3 6 2 3 8 2 4

a = −1 (as the triple a , a , a has to contain at least one −1), a = 1 (as the triple a , a , a has to
7 7 8 9 5 5 6 7

contain at least one 1), and a = a a = 1. But then the triple a , a , a does not contain any −1,
10 2 5 8 9 10

and we get the desired contradiction. It follows that a = −1. 2

Suppose that a = a = x ∈ {−1, 1} for some k ∈ N, k ≥ 2. Using the property for consecutive
k k+1

triples, we get that a = a = −x. Then, as a result, a


k−1 k+2 = a a = −(−x) = x and 2k−2 2 k−1

a2k+4 = a a = x. Similar argument gives us a


2 k+2 = a = −x. Using the property for 2k 2k+2

consecutive triples one more time, we get that x = x. It follows that 2k+1

a2k−2 = a = a
2k+1 ( = x). If k = 3ℓ + 1 for some ℓ ∈ N, then we would get that
2k+4

a6ℓ = a = a
6ℓ+3 but, as a result, also a = a
6ℓ+6 = a , which is not possible (the
2ℓ 2ℓ+1 2ℓ+2

corresponding triple does not satisfy the desired property). It follows that the following property is
satis ed:

a3ℓ+1 = −a3ℓ+2 f or all ℓ ∈ N .

(8.7)

We will prove, by (strong) induction on ℓ, that for all ℓ ∈ N ∪ {0}, a = 1 and a = −1. The 3ℓ+1 3ℓ+2

base case ( ℓ = 0) holds: a = 1 and a = −1. For the inductive step, suppose that a
1 2 = 1 and 3ℓ+1

a3ℓ+2 = −1 for all non-negative integers that are less than ℓ ∈ N. Our goal is to show that a = 1 0 3ℓ0 +1

and a = −1. We will independently investigate two cases, depending on the parity of ℓ .
3ℓ0 +2 0

Case 1: ℓ = 2m for some m ∈ N. We get a


0 3ℓ0 +2 = a6m+2 = a2 a3m+1 = −a3m+1 = −1 by inductive
hypothesis, as m = ℓ /2 < ℓ . It follows that a
0 0 3ℓ0 +1 = −a3ℓ0 +2 = 1 by (8.7).
Case 2: ℓ 0 for some m ∈ N ∪ {0}. We get a
= 2m + 1 = a = a a = −a = 1 by 3ℓ0 +1 6m+4 2 3m+2 3m+2

inductive hypothesis, as m = (ℓ − 1)/2 < ℓ . Again, by (8.7), it follows that


0 0

a3ℓ0 +2= −a = −1.3ℓ0 +1

We showed above that the values of the sequence (a ) are determined when n ≡ 1 or n n∈N

n ≡ 2 ( mod 3). We will now show that the value of a3 uniquely determines the whole sequence.

Indeed, note that any natural number n (not necessarily divisible by 3) is uniquely represented as
follows: n = 3 (3q + r), where p, q ∈ N ∪ {0} and r ∈ {1, 2}. It follows that
p

p p r+1
an = a3p (3q+r) = (a3 ) a3q+r = (a3 ) (−1) .
Therefore, indeed, by xing a ∈ {−1, 1} we uniquely de ne two possible sequences.
3

It remains to show that both sequences satisfy the two desired properties. Property (8.7) guarantees
that all consecutive triples contain at least one 1 and one −1. In order to show that a = a a for all mn m n

m, n ∈ N, let us concentrate on any m, n ∈ N. As mentioned above, we may uniquely represent m

and n as follows: m = 3 (3q + r ) and n = 3 (3q + r ), where p , p , q , q ∈ N ∪ {0} and


p1
1 1
p2
2 2 1 2 1 2

r , r ∈ {1, 2}. It follows that


1 2

p1 +p2
mn = 3 (9q1 q2 + 3(q1 r2 + q2 r1 ) + r1 r2 ) .

As a result, if r 1
= r2 , then mn = 3 p1 +p2
(3q3 + 1) for some q 3
∈ N ∪ {0} and so
p1 +p2 2 p1 +p2 2 r1 +r2
amn = (a3 ) (−1) = (a3 ) (−1) (−1)

p1 r1 +1 p2 r2 +1
= (a3 ) (−1) (a3 ) (−1) = am an ,

as desired. Similarly, if r 1 ≠ r2 , then mn = 3 p1 +p2


(3q3 + 2) for some q 3 ∈ N ∪ {0} and so
p1 +p2 3 p1 +p2 3 r1 +r2 −1
amn = (a3 ) (−1) = (a3 ) (−1) (−1)

p1 r1 +1 p2 r2 +1
= (a3 ) (−1) (a3 ) (−1) = am an .

The two desired properties are satis ed and the proof is nished.
Problem 2.7.2. Let us x any real number a. We recursively de ne sequence (a ) as follows: let n n∈N

a = a and for each n ∈ N, let a


1 = (a − 1/a )/2 if a ≠ 0 and a
n+1 = 0 if a = 0. Prove that
n n n n+1 n

this sequence has in nitely many non-positive elements and in nitely many non-negative elements.
Solution. Let us rst note that if a = 0 for some N ∈ N, then a = 0 for all n ≥ N and so the
N n

desired property is trivially satis ed. Therefore, we may assume that a ≠ 0 for all n ∈ N. It follows n

that a = (a − 1/a )/2 for all n ∈ N which can be rewritten as 2a


n+1 n n a = a − 1, or n+1 n
2
n

equivalently as a + 1 = (a
2
n+1
− a ) . This implies that for each n ∈ N, |a
n+1 n
− a | > 1.
2
n+1 n

Suppose that a > 0 for some n ∈ N. We get that a


n = (a − 1/a )/2 < a /2 < a . n+1 n n n n

Combining this with the previous observation we conclude that a < a − 1. As a result, for some n+1 n

k > n we get that a < 0 (for example, it is easy to see that k − n ≤ ⌈a ⌉); recall that we had
k n

assumed that a ≠ 0 for all n ∈ N. Similarly, if a < 0 for some n ∈ N, then


n n

an+1 = (a − 1/a )/2 > a /2 > a , and so a


n n > a + 1. It follows that a > 0 for some k > n.
n n n+1 n k

The conclusion is that, regardless of the choice of a ∈ R, the sequence (a ) must either reach zero n n∈N

(and stay zero forever), or oscillate in nitely many times between positive and negative values, as
required.
Problem 2.7.3. Let n be any natural number such that n ≥ 3 . Find all sequences of real numbers
(x , x , …, x ) that satisfy the following conditions:
1 2 n

n n
2
∑ xi = n and ∑ (xi−1 − xi + xi+1 ) = n,

i=1 i=1

where we set x 0 = xn and x n+1 = x1 .


Solution. Let us rst note that
n
2
∑ (xi−1 − xi + xi+1 − 1)

i=1

n
2
= ∑((xi−1 − xi + xi+1 ) − 2(xi−1 − xi + xi+1 ) + 1)

i=1

n n
2
= ∑ (xi−1 − xi + xi+1 ) − 2 ∑(xi−1 − xi + xi+1 ) + n

i=1 i=1

n n n n
2
= ∑ (xi−1 − xi + xi+1 ) − 2(∑ xi−1 − ∑ xi + ∑ xi+1 ) + n

i=1 i=1 i=1 i=1

= n − 2(n − n + n) + n = 0.

(Recall our convention: x = x and x


0 n = x .) This implies that x
n+1 1 − x + x − 1 = 0 for all i−1 i i+1

i ∈ [n], or equivalently that x = 1 + x − x . i+1 It follows


i that
i−1

xi+2 = 1 + x − x = 1 + (1 + x − x
i+1 i i) − x = 2 − x
i−1 . As
i a
i−1 result,
xi+5 = 2 − x = 2 − (2 − x
i+2 ) = x
i−1 . It follows that x
i−1 = x for all i ∈ [n − 6]. We will
i+6 i

independently consider the following two cases depending whether n is divisible by 6 or not.
Case 1: 6 divides n. We will show that the values of x1 and x2 uniquely determine the whole sequence.
Indeed, once x1 and x2 are xed, we get that x = 1 + x − x , x = 1 + x − x = 2 − x ,
3 2 1 4 3 2 1

x = 2 − x , and x = 1 + x − x = 1 − x + x . Since x
5 2 6 5 4 2 = x for all i ∈ [n − 6], the
1 i+6 i

remaining values are determined. It is straightforward to check that ∑ x = 6 and so ∑ x = n, 6

i=1 i
n

i=1 i

as desired. The second condition is forced by the fact that x − x + x = 1 for all i ∈ [n]. We
i−1 i i+1

conclude that in this case one can x any values of x1 and x2, and these two values determine the
sequence that satisfy the desired properties. These are the only sequences.
Case 2: 6 does not divide n. As before, we x the values of x1 and x2. Arguing as before, we determine
the remaining values of the sequence. However, since 6 does not divide n, we obtain additional
constraints for x1 and x2. Depending on the remainder of n when divided by 6, we get one of the
following conditions:

x1 = x2
∙ n ≡ 1 (mod 6) : {
x2 = 1 + x2 − x1

x1 = 1 + x2 − x1
∙ n ≡ 2 (mod 6) : {
x2 = 2 − x1

x1 = 2 − x1
∙ n ≡ 3 (mod 6) : {
x2 = 2 − x2

x1 = 2 − x2
∙ n ≡ 4 (mod 6) : {
x2 = 1 − x2 + x1

x1 = 1 − x2 + x1
∙ n ≡ 5 (mod 6) : {
x2 = x1

In each case, the only solution is x = x = 1 and then all other values are also equal to 1. As a result,
1 2

if 6 does not divide n, the only solution is a constant sequence, namely, x = 1 for all i ∈ [n]. i

8.3 Functions, Polynomials, and Functional Equations


Problem 3.1.1. Find all sets of six real numbers a , a , a , b , b , b with the property that for all i ∈ [3]
1 2 3 1 2 3

,a and b
i+1 are two different solutions of the equation x + a x + b = 0 (here we let a = a and
i+1
2
i i 4 1

b = b ).
4 1

Solution. Let us rst observe that if b i+1 = 0 for some i, then bi is also equal to 0, as bi+1 is a root of
x
2
. Therefore, we would have that all bis are equal to 0. But this would mean that
+ ai x + bi

a = −a = a = −a , and so a = 0. However, this contradicts the fact that a ≠ b . As a result, we


1 3 2 1 1 i i

may assume that b ≠ 0 for all i. Using Viete’s formulas, we get that
i

a1 + b1 = −a3

a2 + b2 = −a1

a3 + b3 = −a2

a1 b1 = b3

a2 b2 = b1

a3 b3 = b2 .

After multiplying the last three equations and dividing both sides by b b b , we get that a 1 2 3 1 a2 a3 .
= 1

This, in particular, implies that no coef cient is equal to 0. Now, calculate bi from the rst three
equations and substitute them into the last three equations to get that

a1 (a3 + a1 ) = a2 + a3

a2 (a1 + a2 ) = a3 + a1

a3 (a2 + a3 ) = a1 + a2 .

Since a a a = 1, either all coef cients are positive or only one of them is positive. We will
1 2 3

independently investigate both cases.


Suppose rst that all ai are positive. Due to the symmetry, without loss of generality, we may assume
that a1 is a largest coef cient; in particular, a ≥ 1. From the rst equation above, we get that
1

a2 + a3 = a1 (a3 + a1 ) ≥ a3 + a1 ,

and so a ≥ a . However, because of our assumption that a1 is a largest coef cient and the fact that the
2 1

inequality above is sharp when a > 1, this is only possible when a = a = 1. But then we get that
1 1 2

also a = 1, which in turn implies that b = b = b = −2. It is straightforward to check that, indeed,
3 1 2 3

these coef cients yield the desired solution, as x + x − 2 = (x − 1)(x + 2). 2

Suppose now that two coef cients ai are negative and one of them is positive. Without loss of
generality, we may assume that a > 0, a < 0, and a < 0. From the rst equation, we have that
1 2 3

2
a1 + (a1 − 1)a3 = a2 .

Note that if a ≤ 1, then a = a + (1 − a )a < 0 which is impossible. It follows that a > 1.


1
2
1 2 1 3 1

From the same equation we have that a (a + a ) = a + a < 0 which implies that a + a < 0 and
1 3 1 2 3 3 1

so a < −a < −1. Using this inequality and the fact that a + a < 0, it follows from the third
3 1 2 3

equation that

a1 + a2 = a3 (a2 + a3 ) > −(a2 + a3 ),

or equivalently that a + a > −2a . Since a < 0, we get that a + a > 0, which is a contradiction,
1 3 2 2 1 3

as above we showed that a + a < 0. 3 1

Combining the two cases together we get that the only solution is a = a = a = 1 and 1 2 3

b = b = b = −2.
1 2 3
Problem 3.1.2. Let n ≥ 3 be an integer. Prove that the polynomial

n−3

n i
f (x ) = x + ∑ ai x

i=0

has n real roots if and only if all ai are equal to 0.


Solution. Trivially, if all ai are equal to 0, then f (x) = x has n real roots, all of them equal to zero.
n

Suppose now that f (x) has n real roots: xi, i ∈ [n]. It follows from (8.6) that

n 2
an−1 − 2 ⋅ an ⋅ an−2
2
∑ xi = .
2
a
i=1 n

Since a = 0, we get that ∑ x = 0. If all xi are real, then we get that x = 0 for all i.
n 2
n−1 = a n−2 i=1 i i

As a result, the considered polynomial is f (x) = x (all ai are equal to 0), and the proof is nished.
n

Problem 3.1.3. Let x1 , x2 , x3 be the roots of the equation 3x


3
+ 6x
2
− 1 = 0 . Find the value of

3

i=1
.1
4
x
i

Solution. Let us rst re-write the given equation as follows:


3 2 2
1 = 3x + 6x = 3x (x + 2 ) .

After squaring both sides and then dividing both sides by x4, we get that x
1
4
= 9(x + 2)
2
. It follows
that
3 3 3
1 2 2
∑ 4
= 9 ∑ (xi + 2) = 9 ∑ (x
i
+ 4xi + 4)
x
i=1 i i=1 i=1

3 3
2
= 108 + 9 ∑ x
i
+ 36 ∑ xi .
i=1 i=1

From Vieta’s formulas and their consequence (3.1), we get that

3
−6
∑ xi = = −2
3
i=1

and

3 2
2
6 − 2 ⋅ 3 ⋅ 0
∑ xi = = 4.
2
3
i=1

Therefore, the sum we are looking for is equal to 108 + 9 ⋅ 4 + 36 ⋅ (−2) = 72.
Problem 3.2.1. Find all functions f : Z → Z that satisfy the following condition:
3 3 3
f (a + b) − f (a) − f (b) = 3f (a)f (b)f (a + b)

for all a, b ∈ Z.
Solution. After setting a = b = 0, we get that −f (0) = 3f (0) and so f (0) = 0. Next, after
3 3

considering any a = −b ∈ Z, we get that f (a) = −f (−a) . Since the function g(x) := x is a
3 3 3

bijection, we get that f (a) = −f (−a), and so the function is symmetric about the origin (point (0, 0)).
As a result, we may restrict our analysis to arguments that are natural numbers.
Suppose that f (1) = k for some k ∈ Z and consider x := f (2) ∈ Z which may or may not depend
on k. By considering a = b = 1, we get that x − 2k = 3k x and so (x − 2k)(x + k) = 0. It 3 3 2 2

follows that x = 2k or x = −k. We will consider both cases independently.


Case 1: x = 2k. We will prove by induction that f (m) = km for any m ∈ N. The base case trivially
holds: f (1) = k, f (2) = 2k. For the inductive step, suppose that f (m) = km for all 1 ≤ m ≤ m for 0

some natural number m ≥ 2. Our goal is to show that f (m + 1) = k(m + 1). By considering
0 0 0

a = m and b = 1, we get that


0

3 3 3
f (m0 + 1) − (km0 ) − k − 3(km0 )kf (m0 + 1 ) = 0.

Since our goal is to show that f (m + 1) = k(m + 1), it will be convenient to factor out the term
0 0

f (m + 1) − k(m + 1) from the left hand side of the above equality. Guided by this, we re-write the
0 0

equation as follows:

2 2 2
(f (m0 + 1) − k(m0 + 1))(f (m0 + 1) + k(m0 + 1)f (m0 + 1) + k (m0 − m0 + 1) ) = 0.

If k = 0 , then we see immediately that f (m0 + 1) = 0 . On the other hand, if k ≠ 0, then


has no roots as the discriminant of the
2 2 2
f (m0 + 1) + k(m0 + 1)f (m0 + 1) + k (m − m0 + 1)
0

corresponding quadratic equation satis es the following:


2 2 2 2 2
(k(m0 + 1)) − 4k (m0 − m0 + 1 ) = −3k (m0 − 1) < 0.

(Recall that m ≥ 2.) It follows that f (m + 1) = k(m + 1), as required, and so the proof by
0 0 0

induction is nished.
Let us summarize our observations in this case. We obtained that one possible family of solutions is
f (m) = km for some xed integer k. It is straightforward to check that indeed this family satis es our
original equation.
Case 2: x = −k. We may assume that k ≠ 0, as this case was already considered above. After taking
a = 2 and b = 1, we get that f (3) + k − k = 3(−k)kf (3), or equivalently that 3 3 3

f (3)(f (3) + 3k ) = 0. Since the second term is positive, we get that f (3) = 0. Now, by considering
2 2

b = 3, we get that f (a + 3) − f (a) = 0. Again, as g(x) = x is a bijection, we get that


3 3 3

f (a + 3) = f (a). It follows that the only family of functions that satis es these conditions is

0 if a ≡ 0 ( mod 3)

f (a ) = {k if a ≡ 1 ( mod 3)

−k if a ≡ 2 ( mod 3),

where k ∈ Z is some xed integer. As usual, we directly check that this family satis es the original
condition.
Problem 3.2.2. Find all pairs of functions f : R → R and g : R → R such that

g(f (x) − y ) = f (g(y)) + x

for all x, y ∈ R.
Solution. Let us x y = 0 to get that for any x ∈ R, we have that

g(f (x) ) = f (g(0)) + x .

On the other hand, for an arbitrary x ∈ R and y = f (x) ∈ R, we get that

g(0 ) = f (g(f (x))) + x = f (f (g(0)) + x) + x .


Since x is arbitrary and f (g(0)), g(0) are constants, we get that f (x) = a − x for some a ∈ R.
Substituting this into g(f (x)) = f (g(0)) + x, we get that g(a − x) = f (g(0)) + x and so
g(x) = b − x for some b ∈ R. Now, we may go back to the original equation to get that

b − (a − x − y) = a − (b − y) + x. It follows that b − a = a − b and so a = b. We directly check

that the family of functions f (x) = g(x) = a − x for some xed a ∈ R satis es the original
condition.
Problem 3.2.3. Find all functions f : R → R such that for all x, y ∈ R we have

f (f (x) − y ) = f (x) + f (f (y) − f (−x)) + x .

Solution. Let us rst set x = y = 0 to get that f (f (0)) = 2f (0). Now, set x = 0 and y = f (0) to get
that f (0) = f (0) + f (f (f (0)) − f (0)). Using f (f (0)) = 2f (0) (twice!), we get that
0 = f (f (0)) = 2f (0), and so f (0) = 0. After xing x = 0, we get that for any y ∈ R, we have
f (−y) = f (f (y)). For any x ∈ R, after xing y = f (x), we get that

0 = f (0 ) = f (x) + f (f (f (x)) − f (−x)) + x .

Using f (−y) = f (f (y)), we reduce it to

0 = f (x) + f (f (−x) − f (−x)) + x = f (x) + f (0) + x = f (x) + x,

as f (0) = 0. It follows that f (x) = −x , and one can directly check that this function satis es the
original condition:

f (f (x) − y) = −f (x) + y = x + y

= −x − f (y) + f (−x) + x

= f (x) + f (f (y) − f (−x)) + x.

Problem 3.3.1. Prove that if a function f : R → R satis es the condition f (x) = f (2x) = f (1 − x)
for all x ∈ R, then it is periodic (that is, there exists some a ∈ R such that f (x + a) = f (x) for all
+

x ∈ R).

Solution. Let f : R → R be any function that satis es the condition. We will show that f is periodic
with period a = 1/2. Indeed, note that for any x ∈ R, we have that

f (x + 1/2) = f (1 − (x + 1/2)) = f (1/2 − x)

= f (2(1/2 − x)) = f (1 − 2x)

= f (1 − (1 − 2x)) = f (2x) = f (x).

Problem 3.3.2. Given that the function f (x) satis es

f (1/(1 − x)) = xf (x) + 1,

nd the value of f (5).


Solution. Let us rst make an observation that for any x ≠ 0 we have

f (1/(1 − x)) − 1
f (x ) = .
x
Using this formula three times we get f (5) = (f (−1/4) − 1)/5, f (−1/4) = −4(f (4/5) − 1) , and
f (4/5) = 5(f (5) − 1)/4. It follows that

f (5 ) = (−4(5(f (5) − 1)/4 − 1) − 1)/5 = −f (5) + 8/5 ,

and so f (5) = 4/5.


Problem 3.3.3. Suppose that a function f (x, y, z) of three real arguments satis es the following
condition
5 5

∑ f (xi , xi+1 , xi+2 ) = ∑ xi ,

i=1 i=1

where x i+5 = xi . Prove that for all n ≥ 5 we have


n n

∑ f (xi , xi+1 , xi+2 ) = ∑ xi ,

i=1 i=1

where x i+n = xi .
Solution. By considering (x , x , x , x , x ) = (0, 0, 0, 0, 0), we get that f (0, 0, 0) = 0. On the other
1 2 3 4 5

hand, for (x , x , x , x , x ) = (a, b, c, 0, 0) we get


1 2 3 4 5

f (a, b, c) + f (b, c, 0) + f (c, 0, 0) + f (0, 0, a) + f (0, a, b ) = a + b + c.

Finally, for (x 1, x2 , x3 , x4 , x5 ) = (0, b, c, 0, 0) we get almost the same equality, namely,

f (0, b, c) + f (b, c, 0) + f (c, 0, 0) + f (0, 0, 0) + f (0, 0, b ) = a + b,

so there is hope that after subtracting the two, many values will cancel out. Indeed, after subtracting the
two equalities and using the fact that f (0, 0, 0) = 0 we get

f (a, b, c ) = c + f (0, 0, b) − f (0, 0, a) + f (0, b, c) − f (0, a, b ) .

It follows that
n n n

∑ f (xi , xi+1 , xi+2 ) = ∑ xi+2 + ∑ (f (0, 0, xi+1 ) − f (0, 0, xi ))


i=1 i=1 i=1

+ ∑ (f (0, xi+1 , xi+2 ) − f (0, xi , xi+1 )).


i=1

Since x i+n = xi , all the terms in the second and the third sum cancel out and we nally get that
n n

∑ f (xi , xi+1 , xi+2 ) = ∑ xi ,

i=1 i=1

as required.
Problem 3.4.1. Let f = 0, f = 1, and f1 = f + f
2 for all n ∈ N. Find all polynomials P (x)
n+2 n+1 n

having only integer coef cients with the property that for each n ∈ N there exists k = k(n) ∈ Z such
that P (k) = f . n

Solution. Suppose that P (x) is a polynomial with only integer coef cients, that is, P (x) = ∑ c x
r i
i=0 i

for some r ∈ N and c ∈ Z for all i ∈ [r] ∪ {0}. Let us start with proving the following useful
i

property that we will use many times. Let p and q be any integers. Note that
r r r

i i i i
P (p) − P (q) = ∑ ci p − ∑ ci q = ∑ ci (p − q )
i=0 i=0 i=1

r i−1

j i−1−j
= (p − q) ∑ ci ∑ p q ,
i=1 j=0

(8.8)

and so (p − q) divides P (p) − P (q).


Suppose that P (x) satis es the desired property: for each n ∈ N there exists k = k(n) ∈ Z such
that P (k) = f . In particular, for n = 1 and n = 2 we get that there exist a = k(1) ∈ Z and
n

b = k(2) ∈ Z such that P (a) = f = 0 and P (b) = f = 1. It follows from (8.8) that (b − a) divides
1 2

P (b) − P (a) = 1, and so |b − a| = 1.

Let us de ne the auxiliary polynomial Q(x) := P (a + (b − a)x). Clearly, Q(0) = P (a) = 0 and
Q(1) = P (b) = 1. We will prove by induction that Q(f ) = f for all i ∈ N. This will
i i
nish the proof
as the only polynomial that satis es this property is Q(x) = x. Indeed, each polynomial R(x) of
degree at least 2 has the property that |R(x)| > x for all x ≥ x , where x0 is a suf ciently large
0

constant. Since Q(x) = x for in nitely many natural numbers x, we get that Q(x) has to be of degree
at most 1. Constant polynomials are clearly ruled out and Q(x) = x is the only linear function that
satis es the property. Using the fact that |b − a| = 1, we get that the only polynomials that satisfy the
original equation are polynomials of the form P (x) = x + c or P (x) = −x + c for some c ∈ Z. It is
straightforward to check that, indeed, they satisfy the desired equation.
It remains to show that Q(f ) = f for all i ∈ N. We already showed that this property holds for
i i

i = 1 and i = 2. For the base case, we will show that it also holds for i ∈ {3, 4, 5, 6, 7}. In fact, we

will prove something stronger, namely, that fi is the only integer k that satis es f (k) = f . i

Case: i = 3 . Suppose that Q(k) = f3 = 2for some k ∈ Z. Using (8.8) we get that k − 1 divides
Q(k) − Q(1) = f3 − f2 = 2 − 1 = 1 , and so k − 1 = 1 or k − 1 = −1. Since Q(0) = 0, k = 0 is
ruled out and we get that k = 2 is the unique solution.
Case: i = 4. Suppose that Q(k) = f = 3 for some k ∈ Z. Using the same argument as before, we get
4

that k − 2 divides Q(k) − Q(2) = f − f = 3 − 2 = 1, which implies that k − 2 = 1 or


4 3

k − 2 = −1. Since k = 1 is ruled out ( Q(1) = 1 ≠ 3), we get that k = 3 = f is the unique solution. 4

Case: i = 5. If Q(k) = f = 5, then (k − 0 ) | (5 − 0) and so k ∈ {−5, −1, 5}, as k = 1 is already


5

ruled out. However, since also (k − 3 ) | (5 − 3), we get that k = 5 is the unique solution.
Case: i = 6. If Q(k) = f = 8, then (k − 5 ) | (8 − 5) and so k − 5 ∈ {−3, −1, 1, 3}. It follows
6

that k ∈ {4, 6, 8}, as k = 2 is already ruled out. Moreover, (k − 1 ) | (8 − 1), and so k = 8 is the
only solution.
Case: i = 7. If Q(k) = f = 13, then (k − 0 ) | (13 − 0) so k ∈ {−13, −1, 13} as k = 1 is ruled
7

out. But also (k − 8 ) | (13 − 8), and so k = 13 is the only solution.


Let us now move to the inductive step. Suppose that for some integer n ≥ 7, for each i ∈ [n] we
have that k = f is the unique integer solution to Q(k) = f . Our goal is to show that k = f
i i is the n+1

unique integer solution to Q(k) = f . n+1

Let k ∈ Z be such that Q(k) = f . Since Q(0) = 0, we get from (8.8) that
n+1

k − 0 | Q(k) − Q(0) and so k | Q(k). Moreover, from the same property it follows that k − f n

divides f − f
n+1 = f
n n−1 and so −f ≤ k − f ≤ f
n−1 . We conclude
n n−1 that
5 = f ≤ f
5 n−2 < k ≤ f n+1 (note that Q(f ) = f ≠ f
n−2 so k = f
n−2 is ruled out). Since
n+1 n−2

fn−2 < 2f n−1 < 4f < 8f


n , we get that k = f /x for some x ∈ [7], as k divides Q(k) = f .
n+1 n+1 n+1

Recall that our goal is to show that x = 1. For a contradiction, suppose that x > 1. Applying (8.8)
twice, we get that (k − 1 ) | (xk − 1) and (k − 2 ) | (xk − 2). In other words, there exist a, b ∈ N
such that b > a > 1, a(k − 1) = xk − 1, and b(k − 2) = xk − 2. It follows that
a(k − 1) − b(k − 2) = 1, or equivalently that (b − a)(k − 1) = b − 1. It will be convenient to x
c := b − a ∈ N. We get that b = c(k − 1) + 1 and so b(k − 2) = (c(k − 1) + 1)(k − 2) = xk − 2.

This means that k divides (c(k − 1) + 1)(k − 2) + 2 = k(ck − 3c + 1) + 2c and so we get that
k | 2c.
Let us now summarize what we have learnt. We showed the following three things: x ≤ 7 , k ≥ 6 ,
and k | 2c. But using these observations, we get that

xk−2 xk−1 k(x−1)


c = b − a = − =
k−2 k−1 (k−1)(k−2)

6 6 9
≤ ≤ = ,
k−3+2/k 6−3+2/6 5

and so c = 1. But this is not possible as k ≥ 6 and k | 2c .


Problem 3.4.2. Suppose that a polynomial P (x) has all integer coef cients. Prove that if polynomials
P (P (P (x))) and P (x) have a common real root, then P (x) also has an integer root.

Solution. Let a ∈ R be a common root of P (P (P (x))) and P (x), that is, P (P (P (a))) = P (a) = 0.
This implies that P (P (0)) = 0, that is, P (0) is also a root of P (x). But P (0) is an integer, as P (x)
has all integer coef cients; in particular, the free term is an integer.
Problem 3.4.3. Consider a polynomial P (x) = x + ax + b with a, b ∈ Z. Suppose that for every
2

prime number p, there exists k ∈ Z such that P (k) and P (k + 1) are divisible by p. Prove that there
exists m ∈ Z such that P (m) = P (m + 1) = 0.
Solution. Fix any prime number p. By our assumption, there exists k = k(p) ∈ Z such that both P (k)
and P (k + 1) are divisible by p. Our goal is to nd a number which does not depend on k that is
divisible by p. To that end, note that

2 2
P (k + 1) − P (k) = ((k + 1) + a(k + 1) + b) − (k + ak + b)

= 2k + (a + 1)

is divisible by p and so is
2 2
2P (k) − k(2k + (a + 1)) = 2k + 2ak + 2b − (2k + k(a + 1))

= k(a − 1) + 2b.

Finally, observe that

2(k(a − 1) + 2b) − (a − 1)(2k + (a + 1)) = 4b − (a − 1)(a + 1)

2
= −a + 1 + 4b

is divisible by p. Since this property holds for any p, we get that a = 4b + 1. In particular, a is odd,
2

that is, a = 2s + 1 for some s ∈ Z. We get that a = 4s + 4s + 1 = 4b + 1, and so b = s(s + 1).


2 2

Substituting this back into the quadratic polynomial, we get that


2
P (x ) = x + (2s + 1)x + s(s + 1 ) = (x + s)(x + s + 1 ) .

From this it is clear that if m = −s − 1 ∈ Z, then P (m) = P (m + 1) = 0, the desired property.


Problem 3.5.1. Find all polynomials P (x) with real coef cients that satisfy the following property: if
x + y is rational, then P (x) + P (y) is rational.

Solution. Consider any polynomial P (x) with real coef cients that satis es the desired property. Fix
any rational number q and consider the polynomial Q(x) := P (q + x) + P (q − x) for all x ∈ R.
Since for each x ∈ R we have that q + x + (q − x) = 2q is rational, it follows that Q(x) is rational
for all x ∈ R. But, since Q(x) is continuous, it is only possible when Q(x) is constant. In particular,

P (q + q) + P (q − q ) = Q(q ) = Q(0 ) = P (q + 0) + P (q − 0 ) ,
so P (2q) + P (0) = 2P (q) . It will be convenient to represent P (x) as follows:
2
P (x) = x R(x) + ax + b for some polynomial R(x). It follows that
2 2
(2q) R(2q) + a(2q) + b + b = 2(q R(q) + aq + b)

so, assuming that q ≠ 0, we get that 2R(2q) = R(q). Since this argument holds for all rational
numbers q, we get that R(2 q) = R(q)/2 . It follows that R(q) does not tend to +∞ or −∞ as
n n

q → +∞. This is only possible if R(q) = 0 for all rational numbers and so R(x) = 0 everywhere. As

a result, we get that P (x) = ax + b. Now, after letting x = 0 we see that b must be rational, and after
letting x = 1 we see that a must also be rational. Finally, one can directly check that if a and b are
rational, then the desired condition is satis ed.
Problem 3.5.2. Let P (x) be a polynomial with real coef cients. Prove that if there exists an integer k
such that P (k) is not an integer, then there are in nitely many such integers.
Solution. Let P (x) be any polynomial such that P (k) ∉ Z for some k ∈ Z. It will be more convenient
to work with the polynomial Q(x) := P (x + k) instead of P (x). Indeed, if there are in nitely many
integers ℓ such that Q(ℓ) ∉ Z, then clearly the same property holds for P (x). An advantage of working
with Q(x) is that, by assumption, Q(0) = P (k) ∉ Z and evaluating polynomials at x = 0 is easy.
For a contradiction, suppose that the set A := {x ∈ Z : Q(x) ∉ Z} is nite which implies that the
set B := Z ∖ A = {x ∈ Z : Q(x) ∈ Z} is in nite. Suppose that the degree of Q(x) is n ∈ N ∪ {0}.
In fact, n ≠ 0 as Q(0) ∉ Z and Q(x) ∈ Z for any x ∈ B, and so Q(x) is not a constant polynomial.
Since B is in nite, we may consider n points (x , y ), where both xi and y = Q(x ) are integers. From
i i i i

the Lagrange interpolation formula for these points, we get that all the coef cients of Q(x) are rational.
It follows that Q(x) = ∑ ⋅ x , where n ∈ Z, d ∈ N, and Q(0) = n /d ∉ Z; in particular,
n ni i
i=0 i i 0 0
di

d ≥ 2.
0

Let us now consider the sequence of natural numbers de ned as follows: y := (∏ d ) for t ∈ N
n t

t i=0 i

. Since d ≥ 2, we get that ∏ d > 1 and so the sequence (y ) is increasing. Moreover,


n
0 i=0 i t t∈N

n
ni n0
i
Q(yt ) = ∑ ⋅ yt = ct + ,
di d0
i=0

where ct is some integer. It follows that Q(y ) ∉ Z for all t ∈ N, and so we have an in nite sequence
t

(y , Q(y ))
t t of distinct pairs consisting of integer and non-integer which contradicts the fact that A
t∈N

is nite. The conclusion is that A is in nite, and so the proof is complete.


Problem 3.5.3. Let F (x), G(x), and H (x) be some polynomials of degree at most 2n + 1 with real
coef cients. Moreover, suppose that the following properties hold:

(1) for all x ∈ R, F (x) ≤ G(x) ≤ H (x),


(2) there exist n different numbers x i ,
∈ R i ∈ [n] , such that F (x i) = H (xi ) for all i ∈ [n],
(3) there exists x 0 ∈ R , different than xi for i ∈ [n], such that F (x 0) + H (x0 ) = 2G(x0 ) .

Prove that for all x ∈ R, F (x) + H (x) = 2G(x).


Solution. Consider any polynomials F (x), G(x), and H (x) that are of degree at most 2n + 1 and that
satisfy the properties (1)–(3). It will be convenient to consider two auxiliary polynomials,
P (x) := G(x) − F (x) and Q(x) := H (x) − F (x). It follows from property (1) that for each x ∈ R,

Q(x) ≥ P (x) ≥ 0. In particular, it means that both polynomials are of degree at most 2n as having

degree 2n + 1 would imply that either lim P (x) = −∞ or lim x→−∞ P (x) = −∞. Moreover, x→∞
from property (2) it follows that for i ∈ [n] we have P (x ) = Q(x ) = 0. But this means that all of the
i i

xi are roots of even multiplicity. Since we have 2n roots in total (including multiplicities) and P (x) and
Q(x) have degree at most 2n, we get that there exists a ∈ [0, 1] such as P (x) = aQ(x), for all x ∈ R.

Now, using property (3) we get that


Q(x0 ) H (x0 ) F (x0 ) 2G(x0 )−F (x0 ) F (x0 )
= − = −
2 2 2 2 2

= G(x0 ) − F (x0 ) = P (x0 ),

and so a = 1/2 . It follows that for each x ∈ R , 2(G(x) − F (x)) = H (x) − F (x) , and so
2G(x) = F (x) + H (x) , as needed.
Problem 3.6.1. Find all
real numbers m for which the polynomial
+ 22x − 8 has two real roots whose product is equal to 2.
4 3 2
f (x) = 2x − 7x + mx

Solution. Suppose that the polynomial f (x) has two real roots a and b such that ab = 2. It follows that
f (x) = (x − a)(x − b)(2x + cx + d) for some a, b, c, d ∈ R. After comparing the corresponding
2

coef cients, we get the following system of equations:

−2a − 2b + c = −7

2ab − c(a + b) + d = m

abc − (a + b)d = 22

abd = −8.

Since ab = 2, we get from the last equation that d = −4. Substituting it to the third equation and
adding twice the rst one, we get that c = 2. If follows that a + b = 9/2, and so m = −9 is the only
possible solution. Since
4 3 2
2x − 7x − 9x + 22x − 8 = 2(x − 4)(x − 1)(x − 1/2)(x + 2 ) ,

we get that, indeed, the polynomial f (x) has two roots, namely 4 and 1/2, whose product is equal to 2,
as required.
Problem 3.6.2. Given the polynomial P (x) = x 4
− 3x
3
+ 5x
2
− 9x , x ∈ R , nd all pairs of integers
a and b such that a ≠ b and P (a) = P (b).
Solution. Let us rst note that
2
P (−x + 1) − P (x ) = (2x − 1)(x − x + 6) > 0,

provided x > 1/2. Moreover,

P (x + 1) − P (−x + 1 ) = 2(x − 2)x(x + 2 ) > 0,

provided that x > 2. It follows that for x ∈ N ∖ {1, 2}, we have that

P (x ) < P (−x + 1 ) < P (x + 1 ) .

As a result, there are no a, b ∈ Z ∖ {−1, 0, 1, 2} such that P (a) = P (b). We directly compute that
P (−1) = 18, P (0) = 0, P (1) = −6, P (2) = −6, and P (3) = 18. Since all the values of the

polynomial P (x) evaluated at integers greater than 3 or smaller than −1 are greater than P (3) = 18,
we get that there are only four solutions to the problem:

(a, b) ∈ {(−1, 3), (3, −1), (1, 2), (2, 1) } .

Problem 3.6.3. Find all polynomials P (x) with real coef cients that satisfy the following property: for
all x ∈ R, P (x ) ⋅ P (x ) = (P (x)) .
2 3 5
Solution. We will independently consider two cases depending on how many terms the considered
polynomial has.
Let us rst assume that P (x) has exactly one term (including the special case P (x) = 0 for x ∈ R),
that is, P (x) = ax for some a ∈ R and k ∈ N ∪ {0}. Substituting this into the equation we want to
k

hold for all x ∈ R, we get that ax ⋅ ax = a x . In particular, by considering x = 1, we get that


2k 3k 5 5k

a = a and so a (a − 1)(a + a + 1) = 0. It follows that a = 0 or a = 1, and so the only solutions


2 5 2 2

in this case are: P (x) = 0, P (x) = 1, and P (x) = x for some xed k ∈ N. It is straightforward to k

directly check that all of these polynomials satisfy the original equation.
Let us now assume that P (x) has more than one term and has degree k ∈ N. Then, it can be
represented as follows: P (x) = ax + bx + Q(x), where a, b ∈ R ∖ {0}, ℓ ∈ Z such that 0 ≤ ℓ < k
k ℓ

, and Q(x) has degree less than ℓ (or Q(x) = 0 everywhere if ℓ = 0). Substituting this form into the
original equation we get that for all x ∈ R,
5
2k 2ℓ 2 3k 3ℓ 3 k ℓ
(ax + bx + Q(x ))(ax + bx + Q(x ) ) = (ax + bx + Q(x)) .

Let us now compare the coef cients in front of the term x on both the left and the right hand side 4k+ℓ

of the above equation. The rst term on the left hand side is clearly a x but, since ℓ < k, the next 2 5k

non-zero term is abx . Since 3k + 2ℓ < 4k + ℓ, there is no term we are looking for. Alternatively,
3k+2ℓ

we may say that the coef cient in front of x is equal to 0. On the other hand, the right hand side
4k+l

after expanding is equal to a x + 5a bx + R(x), where R(x) has degree less than 4k + ℓ. It
5 5k 4 4k+ℓ

follows that the coef cient in front of the term x is equal to 5a b ≠ 0. This contradiction proves 4k+l 4

that P (x) cannot have more than one term. We conclude that the only polynomials that satisfy the
desired property are those that we found in the previous case.
Problem 3.7.1. Prove that there are no polynomials P1 (x), P2 (x), P3 (x), P4 (x) with rational
coef cients that satisfy

4
2 2
∑ (Pi (x)) = x + 7 f or all x ∈ R .

i=1

(8.9)

Solution. Due to the symmetry, without loss of generality, we may assume that
n := n1 ≥ n2 ≥ n3 ≥ n4 ≥ 0 , where ni is the degree of P (x), i ∈ [4]. For i ∈ [4], let c ∈ R be the i i

xn
coef cient in front of the term in P (x). Clearly, c ≠ 0. More importantly, after expanding the left
i 1

hand side of (8.9), the coef cient in front of the term x is equal to ∑ c ≥ c > 0. Since the 2n 4 2 2
i=1 i 1

degree of the right hand side of (8.9) is 2, we get that n = 1. As a result, since all the coef cients are
rational, we may represented each polynomial P (x) as follows: P (x) = (a x + b )/m, where ai, bi ( i i i i

i ∈ [4]), and m are some integers.

After comparing the coef cients in (8.9) that are in front of the term xk for k ∈ {2, 1, 0}, we get the
following set of equations:
2 2 2 2 2
a + a + a + a = m
1 2 3 4

a1 b1 + a2 b2 + a3 b3 + a4 b4 = 0

2 2 2 2 2
b + b + b + b = 7m
1 2 3 4

For i ∈ [4], let p := a + b and q := a − b . Adding the rst, the third, and twice the second
i i i i i i

equation we get that p + p + p + p = 8m . Adding the rst, the third, and subtracting the second
2
1
2
2
2
3
2
4
2

equation twice we get that q + q + q + q = 8m . Finally, after subtracting the third equation from
2
1
2
2 3
2 2
4
2
the rst equation, we get that p1 q1 + p2 q2 + p3 q3 + p4 q4 = −6m
2
. Summarizing, we get the
following system of equations:
2 2 2 2 2
p + p + p + p = 8m
1 2 3 4

2 2 2 2 2
q + q + q + q = 8m
1 2 3 4

2
p1 q1 + p2 q2 + p3 q3 + p4 q4 = −6m .

Let us note that,


loss of generality, without
we may assume that
. Indeed, if all the variables involved (namely,
gcd(p1 , p2 , p3 , p4 , q1 , q2 , q3 , q4 , m) = 1

p , p , p , p , q , q , q , q , m) had some common positive divisor d, one could divide the three
1 2 3 4 1 2 3 4

equations by d2 to get an equivalent system of equations. Note that for any x ∈ Z, the reminder of x2
when divided by 8 is equal to 0, 1 or 4. Hence, from the rst equation we get that that all pi are even,
and from the second one it follows that all qi are even. But this means that m is odd, since
gcd(p , p , p , p , q , q , q , q , m) = 1. However, if m is odd and all other variables are even, then the
1 2 3 4 1 2 3 4

left hand side of the third equation is divisible by 4 wheras the right hand side is not. The conclusion is
that there are no polynomials with rational coef cients that satisfy (8.9), and so the proof is complete.
Problem 3.7.2. Consider a polynomial f (x) := x + bx + c, where 2
b, c ∈ Z . Prove that if n ∈ N

divides f (p), f (q), and f (r) for some p, q, r ∈ Z, then

n | (p − q)(q − r)(r − p ) .

Solution. Note that if n | f (p) and n | f (q) , then n divides


2 2
f (p) − f (q) = p + bp + c − (q + bq + c) = (p − q)(p + q) + b(p − q)

= (p − q)((p + q) + b).

Similarly, we get that n | (q − r)((q + r) + b) and n | (r − p)((r + p) + b) . It follows that n also


divides

r(p − q)((p + q) + b) + p(q − r)((q + r) + b) + q(r − p)((r + p) + b)

= r(p − q)(p + q) + p(q − r)(q + r) + q(r − p)(r + p)

2 2 2 2
= (p − q)(rp + rq)pq − pr + qr − qp

2
= (p − q)(rp + rq) − (p − q)(pq + r )

2
= (p − q)(rp + rq − pq − r )

= (p − q)(q − r)(r − p),

and so the desired property holds.


Alternatively, one could consider the above expression as a quadratic polynomial of in terms of p. It
is straightforward to check that q and r are both roots of this polynomial. Moreover, the coef cient in
front of p2 is (r − q). It follows, without having to perform the laborious calculations, that the above
expression is equal to (p − q)(p − r)(r − q).
Problem 3.7.3. Consider a polynomial P (x) with integer coef cients that satis es the following
property: if a, b ∈ Q and a ≠ b, then P (a) ≠ P (b). Does it mean that P (a) ≠ P (b) for all a, b ∈ R,
a ≠ b?

Solution. We will show that this is not true for P (x) := x − 2x. First, let us observe that 3

P (x) = x(x − 2) = x(x − √ 2)(x + √ 2) and so, in particular, P (0) = P (√ 2) = 0. Hence, it is


2

enough to show that P (x) has the desired property, namely, that there are no two different rational
numbers a and b such that P (a) = P (b). For a contradiction, suppose that there are q1 , q2 ∈ Q such
that q ≠ q and P (q ) = P (q ). It follows that
1 2 1 2

3 3
0 = P (q1 ) − P (q2 ) = q − 2q1 − q + 2q2
1 2

2 2
= (q1 − q2 )(q + q1 q2 + q − 2).
1 2

Since q 1 ≠ q2 , we get that


2 2
q1 + q1 q2 + q2 = 2.

Since q1 and q2 are rational numbers, we may express these numbers as follows: q1 = a/c and
q = b/c for some a, b, c ∈ Z, and gcd(a, b, c) = 1. It follows that
2

2 2 2
a + ab + b = 2c .
(8.10)

Note that a and b cannot be both even as then the left hand side of (8.10) would be divisible by 4
whereas the right hand side would not. Similarly, if both a and b were odd, then the left hand side
would be odd but the right hand side would be even. Finally, if only one of the two numbers a and b is
even and the other one is odd, then the left hand side is odd, which is again impossible. We get the
desired contradiction and so, indeed, the polynomial P (x) = x − 2x is a counter-example to our
3

problem.
Let us make a nal remark on how one can guess that P (x) = x − 2x is a counter-example to our 3

problem. Let us rst consider polynomials with integer coef cients that are of degree 2. Such
polynomials can be expressed as follows: P (x) := a(x − p)(x − q), where a, −a(p + q), apq ∈ Z.
Since P (0) = P (p + q) = apq and both 0 and p + q are rational, no polynomial of degree 2 satis es
the desired property.
Hence, we shift our attention to polynomials of degree 3 by considering polynomials of the form
P (x) := a(x − p)(x − q)(x − r) and with integer coef cients. No two of the three roots, say p and q,

can be rational as then P (p) = P (q) = 0 fails the required assumption. So we have two options: all of
them are irrational or exactly one is rational. The second option seems easier to deal with and, without
loss of generality, we may assume that p = 0. Indeed, if P (x) is a counter-example, then so is
Q(x) := P (x − p). Since P (x) has integer coef cients, we get that a ∈ Z and again, without loss of

generality, we may assume that a = 1. It follows that P (x) = x(x − q)(x − r) with q, r ∈ R ∖ Q but
qr ∈ Z. A natural choice for q is an irrational square root of some natural number and r = −q so that

P (x) = x(x
2
. Choosing q
2
− q ) = √2 is an intuitive rst guess, as it is related to a well known proof
that √2 is not a rational number.

8.4 Combinatorics
Problem 4.1.1. There are 2n members of a chess club; each member knows at least n other members
(knowing a person is a reciprocal relationship). Prove that it is possible to assign members of the club
into n pairs in such a way that in each pair both members know each other.
Solution. Let us rst rephrase the question in the language of graph theory. Suppose that G = (V , E) is
a graph on |V | = 2n vertices and the minimum degree, δ = δ(G) ≥ n. Our goal is to show that G has
a perfect matching.
We will construct a perfect matching in n rounds, distinguishing two phases. During the rst phase,
we apply a trivial, greedy algorithm to construct a maximal matching, that is, a matching that cannot be
extended by adding an edge. We start with an empty matching M = (∅, ∅). In each round i, we
0

consider the graph G[V ∖ V (M )] induced by unsaturated vertices. If it contains edges, then we
i−1

arbitrarily pick one of them (say, edge a b ) and add it to the current matching; that is,
i i

V (M ) = V (M
i i−1 ) ∪ {a , b } and E(M ) = E(M
i i i ) ∪ {a b }. This phase ends if there are no more
i−1 i i

edges to pick from. If all the vertices are saturated, then we are done; otherwise, we move on to the
second phase.
During the second phase, at the beginning of each round i ≤ n, set V ∖ V (M ) contains at least
i−1

two vertices and it induces an independent set. We pick any two vertices, say, p and q from that set. We
will show that there is an edge in E(M ), say, rs such that pr ∈ E and qs ∈ E. In other words, we
i−1

will show that there exists a path (p, r, s, q) of length 3 (such paths are often called augmenting paths).
The existence of such paths allows us to improve the size of our matching. Indeed, we can simply
remove rs from the matching and add edges pr and qs instead. Formally, V (M ) = V (M ) ∪ {p, q}
i i−1

and E(M ) = (E(M ) ∖ {rs}) ∪ {pr, qs}.


i i−1

To nish the proof, let us note that p has at least n neighbors in V (M ) (since δ ≥ n and
i−1

V ∖ V (M i−1 ) induces an independent set). Let R := N (p) ⊆ V (M ), and let S be the set of vertices
i−1

matched with vertices from R, that is, S = {s ∈ V (M ) : sr ∈ E(M ) f or some r ∈ R}. Clearly,
i−1 i−1

|S| = |R| ≥ δ ≥ n. Moreover, S and R can overlap (and, in fact, they do) but it causes no problem.

More importantly, since q has at least n neighbors in V (M ), |S| ≥ n, and |V (M )| ≤ 2n − 2, it


i−1 i−1

follows that q has at least one neighbor in S which nishes the argument.
Finally, let us mention that a stronger property holds. Graph G not only contains a perfect matching,
but it in fact has to have a Hamilton cycle, that is, cycle of length 2n whose vertex set is precisely
V (G). This is a famous suf cient condition for the existence of a Hamilton cycle due to Dirac. It is

indeed a stronger property since one can take every second edge of a Hamilton cycle to form a perfect
matching.
Problem 4.1.2. There are 17 players in the tournament in which each pair of two players compete
against each other. Every game can last 1, 2, or 3 rounds. Prove that there exist three players who have
played exactly the same number of rounds with one another.
Solution. As before, let us rephrase this problem in the language of graph theory. The tournament can
be represented as coloring of the edges of K17, the complete graph on 17 vertices, with three colors
(say, red, blue, and green). Coloring edge vw red indicates that the game between players
corresponding to vertices v and w lasted one round. Similarly, blue and green indicate that the
corresponding game lasted two and, respectively, three rounds. Our goal is to show that, regardless how
the graph is colored, it must contain a monochromatic triangle (that is, the edges of some K3 are all the
same color).
In order to warm up, let us prove something slightly easier. Suppose that only two colors are
available (it does not matter which ones; without loss of generality, we may assume that we use red and
blue). We claim that, regardless how the edges of K6 are colored with these two selected colors, there is
a monochromatic triangle. Indeed, pick any vertex v and consider the 5 edges incident to v. Clearly, at
least three of them (say va, vb, and vc) must be of the same color, say red. If any one of ab, ac, bc is
red, then we have a red triangle. If none of these edges is red, then we have a blue triangle. This proves
the claim about two colors.
Now, let us come back to the original problem with three colors and K17. Pick any vertex v and
consider the 16 edges incident to v. Since, 3 ⋅ 5 < 16, at least 6 of them must be of the same color, say,
green. Let N be the set of neighbors of v that are adjacent to v by a green edge. If any edge of G[N ],
the graph induced by N, is colored green, then we have a green triangle. If none of these edges is green,
then all the edges of G[N ] are colored red and blue. By the previous claim, this also generates a
monochromatic triangle and so we are done.
Let us mention that this result is sharp, namely, one can color the edges of K16 with three colors and
avoid monochromatic triangle. Finally, let us mention that this is a speci c case of the classic and
famous problem of Ramsey numbers (for three colors and triangles). Indeed, this observation can be
generalized to any number of colors and any order of a monochromatic complete graph.
Problem 4.1.3. Consider a group of people with the following property. Some of them know each other,
in which case the corresponding pair of people mutually like each other or dislike each other.
Moreover, there is a person who knows at least six other people. Interestingly, for each person the
number of people he or she likes is equal to the number of people he or she dislikes. Prove that it is
possible to remove some, but not all, like/dislike links such that it is still the case that each person has
the same number of liked and disliked acquaintances.
Solution. As usual, let us rephrase this problem in the language of graph theory. Note rst that
acquaintances can be modelled by a graph G: if v and w know each other, then we put an edge between
v and w; otherwise, v and w are not adjacent. Then, likes and dislikes can be represented by coloring
edges red and, respectively, blue. We assume that the maximum degree is at least 6. More importantly,
we assume that the following property holds: for each vertex v ∈ V (G), the number of red edges
incident to v is equal to the number of blue edges incident to V (in particular, it implies that each vertex
has even degree). Our goal is to show that it is possible to remove some edges (but not all of them) such
that this property is preserved.
It will be convenient to use a notion of a walk on graphs. A walk W = (v , …, v ) of length k is a
0 k

sequence of vertices such that v v ∈ E for any i ∈ [k]. Note that walks are allowed to revisit some
i−1 i

vertices and edges but they do not have to. As a result, a path is a walk but not every walk is a path.
In order to show the result, let us select any vertex v0 that has degree at least 6 and start walking
from there, rst using a red edge and then alternate colors. Note that, because of the property of our
coloring, each time we enter some vertex v ≠ v we may continue using some other edge of the other
0

color. Hence, at some point, we need to get back to v0; let W1 be the walk we did so far. If W1 has even
length, then we get the desired property after removing edges of this walk (note that each vertex on the
walk is incident to the same number of red and blue edges used in this walk). On the other hand, if the
length of W1 is odd, then the two edges used by W1 that are incident to v0 are red. We continue walking
from v0 starting from blue edge and oscillating colors. However, this time, we are not allowed to use
any edges of W1. As before, we are guaranteed that we will not get stuck and we need to get back to v0;
let W2 be the second walk. If W2 is even, then removing this walk gives us the desired property. If it is
odd, then removing both W1 and W2 does the trick. (Recall that W1 and W2 are edge disjoint.) Finally,
let us mention that the condition about the maximum degree is at least 6 is needed. One can easily
construct a counter-example when Δ(G) = 4.
Problem 4.2.1. Consider a square grid of size 25 × 25 that has a smaller square grid of size 5 × 5 cut
out from its bottom left corner. Can you cover the remaining cells with 100 blocks of size 1 × 6 or
2 × 3?

Solution. Label the grid so that the bottom left cell has label (1, 1) and the top right one has label
(25, 25). Put 1 in a cell with label (i, j) if i + j is divisible by 3; otherwise, put 0—see Figure 8.4.

Observe that each block (regardless whether it is of size 1 × 6 or 2 × 3) covers precisely two 1’s. There
are 100 such blocks but the number of 1’s to cover is 199. To see this note that in each row we have
either 8 or 9 ones (before removing 5 × 5 square). We have exactly 17 rows with 8 ones and 8 rows
with 9 ones. It follows that the 25 × 25 grid contains 17 ⋅ 8 + 8 ⋅ 9 = 208 ones. After removing 9 of
them from the bottom left 5 × 5 square grid we are left with 199 ones. The conclusion is that no matter
how hard we try, we will not be able to cover the remaining cells with 100 blocks.

FIGURE 8.4: Illustration for Problem 4.2.1. We put ‘x’ in places where 1 should be placed. We also shown ‘x’ in the 5 × 5 grid that was
removed.

Problem 4.2.2. Prove that it is impossible to cover a square grid of size 9 × 9 with tiles of size 1 × 5 or
1 × 6.

Solution. Let us rst observe that any potential covering would have to use at least 14 blocks (as
13 ⋅ 6 = 78 < 81 = 9 ⋅ 9). Hence, there must be at least 7 blocks that are vertical or at least 7 that are

horizontal. Without loss of generality, we may assume that there are at least 7 horizontal blocks.
However, this means that there must be exactly 9 horizontal blocks because the middle column (column
5) is covered by each of such blocks and so no vertical block can intersect it. Consider now the middle
row (row 5). At least 5 cells are covered by the horizontal block so there are at most 4 vertical blocks.
But 4 + 9 = 13 < 14.
Problem 4.2.3. Can you cover a square grid of size 10 × 10 with 25 “T-shaped” blocks consisting of 4
small squares?
Solution. As usual, label the grid so that the bottom left cell has label (1, 1) and the top right one has
label (10, 10). Put 1 in a cell with label (i, j) if i + j is even, and 0 otherwise. The number of 1’s
covered by each block is 1 or 3. Since there are 25 blocks they are going to cover an odd number of 1’s.
On the other hand, the 10 × 10 grid contains an even number of 1’s (precisely 50). Hence, our task is
not possible.
Problem 4.3.1. The class consists of 12 people. Count in how many ways one can divide them into 6
pairs, 4 triples, 3 quadruples, and 2 six-tuples. Which option yields the largest number of possibilities?
Solution. Suppose that we have n people and we want to divide them into k groups. Assume that k | n
so that there will be s = n/k people in each group. In order to generate a division, we can rst assign
unique numbers from the set [n] to all the people. This can be done in n! ways. Now, people with
numbers from 1 to s form the rst group, people with numbers from s + 1 to 2s form the second group,
and so on.
The problem is that a given group is generated multiple times. First of all, we do not care if
{1, 2, …, s} form the rst or the second group. That means that we can rearrange the k groups in any
way we want. There are k! ways to do it. Moreover, in any particular group such as {1, 2, …, s}, it
does not matter who has 1 assigned and who has 2. This gives us another factor of s! per group.
Combining these observations together we get that there are such divisions. (To be slightly more
n!
k
k!(s!)

formal, one can construct a bijection between the family of unique divisions and the partition of the set
of permutations into sets of size k!(s!) .) Alternatively, one can count it as ∏ ( )/k!, because
k k−1 n−is

i=0 s

we can iteratively select s element sets and then observe that each division is counted k! times.
In our particular situation, we have n = 12 people so we can divide them into = 10, 395 pairs (
6!2
12!
6

k = 6), = 15, 400 triples ( k = 4), = 5, 775 quadruples ( k = 3), and = 462 six-
12! 12! 12!
4 3 2
4!6 3!24 2!720

tuples ( k = 2). So triples give us the largest number of possibilities.


Problem 4.3.2. Consider an n × n square grid on which we want to place k ≤ n chess rooks in such a
way that none of them attack another rook. Count the number of ways one can do it.
Solution. We rst select rows for all rooks. Since no two rooks can be placed on the same row, there are
( ) ways to do it. Then we place rooks, one by one, in any order. Since no two rooks can be placed on
n

the same column, ith rook can be placed in n − i + 1 ways. As a result, the number of ways we can
achieve our task is equal to
2
n n! n!
( ) = ( ) /k ! .
k (n − k)! (n − k)!

Another way to see it is to notice that each rook eliminates precisely one column and one row. Hence,
one can place rooks one by one and observe that there are (n + 1 − i) spots available for placing the
2

i-th rook. Once we nish the process, there are k! duplicates because of k! possible permutations of
placing rooks.
Problem 4.3.3. Create all possible 4-digit numbers using digits from set [9] = {1, 2, 3, 4, 5, 6, 7, 8, 9}.
Find the sum of those numbers.
Solution. Let us rst notice that there are 94 numbers that satisfy our requirement. Then, notice that
number c c c c can be associated with number d d d d , where d = 10 − c . As a result, we get a
1 2 3 4 1 2 3 4 i i

bijection from the set of possible numbers to itself. Additionally, the sum of c c c c and d d d d is 1 2 3 4 1 2 3 4

equal to 11, 110, independently of the number used. Therefore, the sum of all numbers is equal to
9 ⋅ 11, 110/2 = 36, 446, 355. (We had to divide the value by 2 because each number was counted
4

twice.)
It is easy to verify that our result is correct using the following one line Julia code: sum(x for x in
1000:9999 if !(0 in digits(x))).
Problem 4.3.4. Alice has 20 balls, all different. She rst splits them into two piles and then she picks
one of the piles with at least two balls, and splits it into two. She repeats this until each pile has only
one ball. Find the number of ways in which she can carry out this procedure.
Solution. The number of ways this splitting procedure can be carried out is the same as the number of
ways to do it backward; that is, Alice can start with 20 piles, each of them containing only one ball, and
then keep merging piles together. Indeed, to see this let us note that any sequence of splits of one set of
20 balls that results in 20 sets can be uniquely reversed to get a sequence of merges from 20 sets to one
set. In other words, there is a bijection between the two sets corresponding to these two operations and
so it does not matter which one we concentrate on. Working backward is slightly easier. In the i-th
move, Alice has 21 − i sets to choose from so she can do ( ) = (21 − i)(20 − i)/2 different
21−i

merges. As she does 19 moves in total, we get that the number of ways is equal to
19
(21 − i)(20 − i)
19
∏ = 20! ⋅ 19!/2 .
2
i=1

Problem 4.4.1. There is a club with 100 members where there are 1, 000 pairs of friends. We want to
pick a three person team from the club with one team member selected as a team leader. The procedure
is that one club member rst becomes a leader. The leader then chooses two followers from his/her
friends and the team is formed. Show that it is possible to pick a team from the club in at least 19, 000
ways.
Solution. Suppose that the ith club member has di friends. Since there are 1, 000 friends in the club, we
get that ∑
100

i=1
di = 2, 000 . Therefore, if we choose the ith member as a leader, he/she can form
(
di

2
) = di (di − 1)/2 unique teams. After taking the sum over all club members we get the number of
possible teams is equal to

100 100 100 100 100


di 1 1 1 1
2 2 2
∑( ) = ∑(d − di ) = ∑d − ∑ di = ∑d − 1, 000 .
i i i
2 2 2 2 2
i=1 i=1 i=1 i=1 i=1

Now we see from Jensen’s inequality applied to f (x) = x (see Section 1.1) that 2

2
100 100 2 100
d
2 i
∑ di = 100 ∑ ≥ 100(∑ di /100) = 40, 000 .
100
i=1 i=1 i=1

Let us note that this problem can be reformulated in the language of graph theory. One can consider
a “friendship graph” consisting of 100 vertices corresponding to the club members and edges that
represent friendship relationships. Our goal is to show that any graph on 100 vertices with the average
degree 20 has at least 19, 000 paths of length 2. Indeed, each path abc of length 2 corresponds to a
team with b being the leader of the team.
Let us also note that the lower bound we just proved is best possible as it is achieved when every
member of the club has precisely 20 friends. The corresponding arrangement exists and an underlying
graph is called a 20-regular graph. In order to see one possible example, imagine all members of the
club sitting in a circle. Each member is a friend with 10 people to the left and with 10 people to the
right.
Problem 4.4.2. Consider the following combinatorial game between two players, Builder and Painter.
The game starts with the empty graph on 400 vertices. In each round, Builder presents an edge uv
between two non-adjacent vertices u and v which has to be immediately colored red or blue by Painter.
Show that Builder can force Painter to create a monochromatic (that is, either red or blue) path on 100
vertices in 400 rounds.
Solution. Let rt and bt be the number of vertices in a longest red and, respectively, blue path after t
rounds of the game. Clearly, both rt and bt are nondecreasing functions of t. We will show that Builder
has a strategy that in two rounds increases the sum of rt and bt by 1; that is, for each t ∈ N,
r2t + b ≥ t. In particular, it will follow that r
2t + b ≥ 200 and so max{r
400 ,b
400 } ≥ 100, as 400 400

required.
In order to see this, we will prove slightly stronger claim and insist that the two paths (red and blue)
are vertex disjoint, that is, have no common vertices. Moreover, we will require that there are two
endpoints, one in each path, that are not adjacent. At time 0, we initiate the process by picking two
different vertices. The desired property as well as the desired lower bound trivially holds:
r + t = 2 ≥ 0.
0 0
Suppose now that at time 2t we have two disjoint paths, a red path R = (v , …, v ) and a blue t 1 rt

path B = (w , …, w ). Moreover, the desired property (there is no edge between v and w ) and the
t 1 bt rt bt

desired condition ( r + b ≥ t) are met. Bulder presents edge v w . Without loss of generality, we
2t 2t rt bt

may assume that Painter paints it red. Builder now presents and edge from w to a new vertex v. If bt

Painter paints it blue then we discard the edge v w but keep w v to extend blue path. We get
rt bt bt

r2t+2 + b2t+2 = r2t + (b2t + 1 ) = r2t + b2t + 1,

the desired lower bound holds, the two paths are disjoint, and the corresponding endpoints are not
adjacent. Suppose then that Painter paints it red. This time, we discard vertex w from the blue path, bt

making it shorter. If, as a result, the blue path becomes empty, we choose any unused vertex as
initialization of the blue path. We get that

r2t+2 + b2t+2 = (r2t + 2) + (b2t − 1 ) = r2t + b2t + 1,

and the desired bounds holds too.


Problem 4.4.3. Consider a chess club consisting of 4t members for some t ∈ N; some of the members
know each other. Show that there exist t members that all know each other, or there exist t members
such that no two of them know each other.
Solution. Let us rst reformulate the problem in the language of graph theory. Friendships between
members of the club can be represented by coloring the edges of the complete graph on 4 = 2 t 2t

vertices using two colors, say, red and blue. Edge uv is colored red if the members corresponding to u
and v know each other; otherwise, uv is colored blue. Our goal is to show that no matter how the edges
of the complete graph on 2 vertices are colored, there exists a set of t vertices that induces a
2t

monochromatic graph; that is, all edges of this induced graph are red or all of them are blue.
Start the process with selecting an arbitrary vertex v. Note that v has an odd number of neighbors,
namely 2 − 1. As a result, either at least 2 /2 = 2
2t 2t
of them are adjacent to v via red edge or at
2t−1

least 22t−1
of them are adjacent to v via blue edge. If v is adjacent to more red edges than blue ones,
then we assign label R to v, remove v and all of its neighbors that are adjacent to v via blue edges. For
simplicity, if needed, we additionally and arbitrarily remove some vertices to keep the number of them
to be exactly 2 . On the other hand, if majority of neighbors of v are blue, then v gets label B
2t−1

assigned. This time, we remove v and its neighbors that are adjacent to v via red edges, and remove
some additional vertices so that the number of vertices left is 2 . 2t−1

We repeat the process on the remaining subset of vertices until we exhaust all of them. Since the
number of vertices decreases by a factor of 2 each time, the process lasts 2t rounds. The last round,
round 2t, is slightly different as there is only one vertex left. The last vertex can get any label assigned,
say, B. It follows that there are at least t vertices with label R assigned or at least t vertices with label B.
Due to symmetry, we may assume without loss of generality that at least t vertices have label R
assigned. It is straightforward to see that all edges in the complete graph induced by these vertices are
red. The desired property is satis ed.
Finally, let us mention that this problem is a famous and an extremely dif cult problem. The
corresponding numbers that we tried to bound in this problem are called the Ramsey numbers. In fact,
with slightly more work, one can replace 4t by ( ) ≤ 4 /√2t. However, perhaps surprisingly, it is not
2t

t
t

known if it can be replaced by (4 − ϵ) for some ϵ > 0.


t

Problem 4.5.1. Let k ∈ N and x N = N (k) := ⌊2 ⌋. Show that it is possible to partition set
k/2

X := [N ] = {1, 2, …, N } into two subsets A and B such that neither A nor B contains an arithmetic
progression of length k.
Solution. Let us rst make an obvious observation. If a , a , …a is an arithmetic progression, then so
1 2 k

is a , a , …, a . Hence, without loss of generality, we may restrict ourselves to increasing


k k−1 1

progressions.
Since the rst two terms of an increasing arithmetic progression uniquely de nes it, the number of
increasing arithmetic progressions of length k in X is at most .
N N (N −1) 2 k−1
( ) = < N /2 ≤ 2
2 2

Consider then a random partition of X into two subsets A and B; that is, each element of X is
independently put into A with probability 1/2. Clearly, the probability that a given k-element sequence
is in A is equal to (1/2) and the same holds for B. It follows that the expected number of arithmetic
k

sequences of length k entirely contained in one of the two sets is less than 2 ⋅ 2 ⋅ (1/2) = 1. By the
k−1 k

probabilistic method, we get that the desired partition exists.


Problem 4.5.2. Show that for any n ∈ N, there is a tournament for which n basketball teams are
participating in, and for which there are at least k = n!/2 orderings t , …, t such that team ti won
n−1
1 n

against team t , for all i ∈ [n − 1].


i+1

Solution. We will compute the expected number of desired orderings in a random tournament where for
each pair A, B of teams, team A wins against team B with probability 1/2, independently from all
other games. Let us x any of the n! permutations of teams: t , …, t . The probability that it satis es
1 n

the desired property is equal to (1/2) . Hence, the expected number of desired orderings is equal to
n−1

n!/2
n−1
and, by the probabilistic method, there must exist a tournament for which there are at least
n!/2
n−1
such orderings. (Surprisingly, this trivial argument gives the results that is almost as best as
possible. It is known that in any tournament, the number of such orderings is O(n n!/2 ).) 3/2 n−1

Problem 4.5.3. Consider a graph with T triangles. Show that it is possible to color the edges of this
graph with two colors so that the number of monochromatic triangles is at most T /4.
Solution. Let us color the edges of this graph at random, uniformly and independently. The probability
that a given triangle is monochromatic is equal to 2 ⋅ (1/2) = 1/4. Therefore, the expected number of
3

monochromatic triangles is T /4. It follows that there must exist a coloring for which the number of
monochromatic triangles is less than or equal to T /4. (Moreover, if T is not divisible by 4, then a strict
inequality holds.)
Problem 4.5.4. There are 100 people invited to the party; 450 pairs of people know each other. Show
that it is possible to select 10 people so that no two of them know each other.
Solution. Since the acquaintances between people invited to the party can be represented as a graph,
our problem can be reformulated in the language of graph theory. Our goal is to show that any graph G
on n = 100 vertices and m = 450 edges has an independent set of size 10.
For a given permutation π of the vertices, we put vertex v into set S = S(π) if no neighbor of v
follows it in the permutation. Clearly, S forms an independent set. Let π now be a random permutation
of the vertices of G taken with uniform distribution; that is, each permutation occurs with probability
1/n!. For a given vertex v ∈ V , let d (v) be the number of neighbors of v that follow it in the
+

permutation. The random variable d (v) attains each of the values 0, 1, …, deg(v) with probability
+

1/(deg(v) + 1). Indeed, this follows from the fact that the random permutation π induces a uniform

random permutation on the set of deg(v) + 1 vertices consisting of v and its neighbors (to see this one
can x the positions of non-neighbors of v rst, and then xing one of deg(v) + 1 free positions for the
vertex v will yield desired values of d (v) with uniform distribution). Therefore the expected number
+

of vertices with d (v) = 0 is equal to C := ∑


+
v∈V
1
. This implies that there exists a speci c
deg(v)+1

permutation with at least C vertices of this type, which form an independent set.
The last part is an optimization problem. Notice that the average degree is equal to
d := ∑ deg(v)/n = 2m/n = 2 ⋅ 450/100 = 9 . It follows from Jensen’s inequality (see Section 1.1)
applied to function 1/t that

1

deg(v)+1
1 1

v∈V
C = = n ≥ n
deg(v)+1 n ∑ (deg(v)+1)
v∈V
v∈V
n

n n 100
= nd+n
= = = 10.
d+1 9+1
n

Problem 4.6.1. Consider an urn that initially contains one white and one black ball. We repeatedly
perform the following process. In a given round, one ball is drawn randomly from the urn and its color
is observed. The ball is then returned to the urn, and an additional ball of the same color is added to the
urn. We repeat this selection process for 50 rounds so that the urn contains 52 balls. What number of
white balls is the most probable?
Solution. Let p be the probability of seeing exactly k white balls in an urn having n balls in total.
k,n

One could write down the recursion for p and then solve it but it seems that it is easier to perform
k,n

calculations for the few rst rounds to make a natural conjecture that can be then proved by induction.
During the rst round, we select a white ball with probability 1/2 so we end up with 2 white balls with
probability 1/2 and, otherwise, we stay with 1 white ball. It follows that p = p = 1/2. In order to 1,3 2,3

see 1 white ball after two rounds, we have to chose black balls during the two rounds and so
p1,4
= (1/2) ⋅ (2/3) = 1/3. Similarly, to see 1 black ball, we have to select white balls twice and so

p3,4 = 1/3 as well. It follows that p = 1 − p


2,4 − p = 1/3. The pattern occurs naturally and we
1,4 3,4

conjecture that for any n ≥ 2 and any 1 ≤ k ≤ n − 1, p = 1/(n − 1). We also see that k,n

p0,n = pn,n = 0, as there is always at least one black and one white ball in the urn.

We prove the claim by induction. The base case ( n = 2) is trivial: p = 1 (in fact, we already 1,2

checked it for n = 3 and n = 4). For the inductive step, suppose that p = 1/(n − 1) for some k,n

n ≥ 2 and all 1 ≤ k ≤ n − 1. Fix k such that 1 ≤ k ≤ n. Our goal is to show that p = 1/n. Note k,n+1

that in order to see k white balls at the end of some round we need to have k white balls in the previous
round and select a black ball, or have k − 1 white balls and select a white one. We get that if k > 1,
then
n−k k−1
pk,n+1 = pk,n + pk−1,n
n n

n−k 1 k−1 1 n−1 1 1


= ⋅ + ⋅ = ⋅ = ,
n n−1 n n−1 n n−1 n

while for k = 1 we have

n − 1 0 n − 1 1 1
p1,n+1 = p1,n + p0,n = ⋅ = .
n n n n − 1 n
The inductive hypothesis holds and the proof is nished.
Finally, let us mention that this problem is an easy, speci c case of the famous Pólya urn model. This
endows the urn with a self-reinforcing property sometimes expressed as the rich get richer. In some
sense, the Pólya urn model is the “opposite” of the model of sampling without replacement, where
every time a particular value is observed, it is less likely to be observed again, whereas in the Pólya urn
model, an observed value is more likely to be observed again.
Problem 4.6.2. There are 65 participants competing in a ski jumping tournament. They take turns and
perform their jumps in a given sequence. We assume that no two jumpers obtain the same result and
that each nal resulting order of participants is equally probable. At each given round of the
tournament, the person that has obtained the best result thus far is called a leader. Prove that the
probability that the leader changed exactly once during the whole tournament is greater than 1/16.
Solution. Let π : [n] → [n] be the nal order/permutation of jumpers. In particular, π(1) is the winner
of the tournament. Our assumption is that π is a random permutation; that is, for a given permutation π0
of [n], we have that π = π with probability 1/n!. Let p be the probability that the leader changed
0 i,n

exactly i times during the tournament of n participants. Our goal is to show that p > 1/16. 1,65

Let q (k) be the probability that the kth participant won the tournament of n ski jumpers and that the
n

leader changed exactly once during the whole event. If the rst participant wins, then he is the leader
from the very beginning and no change in the leadership occurs. It follows that q (1) = 0 and so n

n n

p1,n = ∑ qn (k ) = ∑ qn (k ) .

k=1 k=2

There are many ways to generate random permutations of [n]. The following one will be very
convenient to compute q (k). We start with 1 and then place 2 before 1 with probability 1/2;
n

otherwise, it will be placed after 1. After that we place 3 in a random place and move on to 4. Formally,
given a partial (random) permutation of elements 1, …, k − 1 (for some integer k ≥ 2), we place k
uniformly at random in one of the k possible places. This point of view has an important implication
for our problem. We immediately get that the kth participant becomes a leader (at least till the next
participant jumps) with probability 1/k. It follows that

1 2 k − 2 1 k k + 1 n − 1 1
qn (k ) = ⋅ ⋯ ⋅ ⋅ ⋅ ⋯ = .
2 3 k − 1 k k + 1 k + 2 n n(k − 1)

As a result,
n n
1 1 1 1
p1,n = ∑ = ∑ = Hn−1 ,
n(k − 1) n k − 1 n
k=2 k=2

where H is the (n − 1)-st harmonic number.


n−1

One way to prove the desired lower bound for p is to compare the harmonic series with another1,n

divergent series where each denominator is replaced with the next largest power of two:
1 1 1 1 1 1 1 1 1 1
H2i = + + + + + + + + + … + i
1 2 3 4 5 6 7 8 9 2

1 1 1 1 1 1 1 1 1 1
> + + + + + + + + + … + i
1 2 4 4 8 8 8 8 16 2

1 1 1 i−1 1 i
= 1 + + 2 ⋅ + 4 ⋅ + … + 2 ⋅ i
= 1 + .
2 4 8 2 2

(8.11)

It follows that

1 1 6 4
p1,65 = H26 > (1 + ) = ,
65 65 2 65

which is very close to the desired bound of 1/4 but, unfortunately, slightly smaller than that.
Fortunately, it is easy to improve the bound (4) for H to, for example, H > 1 + i/2 + 1/3 − 1/4. 2
i
2
i

This time we get the desired bound:

1 1 6 1 1 49 1
p1,65 = H26 > (1 + + − ) = > 0.0628 > .
65 65 2 3 4 780 16

We are done with this problem but let us make two additional comments. Let us rst note that
another way to lower bound Hn is to do the following. We know from Section 1.5 that for any i ∈ N,
i i
i + 1 1
( ) = (1 + ) < e.
i i

After taking the natural logarithm of both sides of this inequality, we get

1
ln(i + 1) − ln(i ) <
i
and so
n n
1
Hn = ∑ > ∑(ln(i + 1) − ln(i) ) = ln(n + 1) − ln(1 ) = ln(n + 1 ) .
i
i=1 i=1

This bound is quite good as one can show that H < ln(n) + 1 (see solution to Problem 2.6.1) and
n

asymptotically H = ln(n) + γ + o(1), where γ ≈ 0.577216 is the Euler-Mascheroni constant. It


n

follows that

1 ln(65)
p1,65 = H64 > > 0.0642 .
65 65
Let us also mention about p for some values of i ≠ 1. Clearly, p = (n − 1)!/n! = 1/n as there
i,n 0,n

are (n − 1)! permutations that correspond to situations when the rst participant is the winner. On the
other hand, for i ∈ N such that 2 ≤ i ≤ n − 1, we can get a recursive formula by independently
considering cases when the last change of the leader occurred at round k + 1. We get

n−1 n−1
1 k + 1 k + 2 n − 1 1
pi,n = ∑ pi−1,k ⋅ ⋅ ⋅ ⋯ = ∑ pi−1,k .
k + 1 k + 2 k + 2 n n
k=i k=i

Using this recursion, with computer support, we can easily nd p i,n for some given parameters. Here is
a simple program written in Julia that does this.

function probs65()
probs = Dict{Tuple{Int, Int}, Float64}()

function prob(j,k)
if !haskey(probs, (j,k))
if j == 0
probs[(j,k)] = 1/k
else
probs[(j,k)] = 1/k*sum(prob(j-1,i) for i in j:(k-1))
end
end
return probs[(j,k)]
end
[prob(j, 65) for j in 0:64]
end
You can run it by writing probs65() in the Julia session.
The rst few probabilities are p ≈ 0.0154, p
0,65 ≈ 0.073, p ≈ 0.1606, p
1,65 ≈ 0.2204, 2,65 3,65

p4,65≈ 0.2138, p ≈ 0.157, and p


5,65 ≈ 0.0913. It follows that the most probable case is to see 3
6,65

changes of the leader in the tournament, and 4 changes are only slightly less probable.
If one is unsure about our computations, it is relatively easy to check them using a computer
simulation. Here is another Julia code that simulates the tournament.
function sim_tournament()
# the jump length of first jumper drawn uniformly from [0,1) interv
al
best_length = rand()
best_changes = 0
# simulate jumps of consecutive jumpers
# and count the number of leader changes
for i in 2:65
jump_length = rand()
if jump_length > best_length
best_length = jump_length
best_changes += 1
end
end
return best_changes
end

function run_simulation()
# simprobs65 will hold the counts of observed tournament results
simprobs65 = zeros(65)
sim_runs = 10_000_000
for i in 1:sim_runs
# we have to add 1 to sim_tournament() result as
# in Julia arrays are 1-based and
# a minimal number of leader changes in the tournament is 0
simprobs65[sim_tournament() + 1] += 1
end
simprobs65 / sim_runs
end

simprobs65 = run_simulation()
The rst few probabilities estimated by the simulation are p ≈ 0.0155,
0,65 p ≈ 0.0729,
1,65

, , ,
p2,65 ≈ 0.1604 p3,65 ≈ 0.2207 p4,65 ≈ 0.2138 p5,65 ≈ 0.1569 , and p ≈ 0.0914 (you might get a
6,65

slightly different results because this time we use a rand simulation to generate the outputs). They are
close to the exact values calculated earlier and so we are quite con dent that no mistake was made.
Problem 4.6.3. Three random events satisfy the following three conditions: (a) their probabilities are all
equal to p for some p ∈ [0, 1], (b) they are pairwise independent, and (c) all of them cannot happen at
the same time. What is the maximum value that p may take?
Solution. Denote the the events by Ai (for i ∈ [3] ). It follows from condition (a) that there exists
p ∈ [0, 1] such that p = P(A ) for all i. By condition (b) we get that P(A ∩ A ) = p for i ≠ j.
i i j
2

Finally, condition (c) implies that P(A ∩ A ∩ A ) = 0. It follows immediately from the inclusion–
1 2 3

exclusion principle (4.9) that


P(A1 ∪ A2 ∪ A3 ) = P(A1 ) + P(A2 ) + P(A3 )

−P(A1 ∩ A2 ) − P(A1 ∩ A3 ) − P(A2 ∩ A3 )

+P(A1 ∩ A2 ∩ A3 )

2
= 3p − 3p + 0.

On the other hand, clearly (A ∩ A 1 2 ) ∪ (A1 ∩ A3 ) ∪ (A2 ∩ A3 ) ⊆ A1 ∪ A2 ∪ A3 . Using the


inclusion–exclusion principle the same way as before, we get that
)) = 3p . It follows that
2
P((A1 ∩ A2 ) ∪ (A1 ∩ A3 ) ∪ (A2 ∩ A3

2 2
3p = P((A1 ∩ A2 ) ∪ (A1 ∩ A3 ) ∪ (A2 ∩ A3 ) ) ≤ P(A1 ∪ A2 ∪ A3 ) = 3p − 3p ,

and so 3p(1 − 2p) ≥ 0. This implies that p ≤ 1/2.


It remains to show that p = 1/2 is achievable, that is, that one can design an experiment and the
three events A , A , and A3 such that P(A ) = 1/2 for all i, and conditions (b) and (c) are met.
1 2 i

Suppose that we roll an 8-sided fair die. A1 represents the event that an even number is rolled, A2
represents the event that the number rolled is less than or equal to 4, and A3 represents the event than a
number from the set {1, 3, 6, 8} is rolled. It is straightforward to check that all conditions are met.
Problem 4.7.1. Let P be a set of ve points on a plane with the property that no three of them lie on the
same line. Denote by a(P ) the number of obtuse triangles whose vertices lie in P. Find the minimum
and the maximum value that a(P ) can attain over all possible sets P.
Solution. First note that 5 points, A, B, C, D, and E, create 10 triangles as one can select 3 points from
the set of 5 points in ( ) = 10 ways and each choice yields a unique triangle. We will rst prove that
5

a(P ) ≥ 2 for any con guration P of 5 points. This bound is best possible as shown in Figure 8.5.

FIGURE 8.5: Con guration of ve points forming two obtuse triangles. We take |AB| = |BC| = |CD| and the following angles are
right ∢DEA, ∢ABC , ∢BCD, ∢CDA, ∢DAB, ∢BDE and ∢EAC . Out of the 10 triangles created by points A, B, C, D, and E only
triangles EDC and EAB are obtuse.

We will independently consider two cases. Let us rst assume that some point, say point A, lies
inside a convex hull of the remaining four points. As the points are not colinear, it must lie inside a
triangle formed by some other three points, say, B, C, and D. Note that the sum of the three angles
∢BAC , ∢BAD, and ∢CAD is equal to 2π and all of them are less than π. As a result, at least two of

them are obtuse. These two angles form the two obtuse triangles, as required. Suppose now that the ve
points form a convex pentagon. Note that the sum of the interior angles of the pentagon is equal to 3π.
As each individual angle is less than π, we get that at least two of them are obtuse and, as before, at
least two obtuse triangles are present.
On the other hand, the maximum number of obtuse triangles formed by ve points is 10, that is, it is
possible that all triangles are obtuse. Such con guration is shown in Figure 8.6.
FIGURE 8.6: Con guration of ve points forming ten obtuse triangles.

Problem 4.7.2. Every point on a circle is painted with one of three colors. Prove that there are three
points on the circle that have the same color and form an isosceles triangle.
Solution. In order to warm up, let us consider a much simpler version of this problem when only two
colors are available. Our goal is the same, we want to show that there are three points on the circle that
have the same color and form an isosceles triangle. In order to see this one can take any 5 points that
form a regular pentagon inscribed in the circle. It is enough to concentrate on these 5 points as no
matter how they are colored, the desired triangle has to be created. Indeed, observe that at least three of
these vertices must be painted with the same color. It follows that two of them, say A and B, must be
adjacent. If the third vertex, say C, is adjacent to either A or B, then they form an isosceles triangle
(note that they form the two sides of the pentagon). On the other hand, if C is not adjacent neither to A
nor to B, then |AC| = |BC| as they are diagonals of the pentagram. See Figure 8.7 for an illustration
of both cases.

FIGURE 8.7: Two possible scenarios of two-colorings with at least three gray points.

Let us now come back to the original problem with three colors. Our proof technique is the same as
before. However, instead of regular pentagon we will use 13-sided regular polygon inscribed in the
circle. Let us label the vertices of this polygon with numbers from 1 to 13, anticlockwise. Clearly, at
least 5 of the vertices must be painted with the same color. We will concentrate on them and disregard
the remaining vertices of the polygon (and an in nite number of other points from the circle). Let us
denote their unique labels as a , a , …, a ∈ [13]. We will say that two vertices are at distance k if the
1 2 5

number of vertices (from our polygon) that separate them is equal to k. Note that the smallest distance
is 0 (corresponding to the situation when the two vertices are adjacent) and the largest distance is 5.
It is easy to see that three vertices form an isosceles triangle if and only if the distance from one of
them to the remaining two is the same. We will do case analysis to show that such situation cannot be
avoided when 5 vertices need to be selected and so the desired triangle must exist. In order to reduce
the number of cases to consider, notice that regardless which 5 vertices are selected (out of 13 vertices),
the minimum distance between them cannot be greater than 1 (since 5 + 2 ⋅ 5 = 15 > 13).
Case 1: The minimum distance is equal to 1. Due to symmetry, without loss of generality, we may
assume that 1 and 3 are selected. Because of the distance constraint, 2, 4, and 13 cannot be chosen. In
order to avoid an isosceles triangle, 5 and 12 are also forbidden. This means that either 6 or 11 has to
be chosen. Again, due to symmetry, without loss of generality we may assume that 6 is chosen. This
disallows 7 (because of the distance constraint), 9 (because of the triangle 6-1-9), 10 (because of 6-10-
1), and 11 (because of 6-11-3). The only vertex left is 8 and so we are not able to select 5 points in
total. See Figure 8.8.

FIGURE 8.8: coloring 13-gon. Case 1.

Case 2: The minimum distance is equal to 0. Without loss of generality, we may assume that 1 and 2
are chosen which eliminates 3, 8 and 13. We consider the following sub-cases.
Case 2a: 4 or 12 is selected. Without loss of generality, we may assume that 4 is chosen which
disallows 6, 7, 9, and 11. We are left with three numbers, 5, 10, and 12, but no two of them can be
selected at the same time. Indeed, 5 and 10 cannot be together because of 1, 5 and 12 cannot be
together because of 2, and nally 10 and 12 cannot be together because of 1. See Figure 8.9.

FIGURE 8.9: coloring 13-gon. Case 2a.

Case 2b: Neither 4 nor 12 is selected but 5 or 11 is. Without loss of generality, we may assume that 5 is
chosen which disallows 9 and 10. As before, we are left with three numbers, 6, 7, and 11, but no two of
them can be selected at the same time. See Figure 8.10.

FIGURE 8.10: coloring 13-gon. Case 2b.


Case 2c. None of the 4, 5, 11, 12 is selected. Since there are only four labels left (6, 7, 9, and 10) and
we need to select 3 of them, we deduce that 6 or 10 has to be taken. They cannot be both selected
because of 1. Without loss of generality, we may assume that 6 is chosen but not 10. We need to take
the remaining two labels, 7 and 9, but then 1, 6, and 9 form an isosceles triangle. See Figure 8.11. This
nishes the proof.

FIGURE 8.11: coloring 13-gon. Case 2c.

In problems like this one that require investigating a large number of cases, it is often useful to check
the proof using the computer. Here is a simple code written in Julia language that veri es that for any
selection of 5 vertices from the 13 sided regular polygon, there always exists an isosceles triangle.
Running this code by calling test13gon() returns true, ans so we have a computational
con rmation of our claim.

using Combinatorics

function isisosceles(points)
# make sure points are in ascending order
sort(points)
# calculate their distances
d1 = points[2] - points[1]
d2 = points[3] - points[2]
d3 = 13 - d1 - d2
# check if any of their distances is equal
return d1 == d2 || d2 == d3 || d3 == d1
end

function test13gon()
# pick all 5 element subsets from the set 1:13
for p5 in combinations(1:13, 5)
# check if any 3 element subset of the picked
# 5 element subset forms an isosceles triangle
if !any(isisosceles(p3) for p3 in combinations(p5, 3))
return false
end
end
return true
end
Additionally, we might search for a coloring of the 13-gon that yields a minimum number of
monochromatic isosceles triangles. Below is an additional function, also written in Julia, that calculates
it.

function isosceles_count(i)
# convert number to its representation in base 3
c = string(i, base=3, pad=13)
# count monochromatic triangles that are isosceles
count(t - > c[t[1]] == c[t[2]] == c[t[3]] \& \& isisosceles(t),
combinations(1:13, 3)), c
end
function best13gon()
# initialize the sequence with monochromatic colorings
# then traverse all non monochromatic colorings
# we may then assume (without loss of generality)
# that they start with digits 0 and 2
mapreduce(isosceles_count, min, 2*3^11:3^12-1)
end
best13gon() returns (2, ”0200011022112”). We also check that
Running this code by calling
init=isosceles_count(0) produces a larger number (a monochromatic coloring). It implies
that it is almost possible to avoid isosceles triangles when coloring 13-gon with three colors. The
returned coloring creates only two monochromatic isosceles triangles—see Figure 8.12. Let us note
that it is only one example of such coloring; in other words, this example is not unique.

FIGURE 8.12: Optimal coloring of the 13-gon. Only two monochromatic isosceles triangles are created: 3-4-5 and 1-3-5.

Let us also notice that 13 sided regular polygon is the smallest polygon that can be used in our
method. Indeed, one can color vertices of the 12 sided regular polygon so that it contains no isosceles
triangle. For example, consider the following coloring: vertices {1, 2, 4, 5} are colored red,
{8, 9, 11, 12} are colored green, and {3, 6, 7, 10} are colored blue. Similar patterns can be found for

smaller regular polygons.


Finally, let us mention about natural generalization of this problem to more than 3 colors. It was easy
to show that with 2 colors it is impossible to avoid three monochromatic points on the circle that form
an isosceles triangle. Generalizing this observation to 3 colors was more delicate and tedious. However,
it feels that with, say, 1010 colors one might be able to actually avoid it. It turns out that this is
impossible, regardless how many colors we have available!
We already made the key observation that we need to deal with an arbitrary number of colors: three
vertices of a regular n-gon form an isosceles triangle if and only if the distance from one of them to the
remaining two is the same. In other words, if the vertices are labelled with consecutive numbers from
[n], then it is enough that the corresponding labels a, b, and c form an arithmetic progression. Clearly,

this is a suf cient condition but not a necessary one. As a result, we get a natural connection between
our problem and the famous Van der Waerden numbers.
Let us start with a striking observation made by Van der Waerden. For any given natural numbers r
and k, there is some number n = n(r, k) such that if the integers from [n] are colored, each with one of
r available colors, then there are at least k integers, all of the same color, which form an arithmetic
progression. The least such n is the van der Waerden number W (r, k).
In our problem, we do not actually need to know W (r, k), all we need is to make sure it exists. It is
guaranteed by the original observation of Van der Waerden and can be proved by induction. Indeed,
despite the fact that we have so powerful computers these days, only 6 nontrivial numbers and known:
W (2, 3) = 9 (easy exercise), W (2, 4) = 35 (Chvátal (1970)), W (2, 5) = 178 (Stevens and Shantaram

(1978)), W (2, 6) = 1, 132 (Kouril and Paul (2008)), W (3, 3) = 27 (Chvátal (1970)), and
W (4, 3) = 76 (Beeler and O’Neil (1979)).

Let us come back to the original problem. We start with a regular n-gon with n = W (3, 3) = 27,
vertices of which are labelled with numbers from [n]. Regardless how we color the vertices, a
monochromatic arithmetic sequence of length 3 must be created and the corresponding vertices form an
isosceles triangle. This argument is not optimal (27-gon is used instead of 13-gon) but it trivially
generalizes to any number of colors. For an arbitrary number of r colors, one needs to start with
n = W (r, 3) and the same argument follows.

Problem 4.7.3. Take a set of n ≥ 2 points with the property that no three of them lie on the same line.
We paint all line segments formed by those points in such a way that no two line segments that have a
common vertex have the same color. Find the minimum number of colors for which such coloring
exists.
Solution. Since no three points lie on the same line, the number of lines is equal to f (n), the number of
two element subsets of an n-element set; f (n) = ( ) = n(n − 1)/2. Let g(n) be the maximum
n

number of disjoint two element subsets of such a set; g(n) = ⌊n/2⌋ ( g(n) = n/2 if n is even and
g(n) = (n − 1)/2 if n is odd). Clearly, if the two line segments created by points a, b and,

respectively, points c, d are colored with the same color, then all of these points are different. It follows
that the maximum number of line segments that are in the same color is at most g(n). Combining the
two observations we get that the minimum number of colors for which the desired coloring exists is
then at least

n − 1 if n is even
f (n)/g(n ) = {
n otherwise .

In order to see that this bound is achievable, let us consider the following simple construction.
Suppose rst that n is odd. We start with n points on the circle, equally spaced (that is, these points are
vertices of a regular n-gon). Clearly, there are n directions de ned by the line segments formed by those
points, and in each direction there are exactly (n − 1)/2 line segments. All line segments associated
with the same direction receive the same color. See Figure 8.13 (left) for an example with n = 5. Since
g(n) = (n − 1)/2 line segments have the same direction, only f (n)/g(n) = n colors are used.

For an even value of n, we simply use the previous construction with n − 1 points that can be delt
with n − 1 colors. Note that there are n − 1 colors (or, equivalently, directions) but each vertex is part
of n − 2 line segments. Observe that, as a result, each vertex is missing one unique color. We add the
n-th point in the center of the circle and connect it to the n − 1 points using the missing color. See
Figure 8.13 (right) for an example with n = 5.

FIGURE 8.13: coloring the line segments for n = 5 (left) and n = 6 (right).

Problem 4.8.1. Twenty ve boys and 25 girls sit around a table. Prove that it is always possible to nd a
person both of whose neighbors are girls.
Solution. Let us label seats with numbers from the set [50]. Consider two subsets of people, those
sitting in even and odd positions at the table. In one of those sets there must be at least 13 girls.
However, this implies that there are two girls that are separated by one person. That person is the one
that we are looking for.
Problem 4.8.2. A person takes at least one aspirin a day for 30 days. Show that if the person takes 45
aspirin altogether, then in some sequence of consecutive days that person takes exactly 14 aspirin.
Solution. For i ∈ [30], let ai be the cumulative number of aspirins taken up to and including day i. We
know that a > 0 and for all i, a
1 > a (since a person takes at least one aspirin a day). Moreover,
i+1 i

a30 = 45 (the person takes 45 aspirin altogether). Now, for i ∈ [30], let b = a + 14. The properties
i i

that we determined for ai’s imply that b > 14, b = 59 and for all i, b
1 30 > b . Putting these two
i+1 i

sequences together, we get 60 numbers in total, all of them are positive and smaller than 60. By the
pigeonhole principle we get that there are two numbers, k and ℓ, for which a = b = a + 14. It k ℓ ℓ

follows that a − a = 14, so the person takes exactly 14 aspirin between day ℓ + 1 and day k.
k ℓ

Problem 4.8.3. Prove that, if we take n + 1 numbers from the set from 1 to 2n, then in this subset there
exist two numbers such that one divides the other.
Solution. Each number from the set [2n] can be uniquely represented in the form 2 q, where p

p ∈ N ∪ {0} and q ∈ N is an odd number. We say that number represented as 2 q is of type q.


p

Clearly, two numbers of the same type have the desired property, that is, one divides the other. So our
goal is to show that regardless which n + 1 numbers are selected from [2n], there will be two numbers
of the same type. Since the number of types is equal to n (there are n odd numbers in [2n]), this follows
immediately from the pigeonhole principle. Finally, let us mention that this result is sharp in the sense
that one can select n numbers from [2n] (namely, all odd numbers) and avoid this property.
Problem 4.9.1. Consider the Sicherman dice problem in which the restriction that each side is labelled
with a positive integer is relaxed to any integer, not necessarily positive. Can you design more pairs of
dice?
Solution. Notice that by adding 1 to all sides on one die and subtracting 1 from all sides on the other
die does not affect the distribution for their sum. So there are in nitely many solutions, for example,
((0, 1, 1, 2, 2, 3), (2, 4, 5, 6, 7, 9)) or ((2, 3, 4, 5, 6, 7), (0, 1, 2, 3, 4, 5)).
Problem 4.9.2. Solve the recurrence xn+1 = xn + 2xn−1 for n ∈ N , with x0 = 0 and x1 = 1 . Verify
your solution using induction.
Solution. In order to nd the corresponding generating function we follow the same strategy as for the
Fibonacci sequence (see the example above). We get that

G(x) − x
= G(x) + 2xG(x ) .
x
Therefore

1 1 1 1 n n n n
G(x ) = ( + ) = ∑(−(−1) x + 2 x ) ,
3 −1 − x 1 − 2x 3
i=0

and so x = (2 − (−1) )/3.


n n
n

Verifying the solution by induction is straightforward. Since

x0 = (1 − 1)/3 = 0 and x1 = (2 − (−1))/3 = 1 ,

the initial conditions hold. Assuming that x i = (2


i
− (−1) )/3
i
for i ≥ n, we get that

xn+1 = xn + 2xn−1

n n n−1 n−1
= (2 − (−1) )/3 + 2(2 − (−1) )/3

n+1 n+1
= (2 − (−1) )/3.

Problem 4.9.3. Your friend wants to play the following game with you. You toss three 6-sided fair dies
and calculate the sum of outcomes. For every game you have to pay $1. If the sum is 10 or 11 you get
$4, otherwise you get nothing. Is this game fair?

Solution. We will compute the probability of getting 10 and 11 by investigating f (x) , where is
3
f (x)

the generating function for a fair die we have introduced in the solution. We note that
3
6 10
3 i i 21−i
f (x) = (∑ x ) = ∑ ai (x + x ) ,

i=1 i=3

where (a , a , a , a , a , a , a , a ) = (1, 3, 6, 10, 15, 21, 25, 27). It follows that the probability of
3 4 5 6 7 8 9 10

winning (that is, earning $4 − $1 = $3) is equal to p := 2a /6 . On the other hand, the probability of 10
3

losing $1 is equal to q := 2 ∑ a /6 . Let us note that f (1) = 6 = 216 = 4 ⋅ 2a . Therefore,


9

i=3 i
3 3 3
10

4a10
= ∑
10
a = ∑
i=3 i
a + a
9
, and so q + p = 4p or q = 3p. The expected number of dollars
i=3 i 10

earned in each game is then equal to

3p − q = 0,

so the game is fair.


Here is a simple Julia program that allows us to verify this claim:

count(x->10<=x<=11, i+j+k for i in 1:6, j in 1:6, k in 1:6)/6^3


The output it produces is 0.25, as expected.

8.5 Number Theory


Problem 5.1.1. A positive fraction a/b is said to be in lowest terms if gcd(a, b) = 1 . Prove that, if a
positive fraction a/b is in lowest terms, then fraction
2 2
(a + b)/(a + ab + b )

is also in lowest terms.


Solution. Suppose that a positive fraction a/b is in lowest terms. Since a 2
+ ab + b
2
= (a + b)
2
− ab ,
our goal is prove that

2 2 2
gcd(a + b, a + ab + b ) = gcd(a + b, (a + b) − ab ) = 1.

Since a + b | (a + b)
2
, it follows that

2
gcd(a + b, (a + b) − ab ) = gcd(a + b, ab ) .

Consider any prime p that divides ab. Since a/b is in lowest terms, p cannot divide both a and b. By
symmetry, without loss of generality, we may assume that it divides a but not b. It follows that p does
not divide a + b and so gcd(a + b, ab) = 1, and the proof is nished.
Problem 5.1.2. You are given two natural numbers a and b. Prove that if a + b |
2
a , then a + b |
2
b .
Solution. Let a and b be any two natural numbers such that a + b | a . Let c = gcd(a, b) and set 2

a := a/c, b := b/c so that gcd(a , b ) = 1. Note that our assumption a + b | a can be rewritten as
′ ′ ′ ′ 2

| c a , or equivalently as a + b | ca . However, since gcd(a , b ) = 1, we get that


′ ′′ 2 2 ′ ′ ′ 2 ′ ′
c(a + b )

gcd(a + b , a ) = 1. It follows that a + b | c, and so a + b | c , but this implies that a + b also


′ ′ ′ 2 ′ ′ 2

divides c b = (cb ) = b , as required.


′ 2 2 ′ 2 2

Problem 5.1.3. Consider a set A of four digit numbers whose decimal representation uses precisely two
digits; moreover, both of them are non-zero. Let f : A → A be the function such that f (a) ips the
digits of a ∈ A (for example, f (1333) = 3111). Find n > f (n) for which gcd(n, f (n)) is as large as
possible.
Solution. Let us rst note that gcd(8484, 4848) = 1212. We will show that this is the maximum
possible value of gcd(n, f (n)) and so 8484 is the value of n we are looking for. In fact, we will prove
that it is the unique value of n such that n > f (n) that maximizes gcd(n, f (n)).
Suppose that n > f (n) is such that k = gcd(n, f (n)) ≥ 1212. Note that
k = gcd(n, f (n)) = gcd(n, n + f (n)) and so, in particular, k divides n + f (n). Suppose that the

representation of n uses digits a and b, 1 ≤ a, b ≤ 9 and a ≠ b. It is easy to see that the property of the
function f implies that

n + f (n ) = (a + b)1111 = (a + b)11 ⋅ 101 .

Since k ≥ 1212 divides


, 101 is a prime number,
(a + b)11 ⋅ 101 and
11(a + b) ≤ 11(9 + 8) = 187 < 1212, we conclude that 101 divides k and so it also divides n.

Therefore, n must have the form abab, as numbers of the form baaa, abaa, aaba, aaab and aabb are
not divisible by 101. In order to see this, note that

baaa : 1000b + 111a = 10(a − b) + 101(10b + a) and 0 < 10 ≤ |10(a − b)| ≤ 80 < 101
,
abaa: 1011a + 100b = (a − b) + 101(10a + b) and 0 < 1 ≤ |a − b| ≤ 8 < 101,
aaba: 1101a + 10b = 10(b − a) + 101 ⋅ 11a and 0 < 10 ≤ |10(b − a)| ≤ 80 < 101,
aaab: 1110a + b = (b − a) + 101 ⋅ 11a and 0 < 1 ≤ |b − a| ≤ 8 < 101,
aabb : 1100a + 11b = 11(b − a) + 101 ⋅ 11a and 0 < 11 ≤ |11(b − a)| ≤ 88 < 101.

The remaining case that is left to deal with is when n is of the form abab which can be written as
101(10a + b). The corresponding value of f (n) (that is of the form baba) can be written as
101(10b + a) and so

k = gcd(n, n + f (n)) = 101 gcd(10a + b, 11(a + b) ) .

Since a ≠ b, 10a + b is not divisible by 11 and we get that

k = 101 gcd(10a + b, a + b) = 101 gcd(9a, a + b ) .

We will independentlyconsider two cases. Suppose rst that 9 | a + b. Since


0 < 1 + 2 ≤ a + b ≤ 9 + 8 < 9 ⋅ 2, we get that a + b = 9 and so 101 ⋅ 9 < 1212. Suppose then that

a + b is not divisible by 9. Let c := gcd(a, b), a = a/c, and b = b/c. We get that
′ ′

gcd(9a, a + b) = c gcd(9a , a + b ) = c gcd(9, a + b ). It is straightforward to see that c ≤ 4 and


′ ′ ′ ′ ′

gcd(9, a + b ) ≤ 3. As a result, the maximum value gcd(n, f (n)) can attain is less than or equal to
′ ′

101 ⋅ 4 ⋅ 3 = 1212. Our initial example shows that it is achievable by n = 8484.

Actually, one can show that that this is the unique value of n such that n > f (n) that maximizes
gcd(n, f (n)). Indeed, in order to achieve the maximum value of 1212, we must have that c = 4 and

3 | a + b . Since a + b = c(a + b ) = 4(a + b ) ≤ 9 + 8 = 17, we get that a + b ≤ 4. It


′ ′ ′ ′ ′ ′ ′ ′

follows that a + b = 3, or equivalently, that a + b = 12. Since 1 ≤ b < a ≤ 9 (as n > f (n)), 4 | a,
′ ′

and 4 | b, we get that a = 8 and b = 4. The proof of uniqueness is nished.


Note that this problem is relatively easy to solve using the computer. Here is a simple program in the
Julia language that nds all values of n that satisfy the required condition and maximize gcd(n, f (n)):

julia > function check(n)


s = unique(digits(n))
if length(s) != 2 || minimum(s) == 0
# return a small value as n is invalid
return 0
end
return gcd(1111*sum(s) - n, n)
end
check (generic function with 1 method)

julia > gcds = [check(n) for n in 1000:9999];

julia > max_gcd = maximum(gcds)


1212

julia > [n for n in 1000:9999 if check(n) == max_gcd]


2-element Array{Int64,1}:
4848
8484
And we have a con rmation that the pair (4848, 8282) yields the maximum greatest common divisor
that is equal to 1212.
Problem 5.2.1. Find all primes p for which p 2
+ 2 is also prime.
Solution. We will independently consider the following three cases.
Case 1: Suppose rst that p ≡ 0 ( mod 3). There is only one prime number that is divisible by 3,
namely, 3 itself. If p = 3, then p + 2 = 11 is a prime. So p = 3 is a solution to our problem.
2

Case 2: Suppose now that p ≡ 1 ( mod 3) . Since p 2


+ 2 > 3 and p2
+ 2 ≡ 1
2
+ 2 = 3 ( mod 3) ,
p + 2 is not a prime.
2

Case 3: Finally, suppose that p ≡ 2 ( mod 3) . Arguing as before, since 2


p + 2 > 3 and
p
2
+ 2 ≡ 2
2
+ 2 = 6 ≡ 3 ( mod 3) p , 2
+ 2 is not a prime.

Combining all three cases together we conclude that the only solution is p = 3.
Problem 5.2.2. You are given three consecutive natural numbers (say, a, a + 1, and a + 2) such that the
middle one is a cube (that is, a + 1 = ℓ for some ℓ ∈ N). Prove that their product is divisible by 504.
3

Solution. As 504 = 8 ⋅ 9 ⋅ 7, it is enough to show that k := (ℓ 3 3


− 1)ℓ (ℓ
3
+ 1) is divisible by 8, 9, and
7. We will independently deal with each case.
Case: 8 | k. We will consider two sub-cases. If ℓ is even, then 8 | ℓ and we immediately get that 3

8 | k. Suppose then that ℓ is odd. It follows that ℓ is also odd and so ℓ − 1 and ℓ + 1 are two
3 3 3

consecutive even numbers. One of them must be divisible by 4 and so 8 | (ℓ − 1)(ℓ + 1). We 3 3

conclude that 8 | k in this sub-case too.


Case: 9 | k. It is easy to check that ℓ is congruent to 0, 1, or 8 modulo 9. So one of the three
3

numbers, ℓ , ℓ − 1, or ℓ + 1, is divisible by 9. We get that their product is also divisible by 9.


3 3 3

Case: 7 | k. As before, it is easy to check that ℓ is congruent to 0, 1, or 6 modulo 7. One of the three
3

numbers, ℓ , ℓ − 1, or ℓ + 1, is divisible by 7 and so is k.


3 3 3

Problem 5.2.3. Prove that for any natural n ∈ N that is not divisible by 10 there exists k ∈ N such that
nk has in its decimal representation the same digit at the rst and the last position.
Solution. The property trivially holds for n < 10 as n 1
= n has only one digit. In order to deal with
n > 10, let us rst show the following useful property:
4k+1
n ≡ n ( mod 10) f or all k ∈ N .

As a result, we will be able to restrict ourselves to the subsequence (n ) , that has the property 4k+1
k∈N

that all terms have the same last digit, and concentrate exclusively on the rst digit.
Let
2
4k+1 k k k
ℓ := n − n = n(n − 1)(n + 1)((n ) + 1) .

Our goal is to show that 10 | ℓ. We will show independently that 2 | ℓ and that 5 | ℓ. The rst task
is easy: it is clear that 2 | n(n − 1) and so 2 | ℓ. Divisibility by 5 requires considering a few cases.
k

For each case, we will show that 5 divides some term in the above representation of ℓ. If
( mod 5), then 5 | n. If n ≡ 1 ( mod 5), then 5 | (n − 1). If n ≡ 4
k k k k
n ≡ 0 ( mod 5)

then . If , then . Finally, if


2
k k k
5 | (n + 1) n ≡ 2 ( mod 5) (n ) + 1 ≡ 4 + 1 = 5 ≡ 0 ( mod 5)
2
n
k
≡ 3 , then (n ) + 1 ≡ 9 + 1 = 10 ≡ 0 ( mod 5). This nishes the proof that in the
( mod 5)
k

sequence (n ) all numbers have the same last digit.


4k+1
k∈N

Let us now consider numbers of the form n , for i ∈ [91], and concentrate on their rst two digits.
4i

Clearly, there are 90 possibilities for the rst two digits, from 10 to 99. Hence, by the pigeonhole
principle, there exist such that n and n have the same two rst digits; that is,
1 ≤ i1 < i2 ≤ 91
4i1 4i2

n
4i1
= (d + r )10 1 and n = (d + r )10 , for some integer d such that 10 ≤ d ≤ 99, some real
p1 4i2
2
p2

numbers r , r such that 0 ≤ r , r < 1, and some integers 0 ≤ p < p . In fact r , r > 0 as n is not
1 2 1 2 1 2 1 2

divisible by 10 (which will be important soon). Recall that i < i , p < p , and note that 1 2 2 3

p2
(d + r2 )10 r2 − r1
4(i2 −i1 ) p2 −p1
n = = (1 + )10 .
p1
(d + r1 )10 d + r1

Since −1/10 < (r 2 − r1 )/(d + r1 ) < 1/10 ,


4t s
n = x ⋅ 10 f or a given x ∈ (0.9, 1.1) ∖ {1 } ,

(8.12)

where t = i − i ∈ N and s = p − p ∈ N. Let us stress it again that the assumption that n is not
2 1 2 1

divisible by 10 is important and allows us to exclude the case x = 1.


We will now restrict ourselves even further, and concentrate on the subsequence (n ) of the 4tk+1
k∈N

subsequence (n )
4k+1
. It will be convenient to represent each term in its (normalized) scienti c
k∈N

notation which is a standard way of expressing numbers that are too large or too small to be
conveniently written in decimal form. All terms can be written in the form m ⋅ 10 , where the exponent n

n (called the order of magnitude) is an integer, and the coef cient m (called the signi cand or mantissa)
is a real number with absolute value at least one but less than ten. The rst digit of the term is equal to
the oor of the corresponding mantissa. We start with the original number n (the term corresponding to
k = 0) that has the last digit c ≠ 0. To get the next term, we multiply the current term by n
4t s
= x ⋅ 10

. Because of the property (8.12), the mantissa does not change much after that (unless, of course, it
“switches” from a value from the interval [1, 2) to a value from the interval [9, 10), or vice versa). As a
result, the rst digit never “skips” any digit. Indeed, if 1 < x < 1.1, then the mantissa keeps
geometrically increasing (until it eventually “switches”). More importantly, since 9 ⋅ x < 9.9 < 10, it
never “skips” any digit (the extreme case is when digit 8 changes to 9). Similarly, if 0.9 < x < 1, then
the mantissa keeps geometrically decreasing (again, until it eventually “switches”). Since 10 ⋅ x > 9, it
never “skips” any digit (this time the extreme case is when digit 1 changes to 9). Hence, for some
k ∈ N, the oor of the mantissa is equal to c and so the rst and the last digits of the term n are 4tk+1

the same. This nishes the proof.


Problem 5.3.1. You are given two integers, a and b, and a prime p > 2 . Prove that if p | a + b and
p | a + b , then p | a + b .
2 2 2 2 2

Solution. Suppose that p and p | a + b . Since p | a + b, we get that p | (a + b) . From


2 2 2
| a + b

this and the fact that p | a + b , we get that p also divides (a + b) − (a + b ) = 2ab. As p > 2, it
2 2 2 2 2

follows that p | a or p | b. By symmetry, without loss of generality, we may assume that p | a.

Since p | a + b and p | a, we get that p also divides (a + b) − a = b. Hence, p | a , p


2 2 2 2
| b

and, as a consequence, p 2
| a + b .
2 2

Problem 5.3.2. You are given four integers a, b, c , and d. Prove that if a − c | ab + cd , then
a − c | ad + bc.

Solution. Suppose that a − c | ab + cd for some integers a, b, c, and d. Since

(ab + cd) − (ad + bc ) = a(b − d) + c(d − b ) = (a − c)(b − d)

we get that a − c divides . As a − c | ab + cd, we get that a − c divides


(ab + cd) − (ad + bc)

(ab + cd) − (ad + bc) − (ab + cd) = −(ad + bc). We conclude that a − c | ad + bc.
Problem 5.3.3. Consider any natural number n ≥ 2. Prove that n + 64 has at least four different non- 12

trivial natural factors; that is, n + 64 = a ⋅ b ⋅ c ⋅ d for some a, b, c, d ∈ N such that


12

+ 64.
12
1 < a < b < c < d < n

Solution. We will deal with the case n = 2 independently. We have that


12
2 + 64 = 64 ⋅ (64 + 1 ) = 64 ⋅ 13 ⋅ 5 = 2 ⋅ 32 ⋅ 13 ⋅ 5 ,

and so the desired property holds.


From now on, we will assume n ≥ 3. Using the formula
4 4 2 2 2 2
a + 4b = (a + 2b − 2ab)(a + 2b + 2ab),

we can factor our expression as follows (by setting a = n and b = 2): 3

12 6 3 6 3
n + 64 = (n − 4n + 8)(n + 4n + 8) .

Then, we observe that


6 3 2 4 3 2
n − 4n + 8 = (n + 2n + 2)(n − 2n + 2n − 4n + 4)

and that
6 3 2 4 3 2
n + 4n + 8 = (n − 2n + 2)(n + 2n + 2n + 4n + 4 ) .

It is obvious that
2 2
n − 2n + 2 < n + 2n + 2
and that
4 3 2 4 3 2
n − 2n + 2n − 4n + 4 < n + 2n + 2n + 4n + 4 .
So, in order to nish the proof, it is enough to show that for any n ≥ 3
2 4 3 2
n + 2n + 2 < n − 2n + 2n − 4n + 4,

or equivalently that
4 3 2
n − 2n + n − 6n + 2 > 0.
The desired inequality thus holds, as for any n ≥ 3, we have that
4 3 2 3 2
n − 2n + n − 6n + 2 = n (n − 3) + n(n + n − 6) + 2

≥ 0 + n(9 + 3 − 6) + 2 ≥ 6n + 2 ≥ 20 > 0.

This nishes the proof.


Let us make some nal remark. The reader might wonder how one can nd a factorization like the
one we used in this problem:
6 3 2 4 3 2
n − 4n + 8 = (n + 2n + 2)(n − 2n + 2n − 4n + 4 ) .

One rst needs to notice that n − 4n + 8 has no integer roots (see Section 3.4 for a discussion on the
6 3

rational root theorem). Therefore, the next step is to try to nd a factorization of the form
2 4 3 2
(n + a1 n + a0 )(n + b3 n + b2 n + b1 n + b0 ) .

By comparing the coef cients associated with a given power of n, we get the following system of
equations:
b3 + a1 = 0

a1 b3 + b2 + a0 = 0

a0 b3 + a1 b2 + b1 = −4

a0 b2 + a1 b1 + b0 = 0

a0 b1 + a1 b0 = 0

a0 b0 = 8.

One can consecutively remove bi’s from this systems to get a system of only two equations in a1 and a2.
Then, as we know that a | 8, it is enough to check 8 possible values of a0 ( ±1, ±2, ±4, ±8) to nd
0

out that the only solution is a = a = 2. 1 0

Problem 5.4.1. Prove that for all a ∈ N, we have 35 |


64
a − a
4
.
Solution. Let us rst note that
4 6
64 4 4 15 4 10
a − a = a ((a ) − 1) = a ((a ) − 1) .

In order to see that a − a is divisible by 5, let us consider two cases. If 5 | a, then we are
64 4

immediately done. Suppose then that a is not divisible by 5. It follows that n = a is also not divisible 15

by 5 and so, since 5 is a prime, we get from Fermat’s little theorem that p = 5 divides
. Exactly the same argument shows that either 7 or 7 .
4 6
p−1 15 10
n − 1 = (a ) − 1 | a | (a ) − 1

Problem 5.4.2. Prove that for any odd integer n, we have that n .
n i j
| ∏ ∑ 2
i=1 j=0

Solution. Let us rst note that


n i n

j i+1
∏∑2 = ∏ (2 − 1) .

i=1 j=0 i=1

If n is prime, then it follows immediately from Fermat’s little theorem that n | 2 − 1 and clearly
n−1

2 ≤ n − 1 ≤ n + 1. The desired property holds. Let us then assume that n is composite:

n = ∏
s
p
k=1
, where pk’s are unique prime numbers and wk’s are natural numbers. Since n is odd,
wk

then all pk’s are at least 3.


Fix k ∈ [s]. To get the desired property, it is enough to show that at least wk terms in the product
are divisible by pk. If w = 1, then we immediately get that p | 2 − 1 (by
n i+1 pk −1
∏ (2 − 1) k k
i=1

Fermat’s little theorem) and clearly 2 ≤ p − 1 ≤ n + 1. So we need to concentrate on w > 1. In


k k

order to deal with this case, we will use the fact that for any natural number x, 2 − 1 is divisible
x(pk −1)

by pk. Indeed, using Fermat’s little theorem one more time we get that

x(pk −1) x
2 − 1 ≡ 1 − 1 = 0 ( mod pk ) .

Clearly, for any x ∈ N, x(p − 1) ≥ 2. Hence, in order to see that the number of terms in the product
k

that are of the form 2 − 1 is at least wk, it is enough to check that w (p − 1) ≤ n + 1. But this
x(pk −1)
k k

inequality holds as
wk wk
wk (pk − 1 ) ≤ (pk − 1) ≤ p ≤ n ≤ n + 1.
k

This nishes the proof.


Problem 5.4.3. Find the last two digits in the decimal representation of 7123.
Solution. Using Property 4 of the Euler’s Totient Function, we get that

( )( )
1 1
ϕ(100 ) = 100(1 − )(1 − ) = 40 .
2 5

Hence, since 7 and 100 are co-prime, it follows from Euler’s theorem that 100 |
40
7 − 1 or,
equivalently, that 7 ≡ 1 ( mod 100). Finally, since
40

3
123 40 3 3
7 = (7 ) ⋅ 7 ≡ 1 ⋅ 343 = 343 ≡ 43 ( mod 100) ,

the last two digits of 7123 are 43.


Problem 5.5.1. Decide if there exists k ∈ N with the property that in the decimal representation of 2k
each of the 10 digits ( 0, 1, 2, …, 9) is present the same number of times.
Solution. We will show that no such k exists. For a contradiction, suppose that for some k ∈ N, the
number of times each digit appears in the decimal representation of 2k is s for some s ∈ N. Note that
the sum of these digits is equal to s ∑ i = 45s. Since 45s is divisible by 3, we get that 2k is
9

i=0

divisible by 3. But this is clearly impossible and we get the desired contradiction.
Problem 5.5.2. Find the minimum of |20 m
− 9 |
n
over all natural numbers m and n.
Solution. Let us rst note that |20
1 1
− 9 | = 11 . We will show that it is impossible to achieve smaller
values. Clearly,
m n m n
20 − 9 ≡ 0 − (−1) = ±1 ( mod 10) ,

so the last digit in the decimal representation of |20 − 9 | is either 1 or 9. Hence, the only potentially
m n

possible values of |20 − 9 | that are less than 11 are 9 or 1. We will independently rule them out.
m n

Clearly |20 − 9 | = 9 is not possible as 20 is not divisible by 9. In order to rule out the case
m n

− 9 | = 1, we need to consider two sub-cases. Since 19 | (20 − 1), it is impossible that


m n m
|20

− 1 = 9 . It remains to deal with the sub-case + 1 = 9 . Since


m n m n
20 20

( mod 3), in order for 20 + 1 to be divisible by 3, m would have to be odd.


m m m
20 + 1 ≡ (−1) + 1

But if m is odd then 21 | 20 + 1 (since 20 + 1 ≡ (−1) + 1 ≡ −1 + 1 = 0 ( mod 21)) and


m m m

so also 7 | 20 + 1. It follows that it is impossible that 20 − 1 is equal to 9n as 9n is not divisible by


m m

7.
Problem 5.5.3. Given m, n, d ∈ N, prove that if 2
m n + 1 and mn
2
+ 1 are divisible by d, then
m + 1 and n + 1 are also divisible by d.
3 3

Solution. If d = 1, then the desired property trivially holds. Hence, we may assume that d ≥ 2.
Suppose that m n + 1 and mn + 1 are divisible by d. Let us note that, due to the symmetry, it is
2 2

enough to show that m + 1 is divisible by d. Since d | m n + 1 and d | mn + 1, d also divides


3 2 2

(m n + 1) − (mn + 1) = mn(m − n). Note that it is not the case that d | n or d | m, as


2 2

otherwise d would divide m n and so it would not divide m n + 1. As such, we must have that d
2 2

divides m − n and so m (m − n) as well. It follows that d divides


2

2 2 3
m (m − n) + (m n + 1 ) = m + 1,

as desired.
Problem 5.6.1. Find all x, y ∈ N such that 2 x
+ 5
y
is a square.
Solution. Suppose that 2 + 5 = z for some natural numbers x, y, and z. Let us rst note that z is not
x y 2

divisible by 5. We will split the proof into two independent cases depending on the parity of x.
The case when x is odd is easy to deal with. If x = 2k + 1 for some non-negative integer k, then
x
2 = 2
2k+1
gives the reminder of 2 or 3 when divided by 5 ( 2 = 2 ( mod 5), 1
2
3
= 8 ≡ 3 , = 32 ≡ 2 ( mod 5), etc.). On the other hand, z2 gives the reminder of 1
( mod 5) 2
5

or 4 when divided by 5 as z is not divisible by 5 ( 1 = 1 ( mod 5), 2 = 4 ( mod 5), 2 2

( mod 5), 4 = 16 ≡ 1 ( mod 5), 6 ≡ 1 = 1 ( mod 5), etc.). It follows that there
2 2 2 2
3 = 9 ≡ 4

is no solution in this case.


The case when x is even requires more work. If x = 2k for some k ∈ N, then
y 2 2k k k
5 = z − 2 = (z + 2 )(z − 2 ) .

It is straightforward to see that it is impossible that both (z + 2 ) and (z − 2 ) are divisible by 5. It k k

follows that the second term, namely, z − 2 is equal to 1 and so 5 = 2 ⋅ 2 + 1. Note that there exists
k y k

one solution corresponding to k = 1: x = 2 and y = 1. We will show that this is the only solution.
For a contradiction, suppose that for some y, k ∈ N ∖ {1}, we have that
y k
5 − 2 ⋅ 2 = 1 = 5 − 4,

or equivalently that
y−1 k−1
5(5 − 1) = 4(2 − 1) .

In order for the right hand side to be divisible by 5, we must have that k = 4t + 1 for some non-
negative integer t; that is, 2 has to be of the form 16t. But it means that both sides are also divisible
k−1

by 3, since 16 − 1 ≡ 1 − 1 = 0 ( mod 3). Now, in order for the left hand side to be divisible by 3,
t t

we must have that y = 2s + 1 for some s ∈ N; that is, 5 has to be of the form 25s. Recall that the y−1

case y = 1 ( s = 0) corresponds to a feasible solution and is excluded now. But this implies that the
left hand side is divisible by 8, as 25 − 1 ≡ 1 − 1 = 0 ( mod 8), wheres the right hand side is
s s

clearly not. We get the desired contradiction and the proof is nished.
Problem 5.6.2. Prove that for any two sequences, and , of natural numbers,
2011 2011
(xi ) (yi )
i=1 i=1

(2x + 3y ) is not a square.


2011 2 2
∏ i i
i=1

Solution. Let us rst note that, without loss of generality, we may assume that gcd(x , y ) = 1 for all i i

i ∈ [2011]. Indeed, it is easy to see that one could factor out gcd (x , y ) (that is clearly a square) from
2
i i

the ith term and move it in front of the product. As a result, two sequences (x ) and (y ) satisfy i
2011

i=1 i
2011

i=1

the desired property if and only if (x / gcd(x , y )) and (y / gcd(x , y ))


i i do. i
2011

i=1 i i i
2011

i=1

Now, assuming that gcd(x , y ) = 1, we will analyze the reminder of 2x + 3y when divided by 3.
i i
2
i
2
i

If 3 | x then, by our assumption, 3 does not divide yi and so y ≡ 1 ( mod 3). Indeed, if
i
2
i

yi ≡ 1 ( mod 3) , then
= 1
2
y
( mod 3)
i
whereas if y ≡ 2 ( mod 3), then
≡ 1
1
i

2
yi ≡ 2
2
= 4 ≡ 1 . It follows that 2x + 3y = 3(3t + 1) for some t ∈ N. On the other
( mod 3)
2
i
2
i i i

hand, if 3 does not divide xi then x ≡ 1 ( mod 3) and so 2x + 3y = 3t + 2 for some t ∈ N.


2
i
2
i
2
i i i

These two cases naturally de ne a partition of the 2011 terms of the product. Let A be the subset of
[2011] that consists of those indices i for which 3 | x , and let B = [2011] ∖ A. i

For a contradiction, suppose that the product ∏ (2x + 3y ) is a square. Since 3 does not divide
2011

i=1
2
i
2
i

the term 2x + 3y when i ∈ B and each term corresponding to i ∈ A has precisely one 3 in its unique
2
i
2
i

factorization, we get that |A| is even, that is, |A| = 2s for some non-negative integer s. Hence,
|B| = 2011 − |A| is odd. But this means that

2 2 |B| |B|
∏ (2xi + 3yi ) ≡ ∏ (3ti + 2 ) ≡ 2 ≡ (−1) ≡ −1 ≡ 2 ( mod 3)

i∈B i∈B

gives the reminder of 2 when divided by 3. It follows that


2011

2 2 2s 2s
∏ (2xi + 3yi ) = 3 ∏ (3ti + 1) ∏ (3ti + 2 ) = 3 (3p + 2)

i=1 i∈A i∈B

for some p ∈ N. But it means that 3p + 2 is a square but this is impossible as no square gives a
reminder of 2 when divided by 3. This nishes the proof.
Problem 5.6.3. Consider any integer n ≥ 2 and any subset S of the set N := {0, 1, 2, …, n − 1} that
has more than n elements. Prove that there exist integers a, b, c such that the remainders when
3

numbers a, b, c, a + b, a + c, b + c, a + b + c are divided by n are all in S.


Solution. Consider any subset S of the set N = {0, 1, 2, …, n − 1} of size s = |S| > n = |N |. 3

4
3

Let us start with the following, simple but useful, observation that we will use a few times. Let
x, y, z ∈ N be such that x < y. Then, z + x and z + y yield two different reminders when divided by

n. Indeed, if the remainders of z + x and z + y are equal, then (z + y) − (z + x) = y − x would be


divisible by n which is impossible as 0 < y − x < n.
We will select the three numbers a, b, c from S that satisfy the desired properties in a greedy way. Let
us start with selecting any number a ∈ S , arbitrarily chosen. Because of our property (used with z = a
), the s numbers of the form a + x for x ∈ S all give unique reminders when divided by n. At most
n of them are not in S. Since s > n, we can select b ∈ S such that the reminder of
1 3 1
n − s < n >
4 4 4

a + b when divided by n is in S.

Similarly, having a and b xed, we observe that there are less than n values of x ∈ S for which the 1

reminder of a + x when dividing by n is not in S (property used with z = a), less than n values that 1

create a problem for b + x (property used with z = b), and less than n values not satisfying the 1

condition for (a + b) + x (property used with z = a + b). It follows that there are less than n values 3

that do not satisfy some condition but we have more than n values to choose from. Hence, we are 3

guaranteed that there exists c ∈ S that, together with a and b, satisfy the desired conditions.
Problem 5.7.1. Prove that if the sum of positive divisors of some natural number n is odd, then either n
is a square or n/2 is a square.
Solution. Let us consider the unique factorization of n. That is, we write n = ∏ p for some
k ℓi

i=1 i

sequence of prime numbers 2 ≤ p < p < … < p and ℓ ∈ N for i ∈ [k]. Note that each positive
1 2 k i

divisor of n has unique representation ∏ p , where j ∈ {0, 1, …, ℓ }, and two different divisors
k ji

i=1 i i i

have different representations. It follows that the sum of all positive divisors of n is equal to

ℓ1 ℓk k k ℓi
ji j
S := ∑⋯∑∏p = ∏∑p .
i i

j1 =0 jk =0 i=1 i=1 j=0

Since our assumption is that S is odd, we get that ∑ is odd for each i ∈ [k].
ℓi j

j=0
p
i

Consider any i ∈ [k]. Suppose rst that p > 2 and so it is odd. Since ∑ p is odd, the sum has
ℓi j
i j=0 i

ℓ + 1 terms, and each term is odd, it follows that the number of terms is odd, that is, ℓ is even. In this
i i

case we get that p is a square. On the other hand, if p = 2, then the rst term in the corresponding
ℓi

i 1

sum is odd ( 2 = 1) and the remaining terms are even. As a result, the sum is always odd and ℓ could
0
1

be even or odd. If ℓ is even, then p is a square; otherwise, p


1
ℓ1

1
is a square. Putting these ℓ1 −1

observations together we conclude that if n is odd (that is, p ≠ 2), then n is a square. If n is even ( 1

p = 2), then either n is a square or n/2 is.


1

Problem 5.7.2. Find all natural numbers n for which there exist 2n pairwise different numbers
a , a , …, a , b , b , …, b such that ∑ b and ∏ b .
n n n n
1 2 n 1 2
a = ∑
n
a = ∏ i i i i
i=1 i=1 i=1 i=1
Solution. It is clear that there is no solution for n = 1. For n = 2, we have the following two
conditions: a + a = b + b and a a = b b . From the
1 2 1 2 rst equation we have that
1 2 1 2

a = b + b − a .
2 1 After substituting this to the second equation we get that
2 1

a (b + b − a ) − b b = 0 or, equivalently, that (a − b )(b − a ) = 0. It follows that a = b or


1 1 2 1 1 2 1 2 1 1 1 2

a = b and so it is impossible to have a solution with pairwise different numbers.


1 1

For n = 3, 4, and 5 we have the following solutions:

(a1 , a2 , a3 ) = (2, 8, 9) and (b1 , b2 , b3 ) = (3, 4, 12)

(a1 , a2 , a3 , a4 ) = (1, 6, 7, 10) and (b1 , b2 , b3 , b4 ) = (2, 3, 5, 14)

(a1 , a2 , a3 , a4 , a5 ) = (1, 5, 7, 8, 12) and (b1 , b2 , b3 , b4 , b5 ) = (2, 3, 4, 10, 14).

We will now show that if there is a solution for n = n , then there is one for n = n + 3 (inductive
0 0

step). Since we already showed that there is a solution for n ∈ {3, 4, 5} (base case), by mathematical
induction we will get that there is a solution for any natural number n ≥ 3.
In order to prove the claim, let us make two simple observations. First of all, let us note that the
solution that we have for n = 3 ( (a , a , a ) = (2, 8, 9) and (b , b , b ) = (3, 4, 12)) can be easily
1 2 3 1 2 3

generalized to get an in nite family of solutions. Indeed, it is obvious that for any x ∈ N,
(a , a , a ) = (2x, 8x, 9x) and (b , b , b ) = (3x, 4x, 12x) is also a solution to our problem. The
1 2 3 1 2 3

second ingredient that we need is the fact that any two solutions that consist of non-overlapping values
can be merged together to get another solution. Formally, suppose that the pair (a , …, a ) and 1 n

(b , …, b ) is the solution for some n ≥ 3. Then, one can take x large enough such that
1 n

2x > m := max{a , …, a , b , …, b } 1 (for n example,


1 n x = m). It follows that
(a , …, a , 2x, 8x, 9x) and (b , …, b , 3x, 4x, 12x) is a solution for n + 3.
1 n 1 n

Let us make some nal remarks. Finding solutions for n ∈ {3, 4, 5} by hand can be tedious.
However, with access to a computer, one can easily do it. Below is a short Julia script that was used to
nd the solutions given above.

using Combinatorics

function f(n, k)
for x in combinations(1:k, n),
y in combinations(setdiff(1:k, x), n)
if x < y # avoid printing duplicates
if sum(x) == sum(y) \& \& prod(x) == prod(y)
println((x, y))
end
end
end
end
and now we can run it to get the desired solutions:

julia > f(3, 12)


([2, 8, 9], [3, 4, 12])
([3, 8, 10], [4, 5, 12])
([4, 9, 10], [5, 6, 12])

julia > f(4, 14)


([1, 6, 7, 10], [2, 3, 5, 14])
([1, 7, 8, 9], [2, 3, 6, 14])
([2, 7, 9, 10], [3, 5, 6, 14])
([3, 7, 10, 12], [4, 5, 9, 14])
([4, 7, 10, 12], [5, 6, 8, 14])
julia > f(5, 14)
([1, 5, 7, 8, 12], [2, 3, 4, 10, 14])
Finally, note that the printed solutions show that actually x = 12, x = 14 and x = 14 are minimal
ranges of the sets of the form [x] that produce the solutions for n equal to, respectively, 3, 4 and 5.
Problem 5.7.3. Call a natural number white if it is equal to 1 or is a product of an even number of prime
numbers; otherwise, call it black. Is there an integer for which the sum of its white divisors is equal to
the sum of its black divisors?
Solution. Let x be any natural number. Let W (x) be the set of white divisors of x, and let B(x) be the
set of black divisors of x. Finally, let

D(x ) := ∑ wx − ∑ bx .

wx ∈W (x) bx ∈B(x)

We will show that D(x) ≠ 0 which gives a negative answer to the question; that is, there is no integer
for which the sum of its white divisors is equal to the sum of its black divisors.
Let us start with proving the following useful property. For any p and q that are co-prime, we have
that

D(p ⋅ q ) = D(p) ⋅ D(q ) .

(8.13)

Indeed, note that

⎛ ⎞⎛ ⎞
D(p) ⋅ D(q) = ∑ wp − ∑ bp ∑ wq − ∑ bq
⎝ ⎠⎝ ⎠
wp ∈W (p) bp ∈B(p) wq ∈W (q) bq ∈B(q)

⎛ ⎞
= ∑ wp ∑ wq + ∑ bp ∑ bq
⎝ ⎠
wp ∈W (p) wq ∈W (q) bp ∈B(p) bq ∈B(q)

⎛ ⎞
− ∑ wp ∑ bq + ∑ bp ∑ wq
⎝ ⎠
wp ∈W (p) bq ∈B(q) bp ∈B(p) wq ∈W (q)

Note also that w w and b b are white divisors of p ⋅ q (as both the sum of two even numbers and the
p q p q

sum of two odd numbers is even), and w b and b w are black divisors of p ⋅ q (as the sum of an even
p q p q

and an odd number is odd). Also, in the expression above, all divisors of p ⋅ q are present exactly once
since p and q are co-prime. This shows that, indeed, (8.13) holds.
Let us now come back to our task of showing that D(x) ≠ 0. Let x = ∏ p be the unique prime
t si

i=1 i

factorization of x: 2 ≤ p < … < p , s ∈ N for i ∈ [t]. Using (8.13), we get that


1 t i

t t
si si
D(x ) = D(∏ p ) = ∏ D(p ) .
i i

i=1 i=1
Finally, note that all positive divisors of p have the form p for . Moreover, the
si k
0 ≤ k ≤ si
i i

corresponding divisor is white if and only if k is even. It follows that

⌊si /2⌋ ⌊(si −1)/2⌋ si si +1


2j 2j+1 k
1 − (−pi )
si
D(p ) = ∑ p − ∑ p = ∑ (−pi ) = ≠ 0.
i i i
1 + pi
j=0 j=0 k=0

As a result, D(x) ≠ 0 and the proof is complete.


Problem 5.8.1. Find all natural solutions of the following equation: x 4
+ y = x
3
+ y
2
.
Solution. We can re-write the equation as x − x 4 3
= y
2
− y . In order to aggregate similar factors, we
multiply both sides by 4 and then add 1 to get that

2 2
2 2
(2y − 1) = (2x − x) − (x − 1) .

We will show that x = 1. For a contradiction, suppose that x ≥ 2. It follows that

2 2 2
2 2 2
(2y − 1) ≤ (2x − x) − (2 − 1) < (2x − x) .

On the other hand, note that


2 2
2 2 2
((2x − x) − 1) = (2x − x) − 2(2x − x) + 1

2
2 2
= (2x − x) − (x − 1) − x(3x − 2)

2 2
2 2
< (2x − x) − (x − 1) = (2y − 1) .

Combining the two observations together, we get that


2 2 2
2 2
((2x − x) − 1) < (2y − 1) < (2x − x) .

But there is no natural number such that its square is between squares of two consecutive natural
numbers and so we get the desired contradiction. It follows that x = 1.
If x = 1, then we get that (2y − 1) = 1 which implies that y = 1. Therefore, the only solution of
2 2

our equation in natural numbers is (x, y) = (1, 1).


Problem 5.8.2. Find all pairs of natural numbers that satisfy (x − y) n
= xy .
Solution. We will show that there is no solution when n = 1 or n = 2. For n = 1, for any x, y ∈ N,
we have x − y < x ≤ xy and so there are no natural numbers that satisfy x − y = xy.
For n = 2, for a contradiction, suppose that x + y = 3xy for some x, y ∈ N. Moreover, let us
2 2

assume that this is a smallest example (in terms of variable x), that is, there is no other pair x , y ∈ N ′ ′

that satisfy the desired equality and x < x. Suppose rst that both x and y are divisible by 3. It is easy

to see that x = x/3 ∈ N and y = y/3 ∈ N also form a solution, which contradicts our assumption.
′ ′

Similarly, it is not possible that one of the numbers is divisible by 3 and other is not, as then the left
hand side is not divisible by 3 while the right hand side is. Finally, if both x and y are not divisible by 3,
then x + y gives the reminder of 2 when divided by 3 whereas the right hand side is clearly divisible
2 2

by 3. All cases lead to a contradiction and so there is no solution when n = 2.


For n ≥ 3, let z := x − y. Note that (x − y) = xy ≥ 1 and so x > y if n is odd. If n is even, due
n

to the symmetry, we may assume that x > y and potential solutions will come in pairs; that is, if
(x, y) = (x , y ) is a solution, then so is (x, y) = (y , x ). Either way, we may assume that z ≥ 1.
0 0 0 0

After substitution, our equation becomes z = y + zy. Now, multiply both sides by 4 and add z2 to
n 2

both sides to get that


2 n−2 2
z (4z + 1) = (2y + z) .

z2
Note that the right hand side of this equation is a square. Since is a square, it follows that 4z + 1
n−2

is also a square. As 4z + 1 is clearly an odd number, it must be a square of an odd natural number,
n−2

that is, 4z + 1 = (2t + 1) for some t ∈ N. It follows that z = t(t + 1). Since t and t + 1 are
n−2 2 n−2

co-prime, t = a and t + 1 = b
n−2
for some a, b ∈ N. We get that b
n−2
− a = (t + 1) − t = 1.
n−2 n−2

If n ≥ 4, we get that b n−2


− a > b − a ≥ 1 and so it must be the case that n = 3 and then also
n−2

x > y. Since z = t(t + 1) and z = y + zy, after substitution we get that


3 2

2 2 3 3 2 2 2 2
(t (t + 1))(t(t + 1) ) = t (t − 1) = y + t(t + 1)y = y + (t(t + 1) − t (t + 1))y

or, alternatively, that

2 2
(y − t (t + 1))(y + t(t + 1) ) = 0.

Since both y and t are at least 1 (both are natural numbers), we get that y = t (t + 1). Then 2

x = z + y = t(t + 1) + t (t + 1) = (t + 1) t for some t ∈ N.


2 2

Finally, we have to check that y = t (t + 1) and x = (t + 1) t do, in fact, yield a solution. We get
2 2

xy = t (t + 1) and (x − y) = (t(t + 1)) and so both sides are equal.


3 3 3 3

Problem 5.8.3. Find all natural numbers satisfying the following system of equations:

a + b + c = xyz,

x + y + z = abc,

and such that a ≥ b ≥ c ≥ 1 and x ≥ y ≥ z ≥ 1.


Solution. Adding the two equations together we get

a + b + c + x + y + z = abc + xyz .

Observe that

abc − (a + b + c) = c(ab − 1) − a − b

= c(ab − 1) − a − b + ab + 1 − (ab − 1) − 2

= (c − 1)(ab − 1) + (a − 1)(b − 1) − 2.

Similarly,

xyz − (x + y + z ) = (z − 1)(xy − 1) + (x − 1)(y − 1) − 2 .

It follows that our equation can be rewritten as follows:

(c − 1)(ab − 1) + (a − 1)(b − 1) + (z − 1)(xy − 1) + (x − 1)(y − 1 ) = 4.

Let us note that all the 4 terms at the right hand side are non-negative.
Now, observe that if c ≥ 2 (and so a and b are also at least 2), then

(c − 1)(ab − 1) + (a − 1)(b − 1 ) ≥ 4.

This implies that (z − 1)(xy − 1) + (x − 1)(y − 1) = 0 and so x = y = z = 1. However, such


potential solutions would not be able to satisfy the second equation of the original system. We conclude
that c = 1. Similarly, by symmetry, we get that z = 1.
If c = z = 1, then our equation reduces to

(a − 1)(b − 1) + (x − 1)(y − 1 ) = 4.
Now, if , then we have (x − 1)(y − 1) = 4 and so (x, y) = (3, 3) or (x, y) = (5, 2). If
b = 1

(x, y) = (3, 3), then xyz = 9 and x + y + z = 7. But this would mean that a + 2 = 9 and 2a = 7,
which is not possible. If (x, y) = (5, 2), then xyz = 10 and x + y + z = 8. But this would mean that
a + 2 = 10 and 2a = 8, which is also not possible. Therefore, we conclude that b ≥ 2 and, by

symmetry, that also y ≥ 2.


Now if b ≥ 3 and y ≥ 2, then

(a − 1)(b − 1) + (x − 1)(y − 1 ) ≥ (3 − 1)(3 − 1) + (2 − 1)(2 − 1 ) = 5,

which is impossible. By symmetry, it is also not possible that y ≥ 3 and c ≥ 2.


We are left with only one possibility: b = y = 2. Then, we have a + x = 6. Additionally, from the
a + b + c = xyz condition we also get that a + 3 = 2x, and so 3x = 9. As a result, we get that

a = x = 3. We conclude that the only possible solution is (a, b, c, x, y, z) = (3, 2, 1, 3, 2, 1). We

directly check that, indeed, it satis es our system of equations.

8.6 Geometry
Problem 6.1.1. We are given an acute triangle ABC with ∢ACB = π/3. Let A′ be the orthogonal
projection of A on BC , let B′ be the orthogonal projection of B on AC , and let M be the middle point
of line segment AB. Prove that |A B | = |A M | = |B M |.
′ ′ ′ ′

Solution. Since A AC is a right triangle, ∢B AA = ∢CAA = π/2 − π/3 = π/6. Since AA B and
′ ′ ′ ′ ′

′ ′
AB B are right triangles, points A, B, A , and B lie on a circle whose center is M. But this means that

∢B M A = 2∢B AA = 2(π/6) = π/3. Since |B M | = |A M |, we get that the triangle A B M is


′ ′ ′ ′ ′ ′ ′ ′

equilateral, which nishes the proof.


Problem 6.1.2. Consider a square ABCD. Choose point P outside of this square such that ∢CP B is
the right angle. Denote by Q the intersection of AC and BD. Prove that ∢QP C = ∢QP B.
Solution. Since ∢BQC and ∢BP C are both right angles, points Q, B, C, and P lie on a circle. Since
|QB| = |QC|, we get that ∢QP B = ∢QP C , as required.

Problem 6.1.3. Point O is the center of a circumcircle of a triangle ABC . Point C′ is the orthogonal
projection of C on AB. Prove that ∢ACC = ∢OCB. ′

Solution. Let us rst note that ∢OCB = π/2 − ∢COB/2 = π/2 − ∢CAB. On the other hand, since
triangle ACC is a right triangle, ∢ACC = π/2 − ∢CAC = π/2 − ∢CAB. It follows that
′ ′ ′

∢ACC = ∢OCB.

Problem 6.2.1. Suppose that points P and Q lie on sides BC and CD of a square ABCD such that
∢P AQ = π/4. Prove that |BP | + |DQ| = |P Q|.

Solution. Consider point R inside the square such that |AR| = |AB| = |AD| and ∢BAP = ∢P AR.
Note that R lies inside the angle ∢P AQ. After considering congruent triangles BAP and P AR, we
get that |BP | = |P R|. Now, notice that ∢BAP + ∢DAQ = π/2 − π/4 = π/4. Using this we have
∢QAR = π/4 − ∢P AR = π/4 − ∢BAP = π/4 − (π/4 − ∢DAQ) = ∢DAQ. It follows that
|DQ| = |QR|.

It is left to show that R lies on the line segment P Q, as then we will conclude that
|BP | + |DQ| = |P R| + |QR| = |P Q|. But ∢QRA = ∢QDA = π/2. Similarly,
∢P RA = ∢P BA = π/2, and so ∢P RQ = π, as required.
Problem 6.2.2. Point P lies on a diagonal AC of a square ABCD. Points Q and R are the orthogonal
projections of P on lines CD and DA, respectively. Prove that |BP | = |RQ|.
Solution. Since RP QD is a rectangle, |RQ| = |P D|. Since ∢DCP = ∢BCP (= (π/2)/2 = π/4)
and |DC| = |BC|, triangles P DC and P BC are congruent. It follows that |P B| = |P D| = |RQ|,
and the proof is nished.
Problem 6.2.3. Consider an acute triangle ABC where ∢ACB = π/4. Point B′ is the orthogonal
projection of B on AC and point A′ is the orthogonal projection of A on BC . Let H be the intersection
point of AA and BB . Prove that |CH | = |AB|.
′ ′

Solution. Since triangle BB C is a right triangle and ∢B CB = π/4, we get that |BB | = |CB |.
′ ′ ′ ′

Since ∢CB H = ∢CA H = π/2, points H, B′, C, and A′ lie on a circle. It follows that
′ ′

′ ′ ′ ′
∢B CH = ∢B A H . Similarly, since ∢AB B = ∢AA B = π/2, points A, B , A , and B lie on a
′ ′ ′

circle. It follows that ∢B A H = ∢B BA. As a result, we get that triangles BB A and H B C are
′ ′ ′ ′ ′

congruent, and so |AB| = |H C|, as desired.


Problem 6.3.1. You are given a rectangle that can be covered with n disks of radius r. Prove that it can
be also covered by 4n disks of radius r/2.
Solution. Clearly, if we scale the rectangle down by a factor of 2, then it can be covered by n disks of
radius r/2. Since we can put 4 such rectangles together to recreate the original rectangle, it is enough
to use 4n disks to do the desired covering.
Problem 6.3.2. You are given an acute triangle ABC . Let B′ be the projection of B on AC and C′ be the
projection of C on AB. Show that ABC and AB C are similar. ′ ′

Solution. Since ′
∢BB C = ∢BC C = π/2

, points B, C, B′, and C lie on a circle. Thus,
CB = ∢C BC . This means that also ∢AC B = ∢BCB .
′ ′ ′ ′ ′ ′ ′ ′ ′
AB C = π/2 − ∢C B B = π/2 − ∢C

It follows that triangles ABC and AB C ′ ′


are similar.
Problem 6.3.3. Consider two circles, o1 and o2, that intersect at two points, A and B. Let P be a point on
o1 such that AP goes through the center of o1 and Q be a point on o2 such that AQ goes through the
center of o2. Prove that if ∢P AQ = π/2, then |P B|/|BQ| = [o ]/[o ], where [x] denotes the area of 1 2

gure x.
Solution. Let us rst note that the centers of o1 and o2 cannot lie inside the other circle as then ∢P AQ
could not be equal to π/2. Note then that ∢ABQ = ∢ABP = π/2, and so Q, B and P are colinear. It
follows that AQP is a right triangle and B is an orthogonal projection of A on P Q. So
|QB|/|AB| = |QA|/|AP |. Similarly |P B|/|AB| = |P A|/|AQ|. We conclude that
|P B|/|BQ| = |P A| /|AQ| = [o ]/[o ], and the proof is nished.
2 2
1 2

Alternatively, for the last step one could use the power of the point property we introduce in Section
6.6. Using it one gets that |AQ| = |QB||QP | and that |AP | = |P B||P Q|, and so 2 2

[o ]/[o ] = |AP | /|AQ| = |P B|/|QB|.


2 2
1 2

Problem 6.4.1. Points D, E, and F lie on sides BC , CA, and AB of a triangle ABC in such a way that
lines AD, BE , and CF intersect in a single point P. Prove that
|AF |/|F B| + |AE|/|EC| = |AP |/|P D|.

Solution. After applying Menelaus’s theorem to triangle ABD and line F P , we get that
|BC| |DP | |AF |
⋅ ⋅ = 1,
|DC| |P A| |F B|

and so

|AF | |DC| |P A|
= ⋅ .
|F B| |BC| |DP |

Similarly, applying it to triangle ACD and line EP we get

|CB| |DP | |AE|


⋅ ⋅ = 1,
|DB| |P A| |EC|

and so

|AE| |DB| |P A|
= ⋅ .
|EC| |CB| |DP |

It follows that
|AF | |AE| |DC| |P A| |DB| |P A|
+ = ⋅ + ⋅
|F B| |EC| |BC| |DP | |CB| |DP |

|DC|+|DB| |P A| |P A|
= ⋅ = ,
|BC| |DP | |DP |

as required.
Problem 6.4.2. You are given a triangle ABC where ∢ACB = π/2. On side AC build a square
ACGH , externally to the triangle. Similarly, on side BC build a square CBEF , externally to the

triangle. Show that the point of intersection of AE and BH lies on the line orthogonal to AB that goes
through point C.
Solution. Let A′ be the intersection point of AE and BC , B′ be the intersection point of BH and AC ,
and C′ be the orthogonal projection of C on AB. Since AC is parallel to BE,
|CA |/|A B| = |AC|/|BE| = |AC|/|BC|. Similarly, we argue that |AB |/|B C| = |AC|/|BC|. Let
′ ′ ′ ′

us now observe that triangles AC C and BC C are similar, and so |AC |/|CC | = |CC |/|BC |. It
′ ′ ′ ′ ′ ′

follows that |BC |/|C A| = (|BC |/|CC |) . But BC C and ACB are similar, and so
′ ′ ′ ′ 2 ′

|BC |/|C A| = (|CB|/|AC|) . We get that |CA |/|A B| ⋅ |BC |/|C A| ⋅ |AB |/|B C| = 1. Using
′ ′ 2 ′ ′ ′ ′ ′ ′

Ceva’s theorem, we conclude that lines AA , BB , and CC intersect in one point, which nishes the
′ ′ ′

proof.
Problem 6.4.3. You are given a convex quadrilateral ABCD and a line that intersects lines DA, AB,
BC , and CD in points K, L, M, and N, respectively. Prove that
|DK| ⋅ |AL| ⋅ |BM | ⋅ |CN | = |AK| ⋅ |BL| ⋅ |CM | ⋅ |DN |.

Solution. Let us add an auxiliary line BD to the plot. Let X be the intersection point of line BD with
the new line going through K, L, M, and N. Applying Menelaus’s theorem twice, the rst time to
triangle ABD, and the second time to triangle BDC , we get that

|AL| |BX| |DK|


⋅ ⋅ = 1,
|LB| |XD| |AK|

and, respectively, that

|BM | |CN | |DX|


⋅ ⋅ = 1.
|M C| |N D| |XB|
We get the desired result after multiplying these two equations together.
Problem 6.5.1. Consider a quadrilateral ABCD. Prove that the sum of distances from any point P
inside this quadrilateral to the lines AB, BC , CD, and DA is constant (that is, does not depend on the
choice of P) if and only if ABCD is a parallelogram.
Solution. Let us consider any two half-lines, ℓ and ℓ , that are not parallel and have a common origin,
1 2

point A. We will show that for any two points B ∈ ℓ and C ∈ ℓ with |AB| = |AC| the following
1 2

property holds: all points lying on the line segment BC have the same total distance to lines ℓ and ℓ .
1 2

To see this, let us consider any point P on the line segment BC . Clearly, [ABC] = [ABP ] + [ACP ],
where [x] denotes the area of gure x. Since |AB| = |AC|, we immediately get that the sum of heights
of the two triangles ABP and ACP , projected from P to AB and, respectively, from P to AC is
constant (namely, equal to 2[ABC]/|AB| = 2[ABC]/|AC|). From this argument, we immediately get
the following important observation. Any point P lies on the unique line segment BC de ned as above;
in particular |AB| = |AC|. More importantly, for any two points Pi ( i ∈ {1, 2}) and the associated
line segments B C , the total distances from Pi to ℓ and ℓ are equal if and only if the two
i i 1 2

corresponding line segments B C and B C are identical. Moreover, if we extend half-lines ℓ and ℓ
1 1 2 2 1 2

to lines, then the set of all points having the same distance from these two lines forms a rectangle with
point A being the intersection of its diagonals. All points inside this rectangle have sums of the
distances from these two lines strictly smaller than the points on this rectangle.
Let us now go back to our problem. Clearly, if ABCD is a parallelogram, then the desired property
holds. Suppose then that ABCD is not a parallelogram. Our goal is to show that there are two points
inside of ABCD with different sums of distances.
Let us rst deal with convex quadrilaterals. Without loss of generality, we may assume that AB and
CD are not parallel. Select any point P inside of ABCD. From the observation above it follows that

the set of points that are at the same distance as P from the two lines yielded by line segments AB and
CD lie on some rectangle R . We will independently consider the following two cases.

Case 1: AD and BC are parallel. Note that the sum of distances from AD and BC for all points inside
of ABCD is the same. Thus, we may select any point P′ not lying on rectangle R but lying inside of
ABCD (note that such point always exists) to conclude that its total distance from the sides of ABCD

is different than the total distance of P.


Case 2: AD and BC are not parallel. Let S be the rectangle formed by all points that have the same
total distance to AD and BC as point P. Clearly, P ∈ R ∩ S. Note that both R and S are non-
degenerate as ABCD is convex. (Recall that a single point can be viewed as a degenerate rectangle.) It
is easy to see that the intersection of interiors of R and S is non-empty. Let us select any point P′ from
this intersection that also lies inside of ABCD (since P lies inside of ABCD we may always select P′
close enough to P). The sum of distances of P′ from the sides of ABCD is smaller than of the
corresponding sum of P, and we are done.
Finally, let us handle the case when ABCD is not convex; that is, one of the vertices lies in the
triangle formed by the remaining ones. Without loss of generality, we may assume that A lies inside a
triangle BCD. Let B′ be the intersection of line AB and line CD. Note that B′ lies on the line segment
′ ′
DC . Similarly, let D be the intersection of lines DA and BC . Again, note that D lies on the line

segment CB. Observe now that AB CD is de ned by the same lines as ABCD, but is convex; in
′ ′

particular, AB CD is not a parallelogram as ABCD is not. By the previous argument, we get that
′ ′

there are points inside of AB CD (and so also inside of ABCD) with different sums of the distances
′ ′

from the sides. This nishes the proof.


Problem 6.5.2. Consider a triangle ABC such that |AB| = |AC| (that is, an isosceles triangle), AD is
the height of this triangle, and E is in the middle of AD. Let F be the orthogonal projection of D on
BE . Prove that ∢AF C = π/2.

Solution. Let us introduce an auxiliary point X such that that ADCX forms a rectangle. Note that
2|DE| = |CX| and 2|BD| = |BC| so B, F, E, and X lie on the same line; in particular,

∢DF X = ∢DF E = π/2. It follows that points D, F, A, and X lie on some circle. But C lies on the

circle on which A, D, and X lie. It follows that they all lie on the same cycle and so
∢CF A = ∢CDA = π/2.

Problem 6.5.3. Consider a triangle ABC . Outside of the triangle, on sides AB and AC , we built
squares ABDE and, respectively, ACF G. Let M and N be the middle points of DG and, respectively,
EF . What are the possible values of the rato |M N |/|BC|?

Solution. Let us add an auxiliary point P such that EAGP is a parallelogram; in particular, P E and
GA are parallel and have equal length. On the other hand, since ACF G is a square, F C and GA are

also parallel and have equal length. It follows that CF P E is a parallelogram. But this means that N lies
in the middle of the line segment P C as the diagonals of a parallelogram intersect in their middles.
Using the same argument we get that GP DB is a parallelogram and M lies in the middle of the line
segment BP . It follows that |P N |/|P C| = |P M |/|P B| = 1/2 which means that a triangles P BC
and P M N are similar and thus |N M |/|CB| = 1/2. Hence, this is the only possible ratio.
Problem 6.6.1. Two circles intersect in points A and B. Point P is selected on line AB outside of the
circles. Points C and D are locations where tangent lines going through point P touch both circles.
Prove that ∢P CD = ∢P DC .
Solution. Let us rst note that points C and D are not uniquely de ned (there are two possible
locations). However, regardless of their location, |P C| = |P A| ⋅ |P B| = |P D| . It follows that P CD
2 2

is an isosceles triangle; in particular, ∢P CD = ∢P DC .


Problem 6.6.2. Consider a convex hexagon ABCDEF such that |AB| = |BC|, |CD| = |DE|, and
|EF | = |F A|. Prove that lines containing altitudes of triangles BCD, DEF , and F AB from vertices

C, E, and A, respectively, intersect in one point.


Solution. Consider a circle k1 with center in D and radius |DE| = |DC| and circle k2 with center in F
and radius |F E| = |F A|. These circles intersect in point E and some other point E′ (such point must
exist as ∢F ED ≠ π and so |F D| < |F E| + |ED|). Let us now note that line EE coincides with the

altitude of a triangle F ED as DEF E is a kite ( |F E| = |F E | and |DE| = |DE |). The crucial
′ ′ ′

observation now is the fact that all points lying on a line going through E and E′ have the same power
with respect to circles k1 and k2, as can be seen by calculating this power along EE line. ′

Let us now consider a circle k1 and a circle k3 with center in B and radius |BA| = |BC|. As before,
we de ne point C′ that is a second intersection (the rst one is C) of k1 and k3. We note that CC ′

contains the altitude of BCD, and all points on the line CC have the same power with respect to

circles k1 and k3. Let us now observe that it is not possible that EE and CC are parallel as then
′ ′

∢F DB would have to be 0, which is not the case. Therefore, lines EE and CC have an unique
′ ′

intersection point Z.
It follows that the power of point Z with respect to circles k2 and k3 is the same. Let us draw a line
going through Z and A. Because of the above fact, it must also go through point A′ that is the other
intersection point of circles k2 and k3 (the rst one is A). We conclude that AA contains the altitude of

ABF , which nishes the proof.


Problem 6.6.3. Consider two points A and B. Take two circles o1 and o2 such that o1 is tangent to AB
in point A, o2 is tangent to AB in point B, and o1 and o2 are externally tangent in point X. If we allow
o1 and o2 to vary, then what is the set of points that contains all possible locations of X.
Solution. Draw a line tangent to o1 (and so also to o2) in point X. Denote by Y the point it intersects
line AB. Note that |Y A| = |Y X| = |Y B|. Therefore, Y is in the middle of line segment AB and
|Y X| is constant. It follows that the only possible locations of point X are on the semi-circle with a

center in Y and radius |AY | = |BY | (excluding points A and B). It remains to show that for each such
point, it is possible to generate the two cycles that satisfy the desired properties. (Let us mention that,
indeed, points A and B are excluded, as for them one of the circles would be degenerated to a point.)
Select any point X on such a semi-circle (again, excluding points A and B). It is easy to see that it is
possible to select then a point P such that |P A| = |P X| and ∢Y AP = ∢Y XP = π/2. Similarly, we
select Q such that |QB| = |QX| and ∢Y BQ = ∢Y XQ = π/2. It remains to show that the two
circles with centers in P and Q and radiuses |P A| and, respectively, |QB| are tangent. In order to prove
it is is enough to show that X lies on a line segment P Q. But this is indeed true as
∢P XY + ∢QXY = π/2 + π/2 = π.

Problem 6.7.1. Let P be an interior point of a triangle ABC . Let lines AP , BP , and CP intersect sides
BC , CA, and AB in points A′, B′ and, respectively, C′. Prove that
|P A|/|AA | + |P B|/|BB | + |P C|/|CC | = 2.
′ ′ ′

Solution. Let us consider triangles ABC and BP C . We get that ′


|P A |/|AA |

is proportional to the
ratio between heights of these triangles projected on BC , and so

|P A | [BP C]
= .

|AA | [ABC]

Using the argument for triangles AP C and ABP , we get that


′ ′ ′
|P A | |P B | |P C | [BP C] [AP C] [AP B]
+ + = + + = 1.
′ ′ ′
|AA | |BB | |CC | [ABC] [ABC] [ABC]

It follows that
′ ′ ′
|P A| |P B| |P C| |P A | |P B | |P C |


+ ′
+ ′
= (1 − ′
) + (1 − ′
) + (1 − ′
)
|AA | |BB | |CC | |AA | |BB | |CC |

= 3 − 1 = 2.

Problem 6.7.2. Points E and F lie on sides BC and, respectively, DA of a parallelogram ABCD such
that |BE| = |DF |. Select any point K on side CD. Let P and Q be intersection points of line F E with
lines AK and, respectively, BK . Prove that [AP F ] + [BQE] = [KP Q].
Solution. Since |BE| = |F D|, |BC| = |AD|, and BC and AD are parallel, we get that ABEF and
CDF E are congruent trapezoids. It follows that [ABEF ] = [CDF E] = [ABCD]/2. On the other

hand, since triangle AKB has the same base (namely, AB) and the height projected on this base as the
parallelogram ABCD, [AKB] = [ABCD]/2. It follows that

[AP F ] + [BEQ ] = [ABEF ] − [ABQP ] = [AKB] − [ABQP ] = [KP Q ] ,

and the proof is nished.


Problem 6.7.3. Consider a convex quadrilateral ABCD. Select points K and L on side AB such that
|AK| = |KL| = |LB| = |AB|/3. Similarly, select points N and M on side DC such that
|DN | = |N M | = |M C| = |CD|/3. Show that [KLM N ] = [ABCD]/3.
Solution. Note that [DAK] = [DAB]/3 as the corresponding triangles have the same height projected
on the bases that have proportion 1 to 3. Similarly, note that [BM C] = [BDC]/3. Clearly,
[ABCD] = [DAB] + [BDC]. Combining all of these observations together, we get that

[KBM D] = [ABCD] − [AKD] − [BM C]

= [ABCD] − [DAB]/3 − [BDC]/3

= [ABCD] − [ABCD]/3 = 2[ABCD]/3.

Let us now note that [KLM ] = [LBM ] as these triangles have the same height projected on bases of
equal length. Similarly, [M KN ] = [KN D]. But this implies that

[KLM N ] = [KLM ] + [KM N ]

1
= ([KLM ] + [LBM ] + [M KN ] + [KN D])
2

1 1 2
= [KBM D] = ⋅ [ABCD] = [ABCD]/3.
2 2 3

Problem 6.8.1. Given a parallelogram ABCD, consider points M and N that are in the middle of sides
BC and CD, respectively. Section BD intersects with AN in point Q, and with AM in point P. Prove

that 3|QP | = |BD|.


Solution. Since AB is parallel to DN , triangles ABQ and DQN are similar. It follows from Thales’
theorem that |QD|/|QB| = |DN |/|AB| = 1/2, and so 2|QD| = |QB|. As a result, |QD| = |BD|/3
. Similarly, by considering triangles AP D and BM P we conclude that |P B| = |BD|/3. Combining
these two things together, we get that |QP | = |BD| − |QD| − |P B| = |BD|/3.
Problem 6.8.2. Points K, L, M, and N are the middle points of sides AB, BC , CD and, respectively,
DA of a parallelogram ABCD whose area is equal to 1. Let P be the intersection point of KC and

N B, Q be the intersection point of LD with KC , R be the intersection point of M A with LD, and,

nally, S be the intersection point of N B with M A. Calculate the area of P QRS .


Solution. Denote by C′ the point of intersection of line CK with line AD. Since 2|AK| = |CD|, we
get from Thales’s theorem that |AC | = |DA|. But this means, again by Thales’ theorem, that triangles

C P N and BCP are similar with ratio of 3/2. Consider now heights of these triangles projected from

P onto C N and BC . Denote their lengths as h1 and, respectively, as h2. Since h /h = 3/2 (by

1 2

similarity of the corresponding triangles), we get that |BC|(h + h ) = 1. It follows that


1 2

|BC|h /2 = 1/5,
2 and so [BP C] = 1/5. Similarly, we conclude that
[DQC] = [ARD] = [BSA] = 1/5, and so [P QRS] = 1 − 4/5 = 1/5.

Problem 6.8.3. Points E and F are on sides AB and, respectively, AD of rhombus ABCD. Lines CE
and CF intersect line BD in points K and L, respectively. Line EL intersects side CD in point P. Line
F K intersects side BC in point Q. Prove that |CP | = |CQ|.

Solution. Consider
triangles F LD and BCL. By Thales’ theorem we get that
|F D|/|LD| = |BC|/|LB| . Consider now triangles LP D and LEB. Using Thales’ theorem one more
time we get that |DP |/|LD| = |BE|/|LB|. Combining those two facts together, we conclude that
|DP | = |BE| ⋅ |LD|/|LB| = |BE| ⋅ |F D|/|BC|. Analogously, by analyzing triangles F BK and

DKC , and then triangles BKQ and F KD, we get that |BQ| = |BE| ⋅ |F D|/|DC|. We conclude that

|BQ| = |DP | as |BC| = |DC|, and so also |CP | = |CQ|.


Further Reading

We do hope that our book increased appetite for more problems to solve
and the readers will search for more books to expand her or his knowledge
and skills. Here is a list of books that we have on our shelves and like to
read but, of course, this list is not complete. There are many more books
that are worth reading. Moreover, the mathematical level of these books
varies a lot. In any case, we hope that the readers will enjoy reading some of
them and keep solving interesting problems.

102 Combinatorial Problems by Titu Andreescu, Birkhäuser,


2003.
Are You Smart Enough to Work at Google? by William
Poundstone, Little Brown, 2012.
Asymptopia by Joel Spencer with Laura Florescu, Orient
Blackswan, 2017.
Concrete Mathematics: A Foundation for Computer Science by
Ronald L. Graham, Donald E. Knuth and Oren Patashnik,
Addison-Wesley Professional, 1994.
Discrete Mathematics—Elementary and Beyond by L. Lovasz, J.
Pelikan, K. Vesztergombi, Springer, 2006.
How To Prove It—A Structured Approach by Daniel J. Velleman,
Cambridge University Press, 2006.
How to Read and Do Proofs by Daniel Solow, Wiley, 2014.
Lessons in Play—An Introduction to Combinatorial Game
Theory by Michael H. Albert, Richard J. Nowakowski, David
Wolfe, CRC Press, 2007.
Magical Mathematics—The Mathematical Ideas That Animate
Great Magic Tricks by Persi Diaconis and Ron Graham,
Princeton University Press, 2011.
Mathematical Mind-Benders by Peter Winkler, Routledge, 2007.
Mathematical Puzzles: A Connoisseur’s Collection by Peter
Winkler, Routledge, 2003.
Moscow Mathematical Olympiads, 1993-1999 by Roman
Fedorov, Alexei Belov, Alexander Kovaldzhi, Ivan Yashchenko,
American Mathematical Society, 2011.
Moscow Mathematical Olympiads, 2000-2005 by Roman
Fedorov, Alexei Belov, Alexander Kovaldzhi, Ivan Yashchenko,
American Mathematical Society, 2011.
Pearls of Discrete Mathematics by Martin Erickson, CRC Press,
2009.
Professor Stewart’s Casebook of Mathematical Mysteries by Ian
Stewart, Basic Books, 2014.
Proofs from the book by Martin Aigner, Gunter M. Ziegler,
Springer, 2013.
The Art of Mathematics—Coffee Time in Memphis by Bela
Bollobas, Cambridge University Press, 2006.
The Art of Proof—Basic Training for Deeper Mathematics by
Matthias Beck, Ross Geoghegan, Springer, 2010.
The Math Book—From Pythagoras to the 57th Dimension, 250
Milestones in the History of Mathematics by Clifford A.
Pickover, Sterling, 2012.
The Nikola Tesla Puzzle Collection—An Electrifying Series of
Challenges, Enigmas and Puzzles by Richard Galland, Carlton
Books, 2001.
Thirty-three Miniatures—Mathematical and Algorithmic
Applications of Linear Algebra by Jiri Matousek, American
Mathematical Society, 2010.
One might also consider visiting the website of International
Mathematical Olympiad at https://www.imo-official.org/
problems.aspx where one can nd a collection of challenging problems
to solve. Another website worth mentioning is an on-line resource
maintained by Evan Chen that contains problems and solutions to several
USA contests https://web.evanchen.cc/problems.html.

Good luck!
Index

Acute Triangle, 182


Additive Functions, 80
Altitude of a Triangle, 184
Angle Addition Identities, 54
Arithmetic Mean, 7
Asymptotic Notation, 19

Bézout’s Identity, 177


Base Case, 10
Bernoulli’s Inequality, 13
Bijection, 114, 164
Binomial Theorem, 19
Bipartite Graphs, 104
Birthday Attack, 141
Birthday Paradox, 140
Bonferroni Inequalities, 122
Boole’s Inequality, 29, 122
Boundary, 67

Carmichael Numbers, 163


Cartesian Coordinate System, 2
Cartesian Product, 135
Cauchy’s Bound, 90
Cauchy’s Functional Equation, 80
Cauchy-Schwarz Inequality, 25
Central Angles, 182
Centre of the Symmetry, 135
Ceva’s Theorem, 194
Chinese Remainder Theorem, 166
Circumcenter, 182
Circumcircle, 182
Circumradius, 182
Clean Tile Problem, 129
Clique, 103
Closed Ball, 66
Closed Set, 67
Collinear Points, 181
Combinations, 114
Common Ratio, 6
Complete Graph, 103
Complex Numbers, 74
Composite Numbers, 149
Concave Function, 2
Congruence, 157, 187
Constant e, 15
Constructive Argument, 105
Continuous Functions, 79
Convex Function, 2
Convex Polygon, 136
Cosecant Function, 51
Cosine Function, 51
Cotangent Function, 51
Coupon Collector’s Problem, 140
Cycle, 104
Cyclic Systems of Equations, 43

Degree of a Polynomial, 73
Diophantine Equations, 176
Discriminant, 35
Disjoint Events, 130
Divisibility, 149
Double Counting, 58, 115
Double-angle Identities, 54

Edge Set, 103


Empty Graph, 104
Equilateral Triangle, 181
Euclidean Algorithm, 153
Euclidean Space, 66
Euler’s Theorem, 165
Euler’s Totient Function, 164
Even Functions, 93
Expectation, 123
Extended Euclidean Algorithm, 153
Extremal Graph Theory, 117

Fano Plane, 118


Fermat’s Little Theorem, 163
Fibonacci Sequence, 144
Floor, 152
Function Composition, 100
Functional Equations, 79
Functional Power, 100
Fundamental Theorem of Algebra, 74
Fundamental Theorem of Arithmetic, 150

Gauss’s Lemma, 88
Generating Function, 144
Geometric Distribution, 123
Geometric Mean, 7
Geometric Sequence, 6
Geometric Series, 6
Geometrical Probability, 129
Golden Ratio, 146
Graphs, 103
Greatest Common Divisors, 152
Greedy Algorithm, 106

Harmonic Mean, 7
Heawood Graph, 118
Heron’s Formula, 32
Hypotenuse, 51, 182

Imaginary Number, 74
Incidence Graph, 117
Inclusion–Exclusion Principle, 123
Independent Set, 104
Induced Subgraph, 104
Induction Step, 10
Inductive Hypothesis, 10
In mum, 68
Inscribed Angles, 182
Intercept Theorem, 207
Intersecting Lines, 181
Invariant, 63, 109
Isosceles Triangle, 181
Iterated Function, 100

Jensen’s Inequality, 5

k-regular Graph, 103

Lagrange Polynomials, 92
Lagrange’s Bound, 90
Legendre’s Formula, 150
Limit Point, 67
Linearity of Expectation, 123
Linearization, 36

Matchings, 104
Mathematical Induction, 10
Maximal Matching, 104
Maximum Degree, 103
Maximum Matching, 104
Menelaus’s Theorem, 193
Minimum Degree, 103
Multiplicative Inverse, 158
Mutually Exclusive Events, 130
Needle Problem, 129
Neighborhood, 103
Non-constructive Argument, 105

Obtuse Triangle, 182


Odd Functions, 93
One-to-one Function, 164
Onto Function, 164
Open Ball, 66
Open Sets, 67
Orthocenter, 184
Orthogonal Projection, 181

Parallelogram, 197
Partially Ordered Sets, 68
Path, 104
Perfect Matching, 104
Permutations, 114
Pigeonhole Principle, 138
Point Re ection, 135
Point Symmetry, 135
Polygon, 181
Polynomials, 73
Power of a Point, 199
Prime Numbers, 149
Probabilistic Method, 124
Product-to-sum Identities, 54
Projective Planes, 117
Proof by Contradiction, 99
Pythagorean Identity, 53

Quadrilateral, 181
Quotient-Remainder Theorem, 152
Quotients, 152

Radian, 52
Rational Root Theorem, 87
Rearrangement Inequality, 12
Rectangle, 197
Relatively Prime Numbers, 152
Remainders, 152
Rhombus, 197
Right Angle, 181
Right Triangle, 182
Roots, 74

Sandwich Theorem, 63
Scale Factor, 6
Secant Function, 51
Similarities, 189
Sine Function, 51
Square, 197
Squeeze Theorem, 63
Stolz–Cesàro Theorem, 251
Subgraph, 104
Sum-to-product Identities, 54
Supremum, 68
System of Equations, 36
System of Linear Equations, 36

Tangent Function, 51
Tangent Line, 199
Thales’ Theorem, 207
The Law of Sines, 204
Titu’s Lemma, 25
Transversal, 193
Trapezoid, 197
Triangle, 181
Triangle Inequality, 2
Trigonometric Functions, 51
Trigonometric Identities, 53
Turán Graph, 106
Turán Number, 117

Union Bound, 29, 122

Vacuously True Statements, 98


van der Waerden Number, 292
Vertex Set, 103
Vieta’s Formulas, 76

You might also like