Muirhead
Muirhead
Muirhead
MELVYN B. NATHANSON
Abstract. Preliminary results from Nathanson [5] are used to prove the Muir-
head and Rado inequalities.
1. A simple inequality
The Hardy-Littlewood-Pólya proof of the Muirhead inequality is based on the
following simple inequality:
Theorem 1. The product of two real numbers is positive if and only if either both
numbers are positive or both numbers are negative.
Thus, for u, v ∈ R, we have uv > 0 if u > 0 and v > 0 or if u < 0 or v < 0.
Corollary 1. Let a and b be positive real numbers. Let x1 and x2 be distinct real
numbers. If x1 6= x2 , then
xa1 xb2 + xb1 xa2 < xa+b
1 + xa+b
2 .
If x1 = x2 , then
xa1 xb2 + xb1 xa2 = xa+b
1 + xa+b
2 .
Proof. If x1 > x2 , then xa1 − xa2 > 0 and xb1 − xb2 > 0. If x1 < x2 , then xa1 − xa2 < 0
and xb1 − xb2 < 0. Applying Theorem 1 with u = xa1 − xa2 and v = xb1 − xb2 gives
xa+b
1 + xa+b
2 − xa1 xb2 − xb1 xa2 = (xa1 − xa2 )(xb1 − xb2 ) > 0.
If x1 = x2 , then
2xa+b
1 = xa1 xb2 + xb1 xa2 = xa+b
1 + xa+b
2 .
This completes the proof.
Corollary 2. Let
ρ>0 and |δ| < ∆.
If x1 and x2 are positive real numbers with x1 6= x2 , then
(1) xρ+δ
1 xρ−δ
2 + xρ−δ
1 xρ+δ
2 < xρ+∆
1 xρ−∆
2 + xρ−∆
1 xρ+∆
2 .
Exercises.
(1) Prove that if x1 x2 > 0 and x1 6= x2 , then
2x51 x52 < x61 x42 + x41 x62 < x71 x32 + x31 x72
< x81 x22 + x21 x82 < x91 x2 + x1 x92 < x10 10
1 + x2 .
xn xσ−1 (n)
Equivalently, σ : Rn → Rn is the linear transformation defined by σei = eσ(i) ,
n
where E = {e1 , . . . , en } is the standard basis for R . The matrix of this linear
(σ)
transformation is the permutation matrix Pσ = pi,j , where
(
(σ) 1 if i = σ(j)
pi,j =
0 if i 6= σ(j) .
Let Ω be a subset of Rn that is closed under the action of Sn , that is, x ∈ Ω
implies σx ∈ Ω for all σ ∈ Sn . Let F (Ω) be the set of real-valued functions defined
on Ω.
For every function f ∈ F (Ω) and every permutation σ in the symmetric group
Sn , define the function σf ∈ F (Ω) by
(4) (σf )(x) = f (σ −1 x).
x1 xσ(1)
If x = ... , then σ −1 x = ... and
xn xσ(n)
= ((τ σ) f ) (x) .
Thus,
(τ σ) f = τ (σf )
and (4) defines an action of the group Sn on F (X).
Let G be a subgroup of Sn of order |G|. The G-symmetric mean of the function
f is the function [f ]G ∈ F (Ω) defined by
1 X
[f ]G = σf.
|G|
σ∈G
and
1 X
[f ]G (x1 , x2 , . . . , xn ) = f xσ(1) , xσ(2), . . . , xσ(n) .
|G|
σ∈G
Let R≥0 denote the set of nonnegative real numbers and let R>0 denote the set
of positive real numbers.
The nonnegative octant in Rn is
a1
..
n n
R≥0 = a = . ∈ R : ai ≥ 0 for all i ∈ {1, . . . , n} .
an
we also have
1 X aσ−1 (1) aσ−1 (2) a −1
[xa ]G = x1 x2 · · · xnσ (n)
|G|
σ∈G
1 X aσ(1) aσ(2) a
= x1 x2 · · · xnσ(n) .
|G|
σ∈G
Here are some examples. If the positive vector x is constant with all coordinates
equal to x0 , then
Pn
ai
(5) [xa ]G = x0 i=1
.
Let n = 2. If a = ( 73 ), a
then x = x71 x32 and the S2 -symmetric mean of xa is
1 7 3
[xa ]S2 = x1 x2 + x31 x72 .
2
If b = ( 63 ), then xb = x61 x42 and the S2 -symmetric mean of xb is
1 6 4
[xb ]S2 = x1 x2 + x41 x62 .
2
For all x = ( xx12 ) ∈ R2>0 with x1 6= x2 , we have
[xb ]S2 < [xa ]S2
by inequality
6(2).
Let a = 5 . The S3 -symmetric mean of the monomial xa = x61 x52 x43 is
4
1 6 5 4
[xa ]S3 = x x x + x61 x53 x42 + x62 x53 x41 + x62 x51 x43 + x63 x51 x42 + x63 x52 x41
6 1 2 3
1 4 5 6
x x x + x41 x62 x53 + x51 x62 x43 + x51 x42 x63 + x61 x42 x53 + x61 x52 x43 .
=
6 1 2 3
For the subgroup G = {e, (1, 2, 3), (1, 3, 2)} of S3 , the G-symmetric mean of xa is
1 6 5 4 1 6 5 4
[xa ]G = x x x + x62 x53 x41 + x63 x51 x42 = x x x + x51 x42 x63 + x41 x62 x53 .
3 1 2 3 3 1 2 3
7
Consider the subgroup G = {e, (1, 2)} of S4 . For the exponent vector a = 32 ,
1
the G-symmetric mean of the monomial xa is
1 7 3 2 1
[xa ]G = x1 x2 x3 x4 + x72 x31 x23 x4 = x71 x32 + x31 x72 x23 x4 .
2 2
6
For the exponent vector b = 42 , the G-symmetric mean of the monomial xb is
1
1 6 4 2 1
[xb ]G = x1 x2 x3 x4 + x62 x41 x23 x4 = x61 x42 + x41 x62 x23 x4 .
2 2
It follows from (2) that
1
[xa ]G − [xb ]G = x71 x32 + x31 x72 − x61 x42 − x41 x62 x23 x4 > 0
2
x1
for all vectors x = xx23 ∈ R>0 with x1 6= x2 .
x4
6 MELVYN B. NATHANSON
1!
0
Here are two classical special cases. Associated to the vector e1 = .. ∈ Rn≥0
.
0
is the monomial function
For every ordered pair of integers (i, j) with i, j ∈ {1, . . . , n}, there are (n − 1)!
permutations σ ∈ Sn with σ(i) = j. In particular, for all j ∈ {1, . . . , n}, there are
(n − 1)! permutations σ ∈ Sn with σ(1) = j. We obtain
1 X 1
[xe1 ]Sn = xσ(1) x0σ(2) · · · x0σ(n)
n!
σ∈Sn
n
1 X 1 X
= xσ(1) = (n − 1)!xj
n! n! j=1
σ∈Sn
n
1X
= xj .
n j=1
1/n 1/n
xjn = x1 x2 · · · x1/n
n = (x1 x2 · · · xn )1/n .
Thus, the symmetric mean of the monomial xjn is the geometric mean. The arith-
metic and geometric mean inequality states that
3. Muirhead’s inequality
↓
a1 b1 a1
! !
Let a = .. and b = .. be vectors in R n
, and let a↓
= .. and
. . .
an bn a↓
n
b↓
1
and
n
X n
X
bi = ai .
i=1 i=1
The vector a strictly majorizes b, denoted b ≺ a, if b a and b 6= a.
a1
!
Lemma 2. Let a = .. be a vector in Rn≥0 . Let G be a subgroup of Sn . For
.
an
every permutation τ ∈ G,
[xa ]G = [xτ a ]G .
Proof. Recall that
aτ −1 (1)
a a
τ a = ...
−1 (1) −1 (n)
and xτ a = x1 τ · · · xnτ
aτ −1 (n)
Because G is a group, for all τ ∈ G we have Gτ = {στ : σ ∈ G} = G and so
1 X aτ −1 (1) aτ −1 (n)
[xτ a ]G = xσ(1) · · · xσ(n)
n!
σ∈G
1 X aτ −1 σ−1 (1) a −1 −1
= x1 · · · xnτ σ (n)
n!
σ∈G
1 X a(στ )−1 (1) a −1
= x1 · · · xn(στ ) (n)
n!
σ∈G
1 X aσ−1 (1) a −1
= x1 · · · xnσ (n)
n!
σ∈G
= [xa ]G .
This completes the proof.
Lemma 3. Let G be a ! subgroup of Sn that contains the transposition τ = (j, k).
a1 x1
!
For all vectors a = .. ∈ Rn and x = .. ∈ Rn ,
. .
an xn
n
1 X aj ak aj
Y
[xa ]G = xσ(j) xσ(k) + xaσ(j)
k
xσ(k) xaσ(i)
i
.
2n! i=1
σ∈G
i6=j,k
8 MELVYN B. NATHANSON
By Lemma 2,
2[xa ]G = [xa ]G + [xτ a ]G
1 X a1 aj
= xσ(1) · · · xσ(j) · · · xaσ(k)
k
· · · xaσ(n)
n
n!
σ∈G
1 X a1 aj
+ xσ(1) · · · xaσ(j)
k
· · · xσ(k) · · · xaσ(n)
n
n!
σ∈G
n
1 X aj ak aj
Y
= xσ(j) xσ(k) + xaσ(j)
k
xσ(k) xaσ(i)
i
.
n! i=1
σ∈G
i6=j,k
aj + ak aj − ak
ρ= and ∆= .
2 2
Then
aj = ρ + ∆ and ak = ρ − ∆
and
0 < ∆ ≤ ρ.
Let
0 ≤ δ < ∆.
b1
!
Define the vector b = .. by
.
bn
bj = ρ + δ and bk = ρ − δ
and
b i = ai if i 6= j, k.
The vector b is!nonnegative.
x1
If x = .. ∈ Rn>0 is a nonconstant positive vector, then
.
xn
2n! i=1
σ∈G
i6=j,k
n
1 X ρ+∆ ρ−∆ Y
= xσ(j) xσ(k) + xρ−∆
σ(j) xρ+∆
σ(k) xaσ(i)
i
.
2n! i=1
σ∈G
i6=j,k
Similarly,
n
1 X bj bk bj
Y
[xb ]G = xσ(j) xσ(k) + xbσ(j)
k
xσ(k) xbσ(i)
i
2n! i=1
σ∈G
i6=j,k
n
1 X ρ+δ ρ−δ ρ−δ ρ+δ
Y
= xσ(j) xσ(k) + xσ(j) xσ(k) xaσ(i)
i
.
2n! i=1
σ∈G
i6=j,k
Therefore,
[xa ]G − [xb ]G
n
1 X ρ+∆ ρ−∆ Y
= xσ(j) xσ(k) + xρ−∆ xρ+∆
σ(j) σ(k) − xρ+δ ρ−δ
x
σ(j) σ(k) − xρ−δ ρ+δ
x
σ(j) σ(k) xaσ(i)
i
2n! i=1
σ∈G
i6=j,k
For every pair of integers (i, j) with i, j ∈ {1, . . . , n}, there are (n−1)! permutations
σ ∈ Sn with σ(i) = j and so there are (n− 1)! permutations σ such that xcσ(i) i
= xcj i .
Therefore,
Y n
Y Y n Y
Y
xcσ(1)
1
xcσ(2)
2
· · · xcσ(n)
n
= xcσ(i)
i
= xcσ(i)
i
Applying the arithmetic and geometric mean inequality to the nonconstant pos-
itive vector x gives
!1/n!
Y
c1 c2 cn 1 X c1 c2
1= xσ(1) xσ(2) · · · xσ(n) < xσ(1) xσ(2) · · · xcσ(n)
n
n!
σ∈Sn σ∈Sn
and so X
n! < xcσ(1)
1
xcσ(2)
2
· · · xcσ(n)
n
.
σ∈Sn
Equivalently,
X
(7) 0< xcσ(1)
1
xcσ(2)
2
· · · xcσ(n)
n
−1 .
σ∈Sn
a1 b1
! !
Let a = .. and b = .. . For i ∈ {1, . . . , n}, let
. .
an bn
ci = ai − b i .
MUIRHEAD-RADO INEQUALITY 11
We have
n
X Y n
X Y n
X Y n
Y
xaσ(i)
i
= xbσ(i)
i +ci
= xbσ(i)
i
xcσ(i)
i
.
σ∈Sn i=1 σ∈Sn i=1 σ∈Sn i=1 i=1
Theorem 5. Let a and b be distinct nonnegative vectors in Rn . If [xb ]Sn ≤ [xa ]Sn
x1
!
for every positive vector .. ∈ Rn , then b ≺ a.
.
xn
a1 b1
! !
Proof. Rearranging the coordinates of the vectors a = .. and b = .. , we
. .
an bn
can assume that a and b are decreasing, that is,
a1 ≥ · · · ≥ an ≥ 0 and b1 ≥ · · · ≥ bn ≥ 0.
w
For every w > 0, the constant vector w = ... is positive and
w
1 X b1 b2 1 X b1 +b2 +···+bn Pn
w w · · · wbn = = w i=1 bi .
b
w S = w
n n! n!
σ∈Sn σ∈Sn
Similarly, Pn
ai
[wa ]Sn = w i=1 .
If [xb ]Sn ≤ [xa ]Sn for every positive vector x ∈ Rn , then
Pn Pn
w i=1 bi = wb Sn ≤ [wa ]Sn = w i=1 ai .
Pn Pn Pn
Choosing
Pn w > 1 givesPn i=1 bi P ≤ i=1 ai . Choosing 0 < w < 1 gives i=1 bi ≥
n
i=1 ai . Therefore, i=1 bi = i=1 ai .
Let w > 1. For k ∈ {1, . . . , n − 1}, we define the scalars
(
w if i ∈ {1, . . . , k}
wi =
1 if i ∈ {k + 1, . . . , n}
12 MELVYN B. NATHANSON
Proof. Let
Qn
i=1 vi
vn+1 = min(vn , un ) and un+1 = min(vn , un ) Qn .
i=1 ui
n+1
We have 0 < vn+1 ≤ vn and so the sequence Qn(vi )i=1 Qis positive and decreasing.
n
From inequality (8) with j = n, we see that i=1 ui ≥ i=1 vi > 0 and so
Qn
vi
0 < un+1 = min(vn , un ) Qni=1 . ≤ min(vn , un ) ≤ un
i=1 ui
and
j
X j
X j
X j
X
λvi ≤ λui if and only if vi ≤ ui .
i=1 i=1 i=1 i=1
Also,
min(λun , λvn ) = λ min(vn , un ).
Choosing λ sufficiently large, we can assume that vi > 1 and ui > 1 for all i ∈
{1, . . . , n, n + 1}, and so
bi = log vi > 0 and ai = log ui > 0.
The sequences (bi )n+1
and
i=1 (ai )n+1
i=1 are decreasing sequences of positive numbers
such that bi0 6= ai0 and
j
X j
X j
Y j
Y j
X j
X
bi = log vi = log vi ≤ log ui = log ui = ai
i=1 i=1 i=1 i=1 i=1 i=1
X n+1
Y X n+1
Y
= xbσ(i)
i
< xaσ(i)
i
We have Qn
vi
un+1 = min(vn , un ) Qni=1 ≤ min(vn , un ) = vn+1
i=1 ui
and so
n
X n+1
X n+1
X n
X
vi = vi − vn+1 < ui − un+1 = ui .
i=1 i=1 i=1 i=1
This completes the proof.
Exercises.
(1) (a) Prove that the symmetric mean of the function
f (x1 , x2 ) = x1
is
[f ](x1 , x2 ) = x1 + x2 .
(b) Prove that the symmetric mean of the function
f (x1 , x2 , x3 ) = x1
is
[f ](x1 , x2 , x3 ) = 2x1 + 2x2 + 2x3 .
(2) Compute the symmetric means of the following functions;
(a)
f (x1 , x2 , x3 , x4 , x5 ) = x4 .
(b)
f (x1 , x2 ) = x51 x72 .
(c)
f (x1 , x2 , x3 ) = x51 x72 .
(d)
f (x1 , x2 , x3 , x4 ) = x51 x72 .
(3) Here are two proofs that if 0 ≤ x < y, then
x3 y 2 + x2 y 3 < x5 + y 5 .
(a) Deduce this from Muirhead’s inequality.
(b) Use the arithmetic and geometric mean inequality to show that
p 3x5 + 2y 5
x3 y 2 = 5 x15 y 10 ≤
5
and
p 2x5 + 3y 5
x2 y 3 = 5 x10 y 15 ≤ .
5
Add these
p inequalities.p p p
Hint: 5 x15 y 10 = 5 x5 x5 x5 y 5 y 5 and 5 x10 y 15 = 5 x5 x5 y 5 y 5 y 5
(4) Let 0 ≤ x < y. Apply Muirhead’s inequality to prove
x3 y 2 + x2 y 3 < x4 y + xy 4 .
(5) Let 0 ≤ x < y.
(a) Use the arithmetic and geometric mean to prove that
p
3
p
x2 y + 3 xy 2 < x + y.
MUIRHEAD-RADO INEQUALITY 15
4. Rado’s inequality
Let V be an n-dimensional vector space with basis B = {e1 , . . . , en }. For every
permutation σ ∈ Sn , let Pσ : V → V be the linear transformation defined by
Pσ (ej ) = eσ(j) .
The (i, j)th coordinate of the matrix of Pσ with respect to the basis B is
(Pσ )i,j = δi,σ(j) .
For all σ, τ ∈ Sn and for all j ∈ {1, . . . , n} we have
Pσ Pτ (ej ) = Pσ (eτ (j) ) = eσ(τ (j))
= e(στ )(j)) = Pστ (ej )
and so
Pσ Pτ = Pστ .
16 MELVYN B. NATHANSON
Pn
For every permutation σ ∈ Sn and vector x = j=1 xj ej in V , we define
Xn n
X
σx = Pσ (x) = Pσ xj ej = xj Pσ (ej )
j=1 j=1
n
X n
X
= xj eσ(j) = xσ−1 (j) ej .
j=1 j=1
Pn
Thus, if x = j=1 xj ej ∈ Rn , then
xσ−1 (1)
(10) σx = ... .
xσ−1 (n)
For example, if σ = (1, 2, 3) ∈ S3 , then σ −1 = (1, 3, 2) and
xσ−1 (1) x3 0 0 1 x1
σx = xσ−1 (2) = x1 = 1 0 0 x2 = Pσ (x).
xσ−1 (3) x2 0 1 0 x3
The permutohedron generated by the vector x ∈ Rn , denoted KSn (x), is the
convex hull of the finite set {σx : σ ∈ Sn }.
x1 y1
.. ..
n
Theorem 7. Let x = . ∈ R≥0 and y = . ∈ Rn≥0 . Then y x if and
xn yn
only if y ∈ KSn (x).
Proof. By the Hardy-Littlewood-Pólya theorem, we have y x if and only if there
is a doubly stochastic matrix S such that y = Sx. By the Birkhoff-von Neumann
theorem, the matrix S is a convex combination of permutation matrices, and so
there is a set of nonnegative real numbers {tσ : σ ∈ Sn } such that
X
tσ = 1
σ∈Sn
and X
tσ Pσ = S.
σ∈Sn
Therefore,
X X
y = S(x) = tσ Pσ (x) = tσ σx ∈ conv{σx : σ ∈ Sn } = KSn (x).
σ∈Sn σ∈Sn
Conversely, if y ∈ KSn (x), then there is a set of nonnegative real numbers {tσ :
σ ∈ Sn } such that
X
tσ = 1
σ∈Sn
and X X
y= tσ σx = tσ Pσ (x) = S(x)
σ∈Sn σ∈Sn
P
where the matrix S = σ∈Sn tσ Pσ is doubly stochastic. This completes the proof.
MUIRHEAD-RADO INEQUALITY 17
|G|
σ∈G
1 X aσ−1 (1) aσ−1 (2) a −1
= x1 x2 · · · xnσ (n)
|G|
σ∈G
1 X aσ(1) aσ(2) a
= x1 x2 · · · xnσ(n) .
|G|
σ∈G
For every vector a ∈ Rn , let KG (a) be the convex hull of the finite set of
vectors {γa : γ ∈ G}. If the vector a is constant, then γa = a for all γ ∈ G and
KG (a) = conv{γa : γ ∈ G} = {a}. Thus, if a ∈ Rn and if b ∈ KG (a) for some
b 6= a, then a is not a constant vector.
x1
!
Let x = .. be a positive vector. For every permutation σ ∈ G, the arithmetic
.
xn
and geometric mean inequality gives
n
!tγ n
!
Y Y aγ −1 (i) X Y aγ −1 (i)
xσ(i) ≤ tγ xσ(i) .
γ∈G i=1 γ∈G i=1
18 MELVYN B. NATHANSON
We have
n n P
γ∈G tγ aγ −1 (i)
XY XY
xbσ(i)
b
x G= i
= xσ(i)
σ∈G i=1 σ∈G i=1
n Y n
!tγ
XY tγ aγ −1 (i) X Y Y aγ −1 (i)
= xσ(i) = xσ(i)
σ∈G i=1 γ∈G σ∈G γ∈G i=1
n n
! !
XX Y aγ −1 (i) X X Y aγ −1 (i)
≤ tγ xσ(i) = tγ xσ(i)
σ∈G γ∈G i=1 γ∈G σ∈G i=1
n n
a a
X X Y X X Y
j j
= tγ xσγ(j) = tγ xσ(j)
γ∈G σ∈G j=1 γ∈G σ∈G j=1
n n
a a
X Y X X Y
j j
= xσ(j) tγ = xσ(j)
σ∈G j=1 γ∈G σ∈G j=1
a
= [x ]G .
This completes the proof.
Theorem 9 (Rado). Let G be a subgroup of the symmetric group Sn . If a and b
are nonnegative vectors in Rn such that
b
x G ≤ [xa ]G
for all positive vectors x ∈ Rn , then
b ∈ KG (a).
a1 b1
! !
Proof. Let a = .. and b = .. . We shall prove that if b ∈
/ KG (a), then
. .
an bn
there is a positive vector x ∈ Rn such that xb G > [xa ]G .
The compact convex polytope KG (a) is generated by the finite set of vectors
{γa : γ ∈ G}. If b ∈ / KG (a), then the set KG (a) and the vector b are strictly
separated by a hyperplane H. This means that there is a nonzero linear functional
Xn
H(x) = ui xi
i=1
and scalars c and δ with δ > 0 such that
H(x) ≤ c for all x ∈ KG (a)
and
H(b) ≥ c + δ.
For all γ ∈ G we have
aγ(1)
γ −1 a = ... ∈ KG (a)
aγ(n)
and so
n
X n
X
ui aγ −1 (i) = H(γa) ≤ c ≤ H(b) − δ = ui bi − δ.
i=1 i=1
MUIRHEAD-RADO INEQUALITY 19
Let
M > |G|1/δ and xi = M ui
x1
!
for i ∈ {1, . . . , n}. The vector x = .. is positive. The subgroup G contains the
.
xn
identity permutation, and so
Pn X Pn n
XY
ui bi ui bγ(i)
M i=1 ≤ M i=1 = M ui bγ(i)
γ∈G γ∈G i=1
n
XY b
xi γ(i) = |G| xb
= G
.
γ∈G i=1
It follows that
n n
1 X Y aγ −1 (i) 1 X Y ui aγ −1 (i)
[xa ]G = xi = M
|G| i=1
|G| i=1
γ∈G γ∈G
1 X Pni=1 ui aγ −1 (i) 1 X H(γa)
= M = M
|G| |G|
γ∈G γ∈G
1 X H(b)−δ Pn
≤ M = M i=1 ui bi −δ
|G|
γ∈G
1 Pn |G|
= δ
M i=1 ui bi ≤ δ xb G
M M
< xb G .
This completes the proof.
References
[1] G. H. Hardy, J. E. Littlewood, and G. Pólya, Some simple inequalities satisfied by convex
functions, Messenger of Math. 58 (1929), 145–152.
[2] , Inequalities, Cambridge Mathematical Library, Cambridge University Press, Cam-
bridge, 1988, reprint of the 1952 edition.
[3] A. W. Marshall, I. Olkin, and B. C. Arnold, Inequalities: Theory of Majorization and its
Applications, Springer Series in Statistics, Springer, New York, 2011.
[4] R. F. Muirhead, Some methods applicable to identities and inequalities of symmetric algebraic
functions of n letters, Proc. Edinburgh Math. Soc. 21 (1903), 144–157.
[5] M. B. Nathanson, The Muirhead-Rado inequality, 1: Vector majorization and the permuto-
hedron, arXiv:2109.01746
[6] R. Rado, An inequality, J. London Math. Soc. 27 (1952), 1–6.