Nothing Special   »   [go: up one dir, main page]

Abstract Algebra

Download as pdf or txt
Download as pdf or txt
You are on page 1of 545
At a glance
Powered by AI
The preface discusses the abstract and difficult nature of algebra but also its versatility and important applications. It notes the need for rigor in mathematics but also the possibility of explaining concepts lucidly.

The preface discusses how abstract algebra can be difficult for students to understand initially due to its abstract nature but is useful for various applications like chemistry, physics and cryptography. It also talks about the need for putting in work to fully understand mathematical concepts and theorems.

The author discusses struggling with abstract algebra concepts as a student and sometimes losing sight of why they were learning certain topics and how they connected. They also reference needing to internalize formal definitions and proofs.

NICHOLAS JACKSON

A COURSE IN
ABSTRACT ALGEBRA

D R A F T : J U LY 3 , 2 0 1 7
To Abigail and Emilie
We may always depend upon it that
algebra which cannot be translated
into good English and sound common
sense is bad algebra.
William Kingdon Clifford
(18451879),
The Common Sense of the Exact Sciences
(1886) 21

Preface
Mathematics is written for mathemati-
cians
Nicolaus Copernicus (14731543),
bstract algebra is a fascinating, versatile and powerful subject
A with many important applications not just throughout the wider
preface to De Revolutionibus Orbium
Clestium (1543)
world of mathematics, but also in several other disciplines as well.
Group theory, the focus of the first seven chapters of this book, is par-
ticularly good at describing symmetry. It thereby enables molecular
chemists and crystallographers to study and understand the structure
and properties of molecules and crystals, and it gives particle physi-
cists valuable insights into the fundamental particles and forces that
form our universe. Finite fields and sporadic groups have important
applications in cryptography and information theory, in the construc-
tion of strong public-key cryptosystems and error-correcting codes for
secure and reliable transmission or storage of data.
But its abstract nature, the thing that makes it so useful and versatile,
can also sometimes render it opaque, dry and difficult to understand
for students meeting the subject for the first time. I struggled consider-
ably with much of this material when I was a student, and there were
certainly several occasions when I lost sight of why we were learning
a particular topic and how it joined up with everything else. The effort
ultimately paid off but I was left wondering whether it needed to be
so difficult.
In some sense, the answer is yes: mathematics requires a level of
formality and rigour greater than probably any other subject, and
sooner or later we must put in the work needed to fully understand
and internalise the concepts and theorems under discussion. To borrow
Euclids famous attributed comment to the Egyptian pharaoh Ptolemy
I Soter (c.367c.283BC), there is no royal road to algebra.
In the introduction to Herman Hesses novel The Glass Bead Game, the
narrator remarks that no textbook of the Game will ever be written,
because nobody who has devoted the long years of study necessary to
master it would have any interest in making it easier for anyone else
do so. There were times when I almost wondered if something similar
was occasionally the case with mathematics.
vi a course in abstract algebra

But in my final year as an undergraduate, trying to understand tensor


calculus and general relativity, I found Ray dInvernos excellent text-
1
1
R A dInverno, Introducing Einsteins book Introducing Einsteins Relativity, which convinced me that it was
Relativity, Clarendon Press, Oxford indeed possible to explain a difficult subject in a lucid, engaging and
(1992).
readable manner.
Towards the end of an otherwise positive book review, the algebraic
topologist Frank Adams (19301989) eloquently (and constructively)
2
2
J F Adams, Review of Algebraic K laments the shortcomings of many algebra textbooks:
Theory, Bulletin of the London Mathe-
matical Society 2.2 (1970) 233238. However, I do feel impelled to try to say what needs to be said about
a whole way of writing books on algebra, from Van der Waerden to
Cartan and Eilenberg. Look, you people are writing about a subject I
love. I would echo Hardy and say the subject is so attractive that only
3
G H Hardy and E M Wright, An Intro- extravagant incompetence could make it dull.3 All the same, we have
duction to the Theory of Numbers, Claren- books which do that. They are austere and they are arid. Why is this?
don Press, Oxford (1938), preface to
first edition. He discusses possible reasons why some mathematics textbooks are
difficult to read, suggesting in particular that the author is often so
convinced of the fascination of their subject that they take it for granted
that their readers will also see it without further prompting.
4
R R Skemp, The Psychology of Learning Early in his book on the psychology of learning mathematics,4 the
Mathematics, second edition, Penguin mathematical educationalist Richard Skemp (19191995) proposes two
(1986), pages 3033.
basic principles of mathematics education: that mathematical con-
cepts can only be properly communicated with illustrative examples,
and that mathematical concepts build and depend on each other to a
greater extent than in almost any other subject. An imperfect under-
standing of one concept can endanger those that follow from it.
In this book I have done my best to present a comprehensive and
readable treatment of all the topics one might reasonably expect to
meet in a standard undergraduate algebra course. Bearing in mind
both of Skemps main principles, Ive tried to introduce most new
definitions and theorems with one or more illustrative and motivating
examples, and to link them together into a coherent narrative. Ive
also included some biographical notes to put the subject into some
sort of historical context and also in the hope that readers will find
this material interesting.
Each chapter ends with a short, cross-referenced summary which I
hope will be useful for revision purposes. In addition, Ive drawn
inspiration from (that is, stolen) the layout of dInvernos book and
5
5
E R Tufte, The Visual Display of Quanti- those of the statistician and information designer Edward Tufte.
tative Information, Graphics Press USA
(2001).
I have also included some additional topics that will hopefully be
useful for those wishing to study the subject to a more advanced level,
although there are a number of omissions, and in some cases this
additional material mostly just points the way.
vii

The main aim of this book is to provide a clear, detailed and com-
prehensive account of the topics covered in a typical undergraduate
course on abstract algebra. I hope that Ive mostly succeeded, but if
not, then to echo Hardy and Wright I hope that my incompetence at
least hasnt been too extravagant.

How to read this book On two occasions I have been asked,


Pray, Mr. Babbage, if you put into
the machine wrong figures, will the
How you make use of this book is entirely up to you: there is right answers come out? . . . I am not
no single right way to do so, and your choices will depend on what able rightly to apprehend the kind of
confusion of ideas that could provoke
youre trying to learn (and why), and whether youre reading this such a question.
book as an accompaniment to a lecture course, or studying the subject Charles Babbage (17911871),
independently on your own or in a group. Any obligation you, the Passages from the Life of a Philosopher
(1864) 67
reader, might owe to me, the author, was discharged when you (or
your library) bought this copy of the book.
But Ill take the opportunity now to make a number of suggestions,
which you are free to follow or ignore as you prefer: some general
advice about reading and learning mathematics, and some specific
recommendations for self-contained courses of study.
To properly learn and understand complicated material, simply brows-
ing through the book in a casual, half-distracted way doesnt generally
work: you have to actively concentrate and try to develop a deep
understanding of each topic and how it relates to the others. This is
sometimes called active reading and there are various ways of doing
it, so you should experiment and find one that works for you.
In his classic 1940 work How to Read a Book,6 Mortimer Adler (1902 6 M J Adler and C van Doren, How to
2001) identifies four levels of reading: elementary reading, which is Read a Book, revised edition, Simon and
Schuster (1972).
concerned solely with interpreting and parsing the written or printed
words themselves; inspectional reading or skim-reading, in which
the object is to quickly acquire a structural overview of the subject
matter and how the various subtopics relate to each other; analytical
reading, in which our aim is to obtain a detailed and thorough
understanding of the material; and syntopical reading, which is
concerned with developing a deep, comparative understanding of
the broader subject area, and is usually achieved by detailed critical
reading of a number of related texts.
Of these, the third level is necessary for a full, working understanding
of a subject, while the fourth is required to make any progress with
serious original research.
One practical approach is as follows. Set aside a suitable period of
viii a course in abstract algebra

time in which you can be reasonably sure not to be interrupted, and


put aside any distractions: it often takes a little while to get into the
right frame of mind for concerted study of complicated concepts. But
schedule regular breaks: studies have shown that comprehension and
retension tails off after about forty minutes or so. The Pomodoro
Technique, which has become popular in recent years, advocates
25minute blocks of concentrated work interspersed with 5minute
breaks, and longer breaks every couple of hours.
Compile a realistic list of topics you want to cover in the available time,
but dont be discouraged if you dont complete them on schedule:
mathematics is hard, and any progress is good.
Skim-read first to get an overview of the topic at hand, and feel free
to read the end-of-chapter summaries, because thats one of the things
theyre there for. Read the main points of the topics on your list, and
get a sense of how it all works. Write down any questions that come to
mind on this initial read-through, or any questions you want to know
the answer to. Figure 1 illustrates the main connections between the
chapters and sections in this book: in general, to properly understand
the topics in a given chapter, you should aim to have at least a decent
working knowledge of most of the material in the chapters or sections
that connect to them.
Now read through again, paying attention to all the details. Take notes.
Work out examples where appropriate. When trying to understand
a definition of a new concept, try to think of at least one example of
something that satisfies the definition. Ideally, try to find three exam-
ples that are as different as possible from each other, and which test
the boundaries if the definition, to understand why its phrased how
it is. When studying an example, ask yourself what new behaviour
or concept it illustrates, and try to construct a variation on the same
theme. For the proof of a proposition or theorem, first try to get an
overview of how it works, and how the main steps of the argument
fit together. If its an if and only if proof, find out where each half
starts and ends. If it splits into cases, understand how each fits into
the whole. And if part of the proof relies on a contradiction, see what
it is.
When youve done all this, tick that topic, proof or concept off your list
and move onto the next one. Later on, close the book and your notes,
and see if you can remember enough of the main points to rewrite
them from memory.
If youre using this book to supplement lecture notes or another
textbook, then you might find it helpful to compare details and ap-
proaches.
ix

1. Groups 2. Subgroups
1.1 Numbers 2.1 Groups within groups

1.2 Matrices 2.2 Cosets and Lagranges Theorem

1.3 Symmetries 2.3 Eulers Theorem and Fermats Little Theorem

1.4 Permutations

8. Rings
3. Normal subgroups 8.1 Numbers
3.1 Cosets and conjugacy classes 8.2 Matrices
3.2 Quotient groups 8.3 Polynomials

3.A Simple groups 8.4 Fields

8.A Modules and representations

4. Homomorphisms
4.1 Structure-preserving maps 9. Ideals
4.2 Kernels and images 9.1 Subrings

4.3 The Isomorphism Theorems 9.2 Homomorphism and ideals

9.3 Quotient rings

9.4 Prime and maximal ideals


6. Actions
6.1 Symmetries and transformations

6.2 Orbits and stabilisers


6.3 Counting 5. Presentations
5.1 Free groups

5.2 Generators and relations


7. Finite groups
5.3 Finitely generated abelian groups
7.1 Sylows Theorems
5.A Coset enumeration
7.2 Series of subgroups
5.B Transversals
7.3 Soluble and nilpotent groups
5.C Triangles, braids and reflections
7.4 Semidirect products

7.5 Extensions

7.A Classification of small finite groups 10. Domains


10.1 Euclidean domains
10.2 Divisors, primes and irreducible elements
11. Polynomials
10.3 Principal ideal domains
11.1 Irreducible polynomials
10.4 Unique factorisation domains
11.2 Field extensions
10.A Quadratic integer rings
11.3 Finite fields

11.4 Field automorphisms and the Galois group

11.5 The Galois Correspondence

11.6 Solving equations by radicals

11.A Geometric constructions

Figure 1: Interdependence of chapters


x a course in abstract algebra

An assertion whose original source Ive been unable to trace, but which
is often, plausibly, attributed to George Plya (18871985) states that
mathematics is not a spectator sport. To this end I have included
a selection of exercises at the end of each chapter. Some are fairly
routine, slight variations on examples in the text, while others are
more involved and require further thought.

Bernard of Chartres used to say that Acknowledgements


we were as dwarves standing on the
shoulders of giants, so that we could
see more than them, and further; not I would like to thank Keith Mansfield, Clare Charles, Viki Mortimer
because we have sharper sight or are and Daniel Taber of Oxford University Press, for all their help, advice
taller, but because we are raised up and
held aloft by their great height. and patience.
John of Salisbury (c.11201180), Numerous friends and colleagues offered advice, or read and com-
Metalogicon III:4
mented on drafts of this book, and I thank them all for their generosity
You have added much several ways, &
especially in taking the colours of thin and many helpful suggestions, while at the same time accepting sole
plates into philosophical consideration. responsibility for any remaining errors. A partial list includes John
If I have seen further it is by standing
on the sholders of Giants. Aldis, Andrew Brendon-Penn, Gavin Brown, Inna Capdebosq, Mark
Sir Isaac Newton (16421727), Cummings, San Fryer, James Grime, Derek Holt, Heather McCluskey,
Letter to Robert Hooke (16351703), Nick Mills, Colin Rourke, Jochen Voss, Charles Walkden, Bruce West-
dated 5 February 1676
bury and Colin Wright, and I apologise to anyone Ive inadvertently
missed.
Finally, and most of all, I thank my wife Abigail Davies for her encour-
agement and support, and our daughter Emilie, who was born during
the writing of this book, and without whom I might have finished
slightly sooner, but who has made it all seem much more worthwhile.

Credits

Most of the pictures, photographs and quotations in this


book are either in the public domain or released under Creative Com-
mons or other open licences, with the following exceptions:
(i) The photographs of H S M Coxeter (page 181), Georg Frobenius
(page 230), Felix Klein (page 15), Ludwig Sylow (page 240) and
Hans Zassenhaus (page 254) are reproduced by permission of
the Oberwolfach Research Institute for Mathematics.
(ii) The photograph of Wolfgang Pauli (page 19) is reproduced by
permission of the Wolfgang Pauli Archive, CERN.
(iii) The cryptic crossword clue on page 24 is copyright The Guardian
xi

2001, and is reproduced by permission.


(iv) The quotation from George Whitelaw Mackey on page 344 is re-
produced by permission of the American Philosophical Society.

Nicholas Jackson,
Coventry, July 2017
Contents

Preface v
How to read this book vii
Acknowledgements x
Credits x
1 Groups 1
1.1 Numbers 1
1.2 Matrices 17
1.3 Symmetries 20
1.4 Permutations 24
2 Subgroups 39
2.1 Groups within groups 39
2.2 Cosets and Lagranges Theorem 49
2.3 Eulers Theorem and Fermats Little Theorem 62
3 Normal subgroups 71
3.1 Cosets and conjugacy classes 71
3.2 Quotient groups 82
3.A Simple groups 93
4 Homomorphisms 105
4.1 Structure-preserving maps 105
4.2 Kernels and images 112
4.3 The Isomorphism Theorems 118
5 Presentations 135
5.1 Free groups 136
5.2 Generators and relations 146
5.3 Finitely generated abelian groups 167
5.A Coset enumeration 180
xiv a course in abstract algebra

5.B Transversals 188


5.C Triangles, braids and reflections 199
6 Actions 211
6.1 Symmetries and transformations 211
6.2 Orbits and stabilisers 219
6.3 Counting 226
7 Finite groups 239
7.1 Sylows Theorems 239
7.2 Series of subgroups 250
7.3 Soluble and nilpotent groups 261
7.4 Semidirect products 278
7.5 Extensions 283
7.A Classification of small finite groups 297
8 Rings 323
8.1 Numbers 323
8.2 Matrices 332
8.3 Polynomials 334
8.4 Fields 338
8.A Modules and representations 344
9 Ideals 359
9.1 Subrings 359
9.2 Homomorphisms and ideals 367
9.3 Quotient rings 375
9.4 Prime and maximal ideals 384
10 Domains 393
10.1 Euclidean domains 393
10.2 Divisors, primes and irreducible elements 401
10.3 Principal ideal domains 407
10.4 Unique factorisation domains 411
10.AQuadratic integer rings 424
11 Polynomials 437
11.1 Irreducible polynomials 437
11.2 Field extensions 444
11.3 Finite fields 458
xv

11.4 Field automorphisms and the Galois group 462


11.5 The Galois Correspondence 466
11.6 Solving equations by radicals 484
11.AGeometric constructions 493
A Background 505
A.1 Sets 505
A.2 Functions 511
A.3 Counting and infinity 516
A.4 Relations 520
A.5 Number theory 523
A.6 Real analysis 526
A.7 Linear algebra 527
However, there is a pleasure in recog-
nizing old things from a new point
of view. Also, there are problems for
which the new point of view offers a
distinct advantage.
Richard Feynman (19181988),
Space-time approach to non-relativistic
quantum mechanics, Reviews of Modern
Physics 20 (1948) 367387
1 Groups

ur aim in the study of abstract algebra is to consider famil-


O iar algebraic or numerical systems such as the integers or the
real numbers, and distil from them certain sensible, universal proper-
ties. We then ask the question what other things satisfy some or all
of these properties? and see if the answers give us any insight into
either the universe around us, or mathematics itself.
In practice, this has been an astonishingly rich approach, yielding
valuable insights not only into almost all branches of pure and applied
mathematics, but large swathes of physics and chemistry as well.
In this chapter, we begin our study of groups: sets equipped with a
single binary operation, satisfying certain basic criteria (associativity,
existence of a distinguished identity element, existence of inverses).
We will study a few different scenarios in which this structure naturally
arises (number systems, matrices, symmetry operations in plane and
solid geometry, and permutations of finite or infinite sets) and the
links between them.

1.1 Numbers The Tao begets One,


One begets Two,
Two begets Three,
There are many different ways we could begin our study of Three begets all things.
abstract algebra, but perhaps as sensible a place as any is with the set Lao Tzu, Tao Te Ching 42:14
N of natural numbers. These are the counting numbers, with which
we represent and enumerate collections of discrete physical objects.
It is the first number system that we learn about in primary school;
in fact, the first number system that developed historically. More
precisely, N consists of the positive integers:
N = {1, 2, 3, . . .}
(By convention, we do not consider zero to be a natural number.)
So, we begin by studying the set N of natural numbers, and their
properties under the operation of addition.
2 a course in abstract algebra

Perhaps the first thing we notice is that, given two numbers a, b N,


it doesnt matter what order we add them in, since we get the same
answer either way round:
a+b = b+a (1.1)
This property is called commutativity, and we say that addition of
natural numbers is commutative.
Given three numbers a, b, c N to be added together, we have a choice
of which pair to add first: do we calculate a + b and then add c to the
result, or work out b + c and then add the result to a? Of course, as
we all learned at an early age, it doesnt matter. That is,
( a + b ) + c = a + ( b + c ). (1.2)
This property of the addition of natural numbers is also a pretty
fundamental one, and has a special name: associativity. We say that
addition of natural numbers is associative.
These two properties, commutativity and associativity, are particularly
important ones in the study of abstract algebra and between them
will form two cornerstones of our attempts to construct and study
generalised versions of our familiar number systems.
But first of all, lets look carefully at what we have so far. We have a
set, in this case N, and an operation, in this case ordinary addition,
defined on that set. Addition is one of the best-known examples of a
binary operation, which we now define formally.
Definition 1.1 A binary operation defined on a set S is a function
f : S S S.
1
Definition A.7, page 510. Here SS is the Cartesian product1 of S with itself: the set consisting
of all ordered pairs ( a, b) of elements of S. In other words, a binary op-
eration is a function which takes as input an ordered pair of elements
of the chosen set S, and gives us in return a single element of S.
Casting addition of natural numbers in this new terminology, we can
define a function
f : N N N; ( a, b) 7 ( a + b)
which maps two natural numbers a, b N to their sum. The commu-
tativity condition can then be formulated as
f ( a, b) = f (b, a) for all a, b N
which is perhaps slightly less intuitive than the original statement. But
associativity fares somewhat worse:
f ( f ( a, b), c) = f ( a, f (b, c)) for all a, b, c N
With the original statement of associativity of addition, it was fairly
easy to see what was going on: we can ignore parentheses when
groups 3

calculating sums of three or more natural numbers. Formulating


addition as a function f ( a, b) in this way, we gain certain formal
advantages, but we lose the valuable intuitive advantage that our
original notation a + b gives us.
So, to get the best of both worlds, we adopt the following notational
convention: if our function f is a binary operation (rather than just
some other function defined on S S) we will usually represent it
by some symbol placed between the functions input values (which
we will often refer to as arguments or operands). That is, instead of
writing f ( a, b) we write a b. In fact, given a binary operation , we
will usually adopt the same notation for the function: : S S S.
Definition 1.2 A set S equipped with a binary operation : S S
S is called a magma.
Formalising the above discussion, we give the following two defini- Wikimedia Commons / Commemorative stamp, USSR (1983)

tions: Muh.ammad ibn Musa al-Khwarizm


(c.780-c.850) was a Persian astronomer
Definition 1.3 A binary operation : S S S defined on a set S and mathematician whose work had
is said to be commutative if a profound influence on the develop-
ment of western mathematics and sci-
ab = ba ence during the later medival period.
His best-known work, al-Kitab al-
for all a, b S. mukhtas.ar f h.isab al-jabr wal-muqabala
(The Compendious Book on Calculation by
Definition 1.4 A binary operation : S S S defined on a set S Completion and Balancing) describes gen-
is said to be associative if eral techniques for solving linear and
quadratic equations of various types.
( a b) c = a (b c) Here, al-jabr (completion) is an oper-
ation whereby negative terms are elim-
for all a, b, c S. inated from an equation by adding an
appropriate positive quantity to both
At this point its worth noting that we already have a sophisticated sides, while wal-muqabala (balancing)
is a method for simplifying equations
enough structure with which we can do some interesting mathematics.
by subtracting repeated terms from
Definition 1.5 A semigroup is a set S equipped with an associative both sides.
binary operation : S S S. If is also commutative, then S is a Translated into Latin as Liber algebr
et almucabola in 1145 by the English
commutative or abelian semigroup. writer Robert of Chester, the term al-
jabr became our word algebra, while
Semigroup theory is a particularly rich field of study, although a full Algorizmi, the Latinised form of Al-
treatment is beyond the scope of this book. Khwarizmis name, is the origin of the
word algorithm.
There is another obvious binary operation on N which we typically His other major works include Zij al-
learn about shortly after our primary school teachers introduce us to Sindhind (Astronomical Tables of Sindh
and Hind), a collection of astronomi-
addition: multiplication. This operation : N N N is also both
cal and trigonometrical tables calcu-
commutative and associative, but is clearly different in certain ways lated by methods developed in India,
to our original addition operation. In particular, the number 1 has a and the Kitab S.urat al-Ard. (Book of the
Description of the Earth), a reworking
special multiplicative property: for any number a N, we find that of the Geographia, an atlas written by
the Alexandrian mathematician and
1 a = a 1 = a. astronomer Claudius Ptolemy (c.100
c.170) in the middle of the second cen-
That is, multiplication by 1 doesnt have any effect on the other number tury.
4 a course in abstract algebra

involved. No other natural number apart from 1 has this property


with respect to multiplication. Also, there is no natural number which
has this property with respect to addition: there exists no z N
such that for any other a N we have z + a = a + z = a. Of course,
from our knowledge of elementary arithmetic, we know of an obvious
candidate for such an element, which alas doesnt happen to be an
element of the number system N under investigation.
This leads us to two observations: firstly, that the multiplicative struc-
ture of N is fundamentally different in at least one way from the
additive structure of N. And secondly, it might be useful to widen
our horizons slightly to consider number systems which have one (or
possibly more) of these special neutral elements.
We will return to the first of these observations, and investigate the
interplay between additive and multiplicative structures in more detail
later, when we study the theory of rings and fields. But now we will
investigate this concept of a neutral element, and in order to do so we
state the following definition.
Definition 1.6 Let S be a set equipped with a binary operation .
Then an element e S is said to be an identity or neutral element
with respect to if
ea = ae = a
for all a S.
So, lets now extend our number system N to include the additive
identity element 0. Denote this new set N {0} by N0 .
Definition 1.7 A monoid is a semigroup (S, ) which has an identity
element. If the binary operation is commutative then we say S is a
commutative or abelian monoid.
Monoids also yield a rich field of mathematical study, and in partic-
ular are relevant in the study of automata and formal languages in
theoretical computer science.
In this book, however, we are primarily interested in certain specialised
forms of these objects, and so we return to our investigation of number
systems. Historically, this system N0 represented an important con-
ceptual leap forward, a paradigm shift from the simple enumeration
of discrete, physical objects, allowing the explicit labelling of nothing,
2
Whats red and invisible? the case where there are no things to count.2
No tomatoes. The mathematical concept of zero has an interesting history, a full
Anonymous
discussion of which is beyond the scope of this book. However, a
fascinating and readable account may be found in the book The Nothing
3
3
R Kaplan and E Kaplan, The Nothing That Is by Robert and Ellen Kaplan.
That Is: A Natural History of Zero, Pen-
guin (2000).
groups 5

But having got as far as the invention of zero, its not much further
a step to invent negative numbers.4 With a bit of thought, we can 4
The earliest known treatment of nega-
formulate perfectly reasonable questions that cant be answered in tive numbers occurs in the ancient Chi-
nese text Jiuzhang suanshu (Nine Chap-
either N or N0 , such as what number, when added to 3, gives the ters on the Mathematical Art), which
answer 2? dates from the Han dynasty (202BC
220AD) in which positive numbers are
Attempts to answer such questions, where the need for a consistent represented by black counting rods and
answer is pitted against the apparent lack of a physical interpreta- negative numbers by red ones.
tion for the concept, led in this case to the introduction of negative
numbers. This sort of paradigm shift occurs many times throughout
the history of mathematics: we run up against a question which is
unanswerable within our existing context, and then ask But what if
this question had an answer after all? This process is often a slow
and painful one, but ultimately leads to an expanded understanding
of the subject at hand. It took somewhere in excess of a thousand
years for the concept of negative numbers to fully catch on. The Greek
mathematician Diophantus, writing in the third century AD, rejected
negative solutions to linear or quadratic equations as absurd. Even as
late as the 16th century, the Italian mathematician Girolamo Cardano
(15011576) referred to such numbers as fict, or fictitious, although his
Italian predecessor Leonardo of Pisa, better known as Fibonacci, had
interpreted them in a financial context as a loss or debit. Meanwhile, Wikimedia Commons
the Indian mathematicians Brahmagupta (598668) and Mahavira (9th Figure 1.1: Page from Nine Chapters on
the Mathematical Art
century) had made the necessary intuitive leap and developed rules
for multiplying negative numbers (although even Mahavira baulked
at considering their square roots). Adjoining negative numbers to N0
yields the set of integers
Z = {. . . , 3, 2, 1, 0, 1, 2, 3, . . .}.
Later on, we will examine how Z was extended to construct more
sophisticated number systems, in particular the rational numbers Q,
the real numbers R and the complex numbers C, but for the moment
this will suffice.
The operation of addition can be extended to the negative integers
in an obvious and consistent way, and we find that all of those tricky
questions involving subtraction can now be solved. More importantly,
it turns out that for any integer n N0 there is a unique negative
integer n Z such that
n + (n) = 0 = (n) + n. (1.3)
But this also works for the negative integers themselves, as long as we
define
(n) = n
for any integer n Z. So, we now have a set Z equipped with
6 a course in abstract algebra

an associative and commutative binary operation + : Z Z Z,


together with a designated identity element 0 and, for any number
n Z, an inverse element n satisfying (1.3). More generally, we
have the following.
Definition 1.8 Let S be a set equipped with a binary operation
and an identity element e. Then for any a S, an element a1 is said
to be an inverse of a if
a a1 = e = a1 a.
We are now able to define one of the most important structures in
mathematics, whose properties we will study for much of this book.
Definition 1.9 A group ( G, ) consists of a set G together with a
Wikimedia Commons / Unknown medival artist
Leonardo of Pisa (c.1170c.1250), com- binary operation : G G G satisfying the following three criteria.
monly known as Fibonacci, is perhaps G1 The binary operation is associative.
best known for the numerical sequence
which bears his name, and which he G2 There exists an element e G (the identity or neutral element)
discussed in his book Liber Abaci (1202) such that e g = g e = g for all g G.
in the context of a simple model of
population growth in rabbits. This se-
G3 For each g G there exists an element g1 (the inverse of g)
quence, which can be defined by the re- such that g g1 = g1 g = e.
currence relation Fn+2 = Fn+1 + Fn with
Some texts include a fourth criterion:
F0 = 0 and F1 = 1, or by the formula
 n  n
Fn = 1 1+2 5 1 12 5 , G0 The set G is closed under the action of . That is, for any
5 5
was known to Indian mathematicians
g, h G, it follows that g h G too.
as early as the 6th century, and mani-
fests surprisingly often in the natural
However, in our case this is a direct consequence of the way we defined
world: in the structure of artichokes a binary operation: it is automatically the case that G is closed under
and pinecones, and in the spiral ar- the action of : G G G.
rangement of seeds in sunflowers.
Comparatively little is known of When the group operation is obvious from context, we will often
Leonardo himself, and the portrait omit it, writing gh instead of g h for two elements g, h G. On other
above is believed to be a later in-
vention not based on contemporary occasions, it may be notationally or conceptually more convenient to
sources. The Liber Abaci notes that regard the group operation as a type of addition, rather than multipli-
Leonardo was the son of a customs of-
cation. In that case (especially if the group operation is commutative)
ficial named Guglielmo Bonaccio (the
name Fibonacci is a contraction of fil- we may choose to use additive notation, writing g + h instead of g h
ius Bonacci, or son of Bonaccio) and or gh, and denoting the inverse of an element g by g rather than g1 .
travelled with him to northern Africa.
They spent some time in Bugia (now Although weve used e to denote the identity element of a group, this
Bejaia in modern-day Algeria), where is by no means universal, and often we will use a different symbol,
Leonardo recognised the usefulness of
recent Arabic advances in mathematics. such as 0, 1, , or something else depending on the context.
After his return to Italy, Leonardo spent Since a group is really a set with some extra structure, we can reuse
some time at the court of the Holy Ro-
man Emperor Frederick II (11941250), most of the same concepts that were used to when dealing with sets,
who had a keen appreciation of mathe- and in particular its often useful to consider a groups cardinality:
matics and science. During this period
he wrote other books, of which three Definition 1.10 The order of a group G = ( G, ) is the cardinality
survive: Practica Geometri (1220), Flos | G | of its underlying set. A group is said to be finite (or infinite) if it
(1225) and Liber Quadratorum (1225).
has finite (or infinite) order.
groups 7

Our motivating example Z is an infinite (more precisely, countably


infinite) group, but shortly we will meet several finite examples.
As we saw earlier, many (but not all) well-known binary operations
are commutative. This is certainly the case with the addition operation
in Z, which was the motivating example leading to our study of
groups. So, on the premise that groups with commutative operations
are important (which they are), we give them a special name:
Definition 1.11 An abelian group is a group G = ( G, ) whose
operation is commutative. That is, g h = h g for all g, h G.

The first few groups we will meet are all abelian, although in a short
while we will study some examples of nonabelian groups as well.
Wikimedia Commons / Johan Grbitz (17821853)
Our first abelian example is a slight modification of Z, but instead of Abelian groups are named after the
taking the infinite set of integers, we take a finite subset, and instead Norwegian mathematician Niels Hen-
rik Abel (18021829), whose brilliant
of using the usual addition operation, we use modular arithmetic: career (as well as his life) was cut trag-
Example 1.12 (Cyclic groups) Let ically short by tuberculosis at the age
of 26. At the age of 19 he proved, in-
Zn = {0, . . . , n1} dependently of his similarly tragic con-
temporary variste Galois (18111832),
be the set consisting of the first n non-negative integers, and let that the general quintic equation
+ : Zn Zn Zn be addition modulo n. That is, for any two ax5 + bx4 + cx3 + dx2 + ex + f = 0
a, b Zn we define a+b to be the remainder of the integer a+b Z cannot be solved by radicals. His mono-
after division by n. graph on elliptic functions was only
discovered after his death, which oc-
This is the cyclic group of order n. curred two days before the arrival of a
letter appointing him to an academic
We can regard the elements of this group geometrically as n equally- post in Berlin.
spaced points around the circumference of a circle, and obtain a+b by In 2002, to commemorate his bicente-
starting at point a and counting b positions clockwise round the circle nary (and approximately a century af-
ter the idea had originally been pro-
to see which number we end up with. See Figure 1.2 for a geometric posed) the Norwegian Academy of Sci-
depiction of 5 + 9 = 2 in Z12 . ences and Letters founded an annual
prize in his honour, to recognise stellar
Its natural, when we meet a new mathematical construct, to ask what achievement in mathematical research.
the simplest possible example of that construct is. In the case of
7
groups, the following example answers this question. 6 8
Example 1.13 Let G = {0} be the set consisting of a single element.
5 9
There is only one possible binary operation that can be defined on 11 0 1
this set, namely the one given by 0 0 = 0. A routine verification 10 2
shows that this operation satisfies all of the group axioms: 0 is the 4 9 3
identity element, its its own inverse, and the operation is trivially 8 4
associative and commutative. This is the trivial group. 7 6 5
3
We could denote the trivial group as Z1 , although nobody usually
2 0
does: depending on the context we typically use 1 or 0 instead. Note 1
that although we can define a binary operation of sorts (the empty Figure 1.2: Addition in Z12 . Here 5 +
operation) on the empty set , we dont get a group structure because 9 = 14 2 mod 12
8 a course in abstract algebra

axiom G2 requires the existence of at least one element: the identity.


For a finite group, especially one of relatively small order, writing
down the multiplication table is often the most effective way of de-
+ 0 1 2 3 termining the group structure. This is, as its name suggests, a table of
0 0 1 2 3 all possible products of two elements of the group. Table 1.1 depicts
1 1 2 3 0
2 2 3 0 1 the multiplication table (or, in this case, the addition table) for Z4 .
3 3 0 1 2 In this book, we will adopt the convention that the product a b will be
Table 1.1: Multiplication table for Z4 written in the cell where the ath row and bth column intersect. In the
case where the group under investigation is abelian, its multiplication
Im z
= 12 + 3 table will be symmetric about its leading diagonal, so this convention
2 i
is only necessary when we study nonabelian groups.

Example 1.14 Let = 12 + 23 i be a complex cube-root of unity,
1 Re z that is, a root of the cubic polynomial
z3 1. The other two roots
1 3
of this polynomial are 2 2 i = = 2 and 1 = 0 = 3 (see
2 = 12

3 Figure 1.3). The multiplication table for the set C3 ={1, , 2 } under
2 i
ordinary complex multiplication is
Figure 1.3: Cube roots of unity

1 2
1 1 2
2 1
2 2 1

The following is a straightforward but important fact about group


multiplication, which we will need in this and some later chapters.
Proposition 1.15 Let G = ( G, ) be a group, and g, h, k G be any
three elements of G. Then the left and right cancellation laws hold:
gh = gk = h=k (1.4)
hg = kg = h=k (1.5)

Proof Suppose g h = g k. Multiplying both sides of this equation


on the left by the inverse g1 yields g1 g h = g1 g k, which
gives 1 h = 1 k, hence h = k as required. The right cancellation law
follows by a very similar argument.

Some more book-keeping: at this point we have required the existence


of an identity element e G, and an inverse g1 for each element g
G. The following proposition confirms uniqueness of these elements.
Proposition 1.16 The identity element e of a group G is unique. That is,
for any other element f G satisfying condition G2 in Definition 1.9, we
have f = e.
Any element g of a group G has a unique inverse g1 . That is, for any other
element g satisfying condition G3 in Definition 1.9, we have g = g1 .
groups 9

Proof Suppose that f G also satisfies the identity condition, that


for any element g G, we have f g = g and g f = g. In particular,
f e = e. But since e is also an identity, we have f e = f as well. So
f = e.
Now suppose g1 and g are two inverses for an element g G. Then
g1 g = g g = e. But by condition G3 we have g g1 = e as well.
So
g1 = e g1 = ( g g) g1 = g ( g g1 ) = g e = g.
Hence the identity element and the inverse elements are unique.
While on the subject of inverses, its illuminating to think about what
the inverse of a product of two elements looks like. Given a group
G and two elements g, h G, the inverse ( g h)1 = h1 g1 . We
might be tempted to assume ( g h)1 = g1 h1 but this is not
the case in general (unless G happens to be abelian, in which case
we can reorder the products to our hearts content). Remember that Im z

( g h)1 has to be the unique element of G which, when multiplied



by ( g h), either on the left or the right, gives the identity element e.
The following shows that h1 g1 is precisely the element we want: 2
3

( h 1 g 1 ) ( g h ) = h 1 ( g 1 g ) h = h 1 h = e 4
3 1 Re z
( g h ) ( h 1 g 1 ) = g ( h h 1 ) g 1 = g g 1 = e
2
Another important observation concerning the group C3 in Exam-
ple 1.14 is that its multiplication table has essentially the same structure
Figure 1.4: Cube roots of unity
as the multiplication table for the cyclic group Z3 .

+ 0 1 2 1 2
0 0 1 2 1 1 2
1 1 2 0 2 1
2 2 0 1 2 2 1

In the multiplicative group of complex cube-roots of unity, the identity


element is clearly 1 (since multiplying any complex number by 1
leaves it unchanged), while in Z3 it is 0 (since adding 0 to any integer
leaves it unchanged). Similarly, and 2 behave analogously, under
multiplication, to the integers 1 and 2 in modulo3 arithmetic.
In some sense, these groups are actually the same: apart from some
fairly superficial relabelling, their elements interact in the same way,
and the structure of their multiplication tables are essentially identical.
More explicitly, we have a bijective correspondence
1 0 1 2 2 (1.6)
between the elements of both groups. Actually, this structural corre-
spondence between Z3 and the cube roots of unity is to be expected.
10 a course in abstract algebra

Writing 1, and 2 in polar form, using Eulers formula


ei = cos + i sin ,
(see Figure 1.4) we find that
2i 4i
1 = e0 , =e 3 , 2 = e 3 .
Multiplying any two of these together and using the usual rule for
products of exponents gives us a further insight into whats going on,
and enables us to write down an explicit function
2ki
: Z3 C3 ; k 7 e 3

Although the superficial appearance of a group will often give us some


insight into its fundamental nature, we will be primarily interested in
Wikimedia Commons / Jakob Emanuel Handmann (1756)
The Swiss mathematician Leonhard its underlying structure and properties. It would, therefore, be useful
Euler (17071783) (his surname is pro- to have some way of saying whether two given groups are equivalent,
nounced, roughly, oiler rather than
and from the above example, the existence of a suitable bijection seems
yuler) was one of the most prolific
mathematicians of all time. His con- to be a good place to start. But will any bijection do? What happens
tributions to mathematics and physics if, instead of the bijection in (1.6), we use the following one?
include pioneering work on analysis,
number theory, graph theory, astron- 1 1 2 2 0 (1.7)
omy, logic, engineering and optics.
The son of a Calvinist pastor, Euler The multiplication table for the relabelled group then looks like
was tutored in his youth by Johann
Bernoulli (16671748), a family friend 1 2 0 0 1 2
and eminent mathematician in his own
right, who encouraged him to study
1 1 2 0 0 2 0 1
which we then rearrange to
mathematics rather than theology at 2 2 0 1 1 0 1 2
the University of Basel. 0 0 1 2 2 1 2 0
He graduated in 1726 and moved to the
Imperial Academy of Sciences in St Pe- which doesnt represent the modulo3 addition table of the set {0, 1, 2}.
tersburg. At this time, a suspicious po-
litical class had reasserted their control So not just any bijection will do: the problem here is that the product
and cut scientific funding: a problem operations dont line up properly. However, with the bijection
still regrettably common today.
After a near-fatal fever in 1735, Eu- 1 0 2 2 1 (1.8)
lers eyesight began to deteriorate, and
he eventually went almost completely the multiplication table for the relabelled group looks like
blind. He compensated for this with
a prodigious memory (he could recite 0 2 1 0 1 2
the Aeneid in its entirety) and his math-
0 0 2 1 0 0 1 2
ematical productivity was unaffected. which rearranges to
In 1741 he moved to Berlin at the invi- 2 2 1 0 1 1 2 0
tation of Frederick the Great of Prussia, 1 1 0 2 2 2 0 1
where he stayed for the next twenty-
five years. He returned to St Petersburg which is the same as that for Z3 .
in 1766, during the reign of Cather-
ine the Great, where he lived until his So, we want a bijection which, like (1.8), respects the structure of the
death from a stroke at the age of 76. groups involved. More generally, given two groups G = ( G, ) and
His complete works comprise 866
known books, articles and letters. The
H = ( H, ) which are structurally equivalent, we want a bijection
publication of a definitive, annotated, : G H such that the product in H of the images of any two
collected edition, the Opera Omnia, be- elements of G is the same as the image of their product in G. This
gan in 1911 and is not yet complete but
has so far yielded 76 separate volumes. leads us to the following definition.
groups 11

Definition 1.17 Two groups G = ( G, ) and H = ( H, ) are isomor-


phic (written G
= H) if there exists a bijection (an isomorphism)
: G H such that
( g1 g2 ) = ( g1 ) ( g2 ) .
for any g1 , g2 G.

Later we will consider the more general case of homomorphisms:


functions which respect group structures, but which are not nec-
essarily bijections. For the moment, however, we are interested in
isomorphisms as an explicit structural equivalence between groups.
In the group C3 = Z3 from Example 1.14, both primitive roots z =
, 2 have the property that z3 = 1, but this is not true for any smaller
power. That is, n = 3 is the smallest positive integer for which zn = 1.
Definition 1.18 Let g be an element of a group G with identity
element e. Then the order of g, denoted | g|, is the smallest positive
integer n such that gn = e. If there is no such integer n, then we say
that g has infinite order. An element of finite order is sometimes
called a torsion element.
Here, gn denotes the nth power g g of g. If, as in the case of the
cyclic groups Zn = (Zn , +), we are using additive notation, then we
would replace gn with ng in the above definition.
Considering the group C3 , we remark that | | = | 2 | = 3 but |1| = 1.
In the case of Z4 , the orders of the four elements are given by

|1| = |3| = 4, |2| = 2, |0| = 1.


For the order12 cyclic group Z12 , the orders are

|1| = |5| = |7| = |11| = 12, |2| = |10| = 6, |3| = |9| = 4,


|4| = |8| = 3, |6| = 2, |0| = 1.
In all three of these examples we see that the order of the identity
element is 1, and furthermore that no other element apart from the
identity has order 1. This is true in general:
Proposition 1.19 Let G be a group with identity element e. Then for any
element g G it follows that | g| = 1 if and only if g = e.

Proof The order of e is always 1, since 1 is the smallest positive integer


n for which en = e. Conversely, if g1 = e then g = e.
The simplest nontrivial case is that of a group where all the non-
identity elements have order 2:
Proposition 1.20 Let G be a nontrivial group, all of whose elements apart
from the identity have order 2. Then G is abelian.
12 a course in abstract algebra

Proof Let g, h G. Then by the hypothesis, g2 = h2 = e, and so


g = g1 and h = h1 . It follows that

g h = g 1 h 1 = ( h g ) 1 = h g

as required.
Given that an isomorphism preserves at least some aspects of the
structure of a group, it is reasonable to ask how it affects the order of
a given element. The answer is that it leaves it unchanged:
Proposition 1.21 If : G H is an isomorphism, then |( g)| = | g|
for any g G.

Proof Suppose g has order n in G. Then gn = g g = eG .


Therefore

( g)n = ( g) ( g) = ( g g) = ( gn ) = (eG ) = e H .

We must now check that n is the smallest positive integer such that
( g)n = e H . Suppose that there exists some 1 6 k < n such that
( g)k = e H . Then

( g ) k = ( g ) ( g ) = ( g g ) = ( g k ).

But gk 6= eG , so ( gk ) 6= (eG ) = e H , which contradicts the assertion


that ( g)k = e H for some k < n. Therefore n is after all the smallest
positive integer such that ( g)n = e H , and hence |( g)| = | g| = n.
As it happens, we can construct the group Zn by starting with the
identity element 0 and repeatedly adding 1 to it (subject to modulon
arithmetic). This process yields all elements of Zn ; if we threw away
all of Zn except for 0, 1 and the addition operation, we could rebuild
it by just adding 1 to itself enough times.
We say that the element 1 generates the group Zn , and we write
h1i = Zn . More generally:
Definition 1.22 Suppose that g1 , g2 , . . . G are a (possibly infinite)
collection of elements of some group G. Denote by h g1 , g2 , . . .i the
set of elements of G which can be formed by arbitrary finite products
of the elements g1 , . . . , gk and their inverses. If G = h g1 , g2 , . . .i then
we say that the elements g1 , g2 , . . . generate G, or are generators for
G. If a group G is generated by a finite number of such elements, it
is said to be finitely generated.
The finite cyclic groups Zn can be generated by a single element, as
can the (infinite) group Z. This leads to the following definition:
Definition 1.23 A group which can be generated by a single element
is said to be cyclic.
groups 13

We will sometimes refer to Z as the infinite cyclic group. Up to


isomorphism, there is only one cyclic group of a given order:
Proposition 1.24 Suppose that G and H are (finite or infinite) cyclic
groups with | G | = | H |. Then G
= H.

Proof Consider the infinite case first: G = h gi = { gk : k Z} and


H = hhi = {hk : k Z}. The elements g j and gk are distinct (that is,
g j 6= gk ) if j 6= k. Hence the function : G H defined by ( gk ) = hk
is a bijection. Its also an isomorphism of groups, because
( g j gk ) = ( g j+k ) = h j+k = h j hk = ( g j ) ( gk )
for any j, k Z. Thus G
= H.
The finite case is very similar. Let G = h gi = { gk : k = 0, . . . , n1}
and H = hhi = {hk : k = 0, . . . , n1}, so that | G | = | H | = n. Then the
map : G H defined by ( gk ) = hk is also a bijection (since as in the
infinite case, the elements gk of G are distinct for all k = 0, . . . , n1). It
also satisfies the condition ( g j gk ) = ( g j )( gk ) for all 0 6 j, k 6 n1,
and so G = H.
In particular, the finite cyclic groups in Example 1.12 are precisely
the finite groups satisfying Definition 1.23. We also can derive a few
general results about the order of generators in cyclic groups:
Proposition 1.25 Let G = h gi be a cyclic group. If G is an infinite cyclic
group, g has infinite order; if G is a finite cyclic group with | G | = n then
| g| = n.

Proof If the order | g| of the generator g is finite, equal to some


k N, say, then gm = gm+k = gm+tk for all integers t and m. So
G = h gi = { gi : i Z} contains at most k elements. This proves the
first statement, since if G is an infinite cyclic group, then g cant have
finite order: if it did, G could only have a finite number of elements.
It almost proves the second statement too; all that remains is to show
that a finite cyclic group G whose generator g has order k contains
exactly k elements. This follows from the observation that gm = e if
and only if m is an integer multiple of k, and hence G must contain at
least k elements. Therefore G has exactly k elements.
It is also interesting to ask which elements generate all of Zn .
Proposition 1.26 Let k Zn . Then k is a generator for Zn if and only
if gcd(k, n) = 1; that is, if k and n are coprime.

Proof From Proposition 1.25, we know that k generates all of Zn


if and only if it has order n. In other words, the smallest positive
integer m such that n|mk is n itself, which is the same as saying that
gcd(k, n) = 1, that is, k and n are coprime.
14 a course in abstract algebra

This proposition tells us that any integer k {0, . . . , n1} that is


coprime to n can generate the entirety of the finite cyclic group Zn .
The number of possible generators of Zn is sometimes denoted (n);
this is Eulers totient function.
Given two sets X and Y, we can form their cartesian product X Y,
which we define to be the set
X Y = {( x, y) : x X, y Y }
of ordered pairs of elements of X and Y. Since a group is, fundamen-
tally, a set with some additional structure defined on it, can we use
this cartesian product operation to make new groups by combining
5
5
There are various other group struc- two or more smaller ones in a similar way? The answer is yes:
tures that can be defined on the set
G H. Two particularly important ex-
Definition 1.27 Given two groups G = ( G, ) and H = ( H, ), their
amples are the semidirect product H o direct product is the group G H = ( G H, ) whose underlying
G and the wreath product H o G, both set is the cartesian product G H of the sets G and H, and group
of which we will meet in Section 7.4.
operation given by
( g1 , h 1 ) ( g2 , h 2 ) = ( g1 g2 , h 1 h 2 ) .
If the groups G and H are written in additive notation (especially
if they are abelian) then we call the corresponding group the direct
sum and denote it G H.
Before proceeding any further, we must first check that this operation
does in fact yield a group, rather than just a set with some insufficiently
group-like structure defined on it. Fortunately, it does:
Proposition 1.28 Given two groups G = ( G, ) and H = ( H, ), their
direct product G H is also a group, and is abelian if and only if G and H
both are.
Proof This is fairly straightforward, and really just requires us to
check the group axioms. We begin with associativity:
(( g1 , h1 ) ( g2 , h2 )) ( g3 , h3 ) = ( g1 g2 , h1 h2 ) ( g3 , h3 )
= (( g1 g2 ) g3 , (h1 h2 ) h3 )
= ( g1 ( g2 g3 ), h1 (h2 h3 ))
= ( g1 , h 1 ) ( g2 g3 , h 2 h 3 )
= ( g1 , h1 ) (( g2 , h2 ) ( g3 , h3 ))
for all g1 , g2 , g3 G and h1 , h2 , h3 H. Next we need to verify the
existence of an identity element. If eG is the identity element in G and
e H is the identity element in H, then eG H = (eG , e H ) is the identity
element in G H, since (eG , e H ) ( g, h) = (eG g, e H h) = ( g, h) and
( g, h) (eG , e H ) = ( g eG , h e H ) = ( g, h) for any g G and h H.
We define the inverses in G H in a similar way, with ( g, h)1 =
( g1 , h1 ), since ( g1 , h1 ) ( g, h) = ( g1 g, h1 h) = (eG , e H ) and
groups 15

( g, h) ( g1 , h1 ) = ( g g1 , h h1 ) = (eG , e H ). This completes the


proof that G H is a group. To prove the second part, we observe that
( g1 , h1 )( g2 , h2 ) = ( g1 g2 , h1 h2 )
and ( g2 , h2 )( g1 , h1 ) = ( g2 g1 , h2 h1 ).
These expressions are equal (and hence G H is abelian) if and only if
both G and H are abelian.
Example 1.29 The group Z2 Z3 has the multiplication table
(0, 0) (0, 1) (0, 2) (1, 0) (1, 1) (1, 2)
(0, 0) (0, 0) (0, 1) (0, 2) (1, 0) (1, 1) (1, 2)
(0, 1) (0, 1) (0, 2) (0, 0) (1, 1) (1, 2) (1, 0)
(0, 2) (0, 2) (0, 0) (0, 1) (1, 2) (1, 0) (1, 1)
(1, 0) (1, 0) (1, 1) (1, 2) (0, 0) (0, 1) (0, 2)
Oberwolfach Photo Collection
(1, 1) (1, 1) (1, 2) (1, 0) (0, 1) (0, 2) (0, 0) The German mathematician Felix
(1, 2) (1, 2) (1, 0) (1, 1) (0, 2) (0, 0) (0, 1) Klein (18491925) began his research
career under the supervision of the
This group is isomorphic to the cyclic group Z6 , by the isomorphism physicist and geometer Julius Plcker
(18011868), receiving his doctorate
(0, 0) 7 0 (1, 1) 7 1 (0, 1) 7 2 from the University of Bonn in 1868,
and completing Plckers treatise on
(1, 0) 7 3 (0, 2) 7 4 (1, 2) 7 5 line geometry after the latters death.
Appointed professor at the University
Alternatively, note that Z2 Z3 = h(1, 1)i and so must be cyclic by of Erlangen in 1872 at the age of 23,
Definition 1.23, and since it has six elements, it must be isomorphic Klein instigated his Erlangen Programme
to Z6 by Proposition 1.24. to classify geometries via their under-
lying symmetry groups. In 1886 he
was appointed to a chair at Gttingen,
Example 1.30 The multiplication table for the group Z2 Z2 is where he remained until his retirement
in 1913, doing much to re-establish it as
(0, 0) (0, 1) (1, 0) (1, 1) the worlds leading centre for research
(0, 0) (0, 0) (0, 1) (1, 0) (1, 1) in mathematics.
(0, 1) (0, 1) (0, 0) (1, 1) (1, 0) He also did much to further academic
prospects for women. In 1895 his
(1, 0) (1, 0) (1, 1) (0, 0) (0, 1) student Grace Chisholm Young (1868
(1, 1) (1, 1) (1, 0) (0, 1) (0, 0) 1944) became the first woman to receive
a doctorate by thesis from any German
This group is not cyclic: every nontrivial element has order 2, so no university (the Russian mathematician
single element can generate the entire group. Hence Z2 Z2 6 = Z4 . Sofya Kovalevskaya (18501891) was
awarded hers for published work in
The group Z2 Z2 is named the Klein 4group or Viergruppe after 1874). Klein was also instrumental,
along with David Hilbert (18621943),
the German mathematician Felix Klein (18491925). We will see in a in securing an academic post for the
little while that it describes the symmetry of a rectangle. algebraist Emmy Noether (18821935).
The Klein bottle, a non-orientable
Example 1.31 The group Z2 Z2 Z2 has eight elements, as do the closed surface which cannot be embed-
groups Z8 and Z4 Z2 . If we write out the multiplication tables, ded in 3dimensional space, bears his
though, we find that there are important structural differences be- name.

tween them. Neither Z2 Z2 Z2 nor Z4 Z2 are cyclic, since no


single element generates the entire group. Nor are they isomorphic
to each other, since Z4 Z2 = h(0, 1), (1, 0)i but no two elements of
Z2 Z2 Z2 generate the entire group.
16 a course in abstract algebra

Alternatively, there is no isomorphism : Z4 Z2 Z2 Z2 Z2 ,


since by Proposition 1.21, any such isomorphism must preserve the
order of each element; however the element (1, 0) Z4 Z2 has
order 4, but there is no order4 element in Z2 Z2 Z2 for it to map
to, since every nontrivial element in the latter group has order 2.
A similar argument shows that Z8 (which has four order8 elements)
cannot be isomorphic to either of the other two groups.

Why is Z2 Z3 = Z6 but Z2 Z2 6 = Z4 , and none of the order8


groups in the above example are isomorphic to each other? The
answer is given by the following proposition.
Proposition 1.32 Zm Zn = Zmn if and only if gcd(m, n) = 1; that
is, m and n are coprime.

Proof The key is to examine the subset h(1, 1)i of Zm Zn generated


by the element (1, 1). This is the set of all ordered pairs from Zm Zn
that we get by adding together a finite number of copies of (1, 1). Since
both m and n are finite, the first coordinate will cycle back round to 0
after m additions, while the second will do so after n additions.
Suppose that m and n are coprime. Then after m additions the first
coordinate will be 0 but the second wont. Similarly, after n additions
the second coordinate will be 0 but the first wont. To get back to (0, 0)
we need to add (1, 1) to itself a number of times that contains both
m and n as factors; the smallest such number is the lowest common
multiple lcm(m, n). Since m and n are coprime, lcm(m, n) = mn, so
this subset h(1, 1)i has mn distinct elements. But Zm Zn also has mn
elements, thus (1, 1) generates the entirety of Zm Zn . So Zm Zn is
cyclic and is hence isomorphic to Zmn by Proposition 1.24.
To prove the converse, suppose that gcd(m, n) = d > 1. Then both m
and n are divisible by d, and so lcm(m, n) = mnd . This means that the
subset h(1, 1)i has mn
d elements and hence cant be isomorphic to Zmn
but instead is isomorphic to Zmn/d .
In fact, no element ( a, b) Zm Zn can generate the whole of the
group, since
( a, b) + + ( a, b) = mn mn n m
 
d a, d b = d ma, d nb = (0, 0).

So Zm Zn isnt cyclic and cant be isomorphic to Zmn .


6
Theorem 5.48, page 171. In Section 5.3 well use this result as part of the classification theorem
7
Theorem 5.63, page 180. for finitely-generated abelian groups,6 , 7 and in Section 9.1 we will use
8
Proposition 9.8, page 364. it to help prove the Chinese Remainder Theorem for rings.8
groups 17

1.2 Matrices If only I knew how to get mathe-


maticians interested in transformation
groups and their applications to differ-
All of the concrete examples of groups weve met so far have ential equations. I am certain, abso-
lutely certain, that these theories will
been numerical in nature, and also abelian. In this section we will see
some time in the future be recognised
some examples of nonabelian groups formed from matrices. as fundamental. When I wish such
a recognition sooner, it is partly be-
Multiplication of nn square matrices is well-defined and associative, cause then I could accomplish ten times
and the nn identity matrix more.
Marius Sophus Lie (18421899),
1 0 letter written in 1884 to Adolf Mayer
. . .
In = .. . . .. (18431942)

0 1

satisfies the identity condition. Not every square matrix has an inverse,
but as long as we restrict our attention to those that do, we should be
able to find some interesting examples of groups.
Example 1.33 The matrices This matrix group { I, A, B} is an ex-
" # " # ample of quite a powerful and impor-
1 23 12 3
 
1 0 tant concept called a representation:
I= , A = 32 , B= 2 . broadly speaking, a formulation of a
0 1 2 21 23 12 given group as a collection of matrices
forms a group with multiplication table which behave in the same way. We will
study this idea in a little more detail in
Section 8.A.
I A B
I I A B
A A B I
B B I A

which is clearly isomorphic to the table for the group Z3 that we met
a little while ago, under the isomorphism
I 7 0, A 7 1, B 7 2.

We now use the fact that matrix multiplication is, in general, noncom-
mutative to give us our first examples of nonabelian groups.
Example 1.34 Let GLn (R) denote the set of invertible, nn matrices
with real entries. This set forms a group (the general linear group)
under the usual matrix multiplication operation.

The set GLn (R) is closed under multiplication (since any two invertible
matrices A and B form a product AB with inverse B1 A1 ), matrix
multiplication is associative, the nn identity matrix In GLn (R),
and each matrix A GLn (R) has an inverse A1 GLn (R). We can
specialise this example to give three other (also nonabelian) groups
which are contained within GLn (R):
Example 1.35 The special linear group SLn (R) is the group of nn
real matrices with determinant 1.
18 a course in abstract algebra

Example 1.36 Let On (R) denote the set of nn orthogonal real


matrices. That is, invertible matrices A whose inverse is equal to
their transpose, A1 = A T . This set forms a group (the orthogonal
group) under matrix multiplication.

Example 1.37 The special orthogonal group SOn (R) is the group
of nn orthogonal real matrices with determinant 1.

Theres nothing in particular restricting us to real matrices either.


The following two examples, of matrices with complex entries, have
particular relevance to quantum mechanics.
Example 1.38 Let Un denote the set of nn unitary matrices. That
Wikimedia Commons / Ludwik Szacinski (18441894) is, invertible matrices A whose inverse is equal to their conjugate
After graduating in 1865 from the Uni-
versity of Christiania (now Oslo) with
transpose A , the matrix formed by taking the transpose A T and
a general science degree, Marius So- replacing every entry ai j with its complex conjugate ai j . This set Un
phus Lie (18421899) dabbled in vari- forms a group (the unitary group) under matrix multiplication.
ous subjects, including astronomy, zo-
ology and botany, before settling on The unitary group U1 is isomorphic to the circle group: the multi-
mathematics. In 1869, on the strength
of his first published paper, Reprsen- plicative group of complex numbers z with unit modulus |z| = 1.
tation der Imaginren der Plangeometrie,
Example 1.39 Let SUn denote the group of nn unitary matrices
he won a scholarship to travel to Berlin
and Paris, where he met and worked with determinant 1. This is the special unitary group.
with several important mathematicians
including Felix Klein (18491925) and In the Standard Model of particle physics, U1 describes the quantum
Camille Jordan (18381922). behaviour of the electromagnetic force, SU2 the weak nuclear force,
After the outbreak of the Franco
and SU3 the strong nuclear force.
Prussian War in July 1870, he left for
Italy but was arrested in Fontainebleau Example 1.40 A symplectic matrix is a 2n2n square matrix M
on suspicion of being a German spy
and was only released due to the inter-
satisfying the condition
vention of the French mathematician
M T J2n M = J2n ,
Jean Gaston Darboux (18421917).
Returning to Norway, he was awarded where J2n is the 2n2n block matrix
a doctorate by the University of Chris-  
tiania in July 1872, for a thesis entitled 0 In
On a class of geometric transformations,
J2n = ,
In 0
and subsequently appointed to a chair.
Over the next few decades he made with In denoting the nn identity matrix. So, for example,
many important contributions to geom- " #
etry and algebra, many in collaboration 0 0 1 0
0 0 0 1
with Klein and Friedrich Engel (1861 J4 = 1 0 0 0 .
1941). In 1886 he moved to Leipzig, 0 1 0 0
succeeding Klein (who had just been A B
appointed to a chair at Gttingen) but Equivalently, a 2n2n block matrix C D is symplectic if the n n
suffered a nervous breakdown in late submatrices A, B, C and D satisfy the conditions
1889. This, together with his deteri-
orating physical health and the acri- A T C = C T A, A T D C T B = I,
monious disintegration of his profes-
sional relationships with Klein and En- B T D = D T B, D T A B T C = I.
gel overshadowed the rest of his career.
He returned to Christiania in 1898, and The symplectic group Sp2n (R) is composed of all 2n2n symplectic
died from pernicious anaemia in 1899. matrices with real entries.
groups 19

These groups GLn (R), SLn (R), On (R), SOn (R), Sp2n (R), Un and SUn
are, for n > 1 at least, all infinite and nonabelian. But they also have
a continuous structure inherited from R, which means that we can
regard them as being locally similar to Rm for some value of m. Such
an object is called a manifold, and a group which has a compatible
manifold structure in this way is known as a Lie group, after the
Norwegian mathematician Marius Sophus Lie (18421899).
Example 1.41 Let I, A and B be as in Example 1.33, and let
" # " #
12 23 1 3
 
1 0 2 2
C= , D= , E = .
0 1 23 1
2 2
3 1
2
Then the set { I, A, B, C, D, E} forms a group with multiplication table
I A B C D E
1933 CERN / Wolfgang Pauli Archive
I I A B C D E The Austrian theoretical physicist Wolf-
A A B I D E C gang Pauli (19001958) was one of the
pioneers of quantum mechanics and
B B I A E C D particle physics. His contributions in-
C C E D I B A clude the Pauli Exclusion Principle,
which states that no two electrons
D D C E A I B
(more generally, any two fermions) can
E E D C B A I exist in the same quantum state at the
same time, for which he was awarded
This group can be seen to be nonabelian: its multiplication table is the Nobel Prize for Physics in 1945. In
asymmetric in the leading (top-left to bottom-right) diagonal. 1930, while working on the problem
of beta decay, he postulated the exis-
Example 1.42 The three unitary matrices tence of a new particle (the neutrino)
whose existence was confirmed experi-
mentally 26 years later.
  
0 i
  
0 1 1 0
x = y = z = A notorious perfectionist, he would of-
1 0 i 0 0 1
ten dismiss work he considered sub-
are known as Paulis matrices, and are particularly relevent in par- standard as falsch (wrong) or
ganz falsch (completely wrong); fa-
ticle physics (where they represent observables relating to the spin mously he once remarked of one re-
of spin 12 particles such as protons, neutrons and electrons) and search paper es ist nicht einmal falsch
quantum computing (where they represent an important class of (it is not even wrong).

single-qubit operations). Now consider the matrices E I J K E I J K


      E E I J K E I J K
i 0 0 1 0 i I I E K J I E K J
I = iz = J = iy = K = ix = J J K E I J K E I
0 i 1 0 i 0
K K J I E K J I E
together with the 22 identity matrix, which in this instance we call E E I J K E I J K
E. Let E, I, J and K be the negative scalar multiples of these I I E K J I E K J
J J K E I J K E I
matrices. The group formed by these eight elements is called the K K J I E K J I E
quaternion group Q8 ; its multiplication table is shown in Table 1.2. Table 1.2: Multiplication table for the
quaternion group Q8
The quaternion group Q8 is not isomorphic to any of the three other
order8 groups weve met so far, since they were abelian and this isnt.
Later, we will study some structures related to this group, such as the
quaternion ring H and the Hurwitz integers.9 9
Example 8.12, page 328.
20 a course in abstract algebra

Symmetry: That which we see at a 1.3 Symmetries


glance; based on the fact that there is
no reason to do otherwise; and based
also on the human form, from where We know from basic linear algebra that matrices represent linear
it follows that symmetry is necessary transformations defined on some vector space V. The matrices
only in width, not in height or depth. " # " # " #
1 3 1 3
Blaise Pascal (16231662), 1 0
Penses (1669) I:28 I= , A = 32 2 , B = 2 2
0 1 2 12 23 12
from Example 1.33 are 22 real matrices, so we can regard them as
representing linear maps f : R2 R2 . That is, transformations in the
plane that fix the origin and satisfy the linearity conditions
I A B
I I A B f (v + w) = f (v) + f (w) and f (kv) = k f (v)
A A B I
B B I A for any vectors v, w R2 and any real scalar k R.
Table 1.3: The multiplication table for So what do these three 22 matrices do to points (or position vectors)
the group in Example 1.33 in the plane R2 ? The identity matrix I leaves everything unchanged,
while the matrices A and B represent rotations through, respectively,
angles of 2 4 2
3 and 3 (or 3 ) about the origin. The multiplication
I A B C D E table in Table 1.3 tells us how these rotations interact with each other,
I I A B C D E
A A B I D E C
and completely describes the geometry of the system. As noted earlier,
B B I A E C D this structure is equivalent to the operation of addition in modulo3
C C E D I B A arithmetic.
D D C E A I B
E E D C B Now look at the six-element, nonabelian matrix group from Exam-
A I
Table 1.4: The multiplication table for ple 1.41 (see Table 1.4). These, again, are 22 real matrices, and can
the group in Example 1.41 therefore be viewed as representing linear transformations in R2 . As
before, we have the identity matrix I and the two rotation matrices A
D y = 3x
and B, but we also have three further matrices. A bit of examination
or experimentation shows that these matrices represent reflections in
lines passing through the origin: C is reflection in the xaxis, D is
C y=0
reflection in the line y = 3x, and E is reflection in the line y = 3x.
This matrix group tells us how six specific geometric transformations
interact with each other. For example, the identity CE = A tells us

E y = 3x that reflecting first in the xaxis and then in the line y = 3x is the
2
Figure 1.5: Reflections represented by same as an anticlockwise rotation through an angle of 3 .
the matrices C, D and E
This geometric approach is a useful and illuminating way of thinking
m2 m1 about groups. To take this geometric viewpoint a little further, observe
that these six transformations are exactly those which correspond to
the symmetries of an equilateral triangle: order3 rotational symmetry
and three axes of reflection symmetry (see Figure 1.6). That is, this
m3
group completely describes the symmetry of an equilateral triangle.
This group is the dihedral group, and there are two common conven-
tions for what to call it. Some mathematicians call it D6 , because it
comprises six elements, while others call it D3 because it describes the
Figure 1.6: Axes of symmetry of an
equilateral triangle symmetry of a regular 3sided polygon. In this book, well adopt the
groups 21

e r r2 m1 m2 m3
latter convention, but you should be aware that some books follow the
e e r r2 m1 m2 m3
other one. Its multiplication table is shown in Table 1.5. r r r2 e m3 m1 m2
r2 r2 e r m2 m3 m1
More generally, we can do the same thing with a regular ngon. This m1 m1 m2 m3 e r r2
has ordern rotational symmetry, and n axes of symmetry. In the case m2 m2 m3 m1 r2 e r
where n is odd, each of these axes passes through the midpoint of one m3 m3 m1 m2 r r2 e

side and the opposite vertex. If n is even, then half of these axes pass Table 1.5: The multiplication table for
the dihedral group D3
through pairs of opposite vertices, and the other half pass through the
midpoints of opposite sides. m2

Definition 1.43 The dihedral group m3 m1

Dn = {e, r, . . . , r n1 , m1 , . . . , mn }
is the symmetry group of the regular plane ngon. Here, e is the m4
identity, r represents an anticlockwise rotation through an angle of
2
n , and mk denotes a reflection in the line which makes an angle of
k
n with the horizontal.

Heres another example: the dihedral group D4 , which describes the Figure 1.7: Axes of symmetry of the
symmetry of a square. square
2 3
Example 1.44 The dihedral group D4 = {e, r, r2 , r3 , m1 , m2 , m3 , m4 } e r r r m1 m2 m3 m4
e e r r 2 r 3 m1 m2 m3 m4
consists of the eight possible symmetry transformations on the square. r r r 2 r 3 e m4 m1 m2 m3
Here, e denotes the identity transformation, r is an anticlockwise r 2 r 2 r 3 e r m3 m4 m1 m2
rotation through an angle of 2 , r2 is therefore a rotation through r 3 r 3 e r r 2 m2 m3 m4 m1
m 1 m1 m2 m3 m4 e r r2 r3
an angle of and r3 a clockwise rotation through an angle of 2 , m2 m2 m3 m4 m1 r 3 e r r 2
m1 and m3 are reflections in the squares diagonals, and m2 and m4 m3 m3 m4 m1 m2 r 2 r 3 e r
m4 m4 m1 m2 m3 r r 2 r 3 e
are reflections in, respectively, the vertical and horizontal axes. The
Table 1.6: The multiplication table for
multiplication table is shown in Table 1.6.
the dihedral group D4
A rectangle has fewer symmetries than a square, and as we might v
expect, its symmetry group is simpler:
Example 1.45 Let V4 = {e, r, h, v} be the symmetry group of a (non-
square) rectangle, where e is as usual the identity, r denotes a rotation
h
through an angle of about the centre point, and h and v are,
respectively, reflections in the horizontal and vertical axes. Then
we obtain the group with multiplication table shown in Table 1.7
This group is isomorphic to Z2 Z2 from Example 1.30, via the Figure 1.8: Axes of symmetry of a rect-
isomorphism angle

e r h v
e (0, 0) r (1, 1) h (1, 0) v (0, 1)
e e r h v
and is hence the Klein 4group in a very superficial disguise. r r e v h
h h v e r
We can also think of this group as the dihedral group D2 , the sym- v v h r e
Table 1.7: Multiplication table for sym-
metry group of a 2sided polygon or bigon, the lens-shaped figure
metry group of a rectangle
depicted in Figure 1.9.

The dihedral groups Dn , the Klein 4group V4 and the group in


22 a course in abstract algebra

Example 1.33, as well as the orthogonal groups On (R) and SOn (R)
are examples of isometry groups: groups of transformations which
preserve lengths and distances.
Definition 1.46 Given some set S Rn , define Isom+ (S) to be the
group of all direct isometries of S: length-preserving transforma-
tions which also preserve the orientation of S and the underlying
Figure 1.9: Symmetries of a bigon
space Rn . Define Isom(S) to be the full isometry group of S, consist-
ing of the direct isometries together with the opposite isometries:
those which reverse the orientation of S and Rn .
In this terminology, if we let Pn denote the regular nsided polygon,
then Isom+ ( Pn )
= Zn and Isom( Pn ) = Dn .
Definition 1.47 Let E2+ = Isom+ (R2 ) be the group of all orientation-
preserving isometries of the plane R2 . Then E2+ consists of trans-
lations and rotations. The full isometry group E2 = Isom(R2 ) also
includes reflections and glide reflections (a reflection in some line,
followed by a translation parallel to the same line). This group E2 is
called the (two-dimensional) Euclidean group.

If we let P3 be the subset of R2 consisting of an equilateral triangle


centred on the origin, then the dihedral group D3 consists of precisely
those six elements of E2 = Isom(R2 ) which map P3 to itself, and the
group in Example 1.33 consists of exactly those three direct isometries
in E2+ = Isom+ (R2 ) which map P3 to itself. We will return to this
viewpoint in the next chapter when we look at subgroups.
Returning to the isometry groups of Euclidean space Rn , we find that
things become progressively more complicated in higher dimensions:
Definition 1.48 Let E3+ = Isom(R3 ) be the group of orientation-
preserving isometries of R3 . This group consists of the identity,
translations, rotations about some axis, and screw operations (ro-
tations about some axis followed by a translation along that axis).
The full isometry group E3 = Isom(R3 ) also includes reflections in
some plane, glide reflections (reflections in a plane followed by a
translation parallel to that plane), roto-reflections (rotation about an
axis followed by reflection in a plane perpendicular to that axis) and
inversion in a point (in which every point in R3 is mapped to an
equidistant point on the other side of the chosen fixed point). This
Figure 1.10: The five regular polyhedra group E3 is the three-dimensional Euclidean group.
Wikimedia Commons / User:Cyp
Just as we obtained interesting examples of groups by restricting
ourselves to elements of E2 or E2+ which mapped some subset of R2
to itself, it turns out that we can obtain further interesting examples
by doing something similar with the groups E3 and E3+ .
groups 23

There are five regular polyhedra in three dimensions: the tetrahedron,


the cube, the octahedron, the dodecahedron and the icosahedron.
These have, respectively, four, six, eight, twelve and twenty faces, and
are depicted in Figure 1.10. We will study their symmetry groups,
and those of their higher-dimensional analogues, in more detail in
Section 5.C but for the moment well look at the simplest case: the
direct isometry group of the tetrahedron.
Example 1.49 The tetrahedron 3 has twelve direct isometries:
The identity isometry e
Eight rotations ri through angles of 2 3 around one of four
axes (i = 1, . . . , 4) passing through a vertex and the centre of the
opposite face
Three rotations s A , s B and sC through an angle of around one of r1
three axes passing through the midpoints of opposite edges
The multiplication table for this group Isom+ (3 ) is as follows:

e r1+ r1 r2+ r2 r3+ r3 r4+ r4 sA sB sC


e e r1+ r1 r2+ r2 r3+ r3 r4+ r4 sA sB sC
r1+ r1+ r1 e r4 sB r2 sA r3 sC r4+ r3+ r2+
r3
r1 r1 e r1+ sC r3+ sB r4+ sA r2+ r3 r2 r4
r2+ r2+ r3 sB r2 e r4 sC r1 sA r3+ r4+ r1+ r2
r2 r2 sC r4+ e r2+ sA r1+ sB r3+ r4 r1 r3
r4
r3+ r3+ r4 sA r1 sC r3 e r2 sB r2+ r1+ r4+
r3 r3 sB r2+ sA r4+ e r3+ sC r1+ r1 r4 r2
r4+ r4+ r2 sC r3 sA r1 sB r4 e r1+ r2+ r3+
r4 r4 sA r3+ sB r1+ sC r2+ e r4+ r2 r3 r1
sA sA r3+ r4 r4+ r3 r1+ r2 r2+ r1 e sC sB
sB
sB sB r2+ r3 r1+ r4 r4+ r1 r3+ r2 sC e sA sA
sC sC r4+ r2 r3+ r1 r2+ r4 r1+ r3 sB sA e sC

In this table, the operation in row a and column b is the product ab;
that is, the isometry a followed by the isometry b.

The full isometry group Isom(3 ) also includes twelve opposite isome-
tries: six order2 reflections m12 , m13 , m14 , m23 , m24 and m34 (in planes
passing along one edge and through the midpoint of the opposite
Figure 1.11: Direct isometries of the
edge) and six other roto-reflection isometries tA , tB and tC formed by tetrahedron 3 : axes for order3 rota-
rotating the tetrahedron through an angle of 2 about one of the tions (top) and order2 rotations (bot-
tom)
three axes s A , s B or sC and then reflecting in a plane perpendicular to
that axis. (The typet isometries may also be generated by pairs of
typem reflections.)
24 a course in abstract algebra

Poetical scene with surprisingly chaste 1.4 Permutations


Lord Archer vegetating (3,3,8,12)
Araucaria (John Graham)
Look again at the rotation group R3 = Isom+ (2 ) of the
(19212013),
equilateral triangle, from Example 1.33. Fundamentally, all these
Cryptic Crossword 22 103, The
Guardian, 11 January 2001
transformations really do is swap the vertices round (or, in the case
of the identity transformation, not). What happens to the rest of the
Stifle is an anagram of itself.
Anonymous triangle is determined completely by where the vertices go. So we
could describe the elements of R3 , the rotation transformations on the
equilateral triangle, solely as permutations of the set {0, 1, 2}.
Similarly, the dihedral group D3 can also be regarded as permutations
of the set {0, 1, 2}. The identity transformation leaves all three points
unchanged, the clockwise and anticlockwise rotations permute them
cyclically, and each of the three reflections leave one point unchanged
and swap the other two points round.
Definition 1.50 A permutation of a set X is a bijection : X X.

Given a set X, we can form the set Sym( X ) of all possible permutations
of X. In order to turn this into a group, we need to impose a suitable
binary operation on it, and the one we choose is that of composition.
Viewing two permutations , Sym( X ) as bijections , : X X,
their composites : X X and : X X are certainly well-
defined. These composites, happily, are also permutations in their
own right, since the composite of two bijections is itself a bijection. If
we view and as (possibly different) ways of shuffling or reordering
the elements of X, then if we do and then do to the result, the
combined effect is also a (possibly different) way of shuffling the
elements of X.
So, we have a binary operation defined on Sym( X ). This operation
is associative, since composition of functions is associative. There is
an identity element: the identity permutation : X X which maps
every element to itself. Also, every permutation has an inverse 1 ,
which can be regarded as either the inverse function 1 : X X or
as the permutation which puts all of the elements of X back to how
they were before we applied . Hence Sym( X ) forms a group:
Definition 1.51 Let X be a (possibly infinite) set. The group Sym( X )
of all permutations : X X is called the symmetric group on X.
If X = {1, . . . , n} is a finite set consisting of n elements, then we call
Sym( X ) the symmetric group on n objects and denote it Sn .

Proposition 1.52 The finite symmetric group Sn has order n!.

Proof A permutation of the set {1, . . . , n} is determined completely


The Old Vicarage, Grantchester
by the way it maps the numbers amongst themselves. There are n
groups 25

choices for where 1 maps to, then n1 choices for where 2 goes (since
we can map 2 to any of the remaining numbers except for the one we
mapped 1 to), n2 choices for where 3 maps to, and so on. This gives
us n(n1)(n2) . . . 1 = n! possible permutations of n objects, and so
|Sn | = n!.
It so happens that S2
= Z2 and S3 = D3 . The symmetric group S4
is isomorphic to the symmetry group of the tetrahedron. Later on
we will meet Cayleys Theorem, which states that any group can be
regarded as a group of permutations (although not, in general, the full
symmetric group Sn for some value of n).
To investigate these permutation groups, we need a coherent and
consistent notation, at least for permutations on finite sets.
One method, given that a permutation : X X is determined
completely by its action on the elements of the set X, is to represent it
in the form of an array:
x1 x2 xn
h
...
i
( x1 ) ( x2 ) . . . ( x n )
The first row lists the elements of X, and the second lists their images
under the action of . So, suppose that S5 maps 1 7 1, 2 7 3,
3 7 5, 4 7 4 and 5 7 2. Then we can represent as
h
1 2 3 4 5 .
i
1 3 5 4 2
The second row of this array shows the effect of applying to the first
row. Suppose we have another permutation S5 such that 1 7 2,
2 7 4, 3 7 5, 4 7 1 and 5 7 3. This can be represented as
h
1 2 3 4 5 .
i
2 4 5 1 3
Composition of permutations can be represented fairly simply using
this notation:
h
1 2 3 4 5
i
1 3 5 4 2
i = 1
h
2 3 4 5
i
= h 2 5 3 1 4
1 2 3 4 5
2 4 5 1 3
Note Because we regard permutations as bijections, and the product
operation as being composition, we write products from right to left,
rather than left to right. So means followed by . Stacking the
arrays vertically, we read down the page, so means followed by .
In general, composition of permutations isnt commutative, so we have
to be careful of the order. For example,
h
1 2 3 4 5
i
2 4 5 1 3
i = 1
h
2 3 4 5 6= .
i
= h 3 4 2 1 5
1 2 3 4 5
1 3 5 4 2
26 a course in abstract algebra

This notation is quite clear, and certainly makes it easy to work out the
1
composite of two permutations, but unfortunately it becomes some-
5 2 what unwieldy with large numbers of permuting elements. Also, it

doesnt tell us very much about the actual structure of the permutation.

4 3
For example: the permutation maps 2 7 3, 3 7 5 and 5 7 2,
but leaves 1 and 4 unchanged. Repeated applications of cause the
elements 2, 3 and 5 to cycle amongst themselves, while 1 and 4 stay
1
where they are. Furthermore, if we apply three times, we get the
5 2 identity permutation . Hence 3 = ; equivalently, has order 3 in S5 .

The permutation , on the other hand, maps 1 7 2, 2 7 4 and
4 3 4 7 1, but additionally swaps 3 5. So is doing two different,
1 independent things at the same time: cycling 1 7 2 7 4 7 1 and
transposing 3 5. Neither of these two operations interfere with each
5 2 other, because they are acting on disjoint subsets of the set {1, 2, 3, 4, 5}.

If we apply three times, then the subset {1, 2, 4} will be back where it
4 3 started, but meanwhile 3 and 5 will have swapped places three times,
and will hence be the other way round from where they started. But if
1
we apply another three times, {1, 2, 4} will go round another period-
three cycle, and 3 and 5 will have swapped back to their original
5 2 places. More concisely, 6 = .

We can depict this situation graphically, as shown in Figure 1.12, but
4 3 this also isnt very compact, and tends to become more involved for
Figure 1.12: Graphical depictions of larger numbers of permuting objects.
permutations , , and in S5
Ideally, we would like a new notation which is at the same time
more compact, and also makes it easier to see at a glance the internal
structure of the permutation. The key is to split the permutation into
disjoint cyclic subpermutations, and write them as parenthesised lists.
Fortunately, there is a relatively straightforward procedure for decom-
posing a permutation as a product of disjoint cycles:
Algorithm 1.53
Open parenthesis (
Start with the first element 1 and write it down.
Next write down the image (1) of 1.
Next write down the image ( (1)) = 2 (1).
..
.
When we get back to 1, close the parentheses ).
Now repeat this process, starting with the smallest number not yet
seen, and so on until you have a list of parenthesised lists of numbers.
In practice, we delete any single-number lists, because they just tell
us that the number in question isnt changed by the permutation.
groups 27

This procedure is better illustrated with an example, so lets do it with


our permutation above.
Starting with 1, we see that (1) = 1, so that gives us our first (trivial)
cycle (1). Next we look at 2, which maps to 3, which maps to 5, which
maps back to 2. This gives us the period3 cycle (2 3 5). Next we look
at 4, the smallest number not yet seen, which is again left unchanged
by . This gives us another trivial cycle (4). And now weve dealt
with all of the numbers in {1, 2, 3, 4, 5}, so thats the end, and our
permutation can therefore be written as (1)(2 3 5)(4). Except that
were really only interested in the nontrivial bits of the permutation,
which in this case is the 3cycle (2 3 5); the 1cycles (1) and (4) dont
tell us anything nontrivial about the permutation, so we discard them,
leaving us with = (2 3 5). This compact notation tells us that
causes the numbers 2, 3 and 5 to cycle amongst themselves, and
doesnt do anything to the remaining numbers 1 and 4.
Now lets try this with . This is slightly more interesting, and the
procedure described above gives us the cycle notation (1 2 4)(3 5). Just
by looking at this, we can immediately see what the permutation does:
it cycles 1 7 2 7 4 7 1 and at the same time swaps 3 5.
The above argument tells us the following fact:
Proposition 1.54 Any permutation Sn can be written as a product
of disjoint cyclic permutations.
Because these cyclic permutations are disjoint, they are independent:
they dont interact with each other, in the sense that since they act on
disjoint subsets of X = {1, 2, 3, 4, 5}, they dont step on each others
toes. Therefore, it doesnt matter what order we apply them to the set
X, and so the cycles (1 2 4) and (3 5) commute. (Non-disjoint cycles
dont commute, however.)
A slight disadvantage with cycle notation is that it isnt so obvious
how to multiply permutations together. But with a little practice its
easier than it might appear, and quite possible to do it in your head.
Using the permutations = (2 3 5) and = (1 2 4)(3 5) as before, we
calculate as follows:

= (2 3 5)(1 2 4)(3 5)

Start with 1, and read through the list of cycles from right to left,
applying each one in turn until youve done them all:

(3 5) (1 2 4) (2 3 5)
1 7 1 7 2 7 3

Now do the same process, but starting with the number (in this case 3)
28 a course in abstract algebra

that you ended up with last time:


(3 5) (1 2 4) (2 3 5)
3 7 5 7 5 7 2
Now do it again, and again, until you end up back at 1:
(3 5) (1 2 4) (2 3 5)
2 7 2 7 4 7 4
(3 5) (1 2 4) (2 3 5)
4 7 4 7 1 7 1
This gives the first disjoint cycle (1 3 2 4). Now repeat this process with
the smallest number (in this case 5) not yet seen:
(3 5) (1 2 4) (2 3 5)
5 7 3 7 3 7 5
Thus 5 is unchanged by the product , so our next cycle is (5), except
that by convention we omit length1 cycles. Since all of the numbers
1, . . . , 5 are now accounted for, we are done, and = (1 3 2 4).
Now try it with = (1 2 4)(3 5)(2 3 5).

(2 3 5) (3 5) (1 2 4)
1 7 1 7 1 7 2




2 7 3 7 5 7 5 (1 2 5 4)
5 7 2 7 2 7 4

7 7 7

4 4 4 1
o
(2 3 5) (3 5) (1 2 4)
3 7 5 7 3 7 3 (3)
So = (1 2 5 4).
Now we introduce a couple of concepts relating to cyclic permutations
which will become useful in the next bit.
Definition 1.55 Let = ( x1 x2 . . . xk ) be a finite cyclic permutation
in some (possibly infinite) symmetric group Sym( X ). Then has
length or periodicity k, which we denote by l ( ) = k. This is equal
to the order | | of in the group Sym( X ).

Cycles of length 2 will play an important rle in the next part of the
discussion, so we give them a special name:
Definition 1.56 Let = ( x1 x2 ) be a period2 cyclic permutation in
some (possibly infinite) symmetric group Sym( X ). Then we call a
transposition.

We noted earlier that 3 = and 6 = . More formally, the order | |


of in S5 is 3, while the order | | of in S5 is 6. The next proposition
shows that we can easily work out the order of a permutation Sn
just by looking at its disjoint cycle representation.
Proposition 1.57 Let Sn be a permutation in some (possibly infinite)
symmetric group Sym( X ). Then if can be represented as a finite product
= 1 2 . . . k
groups 29

of disjoint cycles, each of length li = l (i ), then


| | = lcm(l1 , . . . , lk ).
That is, the order of is the lowest common multiple of the lengths of the
disjoint cycles 1 , . . . , k .

Proof First, observe that n = (1 . . . k )n = 1n . . . kn since disjoint


cycles commute. Then if | | = n this means that n = and hence
1n . . . kn = .
Now suppose that i = ( x1 . . . xli ). Then n = implies that n ( x j ) =
x j for 1 6 j 6 li . So (1n . . . kn )( x j ) = x j , and since the s are disjoint
cycles, we know that no other cycle apart from i affects x j and there-
fore i ( x j ) = x j for 1 6 j 6 li . Hence in = 1 and hence n must either
equal li = |i | or be an integer multiple of it. That is, li |n.
Repeating this argument for all the cycles 1 , . . . , k we see that if
n = then li |n for 1 6 i 6 k.
Conversely, suppose that n is such that li |n for 1 6 i 6 k. Then
n = in . . . kn = . . . = .
Therefore, n = if and only if li |n for 1 6 i 6 k. So in particular, n
must be the smallest positive integer which divides each of the li , which
is to say that it must be the lowest common multiple lcm(li , . . . , lk ).
So | | = l ( ) = 3, but | | = lcm(2, 3) = 6. Now lets look again at
the permutation = (2 3 5). This has length 3, but is it possible to
express it as a product of even shorter (but possibly not disjoint) cyclic
permutations? Yes it is, and (2 5)(2 3) is one possible decomposition:
(2 3) (2 5)
2 7 3 7 3
3 7 2 7 5
5 7 5 7 2
In general, we have the following proposition:
Proposition 1.58 Any finite permutation in some (possibly infinite)
symmetric group Sym( X ) can be expressed as a product of (not necessarily
disjoint) transpositions.

Proof Any finite permutation Sym( X ) can be written as a product


of disjoint cyclic permutations. Furthermore, any cyclic permutation
( x1 x2 . . . xk ) can be written as a product
( x1 x2 . . . xk ) = ( x1 xk )( x1 xk1 ) . . . ( x1 x3 )( x1 x2 )
of transpositions.
Corollary 1.59 The transpositions in Sn generate Sn .
In general, there may be multiple ways to represent a given permu-
tation as a product of transpositions. The proof of Proposition 1.58
30 a course in abstract algebra

gives one valid way, but since any cyclic permutation ( x1 x2 . . . xk ) can
be rewritten as, for example, ( xk x1 x2 . . . xk1 ) (or any other cyclic
permutation of that list), the decomposition will not be unique.
Proposition 1.60 The symmetric group Sn is generated by any of the
following sets of cyclic permutations:
(i) {(1 2), (1 3), . . . , (1 n)}
(ii) {(1 2), (2 3), . . . , (n1 n)}
(iii) {(1 2), (1 2 . . . n)}

Proof (i) Corollary 1.59 tells us that Sn is generated by transpositions.


But any transposition ( a b) can be written as a product (1 a)(1 b)(1 a).
(ii) This follows from part (i) together with the observation that
(1 k) = (k1 k) . . . (3 4)(2 3)(1 2)(2 3)(3 4) . . . (k1 k).
(iii) We can write any transposition of the form (k k+1) as the product
(1 2 . . . n)k1 (1 2)(1 2 . . . n)1k , and hence by part (ii), the transposi-
tion (1 2) together with the ncycle (1 2 . . . n) generate all of Sn .
This completes the proof.
As remarked above, the decomposition of a permutation into trans-
positions will not, in general, be unique, nor need those transpositions
be disjoint. It need not even be the case that two such decompositions
have the same number of transpositions; however we can at least say
that the parity (odd or even) of the number of transpositions will be
the same for any such decompositions of the same permutation.
Proposition 1.61 Let be a finite permutation in some finite symmetric
group Sn . Let = ( x1 x2 )( x3 x4 ) . . . ( x2k1 x2k ) be a decomposition of
into k transpositions, and let = (y1 y2 )(y3 y4 ) . . . (y2h1 y2h ) be another
decomposition of into h transpositions. Then h k (mod 2).

Proof Consider the polynomial


P ( x1 , . . . , x n ) = ( xi x j )
16 i < j 6 n

= ( x1 x2 )( x1 x3 ) . . . ( x1 xn )( x2 x3 ) . . . ( xn1 xn ).
Given a permutation Sn , we define
( P)( x1 , . . . , xn ) = P( x(1) , . . . , x(n) ).
The permuted polynomial ( P)( x1 , . . . , xn ) has exactly the same fac-
tors ( xi x j ) as P( x1 , . . . , xn ), except that some of the variables xi have
been permuted, and some of the factors will change sign. We may
therefore define the sign of to be the quotient
( P)( x1 , . . . , xn )
sign( ) = = 1.
P ( x1 , . . . , x n )
In general, for any two permutations , Sn ,
groups 31

( P)( x1 , . . . , xn ) P( x (1) , . . . , x (n) )


sign( ) = =
P ( x1 , . . . , x n ) P ( x1 , . . . , x n )
P( x(1) , . . . , x(n) ) P( x (1) , . . . , x (n) )
=
P ( x1 , . . . , x n ) P ( x (1) , . . . , x ( n ) )
P ( x (1) , . . . , x ( n ) ) P ( x (1) , . . . , x ( n ) )
=
P ( x1 , . . . , x n ) P ( x1 , . . . , x n )
= sign() sign( ).
The transposition (1 2) has negative sign, since
P ( x2 , x1 , . . . , x n ) x x1
sign(1 2) = = 2 = 1.
P ( x1 , x2 , . . . , x n ) x1 x2
In fact, all transpositions have negative sign, since part (i) of Proposi-
tion 1.60 tells us that a transposition = ( a b) Sn can be written as a
product (1 a)(1 b)(1 a), which also has sign 1. Therefore, by the mul-
tiplicativity condition above, if Sn can be decomposed as a product
of an even number of transpositions, then sign( ) = +1, and if it de-
composes as an odd number of transpositions, then sign( ) = 1.
Since the sign of a permutation is independent of any particular
decomposition, a permutation with positive sign must be equal to a
product of an even number of transpositions, while a permutation with
negative sign must be a product of an odd number of transpositions.
So, if = ( x1 x2 ) . . . ( x2k1 x2k ) = (y1 y2 ) . . . (y2h1 y2h ), then h and k
must be either both odd or both even; that is, h k (mod 2).
For example, let = (1 3 4) = (1 3)(1 4) S4 . Then
P( x1 , x2 , x3 , x4 ) = ( x1 x2 )( x1 x3 )( x1 x4 )( x2 x3 )( x2 x4 )( x3 x4 )
and hence
(P)( x1 , x2 , x3 , x4 )=( x(1) x(2) )( x(1) x(3) )( x(1) x(4) )
( x(2) x(3) )( x(2) x(4) )( x(3) x(4) )
=( x3 x2 )( x3 x4 )( x3 x1 )( x2 x4 )( x2 x1 )( x4 x1 )
= P ( x1 , x2 , x3 , x4 )
On the other hand, consider the permutation = (1 2) S4 . Then
(P)( x1 , x2 , x3 , x4 )=( x (1) x (2) )( x (1) x (3) )( x (1) x (4) )
( x (2) x (3) )( x (2) x (4) )( x (3) x (4) )
=( x2 x1 )( x2 x3 )( x2 x4 )( x1 x3 )( x1 x4 )( x3 x4 )
= P ( x1 , x2 , x3 , x4 )

Definition 1.62 Let Sn be a finite permutation. If decomposes


into an even number of transpositions (that is, sign( ) = +1) we say
it is an even permutation. Otherwise we call it an odd permutation.
32 a course in abstract algebra

The product of two odd permutations or two even permutations


is clearly going to be an even permutation, while the product of
an odd permutation with an even one is clearly going to be odd.
Furthermore, the parity (odd or even) of an inverse permutation 1
will be equal to the parity of the original permutation . Finally, the
identity permutation is even (because it decomposes to a product
of zero permutations, and zero is even). This gives another family of
groups related to the symmetric groups Sn :
Example 1.63 The alternating group An is the permutation group
that consists of all even permutations of the set {1, . . . , n}. This set is
closed under composition of permutations (and hence composition
is a well-defined binary operation on An ), it contains the identity
permutation , and every even permutation has a unique inverse
1 which is also even, by the remarks above.
The order of An is n!/2, half the order of the symmetric group Sn ,
since exactly half of the permutations on n elements are even.
By analogy with Sym( X ), we may sometimes use Alt( X ) to denote
the alternating group on a (possibly infinite) set X.

The groups A1 and A2 are trivial, A3 = Z3 and A4 is isomorphic to


the rotation group of the tetrahedron. The alternating groups An for
n > 5 are examples of what are known as simple groups, which we
will discuss further in Chapter 3.
Just as the full symmetric group Sn is generated by transpositions,
it turns out that (apart from the two trivial cases A1 and A2 ) the
alternating groups are generated by the period3 cyclic permutations:
Proposition 1.64 For n > 3, any even permutation on the set {1, . . . , n}
may be written as a product of 3cycles.

Proof Given an element An , by Proposition 1.60 we may write it


as the product of an even number of transpositions of the form
= (1 x1 )(1 x2 ) . . . (1 x2k ).
Each adjacent pair of transpositions (1 x2i1 )(1 x2i ) may now be rewrit-
ten as a 3cycle (1 x2i1 x2i ).
This completes our first venture into group theory, and also our first
look at symmetric groups. In the next chapter we will delve further
into the subject, studying subgroups: groups within groups. The
importance of permutation groups will become clear: by Cayleys
10
Theorem 2.13, page 47. Theorem,10 every group is isomorphic to a permutation group.
groups 33

Summary

In this chapter we began our foray into abstract algebra by distilling


the basic properties of integer addition, probably the first number
system any of us meet. We found that this number system Z con-
sisted of a set Z = {. . . , 2, 1, 0, 1, 2, . . .} equipped with a binary
operation11 satisfying certain properties: commutativity12 , associa- 11
Definition 1.1, page 2.
tivity13 , existence of an identity element14 , and existence of inverse 12
Definition 1.3, page 3.
elements.15 13
Definition 1.4, page 3.
Restating this in more general terms we obtain the concept of a 14
Definition 1.6, page 4.
group:16 a set G equipped with an associative binary operation 15
Definition 1.8, page 6.
such that G contains a special identity element e, and each element g 16
Definition 1.9, page 6.
of G has an inverse g1 . A group whose operation is commutative
as well is said to be an abelian group.17 The order18 | G | of a group G 17
Definition 1.11, page 7.
is the number of elements it has. 18
Definition 1.10, page 6.
Certain consequences follow more or less immediately from this def-
inition: the identity element e is unique in a given group G (that is,
there is no other element f satisfying the required properties), as is the
inverse g1 of any given element g.19 The left and right cancellation 19
Proposition 1.16, page 8.
laws also follow from the definition.20 The inverse of a product of two 20
Proposition 1.15, page 8.
elements is equal to the product of the corresponding inverses, but
written in the opposite order: ( gh)1 = h1 g1 .
Probably the simplest examples of groups apart from Z itself and the
trivial group21 {e} are the finite cyclic groups22 Zn = {0, 1, . . . , n1} 21
Example 1.13, page 7.
with modulon addition. A simple way of studying the internal 22
Example 1.12, page 7.
structure of a group is by examining its multiplication table. The table
for Z3 , suitably relabelled, is structurally identical to that for the set
{1, , 2 } of complex cube roots of unity. This leads us to the notion
of an isomorphism:23 a bijection from one group to another which 23
Definition 1.17, page 11.
respects the group structures. More precisely, a bijection f : G H is
an isomorphism if f ( g1 g2 ) = f ( g1 ) f ( g2 ) for all g1 , g2 G.
Another tool for understanding the internal structure of a group is the
order24 | g| of an element g: the smallest natural number n such that 24
Definition 1.18, page 11.
gn = e (or ng = e if were using additive notation). If no such n exists,
we say the element is of infinite order. The order |e| of the identity
element is 1, and indeed e is the only element with order 1.25 If every 25
Proposition 1.19, page 11.
element of G apart from e has order 2, then G must be abelian.26 The 26
Proposition 1.20, page 11.
order of an element is invariant under isomorphisms: the image of an
element has the same order in its group as the element itself did in
the original group.27 27
Proposition 1.21, page 12.
Some groups can be reconstructed from a proper subset of elements of
the group. We call those elements generators and, if there are finitely
34 a course in abstract algebra

28
Definition 1.22, page 12. many of them, say that the group is finitely generated.28 In the case of
the integers Z or the finite cyclic groups Zn , we can recover the whole
of the group by taking finite products (or sums) of a single nontrivial
generator; groups which are generated by a single element are said
29
Definition 1.23, page 12. to be cyclic.29 There is only one cyclic group (up to isomorphism) of
30
Proposition 1.24, page 13. a given (finite or infinite) order.30 The order of a (finite or infinite)
31
Proposition 1.25, page 13. cyclic group is equal to the order of any of its generators.31 For the
finite cyclic groups Zn , a nonzero integer k Zn generates the whole
32
Proposition 1.26, page 13. group if and only if k and n are coprime.32
Given two groups G and H we can form a new group by taking the
cartesian product G H and defining the obvious combined group
structure on it; we call this the direct product G H (or, if were using
33
Definition 1.27, page 14. additive notation, the direct sum G H).33 , 34 In particular, the Klein
34
Proposition 1.28, page 14. 4group V4 is isomorphic to the direct sum Z2 Z2 ,35 and in general
35
Example 1.30, page 15. Zm Zn = Zmn if and only if m and n are coprime.36
36
Proposition 1.32, page 16. Matrices yield an important and rich class of groups, many of which
are nonabelian. Among these are the general linear groups GLn (K)
37
Example 1.34, page 17. of invertible matrices,37 the orthogonal groups On (K) of matrices
38
Example 1.36, page 18. whose inverse is equal to their transpose,38 and the unitary groups of
39
Example 1.38, page 18. complex matrices whose inverse is equal to their conjugate transpose.39
(Here K can be either Q, R or C, and later we will consider the case of
a finite field F p .) Each of these groups has a special version consisting
40
Example 1.35, page 17. of just the matrices of the given type with determinant 1,40 , 41 , 42 and
41
Example 1.37, page 18. in addition we have the symplectic groups Sp2n (K) of symplectic
42
Example 1.39, page 18. matrices.43 An interesting and important finite example is given
43
Example 1.40, page 18. by Paulis spin matrices; this is closely related to the eight-element
44
Example 1.42, page 19. quaternion group Q8 .44
Matrix groups led naturally to a discussion of geometric operations on
vector spaces, and in particular we considered isometries: operations
which preserve distance. Isometries which preserve orientation are
said to be direct, while those which reverse orientation are said to be
45
Definition 1.46, page 22. opposite.45 Considering those isometries which map a particular geo-
metric object to itself leads to the notion of symmetry groups such as
the dihedral groups Dn : these comprise the 2n symmetry operations
46
Definition 1.43, page 21. (reflections and rotations) we can perform on a regular ngon.46 The
Klein 4group V4 can similarly be viewed as the symmetry group of a
47
Example 1.45, page 21. rectangle.47 More generally, the Euclidean groups En+ and En are de-
fined as, respectively, the direct and full isometry groups Isom+ (Rn )
and Isom(Rn ) of Rn , and in particular we characterised the various
48
Definition 1.47, page 22. types of elements in the two- and three-dimensional case.48 , 49
49
Definition 1.48, page 22. In turn, symmetry groups led to a discussion of permutations: bijec-
50
Definition 1.50, page 24. tions : X X from a set to itself.50 Composition of permutations is
groups 35

associative, there is an obvious identity permutation , and bijections


are necessarily invertible. This leads to the notion of the symmetric
group Sym( X ) on a set X;51 in the case where X is a finite set with n 51
Definition 1.51, page 24.
elements, we usually denote this Sn (a simple argument52 shows that 52
Proposition 1.52, page 24.
|Sn | = n!). We examined two different notations for finite permuta-
tions: array notation and cycle notation.53 The first of these is perhaps 53
Algorithm 1.53, page 26.
more intuitive in some respects, and its a little easier to compose
permutations written in this way, but it doesnt really give a clear in-
sight into the internal structure of the permutation. The latter is more
compact and does allow us to see how individual objects are shuffled
around by the permutation, and with a little bit of practice it becomes
almost as easy to write down the composition of two permutations.
A simple argument demonstrates that any finite permutation can be
written as a product of disjoint cycles; that is, cycles which dont affect
the same objects, and hence dont interact with each other.54 The order 54
Proposition 1.54, page 27.
of a cycle is (reasonably obviously) equal to its length,55 and (less ob- 55
Definition 1.55, page 28.
viously) the order of a product of disjoint cycles is equal to the lowest
common multiple of the lengths of the individual cycles.56 We give a 56
Proposition 1.57, page 28.
cycle of length 2 (that is, a permutation which just interchanges two
distinct objects and leaves everything else where it is) a special name:
we call it a transposition.57 It transpires that any finite permutation 57
Definition 1.56, page 28.
can be written as a product of (not necessarily disjoint) transpositions;
that is, any finite permutation can be performed by carefully swapping
pairs of elements in some sequence.58 This sequence of transpositions 58
Proposition 1.58, page 29.
will not, in general, be unique for a given permutation; even worse,
two different such decompositions of a given permutation wont nec-
essarily even have the same number of transpositions. However, it is
at least the case that all such sequences for a given permutation will
have the same parity: a permutation will always be decomposable
into either an odd or an even number of transpositions.59 This gives 59
Proposition 1.61, page 30.
us a neat classification of finite permutations into two types: odd or
even.60 The even permutations of n objects form a smaller group 60
Definition 1.62, page 31.
called An , the alternating group, which is exactly half the size of the
full symmetric group Sn (which contains both the odd and the even
permutations).61 It is straightforward to show that any even finite 61
Example 1.63, page 32.
permutation on three or more objects can be written as a product of
length3 cycles.62 62
Proposition 1.64, page 32.
36 a course in abstract algebra

References and further reading


I Kleiner, The evolution of group theory: a brief survey, Mathematics Magazine 59.4 (1986) 195215
H Wussing, The Genesis of the Abstract Group Concept, Dover (2007)
Two overviews of the history of group theory from the late 18th century to the early 20th century.

Exercises
1.1 Let be a binary operation defined on R such that x y = x + y + xy.
(a) Is associative?
(b) Is commutative?
(c) Is there an element e R which serves as an identity with respect to ?
(d) Does every element x R have an inverse x 1 with respect to ?
1.2 Let S = { a, b} be a set with two elements. How many possible binary operations can be defined
on this set? How many of these determine semigroups, that is, how many are associative? How
many of those determine monoids? How many determine a group structure?
1.3 Which of the following binary operations (defined on the set R of real numbers) are associative,
and which are commutative?
(a) x y = x + 2y
(b) x y = x + y + xy + 1
(c) x y = x
1.4 Show that the multiplication table for a finite group G satisfies the Latin square property. That
is, show that each element of the group occurs exactly once in each row and column of the
table.
1.5 Let : RR R by x y = xy + 1. Is this operation commutative? Does it determine a group
structure on R? If so, prove it by verifying the necessary criteria; if not, which axioms fail?
1.6 Which of the following are groups? For any that are not, say which of the axioms G0G3 fail.
(a) (Z, ) (e) (Q, ) where ba dc = ba+
+c , for a, b, c, d Z and b, d > 0
d
(b) (Z3 , 3 ) (f) (Zeven , +)
(c) (Z4 , 4 ) (g) (Zodd , +)
(d) (Zn , n ) if n isnt prime (h) (Z, )
Here, represents the usual multiplication operation defined on Z, and n represents multi-
plication modulo n.
1.7 Given a group G, and some fixed element g G, show that the function g : x 7 g1 xg is an
isomorphism G G.
1.8 Write out the group multiplication table for Z2 Z5 .
1.9 Let G be a group. Show that if ( ab)2 = a2 b2 for all a, b G, then G must be abelian.
1.10 Let G be a nonempty set and let : G G G be an associative binary operation. Suppose
that there is a distinguished element e G such that:
(a) e x = x for all x G. (That is, e is a left identity.)
(b) For any x G, there exists y G such that y x = e. (That is, G has left inverses.)
Then show that G = ( G, ) is a group. (This demonstrates that the conditions in Definition 1.9
are slightly stronger than necessary.)
groups 37

1.11 Let G be a group, and suppose that g G and m, n Z. Prove that ( gm )n = gmn and
gm gn = gm+n .
1.12 Show that | a b| = |b a| for any elements a, b of some group G = ( G, ).
1.13 Show that any finite group of even order has an odd number of elements of order 2.
1.14 Show that Isom(3 ), the full isometry group of the tetrahedron, is isomorphic to the symmetric
group S4 , and that the direct isometry group Isom+ (3 ) is isomorphic to the alternating group
A4 .
1.15 Let = 12 23 35 41 54 66 be an element of S6 . Write this permutation as a product of disjoint cycles,
 

and then as a product of transpositions.


1.16 Write the following elements of S7 as (i) a product of disjoint cycles and (ii) as a product of
transpositions, and (iii) calculate their orders.
1 2 3 4 5 6 7
(a) 3541726 (e) (3 1 2 4 5)(4 2 1 3)(2 1)(4 1 2)
1 2 3 4 5 6 7
(b) 2471653 (f) (2 7)(1 3 4)(7 2)(6 5 3)
1 2 3 4 5 6 71 2 3 4 5 6 7
(c) 2143576 4621735
1 2 3 4 5 6 71 2 3 4 5 6 7
(d) 4763521 4263517
1.17 The 3cycle (1 2 3) has order 3, the highest possible for an element of the symmetric group S3 ;
similarly (1 2 3 4) has maximal order 4 in S4 , while (1 2 3)(4 5) has order 6, which is the largest
possible in S5 . Find examples of elements of maximal order in the symmetric groups S6 to S12 .
1.18 Show that Sym( A) is nonabelian if | A| > 2.
1.19 What is the highest order of any element of S7 ? How many permutations in S7 have this order?
1.20 For n Z define a function f n : Z Z by f n (m) = m + n for any m Z. Show that
f n Sym(Z), and also that f n f m = f m+n = f m f n and f n1 = f n for any m, n Z.
1.21 (a) Show that the function f : Z13 Z13 given by f ( x ) = x5 is a permutation. Write this
permutation in disjoint cycle notation.
(b) More generally, choose some integer 0 < < 13, show that the Dickson polynomial
D5 ( x, ) = x5 5x3 + 52 x defines a permutation of Z13 , and write this permutation in
disjoint cycle notation.
(These are examples of permutation polynomials, which have applications in the design of
error-correcting codes for data transmission and storage.)
1.22 Let A GLn (R) be some invertible nn matrix over R. Define a map f A : Rn Rn by
f A (v) = Av and show that f A Sym(R).
1.23 Let S = R \ {1, 0, 1} and let f : S S map x 7 11+ x n
x . Denote by f the nfold composite
f f f . Find the order of f ; that is, the smallest positive integer n such that f n = id, and
thereby show that h f i = Zn .
1.24 Show that Sp2n (R) is indeed a group.
1.25 Show that GLn (Z) is abelian if and only if n = 1.
1.26 Show that the rotation group of the cube is isomorphic to S4 .
These purely mathematical sciences of
algebra and geometry are sciences of
the pure reason deriving no weight and
no assistance from experiment, and iso-
lated, or at least isolable, from all out-
ward and accidental phenomena.
William Rowan Hamilton
(18051865),
Introductory Lecture on Astronomy,
2 Subgroups Dublin University Review 1 (1833)

careful examination of the multiplication tables of many of


A the groups we met in the previous chapter reveals a wealth of
e r r 2 m1 m2 m3
e e r r 2 m1 m2 m3
r r r 2 e m3 m1 m2
internal structure. The dihedral group D3 , for example, has an obvious
r 2 r 2 e r m2 m3 m1
block consisting of the identity and the clockwise and anticlockwise m1 m1 m2 m3 e r r 2
rotations. We could throw away the three reflection operations and still m2 m2 m3 m1 r 2 e r
be left with a perfectly respectable group, which in this case happens m3 m3 m1 m2 r r 2 e

to be the rotation group R3 = Isom+ (2 ) = Z3 . So in some sense, D3 Table 2.1: The multiplication table for
the dihedral group D3 with the sub-
has a copy of R3 (or, for that matter, Z3 ) inside it. group {e, r, r2 } highlighted
Similarly, we could throw away everything except the identity and
e r r 2 m1 m2 m3
a single reflection (m1 for example) and still be left with an order2 e e r r 2 m1 m2 m3
group isomorphic to Z2 . So, in the same sense, D3 also has a copy r r r 2 e m3 m1 m2
r 2 r 2 e r m2 m3 m1
(more accurately, three copies) of Z2 embedded in it. m m m2 m3 e r r 2
1 1
Similarly, looking at the multiplication table for Z4 reveals an order2 m2 m2 m3 m1 r 2 e r
m3 m3 m1 m2 r r 2 e
group {0, 2} = Z2 inside it.
Table 2.2: The multiplication table for
In this chapter, we will make this concept more precise, and develop the dihedral group D3 with the sub-
these ideas to learn more about the internal structure of groups. Along group {e, m1 } highlighted
the way, we will prove Lagranges Theorem, which imposes an im- + 0 1 2 3
portant necessary (but not sufficient) condition on the order of these 0 0 1 2 3
1 1 2 3 0
groups-within-groups, and use it to prove Eulers Theorem and Fer- 2 2 3 0 1
mats Little Theorem, two important and useful number-theoretic 3 3 0 1 2
results about the factorisation of integers. Table 2.3: The multiplication table for
the cyclic group Z4 with the subgroup
{0, 2} highlighted

2.1 Groups within groups On the other side, the general opinion
has been and is that it is indeed by
experience that we arrive at the mathe-
The main concept we will explore in this chapter is that of matics, but that experience is not their
a subgroup: a smaller group neatly embedded inside a larger one. proper foundation: the mind itself con-
tributes something.
More precisely, given some group G = ( G, ), we are interested in
Arthur Cayley (18211895),
finding a subset H G which is a group in its own right under the Presidential Address to the British
same binary operation as G. Not just any subset will do, however, Association for the Advancement of
Science (1883)
so we need to decide what criteria we want a given subset to satisfy.
40 a course in abstract algebra

Given a group G = ( G, ) and a subset H G, we impose on H the


same binary operation as we use in G. We can do this, because every
element of H is also contained in G, so for any h1 , h2 H G, the
product h1 h2 is well-defined, because h1 and h2 are also in G.
But we have to be careful not to gloss over a subtly important, if
slightly pedantic, point here. The binary operation is a function
: G G G, but to define a group structure on H we actually need
a function H : H H H. The obvious way of dealing with this is
to define a new function by restricting the domain of to the subset
H H G G. This gives us a function | H H : H H G, which
is nearly what we want, but not quite. It will work fine, though, if the
image of | H H is contained in H rather than just G. This requirement
is exactly the same as requiring that H is closed under the action of
| H H . If it is, then we can just define our binary operation H to be
the restriction | H H with the codomain changed to be H instead of G.
We call this new operation H the induced operation, and from now
on well drop the subscript H unless to do so would be ambiguous.
We now have a criterion that a subset H G must satisfy in order
for H to be a subgroup of G: it must be closed under the action of
the induced operation. In addition, H = ( H, H ) must also satisfy the
usual axioms for a group (see Definition 1.9). We get associativity (G1)
for free: H is associative since is, and merely restricting it to some
closed subset of G isnt going to cause any problems in that regard.
Its also worth noting at this point that if G is abelian, then any
subgroup will be too, for a similar reason: H inherits commutativity
from in the same way that it inherits associativity.
That leaves us with two remaining axioms to check: existence of
an identity (G2) and existence of inverses (G3): we require e H
and also that h1 H for all h H. Actually, the presence of
the identity element follows from the existence of inverses and the
closure requirement, since if h H and h1 H, closure implies that
h H h1 = e = h1 H h must also be in H. One last thing we need to
remember is that a group must have at least one element, so H cant
be the empty set . This discussion gives us the following definition.
Definition 2.1 Let G = ( G, ) be a group, let H G be a nonempty
subset of G, and let H be the induced operation obtained by re-
stricting to H H. Then H = ( H, H ) is a subgroup of G (written
H < G) if and only if:
SG1 H is closed under the action of H . That is, for all h1 , h2 H,
the product h1 H h2 H too.
SG2 For all h H, the inverse h1 H as well.
subgroups 41

The remark about inheritance of commutativity also yields the follow-


ing simple but relevant fact:
Proposition 2.2 Let G = ( G, ) be abelian, and let H < G be a subgroup
of G. Then H is also abelian.
The converse doesnt hold, however, since nonabelian groups can
have abelian subgroups. For example, we noted earlier that D3 has a
subgroup isomorphic to Z3 which is therefore abelian, but D3 itself
isnt abelian.
This rotation subgroup {e, r, r2 } < D3 is, as we noted earlier, exactly
the group Isom+ (2 ) of direct isometries of an equilateral triangle
2 , embedded neatly inside the full isometry group D3 = Isom(2 ).
Generalising this observation gives the following example:
Example 2.3 Let X be some geometric object (such as a subset of
Rn for some n). The group Isom+ ( X ) of direct isometries of X is a
subgroup of the full isometry group Isom( X ).

The rotations of a regular ngon form an ordern subgroup of the


dihedral group Dn which is isomorphic to Zn . Similarly, the group
Isom+ (3 ) from Example 1.49 is a subgroup of the full isometry group
Isom(3 ), and the groups E2+ = Isom+ (R2 ) and E3+ = Isom+ (R3 )
(see Definitions 1.47 and 1.48) of, respectively, the orientation-preserv-
ing isometries of R2 and R3 are subgroups of, respectively, the full
isometry groups E2 = Isom(R2 ) and E3 = Isom(R3 ).
Any nontrivial group G has at least two subgroups: the trivial sub-
group {e} < G, and G itself. In the next section, we will meet
Lagranges Theorem, which places an elegant necessary (but not suffi-
cient) condition on what other subgroups a given group might have.
Example 2.4 The multiplicative group R of nonzero real numbers
can be regarded as a subgroup of C , the multiplicative group of
nonzero complex numbers: R is clearly a subset of C

, its closed
under multiplication (since the product of any two real numbers is
also a real number) and every nonzero real number has a unique
multiplicative inverse.

This last example suggests an interesting alternative viewpoint. In


this chapter so far, weve chosen a group G and then looked at what
smaller groups H might be contained within it. But, historically at
least, the invention (or discovery, depending on ones philosophical
perspective) of the various number systems N, Z, Q, R and C hap-
pened in the reverse direction: we started with one number system
and then extended it in a consistent way to obtain a new number
system that satisfied certain desirable properties that were lacking in
42 a course in abstract algebra

the previous one. What happens if we take a group G and, instead of


examining its subgroups, we try to extend it in some way to obtain
a larger group containing G as a subgroup? This turns out to be an
interesting and rich line of inquiry, and we will return to it later when
we look at group extensions, semidirect products and extension fields.
Example 2.5 The subset C = {z C : |z| = 1} = {ei : 0 6 < 2 },
which we can regard geometrically as the unit circle in the complex
plane, is a subgroup of C . Its closed under multiplication, since
GLn (C) if 0 6 , < 2, then e e = ei( +) C, and it contains inverses,
i i

since if 0 6 < 2, then ei ei(2 ) = e2i = 1.


In fact, this subgroup is isomorphic to the unitary group U1 . Recall
On ( C ) SLn (C) Un
that U1 is the group of 11 invertible matrices A such that A1 = A .
Unpacking this definition further, for a 11 complex, nonzero matrix
SOn (C) SUn A = [z] to satisfy this requirement is equivalent to the condition
z = z, or z z = 1. But z z = | z | , hence U1 = {[ z ] GL1 (C) : | z | = 1}.
1 2
Figure 2.1: Lattice diagram depicting
some subgroups of GLn (C) The map f : U1 C given by [z] 7 z is clearly an isomorphism.

The mention of U1 in this example suggests that some of the other


Z4 V
matrix groups weve encountered might be subgroups of each other.
Example 2.6 Let K denote one of the number systems Q, R or C.
Then the special linear group SLn (K), the orthogonal group On (K)
{0, 2} {e, r } {e, h} {e, v}
and the special orthogonal group SOn (K) are all subgroups of the
general linear group GLn (K). Also, SOn (K) < On (K).
{0} {e} Furthermore, the unitary group Un and the special unitary group
Figure 2.2: Lattice diagrams for Z4 and SUn are both subgroups of GLn (C), and SUn < Un .
the Klein group V
This example provides us with an interesting collection of groups with
D3 complicated interrelationships, which we can depict graphically by
means of the lattice diagram or Hasse diagram shown in Figure 2.1.
In such a diagram, following a line up the page represents the rela-
{e, r, r2 } {e, m1 } {e, m2 } {e, m3 }
tionship is a subgroup of. We can do this for some of the other
examples weve encountered so far, with variously illuminating re-
sults. Figure 2.2, for example, highlights the differences in internal
{e} structure between the cyclic group Z4 and the Klein group V4 (see
Figure 2.3: Lattice diagram for the di- Example 1.45), while Figure 2.3 depicts the lattice diagram for D3 .
hedral group D3
Looking at the lattice diagram for Z4 , we see that the subgroup {0, 2}
can actually be regarded as the cyclic group h2i. More interestingly,
Z12 the cyclic group Z12 contains four proper, nontrivial subgroups which
h2i h3i can be generated by a single element, namely
h4i h6i
h2i = {0, 2, 4, 6, 8, 10} h3i = {0, 3, 6, 9}
{0}
h4i = {0, 4, 8} h6i = {0, 6}
Figure 2.4: Lattice diagram for the
cyclic group Z12 which fit together in the lattice diagram shown in Figure 2.4
subgroups 43

More generally, we have the following fact:


Proposition 2.7 Let G be a group. Then for any g G, the set
h g i = { g n : n Z} G
forms a subgroup of G, the cyclic subgroup generated by the element g.

Proof The subset h gi is clearly a subset of G. It is closed under the


induced operation, since gm gn = gm+n h gi for any m, n Z. And
for any gn h gi, the inverse ( gn )1 = gn is also contained in h gi.
Hence h gi < G.
Looking more carefully at the lattice diagrams for Z4 and Z12 , we
see that all of their subgroups are cyclic; we can even write Z4 = h1i,
Z12 = h1i and {0} = h0i to complete the picture.
The lattice diagram for Z is similarly illuminating, although infinite
in size. An illustrative fragment is shown in Figure 2.5.

h1i

h5i h3i
h2i
h25i h15i h9i
h10i h6i
h4i h75i h45i
h50i h30i h18i
h20i h12i h225i
h150i h90i
h100i h60i h36i
h450i
h300i h180i

h900i

Figure 2.5: Part of the lattice diagram


for subgroups of Z

Again, all of the subgroups are cyclic, generated by some single ele-
ment of Z. (The trivial subgroup isnt, and cant be, shown: it appears
as the infinitely distant bottom vertex of the lattice, contained in all of
the other subgroups.) We will often write the set
hni = {. . . , 3n, 2n, n, 0, n, 2n, 3n, . . .}
of integer multiples of n as nZ.
These observations lead us to ask whether all subgroups of (finite or
44 a course in abstract algebra

infinite) cyclic groups are themselves cyclic, and it turns out that the
answer is yes:
Proposition 2.8 Let G = h gi be a (finite or infinite) cyclic group. If
H < G then H is also cyclic.

Proof If H is trivial then it is cyclic, generated trivially by the identity


e G. If H = G then it is obviously cyclic, since G is.
If H is a proper and nontrivial subgroup of G, then choose k to
be the smallest positive integer such that gk H, where g is the
generator of G. Now choose some other element gn which also lies
in H. Dividing n by k we get n = kq + r for some unique integers
That this is possible for any two inte- q, r Z, with 0 6 r < k. So g = g
1
1 n kq+r = gkq gr , which implies that

gers n, k Z is a consequence of the gr = gkq gn .


Division Algorithm, which we will dis-
cuss in more detail later on. Since H is a subgroup of G, it must be closed under the induced
operation, and so gr must also lie in H. So r = 0, otherwise k wasnt the
smallest positive integer for which gk H. Hence gn = gkq = ( gk )q )
and so any element of H can be written as a power of gk . Therefore gk
generates all of H and so H = h gk i is cyclic.
A greater wealth of structure can be seen in the lattice diagram for the
dihedral group D4 :

D4

{e, r2 , m2 , m4 } {e, r, r2 , r3 } {e, r2 , m1 , m3 }

{e, m2 } {e, m4 } {e, r2 } {e, m1 } {e, m3 }

{e}

The subgroups on the second level are all of order 4; two of them
({e, r2 , m2 , m4 } and {e, r2 , m1 , m3 }) are isomorphic to the Klein group
V4 , while the other (the rotation subgroup {e, r, r2 , r3 }) is isomorphic
to Z4 . The subgroups on the third level are all isomorphic to Z2 .
Q8 For any dihedral group Dn , the rotation subgroup {e, r, r2 , . . . , r n1 }
is a cyclic subgroup since it can be generated by the single element
{ E, I } { E, J } { E, K }
r. And since it has order n, then by Proposition 1.24, it must be
isomorphic to Zn .
{ E}
The quaternion group Q8 (see Example 1.42) has the lattice diagram
{ E} shown in Figure 2.6.
Figure 2.6: Lattice diagram for the This diagram is obviously different in structure to that for D4 , and
quaternion group Q8
hence D4 6
= Q8 .
subgroups 45

Looking at the diagram for D4 , we can see that


{e, r2 } = {e, r, r2 , r3 } {e, r2 , m2 , m4 }
= {e, r, r2 , r3 } {e, r2 , m1 , m3 }
= {e, r2 , m2 , m4 } {e, r2 , m1 , m3 }.
Something similar happens in the other lattice diagrams weve seen
so far: the intersection of any two subgroups is itself a subgroup,
although in many cases this intersection will be either the trivial
subgroup or one of the two intersecting subgroups. This is true in
general:
Proposition 2.9 Let H, K < G be subgroups of some group G. Then
their intersection H K is also a subgroup of G, and of H and K.

Proof Since both H and K are subgroups of G, they both contain the
identity e of G, and hence their intersection H K does as well, and
is therefore not empty. Given elements g1 , g2 H K, it follows that
both g1 , g2 H and g1 , g2 K. Therefore their product g1 g2 H
and also g1 g2 K, hence g1 g2 H K. Thus H K is closed
under the induced multiplication operation. Finally, we need to show
that H K contains all the necessary inverses. But if g H K then
g H and so g1 H; also g K and so g1 K as well. Therefore
g1 H K as required, and hence H K is a subgroup of G.
Since H K H, and we have just seen that it forms a group under
the induced multiplication operation, it must therefore be a subgroup
of H. By an almost identical argument, H K is a subgroup of K.
We now introduce a related concept and proposition that will come in
useful in the next section when we classify groups of order 6.
Definition 2.10 Let H, K < G be two subgroups of a group G. Then
denote by HK the set
HK = {h k : h H, k K }.
If we are using additive notation, we may choose to denote this as
H + K = {h + k : h H, k K }.
Note that if G is nonabelian, it neednt be the case that HK = KH.
If, however, the elements of H commute with the elements of K then
HK = KH is a subgroup of G:
Proposition 2.11 Let H and K be subgroups of a group G. Then HK is
a subgroup of G if and only if HK = KH.

Proof For any h H and k K we have (hk )1 = k1 h1 KH,


so if HK is a subgroup of G then it must contain all its inverses, so
HK KH. Furthermore, h = he HK and k = ek HK. If HK is a
46 a course in abstract algebra

subgroup, it must be closed under multiplication, so k h must also lie


in HK, and so KH HK, which means that HK = KH as required.
Conversely, suppose that HK = KH. To show that it is a subgroup of G,
we need to confirm that it is closed under inverses and multiplication.
For the first of these, suppose that h H and k K, so that hk HK.
Then (hk)1 = k1 h1 KH = HK, so HK contains all required
inverses.
Now suppose that h1 , h2 H and k1 , k2 K. Then (h1 k1 )(h2 k2 ) =
h1 (k1 h2 )k2 . But k1 h2 KH and KH = HK, so there exists h3 H and
k3 K such that k1 h2 = h3 k3 , hence
(h1 k1 )(h2 k2 ) = h1 (k1 h2 )k2 = h1 (h3 k3 )k2 = (h1 h3 )(k3 k2 ) HK
and thus HK is closed under multiplication. It is therefore a subgroup
of G, as claimed.
Proposition 2.12 Suppose H, K < G are two subgroups of G such that
H K = {e}, every element of H commutes with every element of K (that
is, h k = k h for all h H and k K), and that HK = G. Then
G = H K.
This is sometimes called the internal direct product of H and K.
Proof Let f : H K G be defined by (h, k ) 7 h k for all h H
and k K. Then if h1 , h2 H and k1 , k2 K, we have

f ((h1 , k1 )(h2 , k2 )) = f (h1 h2 , k1 k2 ) = h1 h2 k1 k2


= h1 k 1 h2 k 2 = f ( h1 , k 1 ) f ( h2 , k 2 ).
This proves that f satisfies the structure condition for an isomorphism.
It is clearly surjective, since we know that G = HK, and so any element
of G can be expressed as a product h k = f (h, k), which is the image
of some ordered pair (h, k) H K. All that remains is to show that
f is injective. Suppose that f (h1 , k1 ) = f (h2 , k2 ). Then h1 k1 = h2 k2
and so h21 h1 = k2 k 1
1 . But the terms on the left-hand side of this
expression belong to H, while the terms on the right-hand side belong
to K. So both h21 h1 and k2 k 1
1 must belong to the intersection
H K = {e}, and therefore h21 h1 = k2 k 1
1
= e, so h1 = h2
and k1 = k2 , thus proving that f is injective, and hence the required
isomorphism H K = G.
Another class of groups we met in the last chapter were the symmetric
groups Sym( X ) of permutations on some set X, and in particular the
finite symmetric groups Sn for n N. Recall that we also studied the
even permutations in Sn : those which can be expressed as an even
number of transpositions. In the language of this chapter, we can
certainly say that the alternating group An is a subgroup of Sn . But
Sn has, in general, many other subgroups too: for example, as can be
subgroups 47

readily seen from Figure 2.3, S3 = D3 has subgroups isomorphic to Z2


and Z3 as well as the trivial group and S3 itself. As remarked in the
last chapter, Isom(3 ) = S4 and Isom+ (3 ) = A4 . By examining the
multiplication table in Example 1.49, we can see that Isom+ (3 ) (and
hence A4 and S4 ) has subgroups isomorphic to Z2 (such as {e, s A })
and Z3 (such as {e, r1+ , r1 }). It also contains a subgroup ({e, s A , s B , sC })
isomorphic to the Klein group. The lattice diagram is as follows:

Isom+ (3 )

{e, s A , s B , sC }

{e, r1 } {e, r2 } {e, r3 } {e, r4 }


{e, s A } {e, s B } {e, sC }

{e} Wikimedia Commons


The British mathematician Arthur Cay-
ley (18211895) was one of the pioneers
of group theory. At 17 he went to Trin-
The larger group Isom(3 ) = S4 contains an even wider collection ity College, Cambridge, where he ex-
of subgroups, including Isom+ (3 ) and all its subgroups, three sub- celled: during his undergraduate ca-
groups of order 8, four of order 6, six more subgroups of order 4, and reer he published three papers in the
Cambridge Mathematical Journal, and in
six more subgroups of order 2. 1842 came top in the examinations for
the Mathematical Tripos, thereby win-
So far, weve found isomorphic copies of several small groups con-
ning the title of Senior Wrangler, the
tained in symmetric groups, and its tempting to ask whether this Smiths Prize and a four-year college
is going to be true in general: given an arbitrary group G, can we fellowship.
Upon leaving Cambridge in 1846 he
find a set X such that the symmetric group Sym( X ) has a subgroup
trained as a barrister at Lincolns Inn
which is isomorphic to G? The answer is yes, as the following theorem and practiced law for the next fourteen
demonstrates. years, during which he also wrote be-
tween two and three hundred math-
Theorem 2.13 (Cayleys Theorem) Every group is isomorphic to a ematical papers, and formulated the
permutation group. That is, a subgroup of Sym( X ) for some (finite or CayleyHamilton Theorem in linear al-
gebra (that a matrix satisfies its own
infinite) set X.
characteristic equation). In 1863 he re-
turned to Cambridge to become the
Proof For an element g G we can define a function g : G G by first Sadleirian Professor of Pure Math-
g (h) = g h for any h G. In the special case where g is the identity ematics, in the process willingly giv-
ing up a well-paid legal career for a
e, the function e is just the identity map idG : G G. More generally,
more modest academic salary, and wel-
g is a bijection G G, and therefore a permutation of G. To show coming the opportunity to devote more
this, we need to check that its both injective and surjective. of his time to mathematics. During
this phase of his career he continued
Injectivity follows from the left cancellation law in Proposition 1.15: his research on groups and the theory
if g (h) = g (k ) for two elements h, k G, then this means that of invariants, much of it in collabora-
tion with his near-contemporary James
g h = g k and then (1.4) implies that h = k, so g is injective. Sylvester (18141897). He also became
a keen supporter of university educa-
Surjectivity can be shown as follows. Given some element h G, we
tion for women, and was involved in
want to find some element k G such that g (k) = h. But we can the foundation and early development
always do this by setting k = g1 h, since g ( g1 h) = g g1 h = of the Universitys two womens col-
leges, Girton and Newnham.
h as required. So g : G G is bijective and thus a permutation of G.
48 a course in abstract algebra

The key idea of the rest of this proof is to use these permutations g
to construct a subgroup of Sym( G ) which happens to be isomorphic
to G itself.
Let S = { g : g G } be the set of these left-multiplication permuta-
tions. We now need to show that this subset S Sym( G ) is actually a
subgroup of Sym( G ), and furthermore that its isomorphic to G.
The group operation on S is the induced operation from Sym( G ),
which is just composition of permutations. So we need to show that S
is closed under composition, and also that 1
g S for all g G.
To check closure we need to examine how composition works in S and
how this relates to multiplication in G. Given g, h, k G we have
( g h )(k) = g (h (k)) = g (h k) = g h k = gh (k).
So g h = gh , and since, for any g, h G, the product g h is also
in G, it follows that gh is in S. Hence S is closed under composition.
Next we have to check the existence of inverses. As noted earlier,
the function e is the identity map idG , which is just the identity
permutation Sym( G ). So the inverse 1
g of g is the permutation
h such that h g = e = g h . But we know from the previous
paragraph that g h = gh , so what were really looking for is
h G such that h g = e, which rle is clearly served by h = g1 , and
hence 1
g = g 1 .
So S is closed under composition, and contains both an identity ele-
ment and a full set of inverses, and is therefore a subgroup of Sym( G ).
All thats left is to show S
= G, which requires us to find a bijection
f : G S satisfying the structural condition in Definition 1.17.
An obvious candidate for this isomorphism f is the map that takes an
element g to its corresponding permutation g S; that is, f : g 7 g .
Then f is injective, because if f ( g) = g = h = f (h) for some
g, h G, it follows that g (k) = g k = h k = h (k) for every
element k G, and by the right cancellation law (1.5) it follows that
g = h and hence f is injective. The definition of S immediately
confirms that f is surjective: g S precisely when g G.
All that remains is to show that f respects the group structure:
f ( g ) f ( h ) = g h = gh = f ( g h )
This completes the proof.
To make sure we understand this, lets try a concrete example.
Example 2.14 Let G = Z3 . Cayleys Theorem says we can find a
permutation group isomorphic to Z3 = {0, 1, 2}, and by the construc-
tion in the above proof we know that were actually looking for a
subgroups 49

subgroup of Sym(Z3 ) = S3 .
The permutation 0 is obviously the identity permutation , since
0 (n) = 0 + n = n.
The permutation 1 is defined by 1 (n) = 1 + n (mod 3), which can
be written as 01 12 20 , or (0 1 2) in cycle notation.
 

The permutation 2 is defined by 2 (n) = 2 + n (mod 3), which can


be written as 02 10 21 , or (0 2 1) in cycle notation.
 

So Z3
= {0 , 1 , 2 } = {, (0 1 2), (0 2 1)} < S3 .
This example concerns a finite group, but theres nothing in our proof
of Cayleys Theorem that relies on the groups G, Sym( G ) or S being
finite, and in fact the theorem works for infinite groups too:
Example 2.15 Let G = Z, which is countably infinite. Then for any
integer n Z, the function n : Z Z is defined by n (m) = n + m
for all m Z. This is clearly a bijection on Z; it permutes the integers
by shifting everything along by n places in the appropriate (positive
or negative) direction. The group S = {n : n Z} is a subgroup of
the full permutation group Sym(Z), and the isomorphism f : Z S
that we want is precisely the function defined by mapping an integer
n to the shift everything along by n places permutation n .

Cayleys Theorem justifies the effort we spent studying permutations


in Section 1.4: every group can be regarded as a permutation group.

2.2 Cosets and Lagranges Theorem The reader will find no figures in this
work. The methods which I set forth
do not require either constructions or
In the proof of Cayleys Theorem we introduced permutations geometrical or mechanical reasonings:
g : G G which we used to show that any group can be viewed as but only algebraic operations, subject
to a regular and uniform rule of proce-
a permutation group. These permutations are defined on the whole dure.
of G, but since part of our motivation for introducing the concept of Joseph-Louis Lagrange (17361813),
a subgroup was to investigate the internal structure of groups, its preface to Mcanique Analytique (1788)

natural to ask how the g behave when we restrict them to a subgroup.


The action on the trivial subgroup {e} is obvious: g (e) = g e = g.
To see what happens to a larger subgroup, lets look at the dihedral e r r 2 m1 m2 m3
2
group D3 and its rotation subgroup R3 = {e, r, r }. We can read off e e r r 2 m1 m2 m3
r r r 2 e m3 m1 m2
the image g ( R3 ) = { g, g r, g r2 } from the multiplication table: its r 2 r 2 e r m2 m3 m1
just the part of the table where the row corresponding to the element m1 m1 m2 m3 e r r 2
m2 m2 m3 m1 r 2 e r
g intersects with the columns corresponding to the elements of the m3 m3 m1 m2 r r 2 e
subgroup R3 . For example, m1 ( R3 ) is highlighted in Table 2.4. If we
Table 2.4: The multiplication table for
do this for all of D3 we get the dihedral group D3 with the image
m1 ( R3 ) = {m1 , m2 , m3 } highlighted
e ( R3 ) = {e, r, r2 } = R3 , m1 ( R3 ) = {m1 , m2 , m3 } = D3 \ R3 ,
50 a course in abstract algebra

r ( R3 ) = {r, r2 , e} = R3 , m2 ( R3 ) = {m2 , m3 , m1 } = D3 \ R3 ,
r2 ( R3 ) = {r2 , e, r } = R3 , m3 ( R3 ) = {m3 , m1 , m2 } = D3 \ R3 .
Immediately, we notice that (apart from some inconsequential reorder-
ing) the image g ( R3 ) has one of two different forms. If g R3 then
g ( R3 ) = R3 . (This is exactly what wed expect, since R3 , being a
subgroup, is closed under the induced multiplication operation.) If,
on the other hand, g 6 R3 then g ( R3 ) = {m1 , m2 , m3 } = D3 \ R3 .
So the action of the various permutations on the subgroup R3 neatly
partitions D3 into two different subsets. These images g ( R3 ) look
like they might be a useful concept, so we give them a special name:
Definition 2.16 Let G be a group, and suppose H < G is a subgroup
of G. Then, for a given element g G, the left coset gH is defined
to be the subset g ( H ) = { g h : h H } G.

If we happen to be using additive notation for our group G, then we


will usually denote left cosets as g+ H rather than gH.
Recall that we could also have proved Cayleys Theorem by means of
the permutations g : G G defined by g (h) = h g. In the interests
of completeness and achirality, we introduce the following definition:
Definition 2.17 Let G be a group, and suppose H < G is a subgroup
of G. Then, for a given element g G, the right coset Hg is defined
e r r2 m1 m2 m3 to be the subset g ( H ) = { h g : h H } G.
e e r r2 m1 m2 m3
r r r2 e m3 m1 m2 Again, if were using additive notation for G, then we will typically
r2 r2 e r m2 m3 m1
m1 m1 m2 m3 e r r2 denote right cosets as H + g rather than Hg.
m2 m2 m3 m1 r2 e r
We can easily read off right cosets from the multiplication table too.
m3 m3 m1 m2 r r2 e
Just as the left cosets corresponded to parts of a particular row of
Table 2.5: The multiplication table for
the dihedral group D3 with the image the table, right cosets correspond to parts of a particular column (see
m1 ( R3 ) = {m1 , m2 , m3 } highlighted Table 2.5).
The right cosets of R3 in D3 are
R3 e=e ( R3 )={e, r, r2 }= R3 , R3 m1 =m1 ( R3 )={m1 , m3 , m2 }= D3 \ R3 ,
2
R3 r =r ( R3 )={r, r , e}= R3 , R3 m2 =m2 ( R3 )={m2 , m1 , m3 }= D3 \ R3 ,
2 2
R3 r =r2 ( R3 )={r , e, r }= R3 , R3 m3 =m3 ( R3 )={m3 , m2 , m1 }= D3 \ R3 .
Comparing this list of right cosets with the left cosets on page 49, we
find that the two lists are the same. Or, more formally,
R3 g = g ( R3 ) = g ( R3 ) = gR3
for all g D3 . We might be tempted to conjecture that this is true in
general, and it clearly is in the case where G is abelian, since
g (h) = h g = g h = g (h)
for all g G and h H < G.
subgroups 51

This isnt always true if G is nonabelian, as the next example shows.


Example 2.18 Let M = {e, m1 } < D3 . Then
eM = m1 M = M, Me = Mm1 = M,
rM = m3 M = {r, m3 }, Mr = Mm2 = {r, m2 },
2 2
r M = m2 M = {r , m2 }, Mr2 = Mm3 = {r2 , m3 }.
So, in particular,
rM = m3 M 6= Mm3 = Mr2 and r2 M = m2 M 6= Mm2 = Mr.
Its interesting to ask what special underlying property R3 has, and
M doesnt, that causes its left and right cosets to line up in this way.
Well leave that question until the next chapter, where it will motivate
our study of normal subgroups and quotient groups. For the moment,
left and right cosets will keep us busy enough.
Lets look at a few more examples. Well start with a couple of small
abelian groups weve met several times already.
Example 2.19 The Klein group V4 = {e, a, b, c} has three proper,
nontrivial subgroups
A = {e, a}, B = {e, b}, C = {e, c}.
The group V4 is abelian, so the left and right cosets coincide, and are e a b c
e e a b c
as follows: a a e c b
b b c e a
eA = aA = {e, a} = Ae = Aa, bA = cA = {b, c} = Ab = Ac, c c b a e
eB = bB = {e, b} = Be = Bb, aB = cB = { a, c} = Ba = Bc, Table 2.6: Multiplication table for the
Klein group V4
eC = cC = {e, c} = Ce = Cc, aC = bC = { a, b} = Ca = Cb.

Example 2.20 Consider the cosets of H = {0, 3} < Z6 . This is an


abelian group, so the left and right cosets will coincide.
0+ H =3+ H ={0, 3}, 1+ H =4+ H ={1, 4}, 2+ H =5+ H ={2, 5}. 0 1 2 3 4 5
0 0 1 2 3 4 5
The subgroup K = {0, 2, 4} has cosets 1 1 2 3 4 5 0
2 2 3 4 5 0 1
0+K =2+K =4+K ={0, 2, 4}, 1+K =3+K =5+K ={1, 3, 5}.
3 3 4 5 0 1 2
The full group Z6 has just one coset: itself, since n+Z6 = Z6 for 4 4 5 0 1 2 3
5 5 0 1 2 3 4
all n Z6 . Finally, the trivial subgroup {0} has six cosets, since
Table 2.7: Multiplication table for the
n+{0} = {n} for all n Z6 . cyclic group Z6
Now lets look at a nonabelian group and see what happens.
Example 2.21 The quaternion group Q8 (see Table 2.8) has four
proper, nontrivial subgroups, three of which are isomorphic to Z4
and one which is isomorphic to Z2 :
A = { E, I }, B = { E, J }, C = { E, K }, D = { E}.
52 a course in abstract algebra

The cosets of A are as follows:


( E) A = ( E) A = ( I ) A = ( I ) A = { E, I },
( J ) A = ( J ) A = (K ) A = (K ) A = { J, K },
A( E) = A( E) = A( I ) = A( I ) = { E, I },
A( J ) = A( J ) = A(K ) = A(K ) = { J, K }.
The cosets of B and C follow a similar pattern (check this). We see
from the above list (and the corresponding ones for B and C) that
E I J K E I J K the left and right cosets of these three subgroups do coincide.
E E I J K E I J K
I I E K J I E K J Looking at the cosets of { E} we find that
J J K E I J K E I
K K J I E K J I E
( E) D = ( E) D = { E}, ( I ) D = ( I ) D = { I },
E E I J K E I J K ( J ) D = ( J ) D = { J }, (K ) D = (K ) D = {K },
I I E K J I E K J
D ( E) = D ( E) = { E}, D ( I ) = D ( I ) = { I },
J J K E I J K E I
K K J I E K J I E D ( J ) = D ( J ) = { J }, D (K ) = D (K ) = {K }.

Table 2.8: Multiplication table for the In this case as well, the corresponding left and right cosets coincide.
quaternion group Q8
All the examples weve seen so far are of finite groups. These concepts
carry across to the case where the group under investigation is infinite.
Example 2.22 Consider the (infinite) cyclic subgroup 3Z = h3i of
the (infinite) cyclic group Z. This has three distinct left cosets (which,
since Z is abelian, are each identical to the corresponding right coset):
0+3Z = {. . . , 6, 3, 0, 3, 6, . . .} = 3Z+0,
1+3Z = {. . . , 5, 2, 1, 4, 7, . . .} = 3Z+1,
2+3Z = {. . . , 4, 1, 2, 5, 11, . . .} = 3Z+2.

Example 2.23 Recall that the orthogonal group O3 (R) consists of all
33 real orthogonal matrices; that is, matrices A such that A1 = A T .
These must have determinant 1 since if A T = A1 then det A T =
det( A1 ) = (det A)1 . But also det A T = det A in general. By
putting these together we have det A = (det A)1 , or (det A)2 = 1
which implies that det A = 1.
The special orthogonal group SO3 (R) is the subgroup of orthogonal
matrices which have determinant +1. It has two cosets, namely
SO3 (R) itself and the complement O3 (R) \ SO3 (R), comprising all
orthogonal matrices with determinant 1.
For any special orthogonal matrix A, we see that
A SO3 (R) = SO3 (R) = SO3 (R) A
since SO3 (R) is closed under matrix multiplication.
If, on the other hand, B is orthogonal with determinant 1, then BA
is also orthogonal, because O3 (R) is itself a group and hence closed
subgroups 53

under matrix multiplication. The product BA also has determinant


1 since det( BA) = (det B)(det A) = 11 = 1. So
B SO3 (R) = O3 (R) \ SO3 (R) = SO3 (R) B.

Example 2.24 In Example 2.5 we met the circle subgroup C < C


of the multiplicative group of nonzero complex numbers. Now lets
look at its cosets. Complex multiplication is commutative, so we
know straight away that the left and right cosets will coincide.
An element of C can be written in the form ei where 0 6 < 2;
any other nonzero complex number can be written in polar form as
rei where r > 0 and 0 6 < 2.
Multiplying these together, we get rei( +) , which can be regarded
geometrically as a point on a circle of radius r centred on the origin.
The subgroup C is obviously closed, so zC = C for any z C where
|z| = 1. If |z| = r 6= 1 then the coset zC will be the circle of radius r
centred on the origin. Thus C has an uncountably infinite number of
cosets in C , each with an uncountably infinite number of elements.
These are shown in Figure 2.7

Example 2.25 The multiplicative group R of nonzero real numbers


can also be regarded as a subgroup of C
. For z R < C , the
coset zR
is just R itself. If a nonzero complex number z 6 R


then the coset zR = {zx : x R }, the set of all nonzero real Figure 2.7: Cosets of C < C

multiples of the nonzero complex number z. Geometrically, this is
the straight line through the origin, passing through the point z, with
the origin itself deleted. Again, we have an uncountably infinite
number of cosets, each containing an uncountably infinite number of
elements. These cosets are shown in Figure 2.8

One thing all of these examples have in common is that the cosets
partition the group: their union is the whole group, and no two cosets
intersect, so the group neatly splits into a (possibly infinite) collection
of non-overlapping cosets. The next proposition confirms this is true Figure 2.8: Cosets of R < C

in general; more importantly it will help us prove Lagranges Theorem,
an important result about the internal structure of finite groups.
Proposition 2.26 Let H < G be a subgroup of a (possibly infinite) group
G, and let g, k G be two arbitrary elements of G. Then the following
three statements are equivalent:
(i) k Hg,
(ii) Hg = Hk, and
(iii) kg1 H.

(There is a corresponding result for left cosets.)


54 a course in abstract algebra

Proof First we prove that (i) = (ii). Suppose that k Hg. Then
there exists a unique h H such that h g = k. Multiplying on
the left by h1 gives g = h1 k. As usual, to prove Hg = Hk we
need to show that Hg Hk and Hk Hg. So, choose some element
a Hg. Then a = b g for some b H, and so a = b h1 k, which
means that a Hk (since b h1 H as both b, h1 H). Hence
Hg Hk. Similarly, choose c Hk. Then c = d k for some d H,
so c = d h g, which means that c Hg (because d h H as both
d, h H). Hence Hk Hg and therefore Hg = Hk.
The converse (ii) = (i) holds as well: if Hg = Hk then since k Hk,
it follows that k Hg as well.
Next we show that (i) = (iii). If k Hg then, as noted above, there
is a unique h H such that h g = k. Multiplying on the right by g1
Wikimedia Commons / Robert Hart (fl.1830s)
Joseph-Louis Lagrange (17361813),
born Giuseppe Lodovico Lagrangia in
gives k g1 = h H as required.
Turin, was an Italian and French math- Finally, we show the converse (iii) = (i). If h = k g1 H then it
ematician and physicist who made
many important contributions to alge- follows almost immediately that k = hg Hg.
bra, analysis, number theory and clas-
sical mechanics. His Mcanique Analy-
This proposition has a couple of important corollaries:
tique (1788) was the most comprehen- Corollary 2.27 Two right cosets Hg and Hk are either equal or disjoint.
sive treatment of classical mechanics
since Newtons celebrated Philosophi
Proof Suppose a Hg Hk. Then, by Proposition 2.26, since a Hg,
Naturalis Principia Mathematica (1687).
In 1766 he succeeded Leonhard Euler it follows that Ha = Hg; also since a Hk we have Ha = Hk and
as director of mathematics at the Prus- hence Hg = Hk. The alternative is that no such element a exists, in
sian Academy of Sciences in Berlin and
which case Hg Hk = .
remained there for twenty years, mov-
ing to Paris in 1786 at the invitation of Corollary 2.28 The right cosets of H in G partition G.
Louis XVI. His first few years in Paris
were marked by a prolonged episode The next proposition tells us an important fact about the relative sizes
of depression and the political insta-
bility of the Revolution; he survived right cosets of a finite subgroup H < G.
both and was appointed to a chair in
Proposition 2.29 Let H be a subgroup of a (possibly infinite) group G.
analysis at the cole Polytechnique in
1794, and a chair in mathematics at the Then | Hg| = | H | for any element g G.
cole Normale in 1795. Contemporary
accounts indicate that he was consid- Proof By the right cancellation law in Proposition 1.15, if h1 g =
erably less gifted at teaching than at
h2 g for some h1 , h2 H and g G, then h1 = h2 . This means
research.
He was elected a Fellow of the Royal that the function f : H Hg given by h 7 h g is a bijection, so
Society of London in 1806, appointed | H | = | Hg| as required.
a Grand Officier of the Lgion dHonneur,
and a Comte de lEmpire by Napoleon These last two facts, Corollary 2.28 and Proposition 2.29, give the
in 1808, who later awarded him the following elegant result.
Grand Croix of the Ordre Imprial de la
Runion a week before his death in 1813. Theorem 2.30 (Lagranges Theorem) Let H be a subgroup of a finite
He was granted the honour of a tomb group G. Then the order | H | of H divides the order | G | of G.
in the crypt of the Panthon, and he
is one of 72 eminent French scientists We can use this fact, that the order of a subgroup H is a factor of the
commemorated on the Eiffel Tower.
whole group G, to devise some measure of the relative sizes of G and
G, simply by looking at the quotient | G |/| H |. Lagranges Theorem
tells us that this will always be a positive integer.
subgroups 55

For example, the rotation subgroup R3 < D3 is clearly half the size of
D3 itself, so this quotient | D3 |/| R3 | = 2. Similarly, |Sn |/| An | = 2 for
any n N. The subgroup h6i < Z12 is obviously smaller relative to
the whole group Z12 , and indeed the quotient |Z12 |/|h6i| = 6.
So, the larger this quotient is, the smaller the subgroup in question
is relative to the full group. We might wonder if this concept can be
extended to the case of infinite groups, and indeed it can. The quotient
| G |/| H | is really just the number of distinct right cosets of H in G, so
by looking at this way we can formulate the following definition:
Definition 2.31 The index | G:H | of a subgroup H < G is the num-
ber of right cosets of H in G. If G is finite then | G:H | = | G |/| H |.

The index of a subgroup is, in some sense, multiplicative:


Proposition 2.32 Suppose H and K are subgroups of some finite group
G, such that K < H < G. Then | G : K | = | G : H | | H : K |.

Proof Since G is finite (and hence so are H and K) we have


| G : K | = | G |/|K | = | G |/| H | | H |/|K | = | G : H | | H : K |
as required.
We can extend this result to the case where G, H and K may be infinite:
as long as | G : H |, | G : K | and |K : H | are all finite, the multiplicativity
result still holds, although the proof is more involved and is left as an
exercise to the reader.
The following is an easy application of Lagranges Theorem.
Corollary 2.33 A group G of prime order p = | G | has no proper non-
trivial subgroups.

Proof The order of any subgroup H < G must divide p = | G | by


Lagranges Theorem. The only possibilities are | H | = 1 (in which case
H is the trivial subgroup) or | H | = p (in which case H = G).
With a little bit more effort, we can go further than this and demon-
strate that there is only one group (up to isomorphism) of a given
prime order p. First we need the following simple proposition, which
relates the order of an element to the order of the group.
Proposition 2.34 Let G be a finite group. Then the order | g| of any
element g G divides the order | G | of the whole group G.

Proof Suppose | g| = n. Then the powers { gi : i Z} form a finite


subgroup of G; this is the cyclic subgroup h gi generated by g. But
h gi = { gi : 0 6 i < n} and so |h gi| = | g| = n. So, by Lagranges
Theorem, |h gi| = | g| = n must be a factor of | G |.
This enables us to prove the following extension of Corollary 2.33:
56 a course in abstract algebra

Proposition 2.35 Let G be a finite group of prime order p = | G |. Then


G= Z p , the finite cyclic group of order p.

Proof Suppose g G is some element of G apart from the identity e.


Then | g| > 1 by Proposition 1.19. By Proposition 2.34 above, | g| must
be a factor of | G | = p; since p is prime and we already know | g| 6= 1,
then | g| = p. The cyclic subgroup h gi generated by g must therefore
have p elements, and be isomorphic to Z p . It must also be the full
group, hence G = Zp.
This is a particularly useful result from the point of view of compiling
a list of finite groups of small order, since it tells us exactly what
groups there are of a given prime order p: theres just one (up to
isomorphism) and its Z p . In a little while well develop further
techniques to classify all groups of order 8 or less.
Lagranges Theorem gives a very useful necessary condition on the
order of subgroups of a finite group. But we must be careful not to
read into it more than it actually says. For a start, it only applies
to finite groups, not infinite ones. Also, the condition in Lagranges
Theorem is necessary but not sufficient: it says that the order of any
subgroup must be a factor of the order of the whole group, but it
doesnt guarantee that there will be a subgroup with order equal to a
given factor. The smallest counterexample is the alternating group A4 :
Example 2.36 By Lagranges Theorem, all subgroups of the alternat-
ing group A4 must have order equal to a factor | A4 | = 12; that is, if
H < A4 then | H | = 1, 2, 3, 4, 6 or 12.
The trivial group {} consisting of the identity permutation is, of
course, the only subgroup of order 1.
There are three subgroups of order 2:
{, (1 2)(3 4)}, {, (1 3)(2 4)}, {, (1 4)(2 3)},
each consisting of the identity and an element of order 2. (Individual
transpositions such as (1 2) or (2 3) have order 2, but theyre odd
permutations and hence not elements of A4 .)
There are four subgroups of order 3, each generated by a 3cycle:
{, (1 2 3), (1 3 2)}, {, (1 2 4), (1 4 2)},
{, (1 3 4), (1 4 3)}, {, (2 3 4), (2 4 3)}.
There is a single subgroup of order 4:
{, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}
(This happens to be isomorphic to the Klein 4group V4 ; in a short
while well see that it couldnt be anything else.)
subgroups 57

There are no other order4 subgroups, because by Proposition 2.34,


such a subgroup must comprise elements of order 1, 2 or 4, and the
only remaining elements of A4 are the eight 3cycles, whch have
order 3.
So far, weve found subgroups of order 1, 2, 3, 4 and (A4 itself) 12.
The only remaining possibility, according to Lagranges Theorem, is
a subgroup of order 6. Suppose H < A4 is such a subgroup. If a
given 3cycle belongs to H then its inverse (also a 3cycle) must as
well, so the number of 3cycles in H is even. There cant be six of
them, since that would leave no room for the identity permutation .
Suppose that H contains four 3cycles: , 1 , and 1 . Then
, , 1 , , 1 , and 1 are all distinct elements of H. But there
are seven of them, which contradicts our hypothesis that H has six
elements.
If H contains only two 3cycles, then the rest of H must consist
of the remaining four elements of A4 , namely , (1 2)(3 4), (1 3)(2 4)
and (1 4)(2 3). But we noted earlier that these four elements form a
subgroup of order 4. Lagranges Theorem tells us that any subgroup
of A4 contained in H (and which must therefore itself be a subgroup
of H) must have order a factor of | H | = 6. But clearly 4 isnt a factor
of 6, so this 4element group cant be a subgroup of H.
At this point we have run out of possibilities: if H contains no 3
cycles at all, then there arent enough other elements in A4 to make
up a group of order 6, so our hypothesised subgroup H cant exist,
and hence A4 has no subgroups of order 6.

This example demonstrates an important general fact in mathemat-


ics: just because an object satisfying certain properties might exist, it
doesnt mean that it actually does.
So, the converse of Lagranges Theorem isnt true in general, which
might seem like a bit of a setback to our attempts to understand the
internal structure of groups. But all is not lost: while the full converse
doesnt hold in all cases, there are a number of other results which
provide a partial converse in certain cases. The Norwegian mathemati-
cian Ludwig Sylows three celebrated theorems, for example, are a
little sophisticated for this stage in the narrative, and well postpone
further discussion until Chapter 7, but the following useful theorem is
within our present means.
Theorem 2.37 (Cauchys Theorem) If p is a prime factor of the order of
a finite group G, then G contains a nontrivial element (and hence a cyclic
subgroup) of order p.
This result, due to the French mathematician Augustin-Louis Cauchy
58 a course in abstract algebra

(17891857), says that although subgroups of arbitrary factors might


not exist, cyclic subgroups of prime factors do. In Example 2.36, we
saw that while A4 doesnt have a subgroup of order 6, it does have
subgroups of order 2 and 3, which are prime factors of | A4 | = 12.
Proof Let
X = {( g1 , . . . , g p ) : g1 , . . . , g p G and g1 g p = e}
be the set of all ordered ptuples of elements of G whose product
is the identity e. What were looking for is such a ptuple where
g1 = = g p 6= e; that is, a ptuple of the form ( g, . . . , g) which isnt
(e, . . . , e) but which does multiply out to give the identity.
We want, first of all, to know how big X is; in fact we claim that
| X | is a multiple of p. For any ptuple ( g1 , . . . , g p ) X we know
that since g1 g p = e it follows that ( g1 g p1 ) = g 1
p . So
we effectively have a free choice for all of the first ( p1) elements
g1 , . . . , g p1 and then we just have to set g p = ( g1 g p1 )1 to
Whats actually going on here is that ensure that ( g1 , . . . , g p ) X. So, to write down all the ptuples in X,
were defining an action of the sym- we have | G | choices for g1 , | G | choices for g2 , and so on up to g p1 .
metric group S p on the set X: for each
permutation S p we map Hence | X | = | G | p1 . But we also know from the hypothesis that p is
a prime factor of | G |, so it must also be a factor of | X | = | G | p1 ; in
( g1 , . . . , g p ) 7 ( g ( 1 ) , . . . , g ( p ) ) .
other words, | X | is, as claimed, a multiple of p.
The cyclic permutation is obtained by
using the full pcycle (1 2 . . . p) S p . Now, given a ptuple ( g1 , . . . , g p ) X, we can obtain another (possibly
Well come back to these ideas in much different) one by cyclically permuting the terms in the ptuple:
more detail in Chapter 6.
( g1 , . . . , g p ) 7 ( g2 , . . . , g p , g1 ) .
This is still an element of X, since if g1 g p = e then
g2 g p g1 = g11 ( g1 g p ) g1 = g11 e g1 = g11 g1 = e.
Now define an equivalence relation on the set X as follows. Given
two ptuples x, y X, we say that x y if y can be obtained from x
by a finite number of cyclic permutations. For example,
( g1 , . . . , g p ) ( g4 , g5 , . . . , g p , g1 , g2 , g3 )
because the second ptuple can be obtained by cycling the first one 3
times.
This is an equivalence relation (check this yourself to make sure)
and hence partitions X into a finite number of disjoint subsets: the
equivalence classes of the relation .
The ptuple (e, . . . , e) is unchanged by cyclic (or indeed any) permuta-
tion, so it forms a one-element equivalence class on its own.
The equivalence classes partition X, so if we sum their respective sizes
we must get the number of elements in X. We know that at least
one of the other equivalence classes must have fewer than p elements:
subgroups 59

none of them can have more than p elements since there are at most p
distinct cyclic permutations of a given ptuple, and if all of them had
p elements, the total number of ptuples in X would be of the form
mp+1 for some integer m; we know that it cant because we showed a
little earlier that | X | has to be a multiple of p.
So there must be at least one equivalence class with fewer than p
elements.
If this class consists of just one ptuple then weve found the orderp
element were looking for: any such ptuple must have g1 = g p
otherwise cyclic permutation would yield at least one other distinct
ptuple; given such an element ( g, . . . , g) we immediately have g p = e
as required.
So, suppose we have an equivalence class comprising fewer than p but
Wikimedia Commons / Zphirin Belliard (17981861)
more than one distinct ptuples. Then two of them must be equal: after Jean Roller (17981866)
Augustin-Louis Cauchy (17891857)
( g r + 1 , . . . , g p , g1 , . . . , g r ) = ( g s + 1 , . . . , g p , g1 , . . . , g s ) made numerous contributions to many
areas of mathematics: during his career
Suppose, without loss of generality, that r < s and then cycle back r he published 789 papers and several
times (or forward ( pr ) times) to get books on mechanics, real and complex
analysis, number theory, algebra and
( g1 , . . . , g p ) = ( g k + 1 , . . . , g p , g1 , . . . , g k ) geometry, and has more concepts and
theorems named after him than any
where k = s r. Then we find that gi = gk+i for all 1 6 i 6 p and so other mathematician.
After a short career as a civil engineer
g1 = gk+1 = g2k+1 (mod p) = = g ( p 1) k +1 (mod p) . (2.1) in Cherbourg, he returned to Paris in
1812 and undertook research in mathe-
Now suppose ak + 1 bk + 1 (mod p) where 0 6 a < b 6 p1. Then matics, publishing papers on geometry
p divides (b a)k. But this cant happen, because p is prime and both and algebra. Over the next few years
he was appointed to chairs at the cole
(b a) and k are strictly less than p. Polytechnique, the Collge de France
and the University of Paris, and elected
So the numbers 1, k+1, 2k+1, . . . , ( p1)k+1 are all different modulo
to the Acadmie des Sciences.
p. Furthermore, there are p of them, and so (modulo p) we have some Following the July Revolution in 1830
permutation of the numbers 1, . . . , p. Substituting these into (2.1) and he refused to swear allegiance to the
new king, Louis Philippe I, and was
rearranging, we get
dismissed from his academic posts. He
g1 = g2 = = g p . subsequently became tutor to the ex-
iled Henri dArtois, nephew and heir to
Hence our ptuple ( g1 , . . . , g p ) = ( g1 , . . . , g1 ), and multiplying ever- the previous king Charles X. This was
p a disaster: Henri acquired a lifelong ha-
thing out we get g1 = e as required.
tred of mathematics and Cauchy was
We can use this theorem to help classify (that is, make a complete list unable to do much research, although
he was granted the title of Baron in
of) groups of order 4, 6 and 8. (Groups of orders 1, 2, 3, 5 and 7 have recognition of his service.
already been classified: the trivial group is the only possible group of Returning to Paris in 1838, he contin-
order 1, and by Proposition 2.35 there is only a single group, up to ued his research but didnt regain his
academic posts until Louis Philippe
isomorphism, of a given prime order p, namely Z p .) was deposed ten years later. He re-
We start with groups of order 4: mained a professor at the University
until his death at the age of 67. He
Proposition 2.38 A group of order 4 must be isomorphic either to the is one of 72 eminent scientists whose
names are inscribed on the Eiffel Tower.
cyclic group Z4 or the Klein 4group V4 .
60 a course in abstract algebra

Proof Let G = {e, a, b, c} be a group with 4 elements. Proposition 2.34


tells us that the order of each of these elements must be a factor of
| G | = 4. The identity element e has order 1, and by Proposition 1.19 is
the only element which does. So the remaining elements must have
order either 2 or 4. Cauchys Theorem asserts the existence of at least
one element of order 2. If at least one of the other elements has order
4, say a, then that means G has a cyclic subgroup {e, a, a2 , a3 }
= Z4 ,
which accounts for all of the elements in G and hence G = Z4 . Here a2
is the element of order 2 whose existence was guaranteed by Cauchys
Theorem.
If a, b and c all have order 2 then the cancellation laws show that
the product of any two must be the third. For example, a b = b
implies that a = e, which cant be true since we assumed a has order 2;
similarly a b = a forces b = e which also cant be true for the same
reason. Neither can a b = e, because that would imply a = b1 ;
this cant be true since a and b both have order 2 and so a = a1 and
b = b1 . The only remaining possibility is that a b = c. This is the
Klein group V4 .
A similar argument enables us to classify groups of order 6:
Proposition 2.39 A group of order 6 must be isomorphic to either the
cyclic group Z6 or the dihedral group D3 .

Proof Let G be a group with six elements. Cauchys Theorem asserts


the existence of an element g of order 2 and an element h of order 3.
These are distinct from each other and the identity element e. Each of
them generates a cyclic subgroup:
h gi = {e, g} hhi = {e, h, h2 }
There are therefore six possible elements:
G = {e, g, h, h2 , gh, gh2 }
Now consider the element hg. This element does not belong to the
cyclic subgroup hhi and its not equal to g. So we have two possibilities:
either hg = gh or hg = gh2 .
If hg = gh then this means that every element of the subgroup h gi
commutes with every element of the subgroup hhi; furthermore
G = h gihhi = { xy : x h gi, y hhi} and h gi hhi = {e}. So by
Proposition 2.12 we know that G = h gi hhi
= Z2 Z3 . Since 2 and
3 are coprime, Proposition 1.32 tells us that Z2 Z3
= Z6 , and hence

G = Z6 .
The other possibility is that hg = gh2 ; in this case the function f : G
D3 given by
g 7 m1 , h 7 r, gh 7 m2 , gh2 7 m3 ,
subgroups 61

is an isomorphism (check this) G


= D3 .
In Chapter 5 we will prove the more general result that if p is an odd
prime then any group of order 2p is either dihedral or cyclic; that is,
isomorphic to D p or Z2p .2 2
Proposition 5.29, page 155.
We can also, with a bit more effort, classify groups of order 8 up to
isomorphism:
Proposition 2.40 A group of order 8 must be isomorphic to one of the
groups Z8 , Z4 Z2 , Z2 Z2 Z2 , the dihedral group D4 or the quater-
nion group Q8 .
The proof follows a similar but slightly more involved pattern to the
order6 case.
Proof Let G be a group with 8 elements. By Proposition 2.34, elements
of G must be of order 1 (the identity e), 2, 4 or 8.
If G has an element g of order 8 then G = h gi = Z8 .
Suppose instead that G has an element g of order 4, but no higher-
order elements. Then h gi = {e, g, g2 , g3 }. Let h be some element of
G \ h gi and look at the coset h gih = { h, gh, g2 h, g3 h}. Hence G =
h gi h gih = {e, g, g2 , g3 , h, hg, hg2 , hg3 }.
Now look at the element hg. This clearly isnt a power of g, and is
thus not in h gi. Also, hg 6= h, since the right cancellation law (Proposi-
tion 1.15) would then imply g = e which weve already decided not to
be the case. Furthermore, if hg = g2 h then (by multiplying on the left
by h1 ) it follows that g = h1 g2 h, which implies that
g2 = h1 g2 hh1 g2 h = h1 g4 h = h1 eh = h1 h = e.
But g has order 4, so g2 6= e.
So were left with two possibilities: either hg = gh or hg = g3 h. Also,
we have two possibilities for the order of h: either 2 or 4. Notice that
h h gih, since h 6 h gi. Furthermore, h2 6= g since this would require
|h| = 8; for very similar reasons, h2 6= g3 . So if h has order 2 then
h2 = e, and if h has order 4, the only possibility is that h2 = g2 .
So if G has an element of order 4, we have four possibilities:
(i) If hg = gh and |h| = 2, then G is abelian and the function
f : G Z4 Z2 given by g 7 (1, 0) and h 7 (0, 1) is an
isomorphism G = Z4 Z2 .
(ii) 3
If hg = g h and |h| = 2, then the function f : G D4 given by
g 7 r and h 7 m1 yields an isomorphism G = D4 .
(iii) If hg = gh and |h| = 4, then G is abelian. Also, | gh1 | = 2
since gh1 gh1 = g2 h2 = g2 g2 = e. In this case, the function
f : G Z4 Z2 given by g 7 (1, 0) and gh1 7 (0, 1) provides
an isomorphism G = Z4 Z2 .
62 a course in abstract algebra

(iv) If hg = g3 h and |h| = 4, then the function f : G Q8 given by


g 7 I and h 7 J gives an isomorphism G = Q8 .
That completes the case where G has an element of order 4. All that
remains is to consider the case where all the non-identity elements
in G have order 2. In this case, G is abelian, by Proposition 1.20.
Choose g, h, k G \ {e} such that gh 6= k. The subgroup {e, g, h, gh}
is isomorphic to the Klein 4group V4 = Z2 Z2 . Let K = {e, k} =

hki = Z2 . Then HK = G, the intersection H K = {e}, and every
element of H commutes with both elements of K (since G is abelian).
Therefore, by Proposition 2.12 we have
G
= HK
= H K
= Z2 Z2 Z2 .
There are no further cases to consider, so the proof is complete.
In Section 7.A we will use similar techniques, as well as some more
sophisticated ones, to classify all groups of order less than 32.

Mathematics is the queen of sciences 2.3 Eulers Theorem and Fermats Little Theorem
and arithmetic the queen of mathemat-
ics. She often condescends to render
service to astronomy and other natural In this section, we examine two applications of Lagranges Theorem
sciences, but in all relations she is enti- to number theory, and revisit Eulers totient function .
tled to the first rank.
Carl Friedrich Gauss (17771855), First we introduce some finite multiplicative groups.
quoted by Wolfgang Sartorius von
Walterhausen (18091876), in Gauss
Definition 2.41 Let Z n consist of those integers 0, . . . , n 1 which
zum Gedchtnis (1856) form a group under modulon multiplication. The identity element
is clearly 1, and all the other elements are those integers 0 6 k 6 n1
for which there exists a modulon multiplicative inverse; that is,
an integer 0 6 h 6 n1 where hk 1 (mod n). Clearly 0 is not
invertible, so 0 6 Z n.
In fact, Z
n consists of exactly those integers 0 6 k 6 n 1 for which
gcd(k, n) = 1.

This is a special case of a construct well meet later on in the second


3
Proposition 8.34, page 338. part of the book: the group of units of a ring.3
Example 2.42 In the following list, Z
n is the multiplicative group
described above, while Zn denotes the additive group of integers
modulo n.
Z2 = {1} Z3 = {1, 2}
= Z2
Z4 = {1, 3}
= Z2 Z = {1, 2, 3, 4}
5

= Z4
Z = {1, 5}
6 = Z2 Z7 = {1, 2, 3, 4, 5, 6}
= Z6
Z8 = {1, 3, 5, 7}
= Z2 Z2 Z9 = {1, 2, 4, 5, 7, 8}
= Z6
subgroups 63

We can see that Z5 is isomorphic to Z4 and not Z2 Z2 by observing


that the element 2 Z5 has multiplicative order 4, since 22 = 4 6 1
(mod 5) but 24 = 16 1 (mod 5). On the other hand, in Z8 we
find that 32 = 9 1 (mod 8), 52 = 25 1 (mod 8) and 72 = 49 1
(mod 8), so there are no elements of order 4, and hence Z8 is
isomorphic to the Klein group V4 = Z2 Z2 . Similar arguments can
be applied to prove the other isomorphisms listed.

The following lemma tells us useful information about the group Z


n:
Lemma 2.43 Let k be a non-negative integer such that 0 < k < n. Then
k Z
n if and only if gcd( k, n ) = 1; that is, if k and n are coprime.

Proof If k and n are not coprime, then gcd(k, n) = d > 1. So d divides


km for any m {0, . . . , n1} and hence km 6= 1 Z
n . Thus k has no
inverse modulo n, and is hence not an element of Zn .

If, on the other hand, k and n are coprime, then by the argument
in Proposition 1.26 and the following paragraph, k generates Zn .
Hence there exists m Z such that mn = 1 Zn . Or, equivalently,
mk = an + 1 Z. Now let l = [m]n , the remainder or residue of m
modulo n. Then lk = bn + 1 Z for some b Z, hence lk = 1 Z n
and so k is an invertible element with inverse l, and is thus in Z
n.

This tells us that Z n consists of exactly those non-negative integers


k Zn = {0, . . . , n1} where gcd(k, n) = 1. Thus |Z n | is the number
of these coprime integers, for which concept we already have a name:
Definition 2.44 Eulers totient function (n) is the number of inte-
gers k {0, . . . , n1} such that gcd(k, n) = 1:
(n) = |{k Z : 0 6 k < n and gcd(k, n) = 1}|

We now have enough to state and prove Eulers Theorem:


Theorem 2.45 (Eulers Theorem) Suppose k and n are two coprime,
positive integers. Then
k (n) 1 (mod n)
or, equivalently, n|(k(n) 1).

Proof Lemma 2.43 tells us that |Z


n | = ( n ) since k Zn if and only
if gcd(k, n) = 1. Let [k ]n denote the remainder of k modulo n. Then
clearly [k]n Zn and so, by Proposition 2.34, the order |[ k ]n | of [ k ]n
in Zn must be a factor of |Z

n | = ( n ). So there exists some positive
integer l Z such that l |[k ]n | = (n). Hence
(n) l |[k ] | |[k] | l
[k]n = [k]n n = [k]n n = 1l = 1.
So k(n) 1 (mod n).
The following are some useful properties of ; we will use the first of
64 a course in abstract algebra

these in a minute and the others in Section 11.A.


Proposition 2.46 Let m, n, p N and suppose p is prime. Then:
(i) ( p) = p 1,
(ii) (mn) = (m)(n) if m and n are coprime, and
(iii) ( p n ) = p n 1 ( p 1 ).

Proof (i) Recall that ( p) is the number of integers k {1, . . . , p1}


for which gcd( p, k) = 1. But since p is prime, all such integers k have
this property, and so ( p) = |{1, . . . , p1}| = p 1.
(ii) By Proposition 1.32, we have Zmn = Zm Zn . Let G consist
of the elements of Zm Zn that form a group under multiplication
modulo (m, n). Then Z
Wikimedia Commons / Franois de Poilly (16231693) mn = G, and we claim that G = Zm Zn . Let
The French lawyer and amateur mathe-
f : G Zm Zn by f ( a, b) = ( a, b). Then ( a, b) G if and only if
matician Pierre de Fermat (16011665)
is chiefly remembered for his work in there exists some (c, d) G with ( a, b)(c, d) = ( ac, bd) = (1, 1). But
analytic geometry and number theory, this occurs exactly when ac 1 (mod m) and bd 1 (mod n), and
in particular his famous Last Theo-
rem. This conjecture, which states that hence G = Z
m Zn . Therefore
there exist no integer solutions to the
equation
(mn) = |Z
mn | = | G | = |Zm Zn | = |Zm ||Zn | = ( m ) ( n ).

x n + yn = zn (iii) The only integers between 1 and pn that arent coprime to pn are
n n1 of these, and hence ( pn ) =
for n>2, remained unproven until 1, p, 2p, . . . , p . There are exactly p
n
1994, when Andrew Wiles proved the p p n 1 n 1
= p ( p 1).
TaniyamaShimura Conjecture on ratio-
nal elliptic curves and modular forms, The next proposition is a clever application of Lagranges Theorem.
which was known to imply Fermats
conjecture as a corollary. Proposition 2.47 Suppose n > 3. Then (n) is even.
The original statement of the conjec-
ture occurs in Fermats hand-annotated Proof Let G = Z n and let H = {1, n 1}. It is simple to show that H
copy of the treatise Arithmetica, written is a subgroup of G: it is closed under multiplication modulo n, with
in the third century AD by the Greek
mathematician Diophantus of Alexan- the only nonimmediate case being
dria. Fermat notes that he has discov-
ered a marvellous proof (demonstra- (n1)(n1) = n2 2n + 1 1 (mod n).
tionem mirabilem) of the insolubility of
the general case, but that the margin is By Lagranges Theorem, | H | = 2 must be a factor of | G | = |Zn| =
too small to contain it (hanc marginis (n), and so (n) is even. (If n = 2 this fails, because n1 = 1 and
exiguitas non caperet).
hence the group H = {1} has only one element; indeed, we saw in
Born in Gascony, the son of a merchant
named Dominique Fermat, he stud- Example 2.42 that |Z2 | = (2) = 1.)
ied law at the University of Orlans,
and then embarked on a legal career. Corollary 2.48 (Fermats Little Theorem) Let p N be prime. Then
In 1631, he was appointed as a coun- for any k Z,
cillor and judge at the Parlement de
Toulouse, thereby entitling him to add kp k (mod p)
the honorific de to his surname.
or, equivalently, p|(k p k ).
Fluent in six languages (French, Latin,
Greek, Occitan, Spanish and Italian), he
was also an accomplished writer and
Proof Proposition 2.46(i) tells us that ( p) = p1, and then by Eulers
poet. He communicated most of his Theorem we have
mathematical discoveries informally, in
letters to friends, usually omitting the k p = k ( k p 1 ) = k ( k ( p ) ) k (mod p)
proofs, which were reconstructed by
other mathematicians after his death. as required.
subgroups 65

These two results illustrate the important connections between algebra


and number theory, and demonstrate how versatile algebraic tech-
niques can be when applied to other topics. We will meet a number of
other applications over the rest of the book.

Summary

In this chapter, we studied subgroups, smaller groups neatly em-


bedded inside larger ones.More precisely, a subgroup H < G is a
subset H G which forms a group in its own right under the same
operation (or, strictly speaking, the induced operation H formed by
restricting G to H H).4 In order for this to happen, we require that 4
Definition 2.1, page 40.
H must be closed under H and that H contain inverses of all its ele-
ments. A subgroup will often inherit certain properties from the larger
group: in particular, subgroups of abelian groups are also abelian,5 5
Proposition 2.2, page 41.
and subgroups of cyclic groups are also cyclic.6 6
Proposition 2.7, page 43.
We saw a number of examples of subgroups. Every nontrivial group
has at least two: itself and the trivial subgroup {e} consisting of just
the identity element. The dihedral group Dn contains the rotation
subgroup Rn , which is isomorphic to the cyclic group Zn . The mul-
tiplicative group R 7
of nonzero real numbers and the circle group 7
Example 2.4, page 41.
i 8
C = {e : 0 6 < 2 } = U1 both fit neatly inside the multiplicative 8
Example 2.5, page 42.
group C of nonzero complex numbers. Most of the matrix groups
we met in Section 1.2 fit together in interesting ways, for example,
the special orthogonal group SOn (R) is a subgroup of the orthogonal
group On (R), which in turn is a subgroup of the general linear group
GLn (R), as are the special linear group SLn (R) and the symplectic
group Sp2n (R).9 For any positive integer n, the group hni = nZ is 9
Example 2.6, page 42.
a subgroup of Z itself; more generally for any element g of a group
G, the set h gi = { gi : i Z} is a subgroup of G, called the cyclic
subgroup generated by g.10 We can often describe the relationship 10
Proposition 2.8, page 44.
between a group and its various subgroups, and their relationships
to each other, by means of a lattice diagram; in particular it turns out
that the intersection of any two subgroups will also be a subgroup.11 11
Proposition 2.9, page 45.
Permutation groups, that is, subgroups of (finite or infinite) symmetric
groups are particularly important: Cayleys Theorem tells us that any
group is isomorphic to a group of this type.12 This is true for finite 12
Theorem 2.13, page 47.
groups like Z3
= {, (1 2 3), (1 3 2)}13 and also for infinite groups like 13
Example 2.14, page 48.
Z.14 14
Example 2.15, page 49.
In the proof of Cayleys Theorem we introduced a class of permuta-
66 a course in abstract algebra

tions g : G G for all g G, and shortly afterwards we examined


the effect of these permutations on some subgroup H of G. This led
15
Definition 2.16, page 50. quite naturally to the concept of a left coset15
gH = g ( H ) = { g h : h H }.
16
Definition 2.17, page 50. Analogously, we can define the right coset16
Hg = g ( H ) = { h g : h H }
which, if G is nonabelian, need not be the same as the corresponding
left coset.
We saw that two right cosets Hg and Hk must be either equal or
17
Corollary 2.27, page 54. completely disjoint,17 and that they must have the same cardinality
18
Proposition 2.29, page 54. (both of these statements are true for left cosets as well).18 This means
that we can use a subgroup H to neatly partition a group G into
19
Corollary 2.28, page 54. a collection of equal-sized subsets;19 from this observation we get
Lagranges Theorem, which says that if H is a subgroup of a finite
group G, then the number of elements in H must be a factor of the
20
Theorem 2.30, page 54. number of elements in G.20 The number of (left or right) cosets of
H in G gives an indication of the relative size of G and H, and is
given a special name: we call it the index of H in G and denote it
21
Definition 2.31, page 55. | G : H |.21 The index is multiplicative, in the sense that if K < H < G,
22
Proposition 2.32, page 55. then | G : K | = | G : H || H : K |.22
Lagranges Theorem yields an important necessary condition for a
subset to be a subgroup; for example we can see straight away that
Z10 can have no three-element subgroups since 36 |10. It also tells us,
almost immediately, that a group of prime order (such as the cyclic
group Z p ) can have no subgroups apart from itself and the trivial
23
Corollary 2.33, page 55. subgroup {e}.23 With a little more thought, we can show that the
order | g| of an element g G must divide the order | G | of the group G,
since g generates a cyclic subgroup h gi which must have a number of
24
Proposition 2.34, page 55. elements equal to | g|.24 From here its only a short step to prove that
if p is prime, then there can only be one group (up to isomorphism)
25
Proposition 2.35, page 56. with p elements, namely Z p itself.25
The condition we get from Lagranges Theorem is necessary but not
sufficient: it says nothing to guarantee the existence of subgroups of
a particular order, and indeed such a subgroup might not exist. For
example, we saw that although 6 is a factor of 12 = | A4 |, it so happens
26
Example 2.36, page 56. that A4 has no subgroup of order 6.26
We do, however, have a number of theorems which provide partial
converses to Lagranges Theorem. The three Sylow Theorems are a
bit more advanced, and well discuss them at length in Chapter 7, but
Cauchys Theorem is more straightforward. This states that if p is a
prime factor of the order | G | of a finite group G, then G will contain a
subgroups 67

subgroup of order p.27 27


Theorem 2.37, page 57.
We used Lagranges Theorem and Cauchys Theorem to classify (that
is, write down a complete list of) groups with up to eight elements,
up to isomorphism.
There is one group of order 1: the trivial group {e}. There is one group
each of orders 2, 3, 5 and 7, namely the cyclic groups Z2 , Z4 , Z5 and
Z7 .28 There are two groups of order 4: the cyclic group Z4 and the 28
Proposition 2.35, page 56.
Klein 4group V4 .29 There are two groups of order 6: the cyclic group 29
Proposition 2.38, page 59.
Z6 and the dihedral group D3 (which happens to be isomorphic to
the symmetric group S3 ).30 Finally, there are five groups of order 30
Proposition 2.39, page 60.
8: the cyclic group Z8 , together with the direct sums Z4 Z2 and
Z2 Z2 Z2 , the quaternion group Q8 and the dihedral group D4 .31 31
Proposition 2.40, page 61.
In Section 2.3 we looked at a few applications of Lagranges Theorem to
number theory. First we investigated some finite multiplicative groups
of integers: Z n consists of those integers 1, . . . , n 1 which form a
group under modulon multiplication.32 , 33 With a little thought, we 32
Definition 2.41, page 62.
see that these are exactly those integers 0 < k < n that are coprime 33
Example 2.42, page 62.
to n, or equivalently for which gcd(k, n) = 1.34 The number of such 34
Lemma 2.43, page 63.
integers for a given n, that is, the number of elements in Z n is given
by Eulers totient function (n).35 Eulers Theorem says that if k and 35
Definition 2.44, page 63.
n are coprime then k(n) 1 (mod n), or equivalently that n divides
(k(n) 1).36 36
Theorem 2.45, page 63.
The totient function (n) is quite complicated in general, and well
study it in a little more detail later, but we can already state a couple
of useful facts about it: if p is prime, then ( p) = p 1 and more
generally ( pn ) = pn1 ( p 1). And if m and n are coprime, then
(mn) = (m)(n).37 Using this together with Eulers Theorem we 37
Proposition 2.46, page 64.
obtain Fermats Little Theorem, which states that for any integer k
we have k p k (mod p), or equivalently that p divides (k p 1).38 38
Theorem 2.48, page 64.
Secondly, if n > 3 then {1, n1} forms a 2element subgroup of Z n.
Lagranges Theorem then tells us that |Z n | must have 2 as a factor,
which is equivalent to saying that (n) is even for n > 3.39 39
Proposition 2.47, page 64.

References and further reading


Fermats Last Theorem is one of the most celebrated problems in mathematics, its deceptively simple
statement nevertheless resisting all attempts at a solution for about three and a half centuries. There are
a number of readable accounts of its history and ultimate solution; probably the best are the following
books by Amir Aczel and Simon Singh.
A D Aczel, Fermats Last Theorem, Penguin (1997)
S Singh, Fermats Last Theorem, Fourth Estate (1997)
68 a course in abstract algebra

If you can find a copy, Singhs 1996 BBC Horizon documentary Fermats Last Theorem is also worth
watching: at the time of writing, it is available to watch online via the BBCs website.
An interesting account of the history and subsequent development of Cauchys Theorem can be found
in the following article:
M Meo, The mathematical life of Cauchys group theorem, Historia Mathematica 31 (2004) 196221
Cauchys original exposition and proof ran to about nine pages at the end of a long article on
permutation groups published in 1845.
A L Cauchy, Mmoire sur les arrangements que lon peut former avec des lettres donnes, et sur les permu-
tations et substitutions laide desquelles on passe dun arrangement un autre, Exercises danalyse et de
physique mathmatique 3 (1845) 151252
Meo also notes that Cauchys proof contains a subtle error: at one point he assumes that the direct
product of two subgroups is also a subgroup. This is true if both subgroups are normal, as we will see
in Corollary 3.23 in the next chapter, but not in general.
An elegant and concise proof, just ten lines long, was published in 1959 by the American mathematician
James McKay (19282012):
J H McKay, Another proof of Cauchys group theorem, American Mathematical Monthly 66.2 (1959) 119

Exercises
2.1 For each of the following groups G and subsets H G, say whether or not H is a subgroup of G;
if not, explain briefly why not.

(a) G = Z, H = N. (g) G = D5 , H = {e, r2 }.


(b) G = Z, H = {0}. (h) G = D5 , H = {e, m3 }.
(c) G = Z, H = {1, 0, 1}. = GL2 (R), H = 0a bc : a, b, cR, ac6=0 .
 
(i) G
(d) G = Z, H = 7Z. (j) G = GL2 (R), H = GL2 (Q).
(e) G = S3 , H = {(1 2 3), (1 3 2)}. (k) G = ZZ, H = {( a, b) : a+b is even}.
(f) G = S3 , H = {, (1 2)}.

2.2 Repeat Example 2.14 with the cyclic group Z4 and the Klein group V4 = Z2 Z2 .
2.3 Let m, n N such that n > 2 and 0 6 m 6 n. Let H be the subgroup of the symmetric group
Sn consisting of those permutations which permute the first m elements among themselves. Use
Lagranges Theorem in a manner analogous to Proposition 2.47 to show that the binomial coefficent
(mn ) = m!(nn!m)! is an integer.
2.4 Find the orders of Z
n for n = 10, 12, 21, 60. In the cases where |Zn | = 4, determine whether Zn is
isomorphic to Z4 or the Klein group V4 .

2.5 Find the subgroups of Z14 .
2.6 Let p be prime, and show that ( p1) is the only element of Z p of order 2. By considering the

product of all elements of Z p , prove Wilsons Theorem: that ( p1)! 1 (mod p) if and only if
p is prime.
subgroups 69

2.7 Let H be a subgroup of a group G, and suppose that g G is some arbitrary element. Show that
there exists some element k G such that gH = Hk.
2.8 (a) Show that a group of even order has at least one element of order 2.
(b) Show that a group of odd order has no elements of order 2.
2.9 (a) If G is abelian, show that H = { g G : g2 = e} is a subgroup of G.
(b) Find an example of a nonabelian group G such that H is not a subgroup of G.
2.10 (a) If G is abelian, show that H = { g2 : g G } is a subgroup of G.
(b) Find an example of a nonabelian group G for which H is a subgroup.
(c) Find a nonabelian group G for which H is not a subgroup.
Up and after ordering some things to-
wards my wifes going into the coun-
try, to the office, where I spent the
morning upon my measuring rules
very pleasantly till noon, and then
comes Creed and he and I talked about
mathematiques, and he tells me of a
way found out by Mr. Jonas Moore
which he calls duodecimal arithme-
3 Normal subgroups tique, which is properly applied to
measuring, where all is ordered by
inches, which are 12 in a foot, which I
have a mind to learn.
Samuel Pepys (16331703),
diary entry for Tuesday 9 June 1663
n the last chapter we introduced the concept of a coset of a sub-
I group H of a group G. In the case where G is abelian, obviously
each left coset gH is equal to the corresponding right coset Hg, but if G
is nonabelian this need not be the case, and often isnt. In this chapter
we will examine this topic further, along the way deriving a condition
on H that forces the left cosets to coincide with their right coset twins.
Subgroups H which have this property are called normal subgroups
and have an important rle to play: given a normal subgroup H of a
group G we can construct the quotient group G/H. This method of
decomposing larger groups into products of smaller ones gives us new
techniques for understanding their internal structure. This is (very
loosely) analogous to the number-theoretic study of the multiplicative
properties of integers by factorising them into smaller ones, and just
as we then find a class of integers, namely the prime numbers, which
cant be factorised into smaller ones, we will also investigate the class
of simple groups: those which have no normal subgroups (apart from
themselves and the trivial subgroup) and hence cant be factorised
into smaller ones.

3.1 Cosets and conjugacy classes The species and genus are always the
work of nature; the variety often that of
culture; and the class and order are the
It is an interesting question to ask when the left and right cosets work of nature and art.
of a subgroup H 6 G coincide: is there some property that H or G Carl Linnaeus (17071778),
Philosophia Botanica (1751) 162
might have that then forces gH = Hg for any element g G?
An obvious sufficient condition is that G (and hence, by Proposition 2.2,
H) be abelian, but we know alread that this isnt a necessary condition:
at the beginning of Section 2.2 we saw that the rotation subgroup
R3 6 D3 has this property, but D3 isnt abelian, although R3 = Z3 is.
Perhaps the subgroup H has to be abelian in order to have this coinci-
dence property. This is a reasonable suggestion, but the next example
72 a course in abstract algebra

shows that its wrong.


Example 3.1 The symmetric group S4 contains the alternating group
A4 as a subgroup. The left cosets of A4 in S4 are
A4 = A4 = (1 2 3) A4 = (1 3 2) A4 = (1 2 4) A4 = (1 4 2) A4
= (1 3 4) A4 = (1 4 3) A4 = (2 3 4) A4 = (2 4 3) A4
= (1 2)(3 4) A4 = (1 3)(2 4) A4 = (1 4)(2 3) A4
= {, (1 2 3), (1 3 2), (1 2 4), (1 4 2), (1 3 4), (1 4 3), (2 3 4), (2 4 3),
(1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}
S4 \ A4 = (1 2) A4 =(1 3) A4 =(1 4) A4 =(2 3) A4 =(2 4) A4 =(3 4) A4
= (1 2 3 4) A4 = (1 2 4 3) A4 = (1 3 2 4) A4
= (1 3 4 2) A4 = (1 4 2 3) A4 = (1 4 3 2) A4
= {(1 2), (1 3), (1 4), (2 3), (2 4), (3 4),
(1 2 3 4), (1 2 4 3), (1 3 2 4), (1 3 4 2), (1 4 2 3), (1 4 3 2)}
while the right cosets of A4 in S4 are
A4 = A4 = A4 (1 2 3) = A4 (1 3 2) = A4 (1 2 4) = A4 (1 4 2)
= A4 (1 3 4) = A4 (1 4 3) = A4 (2 3 4) = A4 (2 4 3)
= A4 (1 2)(3 4) = A4 (1 3)(2 4) = A4 (1 4)(2 3)
= {, (1 2 3), (1 3 2), (1 2 4), (1 4 2), (1 3 4), (1 4 3), (2 3 4), (2 4 3),
(1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}
S4 \ A4 = A4 (1 2)= A4 (1 3)= A4 (1 4)= A4 (2 3)= A4 (2 4)= A4 (3 4)
= A4 (1 2 3 4) = A4 (1 2 4 3) = A4 (1 3 2 4)
= A4 (1 3 4 2) = A4 (1 4 2 3) = A4 (1 4 3 2)
= {(1 2), (1 3), (1 4), (2 3), (2 4), (3 4),
(1 2 3 4), (1 2 4 3), (1 3 2 4), (1 3 4 2), (1 4 2 3), (1 4 3 2)}
These are clearly the same, although neither A4 nor S4 are abelian.

It turns out that the answer is a bit more complicated, and in order to
understand it, we need to introduce a new concept: that of conjugacy.
Impatient or alert readers may wonder why it matters whether the left
and right cosets are the same. This question will be answered in the
next section, where well find that subgroups having this property are
useful for understanding the internal structure of the larger group.
Definition 3.2 An element g G is conjugate to an element h G
if there exists some element k G such that k g k1 = h. We say
also that h is the result of conjugating g with k.
This may seem like a strange criterion to impose on elements of a
group, but if youve studied a bit of linear algebra you should have
normal subgroups 73

met this concept before, although probably under a different name.


Definition 3.3 Two nn square matrices A and B are said to be
similar if there exists an invertible nn matrix P such that B =
PAP1 .
This is exactly the same idea as conjugacy within a matrix group such
as GLn (R).
Well stay with similar matrices for a little while, and think carefully
about what this means. An nn matrix A with elements in R can
be regarded as representing a linear map f : Rn Rn . But there are
many (uncountably infinitely many, in fact) other matrices which also
represent that same linear map, so whats going on in this context
is that A represents f relative to some choice of basis or coordinate
system for Rn . The equation B = PAP1 says that both A and B
represent the same linear map f : Rn Rn relative to different bases
for Rn , and P is the change of basis matrix that translates from one
basis (or coordinate system) to the other.
More precisely, let S = {u1 , . . . , un } and T = {v1 , . . . , vn } be two
bases for Rn , let A be the matrix representing f relative to the basis S,
let B be the matrix representing f relative to the basis T, and let P be
the change of basis matrix from S to T.
The point of P is that given any vector w Rn , expressed as an
nelement column vector relative to the basis S, the vector Pw will be
an nelement column vector representing the same vector in terms of
the basis T.
The equation B = PAP1 , then, tells us how A and B are connected:
given a matrix A representing f relative to the basis S, in order to
work out the matrix representing f relative to the basis T, we have
to first map from T coordinates to S coordinates (using the inverse
P1 ), apply A (in S coordinates) and then map back into T coordinates
again (using P). This procedure gives PAP1 , which must therefore
be equal to B.
We can regard this matrix P as representing a change of coordinate
system, but any such matrix has to be invertible, and hence an element
of GLn (R). So another way of looking at all this is to say that we can
rewrite the element B GLn (R) as a product of three other elements of
GLn (R). In the case of GLn (R), the element P has a nice interpretation
as a change of coordinate system, but this isnt always the case (even
in analogy) when we look at other groups.
For example, lets consider the dihedral group D3 again. The element
m1 has a geometric interpretation as a reflection in one of the axes
of symmetry of an equilateral triangle. But so does m2 , just through
74 a course in abstract algebra

one of the other axes. Thinking about it, we can get the same result
e r r2 m1 m2 m3 as applying m2 by rotating the whole triangle by a third of a rotation,
e e e e e e e doing m1 and then rotating the triangle back again. So m1 and m2 are
r r r r r2 r2 r2
conjugate to each other. On the other hand, m1 and r arent conjugate
r2 r2 r2 r2 r r r
m1 m1 m2 m3 m1 m3 m2 to each other, since there exists no g D3 such that m1 = grg1 .
m2 m2 m3 m1 m3 m2 m1
m3 m3 m1 m2 m2 m1 m3
Table 3.1 shows this more clearly. The group element at the intersection
of row g with column h is that obtained by conjugating g with h, that
Table 3.1: Conjugation in D3
is, hgh1 . Here we see that e is conjugate only to itself, r is conjugate to
r2 and vice versa, and the reflections m1 , m2 and m3 are all conjugate
The subtables to each other.

e The operation of conjugation (or, in fact, the relation is conjugate


e e to) neatly partitions D3 into three subsets (which, with the exception
r r2 of {e}, are not subgroups). The obvious next question is: does this
r r r
r2 r2 r2
happen in general? Does the relation is conjugate to always partition
m1 m2 m3 a group into non-overlapping subsets? The answer is yes, because of
m1 m1 m3 m2 the following fact:
m2 m3 m2 m1
m3 m2 m1 m3 Proposition 3.4 Conjugacy is an equivalence relation.

are examples of another interesting al- Proof To show that conjugacy is an equivalence relation we need to
gebraic structure: a quandle. These
show that it is reflexive, symmetric and transitive.
objects (and a slightly more general
structure called a rack) occur naturally Conjugacy is reflexive, since for any element g G we have e g
in the study of knot theory; the third,
for example, represents the nontrivial
e1 = g and hence g g.
3colouring of the trefoil knot. Conjugacy is symmetric, since if g h there exists some x G such
that h = x g x 1 . Multiplying both sides of this equation on the left
by x 1 and on the right by x yields

x 1 h x = x x 1 g x 1 x = e g e = g.

So, since x 1 G we have g = ( x 1 ) h ( x 1 )1 and hence h g.


Finally, suppose that g h and h k. So there exists x G such
that h = x g x 1 , and there exists y G such that k = y h y1 .
Substituting the first of these into the second we get

k = y h y 1 = y x g x 1 y 1 = ( y x ) g ( y x ) 1 .

So g is conjugate to k via (y x ) G and hence g k. Therefore


conjugacy is transitive.

We call the equivalence classes of this relation conjugacy classes.


Example 3.5 The conjugacy classes of D3 are
{ e }, {r, r2 } and {m1 , m2 , m3 }.
Conjugacy in a matrix group is the same as similarity between matrices.
The next example characterises the conjugacy classes in GLn (R).
normal subgroups 75

Example 3.6 We know from linear algebra that some, but not all,
matrices in GLn (R) are similar to (or, in our more general terminol-
ogy, conjugate to) a diagonal matrix. More generally, every matrix
M GLn (R) is conjugate to one in Jordan canonical form. So, the
conjugacy classes of GLn (R) are indexed by the various types of nn
Jordan matrices.
A Jordan block with eigenvalue R of degree k is a kk matrix
J,k = [ ai j ] with diagonal elements ai i = , the supradiagonal ele-
ments ai i+1 = 1 for 1 6 i < k and all other elements ai j = 0 if j 6= i
or i +1.
A matrix is in Jordan canonical form if it can be decomposed as a
block diagonal matrix where the diagonal blocks are Jordan blocks,
and all the off-diagonal blocks are zero.
The conjugacy classes of matrices in GLn (R) are in bijective corre-
spondence with the different Jordan matrices.
So, considering GL2 (R) we have three different types:
diag( J,1 , J,1 ) = 0 0 , diag( J,1 , J,1 ) = 0 0 , J,2 = 0 1
     

where , R and 6= . Thus GL2 (R) has uncountably many


conjugacy classes, but each is one of these three types. The group
GL3 (R) has uncountably many conjugacy classes too, which may be
divided into six different types. As before, the 33 identity matrix
I3 is in a conjugacy class on its own.

Example 3.7 The conjugacy classes of Z are singleton sets, each


consisting of a single integer:
. . . , {2}, {1}, {0}, {1}, {2}, . . .
To see this, consider an arbitrary integer n Z. Conjugating this
with any other integer m Z yields m + n + (m) = n. So n can
only be conjugate to itself.

This is true for any abelian group, as the following proposition shows.
Proposition 3.8 Let G be an abelian group. Then for any element g G,
the conjugacy class of g is { g}.

Proof Conjugating g by another element h G gives

h g h1 = h h1 g = e g = g.

So g is only conjugate to itself, and thus its conjugacy class is { g}.

While were on the subject of conjugation, its worth noting that the
conjugation operation can be used to construct an isomorphism from
a group to itself:
76 a course in abstract algebra

Proposition 3.9 Let g G be some fixed element of a group G. Then the


function f g : G G, defined by h 7 g h g1 , is an isomorphism from
G to itself.

Proof This function f g is bijective: an inverse is given by the function


f g1 = f g1 : G G which maps h to g1 h g.
Alternatively, to show that f g is injective, suppose that f g (h) = f g (k )
for some h, k G. Then g h g1 = g k g1 , and multiplying on
the left by g1 and on the right by g we obtain h = k as required. To
show surjectivity, consider an element h G. Then

f g ( g1 h g) = g g1 h g g1 = e h e = h.

It satisfies the structure condition too, since for any h, k G we have

f g ( h k ) = g h k g 1 = g h e k g 1 =
g h g 1 g k g 1 = f g ( h ) f g ( k )

as required.
An isomorphism from a group to itself is called an automorphism,
and automorphisms of the form discussed in Proposition 3.9 are called
inner automorphisms.
We are now in a position to study precisely what conditions a subgroup
H 6 G must satisfy in order to ensure that each of its left cosets is
equal to the corresponding right coset.
Looking again at the dihedral group D3 we recall that the rotation
subgroup R3 = {e, r, r2 } has this property gR3 = R3 g for any g D3 .
The subgroups {e, m1 }, {e, m2 } and {e, m3 } dont, however, although
the trivial subgroup {e} and the full group D3 do.
The clue lies in the observation that D3 = {e} {r, r2 } {m1 , m2 , m3 },
R3 = {e} {r, r2 } and the trivial subgroup {e} are all unions of
conjugacy classes of elements of D3 , while the reflection subgroups
{e, m1 }, {e, m2 } and {e, m3 } arent.
Why does this matter? Well, if a subgroup H < G is a union of
conjugacy classes (which not every subgroup need be) then it ensures
that for any element h H, all elements of the form g h g1 are
also in H. That is, H is closed under conjugation by elements of G.
The following proposition confirms that were on the right track:
Proposition 3.10 A subgroup H < G is a union of conjugacy classes if
and only if Hg = gH for all g G.

Proof Suppose H is a union of conjugacy classes. Then for any g G


and h H we know that g h g1 and g1 h g must belong to
H. The element g h clearly belongs to the left coset gH, but since
normal subgroups 77

g h = ( g h g1 ) g it must also lie in the right coset Hg, and


so gH Hg. Similarly, h g lies in the right coset Hg, but since
h g = g ( g1 h g), it must also lie in the left coset gH, and so
Hg gH. Therefore Hg = gH.
Now suppose that gH = Hg for all g G. If h H then its conjugate
gh g1 also belongs to H, since g h g1 gHg1 and ( gH ) g1 =
( Hg) g1 = H ( g g1 ) = H. So H is a union of conjugacy classes.

This proposition provides the answer we were looking for: the sub-
groups whose corresponding left and right cosets gH and Hg coincide,
are exactly those subgroups which happen to be unions of conjugacy
classes. In the case of an abelian group, we saw that the conjugacy
classes are just sets consisting of individual elements. So any subgroup
of an abelian group is therefore a union of conjugacy classes, and by
Proposition 3.10 its left and right cosets must be the same. (In addition
to the more obvious reason that all the elements commute.)
Equivalently, the left and right cosets of H in G coincide if and only if
g h g1 H for all g G and h H. We give a subgroup of this
type a special name:
Definition 3.11 A subgroup H < G is said to be a normal subgroup
of G or normal in G if its left and right cosets coincide, in the sense
that gH = Hg for all g G.
Equivalently, by Proposition 3.10, H is normal in G if it is a union of
conjugacy classes of elements of G. We denote normal subgroups by
H C G or H P G.
In the examples earlier on, we noticed that R3 C D3 and A4 C S4 . In
both of these cases, the subgroups have index 2 in the larger group:
| D3 : R3 | = |S4 : A4 | = 2. It is certainly not the case that a normal
subgroup must have index 2, since 3Z C Z and |Z : 3Z| = 3. The
converse, however, is true in general, as the following argument shows:
Proposition 3.12 Let H < G be an index2 subgroup of G. Then H is
normal in G.

Proof Suppose that | G : H | = 2. Then there are only two distinct right
cosets: H itself, and the complement G \ H. Similarly, there are only
two distinct right cosets: H and G \ H again. Clearly, each left coset is
then equal to the corresponding right coset, and hence H C G.

Example 3.13 For some geometric object X, the direct isometry


group Isom+ ( X ) is a normal subgroup of the full isometry group
Isom( X ). This follows from the observation that
| Isom( X ) : Isom+ ( X )| = 2.
78 a course in abstract algebra

Example 3.14 An C Sn since |Sn : An | = 2 for any n > 1. This


generalises Example 3.1.

Example 3.15 SLn (R) C GLn (R) since every conjugate of a matrix
with determinant 1 also has determinant 1, by the multiplicative
property of matrix determinants. Suppose A SLn (R) (so det A = 1)
and that B GLn (R) with det B = d R. Then det( B1 ) = 1d and
so det( BAB1 ) = det( B) det( A) det( B1 ) = 1d 1 d = 1. Hence
SLn (R) C GLn (R).
By a similar argument, SOn (R) C On (R) and SUn C Un .

For the next two examples, we consider commutativity of individual


elements. Recall that two group elements g, h G commute if g h =
h g. A group G is abelian if every element commutes with every
other element. For any (abelian or nonabelian) group G, the identity
element e commutes with every other element. Its often useful to
consider which elements commute with which other elements.
In the dihedral group D3 , for example, we see that every element of
the rotation subgroup R3 commutes with every other element of R3 .
But r m1 = m3 6= m2 = m1 r, so r and m1 dont commute with each
other. The identity element e obviously commutes with everything,
but nothing else does.
In the dihedral group D4 , the 180 rotation r2 commutes with every
other element of the group, as does the identity element e, but nothing
else does. We generalise this to obtain the following definition:
Definition 3.16 The centre Z ( G ) of a group G is the subset of ele-
ments which commute with every element of G. That is,
Z ( G ) = { g G : g h = h g for all h G }.

The centre of a group isnt just a subset, its actually a normal subgroup,
as the following proposition shows. (Well make use of this fact later
on when we study quotient groups.)
Proposition 3.17 The centre Z ( G ) of a group G is a normal subgroup.

Proof First we show that Z ( G ) is a subgroup of G. Clearly e Z ( G ),


since e g = g e for all g G. The centre is closed under the induced
operation since if g, h Z ( G ), it follows that g h Z ( G ) too. That is,
( gh)k = g(hk) = g(kh) = ( gk)h = (k g)h = k( gh)
for any k G. The second and fourth of these equalities follow since
g, h Z ( G ), and the rest are just consequences of associativity.
Also, if g Z ( G ) then g1 Z ( G ), since
g 1 h = ( h 1 g ) 1 = ( g h 1 ) 1 = h g 1 .
normal subgroups 79

Hence Z ( G ) is a subgroup of G. To show that its a normal subgroup of


G we need to confirm that for any g Z ( G ) and h G, the conjugate
h g h1 is also in Z ( G ). Let g Z ( G ) and h G. Then h g = g h
if and only if g = h g h1 , so g Z ( G ) exactly when g is conjugate
only to itself, and hence its conjugacy class is the singleton set { g}.
Hence Z ( G ) is a union of (singleton) conjugacy classes, and is thus a
normal subgroup of G.
This result is certainly consistent with what we know so far. If G is
abelian then clearly Z ( G ) = G, since every element of G commutes
with every other element of G. As noted earlier, the conjugacy classes
of an abelian group G are the singleton sets { g} for all g G. And the
proposition above tells us that Z ( G ) is the union of all the singleton
conjugacy classes of G, which in this case is all of G.
Look at the conjugacy classes of D3 again. These are {e}, {r, r2 } and
{m1 , m2 , m3 }. The only one of these which contains only a single
element is {e}, so the centre Z ( D3 ) is trivial, as noted earlier.
The conjugacy classes of D4 are {e}, {r2 }, {r, r3 } and {m1 , m2 , m3 , m4 },
and hence Z ( D4 ) = {e} {r2 } = {e, r2 }.
Going back to D3 again, we noticed that the identity e commuted with
everything, and that r and r2 commute with each other, but no other
element commutes with anything apart from itself and the identity.
That is, e g = g e and r r2 = r2 r, but g h 6= h g for any other
g, h D3 . Another way to look at this is to take g h = h g and
rearrange it to give g h g1 h1 = e. So g and h commute exactly
when g h g1 h1 = e; in other words the product g h g1 h1
measures how badly g and h fail to commute.
Definition 3.18 For any two elements g, h G of a group G, we
e r r 2 m1 m2 m3
define the commutator [ g, h] := g h g1 h1 .
e e e e e e e
r e e e r2 r2 r2
Table 3.2 lists all of the commutators in D3 : the cell on the gth row
r2 e e e r r r
and hth column contains the commutator [ g, h]. m1 e r r2 e r2 r
m2 e r r2 r e r2
We can see that in D3 the commutators [ g, h] are either the identity m3 e r r2 r2 r e
e (in which case the elements g and h commute) or a clockwise or
Table 3.2: Commutators in D3
anticlockwise rotation.
This tells us that, as we already knew, not all elements of D3 commute
with each other. But it also gives us some idea of how badly they fail to
commute; in other words, the commutators measure how nonabelian
the group D3 is. In this case, they tell us that, at worst, two elements
of D3 fail to commute by a rotation of some sort.
Another way of looking at this is to say that D3 would be abelian if
we ignored rotations in some way. More precisely, if we defined some
sort of equivalence relation on D3 , where we considered two elements
80 a course in abstract algebra

to be equivalent if they differed only by a rotation, then we could form


a new, smaller group consisting of those equivalence classes. And if
we did so, that group would be abelian. Well come back to this idea
in the next section when we study quotient groups.
Looking at the set of commutators in D3 , we find its exactly equal to
the rotation subgroup R3 = {e, r, r2 }. Is this a coincidence? Or is it
always the case that the commutators in a group G form a subgroup?
As it happens, yes, in general it is slightly coincidental: the commuta-
1
1
The smallest groups in which the set tors of a group G dont, in general, form a subgroup of G.
of commutators is not closed under the
induced multiplication operation are of
We can fix this by slightly modifying our approach. The subset of
order 96. More concretely, Joseph Rot- commutators doesnt always form a subgroup, since the product of two
man implemented a computer search commutators neednt itself be a commutator. However, if we consider
which found exactly two such noniso-
morphic groups of order 96. One is a the subgroup generated by the commutators (that is, the subgroup
semidirect product (Z2 Z2 Q8 )oZ3 consisting of all possible commutators, their products, the products of
and the other is a semidirect product
((Z2 Z4 )oZ4 )oZ3 . Well find out
those products, and so on) then everything works fine.
about semidirect products later. Definition 3.19 Given a group G, let [ G, G ] denote the commutator
subgroup of G, namely the subgroup generated by all possible com-
mutators [ g, h] = g h g1 h1 for g, h G. This is sometimes
called the derived group of G. Some writers use the notation G 0 or
G (1) instead of [ G, G ].
e r r 2 r 3 m1 m2 m3 m4
e e e e e e e e e Weve seen that the commutator subgroup [ D3 , D3 ] is equal to the
r e e e e r2 e r2 e rotation subgroup R3 = {e, r, r2 }. Is this true for other dihedral
r2 e e e e e e e e
groups? That is, does [ Dn , Dn ] = Rn for all n? Table 3.3 lists the
r3 e e e e r2 e r2 e
m1 e r2 e r2 e e e e commutators in D4 .
m2 e e e e e e e e
m3 e r2 e r2 e e e e
So [ D4 , D4 ]={e, r2 }, which isnt the rotation subgroup R4 ={e, r, r2 , r3 }.
m4 e e e e e e e e One of the exercises at the end of this chapter asks you to take this
Table 3.3: Commutators in D4 further and examine [ Dn , Dn ] in greater generality.
Another important class of nonabelian groups weve met are the
symmetric groups Sn . The following proposition characterises their
commutator subgroups:
Proposition 3.20 The commutator subgroup [Sn , Sn ] is the alternating
group An for n > 3.

Proof By the discussion following Definition 1.62, the commutator


[, ] of any two permutations , Sn must be even, since [, ] =
1 1 is a product of four permutations. If and are both odd,
then and 1 1 must both be even and hence [, ] must also be
even; if and are both even then [, ] must also be even; and if
(without loss of generality) is odd and is even, then both and
1 1 must be odd, and their product [, ] must be even.
Proposition 1.64 tells us that any even permutation (and hence any
normal subgroups 81

element of An ) can be written as a product of 3cycles. Any 3cycle


( a b c) can be decomposed as a product ( a b)( a c)( a b)( a c), which (since
transpositions are self-inverse) is equal to ( a b)1 ( a c)1 ( a b)( a c) =
[( a c), ( a b)].
Hence all 3cycles in Sn are commutators, and since An is generated
by the 3cycles in Sn , it is generated by commutators. Furthermore,
there are no commutators in Sn which are not in An , since as noted
above, all commutators have even parity. Therefore [Sn , Sn ] = An .
The commutator subgroup [ G, G ], like the centre Z ( G ), provides a
measure of how abelian a group G is. We saw earlier that Z ( G ) = G
exactly when G is abelian, and clearly [ G, G ] is trivial exactly when G
is abelian too. The centre turned out to be a normal subgroup of G, so
what about the commutator subgroup?
Proposition 3.21 The commutator subgroup [ G, G ] of a group G is nor-
mal in G.
Proof By the remark preceding Definition 3.11, we just need to show
that the conjugate of a commutator (or a product of commutators) is
also a commutator (or a product of commutators).
Suppose that [ g, h] = g h g1 h1 [ G, G ]. Then if k G we have
k[ g, h]k1 = k( gh g1 h1 )k1
= (k gk1 )(khk1 )(k g1 k1 )(kh1 k1 )
= (k gk1 )(khk1 )(k gk1 )1 (khk1 )1
= [k gk1 , khk1 ] [ G, G ].
So the conjugate of a commutator is also a commutator. But not
every element of [ G, G ] need be a commutator, so we have to check
that the conjugate of a product of commutators is also a product of
commutators.
Suppose that x1 , . . . , xn are commutators in [ G, G ]. Then x1 xn is
also an element of [ G, G ], and
k x1 xn k1 =(k x1 k1 ) (k xn k1 ) [ G, G ].
So the conjugate of a product of commutators is also a product of
commutators, and hence [ G, G ] is a normal subgroup of G.
We will return to the commutator subgroup in the next section, where
we will use it to form the abelianisation of a group G. But now we will
use the commutator to prove a useful fact about normal subgroups.
Proposition 3.22 Let H and K be normal subgroups of a group G such
that H K = {e}. Then the elements of H commute with the elements of
K; that is, hk = kh for any h H and k K.
82 a course in abstract algebra

Proof Let h H and k K be arbitrary elements of the respective


subgroups, and consider the commutator [h, k ] = hkh1 k1 . Writing
this as h(kh1 k1 ), the parenthesised expression must lie in H because
H is normal and hence closed under conjugation, and so [h, k] H.
Similarly, writing [h, k ] as (hkh1 )k we find that [h, k] K. Thus
[h, k] H K, and therefore [h, k] = hkh1 k1 = e, which gives hk =
kh as claimed.
This gives us a useful variation on Proposition 2.12, concerning the
internal direct product of H and K:
Corollary 3.23 Let H and K be normal subgroups of a group G such that
H G = {e} and HK = G. Then G = H K.
Proof By Proposition 3.22 every element of H commutes with every
element of K, so all the hypotheses for Proposition 2.12 are satisfied,
and hence G = H K.

All parts should go together without 3.2 Quotient groups


forcing. You must remember that the
parts you are reassembling were disas-
sembled by you. Therefore, if you cant As we saw in Corollary 2.28, given any subgroup H < G, we can
get them together again, there must be neatly split G into a collection of subsets (called cosets), all of which
a reason. By all means, do not use a
hammer. are the same size. There are two different ways of doing this, depend-
IBM maintenance manual (c.1925) ing on whether we look at the left or right cosets; in the last section
we examined the circumstances in which these two ways might give
the same result. In this section we investigate how these pieces fit
together.
We start by going back to some of the earliest examples we considered.
11
0
1
One way of defining the finite cyclic group Zn is by taking the group
Z of integers and defining the congruence modulo n equivalence
24
23 12
11 13
10
22 1
0
1
2 relation n on it: a n b if n|(b a). We represented this schematically
10 12 11 14
2
10
2 as a kind of clock face, but we can also think of it as wrapping the
9 21 9 3 9 3 15 3 infinitely-long line of integers round in a circle so that integers with
4 8
8
5 7
4
16
the same residue or remainder modulo n all line up: see Figure 3.1.
20 6
More formally, the equivalence relation n partitions Z into n equiv-
5
8 7
6 4
19 17
18
alence classes [0], . . . , [n1]. An element of [ a] plus an element of
7 5
6 [b] lies in [ a+b], so we can take the set {[0], . . . , [n1]} and define a
Figure 3.1: Wrapping Z into Z12 . new addition operation on this set such that [ a] + [b] = [ a+b]. This
operation is obviously commutative and associative; furthermore the
class [0] acts as an identity since [0] + [ a] = [ a] + [0] = [ a], and
for each class [ a], the class [n a] acts as an additive inverse since
[ a] + [n a] = [ a+n a] = [n] = [0]. So these equivalence classes forms
a group, which is isomorphic to Zn .
normal subgroups 83

Look again at Figure 3.1: all of the integers


. . . , 2n, n, 0, n, 2n, . . .
are congruent to 0 and hence
[0] = {. . . , 2n, n, 0, n, 2n, . . .} = nZ.
Similarly,
[1] = {. . . , 12n, 1n, 1, 1+n, 1+2n, . . .}
is the coset 1 + nZ in Z, and in fact [ a] = a+nZ for all 0 6 a < n.
So its starting to look like we can construct Zn by taking the cosets of
nZ in Z and defining a suitable binary operation on them.
What weve done here is to take Z and factor by (or mod out by)
congruence modulo n, considering two elements of Z to be the same
if they have the same remainder when divided by n. In effect, were
taking all of the integers which are congruent to each other, gluing
them all together into n large clumps labelled [0], . . . , [n1], and then
using these clumps as the elements of a new group, which turns out
to be isomorphic to Zn .
Some questions arise at this point. Firstly, we did this with the sub-
group nZ < Z. Will any subgroup work? Is the new operation,
defined on the cosets of nZ, well-defined? And how do we generalise
this construction?
Lets look at another example. The dihedral group D3 , the symmetry
group of the equilateral triangle, has a three-element subgroup R3 .
Earlier, we defined a congruence relation n on Z such that a n b
if ( ab) is an element of the subgroup nZ < Z. The analogous
construction for R3 < D3 is to define an equivalence relation such
that g h if g h1 R3 . If we do this, D3 splits neatly into two
equivalence classes, which happen to be the left (and, since R3 is
normal in D3 , right) cosets of R3 :
[e] [ m1 ]
[e] = {e, r, r2 } = R3 [ m1 ] = { m1 , m2 , m3 } = m1 R3 = R3 m1 [e] [e] [ m1 ]
[ m1 ] [ m1 ] [e]
We can immediately see from the multiplication table of D3 that any Table 3.4: Multiplication table for
cosets of R3 in D3
two elements of [e] combine to give another element of [e], any two
elements of [m1 ] combine to give an element of [e], and any element e r r 2 m1 m2 m3
e e r r 2 m1 m2 m3
of [e] combined with any element of [m1 ] gives an element of [m1 ]. r r r 2 e m3 m1 m2
r 2 r2 e r m2 m3 m1
What this means is that the multiplication operation in D3 lines up
m1 m1 m2 m3 e r r 2
correctly with the way D3 is partitioned into cosets of the subgroup m2 m2 m3 m1 r 2 e r
R3 . Just as we did with the congruence relation n defined on Z, we m3 m3 m1 m2 r r 2 e
can form a new group by treating the equivalence classes [e] and [m1 ] Table 3.5: Cosets of R3 in D3
as if they were single elements in their own right. The multiplication
operation we get if we do this is shown in Table 3.4 (see also Table 3.5).
84 a course in abstract algebra

This is obviously just the multiplication table for Z2 in a very su-


perficial disguise. So our new group {[e], [m1 ]} with multiplication
operation derived from the original multiplication operation in D3
is isomorphic to the two-element finite cyclic group Z2 .
Geometrically, the elements of [e] are the orientation-preserving isome-
tries of the equilateral triangle , while the elements of [m1 ] are exactly
those isometries of which reverse the orientation. So what this tells
us is that two direct isometries preserve orientation, and two opposite
isometries do as well, but a direct isometry combined with an opposite
isometry reverses orientation. The way direct and opposite isometries
interact is the same as the way the elements 0 and 1 interact in the
group Z2 .
Lets introduce the symbol / to denote this (as yet insufficiently pre-
cisely defined) operation on groups and their subgroups. So the
previous two examples tell us that

Z/nZ
= Zn and D3 /R3
= Z2 .
To decide whether this construction works for any subgroup H of
any group G, well try another example. Let M denote the subgroup
{e, m1 } of D3 . This has three left cosets
M = {e, m1 } rM = {r, m3 } r 2 M = {r 2 , m2 }

and three right cosets

M = {e, m1 } Mr = {r, m2 } Mr2 = {r2 , m3 }

from which we can see immediately that M is, unlike R3 , not a normal
subgroup of D3 . In particular, this means that whereas with the
previous examples we could cheerfully ignore whether the resulting
groups consisted of left or right cosets, we have to be more careful
in this case. For the moment, lets consider the set of left cosets of
M in D3 and see how they interact. Following on from the previous
examples, given two elements g, h D3 , we require that the product of
any element of the left coset gM with any element of the left coset hM
be an element of the left coset ( g h) M. Unfortunately, this doesnt
always work. For example,

(eM)(rM) = { g h : g M, h rM} = {r, r2 , m2 , m3 }


e m1 r m3 r2 m2 which not only fails to be the left coset rM, but isnt even one of the
e e m1 r m3 r2 m2
m1 m1 e m2 r2 m3 r
other two left cosets either. Looking at the multiplication table for D3
r r m3 r2 m2 e m1 again, this time rearranging the elements into their left coset groups,
m3 m3 r m1 e m2 r2 we can see (in Table 3.6) that instead of a nice, neat structure like the
r2 r2 m2 e m1 r m3
m2 m2 r2 m3 r m1 e one we got for D3 /R3 , the whole thing has fragmented in a decidedly
Table 3.6: Left cosets of {e, m1 } in D3
inelegant manner.
normal subgroups 85

Whats going wrong here? What precisely is stopping the left cosets
of M in D3 from behaving like elements of a group?
We want to be able to define a group structure on the set of left cosets
of M such that
( gM)(hM) = ( g h) M
for all g, h D3 . But the problem is that the set { M, rM, r2 M } isnt
closed under this multiplication operation, because the multiplication
operation in D3 doesnt line up neatly with the left cosets of M in D3 .
It works in some cases. For example, (rM)(eM) = {r, m3 } = rM as
required. But if we try (eM)(rM) we find that it doesnt. In particular,
m1 r = ( e m1 ) (r e )
= e ( r r 1 ) m 1 r e
= (e r ) (r 1 m1 r ) e.
In the last line, the first parenthesised expression (e r ) is equal to
r, as we want it to be. But in order for this whole expresion to be
an element of rM, we also need the second parenthesised expression
(r 1 m1 r ) to be an element of M, which it isnt since
r 1 m 1 r = r 2 m 1 r = m 3 .
Notice that (r 1 m1 r ) is a conjugate of an element of M. In fact,
if we tried any other combination of cosets of M (rM and r2 M, for
example) wed end up with a similar requirement: that a particular
conjugate of an element of M be itself an element of M.
In other words, the reason this particular example doesnt work is
because M isnt closed under conjugation. In the last section, we intro-
duced a special class of subgroups which are closed under conjugation,
namely normal subgroups. So the reason we cant form a group from
the left cosets of M is because M isnt a normal subgroup of D3 . The
rotation subgroup is normal in D3 , and the subgroup nZ is normal in
Z, which is why this construction worked properly in those cases.
The following proposition states this more generally:
Proposition 3.24 If H is a normal subgroup of G, the set of all left cosets
of H in G forms a group under the multiplication operation
( gH )(kH ) = ( g k) H
for all g, k G.

Proof Any element of ( gH )(kH ) has the form g h1 k h2 , for some


h1 , h2 H. We can rewrite this as
g h 1 k h 2 = g ( k k 1 ) h 1 k h 2 = ( g k ) ( k 1 h 1 k ) h 2 .
This is an element of ( g k) H if and only if the conjugate (k1 h1 k )
86 a course in abstract algebra

is an element of H. But since H is a normal subgroup of G, its closed


under conjugation (by Proposition 3.10) and hence (k1 h1 k) must
also lie in H. So the set of left cosets of H in G is closed under the
multiplication operation defined above, as required.
In order to show that the set of left cosets forms a group, we must
check each of the group axioms. The new multiplication operation is
clearly associative, because the operation is. The coset eH acts as the
identity element since
(eH ) ( gH ) = gH = ( gH ) (eH )
for any g G. Finally, the coset g1 H is the inverse of the coset gH
since
( g1 H )( gH ) = eH = ( gH )( g1 H )
for anyg G.
This proposition and discussion leads to the following definition:
Definition 3.25 Let H be a normal subgroup of a group G. Then
the quotient group or factor group G/H is the group consisting of
the left cosets of H in G together with the multiplication operation
( gH )(kH ) = ( g k) H
for any g, k G.

To better understand this concept, well look at a few examples.


Example 3.26 The cyclic group Z4 has a subgroup h2i = {0, 2}
isomorphic to Z2 . Its cosets are
h2i = 0 + h2i = {0, 2} 1 + h2i = {1, 3}
The subgroup h2i is normal since Z4 is abelian, so we can go ahead
+ 0 2 1 3 and construct the quotient group Z4 /h2i = {h2i, 1+h2i} which has
0 0 2 1 3 + 0 1 two elements and must therefore be isomorphic to Z2 .
2 2 0 3 1
= 0 0 1
1 1 3 2 0 1 1 0 We can also see this from a rearranged version of the multiplication
3 3 1 0 2 table for Z4 , shown in Table 3.7.
Table 3.7: Z4 /h2i
= Z2
Example 3.27 The Klein 4group V4 = {e, a, b, c} = Z2 Z2 has
three proper, nontrivial subgroups, each of which is isomorphic to
Z2 . Consider the subgroup A = {e, a}; again this is normal since V4
e a b c
is abelian. Its cosets are A and bA = cA = {b, c} hence
e e a b c + 0 1

a a e c b = 0 0 1 V4 /A = { A, bA}
= Z2 .
b b c e a 1 1 0
c c b a e This again can be seen from the multiplication table in Table 3.8.
Table 3.8: V4 /{e, a}
= Z2
The above example could be written, with only a slightly ambiguous
abuse of notation, as (Z2 Z2 )/Z2
= Z2 .
Lets try another example like this.
normal subgroups 87

Example 3.28 Let G = Z6 = Z2 Z3 (by Proposition 1.32). Now


consider the subgroup H = h3i
= Z2 {0}. It has the following
cosets:
0+ H = {0, 3}
= {(0, 0), (1, 0)} = (0, 0)+ H

1+ H = {1, 4} = {(0, 1), (1, 1)} = (0, 1)+ H
2+ H = {2, 5}
= {(0, 2), (1, 2)} = (0, 2)+ H
Furthermore, H is normal in G, not least because G is abelian. So we
can form the quotient group
G/H = { H, 1+ H, 2+ H }
= {(0, 0)+ H, (0, 1)+ H, (0, 2)+ H }
which must be isomorphic to Z3 , the only group with three elements. + 0 3 1 4 2 5
0 0 3 1 4 2 5
Again, the multiplication table, suitably rearranged, makes this clear: + 0 1 2
3 3 0 4 1 5 2
see Table 3.9 1 1 4 2 5 3 0 0
0 =
1 2
1 1 2 0
4 4 1 5 2 0 3
2 2 0 1
Is this true in general? That is, given a direct product G H of groups, 2 2 5 3 0 4 1
is it true that {e} H = H is a normal subgroup of G H, and if so,
5 5 2 0 3 1 4

that ( G H )/({e} H )
= G? Yes, as the next proposition confirms. Table 3.9: Z6 /h3i
= Z3
Proposition 3.29 The direct product G H of any two groups G and H
has a normal subgroup isomorphic to H. Factoring out by this subgroup
yields a group isomorphic to G.

Proof The direct product G H is the set {( g, h) : g G, h H } of


ordered pairs of elements of G and H. The subset {e} H = {(e, h) :
h H } may be seen to be a subgroup of G H; furthermore it is
isomorphic to H via the map f : {e} H H given by f : (e, h) 7 h.
To see that this subgroup is normal in G H, choose any g G and
h, k H. Then the conjugate

( g, k)(e, h)( g, k)1 = ( ge g1 , khk1 ) = (e, khk1 )

is an element of {e} H, and hence {e} H is closed under conjugation


and thus normal.
The cosets of {e} H in G H are of the form ( g, h)({e} H ) = { g} H
and are thus indexed by elements of G. Therefore the quotient group
( G H )/({e} H ) is isomorphic to G.
What this tells us is that any direct product of a finite number of
groups can be divided by any of its factors.
It might, therefore, be tempting to think that any group can be de-
In fact, we can construct Z4 from two
composed in this way. That is, that ( G/H ) H = G for any normal copies of Z2 , but we have to do it in a
subgroup H < G. But Examples 3.27 and 3.26 above show that this slightly more complicated way, using
isnt the case: both Z4 and V4 contain a subgroup isomorphic to Z2 , something called an extension. This is
beyond the scope of the current discus-
but we know from Example 1.30 that Z4 6 = V4 . So not every group is sion, but well come back to it later, in
isomorphic to a direct product of its various quotients. Section 7.5.
88 a course in abstract algebra

Its worth thinking about what the operation of factoring by a normal


subgroup actually means, in addition to the purely algebraic definition
in terms of cosets. Where the group in question has a particular
geometric or other interpretation, what does the quotient operation
mean in that context?
Weve already said that factoring a group G by a normal subgroup
H can be thought of as gluing together the elements of a coset into
a single block and then looking at how they interact with each other
under the group multiplication operation. In particular, though, the
subgroup H itself collapses down to become the identity element in
the quotient group G/H. In doing so, everything that the subgroup H
represented, measured or related to in G gets discarded or ignored in
the passage to G/H.
Looking at our first example Z/nZ = Zn , whats going on here is that
in Z we can uniquely express any integer m in the form m = qd+r,
where 0 6 r < n and q Z. That is, we can consider any integer m as
being formed from an element of nZ, a multiple of n, together with
an extra bit, the remainder r. In passing to the quotient group Z/nZ,
were essentially ignoring the multiple of n bit and only keeping the
remainder r.
Similarly, when we factored the dihedral group D3 by the rotation sub-
group R3 , we were ignoring any rotational component of an element
in D3 , that is, a symmetry of the equilateral triangle, and only looking
at what was left: the reflection component. Geometrically, we can
think of this as taking the original equilateral triangle, gluing together
(or identifying) all the points which differ by a rotation, and thereby
id
en

obtaining a new shape (see Figure 3.2). This new shape, similar to a
tif
y

kite, clearly only has a single line of symmetry, which is consistent


with the quotient group D3 /R3 being isomorphic to Z2 .
As noted in the last section, every group has at least two normal
subgroups: itself and the trivial subgroup {e}. Factoring by these two
subgroups doesnt, however, yield anything particularly illuminating:
Figure 3.2: Identification of points in a
triangle that differ by a 13 rotation
G/{e}
=G and G/G
= {e}
Convince yourself that both of these statements are true.
Two other subgroups that we met in the last section are the centre
Z ( G ) and the commutator subgroup [ G, G ], both of which gave some
measure of the commutativity (or not) of the group operation. These
are normal subgroups, so we can use them to form quotient groups.
Recall that the centre Z ( G ) consists of those elements of G which
commute with all other elements of G. So Z ( G ) = G if and only if
G is abelian. If G is nonabelian then Z ( G ) will be a proper normal
normal subgroups 89

subgroup of G. Factoring by Z ( G ) to get the quotient group G/Z ( G )


can be regarded as ignoring the nicely commuting elements of G;
whats left gives some measure of how nonabelian G is. If G/Z ( G ) is
trivial then G must be abelian, while if G/Z ( G ) is nontrivial then G is
nonabelian to a greater or lesser extent.
Example 3.30 The centre Z ( D3 ) is trivial, so D3 /Z ( D3 )
= D3 . So in
some vague sense, D3 is about as nonabelian as we can reasonably
expect it to be.
The centre Z = Z ( D4 ) = {e, r2 } so D4 /Z has four elements, and is
hence isomorphic to either Z4 or V4 = Z2 Z2 . To see which one,
we need to look at the cosets of Z:
Z = {e, r2 } rZ = {r, r3 }
eZ rZ m1 Z m2 Z
m1 Z = { m1 , m3 } m2 Z = { m2 , m4 } eZ eZ rZ m1 Z m2 Z
rZ rZ eZ m2 Z m1 Z
The quotient group D4 /Z is isomorphic to the Klein 4group V4 = m1 Z m1 Z m2 Z eZ rZ
Z2 Z2 . This can be seen from the coset multiplication table (Ta- m2 Z m2 Z m1 Z rZ eZ

ble 3.10) or by observing that the cosets rZ, m1 Z and m2 Z all have Table 3.10: Coset multiplication table
for D4 /Z ( D4 )
= V4
order 2 under the coset multiplication operation.

Well come back to this later in Section 7.2 when we use the quotient
G/Z ( G ) to form the upper central series of a group, but for the
moment just think of it as giving an indication of the nonabelianness
of a group.
The commutator subgroup [ G, G ] of a group G is also a measure
(albeit a slightly different one) of the nonabelianness of G: if [ G, G ] is
trivial then G is abelian, and the more elements [ G, G ] has, the more
nonabelian G is. Proposition 3.21 confirms that [ G, G ] is normal in
G, so we can form the quotient group G/[ G, G ]. In doing this, we
are killing off all of the noncommutativity in G. More precisely, we
are making every commutator gh g1 h1 equal to the identity e,
forcing every element of G to commute with every other element of G.
Proposition 3.31 For any group G, the quotient G/[ G, G ] is abelian.
Furthermore, if N P G and G/N is abelian, then [ G, G ] 6 N.

Proof Suppose that a, b G, and let G 0 = [ G, G ]. We want to show


that the cosets aG 0 and bG 0 commute in the quotient group G/G 0 , and
we do this by considering their commutator:

[ aG 0 , bG 0 ] = ( aG 0 )(bG 0 )( aG 0 )1 (bG 0 )1 =
( aba1 b1 ) G 0 = [ a, b] G 0 = G 0 .
(The last step follows because [ a, b] G 0 .) This means that the com-
mutator of [ aG 0 , bG 0 ] of any two cosets in G/G 0 is trivial, and hence
G/G 0 is abelian.
90 a course in abstract algebra

To prove the second statement, we consider the commutator [ aN, bN ]


of any two cosets aN and bN in the quotient G/N:
[ aN, bN ] = ( aN )(bN )( aN )1 (bN )1 = ( aba1 b1 ) N = [ a, b] N.
Since G/N is abelian, this commutator must be the identity element in
G/N, namely N. So N = [ a, b] N for any a, b G. By Proposition 2.26
this means that [ a, b] N and hence G 0 6 N.
This process, of factoring a group by its commutator subgroup, is
called abelianisation, and we will often denote the quotient G/[ G, G ]
by Gab ; it is the largest abelian quotient of G.
Example 3.32 The commutator subgroup [ D3 , D3 ] is the rotation
subgroup R3 = {e, r, r2 }, so D3ab = D3 /R3 = Z2 .
The commutator subgroup [ D4 , D4 ] is the two-element subgroup
{e, r2 }, so (by the discussion in Example 3.30) we see that
D4ab = D4 /[ D4 , D4 ]
= Z2 Z2 .

Example 3.33 The alternating group An is a normal subgroup of Sn ;


as remarked earlier (in Example 3.14) it has index 2 and is hence
normal (by Proposition 3.12) and so Sn /An = Z2 . In fact, for n > 3,
Proposition 3.20 tells us that [Sn , Sn ] = An , so weve just shown that
Snab
= Z2 for n > 3.

Example 3.34 Generalising the examples of D3 and D4 above, the


direct isometry group Isom+ ( X ) of some geometric object X is an
index2 normal subgroup of the full isometry group Isom( X ), so
Isom( X )/ Isom+ ( X )
= Z2 .
The next two examples consider quotient groups of infinite groups.
Example 3.35 The multiplicative group R + of positive real numbers
is a subgroup of the multiplicative group C of nonzero complex

numbers. In particular, R+ is normal in C
since C is abelian.

Writing any element of C as re where r > 0 and [0, 2 ) we
i

can see that the subgroup R


+ is exactly those elements of C with

= 0. So we can specify any coset re R+ as e R+ ; this is uniquely
i i

determined by just the argument .


Furthermore,
(ei R i
+ )(e R+ ) = ( e
i ( +)
)R
+,

so the quotient C
/R+ is isomorphic to an additive group R of real
numbers. But were only interested in the principal value of the

argument; that is, the angle it represents in [0, 2 ). So C
/R+ is
isomorphic to the additive group of real numbers modulo 2.
normal subgroups 91


Geometrically, we can regard C /R+ as the group of rotations in
the plane R2 . That is, the matrix group
cos sin
  
: [0, 2 ) .
sin cos

This is exactly the special orthogonal group SO2 (R). So C
/R =
 +i 
SO2 (R). This is also isomorphic to the unitary group U1 = e :

[0, 2 ) .
Alternatively, we can view the cosets as infinite radii in the complex
plane; when we factor by R + , we collapse each of these radii onto its
intersection point with the unit circle.

Example 3.36 The multiplicative group C has another obvious


subgroup consisting of the complex numbers of unit modulus:
C = { z C i
: | z | = 1} = { e : [0, 2 )} = U1 = SO2 (R)
Figure 3.3: Cosets of R
+ in C

Again, C is normal in C
since the larger group is abelian, so we can
form the quotient group C /C.
Since every nonzero complex number can be written in the form rei
where r > 0 and [0, 2 ), we can express any coset rei C = rC.
So a given coset of C is determined entirely by the modulus r, and
hence we have
(rC )(sC ) = (rs)C.
Thus C
= R+ , the multiplicative group of positive real numbers.
/C
Geometrically, the cosets are concentric circles in the complex plane,
and when we factor by C we are collapsing each circle down onto its
intersection point with the positive real axis.
Figure 3.4: Cosets of C in C

We can take Example 3.35 a step further by replacing the nonzero
complex numbers C
by the nonzero quaternions H .

Definition 3.37 A quaternion is a four-dimensional analogue of a


(two-dimensional) complex number, an element of the form
z = a + bi + cj + dk
where a, b, c, d R and ijk = i2 = j2 = k2 = 1. Together, they form
an algebra H, the third in a sequence R, C, H, O, formed by a process
called the CayleyDickson construction. (The fourth algebra O is
eight-dimensional and consists of objects called octonions.)
The algebra H is noncommutative, as the three special elements i, j
and k anticommute:
ij = k jk = i ki = j
ji = k kj = i ik = j
92 a course in abstract algebra

(The set {1, i, j, k}, equipped with this multiplication opera-


tion, is the quaternion group Q8 from Example 1.42.)
The subgroup R
+ is normal in H , and by a similar argument as

in Example 3.35 we can show that H /R+ = SU2 , which happens
to include as an index2 (and hence normal) subgroup SO3 (R), the
group of rotations in three-dimensional space. This fact is particularly
useful in computer graphics programming: quaternion multiplication
turns out to be a computationally efficient way of calculating three-
dimensional rotations.
In the case of the octonions, we not only have to give up commutativity
of multiplication, but associativity as well. It is possible to take this
Wikimedia Commons
sequence one step further, obtaining a sixteen-dimensional algebra
The Irish mathematician and physicist
Sir William Rowan Hamilton (1805 variously called hexadecanions or sedenions, but we have to give up
1865) displayed a strong mathematical something else along the way: this new algebra is not only noncom-
talent from an early age, and was ap-
pointed Professor of Astronomy at the
mutative and nonassociative, it also fails to be a division algebra, in
University of Dublin, while still a 21- the sense that there exist nonzero sedenions that nevertheless multiply
year-old undergraduate. During his to give zero. Well learn about division algebras and division rings in
career he made many contributions to
mechanics, optics and algebra. Chapter 10.
He spent some years unsuccessfully We saw earlier that taking the quotient of a group G by its centre
trying to construct a three-dimensional
analogue of the complex numbers un- Z ( G ) yields another group. If G was abelian to start with, then this
til, while walking along the banks of quotient will simply be the trivial group {e}, but if G was nonabelian,
the Royal Canal in Dublin on the after-
then well get a nontrivial group (which itself might or might not be
noon of Monday 16 October 1843, he
realised the solution: to construct in- abelian). Applying this to the various families of matrix groups we
stead a four-dimensional, noncommu- met in Section 1.2 produces some more interesting examples of groups,
tative algebra. With his walking stick,
he scratched the equations of quater- some of which well investigate further in the next section.
nion multiplication on one of the stones
Example 3.38 The centre Z (GLn (R)) of the general linear group
of nearby Broom Bridge. In 1958, a
commemorative plaque was unveiled GLn (R) consists of all nonzero scalar multiples kI of the identity
by the then Taoiseach amon de Valera matrix I. Factoring out by this group gives the projective general
(18821975), also a mathematics gradu-
ate of the University of Dublin. linear group PGLn (R). Applying the same process to the special
linear group SLn (R) we obtain the projective special linear group
PSLn (R).
In a similar way, the orthogonal group On (R) gives rise to the pro-
jective orthogonal group POn (R) and the special orthogonal group
SOn (R) yields the projective special orthogonal group PSOn (R).
We can also construct the projective unitary group PUn and the
projective special unitary group PSUn from Un and SUn , and the
Geograph Project / 2007 JP
projective symplectic group PSp2n (R) from Sp2n (R).
(Readers interested in visiting the
bridge may find it easiest to catch a In the next chapter we will return to this idea of factoring a group
local train to Broombridge Station.)
by its centre. In particular, Corollary 4.58, a consequence of the
2
Theorem 4.57, page 126. NormaliserCentraliser Theorem,2 gives us a concrete interpretation
of this quotient.
normal subgroups 93

3.A Simple groups It would be of the greatest interest if


a classification could be given of all
simple groups with finitely many oper-
We learned earlier that if p is prime, the cyclic group Z p has no ations.
subgroups except for the trivial subgroup itself. This means Z p is in Otto Hlder (18591937),
Die einfachen Gruppen in ersten und
some sense indivisible: it has no proper, nontrivial normal subgroups,
zweiten Hundert der Ordnungszahlen,
so cant be factored to give a smaller nontrivial group. Mathematische Annalen 40 (1892)
5588
Just as many properties of the multiplicative structure of the integers
relate to the study of prime numbers, we can reasonably expect to Only three people have ever really un-
understand some aspects of the classification of finite groups and derstood the SchleswigHolstein busi-
their internal structure by studying those finite groups which cant be ness: the Prince Consort, who is dead,
a German professor, who is mad, and
turned into smaller groups by factoring by some normal subgroup. I, who have forgotten all about it.
Definition 3.39 A group G is simple if it contains no proper, non- Henry John Temple, 3rd Viscount
Palmerston (17841865)
trivial normal subgroups. That is, it contains no normal subgroups
other than G and {e}.
The classification of finite simple groups represents the end product
of arguably the greatest collaborative project in the history of group
theory. Earlier discussions yield the following two examples.
Example 3.40 The cyclic group Z p is simple if and only if p is prime.

Example 3.41 The dihedral group Dn is not simple, since it contains


a normal subgroup Rn = hr i generated by the rotations.

In fact, there are relatively few types of simple groups. The only other
ones weve met so far, apart from the prime order cyclic groups Z p
are the alternating groups An for n 6= 4. The groups A1 and A2 are
trivial (and hence simple). The group A3 is isomorphic to Z3 and is
hence simple too. The group A4 , however, is not simple: it has a single
proper, nontrivial normal subgroup

{, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}


= V4
= Z2 Z2 .
All the other alternating groups An for n > 5 are simple. To prove
this, we first need the following lemma.
Lemma 3.42 Let N P An , where n > 5. If N contains a 3cycle, then
N = An .

Proof Suppose N contains a 3cycle = ( a1 a2 a3 ). Let = ( x y z) be


another 3cycle in An . Choose Sn such that ( a1 ) = x, ( a2 ) = y
and ( a3 ) = z. Then 1 = ( x y z) = , because
1
x 7 a1 7 a2 7 y,
1
y 7 a2 7 a3 7 z,
1
and z 7 a3 7 a1 7 x.
94 a course in abstract algebra

(Remember that composition of permutations is written right-to-left,


rather than left-to-right.) Since n > 5 there are at least two elements a4
and a5 not affected by . Hence = ( a4 a5 ) commutes with and so
= 1 = 1 1 = ( )( )1 .
Either or belongs to An : either is an even permutation (and
hence an element of An ) or its an odd permutation, in which case
is an even permutation, since is a transposition disjoint from .
Hence is conjugate to in An , and so N. Therefore N contains
all 3cycles in An and must therefore be the whole of An .
We can now prove that the alternating group An is simple for n>5.
The proof is a little involved, but essentially consists of applying
Lemma 3.42 to five possible cases: an element of An (for n>5) can
(i) consist of a single 3cycle,
(ii) contain at least one cycle of length m > 3,
(iii) consist of two or more disjoint 3cycles together with zero or
more disjoint transpositions,
(iv) consist of a single 3cycle and a product of disjoint transposi-
tions, or
(v) consist of an even number of disjoint transpositions.
Proposition 3.43 The alternating group An is simple for n > 5.

Proof Let N P An be nontrivial. We will show that N must be the


whole of An . Consider an arbitrary element N and decompose it
as a product of disjoint cycles. Then we have five cases to consider.
Case 1 = ( a1 a2 a3 ) is a 3cycle. If so, then by Lemma 3.42 the
subgroup N must consist of all the 3cycles, and therefore be An itself.
Case 2 = ( a1 a2 . . . am ), where m > 3 and is some product of
disjoint cycles. In this case, the 3cycle = ( a1 a2 a3 ) commutes with
all the factors of except for the mcycle ( a1 a2 . . . am ). So
= 1 = ( a1 a2 . . . am )1
belongs to N, and so does
1 = ( a1 a2 . . . am )1 1 ( a1 a2 . . . am )1
= ( am . . . a2 a1 )( a1 a2 a3 )( a1 a2 . . . am )( a3 a2 a1 )
= ( a1 a3 a m )
which is a 3cycle, and hence by Lemma 3.42 N must be An .
Case 3 = ( a1 a2 a3 )( a4 a5 a6 ), where is a product of disjoint
cycles. Let = ( a2 a3 a4 ), and note that commutes with . Then
1 1 = ( a6 a5 a4 )( a3 a2 a1 )( a2 a3 a4 )( a1 a2 a3 )( a4 a5 a6 )( a4 a3 a2 )
= ( a1 a2 a4 a3 a6 ).
normal subgroups 95

This is a cycle of length 5, and by the previous case N must therefore


contain a 3cycle and hence be the whole of An .
Case 4 = ( a1 a2 a3 ) where is a product of disjoint transpositions.
We see that
2 = ( a1 a2 a3 ) ( a1 a2 a3 ) = 2 ( a1 a2 a3 )2 = ( a1 a3 a2 )
is a 3cycle, and hence by Lemma 3.42 the subgroup N must again be
the entirety of An .
Case 5 = ( a1 a2 )( a3 a4 ) where is a product of an even number
of disjoint transpositions. The conjugate ( a1 a2 a3 ) ( a1 a2 a3 )1 is in
N, and hence so is the product
1 ( a 1 a 2 a 3 ) ( a 1 a 2 a 3 ) 1
= 1 ( a1 a2 )( a3 a4 )( a1 a2 a3 ) ( a1 a2 )( a3 a4 )( a3 a2 a1 )
= ( a1 a3 )( a2 a4 ).
Since n > 5, we can find a5 6= a1 , a2 , a3 , a4 and define = ( a1 a3 a5 ).
Then
1 ( a1 a3 )( a2 a4 )( a1 a3 )( a2 a4 )
is an element of N, and
1 ( a1 a3 )( a2 a4 )( a1 a3 )( a2 a4 )
= ( a5 a3 a1 )( a1 a3 )( a2 a4 )( a1 a3 a5 )( a1 a3 )( a2 a4 )
= ( a1 a3 a5 )
is a 3cycle. Hence by Lemma 3.42, N = An again.
These are the only possible forms for elements of N, and in each case
we deduce that N = An , and therefore An is simple.
So far, weve discovered two different classes of finite simple groups:
the cyclic groups Z p of prime order and the alternating groups An
for n>5. A reasonable question to ask at this point is whether there
are any more, and it so happens that there are. In one of the longest-
running and heroic endeavours in the history of group theory, the
finite simple groups have been completely classified:
Theorem 3.44 Any finite simple group is one of the following four types:
(i) Cyclic groups Z p of prime order
(ii) Alternating groups An for n > 4
(iii) Finite groups of Lie type
(iv) Sporadic groups

Proof (Omitted.) 3
The proof was declared complete in
The proof of this classification theorem took about a hundred mathe- 1983, but a subtle gap (concerning qu-
asithin groups) was discovered, which
maticians approximately five decades to complete, between 1955 and took a further thousand or so pages to
2004,3 and the details are spread through somewhere in excess of ten patch.
96 a course in abstract algebra

4
It has been suggested that no sin- thousand pages of journal articles and research monographs.4
gle living person fully understands all
the details of the proof: the American
A full discussion of the third and fourth classes is beyond the scope of
group theorist Daniel Gorenstein (1923 this book, but we can look at some examples of finite groups of Lie
1992), one of the main driving forces type now. Table 3.11 lists the nonabelian simple groups with less than
behind the classification, is generally
regarded as having had the best overall 10 000 elements.
perspective on the project. Finite groups of Lie type are related to the matrix groups we met in
Group Order Section 1.2, but we replace R and C with finite analogues. Well study
A5
= A1 (4) = PSL2 (4) these objects, finite fields, in more detail in Chapters 811, but the

= A1 (5) = PSL2 (5) 60
following (incomplete) definition will do for the moment.
A1 (7) = PSL2 (7)

= A2 (2) = PSL3 (2) 168 Definition 3.45 Let p be prime, and let F p = {0, . . . , p1} be the
A6 = A1 (9) = PSL2 (9) 360 set consisting of the first p non-negative integers, equipped with two
A1 (8) = PSL2 (8) 504 binary operations + and representing, respectively, addition and
A1 (11) = PSL2 (11) 660 multiplication modulo p. Then F p is the finite field of order p.
A1 (13) = PSL2 (13) 1092
A1 (17) = PSL2 (17) 2448
More generally, given any prime power q = pn for some n N, we
A7 2520 can define the finite field Fq , but the general construction is more
A1 (19) = PSL2 (19) 3420 complicated, and well postpone it until Section 11.3.
A1 (16) = PSL2 (16) 4080 Here we run into a slight difficulty with notation. The finite groups of
A2 (3) = PSL3 (3) 5616 Lie type classify neatly into a number of infinite families, many having
2 A (33 ) = PSU (3)
2 3 6048 more than one name. One naming convention, used for some of these
A1 (23) = PSL2 (23) 6072 groups, intentionally echoes the names used for the corresponding
A1 (25) = PSL2 (25) 7800 infinite matrix groups. Another widely-used convention (used also for
M11 7920 the Coxeter groups well meet in Section 5.C) uses the capital letters
A1 (27) = PSL2 (27) 9828 AG with numeric subscripts and superscripts; if were not careful,
this could cause confusion with the alternating groups An and the
Table 3.11: Nonabelian simple groups
of order less than 10 000
dihedral groups Dn . To resolve this, well use a slightly different
(sans-serif) font in this context so that An is the alternating group on
n objects, An (q) is the finite simple group described next, and An is
the finite Coxeter group discussed later (and which, confusingly, is
isomorphic to the symmetric group Sn+1 ).
In the following examples, unless otherwise stated, q will be a positive
power of a prime integer, to ensure that Fq is indeed a finite field.
Example 3.46 For n > 1, let An (q) or PSLn+1 (q) denote the projec-
tive special linear group over Fq . That is, the special linear group
SLn+1 (Fq ) of (n+1)(n+1) square matrices, with entries from the
orderq finite field Fq and determinant 1 (calculated in Fq ), factored
by its centre as described in Example 3.38.
In general,
n
|An (q)| = |PSLn+1 (q)| = 1
gcd(n+1,q1)
qn(n+1)/2 ( q i +1 1 ).
i =1
normal subgroups 97

So, for example,


1
|A1 (4)| = |PSL2 (4)| = 1
gcd(2,3)
412/2 (4i+1 1) = 415 = 60.
i =1
All of these groups are simple, except for A1 (2) = PSL2 (2) which is
isomorphic to the symmetric group S3 , and A1 (3) = PSL2 (3) which
is isomorphic to the alternating group A4 .
In addition, A1 (4)=PSL2 (4) and A1 (5)=PSL2 (5) are both isomorphic
to each other and to the alternating group A5 ; while A1 (7)=PSL2 (7)
and A2 (2)=PSL3 (2) are isomorphic to each other. Also, the group
A1 (9)=PSL2 (9) is isomorphic to the alternating group A6 ; and the
group A3 (2)=PSL4 (2) is isomorphic to the alternating group A8 .

In a similar way, the projective symplectic groups PSpn (R) yield


another family of simple groups:
Example 3.47 For n > 1, let Cn (q) or PSp2n (q) denote the projective
symplectic group over Fq . By analogy with the previous example, we
take the group Sp2n (Fq ) of symplectic 2n2n matrices with entries
from Fq and factor out by the centre as described in Example 3.38.
We ignore the case n = 1 since Sp2 (Fq ) = SL2 (Fq ) and so this is
covered by the previous example. All of the groups Cn (q) = PSp2n (q)
are simple except for C2 (2) = PSp4 (2) which is isomorphic to the
symmetric group S6 .
In general,
n
(q2i 1).
2
|Cn (q)| = |PSp2n (q)| = 1
gcd(2,q1)
qn
i =1
So, for example,
2
|C2 (2)| = |PSp4 (2)| = 24
gcd(2,1) (22i 1) = 16315 = 720 = |S6 |.
i =1

There are three other closely-related families of finite simple groups


obtained from the special orthogonal groups SOn (Fq ), however the
details are somewhat involved: in addition to factoring out by their
centre to obtain projective versions, we have to consider separately the
cases where n is even or odd, and restrict our attention to a subgroup
defined in terms of something called the spinor norm. This process
+
yields the groups Bn (q)=O2n+1 (q) (for n > 1), Dn (q)=O2n (q) (for
2 2
n > 3) and Dn (q )=O2n (q) (for n > 3).
More straightforward is the construction of a family of finite simple
groups from the projective special unitary groups PSUn ; this process
leads to the groups 2 An (q2 ) = PSUn+1 (q).
We wont discuss these groups any further, but a complete list appears
98 a course in abstract algebra

Group Order
An (q) = PSLn+1 (q) 1
gcd(n+1,q1)
qn(n+1)/2 in=1 (qi+1 1)
2
Bn (q) = O2n+1 (q) n>1 1
gcd(2,q1)
qn in=1 (q2i 1)
2
Cn (q) = PSp2n (q) n>2 1
gcd(2,q1)
qn in=1 (q2i 1)

+
Dn (q) = O2n (q) n>3 1
gcd(4,qn 1)
qn(n1) (qn 1) in=11 (q2i 1)

1
E6 ( q ) gcd(3,q1)
q36 (q12 1)(q9 1)(q8 1)(q6 1)(q5 1)(q2 1)

1
E7 ( q ) gcd(2,q1)
q63 (q18 1)(q14 1)(q12 1)(q10 1)(q8 1)(q6 1)(q2 1)

E8 ( q ) q120 (q30 1)(q24 1)(q20 1)(q18 1)(q14 1)(q12 1)(q8 1)(q2 1)

F4 ( q ) q24 (q12 1)(q8 1)(q6 1)(q2 1)

G2 ( q ) q6 (q6 1)(q2 1)

2A
n (q
2) = PSUn+1 (q) n>1 1
gcd(n+1,q+1)
qn(n+1)/2 in=1 (qi+1 (1)i+1 )


2D
n (q
2) = O2n (q) n>3 1
gcd(4,qn +1)
q n ( n 1) ( q n + 1) in=11 (q2i 1)

2E 2) 1
6 (q gcd(3,q+1)
q36 (q12 1)(q9 + 1)(q8 1)(q6 1)(q5 + 1)(q2 1)

3 D ( q3 ) q12 (q8 + q4 + 1)(q6 1)(q2 1)


4

2B 2n+1 ) q2 (q2 + 1)(q 1) (where q = 22n+1 )


2 (2

2F 2n+1 ) q12 (q6 + 1)(q4 1)(q3 + 1)(q 1) (where q = 22n+1 )


4 (2

2F 0 17971200 = 211 33 52 13
4 (2)

2G 2n+1 ) q3 (q3 + 1)(q 1) (where q = 32n+1 )


2 (3

Table 3.12: Simple groups of Lie type

in Table 3.12, all of which are simple except for the following:
A1 (2) = PSL2 (2), A1 (3) = PSL2 (3), B2 (2) = O5 (2),
2 2 2
G2 (2), A2 (2 ) = PSU3 (2), B2 (2),
2 2
F4 ( 2 ) , G2 (3).
The Chevalley groups An (q), Bn (q), Cn (q), Dn (q), E6 (q), E7 (q), F4 (q)
and G2 (q) are named after the French mathematician Claude Chevalley
(19091984); the groups 2 An (q2 ), 2 Dn (q2 ), 2 E6 (q2 ) and 3 D4 (q3 ) are
called Steinberg groups after the Moldovan/American mathematician
Robert Steinberg; the groups 2 B2 (22n+1 ) are called Suzuki groups
after the Japanese mathematician Michio Suzuki (19261998); the
groups 2 F4 (22n+1 ) and 2 G2 (32n+1 ) are called Ree groups after the
normal subgroups 99

Group Order
M11 7 920
M12 95 040
M22 443 520
M23 10 200 960
M24 244 823 040
J1 175 560
J2 604 800
J3 50 232 960
J4 86 775 571 046 077 562 880
Co1 4 157 776 806 543 360 000
Co2 42 305 421 312 000
Co3 495 766 656 000
Fi22 64 561 751 654 400
Fi23 4 089 470 473 293 004 800
0
Fi24 1 255 205 709 190 661 721 292 800
McL 898 128 000
HS 44 352 000
Suz 448 345 497 600
He 4 030 387 200
Ru 145 926 144 000
HN 273 030 912 000 000
O0 N 460 815 505 920
Ly 51 765 179 004 000 000
Th 90 745 943 887 872 000
B 4 154 781 481 226 426 191 177 580 544 000 000
M 808 017 424 794 512 875 886 459 904 961 710 757 005 754 368 000 000 000
Table 3.13: Sporadic simple groups

Korean/Canadian mathematician Rimhak Ree (19222005), and the


group 2 F4 (2)0 (the commutator subgroup of 2 F4 (2)) is called the Tits
group after the Belgian/French mathematician Jacques Tits.
The fourth class of finite simple groups comprises twenty-six groups
which dont fit into the first three classes. These groups are called
sporadic groups and all have various intriguing properties. They
range in size from the Mathieu group M11 of order 7920, up to the
aptly-named Monster group M with just over 81053 elements. The
sporadic groups and their orders are listed in Table 3.13.
The five Mathieu groups M11 , M12 , M22 , M23 and M24 were discov-
ered in the 1860s and early 1870s by the French mathematician mile
Mathieu (18351890), who first studied them as part of his research on
multiply transitive permutation groups. All five can be constructed
in terms of symmetries of combinatorial objects known as Steiner
100 a course in abstract algebra

systems. The groups M23 and M24 also happen to be the symmetry
groups of the 23 and 24bit binary error-correcting codes discovered
by the Swiss mathematician and information-theorist Marcel Golay
(19021989); these codes were used by NASA to reliably transmit im-
age data back to earth from the Voyager 1 and 2 space probes during
their 19791981 flyby missions to Jupiter and Saturn.
The existence of the four Janko groups J1 , J2 , J3 and J4 was predicted
by the Croatian mathematician Zvonimir Janko; he constructed J1 in
1965 and the other three were explicitly constructed (and their exis-
tence confirmed) over the next fifteen years by other mathematicians.
The three Conway groups Co1 , Co2 and Co3 were discovered by the
British mathematician John Conway. They arise as subgroups of the
symmetry group Co0 of a 24-dimensional object called the Leech
lattice 24 ; Conway constructed this group during a single twelve-
hour period one Saturday in 1964.
The group Co1 is the quotient of Co0 by its two-element centre, while
the groups Co2 and Co3 are the subgroups which leave unchanged
certain types of lattice vectors in 24 . Four other sporadic groups
exist as subgroups or quotients of Co1 : the McLaughlin group McL,
the HigmanSims group HS, the Suzuki group Suz and the second
Janko group J2 (sometimes called the HallJanko group).
0
The three Fischer groups Fi22 , Fi23 and Fi24 all arise as groups of 3
transpositions, particular types of order2 permutations where the
product of any two has order at most 3. They were discovered in 1971
by the German mathematician Bernd Fischer. The group Fi24 is not
0
itself simple, but its commutator subgroup Fi24 is.
The remaining eleven sporadic groups are the Held group He, the
Rudvalis group Ru, the ONan group O0 N, the HaradaNorton group
HN, the Lyons group Ly, the Thompson group Th, the Baby Monster
group B and the Monster group M.
The last of these contains nineteen of the other twenty-five sporadic
groups as subgroups; these nineteen are sometimes called the happy
family, while the remaining six (J1 , J3 , J4 , Ly, O0 N and Ru) are some-
times referred to as pariahs.
normal subgroups 101

Summary

In the previous chapter, during our discussion of subgroups, we met


the concept of cosets.5 , 6 These arose initially from the proof of Cay- 5
Definition 2.16, page 50.
leys Theorem7 and led to Lagranges Theorem.8 Its interesting to 6
Definition 2.17, page 50.
ask whether a given left coset gH necessarily coincides with the corre- 7
Theorem 2.13, page 47.
sponding right coset Hg. Certainly this will be the case if the larger 8
Theorem 2.30, page 54.
group G is abelian, but need not be otherwise. We did, however, find
a number of cases of nonabelian groups which contained subgroups
with this property: the rotation subgroup R3 in D3 for example, or the
alternating group A4 in S4 .9 9
Example 3.1, page 72.
A full study of this situation led us to a relation called conjugacy:
an element g G is conjugate to an element h G if there exists
some element k G such that k gk1 = h.10 This criterion arises 10
Definition 3.2, page 72.
quite naturally in linear algebra: two nn matrices are conjugate (or
similar) exactly when they represent the same linear map f : Rn Rn
relative to a different choice of basis.11 11
Definition 3.3, page 73.
More generally, we find that conjugacy is an equivalence relation,12 12
Proposition 3.4, page 74.
and hence partitions a given group G into a number of equivalence
classes, called conjugacy classes. The conjugacy classes of D3 , for
example, are {e}, {r, r2 } and {m1 , m2 , m3 }.13 The group GLn (R) has 13
Example 3.5, page 74.
uncountably many conjugacy classes, but each corresponds to a partic-
ular type of Jordan block matrix.14 If G is abelian then its conjugacy 14
Example 3.6, page 75.
classes are singleton sets, each consisting of one element of G.15 15
Proposition 3.8, page 75.
It transpires that conjugacy classes are the key to understanding when
the left and right cosets gH and Hg coincide and when they dont.
This correspondence happens exactly when the subgroup H is a union
of conjugacy classes.16 So, for example, any subgroup of an abelian 16
Proposition 3.10, page 76.
group has this form, since the conjugacy classes all consist of single
elements. The rotation subgroup R3 = {e, r, r2 } = {e} {r, r2 } also
has this form, but the reflection subgroup {e, m1 } doesnt, so the cosets
of the former coincide while those of the latter dont. Equivalently, a
subgroup H < G has this property if its closed under conjugation by
any element of G.
We call subgroups with this coincidence property normal subgroups
and denote them H C G or H P G depending on whether H is a
proper subgroup of G or might not be.17 Subgroups of abelian groups 17
Definition 3.11, page 77.
are always normal, and any subgroup of index 2 (such as Rn < Dn or
An <Sn ) is also normal.18 The special linear group SLn (R) is closed 18
Proposition 3.12, page 77.
under conjugation by elements of GLn (R), hence SLn (R) C GLn (R).19 19
Example 3.15, page 78.
Considering commutativity of individual elements yields some im-
portant examples of normal subgroups. The centre Z ( G ) consists
102 a course in abstract algebra

20
Definition 3.16, page 78. of all elements of G which commute with all other elements of G;20
21
Proposition 3.17, page 78. this is not only a subgroup of G but happens to be normal as well.21
Amongst other things, it gives us a measure of how abelian a particular
group is: Z ( G ) = G exactly when G is abelian.
Similarly, we define the commutator subgroup or derived group
[ G, G ] or G 0 to be the subgroup generated by all elements (called
22
Definition 3.18, page 79. commutators) of the form [ g, h] = gh g1 h1 .22 , 23 Again, this is a
23
Definition 3.19, page 80. normal subgroup of G24 and gives an indication of how abelian G is:
24
Proposition 3.21, page 81. [ G, G ] is trivial if and only if G is abelian, and the larger [ G, G ] is, the
less abelian G is. In particular, for n > 3, the commutator subgroup
25
Proposition 3.20, page 80. [Sn , Sn ] is the alternating group An .25
The reason we care about normal subgroups isnt just because their
left and right cosets line up neatly, its because their cosets interact
26
Proposition 3.24, page 85. in a well-defined way and form a group in their own right.26 A
key example of this construction is the formation of the finite cyclic
group Zn from the cosets of nZ = hni in Z. We say that Zn is the
quotient group Z/nZ. More generally, the (left or right) cosets of a
normal subgroup H P G form a quotient group G/H in which the
27
Definition 3.25, page 86. product ( gH )(kH ) is defined to be the coset ( gk) H.27 We also saw
28
Example 3.28, page 87. that D3 /R3
= Z2 and Z6 /h3i = Z3 ,28 amongst others.
Factoring a group G by its centre Z ( G ) often yields useful information
about the nonabelianness of a group: the smaller the centre, the
larger the quotient G/Z ( G ); if G is abelian then Z ( G ) = G and hence
G/Z ( G ) is trivial. On the other hand, D3 /Z ( D3 ) = D3 , so the dihedral
group D3 is in some sense about as nonabelian as we might reasonably
29
Example 3.30, page 89. expect it to be.29 Factoring the general linear group GLn (R) by its
centre yields the projective general linear group PGLn (R); analogous
projective versions of On (R), SOn (R), Un , SUn and Sp2n (R) exist
30
Example 3.38, page 92. too.30
The commutator subgroup is another measure of the nonabelianness
of a given group: if its trivial then the group is abelian. Factoring out
by this subgroup (which, as with the centre, we can do because its
a normal subgroup) effectively kills all the noncommutativity in the
group, leaving an abelian group, which well denote Gab . This process
31
Example 3.32, page 90. is called abelianisation.31
32
Definition 3.39, page 93. Certain groups have no normal subgroups apart from themselves
33
Example 3.40, page 93. and the trivial subgroup. Another way of looking at this is that they
34
Proposition 3.43, page 94. have no proper and nontrivial quotients. We call groups of this type
35
Example 3.46, page 96. simple,32 and they fall into four different classes: cyclic groups Z p
36
Example 3.47, page 97. where p is prime,33 alternating groups An for n > 4,34 finite groups
37
Table 3.12, page 98. of Lie type,35 , 36 , 37 and twenty-six sporadic groups.38
38
Table 3.13, page 99.
normal subgroups 103

References and further reading


An interesting account of the development of the concept of a quotient group, which emerged towards
the end of the 19th century, can be found in the following article:
J Nicholson, The development and understanding of the concept of quotient group, Historia Mathematica
20.1 (1993) 6888
Further details on quaternions, octonions and related topics can be found in the following book and
article:
J H Conway and D A Smith, On Quaternions and Octonions, A K Peters (2003)
J C Baez, The octonions, Bulletin of the American Mathematical Society 39 (2002) 145205
Mark Ronans book Symmetry and the Monster provides an accessible account of the programme to
classify finite simple groups, written for a lay and nonspecialist audience.
M Ronan, Symmetry and the Monster, Oxford University Press (2006)
Technical details on finite simple groups can be found in the following books, amongst others, intended
for graduate students and specialist researchers:
R Carter, Simple Groups of Lie Type, Wiley (1989)
R A Wilson, The Finite Simple Groups, Graduate Texts in Mathematics 251, Springer (2009)
J H Conway, R T Curtis, S P Norton, R A Parker, and R A Wilson, Atlas of Finite Groups, Clarendon
Press, Oxford (1985)
A very readable account of the fascinating links between error-correcting codes, information theory,
higher-dimensional sphere packings and sporadic simple groups, can be found in Thomas Thompsons
book:
T M Thompson, From Error-Correcting Codes Through Sphere Packings To Simple Groups, Carus Mathe-
matical Monographs 21, Mathematical Association of America (1983)

Exercises
Show that [ D2n+1 , D2n+1 ] = R2n+1
= Z2n+1 , and [ D2n , D2n ]
= Zn .
You have shown me a strange image,
and they are strange prisoners.
Like ourselves, I replied; and they
see only their own shadows, or the
shadows of one another, which the
fire throws on the opposite wall of the
cave?
True, he said; how could they see
anything but the shadows if they were
4 Homomorphisms never allowed to move their heads?
And of the objects which are being car-
ried in like manner they would only
see the shadows?
Yes, he said.
And if they were able to converse with
o far, weve met many different types of groups: cyclic, dihedral one another, would they not suppose

S and other symmetry groups, permutation groups, matrix groups,


that they were naming what was actu-
ally before them?
Very true.
and several types of simple groups, amongst others. Weve studied
their internal structure in various ways: decomposition of permuta- Plato (c.427347 BC),
The Republic VII (c.380 BC)
tions into disjoint cycles, whether two elements are conjugate, which
elements commute with each other, what subgroups a group contains,
what happens if we factor by a normal subgroup, and so on.
However, there is an obvious thing we havent really looked at yet,
except in one special case. Groups are essentially just sets equipped
with some extra structure. For ordinary sets we can cheerfully define
all sorts of functions mapping elements of one set to. We can impose
additional conditions (such as injectivity, surjectivity and bijectivity)
on these functions, study their images, compose them, and so forth.
So, what happens if we define functions between groups? Can we
learn anything useful about the structure of a particular group by
examining these functions? Will any function do, or must we restrict
our attention to ones satisfying some specific conditions?

4.1 Structure-preserving maps Structures are the weapons of the math-


ematician.
attributed to Nicolas Bourbaki
It so happens that weve already met one special sort of function.
In Chapter 1 we saw that some groups were structurally identical,
apart from some minor renaming of their elements. For example,
the cyclic group Z3 , consisting of the numbers {0, 1, 2} under addi-
tion modulo 3, is structurally

the same as the multiplicative group
C3 = 1, 12 + 23 i, 12 23 i of complex cube roots of unity. And the


group Z2 Z2 is effectively the same as the Klein 4group V4 and the


symmetry group of the rectangle.
The device we introduced to formalise this idea of structural similarity
was a particular type of bijective function that respected the group
106 a course in abstract algebra

structure: an isomorphism. Its now time to generalise this concept,


and consider non-bijective functions that preserve the group structure.
Definition 4.1 Let G = ( G, ) and H = ( H, ) be two groups. A
function f : G H is a homomorphism if f ( g1 g2 ) = f ( g1 ) f ( g2 )
for all g1 , g2 G.
We could also consider functions that dont satisfy this structural
condition, but it turns out that these arent very interesting from a
group-theoretic point of view. What distinguishes group theory from
set theory is precisely this extra structure, so it doesnt really make
sense to ignore it. Lets look at some examples.
Example 4.2 The function f : Z Z given by f : n 7 2n is a
homomorphism. To see this, observe that f (m+n) = 2(m+n) =
2m+2n = f (m)+ f (n) for all m, n Z.

Example 4.3 The function f : S3 S3 where f : 7 (1 2) (1 2) is


a homomorphism, because
f ( ) = (1 2) (1 2) = (1 2) (1 2)(1 2) (1 2) = f () f ( )
for any , S3 .

Example 4.4 The function f : GLn (R) GLn (R) where A 7 det1 A A
is a homomorphism, since
f ( AB) = det1AB AB = det A1det B AB = det1 A A det1 B B = f ( A) f ( B)
 

for any matrices A, B GLn (R).

In these examples, the homomorphism maps from a group to itself:


the domain and codomain are the same. We call this type of homo-
morphism an endomorphism. If, like Example 4.3, it is bijective, an
isomorphism from a group to itself, we call it an automorphism.
Example 4.5 The function f : Z Z5 where n 7 [n]5 , the remain-
der of n modulo 5, is a homomorphism:
f (m+n) = [m+n]5 = [m]5 +[n]5 = f (m)+ f (n)

Example 4.6 The function f : GLn (R) R which maps a nonsin-


gular, real, nn matrix A to its determinant det A, is a homomor-
phism, since det( AB) = det A det B for any matrices A, B GLn (R).

Parity of permutations gives a homomorphism into Z2 :


Example 4.7 Let f : Sn Z2 , where
0 if is even,
n
f () =
1 if is odd.
Then f is a homomorphism, by the proof of Proposition 1.61.
homomorphisms 107

In these three examples, the homomorphism f is surjective; we call a


surjective homomorphism an epimorphism. Sometimes well denote
this with a double-headed arrow: f : G  H.
Example 4.8 Let f : 3Z Z be the obvious divide by 3 function
n 7 n3 . This is a homomorphism, since f (m+n) = m+ n m n
3 = 3 + 3 =
f ( m ) + f ( n ).

Example 4.9 Let f : An Sn be the obvious inclusion function. This


is a homomorphism, since An is a subgroup of Sn .

Example 4.10 Let f : Z2 Z2 Z2 by n 7 (n, n) for either n


Z2 . This is a homomorphism, because f (m+n) = (m+n, m+n) =
(m, m)+(n, n) = f (m)+ f (n) for any m, n Z2 .
These three examples are of injective homomorphisms, or monomor-
phisms. Sometimes well use an arrow with a tail to denote an injective
homomorphism: f : G  H. In fact, Example 4.8 happens to be not
just injective but surjective as well, and hence an isomorphism.
Given two functions f : A B and g : B C, we can chain them
together by plugging the output of f into the input of g to form
the composite function g f : A C where ( g f )( a) = g( f ( a)) for
all a A. Remember that its important to get the order right: by
convention we compose functions on the left. In fact, in general, it
doesnt even make sense to discuss f g unless the image of g lies
within the domain of f . (Compare this with the strongly analogous
case of matrix multiplication: even if the product AB is defined, it
might not be the case that BA is defined as well.)
The next proposition is a bit of technical book-keeping to check that
composing two homomorphisms gives another homomorphism rather
than just an ordinary function.
Proposition 4.11 Let G = ( G, ), H = ( H, ) and K = (K, ?) be
groups, and let : G H and : H K be homomorphisms. Then the
composite function () : G K is a homomorphism.

Proof For any g1 , g2 G, we have


()( g1 g2 ) = (( g1 g2 ))
= (( g1 ) ( g2 )) ( is a homomorphism)
= (( g1 )) ? (( g2 )) ( is a homomorphism)
= ()( g1 ) ? ()( g2 ).
Hence is a homomorphism.
There are some special homomorphisms worth mentioning at this
point, which will recur in various contexts later in the book.
108 a course in abstract algebra

Example 4.12 The identity homomorphism idG : G G for a group


G is, as its name suggests, the homomorphism which maps every
element g G to itself. (Often we will omit the subscript G if the
context is clear.) In particular, the identity homomorphism has the
special property that it has no effect on any other homomorphism its
composed with. More precisely, given a homomorphism f : G H,
we have
f idG = f = id H f ,
and indeed its possible to define the identity homomorphism as the
unique homomorphism which has this property for a given group.
The identity homomorphism is bijective, and hence an isomorphism.

Example 4.13 Another special homomorphism well meet from time


to time is the zero homomorphism z : G H for any groups G and
H. This is the unique homomorphism which maps every element in
G to the identity element e H.

Example 4.9 illustrates another important case, which the next example
generalises.
Example 4.14 For any subgroup H < G there is a unique inclusion
homomorphism i : H G which maps every element of H to itself
in G. (Sometimes we use a hooked arrow for inclusions instead:
i : G , H.) This homomorphism is necessarily injective.

Direct products of groups yield two more types of homomorphisms:


Example 4.15 A direct product G = G1 Gn of groups has a
collection of projection homomorphisms p1 : G G1 , . . . , pn : G
Gn which map everything onto the corresponding factor. That is,
we define pi : G Gi by pi ( g1 , . . . , gn ) = gi Gi . These homomor-
phisms are necessarily surjective.
For example:
p1 : Z2 Z3 Z2 p2 : Z2 Z3 Z3
(0, 0), (0, 1), (0, 2) 7 0 (0, 0), (1, 0) 7 0
(1, 0), (1, 1), (1, 2) 7 1 (0, 1), (1, 1) 7 1
(0, 2), (1, 2) 7 2

Example 4.16 We can also define a sequence of inclusion homomor-


phisms i1 : G1 G, . . . , in : Gn G where g 7 (e, . . . , e, g, e, . . . , e)
for all g Gj . That is, we map an element g Gj to the element of
the direct product G = G1 Gn with g in the jth place, and e
everywhere else. These homomorphisms are injective.
homomorphisms 109

For example:
i1 : Z2 Z2 Z3 i2 : Z3 Z2 Z3 m2 m1
0 7 (0, 0) 0 7 (0, 0)
1 7 (1, 0) 1 7 (0, 1)
m3
2 7 (0, 2)

For isometry groups or other groups with some kind of geometric


context, we can often use our geometric intuition to devise interesting
homomorphisms. The following example illustrates this approach.
m3
Example 4.17 The dihedral groups D3 and D6 are the symmetry m4 m2
groups of, respectively, an equilateral triangle and a regular hexagon. m5 m1
Just as we can inscribe an equilateral triangle inside a regular hexagon
m6
(see Figure 4.1), we can define a homomorphism which maps D3 into
D6 :
e 7 e r 7 r2 r2 7 r4
Figure 4.1: Embedding D3 into D6
m1 7 m2 m2 7 m4 m3 7 m6
More generally, we can define an inclusion homomorphism Dn
D2n by mapping
e 7 e r k 7 r2k mk 7 m2k
for all k {1, . . . , n}.

In the last chapter we learned about quotient groups: the formation of


a new group G/H from the cosets of a normal subgroup H of G. We
found that if H is normal in G, then we can use the group structure of
G to define a group structure on the set of cosets of H in G. Another
way of looking at this is that the quotienting process in some way
respects the group structure of G. The following example uses this
insight to define a homomorphism q : G G/H.
Example 4.18 Let G be a group, and H be a normal subgroup of G.
Define the quotient homomorphism q : G G/H to be the function
mapping a given element g G to the coset gH G/H. This is a
homomorphism since
q( g1 g2 ) = ( g1 g2 ) H = ( g1 H )( g2 H ) = q( g1 )q( g2 )
for any g1 , g2 G. It is also surjective, since every coset in G/H is
mapped to by at least one element of G.

Some familiar functions can be described as quotient homomorphisms:



Example 4.19 In Example 3.35 we saw C /R+ = C. The quotient

homomorphism is the argument map arg : C C which assigns to
each nonzero complex number z its argument arg(z) [0, 2 ).
110 a course in abstract algebra

In Example 3.36 we saw that C


/C = R+ . The quotient homomor-

phism C
R+ is precisely the modulus map | | : C R+
which assigns to each nonzero complex number z its modulus |z|.

Example 4.20 At the beginning of Section 3.2 we discussed the


construction of the cyclic group Zn by factoring Z by its subgroup
nZ. The quotient homomorphism q : Z Z/nZ = Zn in this case
is the map m 7 [m]n which maps an integer m to its remainder
modulo n.
The following example considers the special case where we factor a
group G by its commutator subgroup G 0 = [ G, G ]:
Example 4.21 Let : G Gab = G/[ G, G ] be the quotient homo-
morphism from G to its abelianisation. More precisely, ( g) =
g[ G, G ] for any g G; that is, maps an element g to the cor-
responding coset of the commutator subgroup. We call this the
abelianisation homomorphism; if G is already abelian then is just
the identity map idG .

The abelianisation homomorphism has another important property:


given a homomorphism f : G H, abelianisation not only yields
groups Gab and H ab , but also a homomorphism f ab : Gab H ab .
Proposition 4.22 For any group homomorphism f : G H, there is a
well-defined homomorphism f ab : Gab H ab , called the induced homo-
morphism, given by the mapping g[ G, G ] 7 f ( g)[ H, H ], which interacts
neatly with the abelianisation maps G : G Gab and H : H H ab in
the sense that H f = f ab G .

Proof We need to check that f ab is well-defined as a function from


Gab to H ab , and that it satisfies the homomorphism condition.
What we mean by the first of these is that its not enough to know that
for any g G, the coset g[ G, G ] in Gab is mapped to f ( g)[ H, H ], we
need to check that for any other k g[ G, G ], the coset k [ G, G ] = g[ G, G ]
also gets mapped to f ( g)[ H, H ].
Suppose, then, that k g[ G, G ]. This implies that there exists some
h [ G, G ] such that k = g h. Then
f (k )[ H, H ] = f ( g h)[ H, H ]
= ( f ( g) f (h))[ H, H ] ( f is a homomorphism)
= ( f ( g)[ H, H ])( f (h)[ H, H ]) ( H is a homomorphism)
= f ( g)[ H, H ] ( f (h)[ H, H ] = [ H, H ])
and so f ab is well-defined.
That f ab is a homomorphism follows from the fact that f and H are,
and hence so is their composition. H f .
homomorphisms 111

The homomorphism f ab is called the induced homomorphism, and We can represent the induced homo-
is best illustrated by means of another example. morphism f ab by means of the follow-
ing square of groups and homomor-
Example 4.23 We saw in Example 3.32 that D3ab = Z2 , and it so phisms:
happens that D6ab Z
= 2 Z2 . Example 4.17 gave us an inclusion
G
f
/H
homomorphism i : D3 D6 , so putting all these together should
yield an induced homomorphism iab : Z2 Z2 Z2 . G H

We can depict this situation with a commutative diagram:  


Gab / H ab
D3
i / D6 f ab

D3 D6 There are two ways of getting from G


to H ab : do f then H , or do G then f ab .
  The point is that f ab is the unique ho-
Z2 / Z2 Z2
iab momorphism Gab H ab that ensures
these two paths are equivalent, in the
To properly determine the induced homomorphism iab , we follow sense that ( H f )( g) = ( f ab G )( g)
for all g G.
the various elements of D3 around the commutative square. In other
We call a diagram of this type a com-
words, we choose an element g of D3 in the top left-hand corner and mutative diagram.
then calculate its image D3 ( g) with the left-hand vertical map. Then
we look at what happens to the same element g if we follow it along
the top edge and down the right-hand side of the square, to obtain
the composite D6 (i ( g)). We define the induced homomorphism iab
to be the unique homomorphism that maps D3 ( g) 7 D6 (i ( g)); this
is guaranteed to be well-defined by the above proposition.
 mk 
_e
/e
_ r_k  / r2k
_ _
/ m2k
_

       
0 / (0, 0) 0 / (0, 0) 1 / (1, 0)

From this, we can see that the induced homomorphism iab is the
inclusion homomorphism i1 : Z2 Z2 Z2 mapping D3ab = Z2 into
the first coordinate of D6ab = Z2 Z2 :
iab : Z2 Z2 Z2
0 7 (0, 0)
1 7 (1, 0)

Notice that for all of the homomorphisms f : G H weve seen so


far, the identity eG of G is mapped to the identity e H in H. There may
be other elements in G which are also mapped to e H depending on
the exact definition of f , and well look further into that in the next
section, but so far all of the examples weve looked at have been such
that f (eG ) = e H . The following proposition confirms that this isnt
a coincidence, and that all homomorphisms have this property. In
addition, for all of the examples so far, we find that for any element
g G, the inverse g1 of g is mapped to the inverse of whatever
112 a course in abstract algebra

g is mapped to; more formally, f ( g1 ) = f ( g)1 . This is also not


a coincidence: the point of a homomorphism is that its a function
which in some way preserves the structure of the group in question,
so it makes sense that the inverse of an element would be mapped to
the inverse of whatever the original element is mapped to.
Proposition 4.24 Let f : G H be a homomorphism. then f (eG ) = e H ,
and f ( g1 ) = f ( g)1 for any g G.

Proof Suppose that f (eG ) = h for some element h H. Then


h = f (eG ) = f (eG eG ) = f (eG ) f (eG ) = h h.
Hence, by the cancellation laws (Proposition 1.15) h = e H .
Similarly, for any g G,
f ( g ) f ( g 1 ) = f ( g g 1 ) = f ( e G ) = e H
and f ( g 1 ) f ( g ) = f ( g 1 g ) = f ( e G ) = e H .
What this tells us is that f ( g1 ) acts as a left and right inverse for f ( g),
or in other words f ( g1 ) = f ( g)1 .

When you want to see if your picture 4.2 Kernels and images
corresponds throughout with the ob-
jects you have drawn from nature, take
a mirror and look in that at the reflec- Recall that for any function f : X Y we can define the image
tion of the real things, and compare im( f ) of f . This is a subset of the codomain Y and is defined to consist
the reflected image with your picture,
and consider whether the subject of the of those elements of Y that are mapped to by at least one element of
two images duly corresponds in both, the domain X. That is,
particularly studying the mirror.
Leonardo da Vinci (14521519), im( f ) = {y Y : y = f ( x ) for some x X }.
The Practice of Painting
Its natural to ask whether the image of a homomorphism f : G H
has any interesting properties. We know that from Proposition 4.24
that im( f ) must contain the identity e H as well as a full complement
of inverses; that is, for any h im( f ) we know that h1 im( f )
too. We now find ourselves moving inexorably towards the following
proposition.
Proposition 4.25 Let f : G H be a group homomorphism. Then the
image im( f ) is a subgroup of H.

Proof We know from Proposition 4.24 that e H im( f ) and also that
h1 im( f ) for any h im( f ). All we need to do now is show that
im( f ) is closed under the group multiplication operation in H.
Suppose, therefore, that h1 = f ( g1 ) and h2 = f ( g2 ) for some elements
g1 , g2 G. Then
h1 h2 = f ( g1 ) f ( g2 ) = f ( g1 g2 ) im( f ).
homomorphisms 113

Hence im( f ) is indeed closed under the group multiplication operation


in H, and is thus a subgroup of H.

So im( f ) is a subgroup of H. Is there anything else we can say about


it? In the last chapter we spent some time and effort studying normal
subgroups and using them to construct quotient groups. Is the image
of a homomorphism a normal subgroup of its codomain? It certainly
will be if the codomain is an abelian group (since all subgroups of
abelian groups are normal) but as the following example shows, im( f )
need not be a normal subgroup if the codomain of f is nonabelian.
Example 4.26 Let f : D3 D3 such that
e, r, r2 7 e,
m1 , m2 , m3 7 m1 .
This is a homomorphism (check this) but its image im( f ) = {e, m1 }
isnt a normal subgroup of D3 .

The following example considers the image of a particular injective


homomorphism we met in the previous section.
Example 4.27 In Example 4.17 we constructed an injective homomor-
phism f : D3 D6 . The image im( f ) is a six-element subgroup of
D6 consisting of the elements {e, r2 , r4 , m2 , m4 , m6 } which, on further
inspection, turns out to be isomorphic to D3 .

This is true in general: given an injective homomorphism f : G


H, the image im( f ) is isomorphic to the domain G. An explicit
isomorphism can be constructed by restricting the codomain of f to
just the image, thereby making the already injective homomorphism f
surjective as well.
Having looked at what happens at one end (the codomain) of a
homomorphism, we now turn our attention to the other end (the
domain) via a short detour into linear algebra.
An important application of linear algebra is the solution of systems
of simultaneous linear equations. For example, we can encode the
system
x 3y + 4z 2w = 5
2y + 5z + w = 2 (4.1)
y 3z = 4
as the matrix equation
  x 
1 3 4 2 y
h5i
0 2 5 1 z = 2 (4.2)
0 1 3 0 w 4

We can then restate the original problem (4.1) as an attempt to charac-


114 a course in abstract algebra

1
We are tacitly assuming that (4.2) actu-
ally has a solution; in this case it does.
terise the vectors in R4 which get mapped to the vector
h5i
Homogeneous systems are always solv- 2
able; the original inhomogeneous sys- 4
tem (4.2) is solvable if the vector in R3 by the linear map represented by the matrix
h5i  
2 1 3 4 2
4 A= 0 2 5 1 .
0 1 3 0
belongs to the image of A, which it
does. Suppose that v is a solution of (4.2).1 Then any other solution of (4.2)
can be expressed as a sum u + v where u is a solution of the associated
homogeneous system
  x  h i
1 3 4 2 y 0
0 2 5 1 z = 0 (4.3)
0 1 3 0 w 0
The solutions of the homogeneous sytem form a subspace of the vector
space R4 which we call the null space or kernel of (the linear map
represented by) the matrix A. In this case, the general solution of (4.2)
takes the form " #
 5  17
4 +t 1 3
0
6 11
where t R is a real parameter. Here we can see that the kernel of A
is a one-dimensional subspace of R4 .
So, in linear algebra we can derive some useful information about
a linear map (and thereby, perhaps, solve the actual problem under
investigation) by characterising which elements of the domain get
mapped to the zero vector in the codomain; that is, by studying the
kernel of the map.
We now define the analogous concept for group homomorphisms:
Definition 4.28 Let f : G H be a group homomorphism. Then
the kernel ker( f ) of f is defined to be
ker( f ) = { g G : f ( g) = e H }.
That is, the elements of G that are mapped by f to the identity e H in
H.
Looking again at the homomorphism f : D3 D3 from Example 4.26,
we can see that the kernel of f is the set {e, r, r2 }, which happens to
be the rotation subgroup R3 of D3 .
Just as the image of a homomorphism is a subgroup of the codomain,
and the kernel and image of a linear map are vector subspaces of,
respectively, the domain and codomain, it seems reasonable to ask
whether the kernel of a group homomorphism is necessarily a sub-
group of the domain. But the kernel of the homomorphism in Exam-
ple 4.26 isnt just a subgroup of the domain, its a normal subgroup.
Might it be that the kernel is always a normal subgroup of the domain?
The answer to this question is yes, as the following proposition shows.
homomorphisms 115

Proposition 4.29 Suppose f : G H is a group homomorphism. Then


ker( f ) is a normal subgroup of G. Furthermore, any normal subgroup of
a group G can be expressed as the kernel of some suitable homomorphism

Proof Clearly the identity eG ker( f ), since f (eG ) = e H by Proposi-


tion 4.24. Given any g ker( f ) we see that
f ( g 1 ) = f ( g ) 1 = e 1
H = eH
and so g1 ker( f ) as well. Given any g1 , g2 ker( f ) we have
f ( g1 g2 ) = f ( g1 ) f ( g2 ) = e H e H = e H
and so g1 g2 ker( f ) as well. Hence ker( f ) is a subgroup of G: it
contains the identity eG , it contains all necessary inverses, and its
closed under the group multiplication operation in G.
For ker( f ) to be a normal subgroup of G we need it to be closed under
conjugation by any element of G. So let k ker( f ) be some element
of the kernel, and let g be some arbitrary element of G. Then
f ( g k g 1 ) = f ( g ) f ( k ) f ( g 1 )
= f ( g ) e H f ( g ) 1
= f ( g ) f ( g ) 1
= eH
and hence the conjugate g k g1 ker( f ) as required, so ker( f ) is
a normal subgroup of G.
To show the second part of the proposition, that any normal subgroup
N P G can be expressed as the kernel of a suitable homomorphism, we
make use of the quotient homomorphism introduced in the previous
section. Let q : G G/N be this homomorphism. Then N is exactly
that part of G which is mapped to the identity in the quotient group
G/N, and so N = ker(q).
Well make use of this fact in the next section, but for the moment
well look at a few examples.
Example 4.30 Let f : Sn Z2 be the homomorphism which maps
even permutations to 0 and odd permutations to 1. Then ker( f ) =
An , the alternating group of even permutations. This is known to be
normal in Sn (see Example 3.33).

Example 4.31 Example 4.26 provided a homomorphism f : D3


D3 whose kernel is the rotation subgroup R3 C D3 . By Proposi-
tion 4.29 we can now confidently state that there is no homomor-
phism f : D3 H, where H = D3 or any other group, such that
ker( f ) = {e, m1 }, since this isnt a normal subgroup of D3 .
In fact, no matter what group H is, there are only three possibilities
116 a course in abstract algebra

for the kernel of a homomorphism f : D3 H. Either ker( f ) = D3 ,


in which case f is the zero homomorphism (see Example 4.13) which
maps everything in D3 to e H H, or ker( f ) is the trivial subgroup
{eG }, in which case f is an injective homomorphism, or ker( f ) = R3 .

Example 4.32 By the same argument used in the previous example,


if n > 4 then there are no homomorphisms f : An H for any group
H, other than the zero homomorphism which maps all of An to
e H H, or an injective homomorphism mapping An isomorphically
to some subgroup of H. This follows from the simplicity of An for
n > 4 (see Proposition 3.43).

Example 4.33 We know from Example 4.6 that the determinant


map det : GLn (R) R is a homomorphism. The kernel of this
homomorphism is precisely the subgroup of nn matrices with
determinant 1, or in other words the special linear group SLn (R).
We already know that SLn (R) is a normal subgroup of GLn (R) (see
Example 3.15) but if we didnt, this would confirm it for us.

Example 4.34 Let f : Z Z be the homomorphism which maps an


integer m to its modulon remainder [m]n . Then ker( f ) = nZ.

The injective homomorphism f : D3 D3 in Example 4.26 has a trivial


kernel; that is, ker( f ) = {e}. In Examples 4.31 and 4.32 we remarked
that the homomorphisms with trivial kernel were those which mapped
the group in question injectively (in fact, isomorphically) to a subgroup
of their codomain. The next proposition confirms that this is a general
phenomenon.
Proposition 4.35 Let f : G H be a group homomorphism. Then f is
injective if and only if ker( f ) = {eG }.

Proof If f is injective then ker( f ) clearly has to be trivial. For, if there


exist g1 , g2 G with f ( g1 ) = f ( g2 ) = e H , then the injectivity of f
requires that g1 = g2 , and Proposition 4.24 implies that g1 = g2 = eG .
Hence ker( f ) = {eG }.
The converse is only very slightly more involved. Suppose that
ker( f ) = {eG } and that g1 , g2 G are such that f ( g1 ) = f ( g2 ). Then
e H = f ( g1 ) f ( g2 )1 = f ( g1 ) f ( g21 ) = f ( g1 g21 ).
So g1 g21 is in the kernel of f . But ker( f ) is trivial, so g1 g21 = eG
and hence g1 = g2 . Therefore f is injective.
The kernel of a homomorphism f : G H is the subset (actually, by
Proposition 4.29, the normal subgroup) of G consisting of all elements
which f maps to the identity e H in H. We can generalise this notion a
little bit by considering the elements of G which get mapped to other
homomorphisms 117

elements of H.
Definition 4.36 Let f : G H be a group homomorphism, and let
h be some element of H. We define the inverse image or preimage
of h to be the set
f 1 ( h ) = { g G : f ( g ) = h } G
of elements of G which f maps to the element h.
If K is a subset of H, then the inverse image or preimage of K is the
set
f 1 ( K ) = { g G : f ( g ) K } G
of elements of G which f maps to some element of K.

In particular, ker( f ) = f 1 (e H ), and for any single element h H


we have f 1 (h) = f 1 ({ h}). Also, if f is injective, then f 1 (h) will
consist of a single element, and if f is not surjective, then there will
exist at least one element h H for which f 1 (h) = .
What happens if we look at the inverse image, not just of a subset, but
of a subgroup of H? The following proposition answers this question.
Proposition 4.37 Let f : G H be a group homomorphism, and let
K 6 H be a subgroup of H. Then the inverse image f 1 (K ) is a subgroup
of G. If K P H is a normal subgroup of H, then the inverse image f 1 (K )
is a normal subgroup of G.

Proof The inverse image f 1 (K ) certainly contains the identity eG ,


since f (eG ) = e H K. If g f 1 (K ) then g1 f 1 (K ) too, since
f ( g1 ) = f ( g)1 K, because K is a subgroup of H.
Suppose g1 , g2 f 1 (K ). This means that f ( g1 ) and f ( g2 ) are in K,
and since K is a subgroup of H, the product f ( g1 ) f ( g2 ) K as well.
But f ( g1 ) f ( g2 ) = f ( g1 g2 ) since f is a homomorphism, and hence
g1 g2 f 1 (K ) as well. Thus f 1 (K ) is a subgroup of G.
If K is a normal subgroup of H then for any g1 , g2 f 1 (K ) the
conjugate f ( g1 ) f ( g2 ) f ( g1 )1 is also in K, and since
f ( g1 ) f ( g2 ) f ( g1 )1 = f ( g1 ) f ( g2 ) f ( g11 ) = f ( g1 g2 g11 )
we see that the conjugate g1 g2 g11 is also in the inverse image
f 1 (K ), and so f 1 (K ) is normal in G.
More generally, for a group homomorphism f : G H, the preimage
of any single element h H is either the empty set (if h H \ im( f ))
or a coset of ker( f ) in G.
Proposition 4.38 Let f : G H be a group homomorphism with kernel
K = ker( f ), and let h H be such that f ( g) = h for some g G. Then
f 1 (h) = gK.
118 a course in abstract algebra

Proof To show f 1 (h) = gK we follow the standard approach to


proving equality of two sets: we show each is a subset of the other.
Firstly, we show that f 1 (h) gK. Consider some element x f 1 (h),
so f ( x ) = h. Then f ( x ) = f ( g), so
e H = f ( g ) 1 f ( x ) = f ( g 1 ) f ( x ) = f ( g 1 x ).
This means that g1 x ker( f ) = K, and hence by (the left-handed
version of) Proposition 2.26 we have x gK and thus f 1 (h) gK.
Secondly, to show that gK f 1 (h) let x gK. Then x = g k for
some element k K, and therefore
f ( x ) = f ( g k) = f ( g) f (k ) = h e H = h.
Thus x f 1 (h), so gK f 1 (h).
These two inclusions combine to give f 1 (h) = gK as required.
From this we get the following neat little corollary, which will come in
useful in a little while.
Corollary 4.39 Let f : G H be a group homomorphism with kernel
K = ker( f ). Then if |K | = n, the homomorphism f is an nto1 map
from G onto im( f ).

Proof This follows almost immediately from the previous proposition


and Proposition 2.29. For any h im( f ) with h = f ( g) for some
g G, we have
| f 1 (h)| = | gK | = |K | = n
so each element of im( f ) is mapped to by exactly n elements of G.

In the judgment of the most compe- 4.3 The Isomorphism Theorems


tent living mathematicians, Frulein
Noether was the most significant cre-
ative mathematical genius thus far pro- In the previous chapter we studied normal subgroups and
duced since the higher education of used them to define quotient groups: new groups in which the nor-
women began. In the realm of alge-
bra, in which the most gifted math- mal subgroup (together with whatever it might happen to signify)
ematicians have been busy for cen- is factored out and effectively annihilated or ignored. We saw how
turies, she discovered methods which
factoring a group G by its centre Z ( G ) yielded a new group, and in a
have proved of enormous importance
in the development of the present-day little while we will figure out what this particular quotient group tells
younger generation of mathematicians. us. We also saw how factoring by the commutator subgroup [ G, G ]
Albert Einstein (18791955), Letter
killed all the nonabelian structure in the group, leaving us with an
to the New York Times, 5 May 1935
abelianised version Gab of the group.
Both the centre and the commutator subgroups are in some sense
special: every group has one of each, and theyre both well-defined (al-
though in some cases theyre either trivial or otherwise uninteresting).
homomorphisms 119

Proposition 4.29 tells us that for any homomorphism f : G H, the


kernel ker( f ) is a normal subgroup of G. And since its normal in
G, we can go ahead and factor G by it to form the quotient group
G/ ker( f ). Proposition 4.29 also tells us that any normal subgroup of
G can be regarded as the kernel of some suitable homomorphism, so
its natural to ask what these quotient groups look like.
The answer is given by the following elegant theorem:
Theorem 4.40 (First Isomorphism Theorem) Let f : G H be a
group homomorphism and let K = ker( f ). Then the function : G/K
im( f ) given by ( gK ) = f ( g) is an isomorphism. That is,
G/ ker( f )
= im( f ).
Proof We need to show three things: that the function is well-
defined, that it is a homomorphism, and that it is bijective.
Suppose, then, that g1 , g2 G and that the cosets g1 K and g2 K coincide;
that is, g1 K = g2 K. Then by Proposition 2.26 it follows that g1 g21
lies in the kernel K. So f ( g1 g21 ) lies in the image f (K ) = {e H }, and
thus f ( g1 g21 ) = e H . Since f is a homomorphism,
f ( g1 g21 ) = f ( g1 ) f ( g21 ) = f ( g1 ) f ( g2 )1 = e H
and therefore f ( g1 ) = f ( g2 ). So if g1 K = g2 K we have
( g1 K ) = f ( g1 ) = f ( g2 ) = ( g2 K )
and thus is well-defined.
The function is a homomorphism since
(( g1 K )( g2 K )) = (( g1 g2 )K )
= f ( g1 g2 )
= f ( g1 ) f ( g2 )
= ( g1 K ) ( g2 K ) .
It is clear that is surjective: any element of im( f ) is of the form f ( g)
for some g G. And by the definition of , we have f ( g) = ( gK ).
Hence any element of im( f ) is of the form ( gK ) for some coset
gK G/K.
Finally, we need to show that is injective. To see this, consider two
elements g1 , g2 G such that ( g1 K ) = ( g2 K ). Then f ( g1 ) = f ( g2 ),
which then means that
e H = f ( g1 ) f ( g2 )1 = f ( g1 ) f ( g21 ) = f ( g1 g21 ).
This means that g1 g21 ker( f ), and by Proposition 2.26 this means
that g1 K = g2 K as cosets. Therefore is injective.
Hence : G/ ker( f ) im( f ) is a well-defined group isomorphism.
Lets look at some examples.
120 a course in abstract algebra

Example 4.41 In Example 3.27 we saw that factoring the Klein 4


group V4 = {e, a, b, c} by the subgroup A = {e, a} = Z2 yields a
group which is itself isomorphic to Z2 . We can rephrase this as
a consequence of the First Isomorphism Theorem as follows. Let
f : V4 Z2 be the homomorphism which maps e and a to 0 and b
and c to 1. Then ker( f ) = A and f is surjective so im( f ) = Z2 . By
the First Isomorphism Theorem we have
V4 /A = V4 / ker( f )
= im( f ) = Z2
as expected.

Example 4.42 Revisiting Example 3.26 in the same way, we define a


homomorphism f : Z4 Z2 such that 0, 2 7 0 and 1, 3 7 1. Then
ker( f ) = {0, 2} and im( f ) = Z2 , and so by the First Isomorphism
Theorem we see that
Z4 /{0, 2} = Z4 / ker( f )
= im( f ) = Z2
as expected.

The First Isomorphism Theorem is an important. In particular, it tells


us we can study homomorphic images of a group G by examining its
various quotient groups, since there is a well-defined correspondence
between them. This correspondence is not always bijective, though, as
different homomorphisms may have the same image:
Example 4.43 Let V4 = {e, a, b, c} be the Klein 4group as usual. We
define three group homomorphisms , , : V4 Z2 by
e, a 7 0 e, b 7 0 e, c 7 0
n n n
: b, c 7 1, : a, c 7 1, : a, b 7 1.
Let A = ker(), B = ker( ) and C = ker(). These three subgroups
are obviously normal for a variety of reasons (theyre kernels of
group homomorphisms, V4 is abelian, theyre index2 subgroups,
etc) and lead to obviously isomorphic quotient groups
V4 /A = {eA = aA, bA = cA},
V4 /B = {eB = bB, aB = cB}
and V4 /C = {eC = cC, aC = bC }
which are in turn isomorphic to the images
im() = im( ) = im() = Z2 .
Well look more closely at the details of this correspondence later on.
In particular, it will lead us into a discussion of group extensions
and semidirect products. But for the moment, the moral of the First
Isomorphism Theorem is that we can determine all homomorphic
images of a group G from its various quotients.
homomorphisms 121

This, of course, raises the question of why we might be interested in 1 Z24



= Z8 Z3
homomorphic images of a group when we could just be investigating
2 Z12 Z2
the group itself. The answer is that with larger groups it can be quite
= Z6 Z4
time-consuming and difficult to study the group itself, whereas we can
= Z4 Z3 Z2
often find out quite a lot about the internal structure of a group from 3 Z2 Z2 Z6

= Z2 Z2 Z2 Z3
its homomorphic images, as the following examples demonstrate.
4 D12
Example 4.44 Suppose G is a group of order 6 which maps surjec-
5 D6 Z2
= D3 Z2 Z2
tively onto the cyclic group Z3 via some homomorphism f : G Z3 .
6 D4 Z3
Then we know that G must have a normal subgroup of order 2 since
7 D3 Z4
by Proposition 4.29 K = ker( f ) is a normal subgroup of G, and
= S3 Z4
Corollary 4.39 tells us that | ker( f )| = | f 1 (0)| = 2. This means that 8 S4
K = Z2 . It also means that G can only be the cyclic group Z6 since 9 A4 Z2
the only other group of order 6 is the dihedral group D3 , which we 10 Q8 Z3
already know doesnt have a normal subgroup of order 2. 11 SL2 (3)

In this case, merely knowing that such a homomorphism exists, with- 12 Z3 oZ8
13 Z3 o D4 with kernel Z2 Z2
out knowing anything else about it, was enough to help us figure out
14 Dic6
what the group was. This isnt always possible, but we can usually
15 Dic3 Z2
derive at least some information about the chosen groups internal
structure, as the following example demonstrates. Table 4.1: The fifteen groups of order 24

Example 4.45 Suppose f : G Z6 is a surjective homomorphism


from a group G of order 24 onto the cyclic group of order 6.
Then by Proposition 4.29 and Corollary 4.39 we know that G must
have a normal subgroup of order 4 (which must be isomorphic to
either Z4 or the Klein 4group V4 ) since | ker( f )| = | f 1 (0)| = 4.
But we can do better than that. The image group Z6 has a normal
2
subgroup {0, 3} of order 2, so by Proposition 4.37 and Corollary 4.39, Just for interest, the fifteen groups of
order 24 are shown in Table 4.1.
G has a normal subgroup of order 24 = 8.
Weve met the first eleven of these
Also, Z6 has a normal subgroup {0, 2, 4} of order 3, so by the same groups already, to a greater or lesser
extent. (Recall that SL2 (3) is the spe-
argument we know G has a normal subgroup of order 34 = 12. cial linear group of 22 matrices with
determinant 1 over the finite field F3 .)
Up to isomorphism there are fifteen groups of order 24,2 and were
For the remaining four, Dicn denotes
still a little way from being able to say which one this is,3 but just the dicyclic group of order 4n and o
knowing the existence of a surjective homomorphism onto Z6 has told denotes the semidirect product. Well
meet both of these in later chapters.
us a surprising amount about its internal structure.
3
It so happens that SL2 (3) has no sub-
Lagranges Theorem tells us that G might also contain subgroups of groups of order 12, so we can cross that
orders 2, 3 and 6, but doesnt guarantee their existence. However, one off the list of candidates for G.
Cauchys Theorem helps us a little more, guaranteeing the existence
of cyclic (albeit not necessarily normal) subgroups of orders 2 and 3.
For the next application of the First Isomorphism Theorem, we in-
troduce three new constructs: the normaliser NG ( H ), the centraliser
ZG ( H ) and the inner automorphism group Inn( G ).
122 a course in abstract algebra

Definition 4.46 Let S be a subset of a group G. The normaliser of


S in G is the set
NG (S) = { g G : gSg1 = S}
where gSg1 = { g s g1 : s S}.

What this means is that the set gSg1 consists of all possible conjugates
of elements of S by some fixed element g G. So the normaliser NG (S)
is the set of all elements g G for which conjugation by g either leaves
S unchanged or permutes its elements among themselves.
Example 4.47 Consider the subset S = {1, 1} Z. For some
arbitrary integer n Z, the conjugate set is
n + S n = {n 1 n} {n + 1 n} = {1, 1} = S.
In other words, the set S is fixed by conjugation by any integer n Z.
What this means is that NZ (S) = Z: the normaliser of S in Z is the
whole of Z.
Similarly, considering the subgroup 3Z 6 Z, we see that
n + 3Z n = {n + 3m n : m Z} = {3m : m Z} = 3Z
and hence
NZ (3Z) = {n Z : n + 3Z n = 3Z} = {n Z : 3Z = 3Z} = Z.
Thus NZ (3Z) = Z.

More generally, NG (S) = G for any abelian group G and any subset
S G. At first sight the normaliser doesnt look particularly useful:
in the examples above its just equal to the full group. But as usual,
things become more interesting when we consider nonabelian groups.
Example 4.48 Let G = D3 and consider the four subsets
S1 = { m 1 , m 2 , m 3 } , S2 = {e, r, r2 }, S3 = {e, m1 }, S4 = { r } .
Clearly the identity e is in ND3 (S) for any of these subsets, since
e s e1 = s for any s S.
The normaliser ND3 (S1 ) contains the rotations r and r2 since
r m1 r 2 = m2 , r 2 m1 r = m3 ,
r m2 r 2 = m3 , r 2 m2 r = m1 ,
r m3 r 2 = m1 , r 2 m3 r = m2 .
Observe that not only is g s g1 in S1 for each of these elements
g = r, r2 , but that every element of S1 is represented in this way. That
is, gS1 g1 is actually equal to S1 rather than just being a subset of it.
homomorphisms 123

The reflections m1 , m2 and m3 are in ND3 (S1 ) as well, since


m1 m1 m1 = m1 , m2 m1 m2 = m3 , m3 m1 m3 = m2 ,
m1 m2 m1 = m3 , m2 m2 m2 = m2 , m3 m2 m3 = m1 ,
m1 m3 m1 = m2 , m2 m3 m2 = m1 , m3 m3 m3 = m3 .
Hence ND3 (S1 ) = D3 .
The subset S2 happens to be equal to the rotation subgroup R3 , which
we know is closed under conjugation by elements of D3 since its a
normal subgroup of D3 . A glance at the first three rows of Table 3.1
confirms that the normaliser ND3 (S2 ) = D3 .
Examining S3 , we see that m1 ND3 (S3 ) since
m1 e m1 = e, m1 m1 m1 = m1 .
However, neither the rotations r and r2 nor the other reflections m1
and m2 fix S3 under conjugation, so ND3 (S2 ) = {e, m1 }.
Finally, the second row of Table 3.1 tells us that r is unchanged when
conjugated with the identity e, r itself, or r2 , but by no other elements.
Hence ND3 (S4 ) = {e, r, r2 }.

This example displays a range of interesting behaviour. In all cases


the normaliser ND3 (S) was a subgroup of D3 , and as the following
proposition shows, this wasnt a coincidence.
Proposition 4.49 Let S G be a subset of some group G. Then the
normaliser NG (S) is a subgroup of G.

Proof The identity e is in NG (S) since ese1 = ese = s for all


s S.
If g1 , g2 NG (S) then their product g1 g2 NG (S) too:

( g1 g2 )S( g1 g2 )1 = ( g1 g2 )S( g21 g11 )


= g1 ( g2 Sg21 ) g11
= g1 Sg11 since g2 NG (S)
=S since g1 NG (S).

If g NG (S) then S = gSg1 = { g s g1 : s S}, and hence

Sg = {s g : s S} = { g s : s S} = gS

and therefore
g1 Sg = { g1 s g : s S} = S
so g1 NG (S) as required. Thus NG (S) is a subgroup of G.
In Example 4.48 we considered two cases where the subset S happened
to be a subgroup of the larger group D3 . In the first of these cases,
the subset S was actually the rotation subgroup R3 = {e, r, r2 }, which
124 a course in abstract algebra

happens to be a normal subgroup of its normaliser ND3 ( R3 ) = D3 .


In the second case, the subgroup {e, m1 } was its own normaliser. We
call subgroups with this property (that is, subgroups H 6 G such
that NG ( H ) = H) self-normalising. Clearly any self-normalising
subgroup must be a normal subgroup of its own normaliser.
Example 4.50 Consider the following three subgroups of the sym-
metric group S4 :
H1 = {, (1 2)},
H2 = {, (1 2 3), (1 3 2)},
and H3 = {, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}
These subgroups have normalisers
NS4 ( H1 ) = {, (1 2), (3 4), (1 2)(3 4)}
= V4 ,
NS4 ( H2 ) = {, (1 2), (2 3), (1 3), (1 2 3), (1 3 2)}
= S3 ,
and NS4 ( H3 ) = S4
in S4 , and normalisers
NA4 ( H2 ) = {, (1 2 3), (1 3 2)} and NA4 ( H3 ) = A4
in A4 .
Again, we see the same phenomenon occurring:

V4
H1 is not normal in S4 , but it is normal in NS4 ( H1 ) =
H2 is not normal in S4 , but it is normal in NS4 ( H2 )
= S3
H3 is normal in both S4 and its normaliser NS4 ( H3 ) = S4
H2 is not normal in A4 , but it is normal in NA4 ( H2 ) = H2
= A3
H3 is normal in both A4 and its normaliser NA4 ( H3 ) = A4

This happens in general: if H 6 G is a subgroup of G then H is a


normal subgroup of NG ( H ). Moreover, NG ( H ) is the largest subgroup
of G in which H is normal, as the examples above show. When H is
normal in G, its normaliser NG ( H ) is all of G, otherwise NG ( H ) is the
largest subgroup with this property, in some cases just H itself.
A related concept is that of the centraliser of a subset of a group:
Definition 4.51 Let G be a group, and let S be a subset of G. Then
the centraliser of S in G is the set
ZG (S) = { g G : g s g1 = s for all s S}
= { g G : g s = s g for all s S}

This looks almost identical to the definition of the normaliser NG (S),


but its stricter: NG (S) consists of all elements of G that fix S setwise
under conjugation (but might permute the elements of S) while ZG (S)
homomorphisms 125

consists of the elements of G that fix each individual element of S.


Equivalently, ZG (S) comprises those elements of G which commute
with each element of S. The centraliser ZG (S) generalises the concept
of the centre Z ( G ) of a group,4 in the sense that ZG ( G ) = Z ( G ). 4
Definition 3.16, page 78.
Example 4.52 Let G = D3 and let
S1 = { m 1 , m 2 , m 3 } , S2 = {e, r, r2 }, S3 = {e, m1 }, S4 = { r } .
Then
ZD3 (S1 ) = {e}, ZD3 (S2 ) = {e, r, r2 },
ZD3 (S3 ) = {e, m1 }, ZD3 (S4 ) = {e, r, r2 }.

The next example concerns subgroups of S4 and A4 .


Example 4.53 Let
H1 = {, (1 2)},
H2 = {, (1 2 3), (1 3 2)},
H3 = {, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}.
Then these subgroups have centralisers
ZS4 ( H1 ) = {, (1 2), (3 4), (1 2)(3 4)}
= V4 ,

ZS ( H2 ) = {, (1 2 3), (1 3 2)} = Z3
4

and ZS4 ( H3 ) = {, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}


= V4
in S4 , and centralisers
Z A4 ( H2 ) = {, (1 2 3), (1 3 2)}
= Z3
and Z A4 ( H3 ) = {, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}
= V4
in A4 .
As these examples suggest, the centraliser of a subgroup is a subgroup:
Proposition 4.54 Let G be a group, and let H be a subgroup of G. Then
the centraliser ZG ( H ) is a subgroup of G.

Proof The identity e lies in ZG ( H ) since e h = h e for all h H.


If g1 , g2 ZG ( H ) then g1 g2 is also in ZG ( H ) since
( g1 g2 ) h = g1 h g2 = h ( g1 g2 ) .
If g ZG ( H ) then its inverse g1 is also in ZG ( H ) since
g 1 h = ( h 1 g ) 1 = ( g h 1 ) 1 = h g 1
for all h H. Hence ZG ( H ) is a subgroup of G.
The third concept we need to introduce for our next application of the
First Isomorphism Theorem is that of an automorphism group. Recall
that an automorphism of a group G is an isomorphism f : G G.
The automorphisms of a given group G themselves form a group:
126 a course in abstract algebra

Proposition 4.55 Let G be a group, and let Aut( G ) be the set of all
automorphisms f : G G. Then Aut( G ) forms a group under the usual
composition operation for homomorphisms.

Proof Composition of automorphisms is clearly associative, since


composition of functions is always associative as long as the composite
is actually defined. For an automorphism, the domain is the same as
the codomain, so the composite will always be defined; it will also
be a bijection (since the composite of two bijections is also a bijection)
and must also be a homomorphism (since the composite of two ho-
momorphisms is also a homomorphism). Therefore composition is a
suitable associative binary operation : Aut( G ) Aut( G ) Aut( G ).
The identity map id : G G is clearly an automorphism of G, and
has the property that f id = f = id f for any f Aut( G ). It thus
acts as a suitable identity element in Aut( G ).
Every automorphism f : G G has a well-defined, unique inverse
f 1 : G G such that
f f 1 = id = f 1 f .
Thus Aut( G ) is a group.
As with permutation groups, we write the multiplication right-to-left
5
Proposition 3.9, page 76. rather than left-to-right in Aut( G ). As remarked earlier,5 there is a par-
ticular class of automorphisms that turn up in various circumstances:
Definition 4.56 Let G be a group. An inner automorphism of G is
an automorphism of the form
g : G G; h 7 g h g 1
for some fixed g G and all h G.
The inner automorphisms of a group form a group which we denote
Inn( G ) = {g : g G },
and which is a subgroup of the full automorphism group Aut( G ).

We now have the ingredients we need for the following elegant result,
an application of the First Isomorphism Theorem that describes the
connection between normalisers, centralisers and automorphisms.
Theorem 4.57 (The NormaliserCentraliser Theorem) Let H be a
subgroup of a group G. The quotient NG ( H )/ZG ( H ) is isomorphic to a
subgroup of the automorphism group Aut( H ).

Proof Let f : NG ( H ) Aut( G ) be the map given by f ( g) = g , the


inner automorphism g : H H such that g (h) = g h g1 .
This map f is a homomorphism, since for any g1 , g2 , h H we have
g1 g2 ( h ) = ( g 1 g 2 ) h ( g 1 g 2 ) 1
homomorphisms 127

= ( g1 g2 ) h ( g21 g11 )
= g1 ( g2 h g21 ) g11
= g1 (g2 (h))
= (g1 g2 )(h)
and hence f ( g1 g2 ) = f ( g1 ) f ( g2 ) as required.
We want to apply the First Isomorphism Theorem to this homomor-
phism f , so we need to figure out what its kernel is. The kernel
ker( f ) will consist of all elements of NG ( H ) which map to the iden-
tity element in Aut( H ). This identity element is exactly the identity
homomorphism id : H H. So were looking for the elements of
NG ( H ) that map to some inner automorphism g that happens to be
the identity.
In other words, we want the elements g NG ( H ) for which g (h) = h
for all elements h H. But these are exactly those g NG ( H ) for
which g h g1 = h and hence ker( f ) = ZG ( H ), the centraliser of H
in G.
By the First Isomorphism Theorem, then,
NG ( H )/ ker( f ) = NG ( H )/ZG ( H )
= im( f ) 6 Aut( H )
as required.
We can now use this result to tell us what it means if we take the quo-
tient of a group G by its centre Z ( G ): we get the inner automorphism
group of G.
Corollary 4.58 Let G be a group. Then G/Z ( G ) = Inn( G ).

Proof The normaliser NG ( G ) is G itself, and the centraliser ZG ( G )


is, as remarked earlier, just the centre Z ( G ) of G. The image of the
homomorphism f : NG ( G ) Aut( G ) is exactly the subgroup Inn( G )
of inner automorphisms of G. Hence, by the NormaliserCentraliser
Theorem, G/Z ( G )
= Inn( G ).
Now recall6 that if G is a group, and H and K are both subgroups of 6
Definition 2.10, page 45.
G, we denote by HK the set
HK = {h k : h H, k K }.
Soon well study and prove the Second Isomorphism Theorem, but
to provide a little motivation well look at a couple of examples:
one involving the quaternion group Q8 and the other involving the
symmetric group S4 and the alternating group A4 .
Example 4.59 Let G = Q8 , the quaternion group we met in Ex-
ample 1.42, and let H = { E, I } and K = { E, J }. Then
HK = Q8 = G and H K = { E} = Z2 . (Proposition 2.9 says
128 a course in abstract algebra

that H K must be a subgroup of G, but here its fairly obvious.)


All of H, K and H K are normal subgroups of HK = G = Q8 , so
we can use them to form quotient groups.
From Example 2.21 we can see that G/H = Z2 .
Also, H K = { E} = Z2 and H = K = V4 = Z2 Z2 . So, by

Example 3.27 we can see that H/( H K ) = Z2 .
Hence HK/H = H/( H K ).

Example 4.60 Let G = S4 , let K = A4 , and let H = {, (1 2 3), (1 3 2)}.


Then HK = A4 and H K = H, so both HK/K and H/( H K ) are
trivial, and hence isomorphic to each other.

The behaviour exhibited in these examples, that HK/K = H/( H K ),


is true in general. More than that, it turns out that we only really need
one of H and K to be a normal subgroup of G, but in that case we need
the following lemma, which is a generalisation of Proposition 2.9.
Lemma 4.61 Let G be a group, let H be a subgroup of G, and let N be a
normal subgroup of G. Then H N is a normal subgroup of H.

Proof Proposition 2.9 tells us that the intersection H N is a subgroup


of G, and since its clearly a subset of H it must also be a subgroup of
H. So all that remains is to show that its a normal subgroup of H.
Let h H and k H N. We need to show that the conjugate
h k h1 also belongs to H N. It certainly belongs to H, since H is
closed under the multiplication operation inherited from G, and it
must also belong to N since N is a normal subgroup of G and hence
closed under conjugation. Therefore h k h1 must belong to the
intersection H N, and thus H N is a normal subgroup of H.

Were now ready to state and prove the Second Isomorphism Theorem:
Theorem 4.62 (Second Isomorphism Theorem) Let G be a group,
let H be a (not necessarily normal) subgroup of G, and let N be a normal
subgroup of G. Then
HN/N
= H/( H N ).

Proof Lemma 4.61 tells us that H N is a normal subgroup of H,


so the statement makes sense. We can use the First Isomorphism
7
Theorem 4.40, page 119. Theorem7 to prove this result; the key is to construct the right sur-
jective homomorphism f : HN H/( H N ). If we can find such a
homomorphism that happens to have ker( f ) = N, then the Second
Isomorphism Theorem will follow immediately.
A typical element of HN is of the form h n where h H and n N,
while a typical element of the quotient H/( H N ) is a coset k ( H N )
homomorphisms 129

for some k H.
With this in mind, we define a function f : HN H/( H N ) mapping
the element h n HN to the coset h( H N ) in H/( H N ).
We have to show four things: that f is well-defined, that its surjective
onto H/( H N ), that its a group homomorphism, and that its kernel
is the normal subgroup N.
Dealing with the first of these, let h1 , h2 H and n1 , n2 N such that
h1 n1 = h2 n2
in HN. Rearranging this, we get
h21 h1 = n2 n11 .
On the left-hand side, h21 h1 is clearly an element of H, while on
the right-hand side, n2 n11 is clearly an element of N. Since theyre
equal, it must be the case that both belong to H and N. So both
sides, and specifically h21 h1 , are in the intersection H N. Hence
by Proposition 2.26, since h21 h1 H N, the cosets h1 ( H N ) and
h2 ( H N ) are equal, and so f (h1 n1 ) = h1 ( H N ) = h2 ( H N ) =
f (h2 n2 ). Thus f is well-defined.
To show that f is a group homomorphism, we need the following
observation: given any h H and n N there exists some element
k N such that h n = k h. This is because N, being a normal
subgroup, is closed under conjugation and so there is some element
k N such that h n h1 = k. Rearranging this, we get the desired
fact. Then, for any h1 , h2 H and n1 , n2 N we have
f ((h1 n1 )(h2 n2 )) = f (h1 (n1 h2 )n2 )
= f ( h1 h2 n3 n2 ) (for n3 = h21 n1 h2 )
= (h1 h2 )( H N )
= h1 ( H N ) h2 ( H N )
= f ( h1 n1 ) f ( h2 n2 )
and hence f is a homomorphism.
To see that f is surjective, consider any h H and form the coset
h( H N ). We need to find some element h n of HN which maps to
this coset. Setting n = e N to get the product h e HN suffices
admirably, since f (h e) = h( H N ) as required.
Finally, we need to show that ker( f ) = N. This kernel consists of all
elements h n HN such that f (h n) = H N. From this, we see that
the only elements h H for which h n 7 H N are those satisfying
h( H N ) = H N, and Proposition 2.26 tells us that these are exactly
the elements of H which also belong to N. So ker( f ) = ( H N ) N. But
this is exactly the same as N, so ker( f ) = N.
130 a course in abstract algebra

By the First Isomorphism Theorem, then, we have


HN/N = HN/ ker( f )
= im( f ) = H/( H N )
as required.
The next example concerns a nested sequence of normal subgroups of
S4 , and will lead to the last big theorem in this chapter.
Example 4.63 Let
G = S4 , H = A4 and K = {, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}.
Then H and K are both normal subgroups of G, and K is a normal
subgroup of H, so we can form the quotient groups
G/H
= Z2 , G/K
= S3 , H/K
= Z3 .
The quotient H/K is isomorphic to an index2 (and hence normal)
subgroup of G/K, so we can form the quotient group ( G/K )/( H/K ),
which happens to be isomorphic to Z2 and therefore also to G/H:
Wikimedia Commons
Emmy Noether (18821935) made im- G/H
= ( G/K )/( H/K )
portant contributions to algebra and
physics: as well as the three Isomor- It is not immediately obvious that ( H/K ) is normal in ( G/K ), and
phism Theorems as we know them to- therefore that the quotient ( G/K )/( H/K ) makes sense in general.
day, she is also remembered for her
work on ring theory and for Noethers Nevertheless, this is indeed the case, as the next lemma shows.
Theorem, which describes the connec- Lemma 4.64 If G is a group, and H and K are normal subgroups of G
tion between symmetry and conserva-
tion laws in physics. such that K is also a normal subgroup of H, then the quotient H/K is also
Born in 1882 to a Jewish family in Er- a normal subgroup of the quotient G/K.
langen, Bavaria, she was the daughter
of the algebraic geometer Max Noether Proof We need to show that H/K is closed under conjugation by
and his wife Ida Kauffmann. She dis-
elements of G/K. The elements of H/K are cosets of the form hK for
played a talent for languages and quali-
fied as a teacher of French and English some element h H, and the elements of G/K are cosets of the form
before deciding to study mathematics gK for some element g G.
at university. At the time, women were
only permitted to study unofficially at Consider the conjugate ( gK )(hK )( gK )1 . This is equal to
German universities, but she success-
fully persuaded the University of Erlan- ( gK )(hK )( g1 K ) = (( g h)K )( g1 K ) = ( g h g1 )K.
gen to admit her, and completed her
doctoral thesis ber die Bildung des For- But since H is a normal subgroup of G, the conjugate g h g1 lies
mensystems der ternren biquadratischen in H and hence the coset ( g h g1 )K is in the quotient H/K as
Form in 1907. She taught without pay
required, and therefore H/K is a normal subgroup of G/K.
at Erlangen until David Hilbert and Fe-
lix Klein invited her to Gttingen in Having proved this, we can now state and prove the Third Isomor-
1915, where she taught (initially under
Hilberts name) until the early 1930s. phism Theorem.
In 1933 she was dismissed from Gt- Theorem 4.65 (Third Isomorphism Theorem) Let G be a group, and
tingen due to laws passed by the
Nazi government forbidding Jewish let H and K be normal subgroups of G such that K is also a normal sub-
academics from holding posts at Ger- group of H. Then
man universities. She was promptly
G/H = ( G/K )/( H/K ).
offered a visiting professorship at Bryn
Mawr College in Pennsylvania, and
worked there and at the Institute for Proof As with the Second Isomorphism Theorem, we apply the First
Advanced Study in Princeton, New Jer- Isomorphism Theorem to a suitable group homomorphism.
sey, until her death in 1935.
homomorphisms 131

The elements of ( G/K )/( H/K ) are cosets of H/K determined by


elements of G/K, which are themselves cosets of K determined by
elements of G. So an element of G/K will be a coset of the form gK,
where g G, and hence an element of ( G/K )/( H/K ) will be a coset
of the form ( gK )( H/K ). With this in mind, define
f : G ( G/K )/( H/K ); g 7 ( gK )( H/K ).
This map f is well-defined, but we still need to show that its a
homomorphism, that its surjective onto ( G/K )/( H/K ), and that its
kernel is H.
Suppose that g1 , g2 G. Then
f ( g1 g2 ) = (( g1 g2 )K )( H/K )
= (( g1 K )( g2 K ))( H/K )
= (( g1 K )( H/K ))(( g2 K )( H/K ))
= f ( g1 ) f ( g2 )
and thus f is a homomorphism.
This homomorphism f is clearly surjective. Let ( gK )( H/K ) be some
element of ( G/K )/( H/K ). Then g G is mapped to this coset by f .
Finally, the kernel ker( f ) consists of those elements of G which are
mapped to the coset H/K in ( G/K )/( H/K ). These are those elements
g G for which gK H/K. But the elements satisfying this condition
are exactly the elements of H. So ker( f ) = H, and by the First
Isomorphism Theorem we have
G/H = G/ ker( f )
= im( f ) = ( G/K )/( H/K )
as required.
The Third Isomorphism Theorem says that if we have three groups
K P H P G, then we get the same result if we factor G/H as we do if
we factor both G and H by K and then factor the resulting quotients:
we can either collapse G to G/H in one step, or go via the intermediate
factor-byK step, and in both cases we get the same answer.
The three Isomorphism Theorems are generally attributed to the Ger-
man mathematician Emmy Noether (18821935), who stated them
in a little more generality (and using different notation) than this in
her important 1927 paper Abstrakter Aufbau der Idealtheorie in algebrais-
chen Zahl- und Funktionenkrpern (Abstract composition of ideal theory in
algebraic number and function fields),8 although some of the ideas were 8
E Noether, Abstrakter Aufbau der Ide-
present in her earlier work. Hints can also be seen in the work of altheorie in algebraischen Zahl- und Funk-
tionenkrpern, Mathematische Annalen
Richard Dedekind (18311916) and in the classic text Trait des substi- 96 (1927) 2661.
tutions et des quations algbriques (Treatise on permutations and algebraic 9
C Jordan, Trait des substitutions et des
equations)9 by Camille Jordan (18381922). quations algbriques, GauthierVillars
(1870)
132 a course in abstract algebra

Summary

10
Definition 4.1, page 106. In this chapter we studied homomorphisms,10 functions f : G H
from one group to another that preserve some aspects of the group
structure, in the sense that f ( g1 g2 ) = f ( g1 ) f ( g2 ) for any ele-
ments g1 , g2 G. Weve already met the concept of a bijective ho-
11
Definition 1.17, page 11. momorphism, an isomorphism,11 and in this chapter we also saw
several examples of injective homomorphisms or monomorphisms
(sometimes denoted with a tailed arrow f : G  H) and surjective
homomorphisms or epimorphisms (sometimes denoted with a two-
headed arrow f : G  H). In particular, the identity isomorphism
idG : G G is always defined for any group G, but more generally
an isomorphism from a group to itself is called an automorphism,
and a non-bijective homomorphism from a group to itself is called
an endomorphism. For any groups G and H there is a unique zero
homomorphism z : G H which maps every element of G to the
identity element e H in H.
Reassuringly, composing two homomorphisms yields another homo-
12
Proposition 4.11, page 107. morphism.12
Important examples of monomorphisms are given by inclusion ho-
momorphisms i : H , G mapping a subgroup H 6 G into a larger
13
Example 4.14, page 108. group.13 , 14 Important examples of epimorphisms include projection
14
Example 4.16, page 108. homomorphisms.15 Examples of both can often be constructed by
15
Example 4.15, page 108. considering the particular context of the groups in question: in the
case of dihedral or other isometry groups, we can sometimes use our
16
Example 4.17, page 109. geometric intuition to define homomorphisms.16
Another class of homomorphisms derive from the quotient groups
we learned about in the last chapter: given any normal subgroup
H of a group G we can define a canonical, surjective quotient ho-
17
Example 4.18, page 109. momorphism q : G G/H.17 In particular, factoring a group G by
its commutator subgroup [ G, G ] yields the abelianisation homomor-
18
Example 4.21, page 110. phism : G Gab = G/[ G, G ].18 These quotient homomorphisms,
in particular the abelianisation homomorphism, have an interesting
and useful property: given a homomorphism f : G H there is a
19
Proposition 4.22, page 110. well-defined induced homomorphism f ab : Gab H ab .19 , 20
20
Example 4.23, page 111. Homomorphisms map identity elements to identity elements, and
inverses to inverses, in the sense that if f : G H is a homomorphism,
21
Proposition 4.24, page 112. then f (eG ) = e H and f ( g1 ) = f ( g)1 for any g G.21
Since homomorphisms are functions, we can discuss their images;
we denote the image of a homomorphism f : G H by im( f ). It
turns out that the image of a homomorphism is a subgroup of its
22
Proposition 4.25, page 112. codomain,22 but in general not always a normal subgroup.23 If a
23
Example 4.26, page 113.
homomorphisms 133

homomorphism f : G H is injective, then its image im( f ) is isomor-


phic to the domain G of f .24 24
Example 4.27, page 113.
Certain problems in linear algebra reduce to studying the kernel of
some linear map, and with this in mind we introduced the concept
of the kernel ker( f ) of a group homomorphism f : G H, which
comprises all elements of the domain G which map to the identity e H
in the codomain H.25 The kernel of f must at least contain the identity 25
Definition 4.28, page 114.
element eG .26 The kernel is a subgroup of the domain just as the image 26
Proposition 4.24, page 112.
is a subgroup of the codomain. More than that, however, the kernel
is a normal subgroup, and all normal subgroups of a given group G
can be characterised as the kernel of some suitable homomorphism
defined on G.27 As noted earlier, the kernel must contain the identity 27
Proposition 4.29, page 115.
eG ; if, however, the kernel only contains the identity, then this is the
same as saying that the homomorphism in question is injective.28 28
Proposition 4.35, page 116.
Generalising the definition of the kernel we arrived at the concept of
the inverse image or preimage.29 We found that the inverse image of 29
Definition 4.36, page 117.
a subgroup of the codomain is itself a subgroup of the domain, and
the inverse image of a normal subgroup is also a normal subgroup.30 30
Proposition 4.37, page 117.
In fact, for any homomorphism f : G H, the inverse image f 1 (h)
of a single element h of the codomain H will either be the empty set
(if h 6 im( f )) or a coset of the kernel ker( f ) P G.31 Putting this 31
Proposition 4.38, page 117.
together with Proposition 2.29 we learned that if | ker( f )| = n, then f
is an nto1 map from G onto im( f ) 6 H.32 32
Corollary 4.39, page 118.
The remainder of the chapter was devoted to the three Isomorphism
Theorems. The first of these was motivated by the observation33 33
Proposition 4.29, page 115.
that any normal subgroup can be regarded as the kernel of some
appropriate homomorphism, and we wondered what relation the
corresponding quotient group might have to the homomorphism in
question. The answer was given by the First Isomorphism Theorem:34 34
Theorem 4.40, page 119.
for any homomorphism f : G H, the quotient group G/ ker( f ) is
isomorphic to the image subgroup im( f ) in H.
After a short discussion of the usefulness of studying homomorphic
images of groups, and an investigation of some of the structural infor-
mation we can recover from them, we looked at various applications
of the First Isomorphism Theorem. One of the most important of
these was the NormaliserCentraliser Theorem,35 which as a corol- 35
Theorem 4.57, page 126.
lary36 gave us an elegant and concise interpretation of what we get 36
Corollary 4.58, page 127.
when we factor a group G by its centre Z ( G ). During this discussion
we introduced four new concepts: the normaliser NG (S) of a subset
S G of a group G,37 the centraliser ZG (S) of a subset S G in 37
Definition 4.46, page 122.
G, the automorphism group Aut( G )38 and the subgroup Inn( G ) of 38
Proposition 4.55, page 126.
inner automorphisms.39 39
Definition 4.56, page 126.
134 a course in abstract algebra

40
Proposition 4.49, page 123. The normaliser NG (S) is always a subgroup of G40 , and in the case
where the subset S is a subgroup of G it so happens that S is a nor-
mal subgroup of NG (S); in fact, we can regard NG (S) as the largest
41
Exercise 4.1, page 134. subgroup of G in which S is normal.41 The centraliser ZG (S) is a gen-
42
Definition 3.16, page 78. eralisation of the concept of the centre Z ( G ) of a group G,42 and also
a slightly stricter object than the normaliser NG (S). It turns out that
ZG (S) is always a normal subgroup of NG (S), and the Normaliser
Centraliser Theorem tells us that if H is a subgroup of G, then the
quotient NG ( H )/ZG ( H ) is isomorphic to a subgroup of the automor-
phism group Aut( H ). Considering the special case where H = G we
find that the centraliser ZG ( G ) is just the centre Z ( G ), and the nor-
maliser NG ( G ) is G itself; this leads to the discovery that the quotient
G/Z ( G ) is isomorphic to the inner automorphism group Inn( G ).
43
Theorem 4.62, page 128. The Second Isomorphism Theorem43 can be proved by applying the
First Isomorphism Theorem to a suitable homomorphism. Suppose
that G is a group, H is a subgroup of G and N is a normal subgroup
of G. A short lemma confirms that the intersection H N is not just
44
Proposition 2.9, page 45. a subgroup of G,44 but also a normal subgroup of H.45 . The Second
45
Lemma 4.61, page 128. Isomorphism Theorem states that the quotient HN/N (where HN
denotes the subgroup consisting of all products of an element of H
with an element of N) is isomorphic to the quotient H/( H N ).
46
Theorem 4.65, page 130. The Third Isomorphism Theorem46 can also be proved by applying
the First Isomorphism Theorem to an appropriate homomorphism. It
states that if we have three groups G, H and K such that H and K are
normal subgroups of G, and K is also a normal subgroup of H, then
the quotient ( G/K )/( H/K ) is isomorphic to the quotient G/H.

References and further reading


Some details on Emmy Noethers work on the isomorphism theorems (and her contributions to
algebraic topology) can be found in the following article:
C McLarty, Emmy Noethers set-theoretic topology: from Dedekind to the rise of functors, in: The Archi-
tecture of Modern Mathematics: Essays in history and philosophy, ed. by J Gray and J Ferreirs, Oxford
University Press (2006) 211235

Exercises
4.1 Show that if H is a subgroup of G then H is a normal subgroup of NG ( H ). Show also that
NG ( H ) is the largest subgroup of G in which H is normal.
4.2 Let G be a group. Show that the set Inn( G ) of inner automorphisms of G is a group. Show that
it is a subgroup of Aut( G ).
I am not yet so lost in lexicography, as
to forget that words are the daughters of
the earth, and that things are the sons of
heaven. Language is only the instru-
ment of science, and words are but the
signs of ideas: I wish, however, that
the instrument might be less apt to de-
cay, and that signs might be permanent,
like the things which they denote.
5 Presentations Samuel Johnson (17091784),
preface to A Dictionary of the English
Language (1755)

n Chapter 1 we met the concept of a generator of a group.1 We


I
1
Definition 1.22, page 12.
said that if { g1 , g2 , . . .} is a (possibly infinite) collection of elements
of a group G, and if any element of G can be expressed as a finite
product of some or all of these elements and their inverses, then G is
generated by g1 , g2 , . . ., which we call generators of G.
In the case of an infinite cyclic group Z, we can generate the en-
tire group by just using the element 1 Z and its inverse 1. Or,
if were using multiplicative notation, we can consider the group
{. . . , t2 , t1 , 1, t, t2 , . . .} of integer powers of some symbol t. This is
clearly isomorphic to Z by the mapping tk 7 k Z. So, we can rep-
resent Z by h1i, the group generated additively by the single element
1, or hti, the group generated multiplicatively by the single element t.
(In the process well have to be a careful to say whether were using
additive or multiplicative notation unless its obvious from context.)
For the finite cyclic group Zn we can also generate the entire group by
a single element 1 Zn . Or, using multiplicative notation again, we
can consider the 0th to (n1)th powers of some single element t, to
get {1, t, t2 , . . . , tn1 }. This latter group is obviously isomorphic to Zn
by the same mapping as before: tk 7 k Zn . Again, we can denote
Zn by h1i or hti. Or, for that matter, by hki or htk i for any k Zn such
that gcd(k, n) = 1.2 2
Proposition 1.26, page 13.
We called groups which can be generated by a single element cyclic
groups.3 But we can construct other groups as products of small num- 3
Definition 1.23, page 12.
bers of generators too. The dihedral group Dn can be reconstructed
by just using two of its elements: the basic 2
n rotation element r and
the reflection m1 . The Klein group V4 = {e, a, b, c} can be generated
by two elements as well: a and b.
We call groups that can be constructed in this way by a finite number
of their elements finitely generated. Groups which require an infinite
set of elements are said to be infinitely generated; well see later that
the additive group Q of rational numbers is infinitely generated.
136 a course in abstract algebra

In this chapter, were going to study how to construct groups from


(finite or infinite) sets of generators. As is so often the case, the
details are a bit more complicated than the above examples suggest.
In particular, weve glossed over a subtle but important point: its all
very well giving a set of generating elements for a group, but we also
need to specify how those generators interact with each other, and
what their orders are. Its not enough to say that Zn is generated by a
single generator t, we need also to specify that tn = 1. We can write
this compactly as
Zn = h t : t n = 1i
which we interpret as the group generated by a single generator t,
subject to the constraint or relation tn = 1. We call an expression of
this type a presentation for Zn , and will develop this idea in more
detail over the rest of this chapter.

I hope we shall make everything as 5.1 Free groups


plain and as simple to you as we can.
I would never use a long word, even,
where a short one would answer the The infinite cyclic group Z is the most general possible group
purpose. we can construct from a single generator, and as we saw just now,
Oliver Wendell Holmes Sr
we can make the finite cyclic group Zn by taking Z and imposing a
(18091894),
Medical Essays, 18421882 (1891) 302 constraint on it. Or, to put it another way, Zn is the same as Z except
that we replace all occurrences of tn by 1. Amongst other things, this
means that t1 = tn1 , that t2 = tn2 and so on. We want to devise
a similar way of constructing non-cyclic groups like Dn and V4 , and to
do this we first have to generalise our idea of Z as the most general
possible group constructible from a single generator: we want to know
how to construct the most general possible group with two or more
generators.
Suppose that we have two symbolic variables x and y. At present
these are just symbols: dont worry about whether theyre numbers,
symmetry operations, permutations or anything else. For the moment,
were just going to manipulate them in a purely formal and abstract
way.
We can form finite strings of these two symbols, like xxxyxyyx, yy,
xyxy and x. We call strings like this words in x and y; we should also
allow the trivial case of the empty word which well usually denote
e (although some books use other symbols like , a pair of empty
parentheses ( ), 0 or 1).
We can simplify words by collapsing together repeated adjacent sym-
bols of the same type and using integer exponents: the first two
presentations 137

examples become, respectively, x3 yxy2 x and y2 . We dont lose any


information by doing this: if we really wanted to, we could re-expand
the collapsed words to get the original forms.
What were not allowed to do is to arbitrarily reorder the symbols in a
given word. So although x2 y3 = xxyyy, neither are equal to yxyxy or
y3 x2 , for example.
We can, if we so choose, interpolate, prepend or append any number of
copies of the empty word e, so x2 y3 = xxyyy = xexyyeye = xexy2 eye.
Typically, well reduce words by deleting any occurrences of e (except
in the case of the empty word e itself) and combining repeated adjacent
instances of the same symbols. Any word thus has a unique simplest
reduced form.
Given two words, we can concatenate them to get a new word. For ex-
ample, xyx y3 x = xyxy3 x. This might in some cases result in repeated
adjacent symbols of the same type, as with xy2 y4 x5 = xy2 y4 x5 so we
follow such a concatenation with any necessary reduction operations
in order to end up with a word in its simplest form.
Its fairly obvious that this concatenationreduction operation is asso-
ciative: given three words w1 , w2 and w3 , the concatenation (w1 w2 )
w3 gives the same word (when all necessary reductions have taken
place) as the concatenation w1 (w2 w3 ).
The empty word serves as an identity with respect to this concatenation
operation, in the sense that for any word w formed from the symbols
x and y its always the case that

e w = w = w e.

If we let W be the set of all finite, reduced words in the two symbols
x and y, then the concatenationreduction operation is an associative
binary operation defined on this set, and the empty word e behaves
like an identity. What weve just shown is the following:
Proposition 5.1 Let W be the set of finite, reduced words in two symbols
x and y, and let : W W W be the concatenationreduction operation.
Then (W, ) is a monoid.
In fact, this is the most general possible monoid we can get from two
generating symbols. Monoids are all very well, and as remarked in
Chapter 1 they have numerous applications in parts of mathematics
and computer science, but at the moment were primarily interested in
groups. Is there some way we can modify or extend this construction
to get the most general possible group generated by two symbols x
and y? As it happens, yes there is, and the key is to think about whats
missing from a monoid that stops it being a group, namely inverses.
138 a course in abstract algebra

So, back to our symbols x and y. Introduce two more symbols x 1


and y1 . These will function as the inverses of, respectively, x and y,
but at the moment were still going to treat them as formal symbols.
Just as before, well consider finite words in these four symbols, but we
need to slightly extend our idea of reduction. In addition to combining
repeated, adjacent symbols of the same type using positive integer
exponents, well also combine repetitions of the new symbols x 1
and y1 using negative integer exponents. Additionally, well cancel
adjacent pairs of x and x 1 , and adjacent pairs of x and x 1 .
For example, the word xxx 1 x 1 x 1 yyxy1 y reduces to xx 1 x 1 yyx,
then to x 1 yyx and finally to x 1 y2 x, while the word y1 y1 y1 yxxx
reduces to y2 x3 in a similar way.
Again, we dont allow arbitrary reordering, just cancellation and
grouping of adjacent symbols.
So, let W be the set of all finite, reduced words in the symbols x, y,
x 1 and y1 , and let : W W W be the concatenationreduction
operation; this is a well-defined, associative binary operation on the
set W. As before, the empty word e serves as an identity.
All that remains is to ensure the existence of inverses in W. We know
that x and y have inverses, because we introduced x 1 and y1 for
exactly this purpose. But we need to do a little more work to ensure
that any word in W has a well-defined inverse. The clue is given by
the discussion of inverses of products on page 9, where we deduced
that ( g h)1 = h1 g1 for any two elements g and h of some group
n n
G. For any word w = xi 1 xin2 . . . xi k where xi1 , . . . , xik { x, y} and
1 2 k
n1 , . . . , nk Z, the inverse of w is
nk n1
w 1 = x i . . . xi .
k 1

To see this, concatenate w and w1 either way round, reduce, and see
that you end up with the empty word e:
n n nk n1
w w1 = xi 1 xin2 . . . xi k xi . . . xi =e
1 2 k k 1
nk n1 n
w 1 w = x i . . . xi
1
= e xin11 xin22 . . . xi k
k k

We now have all the ingredients to make a group.


Proposition 5.2 Let F2 be the set of finite, reduced words in the sym-
bols x, y, x 1 and y1 . Let : F2 F2 F2 denote the concatenation
reduction operation. Then ( F2 , ) is a group: the free group of rank 2.
This group, which weve renamed F2 and given a special name to, is
the most general possible group we can make from two generators x
and y. Before we go any further, we state some formal definitions.
presentations 139

Definition 5.3 Let X = { x1 , x2 , . . .} be a (possibly infinite) set of for-


mal symbols, and let X 1 = { x11 , x21 , . . .} be a set of corresponding
formal inverses.
Denote by X = X X 1 the union of these two sets; we call this an
alphabet.
A word in X is a (possibly empty) string of symbols from X ; the
empty word is denoted by e.
A word is said to be reduced if all possible cancellations of adjacent
symbols and their inverses have been performed; we may choose
to write a reduced word in an even more compact form by using
exponents to group adjacent symbols of the same type.
Denote the set of reduced words in X by W ( X ).

This set W ( X ) can be partitioned into a number of pairwise disjoint


subsets in a fairly natural way. Since the reduced form of a given
word is unique, it makes sense to talk about its length: the number
of symbols in the string we get if we ungroup all of the adjacent
repetitions (but dont introduce any empty words or cancellable pairs).
For a given word w W ( X ) we denote its length by l (w).
For example, l ( x3 yx 1 ) = l ( xxxyx 1 ) = 5, since when we write the
word out in full we end up with a string of five symbols. We define
the length of the empty word to be zero: l (e) = 0.
Using this concept we can partition W ( X ) into the following subsets:

W0 ( X ) = {w W ( X ) : l (w) = 0} = {e}
W1 ( X ) = {w W ( X ) : l (w) = 1} = X
W2 ( X ) = {w W ( X ) : l (w) = 2}
..
.
Wn ( X ) = {w W ( X ) : l (w) = n}

Were now ready to define the free group F ( X ) on a set X of sym-


bols. There are two different ways of doing this, one concrete and
one abstract. The concrete, constructive definition is essentially a
formalisation and generalisation of the preceding discussion.
Definition 5.4 Let X = { x1 , x2 , . . .} be a (possibly infinite) set of
symbols. Then the free group F ( X ) with basis or generating set X
is the set W ( X ) of reduced words in the symbols x1 , x2 , . . . and their
formal inverses x11 , x21 , . . ., equipped with the binary operation
: W ( X ) W ( X ) W ( X ) given by the concatenationreduction
operation described above.
The rank of F ( X ) is the cardinality of the set X; in the finite case
| X | = n we will often denote the free group F ( X ) by Fn .
140 a course in abstract algebra

The other definition is more abstract, and its not immediately obvious
that its equivalent (or even that the objects it describes actually exist).
Nevertheless, there are certain technical advantages to this definition as
well, and some books present this as the main definition and relegate
the previous one to a concrete construction.
Definition 5.5 Let X be a set. The free group with basis X is the
group satisfying the property that for any other group G and any
function f : X G, there is a unique homomorphism f : F ( X )
G which extends f . That is, f( x ) = f ( x ) G for any x X.
Equivalently, the diagram
f
X G

i
f

F(X )

commutes, in the sense that f i ( x ) = f ( x ) for all x X. (Here


i : X F ( X ) is the obvious inclusion map.)

This is a generalisation of a situation you might be familiar with


from linear algebra. Suppose S = {v1 , v2 , . . .} is a set of vectors
in some vector space V. Then for any other vector space W and
function f : S W theres a unique linear map f : span(S) W
which extends f to the span of S (that is, the subspace of V consisting
of all possible linear combinations of the vectors in S). This linear map
f is determined completely by what f does to each of the vectors in
S, and much the same thing happens in the corresponding scenario
for groups. In a sense, the free group F ( X ) on a set X is a sort of
group-theoretic analogue of the linear algebraic concept of the span of
a set of vectors.
Definition 5.5 contains some important subtlety. The functions f : X
G and i : X , F ( X ) map an object of one type (the set X) to an object
of another type (the groups G and F ( X )). Strictly speaking, this isnt
really allowed: functions map sets to sets, and homomorphisms map
groups to groups.
Happily, we can resolve this apparent type mismatch error and gain a
deeper understanding of how the free group construction works. The
key is to notice that the functions f : X G treats the group G just
as an ordinary set, ignoring any extra structure it might have. The
group structure only becomes relevant when we look at the extended
homomorphism f : F ( X ) G, so to be really precise we need a way
of clarifying exactly when were talking about the group G and when
were talking about its underlying set.
presentations 141

Let U ( G ) be the underlying set of G,4 without any group structure. 4 We can regard U as a device that turns
Then Definition 5.5 says that functions f : X U ( G ) are in bijective groups into sets, and we can view the
free group construction F as a machine
correspondence with homomorphisms f : F ( X ) G. that turns sets into groups. More than
If we denote by HomSet ( X, Y ) the set of all possible functions from that, U turns group homomorphisms
into functions, and F turns functions
the set X to the set Y, and by HomGroup ( G, H ) the set of all possible into group homomorphisms. So U and
homomorphisms from the group G to the group H, then we can write F translate neatly between the world of
sets and the world of groups.
this as
Maps of this type are called functors.
HomSet ( X, U ( G ))
= HomGroup ( F ( X ), G ).6 The free group construction F : Set
Group is a functor from the category of
Well make use of this fact a little later, but now we can show that sets to the category of groups. The un-
Definitions 5.4 and 5.5 are equivalent. derlying set map U : Group Set is a
functor from the category of groups to
Proposition 5.6 Let X = { x1 , x2 , . . .} be a (possibly infinite) set, and let the category of sets; it is an example of
F ( X ) be the free group of reduced words in the alphabet X . For any group something called a forgetful functor.
A full discussion of categories and func-
G and function f : X G there is a unique homomorphism f : F ( X ) G tors is beyond the scope of this book,
which extends f . but the branch of mathematics in which
they play a fundamental part, called
Conversely, let F be a group and let X F be some subset of F. If, for category theory (but sometimes flip-
any group G and function f : X G there exists a unique homomorphism pantly referred to as abstract non-
sense) is at the same time perhaps the
f : F G which extends f , then F is isomorphic to the free group F ( X )
most abstract subject in modern math-
of reduced words in the alphabet X . ematics and one of the most powerful:
it provides a suitably precise language
Proof Every element w of F ( X ) has a unique expression as a reduced and framework within which to dis-
cuss many different sub-branches of
word in the elements of the alphabet X : mathematics in generality.
n n
w = xi 1 xin2 . . . xi k Much of algebraic topology, for exam-
1 2 k ple, one of the highlights of twentieth-
where n1 , . . . , nk Z and xi1 , . . . , xik X are a collection of elements and twenty-first century mathematics,
is concerned with the study of functors
of X (some of which might be the same as each other). from categories of topological objects
to categories of algebraic objects such
Then, given a function f : X G, the only way we can extend this to
as groups and rings. Category theory
a homomorphism f : F ( X ) G is by mapping the word also has important applications in the-
n n oretical computer science.
w = xi 1 xin2 . . . xi k Interested readers are directed to the
1 2 k
classic work Categories for the Working
to the product Mathematician by Saunders Mac Lane
(19092005),5 one of the main pioneers
f ( x i1 ) n1 f ( x i2 ) n2 f ( x i k ) n k of the field. Many years after its origi-
nal publication in 1971, it remains one
in G, and mapping the empty word e in F ( X ) to the identity eG in G. of the most comprehensive and lucid
Nothing else satisfies the homomorphism condition. So, the required treatments of the subject.
homomorphism f exists and is unique. 5
S Mac Lane, Categories for the Working
Mathematician, second edition, Gradu-
To prove the converse, suppose that for any f : X G there exists a ate Texts in Mathematics 5, Springer
unique homomorphism f : F G extending f . Let : F ( X ) F be (1998)
the unique homomorphism which maps any reduced word in F ( X ) to 6
This is an example of something cat-
the element of F obtained by regarding this word as a product of the egory theorists call an adjunction or
adjoint relationship. The functor F is
elements of F. That is, suppose ( F, ) is the hypothesised group. Then said to be a left adjoint of U, and U is
n n
for any w = xi 1 xin2 . . . xi k in F ( X ) we set a right adjoint of F.
1 2 k
n n n n
( xi 1 xin2 . . . xi k ) = xi 1  xin2   xi k .
1 2 k 1 2 k
142 a course in abstract algebra

We want to show that is an isomorphism. To do this, let i : X , F ( X )


be the obvious inclusion function. Then we can extend this to a unique
homomorphism = i : F F ( X ).
The composition : F ( X ) F ( X ) is the identity homomorphism
on any element xi of X, since maps xi in X to the corresponding
element in F, and maps it straight back to the corresponding element
of X F ( X ), since is just the extension of the inclusion map
i : X , F ( X ) to the whole of F.
Since is the identity on X it must also be the identity on F ( X )
since if all the elements of X are mapped to themselves, then so must
all possible reduced words made from them and their inverses.
Also, must be the identity on F. Consider the inclusion map
j : X , F. By the hypothesis, this extends to the identity homomor-
phism idF : F F. But also extends j to all of F. The hypothesis
said that any such function extends to a unique homomorphism, so
the composite must equal the identity map idF .
Weve now shown that there exist homomorphisms : F ( X ) F and
: F F ( X ), with = idF and = idF(X ) . This is equivalent
to saying that both and are isomorphisms, and so F ( X ) = F.
In Definitions 5.4 and 5.5 we referred to the free group with a particular
basis, a phrasing that quietly implies that there is just one such group
with that basis. The following proposition justifies this:
Proposition 5.7 Suppose that X1 and X2 are (possibly infinite) sets such
that there exists a bijection f : X1 X2 . Then F ( X1 )
= F ( X2 ) .
Conversely, suppose that F ( X1 ) and F ( X2 ) are the free groups generated
by two (possibly infinite) sets X1 and X2 , and that F ( X1 )
= F ( X2 ). Then
there exists a bijection f : X1 X2 .

Proof Let i1 : X1 , F ( X1 ) and i2 : X2 , F ( X2 ) be the appropriate


inclusion maps.
We can compose i1 with f 1 to get a function i1 f 1 : X2 F ( X1 ),
and we can compose i2 with f to get a function i2 f : X1 F ( X2 ).
By the fundamental mapping property of free groups, we can extend
i1 f 1 to get a unique homomorphism : F ( X2 ) F ( X1 ), and we can
also extend i2 f to get a unique homomorphism : F ( X1 ) F ( X2 ),
as depicted by the following diagrams:

/ X2   i 2 / F ( X2 ) f 1
/ X1   i 1 / F ( X1 )
f
X1 X2
_ llll
ll5 _ ll5
l llll
ll lll
lll lll
i1 i2
 lll  lll
F ( X1 ) F ( X2 )
(Compare these diagrams with the one in Definition 5.5.)
presentations 143

For any x X1 we have )( x ) = ( f ( x )) = f 1 ( f ( x )) = x. So


extends the inclusion map i1 : X1 , F ( X1 ) to the whole of F ( X1 ).
But so does the identity map idF(X1 ) : F ( X1 ) F ( X1 ), hence =
idF(X1 ) . A similar argument shows that = idF(X2 ) and hence both
and are the required isomorphisms.
To prove the converse, suppose that F ( X1 ) = F ( X2 ). Then for any
group G, the set HomGroup ( F ( X1 ), G ) of homomorphisms F ( X1 ) G
is in bijective correspondence with the set HomGroup ( F ( X2 ), G ) of
group homomorphisms F ( X2 ) G.
By the fundamental mapping property of free groups, we also know
that since HomGroup ( F ( X1 ), G ) is in bijective correspondence with the
set HomSet ( X1 , U ( G )) of functions X1 G and HomGroup ( F ( X2 ), G )
is in bijective correspondence with the set HomSet ( X2 , U ( G )) of func-
tions X2 G, it must follow that the sets HomSet ( X1 , U ( G )) and
HomSet ( X2 , U ( G )) are in bijective correspondence with each other.
In particular, this means that they must have the same cardinality:
| HomSet ( X1 , U ( G ))| = | HomSet ( X2 , U ( G ))|
In general, the number of possible functions from a set A with cardi-
nality a to a set B with cardinality b is given by b a . This is because
each of the elements of A has b possible elements in B that it could
map to, so there are b b = b a possible distinct functions A B.
Bearing this in mind, if we choose G to be some finite group, say Z2 ,
then we can see that
| HomSet ( X1 , U (Z2 ))| = |Z2 ||X1 | = 2|X1 |
and
| HomSet ( X2 , U (Z2 ))| = |Z2 ||X2 | = 2|X2 |
so 2| X1 | = 2| X2 | and therefore | X1 | = | X2 | as required.
The final step in the second half of this proof works for infinite sets
as well as finite ones, although a detailed discussion would be a bit
too much of a digression here. For finite sets its obvious: just take
base2 logarithms of both sides. It would also have worked if wed
replaced Z2 with Z3 , V4 , a dihedral group Dn , or any other nontrivial
finite group.
Corollary 5.8 Let m, n N be two positive integers. Then Fm = Fn if
and only if m = n.
The really important thing this proposition (and the immediate corol-
lary which follows it) tells us is that the rank of a free group (the
cardinality of its basis) is well-defined: any other set with the same
cardinality will generate an isomorphic free group. This is analogous
to the fact that the dimension of a vector space is well-defined: given
144 a course in abstract algebra

any finite-dimensional vector space V over some field K, any two


bases for V will have the same number of elements.
Next well look at the order |w| of a given word w in a free group, but
first we need to introduce a new concept, a slight variation on the idea
of a reduced word:
Definition 5.9 Let w = xi1 . . . xik be a reduced word in a free group
F ( X ) on some set X, where xi1 , . . . , xik X . Then w is cyclically
reduced if xi1 6= xi1 . That is, if we cyclically permute w to get a new
k
word v = xik xi1 . . . xik1 then this new word v is also reduced.

For example, the word


x13 x21 x1 x32 x21 = x1 x1 x1 x21 x1 x3 x3 x21
in F3 is cyclically reduced, but the word
x13 x21 x1 x32 x21 x12 = x1 x1 x1 x21 x1 x3 x3 x21 x11 x11
isnt, since a single cyclic permutation yields x11 x13 x21 x1 x32 x21 x11
which reduces to x12 x21 x1 x32 x21 x11 , which in turn is reduced but not
cyclically reduced.
Recall that an element g 6= e of some group G is a torsion element
7
Definition 1.18, page 11. if its order | g| is finite.7 A group G with no elements of finite order
(except for the identity e) is said to be torsion free. Free groups have
no torsion elements:
Proposition 5.10 Let F be a free group generated by some set X. Then F
is torsion free.

Proof Let w F be a nontrivial reduced word in the alphabet X .


Any such word is conjugate to a cyclically reduced word, in the sense
that w = uvu1 for some u, v F where v is cyclically reduced.
If w is already cyclically reduced then u = e is the empty word,
otherwise u consists of the subword at the beginning of w that cancels
with a corresponding inverse subword at the end of w, while v is the
non-cancelling middle section of w.
If w is cyclically reduced, then so is w2 , and for that matter wn for any
n N. So the length l (wn ) of n concatenated copies of a cyclically
reduced word w is equal to n l (w).
Now we consider the length of w and its subwords u and v. Let
r = l (w) and s = l (u). Then l (v) = r 2s, which is positive if w 6= e.
Then wn = (uvu1 )n = uvn u1 and hence
l (wn ) = nl (v) + 2l (u) = n(r 2s) + 2s.
This cant be less than n, so l (wn ) > n for all n > 1.
In particular, this means that for a nonempty word w, any positive
power wn cant have zero length, which means that it cant be the
presentations 145

empty word. Hence there is no n N for which wn = e and so w


cant have finite order. Thus F has no torsion elements.
The following important theorem is introduced now, but the proof is
rather technical and well postpone the details until Section 5.B.
Theorem 5.11 (The NielsenSchreier Theorem) Let F be a free group,
and let H be a subgroup of F. Then H is itself a free group. Furthermore,
if the index | F : H | = s of H in F and rank( F ) = r are both finite, then
rank( H ) = (r 1)s + 1.
The next proposition tells us that except for the case of the infinite
cyclic group Z
= F1 , free groups are as nonabelian as they can be.
Proposition 5.12 Let F be a free group generated by some set X, and
consider two elements u, v F. Then uv = vu if and only if there exists
some w F and m, n Z such that u = wm and v = wn .

Proof The if direction is obvious: suppose u = wm and v = wn .


Then
uv = wm wn = wm+n = wn wm = vu.
The only if direction is more involved. Let u and v be reduced
words in F, and let r = l (u) and s = l (v). We proceed by induction on
r + s, and without loss of generality we can assume that 0 6 r 6 s.
If r = 0 then u = e = v0 so we can set w = v and let m = 0 and n = 1.
This confirms the induction basis.
Now suppose that r > 0. If we concatenate u and v then it may be
the case that part of the end of u cancels with part of the beginning of
v. Let h be the length of this cancelling subword at the end of u, and
equivalently the length of the cancelling subword at the beginning of
v. (Its entirely possible that h = 0 if no cancellation occurs.)
Then l (uv) = l (u) + l (v) 2h = r + s 2h. Since uv = vu we must
also have 2h letters cancelling when we concatenate v and u. We have
three cases to consider:
Case 1 (h = 0) Here there are no cancellations when we concatenate
u and v or when we concatenate v and u. So
uv = u1 . . . ur v1 . . . vr . . . vs = v1 . . . vr . . . vs u1 . . . ur = vu
tells us that u1 = v1 , u2 = v2 , all the way up to ur = vr . This means we
can write v = uz where z = vr+1 . . . vs . Since uv = vu it follows that v
must be in the centraliser ZF (u) of the element u in F. But u ZF (u)
as well, so z = u1 v must also be in ZF (u) and hence uz = zu.
The length l (z) = s r < l (v) and so by the induction hypothesis
applied to u and z it follows that u = w p and z = wq for some w F
and p, q Z. Then u = w p and v = w p+q as claimed.
146 a course in abstract algebra

Case 2 (h = r) Here, the cancelling portion of u is the entirety of


u, which means that v = u1 z for some word z F, where l (z) =
s r < l (v). We can apply a similar argument to the one we used in
the previous case, by appealing once more to the inductive hypothesis.
We have uv = vu which means that uu1 z = u1 zu, and hence
z = u1 zu, so uz = zu. Therefore u = w p and z = wq for some w F
and p, q Z, so u = w p and v = u1 z = wq p .
Case 3 (0 < h < r) Here we have
uv = u1 . . . urh vh+1 . . . vs
and vu = v1 . . . vsh uh+1 . . . ur .
By the hypothesis, these are equal, and so in particular we have u1 = v1
and ur = vs . Since h > 0 we have at least some cancellation, so that
tells us that ur = v11 and vs = u11 . Putting this together we see that
ur = v11 = u11 = vr
and thus there exist words a, b F such that u = u1 au11 and v =
u1 bu11 , where
a = u 2 . . . u r 1 , b = v 2 . . . v s 1 ,
l ( a) = r 2, l (b) = s 2.
Then uv = vu means that u1 abu11
= u1 bau11 , which implies that
ab = ba. By the induction hypothesis we then have a = z p and b = zq
for some p, q Z and z F. Hence
u = u1 z p u11 = (u1 zu11 ) p ,
v = v1 zq v11 = u1 zq u11 = (u1 zu11 )q .
Setting w = (u1 zu11 , m = p and n = q the result follows.
What this proposition tells us is that no two words u and v in a free
group F commute with each other unless they happen to both be
powers of the same element. A free group F of rank 2 or greater is
therefore no more abelian than it absolutely has to be.

For the relations of words are in pairs 5.2 Generators and relations
first.
For the relations of words are some-
times in oppositions. In this section we will develop the machinery of group presenta-
For the relations of words are accord- tions. Our first ingredient is the next proposition, which shows that
ing to their distances from the pair.
any group can be regarded as a quotient of a free group.
Christopher Smart (17221771),
Jubilate Agno (17591763) Proposition 5.13 Let G be a group, and let X be a set of generators for
Fragment B, part 4
G. Then G is isomorphic to a quotient of the free group F ( X ).
presentations 147

Proof There is a well-defined homomorphism f : F ( X ) G which


n n
maps any word w = xi 1 . . . xi k in F ( X ) onto the corresponding prod-
1 k
uct of generators in G.
This homomorphism is surjective, since G is generated by the elements
of X. Then
F ( X )/ ker( f )
= im( f ) = G
by the First Isomorphism Theorem.
We now have all we need to formally define the notion of a group
presentation: we have a concrete definition of the free group on a set of
generators, and Proposition 5.13 tells us that any group is isomorphic
to a quotient of such a group.
Going back to our motivating example of the finite cyclic group
Zn = {1, t, t2 , . . . , tn1 } (which for the moment well write using
multiplicative notation), this is constructed from a single generator t
subject to the relation tn = 1. In the context of Proposition 5.13 we can
think of this as the free group F1 = Z on a single generator t, modulo
some normal subgroup which we can think of as the kernel of some
surjective homomorphism F1  Zn .
By the fundamental property of free groups, any such homomorphism
is completely determined by what its generators map to. In this case
theres only one generator t to worry about, and as long as we map
it to a generator of Zn , the corresponding homomorphism will be
surjective.
So lets take the homomorphism f : F1 Zn which maps the gener-
ator t F1 to the generator t Zn . What is the kernel of this homo-
morphism? Its everything in F1 that maps to the identity 1 Zn . But
these elements are exactly the powers tk of t for which k is a multiple
of n.
By Proposition 5.13 we know that Zn = F1 / ker( f ) where ker( f ) =
{tk : n|k} C F1 .
This says that the kernel ker( f ) consists of all the words in F1 which
are powers of the word tn , in other words ker( f ) is generated by the
word tn .
We could just as easily write this scenario as
Zn
= hti/htn i
but well usually adopt the notation we saw briefly at the beginning
of the chapter,
Zn
= ht : tn i
or, equivalently,
Zn
= ht : tn i
148 a course in abstract algebra

when its understood implicitly that the word or words to the right of
the colon are equal to the identity.
The group ZZ can be constructed by taking a suitable free group, in
this case F2 = h x, yi and introducing some appropriate relations that
force everything to commute. This is in some senses fairly straightfor-
ward, because all we need to do is to ensure that the generators x and
y commute, and since every other element of our desired group can
be regarded as a reduced word in these generators and their inverses,
all such elements will commute.
This can be effected by imposing the relation xy = yx, to get the
presentation
ZZ
= h x, y : xy = yx i.
Alternatively, replacing the relation xy = yx with the relator xyx 1 y1
(which is also the commutator [ x, y]) gives us

ZZ
= h x, y : xyx 1 y1 i.

Another way of looking at this is to observe that xyx 1 y1 is a genera-


tor for the commutator subgroup [ F2 , F2 ], which is exactly the kernel of
the projection homomorphism F2  ZZ = F2 /[ F2 , F2 ]. Well come
back to this in the next section, when we study finitely-generated
abelian groups.
But now, before looking at any more examples, its time to introduce
some formal definitions.
Definition 5.14 Let G be a group and let S be some subset of G. The
normal closure S or hSG i of S in G is the intersection of all normal
subgroups of G which contain S.

The following proposition gives a constructive definition for hSG i:


Proposition 5.15 Let G be a group and let S be some subset of G. Then
the normal closure
hSG i = { g s g1 : g G and s S}.

Proof By Proposition 3.10 we know that any normal subgroup is


closed under conjugation. So any normal subgroup of G which con-
tains S must also contain all of its conjugates, hence { g s g1 : g
G and s S} hSG i.
But the set { g s g1 : g G and s S} is itself a normal subgroup
of G containing S: set g = e, then g s g1 = e s e1 = s for all
s S. Hence hSG i { g s g1 : g G and s S} and so the two
are equal.

We can now define the concept of a group presentation:


presentations 149

Definition 5.16 Let X = { x1 , x2 , . . .} be a (finite, infinite or empty)


set of generators, and let R = {w1 , w2 , . . .} be a (finite, infinite or
empty) set of reduced words in the free group F ( X ). Suppose G =
F ( X ) / h R F ( X ) i.
Then the elements of X are called generators for G, the words in R are
called relators for G, and we write the corresponding presentation
(or free presentation) of G as
G
= h X : R i = h x 1 , x 2 , . . . : w1 , w2 , . . . i .
A relation in G is an equation of the form (or equivalent to) w = e,
Some books use a vertical bar instead
where w R and e is the identity element of G. We can also write a of a colon as the delimiter:
presentation for G by using relations instead of relators:
G
= h X | Ri
G
= h x 1 , x 2 , . . . : w1 = w2 = = e i
If R is empty, then the presentation h X : Ri = h X : i = h X : i gives
the free group F ( X ). If both X and R are finite, then we say that the
group G = h X : Ri is finitely presented.

Group presentations satisfy a similar universal mapping property to


the one satisfied by free groups:8 8
Proposition 5.6, page 141.
Proposition 5.17 Let G = h X : Ri be the group determined by the
given presentation, and let H be any group. Suppose that f : X H is
a function satisfying the property that f ( x 1 ) = f ( x )1 for any x X,
and that for any relator w = a1 . . . ak in R, where a1 , . . . , ak X , we
have f ( a1 ) . . . f ( ak ) = e H H. Then the function f extends uniquely to
a homomorphism f : G H.

Proof If the function f : X H extends to a group homomorphism


f : G H at all, then it must be the case that
f ( a1 . . . a k ) = f ( a1 ) . . . f ( a k )
for all a1 , . . . , ak X , which means that f must be unique.
Now we have to show that f actually exists. Proposition 5.6 says that
f : X H extends to a unique homomorphism f : F ( X ) H which
extends f in the sense that f| X = f .
The relator set R is contained in the kernel of f since, by hypothesis,
f ( a1 . . . a k ) = f ( a1 ) . . . f ( a k ) = e H H
for any word a1 . . . ak R. Let N = h R F(X ) i be the normal closure of
R in F ( X ). This is a subgroup of ker( f), which is in turn a normal
subgroup of the F ( X ), and by definition G = F ( X )/N.
Now define f : G = F ( X )/N H by f(( a1 . . . ak ) N ) = f( a1 . . . ak ).
Then f is a homomorphism, and f( xN ) = f( x ) = f ( x ) for all x X,
so it extends f as required.
150 a course in abstract algebra

We can write down a presentation for Z Z from the generators x


and y together with the relator xyx 1 y1 :
h x, y : xyx 1 y1 i
or the relation xyx 1 y1 :
h x, y : xyx 1 y1 = ei
or the equivalent relation xy = yx:
h x, y : xy = yx i
This is all well and good, and weve already seen that the language of
presentations, generators and relations is a useful and compact way
of describing at least some groups, but for it to be really useful we
need to know what groups can be described in this way. The answer,
happily, is that all of them can:
Proposition 5.18 Every group G has a presentation h X : Ri where X is
a (finite, infinite or empty) set of generators, and R is a (finite, infinite or
empty) set of relators. Furthermore, if G is finite, then there exists a finite
presentation for G.

Proof Let X G be a set of generators for G. (Setting X = G


will work, for example, but there will typically be a smaller set that
suffices.) Then by Proposition 5.13 there exists a homomorphism
f : F ( X )  G such that G = F ( X )/ ker( f ), and by setting R to be a
set of generators for ker( f ) we obtain a presentation
h X : Ri
= G.
(Setting R = ker( f ) itself will work, but there will usually be a smaller
set which generates ker( f ).)
If G is finite, with | G | = s for some s N, then there exists a finite
generating set X, with | X | = r for some r N.
9
Theorem 5.11, page 145. By the NielsenSchreier Theorem9 there exists a set R which generates
ker( f ), with | R| = (r 1)s + 1. Hence
G
= h X : Ri
is a finite presentation for G.
Thus every (finite or infinite) group can be expressed as a presentation,
and while the above proposition merely assures the existence of such
e a b c a presentation, its a relatively straightforward process to write one
e e a b c
a a e c b down, as the following example shows.
b b c e a Example 5.19 Let V4 = {e, a, b, c} be the Klein group as usual. We
c c b a e
can write down a presentation for this group by using V4 itself as the
Table 5.1: Multiplication table for the
Klein group V4 set of generators, and reading off the relations from the multiplication
table (Table 5.1).
presentations 151

This gives us the presentation


* ee = e ea = a eb = b ec = c +
V4
= ae
e, a, b, c : be = a aa = e ab = c ac = b .
= b ba = c bb = e bc = a
ce = c ca = b cb = a cc = e

The main drawback with this method is that although it does provide
a presentation for a given group, that presentation will usually not
be the simplest one. And for an infinite group, this method becomes
completely impractical for obvious reasons. There will almost always
be a simpler presentation for the groups were likely to meet. For
example, the above presentation for V4 has four generators and sixteen
relations, but we really only need two generators and three relations:
Example 5.20 The Klein group V4 is isomorphic to the direct sum
Z2 Z2 . We already know how to write down a presentation for
ZZ
= h x, y : xy = yx i,
and we can use this to make a presentation for Z2 Z2 . We do this
by introducing two more relations x2 = e and y2 = e, thus forcing
each generator to have order 2. This gives us the presentation
V4
= Z2 Z2
= h x, y : x2 = y2 = e, xy = yx i.
In general we will want a suitably compact presentation of a given
group, or at least one that can be written down in a compact way.
Example 5.21 Let G = Z3 Z2 . Then
G
= h x, y : x3 = y2 = e, xy = yx i.
Examining the presentations in these two examples, we see that each
group is a direct sum of finite cyclic groups, and each has a pre-
sentation comprising the combined generators and relations for the
cyclic group summands together with another relation that forces the
generator of one to commute with the generator of the other. That is,

h x : x2 = ei hy : y2 = ei = h x, y : x2 = e, y2 = e, xy = yx i,
h x : x3 = ei hy : y2 = ei = h x, y : x3 = e, y2 = e, xy = yx i.

This illustrates the following proposition:


Proposition 5.22 Suppose that h X : Ri is a presentation for a group G,
and that hY : Si is a presentation for a group H, with X Y = . Then
h X Y : R S [ X, Y ]i
is a presentation for G H (or G H), where
[ X, Y ] = { xyx 1 y1 : x X, y Y }
is the set of commutators of elements from X and Y.
152 a course in abstract algebra

Proof Suppose
K = h X Y : R S [ X, Y ]i.
Define a function f : X Y G H by f ( x ) = ( x, e H ) and f (y) =
(eG , y) for all x X and y Y.
Let w = a1 . . . ak R S [ X, Y ] be a relator. We need to check that
f (w) = (eG , e H ), and there are three cases to consider.
If w R then w F ( X ) and so f (w) = (w, e H ) = (eG , e H ) since
w = eG in G.
Similarly, if w S then w F (Y ) and so f (w) = (eG , w) = (eG , e H )
since w = e H in H.
If, on the other hand, w [ X, Y ] then w = xyx 1 y1 for some x X
and y Y. Hence

f (w) = f ( xyx 1 y1 ) = f ( x ) f (y) f ( x 1 ) f (y1 )


= ( x, e H )(eG , y)( x 1 , e H )(eG , y1 ) = (eG , e H ).
So f (w) = eG H for any relator w R S [ X, Y ], and hence by the
10
Proposition 5.17, page 149. fundamental mapping property of group presentations,10 f extends
uniquely to a homomorphism f : K G H.
The function f : X Y G H is surjective, and hence so is the
homomorphism f : K G H.
Since xy = yx for all x X and y Y, any k K can be written as a
word
k = a1 . . . ai b1 . . . b j
for some a1 , . . . , ai X and b1 . . . b j Y . Then
f(k) = f( a1 . . . ai b1 . . . b j ) = ( a1 , . . . , ai , b1 . . . b j )
in G H. If k ker( f) then
a1 . . . ai h R F(X ) i h R F(X Y ) i h( R S [ X, Y ]) F(X Y ) i
and b1 . . . b j hS F(Y ) i hS F(X Y ) i h( R S [ X, Y ]) F(X Y ) i
Since, by definition,
K = h X Y : R S [ X, Y ]i = F ( X Y )/h( R S [ X, Y ]) F(X Y ) i,
both of these words a1 . . . ai and b1 . . . b j are equal to the identity e in
K, and hence k = e in K too. So the kernel ker( f) is trivial, hence f is
injective and therefore an isomorphism.
In general, then, if we have a group G with presentation h X : Ri and a
group H with presentation hY : Si, we can write down a presentation
for G H by taking the union X Y of the generators and the union
R S of the relators, and then forcing the generators of G to commute
with the generators of H by means of the commutator [ X, Y ].
presentations 153

If we skip this last step and dont require the generators of G to


commute with the generators of H, we get a more general construct:
Definition 5.23 Let G = h X : Ri and H = hY : Si. Then the free
product of G and H, denoted G H is the group with presentation
G H = h X Y : R S i.
The free product G H is the most general possible group we can
construct from G and H which still preserves G and H as subgroups.
The next fact is fairly obvious, but worth noting.
Proposition 5.24 Let X and Y be sets, and let F ( X ) and F (Y ) be the
corresponding free groups. Then
F ( X ) F (Y ) = F ( X Y ) .
Similarly, if Fm and Fn are free groups of finite rank, then
Fm Fn = Fm+n .

Proof This is an immediate consequence of Definition 5.23. We have


F ( X ) = h X : i and F (Y ) = hY : i, and therefore

F ( X ) F (Y ) = h X Y : i = F ( X Y ) .

The finite case Fm Fn = Fm+n follows by observing that Fm = F ( X )


and Fn = F (Y ) for some sets X and Y with | X | = m and |Y | = n.

Example 5.25 Let G = h x : x4 i = Z4 and H = hy : y6 i = Z6 .



Then the free product G H = Z4 Z3 is given by the presentation
h x, y : x4 = y6 i.
The groups G =Z4 and H
=Z6 embed into G H by the homomor-
phisms i : G , G H and j : H , G H where i ( x )= x and j(y)=y.

Suppose now that both groups G and H have a subgroup which is


isomorphic to some other group K. We can identify both copies of this
subgroup and then take the free product, to get something called a
free product with amalgamation or amalgamated free product.
Definition 5.26 Suppose that G = h X : Ri, H = hY : Si and K are
groups, and there exist injective homomorphisms i : K , G and
j : K , H. Let
T = { i ( k ) j ( k ) 1 : k K } .
Then the group
G K H = h X Y : R S T i
is the free product of G and H amalgamated over K.
This is the usual free product G H with additional relations which
force i (k) = j(k) for all k K.
154 a course in abstract algebra

Free products with amalgamation have useful applications in algebraic


topology. In particular, an important result called the Seifertvan Kam-
pen Theorem (or sometimes just van Kampens Theorem) tells us how
to calculate a presentation for the fundamental group 1 ( X, ) of a
topological space X in terms of the free product with amalgamation
of the fundamental groups of its path-connected subspaces.
We wont study free products or free products with amalgamation in
detail, but heres an illustrative example.
Example 5.27 Let G = Z4 = h x : x4 = ei and H = Z6
= h y : y6 =
ei. Each of these groups has a subgroup isomorphic to Z2 = ht :
t2 = ei, which can be regarded as the image of the homomorphisms
i : Z2 , Z4 and j : Z2 , Z6 given by
i (e) = e, i (t) = x2 ,
j(e) = e, j ( t ) = y3 .
From Example 5.25 we know that the free product G H is given by
the presentation
G H = h x, y : x4 = y6 = ei.
To construct the free product amalgamated over Z2 via the homo-
morphisms i and j, we need to add in the extra relations we get
from forcing i (k) = j(k) for all k Z2 . Setting k = 1 we get
i (1) = e = j(1), which doesnt make any difference since it just gives
us a trivial relation. Setting k = t, on the other hand, gives us the
new relation x2 = y3 , so
G Z2 H = Z4 Z2 Z6 = h x, y : x4 = y6 = e, x2 = y3 i.

One of the first classes of nonabelian groups we met in Chapter 1


were the dihedral groups Dn . It turns out that these groups have
presentations which follow a neat pattern.
Proposition 5.28 Let Dn be the dihedral group of order 2n comprising
the symmetries of a regular ngon. Then
Dn
= h x, y : x n = e, y2 = e, yx = x 1 yi.

Here, x corresponds geometrically to the 2


n rotation, which obviously
has order n, and y corresponds to one of the reflection, say m1 . The
third relation yx = x 1 y says that doing a reflection followed by
a rotation is the same as doing a rotation in the opposite direction
and then reflecting. (Convince yourself that this is true.) What isnt
immediately clear is that these are the only relations we need for Dn .

Proof Any element of G = h x, y : x n = e, y2 = e, yx = x 1 yi. can be


written as a product of the generators x and y and their inverses x 1
presentations 155

and y1 . The relation x n = e means we can always write x 1 as x n1 ;


also x k can be rewritten as x nk for any 1 6 k < n. Similarly, y2 = e
means we can replace any y1 with y. Therefore we can always write
any element of G as a product of positive powers of x and y, such that
the maximum exponent of any occurrence of x is n1 (since x n = e)
and the maximum exponent of any occurrence of y is 1 (since y2 = e).
The third relation yx = x 1 y can be rewritten as yx = x n1 y, so we
can group all occurrences of y together (and simplify them to either
e or a single y) and group all occurrences of x together. Thus any
element of G is of the form x k or yx k for some k where 0 6 k 6 n1.
This leaves us with precisely 2n words in G, namely
e, x, . . . , x n1 , y, yx, . . . , yx n1
and hence | G | = 2n = | Dn |.
We still need to show that G = Dn though, and to do this we have to
consider the internal structure of dihedral groups in detail. Observe
that the reflection mk is equal to the product m1 r k1 for 1 6 k 6 n.
So, define f : G Dn such that x k 7 r k , and yx k 7 m1 r k = mk+1 for
0 6 k 6 n1.
We claim that this is an isomorphism. It is obviously a bijection, so we
need to check its also a homomorphism, and to do that we have to
verify the homomorphism condition for all four types of products:
f ( x k x l ) = f ( x k + l ) = r k + l = r k r l = f (r k ) f (r l )
f ( x k yx l ) = f (yx nk x l ) = f (yx nk+l ) = mnk+l +1
= m1 r n k + l = m1 r n k r l = r k m1 r l
= r k ml +1 = f ( x k ) f (yx l )
f (yx k x l ) = f (yx k+l ) = mk+l +1 = m1 r k+l
= m1 r k r l = mk+1 r l = f (yx k ) f ( x l )
f (yx k yx l ) = f (y2 x nk x l ) = f ( x nk+l ) = r nk+l
= m21 r nk r l = m1 r k m1 r l
= mk+1 ml +1 = f (yx k ) f (ykl )
Thus f : G Dn is an isomorphism.
We can use this result to prove the following useful fact (promised in
Chapter 2) about groups of order 2p where p is prime.
Proposition 5.29 Let G be a group of order 2p, where p is prime. Then
G is isomorphic to either the cyclic group Z2p or the dihedral group D p .

Proof By Proposition 2.34 we know that the order of any element


g G must divide 2p. Hence | g| = 1, 2, p or 2p. By Proposition 1.19
we know that | g| = 1 if and only if g is the identity e G. Suppose,
156 a course in abstract algebra

then, that g 6= e and hence | g| = 2, p or 2p.


If | g| = 2p then h gi
= Z2p comprises all of G, hence G
= Z2p . Well
assume for the rest of the proof that G has no elements of order 2p.
Suppose that there exists g G such that | g| = p. Then the cyclic
subgroup h gi = {e, g, . . . , g p1 } has p elements and is therefore an
index2 subgroup of G. Choose some element h G \ h gi. This
element must have order 2 or p.
If |h| = p then the number of elements in h gihhi must be 2p1 since

|h gi hhi| = |h gi| + |hhi| |h gi hhi| = p + p 1 = 2p 1.


Thus there must exist an element k G which is in neither h gi or hhi.
The cyclic subgroup hk i must therefore consist of e and k only, since
there are no other elements left. Hence |k| = 2.
If, on the other hand, |h| = 2 then hhi = {e, h}, so set k = h. In either
case we have found an element k of order 2 in G \ h gi.
The subgroup h g, ki contains h gi as a proper subgroup and hence
has index less than 2 in G. Since h gi has index 2 in G, then by
Proposition 3.12 we know it is normal in G and therefore closed under
conjugation. So k g k1 = gm for some m.
Since k2 = e we can see that

g = e g e = k 2 g k 2 = k ( k g k 1 ) k 1 = k g m k 1
2
= ( k g k 1 ) m = ( g m ) m = g m
2
and so gm 1 = e, and therefore p must divide m2 1 = (m+1)(m1).
Hence p must divide either m+1 or m1.
If p divides m1 then k g k1 = gm = g, which means that k g =
g k and so G is abelian. If p is odd then |k g| = 2p, which contradicts
our assumption that G has no elements of order 2p. The only other
possibility is that p = 2, in which case G = V4 = D2 = Z2 Z2 .
If, however, p divides m+1 then k g k1 = gm = g1 , which means
that k g = g1 k, and hence G
= D p by Proposition 5.28.
We can now write presentations for all groups with up to seven
elements:

0=h:i Z2
= h x : x2 = ei
h x : x3 = ei
Z3 = h x : x4 = ei
Z4 =
V4
= h x, y : x2 = y2 = e, xy = yx i Z5
= h x : x5 = ei
D3
= h x, y : x3 = y2 = e, yx = x 1 yi Z6
= h x : x6 = ei
Z7
= h x : x7 = ei
There are five groups of order 8 (see Proposition 2.40), and we can
presentations 157

write down four of them straight away:


Z8
= h x : x8 = ei

Z2 Z4 = h x, y : x2 = y4 = e, xy = yx i
Z2 Z2 Z2
= h x, y, z : x2 = y2 = z2 = e, xy = yx, xz = zx, yz = zyi
D4
= h x, y : x4 = y2 = e, yx = x 1 yi
The remaining case, the quaternion group Q8 , requires more work.
Proposition 5.30 The quaternion group Q8 has a presentation
Q8
= h x, y : x4 = e, x2 = y2 , xyx = yi.

Proof Let G = h x, y : x4 = e, x2 = y2 , xyx = yi. We want to show that


this is indeed isomorphic to the quaternion group Q8 , and well start
G e x x2 x3 y xy x2 y x3 y
by showing that it at least has the same number of elements. e e x x2 x3 y xy x2 y x3 y
The first relation x4 = e tells us that we need only consider positive x x x2 x3 e xy x2 y x3 y y
x2 x2 x3 e x x2 y x3 y y xy
powers of x, since we can replace any occurrence of x 1 with x3 . The x3 x3 e x x2 x3 y y xy x2 y
first and second relations together imply that y4 = e, since y y x3 y x2 y xy x2 x e x3
xy xy y x3 y x2 y x3 x2 x e
y4 = (y2 )2 = ( x2 )2 = x4 = e. x2 y x2 y xy y x3 y e x3 x2 x
x3 y x3 y x2 y xy y x e x3 x2
Hence we only have to consider positive powers of y, since y1 = y3 .
Q8 E I E I J K J K
The third relation xyx = y can be rearranged to yx = x 1 y, and then E E I E I J K J K
we can use the first relation to rewrite this as yx = x3 y. I I E I E K J K J
E E I E I J K J K
Putting all this together, any finite word composed from the generators I I E I E K J K J
x and y and their inverses can be rearranged in the form x m yn where J J K J K E I E I
K K J K J I E I E
m and n are non-negative integers. Using the second relation, we can
J J K J K E I E I
rewrite yn as either x n (if n is even) or x n1 (if n is odd) to get a word K K J K J I E I E
of the form x k or x k y. Since x has order 4 in G we know that 0 6 k < 4, Table 5.2: The multiplication tables for
and so G has eight elements: the group
G
= h x, y : x4 = e, x2 = y2 , xyx = yi
G = {e, x, x2 , x3 , y, xy, x2 y, x3 y}.
and the quaternion group Q8
The multiplication table for this group is shown in Table 5.2 along
with that for Q8 , reordered to make the structural similarities clearer.
The function f : G Q8 that maps
e 7 E x 7 I x2 7 E x3 7 I
y 7 J xy 7 K x2 y 7 J x3 y 7 K
is the required isomorphism.
The quaternion group is the first in a sequence of nonabelian groups
of order 4n for some positive integer n.
Example 5.31 The dicyclic group Dicn of order 4n is defined by the
presentation
Dicn
= h x, y : x2n = e, x n = y2 , xyx = yi.
158 a course in abstract algebra

The case n = 1 yields


Dic1
= h x, y : x2 = e, x = y2 , xyx = yi
= Z4
and the group Dic2 is isomorphic to the quaternion group Q8 as
discussed in Proposition 5.30.

All of the examples weve seen so far have been of finitely-presented


groups. Here is an important example of a group which doesnt have
a finite presentation.
Proposition 5.32 Let Q denote the additive group of rational numbers.
Then Q is not finitely presented; in fact
Q
= h x1 , x2 , . . . : xnn = xn1 for n > 1i.

Proof To see that Q is not finitely presented, consider some finite


subset X Q and let n be the lowest common multiple of the de-
nominators of all the elements in X. Now consider the subgroup
h X i 6 Q generated additively by all of the elements of X. Since all
of the denominators of elements in X divide n, this is also true for
the denominators of elements in the subgroup h X i, and so X cant
generate all of Q, since well always be able to find some element of Q
whose denominator doesnt divide n.
It so happens that Q can be generated by the infinite set { n!
1
: n N},
m
since any rational number n can be written as
m m ( n 1) ! 1
= = m ( n 1) ! .
n n! n!
Let X = { x1 , x2 , . . .} and let G = h x1 , x2 , . . . : nxn = xn1 for n > 1i.
Define a function f : X Q by f ( xn ) = n! 1 n
. Then f ( xnn ) = n! =
1
( n 1) !
= f ( xn1 ), so by Proposition 5.17 we know that f extends to a
homomorphism f : G Q. This homomorphism is surjective since Q
1
is generated by the set { n! : n N}.
The group G is abelian, since the relations xn = xnn1 tell us each
generator xn can be written as a power of any of the previous n1
generators x1 , . . . , xn1 , and hence any two generators xi and x j com-
mute, which means any two elements of G commute, so G is abelian.
k
So, any element of G can be written as a finite product x11 x2k2 . . . xnkn
for some integers k1 , . . . , k n Z. Since xnn = xn1 we can use the
Euclidean algorithm to write k n = qn + r where 0 6 r < n. So
qn+r qn q
xnkn = xn = xn xnr = xn1 xnr
which tells us that 0 6 k n < n. Repeating this process for the genera-
tors xn1 , xn2 , . . . , x2 lets us rewrite the word in a canonical form
m
xn 1 x2m2 . . . xnmn
where 0 6 mi < i for 2 6 i 6 n.
presentations 159

We need to show that f is injective, which we can do by showing its


kernel is trivial. Let g ker f be some element of this kernel. By
m
the above discussion we know g = x1 1 x2m2 . . . xnmn for some integers
m1 , . . . , mn Z such that 0 6 mi < i for all i > 1. Looking at the
image f( g) of this element, we see
m m2 mn
f( g) = 1 + ++ =0
1! 2! n!
Multiplying by (n1)! we get
 
m1 m2 mn mn
( n 1) ! + ++ = m+ =0
1! 2! n! n
for some integer m Z. This means that mnn = m must also be
an integer, but we know already that 0 6 mn < n and so the only
possibility is that mn = 0. Therefore
m m2 m n 1
f( g) = 1 + ++ +0 = 0
1! 2! ( n 1) !
and we can repeat the same process with (n2)! to show that mn1 =0,
and then repeat (n3) more times to show that m2 = = mn = 0.
m m
Hence g = x1 1 and since f( g) = f( x1 1 ) = m1 = 0 this tells us that
g = x0 = e and so ker( f) is trivial. Therefore f is injective, and since
1
weve already shown its surjective that means its an isomorphism,
and so Q = h x1 , x2 , . . . : xnn = xn1 for n > 1i as claimed.
Another important class of finite groups we met in Chapter 1 are
the symmetric groups Sn , and with a little bit of work we can find
presentations for these as well.
Proposition 5.33 The symmetric group Sn may be defined by the presen-
tation
xi2 = e 1 6 i 6 n 1
* +
Sn = x 1 , . . . , x n 1 : ( x i x i +1 )3 = e 1 6 i 6 n 2 .
xi x j = x j xi 1 6 i < j 6 n1, ji > 1

Proof Let Gn be the group defined by the above presentation. We aim


to construct an isomorphism from Gn to Sn .
This presentation is inspired by Proposition 1.60(ii), which says that Sn
is generated by transpositions of adjacent objects; that is, transpositions
of the form xi := (i i +i ) for 1 6 i 6 n1. We can easily verify that all
three types of relations are satisfied by these transpositions. The first
follows immediately from the fact that transpositions have order 2:
xi2 = (i i +1)2 = (i i +1)(i i +1) = .
The second follows by a straightforward calculation:

( xi xi+1 )3 = ((i i +1)(i +1 i +2))3 =


(i i +1)(i +1 i +2)(i i +1)(i +1 i +2)(i i +1)(i +1 i +2) =
160 a course in abstract algebra

The third follows from the fact that if |i j| > 1 then the transpositions
xi = (i i +1) and x j = ( j j+1) are disjoint, and therefore commute:
(i i +1)( j j+1) = ( j j+1)(i i +1).
We can thus define a surjective homomorphism f : Gn Sn by map-
ping each generator xi in Gn to the transposition (i i +1) in Sn . If we
can also show this homomorphism is injective, then weve finished.
To do this, we can use the fact that |Sn | = n! and show by induction
that | Gn | 6 n!, which will then imply that f must be injective.
When n = 1 we have the trivial group G1 = h : i, which has order
| G1 | = 1 6 1! as required. For n = 2 we have G2 = h x1 : x12 = ei
= Z2
and hence | G2 | = 2 6 2! as well.
Now suppose n > 3 and | Gn1 | 6 (n1)! for the induction hypothesis.
Let H 6 Gn be the subgroup generated by the first n2 generators
x1 , . . . , xn2 and define
y0 = e and y i = x n 1 x n 2 . . . x n i for 1 6 i 6 n1.
So
y0 = e, y1 = xn1 , y2 = xn1 xn2 , . . . , yn1 = xn1 xn2 . . . x1 .
Now let
A = { hyi : h H and 0 6 i 6 n1}.
This is certainly a subset of Gn , but we want to show that its equal to
the whole of Gn . To do this, we consider products of the form hyi x j ,
where h is some element of H. There are six cases to consider:
Case 1 (i = 0, j < n1) hyi x j = hy0 x j = hx j Hy0 A.
Case 2 (i = 0, j = n1) hyi x j = hy0 xn1 = hxn1 = hy1 Hy1 A.
Case 3 (i > 0, j > ni)
hyi x j = h( xn1 . . . x j x j1 . . . xni ) x j
= h ( x n 1 . . . x j x j 1 x j x j 2 . . . x n i )
= h ( x n 1 . . . x j +1 x j 1 x j x j 1 x j 2 . . . x n i )
= h ( x j 1 x n 1 . . . x j +1 x j x j 1 . . . x n i )
= hx j1 yi Hyi A.
Case 4 (i > 0, j = ni)
hyi x j = hxn1 . . . xni xni
= hxn1 . . . xni+1
= hyi1 Hyi1 A.
Case 5 (i > 0, j = ni 1)
hyi x j = hxn1 . . . xni xni1
= hyi+1 Hyi+1 A.
presentations 161

Case 6 (i > 0, j < ni 1) hyi x j = hx j yi Hyi A.


Therefore Ax j A for all j, and since x2j = e we also know that
Ax 1
j = Ax j A.
Not only that, if w is any word in the generators x1 , . . . , xn1 and their
inverses, then Aw A. (This can be rigorously proved by induction
on the length l (w) of the word w, but well skip the details here.)
In fact, any word w Gn is equal to some element of A, because e A
and so w = ew Aw A. Hence A = Gn .
The relations in Gn involving just the first n2 generators x1 , . . . , xn2
are exactly those which lie in Gn1 , so the map : Gn1 Gn such
that each xi 7 xi is a homomorphism with im = H. Then

| Gn | 6 | A| 6 n| H | 6 n(n1)! = n!
Since | Gn | 6 | A| and f : Gn Sn is surjective, and since we know
from Proposition 1.52 that |Sn | = n! this means that the only possibility
is that f is also injective, and hence an isomorphism.
Thus Gn = Sn and the given presentation is a valid one for Sn .
The given presentation for Sn is a bit complicated in general, so its
illuminating to look at a specific example.
Example 5.34 The symmetric group S3 has presentation
h x, y : x2 = y2 = ( xy)3 = ei.
The symmetric group S4 has presentation
x, y, z : x2 = y2 = z2 = ( xy)3 = (yz)3 = e,


xz = zx .

We also happen to know S3


= D3 , and Proposition 5.28 tells us that
D3
= h a, b : a3 = b2 = e, ba = a1 bi.
Since these two presentations yield isomorphic groups, there must
surely be some structural connection between them. Ideally, we want a
precise way of deciding whether two presentations define isomorphic
groups or not. As with many questions in mathematics, the answer
is simple and elegant, but sometimes not as practically useful as we
might prefer. First, the good news.
Proposition 5.35 Let G = h X : Ri be a group.
(i) If r h R F(X ) i is a relator of G then G
= h X : R {r }i.
(ii) Let y 6 X be another generator and w W ( X ) be some word in
F ( X ). Then G = h X {y} : R {yw1 }i, where the isomorphism
restricts to the identity idX on X.

Proof To show part (i) we note that if r is a relator of G, then its


in the normal closure h R F(X ) i, and hence h( R {r }) F(X ) i = h R F(X ) i.
162 a course in abstract algebra

Therefore

h X : R{r }i = F ( X )/h( R{r }) F(X ) i = F ( X )/h R F(X ) i = h X : Ri


= G.
The proof of part (ii) is a little more involved, and uses the fundamental
11
Proposition 5.17, page 149. mapping property of group presentations.11 Let Y = X y and as
usual let F (Y ) denote the free group on Y. Let H = h X {y} :
R {yw1 }i be the group obtained by adding the new generator y
and the new relator yw1 .
We want to show that G = H, so to that end we define a function
: X H such that every generator in X gets mapped to itself in
Y = X y H. Every relator in G is also a relator in H, so we can
use Proposition 5.17 to extend this to a homomorphism : G H.
We can also define a homomorphism going the other way: let : Y
G be such that ( x ) = x for all x X and (y) = w. For any relator
r R we have (r ) = eG , and also (yw1 ) = (y)w1 = ww1 =
eG . Hence we can use Proposition 5.17 again to extend this to a
homomorphism : H G.
The composite : G G extends the inclusion map i : X , G,
and : H H extends the inclusion map j : Y , H. The identity
homomorphisms idG : G G and id H : H H also do this, and
so by the uniqueness property in Proposition 5.17, = idG and
= id H , so and must be isomorphisms. Hence G
= H.
This tells us that there are two specific things we can do to a group pre-
sentation that wont fundamentally change the group it describes. We
can add a redundant relator, and we can add a new generator together
with a new relator that equates it to an existing element of the group.
We can also do these in reverse: we can delete a redundant relator,
and we can remove a redundant generator and corresponding relator.
More precisely, we have the following definition:
Definition 5.36 Let G = h X : Ri. By Proposition 5.35 the following
operations will give a presentation for a group isomorphic to G:
X+ Add a new generator y 6 X together with a new relator yw1
where w W ( X ). Then G = h X {y} : R{yw1 }i.
X Delete a redundant generator y X and relator yw1 R if
neither w nor any other relator in R contains y or y1 . Then
G= h X \ {y} : R \ {yw1 }i.
R+ Add a new relator r h R F(X ) i \ R. Then G = h X : R{r }i.
R Delete a redundant relator r h R F ( X ) i where h( R\{r }) F(X ) i =
hR F ( X )
i. Then G = h X : R \ {r }i.
These operations are called Tietze transformations, originally for-
mulated by the Austrian mathematician Heinrich Tietze (18801964).
presentations 163

Tietze transformations are typically applied in combinations, and are


best illustrated by a few concrete examples.
Example 5.37 Let Gm,n = h X : Ri = h x, y : x m yn i. We can apply
an R+ operation to introduce the relator yn x m . This is certainly
allowed, since if x m yn is in the normal closure h R F(X ) i then so is
yn x m = ( x m yn )1 , so by Proposition 5.35 we have
Gm,n = h X : R {yn x m }i = h x, y : x m yn , yn x m i.
Having done this, we can use R to delete x m yn , since the normal
closure h( R \ { x m yn }) F(X ) i = h R F(X ) i. This gives us
h x, y : yn x m i
and again by Proposition 5.35 this will be the same group Gm,n . We
will quite often have cause to use a double operation of this type,
where a relator r is replaced with an equivalent relator s. We may
denote such an additiondeletion move as R .
(This group Gm,n happens to be the fundamental group of the com-
plement of a torus knot of type (m, n).)

We can take this a little further and replace any relator with a conjugate
of itself, in particular a cyclic permutation:
Example 5.38 Let Gm,n = h X : Ri = h x, y : x m yn i. Since x m yn = e
we can take the conjugate with x to get x 1 x m yn x = x 1 ex = e
and hence x m1 yn x = e. So x m1 yn x is also in the normal closure
h R F(X ) i and hence we can perform an R operation to get
Gm,n = h x, y : x m1 yn x i.

More generally, we can replace any relator by a conjugate or cyclic


permutation by means of the R double operation.
Another consequence of this example relates to the X move, in which
a generator y can be deleted if there exists a relator of the form yw1
where w is some word in the remaining generators and their inverses,
and no other relator contains y. Example 5.38 tells us it doesnt matter
if the chosen generator y isnt at the beginning of some relator yw1 ,
because we can use a sequence of R moves to cyclically permute a
relator u1 yv1 into the form yv1 u1 = y(uv)1 .
The next example shows that it doesnt even matter if y, the generator
we want to eliminate, happens to occur in other relators as well, as
long as theres one that just contains a single occurrence of y or y1 .

Example 5.39 Let G = h X : Ri = h x, y : x2 y3 , yx5 i. We can take


the relator yx5 = e and rearrange it to get y = x 5 , which we can
then substitute into the other relator to get e = x2 ( x 5 )3 = x17 .
164 a course in abstract algebra

This means that x17 h R F(X ) i and so we can perform an R+ move to


obtain the presentation G = h X : R { x17 }i = h x, y : x2 y3 , yx5 , x17 i.
We can now apply an R move to delete the now redundant relator
x2 y3 , obtaining the presentation
G = h x, y : yx5 , x17 i.
Finally, we can perform an X transformation to delete the generator
y and the relator yx5 , obtaining the much simpler presentation
G = h x : x17 i
= Z17 .

We will denote a compound transformation of this type, consisting


of a number of R+ moves followed by a corresponding number of
R moves and an X move, by the symbol RX .

The next example leads on from Example 5.34 and Proposition 5.28 to
show that the two presentations for D3
= S3 are indeed equivalent.
Example 5.40 From Proposition 5.28 we know that
D3
= h a, b : a3 , b2 , ( ab)2 i,
and from Proposition 5.33 and Example 5.34 we know that
D3
= S3
= h x, y : x2 , y2 , ( xy)3 i.
First we apply an X+ move to the first presentation, introducing the
generator x and the relator x 1 ab (equivalent to x = ab). This yields
h a, b, x : a3 , b2 , ( ab)2 , x 1 abi.
Next, we perform another X+ move to introduce another generator
y and relator by1 (which is equivalent to the relation y = b). This
gives
h a, b, x, y : a3 , b2 , ( ab)2 , x 1 ab, by1 i.
We can now use an R+ move to add a new relator y2 and follow it
with an R move to delete the relator b2 ; this results in the presenta-
tion
h a, b, x, y : a3 , ( ab)2 , x 1 ab, by1 , y2 i.
Similarly, we can replace the relator x 1 ab with x 1 ay, and replace
the relator ( ab)2 with ( ay)2 to obtain the presentation
h a, b, x, y : a3 , ( ay)2 , x 1 ay, by1 , y2 i.
We can now perform an X move to delete the generator b and the
relator by1 , in order to get the presentation
h a, x, y : a3 , ( ay)2 , x 1 ay, y2 i.
Another R transformation replaces the relator x 1 ay with its cyclic
conjugate ayx 1 , and then we perform two more R moves to replace
presentations 165

a3 with ( xy1 )3 and replace ( ay)2 with x2 to get the presentation


h a, x, y : ( xy1 )3 , x2 , ayx 1 , y2 i.
All that remains now is to do another X transformation to delete
the generator a and the relator ayx 1 , and then do another R
transformation to replace ( xy1 )3 with ( xy)3 to get the presentation
into the required form
h x, y : ( xy)3 , x2 , y2 i.
This example is a bit long-winded: we went through the sequence of
Tietze transformations in a bit more detail than would ordinarily be
required, but hopefully its given you a clearer idea of how the process
works.
In a sense this process is analogous to the study of systems of simulta-
neous linear equations. Given a collection of such equations, we want
to find the solution set; that is, the complete set of possible values
each of the variables can take and still satisfy the equation. There are
certain basic operations we can perform on a given system that wont
change the solution set, but which will (if weve done it all correctly)
simplify the system and make it easier to solve. We can multiply both
sides of an equation by a nonzero scalar, and we can add a scalar
multiple of one equation to another. Well come back to this idea
in the next section when we study presentations of abelian groups,
where the similarities will become more obvious.
Proposition 5.35 tells us that the Tietze transformations X+ , X , R+
and R dont change the group in question. That is, if we have
a group G defined by a presentation h X : Ri and we perform a
finite sequence of Tietze transformations on it, ending up with a new
presentation hY : Si, then this new presentation defines a group which
is isomorphic to the original group G.
What Proposition 5.35 doesnt tell us is whether, given two presen-
tations h X : Ri and hY : Si for the same group, we can transform
one into another by a finite sequence of Tietze transformations. As it
happens, we can do exactly this, as the following proposition confirms.
Proposition 5.41 Let h X : Ri and hY : Si be two presentations for
the same group G. Then there exists a finite sequence of Tietze moves
transforming h X : Ri into hY : Si.

Proof Given that G = h X : Ri =


hY : Si it follows that both X and
Y generate G, and hence we can write every element of X as a word
in F (Y ), and write every element of Y as a word in F ( X ). Denote
these sets of equations by X (Y ) and Y ( X ) respectively. Starting with
the presentation h X : Ri, we perform X+ and X transformations to
166 a course in abstract algebra

obtain the presentation


h X, Y : R( X ) Y ( X )i
(where R( X ) denotes the relators R expressed as words in F ( X )). Next
perform R+ moves to transform this into the presentation
h X, Y : R( X ) Y ( X ) X (Y )i.
We can now use RX moves to eliminate the generators in X to get
hY : R( X (Y )) Y ( X (Y ))i.
Here R( X (Y )) denotes the relators from R written in terms of the
generators in Y via the equations in X (Y ), and Y ( X (Y )) denotes the
generators in Y written as words in F (Y ) via the equations in X (Y ). A
sequence of R+ moves allows us to adjoin the relators S(Y ) to get
hY : R( X (Y )) Y ( X (Y )) S(Y )i
and finally we perform a sequence of R moves to delete the relators
in R( X (Y )) and Y ( X (Y )) to obtain the presentation
hY : S i
as required.
Putting Propositions 5.35 and 5.41 together, we now know that two
presentations h X : Ri and hY : Si give isomorphic groups if and only
if we can turn one into another by a finite sequence of the four Tietze
transformations (which in practice we usually employ in compound
forms such as X and RX ).
This is the good news. The bad news is that, in general, whether or
not two arbitrary words yield the same element of a given group is a
formally undecidable question: this is known as the word problem.
There are many classes of groups for which the word problem is
solvable, including finite groups, the Euclidean groups we met in
Definitions 1.47 and 1.48, and the Coxeter groups and braid groups
well meet in Section 5.C.
But in general the word problem is unsolvable, and one of the conse-
quences of this is that there is no systematic, algorithmic method for
deciding whether two arbitrary presentations determine isomorphic
groups. (The proof of Proposition 5.41 worked because wed decided
from the start that the two presentations were isomorphic.)
Nevertheless, this machinery of group presentations is still very useful
for many purposes, and there are many powerful techniques and
results that can be derived using it. In the next section we will state
and prove a classification theorem for finitely generated abelian groups,
and in Section 5.A we will study the ToddCoxeter algorithm which
calculates several useful pieces of information about a given group.
presentations 167

5.3 Finitely generated abelian groups It is the peculiar beauty of this method,
gentlemen, and one which endears it to
the really scientific mind, that under no
In Section 5.1 we studied free groups, the most general possible circumstance can it be of the smallest
groups we can generate from a particular set of generators. Its worth possible utility.
Henry Smith (18261883)
asking what is the most general possible abelian group we can con-
struct from a given set of generators. We could go through the same
construction as for free groups, in addition requiring that all the gen-
erators commute with each other. This definitely results in an abelian
group whose elements are commuting words in the generators. Unlike
with the more general case, we allow arbitrary arbitrary reordering
of generators, which in particular means that we can group together
generators of the same type in a nice, neat and straightforward way.
For example, suppose X = { x, y, z} is a set of generators, and that
w1 = x2 y1 xz3 y2 and w2 = y 1 z 1 x 2 z 3 y 2 .
Allowing arbitrary reordering means we can reduce these two words
to get
w1 = x 3 y 3 z 3 and w2 = x 2 y 3 z 2 .
Concatenation, reduction and reordering of these words gives
w1 w2 = x 5 y 6 z 5 = w2 w1 .
As before, this concatenationreorderingreduction operation is an
associative binary operation on the set of commuting words in X ,
the empty word serves as an identity element, and each word has a
unique, well-defined inverse; for example
w11 = x 3 y3 z3 and w21 = x 2 y3 z2 .
We therefore have all the necessary ingredients for an abelian group:
Definition 5.42 The group just described is the free abelian group
FA( X ) generated by the set X. If X is finite and | X | = n, we will
often denote this as FAn .
Free abelian groups FA( X ) satisfy a similar fundamental property as
the free group F ( X ) except we restrict our attention to abelian groups:
Definition 5.43 Let X F be a set, and A be some arbitrary abelian
group. Then an abelian group F is free abelian on X if any function
f : X A extends uniquely to a homomorphism f : F A.

A slight modification of the proof of Proposition 5.6 confirms that


these two definitions are equivalent. But another way to look at this,
and one that will lead us naturally into the next section, is to say that
in constructing the free abelian group FA( X ) essentially what weve
done is to take F ( X ) and make it abelian: weve made everything
commute with everything else.
168 a course in abstract algebra

This is exactly the abelianisation process we met in Chapter 3. In


constructing FA( X ) weve factored F ( X ) by its commutator subgroup
[ F ( X ), F ( X )]. That is, FA( X ) = F ( X )ab .
Depending on whether weve decided to use multiplicative or additive
notation, we can write a typical element w of FA( X ) as either

xnx
n
w = x1 1 x2n2 . . . = or w = n1 x1 + n2 x2 + = nx x
xX xX
where n1 , . . . Z. Most of the time, well use additive notation for
free abelian groups and their quotients.
Proposition 5.44 Let X = { x1 , . . . , xn } be a finite set of generators.
Then the free abelian group FAn = FA( X ) is isomorphic to the nfold
direct sum Zn = Z Z.
Wikimedia Commons
(Julius Wilhelm) Richard Dedekind
(18311916) was born in Braunschweig,
Proof Written additively, an element of FAn is a word of the form
Germany, the fourth and youngest
w = m1 x1 + + m n x n
child of Julius Dedekind, a professor
at the Collegium Carolinum (now the where m1 , . . . , mn Z.
Technische Universitt Braunschweig).
He entered the University of Gttingen The function f : FAn Zn that maps a word w to the ntuple
in 1850, where he studied with Carl (m1 , . . . , mn ) in Zn is clearly bijective. Its also a homomorphism,
Friedrich Gauss (17771855). Dedekind
became Gauss last student, and in 1852 since for any
was awarded a doctorate for a thesis en-
u = a1 x1 + + a n x n and v = b1 x1 + + bn xn
titled ber die Theorie der Eulerschen In-
tegrale (On the theory of Eulerian inte- we have uv = ( a1 + b1 ) x1 + ( an + bn ) xn and so
grals). As is often the case, this thesis
was adequate but did not display the f (uv) = ( a1 +b1 , . . . , an +bn ) = ( a1 , . . . , an )+(b1 , . . . , bn ) = f (u)+ f (v)
talent evident in his later work. Over
the next two years he studied the lat- as required. Hence f is an isomorphism.
est developments in mathematics, and
in 1854 was awarded his habilitation (a The following proposition is the free abelian analogue of a very similar
postdoctoral qualification required to fact in linear algebra.
teach at a German university).
During his career he made many im- Proposition 5.45 Let F be an abelian group, and let X = { x1 , . . . , xn }
portant advances in analysis, algebra F. Then F is free abelian of rank n and the elements of X are Zlinearly
and number theory. In particular,
he devised the notion of a Dedekind independent if and only if any element of F can be written as a unique
cut, used in one construction of the Zlinear combination of the elements x1 , . . . , xn .
real numbers R. He became an early
admirer of the work of Georg Can- Proof Suppose that any element g F can be written uniquely as the
tor (18451918) on transfinite numbers,
and was a staunch ally in his disputes Zlinear combination
with Leopold Kronecker (18231891).
g = a1 x1 + + a n x n
He also devised the concept of an ideal
and used it to provide a purely alge- where a1 , . . . , an Z. Clearly X generates F, but we want to show that
braic proof of the RiemannRoch The-
orem, an important result in algebraic F is actually the free abelian group FA( X ) generated by X, and which
geometry. consists of all possible formal Zlinear combinations of x1 , . . . , xn .
He retired in 1894 and returned to
Braunschweig, but continued his re-
Let f : X , F be the inclusion function. Then by the fundamental
search and taught occasionally for mapping property of free abelian groups, there exists a unique homo-
some years. He died in 1916 aged 84. morphism f : FA( X ) F which extends f , and which maps a given
Zlinear combination in FA( X ) to the corresponding sum in F. We
presentations 169

want to show that this is an isomorphism. It is clearly surjective, since


weve already hypothesised that any element of F can be expressed
as a Zlinear combination of x1 , . . . , xn . The uniqueness requirement
ensures that there are no two distinct Zlinear combinations in FA( X )
which map to the same element in F, and hence f is injective as well.
Therefore F
= FA( X ), which is the free abelian group of rank n.
Conversely, suppose that F = FA( X ) = FAn and that the elements of
X F are Zlinearly independent.
Since X generates F we know straight away that any g F can be
written as a Zlinear combination
g = a1 x1 + + a n x n
for some a1 , . . . , an Z. Suppose there exists another Zlinear combi-
nation
g = b1 x1 + + bn xn .
Subtracting one from the other yields
( a1 b1 ) x1 + + ( an bn ) xn = 0
and since the generators x1 , . . . , xn are Zlinearly independent,
a1 b1 = 0, ..., a n bn = 0
and therefore
a1 = b1 , ..., a n = bn .
Hence any such Zlinear combination for g must be unique.
Some pages ago we stated the NielsenSchreier Theorem,12 which 12
Theorem 5.11, page 145.
says that a subgroup of a free group is itself free. The proof of this is
a little involved and is postponed until later in this chapter, but the
corresponding result for free abelian groups is a little easier to prove:
Theorem 5.46 (Richard Dedekind (18311916)) Let F = FAr be a free
abelian group of rank r, and let G be a subgroup of F. Then G is a free
abelian group of rank at most r.

Proof We will prove this theorem by induction on r. The base case


(r = 1) yields FAr
= Z and we know that any subgroup of Z is either
trivial or of the form nZ = hni for some n Z. The trivial group
can be regarded as FA0 = FA(), the free abelian group with no
generators. A cyclic subgroup nZ 6 Z is itself isomorphic to Z by the
mapping nk 7 k for any element nk nZ.
Now suppose that r > 1 and that the required statement holds for
free abelian groups of rank r 1. Let H = FAr1 be the subgroup of
F = FAr generated by the first r 1 generators x1 , . . . , xr1 , and let
K =Z = FA1 be the subgroup of F generated by the rth generator xr :
H = h x1 , . . . , xr i
= FAr1 and K = h xr i
=Z = FA1 .
170 a course in abstract algebra

We can see that F = H K.


By Proposition 2.9 G H is a subgroup of H, and hence by the induc-
tion hypothesis it is also a free abelian group of rank s 6 r 1, with
13
Theorem 4.62, page 128. generators y1 , . . . , ys . The Second Isomorphism Theorem13 tells us
( G + H )/H
G/( G H ) =
14
Definition 2.10, page 45. where G + H = { g + h : g G, h H }.14
This set G + H is a subgroup of F, since it contains the identity 0, its
closed under addition, and it contains all the required inverses. Hence
( G + H )/H is a subgroup of F/H. But F/H is exactly the subgroup
K = h xr i
= FA1 , and so we can apply the induction base case (r = 1)
to see that G/( G H ) is either trivial or isomorphic to Z
= FA1 .

If G/( G H ) is trivial, then G = G H which weve already confirmed
is free abelian with rank s 6 r 1.
Suppose, then, that G/( G H ) = Z. This group consists of the cosets
of the intersection G H in G, which are of the form g+( G H ) for
g G. Since G/( G H ) is isomorphic to the infinite cyclic group Z
there must exist some g G \ H for which G/( G H ) is generated by
g+( G H ); that is, G/( G H )
= h g+( G H )i.
Let g = h + nxr for some h H and n N. We claim that G is free
abelian with basis Y = {y1 , . . . , ys , g}. This set Y certainly generates
G, but we need also to show that its linearly independent over Z.
Therefore, suppose that
a1 y1 + + as ys + ag = 0
where a, a1 , . . . , as Z. Substituting in g = h + nxr gives
a1 y1 + + as ys + ah + naxr = 0 (5.1)
and so
naxr = ah ( a1 y1 + + as ys ).
The left hand side of this equation is in K = h xr i, while the right hand
side lies in H (since the y1 , . . . , ys G H and h H). Therefore both
sides of the equation must belong to both K and H, and hence to the
intersection H K. But by the definition of H and K we know that
H K = {0}, hence naxr = 0. Since n 6= 0, the only other possibility is
that a = 0. Substituting this back into (5.1) we get
a1 y1 + + as ys = 0.
The y1 , . . . , ys are free abelian generators of the intersection G H, and
hence by the Zlinear independence condition (Proposition 5.45) it
must be the case that a1 = = as = 0. This means that every
element of G can be uniquely expressed as a Zlinear combination of
the elements y1 , . . . , ys , g Y, and hence G = FA(Y ) and rank( G ) =
s + 1 6 r.
presentations 171

Returning to our discussion of presentations of direct products and


direct sums, obviously if both groups G and H are abelian, then so
is their direct sum G H. We can see this from the presentation for
G H given in Proposition 5.22. A presentation of the form

Zn1 Zn k Z Z =
n
h x1 : x1n1 i h xk : xk k i h xk+1 : i h xm : i =
n
h x1 , . . . , xm : x1n1 = = xk k = e, [ xi , x j ] = e for 1 6 i, j 6 mi

gives an abelian group. Less obviously, every abelian group with a


finite presentation is of this form: all finitely-generated abelian groups
are isomorphic to a direct sum of a finite number of cyclic groups.
The proof of this classification is a little involved, and we will split it
up into a few more manageable pieces. First we introduce a few terms
that will be used in the statement of the theorem and the following
discussion and proof.
Definition 5.47 Suppose that
G
= Z m 1 Z m k Zr
where m1 , . . . , mk , r Z, r > 0 and mi |mi+1 for 1 6 i < k.
This is called an invariant factor decomposition of G, with torsion
coefficients or invariant factors m1 , . . . , mk and rank r.
The statement of the classification theorem is as follows:
Theorem 5.48 Let G be a finitely-generated abelian group. Then G has a
unique invariant factor decomposition
G
= Z m 1 Z m k Zr
in the sense that if H is another finitely-generated abelian group with in-
variant factor decomposition
H
= Zn1 Zn l Zs ,
then G
= H if and only if k = l, r = s and mi = ni for 1 6 i < k.
Two important special cases are given by the following corollaries:
Corollary 5.49 Let G be a finite abelian group. Then G has a unique
invariant factor decomposition
G
= Zm1 Zm k .

Corollary 5.50 Let G be a torsion-free abelian group. Then G


= Zr for
some non-negative integer r.
Before we get to the proof of this theorem, well look at a few examples
to see how it all works.
172 a course in abstract algebra

Example 5.51 Let G be an abelian group of order 12. Then since G


must be a direct sum of finite cyclic groups, we have the following
possibilities:
Z12 , Z2 Z6 , Z3 Z4 , and Z2 Z2 Z3 .
But by Proposition 1.32 we know that
Z3 Z4
= Z12 and Z2 Z2 Z3
= Z2 Z6 .
Only two of these four satisfy the invariant factor condition, namely
Z12 and Z2 Z6 .

Example 5.52 Lets list all the abelian groups of order 48. The integer
48 factorises as 22223, so we have the following possibilities:
Z48 , Z2 Z24 , Z2 Z2 Z12
Z4 Z12 , Z2 Z4 Z6 , Z2 Z2 Z2 Z6 ,
Z8 Z6 , Z8 Z6 , Z2 Z2 Z2 Z2 Z3 ,
Z16 Z3 , Z2 Z8 Z3 , and Z4 Z4 Z3 .
From Proposition 1.32 we know that
Z48
= Z16 Z3 ,
Z2 Z24
= Z8 Z6

= Z2 Z8 Z3 ,
Z2 Z2 Z12
= Z2 Z4 Z6
Z2 Z2 Z4 Z3 ,
=
Z4 Z12
= Z4 Z4 Z3 ,
and Z2 Z2 Z2 Z6
= Z2 Z2 Z2 Z2 Z3 .
There are therefore five different abelian groups of order 48, whose
invariant factor decompositions are as follows:
Z48 , Z2 Z24 , Z2 Z2 Z12 , Z4 Z12 and Z2 Z2 Z2 Z6 .
Well split the proof of Theorem 5.48 into two parts: existence (every
finitely generated abelian group G has an invariant factor decomposi-
tion) and uniqueness (isomorphic finitely generated abelian groups
have the same invariant factor decomposition).
In order to prove the first part, well need the following lemma, which
is really just a consequence of Proposition 5.35 in an abelian setting.
Lemma 5.53 Let X = { x1 , . . . , xn }, and let
Y = { x1 , . . . , x j1 , x j + mxi , x j+1 , . . . , xn }
where 1 6 i 6= j 6 n and m Z.
Then if X is a basis for some abelian group G, so is Y.
presentations 173

So, for an abelian group G we can replace any generator x j with a new
generator of the form x j + mxi where i 6= j and m Z.
Proof This is just an application of Tietze transformations. Replacing
the generator x j with a new generator y = x j + mxi amounts to an X+
move introducing a generator y and a relator y x j mxi , followed by
an X move removing the generator x j and the relator y x j mxi .
Proof of Theorem 5.48 (existence) Our abelian group G has genera-
tors x1 , . . . , xs , and relators r1 , . . . , rt , each of which are of the form
ri = ni,1 x1 + + ni,s xs .
If t = 0 (that is, if there are no relators) then we have the free abelian
group FAs of rank s, which is isomorphic to Zs by Proposition 5.44.
Suppose instead that t > 1. Then there exists at least one nontrivial
relator, and amongst those relators there will be a smallest positive
coefficient ni,j attached to the generator x j in relator ri .
We can relabel the generators and relators15 so that this coefficient is 15 This wont change the group un-
n1,1 and were working with the relator der investigation, except by an isomor-
phism, the details of which dont mat-
r1 = n1,1 x1 + + n1,s xs . ter to this discussion.

Call this coefficient m1 , so our relator becomes


m1 x1 + n1,2 x2 + + n1,s xs .
This m1 is intended to be the first torsion coefficient in the statement
of the theorem, and we want to show its a factor of n1,2 .
Rewrite n1,2 = am1 + b, where 0 6 b < m1 . Then our relator becomes
m1 x1 + ( am1 + b) x2 + n1,3 x3 + + n1,s xs
= m1 x1 + a(m1 x2 ) + bx2 + n1,3 x3 + + n1,s xs
= m1 ( x1 + ax2 ) + bx2 + n1,3 x3 + + n1,s xs .
Lemma 5.53 tells us that ( x1 + ax2 ), x2 , . . . , xs is also a minimal set of
generators for G. But b < m1 , contradicting our statement that m1 is
the smallest positive coefficient over all generators and relators. Thus
b = 0, so n1,2 = am1 , and hence m1 |n1,2 . In a similar way, we can show
that m1 divides each of the other coefficients n1,3 , . . . , n1,s , and so in
each case we have n1,j = a j m1 where a j Z and 2 6 j 6 s.
We can now perform Tietze transformations to replace the generator
x1 with a new generator
z1 = x1 + a2 x2 + a3 x3 + + a s x s
and rewrite the relation
r1 = m1 x1 + n1,2 x2 + + n1,s xs = 0
as mz1 = 0.
174 a course in abstract algebra

Because of our choice of m1 , no smaller multiple of z1 is zero, so z1


has order m1 .
Now let G1 = hz1 i = Zm be the subgroup of G generated by z1 ,
1
and let H be the subgroup of G generated by the other generators
x2 , . . . , xs . We can see that G1 + H = G and that G1 H = {0}, so
Proposition 2.12 tells us that G = G1 H.
We can now apply the same procedure to H, with two possible out-
comes. One possibility is that H is a free abelian group of rank s1,
and hence isomorphic to Zs1 , in which case G = Zm1 Zs1 . If so,
then weve finished: G is isomorphic to a group of the required form.
The other possibility is that H
= Zm K where K is generated by
2
s2 generators, and in this case we have to do a little bit more work,
to show that m2 is a multiple of m1 .
So, suppose that m2 is the smallest positive coefficient in all of the
relators r2 , . . . , rt in H = h x2 , . . . , xs : r2 , . . . , rt i, and that (possibly
after some reordering of generators and relators) it happens to be the
coefficient of the generator x2 in the relator
r2 = m2 x2 + n2,3 x3 + + n2,s xs .
Since z1 , x2 , . . . , xs is a minimal set of generators for G, and since
m1 z1 + m2 x2 + n2,3 x3 + + n2,s xs = 0
we know m1 has to be a factor of m2 by the argument we used earlier.
At this point we have all we need to complete the proof. We know
that m1 is a factor of m2 , and by applying the same procedure to the
group H well eventually end up with the direct sum
G
= Z m 1 Z m 2 Z m k Zr
where m1 |m2 | . . . |mk as required.
So, we now know that any finitely-generated abelian group can be
decomposed as a direct sum in the specified way. Now we need to
show that this decomposition is unique. First we need a short lemma.
Lemma 5.54 There are gcd(n, q) elements k Zn for which qk = 0.
If G = Zn1 Znr then there are gcd(n1 , q) gcd(nr , q)
elements k G for which qk = 0.

Proof If qk = 0 then this is equivalent to saying that n|qk. Suppose


d = gcd(n, q) and let n = md and q = pd, so that gcd(m, p) = 1. If n|qk
then md| pdk and so m|k. Then k must be one of 0, m, 2m, . . . , (d1)m.
There are d = gcd(n, q) of these, which proves the first part.
For the second part, observe that an arbitrary element of the group
G = Zn1 Znr has the form k = (k1 , . . . , kr ). Then if qk =
q(k1 , . . . , kr ) = (qk1 , . . . , qkr ) = (0, . . . , 0), we have gcd(n1 , q) valid
presentations 175

choices for k1 , together with gcd(n2 , q) valid choices for k2 , all the way
up to kr , for which there are gcd(nr , q) possibilities. Hence the number
of elements k = (k1 , . . . , kr ) satisfying qk = 0 is equal to the product of
the greatest common divisors, namely gcd(n1 , q) gcd(nr , q).
Proof of Theorem 5.48 (uniqueness) We will prove uniqueness in
two parts, first by considering the infinite-order component, and then
by considering the torsion coefficients m1 , . . . , mk .
Let Gfin =Zm1 Zmk and Ginf =Zr , and let Hfin =Zn1 Znl
and Hinf =Zs , so that G = Gfin Ginf and H = Hfin Hinf .
There is a surjective projection homomorphism f 1 : G Ginf which
maps ( g1 , . . . , gk , gk+1 , . . . , gr ) 7 ( gk+1 , . . . , gr ). The kernel ker( f 1 ) of
this homomorphism is obviously the finite-order component Gfin , and
its image is equally obviously the infinite-order component Ginf = Zr .
By the First Isomorphism Theorem16 we see that 16
Theorem 4.40, page 119.

G/ ker( f 1 ) = G/Gfin
= im( f 1 ) = Ginf = Zr .
We can define a similar homomorphism f 2 : H Hinf which maps
(h1 , . . . , hl , hl +1 , . . . , hs ) 7 (hl +1 , . . . , hs ), and use this to show that
H/ ker( f 2 ) = H/Hfin
= im( f 2 ) = Hinf = Zs .
Any isomorphism : G H must map the elements of finite order
in G to the elements of finite order in H, by Proposition 1.21, so
Gfin
= Hfin , and hence there must be an induced isomorphism
: G/Gfin = Ginf H/Hfin = Hinf ,
which means that Zr
= Zs and so r = s as claimed.
All that remains is to prove uniqueness of the torsion coefficients. As
just noted, Gfin
= Hfin . Suppose, without loss of generality, that k > l.
Then we apply Lemma 5.54 to Gfin and Hfin with q = m1 to see that
gcd(m1 , m1 ) gcd(m1 , mk ) = gcd(m1 , n1 ) gcd(m1 , nl )
and since m1 |m2 | . . . |mk we have
gcd(m1 , m1 ) = gcd(m1 , m2 ) = = gcd(m1 , mk ) = m1
and therefore
m1k = gcd(m1 , n1 ) gcd(m1 , nl ).
Each of the factors gcd(m1 , n1 ), . . . , gcd(m1 , nl ) on the right hand side
can be at most m1 , which forces k = l. This in turn means that
gcd(m1 , n1 ) = m1 , and hence m1 must be a factor of n1 .
Now apply Lemma 5.54 with q = n1 to get
gcd(m1 , n1 ) gcd(mk , n1 ) = n1k
so gcd(m1 , n1 ) = n1 and thus n1 divides m1 . Therefore m1 = n1 .
176 a course in abstract algebra

Next we can apply Lemma 5.54 again with q = m2 to get


m1 m2k1 = gcd(m2 , n1 ) gcd(m2 , nk )
which implies
m2k1 = gcd(m2 , n2 ) gcd(m2 , nk )
from which we see that gcd(m2 , n2 ) = m2 , so m2 |n2 .
Applying Lemma 5.54 yet again with q = n2 we get
m1 gcd(m2 , n2 ) gcd(mk , n2 ) = m1 n2k1
which tells us that n2 |m2 , hence m2 = n2 .
Repeating this process another k2 times shows that
n1 = m1 , . . . , n k = m k ,
which completes the proof.
This classification theorem is elegant and useful. It tells us that every
finitely generated abelian group has a unique canonical form as a
direct sum of cyclic groups. What we now want is a general procedure
for finding that decomposition from a given presentation.
The key observation is that the problem reduces to solving a finite
system of linear simultaneous equations with integer coefficients.
Basic linear algebra gives us techniques for solving systems of linear
equations with real coefficients, by transforming an augmented matrix
into reduced row echelon form by a finite sequence of elementary
a11 0 0 0 0

.. .. . .. .. row operations. We have to be a little careful working with integer


. ..


0 . . .

coefficients, but the procedure is similar: we convert the relations into
..

.. ..

. . . 0 0 0 a suitable matrix, use elementary row operations to convert that matrix
0 0 arr 0 0
into Smith normal form, and then read off the torsion coefficients.

0 0 0 0 0

.. .. .. .. ..

Definition 5.55 An mn matrix A of rank r is in Smith normal
. . . . .
0 0 0 0 0 form if every element is zero, except possibly for the diagonal ele-
ments ai i for 1 6 i 6 r, and further that ai i | ai+1 i+1 for 1 6 i < r.
Figure 5.1: A matrix in Smith normal
form That is, A is of the form shown in Figure 5.1, where the last nr
columns and the last mr rows are all zero.

Example 5.56 The following matrices are in Smith normal form:



1 0 0
      4 0 0
1 0 2 0 0 3 0
7 0 0 0
, , , and 0 8 0 .

0 1 0 6 0 0 12 0 14 0 0
0 0 0
0 0 0

When solving systems of simultaneous equations with real coefficients,


we employ elementary row and column operations to convert an
augmented matrix to row echelon form, and the same principle applies
here. First we need to define the appropriate elementary operations:
presentations 177

Definition 5.57 Let A be an mn matrix with integer entries. Then


we may apply one or more of the following elementary row and
column operations to A in order to obtain a similar matrix.
E1 Swap two rows or two columns.
E2 Multiply all entries of a row or column by 1.
E3 Add an integer multiple of one row to another row, or an
integer multiple of one column to another column.

The next proposition is the key to the whole problem.


Proposition 5.58 Any mn integer matrix A can be transformed into
Smith normal form by a finite sequence of row and column operations of
Wikimedia Commons
type E1, E2 and E3. Henry John Stephen Smith (1826
Equivalently, there exists an mm matrix P and an nn matrix Q such 1883) was born in Dublin in late 1826,
the fourth child of a barrister named
that PAQ is an mn matrix in Smith normal form. John Smith. After his fathers death
in 1829 the family moved to England,
The following algorithm provides a constructive proof of this fact. where he was educated by private tu-
tors until the age of fifteen. In 1841 he
Algorithm 5.59 Assume that A is not already in Smith normal form. was admitted to Rugby School in War-
wickshire, a prestigious English pub-
1 Use row and column operations of type E1 to arrange the matrix lic (which is to say, private) school
so that the nonzero entries in the first row and first column are in founded in 1567. At the time, the
schools headmaster was the renowned
ascending order of absolute value, followed by any zero elements,
theologian and educator Dr Thomas
so that | a11 | 6 | a12 | 6 6 | a1s | and | a11 | 6 | a21 | 6 6 | at1 |. Arnold (17951842). A fictionalised
2 Use row and column operations of type E2 to ensure that all of portrait of Dr Arnold and the school
can be found in the classic 1857 novel
the entries a11 , . . . , a1s in the first row are positive, and all of the Tom Browns School Days by Smiths fel-
entries a11 , . . . , at1 in the first column are also positive. low pupil Thomas Hughes (18221895).
3 If a11 divides all other nonzero entries in row 1, then go to step 4. In 1845, Smith won a scholarship to Bal-
liol College, Oxford, where he studied
Otherwise, let a1j be the first nonzero entry in the first row which mathematics and classics, graduating
isnt an integer multiple of a11 . Then we can find non-negative with high honours in 1849 and being
promoted to Fellow shortly afterwards.
integers q and p such that a1j = qa11 + p where 0 6 p < a11 .
In 1861 he was appointed Savilian Pro-
Apply a column operation of type E3, subtracting q times column 1 fessor of Geometry and elected a Fel-
from column j. low of both the Royal Society of Lon-
don and the Royal Astronomical Soci-
Repeat this process for all the other columns for which the first ety. Until his death in 1883, he taught
element is not an integer multiple of a11 . and pursued research in mathematics,
making many important contributions
Go to step 1.
to geometry and number theory.
4 If a11 divides all of the other nonzero entries in the first column, Smith was well-respected by his col-
then go to step 5. leagues. John Conington (18251869),
Professor of Latin, once remarked: I
Otherwise, let ak1 be the first nonzero entry in column 1 that isnt
do not know what Henry Smith may
an integer multiple of a11 . Then we can find non-negative integers be at the subjects of which he professes
q and p such that ak1 = qa11 + p where 0 6 p < a11 . to know something; but I never go to
him about a matter of scholarship, in a
Apply an E3 row operation, subtracting q times row 1 from row k. line where he professes to know noth-
Repeat this process for all the other rows for which the first element ing, without learning more from him
than I can get from any one else.
is not an integer multiple of a11 .
Go to step 1.
178 a course in abstract algebra

5 Every entry in row 1 and column 1 is now a multiple of a11 .


Now apply column operations of type E3, subtracting multiples
of column 1 from each of the other columns with nonzero first
element, so that a11 is the only nonzero element on the first row.
Similarly, apply row operations of type E3, subtracting multiples
of row 1 from all the other rows that have a nonzero first element,
so that a11 is left as the only nonzero element in column 1.
The matrix is then in the form

a11 0 0
0 a22 a2n

. .. ..
.
. . .

0 am2 amn
and we can apply steps 15 to the (m1) (n1) submatrix

a22 a2n
. ..
.
. .


am2 amn
to get an mn matrix where the only nonzero elements in the first
two rows and columns are on the diagonal. We repeat this until
we get a matrix whose only nonzero entries are on the diagonal.
These diagonal entries wont necessarily satisfy the divisibility
criterion a11 | a22 | . . . | arr . If they dont, we proceed to the next step.
6 For 1 6 i 6 r 1, compare aii with each a jj for i < j 6 r. Let i and
j be the lowest integers for which aii doesnt divide a jj . Use a row
operation of type E3 to add row j to row i, and then reduce this
new mn matrix using steps 15.

This algorithm will eventually terminate, yielding a matrix in Smith


17 Lets try a couple of examples.
17
A slightly more efficient step 6 is normal form.
described in a paper by V J Rayward-
Smith,18 but this will suffice for our Example 5.60 Let Z2 Z3 be the abelian group with generators x
purposes. and y, and relations
18
V J Rayward-Smith, On computing the
Smith normal form of an integer matrix, 2x = 0, 3y = 0.
ACM Transactions on Mathematical
Software 5.4 (1979) 451456
This yields the coefficient matrix
 
2 0
A= .
0 3
Applying Algorithm 5.59, we see A is already diagonal, so we skip
to step 6. Now a11 = 2, which doesnt divide a22 = 3, so we perform
an operation of type E3, adding row 2 to row 1, to get the matrix
 
2 3
.
0 3
presentations 179

Starting again from step 1, the matrix evolves as follows:


         
2 3 2 1 1 2 1 0 1 0
7 7 7 7 .
0 3 0 3 3 0 3 6 0 6
We can perform a final E2 move to get
 
1 0
,
0 6
and then read off the new relations
x = 0, 6y = 0
which yields the group Z6 , as expected from Proposition 1.32.

Now lets try a more complicated example.


Example 5.61 Let G be the abelian group with four generators w, x,
y and z, and three relations
16w + 56x + 4y + 48z = 0,
4w + 16x + 4y 8z = 0,
10w + 22x 2y + 70z = 0.
Under the application of Algorithm 5.59, the coefficient matrix
evolves as follows:

16 56 4 48 2 10 22 70 2 10 22 70
E1 E2
4 16 4 8 7 4 4 16 8 7 4 4 16 8

10 22 2 70 4 16 56 48 4 16 56 48

2 0 0 0 2 0 0 0 2 0 0 0
E3 E2 E3
7 4 24 60 132 7 0 24 60 132 7 0 24 12 12

4 36 100 188 0 36 100 188 0 36 28 8

2 0 0 0 2 0 0 0 2 0 0 0
E1 E3 E1
7 0 8 28 36 7 0 8 4 4 7 0 4 4 8

0 12 12 24 0 12 24 24 0 24 24 12

2 0 0 0 2 0 0 0 2 0 0 0
E2 E3 E1
7 0 4 4 8 7 0 4 0 0 7 0 4 0 0

0 24 24 12 0 24 0 60 0 24 60 0

20 0 0 20 0 0
E3 E2
7 0 4 0 0 7 0 4 0 0

0 0 60 0 0 0 60 0
The corresponding relations are therefore
2w = 0, 4x = 0, 60y = 0.
The fourth generator z is free, since it has no associated relation, and
hence G = Z2 Z4 Z60 Z.
There is another classification theorem for finitely-generated abelian
180 a course in abstract algebra

groups, in which the torsion subgroups are grouped in a different way.


Well state the theorem, but leave the proof as an exercise.
Definition 5.62 Suppose that
G
= Z p n 1 Z p n k Zr
1 k

where p1 , . . . , pk N are (not necessarily distinct) prime integers,


and n1 , . . . , nk , r Z.
This is called a primary decomposition of G with torsion coeffi-
n n
cients p1 1 , . . . , pk k and rank r.

Theorem 5.63 A finitely-generated abelian group G has a unique pri-


mary decomposition
G
= Z p n 1 Z p n k Zr
1 k

in the sense that if H is another finitely-generated abelian group with pri-


mary decomposition
H
= Zq m1 Zq m l Zs
1 l

then G = H if and only if k = l, r = s and the primes p1 , . . . , pk are equal


to the primes q1 , . . . , qk , up to possible rearrangement.

Unfortunately there isnt a neat classification of finitely-generated


nonabelian groups, although there are many partial results and we
will see a few of them later, including the JordanHlder Theorem on
composition series, which will be covered in Chapter 7.

Count what is countable, measure 5.A Coset enumeration


what is measurable, and what is not
measurable, make measurable.
attributed to Galileo Galilei Presentations provide a neat and compact way of describing many
(15641642) classes of groups, both finite and infinite, but there are technical
difficulties that sometimes limit their usefulness. We cant always
deduce everything we might want to know about a given group G:
indeed its not always possible to calculate the order of G, or even say
whether its finite or infinite. In many cases we can obtain an upper
bound on | G | by using the technique illustrated in Proposition 5.32.
Example 5.64 Let G = h x, y : x2 , y5 , ( xy)2 i. Then the relations x2 = e
and y5 = e tell us that x 1 = x and y1 = y4 , so we need only
consider non-negative powers of x and y.
The relation ( xy)2 = e means that xy = ( xy)1 = y1 x 1 = y4 x. So
we can rewrite any string of xs and ys so as to group the x and y
terms together. Hence any element of G is of the form x m yn where
presentations 181

0 6 m 6 1 and 0 6 y 6 4. There are ten possibilities, so | G | 6 10.


(As it happens, we can see from examining the presentation that
G= D5 so we know that actually | G | = 10.)

There are more sophisticated methods for deriving structural informa-


tion from group presentations, and next well look at one of them, the
ToddCoxeter Algorithm. This algorithm was discovered by J A Todd
(19081994) and H S M Coxeter (19072003) in 1936, and in some cases
can provide structural information about a finitely presented group.
We wont formally state the algorithm, instead well illustrate it with
worked examples. We start with a simple case: the cyclic group Z3 .

Example 5.65 Let G = h x : x3 i


Oberwolfach Photo Collection / Konrad Jacobs
= Z3 . We expand the relator x3 as a The British mathematician Harold
string of generators and inverses, obtaining xxx. We now construct Scott MacDonald Coxeter (19072003)
was born in London and studied at
a relator table with this string as the header, and with a 1 at the Cambridge, where he was awarded a
beginning and end of the first row: PhD in 1931 for a thesis entitled Some
Contributions to the Theory of Regular
Polytopes. In 1936 he took up a post
x x x
at the University of Toronto, where he
1 1 remained for the rest of his long career.
Known as Donald to friends, family
Next pick an empty space next to a 1, write a 2 in it, draw a minus and colleagues, he developed a keen
interest in both music and mathemat-
sign (denoting that this is a definition) across the dividing line
ics and by the age of twelve he was
between the 1 and the 2 and add another row with a 2 at each end. an accomplished pianist and amateur
composer.
x x x His main contributions were in the
1 2 1 field of geometry and algebra. In par-
ticular, since his schooldays he had
2 2 been interested in polyhedra and their
higher-dimensional analogues, poly-
At this point we also start compiling a coset table: topes, subjects which he often dis-
cussed with Alicia Boole Stott (1860
definition deduction 1940), a family friend and the daughter
of the mathematicians George Boole
1x = 2 (18151864) and Mary Everest Boole
(18321916).
The expression 1x = 2 describes how the symbols 1 and 2 are related: Discussions with the Dutch artist Mau-
it says that 2 (which represents x) equals 1 (which represents the rits Cornelis Escher (18981972) in-
spired some of the latters work, in par-
identity e) multiplied on the right by x. ticular the Circle Limit series of wood-
cuts, based on tessellations of the hy-
Returning to the relator table, we write a 2 in any empty cell to the
perbolic plane.
right of a 1, and a 1 in any empty cell to the left of a 2 where the He was elected to the Royal Society
dividing line is labeled x. We also (although this is not applicable of Canada in 1950, the Royal Soci-
ety of London in 1950, the American
here) write a 2 in any empty cell to the left of a 1, and a 1 in any
Academy of Arts and Sciences in 1990,
empty cell to the right of a 2 where the dividing line is labeled x 1 . and appointed a Companion of the Or-
der of Canada in 1997.
x x x definition deduction He attributed his long and active life
to his vegetarian diet, and a regime of
1 2 1 1x = 2 regular exercise which he followed into
2 1 2 his nineties.
182 a course in abstract algebra

Now we write the symbol 3 in an available empty space in the relator


table, add another row with a 3 in the first and last place, write the
definition 2x = 3 in the left-hand column of the coset table, and then
insert 3s in appropriate places in the relator table:

x x x
definition deduction
1 2 3 1
1x = 2
2 3 1 2
2x = 3
3 3

At this point we see that the first row is full. In particular, this yields
the expression 3x = 1 for free: we write this in the deduction column
of the coset table, and mark it in the relator table by an equals sign
= across the appropriate dividing line:

x x x
definition deduction
1 2 3 = 1
1x = 2
2 3 1 2
2x = 3 3x = 1
3 3

Now we go through the relator table again, filling in any empty


spaces with appropriate symbols. If we get any extra information
then we mark this with an equals sign = across the vertical line and
list the new expression in the deduction column of the coset table.
If a line completes without yielding any new information then we
mark this with an equivalence sign .

x x x
definition deduction
1 2 3 = 1
1x = 2
2 3 1 2
2x = 3 3x = 1
3 1 2 3

We can derive some useful information from the tables produced by


this process (which is called scanning).
The number of rows in the completed relator table gives the order of
the group G: in this case there are three rows, so | G | = 3 as expected.
We can also use the first two columns of the relator table to obtain a
permutation in S3 representing the generator x. In this case, we see
that x 7 (1 2 3) S3 .

The data obtained by this algorithm also enables us to construct a


diagram showing how the group generators interact with each other.
Definition 5.66 A Cayley graph or Cayley diagram for a group G
is a directed graph which has one vertex for each element of the
group. Whenever h = g x for two elements g, h G and some
presentations 183

generator x, we draw a directed edge from the vertex g to the vertex


h and label it with the generator x.

In particular, each relator can be interpreted as a closed loop in the


Cayley graph, based at the vertex representing the identity element.
The relator tables obtained via the ToddCoxeter algorithm enable
3
us to draw a Cayley graph in a straightforward way. Each numeric
symbol in the tables yields a vertex, and by reading along the rows of x x

the relator tables we can see where to draw the directed edges.
1 2
For the group G = h x : x3 i
= Z3 studied in Example 5.65 we get the x

diagram shown in Figure 5.2. Figure 5.2: Cayley graph for the group
h x : x3 i
= Z3 , constructed from the
This procedure extends to presentations with more than one relator: relator tables produced by the Todd
Coxeter algorithm. Here the node 1
we have a relator and coset table for each relator, and must keep all
corresponds to the identity element e,
tables up to date at each step. The following example illustrates this. the node 2 corresponds to the element
x, and the node 3 corresponds to the
Example 5.67 Let G = h x, y : x2 , y2 , xyx 1 y1 i. This is isomorphic element x2 .
to the Klein group V4
= Z2 Z2 .

x x y y x y x 1 y 1
1 1 1 1 1 1

We begin with the first relator, writing a 2 in the available space in the
relator table, and adding the definition 1x = 2 to the first column in
the coset table. This yields the deduction 2x = 1. We then add a new
row to the other relator tables and fill in 1s and 2s as appropriate.

x x y y x y x 1 y 1
1 2 = 1 1 1 1 2 1
2 1 2 2 2 2 2
definition deduction
1x = 2 2x = 1

Next, add a 3 to the first empty space in the second table, add a new
definition 1y = 3, and fill in the relator tables as far as possible.

x x y y x y x 1 y 1
1 2 = 1 1 3 = 1 1 2 3 1
2 1 2 2 2 2 2
3 3 3 1 3 3 1 3
definition deduction
1x = 2 2x = 1
1y = 3 3y = 1

Now we write a 4 in the empty cell in the second row of the second
184 a course in abstract algebra

table, and append the definition 2y = 4 to the coset table. After


filling in the relator tables as far as possible, we have the following:

x x y y x y x 1 y 1
1 2 = 1 1 3 = 1 1 2 4 = 3 1
2 1 2 2 4 = 2 2 1 3 4 2
3 4 3 3 1 3 3 4 2 1 3
4 3 4 4 2 4 4 3 1 2 4

definition deduction
1x = 2 2x = 1
1y = 3 3y = 1
2y = 4 4y = 2
3x = 4
4x = 3

The relator tables are now complete, and | G | = 4 as expected. We


1 also get permutation representatives in S4 for the generators of G:
y x
x 7 (1 2)(3 4) and y 7 (1 3)(2 4)
y x
Figure 5.3 shows the Cayley graph obtained from these relator tables.
3 2
x y
Sometimes when applying this process, we obtain a new deduction
x y that is inconsistent with what weve already got. That is, we might
4 obtain two expressions of the form ix = j and ix = k where j 6= k: this
Figure 5.3: The Cayley graph for the is called a coincidence. At first sight, this may seem disastrous, and
group h x, y : x2 , y2 , xyx 1 y1 i
= V4 we might be tempted to give up on the whole thing as a bad idea.
But we can salvage the situation by going through all of the tables
replacing the larger of j and k with the smaller one. That is, if j < k,
we rewrite all the tables, replacing any occurrences of k with j, and
deleting any rows corresponding to k. If we like, we can renumber
everything so that we have a consecutive sequence of integers with no
gaps in it. This phenomenon is called coset collapse, and after it has
happened, we can restart the scanning process from where we left off.

Example 5.68 Let G = h x, y : x2 y, x3 y2 i. After a few steps of the


ToddCoxeter algorithm, we get the following situation:

x x y x x x y y definition deduction
123=1 1 2 34=3 1 1x = 2
2 3 4=2 2 3 4 2 2x = 3 3y = 1
3 4 3 3 4 4 3 3x = 4 4y = 3
4 4 4 4 4y = 2

At this point we notice that we have a coincidence: 4y = 3 and


4y = 2 in the deduction column of the coset table. So before going
presentations 185

any further we rewrite the relator tables, replacing all occurrences of


3 with 2, deleting the third row, and then rewriting all occurrences
of 4 as 3. This yields the following situation:

x x y x x x y y definition deduction
122=1 1 2 23=2 1 1x = 2
2 2 3=2 2 2 3 2 2x = 2 2y = 1
3 3 3 3 2x = 3 3y = 2

Again, we find a coincidence: 2x = 2 and 2x = 3 in the definition


column of the coset table. As before, we replace all 3s with 2s and
delete the third row of each relator table:

definition deduction
x x y x x x y y
1x = 2
122=1 1 2 22=2 1
2x = 2 2y = 1
2 2 2=2 2 2 2 2
2y = 2

This yields another coincidence: 2y = 1 and 2y = 2, so we replace all


2s with 1s and delete the second row of each relator table:

x x y x x x y y definition deduction
111=1 1 1 11=1 1 1x = 1 1y = 1

This leaves us with just a single row in both relator tables, from
which we conclude that | G | = 1. Thus G must be the trivial group.

With a little bit of work, we can amend the basic ToddCoxeter algo-
rithm to work relative to some finitely-generated subgroup H of the
group G were interested in. In this case, instead of obtaining the order
| G | of the main group, we get the index | G : H | of H in G. And rather
than a permutation representation for G, we get a representation of
the way G permutes the (right) cosets of H.
The procedure is very similar to that in Examples 5.65, 5.67 and 5.68,
but we also have relator tables for the words which generate H. These
subgroup tables are similar to relator tables in the basic version of the
algorithm, but have a single row beginning and ending with 1.
Example 5.69 Let G = h x : x4 i and let H = h x2 i. We can see
immediately that G = Z4 and that H is an index2 subgroup which
is isomorphic to Z2 . We write down the relator table, the subgroup
table and the coset table:

x x x x x x definition deduction
1 1 1 1

We make the obvious first move, writing a 2 in all the appropriate


186 a course in abstract algebra

empty cells in the relator table and the subgroup table, and adding
the definition 1x = 2 to the coset table. We add a second row to the
relator table, but not to the subgroup table. From the subgroup table
we deduce that 2x = 1, and use this to complete all the tables:

x x x x x x definition deduction
12 12 1 1 2=1 1x = 2 2x = 1
2 1 21 2

The relator table has two rows, which tells us that | G : H | = 2.


Reading down the first column of the relator table gives us a rep-
resentation of the generator x as a transposition (1 2). With the
basic version of the algorithm, this would give us a permutation
representative of the generator x of G. This relative version of the
ToddCoxeter algorithm, however, instead tells us how G permutes
the (right) cosets of H.

In the next example we apply the ToddCoxeter algorithm to the dihe-


dral group D3 relative to the subgroup H generated by one reflection.
Example 5.70 Let G = h x, y : x3 , y2 , ( xy)2 i
= D3 and let H = hyi.
Then after applying the ToddCoxeter algorithm we obtain the fol-
lowing completed relator tables, subgroup table and coset table:

x x x y y x y x y y
123=1 1 1 1 1 2 3 1 1 1=1
23=1 2 2 3=2 2 3 2 3 1
3 1 2 3 3 2=3 3 1 1 2=1

definition deduction
1y = 1
1x = 2
2x = 3 3x = 1
2y = 3
3y = 2

This tells us that | G : H | = 3, and also that x permutes the right


cosets of H by means of a 3cycle (1 2 3), which maps H to Hx, maps
Hx to Hx2 and maps Hx2 back to H. Similarly, y permutes the right
cosets of H by means of a transposition (2 3): this leaves H alone,
and transposes Hx and Hx2 .

Having looked at some illustrative examples, its time we considered


what the various tables and numbers in the algorithm actually mean.
In general, we start with three finite sets: a set X of generators, a set
R of relators (such that G = h X : Ri) and a set Y of words which
presentations 187

generate the subgroup H 6 G. As the algorithm progresses, we


gradually fill up the cells in the relator and subgroup tables with
positive integers; these integers correspond to the right cosets of H in
G, with 1 representing H itself. This is why the number of integers
we end up using gives us the index | G : H | of H in G.
This process is often called coset enumeration, for the obvious reason
that it lists the cosets of a given subgroup. If the index | G : H | (or,
with the simple version of the algorithm, the order | G |) is infinite, then
the algorithm will not terminate but will just carry on indefinitely.
Since we only have a finite amount of time to devote to any given
coset enumeration problem, we cant tell whether | G : H | is actually
infinite, or just finite but sufficiently large that the algorithm hasnt
terminated yet.
In the simple version of the algorithm, the set Y is empty, so the
subgroup H is trivial, and the positive integers in the relator tables
represent the cosets of this trivial subgroup; that is, the individual
elements of G. This, similarly, is why the number of integers we use
gives us the order | G | of G.
We can get even more information from the ToddCoxeter coset enu-
meration process. To explore this, we need the following definition.
Definition 5.71 Let G be a group and H be a subgroup of G. Then
a right transversal of H in G is a set of elements of G, one chosen
from each right coset of H.
A left transversal is defined similarly as a choice of elements from
the left cosets of H in G.
To illustrate this, lets look at an example.
Example 5.72 Let G = D3 and H = {e, m1 }. Then
H = Hm1 = {e, m1 },
Hr = Hm2 = {r, m2 }
and Hr2 = Hm3 = {r2 , m3 },
(see Example 2.18) so {e, m2 , r2 } is a right transversal of H in G.

We know from Proposition 2.29 that the cosets of H in G are all the
same size, so one way of picturing a transversal is to take all of the
cosets, line them up next to each other, and slice through all of them
in such a way that we hit a single element in each one.
The ToddCoxeter algorithm, in addition to counting the number of
cosets of H in G, also yields a right transversal for the H in G.
In Example 5.65 we applied the ToddCoxeter algorithm to the group
h x : x3 i
= Z3 and found, amongst other things, that | G | = 3 as
188 a course in abstract algebra

expected. More generally, what we were doing was making a list


of the cosets of the trivial subgroup of G, which coincide with the
elements of G itself. There is only one possible transversal in this
case: {e, x, x2 }. We get this transversal recursively from the definition
column of the coset table, which in this case consists of 1x = 2 and
2x = 3. The first element of the transversal is the identity e, and in
general the jth element is given recursively by applying the definitions
in the table. So the second element of the transversal is given by the
definition 2 = 1x, which gives ex = x, and the third element is given
by the definition 3 = 2x, from which we obtain xx = x2 .
In Example 5.67 we constructed a coset table whose definition column
consisted of the equations 1x = 2, 1y = 3 and 2y = 4. Following the
same procedure, the first element of the transversal is e, the second
(given by 2 = 1x) is x, the third (given by 3 = 1y) is y, and the fourth
(given by 4 = 2y) is xy. So our transversal is {e, x, y, xy} as expected.
More interesting is the relative case where we apply the algorithm
relative to a nontrivial subgroup. In Example 5.69 we obtained a single
definition 1x = 2 which we can use to construct the transversal {e, x }.
The first of these is an element of the coset He = H = {e, x2 } and the
second is an element of the other coset Hx = { x, x3 }.
The coset table in Example 5.70 contains two definitions, 1x = 2 and
2x = 3, which yield the transversal {e, x, x2 }. Using the isomorphism
from the discussion following Proposition 5.28 we can rewrite this as
{e, r, r2 }, and see that it is different to the one given in Example 5.72.

5.B Transversals

The transversals obtained via the ToddCoxeter algorithm


all satisfy an important condition that were going to rely on during
19
Theorem 5.11, page 145. part of our proof of the NielsenSchreier Theorem.19
The recursive way the transversal elements were defined ensures that
for any element x1 x2 . . . x3 in some transversal U, the elements x1 ,
x1 x2 , all the way up to x1 x2 . . . xn1 , are also contained in U.
Definition 5.73 Let G be a group (not necessarily free) generated by
some set X, let H be a subgroup of G, and let U be a right transversal
of H in G. Then U is a Schreier transversal, or has the Schreier
property if it is prefix closed. That is, for any word x1 x2 . . . xn in
U, the words x1 , x1 x2 , all the way up to x1 x2 . . . xn1 , are also in U.
presentations 189

In other words, if we delete some substring from the right end of a


word in U, what were left with is also in U.
Its not obvious that any such subgroup has a Schreier transversal, but
well see in a little while that this actually is the case. First we need to
introduce the concept of an ordering on a set:
Definition 5.74 Let 6 be a relation defined on a set S such that:
P1 a 6 a (reflexivity),
P2 if a 6 b and b 6 a then a = b (antisymmetry), and
P3 if a 6 b and b 6 c then a 6 c (transitivity)
for all a, b, c S. Then 6 is a partial order on S.
We say that 6 is a total order if it also satisfies the following tri-
chotomy law or comparability criterion:
P4 Either a 6 b or b 6 a for all a, b S.
For any such partial order 6, there is a corresponding strict version
<, satisfying the following conditions for all a, b, c S:
S1 a 6< a (irreflexivity),
S2 if a < b then b 6< a (asymmetry), and
S3 if a < b and b < c then a < c (transitivity)
We say < is a strict total order if it also satisfies the trichotomy law:
S4 Either a < b or b < a for all a, b S.

Perhaps the best-known examples of orders are the usual ones < and
6 defined on the real numbers R, and indeed these are the models on
which the above definition is based.
For our purposes we need a slight refinement of the above concepts:
Definition 5.75 Let 6 be a total ordering defined on a set S, and
let < be the corresponding strict total ordering. If every nonempty
subset of S has a least element with respect to 6 and <, then we say
that S is well-ordered, and that both 6 and < are well-orderings.

With a bit of thought it should be obvious that any finite set can be Axiom of Choice Given any collec-
well-ordered: arrange the elements of the set in a list and define a tion X of nonempty sets, there exists a
strict ordering corresponding to this arrangement. Then the least function f defined on X (a choice func-
tion) such that f (S) S.
element is simply the first element on the list. This method also
extends to countably infinite sets, but the assertion that any set can be Zorns Lemma Let S be a partially-
well-ordered is equivalent to Zorns Lemma or the Axiom of Choice. ordered set such that every totally-
ordered, nonempty subset of S has an
A discussion of the details of axiomatic set theory is beyond the scope upper bound. Then S contains at least
of this book, but the statement of both the Axiom and the Lemma are one maximal element.
given here for completeness, without further comment.
In particular, well use the following strict well-ordering, called vari-
ously the shortlex, radix, lenlex or length-lexicographical order.
190 a course in abstract algebra

Definition 5.76 Let X be a set of generators, equipped with some


total order 6, and let W ( X ) denote the set of reduced words formed
from those generators and their formal inverses. Then the order 6 on
X extends to a well-defined order 6 on W ( X ), called the shortlex
or lenlex order, defined as follows.
Suppose u = u1 . . . us and v = v1 . . . vt be two words in W ( X ).
Then u < v if and only if:
(i) either s < t
(ii) or s = t and there exists some i < s such that u j = v j for j 6 i
but ui+1 < vi+1 .

To see how this works, consider F2 = h x, y : i. We first need to impose


a strict total ordering on the set X = { x, y, x 1 , y1 }, so lets order
the generators x and y alphabetically, and then order their inverses
alphabetically too:
x < y < x 1 < y 1 .
Now consider the set
S = { x2 , x, x 1 , xy, x 3 , xyx 1 , xyx, y4 } F2 .
To put these elements in shortlex order, we first group the words by
length, in accordance with condition (i) of Definition 5.76, to get
{ x, x 1 } < { xy, x2 } < { x 3 , xyx 1 , xyx } < {y4 }.
Now we use condition (ii) to order words of the same length:
x < x 1 < x2 < xy < xyx < xyx 1 < x 3 < y4 .
We can now use the shortlex order to prove that every subgroup of a
free group has a Schreier transversal. This is one of the first ingredients
in our proof of the NielsenSchreier Theorem.
Proposition 5.77 Let F be a free group with basis X, and let H be a
subgroup of F. Then H has a Schreier transversal U in F.

Proof To form a transversal of H in F, we choose an element from


each right coset Hw, where w F. But to ensure it satisfies the Schreier
property, we must be careful about which elements we choose. So, for
every coset Hw, select the least element of Hw relative to the shortlex
ordering; we claim that this gives a Schreier transversal.
Let a1 . . . as be some element of U. If a1 . . . as1 isnt in U then there
must exist some word b1 . . . bt as in the coset Ha1 . . . as1 such that
b1 . . . bt < a1 . . . as1 . Then b1 . . . bt as < a1 . . . as1 as and also b1 . . . bt as
is in the coset Ha1 . . . as1 as .
But this is a contradiction: if a1 . . . as U then it must be the least
element of the coset Ha1 . . . as relative to the shortlex ordering. Hence
a1 . . . as1 must be in U after all, and so U is a Schreier transversal.
presentations 191

In this proof we used the fact (left as an exercise) that for the shortlex
ordering < if v, w F and x X , with v < w, then vx < wx and
xv < xw. In fact, any ordering of F satisfying this property will suffice.
Now that we know every subgroup of a free group has a Schreier
transversal, lets look at a couple of examples.
Example 5.78 Let F = F2 = h x, y : i and let H = h x2 , y2 , xyx 1 y1 i.
In particular, F/H = h x, y : x2 , y2 , xyx 1 y1 i
= V4 , so | F : H | = 4
and thus any transversal of H in F must have four elements. It so
happens that
U = {e, x, y, xy}
is a Schreier transversal for H in F. Furthermore, U is the Schreier
transversal constructed in Proposition 5.77 since each element of U
is the least element in its coset relative to the shortlex ordering.

Sometimes the obvious Schreier transversal isnt the optimal one


relative to the shortlex ordering, as the following example shows.
Example 5.79 Let F = F2 = h x, y : i and let H = h x3 , y2 , ( xy)2 i. It
so happens that F/H = h x, y : x3 , y2 , ( xy)2 i
= D3 , so | F : H | = 6 and
any transversal of H in F must have six elements.
If f : X D3 defined by f ( x ) = r and f (y) = m1 then f extends to a
unique homomorphism f : F2 D3 whose kernel is H. We also have
f(e) = e, f( x2 ) = r2 ,
f( xy) = m2 , and f( x2 y) = m3 .
We can use these (see Proposition 4.38) to show each of the elements
e, x, y, x2 , xy and x2 y lie in different cosets of H in F. Therefore
U = {e, x, y, x2 , xy, x2 y}
is a transversal for H in F. Its also a Schreier transversal, but its not
the minimal Schreier transversal constructed by Proposition 5.77.
To find this shortlex-minimal Schreier transversal, we can list words
in F in increasing shortlex order until weve found a representative
for each coset, discarding subsequent representatives for a given
coset. We then use these elements to form a transversal.

word e x y x 1 y 1 x2 xy xy1 yx
coset H Hx Hy Hx 1 Hy Hx 1 Hxy Hxy Hyx

This process yields the transversal


V = {e, x, y, x 1 , xy, yx }
which clearly has the Schreier property.

The next step in our journey towards a proof of the NielsenSchreier


192 a course in abstract algebra

Theorem is to find generators for the subgroup H, for which we use the
shortlex-minimal Schreier transversal provided by Proposition 5.77.
First, though, we introduce another little piece of notation. Suppose
that w F is some word in a free group F, and that U is a Schreier
transversal for a subgroup H 6 F. We form the coset Hw and take
the intersection Hw U. This will consist of a single element which
might (but wont necessarily) be equal to w. Denote this element by w.
This process yields a function : F U, given by w 7 w for all w F.
This bar function has a number of properties which we state here
without further proof. Its idempotent, in the sense that w = w for all
w F. Also, Hw = Hw for all w F, and w = w if and only if w U.
Let F be a free group generated by some basis X, let H be a subgroup
of F, and let U be a Schreier transversal for H in F. In the next part of
this discussion well need the set
Y = {uxux 1 : u U, x X }.
Its worth thinking a little bit about what this set Y actually means.
Suppose that u is an element of the Schreier transversal U, and that
x is either a generator, or the inverse of a generator, in X. Then ux is
clearly a word in F, and ux is the corresponding transversal element in
U, and may or may not be equal to ux. If ux = ux, which will happen
exactly when ux U, then the element uxux 1 will be equal to the
identity e F. So uxux 1 tells us whether ux is also in U.
Now consider the element uxx 1 . This is in the same coset as uxx 1 ,
which is in the same coset as uxx 1 = u. Hence
uxx 1 = u. (5.2)
Well use this fact a couple of times later, but first lets construct the
set Y in a specific case to see how it works.

Example 5.80 Let F = F2 = h x, y : i, let H = h x3 , y2 , ( xy)2 i, and let


U = {e, x, y, x 1 , xy, yx }
be the shortlex-minimal Schreier transversal obtained via Proposi-
tion 5.77. We construct the set Y = {uxux 1 : u U, x X } by
means of the following table:
x y x 1 y 1
e e e e y 2
x x 3 e e xy 2 x 1

y e y2 yx 1 y1 x 1 e
x 1 e 1
x yx y 1 1 x 3 x 1 y 1 x 1 y 1
xy xyx 1 xy2 x 1 xyx 2 y1 e
yx 2
yx y x 1 1 yxyx e yxy1 x 1
presentations 193

Here the column headings are elements x of X while the row


headings are elements u of the transversal U, and the entries in the
table are the corresponding words uxux 1 .
1
For example, we calculate ( xy) x 1 ( xy) x 1 by first concatenating
xy and x 1 to get xyx 1 . Then we find the appropriate transversal
element xyx 1 = yx, invert it to get x 1 y1 , append it to xyx 1 , and
perform any necessary cancellations to get xyx 2 y1 .
The set Y, then, comprises all of these elements:

Y = {e, y2 , y2 , x3 , xyx 1 , x 3 , xy2 x 1 , xy2 x 1 , yxyx,


yxy1 x 1 , yx 1 y1 x 1 , x 1 yx 1 y1 , x 1 y1 x 1 y1 ,
xyx 2 y1 , yx2 y1 x 1 }

We wont require that U is minimal with respect to the shortlex order-


ing in the rest of this discussion, but the shortlex ordering is useful
because it enables us to construct a Schreier transversal.
The idea is that this process yields a set of generators for the subgroup
H, but looking at the table constructed in the above example, we see
that there are a number of redundant or trivial elements on the list.
In particular, there are several instances of the identity e: this occurs
when ux U so that ux = ux. Also, there are a number of occasions
when both an element and its inverse occur. In fact, the nontrivial
elements in the right half of the table are exactly the inverses of those
in the left half of the table. So, in general, we can halve the amount of
work we have to do by only constructing the left half of the table.
Now define
Z = {uxux 1 : u U, x X, ux 6 U }.
This will yield only the nontrivial elements we want. To prove that
these are the only redundancies we need the following lemma.
Lemma 5.81 Let
Z 1 = { z 1 : z Z }
and Z = {uxux 1 : u U, x X 1 , ux 6 U }.
Then Z 1 = Z and Y = Z Z 1 {e}.

Proof Let w Z 1 , so w = (uxux 1 )1 = uxx 1 u1 for some u U


1
and x X. Then by (5.2) we have u = uxx 1 so w = uxx 1 uxx 1 ,
1
which is an element of Z, since we can rewrite it as w = vx 1 vx 1
where v = ux. Hence Z 1 Z.
1
Now let w = ux 1 ux 1 Z, so w1 = ux 1 xu1 . Writing v = ux 1
we see that vx = u by (5.2) and hence w1 = vxvx 1 Z, so w Z 1 .
Therefore Z Z 1 and so Z = Z 1 as required.
194 a course in abstract algebra

The union Z Z 1 contains all the nontrivial elements of Y, so includ-


ing e as well gives us the entirety of Y.
The significance of Z is that its elements generate H:
Lemma 5.82 Let F be a free group generated by some basis X, let H be
a subgroup of F, and let U be a Schreier transversal for H in F. The set
Z = {uxux 1 : u U, x X, ux 6 U } generates H.

Proof Let h H, and suppose that h = x1 . . . xn where x1 , . . . , xn


X . We now define a sequence of transversal elements u1 , . . . , un+1
by setting u1 = e and ui+1 = x1 . . . xi for 1 6 i 6 n. Since h H we
have un+1 = x1 . . . xn = h = e. Also, the elements ui satisfy a recursive
relation ui+1 = ui xi .
Define a sequence of elements y1 , . . . , yn Y by setting

yi = ui xi ui xi 1 = ui xi ui+11 .

Then

y1 . . . yn = (u1 x1 u21 )(u2 x2 u31 ) . . . (un xn u 1


n +1 )
= u1 x1 . . . x n u 1
n +1
= u1 hu 1
n +1
= ehe
= h.
This tells us that any element h H can be written as a product of
elements of Y, each of which is in turn a word in elements of Z and
their inverses. Therefore Z generates H as claimed.
At this point, weve found a set Z of generators for H (which we will
sometimes refer to as Schreier generators), but we still need to show
that H is freely generated by those elements. The way we do this is by
means of the following lemma.
Lemma 5.83 Let u U and x X such that uxux 1 Z . Then x
doesnt cancel in the product uxux 1 . Furthermore, if v U and y X
such that vyvy1 Z and vyvy1 6= (uxux 1 )1 then in the product
uxux 1 vyvy1 neither the first x nor the first y cancels.

Proof If x cancels in uxux 1 , then it either does so in ux or in xux 1 .


In the former case, let u = a1 . . . am with a 1
m = x. Then ux =
a1 . . . am1 , which is in U by the Schreier property, so ux = ux and
hence uxux 1 = e, contradicting our hypothesis that uxux 1 Z .
The other possibility is that x cancels in xux 1 . For this to happen, we
need ux = b1 . . . bn with x = bn so that uxx 1 = b1 . . . bn1 U and
hence ux = ux, which also contradicts the assertion that uxux 1 Z .
presentations 195

Now consider the product uxux 1 vyvy1 . By the above discussion, x


cant cancel with either u or ux 1 , and y cant cancel with either v or
vy1 , so if x cancels at all, then either all of ux 1 cancels with (some
or all of) v, or all of v cancels with (some or all of) ux 1 .
Suppose that all of ux 1 cancels with some of v. Then ux = a1 . . . am
and v = b1 . . . bn with m 6 n and ai = bi for 1 6 i 6 m.
1
If m < n then x cancels with bm+1 in the sense that x = bm +1 , and thus
uxx = b1 . . . bm+1 U (by the Schreier property). Hence uxx 1 = u
1

by (5.2), so ux = ux, contradicting the hypothesis that ux 6 U.


If, on the other hand, m = n then ux = v and x = y1 , so vy = uxx 1 ,
which means that vy = uxx 1 = u. Hence uxux 1 = vyy1 v1 =
(vyvy1 )1 , which contradicts our hypothesis.
A similar argument shows that all of v cant cancel with (some or all
of) ux 1 , at which point the lemma is proved.

Corollary 5.84 Let yi = ui xi ui xi 1 in Z for 1 6 i 6 m and suppose


that yi 6= yi+11 for 1 6 i < m. Then in the product y1 . . . ym none of the
xi terms cancel, so y1 . . . ym 6= e.
At long last, we are now ready to prove the NielsenSchreier Theorem.
Theorem 5.11 (The NielsenSchreier
Proof of Theorem 5.11 By Proposition 5.77 we can find a Schreier Theorem) Let F be a free group, and let
H be a subgroup of F. Then H is itself
transversal U for H in F such that |U | = | F : H | = s.
a free group. Furthermore, if the index
By Lemmas 5.81 and 5.82 we can use this transversal to construct a | F:H |=s of H in F and rank( F ) = r
are both finite, then
generating set Z for H.
rank( H ) = (r 1)s + 1.
To show that H is free on Z, we just have to show that distinct reduced
words in Z represent distinct elements of H. Suppose that b1 . . . bm =
c1 . . . cn in F with bi , ci Z , and bi 6= bi+11 for 1 6 i < m and c1 6= ci+1
1
for 1 6 i < n. Then b1 . . . bm c 1
n . . . c1 = 1, so by Corollary 5.84 we
must have bm = cn . We can show by induction that bi = ci for
1 6 i 6 m and that m = n, so H is free on Z.
To calculate the rank of H, we need to work out | Z |. In other words,
we need to know how many u U and x X satisfy ux = ux.
Let a1 . . . am U \ {e}. If am = x X then we have ux = ux with
u = a1 . . . am1 since U is a Schreier transversal. On the other hand,
if am = x 1 X 1 then ux = ux with u = a1 . . . am . So in either case
there is a trivial Schreier transversal generator associated with u.
Conversely, if uxux 1 is trivial then x cancels either with the last
letter of u or the first letter of ux 1 so it must occur in one of the
two ways just described. Hence the number of trivial elements in Z is
|U \ {e}| = n 1.
So | Z | = nr (n 1) = n(r 1) + 1 as claimed.
196 a course in abstract algebra

During this section weve been primarily concerned with subgroups of


free groups, but it turns out that we can use the information provided
by the ToddCoxeter algorithm to write down a presentation of a
finite-index subgroup H of any finitely-presented group G.
Suppose G = h X : Ri and H = hY i 6 G. Then the ToddCoxeter
algorithm provides us with a Schreier transversal U for H in G. We
then use the process described earlier (in Lemmas 5.81 and 5.82) to
construct a generating set Z for H: they are simply the nontrivial
words uxux 1 where x X and u U.
Next, we must form a set of relators for H expressed in terms of the
generators X, and finally we have to rewrite those relators in terms of
the generators in Z. The following proposition gives us the key to the
first of these tasks.
Proposition 5.85 Let G = h X : Ri, let H be a subgroup of G, and let U
be a Schreier transversal for H in G. Then H = h Z : Si where
Z = {uxux 1 : u U, x X, ux 6 U }
and S is the set of elements of
T = {uru1 : u U, r R}
rewritten as words in Z .
Proof We prove this by means of a slightly more general result.
Let F be a group, let K be a subgroup of F, and let V be a Schreier
transversal of K in F. Let Q be a subset of F such that h Q F i K,
where h Q F i denotes the normal closure of Q in F.
We claim that h Q F i is also the normal closure in K of the set
P = {vqv1 : v V, q Q}.
To see this, recall that h Q F i is generated by elements wqw1 where
w F and q Q. Any w F, however, can be written in the form
kv for some k K and v V. So h Q F i is generated by all elements
kvqv1 k1 = k(vqv1 )k1 , which are all conjugates of elements of P
with elements of K, so h Q F i = h PK i as claimed.
We can now apply this to the specific problem of finding relators for
H in terms of the generators in Z.
Set F = F ( X ) and let K be the preimage f 1 ( H ) in F, where f : F
G = F/h R F i is the quotient homomorphism obtained by factoring F
by the relators R.
(By Proposition 4.37, f 1 ( H ) must be a subgroup of F, and in this
case we also know, by the NielsenSchreier Theorem, that its free.)
We have a Schreier transversal U of H in G, but we can also regard
this as a Schreier transversal for K in F. To see this, observe that U is
presentations 197

formed by taking a carefully-chosen element from each coset of H in


G. Passing to the free group F, these same transversal elements can
also be seen as belonging to the cosets of K in F.
Therefore K is freely generated by the set Z. The relator set R is
obviously a subset of F, and its normal closure h R F i lies in K, so by
the more general discussion above we know that h R F i = h T K i.
Rewriting the elements of T in terms of the generators in Z (which we
can often just do by inspection) we get the set S.
All that remains is to note that h Z : Si
= K/h R F i
= H.
So, in practice we write down the set

T = {uru1 : u U, r R}

and then rewrite these in terms of the generators in Z, either by


inspection or by some other method such as the recursive procedure
outlined in the proof of Lemma 5.82. This is all best illustrated by a
couple of examples.
Example 5.86 In Example 5.69 we applied the ToddCoxeter algo-
rithm to the group G = h x : x4 i
= Z4 and the subgroup H = h x2 i.
The resulting Schreier transversal is U = {e, x }, and we can use this
to construct the Schreier generators uxux 1 for H in G. The only
nontrivial such generator is x2 so we have Z = { x2 }. Let z = x2 be
the only element of this set.
Next we calculate the elements uru1 of the set T, which in this case
gives T = { x4 }. We can immediately rewrite this in terms of the
Schreier generator z = x2 , giving x4 = z2 and the presentation
H = h z : z2 i
= Z2
as expected.
Its illuminating to use the recursive method from Lemma 5.82. We
take the relator x4 = x1 x2 x3 x4 , setting x1 = x2 = x3 = x4 = x. Next
we recursively calculate the transversal elements
u1 = e, u2 = x1 = x = x,
u3 = x1 x2 = x2 = e, u4 = x1 x2 x3 = x3 = x,
u5 = x1 x2 x3 x4 = x4 = e.
We use these to calculate the elements y1 , . . . , y4 :
y1 = u1 x1 u21 = exx 1 = e, y2 = u2 x2 u31 = xxe = x2 = z,
y3 = u3 x3 u41 = exx 1 = e, y4 = u4 x4 u51 = xxe = x2 = z.
So x4 = ezez = z2 as expected, which gives us the presentation
H = hz : z2 i again.
198 a course in abstract algebra

Next we study the subgroup of D3 generated by a single reflection.


Example 5.87 Let G = h X : Ri = h x, y : x2 , y2 , ( xy)3 i = D3 and

H = h x i = Z2 . Applying the ToddCoxeter algorithm tells us that
| G : H | = 3 and gives us a Schreier transversal U = {e, y, yx }, which
we can use to write down the Schreier generators
z1 = x, z2 = y2 , z3 = yxyx 1 y1 , z4 = yx2 y1 .
The set T = {uru1 : u U, r R} consists of the words
x2 , y2 , ( xy)3 , yx2 y1 , (yx )3 , yx2 yxyxyx 1 y1 .
Some of these can be immediately written in terms of the Schreier
generators z1 , . . . , z4 :
x2 = z21 , y2 = z2 , yx2 y1 = z4 .
For the rest, we use the procedure outlined in Lemma 5.82. Starting
with ( xy)3 = xyxyxy and setting x1 = x3 = x5 = x and x2 = x4 =
x6 = y, we obtain the transversal elements
u1 = e, u2 = x1 = x = e,
u3 = x1 x2 = xy = y, u4 = x1 x2 x3 = xyx = yx,
u5 = x1 x2 x3 x4 = xyxy = yx, u6 = x1 x2 x3 x4 x5 = xyxyx = y,
u7 = x1 x2 x3 x4 x5 x6 = e.
Using these, we can write down the elements y1 , . . . , y6 as follows:
y1 = u1 x1 u21 = exe = x = z1 , y2 = u2 x2 u31 = eyy1 = e,
y3 = u3 x3 u41 = yxx 1 y1 = e, y4 = u4 x4 u51 = yxyx 1 y1 = z3 ,
y5 = u5 x5 u61 = yx2 y1 = z4 , y6 = u6 x6 u71 = y2 = z2 .
Hence ( xy)3 = ( x )(yxyx 1 y1 )(yx2 y1 )(y2 ) = z1 z3 z4 z2 . Following
the same process, we get
(yx )3 = (yxyx 1 y1 )(yx2 y1 )(y2 )( x ) = z3 z4 z2 z1 ,
yx2 yxyxyx 1 y1 = (yx2 y1 )(y2 )( x )(yxyx 1 y1 ) = z4 z2 z1 z3 .
This gives us the presentation
H = hz1 , z2 , z3 , z4 : z21 , z2 , z4 , z1 z3 z4 z2 , z3 z4 z2 z1 , z4 z2 z1 z3 i.
Immediately we can see that two of the generators, namely z2 and z4
correspond to trivial relators and can therefore be discarded, yielding
H = hz1 , z3 : z21 , z1 z3 , z3 z1 , z1 z3 i.
This can be further simplified, either with carefully-chosen Tietze
transformations or just by inspection, to give
H = h z : z2 i
= Z2 .
presentations 199

5.C Triangles, braids and reflections A bas Euclide! Mort aux triangles!
(Down with Euclid! Death to triangles!)
Jean Dieudonn (19061992)
In this section we will look briefly at a few classes of groups that
can be defined by presentations of particular forms, but which have Although it has been proved that ev-
interesting geometric interpretations. ery braid can be deformed into a sim-
ilar normal form the writer is con-
We start with the dihedral group Dn , which can be expressed by the vinced that any attempt to carry this
presentation out on a living person would only lead
to violent protests and discrimination
h x, y : x2 = y2 = ( xy)n = ei. against mathematics. He would there-
fore discourage such an experiment.
This group has an interpretation as the symmetry group of the regular Emil Artin (18981962),
ngon. Both generators x and y represent reflections, all of which Theory of braids, Annals of
Mathematics 48 (1947) 101126
have order 2, and the product xy represents a 2 n rotation, which has
order n. If we choose x and y to be the right reflections, then we can
generate the entirety of Dn just using them. In general there are many
valid choices, but setting x = m1 and y = m2 works.
The symmetric group Sn has a presentation

xi2 = e 1 6 i 6 n 1
* +
x 1 , . . . , x n 1 : ( x i x i +1 )3 = e 1 6 i 6 n 2 .
xi x j = x j xi 1 6 i < j 6 n1, ji > 1

(The presentation for D3 = S3 is a special case of this.) Again, we have


a group whose generators all have order 2, and where products of
pairs of generators all have finite order.
A carefully chosen, compact notation can be tremendously useful in
seeing to the heart of a problem, and to that end we now introduce
the following scheme for describing finitely generated groups whose
generators all have order 2.
Definition 5.88 Given a finitely-generated group
G = h x1 , . . . , xn : xi2 = e, ( xi x j )kij = e for 1 6 i < j 6 ni,
we construct a graph as follows: Represent each generator xi by a
vertex, and connect two vertices by an edge labelled k if the product
of the corresponding generators has order k. To simplify the diagram,
we omit the label if k = 2 or 3, and omit the edge as well if k = 2.
The resulting graph is called a Coxeter diagram or Coxeter graph;20 a
group determined by such a presentation is called a Coxeter group.
20
They are named after the British
Given two generators x and y of order 2, if their product xy also has mathematician H S M Coxeter (1907
order 2, this is equivalent to saying that x and y commute. Well view 2003). A very similar notation was in-
dependently devised by the Russian
this as the default situation, which is why we omit the edges between mathematician Eugenii Dynkin (1924
vertices representing commuting generators. Also, it turns out that 2014), and is particularly useful in the
study and classification of Lie groups
many of the interesting cases mostly have products of order 3, so we
and Lie algebras. For this reason, these
omit those labels to simplify the notation. diagrams are also sometimes called
Dynkin diagrams or CoxeterDynkin
diagrams.
200 a course in abstract algebra

Example 5.89 The dihedral group Dm is represented by the Coxeter


graph
m

or

for D3 , and

for D2 = V4
= Z2 Z2 .

Example 5.90 The symmetric group Sn is represented by the Coxeter


diagram

with n vertices.
A natural question to ask is which presentations (and which Coxeter
diagrams) correspond to finite groups. The classification theorem is
reasonably straightforward, but the details would involve us in too
much of a digression here, and so the interested reader is directed to
21
J E Humphreys, Reflection Groups and the book by James Humphreys.21 The complete list of finite Coxeter
Coxeter Groups, Cambridge Studies in groups is given in Table 5.3. The list comprises four infinite families
Advanced Mathematics 29, Cambridge
University Press (1990). (An , Bn , Dn and I2 (m)), and seven one-off extra cases: the exceptional
22
As noted before, in the discussion
Coxeter groups (E6 , E7 , E8 , F4 , G2 , H3 and H4 ).22 In this notation, the
of finite simple groups of Lie type, subscript denotes the rank (the number of generators) of the group.
we have a slight notational awkward-
ness to contend with here. We will All of these groups arise as symmetry groups of geometric objects
again use a sans serif typeface to de- in Euclidean spaces, specifically polygons, polyhedra or their higher-
note these groups and distinguish them
dimensional analogues polytopes (or polychora). The group A3 = S4
from other similarly-named ones. In
particular, we distinguish between the describes the symmetries of the regular tetrahedron; more generally
alternating group An and the Coxeter An is the symmetry group of the nsimplex.
group An (which is isomorphic to the
symmetric group Sn+1 ), and between The group B3 is the symmetry group of the cube and the octahedron;
the dihedral group Dn (which in this B4 of their analogues in 4 dimensions (the hypercube or tesseract and
scheme is denoted I2 (n)) and the Cox-
eter group Dn . the hyperoctahedron); and in general Bn is the symmetry group of
the ncube and noctahedron (or ndimensional cross polytope).
The groups Dn represent the symmetries of a family of semiregular
polytopes called ndemicubes. These are obtained by deleting ev-
ery other vertex from the ncube; in particular the 3demicube is a
tetrahedron and the 4demicube is the hyperoctahedron.
The exceptional groups E6 , E7 and E8 describe the symmetries of three
semiregular Gosset polytopes first discovered by the British barrister
and amateur mathematician Thorold Gosset (18691962).
The exceptional group F4 is the symmetry group of a regular 4
dimensional polytope called the 24cell, which consists of 24 regular
presentations 201

Type Coxeter diagram Order of group Polytopes


An n>1 ( n +1) ! nsimplex

4
Bn n>2 2n n! ncube,
ncross

Dn n>4 2n1 n! ndemicube

E6 51 840 Gosset polytope 221

E7 2 903 040 Gosset polytope 321

E8 696 729 600 Gosset polytope 421

4
F4 1 152 24cell

6
G2 12 regular hexagon

5
H3 120 dodecahedron,
icosahedron

5
H4 14 400 120cell,
600cell

m
I2 (m) m>2 2m regular mgon

Table 5.3: Finite Coxeter groups

octahedra glued together along their faces.


The group H3 describes the symmetries of the dodecahedron and
the icosahedron, while H4 is the symmetry group of two regular 4
dimensional polytopes: the 120cell or hyperdodecahedron (which
consists of 120 regular dodecahedra glued together along their faces)
and the 600cell or hypericosahedron (which has 600 regular tetrahe-
dral hyperfaces).
The groups I2 (m) are isomorphic to the dihedral groups Dm . For
202 a course in abstract algebra

Type Coxeter diagram Tiling



A
e1 apeirogon

A
en n>2 nsimplex

e2 4 4
B =Ce2 square tiling

4
B
en n>3 ndemicubic

4 4
C
en n>3 ncubic

D
en n>4 ndemicubic

E
e6 222

E
e7 331 and 133

E
e8 521 , 251 and 152

4
F
e4 16cell and 24cell

6
G
e2 hexagonal and triangular

Table 5.4: Affine Coxeter groups

complicated reasons that need not concern us here (but which relate
to Lie groups and Lie algebras) the case m = 6 is sometimes treated
as a special case, and denoted G2 ; also I2 (3)
= A2 and I2 (4)
= B2 .
Related to these are the affine Coxeter groups, listed in Table 5.4,
which describe the symmetries of certain regular and semiregular
tilings (or honeycombs) of Rn . They all have infinite order. The
names of each of these Coxeter graphs and their associated groups
indicate that an affine group X
e n is obtained from the corresponding
presentations 203

e n contains n+1
finite group Xn by adjoining an extra vertex; so X
vertices, not n.
Another class of Coxeter groups relate to reflections in hyperbolic
space; these will not concern us here, but a full list can be found in
Humphreys book.23 23
Humphreys, Reflection Groups and
Coxeter Groups, Sections 6.8 and 6.9.
The mathematical study of knots is a vibrant and varied area of
research dating back to the middle of the 19th century. An important
and related concept is that of a braid: an arrangement of finitely
many parallel strings, usually drawn vertically, where adjacent strands
are allowed to cross over each other, but must always point strictly
downwards (we dont allow them to be horizontal or double back on
themselves at any point). See Figure 5.4 for an example. Considered
as topological objects, we say two braids are equivalent if one can be
rearranged to look like the other if the top and bottom end of each
string is kept fixed in place. (Technically, we say that two braids are
equivalent if they are ambient isotopic relative to their endpoints.)
It transpires that any such rearrangement can be decomposed as
a finite sequence of simple moves of three basic types. Type one
allows us to move crossings on non-adjacent strings up and down Figure 5.4: A braid
(see Figure 5.5). A move of the second type allows the introduction
or removal of a pair of cancelling crossings (see Figure 5.6). And the
third move involves the interaction of three crossings in three adjacent
strands, where a strand is allowed to pass over or under a crossing Figure 5.5: A braid move of type 1
in the other two strands (see Figure 5.7). The second and third of
these are essentially braid-theoretic versions of the second and third

Reidemeister moves in knot theory (the first Reidemeister move, which


Figure 5.6: A braid move of type 2
relates to the introduction or deletion of a loop, is not applicable here).
It so happens that all of this can be encoded in a group-theoretic way:

Definition 5.91 Let Bn denote the set of all nstring braids, modulo
the equivalence relation we described above. We can concatenate two
braids 1 and 2 by joining the bottom ends of 1 to the top ends of Figure 5.7: A braid move of type 3
the corresponding strings of 2 to obtain a new braid 1 2 . This is
an associative binary operation defined on Bn . The identity element
is the trivial braid consisting of n parallel strings with no crossings.
For any braid we have a unique (up to equivalence) inverse braid
1 obtained by reflecting vertically; the concatenations 1 and
1 both yield the trivial braid when all the strings have been
pulled taut.
The resulting group is the nstring braid group.

Any nstring braid can be decomposed as a concatenation of one


or more elementary braids 1 , . . . , n1 and their inverses. Here, i
204 a course in abstract algebra

1 i 1 i i +1 i +2 n denotes the braid where string i +1 crosses over string i (see Figure 5.8)
i : while i1 is the braid where string i crosses over string i +1.
We can use this viewpoint to construct the following presentation for
1 i 1 i i +1 i +2 n
Bn , first proved in 1925 by the Austrian mathematician Emil Artin
i1 :
(18981962).
Figure 5.8: Elementary braids
Proposition 5.92 Let Bn be the nstring braid group. Then
* +
i j = j i for |i j| > 1
Bn = 1 , . . . , n1 : .
i i+1 i = i+1 i i+1 for 1 6 i 6 n2

The proof of this is a little complicated, and usually requires reference


to topological concepts that are beyond the scope of the current dis-
24
24
E Artin, Theorie der Zpfe, Abhand- cussion, but the details can be found in Artins original paper or in
lungen aus dem Mathematischen Sem- various other places, including the books by Joan Birman25 and Vagn
inar der Universitt Hamburg 4 (1925)
4772. Lundsgaard Hansen.26
25 The keen-eyed reader might notice similarities between this presenta-
J S Birman, Braids, Links and Mapping
Class Groups, Annals of Mathematics tion for Bn , and the presentation for the symmetric group Sn in Propo-
Studies 82, Princeton University Press
(1975), Theorem 1.8. sition 5.33. All that are missing are relations of the form i2 = 1. We
26
V L Hansen, Braids and Coverings, can use this insight to define a surjective homomorphism : Bn Sn ,
London Mathematical Society Student where each elementary braid i maps to the transposition (i i +1). Geo-
Texts 18, Cambridge University Press
(1989), Section 1.4.
metrically, this is equivalent to ignoring the distinction between under-
and over-crossings in the braid. Each braid Bn is mapped to a
1 2 3 4 5
permutation Sn , and we can readily determine this by looking at
a picture of the braid: the string that begins in position i at the top
of the braid ends up in position (i ) at the bottom. In the example
shown in Figure 5.9, the permutation = 14 23 32 41 55 = (1 4)(2 3).
 

The kernel of this homomorphism consists of all braids that map to the
identity permutation . These are exactly the ones in which each string
begins and ends in the same position; they are called pure braids, and
(4) (3) (2) (1) (5)
the subgroup PBn = ker is the nstring pure braid group.
Figure 5.9: Reading a permutation
from a braid This connection between the braid group Bn and the symmetric group
Sn , which is isomorphic to the Coxeter group An1 , yields an entire
family of groups that are related in a similar way to the other Coxeter
groups. First we introduce some notation. Let

h x yim = xyxy . . .
| {z }
m terms

denote the alternating product of x and y with length m. So, for


example, h x yi3 = xyx and h x yi6 = xyxyxy. In particular, we can
now rewrite the second type of relation in the presentation in Proposi-
tion 5.92 as
hi i+1 i3 = hi+1 i i3 .
The exponent 3 is the same as the label on the edge connecting vertex
presentations 205

i and vertex i +1 in the An1 Coxeter diagram. We can also write the
other relation in this form too:

i j = hi j i2 = hj i i2 = j i ,

and here the exponent 2 indicates that there is no edge connecting


vertices i and j if |i j| > 1. We can thus write all the relations in
the presentation for Bn in this way, and generalise to other Coxeter
diagrams too.
Definition 5.93 Let D be a Coxeter diagram, and let mij be the label
on the edge connecting distinct vertices i and j (where mij = 2 if
there is no edge). The Artin group of type D is the group defined by
the presentation
h x1 , . . . , xn : h xi x j imij = h x j xi im ji for 1 6 i < j 6 ni.

These groups are not, in general, finite. Inserting additional relations


of the form xi2 = 1 determines a surjective homomorphism onto
the corresponding Coxeter group; the kernel of this homomorphism
(which is generated by all elements of the form xi2 ) is the pure Artin
group of the appropriate type.
Example 5.94 As discussed above, the Artin group of type An1 is
the braid group Bn .

Example 5.95 The Artin group of type B2 has presentation


h x, y : xyxy = yxyx i.

Example 5.96 The Artin group of type D4 has presentation


* +
wxw = xwx, wyw = ywy, wzw = zwz
w, x, y, z : .
xy = yx, xz = zx, yz = zy

Another generalisation of the braid group Bn comes from viewing it


as the fundamental group of the configuration space of n points in
the plane R2 . Replacing the plane with the sphere S2 or some other
surface yields interesting variants, which are not in general Artin
groups. We will not investigate this avenue further here, but details
can be found in the books by Birman and Hansen.

Summary

In this chapter we developed the notion of a presentation of a group,


as well as some powerful technical machinery to study groups de-
fined in this way. We began by examining the construction of the
206 a course in abstract algebra

infinite cyclic group Zn from all possible powers of a given generator


t, and then making the finite cyclic group Zn by imposing a constraint
tn = 1 on this generator. In order to generalise this approach to
27
Definition 5.3, page 139. larger generating sets, we studied formal reduced strings or words27
in these generating symbols together with their formal inverses. By
concatenating words of this type (and performing any necessary can-
cellations of generators with adjacent formal inverses) we obtained an
associative binary operation on the set of such finite reduced words.
The empty word e acts as an identity element, and each reduced
word has a formal inverse obtained by generalising the usual result
( g h)1 = h1 g1 . This enabled us to construct first the free group
28
Proposition 5.2, page 138. F2 with two generators,28 and then to generalise this to the free group
on an arbitrary (possibly infinite) set of generators; the number of
29
Definition 5.4, page 139. generators is called the rank of the group.29 We then discussed a
30
Definition 5.5, page 140. more abstract definition30 in terms of an adjointness condition, and
31
Proposition 5.6, page 141. proved that these two definitions are equivalent.31 Free groups are
determined by the cardinality of their generating sets: if we have two
sets X1 and X2 such that a bijection exists between them, then the free
32
Proposition 5.7, page 142. groups F ( X1 ) and F ( X2 ) are isomorphic.32 In particular, Fm = Fn if
33
Corollary 5.8, page 143. and only if m = n.33
Next we studied the order of elements in a free group. First we
34
Definition 5.9, page 144. introduced the notion of a cyclically reduced word34 and used this to
show that free groups are torsion free; that is, they have no nontrivial
35
Proposition 5.10, page 144. elements of finite order.35 Free groups are about as nonabelian as they
can be: two elements commute if and only if they are both powers of
36
Proposition 5.12, page 145. the same element.36
Having defined free groups and studied some of their properties, we
37
Definition 5.16, page 149. then introduced the notion of a presentation,37 which we defined in
38
Definition 5.14, page 148. terms of the quotient of a free group by the normal closure38 , 39 of a
39
Proposition 5.15, page 148. set of carefully-chosen words, which we called relators (equivalently
we can replace relators with defining equations called relations). If a
group can be represented in this way using a finite set of generators
and relators then we say its finitely presented. Group presentations
40
Proposition 5.17, page 149. satisfy a similar universal mapping property40 to the one satisfied by
41
Definition 5.5, page 140. free groups.41 Usefully, it turns out that every group can be described
42
Proposition 5.18, page 150. by means of a suitable presentation,42 although not necessarily by a
finite presentation.
Given presentations for two groups G = h X : Ri and H = hY : Si, we
can easily write down a presentation h X Y : R S [ X, Y ]i for their
43
Proposition 5.22, page 151. direct product G H.43 Here, [ X, Y ] denotes the set of commutators
xyx 1 y1 of generators x X and y Y. This led us to the notions of
44
Definition 5.23, page 153. a free product44 G H and an amalgamated free product45 G K H.
45
Definition 5.26, page 153.
presentations 207

In particular, the free product F ( X ) F (Y ) is equal to the free group


F ( X Y ).46 46
Proposition 5.24, page 153.
We then looked at some presentations for groups we met in earlier
chapters. The dihedral groups Dn ,47 the quaternion group Q8 ,48 the 47
Proposition 5.28, page 154.
additive group Q of rational numbers,49 and the symmetric groups 48
Proposition 5.30, page 157.
Sn .50 In particular, the presentation obtained for the dihedral groups 49
Proposition 5.32, page 158.
allowed us to prove a useful classification result for groups of order 50
Proposition 5.33, page 159.
2p with p prime: any such group must be isomorphic to either the
cyclic group Z2p or the dihedral group D p .51 We also introduced the 51
Proposition 5.29, page 155.
dicyclic groups Dicn which in some sense generalise the quaternion
group Q8 .52 52
Example 5.31, page 157.
A discussion of the different presentations obtained for the isomor-
phic groups D3 and S3 led us to formulate certain simple operations,
called Tietze transformations.53 If two presentations are related by 53
Definition 5.36, page 162.
a finite sequence of these transformations, they determine isomor-
phic groups;54 furthermore any presentation for a given group can 54
Proposition 5.35, page 161.
be turned into any other presentation for the same group by a finite
sequence of these operations.55 In practice it is not always easy to find 55
Proposition 5.41, page 165.
such a sequence: it reduces to the word problem, which is in general
unsolvable.
We then investigated the abelian case, obtaining concrete56 and ab- 56
Definition 5.42, page 167.
stract57 definitions for the notion of a free abelian group, which 57
Definition 5.43, page 167.
groups (at least in the finite rank case) we then found to be isomorphic
to direct products Zn of copies of the group of integers.58 We then 58
Proposition 5.44, page 168.
proved a result of Richard Dedekind (18311916) that says that any
subgroup of a free abelian group of rank r must also be a free abelian
group of rank at most r.59 59
Theorem 5.46, page 169.
We found that any finitely generated abelian group can be expressed
in a canonical form, called the invariant factor decomposition, which
consists of a direct sum of finite and infinite cyclic groups.60 , 61 In 60
Definition 5.47, page 171.
particular, this means that any finite abelian group can be expressed 61
Theorem 5.48, page 171.
as a direct sum of finite cyclic groups, and any torsion-free abelian
group must be isomorphic to a free abelian group FAr = Zr .
In practice, we can take any presentation for a finitely generated
abelian group, write down the corresponding coefficient matrix and
use a finite sequence of elementary row and column operations62 62
Definition 5.57, page 177.
to convert it into Smith normal form.63 Having done this, we can 63
Proposition 5.58, page 177.
immediately read off the invariant factor decomposition.64 The process 64
Algorithm 5.59, page 177.
is very similar to the usual linear algebraic method for solving systems
of simultaneous linear equations.
Any finitely generated abelian group can also be canonically decom-
posed in a different way to give the primary decomposition.65 , 66 65
Definition 5.62, page 180.
66
Theorem 5.63, page 180.
208 a course in abstract algebra

Presentations are often a useful way of describing a group in a com-


pact form, but as noted earlier their usefulness is limited by certain
technical constraints (such as the word problem). We cant, for ex-
ample, immediately tell whether a given presentation determines an
infinite group, or if finite, what its order is. In the case of a finite
group, however, the ToddCoxeter algorithm can give us the order
67
Definition 5.66, page 182. of the group, a permutation representation and a Cayley graph67 . If
applied to an infinite group, the algorithm wont terminate. (This
presents us with a potential logistical problem: there is no way of
telling whether the algorithm hasnt terminated because the group
is infinite, or just that it hasnt terminated yet because the group is
finite but very large.) A slightly modified form of the ToddCoxeter
algorithm can be used to find the index | G : H | of a subgroup H 6 G
68
Definition 5.71, page 187. and also a transversal:68 a set of group elements, one from each right
coset of H in G.
The transversals obtained by the ToddCoxeter algorithm have a spe-
cial property: theyre prefix-closed. We call transversals satisfying this
69
Definition 5.73, page 188. property Schreier transversals,69 and it turns out that with the aid of
70
Definition 5.76, page 190. the shortlex well-ordering relation70 we can show that any subgroup
71
Proposition 5.77, page 190. H of a free group F has a transversal of this type.71
72
Theorem 5.11, page 145. We can use this machinery to prove the NielsenSchreier Theorem,72
which states that a subgroup H of a free group F is itself free; moreover
if the rank r = rank( F ) and the index s = | F : H | are both finite, then
rank( H ) is also finite and equal to (r 1)s + 1. It is a nonabelian
73
Theorem 5.46, page 169. analogue of Dedekinds Theorem,73 which makes a similar statement
about subgroups of free abelian groups.
The techniques used in the proof of the NielsenSchreier Theorem also
yield a general method for calculating presentations of a subgroup
H of a given finitely presented group G. We use the ToddCoxeter
algorithm to obtain a Schreier transversal U, which we then use to
construct generators and relators for H.

References and further reading


D L Johnson, Presentations of Groups, second edition, London Mathematical Society Student Texts 15,
Cambridge University Press (1997)
A good textbook on combinatorial group theory, aimed primarily at graduate students and final-year
undergraduates, which goes into more detail and covers several additional topics.
S Roberts, King of Infinite Space, Profile Books (2007)
An excellent biography of H S M Coxeter, containing many details of his life and career, as well as
his numerous contributions to algebra and geometry.
presentations 209

J E Humphreys, Reflection Groups and Coxeter Groups, Cambridge Studies in Advanced Mathematics
29, Cambridge University Press (1990)
A readable textbook on Coxeter groups, suitable for graduate students and final-year undergraduates.

Exercises
5.1 As in Exercise 1.23, let S = R \ {1, 0, 1}, let f : S S such that f ( x ) = 11+ x
x , and denote by f
n

the nfold composite of f . Furthermore, let g : S S with g( x ) = x. Show that h f , gi = Dn


for some n.
5.2 Write down a presentation for the group ZZZ.
5.3 Show that |Dicn | = 4n by a similar argument to that in the proof of Proposition 5.30.
5.4 Use a sequence of Tietze transformations to transform the presentation
h x, y : xyx = yxyi
into the presentation
h a, b : a3 = b2 i.
5.5 List the invariant factor decompositions of all abelian groups of order 60.
5.6 List all the words of length 3 in the free group F2 and put them in shortlex order.
5.7 List the words of length at most 2 in the free group F3 = h x, y, z : i and put them in shortlex
order.
5.8 Let < be the shortlex ordering on some free group F with basis X. Suppose that v, w F and
x X such that v < w. Show that vx < wx and xv < xw.
5.9 Repeat Example 5.80 using the other Schreier transversal U = {e, x, y, x2 , xy, x2 y} from Exam-
ple 5.79.
The Earth is full of anger,
The seas are dark with wrath,
The Nations in their harness
Go up against our path:
Ere yet we loose the legions
Ere yet we draw the blade,
Jehovah of the Thunders,
Lord God of Battles, aid!
Rudyard Kipling (18651936),
6 Actions Hymn Before Action (1896)

any of the groups weve studied so far have consisted of op-


M erations acting on a set. The dihedral and other symmetry
groups, for example, are concerned with certain transformations
which map a particular geometric shape to itself. Matrix groups
such as GLn (R) or SUn can be regarded as groups of invertible linear
transformations defined on some real or complex vector space. And
permutation groups such as Sn and An are defined precisely in terms
of particular operations on some set of objects.
In this chapter, we will investigate this approach in more detail, formu-
lating a rigorous definition of what it means for a group to act on a set,
studying what it means for an element of a group to fix one or more
elements of the chosen set in place, and examining some powerful
applications to combinatorics.

6.1 Symmetries and transformations The chief forms of beauty are order and
symmetry and definiteness, which the
mathematical sciences demonstrate in
Recall that in Chapter 1 we met a class of groups consisting of a special degree.
symmetry operations defined on some geometric object. Aristotle (384322 BC),
Metaphysics XIII:3
The dihedral groups Dn , for example, consist of rotations and reflec-
tions of a regular nsided polygon. More precisely, we define a subset
m2
Pn of R2 consisting of the regular polygon with vertices at the points
(2k1) (2k1)  m3 m1
cos n , sin n for 1 6 k 6 n. 2
1
The reflection m1 determines a particular bijection Pn Pn which
leaves every point along the line y = x tan n fixed where it is, and
m4
swaps every other point with its mirror image in that line. Similarly,
the rotation r fixes the origin and cycles all other points among them-
3
selves in groups of n, and thus also determines a bijection Pn Pn . 4

So, what we have here is a well-defined way of combining an element


of Dn together with a point in Pn to get another point in Pn ; that is, a Figure 6.1: Axes of symmetry of the
function Dn Pn Pn . (Another way of looking at it is that we have a square P4
212 a course in abstract algebra

m3 m2 well-defined way of interpreting elements of Dn as bijections Pn Pn ,


2 and well compare these two viewpoints in a little while.) In particular,
m4 m1
the identity element e leaves every point where it is. Also, for any two
1
elements g, h Dn , applying h to a point x Pn and then applying g
m5 3 to the result has the same effect as if wed applied the transformation
corresponding to g h to x.
5 Writing this down formally, we have a set Pn together with a function
4 : Dn Pn Pn such that
(e, x ) = x,
Figure 6.2: Axes of symmetry of the
pentagon P5 ( g, (h, x )) = ( g h, x ).
Now let A be an invertible nn matrix with real entries. We know
from linear algebra that such a matrix can be regarded as representing
a linear transformation f A : Rn Rn (relative to some basis for Rn ).
The invertibility of A ensures this linear transformation is bijective.
In particular, the identity matrix leaves every point in Rn fixed, and we
know that ( AB)v = A( Bv) for any two such invertible nn matrices
A and B, and any vector v Rn . What we have, then, is a function
GLn (R) Rn Rn .
(The of this function depend on our choice of basis for Rn .)
1
Definition 1.51, page 24. As a third example, recall that the symmetric group Sn was defined1 as
the group of permutations acting on a set Xn = {1, . . . , n}. Again, we
have an identity permutation which leaves Xn unchanged, and for any
two permutations , Sn and element x Xn we have ( )( x ) =
( ( x )). This, therefore, also determines a function Sn Xn Xn .
With this in mind, we present the following definition:
Definition 6.1 Let G be a group, and let X be a set. An action of G
on X, or a Gaction on X is a function
: G X X
satisfying the properties
(e, x ) = x, (6.1)
( g, (h, x )) = ( g h, x ) (6.2)
for all x X and g, h G. The set X is then said to be a Gset.
2
Definition 1.1, page 2. As with the binary operations we met early in Chapter 1,2 in many
cases it will be simpler and clearer if we adopt a slightly different
notation. Instead of writing ( g, x ), then, well often write g x instead.
In this form, conditions (6.1) and (6.2) become much clearer:
e x = x, (6.3)
g (h x ) = ( g h) x. (6.4)
actions 213

The first says that the identity e must act as the identity operation on
X, while the second is an associativity condition.
What weve just defined here is more properly called a left action of
G on X. There is a corresponding notion of a right action, which is a
function : X G X satisfying
( x, e) = x, (6.5)
( ( x, g), h) = ( x, g h), (6.6)
or, alternatively,
x e = x, (6.7)
( x g ) h = x ( g h ), (6.8)
for all x X and g G.
It so happens that there is a bijective mapping between left and right
Gactions on a given set X, in the sense that for any left action : G
X X there is a unique right action : X G X and vice versa.
This correspondence doesnt work in quite the way we might expect,
however. The tempting but nave approach is to take a left action
: G X X and define a right action : X G X by setting
( x, g) = ( g, x ) for all g G and x X.
But this doesnt quite work. It certainly satisfies condition (6.7) since
( x, e) = (e, x ) = x.
Condition (6.8), however, fails in general when G is nonabelian:
( ( x, g), h) = (h, ( g, x )) = (h g, x ) = ( x, h g) 6= ( x, g h).
We can make this work, however, if instead of defining ( x, g) =
( g, x ) we define ( x, g) = ( g1 , x ) for all g G and x X. Here,
( x, e) = (e1 , x ) = (e, x ) = x
as before, but

( ( x, g), h) = (h1 , ( g1 , x )) =
(h1 g1 , x ) = (( g h)1 , x ) = ( x, g h).
In the three motivating examples earlier, we interpreted the action
of Dn on Pn , the action of GLn (R) on Rn and the action of Sn on
Xn as functions of the form G X X satisfying the properties in
Definition 6.1. But in all three cases we noted that the action of a
particular group element on the set determined a bijective symmetry
map, invertible linear transformation or a permutation of that set.
So we could also regard those group actions as ways of assigning
a permutation of X to each element of the group G. This is true in
general, as the following proposition shows.
214 a course in abstract algebra

Proposition 6.2 Let : G X X, defined by ( g, x ) 7 g x for g G


and x X, be a left action of a group G on a set X.
Then determines a unique homomorphism : G Sym( X ), given by
mapping an element g G to the function f g : X X where f g ( x ) =
( g, x ) = g x for all x X.

Proof Property (6.3) tells us that f e ( x ) = e x = x for all x X, so f e


is the identity map idX .
Property (6.4) tells us that
f e ( x ) = f g g1 ( x ) = ( g g1 ) x = g ( g1 x ) = f g ( f g1 ( x ))
and also that
f e ( x ) = f g1 g ( x ) = ( g1 g) x = g1 ( g x ) = f g1 ( f g ( x )).
This tells us that f g f g1 = f e = f g1 f g , so f g must be a bijection for
all g G, and hence f g Sym( X ). So the function : G Sym( X )
given by g 7 f g is certainly well-defined, and we just need to show
that its also a homomorphism.
To see this, let g, h G. Then

( g h)( x ) = f gh ( x ) = ( g h) x = g (h x ) = f g ( f h ( x ))
= ( f g f h )( x ) = (( g) (h))( x )
as required.
Similarly, any homomorphism : G Sym( X ) determines a unique
left action of G on X, and since left Gactions on X are in bijective
correspondence with right Gactions on X, any such homomorphism
determines a unique right action too.
So Gactions on a set X are essentially the same as homomorphisms
G Sym( X ). We studied homomorphisms quite extensively in
Chapter 4, so by applying some of the things we learned, we should
be able to get some important insights into how group actions work.
Example 6.3 By examining how the elements of D4 permute the
vertices of the square P4 , as shown in Figure 6.3, we can obtain
an action of D4 on the set X4 = {1, 2, 3, 4}. This determines a
homomorphism : D4 S4 such that

2 1 e 7 , r 7 (1 2 3 4),
2 3
r 7 (1 3)(2 4), r 7 (1 4 3 2),
m1 7 (2 4), m2 7 (1 2)(3 4),
m3 7 (1 3), m4 7 (1 4)(2 3).

3 4 The dihedral group Dn acts by rotation and reflection operations on


Figure 6.3: Vertices of the square P4 a regular nsided polygon Pn , but as we noted at the beginning of
actions 215

Section 1.4 we can also view it as acting on the subset consisting of


just the n vertices of Pn . Well come back to this idea of a group acting
on a subset in the next section, when we consider orbits and stabilisers.
But by observing just the behaviour of the n vertices of Pn under the
action of Dn , we can define a homomorphism : Dn Sn .3 3
To be really precise about this, we first
have to number the vertices by defin-
In particular, its worth thinking about what happens when the homo- ing a bijection between the n vertices
morphism : G Sym( X ) is injective. If this is the case then any two and the set {1, 2, . . . , n}. There are n!
distinct elements g1 , g2 G map to different permutations in Sym( X ). ways of doing this, and the resulting
homomorphism : Dn Sn will de-
This means that there exists at least one x X for which g1 x 6= g2 x. pend on exactly which choice we make,
In other words, each g G acts differently on the elements of X. but not to any really important degree:
any two such homomorphisms 1 and
If isnt injective, then well be able to find two distinct elements 2 will differ only by composition with
g1 , g2 G which act in the same way on all elements of the set some automorphism f : Sn Sn , in
the sense that 2 = f 1 .
X. Specifically, there will be a nontrivial element g G which acts
trivially on X; that is, g x = x for all x X.
If : G Sym( X ) is injective, each element of G permutes the ele-
ments of X in a different way. We give a special name to such actions:
Definition 6.4 Let a group G act on a set X in such a way that the
corresponding homomorphism : G Sym( X ) is injective. Then
we say that this action is faithful.
Equivalently, a Gaction on X is faithful if, for any distinct elements
g1 , g2 G, there exists at least one x X for which g1 x 6= g2 x.
Proposition 4.35 says that injective homomorphisms are exactly those
that have trivial kernel. So in the context of group actions the kernel
of the homomorphism : G Sym( X ) consists of those elements of
G that are mapped to the identity permutation Sym( X ) and hence
act trivially on X. In other words, a faithful action is one for which
only the identity e G acts trivially on the set X.
Example 6.5 Perhaps the most obvious group action is that of the
full symmetric group Sym( X ) on a set X. This action corresponds to
the identity homomorphism id : Sym( X ) Sym( X ), and since this
is injective the action is faithful.

The next most obvious group action is the one where nothing happens:
Example 6.6 For any group G and any set X we can define the
trivial action of G on X by setting g x = x for all g G and x X.
This certainly satisfies conditions (6.3) and (6.4) and is hence a valid
example of a group action. From the alternative point of view of a
homomorphism G Sym( X ) this is exactly the homomorphism that
maps every element g G to the identity permutation Sym( X ).
This action is not faithful in general, except when the group G is
trivial.
216 a course in abstract algebra

Suppose G is a subgroup of the symmetric group Sym( X ) for some


set X. Then each element of G is still a permutation of X, so we can
construct an action of G on X by using the inclusion homomorphism
i : G , Sym( X ). What does the corresponding action look like?
Well, its exactly what wed expect it to be: an element g G is a
permutation of X; that is, a bijection g : X X. So we can define
g x = g( x ) for all x X and get a Gaction on X that satisfies the
conditions (6.3) and (6.4).
The inclusion homomorphism i : G , Sym( X ) is injective and hence
this action is faithful.
Example 6.7 The alternating group An is a subgroup of the finite
symmetric group Sn , and so there is an inclusion homomorphism
i : An , Sn via which An acts on the finite set Xn = {1, . . . , n} in the
obvious way.

We can take this viewpoint a little further. Suppose that a group G acts
on some set X, and suppose also that H is some subgroup of G. Then
we can define an action of H on X in a straightforward way: every
element of H is also an element of G after all, so for h H and x X
we can just define h x to be whatever it is in the action of G on X.
Another way of looking at this is to define the homomorphism : H
Sym( X ) by composing the inclusion homomorphism i : H , G with
the action homomorphism : G Sym( X ).
Example 6.8 A concrete example of this is given by the inclusion
of the special orthogonal group SOn (R) in the general linear group
GLn (R). Once weve chosen a basis for Rn , that gives us an action
of GLn (R) on Rn : a well-defined method of combining an invertible
nn real matrix with an ncomponent real vector to get another
ncomponent real vector, in such a way that the usual Rlinearity
conditions are satisfied. Whatever other properties (unit determinant,
orthogonality) elements of SOn (R) might have, they are still nn
invertible real matrices, so we can define an SOn (R)action on Rn
by reusing the GLn (R) action we already had.
4
Theorem 2.13, page 47. By Cayleys Theorem4 , any group G can be regarded as a subgroup of
the symmetric group Sym( G ) of permutations of its underlying set.
More precisely, we construct an isomorphism f : G H < Sym( G )
by mapping each element g G to the bijection g : G G defined
by g (h) = g h for all h G. This yields an isomorphism
f : G H = { g : g G } 6 Sym( G ).
We can compose this isomorphism f with the inclusion homomor-
phism i : H , Sym( G ) to get the required action homomorphism
= i f : G Sym( G ).
actions 217

We can do this for any group G, finite or infinite, to obtain an action


of G on its own underlying set:
Definition 6.9 For any group G we can define the left regular action
of G on itself by defining
( g, h) = g h = g h
for all g, h G. This action is faithful: g h = h if and only if g = e,
by the left cancellation laws (1.4).
Similarly, we can define the right regular action of G on itself by
(h, g) = h g = h g
for all g, h G. This action is also faithful.

We can use this process to write down another D4 action:


Example 6.10 Applying Cayleys Theorem to the dihedral group D4
e r r2 r3 m1 m2 m3 m4
(see Table 6.1) gives us an isomorphism f : D4
= G 6 Sym( D4 )
= S8 , e e r r2 r3 m1 m2 m3 m4
such that r r r2 r3 e m4 m1 m2 m3
r2 r2 r3 e r m3 m4 m1 m2
e 7 , r 7 (e r r2 r3 )(m1 m4 m3 m2 ), r3 r3 e r r2 m2 m3 m4 m1
m1 m1 m2 m3 m4 e r r2 r3
r2 7 (e r2 )(r r3 )(m1 m3 )(m2 m4 ), r3 7 (e r3 r2 r )(m1 m2 m3 m4 ), m2 m2 m3 m4 m1 r3 e r r2
m3 m3 m4 m1 m2 r2 r3 e r
m1 7 (e m1 )(r m2 )(r2 m3 )(r3 m4 ), m2 7 (e m2 )(r m3 )(r2 m4 )(r3 m1 ),
m4 m4 m1 m2 m3 r r2 r3 e
m3 7 (e m3 )(r m4 )(r2 m1 )(r3 m2 ), m4 7 (e m4 )(r m1 )(r2 m2 )(r3 m3 ). Table 6.1: The multiplication table for
the dihedral group D4
Composing this isomorphism f with the inclusion homomorphism
i : G , Sym( D4 ) gives us an action of D4 on its underlying set
D4 = {e, r, r2 , r3 , m1 , m2 , m3 , m4 }. This action is faithful: f is an iso-
morphism and i is injective so their composite i f : D4 Sym( D4 )
must be injective too.

Cayleys Theorem gives a faithful action of a group on its underlying


set. The group conjugation operation from Chapter 3 also satisfies (6.3)
and (6.4), and yields another important action of a group on itself.
Definition 6.11 Let G be a (finite or infinite) group. Then the conju-
gation action of G on itself is defined by
h g = g h = g h g 1
for all g, h G. This action is not in general faithful.

Lets look at a simple example first.


Example 6.12 The conjugation action of Z3 on itself (see Table 6.2) 0 1 2
yields the homomorphism Z3 Sym(Z3 ) = S3 given by 0 0 0 0
1 1 1 1
0 7 , 1 7 , 2 7 . 2 2 2 2
Table 6.2: Conjugation in Z3
This is the trivial action on Z3 , and is obviously not faithful.

In general, any abelian group acts trivially on itself by conjugation:


218 a course in abstract algebra

Proposition 6.13 Let G be an abelian group. Then the conjugation action


on G is trivial.

Proof Let g, h G. Then h g = g h g1 = g g1 h = e h = h.


Therefore the conjugation action is trivial.
Example 6.14 Writing the corresponding permutations in Sym( D4 )
in cycle notation, as in Example 6.10, we obtain
e r r2 r3 m1 m2 m3 m4 e 7 , r 7 (m1 m3 )(m2 m4 ),
e e e e e e e e e
2 3
r r r r r r3 r3 r3 r3 r 7 , r 7 (m1 m3 )(m2 m4 ),
r2 r2 r2 r2 r2 r2 r2 r2 r2
r3 r3 r3 r3 r3 r r r r
m1 7 (r r3 )(m2 m4 ), m2 7 (r r3 )(m1 m3 ),
m1 m1 m3 m1 m3 m1 m3 m1 m3 m3 7 (r r3 )(m2 m4 ), m4 7 (r r3 )(m1 m3 ).
m2 m2 m4 m2 m4 m4 m2 m4 m2
m3 m3 m1 m3 m1 m3 m1 m3 m1 This action is clearly not faithful, since the kernel of the homomor-
m4 m4 m2 m4 m2 m2 m4 m2 m4
phism D4 Sym( D4 ) is nontrivial: it consists of e and r2 .
Table 6.3: Conjugation in D4
So far weve met a number of different group actions, some of which
are faithful and some of which arent. Proposition 4.29 says that
kernels of group homomorphisms correspond exactly to normal sub-
groups. So another way to look at this is that a nontrivial unfaithful
Gaction on a set X determines a nontrivial normal subgroup of G.
5
Theorem 4.40, page 119. This, together with the First Isomorphism Theorem5 gives us a method
for turning any unfaithful group action into a faithful one.
An unfaithful Gaction on a set X gives a noninjective homomorphism
: G Sym( X ). Factoring out by the kernel K = ker() of this homo-
morphism yields an isomorphism G/K im(), and composing this
with the inclusion homomorphism i : im() , Sym( X ) gives us an
injective homomorphism : G/K Sym( X ). This homomorphism
determines a faithful action of G/K on X.
Example 6.15 The unfaithful conjugation D4 action from Exam-
ple 6.14 gives a noninjective homomorphism : D4 Sym( D4 ) with
kernel K = {e, r2 }. From Example 3.30 we know that D4 /K = V, the
Klein group, so by factoring out by K (which happens to be the centre
Z ( D4 )) we get a faithful action of V on the underlying set of D4 given
by the following injective homomorphism : V Sym( D4 ):
e 7 , a 7 (m1 m3 )(m2 m4 ),
b 7 (r r3 )(m2 m4 ), c 7 (r r3 )(m1 m3 ).

Also, this relationship between group actions and homomorphisms


and normal subgroups means that if G is a simple group, then the
only unfaithful Gaction is the trivial one.
actions 219

6.2 Orbits and stabilisers [The Earth] alone remains immoveable,


whilst all things revolve around it, be-
ing connected with every other part,
By Proposition 1.54, any finite permutation can be decomposed as a while they all rest upon it.
product of disjoint cyclic permutations. For example, the permutation Pliny the Elder (2379AD),
Natural History II:4
S5 given by 12 24 35 41 53 can be written as the product (1 2 4)(3 5).
 

When this permutation acts on the set X5 = {1, 2, 3, 4, 5} it cycles 1,


If Ms a complete metric space,
2 and 4 amongst themselves, and swaps 3 and 5. We can use this to and nonempty, its always the case,
define an action of the cyclic subgroup that if f s a contraction,
then under its action,
h i = {, (1 2 4)(3 5), (1 4 2), (3 5), (1 2 4), (1 4 2)(3 5)} 6 S5 exactly one point stays in place.
Anonymous, The Contraction
on X5 in the obvious way. Looking at how the six elements of h i Mapping Limerick
permute the elements of X5 , we can see that all of them permute
1, 2 and 4 amongst themselves, and also permute 3 and 5 amongst
themselves. No element of h i, though, maps an element of the subset
{1, 2, 4} to an element of the subset {3, 5}, or vice versa. (This is
obvious, because since = (1 2 4)(3 5) doesnt do this, neither can any
power of .) So, the action of h i on X5 splits (or partitions) X5 as a
union of two disjoint subsets: X5 = {1, 2, 4} {3, 5}.
Where theres a partition, theres an equivalence relation: we can
say two elements x, y S = i Si of a partitioned set S are related,
S

written x y, if they both lie in the same subset Si S. This relation


is reflexive (x must obviously lie in the same subset as itself), its
symmetric (if x lies in the same subset as y then y must also be in the
same subset as x) and its transitive (if x is in the same subset as y and
y is in the same subset as S then x must be in the same subset as z).
An arbitrary partition neednt correspond to a particularly interesting
equivalence relation, but since this one is determined by the action of
h i on X5 , we might reasonably expect the corresponding equivalence
relation to give us some insight into exactly how h i acts on X5 .
The key is the observation a few paragraphs ago: the elements of h i
permute the subsets {1, 2, 4} and {3, 5} amongst themselves, but they
dont map any elements from one subset to the other. More formally,
for two elements x, y X5 , we say that x y if and only if there exists
a permutation h i such that x = y.
We need a name for the equivalence classes determined by this action.
Definition 6.16 Let G be a group, acting on a set X. The orbit of an
element x X is the subset
OrbG ( x ) = {y X : g x = y for some g G }.
So, in the example above,
Orbh i (1) = Orbh i (2) = Orbh i (4) = {1, 2, 4}
and Orbh i (3) = Orbh i (5) = {3, 5}.
220 a course in abstract algebra

The next example involves an infinite group acting on an infinite set.


Example 6.17 The group SO3 (R) consists of all real, orthogonal
33 matrices with determinant 1. Geometrically, these correspond
to rotations in R3 around some axis passing through the origin.
What are the orbits of this SO3 (R)action on R? Well, given any
two position vectors v and w in R3 , if kvk = kwk then there will
be a matrix A SO3 (R) such that Av = w. But if kvk 6= kwk
Figure 6.4: Orbits of SO3 (R) as concen- then no such special orthogonal matrix will exist. So v w if and
tric spheres
only if kvk = kwk, which means that the orbit OrbSO3 (R) (v) must
consist of all vectors w R3 such that kvk = kwk. Geometrically,
these position vectors determine a sphere of radius kvk centred on
the origin. So the action of SO3 (R) partitions R3 into an infinite
collection of concentric spheres and the origin {0} = OrbSO(R) (0).

The following example is of a finite group, the dihedral group D3 ,


acting in a familiar way on an infinite set of points in the plane.
Example 6.18 Let P3 be the equilateral

triangle with vertices at the
1 3 1 3

points (1, 0), 2 , 2 and 2 , 2 . Then the dihedral group D3
acts on this subset of the plane R2 in the usual way, and this action

has uncountably many orbits. These orbits, however, are of three
different types:
?
The origin (0, 0) comprises an orbit on its own: its left fixed by every

element in D6 .
The three vertices form an orbit: as remarked at the beginning of
Section 1.4 they are permuted among themselves by the elements
of D6 . More generally, any point on one of the triangles axes of
Figure 6.5: Axes of symmetry of the
symmetry, apart from the origin itself, forms an orbit with two other
equilateral triangle P3 , and some typi-
cal orbits of the D3 action points.
Finally, any point which doesnt lie on an axis of symmetry forms an
orbit with five other points. See Figure 6.5 for an illustration of some
typical orbits.

One of the standard group actions we met in the last section was the
trivial action. Its orbits have a particularly simple form:
Example 6.19 Let a group G act trivially on some set X, in the sense
that g x = x for all g G and x X. Then for some arbitrary x X,
the orbit OrbG ( x ) = { g x : g G } = { x }. So the trivial action
partitions a set X into singleton subsets consisting of the individual
elements of X.
Another important action is the conjugation action of a group G acting
6
Definition 6.11, page 217. on itself.6 We met the orbits of this action in a slightly different context
7
Proposition 3.4, page 74. in Chapter 3.7
actions 221

Example 6.20 Let a group G act on its underlying set via the conju-
gation action. Then the orbit of an element h G is the set
OrbG (h) = { g h : g G } = { g h g1 : g G }.
But this is exactly the conjugacy class of h in G. Two elements
h, k G are therefore in the same orbit if they are conjugate: if there
exists some element g G such that k = g h g1 .
Sometimes, all elements of a Gset X lie in the same orbit. The action
of Sn on Xn (or, more generally, of Sym( X ) on a set X) has this property.
That is, for any two elements x, y X, there is at least one element
g G for which g x = y. Another concrete example is the action of
GLn (R) on R3 = R3 \ {0}: for any nonzero vectors v, w R3 we can
find an invertible nn real matrix A such that Av = w.
Definition 6.21 An action of a group G on a set X is transitive if it
partitions X into a single orbit. Equivalently, for any x, y X there
exists at least one element g G such that g x = y.

In Section 3.A well extend these ideas to obtain multiply transitive


actions, which we will use to construct the Mathieu groups.
Proposition 6.22 The left regular action of a group on itself is transitive.

Proof This is really just a consequence of the cancellation law for


group multiplication.8 To see this, consider any two elements h, k G. 8
Proposition 1.15, page 8.
For h and k to be in the same orbit, there must be an element g G
such that g h = g h = k. Setting g = k h1 works, so the left
regular action is transitive.
The right regular action is transitive by a similar argument.
For the moment, though, well return to the D3 action described in
Example 6.18, and look at it from a slightly different perspective. In-
stead of asking which points get mapped to which others by arbitrary
elements of the group, we ask what group elements leave a given
point of the triangle P3 unchanged.
The central point of the triangle P3 is unmoved by any element of
D3 , but this isnt the case for any other point. The vertex 1 of P3 is
obviously left fixed in place by the identity e, and also by the reflection
m1 in the axis which passes through that vertex, but any other element
of D3 maps it to one of the other two vertices. This is also true for any
non-central point which lies on one of the three axes of symmetry: its
fixed in place by e and the reflection operation corresponding to its
axis, but the other four elements of D3 move it to one of the other two
points in the same orbit. And if we choose a point of P3 that doesnt
lie on an axis of symmetry, then its even worse: only the identity e
leaves the point where it is.
222 a course in abstract algebra

So, the triangles central point is fixed by all of D3 , any other point
on an axis of symmetry is fixed by just a subset of the form {e, m1 },
and a point which isnt on an axis of symmetry is fixed only by the
identity element e. We give subsets of this type a special name:
Definition 6.23 Let a group G act on a set X, and suppose x is some
element of X. The stabiliser of x in G is the set
StabG ( x ) = { g G : g x = x }
of elements in G which act trivially on x.

In the D3 example, all the stabilisers were subgroups of D3 rather than


just subsets, and its reasonable to ask whether this is true in general:
if a group G acts on a set X, then is the stabiliser StabG ( x ) of a given
point x X always a subgroup of G? It so happens that it is.
Proposition 6.24 Let a group G act on a set X. Then for any x X,
T
the set StabG ( x ) is a subgroup of G. Furthermore, x x StabG ( x ) is the
kernel of the action homomorphism : G Sym( X ).

Proof To show that StabG ( x ) is a subgroup of G, we need to check


that for any two elements g1 , g2 StabG ( x ) the product g1 g2 is also
in StabG ( x ), and that if g StabG ( x ) then g1 StabG ( x ) as well.
To see the first of these, observe that
( g1 g2 ) x = g1 ( g2 x ) = g1 x = x
since both g1 and g2 act trivially on x. The second condition is also
satisfied, since for any g G we have
g1 x = g1 ( g x ) = ( g1 g) x = e x = x.
Hence StabG ( x ) 6 G.
Now suppose that g xX StabG ( x ). This means that g x = x for all
T

x X, which tells us that the action homomorphism maps g to the


identity homomorphism in Sym( X ). Hence xX StabG ( x ) ker().
T

Conversely, suppose g ker(), so that ( g) = Sym( X ). Then


g x = ( x ) = x for any x X, so g xX StabG ( x ), and hence
T

ker() xX StabG ( x ).
T
T
Therefore xX StabG ( x ) = ker() as claimed.
In light of the above result, the stabiliser StabG ( x ) is sometimes called
the isotropy subgroup.
Returning yet again to the D3 example, we notice that the orbit of the
triangle P3 s central point consists of a single element, and that its
stabiliser consists of all six elements of D3 . If we choose some other
point on a symmetry axis of P3 then we find that the orbit of this point
contains three points and its stabiliser contains two elements of D3 .
actions 223

The orbit of some other point in P3 , meanwhile, consists of six points


while its stabiliser is just the identity element e.
In each of these three cases, the number of points in the orbit OrbD3 ( x )
multiplied by the number of elements in the stabiliser StabD3 ( x ) is
equal to the number of elements in the group D3 . That is,

| OrbD3 ( x )| | StabD3 ( x )| = 6 = | D3 |.
The next theorem confirms that this is true for any (finite) group acting
on a (finite or infinite) set.
Theorem 6.25 (OrbitStabiliser Theorem) Let G be a finite group
acting on a (not necessarily finite) set X. Then
| OrbG ( x )| | StabG ( x )| = | G |
for all x X.

Proof Let y be some element of the orbit OrbG ( x ). Then there exists
at least one element g G such that g x = y. Now suppose that there
exists some other h G such that h x = y. Then we have h x = g x,
and hence ( g1 h) x = x. This means that g1 h is in the stabiliser
StabG ( x ). Proposition 6.24 tells us that StabG ( x ) is a subgroup of G,
and hence by Proposition 2.26 it follows that if g1 h is in StabG ( x )
then h must be in the left coset g StabG ( x ).
So all group elements which map x to y lie in the same coset of
StabG ( x ). Conversely, if an element k G lies in the coset g StabG ( x )
then k x = g x. Thus a coset of StabG ( x ) consists of exactly those
group elements that map x to a given element of the orbit OrbG ( x ).
By Proposition 2.29 we know that | g StabG ( x )| = | StabG ( x )|. So for
any element y OrbG ( x ) there are exactly | StabG ( x )| group elements
that map x to y. By Corollary 2.28, these left cosets completely partition
G, and so there must be exactly | G |/| StabG ( x )| distinct elements in
the orbit OrbG ( x ). Therefore | G | = | OrbG ( x )| | StabG ( x )| for any
x X, as claimed.
Heres another simple example.
Example 6.26 Let Z7 act on the circle C = {ei : 0 6 < 2 }
by n ei = ei( +2/7) . Then the stabiliser StabZ7 ( x ) of some point
x = ei C is just the trivial subgroup {0}, so | StabZ7 ( x )| = 1. Since
|Z7 | = 7, by OrbitStabiliser Theorem, the orbit OrbZ7 ( x ) should
consist of seven points.
And this is exactly what happens. The orbit OrbZ7 ( x ) = OrbZ7 (ei )
is equal to the set {ei+2k/7 : 0 6 k < 7}, which consists of the
point x = ei together with six other points spaced at intervals of 2
7 Figure 6.6: An orbit of the Z7 action
around the unit circle C. on C = {ei : 0 6 < 2 }
224 a course in abstract algebra

If we apply the OrbitStabiliser Theorem to the conjugation action,


then something interesting and useful happens. The orbit OrbG ( g)
of some element g G is simply the conjugacy class of g. The
stabiliser StabG ( g) of g, meanwhile, is composed of those elements
h of G that fix g under conjugation; that is, StabG ( g) = {h G :
h g h1 = g}. Weve met this before, however: its the centraliser
ZG ( g) of the element g. So, applying the OrbitStabiliser Theorem to
the conjugation action we get the following corollary:
Corollary 6.27 Let g be an element of some group G, and let [ g] denote
the conjugacy class of g. Then |[ g]| = | G : ZG ( g)|.
If G is abelian then obviously any group element g commutes with
every other element, and so ZG ( g) = G. Therefore | G : ZG ( g)| = 1,
which tells us that the conjugacy class of g consists just of g itself,
something we already know from Proposition 3.8. More generally,
suppose that G isnt necessarily abelian, but that g lies in the centre
Z ( G ) and thus commutes with every element of G. Then ZG ( g) = G
and so again the conjugacy class of g consists only of g.
Recall from Proposition 3.10 that a subgroup H 6 G is normal if and
only if it is a union of conjugacy classes. We can use this to construct
a different proof that the alternating group A5 is simple:
Proposition 6.28 The alternating group A5 is simple; that is, it has no
proper, nontrivial normal subgroups.

Proof Recall that A5 consists of all even permutations of the set


X5 = {1, 2, 3, 4, 5}. With a little thought and experimentation, we can
see that any such permutation must be either the identity , a pair
of disjoint transpositions of the form ( x1 x2 )( x3 x4 ), a three-cycle of
the form ( x1 x2 x3 ) or a five-cycle of the form ( x1 x2 x3 x4 x5 ) where
x1 , x2 , x3 , x4 , x5 are distinct elements of X5 .
The identity permutation forms a conjugacy class

c1 = { }

on its own.
Consider two permutations = ( x1 x2 )( x3 x4 ) and = (y1 y2 )(y3 y4 )
in A5 . Let , : X5 X5 be bijections (and hence permutations in S5 )
such that

: x1 7 y1 , x2 7 y2 , x3 7 y3 , x4 7 y4 , x5 7 y5 ;
: x1 7 y2 , x2 7 y1 , x3 7 y3 x4 , 7 y4 , x5 7 y5 .

Notice that 1 = , and also that 1 = . Since = ( x1 x2 ),


either or must have even parity, with the other one being an odd
permutation. Therefore there exists an even permutation in A5 which
actions 225

conjugates to . Hence the permutations of the form ( x1 x2 )( x3 x4 )


form a single conjugacy class
c2 = {(1 2)(3 4), (1 2)(3 5), (1 2)(4 5), (1 3)(2 4), (1 3)(2 5),
(1 3)(4 5), (1 4)(2 3), (1 4)(2 5), (1 4)(3 5), (1 5)(2 3),
(1 5)(2 4), (1 5)(3 4), (2 3)(4 5), (2 4)(3 5), (2 5)(3 4)}
with fifteen elements.
Similarly, consider two three-cycles = ( x1 x2 x3 ) and = (y1 y2 y3 ).
Let , S5 be defined by
: x1 7 y1 , x2 7 y2 , x3 7 y3 , x4 7 y4 , x5 7 y5 ;
: x1 7 y1 , x2 7 y2 , x3 7 y3 , x4 7 y5 , x5 7 y4 .
Observe again that 1 = 1 = , and that since = ( x4 x5 )
either or must be an even permutation and hence in A5 . So again
we see that any three-cycle in A5 must be conjugate to any other one,
and therefore the three-cycles form a conjugacy class
c3 = {(1 2 3), (1 2 4), (1 2 5), (1 3 2), (1 3 4),
(1 3 5), (1 4 2), (1 4 3), (1 4 5), (1 5 2),
(1 5 3), (1 5 4), (2 3 4), (2 3 5), (2 4 3),
(2 4 5), (2 5 3), (2 5 4), (3 4 5), (3 5 4)}
with twenty elements.
The remaining 24 elements of A5 are cycles of length 5, but it turns
out that these split neatly into two conjugacy classes, each with twelve
elements.
c4 = {(1 2 3 4 5), (1 2 4 5 3), (1 2 5 3 4), (1 3 2 5 4),
(1 3 4 2 5), (1 3 5 4 2), (1 4 2 3 5), (1 4 3 5 2),
(1 4 5 2 3), (1 5 2 4 3), (1 5 3 2 4), (1 5 4 3 2)}
c5 = {(1 2 3 5 4), (1 2 4 3 5), (1 2 5 4 3), (1 3 2 4 5),
(1 3 4 5 2), (1 3 5 2 4), (1 4 2 5 3), (1 4 3 2 5),
(1 4 5 3 2), (1 5 2 3 4), (1 5 3 4 2), (1 5 4 2 3)}
To see that c4 and c5 are disjoint, suppose that = (1 2 3 4 5) and
= (1 2 3 5 4), and let = (4 5). Then 1 = . But has odd
parity and is therefore not in A5 . Suppose instead that there exists
A5 such that 1 = . Then 1 = 1 , which means
that 1 = 1 . Hence 1 commutes with . If A5 then it
must have even parity, so 1 must have odd parity. So such an even
permutation can exist if and only if there is an odd permutation
S5 that commutes with . This means that
1 = (1 2 3 4 5)1 = ((1) (2) (3) (4) (5)) = (1 2 3 4 5).
226 a course in abstract algebra

So if (1) = i then (2) = i +1 (mod 5), (3) = i +2 (mod 5) and so


on. Therefore must be a power of , and hence an even permutation.
Thus and cant be conjugate.
It only remains to show that c4 and c5 are conjugacy classes of 5
cycles in A5 rather than unions of smaller classes. There are 24 dif-
ferent 5cycles of the form (1 x2 x3 x4 x5 ), determined by the 24 cycles
( x2 x3 x4 x5 ) of length 4. Half of these 4cycles are even, and half of
them are odd. The even ones determine the 12 conjugate 5cycles in
c4 , and the odd ones determine the 12 conjugate 5cycles in c5 .
So, we now have the five conjugacy classes in A5 , and we find that
|c1 | = 1, |c2 | = 15, |c3 | = 20, |c4 | = 12, |c5 | = 12.
Now suppose that N is a normal subgroup of A5 . Then by Lagranges
9
Theorem 2.30, page 54. Theorem9 | N | must be a factor of | A5 | = 60, and by Proposition 3.10
we know that N must be a union of some or all of the conjugacy
classes c1 to c5 . Also, N must contain the identity permutation and
hence the class c1 . But the only factors of 60 that can be formed as a
sum of the number 1 together with some or all of the numbers 15, 20,
12 and 12 are 1 and 60. So N must be either the trivial subgroup or
A5 itself, and therefore A5 is simple.

It is pointless to do with more that 6.3 Counting


which can be done with fewer.
William of Ockham (c.12871347),
Summa Logicae (c.1323) The machinery of group actions, orbits and stabilisers has
a number of useful applications in combinatorics, and in this section
we will study two particular powerful results: Burnsides Lemma and
Plyas Enumeration Theorem. The former is named after the British
group theorist William Burnside (18521927), although he attributed it
to the German mathematician Ferdinand Georg Frobenius (18491917),
10
P M Neumann, A lemma that is not
and the French mathematician Augustin-Louis Cauchy (17891857)
Burnsides, The Mathematical Scientist
4.2 (1979) 133141 had discovered it even earlier. Given that Burnside didnt originally
11
This is a common phenomenon in discover the Lemma, and that its not the only lemma or theorem
science, known variously as Stiglers named after him, its been suggested that a more accurate description
Law of Eponymy (No scientific dis-
covery is named after its original dis- is Not Burnsides Lemma.10 Meanwhile, Plyas Enumeration Theo-
coverer) or Boyers Law (Mathemati- rem was originally discovered in 1927 by the American mathematician
cal formulas and theorems are usually John Redfield (18791944) and rediscovered independently ten years
not named after their discoverers).12
Naturally, neither of these laws are later by the Hungarian mathematician George Plya (18871985), after
named after the people who actually whom it was subsequently named.11
coined them either.
12 Both Burnsides Lemma and Plyas Enumeration Theorem are useful
H C Kennedy, Who discovered Boyers
Law?, American Mathematical Monthly for solving enumeration problems related to counting distinct colour-
79.1 (1972) 6667 ings (or analogous configurations) of objects modulo some notion of
actions 227

symmetry. More generally, both are concerned with counting orbits of


a given group action.
For example, suppose that we have a square picture frame and pots of
black and white paint. How many different ways are there of painting
the four edges of this frame either black or white?
At first sight we might be tempted to say 16, because each of the four
edges can be painted one of two colours, so the number of possible
configurations is 24 = 16. These are depicted in Figure 6.7.
However, some of these are really the same as each other, just rotated
clockwise or anticlockwise by a quarter- or half-turn. If we collect
together those coloured frames that only differ by a rotation, we find
there are really just six different ways of painting the edges of a square Figure 6.7: Coloured frames
frame:

What we have here is a set X consisting of 16 coloured picture frames,


together with an action of the rotation group R4 = Z4 . In other words,
given any element of X, we have a well-defined way of making another
(possibly identical) element of X by rotating it through an angle of 0,
3
2 , or 2 . The six subsets shown above are exactly the orbits of this
action.
So, our question now reduces to one about counting the orbits of a
group action, and the answer is given by the following result.
Theorem 6.29 (Burnsides Lemma) Let G be a finite group acting on
a set X. Then the number of distinct orbits of this action is given by
1
|G| | Fixg (X )|,
g G

where Fixg ( X ) = { x X : g x = x } is the subset of elements of X which


are invariant under the action of the element g G.

Proof First we count the number of pairs ( g, x ) G X for which


g x = x. There are two obvious ways of doing this. One way is to
sum the number of elements in each set Fixg ( X ) for each g G; this
gives the sum gG | Fixg ( X )|. The other way is to sum the elements
228 a course in abstract algebra

in the stabiliser StabG ( x ) as x ranges over all the elements in X; this


gives the sum xX | StabG ( x )|. These two sums must be equal, hence

| Fixg (X )| = | StabG ( x )|. (6.9)


g G xX
If we let X1 , . . . , Xk be the distinct orbits of the Gaction on X, this
second sum can be rewritten as ik=1 xXi | StabG ( x )|.
If two elements x1 and x2 lie in the same orbit, then OrbG ( x1 ) =
OrbG ( x2 ) and StabG ( x1 ) = StabG ( x2 ). So, we can choose a specific
representative element xi Xi for each orbit Xi . Then

| StabG ( x )| = | Xi || StabG ( xi )| = | OrbG ( xi )|| StabG ( xi )|.


Wikimedia Commons x Xi
William Burnside (18521927) was
born in London and educated at But by the OrbitStabiliser Theorem (Theorem 6.25) we know that this
Christs Hospital after his fathers is equal to | G |. Substituting this into (6.9) we get
death in 1858. At school he excelled in
mathematics and won a scholarship to | Fixg (X )| = | StabG ( x )| = k| G |,
St Johns College, Cambridge in 1871, g G xX
but transferred to Pembroke College in
1873, to improve his chances of a place where k is the number of orbits. Hence
in the college rowing team. He grad- 1
| G | g
uated in 1875 with the second high- k= | Fixg ( X )|
est marks in his year, winning the title G
of Second Wrangler and a prestigious as required.
Smiths Prize.
He remained in Cambridge for another Burnsides Lemma says that the number of orbits of a Gaction is the
ten years, as a lecturer and Fellow of
mean number of elements of X left fixed by elements of G.
Pembroke, but in 1885 accepted the
post of Professor of Mathematics at Applying this result to our picture frame problem, we need to count
the Royal Naval College in Greenwich,
where he stayed for the rest of his ca-
the number of frames which are unchanged by a given element of
reer, despite being offered the post of R4 = {e, r, r2 , r3 }
= Z4 . The identity e leaves everything fixed, so
Master of Pembroke in 1903. | Fixe ( X )| = 16. The anticlockwise rotation r and the clockwise rota-
After early work on elliptic functions,
he became interested in hydrodynam-
tion r3 both leave the following frames unchanged:
ics, which led him to study complex
function theory and finite group the-
ory. His work on the latter resulted in
the publication of his influential book
The Theory of Groups of Finite Order in
1897, the first treatise on group theory
The half-turn r2 leaves four frames unchanged:
in English. His many contributions to
the field include Burnsides p a qb Theo-
rem and the Burnside Problem. Later
in his career he became interested in
probability, writing several papers on
the subject, as well as a book entitled
Theory of Probability, which was pub- So, applying Burnsides Lemma we find that this action has
lished posthumously in 1928.
1
He was elected a Fellow of the Royal 4 (16 + 2 + 2 + 4) =6
Society in 1893 and awarded the De
Morgan Medal of the London Mathe- orbits. This agrees with the answer we found earlier by inspection.
matical Society in 1899, serving on the
latters council until 1917 and as its Heres another example:
president between 1906 and 1908.
actions 229

Example 6.30 An ordinary six-sided die is a cube with each face


numbered from 1 to 6. Well consider three variants on this theme
and work out how many configurations there are in each case.
Firstly, consider the general case where we number each face inde-
pendently, allowing more than one face to have the same number.
There are 66 = 46656 possible choices, but some of these will be just
rotated versions of each other.
The direct isometry group Isom+ (3 ) consists of all the orientation-
preserving isometries of the cube 3 . There are 24 of these: the
identity e, six order2 rotations t about axes passing through the
midpoints of opposite edges, three order2 rotations r2 about axes
passing through the centres of opposite faces, eight order3 rotations
s about axes passing through opposite corners, and six order4
rotations r about axes passing through the centres of opposite faces.

The identity e obviously fixes all 46656 possible cubes, the six typet
rotations fix 63 = 216 cubes, the three typer2 rotations fix 64 = 1296
cubes, the eight types rotations fix 62 = 36 cubes, and the six typer
rotations fix 63 = 216 cubes.
So, Burnsides Lemma tells us that there are
1 53424
24 (46656 + 6216 + 31296 + 836 + 6216) = 24 = 2226
distinct ways of numbering the faces of the die.
But a conventional die uses each number only once, so how many
distinct dice can we construct with this property? Well, instead of
66 = 46656 possible configurations we have only 6! = 720: we have
six possible choices for the top face, five remaining for the front face,
four for the left face and so on. The identity fixes all 720 of these, but
none of the other 23 direct isometries fix any. So Burnsides Lemma
tells us that there are 720
24 = 30 distinct dice of this type.
Actually, most conventional six-sided dice also satisfy the condition
that opposite faces add up to seven: the 6 face is opposite the 1 face,
the 5 opposite the 2 and the 4 opposite the 3. How many different
dice are there with this property too?
We have six choices of number for the top face, and when weve made
that choice we know what the number on the bottom face has to be.
Next we choose one of the four remaining numbers for the front face,
230 a course in abstract algebra

at which point we have no choice about what number goes on the


back face. Finally, we have two remaining numbers to choose from
for the left face, which leaves only one for the right face. Therefore
there are 642 = 48 possible configurations satisfying these criteria.
As before, the identity fixes all 48 and no other isometry fixes any of
them. By Burnsides Lemma there are exactly 48 24 = 2 such dice.

Burnsides Lemma enables us to count orbits of a group action, and as


weve seen this can help us count colourings or other configurations
of some geometric object modulo some equivalence relation such as
rotational symmetry. But in general the task of counting the number
of elements in Fixg ( X ) can become cumbersome and time-consuming.
For something relatively simple like a square picture frame or a cubic
die, the process is fairly straightforward, but it becomes less so for
more complicated objects. For example, a regular dodecahedron
Oberwolfach Photo Collection / Mathematische Gesellschaft
(Hamburg) has thirty edges and a direct isometry group with sixty elements,
The German mathematician Ferdinand
Georg Frobenius (18491917) so counting its distinct edge-colourings is going to be a lengthier
undertaking than the examples weve looked at so far.
In the picture frame example it was fairly straightforward to work out
the orders of Fixg ( X ) for two colours:
Fixe ( X ) = 24 = 16, Fixr ( X ) = 21 = 2,

Fix 2 ( X ) = 22 = 4, Fix 3 ( X ) = 21 = 2.

r r

What if we have three colours at our disposal: black, white and grey?
There are 34 = 81 possible colourings, of which all are fixed by the
identity e, 3 are fixed by r and r2 , and 9 are fixed by r2 , giving the
following values for | Fixg ( X )|:
Fixe ( X ) = 34 = 81, Fixr ( X ) = 31 = 3,

Fix 2 ( X ) = 32 = 9, Fix 3 ( X ) = 31 = 3.

r r

There is an obvious pattern here, and with a little thought we can see
that for k colours the answer is going to be
Fixe ( X ) = k4 , Fixr ( X ) = k1 = k,

Fix 2 ( X ) = k2 , Fix 3 ( X ) = k1 = k.

r r
1 4
Therefore there are 4 (k + k2
+ 2k) rotationally distinct ways of colour-
ing our square picture frame with k different colours.
The key is to study how the isometry group acts on the object were
colouring. In this case, the rotation group Isom+ (2 ) = R4 = Z4
permutes the four sides of the square: this yields a homomorphism
: R4 S4 which we can write down in cycle form as follows:
e 7 = (1)(2)(3)(4), r 7 (1 2 3 4),
2 3
r 7 (1 3)(2 4), r 7 (1 4 3 2).
actions 231

Here weve numbered the edges of the frame anticlockwise with the
numbers 14. For what follows its helpful to also include cycles of
length 1, which for conciseness we dont usually bother doing.
To each cycle we associate a symbol, its type, as follows.
Definition 6.31 Suppose that a permutation Sn can be decom-
posed as a product of disjoint cycles, of which a1 have length 1, a2
have length 2, and so on, where a1 , a2 , . . . are non-negative integers.
Then the type of the permutation is the partition 1a1 , 2a2 , . . . , n an
 

of n, where in=1 ai i = a1 + 2a2 + + nan = n.


a
The cycle symbol cs of is the formal monomial x11 x2a2 . . . xnan .

So, the cycle symbols of the elements of R4 , regarded as permutations


in S4 via the action homomorphism , are
cse = x14 , csr = x4 , csr2 = x22 , csr3 = x4 .
The cycle symbol csg = csg ( x1 , . . . , xn ) is a monomial in the formal
symbols x1 , . . . , xn . To count the kcolourings fixed by a group element
g, we evaluate csg ( x1 , . . . , xn ) setting x1 = = xn = k. We can see
why this works by looking at what happens in a particular case.
The group element r2 has cycle symbol csr2 = x22 , because it corre-
sponds to the permutation (1 3)(2 4). If sides 1 and 3 are the same
colour as each other, and sides 2 and 4 are also the same colour as each
other, then this permutation will leave the frame unchanged. So the
disjoint cycles group the edges together in identically-coloured cohorts,
each of which can be coloured any of the k available colours. So we
have k choices of colour for each of these groupings; that is, k choices
for each disjoint cycle in the permutation. More generally, for a permu-
tation that decomposes into m disjoint cycles, there are km colourings
of our chosen object that are invariant under that permutation.
Definition 6.32 Let a finite group G act on a finite set X via an Different books adopt different nota-
tion for the cycle index, such as Z ( G ),
action homomorphism : G Sym( X ) = Sn . Then the cycle index PG and G . Here weve settled on CG ,
of the action is the polynomial but you should be aware that there is
no strong consensus. Also, some books
1
CG ( x 1 , . . . , x n ) =
|G| csg (x1 , . . . , xn ). use CG (S) to denote the centraliser of
S in G (Definition 4.51) and C ( G ) to de-
g G
note the centre of G (Definition 3.16).
By the argument above, we get the following corollary to Burnsides
Lemma:
Corollary 6.33 (Cycle Index Theorem) Let X be an object with n
components, each of which is assigned one of k colours. The number of
distinct kcolourings of X modulo an action of G is given by
1
CG (k, . . . , k ) =
|G| csg (k, . . . , k).
g G
232 a course in abstract algebra

Lets apply this to another colouring problem.


Example 6.34 Consider a circular necklace with n coloured but
otherwise identical and equally-spaced beads. Up to rotation, how
many different necklaces can we make if we have k different colours
to choose from?
The general case is a bit involved so first well consider the case
n = 6. We want to count the orbits of the Z6 action. The permutation
representations and cycle symbols of the elements of Z6 are
0 7 = (1)(2)(3)(4)(5)(6), cs0 = x16 ,
1 7 (1 2 3 4 5 6), cs1 = x6 ,
2 7 (1 3 5)(2 4 6), cs2 = x32 ,
Figure 6.8: The 14 rotationally distinct
2coloured 6bead necklaces 3 7 (1 4)(2 5)(3 6), cs3 = x23 ,
4 7 (1 5 3)(2 6 4), cs4 = x32 ,
5 7 (1 6 5 4 3 2), cs5 = x6 .
The cycle index is therefore
CZ6 ( x1 , x2 , x3 , x4 , x5 , x6 ) = 16 ( x16 + x6 + x32 + x23 + x32 + x6 )
and hence the number of necklaces with two colours of beads is
CZ6 (2, 2, 2, 2, 2, 2) = 16 (26 +23 +222 +22) = 84
6 = 14.
These are depicted in Figure 6.8.

The general case requires the cycle decomposition of the appropriate


permutation representation of Zn .
Proposition 6.35 There are
1
n (d)kn/d
d|n

rotationally distinct kcoloured nbead necklaces.


13
Definition 2.44, page 63. Here, denotes Eulers totient function.13
Proof Following the method in Example 6.34, we need to work out
the cycle symbols csi for each i Zn . More precisely, we want the
cycle symbols for the images i of each i in the symmetric group Sn .
The permutation i has a cycle of the form (1 (1+i ) (1+2i ) . . .). If i is
coprime to n then this cycle will be of length n, and the cycle symbol
for i will be xn . If h = gcd(i, n) 6= 1 then this cycle will be of length
n/h and the permutation i will decompose as

i = (1 (1+i ) (1+2i ) . . .)(2 (2+i ) (2+2i ) . . .) . . .


((h1) (h1+i ) (h1+2i ) . . .)
h .
and so the cycle symbol of i will be xn/h
actions 233

Observe that the length of each cycle must be a factor of n, so the


subscripts and superscripts of the cycle symbols for the permutations
i must be factors of n. Furthermore, all factors of n will occur in this
way.
So the full cycle index for this action will be
1
n
md xdn/d
d|n
where md is the number of permutations i that decompose as a
product of n/d disjoint cycles of length d. To calculate this, we need
to know how many integers m satisfy n/ gcd(m, n) = d for a given
divisor d.
Such an m = gcd(m, n) t for some t with gcd(t, d) = 1. There are
(d) choices of such a t, so md = (d) and hence the cycle index is
1
n
(d) xdn/d .
d|n
j
Setting all the xi = k and applying Corollary 6.33 we get
1
n (d)kn/d
d|n
as claimed.
We now have some useful machinery for counting colourings or other
configurations up to some form of symmetry equivalence. But the
results weve seen so far dont really tell us anything about what these
colourings actually look like.
Suppose we have a kcoloured nbead necklace, and let C ={c1 , . . . , ck }
be a set of labels representing the k colours.
Then we represent a given colouring of the necklace with a degreen
m m
monomial c1 1 c2m2 . . . ck k where m1 + + mk = n are non-negative
integers. This denotes a colouring where m1 beads are coloured with
colour c1 , m2 beads with colour c2 and so on.
There is often more than one colouring with these parameters, and
we denote this by a non-negative integer coefficient in front of the
monomial in question.
For example, in the case of the 6bead necklace in Example 6.34, we
had two colours c1 = white and c2 = black. Of the 26 = 64 possible
necklaces (ignoring rotation for the moment), one has six white beads,
six have five white beads, fifteen have four white beads, twenty have
three white beads, fifteen have two white beads, six have a single white
bead and one has only black beads. We can represent this situation
with the polynomial
c61 + 6c51 c2 + 15c41 c22 + 20c31 c32 + 15c21 c42 + 6c1 c52 + c62 .
234 a course in abstract algebra

Or, using more intuitive names for the colours,


w6 + 6w5 b + 15w4 b2 + 20w3 b3 + 15w2 b4 + 6wb5 + b6 .
This is an example of a generating function, a device which has many
applications in combinatorics. In this case, it happens to be equal to
( w + b )6 .
But this isnt quite what we want, because were really interested in
something which encodes the number of rotationally distinct necklaces
with a given number of beads of each colour. With that in mind, we
modify our generating function accordingly, to get
w6 + w5 b + 3w4 b2 + 4w3 b3 + 3w2 b4 + wb5 + b6 .
From this we can easily read off the details of all the individual
rotationally-distinct colourings of the necklace. What we need now is
a general method for constructing this generating function from the
cycle symbols of the group action.
To see how to do this, we consider each group element individually
and look at the monomials corresponding to the necklaces which are
invariant under that element:

g monomials csg
0 w6 +6w5 b+15w4 b2 +20w3 b3 +15w2 b4 +6wb5 +b6 = ( w + b )6 x16
1 w6 + b6 = ( w6 + b6 ) x6
2 w +2w3 b3 +b6
6 = ( w3 + b3 )2 x32
3 w6 +3w4 b2 +3w2 b4 +b6 = ( w2 + b2 )3 x23
4 w6 +2w3 b3 +b6 = ( w3 + b3 )2 x32
5 w6 + b6 = ( w6 + b6 ) x6

Adding all of these together and dividing by |Z6 | = 6 we get


6
1
6 (6w + 6w5 b + 18w4 b2 + 24w3 b3 + 18w2 b4 + 6wb5 + b6 )
which is exactly the generating function
w6 + w5 b + 3w4 b2 + 4w3 b3 + 3w2 b4 + wb5 + b6
that we want. If we look carefully at the table above then a pattern
j
suggests itself: a group element with cycle symbol xi contributed a
term (wi + bi ) j to the generating function. Summing over all group
elements and dividing by the order of the group gives us exactly
the right generating function. What weve done here is substitute
xi = (wi + bi ) in the expression for the cycle index CZ6 ( x1 , . . . , x6 ).
This generating function tells us, amongst other things, that there are
four different rotationally distinct 2coloured 6bead necklaces, the
relevant term being 4w3 b3 .
It turns out that this procedure works in general:
actions 235

Theorem 6.36 (Plyas Enumeration Theorem) Let X be an object


with n components, each of which is assigned one of k colours c1 , . . . , ck .
The generating function describing the distinct kcolourings of X, modulo
an action of some finite group G, is given by

CG (c1 + +ck , c21 + +c2k , . . . , c1n + +cnk ) =


1
|G| csg (c1 + +ck , c21 + +c2k , . . . , c1n + +cnk ).
g G

Proof Let g G and suppose that the cycle decomposition of the


permutation representing g in Sn consists of a single cycle ( a1 . . . an ).
Then Fixg ( X ) consists only of colourings in which each component of
X has the same colour. The generating function in this case is
c1n + + cnk
and the cycle symbol csg = xn .
ETH Bibliothek, Zurich
Now suppose that the permutation representative for g decomposes George (Gyrgy) Plya (18871985)
as more than one disjoint cycle. We can thus regard X as consisting of was born in Hungary and initially stud-
ied language and literature for two
several independently-coloured subobjects with respect to this group years at the University of Budapest,
element g, each corresponding to one of the cycles in the permutation before switching to mathematics and
physics. He studied abroad in Vienna
representing g. and Gttingen, completed his doctor-
A cycle of length m has a generating function of the form ate at Budapest with very little super-
vision, and then visited Paris in 1914.
c1m + + cm
k There, he met the German mathemati-
cian Adolf Hurwitz (18601919) who
by the argument above, and contributes a factor xm to the cycle symbol arranged an academic post for him in
csg . The cycle symbol csg is the product of all of these factors, and the Zrich.

generating function for g is equal to the product of the corresponding He stayed in Zrich for a number of
years, publishing prolifically on a wide
generating functions. range of mathematical topics. With
his fellow Hungarian mathematician
Therefore the generating function for the element g, which describes Gbor Szego (18951985) he wrote an
the kcolourings of X which are invariant under the action of g, can influential two-volume analysis prob-
be obtained by substituting xi = c1i + + cik in the cycle symbol lem book: Aufgaben und Lehrstze aus
der Analysis, published in 1925.
csg ( x1 , . . . , xn ). In 1940 he moved to the USA, and after
In order to find the generating function that takes account of the whole two years at Brown University was ap-
pointed to a post at Stanford, where he
Gaction on X, we need to do a little more work. remained for the rest of his life. Retir-
m m ing in 1953, he was appointed Professor
For a given colour distribution c1 1 . . . ck k , we obtain the number of
Emeritus, and continued teaching and
equivalent colourings by summing the appropriate term from the researching into his nineties.
generating function for each g G and then divide by | G |. His work spanned a wide range of top-
ics: analysis, algebra, combinatorics,
To get the generating function for all possible colourings, then, we geometry, probability and geometry in
have to take the generating function for all g G, sum them and particular. His classic book on mathe-
matical problem solving, How to Solve
divide by | G |. This gives the expression
It, first published in 1945, has sold over
1 a million copies and been translated
| G | g
csg (c1 + + ck , c21 + + c2k , . . . , c1n + + cnk ) into seventeen languages, and remains
G recommended reading for any student
as claimed. of mathematics.
236 a course in abstract algebra

Summary

14
Definition 1.43, page 21. Motivated by several groups discussed in Chapter 1, particularly the
15
Example 1.34, page 17. dihedral groups Dn ,14 the general linear groups GLn (R),15 and the
16
Definition 1.51, page 24. symmetric groups Sn ,16 we formulated the concept of an action of
17
Definition 6.1, page 212. a group G on a set X:17 a function : G X X satisfying certain
simple criteria. We usually denote the image of a given element x X
under the action of a specific element g G by g x rather than ( g, x ).
More precisely, this is a left action of G on X; there is a corresponding
notion of a right action defined by a function : X G X, in which
we denote the images x g. There is a bijective mapping between left
and right actions, so in principle we need only develop the theory for
left actions, and the analogous results should hold for right actions as
well.
A group action : G X X determines a unique homomorphism
: G Sym( X ), where maps a given group element g G to
18
Proposition 6.2, page 214. the permutation f g of X defined by f g ( x ) = g x for all x X.18 If
this homomorphism is injective, we say that the action is faithful;
equivalently an action is faithful if, for any distinct g1 , g2 G there is
19
Definition 6.4, page 215. at least one x X for which g1 x 6= g2 x.19 Also, an action is faithful
20
Proposition 4.35, page 116. exactly when ker is trivial.20
The trivial action is defined by g x = x for all g G and x X;
this corresponds to the homomorphism G Sym( X ) where every
21
Example 6.6, page 215. element of G maps to the identity homomorphism Sym( X ).21 The
trivial action is not faithful except when G is the trivial group {e}.
The action of the full symmetric group Sym( X ) on a set X is faith-
ful, and corresponds to the identity homomorphism id : Sym( X )
22
Example 6.5, page 215. Sym( X ).22 For any subgroup G 6 Sym( X ) we can construct a faithful
action of G on X by composing the canonical inclusion map i : G ,
Sym( X ) with this symmetric group action. In particular, this process
yields a faithful action of the alternating group Alt( X ) < Sym( X ) on
23
Example 6.7, page 216. X, of An on the finite set Xn = {1, . . . , n},23 and a faithful action of
24
Example 6.8, page 216. the special orthogonal group SOn (R) on Rn .24
25
Theorem 2.13, page 47. By Cayleys Theorem25 any group G can be regarded as a subgroup of
Sym( G ), the permutation group of its underlying set. This viewpoint
26
Definition 6.9, page 217. yields a faithful action of G on itself: the left or right regular action.26
The group conjugation operation introduced in Chapter 3 gives another
action of a group on itself: the conjugation action in which g h =
27
Definition 6.11, page 217. h g = ghg1 for any g, h G.27 This action is not in general faithful. If
28
Proposition 6.13, page 218. G is abelian then the conjugation action is trivial.28
Unfaithful actions have nontrivial kernels, and thereby determine
actions 237

nontrivial normal subgroups of the given group G. We can take the


quotient G by this nontrivial kernel K and obtain a faithful G/Kaction
on our chosen set.29 29
Example 6.15, page 218.
An action of a group G on a set X determines an equivalence relation
on X: for any x, y X we say that x y exactly when there is an
element g G such that g x = y. This relation partitions X into
equivalence classes, which we call orbits.30 The orbit of an element 30
Definition 6.16, page 219.
x X is denoted OrbG ( x ).
The orbits of the canonical SO3 (R)action on R3 are concentric spheres
around the origin, and the origin itself.31 The usual action of D3 on 31
Example 6.17, page 220.
an equilateral triangle has three different types of orbit: the centre of
the triangle, orbits comprising triples of points on one of the triangles
axes of symmetry, and orbits consisting of six points off the axes of
symmetry.32 The trivial action of a group G on a set X partitions X 32
Example 6.18, page 220.
into single-element subsets of X.33 The orbits of the conjugation action 33
Example 6.19, page 220.
of a group G on itself are exactly the conjugacy classes of G.34 34
Example 6.20, page 221.
An action with only one orbit is said to be transitive.35
The left (or 35
Definition 6.21, page 221.
right) regular action of a group G on itself is transitive.36 36
Proposition 6.22, page 221.
The stabiliser of an element x X under a given Gaction is the
subset of G consisting of those elements that leave x fixed; that is,
those g G for which g x = x.37 This subset, denoted StabG ( x ), is 37
Definition 6.23, page 222.
a subgroup of G,38 and is sometimes called the isotropy subgroup 38
Proposition 6.24, page 222.
of x. If G is a finite group acting on a (not necessarily finite) set X,
then | OrbG ( x )|| StabG ( x )| = | G | for any x X; this fact is called
the OrbitStabiliser Theorem.39 In the case of the conjugation action 39
Theorem 6.25, page 223.
of G on itself, the orbit OrbG ( g) of some element g G is just the
conjugacy class [ g] of g, and the stabiliser StabG ( g) is the centraliser
ZG ( g) of g in G.40 Hence, by the OrbitStabiliser Theorem, we have 40
Definition 4.51, page 124.
|[ g]| = | G : ZG ( g)|; that is, the number of elements of the conjugacy
class of g is equal to the index of its centraliser in G.41 We can use this 41
Corollary 6.27, page 224.
fact to show that the alternating group A5 is simple.42 42
Proposition 6.28, page 224.
An important result known as Burnsides Lemma enables us to count
the orbits of a finite group action. If we denote by Fixg ( X ) the subset
of X consisting of those elements fixed by an element g G (that is,
those x X for which g x = x) then the number of distinct orbits is
given by |G1 | gG | Fixg ( X )|.43 We can use this result to solve certain 43
Theorem 6.29, page 227.
types of problem in enumerative combinatorics: for example, counting
the number of six-sided dice satisfying various properties.44 44
Example 6.30, page 229.
Suppose that a permutation Sn can be decomposed as a product
of disjoint cycles, a1 with length 1, a2 with length 2, and so forth, for
some non-negative integers a1 , a2 , . . .. The type of is the partition
 a a
1 1 , 2 2 , . . . , n an , where a1 + 2a2 + + nan = n. The cycle symbol

238 a course in abstract algebra

a
45
Definition 6.31, page 231. cs is the formal monomial x11 x2a2 . . . xnan .45
Given a finite group G acting on a finite set X via an action homo-
morphism : G Sym( X ) = Sn , the cycle index of the action is
46
Definition 6.32, page 231. the polynomial CG ( x1 , . . . , xn ) = |G1 | gG csg ( x1 , . . . , xn ).46 Now con-
sider an object with n components, each of which is assigned one of
k colours. Then the Cycle Index Theorem, a corollary to Burnsides
Lemma, states that the number of distinct kcolourings of X modulo a
47
Corollary 6.33, page 231. given Gaction is equal to CG (k, . . . , k ).47
For example, we can use the Cycle Index Theorem to count the number
48
Example 6.34, page 232. of different kcolourings of an nbead necklace.48 , 49
49
Proposition 6.35, page 232. We can obtain more detailed information about these colourings by
50
Theorem 6.36, page 235. constructing a generating function. Plyas Enumeration Theorem50
says that the generating function for the kcolouring problem on an
ncomponent object X, modulo an action by a finite group G, is given
by the cycle index CG (c1 + + ck , c21 + + c2k , . . . , c1n + + cnk ).

References and further reading


J H Conway, N D Elkies, and J L Martin, The Mathieu group M12 and its pseudogroup extension M13 ,
Experimental Mathematics 15.2 (2006) 223236
A detailed discussion of the construction of the Mathieu group M12 .
A S Crans, T M Fiore, and R Satyendra, Musical actions of dihedral groups, The American Mathematical
Monthly 116.6 (2009) 479495
An investigation of the applications of group theory to music. The dihedral group D12 acts musically
by inversion and transposition, and also via three invertible operations on musical triads: the
parallel operation P mapping a major triad to the corresponding minor triad, the leading tone
exchange L which raises the root note by a semitone, and the relative operation R mapping a major
triad to its relative minor.

Exercises
6.1 Show that CG H = CG CH for any groups G and H.
6.2 Suppose that the necklaces in Example 6.34 can also be flipped over. That is, replace the Z6 action
with an appropriate D6 action. Configurations of this type are sometimes called bracelets. How
many distinct 2coloured 6bead bracelets are there?
6.3 Modify or extend the proof of Proposition 6.35, replacing Zn with Dn , to calculate the number of
kcoloured nbead bracelets.
6.4 Convince yourself that A4 is 2transitive, and that A5 is 3transitive.
6.5 Try to construct a Steiner system of type S(2, 3, 4) and convince yourself that no such system exists.
The known is finite, the unknown infi-
nite; intellectually we stand on an islet
in the midst of an illimitable ocean of
inexplicability. Our business in every
generation is to reclaim a little more
land, to add something to the extent
and the solidity of our possessions.
Thomas Huxley (18251895),
On the Reception of the Origin of Species,
7 Finite groups in: Francis Darwin, The Life and Letters
of Charles Darwin (1887) II 204

n this chapter we will study a number of important results related


I to the classification of finite groups. Recall that Lagranges Theo-
rem1 says that for any finite group G, the order | H | of any subgroup 1
Theorem 2.30, page 54.
H 6 G must divide | G |. But the converse doesnt always hold: for
example, | A4 | = 12 but A4 has no subgroup of order 6. Cauchys
Theorem2 is a partial converse, ensuring the existence of a (cyclic) 2
Theorem 2.37, page 57.
subgroup of order p if p is a prime factor of | G |.
First we will investigate some important theorems due to the Norwe-
gian mathematician Ludwig Sylow (18321918),3 that extend Cauchys 3 Pronounced, approximately, see-loff
Theorem to psubgroups: subgroups of order pn . rather than sigh-low.

Next we will study how subgroups can be nested inside each other
(like algebraic matryoshka dolls), and prove some important results
about these series of subgroups, such as Schreiers Refinement Theo-
rem4 and the JordanHlder Theorem.5 Some of this work will lead 4
Theorem 7.27, page 254.
into our study of Galois Theory in Chapter 11. 5
Theorem 7.34, page 260.
After that, we will introduce the semidirect product G = H oK of two
groups H and K, a generalisation of the direct product H K, which
will lead us to the study of group extensions: a general method of
constructing larger groups from smaller ones.
Finally, we will use some of these techniques to classify, up to isomor-
phism, all the groups with order less than 32.

7.1 Sylows Theorems As every educated person knows the


Pythagorean Theorem so does every
mathematician speak of Abels Theo-
We want to find a partial converse to Lagranges Theorem; rem and of Sylows Theorem.
that is, for a finite group G, we want to find which factors of | G | are Georg Frobenius (18491917),
quoted in: G A Miller,
represented as the order of some subgroup of G. Cauchys Theorem
Professor Ludvig Sylow,
gives one answer to this question: it guarantees that at least one (cyclic) Science 49:1256 (1919) 85
subgroup of order p exists for any prime factor p of | G |. But can we
do better than this?
240 a course in abstract algebra

The answer is yes, due to three theorems originally proved by the


Norwegian mathematician Ludwig Sylow (18321918), and which we
will now study.
A good place to start is with our first counterexample A4 which, as we
saw in Example 2.36, has no subgroup of order 6, even though 6|12 =
| A4 |. Cauchys Theorem ensures the existence of cyclic subgroups of
order 2 and 3, because these are the prime factors of | A4 | = 12 = 22 3.
In fact, there are six subgroups of order 2:

{, (1 2)}, {, (1 3)}, {, (1 4)},


{, (2 3)}, {, (2 4)}, {, (3 4)}.
There are four subgroups of order 3:

{, (1 2 3), (1 3 2)}, {, (1 2 4), (1 4 2)},


{, (1 3 4), (1 4 3)}, {, (2 3 4), (2 4 3)}.
Oberwolfach Photo Collection / Wolfgang Gaschtz
Born in Christiania (now Oslo), the There is also one subgroup of order 4:
eldest of ten siblings, Peter Ludwig
Mejdell Sylow (18321918) studied at
{, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}.
the University of Christiania, quali-
fying as a mathematics and science
teacher in 1856. No suitable university
What patterns can we find here? The subgroup of order 4 isnt pre-
post being available, he taught in sec- dicted by Cauchys Theorem, but it exists nonetheless. Meanwhile, we
ondary schools for the next forty-two know from Example 2.36 that no subgroup of order 6 exists. What
years.
He wasnt a very good schoolteacher:
property does 4 have that 6 doesnt? Well, neither are prime, but
he found it difficult to keep order, and 4 = 22 is at least a power of a prime. This, it turns out, is signifi-
lacked enthusiasm for the elementary cant. To explore this idea further, we need to introduce another bit of
topics he was required to teach.
In 1861 he visited Paris and Berlin,
terminology.
learning of recent advances in mechan- Definition 7.1 Let p N be prime, and let G be a group of order
ics, geometry and algebra. The next
year he lectured at the University of | G | = pn for some integer n > 0. Then we say G is a pgroup.
Christiania, covering for his colleague
Ole Jacob Broch (18181889) who had An immediate consequence of this definition and Proposition 2.34 is
been elected to parliament. Broch was that every element of a pgroup must have order a power of p. Indeed,
re-elected in 1865, but Sylows school
we could happily have defined pgroups as groups for which every
refused to grant him further leave.
Nevertheless, Sylow continued to pur- element has order pk for some k > 0, and some books do just that.
sue mathematical research: in 1872 he Theres another related concept that well use in this section:
published his celebrated theorems on
orders of subgroups, and between 1873 Definition 7.2 Let G be a group, and let H 6 G be a subgroup with
and 1881 he and Sophus Lie (1842
| H | = pn for some prime p N and some integer n > 0. Then H is
1899) published the collected works of
their fellow Norwegian mathematician a psubgroup of G; that is, a subgroup of G that is itself a pgroup.
Niels Henrik Abel (18021829).
Lie eventually succeeded in getting As weve seen, the alternating group A4 doesnt have subgroups of all
him appointed to a chair at Christiania possible orders, but it does have psubgroups of all possible orders.
in 1898, at the advanced age of 65. He
occupied this post for a further twenty The next smallest counterexample to the converse of Lagranges Theo-
years until his death in 1918, success- rem is SL2 (3), the group of 22 matrices with elements from F3 and
fully undertaking further research in
group theory and elliptic functions.
determinant 1. This has order 24 = 23 3, and hence by Cauchys
Theorem should have at least one cyclic subgroup of order 2 and
finite groups 241

another of order 3. In fact, it has one subgroup



= Z2
 1 0   2 0 
01 , 02
of order 2, three subgroups

= Z4

 0 1 
 2 2 
 1 2 
20 , 21 , 22
of order 4, one subgroup


 0 1   2 2 
20 , 21 = Q8
of order 8, four subgroups

= Z3

 1 1 
 1 0 
 0 1 
 0 2 
01 , 11 , 22 , 12
of order 3, and four subgroups

= Z6

 1 1   2 0 
 1 0   2 0 
 0 1   2 0 
 0 2   2 0 
01 , 02 , 11 , 02 , 22 , 02 , 12 , 02
of order 6. By Lagranges Theorem, it could also have a subgroup
of order 12, but it turns out not to. Again, however, it does have
psubgroups of all possible orders.
This happens to be true in general, and to prove it we first need to take
a little detour into the realm of group actions. Let G be a (possibly
infinite) group acting on a finite set X. Then there are finitely many
orbits in X under this action, and these orbits partition X:
r
|X| = | OrbG (xi )| (7.1)
i =1
where x1 , . . . , xr are representative elements from each orbit. Denote
by FixG ( X ) the set of elements of X that are fixed by every element
of G. Each of these elements must therefore be a single-element orbit
of the Gaction, so FixG ( X ) is the union of these singleton orbits.
Suppose there are s of these, with 0 6 s 6 r. Then | FixG ( X )| = s and
we can rewrite (7.1) as
r
| X | = | FixG ( X )| + | OrbG ( xi )| (7.2)
i = s +1
In the special case where G is a pgroup, we obtain the following
useful fact.
Proposition 7.3 Let G be a pgroup of order pk for some prime p and
positive integer k, and suppose that G acts on some finite set X. Then
| X | | FixG ( X )| (mod p).

Proof By the OrbitStabiliser Theorem6 we know that | OrbG ( x )| is 6


Theorem 6.25, page 223.
a factor of | G | for any x X. Hence in (7.2) it follows that p divides
| OrbG ( xi )| for s+1 6 i 6 r. Hence
r
| X | = | FixG ( X )| + | OrbG ( xi )| | FixG ( X )| (mod p)
i = s +1
as claimed.
242 a course in abstract algebra

What this tells us is that the modulop congruence class of the number
of elements in X is determined solely by the single-element orbits; that
is, the points that are fixed by every element of G.
We are now ready to prove the fact we mentioned earlier: that a finite
group G has psubgroups of all possible orders for any prime factor p
7
7
This is actually a slightly stronger of | G |. We will call this Sylows First Theorem.
version of what most books call Sy-
lows First Theorem. Also, there isnt
Theorem 7.4 (Sylows First Theorem) Let G be a finite group of order
a complete consensus on the number- | G | = pk m, where p Z is prime, k Z is non-negative, and m N
ing of Sylows Theorems: some books with p 6 |m. Then G contains a subgroup of order pi for all 0 6 i 6 k, and
combine them into one large theorem
with several parts, while others divide each subgroup of order pi is a normal subgroup of some subgroup of order
them into three or sometimes four sep- pi+1 for 0 6 i < k.
arate theorems; neither is there com-
plete agreement on the order in which Proof We will proceed by induction on i. The case i = 0 is trivial, and
they should be presented. Here we will
state and prove them as three separate the case i = 1 holds by Cauchys Theorem8 . Suppose, then, that H is
but closely-related theorems, in what a psubgroup of G of order | H | = pi for some i > 1. Let S be the set
seems these days to be the most com-
of left cosets of H in G, and define a left action of H on S by
mon order.
8
Theorem 2.37, page 57. h ( gH ) = (hg) H.
By Proposition 7.3 the number of cosets in S is congruent modulo p to
the number that are fixed by the action of H:
|S| | Fix H (S)| (mod p).
For a coset gH to be fixed by this action means that h ( gH ) = (hg) H =
gH for all h H. Proposition 2.26 tells us that this is equivalent to
saying that hg gH for all h H, which is the same as requiring that
g1 hg H for all h H. This, in turn, is equivalent to saying that g
lies in the normaliser NG ( H ) of H in G.
So, the cosets gH fixed by this action are exactly those for which
g NG ( H ), and hence Fix H (S) = { gH : g NG ( H )} we find that
| Fix H (S)| = | NG ( H ):H |.
Looking at this another way, the left cosets of H in G are all the same
size and partition G, so |S| = | G:H |. Putting all this together we get
| G:H | = |S| | Fix H (S)| = | NG ( H ):H | (mod p). (7.3)
9
Definition 4.46, page 122. Recall9 that the normaliser NG ( H ) is the largest subgroup of G in
which H is normal, so H P NG ( H ). Therefore it makes sense to talk
about the quotient group NG ( H )/H.
Also, H is a psubgroup of G, so | H | = pi for some i > 1. So if
i < k the index | G:H | is a multiple of p, and by (7.3) it follows that
| NG ( H ):H | is also a multiple of p. Thus | NG ( H )/H | is a multiple of p
too, and by Cauchys Theorem NG ( H )/H has a subgroup of order p.
All subgroups of NG ( H )/H are of the form K/H for some subgroup
K such that H 6 K 6 NG ( H ). So there exists some subgroup K/H
finite groups 243

in NG ( H )/H of order p, which means that |K:H | = p. Hence |K | =


p| H | = pi+1 . This completes the inductive step, and hence G contains
psubgroups of all possible orders.
Finally, as H is normal in NG ( H ) it must be normal in K 6 NG ( H ).
In many cases, therefore, Sylows First Theorem gives us more infor-
mation about orders of subgroups than Cauchys Theorem.
Example 7.5 Suppose that G is a group of order 12 = 22 3. Then
Cauchys Theorem tells us that G has (cyclic) subgroups of orders 2
and 3, because these are the prime factors of 12. But Sylows First
Theorem tells us that G must have a subgroup of order 4 = 22 as
well. Not only that, each subgroup of order 2 is contained (and is
normal) in a subgroup of order 4.

Example 7.6 Let G be a group of order 8 = 23 . We know from


Proposition 2.40 that, up to isomorphism, there are five different
such groups. Cauchys Theorem tells us that G must have at least
one subgroup of order 2, but Sylows First Theorem also ensures
the existence of subgroups of order 4 as well. Also, these order4
subgroups must be normal in G.
Of course, we know this last bit of information already because by
Proposition 3.12 any index2 subgroup must be normal. If, however,
| G | = 27 = 33 then G must have subgroups of order 3 and 9, and the
order9 subgroups must all be normal.

We want to know more about these subgroups: how are they related
and how many are there? Sylows other theorems give answers to
these questions, so well look carefully at some small examples.
Example 7.7 The dihedral group D3 has order 6 = 2 3, and hence
by Cauchys Theorem (and also by Sylows First Theorem) it must
have at least one subgroup of order 2 and at least one of order 3. In
fact, we know from earlier discussions that it has three subgroups
M1 = {e, m1 }, M2 = {e, m2 } and M3 = {e, m3 }
of order 2. These represent the reflections in the three axes of the
equilateral triangle. We might ask how these are related, and the
answer is given by Example 3.5: the three reflections m1 , m2 and m3
are all conjugate to each other, and the identity element e forms a
singleton conjugacy class on its own.
Upon further inspection, then, we can reconstruct the subgroup
M2 = {e, m2 } from the subgroup M1 = {e, m1 } by conjugating each
element by the 2
3 rotation r:

e = rer 1 , m2 = rm1 r 1 .
244 a course in abstract algebra

Similarly, we can reconstruct the subgroup M3 = {e, m3 } by conju-


gating with the 4 2 1
3 rotation r = r :

e = r 1 er, m3 = r 1 m1 r.
So, these subgroups are conjugate to each other:
M2 = rM2 r 1 and M3 = r 1 M1 r.
There is only one subgroup of order 3: the rotation subgroup R3 =
{e, r, r2 }. This is a normal subgroup and is hence conjugate to itself.
Lets look at a slightly larger example: the symmetric group S4 .
Example 7.8 The symmetric group S4 has order 24 = 22 3, and
hence by Sylows First Theorem it must have at least one subgroup
each of orders 2, 3, 4 and 8. In fact, it has three subgroups of order 8,
each isomorphic to the dihedral group D4 :
h(1 2 3 4), (1 3)i, h(1 2 4 3), (1 4)i and h(1 3 2 4), (1 2)i.
These are all conjugate to each other; to see this, we only have to
check the two generators for each subgroup:
(3 4)(1 2 3 4)(3 4) = (1 2 4 3), (3 4)(1 3)(3 4) = (1 4),
(2 3)(1 2 3 4)(2 3) = (1 3 2 4), (2 3)(1 3)(2 3) = (1 2).
There should also be at least one subgroup of order 3; in fact there
are four:
h(1 2 3)i, h(1 2 4)i, h(1 3 4)i and h(2 3 4)i.
These are also conjugate to each other:
(3 4)(1 2 3)(3 4) = (1 2 4), (2 3)(1 2 4)(2 3) = (1 3 4),
(1 2)(1 3 4)(1 2) = (2 3 4), (1 4)(2 3 4)(1 4) = (1 2 3).

The pattern were starting to see here is that if p is a prime factor of


| G |, then the maximal psubgroups of G are conjugate to each other.
Does this happen for smaller psubgroups?
Example 7.9 The group S4 has seven subgroups of order 4 = 22 :
h(1 2 3 4)i, h(1 2 4 3)i, h(1 3 2 4)i,
h(1 3), (2 4)i, h(1 4), (2 3)i, h(1 2), (3 4)i,
h(1 2)(3 4), (1 3)(2 4)i
The first three of these are isomorphic to the cyclic group Z4 , while
the remaining four are isomorphic to the Klein group V = Z2 Z2 .
But conjugation by an element g G yields isomorphisms f g : H
K, where f g (h) = ghg1 for all h H, from one subgroup H to some
other subgroup K. And since not all of the order4 subgroups arent
finite groups 245

isomorphic, at least some of them cant be conjugate to each other.

So this pattern only seems to work for maximal psubgroups. Sub-


groups of this type are named after Ludwig Sylow himself:
Definition 7.10 Let G be a finite group of order | G | = pk m for some
prime p N, some integer n > 0 and some m N such that p 6 |m.
Then a psubgroup of order pk is said to be a Sylow psubgroup.

The pattern we see from the above examples is that for any prime factor
p of | G |, the Sylow psubgroups are conjugate to each other. This is
true in general, a result that we will call Sylows Second Theorem:
Theorem 7.11 (Sylows Second Theorem) Let G be a finite group with
order | G | = pk m for some prime p, non-negative integer k and positive
integer m not divisible by p. If H and K are Sylow psubgroups of G, then
H and K are conjugate. That is, there exists some element g G such that
gHg1 = K.

Proof If | G | = pk then G is itself the unique Sylow psubgroup, is ob-


viously closed under conjugation, and the result follows immediately.
Suppose otherwise, that G is not a pgroup, and let H and K be two
Sylow psubgroups of G, with | H | = |K | = pk . Let S be the set of left
cosets of H in G, and let K act on S by left multiplication:
k ( gH ) = (kg) H
for all k K and g G. We know that K is a finite pgroup, so by (7.2)
we have
|S| | FixK (S)| (mod p).
Observe that |S| = | G:H | = | G |/| H | = m. By the hypothesis, p 6 |m
and so |S| 6 0 (mod p). Therefore | FixK (S)| 6= 0 and so there is at
least one left coset gH in S that is fixed by the Kaction. This means
that (kg) H = gH for all k K. By Proposition 2.26 this is equivalent
to saying that kg gH for all k K, and hence K gHg1 . But
|K | = | gHg1 | and hence K = gHg1 , so K and H are conjugate.
So all the Sylow psubgroups of a finite group G are conjugate to
each other. If G has only one Sylow psubgroup, then it is mapped to
itself by conjugation by every element of G, and is therefore normal.
We saw this in Example 7.7: D3 has only one Sylow 3subgroup, the
rotation subgroup R3 , which we already know is normal. But this
happens to be true in general:
Corollary 7.12 Let G be a finite group, and let p be a prime factor of | G |.
Then G has exactly one Sylow psubgroup H if and only if H is normal.
This leads us neatly to the second of our questions about Sylow p
subgroups: how many are there?
246 a course in abstract algebra

G |G| p np | G:H | In order to explore this question, well introduce a couple of useful bits
D3 6 2 3 3 of notation. Suppose that G is a finite group, and p is a prime integer
3 1 2 that divides | G |. Let Syl p ( G ) denote the set of Sylow psubgroups
A4 12 2 1 3 contained in G, and let n p = | Syl p ( G )|.
3 4 4
For example, if G = D3 then n2 = 3 and n3 = 1 since D3 has three
Z12 12 2 1 3
3 1 4
Sylow 2subgroups and one Sylow 3subgroup. Table 7.1 lists some
S4 24 2 3 3 examples of small finite groups.
3 4 8 What patterns can we see in these examples? All of them have an odd
SL2 (3) 24 2 1 3 number of Sylow 2subgroups. The first five examples all have either
3 4 8 one or four Sylow 3subgroups, and the last two examples both have
Z10 10 2 1 5
a single Sylow 5subgroup.
5 1 2
D5 10 2 5 5 The key observation here is to count modulo p. Given that crucial
5 1 2 steps in the proofs of Sylows First and Second Theorems required us
Table 7.1: The number of Sylow p
to count cosets modulo p, in retrospect this shouldnt be a colossal
subgroups H contained in a group G surprise. Counting modulo p is apparently a fundamental aspect of
working with psubgroups. Looking at these examples through a
modulop lens, we find that in every case the number n p of Sylow
psubgroups is congruent to 1.
Also, in each case n p is a factor of the order of the group in question.
But more than that, it is a factor of the index of the given Sylow
psubgroup. Both of these are true in general, by a result we will call
Sylows Third Theorem.
Theorem 7.13 (Sylows Third Theorem) Let G be a finite group of
order | G | = pk m where p is prime, k is a non-negative integer, and m is a
positive integer not divisible by p. Then n p 1 (mod p) and n p |m.

Proof As with Sylows first two Theorems, we will study the action
of a particular group on a suitably chosen set, and then apply (7.2).
To prove the first statement, suppose H is a Sylow psubgroup of G
and let it act on Syl p ( G ) by conjugation. For any Sylow psubgroup K
to be fixed under this action, we require hKh1 = K for all h H; the
subgroup H itself is certainly fixed in this way.
Suppose that K is any Sylow psubgroup fixed under conjugation by
H. Then H is contained in the normaliser NG (K ) of K in G; clearly
also K NG (K ), so both H and K are Sylow psubgroups in NG (K ),
and by Sylows Second Theorem they must be conjugate in NG (K ).
But K is normal in, and hence conjgate to itself in, its own normaliser
NG (K ). So K = H and thus the only Sylow psubgroup in Syl p ( G )
fixed by conjugation with H is H itself. Hence | Fix H (Syl p ( G ))| = 1
and by (7.2) we have, as required,

n p = | Syl p ( G )| | Fix H (Syl p ( G ))| = 1 (mod p).


finite groups 247

To show that n p |m we let the full group G act on Syl p ( G ) by conju-


gation. By Sylows Second Theorem, all the Sylow psubgroups of
G are conjugate to each other, so the conjugation action has only a
single orbit, namely the whole of Syl p ( G ). By the OrbitStabiliser
Theorem,10 the number of elements in an orbit divides the order of 10
Theorem 6.25, page 223.
the group, so n p = | Syl p ( G )| must divide | G | = pk m. Since weve just
shown that n p 1 (mod p), and hence n p is coprime to p, the only
remaining possibility is that n p |m, as claimed.
We can use Sylows Theorems to help us classify finite groups of a
given order, as the next example shows.
Example 7.14 Let G be a group of order | G | = 15 = 3 5. Sylows
First Theorem tells us that G must contain subgroups of order 3
and 5. (Actually, we know this from Cauchys Theorem too.) In
particular, these subgroups are Sylow 3 and 5subgroups.
Sylows Third Theorem says that n3 1 (mod 3) and n3 |5; the first
of these means that n3 could be 1, 4, 7, 10 or 13, and the second (or,
alternatively, Lagranges Theorem) rules out all but n3 = 1. So there
is one Sylow 3subgroup, isomorphic to Z3 .
Considering the Sylow 5subgroups in the same way, we find the
n5 1 (mod 5) and n5 |3, so n5 must also equal 1. Thus G has a
single Sylow 5subgroup, isomorphic to Z5 .
Sylows Second Theorem says that Sylow psubgroups are conjugate
to each other, and since n3 = n5 = 1, both the order3 and order5
subgroups are normal in G. In particular, G cant be simple.
Let H be the order3 subgroup and K be the order5 subgroup.
Their intersection H K must be a subgroup of both H and K, and
Lagranges Theorem says that | H K | must therefore divide both
| H | = 3 and |K | = 5. Since 3 and 5 are coprime, the only possibility
is that | H K | = 1, and so H K = {e}. The subgroup HK has
order greater than 5, and hence by Lagranges Theorem, | HK | must
equal 15, so HK must be the full group G.
By Corollary 3.23, G = H K, and therefore G = Z3 Z5 . Finally,
since 3 and 5 are coprime, Proposition 1.32 tells us that G = Z15 .
Up to isomorphism, then, there is only one group of order 15: the
cyclic group Z15 .

More generally, we can prove the following:


Proposition 7.15 Let G be a group of order pq where p and q are both
prime, p < q and q 6 1 (mod p). Then G
= Z pq .

Proof By Lagranges Theorem, any proper nontrivial subgroup of G


must have order either p or q. By Sylows First Theorem, or Cauchys
248 a course in abstract algebra

Theorem, G must have at least one subgroup isomorphic to Z p and at


least one subgroup isomorphic to Zq .
Sylows Third Theorem tells us that n p 1 (mod p) and n p |q, so
either n p = 1 or q, but the latter case is disallowed by the hypothesis
q 6 1 (mod p), hence n p = 1. Similarly, nq 1 (mod q) and nq | p, so
either nq = 1 or p, but the second of these cant happen because p < q,
so p 6 1 (mod q), and hence nq = 1.
Let H be the subgroup of order p and K be the subgroup of order q.
Then by Corollary 7.12 these are both normal in G. Their intersection
H K is a subgroup of G and also of both H and K by Proposition 2.9.
By Proposition 2.11, HK is a subgroup of G and its order must be
greater than |K | = q, so again by Lagranges Theorem the only possi-
bility is that | HK | = | G | = pq, so HK = G.
By Corollary 3.23, then, G = H K = Z p Zq . And since p and q are
coprime, Proposition 1.32 tells us that G = Z pq and hence any group
of order pq is cyclic.
Corollary 7.16 Let p and q be prime, with p < q and q 6 1 (mod p).
Then any group of order pq is abelian.
We noted earlier that, up to isomorphism, there are fifteen groups of
11
Table 4.1, page 121. order 24.11 Well leave the details of the classification for Section 7.A,
but we can use Sylows Theorems to find out what psubgroups exist:
Example 7.17 Since 24 = 23 3, Sylows First Theorem guarantees
the existence of subgroups of order 2, 3, 4 and 8.
By Sylows Third Theorem, n2 1 (mod 2) and n2 |3, so there are
either 1 or 3 Sylow 2subgroups of order 8. Sylows Second Theorem
says that these are conjugate to each other.
Similarly, n3 1 (mod 3) and n3 |8. The first of these allows n3 to be
1, 4, 7, 10, 13, 16, 19 and 22, but the second (and Lagranges Theorem)
rules out all but n3 = 1 or 4.
So, any group of order 24 must have at least one subgroup of orders 2
and 4, and either 1 or 3 subgroups of order 8 and 1 or 4 subgroups
of order 3.
We can use Corollary 7.12 to show that there are no simple groups of
certain orders:
Example 7.18 Let G be a group of order | G | = 1246 = 2 7 89.
Then by Sylows First Theorem (or Cauchys Theorem) G has (Sylow)
subgroups of order 2, 7 and 89. Sylows Third Theorem says that n2
is odd and divides 7 89 = 623. Therefore G has either 1, 7, 89 or 623
subgroups of order 2.
Similarly, Sylows Third Theorem tells us that n7 1 (mod 7) and
finite groups 249

n7 |178, whence n7 = 1, and that n89 1 (mod 89) and n89 |14, so
n89 = 1.
By Corollary 7.12, the subgroup of order 7 and the subgroup of
order 89 must be normal in G, and so no group of order 1246 can be
simple.

Corollary 7.16 illustrated another useful application of Sylows The-


orems: to show that groups of certain orders are abelian. Well take
this idea a little further now. To start with, well use (7.2) to show that
pgroups have nontrivial centres, and then well use that to classify
groups of order p2 and show that they are all abelian.
Proposition 7.19 Let G be a finite pgroup of order pk , for some prime
p and integer k > 0. Then the centre Z ( G ) is nontrivial.

Proof Let G act on itself by conjugation. Then the set FixG ( G ) consists
of all elements h G for which ghg1 = h for all g G. This is
equivalent to requiring gh = hg for all g G, hence FixG ( G ) = Z ( G ).
Applying 7.2, we get

| G | | FixG ( G )| = | Z ( G )| (mod p).

We know that | G | is a power of p, so | Z ( G )| 0 (mod p) and hence


| Z ( G )| is a multiple of p. But Z ( G ) is nonempty: at the very least it
must contain the identity. And since p > 1 it follows that Z ( G ) must
contain at least ( p1) other elements apart from the identity.

Now we can prove that groups of order p2 are abelian.


Proposition 7.20 Let G be a group of order p2 for some prime p. Then
G is isomorphic either to Z p2 or Z p Z p .

Proof By Proposition 2.34, the order of any non-identity element of


G must be either p or p2 . If G has an element g of order p2 then
G = h gi
= Z p2 .
Otherwise, every non-identity element g has order p. By Proposi-
tion 7.19 at least one of these lies in the centre Z ( G ) and hence com-
mutes with every element of G. Choose one such non-identity element
g Z ( G ). Then h gi is a cyclic subgroup of order p, isomorphic to Z p .
Now choose some element h G \ h gi; this also has order p.
Let f : Z p Z p G; (m, n) 7 gm hn . This is a homomorphism, since

f (m1 , n1 ) f (m2 , n2 ) = ( gm1 hn1 )( gm2 hn2 )


= g m1 + m2 h n1 + n2 = f ( m 1 + m 2 , n 1 + n 2 )

and since g and all its powers are in Z ( G ), so they commute with
everything, in particular any power of h.
250 a course in abstract algebra

The kernel ker( f ) is trivial: if f (m, n) = e then gm hn = e and so


gm = hn . This element must lie in the intersection h gi hhi, and the
only such element is the identity e G. So gm = hn = e and thus
m = n = 0 in Z p . Since ker( f ) is trivial, f is injective. And since
| G | = |Z p Z p | = p2 , then f is surjective. Hence G
= Z p Z p .
Because cyclic groups and direct sums of cyclic groups are abelian, we
have the following corollary.
Corollary 7.21 Any group G of order | G | = p2 is abelian.
The next proposition takes this idea a little further.
Proposition 7.22 Let p and q be prime, with p < q and q 6 1 (mod p).
Then any group of order p2 q is abelian.

Proof A Sylow psubgroup of G has order p2 , while a Sylow q


subgroup has order q. By Sylows Third Theorem, n p |q and n p 1
(mod p), while nq | p and nq 1 (mod q). The only possibility is that
n p = nq = 1.
Let H be the Sylow psubgroup and K be the Sylow qsubgroup. Both
are normal in G by Corollary 7.12.
By Proposition 7.20, H is isomorphic either to Z p2 or Z p Z p , and
by Corollary 7.21 is therefore abelian. The group K is isomorphic to
Zq and is hence also abelian. Furthermore, their intersection is trivial
because p2 and q are coprime, and therefore by Propositions 3.22
and 2.11 HK is a subgroup of G. | HK | = p2 q = | G |, so HK = G.
By Corollary 3.23, G = H K so G must be abelian. Hence either

G = Z p2 Zq or G = Z p Z p Zq .
For example, we now know that any group of order 45 = 32 5 must
be isomorphic either to Z9 Z5 or Z3 Z3 Z5 .

If the material world rests upon a simi- 7.2 Series of subgroups


lar ideal world, this ideal world must
rest upon some other; and so on, with-
out end. It were better, therefore, never In Chapter 2 we met the concept of a subgroup lattice or Hasse
to look beyond the present material diagram of a group G: a directed graph whose nodes correspond to
world.
the various subgroups of G and whose edges encode the relationship
David Hume (17111776),
Dialogues Concerning Natural is a subgroup of. For example, the Klein group V = ZZ has five
Religion (1779) subgroups in total: itself, three subgroups isomorphic to Z2 , and the
trivial subgroup. The corresponding graph is shown in Figure 7.1.
This is helpful for picturing the subgroup structure of a fairly small
group, but it rapidly becomes unwieldy for larger groups.
But we can still learn a lot from studying how subgroups fit together. If
finite groups 251

we take a subgroup lattice for a finite group G, then any path through V

the graph starting at G and ending at the trivial subgroup gives a


nested series of subgroups, each one containing the next. {e, r } {e, h} {e, v}
For example, one such series for the Klein group is
{e} < {e, a} < V, {e}
while for the dihedral group D4 we have Figure 7.1: Subgroup lattice for the
Klein group V
{e} < {e, r2 } < {e, r, r2 , r3 } < D4
and
12
This conflicts with the usage in anal-
{e} < {e, m1 } < {e, r2 , m1 , m3 } < D4 ysis, where a sequence is a discrete list
and a series is the sum of an infinite
amongst others. We call these chains of nested subgroups series.12 sequence.
Definition 7.23 A series (or subgroup series) for a group G is a
finite sequence of subgroups
{e} = H0 < H1 < < Hn = G
of G such that Hi is a proper subgroup of Hi+1 for 0 6 i < n. The
length of the series is the number n of inclusions Hi < Hi+1 .
Sometimes it will be convenient to number the subgroup indices in
the reverse order. We will call a series of the form
H0 < H1 < < Hn = G
an ascending series for G, and a series of the form
Hn < Hn1 < < H0 = G
a descending series for G.
We may also sometimes relax the requirement that the smallest
subgroup (H0 for an ascending series, Hn for a descending series) is
trivial.
Weve seen already that normal subgroups have nice and useful prop-
erties, and it turns out that for our purposes it will be most useful
to focus on series consisting of subgroups of this type. But there are
a couple of different ways of doing this: we can either insist that all
the subgroups H0 , . . . , Hn are normal in G, or we can just require that
each subgroup Hi is normal in the next one along, Hi+1 .
Definition 7.24 A series
H0 = {e} < H1 < < Hn1 < Hn = G
is normal or invariant if H0 , . . . , Hn P G; that is, if each subgroup
Hi is normal in G.
A series is subnormal or subinvariant if Hi C Hi+1 for 0 6 i < n;
that is, if each subgroup is normal in the next one in the list.
252 a course in abstract algebra

Any group G has a normal series of length 1, namely {e} C G. If G is


simple, then this is the only normal series it can have. If G is abelian,
then any series is both normal and subnormal.
Every normal series is subnormal, but not every subnormal series is
normal. For example, the series
{e} C {e, m1 } C {e, r2 , m1 , m3 } C D4 (7.4)
is subnormal, but not normal, because {e, m1 } 6P D4 .
13
Theorem 7.4, page 242. Sylows First Theorem13 guarantees the existence of subnormal series
of psubgroups for any finite group G.
The point of insisting on normal subgroups is that we can then pass to
quotient groups, secure in the knowledge that each quotient Hi+1 /Hi
is well-defined. For example, with the subnormal series for D4 shown
above, we get the quotients
H1 /H0
= Z2 , H2 /H1
= Z2 and H3 /H2
= Z2 .
There are other subnormal series for D4 , for example
{e} C {e, m2 } C {e, r2 , m2 , m4 } C D4 , (7.5)
the quotients for which are
H1 /H0
= Z2 , H2 /H1
= Z2 and H3 /H2
= Z2 .
Although the series (7.4) and (7.5) arent the same, the corresponding
collection of quotient groups are isomorphic.
The group Z6 has subgroups h2i = {0, 2, 4} and h3i = {0, 3}, which
both yield subnormal (and, since Z6 is abelian, normal) series for Z6 :
{0} C h2i C Z6 (7.6)
{0} C h3i C Z6 (7.7)
The quotient groups obtained from the first of these series are
H1 /H0
= Z3 and H2 /H1
= Z2 ,
while those obtained from the second series are
H1 /H0
= Z2 and H2 /H1
= Z3 .
Again, these two series are different, but their quotient groups are the
same, at least up to permutation. This turns out to be a useful way of
comparing normal and subnormal series:
Definition 7.25 Two normal or subnormal series
{e} = H0 C C Hn = G and {e} = K0 C C Kn = G
are isomorphic or equivalent if there is a bijection between the sets
{ Hi+1 /Hi : 0 6 i < n} and {Ki+1 /Ki : 0 6 i < n}.
finite groups 253

By this definition, both subnormal series (7.4) and (7.5) for D4 are iso-
morphic, and both normal series (7.6) and (7.7) for Z6 are isomorphic
too. We can take this idea a little further. Consider the normal (and
subnormal) series
{0} C 60Z C 12Z C 2Z C Z (7.8)
for Z. There is still room to fit more subgroups in, and we can do this
in different ways. For example,
{0} C 120Z C 60Z C 12Z C 6Z C 2Z C Z (7.9)
and
{0} C 180Z C 60Z C 12Z C 4Z C 2Z C Z (7.10)
are both valid normal series for Z, obtained by fitting extra groups
Otto Schreier (19011929)
into the series (7.8). They arent isomorphic: the quotient groups
of (7.9) are
120Z
= Z, Z2 , Z5 , Z2 , Z3 and Z2 ,
while the quotient groups of (7.10) are
180Z
= Z, Z3 , Z5 , Z3 , Z2 and Z2 .
These series arent isomorphic to each other, and neither are they
isomorphic to the original series (7.8), but this is a useful idea thats
worth studying further, and to that end we introduce the following
definition.
Definition 7.26 Let H = { H0 , . . . , Hm } and {K0 , . . . , Kn } be be two
normal or subnormal series of a group G. Then we say that K is a
refinement of H if we can obtain K by inserting additional subgroups
into H. More precisely, there exists an injective, strictly increasing
function
f : {0, . . . , m} , {0, . . . , n}
such that Hi = K f (i) for 0 6 i 6 m.

Thus the series (7.9) and (7.10) are (non-isomorphic) refinements of


the series (7.8). Now consider the series
{0} C 3Z C Z (7.11)
with quotient groups
3Z
=Z and Z3 ,
and the series
{0} C 8Z C 2Z C Z (7.12)
with quotient groups
8Z
= Z, Z4 and Z2 .
254 a course in abstract algebra

These series are clearly not isomorphic, but if were careful we can find
refinements of each that are isomorphic. Each of these refinements
should, at the very least, have quotient groups Z, Z2 , Z3 and Z4 . We
can refine (7.11) to get
{0} C 24Z C 6Z C 3Z C Z
which has quotient groups
24Z
= Z, Z4 , Z2 and Z3 .
Similarly, we can refine (7.12) to get

Oberwolfach Photo Collection / Klaus Wohlfahrt {0} C 24Z C 8Z C 2Z C Z,


Hans Julius Zassenhaus (19121991)
was born in Koblenz, the eldest son of which has quotient groups
Julius Zassenhaus, a historian and fol-
lower of the theologian and musician 24Z
= Z, Z3 , Z4 and Z2 .
Albert Schweitzer (18751965). Hans
entered the University of Hamburg in Both of these refinements are therefore isomorphic.
1930, initially intending to specialise in
theoretical physics, but he was inspired The next theorem, due to the Austrian mathematician Otto Schreier
to pursue research in mathematics by (19011929), says that we can always do this: any two subnormal (or
Emil Artin (18981962), who became
his doctoral supervisor.
normal) series for a group G have isomorphic refinements.
At the age of 21, while a graduate stu- We will just state the theorem for the moment: the proof is somewhat
dent, he proved the celebrated Butter-
involved and will require a bit of preparation.
fly Lemma and used it to construct a
neater proof of the JordanHlder The- Theorem 7.27 (Schreiers Refinement Theorem) Let G be a group,
orem. After completing his doctoral
thesis in 1934 (on a class of permuta-
and let H = { H0 , . . . , Hm } and K = {K0 , . . . , Kn } be two subnormal (or
tion groups now known as Zassenhaus normal) series for G. Then there exist refinements S = {S0 , . . . , Sl } of H
groups) he taught for two years at the and T = { T0 , . . . , Tl } of K such that S
= T.
University of Rostock, before returning
to Hamburg as Artins assistant, where To prove this, we first need a technical result due to the German
in 1937 he wrote an influential text-
book Lehrbuch der Gruppentheorie and mathematician Hans Zassenhaus (19121991), and in order to prove
completed his habilitation thesis on Lie that we need a lemma attributed to the German mathematician Richard
rings in 1938.
Dedekind (18311916).
He was strongly opposed to the Nazi
regime, and his mother Margarete was Lemma 7.28 (Dedekinds Modular Law) Let G be a group, and sup-
involved in a resistance effort to hide pose that X, Y and Z are subgroups of G, with Z X. Then
endangered people. His refusal to join
the Party resulted in the loss of his X (YZ ) = ( X Y ) Z.
academic post in 1940, after which he
worked on weather forecasting for the
navy until the end of the war.
Proof Let x X (YZ ). Then x = yz for some y Y and z Z X,
In 1949 he was appointed to a chair at so y = xz1 X, and hence y X Y. Therefore x ( X Y ) Z, and
McGill University in Montreal, during so X (YZ ) ( X Y ) Z.
which time he worked on applications
of computers to algebraic number the- Conversely, suppose that x ( X Y ) Z. Then x = ab for some
ory. In 1959 he moved to Ohio State a X Y and b Z X, which means that x X. Also, X Y Y,
University, where he was based for the
rest of his life. He retired in 1982 but so ( X Y ) Z YZ, and hence x YZ. Thus x X (YZ ), so
continued writing papers and super- ( X Y ) Z X (YZ ), and therefore X (YZ ) = ( X Y ) Z.
vising research students until shortly
before his death. The next result we need is Zassenhaus Lemma, sometimes called the
Butterfly Lemma due to the shape of the subgroup lattice formed by
finite groups 255

the various subgroups involved in the proof (see Figure 7.2). Zassen- AK HB

haus formulated and proved his Lemma at the age of 21, while a
graduate student at the University of Hamburg, in order to provide a
A ( AK )( H B) B
neater proof of Schreiers Refinement Theorem and the JordanHlder
Theorem.
A( H B) B( AK )
Theorem 7.29 (Zassenhaus Lemma) Let G be a group, and let H, K,
A and B be subgroups of G such that A P H and B P K. Then H K
(i) A ( H B ) P A ( H K ), A( H K ) B( H K )
(ii) B( A K ) P B( H K ), and
A( H K ) HK B( H K ) H K
(iii) = = .
A( H B) ( A K )( H B) B( A K ) Figure 7.2: Subgroup lattice for Zassen-
haus Lemma
Figure 7.2 shows the subgroup lattice formed by the subgroups in-
volved in the Lemma. For aesthetic reasons this lattice is drawn upside
down relative to our usual convention: larger subgroups are towards
the bottom of the diagram, while smaller subgroups are towards the
top. Bold lines indicate that the upper subgroup is normal in the lower
one. In general, a subgroup at the apex of two upward lines is the
intersection of the subgroups at the other ends. Similarly, two down-
ward lines meet at a subgroup that is the product of the subgroups at
their other ends.
Proof To prove part (i), we must show first that A( H B) is a subgroup
of A( H K ), and then that it is closed under conjugation by elements
of A( H K ). By Proposition 2.11, A( H B) 6 A( H K ) if and only if
A( H B) = ( H B) A. But we know that A P H, so Ah = hA for any
element h H, and hence this also follows for any h H B H.
Therefore A( H B) = ( H B) A and thus A( H B) 6 A( H K ).
Now we must show that A( H B) is normal in A( H K ), which we
do by showing that it is closed under conjugation by any element
of A( H K ). Let x = ab and y = ck where a, c A, b H B and
k H K, so that x A( H B) and y A( H K ). We want to show
that yxy1 A( H B).
First, observe that yay1 A since y A( H K ) AH H,
and A P H. Next, kbk1 H B since k H K and H B P
H K. Hence yby1 = (ck)b(ck)1 = ckbk1 c1 A( H B) A, and
A( H B) A = A( H B) since A P H. Finally, yxy1 = y( ab)y1 =
(yay1 )(yby1 ) A( H B), and therefore A( H B) P A( H K ) as
claimed.
Part (ii) follows by a very similar argument.
To prove part (iii), we use Dedekinds Modular Law14 and the Second 14
Lemma 7.28, page 254.
Isomorphism Theorem.15 Recall that the latter says that if a group G 15
Theorem 4.62, page 128.
contains a subgroup M and a normal subgroup N, then MN/N =
256 a course in abstract algebra

M/( M N ). Now set G = A( H K ), and let M = H K and N =


A( H B), which is normal in G by part (i). The Second Isomorphism
Theorem says that
( H K ) A( H B) HK
= . (7.13)
A( H B) ( H K ) A( H B)
According to Figure 7.2, ( H K ) A( H B) should be equal to A( H K ).
To see this, first we observe that A is normal in H, and so hA = Ah
for any element h H. Consequently, since H K H, we also
have hA = Ah for any h H K, and hence ( H K ) A = A( H K ),
which in turn means that ( H K ) A( H B) = A( H K )( H B). Next,
observe that ( H K )( H B) consists of all products of the form xy
where x H K and y H B. But since H K H B, it follows
that ( H K )( H B) = H K, and hence ( H K ) A( H B) = A( H K ) as
claimed.
Now we apply Dedekinds Modular Law to ( H K ) A( H B). Setting
X = H K, Y = A and Z = H B, by the identity X (YZ ) = ( X Y ) Z
we get

( H K ) A( H B) = ( H K A)( H B) = ( AK )( H B),
and hence (7.13) becomes
A( H K ) HK
= . (7.14)
A( H B) ( AK )( H B)
By a very similar argument we find also that
B( H K ) HK
= . (7.15)
B( A K ) ( AK )( H B)
Putting (7.14) and (7.15) together yields the required isomorphism.
We are now ready to prove Schreiers Refinement Theorem. The
key idea of this proof is to insert between each group Hi and Hi+1
in the series H a chain of groups of the form Hi ( Hi+1 K j ), and
then to perform the analogous construction with the series K. We
then apply Zassenhaus Lemma to show the existence of a bijective
correspondence between the relevant quotient groups.
Proof of Schreiers Refinement Theorem We first prove the theorem
for subnormal series. Suppose that H and K are two subnormal series
for a group G, with

{e} = H0 C H1 C C Hm = G
and {e} = K0 C K1 C C Kn = G.
Now consider the chain

Hi ( Hi+1 K0 ) 6 Hi ( Hi+1 K1 ) 6 6 Hi ( Hi+1 Kn ).


finite groups 257

This consists of (n+1) not necessarily distinct subgroups of G. Fur-


thermore,
Hi ( Hi+1 K0 ) = Hi ( Hi+1 {e}) = Hi {e} = Hi
and Hi ( Hi+1 Kn ) = Hi ( Hi+1 G ) = Hi Hi+1 = Hi+1 .
So, between each pair of subgroups Hi and Hi+1 in the first series H,
we insert the chain
Hi ( Hi+1 K1 ) 6 Hi ( Hi+1 K2 ) 6 6 Hi ( Hi+1 Kn1 ).
This yields a chain of mn+1 subgroups of G, not all of which need
necessarily be distinct.
Similarly, between each pair of subgroups K j and K j+1 in the second
series K, we insert the chain
K j (K j+1 H1 ) 6 K j (K j+1 H2 ) 6 6 K j (K j+1 Hm1 ).
This also results in a sequence of mn+1 subgroups of G, which again
need not all be distinct.
By parts (i) and (ii) of Zassenhaus Lemma, each subgroup in each of
these chains is normal in the next subgroup along. That is,
Hi ( Hi+1 K j ) P Hi ( Hi+1 K j+1 )
and K j (K j+1 Hi ) P K j (K j+1 Hi+1 )
for all 0 6 i < m and 0 6 j < n, since each Hi C Hi+1 and K j C K j+1 .
Also, by part (iii) of Zassenhaus Lemma, we have isomorphisms
Hi ( Hi+1 K j+1 ) K j (K j+1 Hi+1 )

= . (7.16)
Hi ( Hi+1 K j ) K j (K j+1 Hi )
These isomorphisms give a bijective correspondence between the
mn+1 groups in the chain constructed from H, and the mn+1 groups
in the chain formed from K.
But in Definition 7.23 we stipulated that in a series all inclusions
must be strict, so we now have to discard all the repeated subgroups
from these two chains. The quotient of two repeated subgroups is
clearly trivial, so each trivial quotient in the chain obtained from H
corresponds with a trivial quotient in the chain we constructed from
K, and vice versa. So discarding repeated subgroups is the same as
discarding trivial quotients, and there must be the same number in
each chain.
By doing this, the chain constructed from H yields a series S =
{S0 , . . . , Sl } that is a refinement of H, and the chain obtained from K
becomes a series T = { T0 , . . . , Tl } that is a refinement of K. Further-
more, S and T are isomorphic.
All that remains is to confirm this works for normal series as well.
But if H and K are normal series, then each group Hi ( Hi+1 K j ) and
258 a course in abstract algebra

K j (K j+1 Hi ) is normal in G, and hence the refinements constructed


by this method will also be normal.

To see an example of this in practice, lets consider two different,


non-equivalent subnormal series of the dihedral group D4 .
Example 7.30 Let
H0 = {e}, H1 = {e, r, r2 , r3 }, H2 = D4
and
K0 = { e } , K1 = {e, m1 }, K2 = {e, r2 , m1 , m3 }, K3 = D4
be two subnormal series for D4 . We refine H by inserting the groups
H0 ( H1 K1 ) = H0 and H0 ( H1 K2 ) = {e, r2 }
between H0 and H1 , and the groups
H1 ( H2 K1 ) = D4 and H1 ( H2 K2 ) = D4
between H1 and H2 . This gives the chain
H0 6 H0 6 {e, r2 } 6 H1 6 D4 6 D4 6 D4
which, after discarding repeated groups, yields the series
{e} C {e, r2 } C {e, r, r2 , r3 } C D4 .
This series has quotients Z2 , Z2 and Z2 .
Similarly, we refine K by inserting the group
K0 (K1 H1 ) = K0
between the groups K0 and K1 , the group
K1 (K2 H1 ) = K2
between the groups K1 and K2 , and the group
K2 (K3 H1 ) = D4
between the groups K2 and K3 . This gives the chain
K0 6 K0 6 K1 6 K2 6 K2 6 D4 6 D4 ,
which yields the series
{e} C {e, m1 } C {e, r2 , m1 , m3 } C D4 .
This series is actually the same as K itself, which indicates that K
cant be refined any further than it is: there is no room for any more
normal subgroups between any of the existing groups. Its quotients
are Z2 , Z2 and Z2 , which are the same as those in the refined series
obtained from H, and hence the two refined series are isomorphic,
as predicted by Schreiers Refinement Theorem.
finite groups 259

In this example, the subnormal series K, that is,


{e} C {e, m1 } C {e, r2 , m1 , m3 } C D4
was already as long as it could be: there is no room to fit any more
normal subgroups in between any of the others on the list. Each group
is in some sense a largest possible proper normal subgroup of the
group containing it.
Definition 7.31 Let G be a group, and suppose that H < G is
a proper subgroup of G that isnt contained in any other proper
subgroup of G. That is, there exists no other subgroup K < G such
that H < K. Then we say H is a maximal subgroup of G.
If N C G is a normal subgroup of G that isnt contained in any other Wikimedia Commons
proper normal subgroup of G, then we call N a maximal normal The French mathematician Marie En-
nemond Camille Jordan (18381922)
subgroup of G.
was born to a wealthy family in Lyon,
and studied engineering and mathe-
Maximal normal subgroups have an important property:
matics at the cole Polytechnique, as
Proposition 7.32 Let G be a group and N C G be a proper normal his father had before him. He worked
as an engineer for some years, do-
subgroup of G. Then G/N is simple exactly when N is a maximal normal ing mathematical research in his spare
subgroup of G. time, and obtaining his doctorate in
1861 for a two-part thesis entitled Sur
Proof Let N be a maximal normal subgroup of G. The quotient le nombre des valeurs des fonctions and
Sur des periodes des fonctions inverses des
homomorphism q : G G/N maps every element g G to the coset intgrales des diffrentielles algebriques.
gN in G/N. By Proposition 4.37, the preimage q1 ( H ) of any proper He was appointed professor at the
normal subgroup of G/N would be a proper normal subgroup of G cole Polytechnique in 1876, and also
at the Collge de France in 1883, but
such that N C q1 ( H ). But we said that N was maximal normal in G, continued to work as an engineer un-
so this cant happen and therefore G/N must be simple. til 1885, during which period he wrote
his influential textbook Cours dAnalyse,
Now suppose that G/N is simple, and that there exists some proper published between 1882 and 1887.
normal subgroup M C G with N C M. Then q( M) C G/N. But G/N He made many contributions to sev-
is simple, so no such proper normal subgroup q( M ) can exist, and eral different areas of mathematics: as
well as the JordanHlder Theorem, he
hence M cant exist either, so N is maximal normal in G. also formulated the Jordan Curve Theo-
rem in topology and complex analysis,
The reason we cant fit another subgroup in between, for example, and the Jordan normal form in linear
{e, m1 } and {e, r2 , m1 , m3 } is because the former is a maximal normal algebra. He also did much work on
subgroup of the latter. We want to study series of this type further. Galois theory and studied the Math-
ieu groups mentioned in Section 3.A.
Definition 7.33 A subnormal series His 1870 monograph on permutation
groups, Trait des substitutions et des
{e} = H0 C H1 C C Hn = G quations algbriques, won the Prix Pon-
celet, a prestigious prize awarded by
of a group G is a composition series if it cannot be refined any the Acadmie des Sciences.
further. Equivalently, each group Hi is a maximal normal subgroup of Several other members of his family
attained distinction in various fields:
Hi+1 for 0 6 i < n. By Proposition 7.32, this is the same as requiring
his uncle Pierre Puvis de Chavannes
each quotient (or composition factor) Hi+1 /Hi to be simple. (18241898) was a renowned artist, a
great uncle also named Camille Jor-
A normal series with this property is called a principal or chief dan (17711821) became an influential
series. politician, and his cousin Alexis Jordan
(18141897) was an eminent botanist.
Not every group has a composition series or principal series. For
260 a course in abstract algebra

example, consider the subnormal (and normal) series


{0} C 120Z C 60Z C 12Z C 6Z C 2Z C Z.
There is no room for any more subgroups between 120Z and Z: we
can see this by considering the quotients Z2 , Z5 , Z2 , Z3 , Z2 , which
are all cyclic groups of prime order and hence simple. But we can
certainly fit another group in between {0} and 120Z: for example,
240Z, which yields another simple quotient Z2 . And having done so,
we can insert another group, say 720Z, before that, and so on.
The problem, though, is that whatever subgroup we insert here will
have to be of the form kZ for some k Z, and so the first quotient will
always be isomorphic to kZ, which is not simple. A little thought will
Otto Ludwig Hlder (18591937) hopefully convince you that this problem arises for any subnormal
series of Z, so were forced to conclude that Z has no composition
series (and hence it cant have a principal series either).
But those groups that do have composition series satisfy an impor-
tant theorem originally proved by the French mathematician Camille
Jordan (18381922) and later extended by the German mathematician
Otto Hlder (18591937).
Theorem 7.34 (The JordanHlder Theorem) Any two composition
(or principal) series for a group G are isomorphic.

Proof Let
{e} = H0 C H1 C C Hm = G
and { e } = K0 C K1 C C K n = G
be two composition (or principal) series for G. Then by Schreiers
Refinement Theorem both of these series have isomorphic refinements.
But since all the composition factors are already simple, each group
Hi is maximal normal in Hi+1 and each group K j is maximal normal
in K j+1 for 0 6 i < m and 0 6 j < n, and hence neither series
can be refined any further. Therefore both series must already be
isomorphic.
Also in Example 7.30 we found another composition series for D4 ,
namely
{e} C {e, r2 } C {e, r, r2 , r3 } C D4 .
This series contains the rotation subgroup R4 = {e, r, r2 , r4 }, which is
known to be normal in D4 . In fact, if a group G has a composition
series at all, then we can find one that contains any given normal
subgroup N C G:
finite groups 261

Proposition 7.35 Let G be a group that has a composition (or principal)


series, and let N C G be a proper normal subgroup of G. Then there exists
some composition (or principal) series for G that contains N.

Proof Suppose that

{e} = H0 C H1 C C Hn = G
is a composition (or principal) series for G. The series

{e} C N C G
is both subnormal and normal. By Schreiers Refinement Theorem
there is a refinement of this series that is isomorphic to the given
composition (or principal) series, and hence itself a composition (or
principal) series for G. This refinement will necessarily contain N.

7.3 Soluble and nilpotent groups The world is devoted to physical sci-
ence, because it believes these discover-
ies will increase its capacity of luxury
Particularly important is the case where the composition factors and self-indulgence. But the pursuit
are abelian. This will become especially relevant in Chapter 11 when of science only leads to the insoluble.
When we arrive at that barren term, the
we study the solubility by radicals of polynomial equations. The only Divine voice summons man, as it sum-
abelian finite simple groups are the cyclic groups Z p where p is prime, moned Samuel;
so this condition is equivalent to requiring the composition factors to Benjamin Disraeli, 1st Earl of
Beaconsfield (18041881),
be finite cyclic groups of prime order. Lothair (1870) 70
Definition 7.36 A group G is soluble or solvable if it has a composi-
tion series whose composition factors are all abelian, or equivalently
are cyclic groups Z p of prime order.

From earlier discussion we know that D4 is soluble, since it has a


composition series

{e} C {e, m1 } C {e, r2 , m1 , m3 } C D4


whose composition factors are all isomorphic to Z2 . The group Z30 is
soluble since it has a composition series

{0} C h6i C h2i C Z30


with composition factors Z5 , Z3 and Z2 , all of which are simple and
abelian. The symmetric group S5 has a composition series

{ } C A 5 C S5 ,
which has composition factors A5 and Z2 . The alternating group A5 is
simple but not abelian, so S5 is not soluble. An important consequence
of this is the fact, originally discovered independently in the early
262 a course in abstract algebra

19th century by variste Galois (18111832) and Niels Henrik Abel


(18021829), that the general quintic equation
ax5 + bx4 + cx3 + dx2 + ex + f = 0
cannot be solved by radicals.
In fact, apart from the first few cases, no finite symmetric group Sn is
soluble:
Proposition 7.37 The symmetric group Sn is not soluble for n > 5.

Proof By Proposition 3.43, the alternating group An is simple for


n > 5. And by the discussion in Example 3.33, An is normal in Sn . We
thus have a subnormal series
{ } C A n C Sn
with factors An and Z2 , both of which are simple, and so this is a
composition series for Sn . By the JordanHlder Theorem, any other
composition series for Sn will also have composition factors isomorphic
to An and Z2 , and since An is nonabelian, Sn isnt soluble.
However, lots of important classes of groups are soluble. The next few
results show that subgroups, quotients and direct products of soluble
groups are also soluble.
Proposition 7.38 If a group G is soluble, then any subgroup H 6 G is
also soluble.
Proof Suppose that
{e} = G0 C G1 C C Gn = G
is a composition series for G. Now form another chain of subgroups
{e} = H0 < H1 < < Hn = H (7.17)
where Hi = ( H Gi ) for 0 6 i 6 n. By Lemma 4.61 we know that
Hi C Hi+1 for 0 6 i < n since
Hi = H Gi = ( H Gi ) Gi+1 C H Gi+1 = Hi+1 ,
so (7.17) is a subnormal series for H.
Now set A = {e}, K = Gi+1 and B = Gi and apply Zassenhaus
Lemma to see that
Hi+1 H Gi+1 HK
= = =
Hi H Gi ( A K )( H B)
B( H K ) G ( H Gi+1 ) G
= i < i +1 .
B( A K ) Gi Gi
Since Gi+1 /Gi is abelian, its subgroup Hi+1 /Hi must be as well, and
therefore H is soluble.
finite groups 263

Proposition 7.39 If a group G is soluble, and N P G is normal in G,


then the quotient G/N is also soluble.

Proof By Proposition 7.35 there is a composition series


{e} = H0 C H1 C C Hk = N C C Hn = G
for G that contains N. Taking the quotient of each of the groups
Hk , . . . , Hn by N we obtain a new series
{e} = N/N = Hk /N < Hk+1 /N < < Hn /N = G/N.
By the Third Isomorphism Theorem16 this is a subnormal series, and 16
Theorem 4.65, page 130.
( Hi+1 /N )/( Hi /N )
= Hi+1 /Hi for k 6 i < n. Each of these quotients
must therefore be abelian, and hence this new series is a composition
series for G/N with abelian composition factors. Therefore G/N is
soluble.
Proposition 7.40 Let N C G be a normal subgroup of some group G. If
N and G/N are both soluble, then so is G.

Proof Since N and G/N are soluble, there exist composition series
{e} = H0 C H1 C C Hm = N
and
{e} = K0 C K1 C C Kn = G/N.
In the latter series, each group Ki can be written as Gi /N for some
subgroup Gi , and N corresponds to the identity element in the quotient
group G/N, so this can be written as
N C G1 /N C C Gn /N = G/N.
Each of the composition factors of both of these series are cyclic
groups of prime order, and by the Third Isomorphism Theorem we
have ( Gi+1 /N )/( Gi /N )
= Gi+1 /Gi , which must also be of prime
order. We can therefore construct a composition series
{e} = H0 C H1 C Hm = N C G1 C C Gn = G
for G, all of whose composition factors are cyclic groups of prime
order, and hence abelian. Thus G is soluble.
An important corollary of this last proposition is that direct products
of soluble groups are themselves soluble:
Corollary 7.41 If H and K are soluble groups, then so is H K.

Proof Let G = H K. Then H C G and G/H = K, so by Proposi-


tion 7.40 it follows that G must also be soluble.
We can use Proposition 7.40 to show first that finite abelian groups are
soluble, and then that finite pgroups are soluble too.
264 a course in abstract algebra

Proposition 7.42 Let G be a finite abelian group. Then G is soluble.

Proof Suppose that | G | = p for some prime integer p. Then G


= Zp,
which is soluble, since there is a composition series

{0} C Z p
with composition factor isomorphic to Z p itself.
Otherwise, we proceed by induction on n = | G |, which we assume
to be composite. Suppose that all abelian groups of order less than n
have already been shown to be soluble. If p is a prime factor of | G |,
17
Theorem 2.37, page 57. then by Cauchys Theorem17 G must have a cyclic subgroup H = Zp.
This subgroup H is normal in G, and its quotient G/H is abelian with
order | G/H | = n/p < n. Therefore both H and G/H are soluble, and
by Proposition 7.40, so is G.

Proposition 7.43 Let G be a finite pgroup of order | G | = pk for some


prime p and positive integer k. Then G is soluble.

Proof If k = 1 then | G | = p and hence G = Z p , which is soluble. We


now proceed by induction on k. By Proposition 7.19, the centre Z ( G )
is nontrivial, and by Proposition 3.17, Z ( G ) C G. Furthermore, the
centre Z ( G ) is abelian, and hence soluble by Proposition 7.42. Also
G/Z ( G ) is a pgroup of order pm , with m < n, and is hence soluble
by the inductive hypothesis. Therefore, by Proposition 7.40 it follows
that G must also be soluble.
There are two other important results that we will now state without
proof. The first of these is another important result named after (and
in this case originally proved by) the British group theorist William
Burnside (18521927).
Theorem 7.44 (Burnsides Theorem) Any group G of order pm qn ,
where p and q are both prime and m and n are non-negative integers, is
soluble.
The usual proof of this theorem involves some sophisticated techniques
from representation theory. Also worth mentioning is a celebrated
result due to the American mathematicians Walter Feit (19302004)
and John Griggs Thompson. The published proof of this theorem is
rather complicated and filled an entire 255page issue of the Pacific
18
18
W Feit and J G Thompson, Solvability Journal of Mathematics so we will omit it.
of groups of odd order, Pacific Journal of
Mathematics 13.3 (1963) 7751029.
Theorem 7.45 (FeitThompson Theorem) Any group of odd order is
soluble.
One important normal subgroup we met in Chapter 3 was the commu-
19
Definition 3.19, page 80. tator subgroup or derived subgroup [ G, G ] of a group G.19 This gives
some measure of how nonabelian a group is: [ G, G ] is trivial exactly
finite groups 265

when G is abelian, and the larger [ G, G ], the more G fails to be abelian.


For any abelian group G, then, we have a very short subnormal series
[ G, G ] = {e} C G.
But what happens if G isnt abelian? If we try this for S3 , the smallest
nonabelian group, we find that [S3 , S3 ] = A3 = h(1 2 3)i = Z3 , which
yields the subnormal series
{ } C [ S3 , S3 ] = A 3 C S3 .
Since A3 is abelian, its commutator subgroup is trivial, so what we
have here is a series of subgroups of S3 , in which each group is the
commutator subgroup of the next group on the list. More generally, we
can get a descending series of groups in which each is the commutator
subgroup of its predecessor:
Definition 7.46 For a group G, recursively define subgroups
G (0) = G, G (1) = [ G (0) , G (0) ] , G (2) = [ G (1) , G (1) ] , ...
such that G (i+1) = [ G (i) , G (i) ] for i > 0. This series
C G (2) C G (1) C G (0) = G
is called the derived series of G.
The derived series for S4 is
{} C h(1 2)(3 4), (1 3)(2 4)i C A4 C S4 . (7.18)
But if we try this for the symmetric group S5 , we find that [S5 , S5 ] =
A5 ,20 but then we run into a problem (or, synonymously, some in- 20
Proposition 3.20, page 80.
teresting behaviour) because A5 is simple.21 By Proposition 3.21, 21
Proposition 3.43, page 94.
commutator subgroups are normal, so [ A5 , A5 ] must be either trivial
or A5 itself. But A5 isnt abelian, so its commutator subgroup cant
be trivial, which means that [ A5 , A5 ] = A5 . The derived series for S5 ,
then, doesnt terminate at the trivial subgroup:
A 5 C S5 .
The other property S5 has that is relevant to the current discussion is
that it isnt soluble. This isnt a coincidence:
Proposition 7.47 A group G is soluble if and only if its derived series
terminates at the trivial subgroup. That is, if G (n) = {e} for some n > 0.

Proof Suppose that


{ e } = G ( n ) C G ( n 1) C C G (1) C G (0) = G
is the derived series for G. Then each quotient G (i) /G (i+1) is abelian.
In particular, G (n1) /G (n) = G (n1) is abelian and hence soluble. Next,
G (n2) /G (n1) is abelian and hence soluble, so by Proposition 7.40,
G (n2) must also be soluble. We proceed by reverse induction on n:
266 a course in abstract algebra

having shown that G (k) is soluble, and since G (k1) /G (k) is abelian
and therefore soluble, Proposition 7.40 implies that G (k1) must also
be soluble. Therefore, by induction, G = G (0) is soluble.
Conversely, suppose G is soluble. Then there is a composition series
{e} = Gm C Gm1 C C G0 = G
for G, with each composition factor Gi /Gi+1 an abelian simple group,
for 0 6 i 6 m. This composition series will be at least as long as the
derived series, and we claim that G (i) 6 Gi for 0 6 i 6 m.
Meanwhile, the derived groups eventually stabilise, in the sense that
there exists a positive integer n such that G (i+1) = G (i) for all i > n.
We proceed by induction on i: for i = 0 we have G (0) = G = G0 ,
and since G0 /G1 is abelian, G (1) = [ G, G ] 6 G1 by Proposition 3.31.
Suppose that G (i) 6 Gi for some i. The quotient Gi /Gi+1 is abelian, so
[ Gi , Gi ] 6 Gi+1 by Proposition 3.31, and hence
G (i+1) = [ G (i) , G (i) ] 6 [ Gi , Gi ] 6 Gi+1
as claimed. Therefore G (n) 6 Gn = {e} and thus the derived series
terminates at the trivial subgroup.
This gives another way of testing for solubility: construct the de-
rived series and see if it eventually terminates at the trivial subgroup.
Moreover, the derived series of a group is the shortest possible subnor-
mal series with abelian quotients, and its length thus gives us some
potentially useful information about the complexity of the group:
Definition 7.48 Let G be a soluble group. Then the length of the
derived series for G is the derived length of the group.

The trivial group {e} is the only group with derived length 0. The
groups of derived length 1 are exactly the nontrivial abelian groups:
if A is abelian and soluble, then A(0) = A, while A(1) = [ A, A] = {e};
conversely if A(1) = [ A, A] = {e} then this means that every element
of A commutes with every other element of A, and therefore A is
abelian. The groups of derived length 2 are precisely those with
nontrivial abelian commutator groups G (1) = [ G, G ]; such groups are
called metabelian, and we will briefly return to them later.
Another important normal subgroup we met in Chapter 3 is the centre
Z ( G ) of a group G. We can use it to construct normal or subnormal
series but in a slightly more complicated way than we did with the
commutator subgroup [ G, G ]. The obvious method, where each group
is the centre of its predecessor, doesnt work, or at least not in a very
interesting way: Z ( G ) is abelian, so its centre is the same; that is,
Z ( Z ( G )) = Z ( G ). Hence any such series will stabilise after at most
one step.
finite groups 267

So we need a different approach, and the key is to consider the centre


of quotient groups instead. More precisely:
Definition 7.49 A normal series
{e} = G0 < G1 < < Gn = G
is central if Gi+1 /Gi 6 Z ( G/Gi ) for 0 6 i < n.
That is, each quotient Gi+1 /Gi of the series is contained in the centre
of the corresponding quotient G/Gi of the full group G. For example:
Example 7.50 Suppose that
{e} C {e, r2 } C {e, r, r2 , r3 } C D4 (7.19)
is a normal series for D4 , with
G0 = {e}, G1 = {e, r2 }, G2 = {e, r, r2 , r3 }, G = G3 = D4 .
The quotient G3 /G2 consists of two cosets
eG2 = rG2 = r2 G2 = r3 G2 = {e, r, r2 , r3 },
m1 G2 = m2 G2 = m3 G2 = m4 G2 = {m1 , m2 , m3 , m4 }.
This is the same as the quotient G/G2 , whose centre Z ( G/G2 ) con-
sists of all cosets that commute with everything else. But since
G/G2 = Z2 is abelian, we have
G3 /G2 = Z ( G/G2 ).
The quotient G2 /G1 consists of the cosets
eG1 = r2 G1 = {e, r2 }, rG1 = r3 G1 = {r, r3 }.
Meanwhile, the quotient G/G1 consists of the cosets
eG1 = r2 G1 = {e, r2 }, rG1 = r3 G1 = {r, r3 },
m1 G1 = m3 G1 = {m1 , m3 }, m2 G1 = m4 G1 = {m2 , m4 }.
This group is of order 4 and hence abelian, so Z ( G/G1 ) = G/G1 ,
and we can also see that G2 /G1 6 Z ( G/G1 ).
Finally, G1 /G0 consists of the cosets
eG0 = {e}, r2 G0 = {r2 },
while G/G0 consists of the cosets
eG0 = {e}, rG0 = {r }, r2 G0 = {r2 }, r3 G0 = {r3 },
m1 G0 = {m1 }, m2 G0 = {m2 }, m3 G0 = {m3 }, m4 G0 = {m4 }.
The centre of this quotient is {eG0 , r2 G0 }, and thus again we have
G1 /G0 6 Z ( G/G0 ). So (7.19) is a central series.
This construction only works with a normal series, since we need
each quotient G/Gi to be a valid quotient group. It will always be
268 a course in abstract algebra

the case that Gi+1 /Gi is a subgroup of G/Gi , but the requirement that
Gi+1 /Gi 6 Z ( G/Gi ) is stronger, and will not always hold. A central
series will always have abelian quotients Gi+1 /Gi but not every series
with abelian quotients need be central.
There are two ways to form a central series from a group G: start
at {e} and work upwards to construct an ascending series, or start
at G itself and recursively form a descending series. These methods
need not result in the same series; also the ascending series need not
terminate at G, and the descending series might not reach {e}.
Well try both of these approaches in turn. First well build an ascend-
ing series starting at G0 = {e}. We want the second group G1 to have
the property that G1 /G0 6 Z ( G/G0 ), and one way of ensuring this is
to choose G1 such that G1 /G0 = Z ( G/G0 ). That is, we want G1 to be
the subgroup of G whose quotient by G0 is exactly the centre of G/G0 ,
and we can achieve this by setting G1 = Z ( G ).
Next we want G2 to be a normal subgroup of G such that G2 /G1 6
Z ( G/G1 ) and again well choose G2 so that G2 /G1 = Z ( G/G1 ). To
do this, we need to set G2 to be the subgroup of G whose quotient by
G1 is exactly the centre of the quotient G/G1 . Continuing this process
we obtain an ascending central series of groups:
Definition 7.51 For a group G, form an ascending normal series
G0 = {e} C G1 C C Gn
such that G1 = Z ( G ) and Gi is the higher centre Zi ( G ) of G defined
such that Z0 ( G ) = {e}, and
Zi ( G )/Zi1 ( G ) = Z ( G/Zi1 ( G ))
for 0 < i 6 n. This series is the ascending central series or upper
central series of G.
As an example, well do this for the dihedral group D4 :
Example 7.52 Let G = D4 . Then G0 = Z0 ( G ) = {e} and G1 =
Z1 ( G ) = Z ( G ) = {e, r2 }. Next we want G2 = Z2 ( G ) to be the
subgroup of G such that G2 /G1 = Z ( G/G1 ). Now G/G1 consists of
the cosets
eG1 = r2 G1 = {e, r2 }, rG1 = r3 G1 = {r, r3 },
m1 G1 = m3 G1 = {m1 , m3 }, m2 G1 = m4 G1 = {m2 , m4 }.
This quotient group is of order 4 and hence abelian, so Z ( G/G1 ) =
G/G1 , and hence G2 = Z2 ( G ) must be G itself. Therefore the upper
central series of D4 is
{e} C {e, r2 } C D4 .
finite groups 269

As remarked earlier, the upper central series need not terminate at G. If


we try this construction with the symmetric group G = S3 , for example,
we find that G0 = Z0 ( G ) = {}, and G1 = Z1 ( G ) = Z ( G ) = {} again.
So all the higher centres of S3 are trivial, and thus the upper central
series consists of just the trivial subgroup {}.
The upper central series for the dihedral group D6 is a little more
interesting, but also doesnt reach D6 :
{e} C {e, r3 }.
Well explore this idea further in a little while, but now its time to
try the other approach and construct a descending normal series by
starting at G and working downwards.
So, we set G0 = G to begin with. Next we want G1 to have the property
G0 /G1 6 Z ( G/G1 ) and as before the obvious way of achieving this is
to choose G1 such that G0 /G1 = Z ( G/G1 ). What this means is that
G1 has to be selected so that G/G1 is abelian, and the obvious way
of doing this is to set G1 = [ G, G ]. Any other choice of G1 for which
G/G1 is abelian will contain [ G, G ], by Proposition 3.31, and so by
setting G1 = [ G, G ] we get the largest possible abelian quotient G/G1 .
Next, we want to choose G2 so that G1 /G2 6 Z ( G/G2 ), and again we
might as well aim for G1 /G2 = Z ( G/G2 ). What we want, therefore, is
a normal subgroup G2 C G such that the cosets in G1 /G2 are exactly
those that commute with all of the cosets in G/G2 . We can achieve
this by setting G2 = [ G, G1 ] = { ghg1 h1 : g G, h G1 }. More
generally, this process leads to the following descending series for G:
Definition 7.53 For a group G, let 1 ( G ) = G, and recursively define
i ( G ) = [ G, i1 ( G )] = { ghg1 h1 : g G, h i1 ( G )}
for i > 1. The descending series
Gn C Gn1 C G0 = G,
where Gi = i+1 ( G ) for 0 6 i 6 n, is the descending central series
or lower central series of G.
Lets try this for the dihedral group D4 again.
Example 7.54 First we set G0 = G = D4 . Next, we set G1 = 2 ( G ) =
[ G, G ], which is {e, r2 }. The third group G2 = 3 ( G ) = [ G1 , G ] is the
trivial subgroup {e}, and so the lower central series for D4 is
{e} C {e, r2 } C D4 .
Earlier we saw that the upper central series for the symmetric group
S3 doesnt reach S3 , and it turns out that the lower central series
doesnt terminate at the trivial subgroup either. To see this, we first
set G0 = 1 (S3 ) = S3 . The next group in the series is G1 = 2 (S3 ) =
270 a course in abstract algebra

[S3 , S3 ] = A3 . And here everything grinds to a halt, because 3 (S3 ) =


[S3 , A3 ] = A3 as well, so the lower central series for S3 is therefore

A 3 C S3 .

The upper and lower central series for D4 happen to be the same, but
those for S3 arent even the same length.
The case where a group G has a finite-length central series which
reaches {e} at one end, and G at the other, is particularly interesting:
Definition 7.55 Suppose that a group G has a central series
{e} = G0 < G1 < < Gn = G
connecting G with its trivial subgroup {e}. Then we say that G is
nilpotent. The smallest possible length n over all such central series
is called the nilpotency class of G.
The lower central series is in some sense the optimal descending
central series: if it reaches {e} then it does so faster than any other
descending central series.
Proposition 7.56 Suppose that a group G has a finite descending central
series
{e} = Gn+1 < Gn < < G1 = G.
Then i ( G ) 6 Gi for 0 < i 6 n+1. Furthermore, if G has nilpotency
class c, then n > c+1.
To prove this, we first need the following lemma.
Lemma 7.57 Let G be a group, and suppose that K P G and K P H 6 G.
Then H/K 6 Z ( G/K ) if and only if [ G, H ] 6 K.

Proof Suppose that H/K Z ( G/K ). This means that every coset in
H/K commutes with every coset in G/K. That is, for every h H and
g G we have (hK )( gK ) = ( gK )(hK ). Moreover, (hK )( gK ) = (hg)K
and ( gK )(hK ) = ( gh)K, so

( gh)K = ( gK )(hK ) = (hK )( gK ) = (hg)K

and hence by Proposition 2.26 it follows that (hg)( gh)1 K. But


(hg)( gh)1 = hgh1 g1 is the commutator [h, g] [ H, G ]. Thus every
such commutator lies in K and hence [ H, G ] 6 K.
Each step in this argument is reversible: suppose that [ H, G ] 6 K.
Then for any hgh1 g1 [ H, G ] 6 K we have (hg)K = ( gh)K and
hence H/K 6 Z ( G/K ).

Proof of Proposition 7.56 We will prove the inclusion i ( G ) 6 Gi


by induction on i. The inclusion obviously holds for i = 1, since
1 ( G ) = G = G1 .
finite groups 271

Now suppose that i ( G ) 6 Gi for some i > 1. Then


i+1 ( G ) = [ G, i ( G )] 6 [ G, Gi ] 6 Gi+1 .
The second of these inclusions follows from Lemma 7.57 by setting
H = Gi and K = Gi+1 .
If G has nilpotency class c, then any central series for G must have
length at least c, hence n > c+1.
What this means is that if G is nilpotent, that is, if there exists a central
series of finite length which has G at one end and {e} at the other,
then the lower central series is finite and terminates at {e} as well.
(This last statement follows because n+1 ( G ) 6 Gn+1 = {e}.) Not only
that, the lower central series is the shortest descending central series
that does this.
An analogous statement holds for the upper central series too:
Proposition 7.58 Suppose that a group G has a finite ascending central
series
{e} = G0 < G1 < < Gn = G.
Then Gi 6 Zi ( G ) for 0 6 i 6 n. Furthermore, if G has nilpotency class c,
then n > c.

Proof As before, we will prove the inclusion Gi 6 Zi ( G ) by induction


on i. The base case is clear, since Gi = {e} = Z0 ( G ).
Now suppose that Gi 6 Zi ( G ) for some i > 0. For any g G
and h Gi+1 it follows that ( gGi )(hGi ) = (hGi )( gGi ) in G/Gi , since
Gi+1 /Gi 6 Z ( G/Gi ). And since Gi 6 Zi ( G ) we also have
( gZi ( G ))(hZi ( G )) = (hZi ( G ))( gZi ( G )).
Since g G, this means that every coset of the form hZi ( G ) commutes
with every coset of the form gZi ( G ), and hence hZi ( G ) Z ( G/Zi ( G )).
But Z ( G/Zi ( G )) = Zi+1 ( G )/Zi ( G ), and therefore h Zi+1 ( G ). We
originally chose h Gi+1 , hence Gi+1 6 Zi+1 ( G ) as claimed.
If G has nilpotency class c, then any central series for G must have
length at least c, hence n > c.
What all this tells us is that if G is nilpotent, then the upper central
series Z is finite and terminates at G. (This is because G = Gn 6
Z n ( G ) 6 G.) Moreover, the upper central series is the fastest ascending
central series that terminates at G.
Corollary 7.59 The following statements are equivalent:
(i) G is nilpotent,
(ii) n ( G ) = {e} for some n > 0, and
(iii) Z n ( G ) = G for some n > 0.
272 a course in abstract algebra

The groups i ( G ) in the lower central series are defined recursively


in terms of commutator subgroups i+1 ( G ) = [i ( G ), i ( G )], and the
groups G (i) in the derived series are defined recursively as commutator
subgroups G (i+1) ( G ) = [ G (i) , G (i) ]. The next proposition tells us how
these series are related.
Proposition 7.60 For any group G, G (i) 6 i+1 ( G ) for all i > 0.

Proof We prove this by induction on i. The base case i = 0 holds,


since G (0) = G = 1 ( G ). Now suppose that G (i) 6 i+1 ( G ) for some
i > 0. Then
G ( i +1) = [ G ( i ) , G ( i ) ] 6 [ G ( i ) , G ] 6 [ i +1 ( G ), G ] = i +2 ( G ).
Hence G (i) 6 i+1 ( G ) for all i > 0, as claimed.
Corollary 7.61 Nilpotent groups are soluble.

Proof Suppose G is a nilpotent group. Then its lower central series


terminates at {e} after finitely many steps. Since (i) 6 i+1 ( G ) for all
i > 0, it follows that the derived series must also reach {e} after finitely
many steps, and hence by Proposition 7.47 G must be soluble.
Not all soluble groups are nilpotent, however. For example, S3 is
soluble, because its derived series
{ } C A 3 C S3
reaches {} after finitely many steps. But its lower central series
A 3 C S3
doesnt reach {}, so it isnt nilpotent.
We saw earlier that subgroups, quotients, extensions and direct prod-
ucts of soluble groups are soluble. Some analogous results hold for
nilpotent groups. Firstly, we show that subgroups of nilpotent groups
are nilpotent
Proposition 7.62 Let G be nilpotent, and suppose that H 6 G. Then H
is also nilpotent.

Proof We prove by induction that i ( H ) 6 i ( G ). The base case


i = 1 holds, since 1 ( H ) = H 6 G = 1 ( G ). Now suppose that
i ( H ) 6 ( G ) for some i > 1. Then
i +1 ( H ) = [ i ( H ), H ] 6 [ i ( H ), G ] 6 [ i ( G ), G ] = i +1 ( G ).
So i ( H ) 6 i ( G ) for all i > 0. If G is nilpotent then there exists some
n > 0 such that n ( G ) = {e}, and thus n ( H ) 6 n ( G ) = {e} must
also be trivial, so H is nilpotent.
Next, we show that finite direct products of nilpotent groups are
nilpotent.
finite groups 273

Proposition 7.63 If G and H are nilpotent groups, then their direct prod-
uct G H is nilpotent as well.

Proof We prove this by showing that i ( G H ) 6 i ( G )i ( H ) for


i > 0. Again, we prove this by induction. The base case i = 1 holds,
since 1 ( G H ) = G H = 1 ( G )1 ( H ).
Now suppose that i ( G H ) 6 i ( G )i ( H ) for some i > 1. Then

i+1 ( G H ) = [i ( G H ), G H ] 6 [i ( G )i ( H ), G H ]
6 [i ( G ), G ][i ( H ), H ] = i+1 ( G )i+1 ( H ),
and so the induction holds for all i > 1. Since G and H are nilpotent,
there exists m, n N such that m ( G ) = n ( H ) = {e}. Let k =
max(m, n). Then
k ( G H ) 6 k ( G )k ( H ) = {e}{e} = {(e, e)}.
Hence the lower central series for G H terminates at the trivial sub-
group {(e, e)} after finitely many steps, so G H is nilpotent.
We can extend this by induction to show that direct products of finitely
many nilpotent groups are nilpotent:
Corollary 7.64 Let G1 , . . . , Gn be nilpotent. Then the direct product
G1 Gn is nilpotent too.
Finally, we show that images of nilpotent groups are nilpotent, and
consequently so are their quotients.
Proposition 7.65 Let f : G H be a homomorphism from a nilpotent
group G to some group H. Then the image f ( G ) is nilpotent too.

Proof Again, we prove this by considering the lower central series,


in this case by showing inductively that i ( f ( G )) = f (i ( G )) for all
i > 0.
The base case i = 1 holds, since 1 ( f ( G )) = f ( G ) = f (1 ( G )). Now
suppose that f (i ( G )) = i ( f ( G )) for some i > 1, and let g G and
h i ( G ). Then

f ([ g, h]) = [ f ( g), f (h)] [ f ( G ), f (i ( G ))]


= [ f ( G ), i ( f ( G ))] = i+1 ( f ( G )).
Hence f (i+1 ( G )) 6 i+1 ( f ( G )).
Now suppose that x f ( G ) and y i ( f ( G )) = f (i ( G )). Since f
maps G surjectively onto f ( G ), there exists g G and h i ( G ) such
that x = f ( g) and y = f (h). Then
[ x, y] = [ f ( g), f (h)] = f ([ g, h]) f ([ G, i ( G )]) = f (i+1 ( G )).
Hence i+1 ( f ( G )) 6 f (i+1 ( G )), so i+1 ( f ( G )) = f (i+1 ( G )), and
inductively i ( f ( G )) = f (i ( G )) for all i > 0.
274 a course in abstract algebra

Now, since G is nilpotent, there exists some positive integer n such that
n ( G ) = {e}, and therefore n ( f ( G )) = f (n ( G )) = f ({e}) = {e},
which means that f ( G ) is nilpotent.
Corollary 7.66 If G is a nilpotent group, and N P G, then the quotient
G/N is also nilpotent.

Proof This follows from Proposition 7.65 by considering the canonical


surjective homomorphism q : G G/N.
Extensions of nilpotent groups, however, need not be nilpotent. That is,
given a group G and normal subgroup N P G such that N and G/N
are nilpotent, it isnt true in general that G must also be nilpotent. (The
analogous statement for soluble groups is true, by Proposition 7.40.)
The dihedral group D3 is a simple counterexample: we know D3 isnt
nilpotent, because its isomorphic to S3 , which we saw earlier isnt
nilpotent. However, it has a normal subgroup R3 = Z3 which is

nilpotent, and the quotient D3 /R3 = Z2 is also nilpotent.
The nilpotency class (the minimal length of a central series for the
group) gives us a way of partially classifying groups. The trivial
group {e} is the only group with nilpotency class 0. The groups
with nilpotency class 1 are exactly the abelian groups: the lower
central series for an abelian group A is given by 1 ( A) = A and
2 ( A) = [ A, A] = {e}, so the nilpotency class of A is 1; conversely if
A has nilpotency class 1 then its lower central series must terminate
with 2 ( A) = [ A, A] = {e}, which only happens if A is abelian.
Proposition 7.43 showed that all finite pgroups are soluble. An
analogous result holds with respect to nilpotency:
Proposition 7.67 Let G be a finite pgroup of order | G | = pk for some
prime p and positive integer k. Then G is nilpotent.

Proof By Proposition 7.19 finite pgroups have nontrivial centres,


so Z1 ( G ) = Z ( G ) has order a nonzero power of p by Lagranges
22
Theorem 2.30, page 54. Theorem22 . Furthermore, Z2 ( G ) is defined such that Z2 ( G )/Z1 ( G ) =
Z ( G/Z1 ( G )). The quotient G/Z1 ( G ) must be either trivial or a p
group itself. If the former, Z1 ( G ) = G and hence G is nilpotent
by Corollary 7.59; if the former, Z ( G/Z1 ( G )) must be nontrivial by
Proposition 7.19, and hence have order a power of p by Lagranges
Theorem. Thus Z2 ( G ) must be a nontrivial pgroup that strictly
contains Z1 ( G ). Proceeding in this way, we find that Zi ( G ) < Zi+1 ( G )
for all i > 0, and since | G | is finite, the upper central series must
eventually terminate at G. Hence G is nilpotent by Corollary 7.59.
Putting this together with Proposition 7.63 and Corollary 7.64, it
follows that any direct product of finitely many finite pgroups must
finite groups 275

also be nilpotent. In particular, if a finite group G happens to be


a direct product of its Sylow subgroups, then it will therefore be
nilpotent. What isnt so obvious is that the converse also holds: all
finite nilpotent groups can be expressed as direct products of their
Sylow subgroups.
Proposition 7.68 A finite group G is nilpotent if and only if it is the
direct product of its Sylow subgroups.
To prove this, we first need the following proposition about normalis-
ers in nilpotent groups.
Proposition 7.69 Let G be a nilpotent group, and H < G a proper sub-
group of G. Then H < NG ( H ).
What this says is that every proper subgroup of a nilpotent group is
strictly contained in its normaliser.
Proof Since H < G, we have H < 1 ( G ) = G and {e} = n+1 ( G ) 6
H, where n is the nilpotency class of G. Now choose i > 0 such that
i ( G ) 6 H < i1 ( G ). Then
[i1 ( G ), H ] 6 [i1 ( G ), G ] = i ( G ) 6 H,
which means for any g i1 ( G ) and h H, we have ghg1 h1 =
[ g, h] H, and thus ghg1 H. Hence gHg1 = H for all g
i1 ( G ), which means that i1 ( G ) 6 NG ( H ). And since H <
i1 ( G ), it follows that H < NG ( H ).
In the proof of Proposition 7.68, we will use the contrapositive form
of this result:
Corollary 7.70 Let G be a nilpotent group, and suppose that H 6 G
such that H = NG ( H ). Then H = G.
One half of the proof of Proposition 7.68 is now straightforward, but
the other half requires a short lemma about normalisers of Sylow
subgroups.
Lemma 7.71 Let H be a Sylow psubgroup of a finite group G, for some
prime p. Then NG ( NG ( H )) = NG ( H ).

Proof By the definition of the normaliser23 , H P NG ( H ), and by 23


Definition 4.46, page 122.
Corollary 7.12, H must be the unique Sylow psubgroup of N =
NG ( H ). For any g NG ( N ) we have gHg1 6 gNg1 = N, so gHg1
is also a Sylow psubgroup of N and therefore gHg1 = H. This
means that g normalises H in G, so g NG ( H ) = N. Hence NG ( N ) C
N, and thus NG ( N ) = NG ( NG ( H )) = NG ( H ) as claimed.
We now have the necessary ingredients to prove our result about finite
nilpotent groups.
Proof of Proposition 7.68 Suppose that G is the direct product of its
276 a course in abstract algebra

Sylow subgroups. Each Sylow subgroup is a pgroup for some prime


p, and hence nilpotent by Proposition 7.67. The direct product of
finitely many nilpotent groups is nilpotent by Proposition 7.63 and
Corollary 7.63, so G is nilpotent.
To prove the converse, suppose that G is finite and nilpotent, and that
H is a Sylow psubgroup of G for some prime p. Let N = NG ( H ) be
the normaliser of H in G. By Lemma 7.71 we know that NG ( N ) = N;
that is, N is self-normalising in G. By Corollary 7.70 N = NG ( H ) = G,
and therefore H P G.
n n
Now suppose that | G | = p1 1 . . . pk k for some positive integer k, where
p1 , . . . , pk are distinct primes, and n1 , . . . , nk N. By Sylows First
n
24
Theorem 7.4, page 242. Theorem24 we know that G has a Sylow pi subgroup Hi of order pi i
for 1 6 i 6 k, and we have just seen that all of these are normal in G.
We want to show that H1 . . . Hk = H1 Hk , and we prove this by
induction, showing that H1 . . . Hi = H1 Hi for 1 6 i 6 k. The
base case i = 1 is clearly true; suppose that the proposition is true for
some i > 1.
We now consider ( H1 Hi ) Hi+1 . The first part of this has order
n
| H1 Hi | = p1n1 . . . pini , while the second has order | Hi+1 | = pi+i+11 .
By Proposition 2.34, the order of any element of H1 Hi must
n n n
divide p1 1 pi i , and any element of Hi+1 must divide pi+i+11 . These
are coprime, so any element in ( H1 Hi ) Hi+1 must have order 1,
and therefore by Proposition 1.19 ( H1 Hi ) Hi+1 = {e}. By
Corollary 3.23, ( H1 Hi ) Hi+1 must be the internal direct product
of H1 , . . . , Hi+1 and hence the inductive step holds.
In particular, setting i = k we see that G = H1 . . . Hk = H1 Hk
as claimed.

In the middle of this proof we also derived the following useful fact:
Corollary 7.72 A finite group G is nilpotent if and only if all of its Sylow
subgroups are normal.
We will now briefly look at a third class of groups defined in terms
of subgroup series. By Definition 7.49, a group G is nilpotent if
there exists a finite-length central series stretching between {e} and G.
Recall that for a series to be central, we require Gi+1 /Gi 6 Z ( G/Gi ),
and also that the series is normal; that is, Gi P G for all i.
More generally, a subnormal series is abelian if each quotient Gi+1 /Gi
is abelian (but not necessarily contained in the centre of G/Gi ); a
group has a finite-length abelian series if and only if it is soluble.
Between these two, we can consider the case where G has a finite-
length normal series where the quotients Gi+1 /Gi are cyclic.
finite groups 277

Definition 7.73 A group G is supersoluble or supersolvable if


there exists a finite-length normal series
{e} = G0 < G1 < < Gn = G
such that each quotient Gi+1 /Gi is cyclic.

Since every normal series is a subnormal series, a supersoluble group


has a finite-length subnormal series with cyclic factors. This can be
refined to a finite-length subnormal series whose factors are cyclic
groups of prime order, and hence every supersoluble group is soluble.
There exist soluble groups that arent supersoluble, however:
Example 7.74 The symmetric group S4 is soluble by the existence
of the subnormal series
{} C h(1 2)(3 4)i C h(1 2)(3 4), (1 3)(2 4)i C A4 C S4 ,
the factors of which are isomorphic to Z2 , Z2 , Z3 and Z2 . However,
this series is not normal since h(1 2)(3 4), (1 3)(2 4)i 6C S4 . Moreover,
there is no normal series for S4 with cyclic factors, since the only
normal subgroups of S4 are {}, h(1 2)(3 4), (1 3)(2 4)i = V4 and A4 ,
from which we cant construct a normal series with only cyclic factors.
The alternating group A4 is also soluble but not supersoluble.

Results analogous to Propositions 7.38, 7.39, 7.40 and Corollary 7.41


hold for supersoluble groups:
Proposition 7.75 If G is a supersoluble group, and H < G is a subgroup
of G, then H is supersoluble.

Proof The proof is very similar to that of Proposition 7.38, together


with the observation that if Hi+1 /Hi < Gi+1 /Gi and Gi+1 /Gi is cyclic
then so is Hi+1 /Hi by Proposition 2.8.
Proposition 7.76 If G is a supersoluble group and N P G is normal in
G, then the quotient G/N is also supersoluble.

Proof Again, the result follows by an almost identical argument to


that used in the proof of Proposition 7.39.
Proposition 7.77 If G and H are supersoluble groups, then so is their
direct product G H.

Proof Suppose that G has a cyclic normal series


{e} = G0 < G1 < < Gm = G
and that H has a cyclic normal series
{e} = H0 < H1 < < Hn = H.
Then for 0 6 i < m we have Gi+1 H0 /Gi H0 = Gi+1 /Gi , which is
cyclic since G is supersoluble. Similarly, for 0 6 i < n, it follows that
278 a course in abstract algebra

Gm Hi+1 /Gm Hi = Hi+1 /Hi is also cyclic since H is supersoluble.


Therefore the series

{e} = G0 H0 < G1 H0 < < Gm H0


< Gm H1 < < Gm Hn = G H
is a cyclic normal series for G H, and hence the direct product G H
is supersoluble.
By induction it follows that a direct product of finitely many supersol-
uble groups is also supersoluble.
The analogue of Proposition 7.40 doesnt hold in general: the Klein
group V4 = h(1 2)(3 4), (1 3)(2 4)i = N is supersoluble and the quo-
tient A4 /N
= Z3 is supersoluble as well, but A4 isnt. However, a
slightly weaker result does hold:
Proposition 7.78 Let N C G be a normal subgroup of some group G. If
N is cyclic and G/N is supersoluble, then G is supersoluble.

Proof If G/N is supersoluble, then there exists a cyclic normal series


{e} = H1 < H1 < < Hn = G/N.
Each group Hi , for 1 6 i 6 n, can be expressed as a quotient Gi /N for
some other group Gi C G, so our cyclic normal series can be rewritten
N/N = G1 /N < G1 /N < < Gn /N = G/N,
which we can then use to form a new series
{e} = G0 < G1 < < Gn = G,
by inserting the trivial subgroup G0 = {e} at the beginning. Each
factor Gi+1 /Gi of this new series is cyclic, since each factor Hi+1 /Hi
is cyclic, and Hi+1 /Hi = ( Gi+1 /N )/( Gi /N )
= Gi+1 /Gi by the Third
25
25
Isomorphism Theorem. Therefore G also has a cyclic normal series
Theorem 4.65, page 130.
and is thus supersoluble.
We now have a hierarchy of classes of finite groups:

trivial cyclic abelian nilpotent supersoluble soluble

If you have built castles in the air, your 7.4 Semidirect products
work need not be lost; that is where
they should be. Now put the founda-
tions under them. The internal direct product enables us to decompose a group as
Henry Thoreau (18171862), the direct product of two normal subgroups. More precisely, if H and
Walden (1854) 346
K are both normal subgroups of a group G with trivial intersection
H K = {e}, and if HK = G, then G is isomorphic to the direct product
finite groups 279

H K. Also, we have G/H = K and G/K = H. But what happens if


we only require one of H and K to be normal?
Suppose that H P G and K 6 G with H K = {e} and HK = G. What
groups G satisfy this set of conditions for two given subgroups H and
K? Equivalently, given two groups H and K, what groups G exist such
that H P G and K 6 G with H K = {e} and G = HK?
For example, suppose H = Z3 and K = Z2 . Up to isomorphism there
are two ways of combining these to form a new group satisfying the
above properties. The obvious one is the direct sum Z3 Z2 = Z6 .
Here H = {0, 2, 4} and K = {0, 3}. Then H K = {0} and H +K = Z6 .
Furthermore, H P Z6 and K 6 Z6 (actually K P Z6 as well, but we
dont mind), and Z6 /H = Z2
= K.
The other, slightly less obvious way yields D3
= S3 . Set H = R3 =
{e, r, r2 } and K = {e, m1 }. Then H P D3 and K 6 D3 with H K = {e}
and HK = D3 . Furthermore, D3 /H = Z2 = K.
To study this idea in more detail, we introduce the following definition:
Definition 7.79 Let G be a group, and suppose that H P G is normal
in G and K 6 G is a subgroup of G. If H K = {e} and G = HK then
we say that G is an (internal) semidirect product of H by K.

Well meet the external semidirect product shortly.


For a direct product G = H K, the case where both H and K are
normal in G, we know by Proposition 3.22 that the elements of H
commute with those of K; that is, for any h H and k K, we have
hk = kh. Or, equivalently, that khk1 = h. What this means is that the
conjugation action26 of K on H is trivial. 26
Definition 6.11, page 217.
Applying this insight to the more general case of a semidirect product
G = HK, we find that K acts by conjugation on H. Since H is a normal
subgroup, it is closed under conjugation by any element of G, and
therefore also by any element of K 6 G. The next proposition confirms
that this is a well-defined group action of K on H.
Proposition 7.80 Suppose that a group G can be decomposed as a semidi-
rect product G = HK, with H P G. Then K acts on H by conjugation.

Proof For any h H and k K we define k h = khk1 . This lies in


H because H is normal and hence closed under conjugation by any
element of G, and in particular by any element of K 6 G.
Clearly e h = ehe1 = h, so the identity element e K acts trivially
on H. Also, for any k1 , k2 K we have

k1 (k2 h) = k1 (k2 hk 1 1 1
2 ) = k 1 k 2 hk 2 k 1 = ( k 1 k 2 ) h ( k 1 k 2 )
1
= (k1 k2 )h,

and hence this is a well-defined action of K on H.


280 a course in abstract algebra

Applying this to the case of D3 , with H = R3 = {e, r, r2 } and


K = {e, m1 }, and considering the conjugation action of K on H (see
Table 7.2) we find that the conjugation action of K on H maps h 7 h1
for any h H. That is, k h = khk1 = h1 for all k K and h H.
We can use this to construct a presentation for HK:
e r r2 m1 m2 m3
e e e e e e e
hr, m1 : r3 = e, m21 = e, m1 rm11 = r 1 i
r r r r r2 r2 r2
This presentation is the n=3 case of the presentation for Dn obtained
r2 r2 r2 r2 r r r
m1 m1 m2 m3 m1 m3 m2 in Proposition 5.28.
m2 m2 m3 m1 m3 m2 m1
m3 m3 m1 m2 m2 m1 m3
There are only two ways that Z2 can act on Z3 : this one, and the
trivial action, which corresponds to the direct sum Z3 Z2 .
Table 7.2: Conjugation in D3
So, if a group G decomposes as an internal semidirect product G =
HK, then as part of this we automatically get an action of K on H.
Conversely, two groups H and K, together with a specified action of K
on H, can be combined to form a unique group G = HK:
Proposition 7.81 Let H and K be groups, with K acting on H via an
action : K Aut( H ). The multiplication operation
(h1 , k1 ) (h2 , k2 ) = (h1 k1 (h2 ), k1 k2 ) = (h1 (k1 h2 ), k1 k2 )
defines a group structure on the Cartesian product H K. The resulting
group G = H oK decomposes as an internal semidirect product of H {eK }
by {e H }K. The ({e H }K )action on H {eK } is determined by .

Proof To see that the multiplication operation is associative, suppose


h1 , h2 , h3 H and k1 , k2 , k3 K. Then

((h1 , k1 ) (h2 , k2 )) (h3 , k3 ) = (h1 k1 (h2 ), k1 k2 ) (h3 , k3 ) =


(h1 k1 (h2 )k1 k2 (h3 ), k1 k2 k3 ) = (h1 k1 (h2 k2 (h3 )), k1 k2 k3 ) =
(h1 , k1 ) (h2 k2 (h3 ), k2 k3 ) = (h1 , k1 ) ((h2 , k2 ) (h3 , k3 )).
The identity element is (e H , eK ) since, for any h H and k K,
(e H , eK ) (h, k) = (e H eK (h), eK k) = (h, k)
and (h, k) (e H , eK ) = (hk (e H ), keK ) = (h, k).
The inverse of (h, k) is (k1 (h1 ), k1 ), since

(h, k) (k1 (h1 ), k1 ) = (hk (k1 (h1 )), kk1 ) =


(hkk1 (h1 ), eK ) = (heK (h1 ), eK ) = (hh1 , eK ) = (e H , eK )
and

(k1 (h1 ), k1 ) (h, k) = (k1 (h1 )k1 (h), k1 k) =


(k1 (h1 h), eK ) = (k1 (e H ), eK ) = (e H , eK ).
Therefore the set H K forms a group G with this multiplication
operation.
finite groups 281

There are injective maps f H : H , G and f K : K , G defined by


f H (h) = (h, eK ) and f K (k ) = (e H , k ). These are homomorphisms, since
f H ( h1 h2 ) = ( h1 h2 , eK ) = ( h1 , eK ) ( h2 , eK ) = f H ( h1 ) f H ( h2 )
and f K ( k 1 k 2 ) = ( e H , k 1 k 2 ) = ( e H , k 1 ) ( e H , k 2 ) = f K ( k 1 ) f K ( k 2 ),
so im( f H ) = H {eK } and im( f K ) = {e H }K are subgroups.
Furthermore, im( f H ) is normal in G, since for any h1 , h2 H and
k K we have

(h2 , k) (h1 , eK ) (h2 , k)1 = (h2 , k) (h1 , eK ) (k1 (h21 ), k1 )


= (h2 k (h1 ), k) (k1 (h21 ), k1 ) = (h2 k (h1 )k (k1 (h21 )), kk1 )
= (h2 k (h1 )h21 , eK ) H {eK },
and so im( f H ) is closed under conjugation by any element of G. Clearly
( H {eK }) ({e H }K ) = {e H , eK } and ( H {eK })({e H }K ) = G, so G
decomposes as an internal semidirect product, as claimed.
Finally, given h H and k K,

(e H , k) (h, eK ) (e H , k)1 = (e H , k) (h, eK ) (e H , k1 )


= (k (h), kk1 ) = (k (h), eK ),
and hence the conjugation action of {e H }K on H {eK } is determined
completely by , the Kaction on H.
Definition 7.82 Let H and K be groups, and : K Aut( H ) be
an action of K on H. Then the group G from Proposition 7.81 is
called the (external) semidirect product of H by K, and is denoted
H oK,27 or sometimes H o K. We say that H oK realises the action .
27
Think of the symbol o as a hybrid
Example 7.83 The dihedral group Dn can be constructed as a semidi- of and C: the notation G = H oK
tells us that H is the normal subgroup.
rect product Zn oZ2 . There are two Z2 actions we can impose on Similarly, an expression of the form
Zn : the trivial action, and the action : Z2 Aut(Zn ) defined by G = H nK denotes that K is normal.
0 (i ) = i and 1 (i ) = ni for 0 6 i < n. The trivial action gives
h x, y : x n = 1, yn = 1, yxy1 = x i
which is isomorphic to Zn Z2 by Proposition 5.22. The nontrivial
action yields the presentation
h x, y : x n = 1, yn = 1, yxy1 = x 1 i
which is isomorphic to Dn by Proposition 5.28.

Example 7.84 The symmetric group Sn can be constructed as a


semidirect product An oZ2 . Recall that An C Sn is the normal sub-
group of even permutations on n objects. Represent Z2 as the order2
subgroup K = h(1 2)i = {, (1 2)} in Sn . Clearly An K = Sn , since
composing all the even permutations in An with (1 2) yields all the
282 a course in abstract algebra

odd permutations in Sn \ An . Furthermore, An K = {}, so Sn is the


internal semidirect product of An with K.
We can form Sn as the external semidirect product An oZ2 via the
action : Z2 Aut( An ) given by 0 ( ) = and 1 ( ) = (1 2) (1 2)
for any An .

Example 7.85 Let Rn denote the additive group of (Rn , +); that is,
the set of ordered ntuples of real numbers, or ncomponent real
vectors. The group GLn (R) acts on Rn by matrix multiplication, and
hence we can form the semidirect product Rn oGLn (R).
The underlying set of this group is Rn GLn (R): the set of ordered
pairs (u, A), with u a vector in Rn and A a nonsingular nn matrix
in GLn (R). The product of two elements (u, A) and (v, B) is
(u, A) (v, B) = (u + Av, AB).
This is isomorphic to the affine general linear group AGLn (R) of
transformations w 7 Aw + u for A GLn (R) and u, w Rn .
For example, suppose that u, v, w Rn and A, B GLn (R). Then
(v, B) acts on w by w 7 Bw + v and (u, A) acts on w by w 7
Aw + u. By composition, we find that
((u, A)(v, B))w = (u, A)( Bw+v) = ABw+ Av+u = ( Av+u, AB)w
and hence in AGLn (R) the multiplication operation is given by
(u, A)(v, B) = (u + Av, AB)
which is exactly that given by the semidirect product construction.

An interesting case arises when we take the semidirect product of a


group with its full automorphism group.
Example 7.86 Let G be a group, and let its automorphism group
Aut( G ) act on it in the obvious way. That is, given an automor-
phism Aut( G ), we define g = ( g). The semidirect product
Hol( G ) = Go Aut( G ) is called the holomorph of G.
For example, let G = Z3 . The automorphism group Aut(Z3 ) =
Z2 , comprising the identity automorphism and the inverse map.
The holomorph Hol(Z3 ) = Z3 oZ2 is isomorphic to the dihedral
group D3 , or the symmetric group S3 . (This is the nontrivial case of
Example 7.83 where n = 3.)

Another important class of semidirect products can be constructed


by applying a permutation action to a direct product of copies of
a chosen group. Let G be a group, and choose n N. Then we
can define an action of the symmetric group Sn on the nfold direct
product G n = G G, by permuting the coordinates. That is,
finite groups 283

given a permutation Sn and an element ( g1 , . . . , gn ) G n , define


( g1 , . . . , gn ) to be ( g(1) , . . . , g(n) ).
The semidirect product G n oSn obtained via this action is called the
permutation wreath product of G with Sn , and denoted G oSn , or
sometimes G pwr Sn .
More generally, by Cayleys Theorem28 we know that any group H 28
Theorem 2.13, page 47.
can be regarded as a group of permutations of its underlying set. We
can use this to define a wreath product G o H. Suppose that | H | = n
and form the nfold direct product G n = G G. Now define the
Haction on G n by using the permutations obtained from Cayleys
Theorem: an element of G n is an ntuple of elements of G, indexed by
elements of H = {h1 , . . . , hn }. For any element h H we define

h ( gh1 , . . . , ghn ) = ( ghh1 , . . . , ghhn ).

The corresponding semidirect product G n o H is the regular wreath


product of G by H, and denoted G rwr H, or G o H where the meaning
is clear from context.
Example 7.87 The permutation wreath product Zm oSn yields the
generalised symmetric group Sm,n . If m = 1 then this is just the
ordinary symmetric group S1,n = Sn , while if m = 2 then we obtain
the signed symmetric group S2,n or hyperoctahedral group Bn , and
if m = n = 2 then we get the wreath product Z2 oZ2 , which is
isomorphic to the dihedral group D4 .

Wreath products have a number of applications in the classification of


permutation groups, and in the study of semigroups and automata;
however these applications would be too much of a digression here.

7.5 Extensions No power and no treasure can out-


weigh the extension of our knowledge.
Democritus (c.460c.370BC),
We can use the semidirect product to build a wide range of in: John Owen (18331896),
larger groups from smaller ones (or to describe many larger groups as Evenings with the Skeptics (1881) I:149
internal semidirect products of their subgroups). Is this all we need?
Can we construct or decompose any group in this way?
Unfortunately we cant. For example, Z4 cant be split as a semidirect
product of two smaller subgroups. We can certainly find a normal
subgroup of order 2, namely {0, 2} = Z2 , but there isnt another one
So Z4 cant be expressed as an internal semidirect product of two
subgroups isomorphic to Z2 . Looking at this as an external semidirect
product Z2 oZ2 , we discover almost immediately that there is only
one Z2 action on Z2 : the trivial one. Hence Z2 oZ2 = Z2 Z2 .
284 a course in abstract algebra

We therefore need to find some way of further generalising the semidi-


rect product. The key is to relax the requirement on the second
subgroup even more. With the direct product G = H K, both H
and K embed as normal subgroups of G; with a semidirect product
G = H oK, we only require H to embed as a normal subgroup, while
K can be any subgroup satisfying HK = G and H K = {e}. One
option would be to allow H not to be normal in G, alternatively we
could weaken the requirement that K embed as a subgroup, and only
ask that it form a subset. It is this second approach that well adopt.
What were actually trying to do here is understand group extensions:
Definition 7.88 Suppose that G, H and K are groups such that there
exists an inclusion homomorphism i : H , G with i ( H ) P G, and
G/i ( H )
= K. Then G is an extension of K by H.29
29
In some books this is called an exten-
sion of H by K. Clearly the direct product G = H K satisfies this definition, and
the required inclusion is i1 : H , H K, mapping H to the first
coordinate. The (external) semidirect product H oK also works; again
H is included via i1 : H , H oK.
What we need to do is look at how the other group K is included.
With the direct product H K and the semidirect product H oK, we
use the inclusion homomorphisms i2 : K , H K or i2 : K , H oK
mapping K to the second coordinate. If we dont require the image of
K to be a subgroup, then we just need a function s : K , G such that
i ( H )s(K ) = {e} and i ( H )s(K ) = G.
Some books use the term section to re- Definition 7.89 Suppose that G is an extension of a group K by
fer only to the case where s is a homo-
a group H, with inclusion homomorphism i : H , G, and where
morphism rather than just a function
defined on the underlying set. i ( H ) P G and G/i ( H )
= K, with quotient homomorphism q : G  K.
Then an inclusion function s : K , G is called a lifting or a section
if s(eK ) = eG , and if qs = idK ; that is, if q(s(k )) = k for all k K.

A section s : K , G determines an element s(k) of the preimage q1 (k )


for every element k K. By Proposition 4.38, these preimages are the
cosets of H = ker(q) in G, and so a section s is the same as a collection
of elements of G, one from each coset of H. We actually met something
30
Definition 5.71, page 187. just like this earlier: a section is a special sort of transversal.30 We can
represent this scenario with the following diagram:
q
i
H G K
s

Also, a section s : K G is injective, and so the image s(K ) is a subset


of G whose elements are in bijective correspondence with the elements
of K. If s is a homomorphism then s(K ) will be a subgroup of G
isomorphic to K, but if not it will just be some distorted copy of K that
isnt closed under multiplication. Nevertheless, it will still be the case
finite groups 285

that i ( H )s(K ) = G and i ( H )s(K ) = {eG } as required.


The section s defines a Kaction on H: the images of the elements of K
act by conjugation in G on the images of the elements of H. That is,
i ( k h ) = s ( k ) i ( h ) s ( k ) 1 (7.20)
for any h H and k K. We can view this as a commutation rule:
s ( k )i ( h ) = i ( k h ) s ( k ) (7.21)
The semidirect product G = H oK is the special case where s is a
homomorphism; such extensions are said to be split, and the section s
is sometimes called a splitting or splitting homomorphism. Our aim
now is to develop a classification theory for extensions of some group
K by some group H. In general, this is quite complicated, so we will
focus on abelian extensions: the case where the kernel H is abelian.
We also want to define a suitable equivalence relation on extensions,
to enable us to classify those extensions that are essentially different.
From now on, suppose G is a group, and A is an abelian group
with a predetermined (and possibly trivial) Gaction defined on
it. Before we go any further, we introduce the following terminol-
ogy:
Definition 7.90 Let G be a group. A Gmodule is an abelian group
A equipped with a Gaction : G Aut( A).

We want to list, up to isomorphism, all groups E such that E/i ( A)


=G
for some inclusion homomorphism i : A , E, and where the conjuga-
tion s( G )action on i ( A) is determined by the specified Gaction on
A. That is, we want to classify all extensions of the form
i q
A E G (7.22)
up to some notion of equivalence.
The definition of equivalence we need must incorporate isomorphism
of the middle group E in the diagram (7.22), but in such a way that the
inclusion and quotient isomorphisms line up properly:
Definition 7.91 Let G be a group, and let A be a Gmodule. Two
extensions
i1 q1 i2 q2
A E1 G and A E2 G
286 a course in abstract algebra

are equivalent if there exists an isomorphism f : E1 E2 such that


i2 = f i1 and q1 = q2 f . That is, if the following diagram commutes:

i1
E1
q1

A i2
f
q2
G
E1

Suppose, then, that we have an extension


i q
A E G
with a section s : G E, and where A has a Gaction : G Aut( A).
Choose some element g G. Then any element x E can be repre-
sented as a product x = i ( a)s( g) for some unique a A. Furthermore,
s( g) acts by conjugation on i ( A); from now on we will assume that
this s( G )action on i ( A) agrees with the predetermined Gaction .
We say that s realises the Gaction .
Now suppose that g, h G and consider two arbitrary elements
x = i ( a)s( g) and y = i (b)s(h) in E. Their product

xy = i ( a)s( g)i (b)s(h) = i ( a)s( g)i (b)s( g)1 s( g)s(h)


= i ( a)i (g (b))s( g)s(h) = i ( a+g (b))s( g)s(h).
If s is a homomorphism then s( g)s(h) = s( gh) and this reduces to the
semidirect product AoG; but if not, then it doesnt. We can, however,
find an element ( g, h) A such that
s( g)s(h) = s( gh)i ( ( g, h)).
Doing this for all g, h G we get a function : G G A. This
function, called a factor set or cocycle, measures how badly s fails to
be a homomorphism. The individual elements ( g, h) can be thought
of as correction factors or error terms, and if s is a homomorphism,
then ( g, h) = 0 for all g, h G. This enables us to define a group
structure on the set A G:
( a, g) (b, h) = ( a+g (b)+( g, h), gh). (7.23)
Not just any function : G G A will do, however: we need (7.23)
to determine a valid group structure on the set A G.
Proposition 7.92 Let G be a group and A be a Gmodule. Then a func-
tion : G G A is a factor set for an extension
q
i
A E G
s

if and only if
(eG , g) = ( g, eG ) = 0
finite groups 287

for any g G, and


( g1 , g2 ) + ( g1 g2 , g3 ) = g1 ( g2 , g3 ) + ( g1 , g2 g3 )
for all g1 , g2 , g3 G.

Proof The first of these follows from the fact that s(eG ) = eE . For,
s(eG g) = s( g) = eE s( g) = s(eG )s( g)
and s( geG ) = s( g) = s( g)eE = s( g)s(eG ).
The second follows from the associativity condition: for a1 , a2 , a3 A
and g1 , g2 , g3 G we require that
(( a1 , g1 ) ( a2 , g2 )) ( a3 , g3 ) = ( a1 , g1 ) (( a2 , g2 ) ( a3 , g3 )).
The left hand side yields

(( a1 , g1 )( a2 , g2 ))( a3 , g3 ) = ( a1 +g1 ( a2 )+( g1 , g2 ), g1 g2 )( a3 , g3 )


= ( a1 +g1 ( a2 )+g1 g2 ( a3 )+( g1 , g2 )+( g1 g2 , g3 ), g1 g2 g3 )
while the right hand side gives

( a1 , g1 )(( a2 , g2 )( a3 , g3 )) = ( a1 , g1 )( a2 +g2 ( a3 )+( g2 , g3 ), g2 g3 )


= ( a1 +g1 ( a2 +g2 ( a3 )+( g2 , g3 ))+( g1 , g2 g3 ), g1 g2 g3 ).
Setting both of these equal, we get
( g1 , g2 ) + ( g1 g2 , g3 ) = g1 ( g2 , g3 ) + ( g1 , g2 g3 )
as claimed.
A factor set : G G A thus completely determines the group
structure on E for an extension
q
i
A E G
s

of G by A, with a specified section s : G E. In order to reconstruct


the extension, we therefore need three pieces of information: the group
G, the Gmodule A and the factor set . From these, we can build
the middle group E = E[ G, A, ] and recover the homomorphisms i
and q, by setting E to be the group with underlying set A G and the
multiplication operation defined in (7.23). The inclusion homomor-
phism i maps any a A to the ordered pair ( a, eG ) in E = A G, and
the quotient homomorphism q takes any ( a, g) E = A G to the
corresponding element g G.
Example 7.93 Lets use this to construct the cyclic group Z4 as an
extension of Z2 by Z2 . There is only one Z2 action we can define
on Z2 , namely the trivial one, so thats the one well use.
We want to define a group structure on the set Z2 Z2 , and to do
this we have to choose an appropriate factor set : Z2 Z2 Z2 .
288 a course in abstract algebra

By the first condition in Proposition 7.92, we know that


(0, 0) = (0, 1) = (1, 0) = 0,
and so the only thing left for us to do is to decide the value of (1, 1).
Setting (1, 1) = 0 yields a trivial factor set, which corresponds to
the split extension Z2 Z2 .
The other option is to set (1, 1) = 1 (check that this satisfies the
second condition in Proposition 7.92). This determines a group
structure isomorphic to Z4 on the set Z2 Z2 :

(0, 0) (0, 1) (1, 0) (1, 1)


(0, 0) (0, 0) (0, 1) (1, 0) (1, 1)
(0, 1) (0, 1) (1, 0) (1, 1) (0, 0)
(1, 0) (1, 0) (1, 1) (0, 0) (0, 1)
(1, 1) (1, 1) (0, 0) (0, 1) (1, 0)

This group is cyclic of order 4, with possible generators being either


(0, 1) or (1, 1). The corresponding sections are those for which,
respectively, 1 7 (0, 1) or 1 7 (1, 1).

Another interesting example relates to the carrying operation in


base 10 arithmetic:
Example 7.94 There are two obvious extensions of Z10 by Z10 : the
direct sum Z10 Z10 and the cyclic group Z100 . We can regard
both of these as having the same underlying set Z10 Z10 with
different addition operations. With the first, the sum (2, 7) + (3, 4)
evaluates to (5, 1), while in the second case it evaluates to (6, 1)
because 27 + 34 = 61.
In both cases we set A = G = Z10 , and consider A to be a trivial
Z10 module. The semidirect product AoG thus gives the direct sum
Z10 Z10 and the split extension
i q
Z10 Z10 Z10 Z10 .
For the extension
i q
Z10 Z100 Z10 ,
however, we need to introduce a nontrivial factor set that encodes
the carrying operation:

0 if m + n < 10,
(m, n) =
1 if m + n > 10,

for all m, n Z10 . This satisfies the conditions in Proposition 7.92


and is hence a factor set.
What we now need to do is translate the notion of equivalence of
finite groups 289

extensions in Definition 7.91 into an equivalence relation on factor sets.


To do this, well investigate how the factor set changes if we change
the section s. Suppose that
i q
A E G
is an extension of a group G by a Gmodule A. Let s1 and s2 be
sections of this extension that both realise the specified Gaction on A.
Then s1 determines a factor set 1 and s2 yields another factor set 2 .
For any g G there is a unique element ( g) A such that
s1 ( g) = i ( ( g))s2 ( g).
This determines a function : G A with (eG ) = 0. Then, for any
g, h G,
s1 ( g)s1 (h) = i ( ( g))s2 ( g)i ( (h))s2 (h),
and also
s1 ( g)s1 (h) = s1 ( gh)i (1 ( g, h))
= s2 ( gh)i ( ( gh))i (1 ( g, h))
= s2 ( gh)i ( ( gh) + 1 ( g, h)).
Hence, in terms of the group structure on A G,
( ( g), g) ( (h), h) = ( ( gh) + 1 ( gh), gh).
The left hand side of this gives
( ( g), g) ( (h), h) = ( ( g) + g (h) + 2 ( g, h), gh).
Equating these, we get
( gh) + 1 ( g, h) = ( g) + g (h) + 2 ( g, h)
and hence
1 ( g, h) 2 ( g, h) = ( g) + g (h) ( gh). (7.24)
What this means is that different sections of the same group extension
yield factor sets that differ by an expression of the form given in (7.24),
which depends on a special sort of function:
Definition 7.95 Let G be a group, and A a Gmodule. A cobound-
ary or inner factor set is a function : G G A where
( g, h) = ( g) ( gh) + g (h)
for all g, h G, and some function : G A satisfying (eG ) = 0.
Two factor sets 1 and 2 are equivalent or cohomologous if their
difference is an inner factor set .
It is relatively straightforward to verify that an inner factor set satisfies
the criteria in Proposition 7.92 and is hence a factor set.
290 a course in abstract algebra

Now, given a group G and Gmodule A, we can form the set FS( G, A)
of factor sets : G G A, and the subset IFS( G, A) of inner factor
sets. More interestingly, these sets are closed under a fairly straight-
forward addition operation and thereby form abelian groups:
Proposition 7.96 Let G be a group and A a Gmodule. Suppose that
1 , 2 FS( G, A) are factor sets. Then the function 1 +2 given by
(1 +2 )( g, h) = 1 ( g, h) + 2 ( g, h) is also a factor set. Furthermore, if
1 , 2 IFS( G, A) then 1 +2 IFS( G, A) too.

Proof To see that FS( G, A) and IFS( G, A) are closed under this point-
wise addition operation, we must verify that 1 +2 satisfies the con-
ditions in Proposition 7.92, and that 1 +2 satisfies Definition 7.95.
Both of these tasks are entirely routine, and well omit the details here,
although the interested reader is encouraged to check them.
Lets summarise where we are at the moment. For any extension
i q
A E G

of a group G by a Gmodule A, we can choose one or more sections


s : G E that realise the specified Gmodule structure on A. Each
of these sections determines a factor set : G G A, and if we
throw away s and just keep G, A and , we can reconstruct E in a
canonical way. If s is a homomorphism then = 0 and E = AoG,
and in this case we have a split extension. Different sections of the
same extension yield factor sets whose difference is an inner factor
set. Furthermore, the sum of two factor sets is also a factor set, and the
sum of two inner factor sets is also an inner factor set; this enables us
to define groups FS( G, A) and IFS( G, A). The quotient Ext( G, A) =
FS( G, A)/ IFS( G, A) consists of equivalence classes of factor sets that
31
Some books, in particular those on differ only by an inner factor set.31 We claim that this group Ext( G, A)
homological algebra, call factor sets co- classifies extensions of G by A: there is a bijective correspondence
cycles and inner factor sets cobound-
aries; they form groups Z2 ( G; A) and between elements of Ext( G, A) and equivalence classes of extensions
B2 ( G; A), respectively, and their quo- of G by A in the sense of Definition 7.91. Were almost there: all
tient H 2 ( G; A) is the second in a se- we need to do is to show that equivalent extensions correspond to
quence of cohomology groups, each
of which contains important structural equivalent (or cohomologous) factor sets, and vice versa.
information. Homological algebra is
Proposition 7.97 Two extensions
an interesting and powerful branch of
mathematics with strong links to alge- i1 q1 i2 q2
braic topology, but a full discussion is A E1 G and A E2 G
beyond the scope of this book.
of a group G by a Gmodule A are equivalent if and only if they have
equivalent factor sets.

Proof Suppose that the extensions are equivalent; that is, there exists
an isomorphism : E1 E2 such that i1 = i2 and q1 = q2 .
Choose a section s1 : G E1 that realises the Gaction on A. Then s1
finite groups 291

yields a factor set for E1 .


For any g G, the isomorphism maps the coset i1 ( A)s1 ( g) to the
coset
(i1 ( A)s1 ( g)) = (i1 ( A))(s1 ( g)) = i2 ( A)(s1 ( g))
in E2 . Define a section s2 : G E2 by s2 = s1 . We need to check
that this section also realises the given Gmodule structure on A. The
required Gaction is realised by s1 , so for any g G and a A we
have
i 1 ( g a ) = s 1 ( g ) i 1 ( a ) s 1 ( g ) 1 .
Applying to both sides of this equation, we get

i2 ( g a) = (i1 ( g a)) =
(s1 ( g))(i1 ( a))(s1 ( g))1 = s2 ( g)i2 ( a)s2 ( g)1
and so s2 does indeed realise the correct Gaction.
The factor set is determined by the section s1 as follows:
i1 ( ( g, h)) = s1 ( g)s1 (h)s1 ( gh)1
for all g, h G. Applying to both sides, we get
i2 ( ( g, h)) = s2 ( g)s2 (h)s2 ( gh)1
so the factor set determined by the section s2 is the same as that
determined by s1 . If is some other factor set for E2 determined by a
different section t : G E2 , then by (7.24) and are cohomologous.
Conversely, suppose that E1 and E2 are extensions of G by A relative
to equivalent factor sets and respectively. Then there exists a
function : G A such that
( g, h) ( g, h) = ( g) + g (h) ( gh)
for all g, h G.
Let s1 : G E1 and s2 : G E2 be sections that realise the Gmodule
structure on A and yield the factor sets and respectively. Then
every element in E1 can be written uniquely as i1 ( a)s1 ( g) for some
a A and g G, and every element in E2 can be similarly written in
the form i2 ( a)s2 ( g). Recall that the multiplication operation in E1 is
given by
i1 ( a)s1 ( g) i1 (b)s1 (h) = i1 ( a + g b + ( g, h))s1 ( gh)
and that in E2 is given by
i2 ( a)s2 ( g) i2 (b)s2 (h) = i2 ( a + g b + ( g, h))s2 ( gh).
Now define : E1 E2 by (i1 ( a)s2 ( g)) = i2 ( a + ( g))s2 ( g). This is
clearly a bijection. Furthermore
(i1 ( a)s1 ( g) i1 (b)s1 (h)) = (i1 ( a + g b + ( g, h))s1 ( gh))
292 a course in abstract algebra

= i2 ( a + g b + ( g, h) + ( gh))
and

(i1 ( a)s1 ( g))(i1 (b)s1 (h)) = i2 ( a + ( g))s2 ( g)i2 (b + (h))s2 (h)


= i2 ( a + ( g) + g b + g (h) + ( g, h))s2 ( gh).
Since and are equivalent,
( g, h) + ( gh) = ( g) + g (h) + ( g, h)
and hence the two expressions above are equal, which means that is
an isomorphism.
Next we need to show that the diagram in Definition 7.91 commutes.
For any a A and g G,
(i1 ( a)) = (i1 ( a)s1 (eG )) = i2 ( a + (eG ))s2 (eG ) = i2 ( a)
and
q2 ((i1 ( a)s1 ( g))) = q2 (i2 ( a + ( g))s1 ( g)) = g = q1 (i1 ( a)s1 ( g))
as required. Hence E1 and E2 are equivalent.
So, the group Ext( G, A) classifies extensions of a group G by a G
module A. The zero element in Ext( G, A) corresponds to the (equiva-
lence class of) the split extension
i q
A Ao G G
and so if Ext( G, A) is trivial, this is, up to equivalence, the only
Oberwolfach Photo Collection / Konrad Jacobs
extension of G by A.
Issai Schur (18751941)
The next result is due originally to the German algebraist Issai Schur
(18751941) and was later extended to the nonabelian case by Hans
Zassenhaus (19121991).
Theorem 7.98 (SchurZassenhaus Theorem) Let G be a group of
order m and A a Gmodule of order n. If gcd(m, n) = 1 then Ext( G, A)
is trivial, and the only extension of G by A is the semidirect product AoG.

Proof Suppose that is a factor set in FS( G, A); recall that this is a
function : G G A, and hence its values lie in A. By Proposi-
tion 2.34 these values must all have orders that divide | A| = n, and
so n ( g, h) = 0 for all g, h G. Therefore n = 0 in FS( G, A), and n
also lies in the subgroup IFS( G, A) of inner factor sets.
Now we want to show that m is also an inner factor set. To do this,
we define a function : G A by
( g) = ( g, h).
h G
In particular,
(eG ) = (eG , h) = 0 = 0.
h G h G
finite groups 293

Since is a factor set, we have


( g, h) = ( g, hk) + g (h, k ) ( gh, k)
for all g, h, k G. Sum both sides of this equation as k ranges over the
whole of G:
( g, h) = ( g, hk) + g(h, k) ( gh, k).
kG kG
This gives
m ( g, h) = ( g) + g (h) ( gh)
and hence m is an inner factor set as well.
Since m and n are coprime, we can find integers p and q such that
pm + qn = 1. Then
= ( pm + qn) = p(m) + q(n )
lies in IFS( G, A). This means that any factor set FS( G, A) is
inner, so FS( G, A) = IFS( G, A) and hence the quotient Ext( G, A) =
FS( G, A)/ IFS( G, A) is trivial.
We will now look at a few specific types of group extensions.
First, suppose that
i q
A Ao G G
is an extension of a group G by a Gmodule A, and that s : G E is
a section that realises the given Gmodule structure on A. Then by
rearranging (7.20) we get the commutation relation (7.21), reproduced
here:
i ( g a ) s ( g ) = s ( g )i ( a ) (7.25)
What this says is that for each i ( a) to commute with all the s( g) (and,
for that matter, every other element of E) we require g a = a; that
is, for the Gmodule structure on A to be trivial. This means that
every element of the form i ( a) commutes with every element of E,
and therefore that the image i ( A) of A lies in the centre Z ( E) of E.
Such extensions are called central extensions, and the triviality of
the Gaction on A is both a necessary and sufficient condition for an
extension to be central.
Example 7.99 The quaternion group Q8 is a central extension of the
Klein group V4 = Z2 Z2 by the cyclic group Z2 . The latter embeds
into Q8 as the subgroup { E} and one possible quotient map takes
i (Z2 ) = { E} to the identity e V4 , the coset { I } to a V4 , the
coset { J } to b V4 , and the coset {K } to c V4 .
A possible section s : V4 Q8 is given by
(
e 7 E, a 7 I,
s:
b 7 J, c 7 K
294 a course in abstract algebra

and the corresponding factor set is:

e a b c
e 0 0 0 0
a 0 1 0 1
b 0 1 1 0
c 0 0 1 1

Another interesting case is the one where the group G is cyclic. These
are called cyclic extensions, and the following discussion will be
useful when we classify groups of order 16 in Section 7.A.
If we have a group G with a normal subgroup H P G, such that
G/H = Zn then G is obviously a cyclic extension of Zn by some
(possibly nonabelian) group H. Choose some element g G \ H such
that the coset gH generates the quotient G/H. Let v = gn . Then
vH = gn H = ( gH )n = H in G/H, and so v H. As a normal
subgroup, H is closed under conjugation by any element of G, so let
Aut( H ) be the inner automorphism : h 7 ghg1 . Furthermore,
(v) = gvg1 = ggn g1 = gn = v
so fixes this distinguished element v. Also, for any h H,
n (h) = gn hgn = vhv1
so n is conjugation by v.
We can now discard G and g, and just keep v. Well now see that the
data ( H, n, , v) is all we need to reconstruct and uniquely determine
(up to equivalence) the extension
i q
H G Zn .

Definition 7.100 A cyclic extension type for a group H is a quadru-


ple ( H, n, , v) where n N, Aut( H ) and v H such that
(v) = v and n is conjugation by v.

Every cyclic extension type determines some group G, and all cyclic
extensions arise in this way:
Theorem 7.101 (Cyclic Extension Theorem) Given a cyclic extension
type ( H, n, , v) there is some group G with H P G and G/H = Zn .
Furthermore, all extensions of Zn by H are determined by some cyclic
extension type.

Proof Suppose that Zn = ht : tn = 1i. Let G be the set of ordered


pairs (h, ti ) where h H and 0 6 i < n. We want to define a group
multiplication structure on this set, which we do as follows:
(h, ti ) (k, t j ) = h i (k)(i, j), ti+ j

finite groups 295

with (possibly nonabelian) factor set



e
H if i + j < n,
(i, j) =
v if i + j > n.

This operation is associative, the identity element is the pair (e H , t0 ),


and the inverse of (h, ti ) is ( i ((vh)1 )). These can all be checked by
a straightforward but involved calculation which we will omit.
This group is an extension of Zn by H, since the quotient map q : G
G/H maps (h, ti ) to ti H G/H, and H = {(h, t0 ) : h H } is normal
in G. To see the latter, observe that
(k, ti ) (h, t0 ) (k, t)1 = (k, ti ) (h, t0 ) ( 1 ((vk)1 ), tni )
= (k i (h), ti ) ( i ((vk)1 ), tni )
= (k i (h) i ( i ((vk)1 )), ti+nin )
= (k i (h)(vk)1 , t0 ) H {t0 } G
and so H is closed under conjugation and hence normal in G.
Conversely, suppose that some group G has a normal subgroup H
and that G/H = Zn for some n. Then by the discussion earlier, we
can choose an element g G \ H such that G/H = h gH i, and from
this obtain an automorphism Aut( H ), defined by : h 7 ghg1 ,
and an element v = gn H with the required properties.
So, given a cyclic group Zn and a group H of order m, we can construct
a list of groups of order mn by listing all possible cyclic extension
types ( H, n, , v). This list might contain duplicates, and it might not
contain all possible groups of order mn, but its a start. To solve the
duplication problem we need an appropriate notion of equivalence for
cyclic extension types.
Definition 7.102 Let ( H, n, , v) and (K, m, , w) be cyclic extension
types. They are equivalent if m = n and there exists an isomorphism
: H K such that = and w = (v).
This concept of equivalence is exactly the one we want:
Proposition 7.103 Equivalent cyclic extension types yield isomorphic
groups.

Proof Let ( H1 , n, 1 , v1 ) and ( H2 , n, 2 , v2 ) be equivalent cyclic exten-


sion types. Then there exists an isomorphism : H1 H2 satisfying
the conditions in Definition 7.102. Let G1 and G2 be groups realising,
respectively, ( H1 , n, 1 , v1 ) and ( H2 , n, 2 , v2 ). Then there are elements
g1 G1 \ H1 and g2 G2 \ H2 such that 1 is conjugation by g1 and
2 is conjugation by g2 , and v1 = g1n and v2 = g2n .
Let f : G1 G2 such that hg1i 7 (h) g2i ; this is well defined and
296 a course in abstract algebra

bijective. We must show that it is also a homomorphism. Suppose that


i + j < n. Then
j i+ j i+ j
f ((hg1i ) (kg1 )) = f (h1 (k) g1 ) = (h1 (k)) g2
i+ j j
= (h)2 ((k)) g2 = f (hg1i ) f (kg1 ).
If i + j > n then
j i+ j i+ j
f ((hg1i ) (kg1 )) = f (h1 (k)v1 g1 ) = (h1 (k )v1 ) g2
i+ j j
= (h)2 ((k))v2 g2 = f (hg1i ) f (kg1 ).
Hence f : G1 G2 is an isomorphism, as claimed.
Note that the converse doesnt necessarily hold: there may be inequiv-
alent cyclic extension types that also realise isomorphic groups.
Proposition 7.104 Let ( H, n, , v) and ( H, n, , v) be two cyclic exten-
sion types with and conjugate in Aut( H ). Then any group that realises
( H, n, , v) is isomorphic to any group that realises ( H, n, , v)

Proof If and are conjugate in Aut( H ) then there exists an auto-


morphism Aut( H ) such that = 1 . Then : H H
is an equivalence of cyclic extension types, since = , and
since (v) = v it follows that ( H, n, , v) and ( H, n, , v) are equiva-
lent. Hence, by Proposition 7.103, they are realised by isomorphic
groups.
We end this part of the discussion with a short example illustrating
the use of cyclic extension types for classification of groups of a given
order.
Example 7.105 We can use these techniques to classify the groups
of order 6 again. By Cauchys Theorem, a group G of order 6 must
have a subgroup H = Z3 , which has index 2 and is therefore normal
by Proposition 3.12. The quotient G/H = Z2 and so the problem
reduces to studying cyclic extension types of the form ( H, 2, , v).
There are two automorphisms of Z3 : the identity, which we will
denote 1 here, and the map 2 that takes every element of H to its
inverse. From these we obtain four cyclic extension types:
(Z3 , 2, 1 , 1), (Z3 , 2, 1 , h), (Z3 , 2, 1 , h2 ) and (Z3 , 2, 2 , 1).
The first three of these are realised by the direct sum Z3 Z2
= Z6 ,
while the last is realised by the dihedral group D3 .

Well use cyclic extension types in the next section, when we classify
groups of order 16. Finally, well mention a couple of classes of groups
that can be described in terms of extensions.
finite groups 297

Definition 7.106 A metabelian group is a group that can be con-


structed as an extension of an abelian group by an abelian group;
that is, a group G that has an abelian normal subgroup N such that
the quotient G/N is also abelian.
A metacyclic group is a group that is isomorphic to a cyclic extension
by a cyclic group: a group G with a cyclic normal subgroup N such
that G/N is also cyclic.

Metacyclic groups are metabelian, and include the cyclic groups them-
selves, the dicyclic groups Dicn and the dihedral groups Dn . As
remarked earlier, metabelian groups are exactly those with derived
length at most two.

7.A Classification of small finite groups I tried to make out the names of plants,
and collected all sorts of things, shells,
seals, franks, coins and minerals. The
In this section we will classify, up to isomorphism, groups passion for collecting, which leads a
of order less than 32. The reason well stop at order 31 is that the man to be a systematic naturalist, a
virtuoso, or a miser, was very strong
classification is relatively manageable up to that point. The classi- in me, and was clearly innate, as none
fication of groups of order n = 2k is, in general, quite complicated of my sisters or brother ever had this
taste.
and involves the consideration of a number of cases. As can be seen
Charles Darwin (18091882),
from Table 7.3, the only really involved cases are n = 16 = 24 , with 14 in: Francis Darwin, The Life and Letters
non-isomorphic groups, and n = 24 = 323 , with 15. Everything of Charles Darwin (1887) I 2728
else is comparatively straightforward; however there are 51 groups of
order 32, which is why well stop just before then. n total abelian nonabelian
Before we start classifying, well introduce a few results that will prove 1 1 1 0
2 1 1 0
useful. The first relates to groups that decompose as a semidirect 3 1 1 0
product of two cyclic groups. 4 2 2 0
5 1 1 0
Proposition 7.107 Suppose that a finite group G decomposes as an in- 6 2 1 1
ternal semidirect product HK, where H = h x : x m = 1i
= Zm is normal 7 1 1 0
8 5 3 2
in G, and K = hy : y = 1i = Zn . Then the possible actions of K on H
n
9 2 2 0
are determined by yxy1 = xi where 0 < i < m and in 1 (mod m). 10 2 1 1
11 1 1 0
Proof The action of K on H is determined entirely by the behaviour 12 5 2 3
13 1 1 0
of the generators x and y, specifically by the value of yxy1 H. 14 2 1 1
Since H is cyclic, this must be a nontrivial power of the generator x, 15 1 1 0
16 14 5 9
so yxy1 = xi with 0 < i < m. Furthermore, since K is a finite cyclic 17 1 1 0
n
group of order n, we have yn = 1, so x = yn xyn = xi and therefore 18 5 2 3
in 1 (mod n). 19 1 1 0
20 5 2 3
21 2 1 1
The next result, which we will use particularly in the classification of
22 2 1 1
groups of order 16 and 27, relates to factoring a group by its centre. 23 1 1 0
24 15 3 12
25 2 2 0
26 2 1 1
27 5 3 2
298 a course in abstract algebra

Proposition 7.108 If G/Z ( G ) is cyclic then G is abelian.

Proof Let Z = Z ( G ) and suppose that G/Z is cyclic. Then there


exists some g G such that G/Z = h gZ i, and hence every coset of Z
in G is of the form ( gZ )k = gk Z for some k Z.
Consider x, y G with x gm Z and y gn Z. Then x = gm a and
y = gn b for some a, b Z. Therefore
xy = gm agn b = gm gn ab = gm+n ba = gn gm ba = gn bgm a = yx
and hence G is abelian.

The trivial group

There is a single group with one element: the trivial group {e}.

Groups of prime order: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 and 31

By Proposition 2.35, any group G of prime order | G | = p is isomorphic


to the cyclic group Z p . These groups are abelian, simple and soluble.

Groups of twice prime order: 4, 6, 10, 14, 22 and 26

The groups of order 4 were classified in Proposition 2.38 and those of


(i) The cyclic group Z2p and
order 6 in Proposition 2.39. More generally, by Proposition 5.29, there
(ii) the dihedral group D p . are two groups of order 2p where p is prime, listed in Table 7.4.
The cyclic group Z2p is abelian, and therefore soluble by Proposi-
Table 7.4: Groups of order 2p
tion 7.42. The dihedral group D p (in the case p = 2 this is the Klein
group V4 ) is soluble by Proposition 7.40 because it is an extension of
two finite abelian (and hence soluble) groups. More precisely, D p has a
soluble normal subgroup R p = Z p and soluble quotient D p /R p
= Z2 .

Groups of prime square order: 4, 9 and 25

By Proposition 7.20, if p is prime, then there are two groups of order p2 ,


(i) The cyclic group Z p2 and
listed in Table 7.5, both of which are abelian. These are soluble by
(ii) Z p Z p . Propositions 7.42 and 7.43.

Table 7.5: Groups of order p2


Groups of order 8

(i) the cyclic group Z8 , The groups of order 8 were classified in Proposition 2.40, and there
(ii) the abelian groups Z2 Z4
and
are five of them, listed in Table 7.6. Of these, (i)(iii) are abelian, while
(iii) Z2 Z2 Z2 , (iv) and (v) are nonabelian. They are all soluble by Proposition 7.43.
(iv) the dihedral group D4 and
(v) the quaternion group Q8 ,
which is isomorphic to the
dicyclic group Dic2 .
finite groups 299

Groups of order 12

This is the first new case that cant be immediately resolved by existing
classification results. Groups of order 12 that weve met already
include the cyclic group Z12 = Z3 Z4 , the abelian group Z2 Z6 =
Z2 Z2 Z3 , the dihedral group D6 , the alternating group A4 and the
dicyclic group Dic3 . We now prove that these five are the only ones.
Let G be a group of order | G | = 12. First we apply Sylows Third
Theorem32 to calculate the possible numbers n p of Sylow psubgroups. 32
Theorem 7.13, page 246.
Thus n2 1 (mod 2) and n2 |3, which means that either n2 = 1 or 3.
Similarly, n3 1 (mod 3) and n3 |4, so either n3 = 1 or 4.
Case 1 (n3 = 1) Let H be the (unique) Sylow 3subgroup; this must
be isomorphic to Z3 . Furthermore, H C G by Corollary 7.12. Now
let K be a Sylow 2subgroup: this has order 4 and must therefore be
isomorphic to either Z4 or Z2 .
By Proposition 2.34, the order of any element of H K must divide
both | H | = 3 and |K | = 4, which are coprime, so H K = {e}. Then
G = HK must be a semidirect product of H and K. Let H = h x : x3 =
1i and consider the two possibilities for K separately:
Case 1a (K = Z4 ) Let K = hy : y4 = 1i. By Proposition 7.107 the
Kaction is determined by the value of yxy1 = xi , where 0 < i < 3
and i4 1 (mod 3). There are two ways this can go: either yxy1 = x
or yxy1 = x2 = x 1 .
The first of these gives the presentation
G = h x, y : x3 = y4 = 1, xy = yx i
= Z4 Z3
= Z12 .
The other possibility is that yxy1 = x2 , which gives the presentation
G = Z3 oZ4 = h x, y : x3 = y4 = 1, x2 y = yx i.
We claim that this group is isomorphic to the dicyclic group
Dic3 = h a, b : a6 = e, a3 = b2 , ab = ba1 i.
To see this, we note first that
y2 x = y(yx ) = y( x2 y) = (yx ) xy = x2 (yx )y = x4 y2 = xy2
and then that (y2 x )2 = (y2 x )y2 x = xy4 x = x2 . Since x2 has order 3,
it follows that y2 x has order 6. Now let a = y2 x and b = y. Then
b2 a = y4 x = x, so a and b generate all of the group G = Z3 oZ4 .
Furthermore,
a3 = ( y2 x )3 = x 2 y2 x = x 3 y2 = y2 = b2
and aba = y2 xy3 x = xy5 x = xyx = x3 y = y = b.
So G
= Dic3 .
300 a course in abstract algebra

Case 1b (K = Z2 Z2 ) Let K = hy, z : y2 = z2 = 1, yz = zyi. The


semidirect product H oK is determined by the conjugation action of b
and c on a. There are two possibilities for each:
yxy1 = x or yxy1 = x2
and zxz1 = x or zxz1 = x2 .
If both y and z act trivially on x, we have
G = h x, y, z : x3 = y2 = z2 = 1, xy = yx, xz = zx, yz = zyi.
This is isomorphic to the direct sum Z3 Z2 Z2
= Z2 Z6 .
If yxy1 = x2 and zxz1 = x then we have
G = h x, y, z : x3 = y2 = z2 = 1, x2 y = yx, xz = zx, yz = zyi. (7.26)
This is isomorphic to the case yxy1 = x and zxz1 = x2 by swapping
the generators y and z.
The other possibility is that both y and z act nontrivially on x, in which
case we get
G = h x, y, z : x3 = y2 = z2 = 1, x2 y = yx, x2 z = zx, yz = zyi.
But swapping y and yz yields a presentation isomorphic to (7.26),
so as long as either or both of y and z act nontrivially on x we get
isomorphic groups.
We now claim that this group is isomorphic to the dihedral group D6 .
Setting a = xy and b = yz, it follows that a2 = ( xy)2 = xyxy = x2 y2 =
x2 . And x2 has order 3, so a = xy must have order 6. Moreover, ba =
yzxy = yxzy = yxyz = ( x2 y)(yz) = a1 b. The presentation (7.26) can
thus be shown to be isomorphic to
h a, b : a6 = b2 = 1, ba = a1 bi
which by Proposition 5.28 is isomorphic to D6 . Also (7.26) can be
rewritten as
h x, y : x3 = y2 = 1, x2 y = yx i hz : z2 = 1i,
which is a presentation for the direct product D3 Z2 by Proposi-
tion 5.22, and hence D6
= D3 Z2 .
Case 2 (n3 = 4) Suppose instead that G contains four Sylow 3
subgroups, which we can denote H1 , H2 , H3 and H4 . By Sylows
Second Theorem they are all conjugate to each other, so we can let
G act on X = Syl3 ( G ) = { H1 , H2 , H3 , H4 } by conjugation. The Orbit
33
Theorem 6.25, page 223. Stabiliser Theorem33 says that
4 = | X | = | OrbG ( Hi )| = | G |/| StabG ( Hi )| = | G |/| NG ( Hi )|
and hence | NG ( Hi )| = 3, so the normaliser NG ( Hi ) = Hi itself. (Thus
each of the Sylow 3subgroups are self-normalising.)
finite groups 301

Now suppose that H1 = h x : x3 i. In general, Hi Hj = {e} for all


i 6= j, and so a 6 NG ( Hi ) for 2 6 i 6 4. That is, xH1 x 1 = H1 but
xHi x 1 6= Hi remained for 2 6 i 6 4. The action of x on X then fixes
H1 and cyclically permutes the other three groups H2 , H3 and H4 : it
corresponds to the permutation ( H2 H3 H4 ) Sym( X ).
Similarly, let H4 = hy : y3 i. Then by a similar argument, the generator
y corresponds to the permutation ( H1 H2 H3 ) Sym( X ). Therefore
G contains ( H2 H3 H4 ) and ( H1 H2 H3 ), which between them generate
the alternating group Alt( X ) = A4 . And since | A4 | = 12 = | G | it

follows that G = A4 .
Up to isomorphism, then, there are five groups of order 12, listed (i) the cyclic group
in Table 7.7. Of these, the first two are abelian and the other three Z12 = Z4 Z3 ,
(ii) the abelian group
arent. All of them are soluble: the first two by Proposition 7.42 Z2 Z6 = Z2 Z2 Z3 ,
because theyre abelian, the next two because they are extensions (iii) the dihedral group D6 ,
of abelian (and hence soluble) groups by Proposition 7.40, and the (iv) the dicyclic group Dic3 , and
(v) the alternating group A4 .
last by Proposition 7.38 because its a subgroup of S4 , whose derived
series (7.18) terminates in the trivial subgroup and is therefore soluble. Table 7.7: Groups of order 12

Groups of order 15

By Example 7.14, and more generally by Proposition 7.15, there is only


one group of order 15: the cyclic group Z15
= Z3 Z5 . This group is
abelian, and hence soluble by Proposition 7.42.

Groups of order 16

This is the first of the two complicated cases we will consider (the
other being 24). The following exposition is based heavily on the very
readable article by Marcel Wild,34 and makes use of the theory of 34 M Wild, The groups of order sixteen
cyclic extensions developed in the previous section. made easy, The American Mathematical
Monthly 112 (2005) 2031.
First we prove the following result:
Proposition 7.109 Let G be a group of order 16. If G is not isomorphic
to the direct sum Z2 Z2 Z2 Z2 , then G contains a normal subgroup
isomorphic either to Z8 or Z2 Z4 .

Proof Suppose that G 6 = Z2 Z2 Z2 Z2 . If G has an element g of


order 8 then the cyclic subgroup h gi
= Z8 ; it has index 2 in G and is
thus normal by Proposition 3.12.
If G has no element of order 8, then it must have at least some elements
of order 4, otherwise G would be isomorphic to Z2 Z2 Z2 Z2 .
Since G is a pgroup (for p = 2), by Proposition 7.19 the centre Z ( G )
must be nontrivial, by Lagranges Theorem it must have even order,
302 a course in abstract algebra

and by Cauchys Theorem there is an element z Z ( G ) with order 2.


Let H = hzi
= Z2 be the subgroup of G generated by this element.
This subgroup is normal in G because it lies in the centre Z ( G ).
Case 1 Suppose there exists an element x G of order 4 such that
x2 6= z. Then the intersection h x i H = {e} is trivial, and hence
h x, zi = Z2 Z4 .
Case 2 If, instead, all order4 elements of G square to give z then
everything in the quotient G/H has order at most 2, and so G/H is
abelian by Proposition 1.20. Let x G be an element of order 4. Since
G/H is abelian, every conjugate of x in G must lie in the coset Hx.
The conjugacy class [ x ] of x thus has at most | Hx | = 2 elements, and
by Corollary 6.27 the centraliser ZG ( x ) has order at least 8. We can
therefore find an element y ZG ( x ) \ h x i, which commutes with x.
We want to use the elements x and y to generate a subgroup isomor-
phic to Z2 Z4 . If y has order 2 then h x, yi is just such a group. If,
however, |y| = 4 then by the above hypothesis y2 = z and we have
( xy)2 = x2 y2 = z2 = e, so | xy| is at most 2. But | xy| cant equal 1,
because then we would have xy = e and y = x 1 h x i, which con-
tradicts our choice of y, so | xy| = 2. Furthermore, xy 6 h x i because
then we would have xy = x2 = e, and again y = x 1 h x i. So the
subgroup h x, xyi must be isomorphic to Z2 Z4 as claimed; again it
has index 2 and is therefore normal in G.
Every group of order 16, then, can be constructed as an extension of
Z2 by one of three groups of order 8: either Z2 Z2 Z2 , or Z2 Z4 ,
35
Theorem 7.101, page 294. or the cyclic group Z8 . By the Cyclic Extension Theorem,35 to obtain
all the groups of order 16 apart from Z2 Z2 Z2 Z2 , we just have
to construct a list of cyclic extension types of the form (Z8 , 2, , v)
or (Z2 Z4 , 2, , v). The Cyclic Extension Theorem ensures that all
of these are realised by valid groups, and then all we need to do is
( x ) discard any duplicates on the resulting list.
1 = id x
The automorphisms of Z8 = h x : x8 = 1i are completely determined
2 x3
3 x5 by their action on the generator x; we list these in Table 7.8.
x 7 = x 1
Suppose that H = h x : x8 = 1i = Z8 is normal in G, and consider a
4

Table 7.8: Automorphisms of Z8 non-identity element g G \ H of minimal order. By Proposition 2.34


this must have order 2, 4, 8 or 16.
Case 1 (| g| = 2) In this case v = g2 = e. All four automorphisms in
Table 7.8 fix e and satisfy the requirement that 2 is conjugation by e.
We therefore get four cyclic extension types (Z8 , 2, i , e) for 1 6 i 6 4,
which are realised by the following groups:
(Z8 , 2, 1 , e)
= Z8 Z2 , (Z8 , 2, 2 , e)
= Z8 o2 Z2
= SD16 ,

(Z8 , 2, 3 , e) = Z8 o Z2 , (Z8 , 2, 4 , e)
= Z8 o Z2
= D8 .
3 4
finite groups 303

The second of these is the semidihedral group SD16 of order 16.36 36


Some books call this the quasidihe-
dral group QD16 ; it has presentation
Case 2 (| g| = 4) Here v = g2
must have order 2; the only possibility is
that v = x4 . If = 1 then ( x2 g)( x2 g) = x2 1 ( x2 ) g2 = x4 g2 = x8 = e, h x, y : x8 = y2 = 1, yxy1 = x3 i.

so | x2 g| = 2 which contradicts the hypothesis that g has minimal order This is one of a family of groups SD2n
in G \ H. If = 2 then xg has order 2 and if = 3 then x2 g again or QD2n of the form
n 1 n 2
has order 2. So the only new cyclic extension type is (Z8 , 2, 4 , x2 ), h x, y : x2 =y2 =1, yxy1 = x2 1 i
which is realised by the dicyclic group Dic4 . for n > 2. The case n = 4 is the
first nonabelian group in the sequence,
Case 3 (| g| = 8) In this case there are two possible choices for v, and the first that we havent met in
namely x2 and x6 . Both of these elements are fixed by 1 and 3 , but other guises, since SD4
= Z2 Z2 and
SD8
= Z4 Z2 .
not 2 or 4 , and so we have four potentially new cyclic extension
types
(Z8 , 2, 1 , x2 ), (Z8 , 2, 1 , x6 ), (Z8 , 2, 3 , x2 ) and (Z8 , 2, 3 , x6 ).
The automorphism 4 maps x2 to x6 and commutes with both 1 and
3 , so the first and second of the above types are equivalent, as are the
third and fourth.
Considering (Z8 , 2, 1 , x2 ) we note that ( x3 g)( x3 g) = x3 1 ( x3 ) g2 =
x8 = e. Therefore | x3 g| = 2 and so this extension type doesnt yield a
group we havent already seen. We can discard (Z8 , 2, 3 , x2 ) as well,
because ( xg)( xg) = x3 ( x ) g2 = x8 = e and so | xg| = 2. ( x ) (y)
1 = id x y
Case 4 (| g| = 16) If g has order 16 then the cyclic subgroup h gi must 2 x3 y x2 y
x3
be the whole of G, so G = Z16 . This group realises, amongst others, 3 y
4 xy x2 y
the cyclic extension type (Z8 , 2, 1 , x ). 5 xy y
6 x3 x2 y
We have now found six groups of order 16 that have a normal subgroup 7 x3 y y
isomorphic to Z8 . Next we have to study the cyclic extension types 8 x x2 y
of the form (Z2 Z4 , 2, , v). The automorphism group of Z2 Z4 = Table 7.9: Automorphisms of Z2 Z4
h x, y : x4 = y2 = 1, xy = yx i is isomorphic to the dihedral group
D8 . Table 7.9 lists the eight possible automorphisms and their actions
on the generators x and y. With Proposition 7.104 in mind, we note
that 5 and 7 are conjugate in Aut( H ) and so are 6 and 8 . We can
therefore discard 7 and 8 , as they will give rise to groups isomorphic
to those obtained from, respectively, 5 and 6 .
Also, since H is abelian, conjugation by any element of H is the identity
map, so we can also rule out 2 and 4 , neither of which satisfy the
requirement 2 = id.
We assume now that G has no elements of order 8 or higher, otherwise
it would have a normal subgroup isomorphic to Z8 and have been
covered already in one of the previous four cases. Let H = h x, y :
x4 = y2 = 1, xy = yx i = Z4 Z2 and suppose that g G \ H is a
non-identity element of minimal order.
Case 5 (| g| = 2) In this case v = g2 = e, which is fixed by any of the
remaining four automorphisms under consideration. We thus obtain
304 a course in abstract algebra

four cyclic extension types

(Z2 Z4 , 2, 1 , e)
= Z2 Z2 Z4 , (Z2 Z4 , 2, 3 , e)
= D4 Z2 ,
(Z2 Z4 , 2, 5 , e)
= V4 oZ4 , (Z2 Z4 , 2, 6 , e)
= Q8 oZ2 .

Case 6 (| g| = 4) Here v can be either x2 , y or x2 y. From now on we


assume that all elements of G \ H have order at least 4.
Case 6a (v = x2 ) If v = x2 then only 3 and 5 determine groups
that we havent already seen. For 1 the element xg has order 2, since
( xg)( xg) = x2 g2 = x4 = e, and for 6 the element xyg has order 2,
since ( xyg)( xyg) = xyx5 yg2 = x2 y2 g2 = x4 y2 = e. Hence this case
gives two new groups:

(Z2 Z4 , 2, 3 , x2 )
= Q8 Z2 , (Z2 Z4 , 2, 5 , x2 )
= Z4 oZ4 .

Case 6b (v = y) Here the condition (v) = v immediately rules out


2 , 4 , 6 and 8 . The identity automorphism 1 = id yields a new
group

(Z2 Z4 , 2, 1 , y)
= Z4 Z4 .

For 3 we observe that gx2 g1 = 3 ( x2 ) = x2 , so x2 and g commute,


and hence h g, x2 i
= Z2 Z4 . Furthermore, ( xg)( xg) = x3 ( x ) g2 =
x4 y = y, so if we replace g with xg 6 h g, x2 i, we have Case 6a again.
For 5 we have ( xg)( xg) = x5 ( x ) g2 = x2 y2 = x2 , and replacing g
with xg also returns us to Case 6a.
(i) The cyclic group Z16 , And for 7 we have ( xg)( xg) = x7 ( x ) g2 = x4 yg2 = x4 y2 = e, so
(ii) the abelian groups Z2 Z8 ,
(iii) Z4 Z4 ,
| xg| = 2, which was already dealt with in Case 5.
(iv) Z2 Z2 Z4 and Case 6c (v = x2 y) Finally, the condition (v) = v enables us to
(v) Z2 Z2 Z2 Z2 ,
(vi) the dihedral group discard 2 , 4 , 6 and 8 straight away. There exists an automorphism
D8 = Z8 o4 Z2 , of H (2 , for example) mapping v = x2 y to y, and the allowed au-
(vii) the dicyclic group Dic4 ,
tomorphisms 1 , 3 , 5 and 7 are closed under conjugation, so by
(viii) the semidihedral group
SD16 = Z8 o2 Z2 , Proposition 7.103 we get the same groups as in Case 6b.
(ix) the direct products D4 Z2
and
We now have fourteen groups of order 16, namely the direct sum
(x) Q8 Z2 , and Z2 Z2 Z2 Z2 , together with six extensions of Z2 by Z8 , and seven
(xi) the semidirect products extensions of Z2 by Z2 Z4 . It is not immediately obvious that some
Z8 o3 Z2 ,
(xii) V4 oZ4 , of these arent isomorphic to each other, although this can be checked
(xiii) Q8 oZ2 and by comparing various fundamental properties such as commutativity
(xiv) Z4 oZ4 .
and the number of elements of a given order. We will omit this,
Table 7.10: Groups of order 16 encouraging the interested reader to check it themselves. The fourteen
groups of order 16 are listed in Table 7.10. (Several of these can also
be constructed as extensions of Z2 by Z2 Z2 Z2 , by D4 , or by Q8 .)
finite groups 305

Groups of order 18

Let G be a group of order 18. By Sylows Theorems G has exactly


one Sylow 3subgroup H of order 9, which is therefore normal in G
by Corollary 7.12. Furthermore, there can be either 1, 3 or 9 Sylow
2subgroups of order 2, each isomorphic to Z2 .
Let K hz : z2 = 1i
= Z2 be a Sylow 2subgroup of order 2. Then G =
HK, an internal semidirect product of H with K. By Proposition 7.20,
H is isomorphic to either Z9 or Z3 Z3 .
Case 1 Let H = h x : x9 = 1i = Z9 . By Proposition 7.107, the
semidirect product is defined by the action of K on H, and hence
zxz1 = xi for some i where 1 6 i < 9 and i2 1 (mod 9). There are
two possible cases: i = 1 or i = 8.
Case 1a (i = 1) This is the trivial action, so G
= Z9 Z2
= Z18 .
Case 1b (i = 8) This gives zxz1 = x8 = x 1 , and hence
G = h x, z : x9 = 1, z2 = 1, zxz1 = x 1 i
which is isomorphic to the dihedral group D9 by Proposition 5.28.
Case 2 Let H = h x, y : x3 = 1, y3 = 1, xy = yx i = h x : x3 = 1ihy :
y3 = 1i. To determine the action of K on H we need to define how
conjugation by z affects the generators x and y.
Suppose that zxz1 = t and that t 6 h x i. Then x = z2 xz2 = ztz1 ,
so zx = tz and xz = zt. Furthermore, z( xt)z1 = (zxz1 )(ztz1 ) =
tx = xt. If t 6 h x i then H = h xt, x i = h xtih x i. So, we can replace x
with xt and y with x to get an alternative presentation for H where
zxz1 = x. We can therefore assume zxz1 h x i, and hence that
zxz1 = x or x 1 .
Now zyz1 = xi y j for some i and j with 0 6 i < 3 and 0 < j < 3.
Hence zxi y j z1 = y.
If j = 1 then zyz1 = xi y and
y = z2 yz2 = zxi yz1 = (zxi z1 )(zyz1 ) = ( xi )( xi y) = x2i y,
so 2i 0 (mod 3), and hence i = 0. Thus if j = 1 then zyz1 = y.
If j = 2 then zyz1 = xi y2 and
y = z2 yz2 = zxi y2 z1 = (zxi z1 )(zy2 z1 ) = ( xi )( x2i y4 ) = x3i y.
Thus 3i 0 (mod 3) and so i = 0, 1, 2. If i = 0 then zyz1 = y2 = y1 ;
if i = 1 then z( xy)z1 = ( xy)1 ; and if i = 2 then z( x2 y)z1 =
( x2 y)1 . In the latter two cases we can carefully redefine y to get
zyz1 = y1 .
Otherwise, suppose that zxz1 = x 1 . Then
2 2
y = zxi y j z1 = x i xij y j = x (i1) j y j .
306 a course in abstract algebra

In this case, if j = 1 then by suitably redefining y we get zyz1 = y1 ,


and if j = 2 then i = 0 and zyz1 = y2 = y1 .
We can therefore assume that zxz1 = x or x 1 , and zyz1 = y or y1 .
Case 2a (zxz1 = x and zyz1 = y) These are both trivial actions, so
xz = zx and yz = zy and hence G
= Z3 Z3 Z2 = Z3 Z6 .
Case 2b (zxz1 = x and zyz1 = y1 ) In this case, we have

G = h x : x3 = 1i hy, z : y3 = z2 = 1, zyz1 = y1 i
= Z3 D3 .
Case 2c (zxz1 = x 1 and zyz1 = y1 ) This gives the group with
presentation

G = h x, y, z : x3 = y3 = z2 = 1, zxz1 = x 1 , zyz1 = y1 i.

This is a nontrivial semidirect product (Z3 Z3 )oZ2 .


(i) the cyclic group All that remains is to show that these five groups are nonisomorphic.
Z18 = Z2 Z9 ,
The first and third are abelian and distinct by Theorem 5.48. The
(ii) the abelian group
Z3 Z6 = Z3 Z3 Z2 , dihedral group D9 has an element x of order 9, whereas neither
(iii) the dihedral group D9 , Z3 D3 nor (Z3 Z3 )oZ2 do. Finally, Z3 D3 has an element xz of
(iv) the direct product Z3 D3 ,
and order 6, but (Z3 Z3 )oZ2 has no elements of order greater than 3.
(v) the semidirect product Therefore, up to isomorphism, there are five groups of order 18, listed
(Z3 Z3 )oZ2 .
in Table 7.11
Table 7.11: Groups of order 18

Groups of order 20

Let G be a group of order 20. By Sylows Theorems there is a single


Sylow 5subgroup H = h x : x5 = 1i of order 5, which is normal
in G, and which must be isomorphic to Z5 . There must be 5 Sylow
2subgroups of order 4, which must be isomorphic to either Z4 or
Z2 Z2 . Let K be one of these Sylow 2subgroups. Then H K = {e}
and HK = G, so G is an internal semidirect product of H with K.
Case 1 Let K = hy : y4 = 1i = Z4 . This acts on H by conjugation,
so we need to investigate the possible values of yxy1 = xi , where
4
1 6 i 6 4. Also, x = y4 xy4 = xi , so i 1 (mod 5). All of
i = 1, 2, 3, 4 satisfy these conditions, but we can discard i = 3, because
3
if yxy1 = x2 then y3 xy3 = x2 = x8 = x3 , and since hyi = hy3 i, the
group obtained by setting i = 2 is isomorphic to the one we get by
setting i = 3.
Case 1a (i = 1) Here yxy1 = x, so

G = h x, y : x5 = y4 = 1, yx = xyi
= Z5 Z4
= Z20 .
Case 1b (i = 2) In this case, yxy1 = x2 , giving the presentation

G = h x, y : x5 = y4 = 1, yx = x2 yi.
finite groups 307

This is isomorphic to the affine general linear group AGL1 (5) of affine
linear transformations in the finite field F5 .
Case 1c (i = 4) This time, yxy1 = x4 = x 1 , which gives us the
presentation

G = h x, y : x5 = y4 = 1, yxy1 = x 1 i.

This group is isomorphic to the dicyclic group Dic5 . It is not isomor-


phic to the group in Case 1b, because that group has trivial centre,
whereas here y2 xy2 = x, so y2 x = xy2 , and thus y2 is contained in
the centre.
Case 2 Let K = hy, z : y2 = z2 = 1, yz = zyi = Z2 Z2 . The K
action on H is determined by the values of yxy1 and zxz1 . As
before, suppose that yxy1 = xi for some i with 1 6 i 6 4. Then
2
x = y2 xy2 = xi , and so i 1 (mod 5). The only possibilities are
i = 1, 4, so either yxy1 = x or yxy1 = x 1 . Similarly, zxz1 = x or
zxz1 = x 1 . For reasons analogous to those in case 1b of the order 12
classification, we may assume that yxy1 = x, and we therefore have
two cases to consider.
Case 2a (zxz1 = x) This gives the presentation

G = h x, y, z : x5 = y2 = z2 = 1, xy = yx, xz = zx, yz = zyi

which is isomorphic to the abelian group Z5 Z2 Z2


= Z2 Z10 .
Case 2b (zxz1 = x 1 ) This gives the presentation
(i) The cyclic group
G = h x, y, z : x5 = y2 = z2 = 1, yx = xy, zx = x 1 z, yz = zyi
Z20 = Z4 Z5 ,

= h x, z : x5 = z2 = 1, zx = x 1 zi hy : y2 = 1i, (ii) the abelian group
Z2 Z10 = Z2 Z2 Z5 ,
which is isomorphic to the direct product D5 Z2 , which in turn can (iii) the dihedral group D10 ,
(iv) the dicyclic group Dic5 , and
be shown to be isomorphic to the dihedral group D10 . (v) the affine general linear group
There are thus five isomorphism classes of groups with 20 elements, AGL1 (5).

listed in Table 7.12. Table 7.12: Groups of order 20

Groups of order 21

Let G be a group of order 21. Then by Sylows Theorems there is a


unique Sylow 7subgroup H = h x : x7 = 1i = Z7 , which is normal in
G. Suppose that K = hy : y3 = 1i
= Z3 is a Sylow 3subgroup. Then
H K is trivial and HK = G, so G is an internal semidirect product.
3
Considering yxy1 = xi , with 1 6 i 6 6, then x = y3 xy3 = xi , so
i3 1 (mod 7). The possible values for i are thus i = 1, 2 or 4.
If yxy1 = x2 then y2 xy2 = x4 , and since hyi = hy2 i we can replace y
with y2 and see that the cases i = 2 and i = 4 give isomorphic groups.
308 a course in abstract algebra

Case 1 (i = 1) This gives the presentation

G = h x, y : x7 = y3 = 1, xy = yx i
= Z7 Z3
= Z21 .

Case 2 (i = 2) Here,

(i) The cyclic group G = h x, y : x7 = y3 = 1, yx = x2 yi,


Z21
= Z3 Z7 and
(ii) the nontrivial semidirect which is a nontrivial semidirect product Z7 oZ3 .
product Z7 oZ3 .
There are thus two nonisomorphic groups of order 21, listed in Ta-
Table 7.13: Groups of order 21 ble 7.13.

Groups of order 24

Let G be a group of order 24. By Sylows Theorems, there exist


either 1 or 3 Sylow 2subgroups of order 8, and either 1 or 4 Sylow
3subgroups of order 3.
Suppose that n3 = 4. By Sylows Second Theorem, these four Sylow
3subgroups are conjugate to each other. That is, G acts on Syl3 ( G )
by conjugation and thereby determines a homomorphism f : G S4 ,
where each g G maps to the permutation of Syl3 ( G ) defined by the
conjugation action.
By Sylows Second Theorem there must exist elements of G that map
each of the four Sylow 3subgroups to any of the others. So im( f )
must have at least 4 elements and hence | ker( f )| 6 6. By Lagranges
Theorem, | ker( f )| can be 6, 3, 2 or 1.
If ker( f ) contains one Sylow 3subgroup then it must contain all of
them: ker( f ) is normal in G and hence closed under conjugation
by all elements of G, and by Sylows Second Theorem the Sylow 3
subgroups are conjugate, so either all of them must lie in ker( f ) or
none of them do. The first of these is ruled out by the order of ker( f ),
which has at most six elements and is thus not big enough to contain
four distinct 3element subgroups. So | ker( f )| cant be 3 or 6. If
| ker( f )| = 1 then f is an isomorphism and hence G = S4 . This has
four Sylow 3subgroups and three Sylow 2subgroups.
If | ker( f )| = 2 then by the First Isomorphism Theorem, im( f ) =
G/ ker( f ) is a subgroup of S4 of order 12; it can only be the alternating
group A4 , which has a single Sylow 2subgroup of order 4. Therefore
G itself can only have one Sylow 2subgroup (of order 8) and hence
n2 = 1.
Case 1 (n2 = 1, n3 = 1) In this case, both the Sylow 2subgroup H
and the Sylow 3subgroup K are normal in G. Their intersection is
trivial and G = HK, so G
= H K. There are five possibilities for H
finite groups 309

and one for K, so this case yields five groups for our list:
Z8 Z3
= Z24 , D4 Z3 ,
Z2 Z4 Z3
= Z2 Z12 , Q8 Z3 ,
Z2 Z2 Z2 Z3
= Z2 Z2 Z6 .
Case 2 (n2 = 1, n3 = 4) Here we have a normal Sylow 2subgroup
H of order 8 and four conjugate Sylow 3subgroups. Let K be one of
these subgroups. Then H K = {e} and we have a semidirect product
G = HK. We have five choices for H and must consider the possible
Z3 actions. The trivial action yields the five direct products already
listed in Case 1, so at this point were only interested in nontrivial
Z3 actions.
Case 2a (H = Z8 ) Let H = h x : x8 = 1i and K = hw : w3 = 1i. By
Proposition 7.107 we have wxw1 = xi with 0 < i < 8 and i3 1
(mod 8). The only value of i satisfying these criteria is i = 1, which
yields the trivial action.
Case 2b (H = Z2 Z4 ) We want a nontrivial homomorphism : K
Aut( H ) mapping the generator of K to a nontrivial element of Aut( H ).
The generator of K has order 3, so its image in Aut( H ) must also
have order 3; however by Proposition 2.34 the order of any element
of Aut( H ) must divide | Aut( H )| = 8, hence no such homomorphism
can exist. This case also only gives the trivial action and one of the
direct products weve already seen.
Case 2c (H = Z2 Z2 Z2 ) Again, we want a nontrivial homomor-
phism : K Aut( H ). This time | Aut( H )| = 168 (it happens to
be isomorphic to the groups GL3 (2) and PSL2 (7)) which is divisible
by 3. By Cauchys Theorem, it therefore has at least one subgroup
isomorphic to Z3 , whose generator has order 3. In fact, all such ele-
ments are conjugate to each other, so we have only a single nontrivial
action at our disposal: the map that cyclically permutes the factors of
Z2 Z2 Z2 . This action yields a new group, a nontrivial semidirect
product G = (Z2 Z2 Z2 )oZ3 , which is isomorphic to the direct
product Z2 A4 .
Case 2d (H = D4 ) There is no nontrivial Z3 action on D4 since
| Aut( D4 )| = 8, so we dont get any new groups in this case.
Case 2e (H = Q8 ) In this case Aut( Q8 ) =
S4 and thus | Aut( Q8 )| =
24, which is divisible by 3, so we can define a nontrivial Z3 action
on Q8 . All of the 3element subgroups of S4 are conjugate to each
other, so we have only one nontrivial action, which yields a new
group for our list: the nontrivial semidirect product G = Q8 oZ3 .
This is isomorphic to the group SL2 (3) of 22 matrices over F3 with
determinant 1.
310 a course in abstract algebra

Case 3 (n2 = 3, n3 = 1) If G has a single Sylow 3subgroup H = Z3


and K is one of the three Sylow 2subgroups of order 8, then again
we have a semidirect product G = HK whose structure is determined
by the Kaction on H. As before, there are five choices for K, and by
this stage we are only interested in nontrivial Kactions: nontrivial
homomorphisms : K Aut(Z3 ) = Z2 .
Case 3a (K = Z8 ) There is only one nontrivial action of Z8 on Z3 ,
given by the homomorphism : Z8 Z2 = Aut(Z3 ) with kernel
{0, 2, 4, 6}
= Z4 . This yields a semidirect product Z3 oZ8 .
Case 3b (K = Z2 Z4 ) Here we need to study actions of Z2 Z4 on
Z3 , which are determined by nonzero homomorphisms : Z2 Z4
Z2 . There are essentially two such homomorphisms, with kernels
isomorphic to Z4 or the Klein group V4 . The first of these yields a
semidirect product Z3 o(Z2 Z4 ) = Z4 D3 , while the second gives
Z3 o(Z2 Z4 ) = Dic3 Z2 .
Case 3c (K = Z2 Z2 Z2 ) In this case, we want a nonzero ho-
momorphism : Z2 Z2 Z2 Z2 , of which there is only one,
(i) The cyclic group
Z24 = Z3 Z8 ,
with kernel isomorphic to V4 . This action gives a semidirect prod-
(ii) the abelian groups uct Z3 o(Z2 Z2 Z2 ), which is isomorphic to the direct products
Z2 Z12 = Z2 Z4 Z3 and Z2 D6
= D3 V4 .
(iii) Z2 Z2 Z6 =
Z2 Z2 Z2 Z6 , Case 3d (K = D4 ) There are two nontrivial D4 actions on Z3 , de-
(iv) the symmetric group S4 ,
termined by homomorphisms : D4 Z2 . The one with kernel
(v) the dihedral group D12 ,
(vi) the direct products Z3 D4 , R4 = hr i
= Z4 yields a semidirect product Z3 o D4 which is isomor-
(vii) Z3 Q8 , phic to the dihedral group D12 . The one with kernel isomorphic to V4
(viii) Z2 A4 ,
(ix) Z4 D3 , yields another group Z3 o D4 .
(x) Z2 Dic3 Case 3e (K = Q8 ) There is just one nontrivial Q8 action on Z3 ,
(xi) and Z2 D6 = V4 D3 ,
(xii) and the semidirect products determined by the homomorphism : Q8 Z2 with kernel h I i, h J i
Q8 oZ3 = SL2 (3), or hK i, all of which are isomorphic to the Klein group V4 . This yields
(xiii) Z3 oZ8 ,
(xiv) Z3 o D4 (with kernel V4 )
a semidirect product Z3 oQ8 .
(xv) and Z3 oQ8 . Case 4 (n2 = 3, n3 = 4) By the earlier discussion G = S4 .
Table 7.14: Groups of order 24 There are thus fifteen groups of order 24, listed in Table 7.14.

Groups of order 27

Let G be a group of order 27 = 33 . It is a pgroup, so by Proposi-


tion 7.19 it has nontrivial centre. By Lagranges Theorem, | Z ( G )| can
be either 3, 9 or 27. If the latter, then G is equal to its centre and hence
abelian, so the classification theorem for finitely generated abelian
groups yields three possible cases: Z27 , Z3 Z9 and Z3 Z3 Z3 .
Now suppose that G is nonabelian and | G | = 27. For convenience
let Z = Z ( G ). Then | Z | = 3 or 9, which means that | G/Z | = 9
finite groups 311

or 3, respectively. The latter case is ruled out by Proposition 7.108: if


| G/Z | = 3 then G/Z = Z3 , which can only happen if G is abelian. So
| G/Z | = 9 and | Z | = 3, which means that Z = Z3 , and G/Z = Z9
or Z3 Z3 by Proposition 7.20. But G/Z 6 = Z9 because otherwise G
would be abelian by Proposition 7.108. So G/Z = Z3 Z3 .
Since G/Z is abelian, the commutator subgroup [ G, G ] 6 Z, and since
Z= Z3 , a cyclic group of prime order, Lagranges Theorem implies
that [ G, G ] is either trivial or equal to Z. But G is nonabelian, so [ G, G ]
cant be trivial, and therefore [ G, G ] = Z = Z3 .
There are nontrivial elements x, y G such that xZ and yZ generate
G/Z, with neither xZ nor yZ being equal to the centre Z itself. These
elements x and y dont commute: If they did, then the centraliser CG ( x )
would contain x, y and all of Z, which would mean that CG ( x ) = G.
This in turn would imply that everything commutes with x, so x Z
and xZ = Z, which contradicts our choice of x.
So there exists some nontrivial z = [ x, y] [ G, G ] = Z; this element z
has order 3 and thus generates Z = hzi. The elements x and y must
have order either 3 or 9: theyre nontrivial so cant have order 1, and if
either had order 27 then it would generate the entire group G, which
would then be cyclic and abelian.
We therefore have three cases to consider:
Case 1 (| x | = |y| = 3) This gives the presentation

G = h x, y, z : x3 = y3 = z3 = 1, xz = zx, yz = zy, [ x, y] = zi.

This is isomorphic to the Heisenberg group Heis(3) or U (3, 3) of


unitriangular matrices over F3 :
nh 1 a b i o
Heis(3) = U (3, 3) = 0 1 c : a, b, c F3
001

Case 2 (| x | = 9, |y| = 3) In this case, observe that x3 Z; since


x3 6= 1 then it must generate Z and hence x3 = z or z2 . This gives the
presentation
G = h x, y : x9 = y3 = 1, x3 = [ x, y]i. (i) The cyclic group Z27 ,
(ii) the abelian groups Z3 Z9
This is isomorphic to the semidirect product Z9 oZ3 .
and
Case 3 (| x | = |y| = 9) This time we get the presentations (iii) Z3 Z3 Z3 ,
(iv) the Heisenberg group
Heis(3) = U (3, 3) of 33
G = h x, y : x9 = y9 = 1, x3 = y3 = [ x, y]i unitriangular matrices over F3

= h x, y : x9 = y9 = 1, x3 = [ x, y], y3 = [ x, y]2 i, and
(v) the semidirect product
Z9 oZ3 .
both of which are isomorphic to the group obtained in Case 2.
We therefore have five groups of order 27, listed in Table 7.15. Table 7.15: Groups of order 27
312 a course in abstract algebra

Groups of order 28

Let G be a group of order 28 = 22 7. Then G has a normal Sylow 7


subgroup H = h x : x7 = 1i
= Z7 and either 1 or 7 Sylow 2subgroups
of order 4. Let K be one of these latter subgroups, which can be
isomorphic to either Z4 or Z2 Z2 . The group G decomposes as an
internal direct product HK, with K acting on H via a homomorphism
: K Aut( H ) = Z6 .
Case 1 (K = Z4 ) There are two possible actions of K = hy : y4 = 1i
=
Z4 on H: the identity map given by yxy1 = x, and the inversion map
given by yxy1 = x 1 . We therefore get two groups:
h x, y : x7 = y4 = 1, yx = xyi
= Z4 Z7
= Z28
and h x, y : x7 = y4 = 1, yx = x 1 yi
= Dic7 .
Case 2 (K = Z2 Z2 ) Let K = hy, z : y2 = z2 = 1, yz = zyi. There
are three nontrivial homomorphisms : Z2 Z2 Aut( H ) = Z6 , all
with images either the identity map or the inversion map x 7 x 1 .
If both generators y and z map to the identity then we get the group

h x, y, z : x7 = y2 = z2 = 1, xy = yx, xz = zx, yz = zyi



= Z7 Z2 Z2 = Z2 Z14 .
If y maps to the inversion automorphism and z maps to the identity
(or the other way round) then we get

h x, y, z : x7 = y2 = z2 = 1, xy = yx 1 , xz = zx, yz = zyi
(i) The cyclic group
Z28 = Z4 Z7 ,
= D7 Z2
= D14 .
(ii) the abelian group
Z2 Z2 Z7 = Z2 Z14 , If both y and z map to the inversion automorphism then we can
(iii) the dicyclic group Dic7 , and replace either y or z with yz, which then maps to the identity, and this
(iv) the dihedral group D14 .
reduces to the previous case.
Table 7.16: Groups of order 28 Hence there are four groups of order 28, listed in Table 7.16.

Groups of order 30

Suppose that | G | = 30 = 235. Then by Sylows Theorems and


Lagranges Theorem n3 = 1 or 10 and n5 = 1 or 6. At least one of
n3 and n5 must be equal to 1: there arent enough elements in G for
there to be 10 distinct subgroups of order 3 and 6 distinct subgroups
of order 5 with trivial intersections.
We now show that G contains a subgroup isomorphic to Z15 . Since at
least one of n3 and n5 equals 1, we can find subgroups H = Z3 and
K = Z3 , at least one of which is normal. These subgroups have trivial
intersection and hence HK is a subgroup of order 15. This subgroup
is isomorphic to Z15 by Example 7.14.
finite groups 313

Now let H = h x : x15 = 1i be this subgroup. It has index 2 in G


and is hence normal by Proposition 3.12. By Sylows Theorems and
Lagranges Theorem, G also contains at least one (and possibly 3 or 5)
subgroups of order 2. Let K = hy : y2 = 1i be one of these subgroups;
it has trivial intersection with H, and HK = G, so our group G
decomposes as a semidirect product Z15 oZ2 . By Proposition 7.107
the required action is determined entirely by yxy1 = xi for 0 < i < 15
and i2 1 (mod 15) There are four cases to consider:
Case 1 (i = 1) This is the trivial action and yields the direct sum
Z15 Z2 = Z30 .
Case 2 (i = 4) This case yields the presentation
G = h x, y : x15 = y2 = 1, yxy1 = x4 i.
This group is isomorphic to the direct product D5 Z3 .
Case 3 (i = 11) Here we obtain the presentation (i) The cyclic group
15 2 1 11 Z30
= Z2 Z3 Z5 ,
G = h x, y : x = y = 1, yxy = x i, (ii) the dihedral group D15 and
the direct products
which is isomorphic to D3 Z5 .
(iii) D3 Z5 and
Case 4 (i = 14) This final action gives the presentation (iv) D5 Z3 .

G = h x, y : x15 = y2 = 1, yxy1 = x 1 i
= D15 . Table 7.17: Groups of order 30
There are thus four different groups of order 30, listed in Table 7.17. order groups
This completes our classification of groups of order less than 32. We 20 = 1 1
stop here because things get considerably more complicated with 21 = 2 1
22 = 4 2
groups of order 2n , and in particular there are 51 groups of order 32. 23 = 8 5
Table 7.18 lists the number of groups of order 2n . To put this in 24 = 16 14
25 = 32 51
perspective, up to isomorphism there are 49 910 529 484 groups with
26 = 64 267
order at most 2000, and 49 487 365 422 of those (just over 99.15%) have 27 = 128 2 328
order 1024. 28 = 256 56 092
29 = 512 10 494 213
210 = 1024 49 487 365 422

Table 7.18: Groups of order 2n

Summary

Lagranges Theorem37 states that if G is a finite group then the order 37


Theorem 2.30, page 54.
of any subgroup H 6 G must divide the order of G. The converse is
not true in general, the smallest counterexample being the alternating
group A4 , which has order 12 but no subgroup of order 6.38 Cauchys 38
Example 2.36, page 56.
Theorem39 provides a partial converse, however: if p is a prime factor 39
Theorem 2.37, page 57.
of | G | then G contains a cyclic subgroup of order p.
A finite group G is said to be a pgroup if it has order pk for some
prime p and integer k > 0.40 A subgroup of some (not necessarily 40
Definition 7.1, page 240.
314 a course in abstract algebra

41
Definition 7.2, page 240. finite) group G is a psubgroup if it is itself a pgroup.41
Suppose that G is a pgroup acting on some finite set X. Then the
number of elements in X is equal to the number of elements fixed by
42
Proposition 7.3, page 241. the Gaction, modulo p. That is, | X | | FixG ( X )| (mod p).42
43
Theorem 7.4, page 242. We can use this fact to prove Sylows First Theorem:43 any finite group
G contains at least one psubgroup of each possible order for every
prime factor p of | G |, and each of these subgroups is normal in the
psubgroup that contains it. That is, if G is a finite group of order
pk m, where p is prime, k > 0 and p 6 |m, then it has at least one
subgroup of order pi for all 0 6 i 6 k. Furthermore, each subgroup
of order pi is a normal subgroup of some subgroup of order pi+1 . A
44
Definition 7.10, page 245. maximal psubgroup is called a Sylow psubgroup,44 and Sylows
45
Theorem 7.11, page 245. Second Theorem45 says that for any prime factor p of | G |, the Sylow
psubgroups are conjugate to each other. Therefore, G has exactly one
46
Corollary 7.12, page 245. Sylow psubgroup H if and only if H is normal.46
We denote by Syl p ( G ) the set of Sylow psubgroups of G, and let
n p = | Syl p ( G )| denote the number of Sylow psubgroups. Sylows
47
Theorem 7.13, page 246. Third Theorem47 says that for any finite group G of order pk m, we
have n p 1 (mod p) and n p |m. We can use this to help classify
finite groups of a particular order: for example, if | G | = 15 we find
that n3 = 1 and n5 = 1, so G must have a single normal subgroup
of order 3 and another of order 5. From this we can deduce that
48
Example 7.14, page 247. G = Z15 .48 More generally, if | G | = pq where p and q are distinct
49
Proposition 7.15, page 247. primes, p < q and q 6 1 (mod p) then G = Z pq .49 Any such group
50 must therefore be abelian. 50
Corollary 7.16, page 248.
We can also use Sylows Second and Third Theorems to prove the
nonexistence of finite simple groups of certain orders, by showing that
a group of the given order must have a single Sylow psubgroup for
some p, and this subgroup must necessarily be normal. For example,
there is no simple group of order 1246, since 1246 = 2789 and
51
Example 7.18, page 248. n7 = n89 = 1.51
Another useful fact about pgroups is that they have nontrivial cen-
52
Proposition 7.19, page 249. tre.52 We can use this to show that any group of order p2 is isomorphic
53
Proposition 7.20, page 249. to either Z p2 or Z p Z p ,53 and must in either case be abelian.54
54
Corollary 7.21, page 250. More generally, if | G | = p2 q where p and q are both prime, p < q and
55
Proposition 7.22, page 250. q 6 1 (mod p), then G must be abelian.55
We next discussed the concept of subgroup series: a finite nested
sequence {e} = H0 < H1 < < Hn = G of subgroups of a group
56
Definition 7.23, page 251. G.56 We sometimes make a distinction between an ascending series
and a descending series depending on what order we number the
subgroups in, and we will also sometimes relax the requirement that
finite groups 315

the smallest subgroup is trivial. A normal series or invariant series


is one in which each subgroup Hi is normal in G, while a subnormal
series satisfies the weaker condition that each subgroup Hi need only
be normal in the next largest subgroup Hi+1 .57 Every normal series is 57
Definition 7.24, page 251.
subnormal, but not every subnormal series is normal.
The advantage of considering normal and subnormal series is that
the quotients Hi+1 /Hi are all well-defined. This approach provides
a useful way of comparing series: two normal or subnormal series
are isomorphic or equivalent if there is a bijective correspondence
between their quotient groups.58 58
Definition 7.25, page 252.
Given any normal or subnormal series H of some group G, a re-
finement of H is a new series obtained by inserting additional sub-
groups.59 Schreiers Refinement Theorem60 says that given any two 59
Definition 7.26, page 253.
normal (or subnormal) series H and K for the same group G, we can 60
Theorem 7.27, page 254.
find a refinement of H and a refinement of K that are isomorphic
to each other. The proof of this is somewhat involved, and makes
use of a rather technical result, Zassenhaus Lemma or the Butter-
fly Lemma.61 , 62 Our proof of this required another technical lemma 61
Theorem 7.29, page 255.
called Dedekinds Modular Law.63 62
Figure 7.2, page 255.
A maximal subgroup is a proper subgroup not contained in any larger 63
Lemma 7.28, page 254.
proper subgroup, and a maximal normal subgroup is a proper nor-
mal subgroup not contained in any larger proper normal subgroup.64 64
Definition 7.31, page 259.
A quotient G/N is simple exactly when N C G is a maximal normal
subgroup of G.65 A composition series for a group G is a subnormal 65
Proposition 7.32, page 259.
series that cannot be refined any further.66 This is the same as requir- 66
Definition 7.33, page 259.
ing each group Hi to be a maximal normal subgroup of the next largest
group Hi+1 , or equivalently requiring each quotient (or composition
factor) to be simple. A normal (rather than just subnormal) composi-
tion series is called a principal series or chief series. Not every group
has a composition or principal series: in particular Z doesnt. The
JordanHlder Theorem67 says that any two composition series for 67
Theorem 7.34, page 260.
the same group are isomorphic. If G has a composition (or principal)
series, then we can find one that includes any given normal subgroup
N C G.68 68
Proposition 7.35, page 261.
We say that a group is soluble or solvable if its composition factors
are all abelian;69 this is equivalent to requiring the composition factors 69
Definition 7.36, page 261.
to be cyclic groups Z p of prime order. The symmetric group Sn is not
soluble for n > 5,70 because any composition series for Sn must have 70
Proposition 7.37, page 262.
An as one of its composition factors, and this is simple but nonabelian
for n > 5.
If a group G is soluble, then so is any subgroup H 6 G,71 and 71
Proposition 7.38, page 262.
any quotient G/H.72 Furthermore, extensions of soluble groups are 72
Proposition 7.39, page 263.
316 a course in abstract algebra

73
Proposition 7.40, page 263. soluble: if H C G and G/H are soluble, then so is G.73 Consequently,
74
Corollary 7.41, page 263. finite direct products of soluble groups are also soluble,74 as are finite
75
Proposition 7.42, page 264. abelian groups75 and finite pgroups.76
76
Proposition 7.43, page 264. Two important results whose proofs are beyond the scope of this
77
Theorem 7.44, page 264. book are Burnsides Theorem,77 which says that any group of order
pm qn , for p and q both prime and m, n > 0, is soluble, and the Feit
78
Theorem 7.45, page 264. Thompson Theorem,78 which says that any group of odd order is
soluble.
The derived series of a group G is defined recursively by using com-
mutator subgroups. We set G (0) = G, and let G (i+1) be the commutator
79
Definition 7.46, page 265. [ G (i) , G (i) ] for i > 0.79 A group is soluble if and only if its derived
series terminates at the trivial subgroup; that is, if G (n) = {e} for some
80
Proposition 7.47, page 265. n > 0.80 The derived length of a soluble group G is the length of
81
Definition 7.48, page 266. its derived series.81 The trivial group is the only group with derived
length 0, the groups of derived length 1 are exactly the abelian groups,
and groups of derived length 2 are called metabelian.
Another important series is the central series, constructed using the
centre. A normal series is said to be central if Gi+1 /Gi 6 Z ( G/Gi )
82
Definition 7.49, page 267. for all 0 6 i < n;82 that is, if each quotient Gi+1 /Gi is contained in
the centre of the corresponding quotient G/Gi of the full group. This
construction only works with normal series, not subnormal ones, since
we need each subgroup Gi to be normal in the full group G for the
quotient G/Gi to be defined.
83
Definition 7.51, page 268. There are two special central series: the upper central series83 and
84
Definition 7.53, page 269. the lower central series.84 The first of these is an ascending series,
and is defined in terms of the higher centres Zi ( G ) of G: we set G0 =
Z0 ( G ) = {e} and recursively define Zi ( G ) such that Zi ( G )/Zi1 ( G ) =
Z ( G/Zi1 ( G )). The second is a descending series, and is defined by
setting G0 = 1 ( G ) = G and recursively defining Gi = i+1 ( G ) =
[ G, i ( G )].
The upper central series need not terminate at G, and the lower central
series need not terminate at the trivial subgroup {e}. The lower
central series is the optimal descending central series: if it reaches the
trivial group, then it does so in fewer steps than any other descending
85
Proposition 7.56, page 270. central series.85 . Analogously, the upper central series is the optimal
ascending central series: if it reaches G then it does so faster than any
86
Proposition 7.58, page 271. other ascending central series.86
A group G with a finite-length central series connecting G with its
trivial subgroup {e} is said to be nilpotent, and the length of the
shortest possible such central series is called the nilpotency class of
87
Definition 7.55, page 270. G.87 The nilpotency class of G will therefore be equal to the length
of the upper or lower central series, whichever is smaller. A group G
finite groups 317

is therefore nilpotent if and only if its lower central series terminates


at the trivial subgroup, and if and only if its upper central series
terminates at G.88 88
Corollary 7.59, page 271.
For any group G, the derived group 6 i+1 ( G ) for all i >
G (i ) 0,89 89
Proposition 7.60, page 272.
which means that all nilpotent groups are soluble.90 Not all soluble 90
Corollary 7.61, page 272.
groups are nilpotent, however: for example, the symmetric group S3
is soluble because its derived series {} C A3 C S3 reaches {} after
finitely many steps. But its lower central series is A3 C S3 , which
doesnt reach the trivial subgroup, so it isnt nilpotent.
Subgroups of nilpotent groups are also nilpotent,91 as are finite direct 91
Proposition 7.62, page 272.
products of nilpotent groups,92 , 93 and so are homomorphic images 92
Proposition 7.63, page 273.
and quotients of nilpotent groups,94 , 95 but extensions of nilpotent 93
Corollary 7.64, page 273.
groups arent in general nilpotent. 94
Proposition 7.65, page 273.
Finite pgroups are nilpotent,96 and hence all finite direct products 95
Corollary 7.66, page 274.
of pgroups are nilpotent. In particular, any group that happens to 96
Proposition 7.67, page 274.
be a direct product of its Sylow subgroups will be nilpotent. The
converse also holds: a finite group G is nilpotent if and only if it is
the direct product of its Sylow subgroups.97 The proof of this result 97
Proposition 7.68, page 275.
uses the fact that every proper subgroup H of a nilpotent group G is a
proper subgroup of its normaliser NG ( H ).98 , 99 Also, a finite group G 98
Proposition 7.69, page 275.
is nilpotent if and only if all its Sylow subgroups are normal.100 99
Corollary 7.70, page 275.
Between the soluble and nilpotent groups there is another class of 100
Corollary 7.72, page 276.
groups called supersoluble or supersolvable: groups which have
normal series with cyclic quotients.101 Every supersoluble group is 101
Definition 7.73, page 277.
soluble, but not every soluble group is supersoluble: for example, S4
and A4 are soluble but not supersoluble since neither has a normal
series with cyclic factors.102 102
Example 7.74, page 277.
Subgroups of supersoluble groups are supersoluble,103 as are quotients 103
Proposition 7.75, page 277.
of supersoluble groups,104 and finite direct products.105 104
Proposition 7.76, page 277.
Extensions of supersoluble groups by supersoluble groups arent nec- 105
Proposition 7.77, page 277.
essarily supersoluble, but a slightly weaker result holds: extensions of
supersoluble groups by cyclic groups are supersoluble.106 106
Proposition 7.78, page 278.
We therefore have a hierarchy of classes of finite groups:

trivial cyclic abelian nilpotent supersoluble soluble

If H and K are normal subgroups of some group G with trivial in-


tersection H K = {e} and HK = G, then G is isomorphic to the
direct product H K.107 , 108 Requiring only one of the subgroups, say 107
Proposition 2.12, page 46.
H, to be normal, yields a more general construction: the (internal) 108
Corollary 3.23, page 82.
semidirect product.109 The subgroup K acts on the subgroup H by 109
Definition 7.79, page 279.
conjugation,110 so if a group G decomposes as an internal semidirect 110
Proposition 7.80, page 279.
318 a course in abstract algebra

product HK then we automatically get a Kaction on H. Conversely,


given two groups H and K, together with a specified action of K on
H, we can define a group structure on the Cartesian product H K,
111
Proposition 7.81, page 280. with (h1 , k1 ) (h2 , k2 ) = (h1 (k1 h2 ), k1 k2 ).111 We call this group the
112
Definition 7.82, page 281. (external) semidirect product of H and K, and denote it H oK.112
Many groups can be constructed as semidirect products of smaller
113
Example 7.83, page 281. ones. In particular, the dihedral group Dn
= Zn oZ2 ,113 and the sym-
114
Example 7.84, page 281.

metric group Sn = An oZ2 . 114 The semidirect product Rn oGLn (R)
115
Example 7.85, page 282. yields the affine general linear group AGLn (R).115
If we take the semidirect product of a group G with its full auto-
morphism group Aut( G ) we obtain the holomorph of G, denoted
116
Example 7.86, page 282. Hol( G ) = Go Aut( G ).116 Taking the semidirect product of nfold di-
rect product G n = G G with the symmetric group Sn , acting via
permutation of coordinates, gives a group known as the permutation
wreath product of G with Sn , denoted G o Sn = G pwr Sn = G n oSn .
More generally, regarding some arbitrary group H as a permutation
117
Theorem 2.13, page 47. group on its underlying set, via Cayleys Theorem,117 we can form the
regular wreath product of G by H, denoted G rwr H (or G o H if the
meaning is clear from context). This construction enables us to define
118
Example 7.87, page 283. the generalised symmetric group Sm,n = Zm o Sn .118 In particular,
S1,n = Sn is the usual symmetric group, S2,n is the signed symmetric
group or hyperoctahedral group, and S2,2 = D4 .
Although many groups can be decomposed or constructed as semidi-
rect products of smaller groups, there are also many that cant: the
smallest example being Z4 . A group G is an extension of a group K
by a group H if there exists an inclusion homomorphism i : H , G
119
Definition 7.88, page 284. with i ( H ) P G, and G/i ( H )
= K.119 An inclusion function (not nec-
essarily a homomorphism) s : K , G is called a lifting or a section
if s(eK ) = eG , and if qs = idK where q : G  K is the quotient ho-
120
Definition 7.89, page 284. momorphism.120 Then s(K ) is copy of K injectively embedded as a
subset, but not necessarily a subgroup, of G.
As with semidirect products, this section determines a conjugation
Kaction on H by i (k h) = s(k )i (h)s(k)1 .
If s is a homomorphism, G = H oK and we say the extension is split.
Such a section is often called a splitting or splitting homomorphism.
Classifying group extensions is complicated in general, but the abelian
case is relatively straightforward. We define a Gmodule to be an
121
Definition 7.90, page 285. abelian group A equipped with a Gaction,121 and we say that two
extensions
i1 q1 i2 q2
A E1 G and A E2 G

are equivalent if there is an isomorphism f : E1 E2 such that


finite groups 319

i2 = f i1 and q1 = q2 f ; that is, if the diagram

i1
E1
q1

A i2
f
q2
G
E1
commutes.122 A factor set or cocycle is a function : G G A such 122
Definition 7.91, page 285.
that
(eG , g) = ( g, eG ) = 0
for any g G, and
( g1 , g2 ) + ( g1 g2 , g3 ) = g1 ( g2 , g3 ) + ( g1 , g2 g3 )
for all g1 , g2 , g3 G.123 A factor set is determined by a section of the 123
Proposition 7.92, page 286.
extension, and different sections of the same extension yield factor sets
that differ by a coboundary or inner factor set: a function : G G
A such that
( g, h) = ( g) ( gh) + g (h)
for all g, h G, and some function : G A satisfying (eG ) =
0. Two factor sets related in this way are said to be equivalent or
cohomologous.124 Let FS( G, A) be the set of factor sets of extensions 124
Definition 7.95, page 289.
of a group G by a Gmodule A, and let IFS( G, A) be the corresponding
subset of inner factor sets. These both form abelian groups under
the canonical pointwise addition operation.125 We define the group 125
Proposition 7.96, page 290.
Ext( G, A) to be the quotient FS( G, A)/ IFS( G, A); the elements of this
group are equivalence classes of factor sets that differ only by an inner
factor set. Furthermore, equivalent factor sets in this sense correspond
to equivalent extensions,126 so the group Ext( G, A) classifies abelian 126
Proposition 7.97, page 290.
extensions of G by A. The zero element in Ext( G, A) corresponds to
(the equivalence class of) the split extension A , AoG  G. The
SchurZassenhaus Theorem says that if G is a group of order n and A
is a Gmodule of order m, where m and n are coprime, then Ext( G, A)
is trivial; that is, the only extension of G by A is the split extension.127 127
Theorem 7.98, page 292.
An extension
i q
A E G
is central if i ( A) Z ( E); that is, if the image of A lies in the centre of
E, so that every element of i ( A) commutes with every element of E.
We can, for example, construct the quaternion group Q8 as a central
extension of the Klein group V4 by Z2 .128 128
Example 7.99, page 293.
Another interesting case is that of a cyclic extension: an extension
of a cyclic group G. Such an extension is determined entirely by a
cyclic extension type, a quadruple ( H, n, , v) where H is a (possibly
nonabelian) group, n N, Aut( H ) and v H such that (v) = v
320 a course in abstract algebra

and n is conjugation by v.129 The Cyclic Extension Theorem says 129


Definiti
that for every cyclic extension type ( H, n, , v) there is a group G
with H P G and G/H = Zn ; furthermore all extensions of Zn are
130
Theorem 7.101, page 294. determined by some cyclic extension type.130 Two cyclic extension
types ( H, n, , v) and (K, m, , w) are equivalent if m = n and there
exists an isomorphism : H K such that = and (v) =
131
Definition 7.102, page 295. w.131 Equivalent cyclic extension types yield isomorphic groups,132
132
Proposition 7.103, page 295. although the converse doesnt necessarily hold: inequivalent cyclic
extension types might realise isomorphic groups. Nevertheless, given
two cyclic extension types ( H, n, , v) and ( H, n, , w), if and are
133
Proposition 7.104, page 296. conjugate in Aut( H ) then they realise isomorphic groups.133 We can
134
Example 7.105, page 296. use cyclic extension types to classify groups of certain orders.134
Metabelian groups, those with derived length 2 can also be charac-
terised as those groups that can be constructed as extensions of abelian
135
Definition 7.106, page 297. groups by abelian groups.135
We can use the results developed in this and earlier chapters, partic-
ularly Sylows Theorems and the theory of semidirect products and
group extensions, to classify groups of order less than 32. These are
listed in Table 7.19.
finite groups 321

|G| number G
1 1 {e}
2 1 Z2
3 1 Z3
4 2 Z4 , V4
= Z2 Z2
5 1 Z5
6 2 Z6 = Z2 Z3 , D3
7 1 Z7
8 5 Z8 , Z2 Z4 , Z2 Z2 Z2 , D4 , Q8
9 2 Z9 , Z3 Z3
10 2 Z10 , D5
11 1 Z11
12 5 Z12 = Z4 Z3 , Z2 Z6 = Z2 Z2 Z3 , D6 , Dic3 , A4
13 1 Z13
14 2 Z14 , D7
15 1 Z15 = Z3 Z5
16 14 Z16 , Z2 Z8 , Z4 Z4 , Z2 Z2 Z2 Z2 , D8 , Dic4 , SD16 , D4 Z2 ,
Q8 Z2 , Z8 o3 Z2 , V4 oZ4 , Q8 oZ2 , Z4 oZ4
17 1 Z17
18 5 Z18 = Z9 Z2 , Z3 Z6 = Z3 Z3 Z2 , D9 , Z3 D3 , (Z3 Z3 )oZ2
19 1 Z19
20 5 Z20 = Z4 Z5 , Z2 Z10 = Z2 Z2 Z5 , D10 , Dic5 , AGL1 (5)
21 2 Z21 = Z7 Z3 , Z7 oZ3
22 2 Z22 , D11
23 1 Z23
24 15 Z24 = Z3 Z8 , Z2 Z12 = Z2 Z4 Z3 , Z2 Z2 Z6 = Z2 Z2 Z2 Z6 , S4 ,
D12 , Z3 D4 , Z3 Q8 , Z2 A4 , Z4 D3 , Z2 Dic3 , Z2 D6
= V4 D3 ,

Q8 oZ3 = SL2 (3), Z3 oZ8 , Z3 o D4 , Z3 oQ8
25 2 Z25 , Z5 Z5
26 2 Z26 , D13
27 5 Z27 , Z3 Z9 , Heis(3) = U (3, 3), Z9 oZ3
28 4 Z28 = Z4 Z7 , Z2 Z2 Z7 = Z2 Z14 , Dic7 , D14
29 1 Z29
30 4 Z30 = Z2 Z3 Z5 , D15 , D3 Z5 , D5 Z3
31 1 Z31

Table 7.19: Groups of order at most 31

References and further reading


Some interesting details about the history of Sylows Theorems can be found in the following article:
W C Waterhouse, The early proofs of Sylows theorem, Archive for History of Exact Sciences 21.3 (1980)
322 a course in abstract algebra

279290
In the spirit of James McKays concise proof of Cauchys Theorem, cited at the end of Chapter 2,
Mchel Searcid formulated shorter proofs of Sylows Theorems:
M Searcid, A reordering of the Sylow Theorems, The American Mathematical Monthly 94.2 (1987)
165168
An accessible introduction to semidirect products, motivated by affine transformations (discussed in
Example 7.85), can be found in the following article:
S S Abhyankar and C Christensen, Semidirect products: x 7 ax +b as a first example, Mathematics
Magazine 75.4 (2002) 284289
For more detailed discussion of homological algebra, the following books are good places to start:
K S Brown, Cohomology of Groups, Graduate Texts in Mathematics 87, Springer (1994)
P J Hilton and U Stammbach, A Course in Homological Algebra, second edition, Graduate Texts in
Mathematics 4, Springer (1996)
J J Rotman, An Introduction to Homological Algebra, first edition, Pure and Applied Mathematics 85,
Academic Press (1979)
J J Rotman, An Introduction to Homological Algebra, second edition, Universitext, Springer (2008)
C A Weibel, An Introduction to Homological Algebra, Cambridge Studies in Advanced Mathematics 38,
Cambridge University Press (1994)
All are aimed at graduate or advanced undergraduate students; Weibel and the second edition of
Rotman are pitched at a higher level than the others. Browns book focuses mostly on applications to
group theory, while the others take a more general viewpoint concerning modules over arbitrary rings.
As noted earlier, the classification of groups of order 16 in Section 7.A closely follows the exposition in
the very clear and readable article by Marcel Wild:
M Wild, The groups of order sixteen made easy, The American Mathematical Monthly 112 (2005) 2031
Another classification of groups of order less than 32 can be found in the textbook by John Moody:
J A Moody, Groups for Undergraduates, World Scientific (1994)

Exercises
8.1 Construct the dihedral group D4 as a central extension of V4 by Z2 .
Fafner: New mischief will the Ni-
belung plot against us if the gold gives
him power. You there, Loge! Say with-
out lies: of what great value is the gold
then, that it satisfies the Nibelung?
Loge: It is a toy in the depths of the
water, to give pleasure to laughing chil-
dren; but if it were fashioned into a
round ring it would bestow supreme
8 Rings power and win its master the world.
Wotan: I have heard talk of the Rhines
gold: its glittering glow hides runes of
riches; a ring would give unbounded
power and wealth.
Richard Wagner (18131883),
n Chapter 1 we carefully studied the set Z of integers, giving
I particular attention to its additive structure. By deconstructing and
Das Rheingold (1869)

abstracting the way addition of integers works, we derived the concept


of a binary operation defined on an arbitrary set, and also a number
of properties (associativity, commutativity, existence of inverses and
an identity element) that such an operation might satisfy. This led us
to formulate the definition of a group, a surprisingly rich and strange
concept that has kept us busy for 323 pages so far.
But we can do other things with integers as well as add them together.
The second arithmetical operation we learn as children is that of
multiplication. In the next section we will study the multiplicative
structure of Z, both on its own and, crucially, how it interacts with
the additive structure. This will lead us to define a new mathematical
object, a ring, consisting of a set equipped with two binary operations.

8.1 Numbers Can you do addition? the White


Queen asked. Whats one and one
and one and one and one and one and
Considering just the additive structure of the integers one and one and one and one?
led us fairly naturally to the concept of an abelian group. We found I dont know, said Alice. I lost
count.
shortly afterwards that many important examples of the structures
Lewis Carroll (Charles Lutwidge
we were interested in didnt satisfy the commutativity condition: in Dodgson) (18321898),
particular, matrix multiplication and the composition of symmetry Alices Adventures in Wonderland (1865)

operations or permutations led to interesting nonabelian groups.


However, at present were primarily interested in algebraic structures
that mimic the usual addition and multiplication operations on the
integers, so for the moment well start with an additive abelian group.
So, what properties does integer multiplication satisfy? Its associative:
for any integers a, b and c its always the case that ( ab)c = a(bc).
Associativity turns out to be important, so well put it on our list.
Integer multiplication is also commutative: ab = ba for any integers a
324 a course in abstract algebra

and b. However, when we investigated groups, we found that lots of


important and interesting examples didnt satisfy the commutativity
requirement, so although integer multiplication does, we might want
to put this on the optional list.
There is also a special integer 1 that acts as a multiplicative identity
element, in the sense that a1 = 1a = a for any a Z. Notice that
1 6= 0; that is, the multiplicative identity isnt the same as the additive
identity. Well see in a little while that this is (almost) always true.
Something else thats worth mentioning at this stage is that although
Z and many of the other structures of this type were going to study
do have multiplicative identities, well also meet some examples which
dont. So well put this one on the optional list too. This also leads us
Oberwolfach Photo Collection / L Reidemeister into a slight ambiguity of terminology that well deal with soon.
David Hilbert (18621943) was one of
the most influential mathematicians of What about multiplicative inverses? Well, Z doesnt have them in
the late 19th and early 20th centuries. general. In fact, the only invertible elements in Z are 1 and 1. Later
Born in Knigsberg, in East Prussia, he on, well consider structures which do have invertible elements, but
enrolled at the Albertina (the Univer-
sity of Knigsberg) in 1880, obtaining for the moment its not something were going to require.
his doctorate in 1885 for a thesis on Thats pretty much it for the multiplicative operation on its own, but
spherical functions and binary forms,
written under the supervision of Ferdi- we also need to think about how multiplication and addition interact
nand von Lindemann (18521939). with each other. There are really just two questions we need to ask
In 1888 he proved the celebrated Hilbert here: what happens if we add two integers together and multiply on
Basis Theorem, an innovative solution
to a problem set twenty years earlier the left, or the right by another integer. That is, how do we expand
by Paul Gordan (18371912), who is expressions of the form a(b + c) and ( a + b)c?
said to have remarked Das is nicht
Mathematik, das ist Theologie (This We know how to do this for integers: we all learned it quite early on in
is not mathematics, this is theology.) our mathematical education. The expressions we want are, in general,
He stayed at Knigsberg until 1895,
when he was appointed to a chair at a(b + c) = ab + ac, ( a + b)c = ac + bc.
the University of Gttingen. In addi-
tion to his many contributions to ge- We call these the (left and right) distributive laws.
ometry, physics, analysis and algebra,
he fostered a vibrant academic environ- Were now ready to write down a formal definition of the objects were
ment at Gttingen, which succeeded going to be studying for most of the rest of the book.
in attracting active research mathemati-
cians from around the world. Definition 8.1 A ring R = ( R, +, ) is a set R equipped with two
At the International Congress of Math- binary operations + and such that:
ematicians in Paris, in 1900, he pre-
sented a famous list of ten important R1 ( R, +) is an abelian group;
unsolved problems, later extended to R2 is associative, that is, ( a b) c = a (b c) for all a, b, c R;
twenty-three, of which four remain un-
resolved over a century later. R3 a (b + c) = a b + a c and ( a + b) c = a c + b c for all
He died in 1943 and was buried in Gt- a, b, c R;
tingen. His epitaph consists of a re- R4 there exists a unity or multiplicative identity element 1 R
mark made at the end of his retirement
address in 1930 to the Society of Ger- such that 1 a = a 1 = a for all a R.
man Scientists and Physicians: Wir
A commutative ring is one that also satisfies the following extra
mssen wissen. Wir werden wissen.
(We must know. We will know.) condition:
R5 a b = b a for all a, b R.
rings 325

Axiom R2 says that ( R, ) is a semigroup, while R4 says that it is a


monoid. If R5 is satisfied as well, then ( R, ) is a commutative monoid.
The above definition says that a ring is an abelian group equipped
with another binary operation under which it forms a monoid, and
that both operations are compatible with each other in a specific way.
Now to attend to the terminological issue mentioned earlier. Some
books use the term ring to refer to a slightly more general structure
that doesnt necessarily satisfy condition R4, and call the structure in
Definition 8.1 a ring with 1, a unital ring or a ring with unity.
Those books (such as this one) that use the term ring for the corre-
sponding structure that does have a multiplicative identity sometimes
refer to the more general case as a ring without 1, a nonunital ring,
a pseudoring or, most succinctly but least pronouncably, as a rng.
Its time to look at some concrete examples.
Example 8.2 The set Z of integers forms a commutative ring with
respect to the familiar addition and multiplication operations.
Similarly, the set Q of rational numbers, the set R of real numbers
and the set C of complex numbers all form commutative rings under
the usual addition and multiplication operations.

The rings Q, R and C have additional properties that the integers


dont, and well study these in more depth over the next few chapters.
The next examples of groups we met in Chapter 1 were the finite cyclic
groups Zn . These are abelian groups, and with a little extra work we
can turn them into rings.
Example 8.3 The finite cyclic group Zn = {0, . . . , n1} forms a
commutative ring when equipped with modulon multiplication.
To see this, we check the five conditions from Definition 8.1. The
pair (Zn , +n ) is an abelian group, so R1 is satisfied. Modulon
multiplication is associative, commutative, and also distributive with + 0 1 0 1
0 0 1 0 0 0
respect to modulon addition, so conditions R2, R3 and R5 hold. The
1 1 0 1 0 1
element 1 Zn behaves as a multiplicative identity, so condition R4
is satisfied too, and hence Zn is a commutative ring. The addition + 0 1 2 0 1 2
0 0 1 2 0 0 0 0
and multiplication tables for Z2 , Z3 and Z4 are shown in Table 8.1. 1 1 2 0 1 0 1 2
2 1 0 1 2 0 2 1
The simplest example of a ring is the set consisting of a single element:
+ 0 1 2 3 0 1 2 3
Example 8.4 Let R = {0}. This forms a commutative unital ring, the 0 0 1 2 3 0 0 0 0 0
trivial ring, under the usual addition and multiplication operations 1 1 2 3 0 1 0 1 2 3
2 2 3 0 1 2 0 2 0 2
0+0 = 0 and 0 0 = 0. 3 3 0 1 2 3 0 3 2 1

The following two examples are formed by taking the ring Z and Table 8.1: Addition and multiplication
carefully attaching a specific non-integer element. Well come back to tables for Z2 , Z3 and Z4
326 a course in abstract algebra

this idea later on when we study field extensions, but for the moment
its an interesting and useful way of constructing new rings.

Example 8.5 The set Z[ 2] = { a + b 2 : a, b Z} is a commutative
unital ring under the usual addition and multiplication operations.
It is an abelian group under addition, so R1 holds. Multiplication is
associative and commutative, and also distributive over addition, so

R2, R3 and R5 hold. Finally, 1 = 1 + 0 2 Z[ 2] so R4 holds too.

Example 8.6 A variation on the above example is the ring


Z[i ] = { a + bi : a, b Z}
of Gaussian integers. This is a commutative unital ring.
+ 0 1 1+ i i
0 0 1 1+ i i Similarly, the set of Eisenstein integers
1 1 0 i 1+ i
1+ i 1+ i i 0 1 Z[ ] = { a + b : a, b Z},

i i 1+ i 1 0 1 3
where = 2 + 2 i, and the set of Kleinian integers

+ 0 1
2 1+ 2 Z[] = { a + b : a, b Z},
0 0 1 2 1+
2
1 1 0 1 + 2 2 1 7
where = 2 + 2 i, also both form commutative rings.
2 2 1+
2 0 1
1+ 2 1+ 2 2 1 0 We can modify the above examples by replacing Z with some other

+ 0 1 2 ring like Q or Zn , to get new rings such as Q[i ] or Zn [ 2].
0 0 1 2
1 3
1 1 0 2 Example 8.7 Let = 2 + 2 i. Then
2 0 1
2 2 1 0 Z2 [i ] = { a + bi : a, b Z2 },

Table 8.2: Addition tables for the rings Z2 [ 2] = { a + b 2 : a, b Z2 },
Z2 [i ], Z2 [ 2] and Z2 [ ]
Z2 [ ] = { a + b : a, b Z2 }.

0 1 1+ i i
The addition and multiplication tables for these rings are shown in
0 0 0 0 0 Tables 8.2 and 8.3. (Note that + 1 = 2 , but since were working
1 0 1 1+ i i over Z2 we have +1 = 1 and hence + 1 = 2 .)
1+ i 0 1+ i 0 1+ i
i 0 i 1+ i 1 Now weve met a few examples of rings, its time to start formulating

0 1 2 1+ 2 some general results about them. The following basic properties are
0 0 0 0 0 almost immediate consequences of Definition 8.1.
1 0 1 2 1+
2
2 0 2 0 2 Proposition 8.8 Let R = ( R, +, ) be a (unital or nonunital) ring. Then,

1+ 2 0 1+ 2 2 1 for any a, b, c R,
0 1 2 (i) 0 a = a 0 = 0,
0 0 0 0 0 (ii) a (b) = ( a) b = ( a b), and
1 0 1 2
0 2 1 (iii) ( a) (b) = ( a b).
2 0 2 1 If R is unital, then
Table 8.3: Multiplication
tables for the
(iv) (1) a = a = a (1) for any a R, and
rings Z2 [i ], Z2 [ 2] and Z2 [ ]
(v) (1) (1) = 1.
rings 327

Proof All of these follow from the ring axioms R1, R2 and R3, and R5
in the case of (iv) and (v). For (i),

0 + a 0 = a 0 = a (0 + 0) = a 0 + a 0

and using the cancellation law for abelian groups we get 0 = a 0 as


required. The proof that 0 a = 0 is very similar.
We need to think a little carefully about part (ii) and properly under-
stand what it says. In particular, ( a b) is the additive inverse of
a b; that is, the unique (by Proposition 1.16) element of R that yields
0 when added to a b. So, to show that ( a) b = ( a b) we need to
show that ( a) b + a b = 0.
By the right distributive law,
Wikimedia Commons / Christian Albrecht Jensen (17921870)
Carl Friedrich Gauss (17771855) is
( a) b + a b = ( a + a) b = 0 b widely regarded as the greatest mathe-
matician of the late 18th and early 19th
and by part (i) we know that 0 b = 0, hence ( a) b = ( a b) as centuries. Born to poor, working-class
claimed. A very similar argument, using the left distributive law, parents in Braunschweig (Brunswick),
he displayed considerable mathemat-
shows that a (b) = ( a b). ical talents from early childhood. Be-
We can use part (ii) to prove part (iii): note that ( a) (b) = ( a tween 1792 and 1799, during his un-
dergraduate studies at the Collegium
(b)), and furthermore that ( a (b)) = (( a b)). This element Carolinum in Braunschweig and the
(( a b)) is the unique element of R that yields 0 when added University of Gttingen, he indepen-
dently discovered several important
to ( a b). But a b satisfies this property, and by Proposition 1.16 mathematical and physical results, in-
we know that inverses are unique, so (( a b)) = a b, and hence cluding Bodes Law, the Binomial The-
( a) (b) = a b. orem, the Quadratic Reciprocity Law
and the Prime Number Theorem. Also
To show part (v) we use (ii) again: (1) a = (1 a) = a and during this period he wrote Disquisi-
tiones Arithmetic, his celebrated trea-
a (1) = ( a 1) = a.
tise on number theory, which was pub-
Finally, part (v) is the special case of (iii) where a = b = 1. lished a few years later in 1801.
Under the supervision of Johann
The above proof relies in a couple of places on the uniqueness of Friedrich Pfaff (17651825), he obtained
a doctorate from the University of
additive identities and inverses guaranteed by Proposition 1.16. This
Helmstadt in 1799 for a thesis on the
works because a ring is essentially an abelian group with some extra Fundamental Theorem of Algebra.
structure bolted onto it. Its worth asking whether the multiplicative His work in differential geometry in-
cludes the Theorema Egregium and the
identity is also unique, and it turns out that it is.
GaussBonnet Theorem, and his un-
Proposition 8.9 Let R = ( R, +, ) be a ring with a multiplicative iden- published research on non-Euclidean
geometry anticipated later work by
tity 1. This identity element is unique. Jnos Bolyai (18021860) and Nikolai
Lobachevsky (17921856).
Proof Suppose that there is another element e R such that e a = In addition to his mathematical
a e = a for all a R. Then e 1 = 1, because e is a multiplicative achievements, he made important con-
tributions in astronomy. In particular,
identity. But also e 1 = e since 1 is a multiplicative identity too. in late 1801 he successfully predicted
Equating these gives 1 = e 1 = e, so the identity must be unique. the location of the dwarf planet Ceres,
which had been discovered by the Ital-
All of the examples weve seen so far have been commutative and ian astronomer Giuseppe Piazzi (1746
unital, so its time we saw some that arent. The following is a simple 1826) earlier that year and then lost as
it passed behind the sun.
example of a commutative nonunital ring.
328 a course in abstract algebra

Example 8.10 The set 2Z = {. . . , 4, 2, 0, 2, 4, . . .} of even integers


forms a commutative nonunital ring under the usual addition and
multiplication operations.
The pair (2Z, +) is an abelian group, so R1 holds. Multiplication
is associative, distributive and commutative so R2, R3 and R5 are
satisfied. The product of two even integers is also an even integer, so
multiplication is well-defined on 2Z. But the identity axiom R4 fails:
there is no even integer n such that a n = n a = a for all a 2Z.

This example generalises: for some fixed n Z let


nZ = {. . . , 2n, n, 0, n, 2n, . . .}.
Then for |n| > 1 the above argument shows that nZ is a commutative
nonunital ring. If |n| = 1 then nZ = Z, which is a commutative unital
ring as seen in Example 8.2. If n = 0 then we get the trivial ring {0}.
Another source of nonunital rings is shown in the next example, where
we take an abelian group and give it a trivial multiplication operation.
Example 8.11 Let A be an abelian group with identity e. Define
a b = e for all a, b A. This process forms a nonunital ring.
In the case of the group Z4 , this process yields a ring with the
following addition and multiplication tables:

+ 0 1 2 3 0 1 2 3
0 0 1 2 3 0 0 0 0 0
1 1 2 3 0 1 0 0 0 0
2 2 3 0 1 2 0 0 0 0
3 3 0 1 2 3 0 0 0 0

Well meet an important class of noncommutative rings in the next


section, but we can look at two interesting examples now.
Example 8.12 The sets
H = { a + bi + cj + dk : a, b, c, d R}
and L = { a + bi + cj + dk : a, b, c, d Z}
of, respectively, quaternions and Lipschitz integers (or Lipschitz
quaternions) both form noncommutative unital rings under the
usual quaternion addition and multiplication operations, where
i2 = j2 = k2 = 1 and
i j = k, j k = i, k i = j,
j i = k, k j = i, i k = j.
In both cases the conditions R1, R2 and R3 are satisfied and R4 holds
because 1 = 1 + 0i + 0j + 0k, but R5 fails.
rings 329

In all of the nontrivial examples of unital rings weve seen so far, the There is a popular story that an em-
multiplicative identity element is different to the additive identity. The inent mathematician once spent sev-
eral hundred pages formally proving
following proposition confirms this in general. that 1+1=2. Unlike many such apoc-
Proposition 8.13 Let R be a unital ring in which 1 = 0; that is, the ryphal legends, this one is true. The
proof occurs on page 379 of the first vol-
additive and multiplicative identities are the same. Then R is trivial. ume of Principia Mathematica, Bertrand
Russell and Alfred North Whiteheads
Proof Suppose that 1 = 0. Then for any a R we have notationally dense three-volume work
on the axiomatic foundations of mathe-
a = 1a = 0a = 0 matics.1
Here we have taken only 329 pages to
and hence R is trivial. confirm that 1 6= 0.
The rings in Example 8.7 all have addition tables structurally identical 1 B A W Russell and A N Whitehead,
Principia Mathematica, Cambridge Uni-
to that of the Klein group V4 = Z2 Z2 . This leads us to ask whether versity Press (19101913)
any of them are structurally identical as rings to Z2 Z2 . But to answer
this question we first need to decide what it means for two rings to be
isomorphic, and also what we mean by the direct sum of two rings.
For the first one, recall that a group isomorphism is a bijection that
respects the group structure in a precisely-defined way.2 Such a 2
Definition 1.17, page 11.
bijection must respect the additive structure of an abelian group, and
since a ring is to some extent just a special kind of abelian group, this
is a good place to start. But a ring also has a multiplicative structure,
so a ring isomorphism should respect that too.
Rather than define ring isomorphisms and then introduce homomor-
phisms as the non-bijective case (as we did with groups) we might as
well introduce both concepts at the same time, although well leave
the main discussion of ring homomorphisms until Chapter 9.
Definition 8.14 Let R = ( R, +, ) and S = (S, , ) be rings, and
suppose that f : R S is a function satisfying
f ( a + b) = f ( a)  f (b) and f ( a b) = f ( a)  f (b)
for all a, b R. Then f is a (ring) homomorphism.
If f is injective, we call it a monomorphism; if f is surjective, we call
it a epimorphism; and if f is bijective, we call it a isomorphism. A
homomorphism f : R R is called an endomorphism.

The second item on our list is the direct sum of two rings. Again, we
define this by extending the corresponding concept for abelian groups:
Definition 8.15 Let R and S be rings. The direct sum RS is
RS = {(r, s) : r R, s S}
with addition and multiplication operations
( a, b) + (c, d) = ( a + c, b + d) ( a, b) (c, d) = ( a c, b d)
for all a, c R and b, d S.
330 a course in abstract algebra

+ (0, 0) (0, 1) (1, 0) (1, 1)


So were now able to ask whether or not any of the rings in Example 8.7
(0, 0) (0, 0) (0, 1) (1, 0) (1, 1)
(0, 1) (0, 1) (0, 0) (1, 1) (1, 0) are isomorphic as rings to Z2 Z2 .
(1, 0) (1, 0) (1, 1) (0, 0) (0, 1)
(1, 1) (1, 1) (1, 0) (0, 1)
The addition and multiplication tables for Z2 Z2 are shown in Ta-
(0, 0)
ble 8.4. This ring and those in Example 8.7 are all small, finite rings,
(0, 0) (0, 1) (1, 0) (1, 1) so in principle we could check for an isomorphism by considering all
(0, 0) (0, 0) (0, 0) (0, 0) (0, 0)
(0, 1) (0, 0) (0, 1) (0, 0) (0, 1) possible functions from Z2 Z2 to the ring in question. But this is not
(1, 0) (0, 0) (0, 0) (1, 0) (1, 0) a particularly illuminating, clever or efficient method, and it rapidly
(1, 1) (0, 0) (0, 0) (1, 0) (1, 1)
becomes impractical for larger rings, and impossible for infinite rings.
Table 8.4: Addition and multiplication So we need to use more sophisticated methods.
tables for the ring Z2 Z2
However, both Z2 [i ] and Z2 [ 2] have a nontrivial element (respec-

tively, 1+i and 2) that squares to give 0, but Z2 Z2 has no such

element, so neither Z2 [i ] nor Z2 [ 2] can be isomorphic to Z2 Z2 .
Furthermore, Z2 [ ] has a nontrivial element that cubes to give 1,

and Z2 Z2 doesnt. Nor, for that matter, does either Z2 [i ] or Z2 [ 2].
So, we know now that Z2 Z2 isnt isomorphic to any of the rings

Z2 [i ], Z2 [ 2] or Z2 [ ]. We also know that Z2 [ ] isnt isomorphic to

either Z2 [i ] or Z2 [ 2].

Are Z2 [i ] and Z2 [ 2] isomorphic? Well, a careful examination of their
multiplication tables suggests they might be: both have 0 as an additive
identity and 1 as a multiplicative identity, so we might reasonably

expect an isomorphism f : Z2 [i ] Z2 [ 2] to give f (0) = 0 and
f (1) = 1. In Z2 [i ] the element i squares to give 1, and there is only

one remaining element of Z2 [ 2] that squares to f (1) = 1, namely

1+ 2. Similarly, (1+i )2 = 0 in Z2 [i ], and 2 is the only remaining

element of Z2 [ 2] that squares to f (0) = 0. We have thus explicitly

constructed an isomorphism f : Z2 [i ] Z2 [ 2]:

f (0) = 0, f (1) = 1, f (i ) = 1+ 2, f (i +i ) = 2.

Here we made use of the multiplicative structure of the rings in


question, because a ring isomorphism must preserve this structure. In
particular, we used the following facts about ring homomorphisms:
Proposition 8.16 Let R and S be rings, and f : R S be a ring homo-
morphism. Then f (0) = 0. Furthermore, if R and S both have a multi-
plicative identity 1, S is nontrivial, and f is surjective, then f (1) = 1.
Also, f (nr ) = n f (r ) and f (r n ) = f (r )n for any r R and n N.

Proof The first statement follows from Proposition 4.24, since any
ring R is an abelian group, and any ring homomorphism f satisfies
the usual conditions for an abelian group homomorphism.
For the second statement, we want to show that f (1) s = s f (1) = s
for any s S. Since f is surjective, we can find at least one r R such
rings 331

that f (r ) = s, and hence


f (1) s = f (1) f (r ) = f (1 r ) = f (r ) = s.
(A very similar argument shows that s f (1) = s.)
The last statement follows from the definition of a ring homomorphism,
and can be proved by induction on n. For n = 1 we have f (1r ) =
f (r ) = 1 f (r ) and f (r1 ) = f (r ) = f (r )1 .
Suppose that the hypothesis holds for n = k. That is, f (kr ) = k f (r )
and f (r k ) = f (r )k . Then
f ((k+1)r ) = f (kr + r ) = f (kr ) + f (r ) = k f (r ) + f (r ) = (k+1) f (r )
and
f ( r k +1 ) = f ( r k r ) = f ( r k ) f ( r ) = f ( r ) k f ( r ) = f ( r ) k +1
as required.
Early on in our study of groups, we found it useful to consider the
order | g| of a group element g. We can extend this idea to elements of
a ring, but since we have two binary operations to work with, we end
up with two related types of order.
Definition 8.17 Let R be a ring, and r R be some arbitrary element.
The additive order of r is the smallest natural number n such that
nr = 0; if no such n exists, the element r is said to have infinite
additive order.
If R is a unital ring, then the multiplicative order of r is the smallest
natural number n such that r n = 1; again if no such n exists, we say
r has infinite multiplicative order.
Unless otherwise specified, or where the context is ambiguous, we
will typically use the term order to refer to the multiplicative order
rather than the additive order.
As before, the order | R| of a ring R is its cardinality. It will sometimes
be useful to know the maximum additive order of elements in R:
Definition 8.18 The characteristic char( R) of a ring R is the smallest
natural number n such that nr = 0 for all r R. If no such n exists,
we set char( R) = 0.
Over the next couple of chapters we will derive a number of impor-
tant facts about the characteristic of various types of rings, but for
the moment we can state and prove the following result about the
characteristic of unital rings.
Proposition 8.19 Let R be a unital ring. If the multiplicative identity 1
has infinite additive order in R, then char( R) = 0; otherwise char( R) is
equal to the additive order of 1.
332 a course in abstract algebra

Proof Clearly if 1 has infinite additive order, then there is no n N


R | R| char( R) R | R| char( R) such that n1 = 0, so by Definition 8.18, char( R) = 0.
Z 0 0 Zn n n
Q 0 0 2Z 0 0 If 1 has additive order n, then nr = (n1) r = 0 r = 0 for any r R,
R 1 0 Z2 [i ] 4 2 and n is the smallest natural number with this property.
C 1 0 Z2 [ ] 4 2
H 1 0 Z2 Z2 4 2 Table 8.5 lists the characteristics of most of the rings weve met so far.
Table 8.5: Orders and characteristics of Well end this section with a result sometimes called the Freshmans,
some rings
Schoolboys or Childs Binomial Theorem.
Proposition 8.20 Let R be a commutative ring of prime characteristic p.
Then for any a, b R,
( a + b) p = a p + b p .

Proof By the usual binomial theorem, the coefficient of the term


p!
an bn p is (np) = n!( pn)! . When 0 < n < p this coefficient is divisible
by p, and hence is zero in R. Thus
( a + b) p = a p + 0a p1 b + + 0ab p1 + b p = a p + b p
as claimed.

For this purpose we must commence, 8.2 Matrices


not with a square, but with an oblong
arrangement of terms consisting, sup-
pose, of m lines and n columns. This Some of the first examples of nonabelian groups we met were
will not in itself represent a determi- the matrix groups GLn (R) and its various subgroups such as On (R),
nant, but is, as it were, a Matrix out of
which we may form various systems of SOn (R) and SLn (R), and it seems natural to ask whether this might
determinants by fixing upon a number be a source of interesting noncommutative rings.
p, and selecting at will p lines and p
columns, the squares corresponding to Considering square matrices over some ring R, its certainly the case
which may be termed determinants of that matrix addition is commutative and matrix multiplication is
the pth order.
associative, so by carefully choosing a set of matrices to ensure closure
James Joseph Sylvester (18141897),
Additions to the articles under addition and multiplication, existence of additive identity and
On a new Class of Theorems inverses, and so forth, we should be able to construct rings of matrices.
and On Pascals Theorems,
Philosophical Magazine 37 (1850) Example 8.21 Let R be a ring, and denote by Mn ( R) the set of nn
363370 matrices with entries from R. Then Mn ( R) is a ring.
In particular, M2 (Z3 ) is the ring of 22 matrices with entries from
Z3 . It has order 34 = 81 and characteristic 3.
If we allow R to be nonunital, then Mn ( R) is also a nonunital ring.
Thus M3 (2Z) is the nonunital ring of 33 matrices with even integer
entries. It has (countably) infinite order and characteristic zero.
The ring Mn ( R) is commutative if n = 1 and R is itself commutative:
in this case M1 ( R) = R via the obvious map [r ] 7 r. However,
Mn ( R) is in general noncommutative for n > 1.
rings 333

In addition to the full matrix rings Mn ( R), some important subsets


also form rings.
Example 8.22 Let R be a ring, and denote by UT n ( R) the set of nn
upper triangular matrices; that is, nn matrices which have all zero
entries below the leading diagonal. More formally,
UT n ( R) = { A Mn ( R) : Ai j = 0 for i < j}.
Similarly,
LT n ( R) = { A Mn ( R) : Ai j = 0 for i > j}
is the set of lower triangular matrices, for which all the entries above
the leading diagonal are zero.
Both of these sets form rings under the usual matrix addition and
multiplication operations.
Finally, let
Dn ( R) = { A Mn ( R) : Ai j = 0 for i 6= j}
be the set of diagonal matrices; this also forms a ring.

In fact, the ring Dn ( R) of diagonal nn matrices over R is isomorphic


to the nfold direct sum Rn = R R, via the map

r1 0 0 0
0 r
2 0 0
0 0 r3 0 7 (r1 , r2 , r3 , . . . , r n ).

.. .. .. .

. . . ..
0 0 0 rn
From linear algebra were used to using matrices to represent linear
maps from one vector space to another. In particular, an nn real
matrix represents a linear map from Rn Rn . More generally, an
element of Mn ( R) (or, for that matter, UT n ( R), LT n ( R) or Dn ( R))
can be regarded as homomorphisms from Rn to Rn . Recall from
Definition 8.14 that a ring homomorphism from a ring to itself is
called an endomorphism, so every element of Mn ( R) is really just an
endomorphism of Rn .
In particular, a given 11 matrix over R corresponds to an endomor-
phism of R. The way this works is that a matrix [r ] M1 ( R) acts
on an element s R by multiplication on the left: [r ]s = r s. This
determines a unique endomorphism f r of R given by f r (s) = r s.
More generally, given any two endomorphisms f 1 and f 2 of R, we can
define two new endomorphisms ( f 1 + f 2 ) and ( f 1 f 2 ) by
( f 1 + f 2 )(r ) := f 1 (r )+ f 2 (r ) and ( f 1 f 2 )(r ) := f 1 ( f 2 (r )) = ( f 1 f 2 )(r )
for all r R. This leads to the following important example.
334 a course in abstract algebra

Example 8.23 Let End( R) = { f : R R} be the set of all endomor-


phisms of a ring R. Then this set forms a ring, the endomorphism
ring of R under the addition and multiplication operations defined
above.
The additive identity of this ring is the zero endomorphism z : R
R mapping everything to the zero element 0 R, and for any
endomorphism f there is a corresponding endomorphism f given
by ( f )(r ) = ( f (r )) for all r R. Addition of endomorphisms
is clearly associative, because addition in R is. Multiplication is
associative too, because composition of endomorphisms is. Finally,
the left and right distributive laws hold because
( f 1 ( f 2 + f 3 ))(r ) = f 1 ( f 2 (r ) + f 3 (r ))
= f 1 ( f 2 (r )) + f 1 ( f 3 (r ))
= ( f 1 f 2 )(r ) + ( f 1 f 3 )(r )
= (( f 1 f 2 ) + ( f 1 f 3 ))(r )
for all r R. The right distributivity law follows by a very similar
argument.
Finally, End( R) has a multiplicative identity, the identity endomor-
phism id : R R.

More generally, we can construct endomorphism rings of other objects.


The next example will resurface a little later when we discuss subrings.
Example 8.24 Let A be an abelian group. Then the set End( A) of
all abelian group endomorphisms f : A A forms a ring under the
addition and composition operations discussed earlier.

He cited the quadratic equation as an 8.3 Polynomials


example of the sort of irrelevant topic
that pupils study. I had hoped that
the Government would make a robust So far, weve met a number of examples of rings, commutative and
rebuttal, but there was no defence ei- noncommutative, unital and nonunital, formed from numbers and
ther of mathematics in general or the
quadratic equation in particular. matrices of various types. The next simplest class of mathematical
Tony McWalter MP, objects that most of us meet are polynomial expressions. We typically
House of Commons debate on meet these in secondary school, initially as linear or quadratic expres-
Quadratic Equations. Hansard HC
Deb 26 June 2003 407:1259 sions such as 3x 2 or x2 4x + 3, and quickly learn how to add and
multiply them. We also learn how to find their roots, but well leave
that aspect for the moment and concentrate just on their additive and
multiplicative behaviour.
Addition of polynomials is associative and commutative, there is a
zero polynomial 0 = 0 + 0x + 0x2 + which leaves unchanged any
rings 335

other polynomial we add it to, and every polynomial p( x ) has a


unique negative counterpart p( x ). So polynomials certainly exhibit
an abelian group structure.
We can also multiply polynomials together; this operation is associative
and has an identity element 1 = 1 + 0x + 0x2 + . Furthermore,
addition and multiplication of polynomials obeys the left and right
distributive rules. This leads us to the following key example.
Example 8.25 Let x be a formal variable (an unknown or indeter-
minate). The set
Z[ x ] = { a0 + a1 x + a2 x2 + + an x n : a0 , . . . , an Z, n Z>0 }
forms a commutative ring under the usual addition and multiplica-
tion operations.

There is an important detail that we must be careful not to gloss over


at this point. Each polynomial in Z[ x ] is a Zlinear combination of
nonnegative powers of the variable x, and must therefore have only a
finite number of nonzero terms. To make this a little more precise, we
introduce the following definition.
Definition 8.26 Let p( x ) = a0 + a1 x + + an x n be a polynomial in
Z[ x ] such that n 6= 0. The degree of p, denoted deg( p), is the highest
nonzero power of the variable x represented; that is, deg( p) = n.
The degree of a quadratic polynomial a2 x2 + a1 x + a0 is therefore 2,
that of a linear polynomial a1 x + a0 is 1, and the degree of a constant
element a0 Z[ x ] is 0.
The degree of the zero polynomial 0 is usually left undefined, al-
though some books set it by convention to either 1 or .

One generalisation of Example 8.25 is to let R be any ring and let

R[ x ] = { a0 + a1 x + a2 x2 + + an x n : a0 , . . . , an R, n Z>0 }.

denote the ring of polynomials in x with coefficients in R. This ring


R[ x ] is commutative if and only if R is commutative, and unital if
and only if R is unital. The rings Z[ x ] and Q[ x ] of polynomials
with, respectively, integer and rational coefficients, are particularly
interesting from an algebraic point of view: the search for general
strategies for finding roots of polynomials in Q[ x ] led to the study of
field extensions and the formulation of the machinery we now know
as Galois Theory, named in memory of one of its principle discoverers.
Polynomial rings of the form Z p [ x ], where p is prime, have important
applications in cryptography and coding theory.
We can allow negative powers of the variables as well; this leads to the
following class of rings:
336 a course in abstract algebra

Example 8.27 Let Z[ x 1 ] be the ring of polynomials of the form


a m x m + a m +1 x m +1 + + a 1 x 1 + a 0 + a 1 x + + a n x n
with am , . . . , an Z. This is the ring of Laurent polynomials in x.

Laurent polynomials are of particular relevance in knot theory: the


Alexander polynomial of a knot or link is an element of Z[t1 ].
Rings of multivariate polynomials (polynomials with more than one
unknown) play a fundamental rle in the field of algebraic geometry.
A detailed account is well beyond the scope of this book, but we can
at least define them and study some of their basic properties.
Definition 8.28 Let R be a ring, let x1 , . . . , xk be unknowns, and let
n1 , . . . , nk be nonnegative integers. Then a formal product of the
n n
form x1 1 . . . xk k is a monomial. A (multivariate) polynomial over R
(or with coefficients in R) is an Rlinear combination of monomials.
The degree of a monomial is the sum n1 + + nk of its exponents,
and the degree of a given multivariate polynomial is the maximal
degree of all its constituent monomials with nonzero coefficients.

We can now construct rings of multivariate polynomials.


Example 8.29 Let R be a ring, and let x1 , . . . , xk be formal variables.
The set R[ x1 , . . . , xk ] of all Rlinear combinations of finite-degree
monomials in x1 , . . . , xk forms a ring under the usual addition and
multiplication operations.
In particular, we assume that the unknowns x1 , . . . , xk commute with
each other, so that
n m m +nk
( ax1n1 . . . xk k ) (bx1m1 . . . xk k ) = ( a b) x1m1 +n1 . . . xk k .

Another way of generalising the ring R[ x ] is to allow infinite-degree


polynomials. This leads to the following example.
Example 8.30 Let x be an unknown and let R be a ring. Then a
power series over R is a formal sum

a i x i = a0 + a1 x + a2 x +
i =0
where a0 , . . . R. The set RJxK of all power series over R forms a
ring under the following addition and multiplication operations:
For polynomial rings we use single    
brackets, while for power series rings
we use double brackets. So, for exam-
i
a x i
+ i = ( a i + bi ) x i
b x i
i =0 i =0 i =0
ple, the ring of polynomials with inte-
ger coefficients is denoted Z[ x ], while
     i 
the corresponding ring of power series a i x bi x =
i i
a j bi j xi
is denoted ZJxK. i =0 i =0 i =0 j =0

You have probably met power series before, in the context of Taylors
rings 337

Theorem and TaylorMacLaurin series, most likely in a course on


calculus or analysis. When studying these objects from an analytical
viewpoint, we must be careful to consider questions of convergence,
and indeed there are a number of standard analytical results such
as dAlemberts Ratio Test and the Alternating Series Test to answer
these questions; now, however, we are viewing them purely as formal
algebraic objects and can cheerfully sidestep such analytical concerns.
Nevertheless, the first time we meet polynomials and power series, we
typically study them as functions: we are interested in what happens
when we set the variable x to a particular value. We will return
to this question in the next chapter, when we meet the evaluation
homomorphism f a : R[ x ] R determined by a given element a R.3 3
Example 9.20, page 369.
We end this section with a class of rings that are in some sense a
generalisation of the polynomial rings R[ x ].
Example 8.31 Let G be a group, which for notational convenience
we write multiplicatively. Then the (integral) group ring is the set
 
ZG = n g g : only finitely many n g 6= 0
g G

consisting of finite Zlinear combinations of elements of G.


The addition operation in ZG is the obvious one, namely

m g g + n g g = (m g + n g ) g
g G g G g G

and the multiplication operation is


     
mg g ng g = mh + nk g.
g G g G g G hk= g

That is, we multiply two sums in the usual way, combining terms
by using the ordinary multiplication operation in Z and the group
multiplication operation accordingly: n g g mh h = (n g mh )( g h).
We can generalise this by replacing the coefficient ring Z with some
other ring R to get RG.

The group ring is important in representation theory and homological


algebra. We will revisit it in Section 8.A, where we will see that
modules over ZG are exactly the Gmodules discussed in Section 7.5.4 4
Definition 7.90, page 285.

Example 8.32 Let G = ht : = 1i = Z2 . Then RG consists of


t2
elements of the form a+bt where a, b R and
( a+bt) + (c+dt) = ( a+c) + (b+d)t,
( a+bt) (c+dt) = ( ac+bd) + ( ad+bc)t.
This is the ring of split complex, perplex or hyperbolic numbers.
338 a course in abstract algebra

Division findeth how oft one Number is 8.4 Fields


contained in another. The Number di-
vided, (or which is considerd as the
containing number) is called the Divi- At the beginning of this chapter we compiled a list of properties
dend; the Number dividing (or which a multiplication operation might satisfy. So far, weve looked at the
is considered as contained) is called
the Divisor; and the Number sought, existence of an identity element, and also considered examples where
is called the Quotient, or Quote, (from the commutativity condition might or might not be satisfied. The
Quoties, How oft;) because it shews the
other obvious question to ask is whether some or all elements of a
how oft sought.
Alexander Malcolm (16851763), ring have a multiplicative inverse.
A New System of Arithmetick, Theorical For any nontrivial ring R, at least one element isnt invertible: the
and Practical (1730) 52
additive identity 0, since there is no element r R such that r 0 = 1,
as r 0 = 0 for all r R by Proposition 8.8 (i).
In the ring Z2 [i ], the element 1+i doesnt have an inverse, but 1 and i
both do. In Z2 [ ], however, every element apart from 0 is invertible:
11 = 1, 1 = 2 , ( 2 )1 = .
So, in general, 0 isnt invertible (except in the case of the trivial ring), 1
is invertible (its its own inverse) and some or all of the other elements
might be invertible. We give invertible elements a special name:
Definition 8.33 Let R be a (unital) ring. If r R is invertible, in
the sense that there exists some element s = r 1 R such that
r s = s r = 1, then we say that r is a unit (and so is s).

The units of a ring have some interesting properties: they are closed
under multiplication and include the identity, leading to the following.
Proposition 8.34 Let U ( R) be the set of units of a (unital) ring R. Then
U ( R) forms a group, the group of units, under multiplication.

Proof The unity element 1 R is contained in U ( R) since 11 = 1.


For any two units a, b U ( R) the product a b is also in U ( R) since
a b has inverse ( a b)1 = b1 a1 . If a U ( R) then its inverse a1
must also be in U ( R). Finally, multiplication in U ( R) is associative
because it is in R. Hence U ( R) is a group.
5
Definition 2.41, page 62. Weve met this concept already, back in Section 2.3,5 , 6 when we studied
6
Example 2.42, page 62. the multiplicative groups Zn , which are exactly the groups U (Zn ).
7

7
Some books denote U ( R) by R for Example 8.35 The ring Z of integers has only two invertible ele-
this reason.
ments: 1 and 1. Hence U (Z) = {1}
= Z2 .

Example 8.36 The polynomial ring Z[t] also has only two invertible
elements: 1 and 1. Thus U (Z[t]) = {1}
= Z2
The ring Z[t1 ] of Laurent polynomials over Z has infinitely many
units: any element of the form tn , where n Z, is invertible (its
inverse is tn ). Hence U (Z[t1 ]) = {tn : n Z}
= Z2 Z.
rings 339

Example 8.37 The invertible elements in the ring Mn (R) are ex-
actly the nn real matrices with nonzero determinant. Hence
U ( Mn (R)) = GLn (R).

Example 8.38 The ring Q of rational numbers consists of all quo-


tients of the form ba where b 6= 0. The multiplicative inverse of ba is
obviously ba , which is only defined if a 6= 0. Hence U (Q) = Q , the
multiplicative group of nonzero rational numbers.
Similarly, U (R) = R and U (C) = C .

This last example displays the best case scenario: everything is a unit
except for 0 (which definitely cant be). Rings satisfying this criterion
are particularly versatile to work with: we can divide arbitrary nonzero
elements by each other as well as adding, subtracting and multiplying
them. We can legitimately ask (and answer) a wider range of questions
in a ring with this property, such as what gives 1 when multiplied by
3?. It therefore makes sense to give this concept a specific name.
Definition 8.39 A division ring is a ring R such that U ( R) = R =
R \ {0}. That is, every nonzero element r R is a unit.

This isnt to say that rings without this property arent interesting. On
the contrary, such rings often display interestingly awkward behaviour.
Even nicer is the case where such a ring is commutative as well:
Definition 8.40 A field is a commutative division ring. A noncom-
mutative division ring is called a skew field.

Many of the number systems were most familiar with are fields. In
fact, in some sense the point of the rational numbers Q is that they are
the integers Z made as invertible as possible. Well come back to this
idea in a little while when we meet the concept of a field of quotients,
but for the moment here are some examples of fields and skew fields.
Example 8.41 The rational numbers Q, the real numbers R and the
complex numbers C are all fields.
The quaternions H form a skew field.

We met another important class of fields in Chapter 3:8 8


Definition 3.45, page 96.
Example 8.42 Let p be prime. Then the set F p = {0, . . . , p1} forms
a field (the finite field of order p) under addition and multiplication
modulo p.

In a little while we will prove that F p is indeed a field. Finite fields


have particularly important applications in cryptography and coding
theory. Some forms of public key encryption rely on the computational
difficulty of factorising composite numbers over finite fields, while
340 a course in abstract algebra

many useful error-correcting codes are defined in terms of vector


subspaces over finite fields. Later, in Chapter 11, we will see that finite
fields of order q = pn both exist and are unique up to isomorphism,
for any prime p and n N.
After learning basic arithmetic, one of the next things we learn how to
do is to factorise simple polynomials and thereby find their roots.
For example, the quadratic polynomial p( x ) = x2 3x + 2 can be
factorised as ( x 1)( x 2), at which point we can readily see that its
roots (the values of x for which p( x ) = 0) are 1 and 2. In doing so
we are making use of a fundamental property of the number system
were working in, whether that be Z, Q, R or C.
+ 0 1 2 3 4 5 Usually, for ( x 1)( x 2) to equal zero it must follow that either
0 0 1 2 3 4 5 ( x 1) = 0, in which case x = 1, or ( x 2) = 0, in which case
1 1 2 3 4 5 0
2 2 3 4 5 0 1 x = 2. But we have already met several examples of rings in which
3 3 4 5 0 1 2 this doesnt hold. For example, in the ring Z6 (see Table 8.6) we have
4 4 5 0 1 2 3
5 5 0 1 2 3 4
2 3 = 0. So although its always true (from Proposition 8.8(i)) that
a b = 0 if either a = 0 or b = 0, there are certainly some rings in
0 1 2 3 4 5
which the converse doesnt hold.
0 0 0 0 0 0 0
1 0 1 2 3 4 5 Definition 8.43 Let a R be a nonzero element of some ring R.
2 0 2 4 0 2 4 Then a is a left zero divisor if there exists some b R such that
3 0 3 0 3 0 3
4 0 4 2 0 4 2 a b = 0, and a right zero divisor if there exists some c R such
5 0 5 4 3 2 1 that c a = 0.
Table 8.6: Addition and multiplication If a is both a left and right zero divisor then it is said to be a two-
tables for the ring Z6
sided zero divisor, or simply a zero divisor.

Using this terminology, we see from Table 8.6 that 2, 3, 4 Z6 are all
zero divisors, but 1 and 5 arent (and by convention neither is 0).
More generally, there is a simple criterion governing whether a partic-
ular nonzero element of a ring Zn is a zero divisor:
Proposition 8.44 An element m Zn is a zero divisor exactly when it
is not coprime to n; that is, when gcd(m, n) > 1.

Proof Suppose that gcd(m, n) = d > 1. Then


n m
m d = d n 0 (mod n).
Hence m = 0 in Zn , but neither m = 0 nor
n
d
n
d = 0, so they are both
zero divisors.
Conversely, suppose that gcd(m, n) = 1, and that there exists some
k Zn for which mk = 0. This means that n divides mk, and hence by
Proposition A.40, since m and n are coprime, n must divide k instead.
Thus k 0 (mod n), which means that k = 0 in Zn , and therefore m
is not a zero divisor.
Matrix rings, the source of so many interesting examples, also yield
rings 341

instances of zero divisors:


Example 8.45 Let A = 01 00 M2 (Z). Then A is a left zero divisor
 
0 0
via B = 0 1 and a right zero divisor via C = 10 00 .
 

The property that Z, Q, R and C have that simplifies the process of


finding roots of polynomials, is that there are no nonzero elements
that multiply to give zero. That is, they have no zero divisors.
Definition 8.46 An integral domain is a commutative (unital) ring
with no zero divisors.
Clearly not every integral domain is a field, since Z has the required
property too, but it so happens that every field is an integral domain:
Proposition 8.47 Let F be a field. Then F has no zero divisors, and is
therefore an integral domain.

Proof Suppose that a, b F such that a b = 0 and a 6= 0. Then


b = 1 b = ( a1 a) b = a1 ( a b) = a1 0 = 0.
Hence F has no zero divisors.
An important aspect of integral domains is that although they neednt
contain a full complement of multiplicative inverses, they do satisfy
left and right cancellation laws analogous to those in Proposition 1.15.
Proposition 8.48 Let R be an integral domain. Then for any a, b, c R
such that a 6= 0 and a b = a c it follows that b = c.
Similarly, for any nonzero d R such that b d = c d it follows that
b = c.

Proof Suppose that a b = a c. Then a b a c = a (b c) = 0, and


since R, as an integral domain, has no zero divisors, and since a 6= 0, it
follows that (b c) = 0 and hence b = c. The right-hand cancellation
law follows by a similar argument.
Proposition 8.47 tells us that all fields are integral domains, but ob-
viously not all integral domains are fields: for example, neither Z
nor the polynomial rings Z[t] or Z[t1 ] are fields, but are certainly
integral domains. These examples are infinite, though, so can we find
any finite integral domains that arent fields? The answer is no:
Proposition 8.49 Let R be a finite integral domain. Then R is a field.

Proof Let R = {0, r1 , . . . , rn } be a finite integral domain. Choose


1 6 i 6 n and consider the n products ri r1 , . . . ri rn . By the left
cancellation law in Proposition 8.48 these must all be distinct, and
since R is an integral domain none of them can be equal to zero.
Exactly one of these products ri r j must be equal to 1, and hence
r j = ri1 . Repeating this argument n1 further times, we find that
342 a course in abstract algebra

every ri R has a unique inverse, and hence R is a field.


Corollary 8.50 If p is prime, then Z p is a field.

Proof By Proposition 8.44, the zero divisors in Zn are those integers


k Zn which are not coprime to n; that is, for which gcd(k, n) > 1.
Since p is prime, there are no such integers k Z p and hence Z p is
an integral domain. Furthermore, it has finitely many elements, and
by Proposition 8.49 must be a field.
The first field most of us encounter is Q, the field of rational numbers,
although we dont usually know it by that name to start with. Typ-
ically, after learning about addition, subtraction and multiplication
of integers, we learn about division. In many cases, one integer will
divide exactly by another, leaving no remainder, but often it wont. In
order to deal with such an eventuality, at least in circumstances where
this might be important to us, we extend our concept of a number
by assigning meaning to expressions such as 17 and 355 113 . In effect, we
construct a new ring containing Z as a unital subring by allowing
expressions of this type; in particular we get a multiplicative inverse
n1 = n1 for any nonzero integer n Z.
We now want to generalise this construction to work with any integral
domain, and to do so we will look carefully at exactly whats going
on with the construction of the rationals from the integers.
To start with, every rational number q Q can be expressed as a
fraction ba where a, b Z and b 6= 0. Also, we want a and b to be
coprime, so that ba is in its simplest form. This is the same as requiring
gcd( a, b) = 1, which is fine for integers, but we want a construction
that works in any integral domain, and we havent yet decided what,
9
9
In the next chapter well return to if anything, we mean by a greatest common divisor in general.
this question, in the process discover-
We want to ensure that every element of our new field is well-defined;
ing that although we can certainly de-
fine a more general notion of greatest for example, we regard 12 to be the same as 24 . Given that we cant
common divisor, they dont necessarily rely on greatest common divisors or coprimality, we need a slightly
exist in an arbitrary integral domain.
different approach. The answer is to define a suitable equivalence
relation on this larger set and then mod out by that relation.
Suppose that ba = dc . Then by cross-multiplying, we see that this is
equivalent to saying that ad = bc. So, we form the field Q of rational
numbers by taking the Cartesian product
ZZ = {( a, b) : a, b Z, b 6= 0}
and then quotienting by the relation where ( a, b) (c, d) if ad = bc.
The equivalence classes determined by this relation are called fractions,
and we usually write them as ba or a/b.
The next thing we need to do is to decide how to add and multiply
rings 343

two fractions together, and also figure out how the ring Z embeds in
this field.
Addition and multiplication is given by

a c ad + bc a c ac
+ = and = ,
b d bd b d bd
and the integers Z embed via the homomorphism i : Z , Q given by
f (n) = n1 .
All of this works fine in any integral domain, leading to the following.
Definition 8.51 Let R be an integral domain, and let
Q( R) = ( R R )/
be the set of equivalence classes of the relation defined by ( a, b)
(c, d) when ad = bc. We represent the equivalence class [( a, b)] by
a/b or ba . Then Q( R) forms a field (the field of fractions or field of
quotients) with the addition and multiplication operations
a c ad + bc a c ac
+ = and =
b d bd b d bd
for any a, b, c, d R such that b, d 6= 0.
Furthermore, there is an injective ring homomorphism i : R Q( R)
mapping r 7 r/1 for all r R.

So, the field of fractions Q(Z) of the integers Z is the field Q of rational
numbers. This shouldnt be surprising, because we formulated the
definition of Q( R) precisely so that this would be the case.
Example 8.52 Lets calculate the field of fractions Q(Z3 ) of the ring
Z3 . First, we take the set
Z3 (Z3 \{0}) = {(0, 1), (1, 1), (2, 1), (0, 2), (1, 2), (2, 2)}.
Now we mod out by the equivalence relation to find that
(0, 1) (0, 2), (1, 1) (2, 2), (1, 2) (2, 1).
Hence
Q(Z3 ) =
0 1 2

1, 1, 1
which is just the image of Z3 under the inclusion homomorphism
i : Z3 , Q(Z3 ). Hence Q(Z3 )
= Z3 .
This makes sense, since Z3 is already a field under the usual modulo3
addition and multiplication operations. More generally:
Proposition 8.53 Let F be a field. Then Q(F) = F.

Proof The required isomorphism is the inclusion homomorphism


i : F , Q(F). We already know this is injective, so we just need to
show its surjective as well. Any fraction ba Q(F) can be expressed
344 a course in abstract algebra

as the image i ( ab1 ) of the element ab1 F, since


a b 1 a 1 a
i ( ab1 ) = i ( a) i (b1 ) =
= = .
1 1 1 b b
Hence i is surjective as well, and therefore an isomorphism.
Example 8.54 What is the field of fractions Q(Z[i ]) of the ring Z[i ]
of Gaussian integers? It consists of all fractions of the form
a + bi
c + di
where a, b, c, d Z, and c+di 6= 0.
Using the usual trick for complex division, we find that
a + bi a + bi c di a b
= = 2 + 2 i.
c + di c + di c di c + d2 c + d2
Hence, Q(Z[i ]) = { p + qi : p, q Q} = Q[i ].
The following example is one that well make use of in Chapter 11.
Example 8.55 The field of fractions Q(Z[ x ]) of the polynomial ring
Z[ x ] consists of expressions of the form
a m x m + + a0
bn x n + + b0
where a0 , . . . , am , b0 , . . . , bn Z and bn x n + + b0 6= 0. These are
called rational functions, and we denote the resulting field Q( x ).
(Note the use of parentheses instead of brackets.)

The main point of the field of fractions is that its the smallest field
containing the given integral domain.

Our knowledge springs from two fun- 8.A Modules and representations
damental sources of the mind; the first
is the capacity of receiving representa-
In the study of linear algebra, we are largely concerned with
tions (receptivity for impressions), the
second is the power of knowing an vector spaces and linear maps between them. Recall that a vector
object through these representations
(spontaneity of concepts). space consists of a set V (of vectors) and a field F (of scalars), together
Immanuel Kant (17241804), with a binary operation + : V V V (called vector addition) and a
Critique of Pure Reason (1781)
function : F V V (called scalr multiplication). We require (V, +)
to form an abelian group, and for the scalar multiplication operation
Nowadays group theoretical methods
especially those involving charac-
to satisfy some other basic operations on its own, and when interacting
ters and representations pervade all with vector addition.
branches of quantum mechanics.
An obvious generalisation of this arrangement is to allow the scalars to
George Whitelaw Mackey
(19162006), Group Theory and its form a weaker structure than a field, say a ring. In doing this, we have
Significance for Mathematics and Physics, to be a little bit careful. There are certain properties that a field has
Proceedings of the American
Philosophical Society 117 (1973) that not all rings have, and which makes this more general scenario
374380 more complicated (and more interesting).
rings 345

In particular, rings neednt be commutative, they might contain zero


divisors, they might not satisfy unique factorisation, not every nonzero
element need be invertible, and so forth. All these are things we
implicitly or explicitly assume in ordinary linear algebra.
The first of these, commutativity, is the one that will concern us straight
away when trying to formulate the notion of a vector space over a
ring. In a vector space over a field, we find that

h(kv) = (hk )v = (kh)v = k(hv)

for any vector v and scalars h and k. That is, multiplying some vector
v by a scalar k, and then by another scalar h, gives the same result
as multiplying v by h first and then by k. But if our scalars arent
commutative, this wont always be the case.
Another way of looking at this is to define two scalar multiplication
operations: one where we do the multiplication on the left (the usual
notation) and one where we multiply on the right. The associativity
conditions would then look like

h(kv) = (hk )v and (vk)h = v(kh).

If our scalars are commutative, then v(kh) = v(hk), so multiplying on


the left is structurally the same (except for a superficial difference in
notation) to multiplying on the right. But if scalars dont commute,
this wont be the case in general, and well have to distinguish between
scalar multiplication that behaves like multiplying on the left, and
scalar multiplication that behaves like multiplying on the right.
Another thing to note at this stage is that the scalar multiplication
operation is really just an action of the scalar field F on the set (or,
actually, abelian group) V of vectors. With all this in mind, its time
for some definitions.
Definition 8.56 Let R be a ring. A (left) Rmodule is an abelian
group A with a left Raction : R A A.
Explicitly, a left Rmodule satisfies the following criteria for any
r, s R and a, b A:
(LM1) (r +s) a = r a + s a,
(LM2) (rs) a = r (s a), and
(LM3) r ( a+b) = r a + r b.
If R is a unital ring, then a module satisfying the following additional
condition for all a A is said to be unital.
(LM4) 1 a = a.
For completeness, we state the analogous definition for modules with
scalar multiplication on the right.
346 a course in abstract algebra

Definition 8.57 If R is a ring, then a right Rmodule is an abelian


group B with a right Raction : B R B, satisfying the following
conditions for all r, s R and a, b B:
(RM1) a (r +s) = a r + a s,
(RM2) a (rs) = (( a r ) s), and
(RM3) ( a+b) r = a r + b r.
If R is a unital ring, then a right Rmodule B is said to be unital if
the following condition is also satisfied for all a B:
(RM4) a 1 = a.
From now on, we will mostly just consider left Rmodules. The reason
for this is that any left Rmodule can be regarded as a right module
over a ring closely related to R.
Definition 8.58 Given a ring R, we can form the opposite ring
Rop as follows: Let Rop have the same underlying set and addition
operation as R, but define multiplication in the reverse order. So rs
in Rop is defined to be equal to the produce sr in R, for all r, s R.

If R is commutative, then obviously R = Rop , but this wont necessarily


be the case if R is noncommutative. Note also that Rop op = R.
Before looking at some concrete examples, well introduce a couple
more concepts. First, the notion of a submodule: a subset that is a
module in its own right.
Definition 8.59 Let A be a (left) Rmodule, and suppose that B is
a subgroup of A. Then B is a submodule of A if it is an Rmodule
with respect to the inherited Raction. That is, r b B for all r R
and b B.
(If A is a right Rmodule, then a subgroup B is a submodule if
b r B for all r R and b B.)
In particular, if R is a field, then any Rmodule A is actually a vector
space over R, and the submodules of A are vector subspaces.
The second concept we want to introduce now is that of a homomor-
phism of Rmodules.
Definition 8.60 Let A and B be (left) Rmodules and suppose that
f : A B is an abelian group homomorphism. Then f is an R
module homomorphism or Rmap if f (r a) = r f ( a) for all r R
and a A.
(If A and B are right Rmodules, then the required condition is
f ( a r ) = f ( a) r for all r R and a A.)

If R is a field, then Rmodule homomorphisms are exactly the same


as linear maps between vector spaces over R.
rings 347

In general, given any Rmodule homomorphism f : A B, the kernel


ker f is a submodule of A, and the image im f is a submodule of B.
We can define quotient modules in the obvious way: given any sub-
module C A, the quotient A/C consists of the cosets [ a] = a+C for
all a A, subject to the usual addition operation [ a] + [b] = [ a+b],
with Raction defined by r [ a] = [r a] for all r R and a, b A.
Analogues of the three Isomorphism Theorems hold for Rmodules,
but we will merely state without proof the first one:
Theorem 8.61 (First Isomorphism Theorem for Modules) Let R be
a ring, and let f : A B be a homomorphism of Rmodules. Then
A/ ker f
= im f .
Now for a few examples.
Example 8.62 A ring R has a canonical (left or right) Rmodule
structure, given by r s = rs for all r, s R.
Submodules of R considered as a (left or right) Rmodule in this way
are exactly the (left or right) ideals of R. If R is not commutative,
then its left and right ideals need not be the same, which means that
R considered as a left Rmodule structure need not be isomorphic to
R considered as a right Rmodule.

Example 8.63 A unital Zmodule A is exactly the same as an abelian


group. By condition (LM4) we have 1 a = a for all a A. Condition
(LM1) says that (m+n) a = m a + n a for all m, n Z and a A.
Putting these together, we see that
n a = (1 + + 1) a = 1 a + + 1 a = a + a = na
for all n Z and a A. So the only possible unital Zmodule
structure is that given by n a = na, and there is therefore a unique
way of turning an abelian group into a Zmodule.
Conversely, given a Zmodule A, we can turn it into an abelian
group by simply discarding the Zaction.
Submodules and homomorphisms of Zmodules are thus the same
as subgroups and homomorphisms of the underlying abelian groups.

Example 8.64 Let G be a group, and let R be its integral group ring
ZG as introduced in Example 8.31. Then a ZGmodule is exactly
the same as a Gmodule in the sense of Definition 7.90.
Given two modules over the same ring, we can combine them to form
a new, larger module in a familiar way:
348 a course in abstract algebra

Definition 8.65 Let A and B be modules over a ring R. Their direct


sum A B is formed by taking the direct sum A B of their underly-
ing abelian groups, and defining r ( a, b) = (r a, r b) for all r R,
and for all a A and b B.
A module that doesnt decompose as a direct sum of smaller modules
g ( g) is said to be simple or irreducible.
h i
10
0 01 We now pause the discussion of modules to look at what seems initially

1
h
1 3
i to be a non-sequitur, but which will shortly turn out to be very closely
1 2

3 1
h i related.
1 1 3
2 2 3 1 Back in Chapter 1 we looked at some examples of matrix groups
Table 8.7: A representation of Z3 from that we found to be isomorphic to groups that could be defined in
Example 1.33 other ways. In particular, we found matrix groups isomorphic to the
cyclic group Z3 , the dihedral group D3 and the quaternion group
Q8 . In Example 1.33 we effectively found an injective homomorphism
f : Z3 , GL2 (R) (although at that point we didnt have the terminol-
ogy to describe it in that way). This homomorphism is shown again
in Table 8.7.
This turns out to be a useful way of studying the structure of groups:
since matrices are well-understood, having a way of regarding some
arbitrary group as a matrix group can often yield important insights.
To that end, we introduce the concept of a representation:
Definition 8.66 Let F be a field, and G be a group. A representation
of G over F is a homomorphism : G GLn ( F ). The dimension or
degree of the representation is the value of n.

So, Example 1.33 gives a 2dimensional real representation of Z3 .


Example 8.67 For any group G, the representation : G GLn ( F )
which maps every element of G to the nn identity matrix over F is
called the (ndimensional) trivial representation of G (over F).
In Example 1.33, the representation homomorphism was injective,
but this one clearly isnt for any nontrivial group. Injectivity isnt
something we will always insist on, but representations satisfying this
condition preserve more information about the group structure than
those that dont.
Definition 8.68 Let : G GLn ( F ) be a representation of a group
g ( g)
h i G over a field F. If is injective, we say the representation is faithful;
0 10
01 if not, we say it is unfaithful.
h i
1 1 3
1 2 The next example is a very slight variation on Example 1.33.
3 1
h i
1 3
2 1
2 3 1 Example 8.69 Let : Z3 GL2 (R) as in Table 8.8. This is also a
faithful representation of Z3 .
Table 8.8: Another representation of Z3
rings 349

Example 8.70 As well as the trivial representation : Sn GL1 ( F ),


the symmetric group Sn has another one-dimensional representation,
the alternating representation : Sn GL1 ( F ) given by
 
( ) = sign() ,
where sign( ) is the sign of the permutation .
Now let {e1 , . . . , en } be a basis for the vector space F n . The permu-
tation representation : Sn GLn ( F ) of Sn is an ndimensional
representation in which a permutation maps to the nn matrix
g ( g) g ( g)
that maps each basis element ei 7 e(i) for 1 6 i 6 n. The matrix h1 0 0i h0 1 0i
010 (1 2) 100
representatives for the permutation representation of S3 are shown 001 001
h0 0 1i h1 0 0i
in Table 8.9. (1 3) 010 (2 3) 001
100 010
h0 1 0i h0 0 1i
Examples 1.33 and 8.69 are essentially the same in all important (1 2 3) 0 0 1 (1 3 2) 1 0 0
100 010
respects. Considered geometrically, the former maps 1 to the 22 real
Table 8.9: The permutation representa-
matrix representing an anticlockwise rotation through an angle 2 3 , tion of S3
while the other maps 1 to the corresponding clockwise rotation. But
apart from that, neither tells us anything more about the structure of
Z3 that the other one didnt.
The trivial representation : Z3 GL2 (R), meanwhile, is fundamen-
tally different: amongst other things, its unfaithful.
We can regard the difference between these two rotations as essentially
just due to a change of basis. Taking two bases
       
1 0 1 0
S= , and T= ,
0 1 0 1
for R2 , from basic linear algebra we can construct the change of basis
matrices
 
1 0
Q= = Q 1
0 1
that convert from S to T and vice versa. Now observe that
 
1 1 1 1 3 1
 
3
=Q Q .
2 3 1 2 3 1
So our two nontrivial representations are the same except for a change
of basis in the vector space (in this case R2 ) they act on. This is the
notion of equivalence that we need: in general we dont really care
about a specific choice of basis.
Definition 8.71 Two representations , : G GLn ( F ) are equiv-
alent if there exists some invertible matrix Q GLn ( F ) such that
( g) = Q( g) Q1 for all g G.
Equivalence of representations is an equivalence relation. The equiva-
lence condition can be rearranged as Q( g) = ( g) Q. In this case, Q
is an invertible nn matrix, which represents a bijective linear map
350 a course in abstract algebra

from F n to F n . If we drop the requirements that and have the


same degree, and that Q is invertible, we obtain the concept of an
intertwiner:
Definition 8.72 Let : G GLm ( F ) and : G GLn ( F ) be repre-
sentations of a group G of degree m and n respectively. Then an
intertwiner or intertwining matrix from to is an mn matrix T
such that ( g) T = T( g) for all g G.
We can combine two representations to form another of higher degree:
Definition 8.73 Given two representations : G GLm ( F ) and
: G GLn ( F ) of the same group G over the same field F, we can
form the direct sum in terms of the block matrices
 
( g) 0
g ( g) g ( g)
()( g) =
h1 0 0 ( g)
0
h1 0 0i i
0 10 (1 2) 0 1 1
0 01 0 0 1 for all g G. This has degree m+n, the sum of the degrees of the
h1 0 0 i h1 0 0
i
(1 3) 0 0 1 (2 3) 0 1 0 constituent representations and .
0 1 0 0 1 1
h1 0 0i h1 0 0 i
(1 2 3) 0 1 1 (1 3 2) 0 0 1 Example 8.74 Relative to the basis
0 1 0 0 1 1
0
nh 1 i h 1 i h io
Table 8.10: The permutation represen- 1 , 1 , 1 ,
tation of S3 relative to a different basis 1 0 1

g ( g) g ( g)
the permutation representation for S3 has the form shown in Ta-

1 0
(1 2)
 1 1  ble 8.10. Each of these matrices comprises a 11 matrix [1] in the
01 0 1
 0 1
 1 0
 top left corner, and a 22 submatrix in the bottom right corner.
(1 3) 1 0 (2 3) 1 1
 1 1   0 1  The permutation representation thus decomposes into the direct
(1 2 3) 1 0 (1 3 2) 1 1
sum of the 1dimensional trivial representation and another 2
Table 8.11: The standard representation dimensional representation, shown in Table 8.11, called the standard
of S3
representation of S3 . This representation acts on the 2dimensional
subspace {( x, y, z) : x + y + z = 0} of F3 .

The permutation representation is therefore equivalent (relative to


some appropriate choice of basis) to a direct sum of lower-degree
representations, neither of which can be further decomposed in this
way. This leads to the next definition.
Definition 8.75 Let : G GLn ( F ) be a representation of some
group G over some field F. If is equivalent to a direct sum of
lower-degree representations, then we say it is reducible; if not, we
say it is irreducible.

To properly understand the representations of a given group G, we


really only need to study its irreducible representations (sometimes
called irreps for short) because these are the building blocks from
which we construct all the others.
Its now time to draw together the two strands of the discussion so
rings 351

far. A representation of a group G is a homomorphism : G


GLn ( F ), where in practice F will usually be R or C. The general linear
group GLn ( F ) is the group of all nonsingular (and hence invertible)
nn matrices over F. These matrices are exactly those that represent
bijective linear maps from F n to F n ; that is, automorphisms of the
vector space F n , relative to some choice of basis. So we can write
GLn ( F ) = Aut( F n ). So a representation of G defines an action of the
group G on the vector space F n : for each element of G we choose
an automorphism of F n in a way that is compatible with the group
structure of G.
We claim that this is the same as a module over the group ring FG
(which will usually be RG or CG). Such a module consists of an
abelian group A together with an action FG A A. The condition
(ke) a = ka A implies that A is closed under scalar multiplica-
tion by any element of F, thus making A into a vector space over
F. By a standard result in basic linear algebra, this means that A
is isomorphic to F n for some n N. Therefore an FGmodule is
a vector space equipped with an action of the group G; the rest of
the FGaction conditions determine a homomorphism from G to
Aut( A) = Aut( F n ) = GLn ( F ).
By this discussion (the reader is invited to make this argument precise
by checking all the details thoroughly) we find that every FGmodule
determines a representation of G over F, relative to some choice of
basis for F n . Conversely, every representation of G over F determines
an FGmodule.
Table 8.12 lists the correspondence between the world of FGmodules,
and that of representations of G over F.

Modules Representations
FGmodule representation of G over F
FGmodule homomorphism intertwiner between representations of G over F
direct sum of FGmodules direct sum of representations of G over F
irreducible FGmodule irreducible representation of G over F
Table 8.12: A lexicon of module and
representation terminology and con-
The following important result concerns the reducibility or otherwise cepts
of FGmodules (and representations).
Theorem 8.76 (Maschkes Theorem) Let G be a finite group, F a field
of characteristic zero, and V an FGmodule. If U is an FGsubmodule of
V then there exists another FGsubmodule W of V such that V = U W.

Proof Choose a basis T = {e1 , . . . , em } for the submodule U, and a


basis S = {e1 , . . . , em , em+1 , . . . , en } for all of V, so that T S. Define
352 a course in abstract algebra

a surjective Flinear map p : V U with



e if 1 6 i 6 m,
i
p ( ei ) =
0 if m+1 6 i 6 n,

for 1 6 i 6 n. Then im p = U, and ker p = spanF S \ T comprises the


subspace of V generated by the rest of the basis vectors em+1 , . . . , en .
This map p is not necessarily an FGhomomorphism, but we will use
it to construct one. Let q : V U by
1
q(v) =
|G| g 1 p ( g v )
g G

Wikimedia Commons
for all v V. The idea is that in some sense q is the average of p over
Heinrich Maschke (18531908) was the group G.
born in Breslau, Prussia (now Wrocaw,
Poland). He studied mathematics at We need q to be a surjective FGhomomorphism from V onto U. To
the Universities of Heidelberg and check surjectivity, consider any element v V. Then for any g G it
Berlin, graduating with high distinc-
tion from the latter in 1878, and subse- follows that g v lies in V, because V is an FGmodule, then p( g v)
quently undertood graduate research lies in U, because im p = U, and finally g1 p( g v) lies in U as well,
at the University of Gttingen, where
he was awarded his doctorate in 1880
because U is an FGsubmodule of V, and hence closed under the
for a thesis entitled On a triple orthog- action of G.
onal surface system formed from sur-
faces of the third order. Next we must check that q is an FGhomomorphism: an Flinear
After this he worked as a mathemat- map from V to U that commutes with the Gaction. The Flinearity
ics teacher at a gymnasium in Berlin. requirement is satisfied since q is a linear combination of Flinear
But he found the work increasingly un-
fulfilling, especially after a years sab- transformations. And for any h G and v V, we have:
batical undertaking research with Felix 1
Klein (18491925) at Gttingen in 1887. q(h v) =
|G| g1 p( g (h v))
Inspired by two friends who had g G
quickly found academic posts in the 1
USA, he resolved to emigrate. To im- =
|G| g1 p(( gh) v)
prove his chances of finding suitable g G
work, he studied electrical engineering.
1
Arriving in New York in April 1891, he
soon got a job as an electrician with the
=
|G| hh1 g1 p(( gh) v)
g G
Weston Electrical Instrument Company
1
of Newark, New Jersey.
In 1892 he was appointed as an assis-
= h
|G| ( gh)1 p(( gh) v)
gh G
tant professor at the newly founded
1
University of Chicago. Here, he was
able to fully exercise his mathemati-
= h
|G| g 1 p ( g v )
g G
cal abilities: his teaching experience
served him and his students well, and = h q ( v ),
he made many contributions to math-
ematical research, particularly on fi- so q is an FGhomomorphism.
nite groups and quadratic differential
forms. He was an active founder mem- We now set W = ker q, which is therefore an FGsubmodule of V,
ber of the American Mathematical So- and as noted earlier U = im q. Furthermore, the intersection U W
ciety, serving as Vice President in 1907.
consists only of the zero vector 0, and any vector v in V can be written
Despite his generally excellent health,
he died in March 1908 of complications uniquely as a sum v = u+w, for some u U and w W. Hence
from emergency surgery. V = U W as claimed.
rings 353

Summary

We began this chapter with another careful examination of the integers,


this time paying attention not just to the additive structure, but the
multiplicative structure as well. This led us to formulate the concept
of a ring: a set R = ( R, +, ) equipped with two binary operations
+ and .10 We require that ( R, +) is an abelian group, and that the 10
Definition 8.1, page 324.
multiplication operation is associative and satisfies a distributivity
condition with respect to addition. We also require the existence
of a unique11 unity element which acts as a multiplicative identity; 11
Proposition 8.9, page 327.
a ring without such an element is called a ring without unity or a
nonunital ring. (To avoid ambiguity, we sometimes call a ring with a
multiplicative identity a unital ring or a ring with unity, but in this
book we will consider this to be the default setting.)
The first examples we looked at were numerical ones: the integer
rings Z and Zn , the rational numbers Q, the real numbers R and
the complex numbers C.12 , 13 The simplest ring is the trivial ring 12
Example 8.2, page 325.
consisting of a single element, which must be both the additive and 13
Example 8.3, page 325.
multiplicative identity;14 this is the only ring in which 0 = 1.15 We 14
Example 8.4, page 325.
can also make new rings by adjoining a new element to an existing 15
Proposition 8.13, page 329.
one; this led us to the ring Z[i ] of Gaussian integers, the ring Z[ ]
of Eisenstein integers and the ring Z[] of Kleinian integers,16 as 16
Example 8.6, page 326.
well as some finite variants.17 Several basic properties may be quickly 17
Example 8.7, page 326.
deduced from the definition of a ring; in particular multiplication by
the zero (additive identity) element and negative (additive inverse)
elements behaves in the way were used to from ordinary arithmetic.18 18
Proposition 8.8, page 326.
Although well mostly be interested in unital rings, we also looked at a
few examples of nonunital rings, such as the even integers 2Z (which
generalises to nZ in the obvious way)19 and the case where we define 19
Example 8.10, page 328.
multiplication to be trivial.20 The quaternions H and the Lipschitz 20
Example 8.11, page 328.
integers L provided interesting examples of noncommutative rings.21 21
Example 8.12, page 328.
In order to better study structural similarities between rings, we in-
troduced the notions of an isomorphism and a homomorphism,22 22
Definition 8.14, page 329.
and direct sum,23 in both cases extending the corresponding concepts 23
Definition 8.15, page 329.
from group theory. As with group homomorphisms, any ring homo-
morphism must map 0 to 0 and 1 to 1.24 Next we considered the 24
Proposition 8.16, page 330.
additive and multiplicative order of ring elements,25 and introduced 25
Definition 8.17, page 331.
the characteristic char( R) of a ring,26 which latter may be determined 26
Definition 8.18, page 331.
from the additive order of the unity element 1.27 27
Proposition 8.19, page 331.
Next we examined some examples of matrix rings such as Mn ( R), the
354 a course in abstract algebra

28
Example 8.21, page 332. ring of nn matrices with entries from some ring R,28 as well as the
rings UT n ( R), LT n ( R) and Dn ( R) of triangular and diagonal matri-
29
Example 8.22, page 333. ces.29 The ring Mn ( R) is noncommutative except when n = 1 and R is
commutative, and its elements can be regarded as endomorphisms of
R; more generally we can form the ring End( R) of endomorphisms of
30
Example 8.23, page 334. R,30 and also generalise this to endomorphism rings of other objects
31
Example 8.24, page 334. such as abelian groups.31
The next class of rings we studied were polynomial rings such as
32
Example 8.25, page 335. Z[ x ].32 These are formed by adjoining a formal variable x to an ex-
isting ring, in a similar way to the construction of rings such as Z[i ]

and Z2 [ 2]. This process can be generalised in various ways: we can
use an arbitrary coefficient ring to get R[ x ], allowing negative powers
33
Example 8.27, page 336. yields the ring Z[ x 1 ] of Laurent polynomials,33 we can define mul-
tivariate polynomial rings like Z[ x, y] or R[ x1 , . . . , xk ] by using more
34
Example 8.29, page 336. than one variable,34 and by allowing infinitely many terms we obtain
35
Example 8.30, page 336. formal power series rings like ZJxK.35 A related example is that of
the group ring ZG (or RG) of a group G, whose elements are formal
36
Example 8.31, page 337. Zlinear (or Rlinear) combinations of elements of G.36
Next we considered the multiplicative structure of rings in more
37
Definition 8.33, page 338. detail, introducing the notion of an invertible element, or unit.37 The
units in a ring R are closed under multiplication, include the unity
element 1 itself, and all have unique inverses. They therefore form a
38
Proposition 8.34, page 338. multiplicative group, the group of units U ( R).38 The rings Z and Z[t]
have only two units, namely 1, and hence U (Z) = U (Z[t]) = Z2 .

The ring Z[t ] of Laurent polynomials has infinitely many units of
1

the form tn for n Z and hence U (Z[t1 ]) = Z2 Z. The invertible


elements in Mn (R) are exactly the nn invertible matrices, and thus
U ( Mn (R)) = GLn (R).
The best case scenario is that every nonzero element in a ring R is
invertible (the zero element 0 is never invertible unless R = {0} is the
39
Definition 8.39, page 339. trivial ring). We call such a ring a division ring.39 A commutative
division ring is a field and a noncommutative division ring is a skew
40
Definition 8.40, page 339. field.40 Many familiar number systems are fields: the rationals Q,
the reals R and the complex numbers C. The quaternions H are
perhaps the best-known example of a skew field. For a prime p, the
set F p = {0, 1, . . . , p1} forms a finite field under modulop addition
41
Corollary 8.50, page 342. and multiplication.41
A nonzero element a in a ring R is a left zero divisor if there exists
some nonzero b R with a b = 0, and a right zero divisor if there
exists some nonzero c R such that c a = 0. If a is both a left and
42
Definition 8.43, page 340. right zero divisor then we call it simply a zero divisor.42 The zero
divisors in Zn are exactly the elements m that arent coprime to n;
rings 355

that is, those for which gcd(m, n) > 1.43 A ring such as Z or Z[t] that 43
Proposition 8.44, page 340.
contains no zero divisors is said to be an integral domain.44 All fields 44
Definition 8.46, page 341.
are integral domains,45 and all finite integral domains are fields,46 but 45
Proposition 8.47, page 341.
not all infinite integral domains are fields. Although integral domains 46
Proposition 8.49, page 341.
dont necessarily contain a full complement of units, they do satisfy
multiplicative cancellation laws.47 47
Proposition 8.48, page 341.
356 a course in abstract algebra

References and further reading


The mathematical biographer Constance Reid (19182010) wrote an interesting and well-researched
biography of David Hilbert (18621943) which contains many details of his life and also of the
mathematical community at Gttingen during the late 19th and early 20th century.
C Reid, Hilbert, Copernicus (1996)
The German geologist Wolfgang Sartorius von Waltershausen (18091876) wrote a contemporary
biography of Carl Friedrich Gauss (17771855):
W S von Waltershausen, Gauss zum Gedchtnis, S Hirzel (1856)
An English translation by Helen W Gauss, the subjects great granddaughter, was published in 1966 as
Gauss: A Memorial.
For further details on representation theory, the following two books are good places to start:
G James and M Liebeck, Representations and Characters of Groups, second edition, Cambridge Univer-
sity Press (2001)
W Fulton and J Harris, Representation Theorey: A First Course, Graduate Texts in Mathematics 129,
Springer (2004)
The former is suitable for undergraduate students, and covers group representations and characters,
while the latter is intended for graduate students, and covers the representation theory of finite groups,
and also Lie groups and Lie algebras.

Exercises

8.1 Show that the prime subring h1i of a ring R is equal to the intersection of all other subrings of R.

8.2 Write down the addition and multiplication tables for the rings Z2 [], Z3 [i ], Z3 [ 2], Z3 [ ] and
Z3 [].
8.3 Show that if f : R S is a homomorphism of unital rings, then there is a unique induced
homomorphism U ( f ) : U ( R) U (S), between the corresponding groups of units, such that
U ( f )(r ) = f (r ) for all r U ( R).
8.4 A ring element r R is idempotent if r2 = r. Show that if R is an integral domain, then 0 and 1
are the only idempotent elements.
8.5 Let R be an integral domain. Show that the field of fractions Q( R) from Definition 8.51 is indeed a
field.
8.6 Let
H = a+bi +cj+dk : a, b, c, d Z or a, b, c, d Z+ 21


be the set of Hurwitz integers or Hurwitz quaternions, where


Z+ 12 = . . . , 32 , 21 , 12 , 23 , . . .


is the set of half-integers. Show that H is a noncommutative ring.


8.7 Let End(Z) and End(Zn ) denote the endomorphism rings of the abelian groups Z and Zn . Show
rings 357

that End(Z) = Z and End(Zn ) = Zn .


8.8 Show that Z[ x ][y]
= Z[ x, y]. That is, show that the ring of polynomials in y with coefficients in
Z[ x ] is isomorphic to the ring of bivariate polynomials in x and y with coefficients in Z.
8.9 Show that the integral group ring ZZ is isomorphic to the Laurent polynomial ring Z[ x 1 ].
8.10 Suppose that Z[i ] is the ring of Gaussian integers from Example 8.6. Show that U (Z[i ]) = Z4 .
8.11 Determine the group of units U ( Mn (Z)) of the ring of nn integer matrices.
8.12 Let R be a ring in which a2 = a for all a R. Show that R is commutative.
8.13 An element a R is nilpotent if an = 0 for some n N. Find the nilpotent elements in Z8 , Z12
and Z.
In science, nothing that is provable
should be accepted without proof. As
obvious as this requirement seems, yet
I do not think it has been satisfied, even
by the most recent methods in the foun-
dations of the simplest science, namely
that part of logic which concerns the
theory of numbers.
Richard Dedekind (18311916),
9 Ideals Was sind und was sollen die Zahlen?
(1888)

t roughly this point in our study of groups we met the concepts


A of subgroups and their left and right cosets, and fairly soon after
+ (0, 0) (0, 1) (1, 0) (1, 1)
we derived the notion of a normal subgroup: a subgroup whose left
(0, 0) (0, 0) (0, 1) (1, 0) (1, 1)
and right cosets coincide. We found that we could use these normal (0, 1) (0, 1) (0, 0) (1, 1) (1, 0)
subgroups to form quotient groups, a process which gave us new (1, 0) (1, 0) (1, 1) (0, 0) (0, 1)
structural information about the groups in question. We discovered a (1, 1) (1, 1) (1, 0) (0, 1) (0, 0)

little while later that kernels of group homomorphisms are normal, (0, 0) (0, 1) (1, 0) (1, 1)
and conversely that all normal subgroups arise in this way. (0, 0) (0, 0) (0, 0) (0, 0) (0, 0)
(0, 1) (0, 0) (0, 1) (0, 0) (0, 1)
Now we want to study the analogous concepts in ring theory: subrings, (1, 0) (0, 0) (0, 0) (1, 0) (1, 0)
cosets, kernels, homomorphisms, normal subrings and quotient rings. (1, 1) (0, 0) (0, 0) (1, 0) (1, 1)
This will keep us busy for the rest of this chapter. Table 9.1: Addition and multiplication
tables for the ring Z2 Z2

9.1 Subrings Thus I descended out of the first circle


Down to the second, that less space
begirds,
In the earlier part of this book, we learned a lot about the And so much greater dole, that goads
internal structure of groups by studying their subgroups: subsets that to wailing.
Dante Alighieri (c.12651321),
form groups in their own right with the same binary operation (or,
Inferno V:13,
more precisely, the restriction of the same binary operation). translated by Henry Wadsworth
Longfellow (18071882)
We want now to formulate the corresponding concept for rings: a
subset S of a ring R that forms a ring in its own right with the
(restrictions of) the addition and multiplication operations inherited + 0 1 1+ i i
0 0 1 1+ i i
from R.
1 1 0 i 1+ i
Looking at the addition and multiplication tables for Z2 Z2 , in Ta- 1+ i 1+ i i 0 1
i i 1+ i 1 0
ble 9.1, we can see that this ring contains a subset {(0, 0), (1, 1)} that
0 1 1+ i i
forms a ring isomorphic to Z2 . Furthermore, this subset contains both 0 0 0 0 0
the additive and multiplicative identities. 1 0 1 1+ i i
1+ i 0 1+ i 0 1+ i
The ring Z2 [i ]
= Z2 [ 2] also contains a subring isomorphic to Z2 , as i 0 i 1+ i 1
can be seen from Table 9.2. Table 9.2: Addition and multiplication
Turning our attention to infinite rings, we can see that the ring Q of tables for the ring Z2 [i]
rational numbers contains Z as a ring in its own right, and similarly
360 a course in abstract algebra

R contains Q, C contains R, Q and Z, and H contains all of these in


turn.
What about the subset 2Z of even integers in Z? This is certainly
closed under addition and contains both the additive identity 0 and all
the positive and negative even integers, so it certainly forms an abelian
group under addition. Multiplication of even integers is associative
(and commutative) and satisfies the distributive law. But were missing
the remaining crucial ingredient: the multiplicative identity. So, as
discussed in Example 8.10, it only forms a nonunital ring. We need to
take account of this possibility in our definition of a subring.
At this point we run into some notational awkwardness. Just as differ-
ent books adopt different conventions regarding whether rings have
a unity element or not, there is no universally-observed convention
concerning the definition of a subring.
+ 0 1 2 3 4 5 Books that require rings to have a unity element by default usually
0 0 1 2 3 4 5 also require subrings to have a unity element unless otherwise spec-
1 1 2 3 4 5 0
2 2 3 4 5 0 1 ified, while those that dont necessarily expect rings to have a unity
3 3 4 5 0 1 2 element naturally tend not to expect subrings to either. Since weve
4 4 5 0 1 2 3
decided to follow the with unity convention, our way should be
5 5 0 1 2 3 4
clear. Unfortunately, however, things turn out to be more complicated,
0 1 2 3 4 5 as the following example shows.
0 0 0 0 0 0 0
1 0 1 2 3 4 5 Example 9.1 The ring Z6 has the multiplication and addition tables
2 0 2 4 0 2 4
shown in Table 9.3. We can see from these that Z6 has a unity
3 0 3 0 3 0 3
4 0 4 2 0 4 2 element, namely 1. Now consider the subset {0, 2, 4}. This forms
5 0 5 4 3 2 1 an abelian group under modulo6 addition and is closed under
Table 9.3: Addition and multiplication modulo6 multiplication. Its addition and multiplication tables are
tables for Z6
+ 0 2 4 0 2 4
0 0 2 4 0 0 0 0
2 2 4 0 2 0 4 2
4 4 0 2 4 0 2 4

and from the second of these we can see that 4 acts as a unity element.
Similarly, the subset {0, 3} forms an abelian subgroup under ad-
dition, is closed under multiplication, and the element 3 acts as a
multiplicative identity.
In neither of these cases is the unity element in the subset the same
as the unity element 1 Z6 .

This presents us with a quandary. Do we require subrings to have


the same unity element as their ambient ring, thus ruling out some
interesting and otherwise valid cases, or do we relax this condition
and just require a subring to have a unity element without worrying
ideals 361

too much about exactly which element fulfils this rle? There is, again,
no universally adopted convention, and after more than a century of
intensive study of ring theory, it seems unlikely that a consensus will
form any time soon.
In this book, we will adopt the conventions in the following defini-
tion, but you should be aware that many books use slightly different
definitions or terms.
Definition 9.2 Let R = ( R, +, ) be a ring, and let S be a subset of R.
Then S is a subring of R if (S, +S , S ) is a ring in its own right, where
+S and S are the restrictions to S of the addition and multiplication
operations in R.
If S does not contain a unity element, then we say it is a nonunital
subring or a subring without unity.
If the unity element 1S in S is the same as the unity element 1R in R,
then we call S a unital subring or subring with unity.
In other words, unless otherwise specified we require a subring to have
a unity element, but we dont necessarily require that unity element
to be the same as the one from the larger ring. (We could perhaps
refer to this situation as a subring with different unity, but thats
a bit unwieldy and isnt in common usage.) We reserve the terms
unital subring and subring with unity for the case where the two
unity elements are the same. This is a somewhat complicated way of
doing things, but seems to be the least worst option available.
The following proposition provides criteria for a subset S R to form
a unital subring of R. Essentially, it says that we really just have to
check whether S forms a subgroup of R under addition, is closed
under multiplication, and contains the unity element 1 R.
Proposition 9.3 Let R = ( R, +, ) be a ring, and let S R be a subset
of R. Then S is a subring of R if and only if the following conditions hold:
SR1 (S, +) is a subgroup of ( R, +),
SR2 a b S for all a, b S, and
SR3 there exists a unity element 1S S.
Furthermore, S is a unital subring of R if and only if
SR4 1S = 1R .

Proof This is all just a consequence of Definitions 8.1 and 9.2. Since
( R, +) is an abelian group, condition SR1 is equivalent to requiring
that (S, +) is an abelian group itself, which in turn is equivalent to
condition R1 from Definition 8.1.
The closure condition SR2 is equivalent to requiring the restricted
multiplication operation S to be well-defined on S. If it is, then it
362 a course in abstract algebra

will automatically be associative on S (since the full multiplication


operation in R is associative) and hence condition R2 is satisfied. And
since S is closed under multiplication and the distributive law R3
holds in R it must also hold in S. Next, SR3 ensures that R4 holds in
S, and hence S is a subring of R. Finally, if SR4 is satisfied, then the
unity element in S is the one inherited from R, and so S is a unital
subring of R.
Conversely, if SR1 fails, then R1 also fails; if SR2 fails then the restricted
multiplication operation S will not be well-defined; while if SR3 fails
then S wont contain a unity element, and therefore cant be a subring
of R. Finally, if condition SR4 isnt satisfied then S isnt a unital subring
of R.

If only conditions SR1 and SR2 hold, then S is a nonunital subring of


R.
For a subset S R to be a subring, we not only require S to satisfy the
usual additive subgroup conditions (closure, existence of inverses) but
also to satisfy certain multiplicative conditions (closure, existence of 1)
as well. These extra requirements are more demanding, and as a result
many rings which have entirely valid subgroups when considered
just as additive abelian groups, fail to have corresponding subrings
when we consider the ring structure as well. The following example
illustrates this.
Example 9.4 The ring Z4 has no proper unital subrings.
The full ring Z4 and the trivial ring {0} obviously form subrings,
although the latter isnt unital in the sense of Definition 9.2. The only
other candidate is the set {0, 2}, which certainly forms an additive
subgroup of Z4 . Its also closed under multiplication, but doesnt
satisfy the last of the three conditions in Proposition 9.3, as it doesnt
contain the unity element 1.

This is a special case of the following more general fact.


Proposition 9.5 The rings Z and Zn have no proper unital subrings.

Proof Suppose that S is a nontrivial unital subring of Zn . Then it


must contain the unity element 1 Zn (by SR3) and be closed under
addition (by SR1). Together, these imply that S must also contain all
multiples k 1 = k of 1 for all 0 6 k < n, and therefore S = Zn .
Similarly, suppose that S is a unital subring of Z. Then S must
contain 1 and all finite multiples k 1 = k for all k Z, in which case
S = Z.

The reason that Z and Zn have no proper unital subrings is that


any such subring must contain the unity element. If we relax this
ideals 363

requirement then we have more possibilities: Z and Zn have plenty


of nonunital subrings:
Proposition 9.6 For any n N, the set
nZ = {. . . , 2n, n, 0, n, 2n, . . .}
is a nonunital subring of Z. Furthermore, all nonunital subrings of Z are
of this form.

Proof For nZ Z to be a nonunital subring, we need only check


conditions SR1 and SR2 from Proposition 9.3. The first of these, that
(nZ, +) is an abelian subgroup of (Z, +), is straightforward, and
follows from Proposition 2.7. Condition SR2 is also clear: for any
pn, qn nZ we have pn qn = pqn2 = ( pqn) n, so nZ is closed
under multiplication. Hence nZ is a nonunital subring of Z.
Now suppose that S Z is some nonunital subring of Z. Then by
SR1, (S, +) has to be an abelian subgroup of Z. But by Proposition 2.8
we know that this must be a cyclic subgroup, and hence S = nZ for
some n N (the case n = 0 is ruled out by the requirement that S be
nonunital).
Recall that the integer rings Z and Zn were our motivating examples
for the material in this and several subsequent chapters: it was by
careful analysis of their essential properties that we arrived at the
definition of a ring, and all of the other examples in this chapter were
derived from them in one way or another. With this in mind, the
following fact should not be so surprising.
Proposition 9.7 Let R be a (unital) ring. Then R contains a subring
isomorphic to Z or to Zn for some n > 1.

Proof The ring R is by definition closed under addition and multipli-


cation, and contains a unity element 1. The subset
h1i = {. . . , 2, 1, 0, 1, 2, . . .} R
consists of everything in R generated by finite sums or products of 1.
This is a subring of R: it is an abelian subgroup of R by Proposition 2.7,
it is closed under multiplication, and it contains the unity element 1.
Recall from Proposition 8.19 that char( R) is determined by the additive
order of 1 in R. If char( R) = 0 then 1 has infinite additive order, and
h1i
= Z; if R has finite characteristic char( R) = n then the additive
order of 1 is also n, and hence h1i
= Zn .
The subring h1i is called the prime subring; it is the smallest nontrivial
unital subring of R.
At the beginning of this section we saw that Z2 Z2 has a subring
isomorphic to Z2 . This agrees with the above proposition, since
364 a course in abstract algebra

Z2 Z2 is a unital ring with characteristic 2. In general, the direct


product G H of two groups has at least one subgroup isomorphic to
G and one isomorphic to H (see Proposition 3.29). Here, G {e} =G
and {e} H = H. Is this always the case for a direct sum RS?
A more general, if rather more techni- As we learned in Example 4.16, this works for groups because there ex-
cal, way of looking at this is to say that
the direct sum operation is a product
ist canonical inclusion homomorphisms i1 : G , G H and i2 : H ,
but not a coproduct in the category of G H for any groups G and H. However, the corresponding inclusion
unital rings. functions i1 : R , RS and i2 : S , RS fail to be homomorphisms.
The reason for this is that i1 maps Rs unity element 1R to (1, 0) in
RS while i2 maps 1S to (0, 1) in RS. But the unity element in RS
is (1, 1) instead, and for i1 and i2 to be ring homomorphisms, we need
them both to map unity elements to unity elements.
More generally, for RS to have a subring isomorphic to R, there
must exist an injective homomorphism f : R RS. If there is an
injective homomorphism g : R S, then we can use this to construct
the homomorphism f = idR g that maps r 7 (r, g(r )) for all r R.
For example, Z2 Z2 has a subring isomorphic to Z2 via the diagonal
homomorphism = idR idR defined by (r ) = (r, r ) for all r Z2 .
While on the subject of direct sums of rings, its a good time to state
the following fact, which is an analogue of Proposition 1.32. This
result is a form of the famous Chinese Remainder Theorem, which we
will return to later in Chapter 10.
Proposition 9.8 (Chinese Remainder Theorem) The ring Zm Zn is
isomorphic to the ring Zmn if and only if m and n are coprime.

Proof As before, we denote by [ a]n the residue or remainder of a


modulo n. Let f : Zmn Zm Zn where f ( a) = ([ a]m , [ a]n ). That is,
we map any element of Zmn to the ordered pair in Zm Zn consisting
of its residues modulo m and n, respectively.
This is certainly a ring homomorphism, since

f ( a + b) = ([ a + b]m , [ a + b]n ) = ([ a]m + [b]m , [ a]n + [b]n )


= ([ a]m , [ a]n ) + ([b]m , [b]n ) = f ( a) + f (b)
and

f ( a b) = ([ ab]m , [ ab]n ) = ([ a]m [b]m , [ a]n [b]n )


= ([ a]m , [ a]n ) ([b]m , [b]n ) = f ( a) f (b).
We need to show its a bijection, which we can do by constructing a
well-defined inverse homomorphism f 1 : Zm Zn Zmn .
If m and n are coprime then Euclids Algorithm enables us to find
integers p, q Z such that pm + qn = 1. We can use this to define
f 1 ( a, b) = [ pmb + qna]mn
ideals 365

for all a Zm and b Zn . To check that this is indeed the inverse of


f , we need to verify that f ( f 1 ( a, b)) = ( a, b) and f 1 ( f (k)) = k for
all a Zm , b Zm and k Zmn . Firstly,
f ( f 1 ( a, b)) = f ([ pmb + qna]mn )
= ([[ pmb + qna]mn ]m , [[ pmb + qna]mn ]n )
= ([ pmb + qna]m , [ pmb + qna]n )
= ([qna]m , [ pmb]n ).
But pm + qn = 1 and hence pm = 1 qn, so pm 1 (mod n), there-
fore pmb b (mod n). Similarly, qn = 1 pm, so qn 1 (mod m),
hence qna a (mod m). Thus ([qna]m , [ pmb]n ) = ( a, b). Secondly,
f 1 ( f (k)) = f 1 ([k]m , [k]n )
= [ pm[k]n + qn[k]m ]mn
Wikimedia Commons
= [ pmk + qnk]mn The earliest known statement of the
Chinese Remainder Theorem occurs
= [( pm + qn)]mn [k]mn in the book Sun Tzu Suan Ching (or Sun
Z Sun Jng), which translates roughly
= k. as The Mathematical Classic of Sun Tzu.
Therefore f is an isomorphism. This book, a page of which is shown
above, dates from the third century
If, on the other hand, m and n are not coprime, then by Proposi- AD, and is one of a series of impor-
tant mathematical texts known during
tion 1.32 the additive abelian groups (Zmn , +) and (Zm Zn , +) cant
the Tang dynasty (618907AD) as the
be isomorphic since the order of (1, 1) in the latter group is equal to Ten Computational Canons.
lcm(m, n) which is strictly less than mn, the order of 1 in Zmn . Its author Sun Tzu (or Sun Z) is be-
lieved to be distinct from the military
m m
Corollary 9.9 Suppose that n = p1 1 . . . pk k , where p1 , . . . , pk are dis- strategist of the same name, the author
of The Art of War, who is thought to
tinct primes, and m1 , . . . , mk N. Then have lived during the sixth century BC.
Zn
= Z p m1 Z p m k . A more detailed discussion of the prob-
1 k lem appears in the book Aryabhat.ya
(499AD) by the Indian mathematician
m m
Proof If p1 , . . . , pk are distinct primes, then p1 1 , . . . , pk k are pairwise Aryabhat.a (476550AD), and also in
the Liber Abaci (1202) of Fibonacci
coprime. The result then folllows by induction on k.
(c.1170c.1250).
Back to our discussion of subrings. What about the other examples The number-theoretic version of the
theorem is as follows:
of rings weve met so far? Do polynomial rings or matrix rings have
Chinese Remainder Theorem Sup-
interesting subrings? Well, as it happens, weve met some matrix pose that n1 , . . . , nk N are pairwise
subrings already: coprime. Then for any a1 , . . . , ak Z
there exists x Z such that
Example 9.10 For any ring R, the matrix rings UT n ( R), LT n ( R)
x a1 (mod n1 ),
and Dn ( R) of, respectively, upper-triangular, lower-triangular and ..
.
diagonal matrices, are all subrings of Mn ( R). x ak (mod nk ).
Furthermore, Dn ( R) is a subring of both UT n ( R) and LT n ( R), and Furthermore, any two solutions x and y
all of them contain the prime subring h I i
= Z. are congruent modulo n = n1 . . . nk .

The surjectivity of the homomorphism


This example raises another interesting question: the ring Dn ( R) of f in Proposition 9.8 ensures the exis-
diagonal matrices is a subring of both UT n ( R) and LT n ( R); in fact it tence of x, and its injectivity ensures
is exactly the intersection UT n ( R) LT ( R). Proposition 2.9 tells us the modulon congruence property.
366 a course in abstract algebra

that the intersection of two subgroups is itself a subgroup, so does the


same thing happen for subrings?
Proposition 9.11 Let S, T R be unital subrings of some ring R. Then
their intersection S T is also a unital subring of R.

Proof Condition SR1 follows from Proposition 2.9. To check the


closure requirement SR2, we observe that for any two elements a, b
S T, their product a b must lie in S (because S is a subring and thus
closed under multiplication) and also in T (because T is also a subring
and hence also closed under multiplication). Since a b S and
a b T, it follows that a b S T. Finally, SR3 holds because both
S and T are themselves unital subrings and must therefore contain the
unity element 1 R, and so their intersection does as well.

The matrix ring Mn (R) has subrings isomorphic to Z, Q, R and,


somewhat unexpectedly, C:
Example 9.12 The matrix ring Mn (R) has characteristic 0, since the
identity matrix I has infinite additive order. By Proposition 9.7, the
prime subring
h I i = {nI : n Z}
must therefore be isomorphic to Z. We can extend this subring to
get the subrings
{qI : q Q}
= Q and { xI : x R}
= R.
The subring
  
a b
C= : a, b R
b a
of M2 (R) is isomorphic to the ring C via the isomorphism
 
a b
f: 7 a + bi.
b a
With a little bit of work, we can extend this to a subring of Mn (R)
for n > 2, by considering the subset of nn matrices of the form

a b 0 0
b a 0 0

0 0 1 0

.. .. .. ..

. . . .
0 0 0 1
for a, b R.
There are other simple ways of constructing subrings of Mn ( R). For
example, if S is a subring of R, then Mn (S) will be a subring of Mn ( R).
Or, for 0 < k < n, the ring Mk ( R) embeds neatly as a subring of
ideals 367

Mn ( R): map a given kk matrix A to the nn matrix whose top left


kk submatrix is a copy of A, whose other diagonal entries are 1, and
has zero everywhere else.

9.2 Homomorphisms and ideals Mrs Erlynne: Ideals are dangerous


things. Realities are better. They
wound, but theyre better.
Recall that a function f : R S is a (ring) homomorphism if Lady Windermere: If I lost my ideals, I
should lose everything.
f ( a + b) = f ( a) + f (b) and f ( a b) = f ( a) f (b) Oscar Wilde (18541900),
Lady Windermeres Fan (1893)
for all a, b R.1
Example 9.13 For a ring R, the identity homomorphism idR : R
1
Definition 8.14, page 329.
R maps a 7 a for all a R. This homomorphism is an isomorphism.

Example 9.14 For any rings R and S, the zero homomorphism


z : R S maps a 7 0 for all a R.

Example 9.15 Let f : C C by z 7 z. Then


f (z + w) = z + w = z + w = f (z) + f (w),
f (zw) = zw = zw = f (z) f (w).
and so complex conjugation is a ring homomorphism. Since its also
bijective, its an automorphism too.

Back in Chapter 4 we obtained some basic properties satisfied by


group homomorphisms. In particular, we found that f (eG ) = e H for
any group homomorphism f : G H;2 that is, a group homomor- 2
Proposition 4.24, page 112.
phism maps the identity of its domain to the identity of its codomain.
Proposition 8.16 confirmed that ring homomorphisms map additive
identities to additive identities, and also that surjective ring homomor-
phisms map unity elements to unity elements (at least for rings that
have such elements).
But Proposition 8.16 has a loophole: what happens to the unity element
when the homomorphism isnt surjective?
Proposition 9.16 Let f : R S be a ring homomorphism. Then
f (1 R )2 = f (1 R ).

Proof Observe that f (1R ) = f (1R 1R ) = f (1R ) f (1R ).


What this means is that a non-surjective ring homomorphism neednt
map the unity element 1R R to the unity element 1S S, although
certainly this is permitted and will often be the case with many of
the rings and homomorphisms well consider. However, all we really
require is that 1R is mapped to some element s S satisfying the
368 a course in abstract algebra

condition s2 = s.
Such an element is said to be idempotent. It is always the case that
the zero element and (if it exists) the unity element are idempotent.
Sometimes, however, there may be other elements satisfying this
condition as well.
This leads us to the following clarification.
Definition 9.17 Let f : R S be a ring homomorphism. Then f is
unital if f (1R ) = 1S , and nonunital if f (1R ) 6= 1S , or if either R or S
are nonunital rings.

Any nontrivial ring R admits at least one unital homomorphism,


namely the identity map idR : R R mapping a 7 a for all a R,
and at least one nonunital homomorphism, the zero map z R : R R
given by a 7 0 for all a R.
3
This is a consequence of the fact that The integer ring Z has only two idempotent elements, 0 and 1.3
Z is an integral domain. Because of this, the zero homomorphism is the only nonunital homo-
morphism from Z to itself.
In fact, an endomorphism f : Z Z is completely determined by
what happens to 1. We know from Proposition 9.16 above that f (0) =
0, and 1 is very nearly as limited in its choice of destination: it can
map either to 0 or 1. The former gives us the zero homomorphism
while the latter gives the identity homomorphism. The destination of
any other n Z is determined by
f ( n ) = f ( n 1) = f (1 + + 1) = f (1) + + f (1) = n f (1).
In general, the image of the prime subring h1i will be completely
determined in this way.
We will usually require ring homomorphisms to be unital. Nonunital
homomorphisms are interesting, but unital ones have certain nice
properties that will sometimes come in useful. In particular, as well
see in Chapter 10, they map invertible elements to invertible elements.
The following two examples show that nonunital homomorphisms
can often be defined between relatively straightforward unital rings.
Example 9.18 Let f : Z2 Z6 such that 0 7 0 and 1 7 3, and let
g : Z3 Z6 such that 0 7 0, 1 7 4 and 2 7 2. Then f and g are
both nonunital homomorphisms.

Example 9.19 Let f : Z M2 (Z) by n 7 2n2n nn . This satisfies


 

the homomorphism conditions, but doesnt map 1 Z to the 22


identity matrix 10 01 M2 (Z).
 

An important class of homomorphisms can be defined on polynomial


rings:
ideals 369

Example 9.20 Let Z[ x ] denote the ring of polynomials with integer


coefficients. Then we can regard an arbitrary polynomial p = a0 +
a1 x + + an x n Z[ x ] as a function p : Z Z. (In fact, this is
usually the first context in which we meet polynomials.)
So, for example, the polynomial q = x2 2x + 1 defines a function
q : Z Z which amongst other things maps 1 7 12 2 1 + 1 = 0,
maps 0 7 02 2 0 + 1 = 1, and maps 1 7 (1)2 2(1) + 1 = 4,
and so on.
A different way of looking at this is to keep the argument fixed and
vary the polynomial. So, for some other polynomial r = 5x4 + 2x 1
we have r (1) = 5 14 + 2 1 1 = 6. This viewpoint enables us to
define a function f 1 : Z[ x ] Z given by p 7 p(1) Z for any
polynomial p Z[ x ].
So, in particular, f 1 (q) = q(1) = 0 while f 1 (r ) = r (1) = 4, and so
forth.
More generally, for any n Z we can define a function evn : Z[ x ]
Z in the analogous way, mapping a given polynomial p to its evalu-
ation p(n) Z.
Even more generally, for any ring R and fixed element a R we
can define the evaluation homomorphism eva : R[ x ] R such that
p 7 p( a) R.
This can further be extended to multivariate polynomial rings: given
a ring R and an ordered ktuple ( a1 , . . . , ak ) of elements of R, we can
define the evaluation homomorphism ev(a1 ,...,ak ) : R[ x1 , . . . , xk ] R
by p 7 p( a1 , . . . , ak ) R.

Now we look at some of the constructions that proved so important


in group theory. In particular we want to define images and kernels
for ring homomorphisms, and also derive the correct notion of a
coset. The first of these is straightforward: if f : R S is a (possibly
nonunital) homomorphism of (possibly nonunital) rings R and S, then
its image is defined as
im( f ) = {s S : s = f (r ) for some r R}.
Proposition 4.25 tells us that the image of a group homomorphism
is a subgroup of its codomain. Does the same thing happen for ring
homomorphisms? Is the image of a ring homomorphism always a
subring of its codomain? We have to be a little careful here, and
consider the various possibilities.
Proposition 9.21 Let R and S be rings, with or without unity elements,
and let f : R S be a (possibly nonunital) ring homomorphism. Then:
(i) If R is nonunital, then im f is a (possibly nonunital) subring of S.
370 a course in abstract algebra

(ii) If R is unital then im f is a subring of S, and if f is a unital homo-


morphism, im f is a unital subring of S.

Proof Proposition 4.25 ensures that condition SR1 is satisfied: im f


is an abelian subgroup of S. Closure under multiplication, condition
SR2, is satisfied since for any two elements f ( a), f (b) im f the fact
that f is a ring homomorphism ensures that
f ( a ) f ( b ) = f ( a b ).
Hence im f is a (possibly nonunital) subring of S.
If R has a unity element 1R then for any f ( a) im f we have
f (1 R ) f ( a ) = f (1 R a ) = f ( a ),
and so f (1R ) acts as a unity element in im f . This satisfies condition
SR3, and so im f is a subring of S.
If, in addition, f is a unital homomorphism, we know that f (1R ) = 1S ,
which is unique by Proposition 8.9, and since we saw just now that
f (1R ) is the unity element in im f , condition SR4 is satisfied, and im f
is a unital subring of S.
Example 9.22 Let f : Z2 Z6 be the homomorphism from Exam-
ple 9.18 mapping 0 7 0 and 1 7 3. Then im( f ) = {0, 3} = Z2
is a subring of Z6 , but not a unital subring since f isnt a unital
homomorphism.
Similarly, let g : Z3 Z6 be the other homomorphism from Exam-
ple 9.18 mapping 0 7 0, 1 7 4 and 2 7 2. Then im( g) = {0, 2, 4}
=
Z3 is also a subring of Z6 but again not a unital subring.

Example 9.23 Let f : Z M2 (Z) be the nonunital homomorphism


from Example 9.19 mapping n 7 2n2n nn . Then im( f ) is a subring
 

of M2 (Z) isomorphic to Z. It has unity element f (1) = 22 11 but


 

is not a unital subring in the sense of Definition 9.2.

The definition of the image of a ring homomorphism is straight-


forward: its the obvious analogue of the corresponding notion for
functions and group homomorphisms. Less obvious is how we should
go about defining kernels of ring homomorphisms.
For group homomorphisms, we define the kernel as the preimage of
the identity element; that is, everything in the domain that maps to
the identity element in the codomain. But a ring has two elements
that fulfil that rle, and its not immediately obvious which one we
should use.
In fact, there are a few reasons why we should use the additive
identity 0 rather than the multiplicative identity 1. Firstly, we want
our definition to extend to nonunital rings as well as unital ones,
ideals 371

and using the multiplicative identity would prevent us from doing


that. Secondly, rings are in some sense abelian groups with extra
structure, and ring homomorphisms are special kinds of abelian group
homomorphisms, so we might reasonably expect ring homomorphism
kernels to be special sorts of abelian group homomorphisms. Thirdly,
as well see in a little while, using the additive identity enables us to
define the correct analogues of cosets and normal subgroups so that
we can formulate a nice theory of quotient rings.
Definition 9.24 Let f : R S be a ring homomorphism. Then the
kernel of f is the set
ker( f ) = {r R : f (r ) = 0S }.
In Chapter 4 we found that kernels of group homomorphisms are
normal subgroups (and also that every normal subgroup is the kernel
of some group homomorphism).4 For kernels of ring homomorphisms, 4
Proposition 4.29, page 115.
things are slightly more complicated.
If f : R S is a ring homomorphism then certainly ker( f ) is an
additive (normal) subgroup of ( R, +) by Proposition 4.29. And ker( f )
is closed under multiplication, since for any a, b ker( f ) we have
f ( a) = f (b) = 0, and so

f ( a b) = f ( a) f (b) = 0 0 = 0,

hence a b ker( f ) as well. So kernels of ring homomorphisms are at


least nonunital subrings. But ker( f ) need not have a unity element,
so it need not qualify for our definition of a subring.5 And even if it 5
Definition 9.2, page 361.
does, that unity element wont necessarily be the same as the unity
element in the ambient ring. As well see in a little while, ker( f ) will
be a unital subring in very specific circumstances.
For the moment, however, well look at a few examples.
Example 9.25 Let f : Z6 Z2 and g : Z6 Z3 such that

0 if k = 0, 3

0 if k = 0, 2, 4
f (k) = and g(k) = 1 if k = 1, 4
1 if k = 1, 3, 5

2 if k = 2, 5

Then ker( f ) and ker( g) are both subrings of Z6 . Each has a unity
element (4 in the case of ker( f ), and 3 in ker( g)) but neither is equal
to 1, the unity element of the ambient ring Z6 .

Example 9.26 Let n be a positive integer greater than 1. Then we


can define a homomorphism f : Z Zn such that f (k) = [k ]n for
all k Z. Then ker( f ) = nZ, which we know from Proposition 9.6
is a nonunital subring of Z.
372 a course in abstract algebra

Example 9.27 Let a Z and let eva : Z[ x ] Z be the evaluation


homomorphism from Example 9.20. Then ker(eva ) consists of those
polynomials p in Z[ x ] for which p( a) = 0. In other words, ker(eva )
comprises all polynomials which have a as a root, or equivalently all
polynomials which have ( x a) as a factor.
This kernel is a nonunital subring of Z[ x ]: it doesnt contain 1 Z[ x ],
and no other element serves as a unity element.

Example 9.26 shows that the kernel of the modulon residue map
f n = []n : Z Zn is the nonunital subring nZ Z consisting of all
integer multiples of n. Any multiple of n times an arbitrary integer is
also a multiple of n, since kn m = (km)n nZ.
Similarly, Example 9.27 shows that the kernel of the evaluation ho-
momorphism eva : Z[ x ] Z consists of all polynomials p that have
( x a) as a factor (or, equivalently, for which p( a) = 0). Any other
polynomial q Z[ x ] multiplied by one of these polynomials will also
have ( x a) as a factor, and therefore belong to ker(eva ).
More precisely, for any k ker( f n ) and any m Z both m k and
k m lie in ker( f n ). Similarly, for any polynomial p ker(eva ) and
any polynomial q Z[ x ] both p q and q p lie in ker(eva ).
In fact, this is true for the kernel of any ring homomorphism:
Proposition 9.28 Let f : R S be a ring homomorphism. Then for any
k ker( f ) and a R we have k a ker( f ) and a k ker( f ).

Proof This follows almost immediately from the multiplicative prop-


erty of a ring homomorphism, and from Proposition 8.8 (i), since
f ( a k) = f ( a) f (k) = f ( a) 0 = 0,
so a k ker( f ). A similar argument confirms k a ker( f ) too.
Motivated by this discussion and by Proposition 4.29 we are ready to
introduce the following definition.
Definition 9.29 Let R be a ring and I be a subset of R satisfying the
first two subring conditions from Definition 9.2:
I1 ( I, +) is a subgroup of ( R, +), and
I2 a b I for all a, b I.
Then I is a left ideal of R if
I3L a i I for all i I and a R,
a right ideal of R if
I3R i a I for all i I and a R,
and an ideal (or sometimes a two-sided ideal) if both I3L and I3R
(the absorption conditions) are satisfied.
ideals 373

Ideals are the ring-theoretic analogues of normal subgroups, and in


the next section we will learn how to use them to form quotient rings.
In particular, from Proposition 9.28 and the discussion following
Definition 9.24 we know that kernels of ring homomorphisms are
ideals. That is, for any ring homomorphism f : R S, the kernel
ker( f ) is an ideal of R. Later on, well show the converse, that all
ideals can be regarded as kernels of ring homomorphisms. For the
moment, however, we will look at a few more examples.
Example 9.30 Let R be a commutative unital ring and let a R.
Then
h ai = {ra : r R}
is the principal ideal of R generated by a.
If R is a noncommutative unital ring, then for any a R we can
define the left and right principal ideals
aR = { ar : r R} and Ra = {ra : r R}
and the (two-sided) principal ideal
 n 
RaR = ri asi : ri , si R, n N
i =1
generated by a.

A (left, right or two-sided) principal ideal is the largest ideal that can
be generated from a single given element of the ring in question.
Example 9.31 The previous example can be extended to finite sets of
elements of a commutative unital ring R. Suppose that a1 , . . . , ak R.
Then
h a1 , . . . , a k i = {r1 a1 + + r k a k : r1 , . . . , r k R }
is the ideal of R generated by a1 , . . . , ak .

The following are two concrete examples of principal ideals.


Example 9.32 The principal ideal h x i of Z[ x ] generated by x consists
of those polynomials in Z[ x ] that have zero constant term.
This is clearly an ideal, since it forms an additive abelian group,
is closed under multiplication, and satisfies both the left and right
absorption conditions. Also, it is equal to the kernel of the evaluation
homomorphism ev0 : Z[ x ] Z, which is known to be an ideal.

Example 9.33 The principal ideal hni of Z generated by a given


positive integer n consists of all integer multiples of n. This is exactly
the ideal discussed in Example 9.26.

Every ring R has at least two ideals:


374 a course in abstract algebra

Example 9.34 For any ring R, the trivial subring {0} is an ideal: it
satisfies the left and right absorption conditions by Proposition 8.8 (i).

Example 9.35 Any ring R is an ideal of itself, with the left and right
absorption conditions reducing to the requirement that R be closed
under multiplication, which it is.

Definition 9.29 also includes two slightly more general objects: left and
right ideals. If a ring R is commutative, then clearly any left ideal will
also be a right ideal, and vice versa. However, if R isnt commutative
then we have the possibility of left ideals that arent right ideals, and
right ideals that arent left ideals. The following example explores this
idea.
Example 9.36 Let
  
0 x
I= : x, y R .
0 y
This is a left ideal but not a right ideal of the ring M2 (R), since it
forms an abelian group under addition, is closed under addition and
multiplication, and satisfies the left absorption condition I3L from
Definition 9.29. Similarly, the set
  
0 0
J= : x, y R
x y
is a right ideal but not a left ideal in M2 (R).

In most of what follows, however, we will focus on two-sided ideals,


Z12 because they are exactly what we need to define quotient rings.
h2i h3i Example 9.37 The ring Z12 has six ideals, namely
h4i h6i
h0i = {0}, h6i = {0, 6}, h4i = {0, 4, 8},
h0i
h3i = {0, 3, 6, 9}, h2i = {0, 2, 4, 6, 8, 10}, h1i = Z12 .
Figure 9.1: Lattice diagram for the ring
Z12 We can arrange them into the lattice diagram shown in Figure 9.1.

The only one of these ideals of Z12 that contains the unity element 1
is the full ring Z12 = h1i. Furthermore, the only ideal of M2 (Z) that
contains the identity 10 01 is M2 (Z) itself, and R[ x ] is the only ideal
 

of R[ x ] that contains 1. This is true in general:


Proposition 9.38 Let R be a unital ring, and let I be an ideal of R that
contains 1. Then I = R.
Proof The left and right absorption conditions imply that i r I and
r i I for any i I and r R. In particular, setting i = 1 it follows
that 1 r = r = r 1 I for any r R, and hence R I. Since I R
by definition, we have I = R.
We can generalise this result further, to show that if an ideal contains
ideals 375

any invertible element then it must be the entire ring.


Proposition 9.39 Let R be a unital ring, and let I be an ideal of R that
contains a unit u. Then I = R.

Proof Suppose that u I is a unit. Then the left and right absorption
conditions imply that u r I and r u I for any r R. Setting
r = u1 we see in particular that u u1 = 1 = u1 u I as well, and
hence by Proposition 9.38 it follows that I = R.

9.3 Quotient rings My methods are methods of work-


ing and thinking, and have therefore
anonymously invaded everywhere.
Now we have what seems to be the correct notion of a normal Emmy Noether (18821935),
subring, namely an ideal, we have to devise the corresponding ring- letter to Helmut Hasse (18981979),
12 November 1931
theoretic definition of cosets. Again, we must decide whether to
adopt the additive approach and use cosets of the form a+ I, or to
use multiplicative cosets of the form aI. As before, taking the view
that rings are special types of additive abelian groups, we follow the
additive approach to get the following definition.
Definition 9.40 Let R be a ring, and let I be an ideal of R. Then for
any element a R we define the coset
a + I = { a + i : i I }.
A little further thought gives us another reason why multiplicative
cosets are unsuitable for our purposes here. We need the cosets to
partition the full ring into subsets of equal size, and if we have a proper
ideal I R, the absorption conditions I3L and I3R from Definition 9.29
ensure that for all elements a R and i I, both a i and i a lie in I.
So no element of R \ I belongs to any multiplicative coset aI or Ia.
Furthermore, Proposition 4.38 tells us that for some homomorphism
f : R S with I = ker( f ), the additive cosets a+ I can be expressed
as preimages of f in the same way as for group homomorphisms.
Now that we have the correct ring-theoretic versions of normal sub-
groups and cosets, we want to use them to define quotient rings. The
approach we will use is the obvious analogue of the one we used for
quotient groups. Given an ideal I R, we define R/I to be the set
of cosets of I in R. We can then define a suitable addition operation
in the same way as for quotient groups, at which point it doesnt
seem too unreasonable to ask if we can also define an appropriate
multiplication operation as well. The following proposition confirms
this and makes the construction precise.
376 a course in abstract algebra

Proposition 9.41 Let R be a (possibly nonunital) ring and let I R be


an ideal. Then the set
R/I = { a+ I : a R}
forms a (possibly nonunital) quotient ring, with addition and multiplica-
tion operations
( a+ I ) + (b+ I ) = ( a+b)+ I
( a+ I ) (b+ I ) = ( a b)+ I
for any a, b R. Furthermore, if R is unital, so is R/I, and if R is
commutative, R/I is as well.

Proof First we must prove that R/I forms an abelian group with the
specified addition operation, and this essentially follows from Propo-
sition 3.24. The addition operation is well-defined and associative, the
additive identity is the coset I = 0+ I, and any coset a+ I has inverse
a+ I.
The multiplication operation in R/I is associative since the one in R
is: for any a, b, c R we have

(( a+ I ) (b+ I )) (c+ I ) = (( a b)+ I ) (c+ I ) = (( a b) c)+ I


= ( a (b c))+ I = ( a+ I ) ((b+ I ) (c+ I )).
The distributivity conditions are satisfied too, since

( a+ I ) ((b+ I ) + (c+ I )) = ( a+ I ) ((b+c)+ I )


= ( a (b+c))+ I = (( a b) + ( a c))+ I = (( a b)+ I ) + (( a c)+ I )
= ( a+ I ) (b+ I ) + ( a+ I ) (c+ I )
and

(( a+ I ) + (b+ I )) (c+ I ) = (( a+b)+ I ) (c+ I )


= (( a+b) c)+ I = ( a c + b c)+ I = ( a c)+ I + (b c)+ I
= ( a + I ) ( c + I ) + ( b + I ) ( c + I ).
If R is unital then so is R/I: the coset 1+ I acts as a unity element in
R/I since
(1+ I ) ( a+ I ) = (1 a)+ I = a+ I = ( a 1)+ I = ( a+ I ) (1+ I )
for any a R. And if R is commutative, then for any a, b R we have
( a+ I ) (b+ I ) = ( a b)+ I = (b a)+ I = (b+ I ) ( a+ I ),
and hence R/I is commutative.
Having now introduced the general construction and proved that it
works, its time to look at some examples. The first one is very similar
to our motivating example of quotient groups from Section 3.2.
ideals 377

Example 9.42 Let f n : Z Zn be the ring homomorphism that


maps an integer k to its modulon equivalence class [k]n . Then the
kernel K = ker( f n ) is an ideal of Z and we can therefore form the
quotient ring Z/K.
The kernel K consists of all integers congruent to 0 modulo n; that
is, all integers of the form kn. A coset m+K therefore consists
of integers of the form kn+m; that is, exactly those integers that
are congruent to m modulo n. There are thus n of these, namely
K = 0+K, 1+K, . . . , (n1)+K.
The quotient ring Z/K therefore consists of these n cosets. Its ad-
ditive structure is exactly addition modulo n, and its multiplicative
structure is multiplication modulo n. Hence Z/K = Zn .
This example should look familiar. We have a surjective homomor-
phism f n : Z Zn , which means that im( f n ) = Zn , and the example
above shows that Z/ ker( f n ) = Zn . Putting all this together we have
Z/ ker( f n )
= im( f n ), which is a ring-theoretic example of the First
Isomorphism Theorem for groups.6 Well come back to this in a little 6
Theorem 4.40, page 119.
while when we state and prove ring-theoretic versions of all three
Isomorphism Theorems.
Example 9.43 Looking again at the homomorphisms f : Z6 Z2
and g : Z6 Z3 in Example 9.25, we have
K1 = ker( f ) = {0, 2, 4} and K2 = ker( g) = {0, 3}.
The cosets of K1 in Z6 are
K1 = 0+K1 = 2+K1 = 4+K1 = {0, 2, 4}
and 1+K1 = 3+K1 = 5+K1 = {1, 3, 5}.
The quotient ring Z6 /K1 thus has two elements and must therefore
be isomorphic to Z2 . By examining the addition and multiplication
of these cosets, we find that this is indeed the case.
The cosets of K2 in Z6 are
K2 = 0+K2 = 3+K2 = {0, 3},
1+K2 = 4+K2 = {1, 4},
and 2+K2 = 5+K2 = {2, 5}.
There are three of these, and hence the quotient ring Z6 /K2 has three
elements and must be isomorphic to Z3 ; examination of the additive
and multiplicative structure confirms this.

The next example concerns evaluation homomorphisms in the polyno-


mial ring Z[ x ].
378 a course in abstract algebra

Example 9.44 Let ev0 : Z[ x ] Z be the evaluation homomorphism


at x = 0. Then K = ker(ev0 ) consists of all polynomials p Z[ x ]
for which ev0 ( p) = p(0) = 0. These are the polynomials with
zero constant term; that is, polynomials of the form p( x ) = an x n +
an1 x n1 + + a1 x. This is exactly the principle ideal h x i generated
by x.
The cosets of K are of the form q( x )+K, but after a little thought it
becomes apparent that they are determined entirely by their constant
terms and are thus all of the form a0 +K for some a0 Z.
The additive structure of the quotient ring Z[ x ]/K is just ordinary
integer addition, since ( a0 +K ) + (b0 +K ) = ( a0 +b0 )+K; furthermore
the multiplicative structure is also the same as integer multiplication,
since ( a0 +K ) (b0 +K ) = ( a0 b0 )+K. Hence Z[ x ]/K
= Z.
This is also an example of the First Isomorphism Theorem for rings.
Also, recall from Section 3.2 that the process of passing to a quo-
tient group G/H can be viewed as making everything in the normal
subgroup H equal to each other, and to the identity, with whatever
contextual meaning that might have. For example, factoring the dihe-
dral group D3 by the rotation subgroup R3 effectively glues together
those points in the equilateral triangle under investigation that differ
by a 2
3 rotation.
Essentially the same thing happens with quotient rings: we choose an
ideal I in a ring R, make all of the elements of I in some sense the
same, and see what happens to R as a result. In Example 9.44 we
took the quotient of Z[ x ] by the principal ideal h x i, which equated x
with 0. This killed any occurrences of x or its higher powers, leaving
just the constant terms. The next example explores this idea further.
Example 9.45 Let Z[ x ] as usual denote the ring of polynomials in x
over Z, and let h2i be the principal ideal generated by 2; this consists
of all polynomials with even integer coefficients.
The cosets of h2i in Z[ x ] are all of the form p+h2i, where p( x ) =
an x n + + a1 x + a0 , with a0 , . . . , an {0, 1}.
Hence the quotient ring Z[ x ]/h2i consists of the (cosets determined
by) finite-degree polynomials in x with coefficients all equal to either
0 or 1. By considering the additive and multiplicative structure of
these cosets, it turns out that Z[ x ]/h2i
= Z2 [ x ], the ring of polyno-
mials in x over Z2 .
In this example, factoring by the ideal h2i has the effect of setting 2
equal to 0, but leaving x and its powers unchanged. This results in a
change of coefficient ring from Z to Z2 .
ideals 379

Example 9.46 We can factor the ring Z[ x ] by more complicated


ideals. In particular, I = h x2 +1i consists of all polynomials over Z
which have ( x2 +1) as a factor. In the quotient Z[ x ]/h x2 +1i, two
polynomials are equivalent if they differ by a multiple of x2 +1.
But also, since x2 + 1 0, we have x2 1. This means that any
p Z[ x ] maps in the quotient Z[ x ]/h x2 +1i to the corresponding
polynomial with every x2 replaced by a 1. So, for example, x6 +
3x5 + 2x2 1 maps to (1)3 + 3x (1)2 + 2(1) 1 = 3x 4.
The cosets in Z[ x ]/h x2 +1i are thus all of the form ( a+bx )+ I, where
a, b Z. Their additive structure is much as wed expect, with
(( a+bx )+ I )+((c+dx )+ I ) = (( a+c)+(b+d) x )+ I
for any a, b, c, d Z. The multiplicative structure is interesting and
somewhat familiar:
(( a+bx )+ I ) ((c+dx )+ I ) = (( a+bx ) (c+dx ))+ I
= ( ac+( ad+bc) x +bdx2 )+ I
= (( acbd)+( ad+bc) x )+ I.
In fact, the quotient ring Z[ x ]/h x2 + 1i is isomorphic to the ring Z[i ]
of Gaussian integers, via the isomorphism ( a+bx )+ I 7 a+bi.

The following example concerns group rings.7 7


Example 8.31, page 337.
Example 9.47 Let f : D3 Z2 be the group homomorphism that
maps e, , 2 7 0 and m1 , m2 , m3 7 1. Then ker( f ) = R3 , the
rotation subgroup. This group homomorphism f defines an induced
homomorphism f : ZD3 ZZ2 with kernel K = ker( f ) = ZR3 .
The elements of this kernel are formal sums of the form ae + b + c2 ,
where a, b, c Z, and the quotient ring ZD3 /K consists of the cosets
K = e+K = +K = 2 +K = { ae + b + c2 : a, b, c Z}
m1 +K = m2 +K = m3 +K = { am1 + bm2 + cm3 : a, b, c Z}.
The quotient ring thus has two elements and is isomorphic to Z2 .

By Proposition 4.29 we know that all kernels of group homomorphisms


are normal, and all normal subgroups are kernels of group homo-
morphisms. Weve already seen that kernels of ring homomorphisms
are ideals, and the next proposition restates this fact and proves the
converse too.
Proposition 9.48 Let f : R S be a (possibly nonunital) ring homo-
morphism. Then ker( f ) is an ideal of R. Furthermore, any ideal I of a
(possibly nonunital) ring R can be expressed as the kernel of some ring
homomorphism f : R S.

Proof Let f : R S be a (possibly nonunital) ring homomorphism.


380 a course in abstract algebra

Then K = ker( f ) is an ideal: we know from the discussion following


Definition 9.24 that (K, +) is an additive subgroup of ( R, +), and
that K is closed under multiplication; furthermore Proposition 9.28
confirms the absorption conditions.
To show the converse, as with Proposition 4.29 we use the quotient
homomorphism q : R R/I that maps every element a R to the
coset a+ I in R/I. Then I = ker(q).
Examples 9.42 and 9.44, as noted earlier, suggest an analogue for rings
8
Theorem 4.40, page 119. of the First Isomorphism Theorem for groups,8 which we now state
and prove.
Theorem 9.49 (First Isomorphism Theorem) Let f : R S be a
(possibly nonunital) ring homomorphism, and let K = ker( f ). Then the
function : R/K im( f ) given by ( a+K ) = f ( a) is an isomorphism.
That is,
R/ ker( f )
= im( f ).
The following proof is analogous to the group-theoretic case.
Proof As before, we need to show that is a well-defined function, a
ring homomorphism, and a bijection.
Suppose that a, b R such that the cosets a+K and b+K coincide.
Then ( ab) must lie in the kernel K, and hence f ( ab) lies in the
image f (K ) = {0}. Hence f ( ab) = 0, and since f is a ring homo-
morphism,
0 = f ( a b ) = f ( a ) f ( b ),
therefore f ( a) = f (b). So if a+K = b+K it follows that
( a + K ) = f ( a ) = f ( b ) = ( b + K ),
and thus is well-defined.
Next we have to check that is a ring homomorphism. To do this we
have to check both the addition and multiplication conditions. Firstly,

(( a+K ) + (b+K )) = (( a+b)+K ) = f ( a + b)


= f ( a) + f (b) = ( a+K ) + (b+K )
and secondly,

(( a+K ) (b+K )) = (( a b)+K ) = f ( a b)


= f ( a ) f ( b ) = ( a + K ) ( b + K ).
Every element of im( f ) is of the form ( a+K ) for some coset ( a+K )
R/K, so is surjective.
To see that is injective, suppose that ( a+K ) = (b+K ) for some
a, b R. Then
0 = ( a+K ) (b+K ) = f ( a) f (b) = f ( a b)
ideals 381

and hence ( a b) K = ker( f ), which means that a+K = b+K, and


therefore is injective.
Thus : R/K im( f ) is a well-defined ring isomorphism.
As with the group-theoretic version, this is an important and useful
theorem. In Example 9.46 we saw that Z[ x ]/h x2 +1i was isomorphic to
the ring Z[i ] of Gaussian integers.9 We can use the First Isomorphism 9
Example 8.6, page 326.
Theorem to prove this result too.
Example 9.50 The evaluation homomorphism evi : Z[ x ] Z[i ]
maps a polynomial p Z[ x ] to its evaluation p(i ) Z[i ]. This
is surjective: we can find a polynomial a+bx Z[ x ] that evaluates to
any given Gaussian integer a+bi Z[i ]. The kernel ker(evi ) consists
of all polynomials that have ( x2 +1) as a factor; this is the principal
ideal h x2 +1i. Hence, the First Isomorphism Theorem tells us that
Z[ x ]/h x2 +1i = Z[ x ]/ ker(evi )
= im(evi ) = Z[i ].
There are ring-theoretic versions of the Second10 and Third Isomor- 10
Theorem 4.62, page 128.
phism Theorems,11 which we now state and prove. 11
Theorem 4.65, page 130.
First we need some preliminary facts about ideals and subrings. Here
and in what follows, for some subring S and ideal I in a ring R, we set
S+ I = {s+i : s S, i I },
by analogy with Definition 2.10.
Proposition 9.51 Let R be a (unital or nonunital) ring, let S be a (possi-
bly nonunital) subring of R, and let I be an ideal of R. Then
(i) S+ I is a (possibly nonunital) subring of R,
(ii) S I is an ideal of S, and
(iii) I is an ideal of S+ I.

Proof First we check that S+ I is a subring of R. To check condition


SR1, that S+ I is an additive subgroup of R, we observe that S+ I
contains 0 = 0 + 0 since both S and I do. Also, for any s S and i I
the element s+i has additive inverse (s+i ) = (s)+(i ) S+ I.
Furthermore, S+ I is closed under addition because both S and I are.
To check SR2, closure under multiplication, observe that for any a, b
S and i, j I, we have
( a + i ) ( b + j ) = ( a + i ) b + ( a + i ) j = a b + ( i b + a j + i j ).
This is an element of S+ I since a b S and all of i b, a j, i j I.
If S has a unity element 1 then S+ I has unity element 1+0, since
(1+0) ( a+i ) = 1 a + 0 a + 1 i + 0 i = a + i,
( a+i ) (1+0) = a 1 + i 1 + a 0 + i 0 = a + i.
Hence S+ I is a (possibly nonunital) subring of R.
382 a course in abstract algebra

To see that S I is an ideal of S, we first check that it is an additive


subgroup of S. Both S and I contain 0 so their intersection does. If
a S I then a S and a I, so a S I as well. And S I is
closed under addition: if a, b S I then a+b S and a+b I, since
both S and I are closed under addition, and therefore a+b S I.
Finally, we need to check the absorption conditions for S I. If a S
and s S I, then both a s and s a lie in S since S is closed under
multiplication, and both a s and s a also lie in I since I is an ideal of
R. Hence a s S I, and therefore S I is an ideal of S.
The third statement, that I is an ideal of S+ I, is also just a matter of
checking definitions. Since I is an ideal of R it is already an additive
subgroup of R, and since it is a subset of S+ I it must therefore also
be an additive subgroup of S+ I. All that remains is to check the left
and right absorption conditions. For any s S and i, j I we have

( a + i ) j = a j + i j I, j (a + i) = j a + j i I

and hence I is an ideal of S+ I.

Were now ready to prove the Second Isomorphism Theorem for


rings. As with the group-theoretic version of the Second Isomorphism
12
Theorem 4.62, page 128. Theorem,12 we prove this using the First Isomorphism Theorem.13
13
Theorem 9.49, page 380. Theorem 9.52 (Second Isomorphism Theorem) Let R be a ring and
let I be an ideal of R. Then
( R + I )/I
= R/( R I ).

Proof If we can construct a surjective ring homomorphism f : S


(S+ I )/I with kernel S I then we can simply apply the First Isomor-
phism Theorem to get the desired result.
Recall that S+ I consists of elements of R which admit a (not neces-
sarily unique) decomposition as a sum s+i for some elements s S
and i I. Then by Proposition 9.51, I is an ideal of S+ I and we can
thus form the quotient (S+ I )/I, which consists of cosets of the form
(s+i )+ I. But since i I, each such coset is equal to s+ I. So, we want
a surjective ring homomorphism that maps elements of S to cosets of
the form s+ I in the quotient (S+ I )/I.
An obvious candidate is the function f : S (S+ I )/I mapping an
element s S to the corresponding coset s+ I.
This is a ring homomorphism, since for any a, b S,

f ( a+b) = ( a+b)+ I = ( a+ I ) + (b+ I ) = f ( a) + f (b)

and
f ( a b) = ( a b)+ I = ( a+ I ) (b+ I ) = f ( a) f (b).
ideals 383

It is also surjective, since for any coset s+ I in (S+ I )/I there is an


element s S with f (s) = s+ I.
Finally, for s S we have f (s) = 0+ I exactly when s+ I = 0+ I, which
is true if and only if s I. Hence s S I and therefore ker( f ) = S I.
Thus
S/S I = S/ ker( f )
= im( f ) = (S+ I )/I
by the First Isomorphism Theorem.
Example 9.53 Let R = Z[i ], the ring of Gaussian integers, let I =
h3i i, and let S = Z.
Then I is the principal ideal generated by 3i, consisting of all elements
of the form 3i z for z Z[i ]. This is equal to the subset 3Z[i ] =
{3a + 3bi : a, b Z} of Gaussian integers with real and imaginary
parts a multiple of 3.
Now S+ I is the set of elements of the form n+z, where n Z and
z I. This consists of all Gaussian integers with imaginary part
a multiple of 3, so S+ I = { a + 3bi : a, b Z}. It is a subring of
R = Z[i ] by Proposition 9.51(i). The cosets of I in S+ I are therefore
I, 1+ I and 2+ I, so (S+ I )/I
= Z3 .
The intersection S I comprises the elements of I that have zero
imaginary part, namely {. . . , 6, 3, 0, 3, 6, . . .} = 3Z. The quotient
S/(S I ) is thus Z/3Z
= Z3 .
So, in this case we have (S+ I )/I = Z3 = S/(S I ) as predicted by
the Second Isomorphism Theorem.

There is also a Third Isomorphism Theorem for rings, which again we


prove using the corresponding First Isomorphism Theorem.
Theorem 9.54 (Third Isomorphism Theorem) Let R be a ring, and
let I and J be ideals of R such that I J. Then
( R/J )/( I/J )
= R/I.

Proof Let f : R/J R/I such that f ( a+ J ) = a+ I for any a R. This


map is well-defined, since if a+ J = b+ J it follows that ab J I,
so a+ I = b+ I. It is also a ring homomorphism since

f (( a+ J ) + (b+ J )) = f (( a+b)+ J ) = ( a+b)+ I


= ( a+ I ) + (b+ I ) = f ( a+ J ) + f (b+ J )
and

f (( a+ J ) (b+ J )) = f (( a b)+ J ) = ( a b)+ I


= ( a + I ) ( b + I ) = f ( a + J ) f ( b + J ).
Now the kernel ker( f ) = { a+ J : a+ I = I } = { a+ J : a I }, which is
just the quotient I/J. Furthermore, the image im( f ) = { a+ I : a R}
384 a course in abstract algebra

is simply the quotient R/I. Then

( R/J )/( I/J ) = ( R/J )/ ker( f )


= im( f ) = R/I

by the First Isomorphism Theorem.

Example 9.55 Let R = Z[i ], I = h3i i and J = h9i i. Then R/I


= Z3 [i ]
and R/J
= Z9 [i ]. Furthermore, I/J = Z3 [i ], and we have
( R/J )/( I/J )
= Z9 [i ]/Z3 [i ]
= Z3 [i ]
= R/I
as predicted by the Third Isomorphism Theorem.

The prime ideal is a princess of the 9.4 Prime and maximal ideals
world of ideals. Her father is the prince
Point in the world of geometry. Her
mother is the princess Prime Num- We saw earlier that nZ = hni is an ideal of Z. Indeed, all ideals of
bers in the world of numbers. She Z are of this form. Also, we have a series of inclusions h9i h3i Z.
inherits the purity from her parents.
Is there another ideal that fits between h3i and Z? That is, can we find
Kazuya Kato,
lecture, 11 February 2013 an ideal I such that h3i I Z?
The answer is no. Suppose that I strictly contains h3i, and therefore
that there exists some nonzero integer a I \ h3i. This means that a
is not of the form 3n for any integer n. Suppose that a = 3n+1 for
some n Z. Then we need I to satisfy the conditions for an additive
abelian subgroup of Z, and hence I must also contain all elements of
the form 3k+1 for k Z, because it has to be closed under addition. It
must contain all elements of the form 3k+2 as well, since the additive
inverse of 3n+1 is (3n+1) = 3n 1 = 3(n+1)+2. Thus I = Z.
If, on the other hand, a is of the form 3n+2, then I = Z by a very
similar argument. We give ideals of this type a special name:
Definition 9.56 Let R be a (possibly nonunital) ring, and let I be a
proper ideal of R. Then I is a maximal ideal if there exists no other
ideal J such that I J R.
Let a R be some element of R. Then the principal ideal h ai is a
maximal principal ideal if there exists no other principal ideal hbi
such that h ai hbi R.
Here are some examples of maximal and non-maximal ideals. First
lets look at a matrix ring.
Example 9.57 The ideal 20 03 in M2 (Z) consists of all matrices of

 

the form 3c 3d , where a, b, c, d Z. It is contained in 20 01 and


 2a 2b 
 

is hence not a maximal ideal of M2 (Z).


The next example considers the ideals of a finite ring, Z12 .
ideals 385

Example 9.58 The ring Z12 has six ideals, namely


h0i = {0}, h6i = {0, 6}, h4i = {0, 4, 8},
h3i = {0, 3, 6, 9}, h2i = {0, 2, 4, 6, 8, 10}, h1i = Z12 .
The lattice diagram in Figure 9.2 highlights the fact that only h2i and
h3i are maximal ideals of Z12 .
Its interesting to ask what the corresponding quotient rings are, and
it turns out that
Z12
Z12 /h0i
= Z12 , Z12 /h6i
= Z6 ,
h2i h3i
Z12 /h4i
= Z4 , Z12 /h3i
= Z3 ,
h4i h6i
Z12 /h2i
= Z2 , Z12 /h1i
= 0.
h0i
Of these, only Z2 and Z3 are fields, and these are the ones corre- Figure 9.2: Lattice diagram for the ring
sponding to maximal ideals. Z12

All of the ideals of Z12 are principal. In the next chapter we will study
a class of rings with this property, called principal ideal domains.
Example 9.59 The ideal ht2 i in Z[t] consists of all polynomials with
zero constant and linear terms; that is, those of the form
a n t n + + a2 t2 .
This is not a maximal ideal, because it is properly contained in hti,
which is also not a maximal ideal. However, the ideal

ht, 2i = { an tn + + a1 t : a1 , . . . , an Z}
{bn tn + + b1 t + b0 : b0 , . . . , bn 2Z}
is maximal, as is any other ideal of the form ht, pi for p prime. The
quotients are
Z[ t ] / h t i
= Z, Z[t]/ht, 2i
= Z2 .
Of these, Z2 is a field, while Z is just an integral domain. The
quotient Z[t]/ht2 i is not even that, since t2 = 0 means that t is a zero
divisor.
However, if we consider the ring R[t] of polynomials with real co-
efficients, we find that the ideal h x i is maximal: it consists of the
polynomials with zero constant term, and no other proper ideal of
R[t] contains it. The quotient R[t]/hti = R, which is a field.

In fact, not only is ht2 i not maximal in Z[t], no ideal of the form h f i
for f Z[t] is either:
Proposition 9.60 No principal ideal in Z[t] is maximal.

Proof Let n Z Z[t] be a nonzero integer. If n = 1 then


hni = Z[t] isnt maximal. If n 6= 1 then hni hn, ti and hence again
386 a course in abstract algebra

hni isnt maximal.


Now suppose that f = am tm + + a1 t + a0 Z[t] is a polynomial
of degree m > 0. Then h f i isnt maximal either, since the ideal h f , pi
strictly contains it, where p is prime and p6 | am .
This correlation between maximal ideals and fields holds in general:
Proposition 9.61 Let R be a commutative ring, and I be an ideal of R.
Then R/I is a field if and only if I is maximal.

Proof Suppose I is maximal in R, and let a R \ I. Then ( a+ I ) isnt


the additive identity (0+ I ) = I in the quotient R/I. We want to show
that ( a+ I ) is a unit in R/I, and to do this we need to find some
b R \ I such that
( a + I ) ( b + I ) = (1+ I ) = ( b + I ) ( a + I ).
Let J = {ra+i : r R, i I }. Then J is an additive abelian group,
since
(r1 a + i1 ) + (r2 a + i2 ) = (r1 + r2 ) a + (i1 + i2 ) J,
the additive identity 0 = 0a + 0 J, and for any (ra + i ) J we have
(ra + i ) = (r ) a + (i ) J.
Furthermore, J satisfies the left and right absorption conditions, since
for any s R,
s (ra + i ) = (s r ) a + s i J
and (ra + i ) s = ra s + i s = (r s) a + i s J
Finally, J is closed under multiplication, since
(r1 a + i1 ) (r2 a + i2 ) = (r1 r2 a) a + i1 r2 a + r1 ai2 + i1 i2
= (r1 r2 a) a + (i1 r2a + r1 ai2 + i1 i2 ) J.
Hence J is an ideal of R. But J contains I, since every element of i I
can be written in the form 0a + i. But I is maximal in R, and hence
J = R.
In particular, 1 J = R, so there exists some b R and i I such that
1 = ba + i. Hence ba + i (1+ I ), so (b+ I ) = ( a+ I )1 as required,
and therefore R/I is a field.
Conversely, suppose that R/I is a field, and J is an ideal of R that
strictly contains I. Let a J \ I. Then ( a+ I ) is a nonzero element of
R/I. Since R/I is a field, it follows that ( a+ I ) is invertible, and hence
there exists a coset (b+ I ) such that ( a+ I ) (b+ I ) = (1+ I ). Since a J
and J is an ideal, it follows that ab J. Hence ( a+ I )(b+ I ) = 1+ I
and so ab+ I = 1+ I, which means that (1 ab) I J, so 1 =
(1 ab)+ ab J.
Hence J = R by Proposition 9.38, and so I is maximal.
ideals 387

We can now characterise the maximal ideals of Z. Recall that all


additive subgroups of Z are of the form nZ for n N. These are
exactly the principal ideals hni, and the quotient Z/hni
= Zn is a field
exactly when n is prime. Therefore, by Proposition 9.61, the maximal
ideals of Z are exactly the principal ideals h pi where p is prime.
This leads us in an interesting direction. Suppose that r, s Z and
their product rs h pi = pZ. Then if p is prime, p must divide either
r or s.
So, the maximal ideals of Z are exactly those principal ideals hni
satisfying this property: that if r, s Z and rs hni, then either r or s
must be a multiple of n, or equivalently r of s must lie in hni.
Is this true in general? Suppose we have a commutative ring R with
an ideal I such that for any r, s R with r s I, either r I or s I.
Does it necessarily follow that I is maximal?
The answer is no, as the following example shows.
Example 9.62 Let R = Z[ x ]. Then h x i is not maximal, since h x i
h x, 2i. But for any p, q Z[ x ] such that pq h x i it must be the case
that either p h x i or q h x i.
The reason for this is that h x i consists of polynomials with zero
constant term. Suppose that p = am x m + + a1 x + a0 and q =
bn x n + + b1 x + b0 . Then
pq = am bn x m+n + + ( a1 b0 + a0 b1 ) x + a0 b0 .
This product lies in h x i if and only if a0 b0 = 0. Since Z is an integral
domain, this can only be the case if either a0 = 0 or b0 = 0, in which
case either p h x i or q h x i.
This seems to be an interesting concept, worth investigating further,
and to that end we introduce the following definition.
Definition 9.63 Let R be a commutative ring, and let I be an ideal
of R. We say I is prime if, for any a, b R with ab I, it must also
be the case that either a I or b I (or both).
So, the above example demonstrates the existence of ideals that are
prime but not maximal, and we know from the preceding discussion of
Z that some prime ideals are maximal too. The obvious next question
is to ask whether all maximal ideals are prime, but the answer to this
requires a little more work.
By Proposition 9.61 we know that for any commutative ring R and
ideal I, the quotient R/I is a field exactly when I is maximal. In the
above example, the quotient Z[ x ]/h x i is isomorphic to Z, which isnt
a field. But in some sense its next best thing: an integral domain. It
388 a course in abstract algebra

transpires that this is true in general: quotients of commutative rings


by prime ideals are integral domains.
Proposition 9.64 Let R be a commutative ring, and let I be an ideal of
R. Then R/I is an integral domain if and only if I is prime.

Proof Suppose R/I is an integral domain, and that a, b R such


that a b I. Then ( a+ I )(b+ I ) = ( ab+ I ) = I, which is the additive
identity in R/I. Since R/I is an integral domain, this means that either
( a+ I ) = I, in which case a I, or (b+ I ) = I, in which case b I.
Hence I is prime.
Conversely, suppose that I is prime. We must show that R/I has no
zero divisors. For nonzero a, b R such that ( a+ I )(b+ I ) = (0+ I ) = I,
we have ( ab+ I ) = I, and hence ab I. Then, since I is prime, either
a I, in which case ( a+ I ) = I is the zero element in R/I, or b I,
in which case (b+ I ) = I is the zero element in R/I. Thus R/I is an
integral domain.
This is all we need to answer the question of whether maximal ideals
are prime: they are.
Corollary 9.65 Maximal ideals are prime.

Proof Let I be a maximal ideal in some commutative ring R. Then


R/I is a field, and hence also an integral domain by Proposition 8.47.
Therefore I is a prime ideal by Proposition 9.64.

Summary

In Chapters 2 and 3 we investigated subgroups and quotient


groups, in the process learning some useful techniques for under-
standing the internal structure of groups. In this chapter, we sought
to develop analogous concepts and techniques to help us to better
understand rings.
14
Definition 9.2, page 361. We began by introducing the idea of a subring,14 a subset S R
that forms a ring in its own right via the addition and multiplication
operations inherited from the ambient ring R. Almost immediately we
ran into some notational technicalities, related to the lack of a universal
convention on whether rings necessarily have unity elements. We
adopted the convention that by default a subring has a unity element,
a nonunital subring doesnt, and a unital subring not only has a
unity element, but it must have the same unity element as the ambient
ring. (This is not universal: some books adopt a different convention.)
For S to be a subring of R, we require that (S, +) be a subgroup of
ideals 389

( R, +), that S be closed under multiplication, and that S have a unity


element.15 15
Proposition 9.3, page 361.
For example, the ring Z4 has no proper unital subrings, although
the subset {0, 2} forms a subring in the sense just described, with
2 serving as a unity element.16 This is a special case of the more 16
Example 9.4, page 362.
general fact that neither Z nor Zn have any proper unital subrings,17 17
Proposition 9.5, page 362.
although Z has countably many nonunital subrings, all of the form
nZ.18 However, any unital ring R contains a subring isomorphic to 18
Proposition 9.6, page 363.
either Z or Zn for some integer n > 1;19 this is called the prime 19
Proposition 9.7, page 363.
subring, and is the smallest nontrivial unital subring of R. It is also
the intersection of all unital subrings of R; indeed any intersection of
unital subrings is also a unital subring.20 20
Proposition 9.11, page 366.
In general, the direct product G H of two groups G and H will have
at least one subgroup isomorphic to G and one isomorphic to H.21 21
Example 4.16, page 108.
The corresponding statement is not always true for rings, because for
a direct sum RS, the obvious inclusion functions i1 : R , RS and
i2 : S , RS dont necessarily map unity elements to unity elements.
More precisely, any injective ring homomorphism f : R RS must
map 0 to (0, 0) and 1 to (1, 1), which the canonical inclusion functions
dont do.
However, a useful result about direct products of groups that does
work in a ring-theoretic context is the fact that Zm Zn
= Zmn if and
22
only if gcd(m, n) = 1; this is a version of the celebrated Chinese 22
Proposition 1.32, page 16.
Remainder Theorem.23 A corollary of this is that the ring Zn has a 23
Proposition 9.8, page 364.
unique (up to permutation of factors) decomposition as a direct sum
of rings of the form Z pk where p is prime.24 24
Corollary 9.9, page 365.
A matrix group Mn ( R) typically has numerous subrings: the subrings
UT n ( R), LT n ( R) and Dn ( R) of triangular and diagonal matrices,25 the 25
Example 9.10, page 365.
prime subring h I i generated by the identity matrix I, and also any
ring of the form Mn (S) where S is a subring of R.
In order to define and investigate quotient rings, we spent some time
devising the correct ring-theoretic analogue of a normal subgroup.
Bearing in mind that normal subgroups are equivalent to kernels of
group homomorphisms,26 we began by studying some basic properties 26
Proposition 4.29, page 115.
of ring homomorphisms.
For a homomorphism f : R S, where R is unital, it isnt necessarily
the case that the unity element in R maps to the unity element in
S; indeed, S need not be unital. It must, however, be the case that
f (1)2 = f (1); that is, ring homomorphisms map unity elements to
idempotent elements.27 This led us to introduce the notions of a 27
Proposition 9.16, page 367.
unital homomorphism, that maps unity to unity, and a nonunital
homomorphism, that either maps unity to some other idempotent
390 a course in abstract algebra

28
Definition 9.17, page 368. element, or whose domain or codomain are nonunital rings.28 Any
nontrivial ring R admits at least one unital homomorphism, the iden-
tity homomorphism idR : R R that maps everything to itself, and
one nonunital homomorphism, the zero homomorphism z R : R R
that maps everything to 0.
An interesting class of homomorphisms are the evaluation homomor-
29
Example 9.20, page 369. phisms, defined on polynomial rings.29 For some fixed element a R,
the homomorphism eva : R[ x ] R maps a polynomial p R[ x ] to its
evaluation p( a) obtained by setting x = a.
Next we examined the properties of images and kernels of ring homo-
morphisms. Images of group homomorphisms are always subrings
of their codomains, but the situation is slightly more complicated
with ring homomorphisms. In general, images of nonunital homo-
morphisms are (possibly nonunital) subrings, while images of unital
30
Proposition 9.21, page 369. homomorphisms are unital subrings.30 The kernel of a ring homomor-
phism f : R S is the set of elements of R that are mapped to the zero
31
Definition 9.24, page 371. element in S.31 Kernels of ring homomorphisms are at least nonunital
subrings of their domains: they are additive abelian subgroups and
closed under multiplication, but need not have a unity element. They
32
Proposition 9.28, page 372. do, however, satisfy absorption conditions;32 that is, for any r R
and k ker( f ), the products r k and k r lie in ker( f ).
Motivated by this, we defined an ideal I of a ring R to be a subset
of R that satisfies the first two subring conditions and the left and
33
Definition 9.29, page 372. right absorption conditions.33 Ideals are the ring-theoretic analogues
of normal subgroups. In particular, for a commutative unital ring R
and some element a R we defined the principal ideal h ai generated
34
Example 9.30, page 373. by a to be the set {ra : r R}.34 Just as every group has at least two
normal subgroups, itself and the trivial group, every ring has at least
35
Example 9.34, page 374. two ideals: itself and {0}.35 , 36 If R is a unital ring, then the only ideal
36
Example 9.35, page 374. of R that contains 1 is R itself;37 more generally, R is the only ideal
37
Proposition 9.38, page 374. containing a unit.38 That is, no proper ideals contain any invertible
38
Proposition 9.39, page 375. elements or the unity element 1. As anticipated, not only is every ring
homomorphism kernel an ideal, every ideal can be regarded as the
39
Proposition 9.48, page 379. kernel of some ring homomorphism.39
Having obtained a definition for the ring-theoretic analogue of a nor-
mal subgroup, we then turned our attention to cosets. There were
two approaches we could have taken, the additive or multiplicative
viewpoint, but for various reasons the multiplicative one was unsuit-
able for our purposes. So, given a ring R, an ideal I and an element
a R, we defined the coset of I corresponding to a to be the set
40
Definition 9.40, page 375. a+ I = { a + i : i I }.40 This enabled us to define the quotient ring
41
Proposition 9.41, page 376. R/I,41 whose elements are the cosets of I in R, with addition and
ideals 391

multiplication operation defined by


( a+ I ) + (b+ I ) = ( a+b)+ I,
( a+ I ) (b+ I ) = ( a b)+ I.
Ring-theoretic versions of the First, Second and Third Isomorphism
Theorems hold.42 , 43 , 44 42
Theorem 9.49, page 380.
43
Next we considered a couple of special types of ideal. A maximal Theorem 9.52, page 382.
ideal I in a ring R is a proper ideal of R that is contained in no larger 44
Theorem 9.54, page 383.
proper ideal: any other ideal J strictly containing I must be the full
ring R itself.45 There is a similar notion of a maximal principal ideal: 45
Definition 9.56, page 384.
a principal ideal contained in no larger proper principal ideal. Note
that a maximal principal ideal need not be a maximal ideal, and in
particular no principal ideal in Z[t] is maximal.46 A proper ideal 46
Proposition 9.60, page 385.
I R is maximal if and only if the quotient R/I is a field.47 A related 47
Proposition 9.61, page 386.
concept is that of a prime ideal: an ideal I R is prime if for any
a, b R with ab I, then either a I or b I.48 An ideal I is prime if 48
Definition 9.63, page 387.
and only if the quotient R/I is an integral domain.49 Hence, since all 49
Proposition 9.64, page 388.
fields are integral domains, all maximal ideals are prime.50 50
Corollary 9.65, page 388.

References and further reading

Exercises

9.1 Show that the quotient ring Z[ x, y]/h x2 +1, y2 +1, xy+yx i is isomorphic to the ring L of Lipschitz
integers described in Example 8.12.
9.2 Suppose that 1 < k < n and that gcd(k, n) > 1. Show that hki is a nonunital subring of Zn , and
that all nonunital subrings of Zn are of this form.
9.3 Let I denote the 33 identity matrix, and let A be a matrix of the form
0 a b
" #
A= 0 0 c
0 0 0
for some nonzero a, b, c R. Then A3 = 0 but neither A nor A2 are zero. We say that A is nilpotent;
it has finite multiplicative order. Show that I and A generate a commutative subring of M3 ( R).
9.4 Show that the evaluation homomorphism eva : R[ x ] R is indeed a ring homomorphism for any
a R.
9.5 Show that, for a commutative unital ring R and element a R, the principal ideal h ai is indeed an
ideal of R.
392 a course in abstract algebra

9.6 Verify that : UT3 (R) UT2 (R) given by



a b c " #
a b
0 d e =

0 d
0 0 f
is a surjective homomorphism. Check that ker is an ideal of UT3 .
9.7 Let N ( R) = { a R : an = 0 for some n N} be the subset of nilpotent elements of R. Show that
if R is commutative, then N is an ideal of R. Show that N ( M2 (R)) is not a subring of R.
9.8 Let I = h x2 +1i R[ x ]. Show that R[ x ]/I
= C.
9.9 Let

0 b c


I = 0 0 e : b, c, e R


0 0 0

Show that I is an ideal of UT3 (R) and that UT3 (R)/I = RRR.
9.10 Let I = 24Z, J = 60Z and K = 105Z. What are the following ideals of Z?
(a) I J (e) I + J
(b) I K (f) I +K
(c) J K (g) J +K
(d) I J K (h) I + J +K
9.11 Let X be a (possibly infinite) set, and let R = { f : X R} be the set of all possible real-valued
functions on X, equipped with the following addition and multiplication operations:
( f + g)( x ) = f ( x ) + g( x ) and ( f g)( x ) = f ( x ) g( x )
for all f , g R and x X. Then R is a commutative ring. For any fixed a R, we define c a R to
be the constant function given by c a ( x ) = a for all x X.
(a) Show that c0 is the zero of R, and c1 is the multiplicative identity.
(b) For any a R let eva : R R be the evaluation homomorphism eva ( f ) = f ( a). Show that
eva is a surjective homomorphism.
(c) Show that Ia = { f R : f ( a) = 0} is a maximal ideal of R.
9.12 Let R be a unital ring, and let I be an ideal of R. Define
U I = { a U ( R ) : ( a 1) I }.
Show that U I P U ( R).
9.13 Let R be a commutative ring, and choose some element a R. Define
Ia = {r R : ar = 0}.
(a) Show that Ia is an ideal of R.
(b) Find I4 and I9 in Z12 .
(c) Suppose that a2 = a. Show that R/Ia
= h ai
There are several requirements that
must be met to establish a domain. In
general it must be responsibly man-
aged. There must be a responsible per-
son to serve as a coordinator for do-
main related questions, there must be
a robust name service, it must be of at
least a minimum size, and the domain
must be registered with the central do-
10 Domains main administrator.
Jon Postel (19431998),
The Domain Names Plan and Schedule,
IETF RFC 881, November 1983

n this chapter we will study some important types of integral


I domains. The ring Z of integers was our original example of
an integral domain, and it has some special properties that we will
generalise to other contexts.
In particular, the Division Theorem1 holds in Z: given any integers 1
Theorem A.37, page 523.
a and b, we can find integers q and r, with 0 6 r < b such that
a = qb + r. We want to extend this idea to other rings and find out in
exactly which circumstances it holds.
In the last chapter we devised and studied the concept of an ideal,
a (possibly nonunital) subring satisfying absorption conditions.2 In 2
Definition 9.29, page 372.
particular we met a class of ideals called principal ideals,3 of a com- 3
Example 9.30, page 373.
mutative ring R which are of the form h ai = {ra : r R} for some
a R. We will see later that every ideal of Z is principal; that is, of the
form hni for some n Z. We want to know whether this is significant,
and if so whether there are any other rings that have only principal
ideals.
Finally, every integer has a unique (up to permutation and multiplica-
tion by 1) factorisation into prime integers. We will generalise the
notion of a prime element to other rings, and investigate what sort of
rings have unique prime factorisations for arbitrary elements.

10.1 Euclidean domains Besides, they relate that Euclid was


asked by Ptolemy whether there was
any shorter way to the attainment of
The ring Z of integers satisfies the Division Theorem; this useful geometry than by his elementary insti-
fact has been known for at least 2400 years, and probably longer. We tution, and that he answered there was
no other royal path which led to geom-
now want to generalise it to other rings, and to formulate some criteria etry.
that will tell us exactly which rings have an analogous property. This Proclus Lycus (422485AD),
is important, because the Division Theorem essentially says we can A Commentary on the First Book of
Euclids Elements,
do the kind of long division with remainders that we learned about translated by Thomas Taylor
in primary school. Sometimes we will want to do this in other rings, (17581835) (1788)
394 a course in abstract algebra

such as the rings Z[i ] and Z[ ] of Gaussian and Eisenstein integers,


or the ring Q[ x ] of polynomials with rational coefficients, and we need
to know if this is going to be possible.
But straight away we run into a problem. Given two integers a, b Z
with b 6= 0, the Division Theorem asserts the existence of two other
integers q, r Z such that a = qb + r and 0 6 r < b. The first part of
this is fine: in any ring it makes sense to talk about some element a
being equal to an expression qb + r, because all we need is addition and
multiplication, both of which are defined in any ring. The difficulty
arises with the second part: the requirement that 0 6 r < b. Not all
rings, and certainly not all of the ones were most interested in, have
a canonical ordering <. For example, its not immediately obvious
what it might mean to say that 3x2 + 2 < 4x3 + x 1 in Q[ x ], or that
(2 i ) < (5 + 3i ) in Z[i ].
Things become slightly clearer if we restate the Division Theorem for
Z in a slightly different way:
Theorem 10.1 (Division Theorem for Z) Let a, b Z with b 6= 0.
Then there exist integers q, r Z such that a = qb + r and either r = 0
or |r | < |b|.
We can do almost all of this in an arbitrary ring R; again the complica-
tion is the last bit, the need to extend or provide a suitable alternative
to the absolute value function | | so that the whole statement makes
sense in our chosen ring. Ultimately, what we want from an absolute
value is that it maps every element of the ring to a non-negative real
number. However, our proof of the Division Theorem for Z relies on
induction, so really we need our generalised absolute value function
to map into N0 = N{0} rather than R>0 , at least if were to avoid
some complicated technical digressions.
So, we need a function that maps from our ring R into N0 . What else
do we need it to do? Well, while studying linear algebra, functional
analysis or metric spaces you may have met a similar function called
a norm. Typically, we require such a function to satisfy the triangle
inequality
| u + v | 6 | u | + | v |, (10.1)
but in this case were not going to: it turns out that this is overly strict
for our purposes. Instead, we require
| ab| > | a| (10.2)
for any a, b R. The reason for this will become apparent soon.
What we want, then, is a sort of generalised absolute value func-
tion, or norm, that maps into N0 , satisfies (10.2), and for which the
corresponding Division Theorem holds.
domains 395

If our ring R has one of these functions that enables us to state and
prove an analogue of the Division Theorem, then that means we can
happily do long division with remainders in R. But the statement of
the Division Theorem really depends on our choice of norm function,
and there may be more than one of these. (Indeed, the function n 7 n2
suffices in Z as well as the usual | |.) So it makes more sense to fold
the Division Theorem condition into the criteria for our norm function,
and thus we arrive at the following definition:
Definition 10.2 Let R be an integral domain. A Euclidean norm or
Euclidean valuation is a function v : R N0 such that
v( ab) > v( a) (10.3)
for all a, b R , and for any a, b R with b 6= 0 there exist q, r R
such that a = qb + r and either r = 0 or v(r ) < v(b).
The term Euclidean norm is also often used to refer to the Euclidean
distance function in Rn determined by Pythagoras Theorem, and
so from now on we will use the term Euclidean valuation to avoid
confusion.
The integral domains were interested in at the moment are those for
which we can find a Euclidean valuation:
Definition 10.3 An integral domain R that admits a Euclidean valu-
ation is said to be a Euclidean domain.
Note that for a given integral domain R to be a Euclidean domain,
we only need to be able to define a Euclidean valuation v : R N0 :
any suitable one will do, and the valuation v isnt itself considered
part of the structure for the domain. That is, the mere existence of a
Euclidean valuation is enough.
So, the ring Z of integers is a Euclidean domain via the absolute value
function v(n) = |n|, and also via v(n) = n2 .
The significance of the condition (10.3) is given by the next proposition:
Proposition 10.4 Let R be a Euclidean domain, and let v : R N0
be a Euclidean valuation for R. Then v(1) 6 v( a) for any a R , and
v( a) = v(1) if and only if a is a unit.

Proof The condition (10.3) tells us that, in particular,

v( a) = v(1a) > v(1)

for any a R .
If a R is a unit, then

v(1) = v( aa1 ) > v( a),

but weve just seen that v( a) > v(1), so v( a) = v(1).


396 a course in abstract algebra

Now suppose that a R such that v( a) = v(1). Then by the Division


Theorem condition, there exists q, r R such that
1 = qa + r
and either r = 0 or v(r ) < v( a) = v(1). But we already know that v(r )
cant be strictly less than v(1), so it must be the case that r = 0, and
hence 1 = qa, which means that q = a1 , and therefore a is a unit.
What other examples of Euclidean domains can we find? Well, proba-
bly the next objects that we learn how to long divide are polynomials.
As remarked earlier, there isnt a canonical ordering on the ring F [ x ]
for a given field F, nor is there an obvious absolute value function we
can use. However, any nonzero polynomial p( x ) F [ x ] does have a
specific non-negative integer associated with it: its degree deg( p).
This yields a function deg : F [ x ] = F [ x ] \ {0} N0 that certainly
satisfies the condition (10.3), since for any p, q F [ x ] with deg( p) =
m > 0 and deg(q) = n > 0, we have deg( pq) = mn > m.
The corresponding Division Theorem takes a little work to prove, but
follows a similar pattern to the one for Z:
Theorem 10.5 Let F be a field, and let f , g F [ x ] with g 6= 0. Then
there exist unique polynomials q, r F [ x ] such that f = qg + r, where
either r = 0 or deg(r ) < deg( g).

Proof First we prove the existence of q and r. If f = 0 or deg( f ) <


deg( g) then we set q = 0 and r = f . Otherwise, we proceed by
induction on deg( f ).
If deg( f ) = deg( g) = 0 then both f and g are nonzero constants, and
hence they are elements of F, and therefore units. We can thus set
q = f g1 and r = 0.
Suppose that
f = a m x m + + a1 x + a0 and g = bn x n + + b1 x + b0 ,
so that deg( f ) = m and deg( g) = m. Suppose also that m > 0 and
that 0 6 n 6 m. Furthermore, we assume that the theorem holds for
all f and g such that deg( f ) < m.
Now we apply long division to get a polynomial
h = f am bn1 x mn g
= ( a m x m + + a1 x + a0 )
( am x m + am bn1 x m1 + + am bn1 b0 x mn )
= ( am1 am bn1 bn1 ) x m1 +
and we see that either deg(h) < m or h = 0.
If h = 0 then set q = am bn1 x mn and r = 0 to get f = qg.
domains 397

Otherwise, if h 6= 0, then deg(h) < m and so by induction we can find


s, t F [ x ] where h = sg + t, with deg(t) < deg( g). Then
f = am bn1 x mn g + sg + t
= ( am bn1 x mn + s) g + t
and we can thus set q = am bn1 x mn + s and r = t.
To prove uniqueness, suppose that f = q1 g + r1 = q2 g + r2 with either
r1 = 0 or deg(r1 ) < deg( g), and either r2 = 0 or deg(r2 ) < deg( g).
Then
(q1 g + r1 ) (q2 g + r2 ) = 0,
which means that
(r1 r2 ) = (q2 q1 ) g.
Either r1 r2 = 0, in which case r1 = r2 and q1 = q2 , or
deg(r1 r2 ) = deg(q2 q1 ) deg( g)
which means that deg(r1 r2 ) > deg( g), which cant happen.
Hence r1 = r2 and q1 = q2 as claimed.
So F [ x ] is a Euclidean domain via the map deg : F [ x ] N0 . This
means we can, in principle at least, do long division with polynomials.
Example 10.6 Let f = x4 + 5x3 2x 3 and g = x2 + 1 in Q[ x ], the
ring of univariate polynomials with rational coefficients. We want
to find q, r Q[ x ] such that f = qg + r. Long division proceeds as
follows:
x2 + 5x 1
x + 1 x4 + 5x3 + 0x2 2x 3
2

x4 + 0x3 + x2
5x3 x2 2x
5x3 + 0x2 + 5x
x2 7x 3
x2 + 0x 1
7x 2
The quotient is the polynomial left on the top line, namely q =
x2 + 5x 1, while the remainder is the polynomial left at the end, in
this case r = 7x 2. We can easily check this:
qg + r = ( x2 + 5x 1)( x2 + 1) + (7x 2)
= x4 + 5x3 + 5x 1 7x 2
= x4 + 5x3 2x 3
= f

We can use this to prove an important fact about polynomial rings


over fields.
398 a course in abstract algebra

Theorem 10.7 (Remainder Theorem) Let F be a field, let f F [ x ] be


a polynomial over F, and let a F. Then f ( a) = 0 if and only if ( x a)
divides f with no remainder.

Proof By Theorem 10.5 we know that F [ x ] is a Euclidean domain.


Therefore
f = ( x a)q + r
where q, r F [ x ] and either r = 0 or deg(r ) < deg( x a) = 1. Hence r
is a constant polynomial; that is, an element r F. Substituting x = a
into the above Euclidean decomposition, we get
f ( a) = ( a a)q( a) + r ( a) = 0 + r = r.
So f ( a) = 0 exactly when the remainder r = 0.
An important corollary of this theorem gives an upper bound on the
number of roots a polynomial f F [ x ] can have.
Corollary 10.8 Let F be a field, and let f F [ x ]. Then there are at most
deg( f ) distinct elements a F for which f ( a) = 0.

Proof We proceed by induction on the degree of f . If deg( f ) = 0


then f 6= 0 is a nonzero constant in F, and so f ( a) 6= 0 for any a F.
Now suppose that deg( f ) = k and that any polynomial of degree
less than k has at most (k1) distinct roots in F. If f ( a) = 0 then by
Theorem 10.7 we can factorise f = ( x a)q where deg(q) = k1, and
by the inductive hypothesis, q has at most (k 1) roots in F. Thus f
has at most k = deg( f ) roots in F.
A useful application of this concerns the group of units of a field.
Proposition 10.9 Let F be a field, and let U ( F ) = F \ {0} denote its
(multiplicative) group of units. Any finite subgroup of U ( F ) is cyclic.

Proof Let G 6 U ( F ) be some finite subgroup of U ( F ) with order


| G | = n, and suppose it isnt cyclic. Then G must be Abelian, and by
Corollary 5.49 we know it must be isomorphic to a direct sum Zm1
Zmk of finite cyclic groups, where m1 |m2 | |mk . Let m = mk
and consider some arbitrary element ( g1 , . . . , gk ) Zm1 Zmk .
We then have
( g1 , . . . , gk )m = ( g1m , . . . , gkm ) = (1, . . . , 1),
and so gm = 1 for any g G.
But this means that any of the n elements of G is a solution of the
equation gm = 1, and hence a root of the polynomial f = x m 1
in F [ x ]. However, n = m1 mk > mk = m = deg( f ), which
contradicts Corollary 10.8.
Therefore G must be cyclic.
domains 399

The next corollary is a special case, which we will use in Section 10.A
to help characterise the prime elements in the ring Z[i ] of Gaussian
integers.
Corollary 10.10 Let F p denote the finite field of prime order p. Then
U (F p )
= Z p 1 .

Proof By Lemma 2.43, the group U (F p ) = Z p consists of those


integers k such that 0 < k < n and gcd(k, n) = 1. The number of such
integers is given by Eulers totient function ,4 and Proposition 2.46 4
Definition 2.44, page 63.
tells us that ( p) = p1. Hence |U (F p )| = p1, and Proposition 10.9
says that it must be cyclic, so U (F p ) = Z
p = Z p1 as claimed.

The ring Z[i ] of Gaussian integers is also a Euclidean domain. To


show this, we need to find a suitable valuation function v : Z[i ] N0 ,
and prove the corresponding Division Theorem.
Our motivating example of a valuation was the absolute value function
| | : Z N0 , which we can view as the restriction to Z R of the
absolute value function | | : R R>0 for the real numbers. The
Gaussian integers can be viewed as a subset of the complex number
field C, so we might be inclined to use the restriction to Z[i ] of the
complex modulus function | | : C R>0 . Unfortunately, this doesnt
quite work, because the modulus of an arbitrary Gaussian integer need

not be a nonnegative integer: for example, |1+i | = 2. In some sense,
this should be fine, because the set of positive algebraic irrational
numbers are countable, and hence there exists a bijection to N0 , which
we could compose with the modulus to get the desired function,
but itll be much simpler if we can find an alternative valuation that
maps directly into N0 . Happily, we can: v : Z[i ] N0 given by
v( a + bi ) = | a + bi |2 = a2 + b2 .
Proposition 10.11 The ring Z[i ] of Gaussian integers is a Euclidean
domain via the valuation v( a + bi ) = a2 + b2 .

Proof The Gaussian integers form an integral domain: theyre com-


mutative and none of them are zero divisors. So we need to show that
the map v is a Euclidean valuation.
Suppose that z = a + bi and w = c + di. Then

v(zw) = v(( a+bi )(c+di )) = v(( acbd)+( ad+bc)i )


= ( acbd)2 +( ad+bc)2 = a2 c2 +b2 d2 + a2 d2 +b2 c2
= ( a2 +b2 )(c2 +d2 ) > a2 +b2 = v(z)
To prove the Division Theorem is, as usual, slightly more involved.
As is often the case, its helpful to think of the problem geometrically.
The Gaussian integers are exactly those points in the complex plain
400 a course in abstract algebra

with integer coordinates. The valuation v is then the square of the


modulus; it gives the distance from the origin to the point representing
the Gaussian integer in question.
The principal ideal hwi = hc + di i can be regarded as a lattice of
points in Z[i ] C. Converting to polar coordinate, there exists some
p R>0 and [0, 2 ) such that w = pei . The lattice corresponding
to hwi can then be constructed by taking the full lattice Z[i ], scaling it
by a factor of p, and rotating it through an angle .
Now, for any other Gaussian integer z = a + bi, there is some point in
this lattice hwi that is closest to it. More precisely, there exists some
Gaussian integer u hwi such that v(z u) 6 12 p2 . But if u is an
element of hwi, then that means there exists some other Gaussian
integer q Z[i ] such that u = qw. So qw is within squared distance
1 2 1 2
2 p of z; that is, v ( qw z ) 6 2 p . To find q, consider
z a + bi ( a + bi )(c di ) zw
s= = = = .
w c + di c2 + d2 v(w)
This will, in general, not be in Z[i ], but it will at most be distance 1
2
from something that is. Pick the nearest such Gaussian integer and
call it q, so that v(s q) 6 21 .
Hence there exists some other Gaussian integer r = z qw such that
either r = 0 (if s does happen to lie in Z[i ]), or
v(r ) = v(z qw) 6 12 p2 < p2 = v(w).

z=14+2i Thus v is a Euclidean valuation on Z[i ], and therefore Z[i ] is a Eu-



clidean domain.
This is best illustrated by an example.
Example 10.12 Suppose we want to find the quotient q and remain-
qw=(2i )(5+3i )=13+i der r of 14+2i when divided by 5+3i. To find q we want the closest
Figure 10.1: The Division Theorem Gaussian integer to
in Z[i ]. The small dots represent the
14+2i (14+2i )(53i ) 7632i
Gaussian integers, while the larger dots = = = 38 16
17 17 i 2.24 + 0.94i.
represent the ideal hwi = h5+3i i. 5+3i 25+9 34
The closest Gaussian integer to this is 2i, so we set q = 2i. Then
r = (14 + 2i ) (2 i )(5 + 3i ) = (14 + 2i ) (13 + i ) = 1 + i.
Hence 14+2i = (2i )(5+3i ) + (1+i ), and also
v(r ) = v(1 + i ) = 2 < 34 = v(5 + 3i ) = v(w)
as required. This is depicted geometrically in Figure 10.1.

The Division Theorem in Z is the main ingredient for Euclids Algo-


rithm for finding the greatest common divisor of two integers. Theres
nothing stopping us from doing the same in any other Euclidean
domain, but were then faced with the following question: what is
domains 401

the significance of the end result of this process? Ideally, wed hope
that its a suitable analogue of a greatest common divisor in whatever
Euclidean domain we happen to be working in.

10.2 Divisors, primes and irreducible elements Prime numbers are those which have
no other factor but one, as three only
has a third, and five only has a fifth,
In order to explore this idea, we first need to decide what we and seven only has a seventh; that is,
mean by a divisor in some arbitrary integral domain. This is fairly they have only one factor. Composite
numbers are those which not only di-
straightforward, and we arrive quickly at the following definition. vide by one, but are also produced by
Definition 10.13 Let R be an integral domain, and let a, b R. Then another number, such as nine, twenty-
one, fifteen and twenty-five. That is, we
a divides b, or is a divisor or factor of b if there exists some element say three times three, and seven times
r R such that b = ra. We denote this by a|b. three, three times five, and five times
five.
This can be extended to noncommutative division rings by the in- St Isidore of Seville (c.560636AD),
troduction of left and right divisors. We say that a is a left divisor Etymologi III:7
of b if there exists some r R such that b = ar, and a right divisor
of b if there exists some s R for which b = sa. Furthermore, a is a
(two-sided) divisor of b if it is both a left and right divisor.

We will concentrate on the commutative case, in integral domains,


from now on. In Z, for example, 2 divides 6 since there exists some
integer, namely 3, such that 6 = 23. Furthermore, every multiple
of 8 is even, and hence a multiple of 2. In the last chapter we met an
elegant way of thinking of this in terms of ideals: the even numbers
and the multiples of 6 both form principal ideals of Z. Well investigate
principal ideals more in the next section, but for now we can rephrase
the observation that 6 is even as 6 h2i, and the fact that every
multiple of 6 is even as h6i h2i. This is true in general:
Proposition 10.14 Let R be an integral domain. Then the following state-
ments are equivalent, for any a, b R:
(i) a|b
(ii) b h ai
(iii) hbi h ai

Proof Suppose that a|b. Then there exists some r R such that b = ra.
But the principal ideal h ai consists of all elements of R of the form ra
for some r R. Hence b h ai.
Now suppose b h ai. Then b = ra for some r R, and any other
multiple of b (that is, any element of the form sb for some s R) can
be written as sb = s(ra) = (sr ) a h ai. But hbi consists of all elements
of the form sb, so any element of hbi must also lie in h ai, so hbi h ai.
Finally, suppose hbi h ai. This means that every element of the form
402 a course in abstract algebra

sb is also of the form ra for some r, s R. In particular, setting s = 1,


we find b = 1b = ra for some r R. Therefore b = ra and thus a|b.
Is it possible for two elements of an integral domain to divide each
other? In the case of Z this only happens in very specific circumstances:
for two integers a, b Z we have a|b and b| a only when a = b. But
for two polynomials f , g Q[ x ] we see this more often: f | g and g| f
only when f = qg for some nonzero q Q.
What do these two examples have in common? The answer is that two
elements of the given domains divide each other exactly when one is a
multiple of the other by an invertible element: Z has two elements 1,
while the units in Q[ x ] are the nonzero constant polynomials, which
are exactly the nonzero rational numbers Q .
Definition 10.15 Let R be an integral domain. Then two elements
a, b R are associates if a|b and b| a. We denote this by a b.

Proposition 10.16 Let R be an integral domain. Then the following state-


ments are are equivalent for any a, b R:
(i) a b,
(ii) h ai = hbi, and
(iii) a = qb for some unit q R.

Proof The equivalence of (i) and (ii) follows from Proposition 10.14.
If a b then a|b, which means that hbi h ai. Also b| a, which means
that h ai hbi. Hence h ai = hbi. Each step in this argument is
reversible, so the converse holds too: if h ai = hbi then a b.
Now suppose a b. If a = 0 then b = 0, so suppose both a and b are
nonzero. Then there exist r, s R such that a = rb and b = sa. Then
a = r (sa), so (rs) a a = 0 and hence (rs 1) a = 0. Since R is an
integral domain, it has no zero divisors, and therefore rs = 1. Thus r
and s are units, with s = r 1 . The converse is straightforward: if there
exists a unit q R such that a = qb then b| a, and also b = q1 a so a|b.
Thus a b.
Proposition 10.17 Association is an equivalence relation.

Proof Association is reflexive: a a for any a R, since a = 1a.


Association is symmetric: if a b then a = ub for some unit u R.
Then b = u1 a, and hence b a.
Association is transitive: if a b and b c then a = ub and b = vc for
some units u, v R. Then a = (uv)c, and thus a c, since uv is also a
unit of R.
We want to extend the familiar notions of greatest common divisors
and least common multiples to arbitrary integral domains, in partic-
domains 403

ular those which might not have an obvious concept of greatest or


least. To start with, we revisit the usual definitions in Z:
Definition 10.18 For any two integers a, b Z, an integer d Z is
a common divisor of a and b if d| a and d|b. It is a greatest common
divisor if for any other common divisor e Z, we have 0 < |e| 6 |d|.
Furthermore, an integer m Z is a common multiple of a and b if
a|m and b|m. It is a least common multiple if for any other common
multiple n Z we have |n| > |m|.

This certainly makes sense in Z and we can see how to extend it to


any other Euclidean domain:
Definition 10.19 Let R be a Euclidean domain, let a, b R and let
v be a Euclidean valuation for R. Then d R is a common divisor
for a and b if d| a and d|b. It is a greatest common divisor if for any
other common divisor e R, we have 0 < v(e) 6 v(d).
Furthermore, m R is a common multiple of a and b if a|m and
b|m. It is a least common multiple if for any other common multiple
n R we have v(n) > v(m).
This definition is fine for Euclidean domains, but we want one that
works in an arbitrary integral domain. (However, having a definition
that makes sense doesnt guarantee that such elements will actually
exist: we will later see examples of integral domains where arbitrary
elements dont always have a greatest common divisor.)
The key observation is that if a, b Z and d Z is a greatest common
divisor for a and b, then not only does any other common divisor
e Z satisfy 0 < |e| < |d|, but it must also be the case that e|d as well.
This leads to the following more general definition.
Definition 10.20 Let R be an integral domain and let a, b R. Then Where the meaning is clear from
context, we will often just refer to
d R is a common divisor for a and b if d| a and d|b. It is a greatest the greatest common divisor, which
common divisor if for any other common divisor e R we have e|d. we will then denote gcd( a, b) or
gcd( a1 , . . . , an ), and the least com-
An element m R is a common multiple of a and b if a|m and b|m, mon multiple, which we will then write
and a least common multiple if for any other common multiple as lcm( a, b) or lcm( a1 , . . . , an ).
n R it is the case that m|n. Some books refer to the highest com-
mon factor instead of the greatest com-
More generally, for any finite collection a1 , . . . , an of elements of mon divisor, which is then denoted
R, we say that d R is a greatest common divisor of a1 , . . . , an hcf( a, b) or hcf( a1 , . . . , an ). Also, some
books denote the greatest common di-
if d| a1 , . . . , d| an and for any other common divisor e R with visor by ( a, b) or ( a1 , . . . , an ) although
e| a1 , . . . e| an we have e|d. Similarly, l is a least common multiple of this notation is prone to ambiguity and
so we wont use it here.
a1 , . . . , an if a1 |l, . . . , an | and for any other common multiple m R
with a1 |m, . . . , an |m we have l |m.
Definition 10.20 implies Definition 10.19 in a Euclidean domain. Sup-
pose that R is a Euclidean domain and that d is a greatest common
404 a course in abstract algebra

divisor for two arbitrary elements a, b R. Then any other common


divisor e R must satisfy e|d, which means that there exists some
c R with d = ce. Then for any Euclidean valuation v we have
v(d) = v(ce) > v(e) by (10.3). Similarly, suppose that m R is a least
common divisor for a and b, and that n R is some other common
divisor. Then by Definition 10.20 there exists some l R such that
n = lm, and by (10.3) we have v(n) = v(lm) > v(m) as required.
Weve also been careful to use the indefinite rather than the definite
article when discussing greatest common divisors and least common
multiples, and the reason for this is that they are not guaranteed to
be unique in an arbitrary integral domain. In fact, theyre not unique
even in Z. For example, both 3 and 3 are greatest common divisors
of 6 and 9, and both 18 and 18 are lowest common multiples. What
the elements in these pairs have in common is that they differ by a
factor of 1, which is a unit in Z. More generally, they are associates
of each other. This is true in any integral domain: any two greatest
common divisors are associates, and so are any two least common
multiples. This is a consequence of Definitions 10.15 and 10.20.
Any integer n Z is either prime or can be expressed as a finite
product of prime factors n = p1 . . . pk . Later in this chapter we will
study factorisation in more detail, but for now it would be useful to
have a general notion of primality that works in any integral domain.
The integer 3 is prime. By this, we usually mean that it has no factors
except 1 or itself; the integer 4, on the other hand, is composite because
it factorises as 22. That is, 4 can be expressed as a product of integers
which or neither 1 nor 4 itself.
But 3 = (1)(3) as well, so we have to take account of negative
integers as well. We can do this by saying that an integer n is prime
if any factorisation n = ab requires either a or b to be a unit, and
composite otherwise (if it factorises as a product of non-units). Also,
we dont usually consider 1 or 1 to be prime or composite, so well
explicitly discount the case where n is a unit.
Theres another way of looking at this, not in terms of what integers
divide n, but in terms of what composite integers n itself divides. For
example, 3|18 and 18 can be decomposed as a product in various ways:
as 118, as 29 or as 36. In each of these cases, 3 divides at least
one factor: 3|18 in the first case, 3|9 in the second, and 3|3 and 3|6 in
the last one.
If we try this with a composite integer, by considering how 4 divides
20, then something different happens. In particular, 20 = 120 =
210 = 45. In one of these factorisations, namely 210, our chosen
composite integer 4 divides neither factor 2 or 10. This can happen
domains 405

because 4 itself factorises into a product 22 of non-units.


So this gives a different way of characterising an integer n as prime: it
isnt a unit, and for any product m = ab with n|m, either n| a or n|b.
These definitions are equivalent for integers, but not necessarily in
arbitrary integral domains. So well use different terms for each:
Definition 10.21 Let R be an integral domain. An element r R is
irreducible if it is not a unit, and if r = ab for some a, b R it must
be the case that either a or b is a unit.
Slightly confusingly, this is the usual definition of what it means for
an integer to be prime, whereas we reserve that term for the following.
Definition 10.22 Let R be an integral domain. An element r R is
prime if it is not a unit, and if for any a, b R it must be the case
that r | ab implies that either r | a or r |b.

These two definitions give rise to subtly different concepts. The sub-
tlety is exacerbated by the fact that in familiar number systems like
Z, prime elements are irreducible and irreducible elements are prime.
More generally, only the first of these is true in a given integral domain.
Proposition 10.23 Any prime element of an integral domain R is irre-
ducible.
Proof Let r R be prime, and suppose that r = ab. Clearly r |r, so
r | ab, and since r is prime we must have either r | a or r |b. Suppose that
r | a, without loss of generality. But a|r as well since r = ab, so r a,
which means that r = ua for some unit u R. By the cancellation law
for integral domains (Proposition 8.48) we have u = b, so b is a unit
and hence r is irreducible.
In a little while, well investigate the exact circumstances in which the
converse fails to be true, but for the moment here is an example.

Example 10.24 Let Z[ 3] = { a + b 3 : a, b Z}. This is an
integral domain, and we claim that in it, 2 is irreducible but not prime.

To show this, we need to use the norm function N : Z[ 3] N

given by N ( a + b 3) = a2 + 3b2 .
In particular, this function has the following properties:
(i) N (r ) = 1 if and only if r is a unit; and

(ii) N (rs) = N (r ) N (s) for any nonzero r, s Z[ 3].

Now observe that (1+ 3)(1 3) = 4 = 22. Suppose that

2 = rs where r = u1 +v1 3 and s = u2 +v2 3. Then
4 = N (2) = N (rs) = N (r ) N (s) = (u21 + 3v21 )(u22 + 3v22 ).

If N (r ) = 1 then r is a unit and r 1 = u1 v1 3.
406 a course in abstract algebra

Similarly, if N (s) = 1 then s is a unit, with s1 = u2 v2 3.
The only other option is that N (r ) = N (s) = 2 6= 1. This ensures that
neither r nor s are units, but also requires u21 + 3v21 = 2, and there
are no integers u1 and v1 satisfying this equation, so N (r ) cant be 2.

Thus, the only factorisations 2 = rs in Z[ 3] require either r or s to
be a unit. Also, N (2) = 4 6= 1, so 2 itself isnt a unit, and is therefore
irreducible.

Now recall that 2|4 and 4 = (1+ 3)(1 3). We know that 2

isnt a unit, so for it to be prime we would require either 2|(1+ 3)

or 2|(1 3). For this to be the case, we would need an element

t Z[ 3] for which 2t = 1 3. But this requires t = 12 23 ,

neither of which are elements of Z[ 3]. So 2 is not prime in

Z[ 3].
Something else we notice in this example is that 4 factorises in two

different ways 22 = (1+ 3)(1 3) as a product of irreducible

elements in Z[ 3]. In Z, however, were used to irreducible fac-
torisations being unique (at least up to permutation of factors and
multiplication by units u = 1). Later well return to this idea and
investigate when factorisations are unique in this way.
Recall that we defined an ideal I R of a ring R to be prime if, for
5
Definition 9.63, page 387. any a, b R with ab I, either a I or b I.5 This looks very similar
to the definition of a prime element in an integral domain R: p R is
6
Definition 10.22, page 405. is prime if, whenever p| ab for some a, b R, either p| a or p|b.6
In one definition we are concerned with whether the given ideal I
contains one or other element a or b, and in the other we are interested
in whether the given element p is a multiple of a or b.
Is this just a coincidental analogy, or is there a deeper connection?
And what can we say about irreducible elements and maximal ideals?
Proposition 10.25 Let p be a nonzero element of an integral domain R.
Then
(i) p is irreducible if and only if h pi is a maximal principal ideal, and
(ii) p is prime if and only if h pi is a prime ideal.

Proof (i) Suppose that p is irreducible, and h pi isnt a maximal


principal ideal of R; that is, there exists some a R for which h pi
h ai R. Then p h ai, so there exists some b R for which p = ab.
Since p is irreducible, either a or b must be a unit. If a is a unit, then
h ai = R by Proposition 9.39. And if b is a unit, then a = pb1 , which
7
There may be a non-principal ideal must lie in h p i, so h a i = h p i. In either case, there is no principal ideal
between h pi and R, however. between h pi and R, so h pi is a maximal principal ideal.7
Now suppose h pi is a maximal principal ideal of R, and that p is
domains 407

not irreducible. Then p = ab for some a, b R, neither of which are


units. If a h pi then a = pc for some c R, and thus p = ab = pcb.
The cancellation law for integral domains (Proposition 8.48) tells us
that 1 = cb, which means b is a unit, contradicting the hypothesis
that it wasnt. Hence a 6 h pi, so h pi h ai. If, however, h ai = R, it
would follow that a is a unit, again contradicting the hypothesis that
it is noninvertible. So h pi h ai R, and hence h pi isnt a maximal
principal ideal. Alternatively, p could be a unit, but then we would
have h pi = R, and again h pi wouldnt be a maximal principal ideal.
Either way, h pi cant be a maximal principal ideal if p isnt irreducible.
(ii) Suppose p R is prime and consider some ab h pi. Then there
exists some c R for which ab = cp, so p| ab. Therefore either p| a, in
which case a = rp for some r R and thus a h pi, or p|b, whence
b = sp for some s R and thus b h pi. Hence h pi is a prime ideal.
Conversely, suppose h pi is a prime ideal, and p = ab for some a, b R.
Then ab h pi, hence either a = h pi, in which case p| a, or b h pi,
whence p|b, both by Proposition 10.14. Thus p is prime in R.
Corollary 10.26 Let R be an integral domain, and let p be some nonzero
element of R. Then
(i) R/h pi is an integral domain if and only if p is prime, and
(ii) if R/h pi is a field, then p is irreducible.

Proof (i) The ideal h pi is prime exactly when p is prime in R, and by


Proposition 9.64 R/I is an integral domain exactly when I is a prime
ideal of R.
(ii) By Proposition 9.61, if R/h pi is a field, then h pi is a maximal ideal.
And since a maximal ideal that happens to be principal is necessarily
a maximal principal ideal, p must be irreducible in R.

10.3 Principal ideal domains Given two numbers which are rela-
tively prime, find the least multiple of
each such that one multiple exceeds the
Proposition 10.25 and Corollary 10.26 provide a neat connection be- other by unity.
tween prime elements, prime ideals and integral domains as quotients, Claude Gaspard Bachet de Mziriac
(15811638),
and a somewhat weaker and untidier link between irreducible ele-
Problemes plaisans et delectables, qui se
ments, maximal ideals and fields as quotients. sont par les nombres (1624) 18
As remarked just now, the problem arises when the integral domain
under investigation has maximal ideals that arent principal, and as
we saw in Example 9.59 and Proposition 9.60 there are some by now
quite familiar examples of integral domains that fall into this category.
There are, however, some integral domains that have only principal
408 a course in abstract algebra

ideals, and when were dealing with one of these, the second part of
both Proposition 10.25 and Corollary 10.26 become much neater.
Definition 10.27 An integral domain R is said to be a principal
ideal domain or PID if all of its ideals are principal; that is, of the
form h ai for some a R.
Our model integral domain so far has been Z, and it turns out that it
all of its ideals are principal; that is, of the form hni for some n Z.
Proposition 10.28 The ring Z is a PID.

Proof The trivial ideal {0} = h0i is principal, as is Z = h1i itself.


Suppose that I Z is a proper, nontrivial ideal, and choose the
smallest nonzero element b I; that is, select b I \ {0} such that |b|
is as small as possible. Then hbi I, since for any n Z, the product
nb lies in I by the absorption condition for ideals.
Now take some arbitrary element a I and use the Division Theorem
to find q, r Z such that a = qb + r, with either r = 0 or |r | < |b|.
Then r = 0, because otherwise 0 < |r | < |b|, which contradicts the
choics of b. Hence a = qb hbi, so I hbi and therefore I = hbi.
Thus any ideal of Z is principal, of the form hbi for some b Z, and
hence Z is a PID.
What this says is that every ideal of Z is generated by its smallest
nonzero element, and is hence principal. So we can rewrite Proposi-
tion 10.25 and Corollary 10.26 in a nicer and stronger form in Z. But
actually, the above proof just makes use of basic properties of ideals
and the fact that the Division Theorem works in Z, so we should be
able to extend it to any integral domain in which some version of the
Division Theorem holds. This leads to the following, which is just a
suitably generalised version of Proposition 10.28.
Proposition 10.29 Every Euclidean domain is a PID.

Proof Let R be a Euclidean domain, and let v be a Euclidean valuation


for R. Then the trivial ideal {0} = h0i and R = h1i itself are principal.
So, suppose that I is a proper, nontrivial ideal of R and choose b
I \ {0} such that v(b) is as small as possible. Then hbi I.
Now let a I be some arbitrary element of I. Because R is a Euclidean
domain, we can find q, r R such that a = qb + r, with either r = 0
or v(r ) < v(b). If r 6= 0 then r = a qb, which is an element of I
(since a I and, by the absorption condition for ideals, qb I as well).
Then v(r ) < v(b), which cant be the case because we specifically
chose b so that v(b) was minimal over I. Therefore r = 0 and we have
a = qb hbi. So I hbi and therefore I = hbi as claimed. Hence I is
principal, and R is a PID.
domains 409

So, every Euclidean domain is a PID. But the converse doesnt neces-
sarily hold: there are PIDs that arent Euclidean. In particular, the ring

Z[ ], where = 21 (1 + 19), is a PID but not a Euclidean domain.
The proof of this is a little involved and would be a bit of a digression
at this point, so we postpone it to Section 10.A.
PIDs dont satisfy all the properties that Euclidean domains do: were
not guaranteed that some version of the Division Theorem will hold.
Nevertheless, they are still quite civilised rings in which to work. In
particular, were guaranteed to have well-defined greatest common
divisors and least common multiples:
Wikimedia Commons / unknown artist
Proposition 10.30 If R is a PID then for any a, b R there exists ele- Claude Gaspard Bachet, Sieur de
ments d, l R such that d = gcd( a, b) and l = lcm( a, b). Mziriac (15811638) was born to a no-
ble family in Bourg-en-Bresse, at the
Also, there exist r, s R such that ra + sb = d. time part of the Duchy of Savoy. From
early childhood, he lived and studied
The second part of this proposition is known as Bzouts Identity, af- with the Jesuit Order, both his parents
ter the French mathematician Etienne Bzout (17301783), who proved having died by the time he was six
years old. He joined the order in 1601
it for polynomials in his 1779 work Thorie gnrale des quations al- but left the following year due to ill
gbriques. The corresponding result for integers had been proved ear- health and returned to his family es-
lier by the French mathematician, poet and classicist Claude Gaspard tate at Bourg-en-Bresse, where he lived
in comparative leisure for most of the
Bachet de Mziriac (15811638) in his 1612 book Problmes plaisants et rest of his life.
dlectables qui se sont par les nombres. Bachet was an accomplished poet and
writer, publishing poems in French,
Proof Let I = h ai + hbi = {ra + sb : r, s R}. This is an ideal of R, Italian and Latin, and translations of
the psalms and other religious works,
and since R is a PID, I must be principal, and hence equal to hdi for as well as translations of some of the
some d R. This element f is a greatest common divisor of a and works of Ovid (43BCc.17AD). He was
b: indeed, h ai hdi and hbi hdi, so by Proposition 10.14 we have a member of the literary salon that in
1634 became the Acadmie Franaise,
d| a and d|b. Furthermore, for any e R with h ai hei and hbi hei, although recurring health problems
it follows that hdi = h ai + hbi hei, so e|d, and hence d is a greatest prevented him from attending the in-
auguration ceremony and he was for-
common divisor of a and b. mally elected to membership the fol-
Since h ai + hbi = hdi, there exists some element of h ai + hbi equal to lowing year.
In addition to his literary works, he
d. That is, there exist elements r, s R such that ra + sb = d.
also wrote a number of books on math-
Similarly, h ai hbi is an ideal of R, and thus equal to hl i for some ematics, in particular Problmes plaisans
et delectables qui se sont par les nombres
l R since R is a PID. This l is a least common multiple of a and b: (1612). This book consisted of a collec-
indeed, hl i h ai and hl i hbi, so by Proposition 10.14 we have a|l tion of mathematical puzzles, mostly
and b|l. Furthermore, for any m R with hmi h ai and hmi hbi, in the form of arithmetical problems,
and was a forerunner of modern recre-
it follows that hmi h ai hbi = hl i and thus l |m. Hence l is a least ational mathematics books. He also
common multiple of a and b. published a Latin translation of the clas-
sic number theory text Arithmetica by
This can be extended to finite collections of elements: the Greek mathematician Diophantus
of Alexandria (c.200c.285AD). It was
Corollary 10.31 Let R be a PID. Then for any a1 , . . . , am R there exist in the margin of this edition that Pierre
elements d, l R with d = gcd( a1 , . . . , am ) and l = lcm( a1 , . . . , am ). de Fermat (16011665) wrote his cele-
brated Last Theorem.
Also, there exist elements r1 , . . . , rm R such that r1 a1 + + rm am = d.
410 a course in abstract algebra

What this says is not only that any two (or, actually, any finite collection
of) elements of a PID have a greatest common divisor, but also that
it can be written as a linear combination of those elements. We
call integral domains with greatest common divisors GCD domains,
and those that also satisfy the linearity condition are called Bzout
domains, but we will not study them further here.
In Proposition 10.23 we saw that in an integral domain, every prime
element is irreducible, but Example 10.24 showed that the converse
doesnt necessarily hold. However, it does hold in a PID, as the next
proposition shows.
Proposition 10.32 In a PID R, every irreducible element is prime.
Wikimedia Commons / unknown artist
The son and grandson of lawyers, Eti-
enne Bzout (17301783) instead pur- Proof Suppose that r R is irreducible, and that r | ab for some a, b
sued a career in mathematics, inspired R. Then by Proposition 10.30 there exists an element d R that is a
by the work of the Swiss mathemati-
greatest common divisor of r and a, and hence r = ds for some s R.
cian Leonhard Euler (17071783).
In the late 1750s he published memoirs Since r is irreducible, either d or s is a unit. Suppose that d is a unit.
on dynamics and integration, and in By Proposition 10.30 we can find x, y R such that d = xa + yr.
1758 was elected an adjoint member of
the Acadmie des Sciences. Over the Multiplying this by b we get db = xab + yrb. We know that r | ab and
next ten years he was appointed exam- obviously r |yrb, so r |db as well, and hence db = tr for some t R.
iner to the Gardes de la Marine and
the Corps dArtillerie, with responsibil-
But d is a unit, so we can multiply both sides by d1 to get b = d1 tr,
ity for the mathematical education of which means that r |b.
naval and army officer cadets.
If, on the other hand, s is a unit, we have r d and hence r |d. We
In addition to his contributions to
analysis, mechanics and algebra, he know already that d| a, since its a greatest common divisor of r and a,
also wrote a number of influential so therefore r | a.
textbooks, including the four-volume
Cours de mathmatiques lusage des Hence, if r is irreducible, either r | a or r |b, and thus r is prime.
Gardes du Pavillon et de la Marine (1764
1767) and the six-volume Cours com- Combining this with Proposition 10.23 we immediately get the follow-
plet de mathmatiques lusage de la ma-
rine et de lartillerie (17701782) which ing fact.
were used not only by officer cadets but Corollary 10.33 In a PID, an element is irreducible if and only if it is
also by students at the cole Polytech-
nique, and in translation by Harvard prime.
and other American universities.
Bzouts approach to research, which
And putting it together with Proposition 10.25 we find that in PIDs
he called the method of simplifying as- there is a stronger link between prime and maximal ideals.
sumptions was to attack special cases
of difficult problems, gradually devel- Corollary 10.34 A nontrivial ideal of a PID is prime if and only if it is
oping greater insight that often enabled maximal.
him to find a general solution.
Proof Let R be a PID and let I be some nontrivial ideal of R. Then
I = h ai for some a R because R is a PID.
Then I = h ai is prime if and only if a is prime in R (by Proposi-
tion 10.25), a is prime in R exactly when its irreducible in R (by
Proposition 10.32), and a is irreducible in R if and only if h ai is a
maximal principal ideal in R (by Proposition 10.25). But R is a PID, so
maximal principal ideals and maximal ideals are the same thing.
domains 411

So, in a PID irreducible elements and prime elements are the same,
and prime ideals and maximal ideals are the same as well.

10.4 Unique factorisation domains I never could do anything with figures,


never had any talent for mathematics,
never accomplished anything in my ef-
The Fundamental Theorem of Arithmetic says that any nonzero forts at that rugged study, and to-day
integer apart from 1 can be factorised uniquely (up to permutation the only mathematics I know is multi-
plication, and the minute I get away up
and multiplication by 1) as a product of prime integers. Rephrasing in that, as soon as I reach nine times
this using the terminology of the last couple of sections, any nonzero, seven . . . Ive got it now. Its eighty-four.
Well, I can get that far all right with a
nonunit element of Z can be factorised as a product of irreducible ele-
little hesitation. After that I am uncer-
ments. Also, any such factorisation in Z is unique up to permutation tain, and I cant manage a statistic.
and multiplication by units. This leads to the following definition. Mark Twain (Samuel Langhorne
Clemens) (18351910),
Definition 10.35 An integral domain R is a factorisation domain In Aid of the Blind (29 March 1906),
or atomic domain if any nonzero, nonunit element r R can be Mark Twains Speeches (1910) 322332
expressed as a product r = a1 . . . am where a1 , . . . , am are irreducible.
Furthermore, R is a unique factorisation domain or UFD if for any
two irreducible factorisations r = a1 . . . am = b1 . . . bn , we have m = n,
and there exists some permutation Sn such that ai b(i) for
all 1 6 i 6 n. That is, irreducible factorisations are unique up to
permutation and multiplication by units.

Corollary 10.33 (which depends on Propositions 10.23 and 10.32) tells


us that in a PID prime and irreducible elements are the same thing.
This is true in UFDs as well:
Proposition 10.36 Let R be a factorisation domain. Then R is a UFD if
and only if every irreducible element is prime.

Proof Let x R be irreducible, suppose that R is a UFD, and suppose


that x | ab for some a, b R. We can factorise a = r1 . . . rk and b =
rk+1 . . . rm , and hence ab = r1 . . . rm . But since x | ab there exists some
y R such that xy = ab. Since R is a UFD we can factorise y = s1 . . . sn ,
and thereby obtain another factorisation ab = xy = xs1 . . . sn . But since
R is a UFD, this factorisation must be unique up to permutation and
association, and therefore x ri for some i with 1 6 i 6 m. If i 6 k
then x | a, and if i > k then x |b. Hence x is prime. This proves the first
part: in a UFD, irreducible elements are prime.
Now for the converse. Suppose that R is at least a factorisation
domain, and that every irreducible element in R is prime. Let x R
be neither zero nor a unit, and consider two irreducible factorisations
x = r1 . . . rm = s1 . . . sn . We want to show that these are the same up
to permutation and association (that is, multiplication by a unit). We
412 a course in abstract algebra

prove this by induction on m. If m = 1 then x = r1 is irreducible, and


hence n = 1 and s1 = r1 .
Now suppose that any element x R that can be expressed as a
product of fewer than m irreducibles can be done so uniquely up
to permutation and association. Since rm divides s1 . . . sn , it must
divide si for some 1 6 i 6 n. By permuting the factors, we can
assume that rm |sn , which means that sn = qrm for some q R. But
q must be a unit, since both rm and sn are prime. So rm sn , and
hence r1 . . . rm = s1 . . . sn1 (qrm ), which means that, by cancellation,
r1 . . . rm1 = qs1 . . . sn1 . By the induction hypothesis we know that
these two factorisations r1 . . . rm1 and qs1 . . . sn1 are the same up to
permutation and multiplication by units. Hence R is a UFD.
Lets now summarise what we know about the relationship between
primes and irreducibles in integral domains.
8
Proposition 10.23, page 405. In an integral domain, all primes are irreducible.8
9
Proposition 10.36, page 411. In a UFD, all irreducible elements are prime.9
10
Proposition 10.32, page 410. In a PID, all irreducible elements are prime.10
A factorisation domain where all irreducible elements are prime is
11
Proposition 10.36, page 411. a UFD.11
12
Proposition 10.29, page 408. We saw earlier that every Euclidean domain is a PID.12 By the above
discussion, every PID that is also a factorisation domain is also a UFD.
So if we could show that every PID is a factorisation domain, wed
have a nice, neat chain of implications: a Euclidean domain is a PID,
which is a UFD. To settle this question, well look at factorisation in Z.
Proposition 10.37 Any nonzero integer n apart from 1 can be ex-
pressed as a finite product n = p1 . . . pk of primes. That is, Z is a fac-
torisation domain.
Proof Suppose n cant be factorised as a product of primes. Then n
itself cant be prime, and we can express it as a product n = n1,1 n1,2
where neither n1,1 nor n1,2 is 1. At least one (and possibly both)
of these is not prime, otherwise n = n1,1 n1,2 would be a prime fac-
torisation. Suppose, without loss of generality, that this non-prime
integer is n1,1 , and let n1 = n1,1 . Then we can factorise it further as
n1 = n1,1 = n2,1 n2,2 . At least one of these factors, say n2,1 , must be
prime, so let n2 = n2,1 and factorise it as n3,1 n3,2 . We continue this
process to obtain an infinite sequence n = n0 , n1 , n2 , . . . of integers
such that ni+1 |ni but ni 6= ni+1 .

13
But n is finite, so this sequence cant be infinite and must eventually
The earliest known proof appears
as Proposition 32 of Book VII of Eu- terminate. Thus n must be factorisable as a finite product of primes.
clids Elements; a more modern proof
appears in Gauss 1801 book Disquisi-
The following corollary is known as the Fundamental Theorem of
tiones Arithmetic. Arithmetic.13
domains 413

Corollary 10.38 (Fundamental Theorem of Arithmetic) The ring


Z of integers is a UFD. That is, any nonzero integer apart from 1 can
be factorised as a finite product of prime (and hence irreducible) integers.
Furthermore, any such factorisation is unique up to permutation and mul-
tiplication of individual factors by 1.

Proof Proposition 10.37 says that Z is a factorisation domain, and by


Proposition 10.28 we know Z is a PID. Hence Z must be a UFD.
Wed like to generalise the above argument to other integral domains.
Notice that the key ingredient in the proof was showing that for
any nonzero, nonunit element in Z we can find a finite sequence
n = n0 , n1 , . . . , nk of nonunit elements such that ni+1 |ni but ni+1 6 ni .
By Proposition 10.14 this is equivalent to requiring the existence of a
finite sequence of principal ideals
h n k i h n k 1 i h n 1 i h n 0 i = h n i Z
for any nonzero nonunit integer n Z.
So, any integral domain R which only has finite chains of strictly-
included principal ideals should be a factorisation domain by some
suitably modified version of the above proof. Lets introduce some
new terminology and then prove this more general result.
Definition 10.39 If a ring R contains no infinite, strictly increasing
chain of principal ideals
h a1 i h a2 i R
then we say it satisfies the ascending chain condition for principal
ideals.
In the rest of this section we will be interested mainly in integral
domains that satisfy the ACCP, but its worth also mentioning the
following stronger condition:
Definition 10.40 A ring R is said to be Noetherian if it satisfies the
ascending chain condition (ACC): if every strictly increasing chain
of (not necessarily principal) ideals
I1 I2 R
terminates after finitely many steps.
More generally, R is left Noetherian if it satisfies the ACC for left
ideals, and right Noetherian if it satisfies the ACC for right ideals.

Many of the rings weve met so far are Noetherian. Examples of rings
that fail to satisfy the ACC include F [ x1 , x2 , . . .], the polynomial ring
with infinitely many unknowns, which contains infinite chains of the
form h x1 i h x1 , x2 i h x1 , x2 , x3 i , and the ring of algebraic
integers (that is, those complex numbers that are roots of polynomials
414 a course in abstract algebra

in Z[ x ] with leading coefficient 1), which contains infinite chains of


the form h2i h21/2 i h21/3 i h21/4 i .
What we want to do now is to generalise the proof of Proposition 10.37
to the case of any integral domain satisfying the ACC for principal
ideals.
Proposition 10.41 Let R be an integral domain that satisfies the ACC for
principal ideals. Then R is a factorisation domain.

Proof For the moment, well call a nonzero, nonunit element r R


unfactorisable if it cant be factorised as a finite product of irreducible
elements of R. Since r isnt irreducible, it can be expressed as a
product r = a1 b1 , where neither a1 nor b1 are units. At least one
of these must be unfactorisable; suppose without loss of generality
that a1 is this element. Then a1 cant be irreducible, so it can be
expressed as a product a1 = a2 b2 , at least one of which, say a2 , must
be unfactorisable.
If r were unfactorisable, then this process must continue forever, yield-
ing an infinite sequence r = a0 , a1 , a2 , . . . of unfactorisable elements
of R. Furthermore, ai+1 | ai but ai 6 ai+1 for all i > 0. So by Proposi-
tion 10.14 we get an infinite chain
hr i = h a0 i h a1 i h a2 i
of principal ideals in R.
But this cant happen, because R satisfies the ACC for principal ide-
als. Hence every element of R must be factorisable as a product of
irreducible elements, and therefore R is a factorisation domain.
All that remains is to show that PIDs satisfy the ACC for principal
ideals.
Proposition 10.42 Let R be a PID. Then R satisfies the ACC for principal
ideals.
Proof Suppose that I1 I2 is a strictly increasing chain of
ideals in R. We want to show that this must be of finite length.
The union I = i Ii = I1 I2 . . . is itself an ideal of R. To confirm this,
S

we need to check that it satisfies the conditions for a (not necessarily


unital) subring, and also obeys the absorption condition.
It certainly contains 0, since each of the Ii do. For any a Ik I there
exists ak Ik I. Now suppose that a Ij and b Ik for some
j, k > 1. Either j = k, in which case a, b Ij and hence both a+b and
ab also lie in Ij I, or j 6= k, in which case we may assume without
loss of generality that j < k, and hence a Ij Ik , so a+b and ab lie
in Ik . Thus I is a (not necessarily unital) subring of R.
Now suppose that r R and a I. Then a Ik I for some k > 1,
domains 415

and since Ik is an ideal of R it follows that both ar and ra lie in Ik I.


So I is an ideal of I.
Since R is a PID, I = h ai for some a R. This element a must belong
to some ideal In in the chain, and so for any other ideal Ik we must
have Ik I = h ai = In . Therefore In must be the last ideal in the
chain, and so R satisfies the ACC for principal ideals.
Putting all this together, we have the following corollary:
Corollary 10.43 Every PID is a UFD.

Proof Weve just shown that every PID satisfies the ACC for principal
ideals, and is hence a factorisation domain. By Proposition 10.32, in
a PID all irreducible elements are prime, and by Proposition 10.36
every factorisation domain where all irreducible elements are prime is
a UFD. Therefore every PID is a UFD.
In Proposition 10.30 we saw that greatest common divisors and least
common multiples exist in PIDs. This is also true in UFDs:
Proposition 10.44 Let R be a UFD, and let a and b be two nonzero
elements of R. Then there exist d, l R such that d is a greatest common
divisor of a and b, and l is a least common multiple of a and b.

Proof Since R is a UFD, both a and b have finitely many irreducible


factors. Let p1 , . . . , pk be the irreducible elements that divide either
a or b, such that none is an associate of any of the others. That is,
we make a complete list of all the irreducible factors of a, and all the
irreducible factors of b, and choose only one representative from each
equivalence class of associate elements, discarding the others. We can
then write
m m n n
a = up1 1 . . . pk k and b = vp1 1 . . . pk k ,
where the exponents m1 , . . . , mk , n1 , . . . , nk > 0 are nonnegative inte-
gers, and u, v R are units.
e e f f
Then d = p11 . . . pkk and l = p11 . . . pkk are the required elements,
where ei = min(mi , ni ) and f i = max(mi , ni ) for 1 6 i 6 k.
This proof can be extended in a straightforward manner to finite
collections of nonzero elements:
Corollary 10.45 Let R be a UFD, and let a1 , . . . , an R be nonzero
elements. Then there exist d, l R such that d is a greatest common
divisor and l is a least common multiple of a1 , . . . , an .
So, greatest common divisors and least common multiples exist in
a UFD. However, we havent shown that greatest common divisors
satisfy Bzouts identity; that is, that they can be written as linear
combinations of the given elements. Indeed, this is not the case: there
416 a course in abstract algebra

exist UFDs that dont satisfy Bzouts condition, although we wont


study these any further.
For the moment, however, well concern ourselves with finding ex-
amples of UFDs that arent PIDs Some important examples are given
by univariate polynomial rings R[ x ] for some carefully chosen ring
R. To that end, well now spend a little time studying factorisation in
polynomial rings. This will also serve as a prelude to the material in
the next chapter.
First of all, we consider the case where R is an integral domain.
Proposition 10.46 If R is an integral domain, then so is R[ x ].

Proof We know already that R[ x ] is a ring, and if R is commutative


(which as an integral domain, it must be) then R[ x ] must be too. Also,
the unity element 1 R also serves as the unity element in R[ x ]. All
that remains is to show that R[ x ] contains no zero divisors. To this
end, let f = am x m + + a1 x + a0 and g = bn x n + + b1 x + b0 be
two elements of R[ x ], with both am and bn nonzero (so deg( f ) = m
and deg( g) = n). Then f g has leading term am bn x m+n , which can
only be zero if am bn = 0. But we know this isnt the case, since R is an
integral domain, and hence R[ x ] can have no zero divisors either, and
must therefore be an integral domain.
Univariate polynomial rings over fields satisfy stronger conditions:
Proposition 10.47 If F is a field, then F [ x ] is a PID. Furthermore, a
nontrivial ideal I in F [ x ] is equal to h gi if and only if g is a nonzero
polynomial of minimal degree in I.

Proof We know from Proposition 10.46 that F [ x ] must at least be an


integral domain, so to prove that its a PID we just need to show that
every ideal of F [ x ] is principal.
Suppose, then, that I is an ideal of F [ x ]. If I is trivial, then I =
{0} = h0i. Otherwise, chose a nonzero polynomial g I with degree
less than any other nonzero element of I. Certainly h gi I. By
Theorem 10.5 we can decompose any other element f I as f = qg + r,
where r = 0 or deg(r ) < deg( g).
Then r = f qg, and since f , g I it follows that r I. If r 6= 0 then
deg(r ) < deg( g), which contradicts our hypothesis that g has minimal
degree in I. So we must have r = 0, which means that f = qg h gi.
So I h gi and hence I = h gi.
Weve now considered two, in some sense, extreme cases of R[ x ]: one
when R is just an integral domain, and the other when R is a field. We
now want to look at the intermediate case where R is a UFD. To do
this, we have to think carefully about irreducible elements in R[ x ].
domains 417

Proposition 10.48 Let R be a integral domain. The units in R[ x ] are


exactly the units in R, and an irreducible element in R is also irreducible
in R[ x ].

Proof Suppose that f , g R[ x ] with f = am x m + + a1 x + a0 and


g = bn x n + + b1 x + b0 and that f g = 1. Then
f g = am bn x m+n + + a0 b0 = 1,
and by comparing coefficients we find that
a1 = = am = b1 = = bn = 0
and a0 b0 = 1, which means that a0 and b0 must be units in both R and
R[ x ]. Conversely, if a, b R are units, with ab = 1, then this also holds
in R[ x ]. Hence the units in R[ x ] are exactly the units in R.
Now suppose that r R is irreducible in R. Suppose that r = ab
where neither a nor b are units in R. By the argument above, neither
a nor b can be units in R either. Since r is irreducible in R, at least
one of a and b must lie in R[ x ] but not in R; that is, they must be
non-constant polynomials of strictly positive degree. Suppose, without
loss of generality, that deg( a) = m > 0. Then
deg(r ) = deg( ab) = deg( a) + deg(b) = m + deg(b) > 0.
But r R is a constant polynomial in R[ x ], and so deg(r ) = 0, which
contradicts this. Therefore r must be irreducible in R[ x ].
The second part of this proposition says irreducibles in R must also
be irreducible in R[ x ], but it doesnt rule out the possibility that
there might be other irreducible elements in R[ x ] apart from the ones
inherited from R. To properly understand the irreducible elements in
R[ x ] we need to study a particular class of polynomials: those that
cant be expressed as a scalar multiple of another polynomial.
Definition 10.49 Let R be a UFD. A nonzero polynomial Somewhat confusingly, the term primi-
tive polynomial has another, different
f = a m x m + + a1 x + a0 R [ x ] meaning in the context of finite fields.
Well discuss this further in the next
is primitive if gcd( a0 , . . . , am ) = 1; that is, if its coefficients are chapter.
coprime. The element c( f ) = gcd( a0 , . . . , am ) is the content of f .

In particular, this means that any nonzero polynomial


f = a m x m + + a1 x + a0 R [ x ]
can be written as f = a f 0 , where f 0 is a primitive polynomial in R[ x ],
and a = c( f ) = gcd( a0 , . . . , am ).
Example 10.50 The polynomials
3x2 + x 2, 2x + 1 and x5 2x + 4
418 a course in abstract algebra

are primitive in Z[ x ], but


2x2 + 4x + 2, 3x 9 and 4x5 8x4 2x3
arent.
Well see in a little while that primitive polynomials yield all the other
irreducible elements in R[ x ]. For the moment, however, well look
at what happens when we multiply two primitive polynomials. Let
f = 3x2 + x 2, and let g = x5 2x + 4. These are both primitive in
Z[ x ]. Then

f g = 3x7 + x6 2x5 6x3 + 10x2 + 8x 8.

This is also primitive in Z[ x ], since its coefficients are all coprime.


The fact that this holds in general for polynomials in Z[ x ] was origi-
14 The
14
It appears as Section 42 in Chapter 2 nally stated and proved by Carl Friedrich Gauss (17771855).
(De Congruentiis Primi Gradus) of his following is a generalisation of Gauss original statement to any UFD.
book Disquisitiones Arithmetic (1801).
Proposition 10.51 (Gauss Lemma) Let R be a UFD, and let
f = a m x m + + a1 x + a0 and g = bn x n + + b1 x + b0
be two primitive polynomials in R[ x ]. Then the product f g is also primitive
in R[ x ].

Proof Suppose that the product f g = cm+n x m+n + + c1 x + c0 is


not primitive; that is, there exists some irreducible element p R that
divides all the coefficients ci for 0 6 i 6 m+n.
Both f and g are primitive, so this element p cant divide all of their
coefficients. In that case, there must be an integer s > 0 such that p| ai
for i < s but p 6 | as ; similarly there must be an integer t > 0 such that
p|bi for i < t but p 6 | at . That is, s and t are, respectively, the indices of
the lowest-degree coefficients of f and g that are not divisible by p.
The coefficients of the product f g are completely determined by the
coefficients of f and g, with ck = i+ j=k ai b j for 0 6 k 6 m+n. Since
p is irreducible and R is a UFD, by Proposition 10.36 p must therefore
be prime, and thus if p 6 | as and p 6 |bt , it must also be the case that
p 6 | as bt . Now consider the coefficient cs+t = i+ j=s+t ai b j . The term
as bt in this sum is not divisible by p, but all of the other terms ai b j are,
because either i < s (and hence p| ai ) or j < t (so p|b j ).
Thus p cant divide cs+t , which contradicts our original statement that
p does divide all the coefficients ci of f g. Hence no such irreducible
element p can exist, and f g must also be primitive.

There is a short and elegant proof of the special case where R = Z,


which makes use of the fact that Z p [ x ] is an integral domain for any
prime integer p.
domains 419

Proposition 10.52 Let f , g Z[ x ] be primitive polynomials. Then f g is


also primitive in Z[ x ].

Proof Suppose that the product f g is not primitive, so there exists


some prime integer p that divides all of its coefficients. There is a
well-defined homomorphism p : Z Z p mapping every integer n
to its residue [n] p modulo p, and we can extend this to a ring homo-
morphism p : Z[ x ] Z p [ x ] that maps a given polynomial in Z[ x ]
to the corresponding polynomial in Z p [ x ] with coefficients reduced
modulo p.
Then p ( f ) 6= 0 and p ( g) 6= 0, since p doesnt divide all the coeffi-
cients of either f or g. However, p does divide all the coefficients of
their product f g, so p ( f g) = 0.
Since p is a ring homomorphism, we have
p ( f ) p ( g) = p ( f g) = 0.
But this cant happen, because Z p [ x ] is an integral domain by Propo-
sition 10.46, and therefore has no zero divisors. Hence f g must be
primitive in Z[ x ].
We know from Proposition 10.48 that the irreducible elements of R
must also be irreducible in R[ x ]: these are exactly the irreducible
constant (degree0) polynomials in R[ x ]. We now want to find all
the irreducible non-constant polynomials. These must be primitive,
because otherwise they could be factorised as a product of a primitive
polynomial and a non-unit constant element. But not all primitive
polynomials are irreducible, because Gauss Lemma (Proposition 10.51)
says that a product of primitive polynomials is itself primitive. Thus
there must exist some primitive polynomials that can be factorised,
and wed like to find out, in some sense, exactly which ones these are.
The key, it turns out, involves the field of fractions Q( R) of our cho-
sen UFD R.15 The irreducible polynomials in R[ x ] are exactly the 15
Definition 8.51, page 343.
ones which are also irreducible in Q[ x ]. To see this, well look at
an illustrative example first, and then use those ideas to prove it in
general.
Its relatively straightforward to show that a reducible polynomial in
Z[ x ] must also be reducible in Q[ x ].
Example 10.53 Let f = 2x3 6x2 + x 3 in Z[ x ]. This is primitive,
since c( f ) = 1, but factorises as f = ( x 3)(2x2 + 1) and is thus
reducible in Z[ x ]. It must also be reducible in Q[ x ], since its factors
g = x 3 and h = 2x2 + 1 also lie in Q[ x ].
The only thing that would stop a primitive reducible polynomial
f = gh Z[ x ] from being reducible in Q[ x ] as well, would be if one
420 a course in abstract algebra

of its factors g or h happened to be a unit in Q[ x ] but not in Z[ x ].


But Proposition 10.48 says that the units in Q[ x ] are exactly the units
in Q, namely the nonzero rationals, and the primitivity of f means
that neither g nor h can be a constant in Z Q: they must both have
degree at least 1.
Going the other way is slightly more involved. Well use the same
polynomial as in the previous example to illustrate the technique were
going to use in the proof.
Example 10.54 Let f = 2x3 6x2 + x 3 in Q[ x ], and also in Z[ x ].
This can be factorised as f = ( x2 + 12 )(2x 6) and is hence reducible
in Q[ x ]. We want to find a factorisation into non-units in Z[ x ], but
this one wont work, because g = ( x2 + 12 ) isnt in Z[ x ], even though
the other factor h = (2x 6) is.
We start by getting rid of any awkward fractions. We do this by
looking at the least common multiple of the denominators of the
coefficients of a given factor, and then multiplying through by that
number. In the case of g we get a1 = lcm(1, 2) = 2, and for h we get
b1 = lcm(1, 1) = 1. Multiplying through, we get a1 g = 2x2 + 1 and
b1 h = 2x 6.
These can now be considered as elements of Z[ x ], but the problem
now is that b1 h isnt primitive. So we fix that by dividing by the
content. Let a2 = c( a1 g) = 1 and b2 = c(b1 h) = 2, and then set
a = aa12 = 2 and b = bb1 = 12 .
2
Now both ag = 2x2 + 1 and bh = x 3 are primitive elements of
Z[ x ], and it so happens that ( ag)(bh) = (2x2 + 1)( x 3) = f is the
required factorisation of f in Z[ x ], hence f is reducible in Z[ x ] as
well as in Q[ x ].
In the proof of the next proposition, well see that we can always do
this: for any primitive polynomial in Z[ x ] that factorises into lower-
degree polynomials in Q[ x ], we can find a corresponding factorisation
in Z[ x ] too.
Proposition 10.55 Let R be a UFD, and let Q = Q( R) be its field of
fractions. Then a primitive polynomial f R[ x ] is irreducible in R[ x ] if
and only if it is irreducible in Q[ x ].

Proof If deg( f ) = 0 then f is a constant element of R[ x ] and also a


unit in R. It is therefore not irreducible in R[ x ] or in Q[ x ].
So, we must have deg( f ) > 1. Suppose that f = gh for some nonunit
polynomials in R[ x ]. Since f is primitive, neither g nor h can be con-
stants, so neither deg( g) nor deg(h) can be zero. By Proposition 10.48
neither g nor h can be units in Q[ x ] either, and so f must be reducible
domains 421

in Q[ x ]. Contrapositively, if f is irreducible in Q[ x ], it must be irre-


ducible in R[ x ] too.
Now suppose that f is primitive in R[ x ] and reducible in Q[ x ]. Then
f = gh with g, h Q[ x ] not units, so deg( g) and deg(h) are positive.
Let a1 be a least common multiple of the denominators of the coeffi-
cients of g, so that a1 g has coefficients in R, and thus a1 g R[ x ]. Next,
let a2 = c( a1 g), the content of a1 g. Then a = aa21 is an element of Q,
and ag is a primitive polynomial in R[ x ].
Similarly, let b1 be a least common multiple of the denominators of
the coefficients of h, and let b2 = c(b1 h). Then if bb1 , we have b Q
2
and bh a primitive element of R[ x ].
By Gauss Lemma,16 k = ( ag)(bh) = abgh = ab f is primitive in R[ x ], 16
Proposition 10.51, page 418.
so a1 b1 f = a2 b2 k.
Hence a1 b1 and a2 b2 are both greatest common divisors of the co-
efficients of a1 b1 f , and therefore a1 b1 a2 b2 in R. But this means
that u = ab = aa1 bb1 is a unit in R, and hence f = ( ag)(u1 bh) is a
2 2
factorisation of f in R[ x ]. So f must also be reducible in R[ x ].
Contrapositively, if f is irreducible in R[ x ] it must also be irreducible
in Q[ x ].

The special case where R = Z and Q = Q(Z) = Q was originally


proved by Gauss, and is also sometimes called Gauss Lemma.
We are now able to completely characterise irreducible elements of
R[ x ] where R is a UFD:
Corollary 10.56 Let R be a UFD, and let Q = Q( R) be its field of
fractions. Then an irreducible element in R[ x ] is either irreducible in R, or
a primitive polynomial in R[ x ] that is also irreducible in Q[ x ].

Proof We know from Proposition 10.48 that an irreducible element of


R is irreducible in R[ x ]. By Proposition 10.55 any primitive polynomial
in R[ x ] is irreducible in R[ x ] if and only if it is irreducible in Q[ x ].
Finally, any non-primitive polynomial in R[ x ] is reducible in R[ x ].

So, the irreducible elements in R[ x ] comprise the irreducible constants


inherited from R, and the primitive polynomials in R[ x ] that are also
irreducible in Q[ x ].
We now have all the ingredients we need to prove the last major result
of this section:
Proposition 10.57 If R is a UFD, then so is R[ x ].

Proof We must show that every nonzero nonunit in R[ x ] has a fac-


torisation into irreducible elements, and then that this factorisation is
unique up to permutation and association.
422 a course in abstract algebra

The first of these is intuitively obvious, but we will prove it in detail


anyway. Let f R[ x ] be nonzero and not a unit. Then if deg( f ) = 0 it
follows that f R and hence has a unique factorisation because R is a
UFD.
So, from now on, suppose that deg( f ) > 0. Regard f as a polynomial
in Q[ x ], where Q = Q( R) is the field of fractions of R. Then since Q is
a field, Q[ x ] is a Euclidean domain by Theorem 10.5 and hence a PID
by Proposition 10.29 and thus a UFD by Corollary 10.43. So f has a
unique factorisation
f = g1 . . . g k
in Q[ x ], where g1 , . . . , gk are irreducible in Q[ x ].
The coefficients of all these irreducible polynomials are of the form ba
with a, b R, so next we clear the denominators in the same way as
in the proof of Proposition 10.55. Let ai be a least common multiple
of the denominators of the coefficients of gi , for 1 6 i 6 k, and let
a = a1 . . . ak . Then
a f = h1 . . . h k
where a R and h1 , . . . , hk R[ x ].
Now g1 , . . . , gk were all irreducible in Q[ x ], so h1 , . . . , hk are all irre-
ducible in Q[ x ] as well, because each hi is just the corresponding gi
multiplied by some nonzero element of R that must therefore be a
unit in Q.
Considering the content of each of the polynomials f , h1 , . . . , hk , we
get
f = cr and hi = ci si ,
for 1 6 i 6 k, where r, s1 , . . . , sk R[ x ] are primitive, c = c( f ) and
ci = c(hi ). Then
( ac)r = (c1 s1 ) . . . (ck sk )
= ( c1 . . . c k ) s1 . . . s k .
The content of a polynomial is a least common multiple, and hence is
unique up to association, so there must be some unit u R such that
acu = c1 . . . ck .
Then
( ac)r = ( acu)s1 . . . sk
and hence
f = cr = (cu)s1 . . . sk .
Now cu has a unique factorisation as irreducible elements of R R[ x ]
since R is a UFD. Furthermore s1 , . . . , sk are all irreducible in R[ x ]
because they were primitive in R[ x ] and were carefully selected to
domains 423

be irreducible in Q[ x ], so by Proposition 10.55 they are irreducible in


R [ x ].
Therefore f has a factorisation into irreducible elements in R[ x ] and
hence R[ x ] is a factorisation domain.
Now we want to show that such a factorisation is unique up to per-
mutation and association. Suppose that

f = p 1 . . . p k g1 . . . g m = q 1 . . . q l h 1 . . . h n

are factorisations of the polynomial f R[ x ], with p1 , . . . , pk , q1 , . . . , ql


irreducible elements of R, and g1 , . . . , gm , h1 , . . . , hn irreducible primi-
tive polynomials in R[ x ].
By Proposition 10.55, g1 , . . . , gm , h1 , . . . , hn are irreducible in Q[ x ], and
since Q[ x ] is a UFD, m = n and, possibly after some judicious permu-
tation, gi hi for 1 6 i 6 m. This means that hi = ui gi for some units
u1 , . . . , um Q. Since Q is the field of fractions of R, each ui = bai for
i
some ai , bi R, and thus bi hi = ai gi . Since hi and gi are both primitive,
ai and bi are both greatest common divisors of the coefficients of ai gi ,
so bi ci in R, and hence each ui is a unit in R as well as in Q.
We can now cancel all the gi to get

up1 . . . pk = q1 . . . ql

where u = u1 . . . um is a unit in R. Since R is a UFD, it follows that


k = l and, after a suitable permutation, pi qi for 1 6 i 6 k. Hence
the factorisation of f is unique up to permutation and association, and
therefore R[ x ] is a UFD as claimed.
We can extend this result to multivariate polynomial rings by induc-
tion:
Corollary 10.58 If R is a UFD then so is R[ x1 , . . . , xn ].

Proof This follows by induction on n, using the observation that


R[ x ][y]
= R[ x, y]. That is, the ring of polynomials in the formal
variable y with coefficients from R[ x ], is isomorphic to the ring of
polynomials in the formal variables x and y with coefficients in R.
We are now, finally, in a position to show that although every PID is a
UFD, not every UFD is a PID.
Example 10.59 The polynomial ring Z[ x ] is a UFD by Proposi-
tion 10.57, since Z is a UFD. But Z[ x ] is not a PID because, for
example, the ideal h x, 2i is not principal.

This almost completes the verification of the strictness in this chapters


hierarchy

{Euclidean domains} {PIDs} {UFDs} {integral domains}


424 a course in abstract algebra

of integral domains. Not every integral domain is a UFD: the ring



Z[ 3] in Example 10.24 is an integral domain but doesnt have

unique factorisation because 4 = 22 = (1+ 3)(1 3). Weve
just shown that not every UFD is a PID. It also so happens that not
every Euclidean domain is a PID, but probably the simplest coun-

terexample is the ring Z[] where = 12 (1 + 19), an example of a
quadratic integer ring. The proof that it is a PID but not a Euclidean
domain is a bit involved, and we leave it to Section 10.A.
We will return to discussion of irreducible polynomials in the next
chapter, which is a short overview of Galois theory, an important
branch of mathematics which applies powerful techniques in group
and ring theory to the solubility of polynomial equations.

Therefore, the Divine Spirit discovered 10.A Quadratic integer rings


an elegant and wonderful escape in
that miracle of analysis, that token of
the ideal world, that amphibian be- In this section, we will study the Gaussian integers Z[i ]
tween being and non-being, that we and investigate which of them are prime. We will then extend some
call the imaginary root.
of these ideas to a class of rings called quadratic integer rings, and
Gottfried Wilhelm Leibniz
(16461716), finally construct an example of a PID that isnt a Euclidean domain.
Specimen novum analyseos pro scientia
infiniti, circa summas et quadraturas
We saw in Proposition 10.11 that Z[i ] is a Euclidean domain, so ir-
(1702) reducible and prime elements are the same thing. The Euclidean
valuation we used before was defined by

v( a + bi ) = | a + bi |2 = a2 + b2 .

(Its also worth noting that v( a + bi ) = ( a + bi )( a + bi ).) In a little while


we will develop a general technique for defining suitable valuation
functions, but for the moment this will suffice. In particular, we want
to see how this valuation behaves with respect to primes in Z or Z[i ].
The first thing to notice is that this valuation satisfies a multiplicative
property:

v(( a+bi )(c+di )) = v(( acbd)+(bc+ ad)i ) =


a2 c2 + a2 d2 + b2 c2 + b2 d2 = ( a2 +b2 )(c2 +d2 ) = v( a+bi )v(c+di )

By Proposition 10.4, the units in Z[i ] are exactly those elements a+bi
such that v( a+bi ) = a2 +b2 = 1 = v(1), namely 1 and i. This,
together with the multiplicativity of v, allows us to prove the following
fact:
Proposition 10.60 If r = a + bi Z[i ] and v(r ) is prime in Z, then r
is also prime in Z[i ].
domains 425

Proof Suppose that s|r in Z[i ]. Then v(s)|v(r ) in Z. But v(r ) is prime
in Z, so v(s) must equal either 1 or p. If v(s) = 1 then s is a unit
in Z[i ] by Proposition 10.4 and the above discussion. Otherwise, if
v(s) = p then s r. In either case, p must be prime.
The ring Z[i ] has a copy of Z embedded in it as a unital subring, so
its natural to ask what happens to the primes in Z when we consider
them as elements of Z[i ]. Are they still prime in Z[i ] or not? The
answer is: sometimes, but not always. For example 2 is prime in Z, but
we can factorise it as 2 = (1+i )(1i ) in Z[i ]. Similarly, 5 and 13 are
prime in Z, but factorise as 5 = (1+2i )(12i ) and 13 = (2+3i )(23i )
in Z[i ]. On the other hand, 3, 7 and 11 are prime in both Z and Z[i ].
There are a couple of patterns here: one is that the prime integers that
fail to be prime in Z[i ] seem to factorise as a product of a Gaussian
integer q with its conjugate q; the other is that (apart from 2, which is
often a special case) they are of the form 4n+1.
Well leave the second of these observations for a little while, but the
first leads to the following fact.
Proposition 10.61 Suppose p Z is prime in Z. Then either p is also
prime in Z[i ], or it factorises as p = qq, where q is prime in Z[i ].

Proof If p Z is not prime in Z[i ], then there must exist two Gaussian
integers q, r Z[i ] such that p = qr, with at least one of q and r prime
in Z[i ], and neither of them are units.
Suppose that q is a Gaussian prime. Then by the multiplicativity of v,
p2 = v( p) = v(qr ) = v(q)v(r ).
Then v(q) = v(r ) = p, so by Proposition 10.60 it follows that both q
and r must be prime in Z[i ]. Also, v(q) = qq = p = qr, so it must be
the case that r = q.
Now lets look at what happens to v(q) when q is prime in Z[i ]. We
know by Proposition 10.60 that if v(q) is prime in Z, then q must
be prime in Z[i ], but the converse doesnt necessarily hold: some
Gaussian primes q might yield non-prime values of v(q). Applying v
to some small Gaussian primes, we see the following:
v(1+i ) = 2, v(1+2i ) = 5, v(3) = 9, v(2+3i ) = 13,
v(1+4i ) = 17, v(2+5i ) = 29, v(7) = 49, v(11) = 121.
Again, various patterns are starting to form: v(q) is either a prime
integer, or the square of a prime integer. Also, in the cases when v(q)
is the square of a prime integer, q is an integer of the form 4n1, and
in the cases where q is a Gaussian prime with nonzero imaginary part,
v(q) is of the form 4n+1. Except, that is, for the special case where
v(q) = 2, but 2 is often a special case when discussing prime numbers.
426 a course in abstract algebra

The first of these is relatively straightforward to prove:


Proposition 10.62 Let q Z[i ] be a Gaussian prime. Then v(q) is either
a prime integer or the square of a prime integer.

Proof Let v(q) = n, and factorise n = p1 . . . pk as a product of primes


in Z. Recall that v(q) = qq, so we have qq = p1 . . . pk . Then q| p1 . . . pk ,
and since q is prime in Z[i ], it must be the case that q| pi for some i.
Then v(q)|v( pi ), and v( pi ) = p2i , so v(q)| p2i . So either v(q) = pi or
v(q) = p2i , as claimed.
We are now in a position to classify Gaussian primes:
Proposition 10.63 Let q Z[i ] be a Gaussian prime. Then q is one of
the following three types:
(i) q 1i, with v(q) = 2;
(ii) q = a+bi, with v(q) = a2 +b2 1 (mod 4); or
(iii) q = a, where a is prime in Z and a 3 (mod 4).

Proof By Proposition 10.62, v(q) is either a prime integer p, or its


square p2 . If v(q) = p then by Proposition 10.61 we have v(q) = p =
qq.
If, on the other hand, v(q) = p2 then q| p2 in Z[i ], and since q is prime
in Z[i ] we must also have q| p in Z[i ], so there exists some r Z[i ]
such that qr = p. Then v(r ) = v( p)/v(q) = p2 /p2 = 1, so r must be a
unit in Z[i ], and thus q = p or ip. That is, if v(q) is the square of a
prime integer p, then q must be an associate in Z[i ] of p.
We now look at the modulo4 congruence of this prime p. Suppose
that q = a + bi. Then v(q) = qq = a2 + b2 , and with the congruences
02 0 (mod 4), 12 1 (mod 4),
2 2
2 0 (mod 4), 3 1 (mod 4),
its clear that we cant have a2 + b2 3 (mod 4). So if p 3 (mod 4)
then it cant be factorised as p = qq in Z[i ], and therefore p must be
prime in Z[i ] too.
As noted earlier, 2 = (1+i )(1i ), and both (1+i ) and (1i ) are
a conjugate pair of Gaussian primes. Also, (1+i ) = i (1i ) and
(1i ) = i (1+i ) so they are associates.
By Corollary 10.10 the multiplicative group U (F p ) = Z p \ {0} is cyclic
of order p1, and is thus isomorphic to Z p1 . Now suppose that
p 1 (mod 4); this means that 4| p1, so this group Z p1 has an
element a of order 4.
The polynomial x2 1 has at most two roots in Z p , namely 1 and
1 p1 (mod p). Then 1 is the only nonzero element of order
2 in Z p . Since a has order 4, it follows that a2 has order 2, and
domains 427

hence a2 = 1 in Z p . This means that a2 p1 mod p in Z,


and so a2 + 1 0 (mod p). Therefore there exists k Z such that
a2 + 1 = kp, and hence ( a + i )( a i ) = kp in Z[i ], which means that
p|( a + i )( a i ) in Z[i ]. If p were a prime in Z[i ] then either p|( a + i )
or p|( a i ). But neither of these can happen, because multiples of p
in Z[i ] are of the form p(r + si ) = pr + psi for some r, s Z. Hence p
cant be prime in Z[i ] if p 1 mod 4.
This completes the proof: v(q) is either a prime p, in which case p = qq
and q is either (1i ) and p = 2, or p 1 (mod 4); or v(q) = p2 , in
which case p 3 (mod 4) and q p.
We can use this to prove a result originally proved by Pierre de Fermat
(16011665).
Corollary 10.64 Let p be a prime integer such that either p = 2 or p 1
(mod 4). Then p can be expressed uniquely as a sum of positive integer
squares.

Proof If p = 2 or p 1 (mod 4) then p = qq = a2 + b2 for some


Gaussian prime q = a + bi Z[i ], by Proposition 10.63.
To show uniqueness, suppose that p = a2 + b2 = c2 + d2 for some
integers a, b, c, d > 0. Then p = ( a + bi )( a bi ) = (c + di )(c di ) are
two prime decompositions of p in Z[i ]. But Z[i ] is a UFD, so these
factorisations are the same up to permutation and multiplication by a
unit 1, i. So either ( a + bi ) = (c + di ) or ( a + bi ) = i (c + di ) =
(d ci ). Hence either a = c and b = d or a = d and b = c.
The ring Z[i ] of Gaussian integers is one of a class of integral domains
called quadratic integer rings. These are of the form Z[] where is a
root of a monic quadratic polynomial (that is, a quadratic polynomial
whose top coefficient is 1).
Definition 10.65 Let f = x2 + bx + c Z[ x ] be a quadratic polyno-
mial with integer coefficients, and let be a root of this polynomial.
Then
Z[] = {s + t : s, t Z}
is the quadratic integer ring determined by f and . We say Z[] is
real if R, and imaginary otherwise.

Weve seen several examples of quadratic integer rings already: Z[i ]


is determined by the polynomial x2 + 1 and the root x = i; we obtain
an isomorphic ring if we use the other root x = i, the isomorphism
being the usual conjugation map z 7 z.

The real quadratic integer ring Z[ 2] is determined by the polynomial

x2 2 and the root x = 2. Again, this is isomorphic to the ring

Z[ 2] by the map f : ( a+b 2) 7 ( ab 2).
428 a course in abstract algebra

Now let f = x2 + x + 1 and let = 12 + 23 i be one of its roots. We then
obtain the ring Z[ ] of Eisenstein integers. Again, Z[ ] = Z[ ].
What we want to do now is to generalise some of the constructions we
used for the Gaussian integers, in particular the concept of a conjugate,
and the Euclidean valuation or norm v : Z[i ] \ {0} N {0}.
The first step is to generalise the notion of a complex conjugate
to an arbitrary quadratic integer ring Z[]. The key is to look at
the conjugation isomorphism Z[i ] Z[i ], and the isomorphism

Z[ 2] Z[ 2]. In both cases, we swapped the chosen root of
the defining polynomial f with the other root . This approach works
in general:
Definition 10.66 Suppose that is a root of f = x2 + bx + c, and
Z[] is the corresponding quadratic integer ring. Let be the other
root of f ; by the usual quadratic formula, = b . Then the
conjugate of an element r = (s+t) Z[] is defined to be
r = s+t = stbt.

For Z[i ] we used the valuation function

v( a+bi ) = ( a+bi )( a+bi ) = ( a+bi )( abi ) = a2 + b2 .

This valuation has all the necessary properties to ensure Z[i ] is a


Euclidean domain, and was also very useful when we studied the
Gaussian primes. And now that we have a general notion of a conju-
gate in a given quadratic integer ring, the following definition seems
the natural next step.
Definition 10.67 Suppose that , are the roots of a monic quadratic
polynomial f = x2 + bx + c Z[ x ], and that Z[] is the correspond-
ing quadratic integer ring. Let r = s + t be an arbitrary element of
Z[]. Then we define the norm of r to be
N (r ) = rr = s2 bst + ct2 .
This norm will sometimes satisfy the conditions for a Euclidean valu-
ation, either on its own or by defining v(r ) = | N (r )|, but sometimes
it wont. Rings for which the absolute value | N (r )| of the norm N
happens to be a Euclidean valuation are said to be norm Euclidean.

17
For example, Z[ 14] is not norm Eu- However, not every Euclidean domain is norm Euclidean in this sense:
clidean, but was shown in 2004 to nev- just because the absolute value of the norm N isnt a Euclidean valua-
ertheless be a Euclidean domain any-
way; however, the techniques used are tion doesnt mean that no such valuation exists.17
beyond the scope of this book.18 Although the norm of a quadratic integer ring neednt give a Eu-

18
M Harper, Z[ 14] is Euclidean, Cana- clidean valuation, it does have some useful properties, in particular
dian Journal of Mathematics 36 (2004)
5570
the following.
domains 429

Proposition 10.68 Let Z[] be a quadratic integer ring determined by a


root of a monic polynomial f = x2 + bx + c Z[ x ], and let N (r ) = rr
be the corresponding norm. Then N (rs) = N (r ) N (s) for all r, s Z[],
and N (r ) = 1 if and only if r is a unit of Z[].

Proof The multiplicative property can be verified by a straightforward


but tedious calculation, which we will therefore skip here, but which
the reader is invited to do for themselves.
If rs = 1 in Z[], then taking norms of both sides we have

N (r ) N (s) = N (rs) = N (1) = 1.

So, N (r ) = N (s) = 1. Conversely, if N (r ) = 1 then rr = 1, which


means that r must be invertible with r 1 = r.

What we want to do is to find an example of a quadratic integer ring


that is a PID but isnt a Euclidean domain. In order to do this, we
need to introduce a little more technology: A necessary condition for
something to be a Euclidean domain, and a sufficient condition for
something to be a PID. Then if we can find something that satisfies
one and not the other, well have the example were looking for.
Taking the first of these, welll have a look at some examples of 6 5 4 3 2 1 0 1 2 3 4 5 6

quadratic integer rings that are Euclidean. Figure 10.2: Even and odd integers
Our motivating example of a Euclidean domain, as well as a lot of
the other algebraic structures weve investigated so far, is the ring
6 5 4 3 2 1 0 1 2 3 4 5 6
Z of integers. We can partition this into even and odd numbers: Figure 10.3: Multiples of 3
every element of Z is either even or odd, having the form 2n or 2n+1.
One way of looking at this is to say, as we did in the last chapter, that
Z/2Z = Z2 . But now were interested in the existence or nonexistence 6 5 4 3 2 1 0 1 2 3 4 5 6

of a Euclidean valuation v : Z N{0}. Figure 10.4: Multiples of 4

Both the multiplicative norm N of a quadratic integer ring Z[] and


the valuation defined on a Euclidean domain (and also the Dedekind
Hasse norms well meet in a little while) can be thought of as distance
functions in the domain in question. So another way of looking at the
even/odd partition of Z is to say that every element of Z is at most
distance 1 from an even integer, as depicted in Figure 10.2. The same
holds for multiples of 3: every integer is of the form 3n1, 3n or 3n+1,
as shown in Figure 10.3. However, this doesnt work for multiples of
4: there exist infinitely many integers of the form 4n+2, which are
distance 2 from the nearest multiple of 4, as shown in Figure 10.4.
We can do this with the Gaussian integers too: if we consider the
ideal h1+i i generated by the element 1+i, we get the lattice depicted
in Figure 10.5. Every element of Z[i ] is either a point in this lattice
(that is, an element of h1+i i) or distance 1 from it using the valuation Figure 10.5: The ideal h1+ii in Z[i]
430 a course in abstract algebra

v( a+bi ) = a2 +b2 .
Alternatively, every element in Z[i ] can be expressed in the form z+ a
where a h1+i i and z {0, 1, i }. Similarly, every element of Z
can be written as m+ a where a h2i and m = 0, 1; or in the form
m+b where b h3i and m = 0, 1.
This, well see in a moment, is a property shared by every Euclidean
domain R. Given a Euclidean valuation defined on R, we can find a
nonzero element r R such that the cosets of R/hr i are of the form
a+hr i where a is either 0 or a unit of R. Time for some terminology:
Definition 10.69 Let R be an integral domain, and let r R be such
that every coset in R/hr i can be represented as a+hr i where either
a = 0, or a is a unit of R. Then r is a universal side divisor of R.
This is the key concept we need: it turns out that a Euclidean domain
must have at least one universal side divisor.
Proposition 10.70 Let R be a Euclidean domain, and let v be a Euclidean
valuation defined on R. Then let a R be a nonunit element of R for which
v( a) is minimal over all nonunits in R. Then a is a universal side divisor.

Proof Let x R be any element of R. Then since R is a Euclidean


domain, we can find q, r R such that x = qa+r and either r = 0 or
v (r ) < v ( a ).
If r 6= 0 then v(r ) < v( a) and by Proposition 10.4 r must be a unit.
Since a|( x r ), the quotient R/h ai must be represented by 0 and the
units in R.

Our current goal is to show that Z[], where = 21 (1+ 19), is
a PID but not a Euclidean domain. If we can show it doesnt have
any universal side divisors, then by Proposition 10.70 it cant be a
Euclidean domain.
Proposition 10.71 Let Z[] be the quadratic integer ring determined by

the polynomial x2 x + 5 and the root = 12 (1+ 19). Then Z[] has
no universal side divisors and hence is not a Euclidean domain.

Proof First we want to find the units in Z[], and for this we use
the norm N ( a + b) = a2 + ab + 5b2 . This is positive definite; that
is, N ( a + b) > 0 for all a, b Z[], and N ( a + b) = 0 only when
a = b = 0. We can show this by completing the square, which yields
the two expressions
2 2 19 2
2
a + 12 b + 19
4 b and 20 a +
1
a + 5b
2 5

for N ( a + b). The first of these is neater than the second, but both
are clearly positive definite. If b 6= 0 then
2 2 19 2
N ( a + b) = a + 12 b + 19 19
4 b > 4 b > 4 > 4.
domains 431

By Proposition 10.68 the units are exactly those elements r = ( a +


b) Z[] for which N (r ) = 1, so these must have b = 0, and hence
we require N (r ) = N ( a) = a2 = 1. Thus a = 1 and so the units in
Z[] are 1.
Now suppose that Z[] has a universal side divisor d. Then for any
r Z[] there exists some u {0, 1} such that r u (mod d)
in Z[]. This means that r = kd + u for some k Z[], and hence
r u = kd. Thus d|(r u).
Suppose r = 2. Then d|(2u) for some u {0, 1}. If u = 1 then d|1,
so d is a unit. But Definition 10.69 specifically rules this out: a universal
side divisor may not be a unit. If u = 0 then d|2, and if u = 1 then
d|3, so the only possibilities open to us are that d {2, 3}. We will
now rule out each of these four possibilities as well.

Consider another element r Z[]: take r = = 12 (1+ 19) itself.
Again, d|(r u) for some u {0, 1}; that is, either d|(r 1), or d|r, or
d|(r +1).
Suppose that d|(r +1). Then there exists some s Z[] such that

sd = r +1 = 32 + 21 19. Applying the norm N to this, we get
N (s) N (d) = N (sd) = N (1 + ) = 7.
This cant happen, because either N (d) = 4 or 9, and neither of these
are factors of 7.
If d|r then sd = r = , and N (s) N (d) = N (sd) = N () = 5; this is
also prime, so there is no suitable s that can be multiplied by any
d {2, 3} to give .
Finally, if d|(r 1) then there would exist some s Z[] such that

sd = 1 = 12 + 12 19. Using the norm again, this would require
N (s) N (d) = N (sd) = N ( 1) = 7, which cant happen for the same
reasons as in the previous two cases.
Therefore there is no universal side divisor d Z[], and hence Z[]
cant be a Euclidean domain.
Next we need to show that Z[] is a PID, and to do that we have to
introduce another bit of algebraic machinery. Recall that a Euclidean
domain is an integral domain for which we can define a Euclidean
norm or Euclidean valuation. Is there a slightly more general class of
norms whose existence ensures that a given integral domain is a PID?
The answer is yes:
Definition 10.72 Let R be an integral domain, and let N : R Z be
a function defined on R. If N is positive definite and if, for every
a, b R, either
(i) a|b, or
432 a course in abstract algebra

(ii) there exists a nonzero element c h a, bi such that


0 < N ( c ) < N ( b ),
then N is a DedekindHasse norm.
The first of these conditions is equivalent to requiring that b h ai,
and the second is equivalent to the existence of elements s, t R
such that
0 < N (sa + tb) < N (b).
Every Euclidean valuation is a DedekindHasse norm: the Euclidean
condition is the same as saying we can always set s = 1, so that
0 < N ( a + tb) < N (b).
This new type of norm is exactly what we need:
Proposition 10.73 An integral domain R is a PID if and only if there
exists a DedekindHasse norm defined on R.

Proof Let I be an ideal in R and choose b I such that N (b) is less


than the norm of any other element in I. Then let a I be some
arbitrary nonzero element of I. Since I is an ideal of R, then for any
s, t R it must be the case that (satb) I. Then either (satb) = 0
or N (satb) > N (b) for any s, t R. Hence, by Definition 10.72, since
N (satb) 6< N (b) for any s, t R, it must be the case that b| a, and
thus a hbi. Thus I = hbi is principal, and hence R must be a PID.
Conversely, suppose that R is a PID. We will construct a Dedekind
Hasse norm on R. Let N (0) = 0, and let N (u) = 1 for any unit u R.
Since every PID is a UFD by Corollary 10.43, any nonzero nonunit r in
R can be factorised into a finite product of irreducible elements. Any
such factorisation is unique up to permutation and multiplication by
units, and thus will have the same number nr of irreducible elements.
Define N (r ) = 2nr . Then N is positive definite.
It also satisfies the multiplicativity condition, since

N (rs) = 2nrs = 2nr +ns = 2nr 2ns = N (r ) N (s).

We claim that this is a DedekindHasse norm on R. Suppose a, b R


with b 6= 0. Then I = h a, bi = { as + bt : s, t R} is an ideal of R, and
since R is a PID, there exists some nonzero element d R such that
I = hdi. This element d = as+bt for some s, t R.
Then b = rd I = hdi for some nonzero r R, so

N (b) = N (rd) = N (r ) N (d) > N (d) = N ( as+bt) > 0.

Thus N is a DedekindHasse norm for R.


So, if we can show now that Z[] does admit a DedekindHasse norm
then were done: weve found a PID that isnt Euclidean. The most
domains 433

straightforward approach is to confirm that the norm N ( a + b) =


a2 + ab + 5b2 satisfies either of the DedekindHasse criteria.
Proposition 10.74 Let Z[] be the quadratic integer ring determined by

the polynomial f = x2 x + 5 Z[ x ] and the root = 12 (1 + 19).
Then N ( a + b) = a2 + ab + 5b2 is a DedekindHasse norm, and hence
Z[] is a PID.

The following proof is based on the version given by Wilson.19 19


J C Wilson, A principal ideal ring that is
not a Euclidean ring, Mathematics Mag-
Proof Suppose u, v Z[] with v 6= 0 and v 6 |u, and N (u) > N (v). azine 46 (1973) 3438.
We want to find elements s, t Z[] such that 0 < N (su + tv) < N (v).
Since N satisfies the multiplicative condition, this is the same as
finding s, t Z[] such that
u
0<N v s t) < N (1) = 1.
In general, the quotient doesnt lie in Z[]; indeed, unless v is a unit
u
v
or v|u, this doesnt even make sense in Z[]. Here, weve specifically
chosen u and v so that this is the case, so we must instead consider uv
as an element of Q(Z[]), the field of fractions of Z[]. This is exactly

Q[] = { a + b : a, b Q} = { a + b 19 : a, b Q} = Q[ 19].

The norm N extends happily to Q[], but when we switch to Q[ 19]
it takes a slightly different form:

N ( a + b 19) = a2 + 19b2

Suppose, then, that uv = 1c ( a + b 19) Q[ 19], where a, b, c Z
are coprime integers and c > 1. (This latter condition must be true,
because weve already decided that v 6 |u.) Since a, b, c are coprime,
and hence gcd( a, b, c) = 1, there must exist integers l, m, n such that
la + mb + nc = 1.
Furthermore, there must exist integers q, r Z such that am 19bl =
cq + r with |r | 6 21 c. We have four cases to consider.

Case 1 (c > 5) Set s = m+l 19 and t = qn 19. Then

0 < N uv s t = N 1c ( a+b 19)(m+l 19) (qn 19)
 

= N 1c (( am19bl ) + (bm+ al ) 19) q + n 19


= N 1c ( am 19bl cq) + ( al + bm + cn) 19


= N 1c (r + 19)


r2 + 19
= .
c2
r2 +19 1 19 7
If c > 5 then c2
6 4 + c2
6 9 < 1, so N is a DedekindHasse
norm.
r2 +19 23
If c = 5 then |r | 6 2 and so c2
6 25 < 1, and again N is a
DedekindHasse norm.
434 a course in abstract algebra

Case 2 (c = 2) One of a and b is even, and the other must be odd,


otherwise we would have uv Z[]. Let

s=1 and t = 12 ( a1) + 21 b 19 = 12 ( ab1) + 12 b Z[].
Then
u 1

19) 12 ( a1) 21 b 19 = N 1
  
N vs t =N 2 ( a+b 2 < 1.
Case 3 (c = 3) If c = 3 then 3 doesnt divide both a and b, so a2 + 19b2
isnt divisible by 3 either. Choose q, r Z such that a2 +19b2 = 3q + r,
with 0 < r < 3; that is, r = 1 or 2. Set

s = a b 19 and t = q Z[ ].
Then
u
 1

N vs t =N 3 ( a + b 19)( a b 19) q
1 2 2

=N 3 ( a + 19b ) q
1

=N 3 (3q + r ) q
1

=N 3 r < 1.

Case 4 (c = 4) Finally, suppose that c = 4. Then at most one of a and


b can be even, otherwise this reduces to Case 2. If only one is odd then
a2 +19b2 is odd, so we can find q, r Z such that a2 +19b2 = 4q + r,
with r = 1 or 3. As with Case 3, set

s = a b 19 and t = q Z[ ].
Then
u
 1
 1

N vs t =N 4 ( a+b 19)( ab 19) q = N 4r < 1.
If both a and b are odd, then a2 +19b2 6 0 (mod 8), so choose q, r Z
such that a2 +19b2 = 8q + r, with 0 < r < 8. Set

s = 12 ( a b 19) and t = q.
Then
u
 1
 1

N vs t =N 8 ( a+b 19)( ab 19) q = N 8r < 1.
In each case, then, N is a DedekindHasse norm, and so by Proposi-
tion 10.73 Z[] is a PID.
We have now proved the existence of at least one PID that isnt a
Euclidean domain, and thereby completed the construction of the
hierarchy
{Euclidean domains} {PIDs} {UFDs} {integral domains}.
Each inclusion in this hierarchy is strict.
domains 435

Summary

In this chapter we studied some important classes of integral


domains, each motivated by a property satisfied by Z. The first
of these is the Division Theorem,20 which we restated in a slightly 20
Theorem A.37, page 523.
different form.21 This says that for any integer a and nonzero integer 21
Theorem 10.1, page 394.
b we can find integers q (the quotient) and r (the remainder) such
that a = qb + r, where either r = 0 or |r | < |b|. In order to generalise
this to an arbitrary integral domain R, we introduced the concept of
a Euclidean valuation or norm,22 a function v : R N0 such that 22
Definition 10.2, page 395.
for all nonzero a, b R, the inequality v( ab) > v( a) holds, and there
exist q, r R such that a = qb + r with either r = 0 or v(r ) < v(b).
An integral domain which admits a Euclidean valuation is called a
Euclidean domain.23 A Euclidean domain, then, is an integral domain 23
Definition 10.3, page 395.
in which some suitable analogue of the Division Theorem holds. For
any Euclidean valuation v defined on a ring R, we have v( a) > v(1)
for any nonzero element a R, and v( a) = v(1) exactly when a is a
unit.24 24
Proposition 10.4, page 395.
If F is a field, then the polynomial ring F [ x ] is a Euclidean domain: we
can use the degree function deg as the required Euclidean valuation,25 25
Theorem 10.5, page 396.
and thereby perform long division with polynomials.26 We can use 26
Example 10.6, page 397.
this to prove the Remainder Theorem:27 for any polynomial f F [ x ] 27
Theorem 10.7, page 398.
and element a F, we have f ( a) = 0 if and only if f has remainder 0
when divided by ( x a). A consequence of this is that f has at most
deg( f ) distinct roots in F.28 It also follows that any finite subgroup of 28
Corollary 10.8, page 398.
U ( F ) = F \ {0}, the group of units of a field F, must be cyclic,29 and 29
Proposition 10.9, page 398.
that U (F p )
= Z p1 ,30 where F p is the finite field of order p. 30
Corollary 10.10, page 399.
The ring Z[i ] of Gaussian integers is also a Euclidean domain, via the
valuation v( a + bi ) = a2 + b2 .31 31
Proposition 10.11, page 399.
Next we generalise the notion of a divisor to an arbitrary integral
domain R:32 for any a, b R, we say that a divides b, or is a factor or 32
Definition 10.13, page 401.
divisor of b, if there exists some r R with b = ra. We denote this by
a|b. We can generalise this further to noncommutative rings, where
we distinguish between left, right and two-sided divisors. For any
elements a and b of an integral domain R, the element a divides b if
and only if b h ai, if and only if hbi h ai.
If two elements a and b divide each other, then we say they are
associates,33 denoted a b. Two such elements are associates if and 33
Definition 10.15, page 402.
only if h ai = hbi, or equivalently when one is equal to the other
multiplied by a unit.34 Association is an equivalence relation.35 34
Proposition 10.16, page 402.
35
Proposition 10.17, page 402.
436 a course in abstract algebra

References and further reading

Exercises
10.1 List all the primitive polynomials of degree 3 in Z5 [ x ].
10.2 Show that the ring Z[ ] of Eisenstein integers is a Euclidean domain via the valuation v( a + b ) =
a2 ab + b2 .
10.3 Let F be a field, and suppose that f F [ x ] is irreducible. Show that F [ x ]/h f i is a field.
Im very well acquainted, too,
with matters mathematical,
I understand equations,
both the simple and quadratical,
About binomial theorem
Im teeming with a lot o news,
With many cheerful facts
about the square of the hypotenuse.
W S Gilbert (18361911) and
11 Polynomials Arthur Sullivan (18421900),
The Major Generals Song from
The Pirates of Penzance (1879)

he problem of finding solutions to polynomial equations is an


T old one with an interesting and occasionally scandalous or tragic
history. In this chapter we will study the factorisation problem for
polynomials over a field, by introducing the idea of a field extension,
which will lead us to investigate field automorphisms and construct
the Galois group of a polynomial. This group, named after and
originally introduced by the young French mathematician variste
Galois (18111832), enables us to determine whether the polynomial
has an algebraic solution or solution by radicals.

11.1 Irreducible polynomials Matter, though divisible in an extreme


degree, is nevertheless not infinitely di-
visible. That is, there must be some
In this chapter we will mainly be concerned with polynomials over point beyond which we cannot go in
fields; that is, elements of F [ x ] where F is a field. We know from the division of matter. The existence of
these ultimate particles of matter can
Theorem 10.5 that F [ x ] is a Euclidean domain, and the Remainder scarcely be doubted, though they are
Theorem1 tells us that finding the roots of such a polynomial is equiv- probably much too small ever to be ex-
hibited by microscopic improvements.
alent to factorising it into linear factors. And since F [ x ] is a Euclidean
I have chosen the word atom to signify
domain, it must also be a UFD, so any such irreducible factorisation is these ultimate particles in preference to
unique up to permutation and multiplication by units. So to properly particle, molecule, or any other diminu-
tive term, because I conceive it is much
study the solution of polynomial equations, we need to understand more expressive; it includes in itself the
the irreducible elements in F [ x ]. notion of indivisible, which the other
terms do not.
Gauss Lemma, as stated in Proposition 10.51, says that the products of John Dalton (17661844),
primitive polynomials (those whose coefficients are coprime) are also lecture at the Royal Institution,
primitive. Proposition 10.55 tells us that if R is a UFD and Q = Q( R) 30 January 1810

is its field of quotients, then a primitive polynomial f R[ x ] is


1
irreducible in R[ x ] if and only if it is irreducible in Q[ x ]. (The special Theorem 10.7, page 398.

case when R = Z and Q = Q is also sometimes called Gauss Lemma.)


So, by Corollary 10.56, an irreducible element in R[ x ] is either irre-
ducible in R or a primitive polynomial in R[ x ] that is also irreducible
in Q[ x ]. Another important result on irreducible polynomials is due
438 a course in abstract algebra

to the German mathematician Gotthold Eisenstein (18231852).


Proposition 11.1 (Eisensteins Criterion) Suppose that
f = a n x n + + a1 x + a0 Z[ x ]
is a polynomial with integer coefficients, and that there is a prime p such
that
(i) p 6 | an ,
(ii) p| an1 , . . . , a0 , and
(iii) p2 6 | a0 .
Then f is irreducible over Q.

Proof If we can show that f is irreducible over Z, then Gauss Lemma


ensures that it will also be irreducible over Q.
Wikimedia Commons / Unknown artist
Ferdinand Gotthold Max Eisenstein Suppose, then, that f is not irreducible, and that there exists a prime
(18231852) was born in Berlin and dis- p satisfying the hypotheses. Then f = gh for some polynomials
played a strong aptitude in both mathe-
matics and music from an early age. At g = bl x l + + b1 x + b0 and h = c m x m + + c1 x + c0
17, having exhausted the school mathe-
matical curriculum, he began attending in Z[ x ]. Since a0 = b0 c0 and p| a0 , we know that either p|b0 or p|c0 .
lectures at the University of Berlin, and
formally enrolled in 1843.
But p cant divide both b0 and c0 , since if it did we would have p2 | a0 ,
After graduation, he met the geogra- which the third hypothesis disallows. We can assume without loss of
pher Alexander von Humboldt (1769 generality that p|b0 and p 6 |c0 .
1859), who managed to obtain grants
to support his career. He also visited Furthermore, the leading coefficient an = bl cm is not divisible by p, so
Carl Friedrich Gauss (17771855), who p 6 |bl and p 6 |cm .
was impressed: the mathematical his-
torian Moritz Cantor (18291920), one Now let bi be the first coefficient in g that isnt divisible by p, so that
of Gauss last students, later wrote p|b0 , . . . , bi1 . If i < n then p| ai by the second hypothesis. Also, since
that he once remarked in conversation
that there had been only three epoch- bi c0 = ai (bi1 c1 + + b0 ci ),
making mathematicians: Archimedes,
Newton and Eisenstein. then we must have p|bi c0 . But this cant happen, because p 6 |bi and
Over the rest of his life, he made many
p 6 |c0 . So instead i = n, which means that deg( g) = n and deg(h) = 0.
contributions to the study of quadratic
and cubic forms, and to reciprocity Therefore h is a nonzero constant c0 Z, and so f = c0 g.
laws for prime numbers. He was also
an accomplished pianist and composer.
If f is irreducible over Z then by Gauss Lemma it is irreducible over
Eisensteins physical health was deli- Q; if not then it has the form f = c0 g for some nonzero c0 Z, which
cate throughout his life, due to menin- is a unit in Q, and therefore f is again irreducible over Q.
gitis he contracted during childhood
(and which also killed his five siblings); Example 11.2 The polynomial f = 2x3 + 6x + 3 is irreducible over
also he suffered from depression, a
legacy of an overly harsh school en- Q: setting p = 3 we have p 6 | a3 , while p| a1 and p| a0 , but p2 6 | a0 .
vironment. In 1848 he found himself in
the middle of a political demonstration Another useful technique for checking irreducibility is reduction mod-
and was arrested. Although released ulo p, for some carefully chosen prime p. For any n N there is a
the next day, the incident contributed
canonical quotient homomorphism qn : Z Zn mapping each inte-
to a further decline in his health, and
caused the loss of some of his funding. ger k Z to its modulon residue class [k ]n . Now suppose that a
He died of tuberculosis in 1852, at the polynomial f Z[ x ] is reducible over Z; that is, there exist polyno-
age of 29; Humboldt led the funeral
mials g, h Z[ x ] such that f = gh. Then qn ( f ) = qn ( g)qn (h) is also
procession.
reducible over Zn . Contrapositively, if qn ( f ) isnt reducible over Zn ,
polynomials 439

then f must be irreducible over Z (and hence also over Q).


Example 11.3 Let f = 2x2 + 6x + 15. By Eisensteins criterion we
can show this is irreducible over Q by considering p = 3.
Reducing modulo 7 gives q7 ( f ) = 2x2 + 6x + 1, which is either
irreducible, or factorises as two linear terms ( ax + b)(cx + d) with
a, b, c, d Z7 . Comparing coefficients, we see that ac = 2, bd = 1,
and ( ad + bc) = 6. A routine verification confirms that there are no
modulo7 solutions to these three equations, and therefore f is again
irreducible over Z and thereby Q.

For completeness, we state the following well-known definition:


2
Theorem 10.7, page 398.
Definition 11.4 Suppose that f R[ x ] is a polynomial over some
integral domain R. An element R is a root or zero of f if 3
The earliest known examples of a
f () = 0. general solution of what we would
now regard as a quadratic equation oc-
By the Remainder Theorem,2 the problem of finding roots of a poly- cur in cunieform tablets dated to the
nomial is equivalent to that of factorising the polynomial into linear Sumerian Ur III period (c.2000BC), and
the Berlin Papyrus, dated to the Mid-
factors. For a quadratic polynomial dle Kingdom period (c.2050c.1650BC)
of Egyptian history. Further develop-
ax2 + bx + c (11.1) ments by Babylonian, Chinese, Indian
and Greek mathematicians followed
the formula over the next three thousand years,

b b2 4ac and the Indian mathematician Brah-
x= (11.2) magupta (598c.670AD) gave probably
2a the first known explicit solution in a
3
has been well known for centuries. The solutions given by this form we would recognise today. The
formula lie in C, so the polynomial (11.1) can always be factorised into first person to give a fully general for-
mula to solve an arbitrary quadratic
linear factors over C, but not always over some subfield of C. equation was probably the Flemish
Note, however, that Eisensteins Criterion is a sufficient, but not a mathematician and engineer Simon
Stevin (15481620) in 1594.
necessary condition for a polynomial f Z[ x ] to be irreducible. Well
now look at a couple of other useful tests for irreducibility, both of
which are discussed in more detail in the article4 by Ram Murty. 4
M Ram Murty, Prime numbers and
irreducible polynomials, The American
Proposition 11.5 Let f = am + + a1 x + a0 Z[ x ], and set
xm Mathematical Monthly 109.5 (May
N = max{| ai /am | : 0 6 i < m}. If f (n) is prime for some integer 2002) 452458.
n > N +2 then f is irreducible over Z.
To prove this, we need the following lemma.
Lemma 11.6 Let f = am x m + + a1 x + a0 Z[ x ] and set N =
max{| ai /am | : 0 6 i < m}, then || < N +1 for any root C of f .

Proof Since f () = am m + + a1 + a0 = 0, we have

a m m = a m 1 m 1 + + a 1 + a 0
and so
||m 1
 
m m 1
|| 6 N (|| + + | | + 1) = N .
|| 1
440 a course in abstract algebra

If || 6 1 then || < N +1 and the lemma follows; if || > 1 then

(|| 1)||m 6 N (||m 1) < N ||m ,


so ||m+1 < ( N +1)||m , and hence || < N +1.
Proof of Proposition 11.5 Suppose that f is reducible in Z[ x ]; that is,
f = gh for some polynomials

g = bk x k + + b1 x + b0 and h = c l x l + + c1 x + c0

in Z[ x ], both with positive degree. Since f (n) is prime, either g(n) or


h(n) must be equal to 1. Suppose, without loss of generality, that
g(n) = 1. Then g( x ) = bk i ( x i ), where i ranges over the indices
of the complex zeros of g, which are themselves a subset of the zeros
of f . By Lemma 11.6 we have

1 = | g(n)| > (n |i |) > (n ( N +1)) > 1,


i i

which is a contradiction. Thus f must be irreducible.

Example 11.7 The polynomial f = x4 4x2 + 16 Z[ x ] cannot be


shown to be irreducible by Eisensteins Criterion: the only possible
prime we can try is p = 2, and although conditions (i) and (ii) are
satisfied, condition (iii) isnt.
Using Proposition 11.5, however, we just need to find a positive
integer n > (16+2) = 18 such that f (n) is prime. After a few
attempts (which can be easily automated with a very short computer
program) we find that f (23) = 277741, which is prime. Hence f is
irreducible over Z (and therefore also over Q).
A related irreducibility test is originally due to Arthur Cohn (1894
1940), a doctoral student of the algebraist Issai Schur (18751941) at the
5
5
G Plya and G Szego, Problems and University of Berlin. Cohns original version was stated in base 10,
Theorems in Analysis, Springer (1976), but it was later generalised to arbitrary bases.6
volume II, page 133.
6 Proposition 11.8 (Cohns Irreducibility Criterion) If a prime integer
J Brillhart, M Filaseta, and A Odlyzko,
On an irreducibility theorem of A Cohn, p can be expressed in baseb notation for b > 1 as
Canadian Journal of Mathematics 33
(1981) 10551059. p = a m b m + a m 1 b m 1 + + a 1 b + a 0 ,
with 0 6 ai < b for 0 6 i 6 m, then the polynomial f = am x m + +
a1 x + a0 is irreducible over Z.
The proof is somewhat involved, relying on a couple of technical
lemmas, and would be a little too much of a digression here, but the
details can be found in the article by Brillhart, Filaseta and Odlyzko.
This article also contains a similar test that can be applied to polyno-
mials with some negative coefficients, while Cohns original test and
the generalised version above cant.
polynomials 441

Example 11.9 Let f = x4 + 6x2 + 16. Cohns original, base10 test Proposition 11.8 is a partial converse of
Bunyakovskys Conjecture, originally
is inapplicable, because the constant term 16 > 10. But we can stated in 1857 by the Russian math-
apply the generalised test and search for an integer b > 16 for which ematician Viktor Bunyakovsky (1804
1889), and which is still an open ques-
f (b) = b4 +6b2 +16 is prime. Again, this can be done very quickly
tion.
with a short computer program, and we find that b = 31 works, since Conjecture 11.10 (Bunyakovskys
f (31) = 929303 is prime. Hence f is irreducible over Z. Conjecture) Let f Z[ x ] be an irre-
ducible, primitive polynomial of positive
We can also use Cohns Criterion to find irreducible polynomials: degree, with positive leading coefficient.
Then there are infinitely many values of
Example 11.11 Expressing the prime p = 97 in different numerical n Z for which f (n) is prime.
bases b > 1, we obtain the following list of representations, which
correspond to irreducible polynomials in Z[ x ].
11000012 x6 + x5 + 1, 101213 x4 + x2 + 2x + 1,
12014 x3 + 2x2 + 1, 3425 3x2 + 4x + 2,
2416 2x2 + 4x + 1, 1667 x2 + 6x + 6,
1418 x2 + 4x + 1, 1179 x2 + x + 7,
9710 9x + 7, 8911 8x + 9, ...

As with Eisensteins Criterion, there exist polynomials in Z[ x ] that


are irreducible but which arent detected by either Proposition 11.5 or
Cohns Criterion. For example, x2 +3x + 4 is irreducible over Z, but
evaluates to an even (and thus non-prime) integer for any integral x.
Polynomials of the form x n 1 are not irreducible, but their roots
yield an important class of irreducible polynomials that will become
useful later on.
Definition 11.12 An nth root of unity is a root of the polynomial
x n 1. They form a multiplicative group e2ki/n : 0 6 k < n = Zn .


By Proposition 1.26 this group is generated by any single element


= e2ki/n where gcd(k, n) = 1; we call such generators primitive
nth roots of unity. (Equivalently, is primitive if n = 1 but k 6= 1
for 1 6 k < n.)

Let = e2i/m . Then


x n 1 = ( x 1)( x )( x 2 ) . . . ( x n1 )
which is clearly not irreducible over Q since it has a factor ( x 1)
Q[ x ]. We want the largest irreducible factor whose roots are the
primitive roots of unity, from which we can recover all of the others.
Definition 11.13 Let Pn = e2ki/n : 1 6 k < n, gcd(k, n) = 1 be


the set of primitive nth roots of unity. We define the nth cyclotomic
polynomial
n = Pn ( x ) = gcd(k,n)=1 x e2ki/n .


Well look at a few examples.


442 a course in abstract algebra

Example 11.14 The cube roots of unity are



3 3
0 = e0 = 1, 1 = e2i/3 = 12 + 2 i, 2 = e4i/3 = 12 2 i,

of which 1 and 2 are primitive. So



3 = x + 21 23 i x + 12 + 3
= x2 + x + 1.

2 i

Example 11.15 The sixth roots of unity are


0 = e0 = 1, 3 = ei = 1,

3 3
1 = ei/3 = 1
2 + 2 i, 4 = e4i/3 = 12 2 i,

3 3
2 = e2i/3 = 12 + 2 i, 5 = e5i/3 = 1
2 2 i.

Of these, 1 and 5 are primitive, so



6 = x 12 23 i x 12 + 3
= x2 x + 1.

2 i

In particular, 6 = 3 ( x ); more generally, 2n = n ( x ) for any


odd integer n > 1, as well see shortly. The next example is the first
which has a coefficient other than 0 or 1.
Example 11.16
105 = x48 + x47 + x46 x43 x42 2x41 x40 x39 + x36
+ x35 + x34 + x33 + x32 + x31 x28 x26 x24 x22 x20
+ x17 + x16 + x15 + x14 + x13 + x12 x9 x8 2x7 x6
x5 + x2 + x + 1

More generally, we have the following:


Proposition 11.17 Let n N.
(i) If n is prime then n = x n1 + x n2 + + x + 1;
(ii) if n is odd then 2n = n ( x );
(iii) n is monic with degree deg n = (n);7 and
(iv) x n 1 = d|n d .
7
Here (n) denotes Eulers totient
function (Definition 2.44, page 63). Proof (i) If n is prime, then e2ki/n is a primitive nth root of unity
for 1 6 k < n. That is, all roots of x n 1 except for x = 1 = e0 are
primitive. So
n = ( x n 1)/( x 1) = x n1 + x n2 + + x + 1.
(ii) If is a primitive nth root of unity, then is a primitive 2nth
root of unity, since ( )2n = n = 1. Thus
2n = ( x + 1 ) . . . ( x + k ) = ( x 1 ) . . . ( x k ) = n ( x ).
(iii) From Definition 11.13 we have
n = gcd(k,n)=1 x e2ki/n .


This has (n) factors, so its highest term is x (n) .


polynomials 443

(iv) Every nth root of unity is a primitive dth root of unity for some
unique d|n. So all the roots of x n 1 are roots of a cyclotomic polyno-
mial d for some d|n, and hence x n 1 = d|n d as claimed.
Not only is n monic, its coefficients are all integers. To prove this we
need the following fact.
Proposition 11.18 Let E and F be fields with F E, and suppose f , g
E[ x ] with f , f g F [ x ]. Then g F [ x ] too.

Proof Suppose that


f = a m x m + + a1 x + a0 ,
g = bn x n + + b1 x + b0
and f g = c m + n x m + n + + c1 x + c0 ,
where a0 , . . . , am , c0 , . . . , cm+n F and b0 , . . . , bm E, with both am
and bn nonzero. We want to show that b0 , . . . , bn also lie in F, and we
prove this by induction.
First, observe that cm+n = am bn , and hence bn = cm+n /an F.
Now suppose that bi F for all i > k. Then
c m + k = a m bk + a m 1 bk + 1 + + a m n + k bn .
(Here, let ai = 0 for i < 0.) Hence
bk = (cm+k am1 bk+1 amn+k bn )/am
which also lies in F. By decreasing induction on k, we have bi F for
0 6 i 6 n, and hence g F [ x ].
Proposition 11.19 The cyclotomic polynomial n Z[ x ] for all n N.

Proof We prove this by induction on n. First, observe that 1 =


x 1 Z[ x ].
Now suppose that n Z[ x ] for all n < k. From Proposition 11.17(iv)
we have
x k 1 = i|k i = k i|k,i<k i .
Here, x k 1 Z[ x ], and by the induction hypothesis i|k,i<k i
Z[ x ] too. So, by Proposition 11.18 we have k Z[ x ]. Thus n Z[ x ]
for all n N.
The cyclotomic polynomials are irreducible. The general case is a
little complicated, and requires discussion of splitting fields, which
we wont meet until later in Section 11.2, but we can prove the case
for n prime now by a nice application of Eisensteins Criterion.
Proposition 11.20 Let n be prime. Then the cyclotomic polynomial n
is irreducible over Z.
444 a course in abstract algebra

Proof If n is prime, then by Proposition 11.17(i) we have


n = ( x n 1)/( x 1) = x n1 + x n2 + + x + 1.
We cant apply Eisensteins Criterion directly, but n ( x ) is irreducible
if and only if n ( x +1) is irreducible. Then
n ( x +1) = (( x +1)n 1)/x = in=1 (ni) xi1 ,
which is a monic polynomial of degree n 1. Each coefficient apart
from that of the highest degree term is divisible by n, and the constant
term n is divisible by n but not n2 . By Eisensteins Criterion, n ( x +1)
is irreducible over Z, and hence so is n ( x ) as claimed.
We will prove the general case later on, but we need to introduce a
number of new concepts and results first.

The proof of the transcendency of 11.2 Field extensions


will hardly diminish the number of
circle-squarers, however; for this class
of people has always shown an abso- The polynomial x2 + 1 is irreducible over Z (and hence also Q). If,
lute distrust of mathematicians and a however, we switch to the larger field Q[i ] = { a + bi : a, b Q} then it
contempt for mathematics that cannot
factorises quite readily as ( x +i )( x i ). Similarly, the polynomial x2 2
be overcome by any amount of demon-
stration. is irreducible over Q, but not over Q[ 2] = { a + b 2 : a, b Q}, in

Felix Klein (18491925), which it factorises as ( x + 2)( x 2).
The Evanston Colloquium,
Lecture 7 (The transcendency of the In some sense, Q[i ] is the smallest possible field over which x2 + 1 is

numbers e and ), 4 September 1893 reducible: no smaller subfield will do. Similarly, Q[ 2] is the smallest
field over which x2 2 is reducible.
What weve done in both of these cases is to attach (or adjoin) a
carefully chosen element to Q in order to get a new field over which
the polynomial is reducible. In general, however, we dont know that
Q[] will be a field. It will certainly be an integral domain, but we
also need it to have all necessary inverses.
The rational field Q has a privileged position in the theory were
developing in this chapter: it is the smallest nontrivial subfield of C:
Proposition 11.21 Let F C be a nontrivial subfield of C. Then Q F.

Proof Since F is a nontrivial field, it must contain 0 and 1 (which


may not be equal) and hence we can inductively generate the positive
integers n = 1+ +1. Existence of additive inverses ensures that
F also contains the negative integers, so Z F. The existence of
multiplicative inverses ensures that F also contains rational numbers
of the form n1 = n1 for all n Z \ {0}, and together with the integers
these multiplicatively generate the rest of the rational numbers, so
Q F.
polynomials 445

In the examples above, we constructed Q[i ] and Q[ 2] by including

an extra element i or 2. The point of this is that weve enlarged Q
just enough to guarantee that the polynomial in question is reducible.
More generally, weve defined a new field that contains Q as a subfield.
Definition 11.22 Let K and L be fields. Then L is an extension or
extension field of K if there exists an injective field homomorphism
i : K , L including K as a subfield of L. We denote this by L : K.

In particular, were interested in finding out just how far we have


to extend Q so that a given irreducible polynomial f Q[ x ] can be
reduced. In the case of x2 + 1 the corresponding extension is Q[i ] : Q

and for x2 2 its Q[ 2] : Q. More generally, we need the following
definition.
Definition 11.23 Let F be a field. Then
Wikimedia Commons / Anonymous Dutch artist

F ( x ) = { p/q : p, q F [ x ], q 6= 0} The Dutch mathematician and engi-


neer Simon Stevin (c.15481620) is be-
is the field of rational functions over F; that is, the field of quotients lieved to have been born in Bruges in
approximately 1548 to an unmarried
Q( F [ x ]) of the polynomial ring F [ x ]. couple named Anton Stevin and Cate-
Now let K for some field K such that F K. Then lyne van der Poort. Little is known of
his early life, but after working as a
F () = { p()/q() : p, q F [ x ], q() 6= 0} clerk in Antwerp and Bruges, and trav-
elling throughout Poland, Prussia and
is the field obtained by adjoining to F. Norway during the early 1570s, he en-
tered the University of Leiden in 1583
at the age of 35. During this period
It so happens that Q[i ] = Q(i ) and Q[ 2] = Q( 2), but it isnt
he became friends with Prince Maurits,
necessarily the case that F [] = F (). We will usually use the first of Count of Nassau (15671625), whom
these to denote he also tutored in mathematics. When
Maurits father William I, Prince of Or-
F [ ] = { a0 + a1 + a2 2 + + a n n : a0 , . . . , a n F } ange, was assassinated in 1584, Maurits
was appointed head of the army, and
which in general is only guaranteed to be a ring. Stevin became one of his trusted advi-
sors.
So, in order to obtain a field in which x2 + 1 is reducible, we only have
He wrote eleven books, including con-
to adjoin i to Q to get Q(i ). The field extension were interested in is tributions to mathematics and engi-
therefore Q(i ) : Q. Similarly, with x2 2 we want to study the field neering. In De Thiende (The tenth),
published in 1585, he introduced deci-
extension Q( 2) : Q. mal fractions to a Western audience for
Definition 11.24 Suppose that L : K is a field extension with L = the first time. In his 1594 book Arith-
metic, he presented the first known gen-
K () for some L. Then L : K is called a simple extension, and eral solution for the quadratic equa-
is said to be a primitive element of the extension. tion.
In 1586 he devised an improved
We can also define F (, ) = F ()( ); that is, the field obtained from method of using windmills to pump
F by first adjoining and then adjoining to the result. This enables water from marshes and flood plains
us to define F (S) for some finite set S K. that trebled the efficiency of the pump-
ing process. In about 1600 he invented
Some simple field extensions might not look simple at first glance: a land yacht: a sail-propelled chariot
that could overtake galloping horses.

Example 11.25 The field extension Q( 3, i ) : Q is simple, despite Partly in honour of Stevins achieve-
ment, mechanical engineering students
not initially appearing to be so. To see this, we have to find an
at Eindhoven University of Technology
element Q( 3, i ) such that Q( 3, i ) = Q(). But = ( 3+i ) is regularly build, design and race land
yachts at speeds of up to 70km/h.
446 a course in abstract algebra

one such element. Certainly Q() Q( 3, i ), so we only need to

show that Q( 3, i ) Q().

But it so happens that i = 18 3 and hence 3 = 18 3 . Hence any

element of Q( 3, i ) can be obtained as a polynomial expression in

with rational coefficients, and so Q( 3, i ) = Q().

Since 18 3 = i, it is also the case that 6 + 64 = 0. Equivalently, is a


root of the polynomial x6 + 64 Q[ x ]. On the other hand, there is no
polynomial f Q[ x ] such that f ( ) = 0. This leads to a distinction

between algebraic real numbers like 2 and 3 5 (and, for that matter,
2 and 37 ) that can be expressed as roots of polynomials in Q[ x ], and
transcendental real numbers like and e, that cant. We can generalise
this distinction to arbitrary field extensions:
Wikimedia Commons Definition 11.26 Let L : K be a field extension, and suppose L.
Charles Hermite (18221901) was the
Then is algebraic over K if there is a nonzero polynomial f K [ x ]
sixth of seven children of Ferdinand
Hermite, a draper and artist. He stud- such that f () = 0; otherwise we say is transcendental over K.
ied at the prestigious cole Polytech-
nique but was dismissed after a year,
The extension L : K is algebraic if every L is algebraic over K,
not for academic reasons but due to a and transcendental otherwise.
disability of his right foot: the cole
was, and to some extent still is, a The usual notion of algebraicity and transcendentality fits into this
military academy. He completed his definition: numbers such as and e are transcendental over Q, while
studies privately and pursued research
numbers like 2, 3 5 and 2 are algebraic over Q.
independently, later returning to the
cole as a member of academic staff.

Example 11.27 The extensions Q(i ) : Q, Q( 2) : Q and Q( 3+i ) : Q
In 1858 he discovered a method for
solving polynomial equations of de- are algebraic. However, the extension R : Q is transcendental, because
gree 5 using elliptic functions, and in there exist some real numbers (for example, and e) which arent
1873 he proved that the number e is
algebraic over Q.
transcendental.
Given a field extension L : K, we can view L as a vector space over K in
an obvious way: the elements of L are the vectors, and the elements of
K are the scalars. We can certainly add together any two elements of
L, and L forms an abelian group under this vector addition operation.
Also, since every element of K is contained in L, we can also multiply
any element of L by an element of K, to get another element of L; this
scalar multiplication operation also satisfies the usual associative and
distributive laws, and K contains the multiplicative identity 1.
Definition 11.28 The degree [ L : K ] of a field extension L : K is the
dimension dimK L of L considered as a vector space over K.
For L : K to have degree 1 requires L to be a 1dimensional vector
space over K, which is equivalent to saying L = K. Extensions of
degree 1 are called trivial extensions. Extensions of degree 2 and 3
are called quadratic or cubic extensions, respectively. An extension
L : K is said to be finite or infinite, depending on whether its degree
is finite or infinite.
polynomials 447

For example, Q(i ) has degree 2 over Q. The reason for this is that, as
noted earlier,
Q(i ) = { a + bi : a, b Q}.
It is not difficult to see that Q(i ) is a 2dimensional rational vector

space: the set {1, i } is a suitable basis. Similarly, Q( 2) also has
degree 2 over Q.

Example 11.29 The extension Q( 3, i ) : Q has degree 4. Recall that

Q( 3, i ) = Q( 3)(i ) = { p(i )/q(i ) : p, q Q( 3)[ x ]}.
By a rather tedious calculation it can be shown that

Q( 3, i ) = { a + b 3 + ci + di 3 : a, b, c, d Q}. Wikimedia Commons
Carl Louis Ferdinand von Linde-
The obvious basis for this is {1, 3, i, i 3}, which has four elements, mann (18521939) was born in

and hence [Q( 3, i ) : Q] = dimQ Q( 3, i ) = 4. Hanover, the son of a schoolteacher.
He studied mathematics at Gttingen,
Also, note that Q( 3, i ) = {( a+b 3) + (c+d 3)i : a, b, c, d Q}, Erlangen and Munich, completing his

and so, considering Q( 3, i ) as a vector space over Q( 3) we see doctorate on non-Euclidean geometry
in 1873, under the supervision of Felix
that [Q( 3, i ) : Q( 3)] = dimQ(3) Q( 3, i ) = 2. Klein (18491925).
His main contributions to mathematics
In the above example it turns out that were in geometry and analysis, and
perhaps his best known achievement is
[Q( 3, i ) : Q] = 4 = 22 = [Q( 3, i ) : Q( 3)][Q( 3) : Q]. his proof, published in 1882, that is
transcendental. This result was based
This is true in general: the degree is multiplicative, in much the same on Charles Hermites proof that e is
transcendental, together with Eulers
way that the index of a subgroup is.8 The formal statement of this fact identity ei = 1.
is sometimes called the Tower Law. In 1883 he was appointed to a chair
Proposition 11.30 (The Tower Law) Let E : K and K : F be finite field at the University of Knigsberg, and
while there he supervised the doctoral
extensions. Then dissertation of David Hilbert (1862
1943).
[ E : F ] = [ E : K ][K : F ].
8
Proposition 2.32, page 55.
Proof Suppose that [ E : K ] = m and [K : F ] = n. Then K is an n
dimensional vector space over F, and so we can find a basis

A = { a1 , . . . , a n }.

Similarly, E is an mdimensional vector space over K, and so we can


find a basis
B = {b1 , . . . , bm }.
We claim that the mn products ai b j , where 1 6 i 6 n and 1 6 j 6 m,
form a basis for E over F.
First we need to show that they span E over F. Suppose that c E.
Then there exist elements 1 , . . . , n K such that
n
c= i ai .
i =1
448 a course in abstract algebra

Furthermore, there exist elements i 0 , . . . , i m F such that


m
i = i j bj
j =1

for 1 6 i 6 n. Hence
n m
c= i,j (ai bj )
i =1 j =1
and therefore A B spans E over F.
To show linear independence, suppose that
n m
i j ( a i b j ) = 0
i =1 j =1

where i j F. Then
n  m 
i j b j ai = 0
i =1 j =1

and since a1 , . . . , an are linearly independent, it follows that


m
i j b j = 0
j =1

for 1 6 i 6 n. And since b1 , . . . , bm are linearly independent, it follows


that each i j = 0, so all the products ai b j are linearly independent,
and hence they form a basis for E over F. Since this basis has mn
elements, it follows that [ E : F ] = dimF E = mn = [ E : K ][K : F ].
The following corollary can be proved by a simple inductive argument.
Corollary 11.31 Suppose that F0 F1 Fn are fields, such that
Fi+1 : Fi are extensions for 0 6 i < n. Then
[ Fn : F0 ] = [ Fn : Fn1 ][ Fn1 : Fn2 ] . . . [ F1 : F0 ].
All of the finite extensions weve seen so far have been algebraic. The
next proposition shows that all finite extensions are algebraic:
Proposition 11.32 Any finite extension E : F is algebraic.

Proof Let L be some arbitrary element of the larger field E,


and suppose that [ E : F ] = n. Recall that any m elements in an n
dimensional vector space cannot be linearly independent if m > n. In
particular, the n+1 elements
1, , 2 , . . . , n
cant be linearly independent, and hence there exist a0 , . . . , an F, not
all zero, such that
a0 + a1 + a2 2 + + an n = 0.
Hence f = a0 + a1 x + a2 x2 + + an x n is a nonzero polynomial in
F [ x ] such that f () = 0. Therefore is algebraic over F. But was
polynomials 449

arbitrarily chosen, so every element of E is algebraic over F, and thus


E : F is algebraic.

Corollary 11.33 If E : F is a transcendental extension, then it has infinite


degree.
In Example 11.25 we saw that some field extensions are simple even
though they might not look like it at first glance. The next example
describes a finite field extension that isnt simple.
Example 11.34 Let p be prime, and suppose that F p is the finite
field of order p. Then L = F p ( x, y) is the field of rational functions
in two variables, with coefficients in F p . Now let K = F p ( x p , y p ), the
field of rational functions in x p and y p . Then K is a subfield of L, so
L : K is an extension of K.
The extension F p ( x p , y) : F p ( x p , y p ) has degree p, and so does the
extension F p ( x, y p ) : F p ( x p , y p ). So by Proposition 11.30 the extension
L : K has degree p2 .
Now suppose that f L = F p ( x, y). Then f = g/h, where
g, h F p [ x, y] and h 6= 0. By the Freshmans Binomial Theorem
(Proposition 8.20) both g p and h p lie in F p [ x p , y p ], and so f = g/h
lies in F p ( x p , y p ) = K. So every element of L satisfies a degreep
polynomial in K, and hence there is no single element of degree p2
that generates L over K. Therefore L : K is not simple.

The extension obtained from the field of rational functions is simple


but transcendental.
Proposition 11.35 For any field F, the extension F ( x ) : F is simple and
transcendental.

Proof The extension F ( x ) : F is simple by definition: the field of ra-


tional functions over F is obtained by adjoining a single element x
to F. For this extension to be algebraic, we require that for every
rational function f F ( x ) there exists a nonzero polynomial g F [ x ]
such that g( f ) = 0. But this certainly isnt the case for f = x, since
any polynomial satisfying g( x ) = 0 must be zero, and hence x is not
algebraic over F in F ( x ).

It turns out that all simple transcendental field extensions are of this
form, although to properly state and prove that fact we first need to
formulate a concept of equivalence of field extensions. Recall that an
extension of a field F is an embedding of F into some larger field E.
Given two such inclusions i1 : F , E1 and i2 : F , E2 , what were
looking for is an isomorphism : E1 E2 that respects the embedded
images of F.
450 a course in abstract algebra

Definition 11.36 Let E1 : F1 and E2 : F2 be two field extensions deter-


mined by inclusion homomorphisms i1 : F1 , E1 and i2 : F2 , E2 .
Then a homomorphism of field extensions is a field homomorphism
: E1 E2 such that (i1 ( F1 )) i2 ( F2 ); that is, maps the embed-
ded image of F1 in E1 to the embedded image of F2 in E2 . If is
bijective and the restriction |i1 ( F1 ) maps the image i1 ( F1 ) isomor-
phically to the image i2 ( F2 ) then we call it an isomorphism of field
extensions, and say the extensions E1 : F1 and E2 : F2 are isomorphic.
Let E1 : F and E2 : F be two extensions of the same field F, determined
by inclusion homomorphisms i1 : F , E1 and i2 : F , E2 . An F
homomorphism is a field homomorphism : E1 E2 such that
(i1 ( a)) = i2 ( a) for all a F; that is, restricts to the identity
map idF on the embedded images of F. An Fisomorphism is a
bijective Fhomomorphism, and two extensions E1 : F and E2 : F are
Fisomorphic if there is an Fisomorphism : E1 E2 .

In most of what follows, particularly in the classification of simple


field extensions, and also when we discuss field automorphisms and
the Galois group in Section 11.4, we will be mainly concerned with
determining when two extensions of the same field F are Fisomorphic,
but we will require the more general notion when we classify finite
fields in Section 11.3.
The next proposition, together with Proposition 11.35, provides a neat
classification of simple transcendental field extensions:
Proposition 11.37 Any simple transcendental extension F () : F is iso-
morphic to F ( x ) : F.

9
Example 9.20, page 369. Proof Let : F ( x ) F () be the evaluation homomorphism ev ;9
that is, ( f /g) = f ()/g() for any rational function f /g F ( x ).
Since g is a nonzero polynomial in F [ x ] and is transcendental over
F, it follows that g() 6= 0, so is a well-defined homomorphism.
Furthermore, if f ()/g() = 0 then we must have f () = 0, which
can only happen if f () = 0, again because is transcendental over
F. Therefore ker = {0} and so is injective; it is clearly surjective
as well, and is hence an isomorphism : F ( x ) F (). The restriction
| F is the identity, so is an isomorphism of extensions.

To classify simple algebraic extensions, we need to introduce the


concept of a minimal polynomial. Returning again to Example 11.25,

we found a polynomial x6 + 64 with the required root = 3+i. But
we can factorise this as

x6 + 64 = ( x2 + 4)( x4 4x2 + 16)

where the first factor has roots 2i and the second factor has roots
polynomials 451

3 i. So if were only interested in recovering , we dont need to
bother with the first factor x2 + 4. In fact, f = x4 4x2 + 16 happens
to be the lowest-degree polynomial in Q[ x ] with as a root. We saw
in Example 11.7 that this polynomial is irreducible over Z and Q.
Proposition 11.38 If is algebraic over a field F, then there is a unique
monic, irreducible polynomial (the minimal polynomial) of minimal de-
gree in F [ x ] such that f () = 0, and f | g for any other g F [ x ] with
g() = 0.
(Recall that a polynomial is monic if its highest-degree coefficient is 1.)

Proof If is algebraic over F then by definition there exists a polyno-


mial f F [ x ] such that f () = 0. We may suppose that f is monic; if
not, then we can obtain a new polynomial by dividing each coefficient
by the highest-degree coefficient, and this new polynomial will also
have as a root. Furthermore, suppose that f has minimal degree
over all monic polynomials in F [ x ] with as a root.
This monic polynomial f must be unique: if not, then there is another
monic polynomial h F [ x ] with h() = 0. Then f () h() = 0,
which means that either f = h or we can find a monic polynomial
k F [ x ] that is a constant multiple of f h, and k() = 0.
The minimal polynomial of an algebraic extension encodes important
information about the extension. In particular, the degree of the
extension is equal to the degree of its minimal polynomial:
Proposition 11.39 If F () : F is a simple algebraic extension of F, then
[ F () : F ] = deg( f ), where f is the minimal polynomial of over F.

Proof If is algebraic over F then by Proposition 11.38 there exists a


nonzero, monic minimal polynomial
f = a0 + a1 x + + x n F [ x ]
such that f () = 0. Here deg( f ) = n. We claim that any element
b F () can be expressed uniquely as an Flinear combination
b = b0 + b1 + b2 2 + + bn1 n1
where b0 , . . . , bn1 F. That is, the set {1, , 2 , , n1 } is linearly
independent in F (), and hence forms a basis for F () over F.
Since f () = 0, it follows that a0 + a1 + + n = 0, and hence
n = a 0 a 1 a n 1 n 1 .
Therefore, any monomial of the form k can be expressed as an F
linear combination of powers of with exponent strictly less than n,
and hence 1, . . . , n1 span F () over F.
452 a course in abstract algebra

To show linear independence, suppose that


b0 + b1 + b2 2 + + bn1 n1 = 0
with at least one coefficient not equal to zero. Then
b0 b1 bn 2 n 2
bn 1 + bn 1 + + bn 1 + n1 = 0.
Suppose that bk is the highest nonzero coefficient. Then
b0 b1 bk 1 k 1
g( x ) = bk + bk x ++ bk x + xk
is a monic polynomial over F with g() = 0, with degree k < n. But
this contradicts the assertion that the minimal polynomial f , with
degree n, was the smallest-degree monic, irreducible polynomial over
F with as a root. Hence b0 = = bn1 = 0, and therefore
1, , . . . , n1 are linearly independent. Thus they form a basis for
F () over F and hence [ F () : F ] = n.
By Proposition 11.32 we know that finite extensions are algebraic, and
therefore (by Corollary 11.33) that transcendental extensions must be
infinite. However, not every algebraic extension is finite:
Example 11.40 The extension Q 21/2 , 21/4 , 21/8 , . . . : Q is algebraic

n
but infinite. It is algebraic, because each 21/2 is algebraic over Q, and
n 1
therefore over Q 21/2 , 21/4 , 21/8 , . . . , 21/2 , since for each n N

n n
we can set f n = x2 2 Q[ x ] and then f n 21/2 = 0.
n
This extension is infinite, because the infinite set 21/2 : n N is


linearly independent over Q.

Other examples of infinite algebraic extensions include



Q( 2, 3, 5, . . .) : Q,
obtained by adjoining all square roots of positive primes,
Q(3 , 4 , 5 , . . .) : Q,
where n is a primitive nth root of unity (that is, a minimal n 6= 1
such that nn = 1) and A : Q, where
A = { x C : x is algebraic over Q}
is the field of algebraic numbers.
Going back to our key examples x2 2 and x2 + 1, the point of all
of this theory on field extensions is that we can factorise x2 2 =

( x + 2)( x 2) over Q( 2) and x2 + 1 = ( x + i )( x i ) over Q(i ).
In both cases, we adjoin a carefully chosen element to Q to obtain a
field over which the given polynomial is reducible; that is, it factorises
as a product of non-constant polynomials of lower degree. In these
two cases, the polynomials factorise as linear terms, but this is not
always the case for polynomials of cubic or higher degree.
polynomials 453

For example, let f = x3 + 5. This is irreducible over Q (to see this,



apply Eisensteins Criterion with p = 5). If we adjoin = 3 5 then we
find that f is reducible over Q():
f = x3 5 = ( x )( x2 + x + 2 ).
But x2 + x + 2 doesnt factorise as linear factors over Q(). For that,

we have to adjoin another element: = i 3 will do. Then f factorises
over Q(, ) as
f = ( x )( x 2 (1 + ))( x 2 (1 )).
The degree of this extension Q(, ) : Q is 6, since [Q() : Q] = 3 and
[Q(, ) : Q()] = 2.
Before we proceed any further with this line of inquiry, theres some-
Wikimedia Commons
thing weve taken for granted so far, and which we must now deal Born in Prussia to a wealthy Jew-
with. Weve assumed that for a given polynomial f over some field F ish family, Leopold Kronecker (1823
1891) studied mathematics, astronomy
its actually possible to find a suitably adjoinable element. By Defini- and philosophy at the University of
tion 11.26 is algebraic over F if there exists a finite polynomial over Berlin, graduating in 1845 with a disser-
tation on algebraic number theory pre-
F with as a root, and Proposition 11.38 says that we can always find pared under the supervision of Peter
a monic, irreducible polynomial (the minimal polynomial) satisfying Gustav Lejeune Dirichlet (18051859).
this condition, and whose degree, by Proposition 11.39, is equal to the He spent the next ten years manag-
ing his familys large farming estate,
degree of the simple extension [ F () : F ]. studying and researching mathemat-
What we havent yet proved is that this works the other way round: ics in his spare time, and returned to
Berlin in 1855, publishing numerous
given some irreducible polynomial f F [ x ] we can always find some papers based on his earlier research.
appropriate element to adjoin to F, so that f is reducible over F (). He was elected to the Prussian
The confirmation of this is due to the German mathematician Leopold Academy of Sciences in 1861 and
taught at Berlin from 1862 onwards,
Kronecker (18231891). where he remained for the rest of his
Theorem 11.41 (Kroneckers Theorem) Let F be a field, and let f career, turning down the offer of a chair
at Gttingen in 1866.
F [ x ] be an irreducible polynomial of degree n > 0. There exists a simple He made many contributions to num-
field extension F () : F of degree [ F () : F ] = n such that f () = 0. ber theory and algebra. In 1850 he de-
vised a general solution to the quintic
Proof Let i : F , F [ x ] be the obvious inclusion homomorphism, equation using group-theoretic meth-
ods (although not in terms of radicals,
mapping each element of F to the corresponding constant polynomial for reasons well see in a little while).
in F [ x ]. Now let K = F [ x ]/h f i. Since f is irreducible in F [ x ], then h f i He was a leading member of the con-
is a maximal principal ideal by Proposition 10.25 (i). But F [ x ] is a PID structivist movement, which held that
a mathematical object did not exist un-
by Proposition 10.47, so h f i is a maximal ideal. Hence the quotient less it could actually be constructed;
F [ x ]/h f i is a field by Proposition 9.61. in particular, proofs by contradiction
werent considered valid.
If q : F [ x ] K is the quotient homomorphism, then the composite
His mathematical disputes tended to
qi : F K is an injective field homomorphism. This is because any drift into personal animosity, and some
element a F maps to a+h f i in K, and hence for any a, b F with of his contemporaries found him diffi-
cult to work with, notably Karl Weier-
(qi )( a) = a+h f i = b+h f i = (qi )(b) it follows that a = b since strass (18151897) who nearly resigned
a, b 6 h f i. Therefore K : F is a field extension. his chair in 1885. He also had a bitter
and long-running conflict with Georg
Now let = q( x ) = x +h f i. Since x generates F [ x ] over F, it follows Cantor (18451918) over the latters
that generates K over F, and so K = F (), so K : F = F () : F is a work on transfinite numbers.
454 a course in abstract algebra

simple field extension.


Suppose f ( x ) = an x n + + a1 x + a0 for some a0 , . . . , an F. Then
f () = f ( x + h f i)
= an ( x + h f i)n + + a1 ( x + h f i) + a0
= an ( x n + h f i) + + a1 ( x + h f i) + a0
= ( a n x n + + a1 x + a0 ) + h f i
= f + hfi
= 0 + hfi
Hence f () = 0 in K, and so is algebraic over F. By Proposition 11.38
f must be a constant multiple of the minimal polynomial of over F.
Hence, by Proposition 11.39 [K : F ] = [ F () : F ] = deg( f ) = n.
We can extend this to account for reducible polynomials too:
Corollary 11.42 Let f F [ x ] be a nonconstant polynomial of degree n >
0. Then there exists a simple field extension F () : F of degree [ F () : F ] 6
n such that f () = 0.

Proof If f is irreducible then the result follows immediately from


the main theorem. Suppose instead that f is reducible. By Proposi-
tion 10.47, F [ x ] is a PID and hence a UFD by Corollary 10.43. So f
has a unique factorisation as a product of irreducible polynomials,
each with degree strictly less than n. Let g be one of these factors,
and suppose that deg( g) = m < n. Then by the main theorem there
exists a simple field extension F () : F with [ F () : F ] = m < n and
f () = g() = 0.
Classifying simple algebraic extensions requires a little more work
than transcendental ones, but the end result is still quite neat. We start
with the following proposition, which characterises simple algebraic
extensions in terms of quotients of polynomial rings.
Proposition 11.43 Let F () : F be a simple algebraic extension of a field
F, and suppose that f is the minimal polynomial of over F. Then F () =
F [ x ]/h f i. The isomorphism can be chosen so as to map to x +h f i, and
to be the identity when restricted to F F ().

Proof Let : F [ x ] F () by g 7 g(); this is a surjective homomor-


phism. Its kernel ker consists of all polynomials in F [ x ] for which
is a zero. By Proposition 11.38 any such polynomial must have
the minimal polynomial f as a factor. Hence h f i ker . But by
Proposition 10.25 (i), h f i is maximal in F [ x ] because f is irreducible.
And since ker doesnt contain the constant polynomial 1 F [ x ] we
cant have ker = F [ x ], so ker = h f i.
10
Theorem 9.49, page 380. By the First Isomorphism Theorem for Rings,10 then, we have
polynomials 455

F [ x ]/h f i = F [ x ]/ ker
= im = F ()
as claimed. This isomorphism is the induced map b: F [ x ]/h f i F (),
which maps x +h f i to and any constant element a+h f i to a F ().
Its inverse b1 : F () F [ x ]/h f i is the required isomorphism.

The next proposition is slightly more general than we need to classify


simple algebraic field extensions: all we need at this point is the
subsequent corollary, but well need the full version later so we might
as well do it here.
Proposition 11.44 Let : F1 F2 be a field isomorphism, and suppose There is a slight abuse of notation here,
that f F1 [ x ] is irreducible over F1 . Let be a root of f in some extension in that were using to denote both
the field isomorphism : F1 F2 , and
of F1 , and let be a root of ( f ) in some extension of F2 . Then there is also the corresponding ring isomor-
an isomorphism : F1 () F2 ( ) that maps to and agrees with on phism : F1 [ x ] F2 [ x ]. This shouldnt
present a problem in practice, but its
F1 ; that is, | F1 = .
worth being aware of it anyway.

Proof By Proposition 11.43, there exist isomorphisms

: F1 () F1 [ x ]/h f i and : F2 ( ) F2 [ x ]/h( f )i,

such that () = x +h f i and ( ) = x +h( f )i.


The composite = ( 1 ) : F1 () F2 ( ) thus maps to ,
and restricts to on F1 .

Corollary 11.45 Let F () : F and F ( ) : F be simple algebraic extensions


of a field F where and have the same minimal polynomial f over F.
Then F () = F ( ), and the isomorphism can be chosen to map to .
Furthermore, the simple algebraic extensions F () : F and F ( ) : F are F
isomorphic.

Proof This follows immediately from Proposition 11.44 by setting


F1 = F2 = F and = idF . The resulting isomorphism : F () F ( )
restricts to the identity on F, and is hence an Fisomorphism.

This completes our classification. Every nonconstant polynomial f


F [ x ] yields a simple algebraic field extension F () : F where is a root
of f . And any two simple algebraic extensions F () : F and F ( ) : F
are isomorphic if and have the same minimal polynomial over F.

For example, both 2 and 2 have minimal polynomial x2 2 over

Q, so by Proposition 11.44 the extensions Q( 2) : Q and Q( 2) : Q

are Qisomorphic via an isomorphism : Q( 2) Q( 2) that

switches 2 and 2, and restricts to the identity on Q.
By the Remainder Theorem,11 the existence of a root of a polynomial 11
Theorem 10.7, page 398.
f is equivalent to f having a linear factor ( x ). So, now wed like
to study factorisations of polynomials into linear factors, and thus we
introduce some more new terminology.
456 a course in abstract algebra

Definition 11.46 If a polynomial f F [ x ] can be factorised as a


product of linear factors
f = ( x 1 )( x 2 ) . . . ( x n )
where 1 , . . . , n K, where K is some field such that F K, then
we say that f splits over K.

Splittability is a stronger condition than reducibility: if a polynomial is


reducible then it factorises into lower-degree terms, but if its splittable
then it factorises completely as a product of linear terms. Given some
polynomial f F [ x ], we want to find the simplest possible extension
of F in which f is not only reducible, but splits:
Definition 11.47 Suppose that f F [ x ] splits over some field K for
which F K, and that f does not split over any proper subfield of K.
Then K is a splitting field for f over F, and K : F is a splitting field
extension for f .
Well start by looking at two of our key examples.

Example 11.48 The splitting field for f = x2 2 over Q is Q( 2),

in which f factorises as ( x + 2)( x 2). There is no smaller field
in which f splits. Furthermore, f is the minimal polynomial of this

extension, so [Q( 2) : Q] = deg( f ) = 2.

Example 11.49 The splitting field for f = x2 + 1 over R is the


complex number field C = R(i ), which has degree 2.

The next example is slightly more complicated, and is one that well
return to later when we discuss the Galois correspondence.
Example 11.50 Lets investigate the
splitting field for f = x3 5
3
over Q. Set = 5 and = 2 + 2 i. Then the roots of f are ,
3 1

and 2 . The splitting field of f over Q is the smallest field extension


of Q in which f splits. It must therefore contain and , which are
linearly independent over Q, so the smallest such extension field is
Q(, ).
We can calculate the degree of this extension using the Tower Law:
[Q(, ) : Q] = [Q(, ) : Q()][Q() : Q]. The minimal polynomial
of over Q is x3 5, which has degree 3, so [Q() : Q] = 3. The
minimal polynomial of over Q (and also over Q()) is x2 + x + 1,
which has degree 2, so [Q(, ) : Q()] = 2, and by the Tower Law
[Q(, ) : Q] = 23 = 6.
The degree of Q(, ) : Q is the dimension of Q(, ) as a vector
space over Q. A suitable basis is {1, , 2 , , , 2 }.

Splitting fields exist and have finite degree:


polynomials 457

Proposition 11.51 Given a field F and a polynomial f F [ x ], there


exists a splitting field K for f . If deg( f ) = n then [K : F ] 6 n!

Proof We prove this by induction on n = deg( f ). If n = 1 then f is


linear and therefore splits over F, so K = F and we have [K : F ] = 1.
Now suppose that n > 2, and that the hypothesis holds for all polyno-
mials in F [ x ] with degree strictly less than n.
If f is reducible over F then we can factorise it as f = gh with
deg( g) = p < n and deg(h) = q < n. By induction there is a splitting
field L for g over F, with [ L : F ] 6 p! and over L g factorises as
g = a ( x 1 ) . . . ( x p )
where a F and 1 , . . . , p L. Furthermore, L = F (1 , . . . , p ).
Now h F [ x ] L[ x ] so we can regard h as a polynomial over L.
Again, by induction there exists a splitting field K for h over L, with
[K : L] 6 q! and we can factorise h as
h = b( x 1 ) . . . ( x q )
for some b L and 1 , . . . , q K. Moreover, K = L( 1 , . . . , q ) =
F (1 , . . . , p , 1 , . . . , q ). Thus K is a splitting field for f over F and by
the Tower Law12 we have 12
Proposition 11.30, page 447.
[K : F ] = [K : L][ L : F ] 6 q!p! 6 ( p + q)! = n!
as claimed.
If f is irreducible over F then by Kroneckers Theorem we can find
a simple extension F () : F with degree [ F () : F ] = deg( f ) = n such
that f () = 0. Therefore f = ( x ) g for some polynomial g F [ x ]
with deg( g) = n 1.
By the inductive hypothesis there exists a splitting field K for g over
F () with degree [K : F ()] 6 (n 1)! and g factorises over K as
g = a ( x 1 ) . . . ( x n 1 )
for some 1 , . . . , n1 K = F ()( 1 , . . . , n1 ) = F (, 1 , . . . , n1 )
and a F. Thus K : F is a splitting field extension for f .
Furthermore, splitting fields are unique up to isomorphism.
Proposition 11.52 Let : F1 F2 be an isomorphism of fields, and
suppose that f F1 [ x ]. Suppose also that E1 is a splitting field for f
over F1 , and that E2 is a splitting field for ( f ) over F2 . Then there is an
isomorphism : E1 E2 which agrees with on F1 ; that is, | F1 = .

Proof We prove this by induction on deg( f ). If deg( f ) = 1 then f is


linear and already splits over F1 , so E1 = F1 , E2 = F2 and = .
Now suppose that deg( f ) = n > 1, and that the proposition has been
proved for all polynomials of degree less than n.
458 a course in abstract algebra

Let g F1 [ x ] be a monic, irreducible factor of f . Let be a root of g in


E1 , and let be a root of ( g) in E2 . Then g is the minimal polynomial
of the simple algebraic extension F () : F
By Proposition 11.44, there is an isomorphism : F1 () F2 ( ) that
agrees with when restricted to F1 , and which maps to .
We can now factorise f = ( x )h for some h F1 ()[ x ]. Then E1 is a
splitting field for h over F1 () and E2 is a splitting field for (h) over
F2 ( ). By the inductive hypothesis, since deg(h) < deg( f ), it follows
that there is an isomorphism : E1 E2 that agrees with on F1 ()
and hence with when restricted to F1 .
Corollary 11.53 Let f be a nonconstant polynomial in F [ x ] and suppose
that E1 and E2 are two splitting fields for f over F. Then there is an
Fisomorphism : E1 E2 .

Proof This follows immediately from the above proposition by setting


F1 = F2 = F and = idF .

The philosopher may sometimes love 11.3 Finite fields


the infinite; the poet always loves the
finite.
G K Chesterton (18741936), We saw earlier that, for any prime p, the set F p = {0, . . . , p1}
The Man Who Was Thursday (1908) forms a field under addition and multiplication modulo p. At the
+ 0 1 2
time we also claimed that there is a single field of order q = pn for
0 0 1 2 any n N. We can now prove this. First lets look at an illustrative
1 1 0 2 example.
2 0 1
2 2 1 0 Example 11.54 One of the rings we saw in Example 8.7 was a field

0 1 2 of order 4: the ring Z2 [ ], where = 12 + 23 i is a primitive cube
0 0 0 0 0 root of unity; that is, a nontrivial root of the polynomial x3 1.
1 0 1 2 The addition and multiplication tables for this ring are shown in
0 2 1
2 0 2 1 Table 11.1. Examination of the multiplication table confirms that
Table 11.1: Addition and multiplication every element apart from 0 has an inverse, and so this is a field.
tables for Z2 [ ]
How can we generalise this example to arbitrary powers of primes?
The first thing to notice is that Z2 [ ] is the smallest field in which
the polynomial x3 1 splits over F2 . This is how well construct finite
fields in general: as splitting fields of some suitably chosen polynomial
over a smaller field. The next proposition identifies this smaller field:
Proposition 11.55 A field F either has characteristic 0 and contains a
subfield isomorphic to Q, or it has prime characteristic p and contains a
subfield isomorphic to F p .

Proof If char( F ) = 0 then by Proposition 9.7 F contains a subring


polynomials 459

R = h1i isomorphic to Z, whose field of quotients Q( R)


= Q (Z) = Q
is also contained in F.
If char( F ) = n is nonzero then F contains a subring isomorphic to Zn .
But in this case if n is composite then there exist nonzero elements
a, b R F such that ab = 0, which cant happen because F is a field
and has no zero divisors. Therefore n must be equal to some prime
p, and hence F contains a subring isomorphic to Z p = F p , which is a
field.
The subfield constructed in this proof, either Q(h1i) = Q if char( F ) =
0, or h1i
= F p if char( F ) = p, is called the prime subfield of F. At the
moment, were interested in finite fields, whose prime subfields must
therefore be of prime order. The next proposition shows that finite
fields must have prime or prime power order.
Proposition 11.56 Let F be a finite field. Then | F | = pn for some prime
p and positive integer n.

Proof By Proposition 11.55, char( F ) = p for some prime p, and its


prime subfield P = F p . Then F : P is a field extension, and since F is
finite this must have finite degree [ F : P] = n. Furthermore, F can be
considered as an ndimensional vector space over P, so we can find a
basis { x1 , . . . , xn } such that any element a F can be written uniquely
as a Plinear combination
a = p1 x1 + + p n x n
where p1 , . . . , pn P. There are pn possible distinct Plinear combina-
tions of this form, and hence | F | = pn .
So now we know that any finite field has order pn and a prime subfield
isomorphic to F p . Our key example Z2 [ ] has characteristic 2 and
a prime subfield {0, 1} isomorphic to F2 . Furthermore, it consists
exactly of the roots of x3 1 together with 0. We can take this slightly
further: Z2 [ ] consists of all four roots of the polynomial x4 x =
x ( x 1)( x )( x 2 ) in F2 [ x ]. More generally, we have the following:
Proposition 11.57 Let F be a finite field of prime characteristic p and
order | F | = q = pn for some n N. Then F is a splitting field of the
polynomial f = x q x over its prime subfield P
= F p , and every element
of F is a root of f .

Proof The group of units U ( F ) consists of all invertible elements


of F, and by the definition of a field this is everything but 0. So
|U ( F )| = pn 1. Let a U ( F ). By Proposition 2.34 the order | a| must
n n n
divide |U ( F )|, hence a p 1 = 1, so a p = a, and therefore a p a = 0.
n n
Also, trivially, 0 p 0 = 0, so every element of F is a root of x p x.
n
This polynomial f = x p x splits over F, since f = a F ( x a).
460 a course in abstract algebra

Moreover, f cannot split over any proper subfield of F, so F must be


the splitting field of f over P
= Fp.

Were gradually getting closer to a complete classification of finite


fields: they all have prime power order and can be constructed as
n
splitting fields of the polynomial x p x over F p . All that remains
is to ask how many different fields there are of a given order, up to
isomorphism, and it turns out the answer is just one. To show this, we
need to introduce the following function, which should look familiar.
Definition 11.58 Given a polynomial ring R[ x ], the formal deriva-
tive d : R[ x ] R[ x ] is the function that maps a polynomial
f = a n x n + + a1 x + a0
to the polynomial
d f = nan x n1 + (n1) an1 x n2 + + 2a2 x + a1 .
This is not a ring homomorphism, but satisfies the usual linearity
and product rules for derivatives.

We can use this to give the following test for distinct roots:
Proposition 11.59 Let F be a field, and suppose that f F [ x ]. Let L be
a splitting field for f over F. Then the roots of f in L are distinct if and
only if f and d f have no non-constant common factor.

Proof Suppose that f has a repeated root a L, so that f = ( x a)m g


for some m > 2 and g F [ x ]. Then d f = m( x a)m1 g + ( x a)m dg,
so f and d f have the common factor ( x a)m1 .
Conversely, suppose that f has no repeated roots. Then for each
root a of f in L we have f = ( x a) g where g( a) 6= 0. Then d f =
13
Theorem 10.7, page 398. g + ( x a)dg and so d f ( a) = g( a) 6= 0. By the Remainder Theorem,13
( x a) isnt a factor of f . This is true for every root a of f in L,
and hence for every factor ( x a) of f in L[ x ], so f and d f have no
non-constant common factors.

Proposition 11.60 Let p be prime, and n N. Then there exists a field


Fq of order q = pn and characteristic p, and this field is unique up to
isomorphism.

n
Proof Let L be the splitting field of f = x p x over F p . This field has
n
characteristic p, and hence the formal derivative d f = pn x p 1 1 =
1, which has no non-constant factors in common with f . So, by
Proposition 11.59, f has no repeated roots.
Let K be the set consisting of these distinct roots; there are deg( f ) = pn
of them. We claim that K is a subfield of L. Certainly it contains 0 and
polynomials 461

n n
1, since 0 p 0 = 0 and 1 p 1 = 0. Also, for any a, b K we have
n n n
( a + b) p = a p + b p = a + b
by the Freshmans Binomial Theorem,14 and 14
Proposition 8.20, page 332.
pn pn pn
( ab) =a b = ab
n n
and ( a 1 ) p = ( a p ) 1 = a 1 ,
so K is closed under addition, multiplication and inversion. This
subfield K is itself the splitting field of f over F p : it contains all the
roots of f , but no proper subfield of K does.
Hence, for any prime p and positive integer n there exists a field of
order pn with characteristic p. By Proposition 11.57, this field is the
n
splitting field of f = x p x over F p , and by Proposition 11.52 any
other such field is isomorphic to K.
Definition 11.61 For any prime p and positive integer n, the finite
field of characteristic p and order q = pn discussed above is the
Galois field or finite field of order q. We denote it Fq or GF(q).

We end this section with a proof of the irreducibility of the cyclotomic


polynomials n for all n N.
Proposition 11.62 The cyclotomic polynomial n is irreducible for any
n N.

Proof Suppose that n isnt irreducible. Let = e2i/n , and let f be


a monic irreducible factor of n with f ( ) = 0. Since f is a proper
factor of n , not every primitive nth root of unity is a root of f . Let k
be the smallest exponent for which f ( k ) 6= 0, and let p be a prime
factor of k.
Since gcd(k, n) = 1, it follows that gcd( p, n) = 1 too. Set = k/p .
Then f ( ) = 0 but f ( p ) = f ( k ) 6= 0.
The polynomial f is irreducible and monic, and divides x n 1 in
Q[ x ], so x n 1 = f g with f , g Z[ x ] by Proposition 11.18. Then
0 = ( p )n 1 = f ( p ) g( p ), and since f ( p ) = f ( k ) 6= 0 it must be
the case that g( p ) = 0.
Let h( x ) = g( x p ). Then h( ) = g( p ) = 0. Since f ( ) = 0 and f is
monic and irreducible, it must follow that f |h in Q[ x ] and also in Z[ x ].
Let : Z[ x ] Z p [ x ] be the modulop reduction map defined by
( a m x m + + a1 x + a0 ) = [ a m ] p x m + + [ a1 ] p x + [ a0 ] p ,
where [ ] p denotes the residue modulo p. Then ( f ( x )) divides
(h( x )) = ( g( x p )) = ( g( x )) p in Z p [ x ]. So ( f ) and ( g) have an
irreducible factor q in common.
However, x n 1 = ( x n 1) = ( f ) ( g), which means that q2 | x n 1
in Z p [ x ], and therefore that x n 1 has a repeated linear factor in its
462 a course in abstract algebra

splitting field E p over F p = Z p . But this cant be the case, because p


and n are coprime, so x n 1 doesnt have any repeated roots in E p .
(Alternatively, x n 1 and d( x n 1) = nx n1 have no non-constant
common factor in F p [ x ], and hence by Proposition 11.59, x n 1 has no
repeated roots in E p .) This is a contradiction, so n must be irreducible
after all.

I have no doubt that an author never 11.4 Field automorphisms and the Galois group
harms his readers more than when he
hides a difficulty.
variste Galois (18111832),
It should now be clear that there is a strong connection between
Deux mmoires dAnalyse pure (1831) solving polynomial equations and studying field extensions. In this
section we will work out the details of this connection, using tech-
niques pioneered by the French mathematician variste Galois (1811
1832) in the few years before his death in a duel at the age of 20.
The approach well take is one that has proved very fruitful on many
occasions before: we will use group theory to study the symmetry of
the objects under investigation, which in this case are field extensions.
We must first decide what the correct notion of symmetry is.
The answer is given by our earlier discussion of simple algebraic field
extensions. Proposition 11.44 in particular says that if and have
the same minimal polynomial over some field F, then the extensions
F () : F and F ( ) : F are Fisomorphic. When studying the internal
structure of a field extension E : F, we want to see how many ways we
can get E by extending F. Equivalently, we want to find all the different
ways of mapping some field E to itself that both respects the field
structure of E and keeps the embedded copy of F fixed. The concept
were heading inexorably towards is that of an Fautomorphism: an
Fisomorphism from E to itself.
It is straightforward to show that the automorphisms of a field E form
a group Aut( E), and it is almost as straightforward to show that the
Wikimedia Commons
Fautomorphisms of E also form a group, a subgroup of Aut( E).
variste Galois (18111832)
Definition 11.63 Let E : F be a field extension. The Galois group
Gal( E : F ) is the group of Fautomorphisms of E; it is a subgroup of
the full automorphism group Aut( E).

Lets look at a few examples.



Example 11.64 Consider the simple algebraic extension Q( 2) : Q.

A typical element of Q( 2) is of the form a+b 2, for some a, b Q,

and a Qautomorphism will map this to a+b( 2). So is

determined by what it does to the adjoined element 2.
polynomials 463

We have two choices: either ( 2) = 2, in which case is the

identity map, or ( 2) = 2, a kind of Q( 2) analogue of the

complex conjugation map. Hence Gal(Q( 2) : Q)
= Z2 .

Example 11.65 The Galois group Gal(C : R)


= Z2 . To see this,
consider an arbitrary Rautomorphism of C. For any element
a+bi C we have
( a+bi ) = ( a) + (b)(i ) = a + b(i )
so is completely determined by where it sends i. Furthermore,
(i )2 = (i2 ) = (1) = 1
so (i ) = i and there are hence just two possible Rautomorphisms
of C. If (i ) = i then this is the identity automorphism, while if
(i ) = i then is the conjugation map. Therefore Gal(C : R)
= Z2 .

Example 11.66 Any Qautomorphism of Q( 2, 3) : Q is deter-

mined by ( 2) and ( 3). The existence of two adjoined elements
raises the possibility that might map one to the other but, in this
case at least, this cant happen. The reason for this is that
2
( 2)2 = ( 2 ) = (2) = 2
2 2
and ( 3) = ( 3 ) = (3) = 3
( 2)
( 3)

so can only map 2 to 2 and 3 to 3. There are therefore 1 = id 2 3

four different Qautomorphisms of Q( 2, 3), listed in Table 11.2. 2 2 3
2
Each of these has order 2, so Gal(Q( 2, 3) : Q) = V4 = Z2 Z2 .
3 3
4 2 3

Example 11.67 Let = 3 5, and consider some Qautomorphism
Table The Qautomorphisms of
11.2:
Q( 2, 3)
Gal(Q() : Q). Then ()3 = (5) = 5, so must map to a
real cube root of 5, of which is the only one. So = id and hence
Gal(Q() : Q) is trivial.
In Example 11.65 we saw that the only nontrivial Rautomorphism
in C is the complex conjugation map swapping i and i. Something
similar happened in Example 11.64, where there are again two Q

automorphisms: the identity, and the one that swaps 2 and 2.
In Example 11.66 things are slightly more complicated, with four Q
automorphisms. Motivated by this, and the other examples above, we
introduce the following definition:
Definition 11.68 Let E : F be an algebraic extension. Then , E
are Fconjugate if there is an Fautomorphism of E mapping to .
Equivalently and are Fconjugate if they have the same minimal
polynomial over F.

The equivalence of the two criteria in the above definition is a conse-


464 a course in abstract algebra

quence of the following:


Proposition 11.69 Let E : F be a field extension, let f F [ x ] be a poly-
nomial over F, and let Gal( E : F ) be an Fautomorphism of E. Then
( f ()) = f (()) for any E, and in particular permutes the roots
of f in E.

Proof Let f = an x n + + a1 x + a0 for some a0 , . . . , an F. Then


( f ()) = ( an n + + a1 + a0 )
= ( a n ) ( ) n + + ( a1 ) ( ) + ( a0 )
= a n ( ) n + + a1 ( ) + a0
= f (())
for any E. In particular, if is a root of f then () is too, since
f (()) = ( f ()) = (0) = 0. Since is a field automorphism, it
is a bijection from E to itself, so it permutes the roots of f amongst
themselves.
Although understanding field extensions is a worthwhile goal in its
own right, a major part of our motivation for doing so is to study the
solubility of polynomial equations. Proposition 11.69 presents a link
between the roots of polynomials in F [ x ] and the Fautomorphisms of
extensions containing those roots. We can think of the Galois group
Gal( E : F ) as a group of symmetries, not just of the field extension E : F
but also of the polynomials over F whose roots lie in E. To that end,
we introduce the following definition, which extends Definition 11.63:
Definition 11.70 Let f F [ x ] be a polynomial, and let E be the
splitting field of f over F. Then the Galois group Gal( f ) or Gal( f , F )
of f over F is the group Gal( E : F ) of the splitting field extension of
f over F.
Calculating Galois groups is often tricky, but we will look at some
examples now.
Example 11.71 What is Gal( x3 1, Q)? We need to find the splitting
field of f = x3 1 over Q, and first we notice that this polynomial
factorises as ( x 1)( x2 + x + 1). The second factor g = x2 + x + 1
is irreducible over Q, so the splitting field of f must be thesmallest
field in which g splits. This is Q( ) where = 12 + 23 i. Any
Qautomorphism of Q( ) must map to either or 2 , since
( )3 = (1) = 1, but we cant have ( ) = 1 because that isnt
an automorphism. There are thus two possibilities: 1 = id and
2 given by 2 ( ) = 2 . Hence | Gal( x3 1, Q)| = 2, and the only
possibility is Gal( x3 1, Q)
= Z2 .
The next example is a slight variation with markedly different results.
polynomials 465

Example 11.72 In Example 11.50 we investigated the splitting field


of the polynomial f = x3 5 over Q. Lets now calculate the Galois
group Gal( f , Q) of this polynomial. Recall from the earlier discussion

that f has three roots: , and 2 , where = 3 5 and =
3
12 + 2 i.
Any Qautomorphism of the splitting field Q(, ) is determined
by what it does to and . Moreover, we require ()3 = (5) = 5
and ( )3 = (1) = 1. There are three possibilities for , namely
() = , or 2 , and two possibilities for , namely ( ) =
or 2 . (We cant have ( ) = 1 because that wouldnt be an
automorphism.) So we have six possibilities, listed in Table 11.3.
All of these are Qautomorphism of Q(, ) so | Gal( x3 5, Q)| = 6.
We thus have two possibilities for the Galois group: either Z6 or S3 .
We can figure out which one it is in various ways. We could, for
example, consider the orders of the elements: () ( )
1 = id
|1 | = 1, |2 | = |3 | = 3, |4 | = |5 | = |6 | = 2. 2
2
There is no element of order 6, so this cant be Z6 and must therefore
3
4 2
be S3 instead. We can also see this from the way the automorphisms 5 2
permute the roots of f , shown in Table 11.4. Each of the six possible 6 2 2

permutations is represented, so Gal( x3 5, Q) = S3 . Table 11.3: Elements of Gal( x3 5, Q)

The following example is rather more complicated.


() ( ) ( 2 )
Example 11.73 Let f = x4 5x2 + 5. We want to calculate Gal( f , Q), 1 = id 2
and to do this we start by finding the roots of f . In general, finding 2 2
3 2
the roots of a quartic polynomial is somewhat involved (the formula 4 2
derived by Lodovico Ferrari in the sixteenth century is rather compli- 5 2
6 2
cated) but in this case we can view it as a quadratic polynomial in
x2 , and solve it in the usual way to get Table 11.4: Permutations of the roots of
x 3 5
5
x2 = 5
2 2 .

The roots of f are therefore


q q
1 = 52 + 25 , 2 = 5
2 5
2 ,
q q
3 = 52 + 25 = 1 , 4 = 5
2 2
5
= 2 .
The Galois group G = Gal( f , Q) consists of all Qautomorphisms of
the splitting field of f , which is Q(1 , 2 ). These permute the roots
amongst themselves, and hence G 6 S4 .
Is Q(1 , 2 ) the simplest way of writing this splitting field? Actually

we can simplify it further. Observe that 1 2 = 5, and also that

221 = 5 + 5. Hence 1 2 = 221 5, and so 2 = 21 5/1 .
Furthermore, 3 = 1 and 4 = 2 , so we can recover all the
466 a course in abstract algebra

other roots from 1 and the required splitting field is Q(1 ). The
polynomial f is irreducible (by Eisensteins Criterion) and monic,
and is hence the minimal polynomial for the extension Q(1 ) : Q,
which must therefore have degree 4.
Suppose that G, and that (1 ) = 2 . Then

( 5) = 2(1 )2 5 = 222 5 = 5,
and hence

(2 ) = ( 5/1 ) = 5/2 = 1 = 3 .
Furthermore,
(3 ) = (1 ) = 2 = 4 , (4 ) = (2 ) = 3 = 1 .
So , and for that matter any other Qautomorphism that maps 1
to 2 , cyclicly permutes the four roots. Therefore hi
= Z4 . Any
Qautomorphism mapping 1 to 4 must similarly be equal to 1 .
Now consider a Qautomorphism that maps 1 to 3 = 1 . Then

( 5) = 2(1 )2 5 = 221 5 = 5
and so

(2 ) = ( 5/1 ) = 5/1 = 2 = 4 .
Furthermore, (3 ) = 1 and (4 ) = 2 , so = 2 .
These are the only possibilities, and so G
= hi
= Z4 .

Its people like that who make you re- 11.5 The Galois Correspondence
alise how little youve accomplished. It
is a sobering thought, for example, that
when Mozart was my age, he had been Its illuminating to look at subgroups of the Galois group of a field
dead for two years. extension E : F, and see how the various Fautomorphisms act on the
Tom Lehrer,
various subfields of E. In Example 11.66 we saw that
introduction to Alma,

from: That Was The Year That Was
Gal(Q( 2, 3) : Q) = {1 = id, 2 , 3 , 4 }
= Z2 Z2 .
(1965)
Every element of this group fixes the base field Q. Every element of the

subgroup {1 , 2 } fixes the subfield Q( 2), but 2 acts nontrivially

on Q( 3) so the subgroup doesnt fix this subfield. Similarly, {1 , 3 }

{1 , 2 , 3 , 4 } leaves Q( 3) invariant but not Q( 2). Neither Q( 2) nor Q( 3)

are fixed by {1 , 4 }, but 6 = 2 3 is mapped to itself by both 1

{1 , 2 } {1 , 3 } {1 , 4 } and 4 , so the subfield Q( 6) is fixed by this subgroup. Finally, the

trivial subgroup {1 } leaves Q( 2, 3) invariant.

{1 } Figure 11.1 shows the subgroup lattice for Gal(Q( 2, 3) : Q), with
Figure 11.1:Subgroup
lattice diagram inclusions
going up the page. Figure 11.2 shows the subfield lattice
for Gal(Q( 2, 3) : Q) = Z2 Z2 for Q( 2, 3), this time drawn upside down, with inclusions going
polynomials 467

down the page. Well come to the reason for this reversal soon,
but for the moment observe the striking and suggestive similarity Q

between the two diagrams. Each subfield in Figure 11.2 is fixed by



the corresponding subgroup in Figure 11.1. This is an example of the Q( 2) Q( 3) Q( 6)
Galois correspondence, which is the subject of the rest of this section.

Lets try to generalise this construction. Given an extension E : F, we Q( 2, 3)
want to define a correspondence between groups and fields, so that on Figure 11.2: Subfield lattice diagram

one hand we have subgroups of the Galois group Gal( E : F ), and on for Q( 2, 3)
the other we have subfields K of E, such that F K E. The latter is
something well run into a lot over the rest of this chapter, so it would
be a good idea to introduce some appropriate terminology.
Definition 11.74 An intermediate subfield of an extension E : F is
a subfield K of E that contains F, such that F K E.
Firstly, we want to define a map from subgroups of Gal( E : F ) to
intermediate subfields of E : F. In our key example above, we identified
each subgroup with the subfield that it left invariant. So, for any
subgroup H 6 Gal( E : F ) we define the fixed field of H to be15 15
This is sometimes denoted E H .

F ( H ) = Fix H ( E) = { a E : ( a) = a for all H }.


Secondly, going back the other way, we set
G(K ) = Gal( E : K ) = { Gal( E : F ) : (k) = k for all k K }
for any intermediate subfield K.
It would all be rather neat and tidy if F and G were inverses of each
other. But this isnt always the case, as the next two examples show.
Example 11.75 We saw in Example 11.67 that the Galois group of the

extension Q( 3 5) : Q is trivial. Therefore G(Q) must also be trivial.

In particular, F (G(Q)) = Q( 3 5). So F is not the inverse of G in this
case.
The problem with this example, or at least the thing that stops F
and G being inverses of each other, is that the minimal polynomial of

= 3 5 is x3 5. But its splitting field isnt Q(), its Q(, ), where
3
= 12 + 2 i. The next example also fails, but for a different reason.
Example 11.76 Let F be a field, and adjoin some element 6 F to
get F (). The polynomial ring F ()[ x ] consists, as usual, of all finite-
degree polynomials with coefficients in F (). In particular, x n is
irreducible over F () by Eisensteins Criterion. Now let = 1/n be
the nth root of . Then F (, ) = F (, 1/n ) = F (1/n ) = F ( ) is a
simple extension of F with degree n.
Now consider the case where F = F p for some prime p. Then
( x p ) = ( x 1/p ) p = ( x ) p ,
468 a course in abstract algebra

16 Proposition 8.20, page 332. by the Freshmans Binomial Theorem,16 and so x p has only one
root in F p ( ), and indeed in any other extension of F. Therefore
Gal(F p ( ) : F p ()) is trivial, and hence G(F p ()), the group of all
F p ()automorphisms of F p ( ), must be trivial. Applying F to
this we get the field consisting of every element of F p ( ) fixed
by the identity F p ()automorphism, which is all of F p ( ). So
F (G(F p ())) = F p ( ) and hence F and G arent inverses of each
other.
The problem here is that x p only has a single root in the splitting
field F p ( ); moreover we can adjoin as many other elements to F p ( )
as we like, and it will still only have a single root, because x p =
( x ) p over F p . We have a single root, repeated p times.
We want to find a well-defined characterisation of nice field extensions
or polynomials where neither of these awkward situations arise. We
want to know what criteria an extension E : F must satisfy to ensure
F and G are inverses of each other, so that there is a neat and tidy
correspondence between intermediate subfields of E : F and subgroups
of Gal( E : F ).
Before we get to that, we have a few useful facts about F and G to
prove. The first of these says that while F and G arent necessarily
inverses, we do at least have K F (G(K )) and H 6 G(F ( H )) for any
intermediate subfield K and subgroup H.
Proposition 11.77 Let K be an intermediate subfield of an extension E : F
and suppose H 6 Gal( E : F ). Then K F (G(K )) and H 6 G(F ( H )).

Proof The group G(K ) comprises all of the Fautomorphisms in


Gal( E : F ) that keep K fixed pointwise. Any element k K is thus
fixed by everything in G(K ) = Gal( E : K ) and hence k FixG(K ) ( E) =
F (G(K )). Thus K F (G(K )).
Now considering a subgroup H 6 Gal( E : F ), the field F ( H ) =
Fix H ( E) consists of every element of E fixed by the Fautomorphisms
in H. So any h H fixes everything in F ( H ), hence h G(F ( H ))
and therefore H 6 G(F ( H )).

Weve seen so far that there is at least a partial correspondence between


intermediate subfields and subgroups of the Galois group. In ideal cir-
cumstances, the exact details of which we will shortly investigate, this
correspondence is bijective, but as we saw above there are awkward
cases in which it isnt. In general, however, there is at least some link
between a given subgroup H of the Galois group Gal( E : F ) and the
extension E : F ( H ):
polynomials 469

Proposition 11.78 Let E : F be a finite field extension, and let H 6


Aut( E) be a finite subgroup of the automorphism group of E. Then
[ E : F ( H )] = | H |.
To prove this we need a technical lemma concerning linear indepen-
dence of field monomorphisms, due to Richard Dedekind (18311916):

Lemma 11.79 (Richard Dedekind) Let F1 and F2 be fields and suppose


that f 1 , . . . , f n : F1 F2 are distinct monomorphisms. Then f 1 , . . . , f n are
linearly independent.

Proof We prove this lemma by induction and contradiction. The


result is obviously true for n = 1, since the only coefficient a1 F2 for
which a1 f 1 = 0 is a1 = 0.
Now suppose that the result is true for all n < k, for some arbitrary
k N. That is, every set of fewer than k distinct monomorphisms is
linearly independent. Assume that the result fails for n = k; that is,
there exist k distinct monomorphisms f 1 , . . . , f k : F1 F2 and coeffi-
cients a1 , . . . , ak F2 , not all of which are zero, such that
a1 f 1 ( x ) + + a k f k ( x ) = 0 (11.3)
for all x F1 . In fact, we may assume that all of a1 , . . . , ak are nonzero,
otherwise we have a linearly dependent set of fewer than k distinct
monomorphisms, contrary to the inductive hypothesis.
Since the monomorphisms are distinct, and in particular f 1 6= f k , there
exists some nonzero element y F1 such that f 1 (y) 6= f k (y). Then
a1 f 1 ( xy) + + ak f k ( xy) = 0
and so
a1 f 1 ( x ) f 1 (y) + + ak f k ( x ) f k (y) = 0. (11.4)
Multiplying (11.3) by f 1 (y) and subtracting (11.4) gives
a2 ( f 1 (y) f 2 (y)) f 2 ( x ) + + ak ( f 1 (y) f k (y)) f k ( x ) = 0. (11.5)
This is an F2 linear combination of fewer than k distinct monomor-
phisms f 2 , . . . , f k : F1 F2 . The coefficient of f i ( x ) in this equation
is ai ( f 1 (y) f i (y)), for 2 6 i 6 k. All of a2 , . . . , ak are nonzero, and
at the very least the coefficient ak ( f 1 (y) f k (y)) must be nonzero,
since we have chosen y such that f 1 (y) 6= f k (y). Thus (11.5) implies
that f 2 , . . . , f k are linearly independent, contradicting the inductive
hypothesis.
Therefore, no equation of the form (11.3) exists, and hence f 1 , . . . , f k
are linearly independent.
By induction, any finite set of monomorphisms f 1 , . . . , f n is linearly
independent, as claimed.
470 a course in abstract algebra

We can now prove Proposition 11.78:

Proof of Proposition 11.78 Let | H | = n and let [ E : F ( H )] = m. Sup-


pose that H = {h1 = e, . . . , hn } and that S = { x1 , . . . , xm } is a basis
for E considered as an mdimensional vector space over F ( H ). E is
therefore isomorphic to F ( H )m .
We prove the equality by contradiction, showing that m can be neither
strictly greater than, nor strictly less than n.
First suppose that [ E : F ( H )] = m > n = | H |, and consider the nm
matrix

h1 ( x1 ) ... hm ( x1 )
A= .. .. .
. .
h1 ( xn ) ... hm ( xn )

We can view this as representing a linear map F ( H )m F ( H )n . By


the Dimension Theorem we have

rank( A) + nullity( A) = n.

The rows of A are linearly independent, so rank( A) = n and hence


nullity( A) = m n > 1. Hence ker( A) is nontrivial, which means
that there exist y1 , . . . , ym F ( H )n , not all zero, such that:

h1 ( x1 ) y1 + + h m ( x1 ) y m = 0
.. (11.6)
.
h1 ( x n ) y1 + + h m ( x n ) y m = 0

We can write any element a E as a unique F ( H )linear combination

a = a1 x1 + + a m x m . (11.7)
Multiplying the equations in (11.6) by the coefficients a1 , . . . , am re-
spectively, we get:

a1 h1 ( x1 ) y1 + + a1 h n ( x1 ) y n = 0
.. (11.8)
.
a m h1 ( x m ) y1 + + a m h n ( x m ) y n = 0

All of the coefficients a1 , . . . , am lie in F ( H ) and are hence fixed by all


of the F ( H )automorphisms h1 , . . . , hn H. What this means is that
hi ( a j ) = a j for all i and j, and so we can rewrite (11.7) as

h1 ( a1 ) h1 ( x1 ) y1 + + h n ( a1 ) h n ( x1 ) y n = 0
.. (11.9)
.
h1 ( a m ) h1 ( x m ) y1 + + h n ( a m ) h n ( x m ) y n = 0
polynomials 471

and then as
h1 ( a1 x1 ) y1 + + h n ( a1 x1 ) y n = 0
.. (11.10)
.
h1 ( a m x m ) y1 + + h n ( a m x m ) y n = 0
Adding all of these equations together and using (11.7) we get
y1 h1 ( a) + + yn hn ( a) = 0. (11.11)
This is true for any a E, and since the elements y1 , . . . , yn arent
all zero, this means that the automorphisms h1 , . . . , hn are linearly
dependent, contrary to Lemma 11.79. Hence our original assertion,
that | H | > [ E : F ( H )], cant be true, so instead | H | 6 [ E : F ( H )].
Now suppose | H | = n < m = [ E : F ( H )]. Let T = { x1 , . . . , xn+1 } S
be a set of n+1 linearly independent vectors in E regarded as an m
dimensional vector space over F ( H ). Consider the m(n+1) matrix

h1 ( x1 ) ... h1 ( xn+1 )
B= .. ..
. .
hn ( x1 ) ... hn ( xn+1 )

Again, we can view this as representing a linear map F ( H )n+1


F ( H )n , and by the Dimension Theorem we have
rank( B) + nullity( B) = n + 1.
The rows of this matrix are linearly independent, so rank( B) = n and
hence nullity( B) = 1, which means that ker( B) is nontrivial. Hence
we can find elements z1 , . . . , zn+1 E, not all zero, such that:
h 1 ( x 1 ) z 1 + + h 1 ( x n +1 ) z n +1 = 0
.. (11.12)
.
h n ( x 1 ) z 1 + + h n ( x n +1 ) z n +1 = 0
Suppose that z1 , . . . , zn+1 are chosen such that as few of them as
possible are zero. Furthermore, arrange them so that z1 , . . . , zr are
nonzero, and zr+1 = = zn+1 = 0. We can thus rewrite (11.12) as
h1 ( x1 ) z1 + + h1 ( xr ) zr = 0
.. (11.13)
.
h n ( x1 ) z1 + + h n ( xr ) zr = 0
Choose some element h H and apply it to all of the equations
in (11.13) to get
hh1 ( x1 )h(z1 ) + + hh1 ( xr )h(zr ) = 0
.. (11.14)
.
hhn ( x1 )h(z1 ) + + hhn ( xr )h(zr ) = 0
472 a course in abstract algebra

Recall that H = { h1 , . . . , hn } is a finite group, and so multiplying


every element on the left by h merely permutes the set, so hH =
{hh1 , . . . , hhn } = H. Hence we can reorder the equations in (11.14) as
h1 ( x1 ) h ( z1 ) + + h1 ( xr ) h ( zr ) = 0
.. (11.15)
.
h n ( x1 ) h ( z1 ) + + h n ( xr ) h ( zr ) = 0
We now multiply (11.13) by h1 (z1 ) and (11.15) by z1 , and subtract the
latter from the former to get
( h ( z1 ) z2 h ( z2 ) z1 ) h1 ( x2 ) + + ( h ( z1 ) zr h ( zr ) z1 ) h1 ( xr ) = 0
..
.
( h ( z1 ) z2 h ( z2 ) z1 ) h n ( x2 ) + + ( h ( z1 ) zr h ( zr ) z1 ) h n ( xr ) = 0
This is a system of n equations with (r 1) terms. But we carefully
chose z1 , . . . , zr such that these were the smallest posssible number of
nonzero elements, and have now constructed a system of equations
with fewer nonzero coefficients. This is a contradiction, unless this
new system of equations is trivial; that is, all of the coefficients are zero.
If this is the case, then we have h(z1 )zi h(zi )z1 = 0 for 2 6 i 6 r.
Rearranging this, we find that
h(z1 zi1 ) = z1 zi1 .
But the choice of h was arbitrary, so z1 zi1 F ( H ). Therefore we can
find some nonzero element c E and elements w1 , . . . , wr F ( H )
such that zi = cwi for all i.
Then the first equation in (11.13), recalling that h1 is the identity,
becomes
x1 cw1 + + xr cwr = 0.
Dividing through by c gives
x 1 w1 + + x r wr = 0
which implies that x1 , . . . , xr are linearly dependent, contrary to our
initial assumption. Therefore | H | = n > m = [ E : F ( H )].
Together with the first part, this implies | H | = [ E : F ( H )].
Now well try to nail down the exact conditions under which the maps
F and G are mutually inverse, implying a bijective correspondence
between intermediate subfields and subgroups of the Galois group.
Our main clues for this investigation come from Examples 11.75
and 11.76, which provide two different ways in which this corre-
spondence can fail to be bijective. In fact these are the only two ways
the correspondence can fail, but we need to do a little more work to
prove that.
polynomials 473

First well consider the failure mode seen in Example 11.75. The issue

there was that x3 5 is the minimal polynomial over Q for = 3 5,
but it doesnt split in Q(). We can avoid this situation if we restrict
our attention to extensions in which every polynomial splits:
Definition 11.80 A field extension E : F is normal if and only if
every irreducible polynomial in F [ x ] that has at least one root in E
splits completely over E.

In other words, E : F is normal exactly when any polynomial f F [ x ]


with a root in E has all its roots in E.
Example 11.81 The extension C : R is normal, since every polyno-
mial in R[ x ] splits completely over C.

The next example is slightly more complicated.



Example 11.82 The extension Q( 2) : Q is normal. Suppose that

f Q[ x ] is irreducible over Q and has a root = a+b 2 in Q( 2).
Then f factorises as ( x ) g in Q(). But for f to be in Q[ x ] the

polynomial g must also have some factor ( x ), where = c+d 2,
such that ( x )( x ) = x2 (+ ) x + Q[ x ].
For + Q we require d = b, and for Q we require a = c.

Hence = = ab 2.
So f has h = x2 2ax + ( a2 +2b2 ) Q[ x ] as a factor. But f is
irreducible over Q, so f = kh for some k Q. Hence f splits

completely over Q( 2), and so Q( 2) : Q is a normal extension.

This is a rather cumbersome way of proving normality: we have to


show that that any irreducible polynomial with a root in the extension
field has all its roots in the extension field, which is manageable

for something relatively simple like Q( 2) : Q but rapidly becomes
problematic for more complicated extensions. The following result
simplifies matters considerably.
Proposition 11.83 A finite extension E : F is normal if and only if it is a
splitting field for some polynomial in F [ x ].

Proof First suppose that E : F is normal, and that [ E : F ] = n. Then


we can view E as an ndimensional vector space over F. Let S =
{1 , . . . , n } be a basis for this space, and let f i be the minimal polyno-
mial of the adjoined element i , for i = 1, . . . , n. Since E : F is normal,
any polynomial in F [ x ] with a root in E splits completely over E; in
particular, each minimal polynomial f i splits completely over E. Let
f = f 1 f 2 . . . f n be the product of all the minimal polynomials. Then f
also splits completely over E, but not over any proper subfield, because
E is generated by S and not by any proper subset thereof. Hence E is
474 a course in abstract algebra

K a splitting field for f over F.


E() E( ) Conversely, suppose that E is a splitting field for some polynomial
f F [ x ]. Let g F [ x ] be some arbitrary polynomial of degree at
E
least 2, that is irreducible over F and has a root E.
F () F ( )
Let K be the splitting field of f g over F, and suppose that is some
F other root of g in K. We can construct a lattice of fields as depicted
Figure 11.3: Lattice diagram for the in Figure 11.3. From this diagram, using the Tower Law,17 we can
fields in Proposition 11.83
deduce the following identities:
17
Proposition 11.30, page 447.
[ E() : E][ E : F ] = [ E() : F ] = [ E() : F ()][ F () : F ] (11.16)
[ E( ) : E][ E : F ] = [ E( ) : F ] = [ E( ) : F ( )][ F ( ) : F ] (11.17)
By Proposition 11.44, since and are roots of the same irreducible
polynomial g, there is an Fisomorphism : F () F ( ). Then
[ F ( ) : F ] = [ F ( ) : F ].
Since E is a splitting field for f over F, it follows that E() and E( )
are splitting fields for f over F () and F ( ). Then by Proposition 11.52
there is an isomorphism : E() E( ) extending . Hence
[ E() : F ()] = [ E( ) : F ( )].
Furthermore, [ E() : E] = 1 since E. Putting all this together
with (11.16) and (11.17) we get
[ E( ) : E][ E : F ] = [ E( ) : F ( )][ F ( ) : F ]
= [ E() : F ()][ F () : F ]
= [ E() : E][ E : F ]
= [ E : F ].
Hence [ E( ) : E] = 1, which can only be the case if E. So if E is a
splitting field for some polynomial f F [ x ], then any other irreducible
polynomial g F [ x ] that has a root in E has all of its roots in E, and
hence E : F is normal.
Proposition 11.84 Let K be an intermediate subfield of a finite normal
extension E : F. Every Fmonomorphism of K into E can be extended to a
Fautomorphism of E.

Proof Let : K E be an Fmonomorphism. By Proposition 11.83


there is a polynomial f F [ x ] with splitting field E. Since F K and
F (K ), it follows that F [ x ] K [ x ] and F [ x ] (K )[ x ]. Hence we
can regard f as belonging to both K [ x ] and (K )[ x ], and therefore E
is a splitting field for f over K and (K ). Applying Proposition 11.52
with F1 = K, F2 = (K ) and E1 = E2 = E ensures the existence of the
desired Fautomorphism of E which extends .
polynomials 475

Corollary 11.85 Let E : F be a finite normal extension, and suppose that


1 , 2 E are roots of some irreducible polynomial f F [ x ]. Then there
is an Fautomorphism of F mapping 1 to 2 .

Proof Corollary 11.45 ensures the existence of an Fisomorphism


: F (1 ) F (2 ), which can be extended to an Fautomorphism of
E by Proposition 11.84.

The extension Q( 3 5) : Q in Example 11.75 is not normal, but we
can make it into a normal extension by adjoining another element
= 12 + 23 i. It so happens that we can do this for any finite
extension: we can always adjoin finitely many extra elements to obtain
a normal extension containing the original extension field. Not only
this, there is effectively a unique choice of minimal normal extension.
Definition 11.86 A normal closure of a finite extension E : F is a
field K containing E such that K : F is normal, and K contains no
proper subfield L such that L : F is normal.

Normal closures always exist, and are unique up to isomorphism:


Proposition 11.87 Every finite algebraic extension E : F has a normal
closure K, which is unique up to Fisomorphism.

Proof Let {1 , . . . , n } be a basis for E as a vector space over F, and


let f i F [ x ] be the minimal polynomial for i over F, for all 1 6 i 6 n.
Form the product f = f 1 . . . f n F [ x ], and let K be its splitting field
over F. By Proposition 11.83, the extension K : F is normal. This field
K contains all the roots of f and thus all the roots of f 1 , . . . , f n . It also
contains the basis elements 1 , . . . , n , so K contains E as a subfield.
Now suppose that L is a subfield of K containing E, such that L : F
is normal. Since L contains E, it contains all of the adjoined basis
elements 1 , . . . , n . Each i is a root of the corresponding minimal
polynomial f i , and since L : F is normal, L must contain all of the roots
of f i , and hence L = K. Therefore K is a normal closure of E : F.
To prove uniqueness, suppose that K1 and K2 are both normal closures
of E : F. The polynomial f = f 1 . . . f n splits in both K1 and K2 , so each
contains a splitting field for f over F. But these splitting fields are
each normal closures for E : F, and must therefore be equal to K1 and
K2 respectively. By Corollary 11.53, K1 and K2 are Fisomorphic.
The next proposition will be useful later, when we study the Galois
correspondence and the Fundamental Theorem of Galois Theory.18 18
Theorem 11.108, page 482.
Proposition 11.88 Let K be an intermediate subfield of a finite normal
extension E : F. Then K : F is normal if and only if every Fmonomorphism
: K E is an Fautomorphism of K.
476 a course in abstract algebra

Proof Suppose that K : F is normal. Let : K E be an Fmonomor-


phism, and let K. Let f be the minimal polynomial of over F, so
f () = 0 and hence ( f ()) = (0) = 0. But ( f ( x )) = f (( x )) since
is an Fmonomorphism, and in particular f (()) = ( f ()) = 0,
so () is also a root of f . Since K : F is normal, () must also lie in
K. Therefore ( E) E. But K : F is a finite extension, and hence K
may be regarded as a finite-dimensional vector space over F. Since
is an Fmonomorphism, it follows that dimF (K ) = dimF K, so
(K ) = K, and thus (or, strictly speaking, its restriction |K ) is an
Fautomorphism of K.
Now suppose that every Fmonomorphism : K E is an Fauto-
morphism of K. Let f F [ x ] be an irreducible polynomial with a
root K. We want to show that all of the roots of f lie in K. We
know that E : F is normal, so let E be some other root of f . Then
by Corollary 11.85 there is an Fautomorphism : E E mapping
to . The restriction |K : K E is an Fmonomorphism, and by
hypothesis is therefore an Fautomorphism of K, so = |K () also
lies in K. Hence K : F is normal.

The other situation in which the Galois correspondence can fail is


shown in Example 11.76. There, the problem was that the polynomial
x p has only a single repeated root in F p (1/p ). In order to rule out
this situation, we introduce the following definitions.
Definition 11.89 An irreducible polynomial f F [ x ] is separable
over F if it has no repeated roots in its splitting field. More generally,
any polynomial g F [ x ] is separable over F if all of its irreducible
factors are.
An algebraic element in some extension E : F is separable over F if
its minimal polynomial is separable over F.
An algebraic extension E : F is separable if every element E is
separable over F.
A polynomial, element or extension is said to be inseparable if it
isnt separable.

This highlights one of the recurring themes in this chapter: the close
interrelationship between polynomials, algebraic elements and field
extensions. Well now look at some examples illustrating the concept
of separability in each of these contexts.
Example 11.90 The polynomial x2 +1 Q[ x ] is separable over Q,
since it has no repeated roots in its splitting field Q(i ): it factorises
as ( x +i )( x i ).
polynomials 477


Example 11.91 The element 2 Q( 2) is separable over Q, since
its minimal polynomial x2 2 Q[ x ] has no repeated roots in its

splitting field Q( 2): it factorises as ( x + 2)( x 2).

Example 11.92 The extension C : R is separable. Suppose it isnt,


and there exists some C \ R that is inseparable over R. Then
its minimal polynomial f R[ x ] has repeated roots in C. Since
[C : R] = 2, it follows that deg( f ) = 1 or 2. If deg( f ) = 1 then f can
have no repeated roots, so deg( f ) must equal 2. The element must
therefore have multiplicity 2 as a root of f . Furthermore, f can have
no other roots, and must therefore factorise as
f = ( x )2 = x2 2x + 2 .
But f R[ x ], so all its coefficients must be real. In particular,
2 = a for some a R. Hence = a/2 R. So f is reducible
in R[ x ], which contradicts our hypothesis that C \ R. Therefore
C : R is separable.

This last example shows that every algebraic element in C is separable


over R. It also so happens that every polynomial in R[ x ] is separable
over R. Fields with this property are of particular interest:
Definition 11.93 A field F is perfect if every polynomial f F [ x ] is
separable over F.

To characterise separable polynomials and perfect fields we need the


following proposition.
Proposition 11.94 An irreducible polynomial f F [ x ] is separable over
F if either
(i) char( F ) = 0, or
(ii) char( F ) = p > 0 and f 6= a0 + a1 x p + a2 x2p + + an x np .

Proof If f is inseparable then it has a repeated root in its splitting


field. By Proposition 11.59, f and its formal derivative d f have some
non-constant common factor g F [ x ], so that g| f and g|d f . But
f is irreducible in F [ x ], so f and g are associates in the sense of
Definition 10.15, and hence f |d f . However, deg(d f ) < deg( f ), so f |d f
only when d f = 0.
This cant happen if char( F ) = 0, since weve tacitly assumed that f
isnt a constant (if it is, then it has no roots and questions of separability
become irrelevant). If char( F ) = p > 0 then the only situation in which
a nonconstant polynomial f can have d f = 0 is if all of its nonzero
terms have exponents that are multiples of p. That is, f = a0 + a1 x p +
p 1
+ an x np , in which case d f = pa1 + + npan x np1 = 0.
478 a course in abstract algebra

Corollary 11.95 If char( F ) = 0 then F is perfect.

Proof If char( F ) = 0 then any irreducible polynomial f F [ x ] is sep-


arable; furthermore any other polynomial is a product of irreducible
polynomials and is hence also separable.
Corollary 11.96 Let E : F be an algebraic extension, and suppose that
char( F ) = 0. Then E : F is a separable extension.

Proof The minimal polynomial f F [ x ] of any element E is


separable by Proposition 11.94, so every E is separable, and
therefore E : F is a separable extension.
So, fields of characteristic 0 are perfect: these include Q, R and C. But
what about fields of characteristic p > 0? To answer this question
we need to introduce a special field homomorphism named after the
German mathematician Georg Frobenius (18491917).
Definition 11.97 Let F be a field with char( F ) = p > 0. The Frobe-
nius map or Frobenius monomorphism is the function : F F
given by ( a) = a p .

Proposition 11.98 The Frobenius map : F F is a field monomor-


phism. If F is finite, then is an automorphism.

Proof Given any a, b F, we see that ( ab) = ( ab) p = a p b p =


19
Proposition 8.20, page 332. ( a)(b). Also, by the Freshmans Binomial Theorem19 we have
( a + b) = ( a + b) p = a p + b p = ( a) + (b). Therefore is a field
homomorphism.
Now suppose that ( a) = (b). Then ( a) (b) = ( a b) = 0.
If a 6= b then
(1) = (( ab)( ab)1 ) = ( ab)(( ab)1 = 0.
This is a contradiction, since (1) = 1, so it must instead be the case
that a = b, and therefore is injective.
If F is finite, then surjectivity follows by a simple counting argument.
Since is injective, |( F )| = | F |, so ( F ) = F. Therefore is
surjective and hence an automorphism.
The Frobenius automorphism enables us to settle the question of when
fields of prime characteristic are perfect.
Proposition 11.99 If char( F ) = p > 0 then F is perfect if and only if
( F ) = F.

Proof Suppose that ( F ) = F. If f F [ x ] is irreducible and in-


separable over F, then by Proposition 11.94 it must be of the form
a0 + a1 x p + + an x np . If F = ( F ) then for every coefficient ai there
polynomials 479

p
is an element bi F such that bi = ai . Hence
p p p
f = b0 + b1 x p + + bn x np
= (b0 + b1 x + + bn x n ) p
which is reducible. So if ( F ) = F then F [ x ] contains no inseparable
polynomials, and hence F is perfect.
Conversely, suppose that ( F ) 6= F. Then choose some element
a F \ ( F ), and consider the polynomial f = x p a F [ x ]. Suppose
that p = a. Then x p a = x p p = ( x ) p by the Freshmans
Binomial Theorem again, and so f has only a single repeated root
in any splitting field over F. All that remains is to show that this
polynomial is irreducible over F. If not, then any proper, nontrivial
monic factor of f must have the form ( x )m for some m such that
0 < m < p. The coefficient of x m1 in ( x )m is m, so for x p a
to be reducible over F we require m F. But m F p F, so
this means that F too, and hence a = p ( F ), which is a
contradiction. So if ( F ) 6= F, there exists at least one irreducible
polynomial x p a that is inseparable over F, and therefore F isnt
perfect.
Corollary 11.100 Finite fields are perfect.

Proof Suppose that F is a finite field. Then char( F ) = p > 0 by


Proposition 11.60, and ( F ) = F by Proposition 11.98. Hence by
Proposition 11.99 F is perfect.
Putting all this together we get the following characterisation of perfect
fields:
Proposition 11.101 A field F is perfect if and only if either
(i) char( F ) = 0, or
(ii) char( F ) = p > 0 and ( F ) = F.
As a consequence of this, it is relatively difficult to find good examples
of inseparable polynomials and imperfect fields.
These two properties of normality and separability are clearly impor-
tant, and so we now consider extensions that satisfy them both:
Definition 11.102 A finite field extension E : F is said to be a Galois
extension if it is both normal and separable.

Proposition 11.103 Let K be an intermediate subfield of a finite extension


E : F. Then E : K is normal if E : F is normal, and separable if E : F is
separable.

Proof Suppose that E : F is normal. Then by Proposition 11.83 there


is a polynomial f F [ x ] with splitting field E over F. Since F K it
480 a course in abstract algebra

follows that F [ x ] K [ x ], and hence E is also a splitting field for f over


K, and so E : K is normal.
Now suppose that E : F is separable, and consider some element E
with minimal polynomial f F over F and f K over K. If f F is separable,
then f K | f F in K [ x ]. Suppose that f K isnt separable. Then by Propo-
sition 11.59 there is a non-constant polynomial g K [ x ] that divides
both f K and d f K . Since f K | f F it follows that g divides both f F and d f F .
But this cant happen, because f F is separable. Hence f K must also be
separable, and therefore the extension E : K must be separable.
In several of our key examples it has been the case that the degree
[ E : F ] of an extension is equal to the order | Gal( E : F )| of its Galois
group. This doesnt always happen, as weve seen in our key coun-
terexamples, but it does if we insist on the now obvious nice properties
of normality and separability.
Proposition 11.104 Let N be a normal closure of a finite separable exten-
sion E : F. Then there are exactly [ E : F ] distinct Fmonomorphisms from
E into N.

Proof We prove this by induction on the degree [ E : F ] of the extension.


If [ E : F ] = 1 then E = F = N and the only Fmonomorphism : E
N is the identity map idF .
Now suppose that k > 1 and the proposition holds for all exten-
sions E : F with degree [ E : F ] < k. Choose some arbitrary element
E \ F and let f be its minimal polynomial over F. Then by
Proposition 11.39 we have [ F () : F ] = deg( f ) = r > 1. This poly-
nomial f is irreducible, and has one root E N, so since N : F
is normal, f splits completely in N. Its zeros 1 = , . . . , r all lie
in N, and by separability are distinct. By the induction hypothesis,
there are exactly s = k/r = [ E : F ()] distinct F ()monomorphisms
1 , . . . , s : E N.
By Corollary 11.85 there are r distinct Fautomorphisms 1 , . . . , r of
N with i (1 ) = i for 1 6 i 6 r. We now compose these with the
maps 1 , . . . , s to obtain rs Fmonomorphisms
i,j : E N; i,j ( a) = i (j ( a))
for all a E, and such that 1 6 i 6 r and 1 6 j 6 s.
We now need to show that these Fmonomorphisms are all distinct,
and then that any Fmonomorphism from E to N is of this form.
Consider two such Fmonomorphisms m,n and p,q for some values
of m, n, p and q. For these to be equal, we require that m,n ( a) = p,q
for all a E. In particular,
m,n () = m (n ()) = m () = m
polynomials 481

and p,q () = p (q ()) = p () = p .


These are clearly equal only when m = p. Furthermore, if m,n ( a) =
m,q ( a), then m (n ( a)) = m (q ( a)). Since m is injective, it follows
that n ( a) = q ( a). Hence m,n = m,q only when n ( a) = q ( a) for
all a E. But the F ()monomorphisms 1 , . . . , r are all distinct, so
this can only happen if n = q. The Fautomorphisms i,j are therefore
all distinct.
Now let be any Fautomorphism from E to N. It must map the
root to some root i of f in N, so define ( a) = i1 (( a)) for all
a E. In particular, () = i1 (()) = i1 (i ) = , so is an F ()
monomorphism. Earlier we deduced from the induction hypothesis
that there are exactly s = k/r F ()monomorphisms 1 , . . . , s : E
N, and so this new map must be one of them, say j . Hence
j = i1 , so = i j = i,j . Therefore every Fautomorphism is
of the required form. This completes the induction step, and thereby
the proof.
Corollary 11.105 If E : F is a Galois extension then | Gal( E : F )|=[ E : F ].

Proof If E : F is a Galois extension, then it is finite, normal and


separable, and its normal closure is E itself. By the above propo-
sition, there are precisely [ E : F ] Fmonomorphisms from E E. By
Proposition 11.88 these are all actually Fautomorphisms. The Galois
group Gal( E : F ) consists of all of these Fautomorphisms, and hence
| Gal( E : F )| = [ E : F ] as claimed.
Proposition 11.106 If E : F is a Galois extension then F (Gal( E : F ))= F.

Proof Observe that Gal( E : F ) = G( F ). By Proposition 11.77 we have


F F (G( F )). It follows from Corollary 11.105 that
|G( F )| = | Gal( E : F )| = [ E : F |.
Furthermore, by Proposition 11.78
[ E : F (G( F ))] = | Gal( E : F )| = [ E : F |.
By the Tower Law,20 20
Proposition 11.30, page 447.

[ E : F ] = [ E : F (G( F ))][F (G( F )) : F ],


so [F (G( F )) : F ] = 1, and thus F (G( F )) = F.
The converse of this is also true, but the proof is somewhat involved,
and we wont need it for the main aim of this section.
Proposition 11.107 Let K be an intermediate subfield of a Galois exten-
sion E : F. If Gal( E : F ) then G((K )) = G(K )1 .

Proof Let G((K )) = Gal( E : (K )) and let a K be any element


of the intermediate field. Then ( a) (K ), and fixes ( a); that is,
482 a course in abstract algebra

(( a)) = ( a). Therefore


(1 )( a) = 1 ((( a))) = 1 (( a)) = a
and hence 1 Gal( E : K ) = G(K ). So 1 G((K )) 6 G(K ),
which implies that G(K )1 6 G((K )).
Now let G(K ) = Gal( E : K ), let b (K ), and let a = 1 (b) K.
Since fixes all the elements of K, in particular we have ( a) = a.
Then
(1 )(b) = (1 )(( a)) = (( a)) = ( a) = b,
so 1 fixes any element of (K ), and is hence an element of
Gal( E : (K )) = G((K )). Thus G(K )1 6 G((K )), and therefore
G(K )1 = G((K )) as claimed.
We are now ready to state and prove one of the main results of Galois
theory. This theorem answers the question we asked earlier about
when the maps F and G are mutually inverse, and also tells us some
other useful facts about Galois groups.
Theorem 11.108 (Fundamental Theorem of Galois Theory) Let E : F
be a Galois extension, and suppose that K is an intermediate subfield of E,
and H is a subgroup of the Galois group Gal( E : F ). Then:
(i) F (G(K )) = K and G(F ( H )) = H;
(ii) |G(K )| = [ E : K ] and | Gal( E : F )|/|G(K )| = [K : F ];
(iii) K is a normal extension of F if and only if G(K ) P Gal( E : F ); and
(iv) if K is a normal extension of F then Gal(K : F ) = Gal( E : F )/G(K ).

Proof By Proposition 11.103, E : K is a Galois extension, and hence


F (G(K )) = F (Gal( E : K )) = K (11.18)
by Proposition 11.106. And
|G(K )| = | Gal( E : K )| = [ E : K ]
by Corollary 11.105.
Setting K = F ( H ) in (11.18), we get F (G(F ( H ))) = F ( H ). Proposi-
tion 11.78 says that | H | = [ E : F ( H )], and hence
| H | = [ E : F ( H )] = [ E : F (G(F ( H )))] = |G(F ( H ))|.
By Proposition 11.77 we have H 6 G(F ( H )), and since both H and
G(F ( H )) are finite groups, it must follow that G(F ( H )) = H as
claimed.
21
Proposition 11.30, page 447. By the Tower Law,21
[ E : F ] = [ E : K ][K : F ]
and hence
[ K : F ] = [ E : F ] / [ E : K ].
polynomials 483

Applying Corollary 11.105, we have [ E : F ] = | Gal( E : F )| and [ E : K ] =


| Gal( E : K )| = |G(K )|, since both E : F and E : K are Galois extensions.
Thus
[K : F ] = | Gal( E : F )|/|G(K )|.
This completes the first two parts of the proof.
Now suppose that K : F is a normal extension, and let Gal( E : F )
be some arbitrary Fautomorphism of E. Then the restriction |K
to K is an Fmonomorphism from K to E, and hence by Proposi-
tion 11.88 is actually an Fautomorphism of K, so (K ) = K. Then
from Proposition 11.107 we have

G(K )1 = G((K )) = G(K ),

so G(K ) is closed under conjugation by any element of Gal( E : F ), and


therefore G(K ) P Gal( E : F ).
Conversely, suppose that G(K ) P Gal( E : F ). Let : K E be any
Fmonomorphism from K into E. By Proposition 11.84 this can be ex-
tended to an Fautomorphism of E, such that |K = . Since
G(K ) is normal in Gal( E : F ) it is closed under conjugation by ,
and so G(K )1 = G(K ). From Proposition 11.107 we deduce that
G((K )) = G(K ), and by (11.18) it follows that
(K ) = F (G((K ))) = F (G(K ))) = K.

Therefore (K ) = |K (K ) = K, so is an Fautomorphism of K, and


by Proposition 11.88 K : F is normal. This completes part (iii).
To prove part (iv), suppose that K : F is a normal extension. Let
f : Gal( E : F ) Gal(K : F ) be defined by f () = |K for any F
automorphism of E. We must first confirm that this is a well-defined
group homomorphism. Let Gal( E : F ) be any Fautomorphism of
E. Then |K is an Fmonomorphism, and by Proposition 11.88 it is an
Fautomorphism of K, and thus lies in Gal(K : F ). Furthermore, for
any , Gal( E : F ) we have

f () = ()|K = |K |K = f () f (),

so f is a group homomorphism. By Proposition 11.84, since K : F


is normal, any Fautomorphism of K is the restriction of some F
automorphism of E. The homomorphism f is thus surjective.
The kernel ker( f ) consists of all Fautomorphisms of E that restrict to
the identity on K. But this is precisely G(K ) = Gal( E : K ). Applying
the First Isomorphism Theorem for groups,22 we obtain 22
Theorem 4.40, page 119.

Gal(K : F ) = im( f )
= Gal( E : F )/ ker( f ) = Gal( E : F )/G(K )
as claimed.
484 a course in abstract algebra

Radical, n. A miscreant who would 11.6 Solving equations by radicals


forestall the future by discrediting the
past and abolishing the present.
Ambrose Bierce (1842c.1914), Methods for solving quadratic equations have been known
The New York American, 29 June 1906, for thousands of years: the earliest known examples occur in Sumerian
in: The Unabridged Devils Dictionary
cunieform tablets dating from about 2000 BC.23 The standard formula
(2000)
that most of us learn at school can be derived by a method called
23
completing the square. Given
J Friberg, A Geometric Algorithm with
Solutions to Quadratic Equations in a
ax2 + bx + c = 0
Sumerian Juridical Document from Ur III
Umma, Cuneiform Digital Library Jour-
nal 2009.3 (2009).
where a, b, c Q and a 6= 0, we first divide through by a to get a
monic equation
x2 + ba x + c
a = 0.
The trick is then to rewrite as much of the left hand side as possible in
the form ( x + d)2 for some appropriate d Q. Setting d = 2a b
, we get
b2
( x + d )2 = ( x + b 2
2a ) = x2 + ba x + 4a2
,

which is very nearly what we want: its just the constant term that
needs addressing. So
b2
x2 + ba x + c
a = (x + b 2
2a ) 4a2
+ c
a = 0,
which we can then rearrange to get
b 2 b2 4ac
(x + 2a ) = 4a2
.

Taking square roots and solving for x we get the familiar quadratic
formula

b b2 4ac
x= .
2a
Explicit formulae for solving cubic and quartic equations were devel-
oped in Italy during the 16th century, amid an environment of fiercely
counterproductive competition. These formulae are rather cumber-
some, but they all only require basic arithmetic operations addition,
subtraction, multiplication and division) together with finite roots.
This isnt true for all polynomial equations, and to understand why
will involve the algebraic machinery developed in this chapter together
with some of the advanced group theory we studied in Chapter 7.
We start by identifying a special class of algebraic field extensions.
Definition 11.109 A field extension E : F is a radical extension if
there exists a sequence
F = K0 , K1 , . . . , K m = E
of fields such that Ki+1 = Ki (i ) where i is a root of a polynomial
f i of the form x ni ai in Ki [ x ], with 0 6 i < m and ni N.
polynomials 485

What this means in practice is that radical extensions are exactly those
of the form F () : F where can be constructed from elements of F
using just the four basic arithmetical operations and finite roots.

Example 11.110 Let = (1+ 3)1/5 3 5( 12 + 2)1/7 . Then Q() : Q
is a radical extension, since
K0 = Q; K1 = K0 ( 0 ) , f 0 = 20 3;
K2 = K1 ( 1 ) , f 1 = 51 (1+0 ); K3 = K2 ( 2 ) , f 2 = 32 5;
K4 = K3 ( 3 ) , f 3 = 23 2; K5 = K4 ( 4 ) , f 4 = 74 (2+3 ).
By the Tower Law, [Q() : Q] = 4i=0 ni = 25327 = 420.

Our aim is to identify a privileged class of polynomials whose roots


lie in radical extension fields. The key is to consider splitting fields:
Definition 11.111 A polynomial f F [ x ] is soluble by radicals if
there exists a radical extension E of F which contains, or is itself, a
splitting field for f .

A given polynomial of degree five or greater is not necessarily soluble


by radicals.24 24
The proof is originally due to the
Italian mathematician and physician
Recall from Proposition 11.55 that a field must have either zero or Paolo Ruffini (17651822) and the Nor-
prime characteristic. We will only consider the zero-characteristic case wegian mathematician Niels Henrik
here. The essential idea is that the Galois group encodes fundamental Abel (18021829). The more general
theory was developed by the French
information about the field extension or polynomial under investiga- mathematician variste Galois (1811
tion. In particular, a given polynomial is soluble by radicals exactly 1832).
when its Galois group is soluble.25 Well prove this in two halves. 25
Definition 7.36, page 261.
Proposition 11.112 Let f F [ x ] be a polynomial over a field F with
characteristic zero. If the Galois group Gal( f , F ) is soluble, then f is solu-
ble by radicals.
In order to prove this, we need a few preliminary results. First we
briefly study extensions with cyclic Galois groups.
Definition 11.113 Let F be a field of characteristic zero. A Ga-
lois extension E : F is cyclic if its Galois group is cyclic; that is, if
Gal( E : F )
= Zn for some n N.

Definition 11.114 Let E : F be a finite extension of a field F of char-


acteristic zero. Let N be a normal closure of E. By Proposition 11.104
there are [ E : F ] distinct Fmonomorphisms from E into N. We define
the norm NE : F and trace TrE : F of E : F to be
NE : F ( a) = ( a) and TrE : F ( a) = ( a)
Gal( E : F ) Gal( E : F )

for all a E.
The first important result we will prove is due to David Hilbert (1862
486 a course in abstract algebra

26
This result is often referred to as 1943) and Ernst Kummer (18101893).26
Hilberts Theorem 90, because it is the
Proposition 11.115 (Hilberts Theorem 90) Let E : F be a cyclic ex-
ninetieth theorem stated in Zahlbericht,
his 1897 report on algebraic number tension, and a generator of Gal( E : F ). Then for any a E, the norm
theory; it had, however, been proved in NE : F ( a) = 1 if and only if there exists some b E with a = b/ (b).
1855 by Ernst Kummer.

Proof Suppose that NE : F ( a) = a ( a) 2 ( a) . . . n1 ( a) = 1, so


a 1 = ( a ) 2 ( a ) . . . n 1 ( a ). (11.19)
Let
k
dk = i (a) = a (a)2 (a) . . . n1 (a)
i =0
for 0 6 k < n. In particular, dn1 = NE : F ( a) = 1 and dk = a (dk1 )
for 0 < k < n. By Lemma 11.79 the automorphisms id, , 2 , . . . , n1
are linearly independent over E, so
= d0 id +d1 + d2 2 + + dn1 n1
isnt the zero map. In particular, there exists some c E such that
(c) = d0 c + d1 (c) + + dn1 n1 (c) 6= 0.
Let b = (c), and apply to get
(b) = (d0 c + + dn1 n1 (c))
= ( d 0 ) ( c ) + ( d 1 ) 2 ( c ) + + ( d n 1 ) n ( c )
= ( d n 1 ) c + ( d 0 ) ( c ) + + ( d n 2 ) n 1 ( c )
= (d0 c + d1 (c) + + dn1 n1 (c))/a
= b/a,
and hence a = b/ (b).
Conversely, suppose there exists some b E with a = b/ (b). Then
b (b) n 1 ( b )
NE : F ( a) = a ( a) 2 ( a) . . . n1 ( a) = 2
n =1
(b) (b) (b)
as claimed.
Example 11.116 The extension C : R is cyclic, with Galois group
Gal(C : R)
= Z2 generated by the conjugation automorphism : z 7
z. The norm NC : R is defined by NC : R (z) = z (z) = zz = |z|2 for
any z C. The complex numbers of unit norm are therefore exactly
those with modulus 1.
Suppose z = h1, i in polar form, with (, ]. We can then
choose w = hr, /2i for any r > 0 and get
w/ (w) = w/w = hr, /2i/hr, /2i = h1, i.
Conversely, any nonzero complex number w yields a complex num-
ber z = w/ (w) of unit modulus.
polynomials 487

Proposition 11.117 Suppose that f = x m 1 F [ x ] splits completely


over F, a field with characteristic zero. Let E : F be a cyclic extension with
[ E : F ] = m. Then there exists an element a F for which g = x m a
is irreducible over F, and E is a splitting field for g over F generated by a
single root of g.

Proof Let be a generator of Gal( E : F )


= Zm . Since f = x m 1 splits
completely over F, there exists a primitive mth root of unity F,
which is fixed by every Fautomorphism of E. That is, k ( ) = for
all k = 0, . . . , m1, and hence
m 1 m 1
NE : F ( ) = k ( ) = = m = 1.
k =0 k =0

By Hilberts Theorem 27 exists some E with = / (),


90, there 27
Proposition 11.115, page 486.
and hence () = 1 . This means that E \ F, and also that
k ( ) = k for k = 0, . . . , m1.
This means that the only Fautomorphism that fixes F () is the iden-
tity, so G( F ()) = {id}.
Since E : F is cyclic, and thus by definition a Galois extension, we can
apply the Fundamental Theorem of Galois Theory28 to show that 28
Theorem 11.108, page 482.

F () = F (G( F ())) = F ({id}) = E,


so E is generated over F by . Observe that
( m ) = ( ) m = ( 1 ) m = m m = m ,
so m is fixed by and all its powers, and therefore by every element
of Gal( E : F ), so it lies in F (Gal( E : F )) = F. Let a = m . Then is a
root of g = x m a. We now want to show that g is irreducible over F,
and that E is its splitting field.
Since is a root of g, the minimal polynomial for over F must be
a factor of g. But [ E : F ] = [ F () : F ] = deg( g) = m, so the minimal
polynomial of over F must be g itself. By Proposition 11.38, g must
be irreducible over F. Finally, since
g = x m a = ( x )( x ) . . . ( x m1 ),
which splits completely over E = F () and not over any subfield of E,
it follows that E is a splitting field for g over F.
The following is a technical result about extensions of splitting fields.
Proposition 11.118 Let f F [ x ] be a polynomial over a field F of char-
acteristic zero. Let K be a splitting field for f over F, let L be a field
containing F, and let E be a splitting field of f over L. Then E contains a
subfield isomorphic to K, and Gal( E : L) = Gal(K : LK ).
488 a course in abstract algebra

Proof Figure 11.4 depicts the subfield lattice for this situation.
E Since F L and L E it follows that E is an extension of F. By
Definition 11.47, K is the smallest possible field over which f F [ x ]
L K splits completely, so any other field over which f splits completely
must contain a subfield isomorphic to K. Hence we can regard E as
LK an extension of K.
Let Gal( E : L) be some Lautomorphism of E. The restriction
F = |K is a field monomorphism from K to E. As the restriction of
Figure 11.4: Subfield lattice for Propo- an Lautomorphism, must necessarily fix all the elements of L that
sition 11.118 lie in K, so it is a (K L)monomorphism. We want to show that this
is actually a (K L)automorphism of K.
By Proposition 11.69, , as an Lautomorphism of E, permutes the
roots 1 , . . . , n of f in E = L(1 , . . . , n ). Its restriction also per-
mutes all of these roots in K = F (1 , . . . , n ), while fixing all the
elements of F. Hence is an automorphism of K that, as previously
noted, fixes not just F but K L, so Gal(K : K L).
Restriction to K yields a map : Gal( E : L) Gal(K : K L) such that
() = |K , and we now want to show that this is an isomorphism. It
is a homomorphism, since for any 1 , 2 Gal( E : L) we have

(1 2 ) = (1 2 )|K = (1 |K )(2 |K ) = (1 ) (2 ).

It is injective, since if (1 ) = (2 ) then both (1 ) = 1 |K and


(2 ) = 2 |K permute the roots 1 , . . . , n the same, so 1 = 2 .
We now just need to show that is surjective. Suppose that
Gal( E : L). Certainly () fixes K L, so () Gal(K : K L) and
(Gal( E : L)) Gal(K : K L). Now let a K \ K L, so a 6 L. There
is an Lautomorphism Gal( E : L) for which ( a) 6= a. Hence
()( a) 6= a, so a is not fixed by all elements of (Gal( E : L). Thus
(Gal( E : L)) fixes every element of K L and doesnt fix any element
of K \ K L, so (Gal( E : L)) = Gal(K : K L) as claimed.

Proof of Proposition 11.112 Let K be a splitting field for f over F.


Suppose Gal( f , F ) = Gal(K : F ) is soluble, and that | Gal(K : F )| = m.
There are two cases to consider, depending on whether F contains an
mth root of unity or not.
Case 1 Suppose that F does contain a primitive mth root of unity ;
it also necessarily contains all mth roots of unity, these being powers
of m. Since G = Gal( f , F ) = Gal(K : F ) is soluble, there exists a series

{id} = G0 C G1 C C Gk = G

with each quotient Gi /Gi1


= Zmi cyclic, for 0 < i 6 k.
29
Theorem 11.108, page 482. By the Fundamental Theorem of Galois Theory29 there is a correspond-
polynomials 489

ing sequence
F = Fk Fk1 F1 F0 = E
of subfields of E, with G( Fi ) = Gal( E : Fi ) = Gi and Gal( Fi1 : Fi ) =
Gi /Gi1 = Zmi , so Fi1 : Fi is a cyclic extension of degree mi for
0 < i 6 k.
Furthermore, Fi contains Fk = E, which contains all mth roots of unity.
Since mi |m, it follows that E, and hence Fi , also contains all mi th roots
of unity, since these are also powers of . By Proposition 11.117 there
exists some element i Fi such that Fi = Fi+1 (i ) is the splitting field
m
over Fi+1 of some polynomial gi = x mi ai Fi+1 [ x ] where ai = i i .
Therefore f is soluble by radicals.
Case 2 Now suppose that F doesnt contain a primitive mth root of
unity. The polynomial x m 1 F [ x ] is obviously soluble by radicals.
Let be a primitive mth root of unity, and adjoin it to F to get
L = F ( ), the splitting field of x m 1 over F. Now let E be the
splitting field of f over L. By Proposition 11.118, E is an extension of
K, and Gal( E : L) = Gal(K : LK ), which is a subgroup of Gal(K : F ),
which is soluble. Proposition 7.38 says that subgroups of soluble
groups are soluble, and hence Gal( E : L) is soluble. This now reduces
to Case 1, and therefore f is again soluble by radicals.
The converse is also true: polynomials that are soluble by radicals
have soluble Galois groups.
Proposition 11.119 Let f F [ x ] be a polynomial over a field F of char-
acteristic zero. If f is soluble by radicals, then its Galois group Gal( f , F )
is soluble.
To prove this we first need the following result.
Proposition 11.120 Let F be a field of characteristic zero, and let f =
x n a F [ x ]. The Galois group Gal( f , F ) is soluble.

Proof We split the proof into two cases, according to whether F


contains an nth root of unity. Let E be the splitting field of f over F.
Case 1 Suppose that F contains a primitive nth root of unity . Since
F necessarily contains all powers of , it thereby contains all nth roots
of unity.
Let be a root of f = x n a in E. Then the roots of f are
, , 2 , . . . , n1
and E = F (). Any Fautomorphism of E is thus determined entirely
by its action on , and must fix .
By Proposition 11.69, any such Fautomorphism of E must permute
the roots of f in E, in particular sending to an element of the
490 a course in abstract algebra

form i . Let , Gal( f , F ) = Gal( E : F ), with () = i and


() = j . Then
()() = (()) = ( j ) = ()( ) j = i j = i+ j
()() = (()) = ( i ) = ()( )i = j i = i+ j
so = , hence Gal( E : F ) is abelian, and therefore soluble by
Proposition 7.42.
Case 2 Suppose instead that F doesnt contain a primitive nth root
of unity , and adjoin one to get F ( ), which is the splitting field of
x n 1 over F. Since the roots of f are, as noted in Case 1,
, , 2 , . . . , n1 ,
we know that E contains F ( ).
Let , Gal( F ( ) : F ) be two Fautomorphisms of F ( ). Each is
completely determined by its action on so suppose that ( ) = i
and ( ) = j . Then
()( ) = (( )) = ( j ) = ( ) j = ( i ) j = ij
()( ) = (( )) = ( i ) = ( )i = ( j )i = ij
so Gal( F ( ) : F ) is abelian and hence soluble, again by Proposition 7.42.
By the same argument used in Case 1, with F ( ) replacing F, we see
that Gal( E : F ( )) is also abelian and therefore soluble.
Since F ( ) is the splitting field for x n 1 over F, the extension F ( ) : F
is normal by Proposition 11.83. Hence by part (iv) of the Fundamental
30
Theorem 11.108, page 482. Theorem of Galois Theory,30
Gal( F ( ) : F )
= Gal( E : F )/ Gal( E : F ( )).
By Proposition 7.40, since both Gal( F ( ) : F ) and Gal( E : F ( )) are
both soluble, so is Gal( f , F ) = Gal( E : F ).
Proof of Proposition 11.119 Let E be the splitting field for f over
n n
F, and suppose that E F (1 , . . . , k ) where 1 1 F and i i
F (1 , . . . , i1 ) for some ni N and i = 2, . . . , k.
We proceed by induction on k. For the base case k = 1, the splitting
n
field E is contained in F (1 ). Let a = a1 1 , and suppose that K is the
splitting field for the polynomial g = x n1 a over F. Then K contains
F (1 ), so E K.
Both E and K are splitting fields for polynomials over F, and hence by
Proposition 11.83 the extensions E : F and K : F are normal, and so by
31
Theorem 11.108, page 482. part (iv) of the Fundamental Theorem of Galois Theory31
Gal( f , F ) = Gal( E : F )
= Gal(K : F )/ Gal(K : E).
By Proposition 11.120, Gal(K : F ) = Gal( g, F ) is soluble, and Proposi-
tion 7.39 says that a quotient of a soluble group is itself soluble, so
Gal( f , F ) = Gal( E : F ) is soluble as claimed.
polynomials 491

n
Now suppose k > 1. Again, let a = 1 1 , and let K be the splitting field
of g = x n1 a over F. Let L be the splitting field of g over E. Then
F E K L. Furthermore, L is the splitting field of g f over F, and
of f over K. Since F (1 ) K it follows that f splits in K (2 , . . . , k ).
That is, since f F [ x ] and F K, we can regard f as an element
of K [ x ]. Its splitting field over K is contained in K (2 , . . . , k ), a rad-
ical extension of K by (k1) adjoined elements. By the induction
hypothesis, this means that Gal( f , K ) = Gal( L : K ) is soluble.
By Proposition 11.120, Gal( g, F ) = Gal(K : F ) is soluble. By part (iv)
of the Fundamental Theorem of Galois Theory,
Gal( g, F ) = Gal(K : F )
= Gal( L : F )/ Gal( L : K )
and since Gal( L : K ) is soluble, Proposition 7.40 says that Gal( L : F ) is
soluble, as an extension of soluble groups.
Then by the Fundamental Theorem of Galois Theory again,
Gal( f , F ) = Gal( E : F )
= Gal( L : F )/ Gal( L : E)
and by Proposition 7.39 this is soluble as a quotient of a soluble group.
This completes the induction step, and thereby the proof.
Combining Propositions 11.112 and 11.119 we get the following:
Theorem 11.121 Let F be a field of characteristic zero. Then a polynomial
f F [ x ] is soluble by radicals if and only if Gal( f , F ) is soluble.

The case of nonzero characteristic is rather more complicated, and we


will state it without proof:
Theorem 11.122 Let F be a field of prime characteristic p, and suppose
that f F [ x ] has splitting field E over F.
If Gal( f , F ) = Gal( E : F ) is soluble and has order not divisible by p, then
f is soluble by radicals.
Conversely, if E : F is a radical extension with intermediate subfields
F = F0 F1 Fk = E
with Fi = Fi1 (i ) for a root i of a polynomial gi = x ni ai Fi1 [ x ], and
if n1 n2 . . . nk is not divisible by p, then Gal( f , F )= Gal( E : F ) is soluble.
So polynomials with insoluble Galois groups arent soluble by radicals.
The smallest insoluble groups are the alternating group A5 and the
symmetric group S5 . The next proposition guarantees the existence of
polynomials with Galois group isomorphic to S p for any prime p.
Proposition 11.123 Let f Q[ x ] be a monic, irreducible polynomial of
prime degree p, with exactly two roots in C \ R. Then the Galois group
Gal( f , Q)
= Sp.
492 a course in abstract algebra

Proof Let E be the splitting field of f over Q; this is contained in C. By


Proposition 11.94, f is separable over Q, and hence has p distinct roots.
The Galois group Gal( f , Q) = Gal( E : Q) is a group of permutations
of these roots, and is therefore a subgroup of S p .
Let be a root of f . Then f is the minimal polynomial of the extension
Q() : Q, and by Proposition 11.39 and Corollary 11.105 we have

| Gal(Q() : Q)| = [Q() : Q] = deg( f ) = p.


32
Theorem 11.108, page 482. Applying part (ii) of the Fundamental Theorem of Galois Theory,32

| Gal( E : Q)|/| Gal( E : Q())| = | Gal(Q() : Q)| = p,


so the order of Gal( f , Q) = Gal( E : Q) is divisible by p. By Cauchys
33
Theorem 2.37, page 57. Theorem33 Gal( E : Q) thus has an element of order p. But the only
elements of order p in S p are pcycles, so Gal( E : Q) contains such a
permutation. The conjugation map swaps the two complex roots of f
and leaves the real roots fixed, and thereby determines a transposition
in S p . By Proposition 1.60 (iii), these two permutations generate all of
S p , and hence Gal( f , Q) = Gal( E : Q) = Sp.
We can illustrate this with a simple example. The argument relies on
two fundamental results from real analysis, the Intermediate Value
34
Theorem A.41, page 526. Theorem34 and Rolles Theorem.35
35
Theorem A.42, page 526. Example 11.124 Let f = x5 5x +2 Q[ x ]. This is monic, and
irreducible over Q by Eisensteins Criterion. By evaluating f at a few
test values, we find that
f (2) = 20, f (1) = 6, f (0) = 2, f (1) = 2, f (2) = 24.
By the Intermediate Value Theorem, the intervals (2, 1), (0, 1)
and (1, 2) thus contain at least one root each.
We can use Rolles Theorem to show that each of these intervals
contains exactly one root. The first derivative f 0 = 5x4 5 is zero
at x = 1, and nowhere else. If there were more than one root in
(1, 1) then Rolles Theorem would imply the existence of another
turning point in this interval, which cant happen.
Similarly, f 0 ( x ) is strictly positive for | x | > 1, so again f can have at
most one root in each of the intervals (, 1) and (1, ). Putting
all this together tells us that f has exactly three real roots, one in
each of the intervals (2, 1), (0, 1) and (1, 2).
The remaining two roots must thus be complex, and hence by Propo-
sition 11.123 the Galois group Gal( f , Q) is isomorphic to the sym-
metric group S5 . This isnt soluble by Proposition 7.37, and so f isnt
soluble by radicals by Proposition 11.119.

This doesnt mean that f has no roots, nor that it doesnt have a
polynomials 493

complete set of roots in some appropriate algebraic extension field.


All it means is that the roots of f cant be expressed using just the
basic arithmetical operations and finite roots.
We can certainly use numerical methods (such as the NewtonRaphson
iterative method) to calculate the roots of f to arbitrary accuracy, but
this is algebraically unsatisfying. Another approach is to extend the
class of operations we allow: following an approach pioneered by
the Swedish mathematician Erland Bring (17361798) and the British
mathematician George Jerrard (18041863), we define the ultraradical
or Bring radical B( a) of a Q to be the real root of x5 + x + a. By a
suitable transformation, any quintic polynomial can be reduced to the
form x5 + px + q (the BringJerrard normal form), whose roots are
q
4
5p B 14 ( 5p )5/4 q


and its four conjugates.


Charles Hermite (18221901) discovered a general solution for the
quintic equation in terms of elliptic functions, and later Felix Klein
(18491925) developed a general theory involving elliptic functions
and the rotational symmetry group of the regular icosahedron (which
is isomorphic to the alternating group A5 ).

11.A Geometric constructions The Academy has taken, this year, the
decision to no longer examine any solu-
tion of the problems of the doubling of
In this section we will look at some applications of Galois Theory the cube, of the trisection of the angle,
to classical geometry. In particular, we will identify what geometric or of the squaring of the circle, nor any
announcement of a perpetual motion
constructions are possible using just an unmarked ruler and a pair machine.
of compasses. As ancient Greek mathematicians discovered, we can Histoire de lAcadmie Royale des
achieve a surprisingly wide range of geometric tasks with just these Sciences (1775) 61

tools.
Example 11.125 (Bisecting an angle) Given an arbitrary angle , we B
can construct an angle 2 with straightedge and compasses as shown Q
in Figure 11.5. R

Set the compasses to some arbitrary distance d, place the point at O


O
and sketch out an arc of radius d. This arc intersects the lines OA P A
and OB at points P and Q respectively. Now put the compass point Figure 11.5: Angle bisection with com-
at P and sketch out an arc of some radius r (where r may or may pass and straightedge

not equal d, it doesnt really matter). Then do the same at Q. These


two arcs intersect at a point R. The line OR intersects the angle AOB;
that is, the angles AOR and ROB are both equal to 2 .

There are three famous, unsolved classical problems in geometry that


494 a course in abstract algebra

between them occupied mathematicians for countless hours over at


least two and a half thousand years and possibly longer. Each of these
I consider myself as having made my must be achieved, for any arbitrary circle, angle or cube, just using a
report, and being discharged from fur-
ther attendance on the subject. I will pair of compasses and an unmarked ruler.
not, from henceforward, talk to any Squaring the circle Construct a square with exactly the same area
squarer of the circle, trisector of the an-
gle, duplicator of the cube, constructor as a given circle.
of perpetual motion, subverter of grav- Trisecting an angle Construct an angle exactly equal to one third of
itation, stagnator of the earth, builder
of the universe, &c.
a given angle.
Augustus De Morgan (18061871), Doubling a cube Construct a cube with volume exactly twice that
A Budget of Paradoxes (1872) 7 of a given cube.

In the middle of the 19th century, all three of these were finally proved
36
36
Impossible with just straightedge to be impossible, and we will study the reasons why now.
and compasses, that is: the latter two
are possible using slightly more sophis-
Well do this by developing a characterisation of those points in the
ticated techniques. Solutions involv- plane R2 that can be obtained just by using an unmarked ruler and a
ing the intersection of conic curves or pair of compasses, and then using this to construct a hierarchy of field
marked rulers were known to Greek
mathematicians in the fourth century extensions of Q.
BCE. More recently, solutions have First we choose a coordinate system for R2 : an origin O and two lin-
been devised using origami, linkages
and similar mechanisms. early independent vectors e1 and e2 , which determine the coordinate
axes. Then any other vector v in R2 can be expressed uniquely as a
linear combination v = ae1 + be2 , and this corresponds to the point
with coordinates ( a, b).
Suppose that P is a set of points in R2 . We consider the following two
operations:
(L) Draw a line through any two distinct points in P.
(C) Draw a circle centred on some point in P, with radius equal to
the distance between any two distinct points in P.
Let C ( P) be the set of points in R2 that can be obtained as the inter-
section of two lines, two circles, or a line and a circle, resulting from
applying operations (L) and (C) to the points in P. The points in C ( P)
are said to be constructible in one step from P.
Now let A0 = {(0, 0), (1, 0)}. We recursively define a sequence of sets
A0 , A1 , A2 , . . . of points in R2 such that

A i = A i 1 C ( A i 1 ).

That is, each set Ai consists of the points in Ai1 together with all
the points which are constructible in one step from Ai1 . We say any
point in R2 is constructible if it lies in An for some n; that is, if it is
constructible in finitely many steps from A0 . Similarly, we say that an
angle is constructible if the point (cos , sin ) is constructible.
The set A1 is constructed from A0 by means of the operations (L) and
(C). The only possible line at this stage is the horizontal axis, passing
polynomials 495

through (0, 0) and (1, 0). There are two possible circles that can be
drawn, one centred at (0, 0) and the other at (1, 0), both with radius 1.
These intersect at the points ( 2 , 23 ), so
1

3 3
A1 = {(0, 0), (1, 0), ( 21 , 1
2 ), ( 2 , 2 )}.
The circles and lines passing through these four points yield several
more, the elements of A2 , and by continuing this process indefinitely
To translate this into an algebraic context and use the machinery
developed in this and preceding chapters, we define a tower of fields
F0 F1 Fn R
by setting each Fn to be the subfield of R generated by the coordinates
of the points in An . For each point ( x, y) in An \ An1 we adjoin both
x and y to the field Fn1 . So F0 is the field generated by the elements
0 and 1, the coordinates of the points in A0 = {(0, 0), (1, 0)}; thus
F0 = Q. To get F1 we adjoin the elements 12 and 23 , the coordinates of

the points in A1 \ A0 , which yields F1 = Q( 3).
This new field F1 is an algebraic extension of F0 = Q with degree

[Q( 3) : Q] = 2. In fact, each successive extension Fn : Fn1 (and, by
the Tower Law,37 any extension Fn : Fm ) has degree 2k for some k N: 37
Proposition 11.30, page 447.
Proposition 11.126 Let Fn be the subfield of R generated by the coor-
dinates of the constructible points in the set An discussed above. Then
[ Fn : Q] is a power of 2 for all n N.

Proof We proceed by induction on n. As discussed earlier, [ F0 : Q] =


1 = 20 and [ F1 : Q] = 2.
Now suppose that [ Fn1 : Q] = 2k for some k N. Each of the points
in An \ An1 is an intersection of two lines, a line and a circle, or two
circles determined by points in An1 . We consider each of these cases
in turn.
Case 1 Suppose we have four points A = ( a1 , a2 ), B = (b1 , b2 ),
C = (c1 , c2 ) and D = (d1 , d2 ). The equation of the lines AB and CD
are
x a1 y a2 x c1 y c2
= , = .
b1 a1 b2 a2 d1 c1 d2 c2
Finding the intersection point of AB and CD amounts to solving these
two simultaneous linear equations, a task which requires only addition,
subtraction, multiplication and division. Hence both x and y already
lie in Fn1 , so Fn1 ( x, y) = Fn1 and thus [ Fn1 ( x, y) : Fn1 ] = 1.
Case 2 Given three points A = ( a1 , a2 ), B = (b1 , b2 ) and C = (c1 , c2 ),
we now consider the intersection of the line AB with the circle centred
at C with radius r Fn1 . The equation of the line AB is
x a1 y a2
=
b1 a1 b2 a2
496 a course in abstract algebra

and the equation of the circle centred at C is


( x c1 )2 + ( y c2 )2 = r 2 .
Solving these we get
2
b2 a2

( x c1 )2 + ( x a1 ) + a2 c2 = r 2
b1 a1
2
b1 a1

( y a2 ) + a1 c1 + ( y c2 )2 = r 2
b2 a2
so x and y are roots of quadratic polynomials over Fn1 . Then
[ Fn1 ( x, y) : Fn1 ] = 1, 2 or 4, depending on whether both, only one or
neither of x and y lie in Fn1 .
Case 3 The third case to consider is the intersection of two circles.
Given circles centred at A = ( a1 , a2 ) and B = (b1 , b2 ) with radii r and
s respectively, where r, s Fn1 , we get equations
( x a1 )2 + ( y a2 )2 = r 2 , ( x b1 )2 + (y b2 )2 = s2 .
Expanding these and subtracting one from the other yields
2(b1 a1 ) x + 2(b2 a2 )y + a21 + a22 b12 b22 = r2 s2 ,
a linear equation in x and y, which determines the chord between the
two intersection points. This case thus reduces to the previous case of
a line intersecting with a circle, and so [ Fn1 ( x, y) : Fn1 ] = 1, 2 or 4
again.
In each case, then, adjoining elements determined by these intersection
points thus multiplies the degree of the extension by 1, 2 or 4. So
adjoining to Fn1 elements corresponding to all the points in An \ An1
results in the degree [ Fn : Fn1 ] = 2 j for some non-negative integer j.
38
Proposition 11.30, page 447. By the Tower Law,38 [ Fn : Q] = [ Fn : Fn1 ][ Fn1 : Q] = 2 j 2k = 2 j+k ,
which completes the induction step and thereby the proof.
We can now apply this to the three classical geometric problems
mentioned earlier. The impossibility of two of these was first proved
by the French mathematician Pierre Wantzel (18141848).
Corollary 11.127 A cube cannot be doubled in volume using just ruler
and compasses.

Proof Suppose, without loss of generality, that the cube in question


has sides of unit length, and hence unit volume. A cube of twice this
volume must have sides of length where 3 = 2, whose minimal
39
Proposition 11.1, page 438. polynomial is x3 2 Q[ x ]. By Eisensteins Criterion39 this is irre-
ducible over Q, and hence [Q() : Q] = 3. But by Proposition 11.126
constructible lengths and points must lie in field extensions of degree
2k for k N, and hence no such construction exists for .
polynomials 497

Corollary 11.128 Not all angles can be trisected using just ruler and Pierre Laurent Wantzel (18141848)
demonstrated a strong aptitude in
compasses. mathematics from an early age. At
the age of 14, he entered the Col-
Proof As a counterexample, let = 3 . To trisect this angle requires lge Charlemagne, having been tutored
us to construct = /3 = 9 , or equivalently the length y = cos . in Latin and Greek by a M. Lievyns,
whose daughter he later married.
Using the trigonometric identity While there, he edited a new edition
of the Trait dArithmetique of Antoine-
cos 3 = 4 cos3 3 cos Andr-Louis Reynaud (17711884) and
1 Etienne Bezout (17301783), and won
and the fact that cos 3 = cos 3 = 2 this gives first prize in French and Latin.
8y3 6y 1 = 0. In 1832 he won first place in the en-
trance exams for both the cole Poly-
Setting x = 2y this yields technique and the science section of the
cole Normale Superieure, which no-
x3 3x 1 = 0 body had done before. He excelled in
his studies and in 1834 began training
and so this particular trisection problem is equivalent to constructing as an engineer at the cole des Ponts
a length satisfying this equation. But the polynomial x3 3x 1 et Chausses. He concluded that he
would be at best a mediocre engineer,
is irreducible over Q, again by Eisensteins Criterion, the extension and much preferred teaching mathe-
Q() : Q has degree. [Q() : Q] = 3, which is not divisible by 2. By matics.
Proposition 11.126 no such construction exists. In 1838 he was appointed Professor
of analysis at the cole Polytechnique,
Some angles can be trisected in this way, however. If = 2 then and three years later Professor of ap-
plied mechanics at the cole des Ponts
the question concerns the polynomial 4x3 3x which factorises as et Chausses. In addition, he over-

x (4x2 3) and so the corresponding field extension is Q( 3) : Q. This saw the entrance exams at the cole
Polytechnique and taught mathematics
has degree 2, and 2 can be trisected in the required manner.
and physics in various schools around
The other classical problem, that of squaring the circle, requires the Paris.
fact that is not algebraic over Q, a fact originally proved in 1882 by His mathematical contributions in-
clude important work on the solution
the German mathematician Ferdinand Lindemann (18521939). of equations by radicals, and the pos-
sibility of ruler and compass construc-
Corollary 11.129 A circle cannot be squared using just ruler and com-
tions. In addition, with his friend and
passes. colleague Jean Claude de Saint-Venant
(17971886), he published three papers
Proof Without loss of generality, consider a circle of unit radius. This on air flow.
has area units, and so a square of the same area must have side He died at 33, apparently from over-
work. In his obituary, Saint-Venant re-
. To construct such a square requires the point (0, ) to be marked: He usually worked in the
constructible from A0 = {(0, 0), (1, 0)} in finitely many steps. By evenings, not going to bed until late;
then he read, sleeping restlessly for
Proposition 11.126 this requires the extension Q( ) : Q to be alge-
only a few hours, alternately abus-
braic with degree a power of 2. But Q( ) Q( ), and is known ing coffee and opium and, until his
not to be algebraic over Q, so no such construction exists. marriage, taking his meals at irregular
times. He trusted without measure in
Proposition 11.126 gives a necessary condition for constructibility. With his constutition, which was naturally
very strong, and which he subjected
a little more work we can show that this condition is also sufficient. with pleasure to all kinds of abuse. He
First we need to prove some preliminary results on constructible points. brought sadness to those who mourn
We begin with some standard ruler and compass constructions. his untimely death.

Proposition 11.130 Given points A and B, we can construct a line


perpendicular to AB that intersects with AB at a point C, such that
AC = CB.
498 a course in abstract algebra

P Proof Draw circles centred at A and B with radius AB. These circles
intersect at two points P and Q. The line through P and Q is perpen-
dicular to AB, and intersects it at the midpoint C. (See Figure 11.6.)
Proposition 11.131 Given points A, B and C, we can construct a line
through C perpendicular to AB.
A C B
Proof Draw a circle centred at C that intersects AB at points P and Q.
Draw the perpendicular bisector of PQ as in Proposition 11.130. (See
Figure 11.7.)
(Although Figure 11.7 shows the point C off the line AB, this construc-
tion also works if C happens to lie on AB.)
Q Proposition 11.132 Given points A, B and C, we can draw a line parallel
to AB, passing through C.
Figure 11.6: The perpendicular bisector
Proof Draw the line through A and B. Drop a perpendicular from C
C to this line, as in Proposition 11.131, meeting it at a point D. Draw
the line through C and D, and construct the perpendicular line to CD
passing through C. This line is parallel to AB. (See Figure 11.8.)
A B Proposition 11.133 Suppose that a, b R. Then
P Q (i) the point ( a, 0) is constructible if and only if (0, a) is;
(ii) the point ( a, b) is constructible if and only if ( a, 0) and (b, 0) are;
Figure 11.7: Dropping a perpendicular
from C to AB (iii) if ( a, 0) and (b, 0) are constructible, then so are ( a+b, 0), ( ab, 0),
( ab, 0) and, if b 6= 0, ( a/b, 0); and
C
(iv) if a, b Q then ( a, b) is constructible.

Proof Given A0 = {(0, 0), (1, 0)}, we can construct the xaxis by
drawing the line through (0, 0) and (1, 0). By Proposition 11.131 we
A B can construct the line passing through (0, 0) perpendicular to this line;
D that is, the yaxis.
Figure 11.8: Constructing a parallel (i) Given A = ( a, 0), the circle with radius a centred on (0, 0) inter-
sects the yaxis at P = (0, a) (and (0, a)). (See Figure 11.9.)
P Conversely, given (0, a), the same circle intersects the xaxis at ( a, 0)
and ( a, 0).
(ii) Given B = (b, 0), construct (0, b) using the method in part (i).
Draw the line passing through A = ( a, 0) parallel to the yaxis and the
line through P = (0, b) parallel to the xaxis, as in Proposition 11.132.
These lines intersect at Q = ( a, b).
A
Figure 11.9: Constructing (0, a) from Conversely, given Q = ( a, b), draw the line through ( a, b) parallel to
( a, 0) and vice-versa the xaxis; this intersects the yaxis at P = (0, b), and from this we
can construct B = (b, 0) via the method in part (i). Drawing a line
through ( a, b) parallel to the yaxis yields the point A = ( a, 0) as its
intersection with the xaxis. (See Figure 11.10.)
polynomials 499

(iii) Given A = ( a, 0) and B = (b, 0), the circle centred at ( a, 0) with Q


P
radius AB intersects the xaxis at P = ( a+b, 0) and Q = ( ab, 0). (See
Figure 11.11.)
We can construct ( ab, 0) by drawing the line through I = (0, 1) and B =
(b, 0) and then drawing the line parallel to this passing through A =
(0, a). This line intersects the xaxis at P = ( ab, 0). (See Figure 11.12.)
Similarly, we can obtain ( a/b, 0) by doing this construction in reverse. B A

Draw the line through I = (0, 1) and B = (b, 0), and then construct Figure 11.10: Constructing ( a, b)
the parallel line passing through A = ( a, 0): this intersects the yaxis
at P = (0, a/b), and by part (i) we can then construct Q = ( a/b, 0).
(See Figure 11.13.)
(iv) By part (iii) we can construct (n, 0) for any n Z, and then
(m/n, 0) for any m, n Z with n 6= 0. We can thus construct ( a, 0)
and (0, b) for any a, b Q, and hence ( a, b) by part (ii). Q B A P
Figure 11.11: Constructing ( a+b, 0)
Proposition 11.134 Suppose that a R. If the point ( a, 0) is con-
and ( ab, 0)
structible, then so is ( a, 0).

Proof The geometric construction is shown in Figure 11.14. Let A = A


( a, 0) and let B = ( a+1, 0). Let C = (( a+1)/2, 0) be the midpoint
of the line segment OB. Draw the circle with radius OC = ( a+1)/2
centred on C. Now construct the perpendicular to OB through the I
point I = (1, 0). This intersects the circle at two points P and Q;
without loss of generality suppose that P is the one with positive
B P
ycoordinate. The triangles BIP and PIO are similar, and hence the
Figure 11.12: Construction of ( ab, 0)
ratios OI/IP and IP/IB are equal. From this, and the fact that OI = 1,
we deduce that IP2 = IB = a, and hence the line segment IP has P

length a. We now draw a circle of radius IP centred at O: this

intersects the positive xaxis at D = ( a, 0) as required.

Proposition 11.126 says that the coordinates of constructible points lie I


in a field extension of degree 2n , for some non-negative integer n. The
next two propositions concern the converse.
B Q A
Proposition 11.135 If E is an extension of Q such that there exists a
Figure 11.13: Construction of ( a/b, 0)
sequence
Q = F0 F1 Fn = E
with [ Fi+1 : Fi ] = 2 for 0 6 i < n, then every point with coordinates in E P
is constructible.

Proof We proceed by induction on n. Suppose first that n = 1,


so that [ E : Q] = 2. The field F1 = E has degree 2 over Q, and
is hence of the form Q(), where is the root of some quadratic O I DC A B
polynomial f = x2 +bx +c Q[ x ] that is irreducible over Q. The Figure 11.14: Construction of ( a, 0)

roots of f are 12 (b b2 4c), and without loss of generality we
500 a course in abstract algebra

may choose = 12 (b + b2 4c). This is constructible from Q by
Propositions 11.133 and 11.134, and hence the result holds for n = 1.
Now suppose that the result holds for n = k. The extension [ Fk+1 : Fk ]
has degree 2 and therefore Fk+1 = Fk ( ) for some , the root of
a quadratic polynomial g = x2 + px +q Fk [ x ] that is irreducible
over Fk . By the same argument as above, = 21 ( p + p2 4q) is
p

constructible from Fk . This completes the induction and the proof.


Proposition 11.136 Let E : Q be a normal extension with degree 2n for
some n N. Then every point in E E is constructible.

Proof By Corollary 11.105, the Galois group of E : Q has order


| Gal( E : Q)| = [ E : Q] = 2n .
40
Theorem 7.4, page 242. By Sylows First Theorem40 there exists a sequence
{e} = H0 C H1 C C Hn = Gal( E : Q)
of subgroups such that | Hi | = 2i for 0 6 i < n. By the Fundamental
41
Theorem 11.108, page 482. Theorem of Galois Theory41 there exist subfields
Q = F ( Hn ) F ( Hn1 ) F ( H1 ) F ( H0 ) = E
with [ E : F ( Hi )] = 2i for 0 6 i < n, and so [F ( Hi ) : F ( Hi+1 )] = 2. The
result then follows from Proposition 11.135.
We end this section with a discussion of the constructibility of regular
polygons. Let Pn denote the regular nsided polygon with vertices
unit distance from the origin, and let n = 2
n .
Proposition 11.137 The polygon Pn is constructible if and only if the
angle n is constructible.

Proof Let I = (1, 0) be the first vertex of our polygon. Then if n is


constructible, we can make (cos n , sin n ) the second vertex.
Suppose we can construct the kth vertex (cos kn , sin kn ). Then by
Proposition 11.133 we can construct the (k+1)st vertex

(cos kn cos n sin kn sin n , sin kn cos n + cos kn sin n ) =


(cos(k+1)n , sin(k+1)n ).
By induction, we can construct all n vertices of Pn .
Conversely, given Pn , the angle between adjacent vertices is n .
Proposition 11.138 Let m, n N.
(i) If Pn is constructible then so is Pm for any m|n.
(ii) If Pm and Pn are constructible, and m and n are coprime, then Pmn
is constructible.
(iii) The polygon P2n is constructible.
polynomials 501

Proof (i) Suppose n = md for some d N. Joining every dth vertex


of Pn yields Pm .
(ii) If m and n are coprime, then there exist integers a and b such that
am + bn = 1. Hence na + mb = mn 1
, and so mn2
= a 2 2
n + b m . We can
2
thus construct the angle mn and therefore the polygon Pmn .
(iii) Construction of the polygon P2n is equivalent to construction of
the angle 2n = 2
2n . The angle = 2 is obviously constructible. Now
suppose that 2k is constructible, and bisect it to obtain 12 2k = 2k+1 .
By induction, we can construct 2n , and thereby P2n for any n N.
m m
Corollary 11.139 Let n = p1 1 . . . pk k where p1 , . . . , pk are distinct
primes. The polygon Pn is constructible if and only if each of the poly-
gons Ppm1 , . . . , Ppmk are constructible.
1 k

Proof If each of Ppm1 , . . . , Ppmk is constructible, then so is Pn by Propo-


1 k
sition 11.138(ii). Conversely, if Pn is constructible, then so is Ppmi for
m i
1 6 i 6 k, by Proposition 11.138(i), since pi i |n.
We are now almost ready to state and prove the necessary and suffi-
cient conditions on the constructibility of regular polygons. First, a
short number-theoretic lemma and an accompanying definition.
Lemma 11.140 Let n = 2m + 1 for some m N. Then n is prime only
if m is a power of 2.

Proof Suppose that m = ab where a > 1 is odd. Then n = 2ab + 1,


which factorises as
2ab + 1 = (2b + 1)(2(a1)b 2(a2)b + + 1).
Hence n is composite if m has an odd factor greater than 1, so for n to
be prime m must be a power of 2.
k
Definition 11.141 Integers of the form Fk = 22 + 1 for k N0 are
called Fermat numbers; if prime they are called Fermat primes.

The only known Fermat primes are:


F0 = 3, F1 = 5, F2 = 17,
F3 = 257, F4 = 65 537.

Proposition 11.142 (Gauss) The regular polygon Pn is constructible if


k
and only if n = 2m p1 . . . pk where k, m N0 and each factor pi = 22 i + 1
is a Fermat prime.

Proof The polygon Pn is constructible if and only if n = e2i/n


is constructible. This in turn is possible exactly when the splitting
extension of the cyclotomic polynomial n has degree a power of 2.
m m
Suppose that n = 2m p1 1 . . . pk k for some k, m, m1 , . . . , mk N0 .
502 a course in abstract algebra

By Propositions 11.39 and 11.17(iii), the degree of the extension is (n),


and by Proposition 2.46 we have
m 1
( n ) = 2m 1 ( p 1 1 ) p m1 1 . . . ( p k 1 ) p k k .
We need this to be a power of 2, so m0 can be any non-negative integer.
m 1
Furthermore, each pi i is a power of an odd prime and can therefore
only be equal to a power of 2 in the trivial case where mi 1 = 0, so
m1 = = mk = 1. And for ( pi 1) to be a power of 2, the prime pi
must be of the form 2hi + 1 for some hi N0 ; by Lemma 11.140 the
exponent hi must itself be a power of 2.
Putting all this together, we see that (n) is a power of 2 (and the
polygon Pn is constructible) if and only if m1 = = mk = 1 and each
k
odd prime factor pi = 22 i is a Fermat prime.
This result is originally due to the great German mathematician Carl
Friedrich Gauss (17771855), who proved the sufficiency (the if half)
in his 1801 book Disquisitiones Arithmeticae; he stated the necessary
(only if) half without proof. Proof of necessity was given by Pierre
Wantzel (18141848) in 1837.

Summary

References and further reading


In this chapter weve only been able to look at the main ideas of Galois theory. The following two
books, both very readable and aimed at undergraduate students, cover the subject more thoroughly:
J M Howie, Fields and Galois Theory, Springer Undergraduate Mathematics Series, Springer (2006)
I N Stewart, Galois Theory, fourth edition, CRC Press (2015)
The latter also includes a couple of chapters of historical and biographical notes. The following article
gives a much briefer overview of the subject:
J Stillwell, Galois Theory for Beginners, The American Mathematical Monthly 101.1 (1994) 2227
The romantic and tragic story of Galois brilliant but short life has passed into mathematical folklore
and, as often happens, many of the things we think we know are wrong (particularly the popular image
of the doomed mathematician frantically writing down his discoveries in the hours before his fatal
duel). The following article unpicks many of these myths, giving an account of Galois life and death
better supported by the available historical sources:
T Rothman, Genius and Biographers: The Fictionalization of variste Galois, The American Mathematical
Monthly 89.2 (1982) 84106
The next two articles present applications of finite fields to mathematical puzzles.
polynomials 503

N G de Bruijn, A solitaire game and its relation to a finite field, Journal of Recreational Mathematics 5.2
(1972) 133137
D A James, Magic circles, Mathematics Magazine 54.3 (1981) 122125
We saw in Section 11.A that the classical problems of trisecting an arbitrary angle, doubling the cube,
and squaring the circle are impossible with just ruler and compasses. This has been known since the
middle of the nineteenth century. Nevertheless, well-meaning enthusiasts still persist in trying to solve
them, with sometimes exasperating results. The following articles and books give fascinating accounts
of many such attempts.
A De Morgan, A Budget of Paradoxes, Longman, London (1872)
U Dudley, Mathematical Cranks, Mathematical Association of America, Washington, DC (1992)
U Dudley, The Trisectors, Mathematical Association of America, Washington, DC (1994)
A E Hallerberg, Indianas Squared Circle, Mathematics Magazine 50.3 (1977) 136140
D Singmaster, The Legal Values of Pi, Mathematical Intelligencer 7.2 (1985) 6972

Exercises
12.1 Use Proposition 11.5 together with the prime p = 29, in a manner analogous to the method used
in Example 11.11, to find a list of quadratic and higher-degree irreducible polynomials in Z[ x ].

12.2 Show that the extension Q( 2, 3) is simple.
12.3 Adapt the argument in Example 11.92 to show that if [ E : F ] = 2 and char( F ) 6= 2, then E : F is
separable. What happens if char( F ) = 2?
We have offended against thy holy
lawes: We have left undone those
thinges whiche we ought to have done,
and we have done those thinges which
we ought not to have done, and there is
no health in us, but thou, O Lorde, have
mercy upon us miserable offendours.
The Book of Common Prayer (1559)

A Background Reports that say that something hasnt


happened are always interesting to me,
because as we know, there are known
knowns; there are things we know we
know. We also know there are known
unknowns; that is to say we know there
he rest of this book assumes some background knowledge that
T is usually taught early in the first year of a mathematics degree
are some things we do not know. But
there are also unknown unknowns: the
ones we dont know we dont know.
programme at a British university.
Donald Rumsfeld, US Department
This appendix comprises a quick overview of some of the more rel- of Defense news briefing,
evant bits of this material, in order to make this book accessible to a 12 February 2002

wider audience, to provide a quick reminder for those who might need
one, and also to standardise notation and terminology. In particular,
we cover sets, infinite cardinals, functions, and relations.

A.1 Sets A set is a many that allows itself to be


thought of as a One.
Georg Cantor (18451918)
Much of pure mathematics, and abstract algebra in particular, is
based on set theory in one form or another; indeed, almost all of the
objects we study in this book are essentially sets equipped with some
extra structure.
Much has been written over the years, particularly during the earlier
and middle years of the 20th century, on axiomatic set theory: the
formalisation and underpinnings of much of the foundations of pure
mathematics. This is in many ways an interesting subject, and some
quite surprising results and unexpectedly tricky problems arise in the
process, but it is beyond the scope of the current discussion. We will
adopt a more concrete approach (sometimes called nave set theory),
and merely use the machinery without worrying too much about how
it works, and in what circumstances it might break down. Although
awkward paradoxes can occur, often when we consider sets of sets
without due care and attention, in practice we wont run into any of
these situations in the rest of this book.
Broadly speaking, a set is an amorphous collection of objects, a bag of
distinct things. Although most of the more familiar examples will tend
to have some additional structure, such as an ordering, an addition or
506 a course in abstract algebra

multiplication operation, or some notion of distance between elements,


by default a set has no preordained or canonical internal structure.
Definition A.1 A set is a (finite or infinite) collection of distinct
objects, which are called the elements or members of the set.
We denote membership by the symbol , so x S means that x is an
element of the set S, while y 6 S means that y is not an element of S.
We will usually adopt the convention that sets are represented by
capital letters, while elements are denoted by lower-case letters.
There are various ways of defining a given set. One, suitable only for
finite sets (or very occasional instances of particularly well-behaved
infinite sets) is to simply list the elements:
Wikimedia Commons / Sophia De Morgan (18091892)
Augustus De Morgan (18061871) was A = { a, b, c, d},
born in Madras, but when he was ten
his father (also called Augustus) died B = {1, 2, 3, . . .},
and the family returned to England. C = {. . . , 4, 2, 0, 2, 4, . . .},
His mother, a devout Anglican, hoped
he would become a clergyman but D = {Io, Europa, Ganymede, Callisto}.
at sixteen he entered Trinity College,
Cambridge to study mathematics. He Sometimes its more convenient to specify the elements in some other,
played the flute and enjoyed learning well-defined way:
for its own sake, which hampered his
performance in the fiercely competitive S1 = {( x, y) : x, y R, x2 + y2 = 1}
Cambridge mathematics tripos exams,
and he graduated fourth in his class in This set consists of all ordered pairs of real numbers ( x, y) such that
1827, with the title of Fourth Wrangler
x2 + y2 = 1; it can be realised geometrically as the unit circle in the
and the degree of Bachelor of Arts.
His nonconformist beliefs prevented Euclidean plane R2 .
him from taking the theological exam In all of these examples we have used braces { and } to contain the
required for progression to the degree
of Master of Arts and a college fel- elements or their specifications. In the last example, the colon : should
lowship, so he left Cambridge and be interpreted as such that, so we could read the whole thing as
briefly studied law at Lincolns Inn,
but in 1828 was appointed Professor
S1 is the set of all ( x, y) such that x and y are in R and x2 + y2 = 1.
of Mathematics at the newly-founded Some books use a vertical bar | instead of a colon.
London University. He stayed there
until 1831, when he and several col- Example A.2 Some standard sets that we refer to regularly through-
leagues resigned en masse on a point out the rest of the book include the following:
of principle. His successor drowned in
1836 and De Morgan was reappointed, N = {1, 2, 3, . . .} the natural numbers
remaining until 1866, when he resigned
again on a point of principle. Z = {. . . , 2, 1, 0, 1, 2, . . .} the integers
Q = ba : a, b Z, b > 0, gcd( a, b) = 1

In addition to his many contributions the rational numbers
to logic, set theory and algebra, in
1865 De Morgan and his son George R the real numbers
founded the London Mathematical So- 2
ciety, which to this day is the main
C = { a + bi : a, b R, i = 1} the complex numbers
British learned society for pure and ap-
plied research mathematicians. He was One particularly (if unexpectedly) important set is the empty set, the
set with no elements. We denote it = { }.1 Also important, but
also an entertaining writer: his book A
Budget of Paradoxes, published posthu-
mously in 1872, is a fascinating study
generally only in a formal sense, is the universal set E . This can be
of mathematical crackpottery. regarded as the set containing everything (including every other set);
1
This symbol is not to be confused however be warned that one can easily run into awkward paradoxes
with the lowercase Greek letter or .
background 507

if one isnt careful, especially when dealing with sets of sets.2 2


One such paradox is sometimes
known as Russells Paradox (after
Because of this, many versions of axiomatic set theory (including the philosopher and mathematician
ZermeloFraenkel set theory) dont permit the existence of a univer- Bertrand Russell (18721970)) or the
sal set. To sidestep these problems, we will typically use E to denote Barbers Paradox. The latter form is
usually stated as follows:
the largest set necessary in a particular context. This is a somewhat Suppose there exists a town whose
imprecise way of doing things, but will suffice for our purposes. residents are extremely pogonopho-
bic; that is, they are either afraid of,
The sets in Example A.2 can all be nested neatly inside each other: the strongly disapprove of, or are oth-
natural numbers are the positive integers, the integers are the rational erwise prejudiced against men with
numbers with denominator 1, the rationals are those real numbers that beards. The town has exactly one pro-
fessional barber, who shaves every man
can be expressed as a quotient, and the real numbers are the complex who doesnt shave himself. The seem-
numbers with imaginary part equal to zero. ingly innocuous question who shaves
the barber? leads to an irresolvable in-
Definition A.3 Let A and B be two sets. We say that A is a subset consistency: if the barber (in his capac-
of B, written A B or B A, if every element of A is an element of ity as a private citizen) doesnt shave
himself, then he must do so (in his ca-
B as well. pacity as the towns barber); if on the
other hand he does, then he shouldnt.
If B also contains at least one extra element that isnt in A, then we
There are other ways of formulating
say A is a proper subset of B, and write A B or B A. this paradox: by considering a book
indexing all books that dont cite them-
In particular, the empty set is a subset of every set, including itself: selves, or by contemplating a set that
all of the no elements doesnt have are also contained in any other contains every set that doesnt contain
itself.
set we care to consider. Any set contains itself as a subset (although
obviously not a proper subset), and our somewhat nebulously-defined
universal set E contains itself and every other set under consideration.
Many important theorems in real analysis, such as the Intermediate
Value Theorem or Taylors Theorem, are concerned with the continu-
ity or differentiability of functions over contiguous subsets of R, and
so it is useful to have a compact notation for this situation.
Definition A.4 Let a, b R such that a < b. Then the closed
interval [ a, b] is the set
[ a, b] = { x R : a 6 x 6 b}
and the open interval ( a, b) is the set
( a, b) = { x R : a < x < b}.
We may also define the half-open intervals
[ a, b) = { x R : a 6 x < b}, ( a, b] = { x R : a < x 6 b}.

A variant notation used by some books replaces parentheses with


reversed square brackets, for example writing the open interval ( a, b)
as ] a, b[. We can also extend this notation to infinite endpoints:

( a, ) = { x R : x > a}, [ a, ) = { x R : x > a},


(, b) = { x R : x < b}, (, b] = { x R : x 6 b}.

We will often want to combine or compare two or more sets, and this
508 a course in abstract algebra

leads us to the following two operations.


Definition A.5 The union A B of two sets A and B is defined as
A B
A B = { x : x A and/or x B}.
It is the set formed by combining the elements of A and B into one
larger collection and discarding any duplicates.
The intersection of A and B is
A A B = { x : x A and x B},
and comprises those elements that are in both A and B.
A B
If A B = then we say that A and B are disjoint.

We can extend this to arbitrary finite collections of sets. We need to


check that both the union and intersection operations can be performed
in any order without changing the end result; that is,
A0 ( A B) C = A ( B C ) and ( A B) C = A ( B C )
for any sets A, B and C. We say that and are associative operations.
A B Similarly, with a bit of thought we can see that it doesnt mater whether
we consider A B or B A, because well get the same result either
way. The same is true for intersections: A B = B A. (Operations
satisfying this order-invariance condition are said to be commutative.)
Since and are associative, we can ignore parentheses in unions
AB
and intersections of finite collections of sets. This means that, give a
collection A1 , . . . , An of sets we can define
n
[ n
\
A B A i = A1 . . . A n and A i = A1 . . . A n
i =1 i =1
secure in the knowledge that well get the same answer whatever order
we do each pairwise union or intersection operation.
Another important operation is the difference of two sets, which is
AB
defined as
A \ B = { x : x A and x 6 B}.
A B If A and B are disjoint then A \ B = A. In general, it is not the case
that A \ B = B \ A, so the difference operation is noncommutative.
The complement of a set A is
A 0 = E \ A = { x : x 6 A },
and consists of all elements that arent in A.
A\B
But what do we really mean by this? We need some well-defined way
Figure A.1: Venn diagrams depicting of saying when two sets are equal, and the answer is the obvious one:
standard set operations
two sets are equal if they contain exactly the same elements.
Table A.1 contains a standard list of general laws or identities satisfied
by the set operations weve met so far.
background 509


A B = B A A( BC ) = ( A B)C
Commutative Laws Associative Laws
A B = B A A( BC ) = ( A B)C

A( BC ) = ( A B)( AC ) A = A
Distributive Laws Identity Laws
A( BC ) = ( A B)( AC ) AE = A
A A0 = E


A( A B) = A



Complement Laws A A0 = Absorption Laws
A( A B) = A
0 0
(A ) = A

( A B)( A B0 ) = A ( A B)0 = A0 B0

Minimisation Laws De Morgans Laws


( A B)( A B0 ) = A ( A B)0 = A0 B0

Table A.1: Standard identities in set


algebra
We can depict these geometrically by means of Venn diagrams. These
consist of a boundary rectangle, representing E , and inner circles (or
A B
other closed curves) representing other sets. We represent intersections
by shading in the overlapping portions of two or more circles, and
unions by shading in all of the sets in question.
We can check the identities in Table A.1 by drawing their Venn dia-
grams; for example, both sides of the distributive law
A ( B C ) = ( A B) ( A C ) C

yield the diagram in Figure A.2.


Figure A.2: Venn diagram depicting
As useful as they are for illustrative purposes, Venn diagrams dont the distributive law
constitute a formal proof, so we need a better method of proving set A ( B C ) = ( A B) ( A C )
equality. As remarked earlier, two sets are equal if they have exactly
the same elements. Equivalently, for A and B to be equal, every
element in A must be in B, and every element in B must also be in A.
This is the same as requiring A to be a subset of B and B to be a subset
of A at the same time. So, A = B if and only if A B and B A.
To do this, we consider some arbitrary element of A and show that it
must also be in B. Then we consider some arbitrary element of B and
show that it must be in A. The following proof illustrates this.
Proposition A.6 Let A, B and C be sets. Then
A ( B C ) = ( A B) ( A C )

Proof Suppose that x A ( B C ). Then either x A or x B C


(or both). Well consider both of these separately. Wikimedia Commons / User:Schutz

If x A then its certainly the case that x A B and x A C, Figure A.3: Detail from a memorial
window to John Venn (18341923) in
because each of these sets are just A with the elements of, respectively, the dining hall of Gonville and Caius
B or C included. And since x lies in both A B and A C, then x College, Cambridge
510 a course in abstract algebra

must also be in their intersection. Hence x ( A B) ( A C ).


Alternatively, if x B C, then that means that x B and x C.
Hence x is a member of both A B and A C, and must therefore be
in their intersection. Thus x ( A B) ( A C ), and so A ( B C )
( A B ) ( A C ).
Now suppose x ( A B) ( A C ). Then x is a member of both
A B and A C. Again, we have two cases: either x A or x 6 A.
If x A then it must also be in A ( B C ), since this is just A with
the elements of B C included as well.
If, however, x 6 A then it must lie in the other parts of A B and
A C, which means that x B and x C. Hence x B C, and
Wikimedia Commons therefore x A ( B C ).
John Venn (18341923) was born in
Yorkshire, a descendent of a line of in- Thus ( A B) ( A C ) A ( B C ), and therefore A ( B C ) =
fluential Anglican clergymen who cam-
( A B) ( A C ) as claimed.
paigned strongly on social issues, such
as prison reform, the abolition of slav-
ery, and banning cruel sports. Another important set operation that we use regularly throughout the
John himself studied mathematics at rest of the book is the Cartesian product:
Gonville and Caius College, Cam-
bridge, graduating as Bachelor of Arts Definition A.7 Given two sets A and B, their Cartesian product is
in 1857 with the sixth highest mark the set
in his year, winning the title of Sixth
Wrangler and a college fellowship. Fol- A B = {( a, b) : a A, b B}
lowing in the family tradition he en-
tered the Church of England, being or- of ordered pairs of elements from A and B.
dained first as a deacon in 1858 and as More generally, for a finite collection of sets A1 , . . . , An , their nfold
a priest a year later. He served as Cu-
rate of Mortlake in Surrey until 1862, Cartesian product is the set
when he returned to Cambridge as a n
College Lecturer in Moral Sciences. Ai = A1 An = {(a1 , . . . , an ) : a1 A1 , . . . , an An }
In addition to his work on logic and set i =1
theory, he wrote several books on the
of ordered ntuples of elements from A1 , . . . , An .
history of the University of Cambridge,
in particular the ten-volume Alumni
So, the Cartesian product A B consists of all possible ordered pairs
Cantabrigienses, compiled in collabora-
tion with his son, the economist John ( a, b) where the first element is in A and the second element is in B.
Archibald Venn (18831958). This com-
prises detailed biographical notes on
Its tempting, certainly when considering products of numerical sets
all known graduates of the University such as N N, to expect the Cartesian product to consist of all
from its foundation in the early 13th possible products of elements. But this only makes sense when the
century until the mid 20th century.
He was elected a Fellow of the Royal
elements have a well-defined, canonical product operation. When
Society in 1883, granted the degree of this isnt true (as is the case in many of the examples throughout this
Doctor of Science in 1884, elected a Fel- book), this approach fails.
low of the Society of Antiquaries in
1892, and was President of Gonville The ordering is important: its not the case in general that A B =
and Caius College from 1912 until 1923.
B A, so the Cartesian product isnt commutative. In the next section
He was also a keen amateur engineer,
his most successful creation being a well see that there is a strong relationship between A B and B A,
cricket bowling machine that bowled and well study in detail exactly what that relationship is.
out some members of the Australian
national team during their visit to Cam- The Cartesian product also isnt, strictly speaking, associative, since
bridge in 1909. ( A B) C consists of ordered pairs of the form (( a, b), c) and A
background 511

( B C ) consists of pairs of the form ( a, (b, c)), whereas A B C


consists of ordered triples of the form ( a, b, c). This distinction might
seem pedantic, and its one that we will often gloss over, but its
important to be aware of it.
From time to time we will find it necessary or useful to think about
the number of elements a set has, and to that end we introduce the
following terminology.
Definition A.8 A set A is said to be finite if it has a finite number
of elements, and infinite otherwise.
The cardinality or order of a finite set A is the number of elements
contained in A, and is denoted | A|.
In Section A.3 we will extend this definition, discussing the cardinality
of infinite sets and investigating different types of infinity.

A.2 Functions

One of the most important concepts in the entirety of mathemat-


ics is that of a function or map. We first meet these as operations
performed on real numbers, in particular polynomial expressions such
as x2 + 3x + 1, trigonometric functions like sin, cos and tan, or the
exponential function e x and the natural logarithm ln.
This is all very well, as far as it goes, but we want to extend these
ideas to work with arbitrary sets rather than just those consisting of
real numbers.
To do this, we look carefully at what a function does: its a device that
takes an element of one set (which in the examples just listed has been
some subset of R) and gives in return an element of some other set
(which might be the same or something else entirely) in a well-defined,
deterministic and consistent way.
Definition A.9 Let A and B be sets. A function or map f : A B
is a rule that assigns an element of B to every element of A.
More precisely, for every a A there is an element f ( a) B.
The set A is called the domain of f and B is the codomain of f .
Look carefully at what this definition says, and also at what it doesnt
say. In particular, we dont require that every element of the codomain
B is mapped to by some element of the domain A, nor do we insist
that distinct elements of the domain A are mapped to distinct elements
of the codomain B. We will certainly consider these special cases in a
512 a course in abstract algebra

little while, but the realisation that not every element of the codomain
need be mapped to inspires the following definition.
Definition A.10 Given a function f : A B, the image or range of
f is the subset
im( f ) = { f ( a) : a A}
of the codomain B.

Example A.11 Let f : R R such that x 7 x2 for all x R. Then


the image
im( f ) = R>0 = [0, ).

Example A.12 Consider the trigonometric functions sin : R R,


and cos : R R. Then
im(sin) = im(cos) = [1, 1].

Example A.13 Suppose that f : { a, b, c, d, e} {0, 1, 2} such that


a, e 7 0 and b, c, d 7 1. Then im( f ) = {0, 1}.
These examples have made use of some common notation, which its
worth highlighting before we go any further. We use arrows to denote
the direction of the mapping, so for a function f : A B we are taking
elements from A and mapping them to elements in B. But as the above
examples show, there are two types of arrow in common usage. The
arrow acts on the level of sets, while the arrow 7 acts on the level
of individual elements. So the definition f : R R; x 7 3x +2 says
that the function f maps elements of R to elements of R, by mapping
a given element x to the element 3x +2.
Given functions f : A B and g : B C we can chain them together
by taking the output of f and feeding it into the input of g. We can
represent this situation as
f g
A. /B /C

and notice that what we end up with is a well-defined way of turning


an element of A into an element of C. Granted, we have to go through
B to get there, but thats just an interim stage and ultimately what we
have is a function h : A C. This process is called composition:
Definition A.14 Given functions f : A B and g : B C, we can
define the composite function g f : A C such that
( g f )( a) = g( f ( a))
for all a A.
The ordering of the functions is important: g f means we do f and
then do g to the result, rather than the other way round. Applying
background 513

the functions in the reverse order need not give the same composite
function; indeed the composite f g need not even be defined.
More generally, for two functions f : A B and g : C D, we can
define g f if and only if im( f ) C. That is, if the image of f is a
subset of the domain of g.
The reason we write g f to denote f followed by g is because it
makes the link with g( f ( a)) clearer. In a little while we will see
that this agrees with multiplication of matrices representing linear
transformations.
In some respects its less intuitive: we write the functions in the
reverse order to how theyre applied. Some books sidestep this issue
by applying functions on the right, writing x f instead of f ( x ). This
is a more intuitive ordering for function composition, since x ( f g)
is the same as x f g, but the cost is that this notation is less intuitive
in general. As a result, this convention tends to be rare now, but
sometimes appears in older books.
Sometimes we can use an explicit definition for f and g to obtain an
explicit definition for the composite g f .
Example A.15 Let f : R R such that f ( x ) = x2 + 5x + 1 and
g : R R such that g( x ) = 2x + 3.
Then we can define both composites g f and f g. To obtain explicit
expressions for these, we substitute the image of x under the action
of one function into the definition of the other. So
( g f )( x ) = g( f ( x ))
= g( x2 + 5x + 1)
= 2( x2 + 5x + 1) + 3
= 2x2 + 10x + 5,
( f g)( x ) = f ( g( x ))
= f (2x + 3)
= (2x + 3)2 + 5(2x + 3) + 1
= 4x2 + 22x + 25.
As we can see, f g 6= g f .
The last line of this example quietly introduces the idea of two func-
tions being equal to each other or not: two functions f : A B and
g : C D are equal if and only if A = C, B = D and f ( a) = g( a) for
all a A.
Just as we declared two sets to be equal if they contained exactly the
same elements, we regard two functions as equal if they have the same
domains and codomains, and act exactly the same on all elements of
514 a course in abstract algebra

their domain.
One special function, albeit one that seems trivial when we first meet
it, is the identity function.
Definition A.16 For a given set A, the identity map or identity
function id : A A is defined by id( a) = a for all a A. To avoid
ambiguity we might sometimes denote this function as id A .

This might seem a bit pointless, but we should never automatically


3
In computer programming, the con- dismiss something that appears to do nothing.3 Throughout the rest of
cept of an operation that doesnt do this book we see many examples of identity functions and operations
anything (usually called a NOP or no
operation) is often useful for low-level that perform important rles. One particular important property is
purposes such as memory alignment or shown by the following proposition.
synchronising processor clock cycles.
Proposition A.17 For any function f : A B,
idB f = f = f id A .

Proof This is fairly straightforward, and follows almost immediately


from the definitions. For any a A we have

(idB f )( a) = idB ( f ( a)) = f ( a) and ( f id A )( a) = f (id A ( a)) = f ( a).


Hence idB f = f and f id A = f .
Some function f : A B can be undone, in the sense that for any
element a we can apply f to get f ( a) and then reverse the process to
recover a. More precisely, what we want is a function g : B A such
that g( f ( a)) = a for all a A.
Definition A.18 A function f : A B is invertible if there exists a
function g : B A such that
g f = id A and f g = idB .
We call g the inverse of f and usually denote it f 1 , so that
f 1 f = id A and f f 1 = idB .

Example A.19 Let exp : R R+ be the exponential function map-


ping x 7 e x , and let ln : R+ R be the natural logarithm function.
Each of these functions is the inverse of the other.

Exercise A.1 Show that inverse func- Example A.20 Let g : R>0 R>0 map x x2 , and let h : R>0
tions, if they exist, are unique. That is,
for some function f : A B with two
R>0 map x x. Here we follow the usual convention that x
inverses g : B A and h : B A, it denotes the positive square root of x. Then g and h are the inverse of
must be the case that g = h. Hence it each other.
is appropriate to refer to the inverse
f 1 . Its not always the case that a function is invertible. The functions in
Show also that if g : B A is the the above example are invertible because we chose the domain and
inverse of f : A B, then f is also
the inverse of g. Hence ( f 1 )1 = f . codomain carefully to ensure that everything worked. The closely-
related function f : R R that maps x 7 x2 fails to be invertible in
background 515

two different ways.


One problem becomes apparent when we consider what happens to 2
and 2. Both of these square to 4, at which point we run into difficulty
with the definition of an inverse function f 1 : we have to find some
way of mapping 4 to both 2 and 2 in order to satisfy the condition
f 1 f = idR . That is, we need to define f 1 so that the composition
f 1 f maps 2 7 2 and 2 7 2. The only way we can do this is to
somehow define f 1 so that it maps 4 to both 2 and 2. But this isnt
allowed: a function must map each individual element of its domain
to a single element of its codomain, not to pairs or larger subsets of
elements.
The other problem occurs when we look at the second condition
f f 1 = idR . In particular, look at what happens to 1. No matter
how we define f 1 and whatever we decide 1 should map to, when
we then apply f to the result we get a non-negative number. So f fails
the second invertibility condition as well.
The reason the first condition fails is that f maps both 2 and 2 to the
same element in the codomain, which we said earlier is fine in general,
but here is exactly the thing that stops f being invertible.
The second condition fails because 1 lies outside the image of f , so
whatever we map it to via a candidate inverse function f 1 , its not
going to get mapped back to 1 when we apply f , because f doesnt
map anything to 1. Again, the existence of un-mapped-to elements
in the codomain isnt a problem in general, but is precisely the thing
that causes the second invertibility condition to fail.
This leads to the following two definitions.
Definition A.21 A function f : A B is injective or one to one if
every element of the codomain is the image of at most one element
of the domain.
More precisely, f is injective if and only if f ( a1 ) = f ( a2 ) implies
a1 = a2 .
Be careful not to get this condition the wrong way round. If a1 = a2
then f ( a1 ) has to equal f ( a2 ), otherwise the function isnt well-defined.
Definition A.22 A function f : A B is surjective or onto is every
element of the codomain is the image of at least one element of the
domain.
More precisely, f is surjective if and only if for any b B there exists
some (not necessarily unique) element a A such that f ( a) = b.

In addition to their important invertibility properties, bijective func-


tions are just what we need to extend our notion of cardinality to
516 a course in abstract algebra

infinite sets. The key observation is that a bijection defines an exact


correspondence between its domain and codomain. We can use this as
a way of counting the elements of a finite set: we construct a bijection
between our chosen set and a subset of natural numbers of the form
{1, 2, . . . , n} for some n N.
For example, taking the set A = { a, b, c, d} we can define a bijection
f : A B where B = {1, 2, 3, 4}. In fact, we can define 4! = 24
different such bijections, of which
f ( a) = 1, f (b) = 3, f (c) = 4, f (d) = 2
is one. If B has any fewer elements then no function f : A B can be
injective, and if it has any more elements then no function f : A B
can be surjective. So the only case in which a bijection can exist is
when A and B have the same number of elements. And since we know
that B has 4 elements, A must too.

O God, I could be bounded in a nut- A.3 Counting and infinity


shell, and count myself a king of infi-
nite space were it not that I have bad
dreams. We can extend this approach to infinite sets as well. Suppose
William Shakespeare (15641616), that we take some infinite set A and manage to construct a bijection
Hamlet II:2 (c.1600)
f : A N. Then by the discussion above these two sets must have
the same number of elements, although its usually unwise to treat
infinity as a conventional number because it behaves in particularly
counterintuitive ways. In fact, as well see in a little while, there isnt
even just one infinity, but an entire hierarchy of them.
Our first counterintuitive consequence of this discussion is that there
are as many natural numbers as there are even natural numbers:
Proposition A.23 There is a bijection f : 2N N defined by f (n) =
1
2 n.

Proof Every even natural number is of the form n = 2k for some


k N, so the function f is well-defined in the sense that every
element of 2N maps to an element of N.
The function f is injective, since if f (n) = f (m) we have 12 n = 12 m and
hence n = m. It is also surjective, since for any m N we can find
2m 2N such that f (2m) = 12 (2m) = m. Therefore f is a bijection.
Equally surprisingly, there are as many integers as natural numbers:
Proposition A.24 There is a bijection f : Z N defined by

2n + 1 if n > 0,
f (n) =
2n if n < 0.
background 517

Proof The function f is well-defined: it maps every integer to a


natural number. It is injective, since it maps distinct negative integers
to distinct even natural numbers and non-negative integers to distinct
odd natural numbers. And it is surjective since for every odd natural
number m we can find an integer n = 21 (m 1) with f (n) = m, and for
every even natural number m we can find a negative integer n = 12 n
such that f (n) = m. Therefore f is a bijection.

Its useful to have a specific word to describe sets for which there
exists a bijective function to the natural numbers N. Since the natural
numbers are the counting numbers that we meet quite early in
primary school, a bijection to N is really just a way of counting the
elements of our chosen set.
Wikimedia Commons
Definition A.25 Let A be a set, and suppose either that A is finite, or Georg Ferdinand Ludwig Philipp
that there exists a bijection f : A N. Then we say A is countable. Cantor (18451918) was born in St Pe-
tersburg but in 1860, the family moved
If A is finite, then the cardinality or order | A| of A is the number to Frankfurt. Here Georg, already
of elements in A. If A is infinite and countable then we say it has an accomplished violinist, excelled
in mathematics and subsequently at-
cardinality | A| = 0 (pronounced aleph null or aleph zero). tended the Universities of Zrich,
Berlin and Gttingen. After completing
We will often use the leminiscate or analemma symbol to denote a doctoral dissertation in number the-
infinity in general, but 0 specifically denotes the countable infinity, ory at Berlin in 1867, he moved to the
University of Halle and was promoted
the cardinality of the natural numbers N.
to Professor in 1879.
Even more surprisingly, it turns out that the rational numbers Q are Much of his most important work was
countable. The proof of this celebrated result is due to the German achieved between 1874 and 1884, a pe-
riod overshadowed by bitter disputes
mathematician Georg Cantor (18451918). with the influential German mathemati-
cian Leopold Kronecker (18231891),
Proposition A.26 The set Q of rational numbers is countable.
a founder of the constructivist school
in mathematics, who had fundamen-
tal objections to Cantors work. Kro-
Proof We start by proving that the positive rationals Q+ are countable.
necker, head of department at Berlin,
To do this, we display them in a grid as follows: also blocked Cantors attempts to se-
1 2 3 4 5 6 cure an academic post there.
1 1 1 1 1 1
The stress of this situation took its toll,
and Cantor suffered the first of several
1 2 3 4 5 6 nervous breakdowns, spending time in
2 2 2 2 2 2 a sanatorium with severe depression.
He moved away from mathematics for
1 2 3 4 5 6 a time, focusing instead on philosophy
3 3 3 3 3 3
and English literature. He researched
the authorship of the plays of William
1 2 3 4 5 6 Shakespeare (15641616), and argued
4 4 4 4 4 4
in favour of Francis Bacon (15611626).
1 2 3 4 5 6 He suffered from depression for most
5 5 5 5 5 5
of the rest of his life, a condition that
was exacerbated by the death of his
1 2 3 4 5 6 youngest son Rudolph in 1899 and oc-
6 6 6 6 6 6
casionally fierce criticism of his work
on transfinite set theory. He died in
.. .. .. .. .. .. .. a sanatorium in 1918, roughly two
. . . . . . . months before his 73rd birthday.
518 a course in abstract algebra

Following the arrows yields the list


1 2 1 1 2 3 4 3 2 1 1 2 3 4 5 6 5 4 3 2 1
1, 1, 2, 3, 2, 1, 1, 2, 3, 4, 5, 4, 3, 2, 1, 1, 2, 3, 4, 5, 6, . . .
However, each rational number will occur infinitely many times (for
example, 12 = 24 = 36 = ) so we must delete from this list the second
and subsequent occurrences of every given rational number. This
gives the list
1 2 1 1 3 4 3 2 1 1 5 6 5 4 3 2 1
1, 1, 2, 3, 1, 1, 2, 3, 4, 5, 1, 1, 2, 3, 4, 5, 6, . . .
which contains every single positive rational number exactly once. We
can then construct a bijection f : Q+ N by mapping each rational
number to the positive integer given by its position in the list. So, for
example, 25 7 16 and so forth.
This function, while not constructed by means of a neat formula, is
nevertheless well-defined and bijective, and so |Q+ | = |N| = 0 as
claimed.
We can extend this to provide a bijection g : Q N by using a similar
trick to the one used in Proposition A.24: we define

2 f (q) + 1 if q > 0,


g(q) = 1 if q = 0,


2 f (q) if q < 0.

This function g is a bijection, and thus Q is countable.


The obvious, nave approach to constructing a bijection f : Q N is
to start in the top left-hand corner of the grid and proceed horizontally,
row by row, accounting for all the rational numbers. The problem
is that there are (countably) infinitely many rational numbers on the
first row, so we never actually get as far as the second or subsequent
rows. Cantors method avoids this problem by traversing the grid in a
zig-zag pattern that doesnt miss anything out: eventually well reach
any given rational number in a way that doesnt happen if we follow
the obvious row-by-row method.
What about the next set in our standard hierarchy of number systems?
Are the real numbers countable? There are certainly infinitely many
of them, because R contains Q as a proper subset, but it turns out
we cant construct a bijection f : R N no matter how carefully we
arrange them.
The following result is also due to Cantor, and has become known as
Cantors diagonal argument.
Proposition A.27 The set R of real numbers is not countable.
The following proof uses a technique called proof by contradiction or
reductio ad absurdum, which well discuss in more detail later. The
background 519

central idea is that we assume the opposite of what were trying to


prove, and then show that the logical consequences of that are so
obviously incorrect or unthinkable that the original proposition must
have been true all along.
Proof First we show that the interval (0, 1) = { x : 0 < x < 1} is
uncountable.
Suppose for the moment that (0, 1) is countable after all. That means
that there is a bijection f : (0, 1) N, and hence we can (in principle,
if not in practice) write down an infinite list of all the real numbers in
the open interval (0, 1). This yields a sequence x = ( x1 , x2 , . . .) where
xi is the unique real number satisfying f ( xi ) = i N.
We can write every number y (0, 1) as an infinite decimal expansion
of the form
y = 0.y1 y2 y3 . . .
where each digit yi {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. To avoid ambiguity we
write any decimal expansion of the form

0.y1 y2 . . . yk 9999 . . .

(where yk 6= 9) as
0.y1 y2 . . . (yk +1)0000 . . .
instead, and we pad any finite-length decimal expansion to an infinite
one by appending an infinite sequence of trailing zeros after the last
nonzero digit.
Our sequence x of all real numbers from (0, 1) can be written out as
infinite decimal expansions as follows:

x1 = 0 . x11 x12 x13 x14 . . .


x2 = 0 . x21 x22 x23 x24 . . .
x3 = 0 . x31 x32 x33 x34 . . .
x4 = 0 . x41 x42 x43 x44 . . .
..
.
We now construct a new number x0 by taking the diagonal digits of
this list
x1 = 0 . x11 x12 x13 x14 . . .
x2 = 0 . x21 x22 x23 x24 . . .
x3 = 0 . x31 x32 x33 x34 . . .
x4 = 0 . x41 x42 x43 x44 . . .
..
.
to get
0.x11 x22 x33 x44 . . .
520 a course in abstract algebra

and then adding 1 to each digit using modulo10 arithmetic. This


yields
x0 = 0.( x11 +1)( x22 +1)( x33 +1)( x44 +1) . . . ,
a perfectly valid real number that lies in the interval (0, 1) but doesnt
appear anywhere in the existing sequence x. The reason for this is that
for any n N, the number xn differs from x0 at the nth digit.
Therefore the sequence x didnt contain all of (0, 1), and in fact no
such sequence can exist, so (0, 1) is not countable after all.
Since R contains (0, 1) as a proper subset, R isnt countable either.
This proves that |R| 6= 0 , and since N R we already know that
|N| 6 |R, so it must follow that |N| < |R| and hence we need a new
infinite cardinal. We say that R is uncountable, and has cardinality
1 (pronounced aleph one).
It is possible to construct hierarchies of successively larger infinite
cardinals, but we wont go into the details. For a very readable
discussion of infinity and related matters see the book Infinity and
the Mind by the mathematician, logician, artist and science fiction
4
R Rucker, Infinity and the Mind, writer Rudy Rucker.4 The Argentinian magic realist writer Jorge Luis
Princeton University Press (2005). Borges (18991986) explores ideas of infinity in several of his stories,
5 6 7
5
J L Borges, The Library of Babel, in: in particular The Library of Babel, The Aleph and The Book of Sand.
Labyrinths, Penguin (1970) 7886.
Bijective functions also enable us to properly answer the question
6
J L Borges, The Aleph, in: The Aleph
about the commutativity and associativity of cartesian products. As
and Other Stories, Penguin (2000) 118
133. noted earlier, in the discussion following Definition A.7, in general
7
J L Borges, The Book of Sand, in: The A B 6 = B A for arbitrary sets A and B. There is, however, an obvious
Book of Sand and Shakespeares Memory, bijection f : A B B A given by f ( a, b ) = (b, a ) for all a A and
Penguin (2001) 8993.
b B.
Similarly, although A( BC ) 6= ( A B)C, we can define a bijection
g : A( BC ) ( A B)C by g(( a, b), c) = ( a, (b, c) and therefore it
makes sense in almost all cases to regard these as effectively the same
as each other and the threefold Cartesian product A BC.

A.4 Relations

Earlier, we noted that a set has, by default, no particular prede-


termined structure. Most of the rest of this book is concerned with the
study of some particularly important structures that arise naturally
in various contexts. Now we will examine another class of structures
that can be defined on sets.
We often find it necessary or useful to compare individual elements
background 521

of sets in various ways. For example, sometimes we need to know in


what circumstances two arbitrary elements are equal to each other. Or
we might be interested in whether one element is greater than another
(if the set in question actually has a coherent concept of greater than
or less than).
More generally, we sometimes find it necessary to study other connec-
tions between elements. In number theory, for example, we are often
interested in whether some integer is a factor of another.
Wed like to formalise this idea and see what important properties arise.
To start with, any two elements of a set are either related in a specific
way or they arent. So what we want is a way of assigning a value of
either true or false to a given ordered pair of elements, depending
on whether those elements are related in the specified way or not. One
way of looking at this is as a function f : A A {true, false}.
Alternatively, we can define a relation by listing the ordered pairs of
related elements. This approach views a relation as a subset of the
cartesian product A A. Both approaches are valid and equivalent,
and each has its own advantages, so we present both.
Definition A.28 A (binary) relation on a set A is a function f : A
A {true, false}, and we say that two given elements a, b A are
related if f ( a, b) = true.
Equivalently, a (binary) relation on a set A is a subset B A A,
and we say that two given elements a, b A are related if ( a, b) B.

The notation f ( a, b) = true or ( a, b) B is rather cumbersome, and


obscures the comparative aspect of the relation under discussion, so
in most cases we will represent a given relation with a symbol such
as placed between the two elements. This enables us to write more
intuitive expressions such as a b if a is related to b, and a 6 b if not.
Example A.29 For any set A we can define the equality relation =
defined by

true if a = b,
f ( a, b) =
false if a 6= b.

Example A.30 Let f : N N {true, false} be defined by



true if m is a factor of n,
f (m, n) =
false otherwise.

This determines the relation | or divides: we say that m|n if and


only if f (m, n) = true.

Another important relation is the notion of an ordering on a set:


522 a course in abstract algebra

Example A.31 Suppose A R. Then the less than relation < is


given by

true if a < b,
f ( a, b) =
false if a > b.

We can define the less than or equal relation 6 in a similar way.

Each of these relations have subtly different properties. Is it always


the case that a given element is related to itself? In the case of = it so
happens that a = a for any element a A, but its not the case that
a < a in general (or indeed at all) for subsets A R, although clearly
a 6 a in all cases. Furthermore, n|n for any n N.
Does the order of the elements matter? If a b then is it necessarily
true that b a? If a = b then clearly b = a in all cases. But if a < b
its certainly not the case that b < a, and if m|n its not true that n|m
(except when m = n).
What about comparisons of three arbitrary elements? If a = b and
b = c then clearly a = c. Also, if a < b and b < c then its necessarily
the case that a < c. And if p|q and q|r then p|r.
Are there any relations that dont satisfy this third property? Proba-
bly the best known example is the old game of rockpaperscissors
8
Here, rock beats scissors (because the (sometimes called roshambo or ick ack ock).8 We can define a rela-
rock blunts the scissors), scissors beat tion beats on the set {rock, paper, scissors} and then observe that
paper (because the scissors cut the pa-
per) and paper beats rock (because the although rock beats scissors, and scissors beat paper, rock doesnt beat
paper wraps the rock). Every other out- paper. This relation also doesnt satisfy the other two conditions.
come is a draw.
This game can be generalised to any
Definition A.32 A relation defined on some set A is said to be
odd number of players: there is a pop- (i) reflexive if a a for all a A,
ular five-player variant called rock
paperscissorslizardSpock. (ii) symmetric if a b implies that b a for all a, b A, and
(iii) transitive if a b and b c together imply that a c for all
a, b, c A.
So, in this terminology, = is reflexive, symmetric and transitive; <
is transitive only; and 6 and | are both reflexive and transitive but
not symmetric. The relation beats in rockpaperscissors is neither
reflexive, symmetric nor transitive.
A relation which satisfies all three of these conditions is in some sense
stronger or more exclusive than one that satisfies only some or none
of them. We can view such a relation as a kind of generalised notion
of equality or equivalence:
Definition A.33 A relation on a set A is said to be an equivalence
relation if it is symmetric, reflexive and transitive.

The motivating example of an equivalence relation is, of course, the


background 523

equality relation =, but there are others that arise naturally in certain
contexts. In Section 2.2 we formulate a particularly important class of
equivalence relations on groups, and use it to prove Lagranges Theo-
rem.9 Probably the best-known example of an equivalence relation, 9
Theorem 2.30, page 54.
apart from = itself, is that of congruence modulo n.
Definition A.34 Two integers a, b Z are congruent modulo n for
some positive integer n N if ( ab)|n. Or, equivalently, if they both
have the same remainder when divided by n.
We write a n b or a b (mod n).
This is an equivalence relation on the set of integers.

A.5 Number theory God created the integers; all else is the
work of man.
Leopold Kronecker (18231891),
Much of abstract algebra is concerned with extending and gen- lecture at Gttingen (1886),
eralising facts about the integers to wider classes of objects, such as quoted by Heinrich Weber (18421913),
Jahresbericht der Deutschen
polynomials, symmetry operations, matrices and so forth. In this
Mathematiker-Vereinigung 2 (1892) 19
section we review a few elementary number-theoretic results used
elsewhere in the book. First we introduce a couple of basic definitions.
Definition A.35 Let a, b Z. Then a is a factor or divisor of, or
divides b, if b = ka for some integer k Z. We write a|b if this is so,
and a 6 |b if not.
The greatest common divisor or highest common factor of two pos-
itive integers m, n N is the largest positive integer d N that
divides both m and n. We write this as gcd(m, n). If gcd(m, n) = 1
then m and n are said to be coprime or relatively prime.
The ancient Greek mathematician Euclid of Alexandria (fl. 300BC)
gives an algorithm for finding the greatest common divisor of two
positive integers. It appears as Propositions 1 and 2 in Book VII (and 10 This proof makes use of the method
later, in a slightly different form, as Propositions 2 and 3 in Book X) of of Proof by Induction, which relies on
his celebrated treatise (Elements), although it almost certainly (indeed, its validity is formally equiva-
lent to) the Well-Ordering Principle:
dates back even earlier than that.
Axiom A.36 (The Well-Ordering
In order to prove the validity of this algorithm, we need a basic fact Principle) Let S N such that S 6=
about natural numbers, namely that the process of division with re- . That is, let S be a nonempty set of
natural numbers. Then S has a least ele-
mainders, which most of us learn at primary school, actually works.10 ment.
Theorem A.37 (The Division Theorem) Let a, b N. Then there Depending on exactly which version
exist unique integers q, r Z such that a = bq + r and 0 6 r < b. of axiomatic set theory one is working
with, this statement is either a basic
axiom or a provable proposition. We
Proof First, we prove the existence of q and r. If a = b then obviously will cheerfully ignore such concerns in
a = 1b + 0, so q = 1 and r = 0 suffice. the rest of this book.
524 a course in abstract algebra

If a6=b then we proceed by induction on a. For a=1 (and thereby b>1)


we have
a = 1 = 0b + 1
with q = 0 and r = 1 < b.
Now suppose that a = qb + r for some q, r Z with 0 6 r < b. Then
adding 1 to both sides of this equation we obtain
a + 1 = qb + r + 1.
Since r < b it follows that r + 1 6 b. If r + 1 = b then
a + 1 = (q + 1)b + 0,
and if r + 1 < b we have
Wikimedia Commons / Raphael (14831520), detail from The
School of Athens (15091510) a + 1 = qb + (r + 1).
Almost nothing is known of the life of
the ancient Greek mathematician Eu- In each case a+1 can be expressed in the desired way, and hence it
clid of Alexandria (fl. 300BC), the few follows by induction that any a N can be decomposed in this way
historical references having been writ-
ten centuries later. He is believed to too.
have lived during the reign of Ptolemy To show uniqueness, suppose that there exist two other integers s, t
I of Egypt (c.367c.283BC), and there
are six surviving works that are credi- Z such that a = sb + t, with 0 6 t < b. Then sb + t = qb + r, which
bly attributed to him, the best-known implies that (q s)b = t r.
of which is the Elements ().
This work, consisting of thirteen books This means that (t r ) is divisible by b. Since 0 6 r, t < b it follows
of definitions, axioms and propositions that 0 6 |t r | < b, which means that
on plane geometry, spatial geometry
and number theory, is perhaps one of (b 1) 6 t r 6 b 1.
the most influential mathematical texts
of all time: for several centuries it was The only integer in this range divisible by b is 0, so t r = 0 and
part of the standard Western university
hence t = r.
and school curriculum.
One of the earliest surviving fragments Now we have qb + r = sb + t = sb + r, and hence qb = sb, from which
is contained in the Oxyrhinchus Pa- we find that q = s. Hence q and r are unique.
pyrus, dated to around 100AD, but
probably the earliest surviving com- We can now state Euclids Algorithm, and prove that it works.
plete texts are an edition prepared
in the fourth century AD by Theon Theorem A.38 (Euclids Algorithm) Let a, b N. Then there ex-
of Alexandria (c.335c.405AD) and
ist unique integers q1 , . . . , qn+1 (the quotients) and b>r1 > >rn+1 =0
a Byzantine palimpsest dating from
about 900AD discovered in 1808 by (the remainders) such that
the French historian Franois Peyrard
(c.17601822). a = q1 b + r1 ,
b = q2 r1 + r2 ,
r1 = q3 r2 + r3 ,
..
.
r n 2 = q n r n 1 + r n ,
r n 1 = q n +1 r n + r n +1 ,
Wikimedia Commons / Oxyrhynchus Papyrus, fragment 29

and gcd( a, b) = rn , the last nonzero remainder.


background 525

Proof First we apply the Division Theorem (Theorem A.37) to the


pair a, b to find the unique integers q1 , r1 satisfying a = q1 b + r1 and
0 6 r1 < b. If r1 = 0 then we stop. Otherwise, we apply the Division
Theorem again to the pair b, r1 to find the unique integers q2 , r2 such
that b = q2 r1 + r2 . If r2 = 0 then we stop, otherwise we apply the
Division Theorem yet again to the pair r1 , r2 .
Continuing this process, we obtain a strictly decreasing sequence
of remainders r1 > r2 > which are all less than b and strictly
non-negative. Hence, after finitely many steps, we reach rn+1 = 0.
The last step is therefore
r n 1 = q n +1 r n + 0
and so rn1 = qn+1 rn , which means that rn |rn1 .
The penultimate step, similarly, is
r n 2 = q n r n 1 + r n = q n q n +1 r n + r n = ( q n q n +1 + 1 ) r n
and so rn |rn2 . Continuing backwards, we find that rn divides all of
the other remainders rn1 , . . . , r1 . Furthermore, since rn |r2 and rn |r1 ,
it follows that rn also divides b = q2 r1 + r2 . And since rn |b, it also
follows that rn divides a = q1 b + r1 . Thus rn is a factor of both a and b.
All that remains is to show that rn is the largest such integer; that is,
rn = gcd( a, b). Suppose d N is another common factor of a and b.
Rearranging the equations obtained via the Division Theorem, we get
r1 = a q1 b,
r2 = b q2 r1 ,
r3 = r1 q3 r2 ,
..
.
r n = r n 2 q n r n 1 .
Now suppose d| a and d|b. Then from the first equation we find that
d|r1 , from the second that d|r2 , from the third that d|r3 , and so on.
Finally, we find that d|rn2 and d|rn1 , so d|rn .
Therefore d 6 rn , and hence rn is the highest such factor. That is,
rn = gcd( a, b) as claimed.
Corollary A.39 Let a, b N, and let d = gcd( a, b). Then there exist
integers s, t Z such that d = sa + tb.

Proof From the latter part of the above proof, we know that Euclids
Algorithm ensures the existence of integers q1 , . . . , qn+1 and b > r1 >
> rn+1 > 0 such that
r1 = a q1 b,
r2 = b q2 r1 ,
526 a course in abstract algebra

r3 = r1 q3 r2 ,
..
.
r n 1 = r n 3 q n 1 r n 2 ,
d = r n = r n 2 q n r n 1 .
Substituting the penultimate expression into the last one, we obtain
d = r n = r n 2 q n ( r n 3 q n 1 r n 2 )
= (1 + q n q n 1 ) r n 2 q n r n 3 .
Hence d = rn can be expressed as a Zlinear combination of rn3
and rn2 . Now we use the antepenultimate expression on the list to
substitute for rn2 and thereby express d as a Zlinear combination of
rn4 and rn3 .
Continuing this process, we eventually obtain an expression for d as a
Zlinear combination of a and b, as required.
The following important fact about coprime integers is easy to prove,
and will be used elsewhere in the book.
Proposition A.40 Let a and b be coprime integers. Then if a divides bq
then a must divide q.

Proof If a and b are relatively prime, then by Corollary A.39 there


exist integers s and t such that
sa + tb = 1.
Multiplying both sides of this equation by q we get
saq + tbq = q.
Since a obviously divides saq, and by hypothesis a also divides bq, the
entire left hand side of this equation is divisible by a. Hence a must
also divide the right hand side as well, which means that a divides q
as claimed.

A.6 Real analysis

Theorem A.41 (The Intermediate Value Theorem) Let f : R R be


continuous on the closed interval [ a, b] for some a, b R. Then if f ( a) and
f (b) have opposite signs, there exists some c ( a, b) for which f (c) = 0.

Theorem A.42 (Rolles Theorem) Let f : R R be continous on


the closed interval [ a, b] and differentiable on the open interval ( a, b), and
suppose that f ( a) = f (b). Then there exists some c ( a, b) for which
f 0 (c) = 0.
background 527

A.7 Linear algebra


background 529

You might also like