Finite Sets and Counting: Chris Preston

arXiv:0809.0105v2 [math.
HO] 19 Jun 2010
Finite Sets and Counting

Chris Preston
June 2010
We start by presenting a theory of finite sets using the approach

which is essentially that taken by Whitehead and Russell in Principia
Mathematica, and which does not involve the natural numbers (or any
other infinite set). This theory is then applied to prove results about
structures which, like the natural numbers, satisfy the principle of
mathematical induction, but do not necessarily satisfy the remaining
Peano axioms.
Contents
1 Introduction 2
2 Finite sets 12
3 Counting systems 19
4 Minimal counting systems 27
5 Addition and multiplication 35
6 Another take on addition and multiplication 45
References 55
1 Introduction
The notion of what it means for a set to be finite is usually presented within
the framework of the natural numbers N = {0, 1, . . . } (and for our purposes it
is convenient to regard 0 as being a natural number, so 0 can be used to count
the number of elements in the empty set). The standard approach is to define
a set A to be finite if there exists n ∈ N and a bijective mapping h : [n] → A,
where [0] = ∅ and [n] = {0, 1, . . . , n − 1} for each n ∈ N with n 6= 0. If this
approach appears to be rather straightforward then it is probably because the
various facts about N which are needed to define the set {0, 1, . . . , m} have been
taken for granted.
We are going to define finite sets within a framework which does not involve the
natural numbers. (The definition will of course meet the obvious requirement of
being equivalent to the usual one.) One reason for doing this is that it seems
somewhat strange to have to define finite sets in terms of the infinite set N,
and in fact we don’t want to assume in general anything about the existence
of such sets. The definition of being finite to be used here is essentially that
employed by Whitehead and Russell in Principia Mathematica [6]. There are
similar definitions which also appeared in the first decades of the previous century
and excellent treatments of this topic can be found in Levy [3] and Suppes [4].
Let us start by discussing a simple situation involving counting. This is done very
informally, taking for granted all the standard properties of the natural numbers.
The aim is to make a plausible transition from the old to the new definition of
being finite.
Suppose there is a box containing table-tennis balls, and we want to know how
many balls there are in the box. Let us count them by taking them out of the
box one at a time and marking them, say with a red felt marker, marking the
first ball taken out with a 1, the second with a 2 and so on. Eventually we take
the last ball out and mark it, and the number we write on this last ball tells us
how many balls there were in the box.
Now repeat the above procedure: Put the balls back in the box, mix them up
and then take them out again one by one, this time marking them with a green
felt marker. Then the last ball taken out will be marked with the same number
(now in green) that appears on the last ball that was taken out when the red
marker was used.
Each ball now has two numbers on it, one written in red and one in green,
and in general there will be no discernible relationship between the red and
green numbers to be found on any ball. However, this is not important; what is
important is that the number written on the last ball in red is the same as the
number written on the last ball in green. This is one of the reasons that counting
makes sense.
2
1 Introduction 3
Consider the following simple example:
red number: 1 2 3 4 5 6 7 8 9
green number: 8 9 3 2 4 6 5 1 7
The number written on the last ball in red is 9, which is the same as the number
written on the last ball in green. Note that the last ball taken out when using
the red marker was the 7th ball taken out using the green marker, and the last
ball taken out when using the green marker was the 2nd ball taken out using
the red marker. For 9 balls there are 9! = 362880 different ways of taking the
balls one at a time out of the box. However, they all give the same result as
far as determining the number of balls is concerned: The last ball taken out and
counted will always be assigned the number 9.
The above discussion illustrates a basic counting principle which says that if a set
of objects can be counted then it doesn’t matter in which order the counting is
carried out. Most people’s experience of counting things leads them at some point
to accept the principle as something which is true about the world. Thereafter
they are usually not even aware of the role which it plays.
Note that the basic counting principle only applies to sets of objects which can
be counted in the sense that the process of removing the objects from the set
one at a time will eventually terminate. Of course, it is not really clear what
‘eventually terminates’ means, but intuitively it characterises the finite sets. Let
us call a set with this rather vague property N-finite in order to distinguish it
from the new definition of being finite to be given below.
We next reformulate the basic counting principle in a form in which the natural
numbers do not occur. Consider the set S whose elements are the balls in the
box, and suppose the box contains N balls. In order to refer to the balls we
assume that the numbers made with the red felt marker are still there, and we
refer to the ball with the number n written on it in red by bn .
Let f : S → S be an injective mapping, so f (s1 ) 6= f (s2 ) whenever s1 6= s2 . We
can write this mapping as a table:
bn b1 b2 b3 b4 ··· bN −1 bN
f (bn ) f (b1 ) f (b2 ) f (b3 ) f (b4 ) ··· f (bN −1 ) f (bN )
An example of such a mapping (with 9 balls) is the following:
bn b1 b2 b3 b4 b5 b6 b7 b8 b9
f (bn ) b8 b9 b3 b2 b4 b6 b5 b1 b7
1 Introduction 4
In this example the ball marked with the red 1 is mapped to the ball marked
with the red 8, the ball marked with the red 2 is mapped to the ball marked with
the red 9, and so on.
Now do the following: Find the ball f (b1 ) and write 1 on it in blue, then find
the ball f (b2 ) and write 2 on it in blue. Repeat this until finally finding the ball
f (bN ) and writing N on it in blue. Then for each number between 1 and N there
is exactly one ball with this number written on it in blue. (No ball can have more
than one blue number written on it because the mapping f is injective.)
For the above example this means that the ball with the red 8 also has the blue
1 on it, the ball with the red 9 also has the blue 2 on it, and so on:
red number: 8 9 3 2 4 6 5 1 7
blue number: 1 2 3 4 5 6 7 8 9
Now count the balls in the box using the the blue numbers: First take the ball
marked with the blue 1 out of the box, then the ball marked with the blue 2
and so on. Eventually the ball marked with the blue N will be taken out, and
then there can be no more balls left in the box: If this were not the case then we
could continue the counting by taking the remaining balls out in any order and
would conclude that there were more than N balls originally in the box, which
by the basic counting principle is not possible. But if there are no more balls left
in the box after the ball marked with the blue N is taken out then every ball is
marked with a blue number. It follows that the mapping f is surjective, since a
ball marked with a blue n is the image under f of the ball marked with a red n.
Thus f , being injective and surjective, is bijective.
Let us say that a set A is tame if every injective mapping f : A → A is bijective.
We have thus seen that the basic counting principle implies that any N-finite set
is tame.
The converse also holds: Suppose that A is an N-finite set which is tame and
consider two ways of counting out the elements of A. The first time we count we
mark the elements in red and suppose that the last element is marked with the
number M; the second time we mark the elements in green and suppose that the
last element is marked with the number N. Without loss of generality we can
assume that M ≤ N and so we can define a mapping f : A → A as follows: For
each number m between 1 and M let the image under f of the element marked
with m in red be the element marked with m in green. Then f is injective and
thus bijective, since A is tame, which implies that M = N. This shows that the
basic counting principle is valid for A.
We have now established that for N-finite sets the basic counting principle is
equivalent to being tame, and note that being tame has nothing to do with the
1 Introduction 5
natural numbers. Let us next consider an important property of tame sets which
will be used to help motivate the new definition of being finite.
Lemma 1.1 Let A be tame and b be some element not in A. Then A′ = A ∪ {b}
is also tame.
Proof Consider an injective mapping f : A′ → A′ ; then there are two cases:

1. f (A) ⊂ A. Then the restriction f|A : A → A of f to A is injective and hence
bijective, since A is tame. If f (b) ∈ A then f (a) = f|A (a) = f (b) for some
a ∈ A, since f|A is surjective, which contradicts the fact that f is injective. Thus
f (b) = b, and it follows that f is bijective.
2. f (A) 6⊂ A. In this case there exists an element c ∈ A with f (c) = b and, since
f is injective, we must have f (a) ∈ A for all a ∈ A \ {c} and f (b) ∈ A. This
means there is an injective mapping g : A → A defined by letting

f (a) if a ∈ A \ {c} ,
g(a) =
f (b) if a = c
and then g is bijective, since A is tame. Therefore f is again bijective.

Thus in both cases f is bijective, and this shows that A′ is tame.
Note that as a very special case the empty set ∅ is tame, since there is only
one mapping ǫ : ∅ → ∅ which is trivially bijective. For the same reason each
singleton set {a} is tame. This also follows from Lemma 1.1, since ∅ is tame and
{a} = ∅ ∪ {a}.
Let A be an N-finite set; then we can apply Lemma 1.1 to show that A is tame
without appealing to the basic counting principle: Start with the set A and the
empty set ∅; next remove an element a1 from A and add it to ∅ to give the set
∅ ∪ {a1 } = {a1 }; then remove an element a2 from A \ {a1 } and add it to {a1 } to
give the set {a1 } ∪{a2 } = {a1 , a2 }. Repeat the procedure, at each stage removing
an element from what is left from A and adding it to the set consisting of the
elements of A which have already been removed. This process will eventually
come to a halt, because at some point all the elements will have been removed
from A. The other set will then consist of all these elements and so will be equal
to A. As noted above ∅ is tame; thus by Lemma 1.1 {a1 } = ∅ ∪ {a1 } is tame
and so again by Lemma 1.1 {a1 , a2 } = {a1 } ∪ {a2 } is tame. Repeating this until
the original set is empty, Lemma 1.1 shows that at each stage the set consisting
of those elements which have been removed from A will be tame, and therefore
A itself must have the property.
The set of all subsets of a set A (the power set of A) will be denoted as usual
by P(A). The new definition of being finite will involve what is known as an
1 Introduction 6
inductive system in P(A) (or just an inductive system if A can be determined

from the context): This is any subset K of P(A) such that ∅ ∈ K and for which
B ∪ {a} ∈ K for all B ∈ K, b ∈ A \ B. In particular, P(A) itself is always
an inductive system. Another example is given by Lemma 1.1 which shows that
K = {B ∈ P(A) : B is tame} is an inductive system.
Now consider an N-finite set A and let K be an inductive system. Let B ∈ P(A);
then B is N-finite, and the argument used to show that an N-finite set is tame
shows that B ∈ K. Thus K = P(A). In other words, if A is N-finite then P(A)
is the only inductive system.
On the other hand, for any set A we have that K = {B ∈ P(A) : B is N-finite}
is an inductive system: The empty set is trivially N-finite, and if B ∈ K and
a ∈ A \ B then B ∪ {a} ∈ K. (First count the elements in B and then count
a.) In particular, if P(A) is the only inductive system then K = P(A) and hence
A ∈ K, i.e., A is N-finite.
The above (rather informal) arguments imply that a set A is N-finite if and only
if P(A) is the only inductive system, which leads to the new definition: A set
A will be called finite if P(A) itself is the only inductive system. As mentioned
above, this definition is essentially that employed by Whitehead and Russell [6].
Note that it has nothing to do with the natural numbers.
Proposition 1.1 Every finite set is tame. Thus if A is finite then every injective
mapping f : A → A is bijective.
Proof Let A be a finite set; as noted above K = {B ⊂ A : B is tame} is an

inductive system and hence K = P(A), since A is finite. In particular A ∈ K,
which shows that A is tame.
Lemma 1.2 Let A be finite and b be some element not in A. Then A′ = A ∪ {b}
is also finite.
Proof Let K′ be an inductive system in P(A′ ); then K = {B ∈ K′ : B ⊂ A}

is clearly an inductive system in P(A) and hence K = P(A). Thus P(A) ⊂ K′ .
Moreover, B ∪ {a} ∈ K′ for all B ∈ P(A), since K′ is an inductive system in
P(A′ ), and therefore K′ = P(A′ ). It follows that A′ is finite.
Lemma 1.2 shows that sets defined by explicitly giving each of their elements
are finite. For example, consider the set A = {a, b, c, d}, where the elements
a, b, c, d are all different. Then ∅ is finite implies {a} = ∅ ∪ {a} is finite
implies {a, b} = {a} ∪ {b} is finite implies {a, b, c} = {a, b} ∪ {c} is finite implies
A = {a, b, c, d} = {a, b, c} ∪ {d} is finite.
1 Introduction 7
In Section 2 we establish the properties of finite sets, using the above definition
of being finite. There is nothing surprising here: Every subset of a finite set is
finite, if A and B are finite sets then so are A ∪ B, A × B and B A (the set of all
mappings from A to B) and the power set of a finite set is finite. If A is a finite
set and f : A → A a mapping then f is injective if and only if it is surjective
(and thus if and only if it is bijective). If A is finite and there exists an injective
mapping f : B → A or a surjective mapping f : A → B then B is also finite. If
A and B are finite sets then either there exists an injective mapping f : A → B
or an injective mapping g : B → A, and if there exists both an injective mapping
f : A → B and an injective mapping g : B → A then A ≈ B, where we write
A ≈ B if there exists a bijective mapping h : A → B. If B is a subset of a finite
set A with B ≈ A then B = A. The proofs are mostly very straightforward and
they all follow the same pattern (and so tend to become somewhat monotonous).
Having developed a theory of finite sets we turn in Section 3 to the question of
what it means to count the elements in such a set. Of course, counting is usually
associated with the natural numbers, and let us start by giving an informal
discussion of this case. Let s : N → N be the successor operation (with s(0) = 1,
s(1) = 2 and so on). For each finite set A there is an element #(A) of N which
tells us how many elements A contains, and which is usually referred to as the
cardinality of A. The assignment A 7→ #(A) has the properties that #(∅) = 0
and #(A ∪ {a}) = s(#(A)) whenever A is a finite set and a is an element not in
A. Moreover, it is uniquely determined by these requirements. The importance
of the assignment # is that if A and B are finite sets then #(B) = #(A) if and
only if B ≈ A.
Note that the existence of # implies that the basic counting principle introduced
above is valid. For example, if A = {a, b, c} then
#(A) = #({a, b} ∪ {c}) = s(#({a, b})) = s(#({a} ∪ {b})) = s(s(#({a})))

= s(s(#(∅ ∪ {a}))) = s(s(s(#(∅)))) = s(s(s(0))) = 3
which corresponds to counting the elements of A by first removing c, then b

and finally a. However, the result 3 does not depend on the order in which the
elements are removed.
The triple (N, s, 0) is a special case of what we call a counting system, which is
defined to be any triple (X, f, x0 ) consisting of a set X, a mapping f : X → X and
an element x0 ∈ X. Consider a counting system (X, f, x0 ) and for each finite set
A let #(A) be an element of X. Then the assignment A 7→ #(A) will be called
an iterator for (X, f, x0 ) if #(∅) = x0 and #(A ∪ {a}) = f (#(A)) whenever
A is a finite set and a ∈ / A. (This means, for example, that #({a}) = f (x0 )
for each element a, #({a, b}) = f (f (x0 )) for distinct elements a and b, and so
on.) In particular, the operation which assigns to each finite set A its cardinality
#(A) defines an iterator for (N, s, 0). In fact, we will see that there exists a
1 Introduction 8
unique iterator # for each counting system (X, f, x0 ) and it has the property
that #(B) = #(A) whenever A and B are finite sets with B ≈ A. (These
statements will be established in Theorem 3.1 and Proposition 3.1 respectively.)
The iterator # for (N, s, 0) (with #(A) the cardinality of A) has the additional
crucial property that B ≈ A whenever A and B are finite sets with #(B) = #(A).
If an iterator # for a counting system (X, f, x0 ) has this property then we will
say that # is complete. In general # will not be complete, as can be seen by
looking at the trivial example in which X consist of the single element x0 , and so
f is the mapping with f (x0 ) = x0 . Then #(A) = x0 for each finite set A and here
#(B) = #(A) for all finite sets A and B. The completeness of the iterator for
(N, s, 0) must therefore depend on some special properties of the natural numbers.
These come about because the counting system (N, s, 0) is assumed to satisfy what
are usually called the Peano axioms (although, as was acknowledged by Peano,
they were introduced earlier by Dedekind in [1]).
One of these axioms is the principle of mathematical induction, which holds for a
counting system (X, f, x0 ) if whenever a proposition P is given about the elements
of X (meaning for each x ∈ X we have a proposition P(x)) such that
(⋄) P(x0 ) holds,
(⋆) P(f (x)) holds for every x ∈ X for which P(x) holds,
then P(x) holds for all x ∈ X. However, it turns out to be more convenient to
work with the following property, which Lemma 1.3 below shows is equivalent to
the principle of mathematical induction holding: A counting system (X, f, x0 ) is
said to be minimal if the only f -invariant subset of X containing x0 is X itself,
where a subset Y ⊂ X is f -invariant if f (Y ) ⊂ Y .
Lemma 1.3 The principle of mathematical induction holds for a counting system
if and only if it is minimal.
Proof Let (X, f, x0 ) be a counting system and let P be a proposition about the
elements of X satisfying (⋄) and (⋆). Then the subset X ′ = {x ∈ X : P(x) holds}
of X is f -invariant and contains x0 . Therefore if (X, f, x0 ) is minimal then
X ′ = X, and which means the principle of mathematical induction holds for
(X, f, x0 ). Suppose conversely that (X, f, x0 ) is not minimal; then there exists
an f -invariant subset X ′ of X containing x0 with X ′ 6= X. For each x ∈ X let
P(x) be the proposition that x ∈ X ′ ; then (⋄) and (⋆) are satisfied by P, but P(x)
does not hold for x ∈ X \ X ′ and so the principle of mathematical induction does
not hold for (X, f, x0 ).
The other two Peano axioms, when stated in terms of a counting system (X, f, x0 ),
require that the mapping f should be injective and that f (x) 6= x0 for all x ∈ X
1 Introduction 9
(i.e., that x0 ∈/ f (X)), and a counting system satisfying these two conditions
will be called standard. The Peano axioms thus require that (N, s, 0) should be a
minimal standard counting system, and such a counting system will be called a
Dedekind system.
It is the second property which implies completeness: Theorem 3.2 states that
if (X, f, x0 ) is standard then the unique iterator for (X, f, x0 ) is complete. This
confirms that the iterator for (N, s, 0) is complete.
Although the requirement that the counting system be minimal is not involved
here, it will be needed in Theorem 3.3, which states that if (X, f, x0 ) is a Dedekind
system then for each counting system (Y, g, y0) there exists a unique mapping
h : X → Y with h(x0 ) = y0 such that h ◦ f = g ◦ h. This result, which is known
as the recursion theorem (at least when applied with (X, f, x0 ) = (N, s, 0)) is of
fundamental importance, since it provides the justification for making recursive
or inductive definitions.
Theorem 3.5 (a result of Lawvere [2]) will show that (X, f, x0 ) being a Dedekind
system is necessary for the statement in the recursion theorem to hold, and so it
is worth noting that the existence of a Dedekind system depends on the existence
of a non-finite set. More precisely, a set Y is said to be Dedekind-infinite if there
exists a mapping g : Y → Y which is injective but not surjective, and thus by
Proposition 1.1 a Dedekind-infinite set cannot be finite.
Proposition 1.2 The existence of a Dedekind system is equivalent to that of a

Dedekind-infinite set.
Proof If (X, f, x0 ) is a standard counting system then the set X is Dedekind-

infinite, since f : X → X is an injective mapping which is not surjective. For
the converse consider a Dedekind-infinite set X, and so there exists an injective
mapping f : X → X which is not surjective. Choose an element x0 ∈ / f (X),
which gives us a counting system (X, f, x0 ). Now let X0 be the least f -invariant
subset of X containing x0 and f0 be the restriction of f to X0 , considered as a
mapping X0 → X0 . Then it is easy to see that the counting system (X0 , f0 , x0 )
is minimal. Moreover, x0 ∈ / f (X) ⊃ f (X0 ) = f0 (X0 ) and f0 , as the restriction of
an injective mapping, is itself injective. Hence (X0 , f0 , x0 ) is also standard i.e.,
(X0 , f0 , x0 ) is a Dedekind system.
As an application of the recursion theorem it is shown in Theorem 3.4 that if

(X, f, x0 ) is a Dedekind system then there exists a unique mapping [ · ] from X
to the set of finite subsets of X with [x0 ] = ∅ and such that [f (x)] = [x] ∪ {x} for
all x ∈ X. Moreover, A ≈ [#(A)] holds for each finite set A, and in particular a
set A is finite if and only if A ≈ [x] for some x ∈ X.
1 Introduction 10
For the Dedekind system (N, s, 0) we have [0] = ∅ and [n] = {0, 1, . . . , n − 1}
for each n ∈ N with n 6= 0. This shows that the definition of a finite set being
employed here is equivalent to the usual one.
Section 4 gives a more detailed account of minimal counting systems, and in
particular of those which are not Dedekind systems. It is shown in Theorem 4.1
that a minimal counting system (X, f, x0 ) is standard (and thus a Dedekind
system) if and only if the iterator is complete and that this is the case if and only
if the set X is not finite. The rest of the section is taken up with an analysis of
minimal counting systems (X, f, x0 ) for which X is finite. There are two cases.
In the first case f is bijective (and so x0 ∈ f (X)):
xℓ
• •
A
A
A
A
A
• x0 =f (xℓ ) A•
A
A
A
A
A
A• •
x1 =f (x0 ) x2 =f (x1 )
In the second f is not injective and x0 ∈

/ f (X):
x̆ℓ
• •
A
A
A
A
A
• • • • x̆0 =f (x̆ℓ )=f (xt ) A•
x0 x1 =f (x0 ) xt A
A
A
A
A
A• •
x̆1 =f (x̆0 )
In Section 5 we show how an addition and a multiplication can be defined for

any minimal counting system (X, f, x0 ). These operations can be specified by the
rules (a0), (a1), (m0) and (m1) below, which are usually employed when defining
the operations on N via the Peano axioms.
1 Introduction 11
Theorem 5.1 deals with the addition and states that there exists a unique binary
operation ⊕ on X such that
#(A) ⊕ #(B) = #(A ∪ B)
whenever A and B are disjoint finite sets, where # is the iterator for (X, f, x0 ).
This operation ⊕ is both associative and commutative, x ⊕ x0 = x for all x ∈ X
and for all x1 , x2 ∈ X there is an x ∈ X such that either x1 = x2 ⊕ x or
x2 = x1 ⊕ x. Moreover, ⊕ is the unique binary operation ⊕ on X such that
(a0) x ⊕ x0 = x for all x ∈ X.
(a1) x ⊕ f (x′ ) = f (x ⊕ x′ ) for all x, x′ ∈ X.
Theorem 5.2 treats the multiplication and states that there exists a unique binary
operation ⊗ on X such that
#(A) ⊗ #(B) = #(A × B)
for all finite sets A and B. This operation ⊗ is both associative and commutative,
x ⊗ x0 = x0 and x ⊗ f (x0 ) = x for all x ∈ X (and so f (x0 ) is a multiplicative
unit) and the distributive law holds for ⊕ and ⊗: For all x, x1 , x2 ∈ X
x ⊗ (x1 ⊕ x2 ) = (x ⊗ x1 ) ⊕ (x ⊗ x2 ) .
Moreover, ⊗ is the unique binary operation on X such that

(m0) x ⊗ x0 = x0 for all x ∈ X.
(m1) x ⊗ f (x′ ) = x ⊕ (x ⊗ x′ ) for all x, x′ ∈ X.
Finally, Section 6 presents alternative proofs for Theorems 5.1 and 5.2.
2 Finite sets
Let us first repeat the definition to be used here of a set being finite.
The set of all subsets of a set A (the power set of A) will be denoted as usual
by P(A). A subset K of P(A) is called an inductive system in P(A) (or just
an inductive system if A can be determined from the context) if ∅ ∈ K and
B ∪ {a} ∈ K for all B ∈ K, b ∈ A \ B. In particular, P(A) itself is always an
inductive system. A set A will be called finite if P(A) itself is the only inductive
system. As was already mentioned, this definition is essentially that employed
by Whitehead and Russell [6].
To keep the section self-contained let us also repeat Lemma 1.2.
Lemma 2.1 Let A be finite and b be some element not in A. Then A′ = A ∪ {b}
is also finite.
Proof Let K′ be an inductive system in P(A′ ); then K = {B ∈ K′ : B ⊂ A}

is clearly an inductive system in P(A) and hence K = P(A). Thus P(A) ⊂ K′ .
Moreover, B ∪ {a} ∈ K′ for all B ∈ P(A), since K′ is an inductive system in
P(A′ ), and therefore K′ = P(A′ ). It follows that A′ is finite.
Most proofs about finite sets take the following form: To show that every finite
set A has a particular property we consider the subset K of P(A) consisting of
those subsets of A which have the property. We then show that K is an inductive
system and conclude that K = P(A). In particular, it then follows that A ∈ K,
which shows that A has the property. This means that what we usually need is
not that P(A) is the only inductive system, but the apparently somewhat weaker
statement that every inductive system in P(A) contains A. However, as the next
result shows, this statement is actually equivalent to A being finite.
Lemma 2.2 The set A is finite if and only if every inductive system in P(A)
contains A.
Proof If A is finite then P(A) is the only inductive system, and P(A) contains
A. Conversely, suppose that every inductive system in P(A) contains A, and
consider the system of subsets K = {B ∈ P(A) : B is finite}. Then ∅ ∈ K and if
B ∈ K and a ∈ A \ B then by Lemma 2.1 B ∪ {a} ∈ K. Thus K is an inductive
system and hence A ∈ K, i.e., A is finite.
We now establish the usual properties of finite sets. The proofs are mostly very
straightforward and they all follow the same pattern (and so tend to become
somewhat monotonous).
12
2 Finite sets 13
Proposition 2.1 Every subset of a finite set is finite.
Proof Let A be finite and put K = {B ∈ P(A) : B is finite}. Then ∅ ∈ K and if

B ∈ K and a ∈ A \ B then by Lemma 2.1 B ∪ {a} ∈ K. Thus K is an inductive
system and hence K = P(A), i.e., every subset of A is finite.
Proposition 2.2 If A and B are finite sets then so is A ∪ B.
Proof Consider the system of subsets K = {C ∈ P(A) : C ∪ B is finite}. Then

∅ ∈ K, since by assumption ∅ ∪ B = B is finite and if C ∈ K (i.e., C ∪ B is
finite) and a ∈ A \ C then by Lemma 2.1 (C ∪ {a}) ∪ B = (C ∪ B) ∪ {a} ∈ K.
Thus K is an inductive system in P(A) and therefore K = P(A). In particular,
A ∈ K, i.e., A ∪ B is finite.
Proposition 2.3 Let A and B be sets with A finite.

(1) If there exists an injective mapping f : B → A then B is also finite.
(2) If there exists a surjective mapping f : A → B then B is again finite.
Proof (1) Let K consist of those subsets C of A such that if D is any set for
which there exists an injective mapping f : D → C then D is finite. Then ∅ ∈ K,
since there can only exist a mapping f : D → ∅ if D = ∅ and the empty set ∅
is finite. Let C ∈ K and a ∈ A \ C. Consider a set D for which there exists an
injective mapping f : D → C ∪ {a}. There are two cases:
(α) f (d) ∈ A for all d ∈ D. Here we can consider f as a mapping from D to C
and as such it is still injective. Thus D is finite since C ∈ K.
(β) There exists an element b ∈ D with f (b) = a. Put D ′ = D \ {b}. Now since
f is injective it follows that f (d) 6= a for all d ∈ D ′ , and thus we can define a
mapping g : D ′ → C by letting g(d) = f (d) for all d ∈ D ′ . Then g : D ′ → C is
also injective (since f : D → C ∪ {a} is) and therefore D ′ is finite since C ∈ K
holds. Hence by Lemma 2.1 D = D ′ ∪ {b} is finite.
This shows that C ∪ {a} ∈ K and therefore K is an inductive system in P(A).
Thus K = P(A) and in particular A ∈ K, which means that if there exists an
injective mapping f : B → A then B is also finite.
(2) Let K consist of those subsets C of A such that if D is any set for which
there exists a surjective mapping f : C → D then D is finite. Then ∅ ∈ K, since
there can only exist a surjective mapping f : ∅ → D if D = ∅ and the empty
set ∅ is finite. Let C ∈ K and a ∈ A \ C. Consider a set D for which there exists
a surjective mapping f : C ∪ {a} → D. There are again two cases:
2 Finite sets 14
(α) The restriction f|C : C → D of f to C is still surjective. Then D is finite

since C ∈ K.
(β) The restriction f|C is not surjective. Put b = f (a) and D ′ = D \ {b}. Then
f (c) 6= b for all c ∈ C (since f|C is not surjective) and therefore we can define a
mapping g : C → D ′ by letting g(c) = f (c) for all c ∈ C. But f : C ∪ {a} → D is
surjective and hence g : C → D ′ is also surjective. Thus D ′ is finite since C ∈ K
holds, and so by Lemma 2.1 D = D ′ ∪ {b} is finite.
This shows that C ∪ {a} ∈ K and therefore K is an inductive system in P(A).
Thus K = P(A) and in particular A ∈ K, which means that if there exists a
surjective mapping f : A → B then B is also finite.
Proposition 2.4 If A is a finite set then so is the power set P(A).
Proof Consider the system of subsets K = {B ∈ P(A) : P(B) is finite}. Then

by Lemma 2.1 ∅ ∈ K, since P(∅) = {∅} = ∅ ∪ {∅}. Let B ∈ K and a ∈ A \ B.
Then P(B ∪ {a}) = P(B) ∪ Pa (B), where Pa (B) = {C ∪ {a} : C ∈ P(B)} and
the mapping C 7→ C ∪ {a} from P(B) to Pa (B) is surjective. It follows from
Proposition 2.3 (2) that Pa (B) is finite and thus by Proposition 2.2 P(B ∪ {a})
is finite, i.e., B ∪ {a} ∈ K. This shows that K is an inductive system in P(A).
Hence K = P(A) and in particular A ∈ K, i.e., the power set P(A) is finite.
Proposition 2.5 If A and B are finite sets then so their product A × B.
Proof Put K = {C ∈ P(A) : C × B is finite}. Then ∅ ∈ K, since ∅ × B = ∅.

Let C ∈ K and a ∈ A \ C. Then (C ∪ {a}) × B = (C × B) ∪ ({a} × B) and by
Proposition 2.3 (2) {a} × B is finite since the mapping f : B → {a} × B with
f (b) = (a, b) for all b ∈ B is surjective. Thus by Proposition 2.1 (C ∪ {a}) × B
is finite, i.e., C ∪ {a} ∈ K. This shows that K is an inductive system in P(A).
Hence K = P(A) and in particular A ∈ K, i.e., A × B is finite.
Proposition 2.6 If A and B are finite sets then so is B A , the set of all mappings
from A to B.
Proof Since B A is a subset of P(A × B) and by Propositions 2.4 and 2.5 the set
P(A × B) is finite it follows from Proposition 2.1 that B A is finite.
Proposition 2.7 Let A be a finite set and f : A → A be a mapping. Then f is

injective if and only if it is surjective (and thus if and only if it is bijective).
2 Finite sets 15
Proof We first show that an injective mapping is bijective. Let K consist of those
subsets B of A having the property that every injective mapping f : B → B is
bijective. Then ∅ ∈ K, since the only mapping f : ∅ → ∅ is bijective. Let
B ∈ K and a ∈ A \ B; consider an injective mapping f : B ∪ {a} → B ∪ {a}.
There are two cases:
(α) f (B) ⊂ B. Then the restriction f|B : B → B of f to B is injective and
hence bijective, since B ∈ K. If f (a) ∈ B then f (b) = f|B (b) = f (a) for some
b ∈ B, since f|B is surjective, which contradicts the fact that f is injective. Thus
f (a) = a, and it follows that f is bijective.
(β) f (B) 6⊂ B. In this case there exists b ∈ B with f (b) = a and, since f is
injective, we must have f (c) ∈ B for all c ∈ B \ {b} and f (a) ∈ B. This means
there is an injective mapping g : B → B defined by letting

f (c) if c ∈ B \ {b} ,
g(c) =
f (a) if c = b
and then g is bijective, since B ∈ K. Therefore f is again bijective.
This shows that B ∪ {a} ∈ K and thus that K is an inductive system. Hence
K = P(A) and in particular A ∈ K, i.e., every injective mapping f : A → A is
bijective.
We now show a surjective mapping is bijective, and here let K consist of those
subsets B of A having the property that every surjective mapping f : B → B is
bijective. Then ∅ ∈ K, again since the only mapping f : ∅ → ∅ is bijective. Let
B ∈ K and a ∈ A \ B; consider a surjective mapping f : B ∪ {a} → B ∪ {a}. Let
D = {b ∈ B : f (b) = a}; there are three cases:
(α) D = ∅. Then f (a) = a, since f is surjective, thus the restriction f|B : B → B
of f to B is surjective and hence bijective (since B ∈ K holds), and this means
f is bijective.
(β) D 6= ∅ and f (a) ∈ B. Here we can define a surjective mapping g : B → B
by letting
f (c) if c ∈ B \ D ,
g(c) =
f (a) if c ∈ D .
Thus g is bijective (since B ∈ K), which implies that D = {b} for some b ∈ C
and in particular f is also injective.
(γ) D 6= ∅ and f (a) = a. This is not possible since then f (B \ D) = B and so,
choosing any b ∈ D, the mapping h : B → B with

f (c) if c ∈ B \ D ,
h(c) =
b if c ∈ D
would be surjective but not injective (since there also exists c ∈ B \ D with
f (c) = b).
2 Finite sets 16
K = P(A) and in particular A ∈ K, i.e., every surjective mapping f : A → A is
bijective.
If A and B are sets then we write A ≈ B if there exists a bijective mapping

f : A → B. Proposition 2.3 implies that if A ≈ B then A is finite if and only if
B is. It is clear that ≈ defines an equivalence relation on the class of all finite
sets.
Proposition 2.8 Let A and B be finite sets. Then either there exists an injective
mapping f : A → B or an injective mapping g : B → A. Moreover, if there exists
both an injective mapping f : A → B and an injective mapping g : B → A then
A ≈ B.
Proof Let K consist of those subsets C of A for which there either there exists an
injective mapping f : C → B or an injective mapping g : B → C. Then ∅ ∈ K,
since the only mapping f : ∅ → B is injective. Let C ∈ K and let a ∈ A \ C.
There are two cases:
(α) There exists an injective mapping g : B → C. Then g is still injective when
considered as a mapping from B to C ∪ {a}.
(β) There exists an injective mapping f : C → B. If f is not surjective then it
can be extended to an injective mapping f ′ : C ∪ {a} → B (with f ′ (a) chosen to
be any element in B \ f (C)). On the other hand, if f is surjective (and hence
a bijection) then the inverse mapping f −1 : B → C is injective and so is still
injective when considered as a mapping from B to C ∪ {a}.
K = P(A) and in particular A ∈ K, i.e., there either exists an injective mapping
f : A → B or an injective mapping g : B → A.
Suppose there exists both an injective mapping f : A → B and an injective
mapping g : B → A. Then f ◦ g : B → B is an injective mapping, which by
Proposition 2.7 is bijective. In particular f is surjective and therefore bijective,
i.e., A ≈ B.
Lemma 2.3 Let A and B be finite sets. Then there exists either a subset B ′ of
A with B ′ ≈ B or a subset A′ of B with A′ ≈ A.
Proof By Proposition 2.8 there either exists an injective mapping f : A → B or

an injective mapping g : B → A. Suppose that the former is the case and put
A′ = f (A); then A′ ⊂ B with A′ ≈ A (since f as a mapping from A to A′ is a
bijection). If there exists an injective mapping g : B → A then in the same way
there exists a subset B ′ of A with B ′ ≈ B.
2 Finite sets 17
Proposition 2.9 If B is a subset of a finite set A with B ≈ A then B = A.
Proof There exists a bijective mapping f : A → B and then the restriction

f|B : B → B of f to B is injective; thus by Proposition 2.7 f|B is bijective. But
this is only possible if B = A, since if a ∈ A \ B then f (a) ∈
/ f|B (B).
If A is a set and S a non-empty subset of P(A) then B ∈ S is said to be minimal

if B ′ ∈
/ S for each proper subset B ′ of B. The statement in the following result
is Tarski’s definition [5] of a set being finite.
Proposition 2.10 Let A be a set; then each non-empty subset of P(A) contains
a minimal element if and only if A is finite.
Proof Let A be a finite set and let K consist of those subsets B of A such that
each non-empty subset of P(B) contains a minimal element. Then ∅ ∈ K, since
the only non-empty subset of P(∅) is {∅} and then ∅ is the required minimal
element. Let B ∈ K and a ∈ A\B, and let S be a non-empty subset of P(B∪{a}).
Put S ′ = S ∩ P(B); there are two cases:
(α) S ′ 6= ∅. Here S ′ is a non-empty subset of P(B) and thus contains a minimal
element C which is then a minimal element of S, since each set in S \ S ′ contains
a and so is not a proper subset of C.
(β) S ′ = ∅ (and so each set in S contains a). Put S ′′ = {C ⊂ B : C ∪ {a} ∈ S};
then S ′′ is a non-empty subset of P(B) and thus contains a minimal element C.
It follows that C ′ = C ∪ {a} is a minimal element of S: A proper subset of C ′ has
either the form D with D ⊂ C, in which case D ∈ / S (since each set in S contains
a), or has the form D ∪ {a} with D a proper subset of C and here D ∪ {a} ∈ / S,
′′
since D ∈ /S .
K = P(A) and in particular A ∈ K, i.e., non-empty subset of P(X) contains a
minimal element.
Conversely, suppose A is not finite and let S = {B ∈ P(A) : B is not finite};
then S is non-empty since it contains A. However S cannot contain a minimal
element: If B is a minimal element of S then B 6= ∅, since ∅ is finite. Choose
b ∈ B; then B \ {b} is a proper subset of B and thus B \ {b} ∈
/ S, i.e., B \ {b} is
finite. But then by Lemma 2.1 B = (B \ {b}) ∪ {b} would be finite.
For what we consider in later sections it is useful to employ another technique for
establishing statements about finite sets. This involves the following induction
principle for finite sets which first appeared in a 1909 paper of Zermelo [7]:
2 Finite sets 18
Theorem 2.1 Let P be a proposition about finite sets (meaning that for each
finite set A we have some proposition P(A)). Suppose that
(⋄) P(∅) holds.
(⋆) If A is a finite set for which P(A) holds then P(A ∪ {a}) holds for each
element a ∈/ A.
Then P is a property of finite sets, i.e., P(A) holds for every finite set A.
Proof Let A be a finite set and put K = {B ∈ P(A) : P(B) holds}. In particular
(⋄) implies ∅ ∈ K. Consider B ∈ K (and so P(B) holds) and let a ∈ A \ B;
then P(B ∪ {a}) holds by (⋆), and hence B ∪ {a} ∈ K. This shows that K is
an inductive system in P(A) and so K = P(A). In particular A ∈ K, i.e., P(A)
holds, and since A is arbitrary P(A) holds for every finite set A.
All the properties of finite sets presented above could have been deduced from
Theorem 2.1 (together with Lemma 2.1, which states that A ∪ {a} is finite for
each finite set A and each element a ∈ / A, and the fact that the empty set ∅
is finite). For example, consider Proposition 2.2 which states that if A and B
are finite sets then so is A ∪ B. To establish this using the induction principle
for finite sets regard B as being fixed and for each finite set A let P(A) be the
proposition that A ∪ B is finite. Then:
(⋄) P(∅) holds because ∅ ∪ B = B.
(⋆) Let A be a finite set for which P(A) holds and let a ∈ / A. Now A ∪ B is
finite, since P(A) holds, and thus by Lemma 2.1 (A ∪ B) ∪ {a} is also finite (since
this holds immediately if a ∈ A ∪ B). But (A ∪ {a}) ∪ B = (A ∪ B) ∪ {a}; i.e.,
(A ∪ {a}) ∪ B is finite. This shows that P(A ∪ {a}) holds.
Therefore by Theorem 2.1 P(A) holds for each finite set A, and thus for all finite
sets A, B the set A ∪ B is finite.
The reader is left to check that the proofs of all the other results about finite sets
can be obtained in this manner.
3 Counting systems
Having introduced a theory of finite sets we now turn to the question of what
it means to count the elements in such a set. Let us first recall some of the
definitions made in Section 1.
A triple (X, f, x0 ) consisting of a set X, a mapping f : X → X and an element
x0 ∈ X will be called a counting system. A counting system (X, f, x0 ) is said to
be minimal if the only f -invariant subset of X containing x0 is X itself, where a
subset Y ⊂ X is f -invariant if f (Y ) ⊂ Y . Moreover, it will be called standard
if the mapping f is injective and f (x) 6= x0 for all x ∈ X (i.e., x0 ∈
/ f (X)). The
Peano axioms thus require that (N, s, 0) should be a minimal standard counting
system, and such a counting system will be called a Dedekind system.
Consider a counting system (X, f, x0 ) and for each finite set A let #(A) be an
element of X. Then the assignment A 7→ #(A) will be called an iterator for
(X, f, x0 ) if #(∅) = x0 and #(A ∪ {a}) = f (#(A)) whenever A is a finite set
and a ∈ / A. (This means, for example, that #({a}) = f (x0 ) for each element
a, #({a, b}) = f (f (x0 )) for distinct elements a and b, and so on.) Theorem 3.1
states that there exists a unique iterator # for each counting system (X, f, x0 )
and by Proposition 3.1 #(B) = #(A) whenever A and B.
If the iterator # for (X, f, x0 ) has the additional property that B ≈ A whenever
A and B are finite sets with #(B) = #(A) then we say that it is complete. In
general # will not have this property, as can be seen by looking at the trivial
example in which X consist of the single element x0 , and so f is the mapping with
f (x0 ) = x0 . Then #(A) = x0 for each finite set A and here #(B) = #(A) for all
finite sets A and B. However, Theorem 3.2 states that if (X, f, x0 ) is standard
then the unique iterator is complete.
In this section most of the statements about finite sets will be established with
the help of the induction principle for finite sets (Theorem 2.1). The counting
system (X, f, x0 ) is considered to be fixed in what follows.
Theorem 3.1 There exists a unique iterator # for (X, f, x0 ).
Proof We first consider a version of the theorem restricted to the subsets of a

finite set. Let A be a finite set; then a mapping #A : P(A) → X will be called
an A-iterator if #A (∅) = x0 and #A (B ∪ {a}) = f (#A (B)) for each B ⊂ A and
each a ∈ A \ B.
Lemma 3.1 For each finite set A there exists a unique A-iterator.
19
Proof For each finite set A let P(A) be the proposition that there exists a unique
A-iterator.
(⋄) P(∅) holds, since the mapping #∅ : P(∅) → X with #∅ (∅) = x0 is clearly
the unique ∅-iterator.
/ A; put A′ = A ∪ {a}.
(⋆) Let A be a finite set for which P(A) holds, and let a ∈
By assumption there exists a unique A-iterator #A . Now P(A′ ) is the disjoint
union of the sets P(A) and {B ∪ {a} : B ⊂ A} and so we can define a mapping
#A′ : P(A′ ) → X by #A′ (B) = #A (B) and #A′ (B ∪ {a}) = f (#A (B)) for each
B ⊂ A. Then #A′ (∅) = #A (∅) = x0 , and for all B ⊂ A and all b ∈ A \ B
#A′ (B ∪ {b}) = #A (B ∪ {b}) = f (#A (B)) = f (#A′ (B)) ,

#A′ (B ∪ {a}) = f (#A (B)) = f (#A′ (B)) ,
#A′ (B ∪ {a} ∪ {b}) = f (#A (B ∪ {b})) = f (f (#A (B))) = f (#A′ (B ∪ {a})) ,
i.e., #A′ (B ′ ∪ {b}) = f (#A′ (B ′ ) for all B ′ ⊂ A′ and b ∈ A′ \ B ′ and this means
that #A′ is an A-iterator in (X, f, x0 ). Let #′A′ be an arbitrary A′ -iterator. Then
in particular #′A′ (B ∪ {b}) = f (#′A′ (B) for all B ⊂ A and all b ∈ A \ B, and from
the uniqueness of the A-iterator #A it follows that #′A′ (B) = #A (B) and thus
also #′A′ (B ∪ {a}) = f (#′A′ (B)) = f (#A′ (B)) = #A′ (B ∪ {a}) for all B ⊂ A, i.e.,
#′A′ = #A′ . This shows that P(A ∪ {a}) holds.
Therefore by the induction principle for finite sets P(A) holds for every finite set
A, i.e., for each finite set A there exists a unique #A -iterator.
Lemma 3.2 Let A and B be finite sets with B ⊂ A; then the unique B-iterator
#B is the restriction of the unique A-iterator #A to P(B), i.e., #B (C) = #A (C)
for all C ∈ P(B).
Proof This follows immediately from the uniqueness of #B .
Now for each finite set A let #A be the unique A-iterator and put #(A) = #A (A).
In particular #(∅) = #∅ (∅) = x0 . For each finite set A and each element a ∈/A
it follows from Lemma 3.2 that
#(A ∪ {a}) = #A∪{a} (A ∪ {a}) = f (#A∪{a} (A) = f (#A (A)) = f (#(A))
and hence # is an iterator for (X, f, x0 ).

Finally, consider an arbitrary iterator #′ for (X, f, x0 ) and for each finite set A
let P(A) be the proposition that #′ (A) = #(A).
(⋄) P(∅) holds since #′ (∅) = x0 = #(∅).
(⋆) Let A be a finite set for which P(A) holds (and so #′ (A) = #(A)) and let
a∈/ A. Then #′ (A∪{a}) = f (#′ (A)) = f (#(A)) = #(A∪{a}) and so P(A∪{a})
holds.
Hence by the induction principle for finite sets P(A) holds for every finite set A,
which means #′ (A) = #(A) for each finite set A, i.e., the iterator A → #(A) for
(X, f, x0 ) is unique. This completes the proof of Theorem 3.1.
In what follows let # be the unique iterator for (X, f, x0 ).
Proposition 3.1 If A and B are finite sets with B ≈ A then #(B) = #(A).
Proof For each finite set A let P(A) be the proposition that #(B) = #(A)
whenever B is a finite set with B ≈ A.
(⋄) P(∅) holds since B ≈ ∅ if and only if B = ∅.
(⋆) Let A be a finite set for which P(A) holds and let a ∈
/ A. Consider a finite
set B with B ≈ A ∪ {a}, put b = h(a) and let B = B \ {b}; then B ′ ≈ A, thus
′
#(B ′ ) = #(A), since P(A) holds, and it follows that
#(B) = #(B ′ ∪ {b}) = f (#(B ′ )) = f (#(A)) = #(A ∪ {a}) .
This shows that P(A ∪ {a}) holds.

A, and hence #(B) = #(A) whenever B is a finite set with B ≈ A.
Theorem 3.2 If (X, f, x0 ) is standard then the iterator # is complete.
Proof For each finite set A let P(A) be the proposition that B ≈ A whenever B
is a finite set with #(B) = #(A).
(⋄) Let B be a finite set with B 6= ∅, let b ∈ B and put B ′ = B \ {b}. Then
#(B) = #(B ′ ∪ {b}) = f (#(B ′ )), and so #(B) 6= x0 , since x0 ∈
/ f (X). Thus
#(B) 6= #(∅), which shows that P(∅) holds, since ∅ ≈ B if and only if B = ∅.
/ A. Consider a finite set
B with #(B) = #(A ∪ {a}); then #(B) = f (#(A)) ∈ f (X), hence #(B) 6= x0
and so B 6= ∅. Let b ∈ B and put B ′ = B \ {b}; then
f (#(B ′ )) = #(B ′ ∪ {b}) = #(B) = f (#(A)) ,
and thus #(B ′ ) = #(A), since f is injective, and it follows that B ′ ≈ A, since
P(A) holds. But B = B ′ ∪ {b} with b ∈ / B′, a ∈/ A and B ′ ≈ A, and therefore
′
B = B ∪ {b} ≈ A ∪ {a}. This shows that P(A ∪ {a}) holds.
Thus by the induction principle for finite sets P(A) holds for every finite set A,
which means that if A and B are finite sets with #(B) = #(A) then B ≈ A.
The next lemma indicates the advantage of having a minimal counting system for
obtaining information using the iterator #. Note that an arbitrary intersection
of f -invariant subsets of X is again f -invariant and X is itself an f -invariant
subset containing x0 . There is thus a least f -invariant subset of X containing
x0 (namely the intersection of all such subsets). If (X, f, x0 ) is minimal then the
least f -invariant subset containing x0 is of course X itself.
Lemma 3.3 Let X0 be the least f -invariant subset of X containing x0 . Then
X0 = {x ∈ X : x = #(A) for some finite set A} .
In particular, if (X, f, x0 ) is minimal then for each x ∈ X there exists a finite set
A such that x = #(A).
Proof Put X0′ = {x ∈ X : x = #(A) for some finite set A}. Now let X ′ be an
f -invariant subset of X containing x0 and for each finite set A let P(A) be the
proposition that #(A) ∈ X ′ .
(⋄) P(∅) holds since #(∅) = x0 ∈ X ′ .
(⋆) Let A be a finite set for which P(A) holds (and so #(A) ∈ X ′ ) and let a ∈
/ A.
′ ′
Then #(A ∪ {a}) = f (#(A)) ∈ X , since X is f -invariant, and hence P(A ∪ {a})
holds.
A, which means that #(A) ∈ X ′ for each finite set A, i.e., X0′ ⊂ X ′ . In particular,
X0′ ⊂ X0 .
It remains to show that X0′ is itself an f -invariant subset of X containing x0 ,
and clearly x0 ∈ X0′ since x0 = #(∅). Thus let x ∈ X0′ , and so there exists a
finite set A with x = #(A). Let a be an element not in A; it then follows that
#(A ∪ {a}) = f (#(A)) = f (x), which implies that f (x) ∈ X0′ . Hence X0′ is
f -invariant.
Remark: In the above proof, and also below, we use the fact that for each set
A there exists an element a not in A. In fact there must exist an element in
P(A) \ A. If this were not the case then P(A) ⊂ A, and we could define a
surjective mapping f : A → P(A) by letting f (x) = x if x ∈ P(A) and f (x) = ∅
otherwise. But by Cantor’s diagonal argument (which states that a mapping
f : X → P(X) cannot be surjective) this is not possible.
Here is the recursion theorem (which first appeared in Dedekind [1]).
Theorem 3.3 If (X, f, x0 ) is a Dedekind system then for each counting system
(Y, g, y0) there exists a unique mapping h : X → Y with h(x0 ) = y0 such that
h ◦ f = g ◦ h.
Proof Here we have the unique iterator # for (X, f, x0 ) as well as the unique
iterator #′ for (Y, g, y0). Let x ∈ X; by Lemma 3.3 there exists a finite set A
with x = #(A) (since (X, f, x0 ) is minimal). Moreover, if B is another finite
set with x = #(B) then #(B) = #(A) and hence by Theorem 3.2 B ≈ A
(since (X, f, x0 ) is standard). Therefore by Proposition 3.1 #′ (B) = #′ (A). This
implies there exists a unique mapping h : X → Y such that h(#(A)) = #′ (A)
for each finite set A. In particular, h(x0 ) = h(#(∅)) = #′ (∅) = y0 . Let x ∈ X;
as above there exists a finite set A with x = #(A), and there exists an element
a not contained in A. Hence
h(f (x)) = h(f (#(A))) = h(#(A ∪ {a}))

= #′ (A ∪ {a}) = g(#′ (A)) = g(h(#(A))) = g(h(x))
and this shows that h ◦ f = g ◦ h. The proof of the uniqueness only uses the fact
that (X, f, x0 ) is minimal: Let h′ : X → Y be a further mapping with h′ (x0 ) = y0
and h′ ◦ f = g ◦ h′ and let X ′ = {x ∈ X : h(x) = h′ (x)}. Then x0 ∈ X ′ , since
h(x0 ) = y0 = h′ (x0 ), and if x ∈ X ′ then h′ (f (x)) = g(h′ (x)) = g(h(x)) = h(f (x)),
i.e., f (x) ∈ X ′ . Thus X ′ is an f -invariant subset of X containing x0 and so
X ′ = X, i.e., h′ = h.
The recursion theorem will now be applied to show that the definition of a finite
set we are working with is equivalent to the usual one. The usual definition of
A being finite is that there exists n ∈ N and a bijective mapping h : [n] → A,
where [0] = ∅ and [n] = {0, 1, . . . , n − 1} for n ∈ N \ {0}. Moreover, if there is a
bijective mapping h : [n] → A, then n is the cardinality of A, i.e., n = #(A), and
so A ≈ [#(A)] for each finite set A. The problem here is to assign a meaning to
the expression {0, 1, . . . , n − 1}, and one way to do this is to make use of the fact
that {0, 1, . . . , n} = {0, 1, . . . , n − 1} ∪ {n} and hence that [s(n)] = [n] ∪ {n} for
all n ∈ N. The corresponding approach works with any Dedekind system.
If X is a set then put F (X) = {A ⊂ X : A is finite}.
Theorem 3.4 Let (X, f, x0 ) be a Dedekind system. Then there exists a unique
mapping [ · ] : X → F (X) with [x0 ] = ∅ such that [f (x)] = [x] ∪ {x} for all
x ∈ X. Moreover, x ∈ / [x] for each x ∈ X and A ≈ [#(A)] holds for each finite
set A. In particular, a set A is finite if and only if A ≈ [x] for some x ∈ X.
Proof If A ∈ F (X) then by Propositions 2.2 and 2.3 (2) {x0 } ∪ f (A) ∈ F (X)
and so there is a mapping F : F (X) → F (X) given by F (A) = {x0 } ∪ f (A) for
all A ∈ F (X). (Note that here F (A) is the result of applying the mapping F to
the argument A while f (A) = {f (a) : a ∈ A}.) Consider the counting system
(F (X), F, ∅). By Theorem 3.3 there exists a unique mapping [ · ] : X → F (X)
with [x0 ] = ∅ and such that [f (x)] = F ([x]) = {x0 } ∪ f ([x]) for all x ∈ X. Let
X0 = {x ∈ X : [f (x)] = [x] ∪ {x}}. Then
[f (x0 )] = {x0 } ∪ f ([x0 ]) = {x0 } ∪ f (∅) = {x0 } = ∅ ∪ {x0 } = [x0 ] ∪ {x0 }
and so x0 ∈ X0 . Let x ∈ X0 ; then
[f (f (x))] = {x0 } ∪ f ([f (x)]) = {x0 } ∪ f ([x] ∪ {x})

= {x0 } ∪ f ([x]) ∪ {f (x)} = [f (x)] ∪ {f (x)}
and so f (x) ∈ X0 . Thus X0 is an f -invariant subset of X containing x0 and hence

X0 = X, i.e., [f (x)] = [x] ∪ {x} for all x ∈ X. The uniqueness of the mapping
[ · ] : X → F (X) (satisfying [x0 ] = ∅ and [f (x)] = [x] ∪ {x} for all x ∈ X) follows
immediately from the fact that (X, f, x0 ) is minimal.
Next consider the set X1 = {x ∈ X : x ∈ / [x]}; in particular x0 ∈ X1 , since
x0 ∈/ ∅ = [x0 ]. If x ∈ X1 then f (x) ∈
/ f ([x]) (since f is injective) and f (x) ∈
/ {x0 }
(since x0 ∈/ f (X)) and so f (x) ∈
/ {x0 } ∪ f ([x]) = [f (x)]. Hence f (x) ∈ X1 . Thus
X1 is an f -invariant subset of X containing x0 and therefore X1 = X, i.e., x ∈ / [x]
for all x ∈ X.
Finally, for each finite set A let P(A) be the proposition that A ≈ [#(A)].
(⋄) P(∅) holds since [#(∅)] = [x0 ] = ∅.
(⋆) Let A be a finite set for which P(A) holds (so A ≈ [#(A)]) and let a ∈ / A.
Then A ≈ f (A), since f is injective, and x0 ∈
/ f (A), and it therefore follows that
{x0 } ∪ f (A) ≈ A ∪ {a}. Moreover,
[#(A ∪ {a})] = [f (#(A))] = {x0 } ∪ f ([#(A)]) = {x0 } ∪ f (A)
and thus A ∪ {a} ≈ [#(A ∪ {a})], i.e., P(A ∪ {a}) holds.

Hence by the induction principle for finite sets P(A) holds for every finite set
A, which means A ≈ [#(A)] for each finite set A. This completes the proof of
Theorem 3.4.
We now consider the converse of the recursion theorem (Theorem 3.3), and for
this it is useful to to be more explicit about the structure preserving mappings
between counting systems. If (X, f, x0 ) and (Y, g, y0) are counting systems then
a mapping π : X → Y is said to be a morphism from (X, f, x0 ) to (Y, g, y0)
if π(x0 ) = y0 and g ◦ π = π ◦ f . This will also be expressed by saying that
π : (X, f, x0 ) → (Y, g, y0) is a morphism.
Lemma 3.4 (1) For each counting system (X, f, x0 ) the identity mapping idX
is a morphism from (X, f, x0 ) to (X, f, x0 ).
(2) If π : (X, f, x0 ) → (Y, g, y0) and σ : (Y, g, y0) → (Z, h, z0 ) are morphisms
then σ ◦ π is a morphism from (X, f, x0 ) to (Z, h, z0).
Proof (1) This is clear, since idX (x0 ) = x0 and f ◦ idX = f = idX ◦ f .
(2) This follows since (σ ◦ π)(x0 ) = σ(π(x0 )) = σ(y0 ) = z0 and
h ◦ (σ ◦ π) = (h ◦ σ) ◦ π = (σ ◦ g) ◦ π = σ ◦ (g ◦ π) = σ ◦ (π ◦ f ) = (σ ◦ π) ◦ f .
If π : (X, f, x0 ) → (Y, g, y0) is a morphism then clearly π ◦ idX = π = idY ◦ π,

and if π, σ and τ are morphisms for which the compositions are defined then
(τ ◦ σ) ◦ π = τ ◦ (σ ◦ π). This means that counting systems are the objects of a
concrete category, whose morphisms are those defined above.
A counting system (X, f, x0 ) is said to be initial if for each counting system
(Y, g, y0) there is a unique morphism from (X, f, x0 ) to (Y, g, y0). Theorem 3.3
thus states that a Dedekind system is initial. The following result of Lawvere [2]
shows that the converse of the recursion theorem holds.
Theorem 3.5 An initial counting system (X, f, x0 ) is a Dedekind system.
Proof We first show that an initial counting system is minimal, and then that it
is standard.
Lemma 3.5 An initial counting system (X, f, x0 ) is minimal.
Proof Let X0 be the least f -invariant subset of X containing x0 and f0 be the

restriction of f to X0 , considered as a mapping X0 → X0 . Then it is easy to
see that the counting system (X0 , f0 , x0 ) is minimal and clearly the inclusion
mapping inc : X0 → X defines a morphism from (X0 , f0 , x0 ) to (X, f, x0 ). Let
σ : (X, f, x0 ) → (X0 , f0 , x0 ) be the unique morphism; then inc ◦ σ = idX , since
by Lemma 3.4 inc ◦ σ and idX are both morphisms from (X, f, x0 ) to (X, f, x0 )
(and there is only one such morphism, since (X, f, x0 ) is initial). In particular,
inc is surjective, which implies that X0 = X, i.e., (X, f, x0 ) is minimal.
Lemma 3.6 An initial counting system (X, f, x0 ) is standard.

Proof Let ⋄ be an element not contained in X, put X⋄ = X ∪ {⋄} and define

f⋄ : X⋄ → X⋄ by letting f⋄ (x) = f (x) for x ∈ X and f⋄ (⋄) = x0 ; thus (X⋄ , f⋄ , ⋄)
is a counting system. Since (X, f, x0 ) is initial there exists a unique morphism
π : (X, f, x0 ) → (X⋄ , f⋄ , ⋄). Consider the set X ′ = {x ∈ X : f⋄ (π(x)) = x}; then
x0 ∈ X ′ , since f⋄ (π(x0 )) = f⋄ (⋄) = x0 and if x ∈ X ′ then f⋄ (π(x)) = x and so
f⋄ (π(f (x))) = f⋄ (f⋄ (π(x))) = f⋄ (x) = f (x) ,
i.e., f (x) ∈ X ′ . Thus X ′ is an f -invariant subset of X containing x0 and hence

X ′ = X, since by Lemma 3.5 (X, f, x0 ) is minimal. Thus π(f (x)) = f⋄ (π(x)) = x
for all x ∈ X, which implies that f is injective. Moreover, x0 ∈ / f (X), since
π(f (x)) = f⋄ (π(x)) 6= ⋄ = π(x0 ) for all x ∈ X. Hence (X, f, x0 ) is standard.
This completes the proof of Theorem 3.5.

4 Minimal counting systems
In this section we give a more detailed analysis of minimal counting systems,
and in particular of those which are not Dedekind systems. In the following let
(X, f, x0 ) be a minimal counting system and let # be the unique iterator for
(X, f, x0 ). For each finite set A define a subset #P (A) of X by
#P (A) = {x ∈ X : x = #(C) for some C ⊂ A} .
Thus #P (∅) = {x0 }, #P ({a}) = {x0 , f (x0 )}, #P ({a, b}) = {x0 , f (x0 ), f (f (x0 ))}
for distinct elements a and b, and so on. If A ⊂ B then clearly #P (A) ⊂ #P (B).
A finite set A will be called #-regular if #(B) 6= #(A) for each proper subset B
of A.
The first result shows in particular that for minimal counting systems the converse
of Theorem 3.2 holds.
Theorem 4.1 The following statements are equivalent:

(1) (X, f, x0 ) is standard (and thus a Dedekind system).
(2) The iterator # is complete.
(3) Each finite set is #-regular.
(4) #P (A) 6= X for each finite set A.
(5) The set X is not finite.
For the proof we need two lemmas:
Lemma 4.1 (1) For each finite set A the set #P (A) is finite.
(2) If A and B are finite sets with A ≈ B then #P (A) = #P (B).
Proof (1) By Proposition 2.4 the set P(A) is finite and # maps P(A) onto
#P (A). Thus by Proposition 2.3 (2) #P (A) is finite.
(2) If h : A → B is a bijection then the mapping h∗ : P(A) → P(B) given by
h∗ (C) = h(C) for each C ∈ P(A) is also a bijection with h∗ (C) ≈ C and hence
by Proposition 3.1 with #(h∗ (C)) = #(C) for each C ∈ P(A). It follows that
#P (A) = #P (B).
Lemma 4.2 Let A be a finite set and a ∈

/ A. Then A ∪ {a} is #-regular if and
only if #P (A) 6= X.
27
Proof If #P (A) = X then #(A ∪ {a}) ∈ #P (A) and so #(A ∪ {a}) = #(B) for
some B ⊂ A. Hence A∪{a} is not #-regular, since B is a proper subset of A∪{a}.
Suppose conversely that A ∪ {a} is not #-regular and so there exists a proper
subset B ′ of A ∪ {a} with #(B ′ ) = #(A ∪ {a}). There then exists B ⊂ A with
B ≈ B ′ and by Proposition 3.1 #(B) = #(B ′ ) = #(A ∪ {a}). Consider the set
X0 = {x ∈ X : x = #(C) for some C ⊂ A} and so in particular x0 = #(∅) ∈ X0 .
Let x = #(C) ∈ X0 with C ⊂ A; if C 6= A then f (x) = f (#(C)) = #(C ∪ {c}),
with c any element in A \ C, and thus f (x) ∈ X0 . On the other hand, if C = A
then f (x) = f (#(A)) = #(A ∪ {a}) = #(B) and again f (x) ∈ X0 , since B ⊂ A.
Thus X0 is an f -invariant subset of X containing x0 and therefore X0 = X, since
(X, f, x0 ) is minimal. But this implies that #P (A) = X.
Proof of Theorem 4.1: (1) ⇒ (5): This follows from Proposition 2.3 (1).
(5) ⇒ (4): This follows from Lemma 4.1 (1).
(4) ⇔ (3): This follows directly from Lemma 4.2, since the empty set is clearly
#-regular.
(3) ⇔ (2): Suppose # is complete, let A be a finite set and B be a subset of A
with #(B) = #(A). Then B ≈ A and hence by Proposition 2.9 B = A. Thus
A is #-regular. Now suppose conversely that each finite set is #-regular and
let A, B be finite sets with #(A) = #(B). Then by Lemma 2.3, and without
loss of generality, there exists C ⊂ A with C ≈ B and so by Proposition 3.1
#(C) = #(B) = #(A). Thus C = A, since A is #-regular, which implies that
A ≈ B. This shows that # is complete.
(3) ⇒ (1): We assume that (X, f, x0 ) is not standard and show there exists a
finite set which is not #-regular. Suppose first that x0 = f (x) for some x ∈ X.
By Lemma 3.3 there exists a finite set A with x = #(A); let a be some element not
in A. Then #(A∪{a}) = f (#(A)) = f (x) = x0 = #(∅) and ∅ is a proper subset
of A ∪ {a}; hence A ∪ {a} is not #-regular. Suppose now that f is not injective
and so there exist x, x′ ∈ X with x 6= x′ and f (x) = f (x′ ). By Lemma 3.3 there
exist finite sets A and B with x = #(A) and x′ = #(B) and by Lemma 2.3 and
Proposition 3.1 we can assume that B ⊂ A. Thus B is a proper subset of A,
since #(A) = x 6= x′ = #(B). Let a ∈ / A; then B ∪ {a} is a proper subset of
A ∪ {a} with #(B ∪ {A}) = f (#(B)) = f (x′ ) = f (x) = f (#(A)) = #(A ∪ {a}),
and hence again A ∪ {a} is not #-regular.
If the minimal counting system (X, f, x0 ) is not a Dedekind system then by

Theorem 4.1 X is finite, and we next look at the structure of such counting
systems.
A counting system (Y, g, y0) will be called z-minimal, where z ∈ Y , if the only
subset Y ′ of Y which contains y0 and is such that g(Y ′ \ {z}) ⊂ Y ′ is Y itself.
Lemma 4.3 If (Y, f, x0 ) is z-minimal then it is also minimal. Moreover, if A

is any finite set with #(A) = z (and such a set exists by Lemma 3.3) then
#P (A) = Y , and in particular Y is finite.
Proof If Y ′ is a g-invariant subset of Y then g(Y ′ \ {z}) ⊂ g(Y ′ ) ⊂ Y ′ . Thus

if also y0 ∈ Y ′ then Y ′ = Y , which shows that (Y, g, y0) is minimal. Now by
Lemma 3.3 there exists a finite set A with z = #(A), and consider the subset
Y ′ = {y ∈ Y : y = #(B) for some B ⊂ A}; in particular y0 = #(∅) ∈ Y ′ . Let
y ∈ Y ′ \ {z}; then y = #(B) for some proper subset B of A. Let a ∈ A \ B; then
B ∪{a} ⊂ A and so g(y) = g(#(B)) = #(B ∪{a}) ∈ Y ′ . Therefore Y ′ = Y , since
(Y, g, y0) is z-minimal, which implies that #P (A) = Y and in particular that Y
is finite.
Theorem 4.2 Let X be finite. Then:

(1) There exists a unique element z ∈ X such that (X, f, x0 ) is z-minimal.
(2) f maps X \ {z} bijectively onto X \ {x0 }.
(3) If A is any #-regular finite set A with #P (A) = X then z = #(A). Moreover,
a finite set A is #-regular and satisfies #P (A) = X if and only if A ≈ X \ {c}
for some c ∈ X (i.e., if and only if A has one less element than X).
Proof (1) We first show there exists a #-regular finite set A with #P (A) = X.
By Theorem 4.1 there exists a finite set A′ such that #P (A′ ) = X. Therefore the
set S = {C ∈ P(A) : #P (C) = X} is non-empty and so by Proposition 2.10 it
contains a minimal element A, which is in fact #-regular: This holds trivially if
A = ∅, and if A is non-empty and B = A\{a} with a ∈ A then #P (B) 6= X, since
B is a proper subset of A, and hence by Lemma 4.2 A = B ∪ {a} is #-regular.
Put z = #(A); we next show that (X, f, x0 ) is z-minimal, so let X ′ ⊂ X with
x0 ∈ X ′ and f (X ′ \ {z}) ⊂ X ′ , and consider K = {B ∈ P(A) : #(B) ∈ X ′ }. In
particular, ∅ ∈ K, since #(∅) = x0 ∈ X ′ . Now let B ∈ K and b ∈ A \ B; then
B is a proper subset of A, which means #(B) 6= #(A) = z, since A is #-regular.
Hence #(B) ∈ X ′ \ {z} and it follows that #(B ∪ {b}) = f (#(B)) ∈ X ′ . Thus K
is an inductive system and so K = P(A), which implies that X = #P (A) ⊂ X ′ ,
i.e., X ′ = X. Therefore (X, f, x0 ) is z-minimal.
Suppose (X, f, x0 ) is also z ′ -minimal for some z ′ ∈ X. Then, since #P (A) = X,
there exists B ⊂ A with z ′ = #(B) and by Lemma 4.3 #P (B) = X. In particular
z ∈ #P (B) and hence #(A) = z = #(C) for some C ⊂ B. But this is only
possible if C = B = A, since A is #-regular and therefore z ′ = #(B) = #(A) = z,
which shows there exists a unique element z ∈ X such that (X, f, x0 ) is z-minimal.
For the other parts we need the following facts:
Lemma 4.4 Let A be #-regular. Then:

(1) A finite set A′ with #(A′ ) = #(A) is #-regular if and only if A′ ≈ A.
(2) B is #-regular for each B ⊂ A.
(3) #(B) = #(C) holds for subsets B and C of A if and only if B ≈ C.
(4) #P (A) ≈ A ∪ {a} for any element a ∈
/ A.
Proof (1) Let A′ be a finite set with A′ ≈ A and h : A → A′ be bijective; then,

as in the proof of Lemma 4.1 (2), the induced mapping h∗ : P(A) → P(A′ ) is
also a bijection with #(h∗ (C)) = #(C) for each C ∈ P(A). It follows that A′
a #-regular. Conversely, let A and A′ be #-regular with #(A′ ) = #(A); by
Lemma 2.3 (and without loss of generality) we can assume there exists B ′ ⊂ A′
with B ′ ≈ A and hence by Proposition 3.1 with #(B ′ ) = #(A) = #(A′ ). Thus
B ′ = A′ , since A′ is #-regular, which shows that A ≈ A′ .
(2) Let B be a non-empty subset of A and let b ∈ B; put A′ = A \ {b} and
B ′ = B \ {b}. Then by Lemma 4.2 #P (A′ ) 6= X, since A′ ∪ {b} = A is #-regular,
and #P (B ′ ) ⊂ #P (A′ ), since B ′ ⊂ A′ . Thus #P (B ′ ) 6= X and so by Lemma 4.2
B = B ′ ∪ {b} is #-regular. Since the empty set is also #-regular it follows that
every subset of A is #-regular.
(3) This follows immediately from (1) and (2).
(4) Let a ∈/ A and put K = {B ∈ P(A) : #P (B) ≈ B ∪{a}}; in particular ∅ ∈ K,
since #P (∅) = {x0 } ≈ {a} = ∅∪{a}. Let B ∈ K (and so #P (B) ≈ B ∪{a}) and
let b ∈ A \ B; then by (2) B ∪ {b} is #-regular and hence #(B ∪ {b}) ∈
/ #P (B),
since every subset of B is a proper subset of B ∪ {b}. Moreover
#P (B ∪ {b}) = {x ∈ X : x = #(C) for some C ⊂ B ∪ {b}}

= {x ∈ X : x = #(C) for some C ⊂ B} ∪ {#(B ∪ {b})}
= #P (B) ∪ {#(B ∪ {b})}
since if C is a proper subset of B ∪ {b} then C ≈ C ′ for some C ′ ⊂ B, and so by

Proposition 3.1 #(C ′ ) = #(C). Therefore
#P (B ∪ {b}) = #P (B) ∪ {#(B ∪ {b}) ≈ (B ∪ {a}) ∪ {b} = (B ∪ {b}) ∪ {a} ,
i.e., B ∪ {b} ∈ K, which shows that K is an inductive system and hence that
K = P(A). In particular A ∈ K, i.e., #P (A) ≈ A ∪ {a}.
We now continue with the proof of Theorem 4.2.
(2) Let A be #-regular with #P (A), and thus z = #(A). Let x ∈ X \ {z}, and
so x = #(B) for some proper subset B of A. If a ∈ A \ B then by Lemma 4.4 (3)
f (x) = f (#(B)) = #(B ∪ {a}) 6= x0 , since B ∪ {a} 6≈ ∅, and it thus follows that
f (X \{z}) ⊂ X \{x0 }. On the other hand, if x ∈ X \{x0 } then by Lemma 4.4 (3)
x = #(B) for some non-empty B ⊂ A. Let b ∈ B and put B ′ = B \ {b}; then
x′ = #(B ′ ) ∈ X \ {z}, since B ′ is a proper subset of the #-regular set A, and
f (x′ ) = f (#(B ′ )) = #(B ′ ∪{b}) = #(B) = x. Hence f (X \{z}) = X \{x0 }. Now
let x1 , x2 ∈ X \ {z} with f (x1 ) = f (x2 ); there then exist proper subsets B1 , B2
of A with x1 = #(B1 ) and x2 = #(B2 ) and by Lemma 2.3 and Proposition 3.1,
and without loss of generality, we can assume B2 ⊂ B1 . Let a ∈ A \ B1 ; then
#(B1 ∪ {a}) = f (#(B1 )) = f (x1 ) = f (x2 ) = f (#(B2 )) = #(B2 ∪ {a})
and so by Lemma 4.4 (3) B1 ∪ {a} ≈ B2 ∪ {a}, which implies that B1 ≈ B2 ,

and then by Proposition 3.1 x1 = #(B1 ) = #(B2 ) = x2 . This shows that the
restriction of f to X \ {z} is injective.
(3) The proof of (1) shows there exists a #-regular finite set A′ with #P (A′ ) = X,
and that z = #(A) for any #-regular finite set A with #P (A) = X. Moreover,
if A is such a set then by Lemma 4.4 (4) X = #P (A) ≈ A ∪ {a} for any element
a∈ / A, and hence A ≈ X \ {c} for any c ∈ X. Finally, if A is a finite set with
A ≈ X \ {c} then A ≈ A′ (since A′ ≈ X \ {c}), hence by Lemma 4.4 (1) and
Proposition 3.1 A is #-regular and by Lemma 4.1 #P (A) = #P (A′ ) = X.
In what follows we assume that X is finite and let z be the unique element given
in Theorem 4.2 such that (X, f, x0 ) is z-minimal; z will be called the end-point
of (X, f, x0 ). Note that z 6= x0 if X 6= {x0 } (since (X, f, x0 ) being x0 -minimal
implies that X = {x0 }). Theorem 4.2 essentially characterises the behaviour of
f on the set X \ {z}; it does not say anything about the value f (z).
Proposition 4.1 (1) If f (z) = x0 then f is bijective. In this case x0 ∈ f (X).

(2) If f (z) 6= x0 then f is not injective and x0 ∈
/ f (X).
Proof This follows immediately from Theorem 4.2 (2).
z
• •
A
A
A
A
A
• x0 =f (z) A•
A
A
A
A
A
A• •
x1 =f (x0 ) x2 =f (x1 )
z
• •
A
A
A
A
A
• • • • x̆0 =f (z)=f (xt ) A•
x0 x1 =f (x0 ) xt A
A
A
A
A
A• •
x̆1 =f (x̆0 )
Let w ∈ X. Then we can modify f to obtain a new mapping fw : X → X by

changing the value at the argument z from f (z) to w (and leaving all other values
unchanged). More precisely, fw (x) = f (x) for x ∈ X \ {z} and fw (z) = w. This
gives us a new counting system (X, fw , x0 ).
Proposition 4.2 (X, fw , x0 ) is minimal with end-point z for each w ∈ X.
Proof Let X ′ ⊂ X with x0 ∈ X and fw (X ′ \ {z}) ⊂ X ′ ; then f (X ′ \ {z}) ⊂ X ′

also holds and thus X ′ = X, since (X, f, x0 ) is z-minimal. Hence (X, fw , x0 ) is
z-minimal, and it follows from Lemma 4.3 that (X, fw , x0 ) is minimal and from
Theorem 4.2 that z is the end-point of (X, fw , x0 ).
Lemma 4.5 f (x) 6= x for all x ∈ X \ {z}.
Proof Let A be a #-regular finite set with #P (A) = X, and so #(A) = z. If

x ∈ X \ {z} then x = #(B) for some proper subset B of A; let a ∈ A \ B. Then
B ∪ {a} ⊂ A and by Proposition 2.9 B 6≈ B ∪ {a}. Hence by Lemma 4.4 (3)
f (x) = f (#(B)) = #(B ∪ {a}) 6= #(B) = x.
An element x ∈ X is a fixed point of f if f (x) = x. By Lemma 4.5 f has a fixed

point if and only if z is a fixed point, and in this case z is the unique fixed point
of f . If z is a fixed point then we say that (X, f, x0 ) is a segment.
If (X, f, x0 ) has end-point z then it follows from Proposition 4.2 that (X, fz , x0 )
is a segment. Moreover, given the segment (X, fz , x0 ) and knowing the value
f (z) we can recover the original counting system (X, f, x0 ). This means that all
the properties of finite minimal counting systems can be deduced from those of
segments.
We typically use (S, σ, s0 ) (and not (X, f, x0 )) to denote a segment. A segment
(S, σ, s0 ) will be called non-trivial if S 6= {s0 }. Proposition 4.3 below shows that
for each non-empty finite set A there is essentially a unique segment (S, σ, s0 )
with S ≈ A.
Lemma 4.6 Let (S, σ, s0 ) be a segment with end-point z. Then:

(1) Let S⋄ = S ∪ {⋄}, with ⋄ an element not in S, and let σ⋄ : S⋄ → S⋄ be the
mapping with σ⋄ (s) = σ(s) for s ∈ S and σ⋄ (⋄) = s0 . Then (S⋄ , σ⋄ , ⋄) is also a
segment with end-point z.
(2) If (S, σ, s0 ) is non-trivial then the set S ′ = S \ {s0 } is σ-invariant, and if σ ′
is the restriction of σ to S ′ then (S ′ , σ ′ , σ(s0 )) is a segment with end-point z.
Proof (1) Let S⋄′ be σ⋄ -invariant and contain ⋄ and let S ′ = S⋄′ \ {⋄}. Then S ′
is an σ-invariant subset of S. (If s ∈ S ′ then σ(s) = σ⋄ (s) ∈ S⋄′ and so σ(s) ∈ S ′ ,
since σ(s) ∈ S.) Therefore S ′ = S, since s0 = σ⋄ (⋄) ∈ S ′ and (S, σ, s0 ) is minimal,
and thus S⋄′ = S⋄ . Hence (S⋄ , σ⋄ , ⋄) is minimal. Moreover, by Lemma 2.1 S⋄ is
finite, and z is a fixed point of σ⋄ . Therefore by Lemma 4.5 (S⋄ , σ⋄ , ⋄) is a segment
with end-point z.
(2) Proposition 4.1 (2) implies that s0 ∈ / σ(S) (since σ(z) = z 6= s0 ), hence
σ(S) ⊂ S \ {s0 } = S and in particular σ(S ′ ) ⊂ S ′ . Now if S0′ is σ ′ -invariant
′
and contains σ(s0 ) then S0′ ∪ {s0 } is an σ-invariant subset of S containing s0 and
so S0′ ∪ {s0 } = S, since (S, σ, s0 ) is minimal. Thus (S ′ , σ ′ , σ(s0 )) is minimal. It
follows from Lemma 4.5 that (S ′ , σ ′ , σ(s0 )) is a segment with end-point z, since
by Proposition 2.1 S ′ is finite, and z is a fixed point of σ ′ .
Proposition 4.3 (1) For each non-empty finite set A there exists a segment
(S, σ, s0 ) with S ≈ A.
(2) If (S, σ, s0 ) and (T, τ, t0 ) are segments with S ≈ T then there exists a unique
mapping p : S → T with p(s0 ) = t0 such that τ ◦ p = p ◦ σ.
Proof (1) For each finite set A let P(A) be the proposition that if A is non-empty
then there exists a segment (S, σ, s0 ) with S ≈ A.
(⋄) P(∅) holds trivially.
(⋆) Let A be a finite set for which P(A) holds and let a ∈ / A. If A = ∅ then
({a}, σ, a) with σ(a) = a is a segment and {a} ≈ A ∪ {a}, and so in this case
P(A ∪ {a}) holds. On the other hand, if A 6= ∅ then there exists a segment
(S, σ, s0 ) with S ≈ A, and if (S⋄ , σ⋄ , ⋄) is the segment given in Lemma 4.6 (1)
then S⋄ = S ∪ {⋄} ≈ A ∪ {a} and so again P(A ∪ {a}) holds.
Therefore by the induction principle for finite sets P(A) holds for every finite
set A, thus for each non-empty finite set A there exists a segment (S, σ, s0 ) with
S ≈ A.
(2) For each finite set A let P(A) be the proposition that if A is non-empty and
(S, σ, s0 ) and (T, τ, t0 ) are segments with S ≈ A ≈ T then there exists a mapping
p : S → T with p(s0 ) = t0 such that τ ◦ p = p ◦ σ.
(⋄) P(∅) holds trivially.

(⋆) Let A be a finite set for which P(A) holds and let a ∈ / A. If A = ∅ then
A ∪ {a} = {a} and it is clear that P(A ∪ {a}) holds here. Assume then that
A 6= ∅ and let (S, σ, s0 ) and (T, τ, t0 ) be segments with S ≈ A ∪ {a} ≈ T .
These segments are non-trivial and so by Lemma 4.6 (2) we have the segments
(S ′ , σ ′ , σ(s0 )) and (T ′ , τ ′ , τ (t0 )), where S ′ = S \ {s0 }, T ′ = T \ {t0 } and σ ′ and
τ ′ are respectively the restrictions of σ to S ′ and τ to T ′ . Now S ′ ≈ A ≈ T ′ and
P(A) holds and there thus exists a mapping p′ : S ′ → T ′ with p′ (σ(s0 )) = τ (t0 )
such that τ ′ ◦ p′ = p′ ◦ σ ′ . Extend p′ to a mapping p : S → T by letting p(s0 ) = t0 .
Then τ ◦ p = p ◦ σ, which implies that again P(A ∪ {a}) holds.
A, which shows that if (S, σ, s0 ) and (T, τ, t0 ) are segments with S ≈ T then
there exists a mapping p : S → T with p(s0 ) = t0 such that τ ◦ p = p ◦ σ. The
uniqueness of p follows as usual from the fact that (S, σ, s0 ) is minimal.
Let (S, σ, s0 ) and (T, τ, t0 ) be segments with S ≈ T . Then by Proposition 4.3 (2)
there exists a unique mapping p : S → T with p(s0 ) = t0 such that τ ◦ p = p ◦ σ,
and the mapping p is a bijection: Reversing the roles of (S, σ, s0 ) and (T, τ, t0 )
gives us a unique mapping q : T → S with q(t0 ) = s0 such that σ ◦ q = q ◦ τ ,
and it easily follows that q ◦ p = idS and p ◦ q = idT . Moreover, if z is the
end-point of (S, σ, s0 ) and z ′ the end-point of (T, τ, t0 ) then p(z) = z ′ , since
τ (p(z)) = p(σ(z)) = p(z) and z ′ is the unique fixed point of τ .
Finally, the following result shows how two segments can be joined together to
make a ‘longer’ segment.
Proposition 4.4 Let (S, σ, s0 ) and (T, τ, t0 ) be segments with end-points z and
z ′ respectively, and such that the sets S and T are disjoint. Put R = S ∪ T and
define a mapping ̺ : R → R by

 σ(r) if r ∈ S \ {z} ,
̺(r) = t0 if r = z ,
τ (r) if r ∈ T .

Then (R, ̺, s0 ) is a segment with end-point z ′ .
Proof Let R0 be an ̺-invariant subset of R containing s0 , and put S0 = R0 ∩ S

and T0 = R0 ∩ T . If s ∈ S0 \ {z} then σ(s) = ̺(s) ∈ R0 , and so σ(s) ∈ S0 ; on
the other hand, if z ∈ S0 then σ(z) = z ∈ S0 . Thus S0 is a σ-invariant subset
of S containing s0 and hence S0 = S, since (S, σ, s0 ) is minimal. In particular,
z ∈ S0 ⊂ R0 , which implies that t0 = ̺(z) ∈ R0 , i.e., t0 ∈ T0 . But T0 is
clearly τ -invariant and therefore T0 = T , since (T, τ, t0 ) is minimal. It follows
that R0 = S0 ∪ T0 = S ∪ T = R, which shows that (R, ̺, s0 ) is minimal. But
by Proposition 2.2 R is finite, and z ′ is a fixed point of ̺, and so by Lemma 4.5
(R, ̺, s0 ) is a segment with end-point z ′ .
5 Addition and multiplication
In this section we show how an addition and a multiplication can be defined for
any minimal counting system. These operations are associative and commutative
and can be specified by the rules (a0), (a1), (m0) and (m1) below, which are
usually employed when defining the operations on N via the Peano axioms.
In the following let (X, f, x0 ) be a minimal counting system and let # be the
unique iterator for (X, f, x0 ). We first state the main results (Theorems 5.1 and
5.2) and then develop the machinery required to prove them. In the following
section we give alternative proofs for these theorems.
Theorem 5.1 There exists a unique binary operation ⊕ on X such that
#(A) ⊕ #(B) = #(A ∪ B)
whenever A and B are disjoint finite sets. This operation ⊕ is both associative
and commutative, x ⊕ x0 = x for all x ∈ X and for all x1 , x2 ∈ X there is an
x ∈ X such that either x1 = x2 ⊕ x or x2 = x1 ⊕ x. Moreover, ⊕ is the unique
binary operation ⊕ on X such that
(a0) x ⊕ x0 = x for all x ∈ X.
(a1) x ⊕ f (x′ ) = f (x ⊕ x′ ) for all x, x′ ∈ X.
Theorem 5.2 There exists a unique binary operation ⊗ on X such that
#(A) ⊗ #(B) = #(A × B)
for all finite sets A and B. This operation ⊗ is both associative and commutative,
x ⊗ x0 = x0 and x ⊗ f (x0 ) = x for all x ∈ X (and so f (x0 ) is a multiplicative
unit) and the distributive law holds for ⊕ and ⊗: For all x, x1 , x2 ∈ X
x ⊗ (x1 ⊕ x2 ) = (x ⊗ x1 ) ⊕ (x ⊗ x2 ) .
Moreover, ⊗ is the unique binary operation on X such that

(m0) x ⊗ x0 = x0 for all x ∈ X.
(m1) x ⊗ f (x′ ) = x ⊕ (x ⊗ x′ ) for all x, x′ ∈ X.
Before beginning with the proofs of Theorems 5.1 and 5.2 we give two results
about the operation ⊕ for special cases of (X, f, x0 ).
Proposition 5.1 If f is injective then the cancellation law holds for ⊕ (meaning
that x1 = x2 whenever x1 ⊕ x = x2 ⊕ x for some x ∈ X).
35
Proof Let x1 , x2 ∈ X with x1 6= x2 and let X0 = {x ∈ X : x1 ⊕ x 6= x2 ⊕ x};

then x0 ∈ X0 , since by (a0) x1 ⊕ x0 = x1 6= x2 = x2 ⊕ x0 . Let x ∈ X0 , then by
(a1), and since f is injective, x1 ⊕ f (x) = f (x1 ⊕ x) 6= f (x2 ⊕ x) = x1 ⊕ f (x), i.e.,
f (x) ∈ X0 . Thus X0 is an f -invariant subset of X containing x0 and so X0 = X,
since (X, f, x0 ) is minimal. Hence if x1 6= x2 then x1 ⊕ x 6= x2 ⊕ x for all x ∈ X,
which shows that the cancellation law holds for X.
If (X, f, x0 ) is a Dedekind system then f is injective and so by Proposition 5.1

the cancellation law holds for ⊕. In particular, x 6= x ⊕ x′ for all x, x′ ∈ X with
x′ 6= x0 (since x = x ⊕ x0 ).
Proposition 5.2 If x0 ∈ f (X) (and so by Theorem 4.2 and Proposition 4.1 X

is finite and f is bijective) then (X, ⊕, x0 ) is an abelian group: For each x ∈ X
there exists x′ ∈ X such that x ⊕ x′ = x0 .
Proof Let z be the end-point of (X, f, x0 ), so by Proposition 4.1 (1) f (z) = x0 and
by Theorem 4.2 z = #(A) where A is any #-regular finite set with #P (A) = X.
Let x ∈ X and so x = #(B) for some B ⊂ A. Put C = (A \ B) ∪ {a}, where a is
some element not in A and let x′ = #(C). Then B and C are disjoint and hence
x ⊕ x′ = #(B) ⊕ #(C) = #(B ∪ C) = #(A ∪ {a}) = f (#(A)) = f (z) = x0 .
It is not difficult to show that the group in Proposition 5.2 is cyclic and generated
by the element f (x0 ).
We now prepare for the proof of Theorem 5.1. Among other things, the theorem
states that there exists a binary operation ⊕ on X such that
(α) #(A) ⊕ #(B) = #(A ∪ B) whenever A and B are disjoint finite sets.
To understand what this implies it is convenient to introduce some notation. For
pairs of finite sets let us write (A, B) ≈ (A′ , B ′ ) if A ≈ A′ and B ≈ B ′ , and employ
#(A, B) as shorthand for the element (#(A), #(B)) of X × X. A pair (A, B) will
be called disjoint if the sets A and B are disjoint. Consider disjoint pairs (A, B)
and (A′ , B ′ ) with #(A, B) = #(A′ , B ′ ). Then #(A) ⊕ #(B) = #(A′ ) ⊕ #(B ′ )
(regardless of how ⊕ is defined) which shows that if ⊕ exists satisfying (α) then
the following must hold:
(β) If (A, B) and (A′ , B ′ ) are disjoint pairs with #(A, B) = #(A′ , B ′ ) then
#(A ∪ B) = #(A′ ∪ B ′ ).
Conversely, (β) is sufficient to ensure the existence of an operation ⊕ satisfying
(α). To see this is the case, first note the following:
Lemma 5.1 For all elements x, x′ ∈ X there exists a disjoint pair (A, B) with
(x, x′ ) = #(A, B).
Proof By Lemma 3.3 there exists a pair (C, D) with #(C, D) = (x, x′ ), put
A = C × {⊳} and B = D × {⊲}, where ⊳ and ⊲ are distinct elements. Then (A, B)
is a disjoint pair with (A, B) ≈ (C, D) and so #(A, B) = #(C, D) = (x, x′ ).
Now suppose that (β) holds and consider disjoint pairs (A, B) and (A′ , B ′ ) with
#(A, B) = (x, x′ ) = #(A′ , B ′ ). Then #(A ∪ B) = #(A′ ∪ B ′ ) and we can thus
define a binary operation ⊕ on X by letting x ⊕ x′ = #(A ∪ B), where (A, B) is
any disjoint pair with #(A, B) = (x, x′ ). In particular, #(A) ⊕ #(B) = #(A ∪ B)
whenever A and B are disjoint, i.e., (α) holds.
The main step in the proof of Theorem 5.1 will be to establish that (β) holds.
For a Dedekind system (X, f, x0 ) this is not a problem, since if (A, B) and
(A′ , B ′ ) are disjoint pairs with #(A, B) = #(A′ , B ′ ) then Theorem 3.2 implies
that (A, B) ≈ (A′ , B ′ ), from which it easily follows that A∪B ≈ A′ ∪B ′ and hence
by Proposition 3.1 that #(A ∪ B) = #(A′ ∪ B ′ ). Establishing (β) for a general
minimal counting system involves more work (which is done in Lemma 5.2).
Once it is known that an operation ⊕ exists satisfying (α) then the remaining
properties of ⊕ listed in Theorem 5.1 follow from the corresponding properties of
the union operation ∪ (for example, that it is associative and commutative).
Theorem 5.2 will be dealt with in a similar manner. Here we are looking for an
operation ⊗ on X such that
(γ) #(A) ⊗ #(B) = #(A × B) whenever A and B are finite sets
and the condition corresponding to (β) is clearly the following:
(δ) If (A, B) and (A′ , B ′ ) are (arbitrary) pairs with #(A, B) = #(A′ , B ′ ) then
#(A × B) = #(A′ × B ′ ).
If (δ) holds and (A, B) and (A′ , B ′ ) are pairs with #(A, B) = (x, x′ ) = #(A′ , B ′ )
then #(A × B) = #(A′ × B ′ ) and so we can define a binary operation ⊗ on X by
letting x ⊗ x′ = #(A × B), where (A, B) is any pair with #(A, B) = (x, x′ ). In
particular, #(A) ⊗ #(B) = #(A × B) for all finite sets A and B, i.e., (γ) holds.
Again, there is no problem to show that (δ) holds for a Dedekind system, since if
(A, B) ≈ (A′ , B ′ ) then A × B ≈ A′ × B ′ . The general minimal counting system
is dealt with in Lemma 5.3.
As with the addition ⊕, once it is known that an operation ⊗ exists satisfying
(γ) then the remaining properties of ⊗ listed in Theorem 5.2 follow from the
corresponding properties of the cartesian product operation × (for example, that
it is associative and commutative) and the relationship between ∪ and ×.
The following shows that condition (β) holds.
Lemma 5.2 If (A, B) and (A′ , B ′ ) are disjoint pairs with #(A, B) = #(A′ , B ′ )
then #(A ∪ B) = #(A′ ∪ B ′ ).
Proof Start by considering a finite sets A and A′ with #(A) = #(A′ ) and for
each finite set B let P(B) be the proposition that if B is disjoint from A and A′
then #(A ∪ B) = #(A′ ∪ B).
(⋄) P(∅) holds, since #(A ∪ ∅) = #(A) = #(A′ ) = #(A′ ∪ ∅).
(⋆) Let B be a finite set for which P(B) holds and b ∈ / B. If B ∪ {b} is not
disjoint from A and A′ then P(B ∪ {b}) holds trivially, and so we can assume
that this is not the case. In particular, B is then disjoint from A and A′ and so
#(A ∪ B) = #(A′ ∪ B), and also b ∈ / A ∪ B and b ∈ / A′ ∪ B. Thus
#(A ∪ (B ∪ {b})) = #((A ∪ B) ∪ {b}) = f (#(A ∪ B))

= f (#(A′ ∪ B)) = #((A′ ∪ B) ∪ {b}) = #(A′ ∪ (B ∪ {b}))
and hence P(B ∪ {b}) holds.

Therefore by the induction principle for finite sets P(B) holds for every finite set
B, thus if A and A′ are finite sets with #(A) = #(A′ ) then #(A∪B) = #(A′ ∪B)
for every finite set B disjoint from A and A′ .
For each set C and each element d put Cd = C × {d} (and so Cd ≈ C). Now
let (A, B) and (A′ , B ′ ) be disjoint pairs with #(A, B) = #(A′ , B ′ ), and choose
distinct elements ⊳ and ⊲; then #(A ∪ B) = #(A⊳ ∪ B⊳ ) (since A ∪ B ≈ A⊳ ∪ B⊳ ),
#(A′⊲ ∪ B⊲′ ) = #(A′ ∪ B ′ ) (since A′⊲ ∪ B⊲′ ≈ A′ ∪ B ′ ), #(A⊳ ) = #(A′⊲ ) (since
A⊳ ≈ A and A′ ≈ A′⊲ )) and #(B⊳ ) = #(B⊲′ ) (since B⊳ ≈ B and B ′ ≈ B⊲′ ), which
gives us the following data:
– #(A ∪ B) = #(A⊳ ∪ B⊳ ),
– #(A⊳ ) = #(A′⊲ ) and B⊳ is disjoint from both A⊳ and A′⊲ ,
– #(B⊳ ) = #(B⊲′ ) and A′⊲ is disjoint from both B⊳ and B⊲′ ,
– #(A′⊲ ∪ B⊲′ ) = #(A′ ∪ B ′ ).
Thus by two applications of the first part of the proof
#(A ∪ B) = #(A⊳ ∪ B⊳ ) = #(A′⊲ ∪ B⊳ )

= #(B⊳ ∪ A′⊲ ) = #(B⊲′ ∪ A′⊲ ) = #(A′⊲ ∪ B⊲′ ) = #(A′ ∪ B ′ ) .
Proof of Theorem 5.1: Let x, x′ ∈ X. By Lemma 5.1 there exists a disjoint pair
(A, B) with (x, x′ ) = #(A, B) and if (A′ , B ′ ) is another disjoint pair (A′ , B ′ ) with
(x, x′ ) = #(A′ , B ′ ) then #(A, B) = #(A′ , B ′ ) and so Lemma 5.2 implies that
#(A ∪ B) = #(A′ ∪ B ′ ). We can therefore define x ⊕ x′ to be #(A ∪ B), where
(A, B) is any disjoint pair with (x, x′ ) = #(A, B). In particular, if A and B
are disjoint finite sets then #(A) ⊕ #(x′ ) = #(A ∪ B), since (A, B) is a disjoint
pair for (#(A), #(B)). Moreover, ⊕ is uniquely determined by this requirement:
Consider any binary operation ⊕′ on X for which #(A) ⊕′ #(B) = #(A ∪ B)
whenever A and B are disjoint finite sets. If C and D are any finite sets then there
exists a disjoint pair (A, B) with (A, B) ≈ (C, D) and hence by Proposition 3.1
#(C) ⊕′ #(D) = #(A) ⊕′ #(B) = #(A ∪ B) = #(A) ⊕ #(B) = #(C) ⊕ #(D)
and so by Lemma 3.3 ⊕′ = ⊕.
We show that ⊕ is associative and commutative: Let x1 , x2 , x3 ∈ X; then by
Lemma 3.3 there exists finite sets A1 , A2 and A3 with x1 = #(A1 ), x2 = #(A2 )
and x3 = #(A3 ). Let ⊳, ⊲ and ⋄ be distinct elements and put B1 = A1 ∪ {⊳},
B2 = A2 ∪ {⊲} and B3 = A3 ∪ {⋄}. Then B1 , B2 , B3 are disjoint with Bj ≈ Aj
and hence with #(Bj ) = #(Aj ) for j = 1, 2, 3. Therefore
(x1 ⊕ x2 ) ⊕ x3 = (#(B1 ) ⊕ #(B2 )) ⊕ #(B3 ) = #(B1 ∪ B2 ) ⊕ #(B3 )
= #((B1 ∪ B2 ) ∪ B3 ) = #(B1 ∪ (B2 ∪ B3 ))
= #(B1 ) ⊕ #(B2 ∪ B3 ) = #(B1 ) ⊕ (#(B2 ) ⊕ #(B3 ))
= x1 ⊕ (x2 ⊕ x3 ) .
In the same way ⊕ is commutative. Let x1 , x2 ∈ X; then there exists a disjoint
pair (A1 , A2 ) with (x1 , x2 ) = #(A1 , A2 ). Thus
x1 ⊕ x2 = #(A1 ) ⊕ #(A2 )
= #(A1 ∪ A2 ) = #(A2 ∪ A1 ) = #(A2 ) ⊕ #(A1 ) = x2 ⊕ x1 .
Moreover, if x ∈ X and A is a finite set with x = #(A) then
x ⊕ x0 = #(A) ⊕ #(∅) = #(A ∪ ∅) = #(A) = x ,
and so x ⊕ x0 = x for all x ∈ X.
Let x1 , x2 ∈ X, and so by Lemma 3.3 there exist finite sets A and B such that
x1 = #(A) and x2 = #(B). By Proposition 2.8 there either exists an injective
mapping g : A → B or an injective mapping h : B → A. Assume the former holds
and put B ′ = g(A) and C = B \ B ′ . Then B ′ and C are disjoint and B = B ′ ∪ C;
moreover, A ≈ B ′ (since g considered as a mapping from A to B ′ is a bijection)
and so by Proposition 3.1 #(A) = #(B ′ ). Thus, putting x = #(C), it follows
that x2 = #(B) = #(B ′ ∪ C) = #(B ′ ) ⊕ #(C) = #(A) ⊕ #(C) = x1 ⊕ x. On
the other hand, if there exists an injective mapping h : B → A then the same
argument shows that x1 = x2 ⊕ x for some x ∈ X.
Now to (a0) and (a1), and we have seen above that (a0) holds. Let x, x′ ∈ X, so
by Lemma 5.1 there exists a disjoint pair (A, B) for (x, x′ ). Let b ∈
/ A ∪ B; then
x ⊕ f (x′ ) = #(A) ⊕ f (#(B)) = #(A) ⊕ #(B ∪ {b}) = #(A ∪ (B ∪ {b}))
= #((A ∪ B) ∪ {b}) = f (#(A ∪ B)) = f (#(A) ⊕ #(B)) = f (x ⊕ x′ )
and hence (a1) holds. If ⊕′ is another binary operation on X satisfying (a0) and
(a1) then it is easy to see that X0 = {x′ ∈ X : x ⊕′ x′ = x ⊕ x′ for all x ∈ X}
is an f -invariant subset of X containing x0 . Hence X0 = X, since (X, f, x0 ) is
minimal, which implies that ⊕′ = ⊕.
We prepare for the proof of Theorem 5.2 by showing that (δ) holds.
Lemma 5.3 If (A, B) and (A′ , B ′ ) are any pairs with #(A, B) = #(A′ , B ′ ) then
#(A × B) = #(A′ × B ′ ).
Proof Start by considering finite sets A and A′ with #(A) = #(A′ ) and for each
finite set B let P(B) be the proposition that #(A × B) = #(A′ × B).
(⋄) P(∅) holds, since A × ∅ = ∅ = A′ × ∅ and so #(A × ∅) = #(A′ × ∅).
(⋆) Let B be a finite set for which P(B) holds and let b ∈
/ B. Then the sets
A × B and A × {b} are disjoint and A × (B ∪ {b}) = (A × B) ∪ (A × {b}). It thus
follows that
#(A × (B ∪ {b})) = #((A × B) ∪ (A × {b}) = #(A × B) ⊕ #(A × {b})
and in the same way #(A′ × (B ∪ {b})) = #(A′ × B) ⊕ #(A′ × {b}). Clearly
A × {b} ≈ A and so by Proposition 3.1 #(A × {b}) = #(A), and in the same way
#(A′ × {b}) = #(A′ ). Therefore
#(A × (B ∪ {b})) = #(A × B) ⊕ #(A × {b}) = #(A′ × B) ⊕ #(A)

= #(A′ × B) ⊕ #(A′ × {b}) = #(A′ × (B ∪ {b}))
and so P(B ∪ {b}) holds.

Therefore by the induction principle for finite sets P(B) holds for every finite set
B, thus if #(A) = #(A′ ) then #(A × B) = #(A′ × B) for every finite set B.
Now let (A, B) and (A′ , B ′ ) be any pairs with #(A, B) = #(A′ , B ′ ). Then clearly
we have A′ × B ≈ B × A′ and A′ × B ′ ≈ B ′ × A′ and therefore by Proposition 3.1
#(A′ × B) = #(B × A′ ) and #(A′ × B ′ ) = #(B ′ × A′ ). Hence by the first part
#(A × B) = #(A′ × B) = #(B × A′ ) = #(B ′ × A′ ) = #(A′ × B ′ ) .
Proof of Theorem 5.2: Let x, x′ ∈ X. By Lemma 3.3 there exist finite sets A
and B with x = #(A) and x′ = #(B) and if x = #(A′ ) and x′ = #(B ′ ) for
some other finite sets A′ and B ′ then #(A′ ) = #(A) and #(B ′ ) = #(B) and
hence by Lemma 5.3 #(A × B) = #(A′ × B ′ ). We can thus define x ⊗ x′ to be
#(A × B), where A and B are any finite sets with x = #(A) and x′ = #(B).
Then #(A) ⊗ #(B) = #(A × B) for all finite sets A and B, a requirement which
clearly determines ⊗ uniquely.
We show that ⊗ is associative and commutative: Let x1 , x2 , x3 ∈ X; then by
Lemma 3.3 there exists finite sets A1 , A2 , A3 with x1 = #(A1 ), x2 = #(A2 ) and
x3 = #(A3 ). Now it is easy to check that (A1 × A2 ) × A3 ≈ A1 × (A2 × A3 ) and
so by Proposition 3.1 #((A1 × A2 ) × A3 ) = #(A1 × (A2 × A3 )). Therefore
(x1 ⊗ x2 ) ⊗ x3 = (#(A1 ) ⊗ #(A2 )) ⊗ #(A3 ) = #(A1 × A2 ) ⊗ #(A3 )

= #((A1 × A2 ) × A3 ) = #(A1 × (A2 × A3 ))
= #(A1 ) ⊗ #(A2 × A3 ) = #(A1 ) ⊗ (#(A2 ) ⊗ #(A3 ))
= x1 ⊗ (x2 ⊗ x3 )
which shows ⊗ is associative. Let x1 , x2 ∈ X; by Lemma 3.3 there exist finite

sets A1 and A2 with x1 = #(A1 ) and x2 = #(A2 ). Then by Proposition 3.1
#(A1 × A2 ) = #(A2 × A1 ), since clearly A1 × A2 ≈ A2 × A1 . Thus
x1 ⊗ x2 = #(A1 ) ⊗ #(A2 )
= #(A1 × A2 ) = #(A2 × A1 ) = #(A2 ) ⊗ #(A1 ) = x2 ⊗ x1
which shows that ⊗ is also commutative.

Let x ∈ X, so by Lemma 3.3 there exists a finite set A with x = #(A). Then
x ⊗ x0 = #(A) ⊗ #(∅) = #(A × ∅) = #(∅) = x0 .
Moreover, if a is any element then by Proposition 3.1 #(A × {a}) = #(A), since
A × {a} ≈ A, and hence
x ⊗ f (x0 ) = #(A) ⊗ f (#(∅)) = #(A) ⊗ #(∅ ∪ {a})

= #(A) ⊗ #({a}) = #(A × {a}) = #(A) = x .
Thus x ⊗ x0 = x0 and x ⊗ f (x0 ) = x for each x ∈ X (and note that the first
statement is (m0)).
Now for the distributive law. Let x, x1 , x2 ∈ X. There exists a finite set A with
x = #(A) and a disjoint pair (B, C) with (x1 , x2 ) = #(B, C). Then A × (B ∪ C)
is the disjoint union of A × B and A × C and thus
(x ⊗ x1 ) ⊕ (x ⊗ x2 ) = (#(A) ⊗ #(B)) ⊕ (#(A) ⊗ #(C))

= #(A × B) ⊕ #(A × C) = #((A × B) ∪ (A × C))
= #(A × (B ∪ C)) = #(A) ⊗ #(B ∪ C)
= #(A) ⊗ (#(B) ⊕ #(C)) = x ⊗ (x1 ⊕ x2 ) .
We have already seen that (m0) holds and, since f (x0 ) is a unit, (m1) is a special
case of the distributive law: Let x, x′ ∈ X; then by (a0) and (a1) and since ⊕ is
commutative it follows that f (x′ ) = f (x′ ⊕ x0 ) = x′ ⊕ f (x0 ) = f (x0 ) ⊕ x′ , and

hence x ⊗ f (x′ ) = x ⊗ (f (x0 ) ⊕ x′ ) = (x ⊗ f (x0 )) ⊕ (x ⊗ x′ ) = x ⊕ (x ⊗ x′ ), which is
(m1). Finally, if ⊗′ is another binary operation satisfying (m0) and (m1) then it
is easy to see that X0 = {x′ ∈ X : x ⊗′ x′ = x ⊗ x′ for all x ∈ X} is an f -invariant
subset of X containing x0 . Hence X0 = X, since (X, f, x0 ) is minimal, which
implies that ⊗′ = ⊗.
We end the section by looking at the operation of exponentiation. Here we have

to be more careful: For example, 2 · 2 · 2 = 2 in Z3 and so 23 is not well-defined
if the exponent 3 is considered as an element of Z3 (since we would also have to
have 20 = 1). However, 23 does make sense if 2 is considered as an element of Z3
and the exponent 3 as an element of N.
In general we will see that if (Y, g, y0) is a Dedekind system then we can define
an element of X which is ‘x to the power of y’ for each x ∈ X and each y ∈ Y
and this operation has the properties which might be expected.
In what follows let (Y, g, y0) be a Dedekind system and let #′ be the unique
iterator for (Y, g, y0). (As before (X, f, x0 ) is assumed to be minimal with unique
iterator #.)
Theorem 5.3 There exists a unique operation ↑ : X × Y → X such that
#(A) ↑ #′ (B) = #(AB )
for all finite sets A and B. This operation ↑ satisfies
x ↑ (y1 ⊕ y2 ) = (x ↑ y1 ) ⊗ (x ↑ y2 )
for all x ∈ X and all y1 , y2 ∈ Y and
(x1 ⊗ x2 ) ↑ y = (x1 ↑ y) ⊗ (x2 ↑ y)
for all x1 , x2 ∈ X and y ∈ Y . Moreover, ↑ is the unique operation such that

(e0) x ↑ y0 = f (x0 ) for all x ∈ X.
(e1) x ↑ g(y) = x ⊗ (x ↑ y) for all x ∈ X, y ∈ Y .
Lemma 5.4 If B, C are finite sets with #(B) = #(C) then for all finite sets A
we have #(B A ) = #(C A ).
Proof Let B, C be finite sets with #(B) = #(C) and for each finite set A let
P(A) be the proposition that #(B A ) = #(C A ).
(⋄) P(∅) holds since #(B ∅ ) = #(C ∅ ). (For any set X the set X ∅ consists of
the single element {∅}.)
/ A. Now #(B A ) = #(C A )

(⋆) Let A be a finite set for which P(A) holds and a ∈
(since P(A) holds) and #(B) = #(C); hence by Proposition 3.1 and Lemma 5.3
#(B A∪{a} ) = #(B A × B) = #(C A × C) = #(C A∪{a} )
(since E A∪{a} ≈ E A × E for each set E), and so P(A ∪ {a}) holds.
A, and so #(B A ) = #(C A ) holds for all finite sets A.
Remark: If B, C are finite sets with #(B) = #(C) then #(AB ) = #(AC ) does
not hold in general for a finite set A.
Proof of Theorem 5.3: Let A1 , A2 , B1 , B2 be finite sets with #(A1 ) = #(A2 ) and
#′ (B1 ) = #′ (B2 ); then by Lemma 5.4 #(AB B1
1 ) = #(A2 ) and by Theorem 3.2
1
B1 ≈ B2 . Since B1 ≈ B2 it follows that AB B2

2 ≈ A2 and then by Proposition 3.1
1
#(A2 ) = #(A2 ). This shows that #(A1 ) = #(AB

B1 B2 B1
2 ). Therefore by Lemma 3.3
2
we can define x ↑ y to be #(AB ), where A and B are any finite sets with x = #(A)
and y = #′ (B). Then
#(A) ↑ #′ (B) = #(AB )
for all finite sets A and B and this requirement clearly determines ↑ uniquely.
Let x ∈ X and y1 , y2 ∈ Y ; then by Lemma 5.1 there exists a disjoint pair
(B1 , B2 ) with (y1 , y2 ) = #′ (B1 , B2 ) and by Lemma 3.3 there exists a finite set A
with x = #(A). Moreover, it is easily checked that AB1 ∪B2 ≈ AB1 × AB2 and thus
by Proposition 3.1
x ↑ (y1 ⊕ y2 ) = #(A) ↑ (#′ (B1 ) ⊕ #(B2 ))
= #(A) ↑ #′ (B1 ∪ B2 ) = #(AB1 ∪B2 ) = #(AB1 × AB2 )
= #(AB1 ) ⊗ #(AB2 ) = (x ↑ y1 ) ⊗ (x ↑ y2 ) .
Now let x1 , x2 ∈ X and y ∈ Y . By Lemma 3.3 there exist finite sets A1 , A2 and
B such that x1 = #(A1 ), x2 = #(A2 ) and y = #′ (B) and (A1 ×A2 )B ≈ AB B
1 ×A2 .
Thus by Proposition 3.1
(x1 ⊗ x2 ) ↑ y = (#(A1 ) ⊗ #(A2 )) ↑ #′ (B)
= #(A1 × A2 ) ↑ #′ (B) = #((A1 × A2 )B ) = #(AB B
1 × A2 )
= #(AB B
1 ) ⊗ #(A2 ) = (x1 ↑ y) ⊗ (x2 ↑ y) .
It remains to consider the properties (e0) and (e1). Now for each finite set A we
have #(A) ↑ #′ (∅) = #(A∅ ) = #({∅}) = f (x0 ) and hence x ↑ y0 = f (x0 ) for
each x ∈ X, i.e., (e0) holds. Let A and B be finite sets and let b ∈ / B. Then,
B∪{b} B
since A ≈ A × A , it follows from Proposition 3.1 that
#(A) ↑ g(#′ (B)) = #(A) ↑ #′ (B ∪ {b}) = #(AB∪{b} ) = #(A × AB )
= #(A) ⊗ #(AB ) = #(A) ⊗ (#(A) ↑ #′ (B))
and this shows x ↑ g(y) = x ⊗ (x ↑ y) for all x ∈ X, y ∈ Y , i.e., (e1) holds.

Finally, if ↑′ is another operation satisfying (e0) and (e1) then
Y0 = {y ∈ Y : x ↑′ y = x ↑ y for all x ∈ X}
is a g-invariant subset of Y containing y0 . Therefore Y0 = Y , since (Y, g, y0) is

minimal, which implies that ↑′ = ↑.
6 Another take on addition and multiplication
In the following again let (X, f, x0 ) be a minimal counting system and let # be
the unique iterator for (X, f, x0 ). In this section we give alternative proofs for
Theorems 5.1 and 5.2.
In Section 5 only the single iterator # was used. Here we make use of a family
of iterators {#x : x ∈ X}, which arise as follows: For each x ∈ X there is the
counting system (X, f, x) (which will usually not be minimal) and by Theorem 3.1
there is then the unique iterator for (X, f, x), which we denote by #x . Thus
#x (∅) = x and #x (A ∪ {a}) = f (#x (A)) whenever A is a finite set and a ∈ / A.
In particular we have # = #x0 . For each finite set A the element #x (A) can be
thought of as the ‘number’ obtained by counting the elements in A, but starting
with x instead of x0 . Now it is more convenient to repackage the information given
by the iterators #A , x ∈ X, by introducing for each finite set A the mapping
fA : X → X with fA (x) = #x (A) for all x ∈ X, and so #(A) = #x0 (A) = fA (x0 ).
Consider disjoint finite sets A and B; then #(A ∪ B) gives the ‘number’ of
elements in A ∪ B. But this ‘number’ can also be determined by first counting
the elements in B, giving the result #(B), and then counting the elements in A,
but starting the counting with #(B) and not with x0 . The result is thus #z (A),
where z = #(B), and #z (A) = fA (z) = fA (#(B)) = fA (fB (x0 )) = (fA ◦ fB )(x0 ),
and so we would expect that #(A ∪ B) = (fA ◦ fB )(x0 ). But if ⊕ is the operation
given by Theorem 5.1 then #(A ∪ B) = #(A) ⊕ #(B) = fA (x0 ) ⊕ fB (x0 ), which
suggests that the following should hold:
(µ) fA (x0 ) ⊕ fB (x0 ) = (fA ◦ fB )(x0 ) whenever A and B are disjoint finite sets.
It will be seen later that (µ) does hold. What is perhaps more important, though,
is that (µ) can actually be used to define ⊕, as we now explain.
Denote by TX the set of all mappings from X to itself. Then (TX , ◦, idX ), where
◦ is functional composition and idX : X → X is the identity mapping, is a
monoid. (A monoid is any triple (M, •, e) consisting of a set M, an associative
operation • on X and a unit element e ∈ M satisfying a • e = e • a = a for all
a ∈ M.) Lemma 6.6 shows that
Mf = {u ∈ TX : u = fA for some finite set A}
is a submonoid of (TX , ◦, idX ), meaning that idX ∈ Mf and u1 ◦ u2 ∈ Mf for all
u1 , u2 ∈ Mf , and that this submonoid is commutative, i.e., u1 ◦ u2 = u2 ◦ u1 for
all u1 , u2 ∈ Mf . (The monoid (TX , ◦, idX ) itself is not commutative except when
X = {x0 }.)
Let Φx0 : Mf → X be the mapping with Φx0 (u) = u(x0 ) for each u ∈ Mf , and
so in particular Φx0 (fA ) = fA (x0 ) = #(A) for each finite set A. Lemma 6.7 will
show that Φx0 is a bijection, and therefore there exists a unique operation ⊕ on
X such that
45
(ν) Φx0 (u) ⊕ Φx0 (v) = Φx0 (u ◦ v) for all u, v ∈ Mf .

This is how ⊕ will be defined below. Note that if A and B are (not necessarily
disjoint) finite sets then by (ν)
fA (x0 ) ⊕ fB (x0 ) = Φx0 (fA ) ⊕ Φx0 (fB ) = Φx0 (fA ◦ fB ) = (fA ◦ fB )(x0 )
and so in particular (µ) holds.
We now give the details of the approach outlined above.
Lemma 6.1 The assignment A 7→ fA satisfies f∅ = idX and fA∪{a} = f ◦ fA

whenever A is a finite set and a ∈
/ A. Moreover, it is uniquely determined by
these requirements.
Proof We have f∅ (x) = #x (∅) = x = idX (x) for all x ∈ X, and so f∅ = idX .
Moreover, if A is a finite set and a ∈
/ A then
fA∪{a} (x) = #x (A ∪ {a}) = f (#x (A)) = f (fA (x)) = (f ◦ fA )(x)
for all x ∈ X and hence fA∪{a} = f ◦ fA . Finally, consider a further assignment
A 7→ fA′ with f∅′ = idX and such that fA∪{a}
′
= f ◦ fA′ whenever A is a finite set
and a ∈ / A. For each finite set A let P(A) be the proposition that fA′ = fA .
(⋄) P(∅) holds since f∅′ = idX = f∅ .
(⋆) Let A be a finite set for which P(A) holds (and so fA′ = fA ) and let a ∈
/ A.
′ ′
Then fA∪{a} = f ◦ fA = f ◦ fA = fA∪{a} and so P(A ∪ {a}) holds.
A, which means that fA′ = fA for each finite set A.
The assignment A 7→ fA will be called the f -iterator. Note that in particular
f{a} = f for each element a, since f{a} = f∅∪{a} = f ◦ f∅ = f ◦ idX = f .
Lemma 6.2 f ◦ fA = fA ◦ f for each finite set A.
Proof For each finite set A let P(A) be the proposition that f ◦ fA = fA ◦ f .
(⋄) P(∅) holds since f ◦ f∅ = f ◦ idX = f = idX ◦ f = f∅ ◦ f .
/ A. Then
f ◦ fA∪{a} = f ◦ f ◦ fA = f ◦ fA ◦ f = fA∪{a} ◦ f
and so P(A ∪ {a}) holds.
A and hence f ◦ fA = fA ◦ f for each finite set A.
The next result establishes an important relationship between the f -iterator and
the iterator # for (X, f, x0 ).
Proposition 6.1 If A and B are finite sets then fA = fB holds if and only if
#(A) = #(B).
Proof By definition #(C) = fC (x0 ) for each finite set C, and so #(A) = #(B)
whenever fA = fB . Suppose conversely that #(A) = #(B) and consider the set
X0 = {x ∈ X : fA (x) = fB (x)}. Then X0 is f -invariant, since if x ∈ X0 then by
Lemma 6.2 fA (f (x)) = f (fA (x)) = f (fB (x)) = fB (f (x)), i.e., f (x) ∈ X0 . Also
x0 ∈ X0 , since fA (x0 ) = #(A) = #(B) = fB (x0 ). Hence X0 = X, since (X, f, x0 )
is minimal. This shows that fA (x) = fB (x) for all x ∈ X, i.e., fA = fB .
There is another way of obtaining the f -iterator: Consider the counting system
(TX , f∗ , idX ), where f∗ : TX → TX is defined by f∗ (h) = f ◦ h for all h ∈ TX .
Lemma 6.3 If #∗ is the unique iterator for (TX , f∗ , idX ) then #∗ (A) = fA for
each finite set A.
Proof By definition #∗ (∅) = idX and #∗ (A ∪ {a}) = f∗ (#∗ (A)) = f ◦ #∗ (A)

whenever A is a finite set and a ∈ / A. Thus by the uniqueness in Lemma 6.1
#∗ (A) = fA for each finite set A.
Proposition 6.2 If A and B are finite sets with A ≈ B then fA = fB .
Proof This follows from Proposition 3.1 (applied to #∗ ) and Lemma 6.3.
Lemma 6.4 If A and B are disjoint finite sets then fA∪B = fA ◦ fB .
Proof For each finite set A let P(A) be the proposition that fA∪B = fA ◦ fB for
each finite set B disjoint from A.
(⋄) P(∅) holds since f∅∪B = fB = idX ◦ fB = f∅ ◦ fB for each finite set B.
/ A. Consider a finite
set B disjoint from A ∪ {a}; then B is disjoint from A and so fA∪B = fA ◦ fB .
Moreover a ∈/ A ∪ B and hence
f(A∪{a})∪B = f(A∪B)∪{a} = f ◦ fA∪B = f ◦ fA ◦ fB = fA∪{a} ◦ fB .
This shows that P(A ∪ {a}) holds.

A, which means fA∪B = fA ◦ fB whenever A and B are disjoint finite sets.
Lemma 6.5 (1) If f is bijective then fA is bijective for each finite set A.
(2) If f is injective then fA is also injective for each finite set A.
Proof (1) For each finite set A let P(A) be the proposition that fA is bijective.
(⋄) P(∅) holds since f∅ = idX is bijective.
/ A. Then fA∪{a} = f ◦fA ,
as the composition of two bijective mappings, is itself bijective, and so P(A∪{a})
holds.
A, which means that fA is bijective for each finite set A.
(2) Just replace ‘bijective’ by ‘injective’ in (1).
As above let Mf = {u ∈ TX : u = fA for some finite set A}. Thus in particular

idX ∈ Mf , since idX = f∅ , and f ∈ Mf , since f = f{a} for each element a.
Moreover, if f is injective (resp. bijective) then by Lemma 6.5 each element in
Mf is injective (resp. bijective).
Lemma 6.6 For all u1 , u2 ∈ Mf we have u1 ◦ u2 ∈ Mf and u1 ◦ u2 = u2 ◦ u1 .

(Since also idX ∈ Mf this means that Mf is a commutative submonoid of the
monoid (TX , ◦, idX ).)
Proof Let u1 , u2 ∈ Mf and so there exist finite sets A and B with u1 = fA and
u2 = fB . There then exists a disjoint pair (A′ , B ′ ) with (A′ , B ′ ) ≈ (A, B) and
hence by Proposition 6.2 and Lemma 6.4
u1 ◦ u2 = fA ◦ fB = fA′ ◦ fB′ = fA′ ∪B′ = fB′ ∪A′ = fB′ ◦ fA′ = fB ◦ fA = u2 ◦ u1 ,
i.e., u1 ◦ u2 = u2 ◦ u1 . Moreover, since u1 ◦ u2 = fA′ ∪B′ and fA′ ∪B′ ∈ Mf , this also
shows that u1 ◦ u2 ∈ Mf .
As above let Φx0 : Mf → X be the mapping with Φx0 (u) = u(x0 ) for all u ∈ Mf .
Then Φx0 (idX ) = x0 and Φx0 (fA ) = fA (x0 ) = #(A) for each finite set A. An
important property of Φx0 is that
(♯) u(Φx0 (v)) = Φx0 (u ◦ v) for all u, v ∈ Mf ,
which holds since u(Φx0 (v)) = u(v(x0 )) = (u ◦ v)(x0 ) = Φx0 (u ◦ v). The special
case of this with u = f gives us f (Φx0 (v)) = Φx0 (f ◦ v) for all v ∈ Mf .
Lemma 6.7 The mapping Φx0 is a bijection.

Proof If x ∈ X then by Lemma 3.3 there exists a finite set A with x = #(A) and
it follows that Φx0 (fA ) = fA (x0 ) = #(A) = x. Thus Φx0 is surjective. Now let
u1 , u2 ∈ Mf with Φx0 (u1 ) = Φx0 (u2). By the definition of Mf there exist finite
sets A and B with u1 = fA and u2 = fB , and hence
#(A) = Φx0 (fA ) = Φx0 (u1 ) = Φx0 (u2 ) = Φx0 (fB ) = #(B) .
Therefore by Proposition 6.1 fA = fB , i.e., u1 = u2 , which shows that Φx0 is also
injective.
Proof of Theorem 5.1: Since Φx0 : Mf → X is a bijection there clearly exists a

unique binary relation ⊕ on X such that
Φx0 (u1 ) ⊕ Φx0 (u2 ) = Φx0 (u1 ◦ u2 )
for all u1 , u2 ∈ Mf . The operation ⊕ is associative since ◦ has this property: If
x1 , x2 , x3 ∈ X and u1 , u2 , u3 ∈ Mf are such that xj = Φx0 (uj ) for each j then
(x1 ⊕ x2 ) ⊕ x3 = (Φx0 (u1 ) ⊕ Φx0 (u2 )) ⊕ Φx0 (u3 )
= Φx0 (u1 ◦ u2 ) ⊕ Φx0 (u3 ) = Φx0 ((u1 ◦ u2 ) ◦ u3 )
= Φx0 (u1 ◦ (u2 ◦ u3 )) = Φx0 (u1 ) ⊕ Φx0 (u2 ◦ u3 )
= Φx0 (u1 ) ⊕ (Φx0 (u2 ) ⊕ Φx0 (u3 )) = x1 ⊕ (x2 ⊕ x3 ) .
In the same way ⊕ is commutative, since by Lemma 6.6 the restriction of ◦ to
Mf has this property: If x1 , x2 ∈ X and u1 , u2 ∈ Mf are such that x1 = Φx0 (u1 )
and x2 = Φx0 (u2 ) then u1 ◦ u2 = u2 ◦ u1 and so
x1 ⊕ x2 = Φx0 (u1 ) ⊕ Φx0 (u2 )
= Φx0 (u1 ◦ u2 ) = Φx0 (u2 ◦ u1 ) = Φx0 (u2 ) ⊕ Φx0 (u1 ) = x2 ⊕ x1 .
Moreover, if x ∈ X and u ∈ Mf is such that x = Φx0 (u) then
x ⊕ x0 = Φx0 (u) ⊕ Φx0 (idX ) = Φx0 (u ◦ idX ) = Φx0 (u) = x ,
and so x ⊕ x0 = x for all x ∈ X.
Let x1 , x2 ∈ X; we next show that for some x ∈ X either x1 = x2 ⊕ x or
x2 = x1 ⊕ x. Let u1 , u2 ∈ Mf be such that x1 = Φx0 (u1) and x2 = Φx0 (u2 ) and
let A and B be finite sets with u1 = fA and u2 = fB . By Proposition 2.8 there
either exists an injective mapping g : A → B or an injective mapping h : B → A.
Assume the former holds and put B ′ = g(A) and C = B \ B ′ . Then B ′ and C
are disjoint and B = B ′ ∪ C; moreover, A ≈ B ′ (since g considered as a mapping
from A to B ′ is a bijection) and so by Proposition 6.2 fA = fB′ . Thus, putting
x = Φx0 (fC ), it follows that
x2 = Φx0 (u2 ) = Φx0 (fB ) = Φx0 (fB′ ∪C ) = Φx0 (fB′ ◦ fC )
= Φx0 (fB′ ) ⊕ Φx0 (fC ) = Φx0 (fA ) ⊕ Φx0 (fC ) = Φx0 (u1 ) ⊕ x = x1 ⊕ x .
On the other hand, if there exists an injective mapping h : B → A then the same
argument shows there exists x ∈ X with x1 = x2 ⊕ x.
Now to (a0) and (a1), and we have seen above that (a0) holds. Let x, x′ ∈ X
and let u, u′ ∈ Mf with x = Φx0 (u) and x′ = Φx0 (u′ ). Then by (♯)
x ⊕ f (x′ ) = Φx0 (u) ⊕ f (Φx0 (u′ )) = Φx0 (u) ⊕ Φx0 (f ◦ u′ ))

= Φx0 (u ◦ f ◦ u′) = Φx0 (f ◦ u ◦ u′ ) = f (Φx0 (u ◦ u′ ))
= f (Φx0 (u) ⊕ Φx0 (u′ )) = f (x ⊕ x′ )
and so (a1) holds. If ⊕′ is another binary operation on X satisfying (a0) and

(a1) then it is easy to see that X0 = {x′ ∈ X : x ⊕′ x′ = x ⊕ x′ for all x ∈ X}
is an f -invariant subset of X containing x0 . Hence X0 = X, since (X, f, x0 ) is
minimal, which implies that ⊕′ = ⊕.
Finally, if A and B are disjoint finite sets then
#(A) ⊕ #(B) = Φx0 (fA ) ⊕ Φx0 (fB ) = Φx0 (fA ◦ fB ) = Φx0 (fA∪B ) = #(A ∪ B) .
Moreover, ⊕ is uniquely determined by this requirement: Consider any binary

operation ⊕′ on X for which #(A) ⊕′ #(B) = #(A ∪ B) whenever A and B are
disjoint finite sets. If C and D are any finite sets then there exists a disjoint pair
(A, B) with (A, B) ≈ (C, D) and hence by Proposition 3.1
#(C) ⊕′ #(D) = #(A) ⊕′ #(B) = #(A ∪ B) = #(A) ⊕ #(B) = #(C) ⊕ #(D)
and so by Lemma 3.3 ⊕′ = ⊕. This completes the proof of Theorem 5.1.
Let ⊕ be the operation given in Theorem 5.1. The theorem shows in particular
that (X, ⊕, x0 ) is a commutative monoid. The next result, which generalises
Propositions 5.1 and 5.2, shows how properties of the mapping f correspond to
properties of the monoid (X, ⊕, x0 ).
Proposition 6.3 (1) (X, ⊕, x0 ) is a group if and only if f is a bijection.

(2) The cancellation law holds in (X, ⊕, x0 ) if and only if f is injective.
Proof We have the commutative monoid (X, ⊕, x0 ), and also the commutative
monoid (Mf , ◦, idX ). Now the operation ⊕ was defined so that
Φx0 (u1 ) ⊕ Φx0 (u2 ) = Φx0 (u1 ◦ u2 )
for all u1 , u2 ∈ Mf and, since Φx0 (idX ) = x0 and Φx0 is a bijection, this means
⊕ was defined to make Φx0 : (Mf , ◦, idX ) → (X, ⊕, x0 ) a monoid isomorphism.
It follows that the cancellation law holds in (X, ⊕, x0 ) if and only if it holds in
(Mf , ◦, idX ) and that (X, ⊕, x0 ) will be a group if an only if (Mf , ◦, idX ) is. It is
thus enough to prove the statements in the proposition with (X, ⊕, x0 ) replaced
by (Mf , ◦, x0 ).
(1) We first show that u−1 ∈ Mf whenever u ∈ Mf is a bijection. This follows
from the fact that u−1 (x0 ) ∈ X and Φx0 is surjective and so there exists v ∈ Mf
with Φx0 (v) = u−1 (x0 ); thus by (♯)
Φx0 (u ◦ v) = u(Φx0 (v)) = u(u−1(x0 )) = x0 = Φx0 (idX )
and therefore u ◦ v = idX , since Φx0 is injective. Hence u−1 = v ∈ Mf . Now

clearly Mf is a group if and only if each mapping u ∈ Mf is a bijection and
u−1 ∈ Mf , and we have just seen that u−1 ∈ Mf holds automatically whenever
u ∈ Mf is a bijection. Moreover, by Lemma 6.5 (1) each element of Mf is a
bijection if and only if f is a bijection.
(2) Suppose the cancellation law holds in (Mf , ◦, idX ), and let x1 , x2 ∈ X with
f (x1 ) = f (x2 ). Then there exist u1 , u2 ∈ Mf with Φx0 (u1 ) = x1 and Φx0 (u2 ) = x2
(since Φx0 is surjective), and hence by (♯)
Φx0 (f ◦ u1 ) = f (Φx0 (u1)) = f (x1 ) = f (x2 ) = f (Φx0 (u2 )) = Φx0 (f ◦ u2 ) .
It follows that f ◦ u1 = f ◦ u2 (since Φx0 is injective) and so u1 = u2 . In particular

x1 = x2 , which shows that f is injective. The converse is immediate, since if f is
injective then by Lemma 6.5 (2) so is each u ∈ Mf and hence u1 = u2 whenever
u ◦ u1 = u ◦ u2 .
We now begin the preparations for the proof of Theorem 5.2.

For each finite set B there is the mapping fB : X → X and thus there exists a
unique fB -iterator. This is the unique assignment A 7→ (fB )A with (fB )∅ = idX
such that (fB )A∪{a} = fB ◦ (fB )A for each finite set A and each element a ∈
/ A.
Lemma 6.8 (1) (fB )A = fB×A for all finite sets A and B.
(2) (fB )A = (fA )B for all finite sets A and B.
Proof (1) Consider the finite set B to be fixed and for each finite set A let P(A)
be the proposition that (fB )A = fB×A .
(⋄) P(∅) holds since (fB )∅ = idX = f∅ = fB×∅ .
(⋆) Let A be a finite set for which P(A) holds (and so (fB )A = fB×A ) and let
a∈/ A. Then B × (A ∪ {a}) is the disjoint union of the sets B × {a} and B × A
and B × {a} ≈ B; thus by Proposition 6.2 and Lemma 6.4
(fB )A∪{a} = fB ◦(fB )A = fB ◦fB×A = fB×{a} ◦fB×A = f(B×{a})∪(B×A) = fB×(A∪{a})
and so P(A ∪ {a}) holds.

A, which means (fB )A = fB×A for all finite sets A and B.
(2) By Proposition 6.2 fB×A = fA×B , since clearly B × A ≈ A × B, and therefore
by (1) (fB )A = fB×A = fA×B = (fA )B .
The next result will not be needed in what follows, but it shows that (δ) in
Section 5 holds, and could be used instead of Lemma 5.3 in the previous proof of
Theorem 5.2.
Lemma 6.9 Let (A, B) and (A′ , B ′ ) be pairs of finite sets.

(1) If fA = fA′ and fB = fB′ then fA×B = fA′ ×B′ .
(2) If #(A, B) = #(A′ , B ′ ) then #(A × B) = #(A′ × B ′ ).
Proof (1) By several applications of Lemma 6.8 (1) and (2) we have
fA×B = (fB )A = (fB′ )A = (fA )B′ = (fA′ )B′ = (fB′ )A′ = fA′ ×B′ .
(2) This follows immediately from (1) and Proposition 6.1.
Lemma 6.10 Let v ∈ Mf and A → vA be the unique v-iterator (and so v∅ = idX

and vA∪{a} = v ◦ vA for each finite set A and each element a ∈
/ A). Then:
(1) vA ∈ Mf for every finite set A.
(2) If A and B are finite sets with fA = fB then vA = vB .
Proof (1) For each finite set A let P(A) be the proposition that vA ∈ Mf .
(⋄) P(∅) holds since v∅ = idX ∈ Mf .
(⋆) Let A be a finite set for which P(A) holds (and so vA ∈ Mf ) and let a ∈
/ A.
Then vA∪{a} = v ◦ vA ∈ Mf , since Mf is a submonoid of (TX , ◦, idX ), and so
P(A ∪ {a}) holds.
A, which means vA ∈ Mf for all finite sets A.
(2) There exists a finite set C with v = fC and thus by Lemma 6.8 (2)
vA = (fC )A = (fA )C = (fB )C = (fC )B = vB .
Let v ∈ Mf ; then by Lemma 6.10 there exists a unique mapping ψv : Mf → Mf

such that ψv (fA ) = vA for each finite set A, and in particular ψv (f ) = v (since
if a is any element then f = f{a} and v = v{a} ). Moreover, if v = fB then by
Lemma 6.8 (1) ψv (fA ) = fA×B .
A mapping ψ : Mf → Mf is an endomorphism (of the monoid (Mf , ◦, idX )) if
ψ(idX ) = idX and ψ(u1 ◦ u2 ) = ψ(u1 ) ◦ ψ(u2 ) for all u1 , u2 ∈ Mf .
Lemma 6.11 (1) ψv is an endomorphism for each v ∈ Mf .

(2) ψv (u) = ψu (v) for all u, v ∈ Mf .
Proof (1) If u1 , u2 ∈ Mf then there exist disjoint finite sets A and B with
u1 = fA and u2 = fB and hence by Lemma 6.4
ψv (u1 ◦ u2 ) = ψv (fA ◦ fB ) = ψv (fA∪B ) = vA∪B = vA ◦ vB = ψv (fA ) ◦ ψv (fB )
(noting that Lemma 6.4 can also be applied to the v-iterator). Moreover, we have
ψv (idX ) = ψv (f∅ ) = v∅ = idX and hence ψv is an endomorphism.
(2) Let A, B be finite sets with u = fA and v = fB . Then by Lemma 6.8 (2)
ψv (u) = ψv (fA ) = vA = (fB )A = (fA )B = uB = ψu (fB ) = ψu (v) .
Proof of Theorem 5.2: Define a binary operation ⋄ : Mf × Mf → Mf by letting
u ⋄ v = ψu (v)
for all u, v ∈ Mf . In particular, if A and B are finite sets then by Lemma 6.8 (1)
fA ⋄ fB = ψfA (fB ) = (fA )B = fA×B and therefore
fA ⋄ fB = fA×B
for all finite sets A and B. Let u, v, w ∈ Mf and A, B and C be finite sets with
u = fA , v = fB and w = fC . Then clearly A × (B × C) ≈ (A × B) × C, so by
Proposition 6.2 fA×(B×C) = f(A×B)×C and thus
u ⋄ (v ⋄ w) = fA ⋄ (fB ⋄ fC ) = fA ⋄ fB×C
= fA×(B×C) = f(A×B)×C = fA×B ⋄ fC = (fA ⋄ fB ) ⋄ fC = (u ⋄ v) ⋄ w .
Hence ⋄ is associative. Moreover, Lemma 6.11 (2) shows that ⋄ is commutative,

since u ⋄ v = ψu (v) = ψv (u) = v ⋄ u for all u, v ∈ Mf . Also (with a any element)
u ⋄ f = u ⋄ f{a} = u{a} = u, i.e., u ⋄ f = u for all u ∈ Mf , and by Lemma 6.11 (1)
u ⋄ idX = idX and u ⋄ (v1 ◦ v2 ) = (u ⋄ v1 ) ◦ (u ⋄ v2 ) for all u, v1 , v2 ∈ Mf .
Now since Φx0 : Mf → X is a bijection there clearly exists a unique binary
relation ⊗ on X such that
Φx0 (u1 ) ⊗ Φx0 (u2 ) = Φx0 (u1 ⋄ u2 )
for all u1 , u2 ∈ Mf , and exactly as in the proof of Theorem 5.1 the operation
⊗ is associative and commutative since ⋄ has these properties. The same holds
true of the distributive law: Let x, x1 , x2 ∈ X, and u, v1 , v2 ∈ Mf be such that

x = Φx0 (u), x1 = Φx0 (v1 ) and x2 = Φx0 (v2 ). Then
x ⊗ (x1 ⊕ x2 ) = Φx0 (u) ⊗ (Φx0 (v1 ) ⊕ Φx0 (v2 ))

= Φx0 (u) ⊗ Φx0 (v1 ◦ v2 ) = Φx0 (u ⋄ (v1 ◦ v2 ))
= Φx0 ((u ⋄ v1 ) ◦ (u ⋄ v2 )) = Φx0 (u ⋄ v1 ) ⊕ Φx0 (u ⋄ v2 )
= (Φx0 (u) ⊗ Φx0 (v1 )) ⊕ (Φx0 (u) ⊗ Φx0 (v2 )) = (x ⊗ x1 ) ⊕ (x ⊗ x2 )
Next, if x ∈ X and u ∈ Mf is such that x = Φx0 (u) then
x ⊗ x0 = Φx0 (u) ⊗ Φx0 (idX ) = Φx0 (u ⋄ idX ) = Φx0 (idX ) = x0 ,

x ⊗ f (x0 ) = Φx0 (u) ⊗ Φx0 (f ) = Φx0 (u ⋄ f ) = Φx0 (u) = x
and so x ⊗ x0 = x0 and x ⊗ f (x0 ) = x for all x ∈ X.

We have already seen (m0) holds and, since f (x0 ) is a unit, (m1) is a special
case of the distributive law: Let x, x′ ∈ X; then by (a0) and (a1) and since ⊕ is
commutative it follows that f (x′ ) = f (x′ ⊕ x0 ) = x′ ⊕ f (x0 ) = f (x0 ) ⊕ x′ , and
hence x ⊗ f (x′ ) = x ⊗ (f (x0 ) ⊕ x′ ) = (x ⊗ f (x0 )) ⊕ (x ⊗ x′ ) = x ⊕ (x ⊗ x′ ), which is
(m1). Finally, if ⊗′ is another binary operation satisfying (m0) and (m1) then it
is easy to see that X0 = {x′ ∈ X : x ⊗′ x′ = x ⊗ x′ for all x ∈ X} is an f -invariant
subset of X containing x0 . Hence X0 = X, since (X, f, x0 ) is minimal, which
implies that ⊗′ = ⊗.
References
[1] Dedekind, R. (1888): Was sind und was sollen die Zahlen? Vieweg.
[2] Lawvere, W.F. (1964): An elementary theory of the category of sets. Proc.
Nat. Acad. Sci., 52, 1506-1511.
[3] Levy, A. (1979): Basic Set Theory. Springer.
[4] Suppes, P. (1960): Axiomatic Set Theory. van Nostrand.
[5] Tarski, A. (1924): Sur les ensembles finis. Fundamenta Mathematicae, 6,

45-95.
[6] Whitehead, A.N., Russell, B. (1912): Principia Mathematica. Cambridge

University Press.
[7] Zermelo, E. (1909): Sur les ensembles finis et le principe de l’induction

complète. Acta Mathematica, 32, 185-193.
Fakultät für Mathematik, Universität Bielefeld

Postfach 100131, 33501 Bielefeld, Germany
E-mail address: preston@math.uni-bielefeld.de
URL: http://www.math.uni-bielefeld.de/~preston
55

Finite Sets and Counting: Chris Preston

Uploaded by

Copyright:

Available Formats

Finite Sets and Counting: Chris Preston

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Finite Sets and Counting: Chris Preston

Uploaded by

Copyright:

Available Formats

arXiv:0809.0105v2 [math.

HO] 19 Jun 2010

Finite Sets and Counting

We start by presenting a theory of finite sets using the approach

4 Minimal counting systems 27

5 Addition and multiplication 35

6 Another take on addition and multiplication 45

Consider the following simple example:

An example of such a mapping (with 9 balls) is the following:

Proof Consider an injective mapping f : A′ → A′ ; then there are two cases:

and then g is bijective, since A is tame. Therefore f is again bijective.

inductive system in P(A) (or just an inductive system if A can be determined

Proof Let A be a finite set; as noted above K = {B ⊂ A : B is tame} is an

Proof Let K′ be an inductive system in P(A′ ); then K = {B ∈ K′ : B ⊂ A}

#(A) = #({a, b} ∪ {c}) = s(#({a, b})) = s(#({a} ∪ {b})) = s(s(#({a})))

which corresponds to counting the elements of A by first removing c, then b

Proposition 1.2 The existence of a Dedekind system is equivalent to that of a

Proof If (X, f, x0 ) is a standard counting system then the set X is Dedekind-

As an application of the recursion theorem it is shown in Theorem 3.4 that if

In the second f is not injective and x0 ∈

In Section 5 we show how an addition and a multiplication can be defined for

#(A) ⊕ #(B) = #(A ∪ B)

#(A) ⊗ #(B) = #(A × B)

Moreover, ⊗ is the unique binary operation on X such that

Proof Let K′ be an inductive system in P(A′ ); then K = {B ∈ K′ : B ⊂ A}

Proposition 2.1 Every subset of a finite set is finite.

Proof Let A be finite and put K = {B ∈ P(A) : B is finite}. Then ∅ ∈ K and if

Proposition 2.2 If A and B are finite sets then so is A ∪ B.

Proof Consider the system of subsets K = {C ∈ P(A) : C ∪ B is finite}. Then

Proposition 2.3 Let A and B be sets with A finite.

(α) The restriction f|C : C → D of f to C is still surjective. Then D is finite

Proposition 2.4 If A is a finite set then so is the power set P(A).

Proof Consider the system of subsets K = {B ∈ P(A) : P(B) is finite}. Then

Proposition 2.5 If A and B are finite sets then so their product A × B.

Proof Put K = {C ∈ P(A) : C × B is finite}. Then ∅ ∈ K, since ∅ × B = ∅.

Proposition 2.7 Let A be a finite set and f : A → A be a mapping. Then f is

If A and B are sets then we write A ≈ B if there exists a bijective mapping

Proof By Proposition 2.8 there either exists an injective mapping f : A → B or

Proposition 2.9 If B is a subset of a finite set A with B ≈ A then B = A.

Proof There exists a bijective mapping f : A → B and then the restriction

If A is a set and S a non-empty subset of P(A) then B ∈ S is said to be minimal

Theorem 3.1 There exists a unique iterator # for (X, f, x0 ).

Proof We first consider a version of the theorem restricted to the subsets of a

#A′ (B ∪ {b}) = #A (B ∪ {b}) = f (#A (B)) = f (#A′ (B)) ,

Proof This follows immediately from the uniqueness of #B .

#(A ∪ {a}) = #A∪{a} (A ∪ {a}) = f (#A∪{a} (A) = f (#A (A)) = f (#(A))

and hence # is an iterator for (X, f, x0 ).

In what follows let # be the unique iterator for (X, f, x0 ).

#(B ′ ) = #(A), since P(A) holds, and it follows that

#(B) = #(B ′ ∪ {b}) = f (#(B ′ )) = f (#(A)) = #(A ∪ {a}) .

This shows that P(A ∪ {a}) holds.

Theorem 3.2 If (X, f, x0 ) is standard then the iterator # is complete.

f (#(B ′ )) = #(B ′ ∪ {b}) = #(B) = f (#(A)) ,

Lemma 3.3 Let X0 be the least f -invariant subset of X containing x0 . Then

X0 = {x ∈ X : x = #(A) for some finite set A} .

h(f (x)) = h(f (#(A))) = h(#(A ∪ {a}))

[f (x0 )] = {x0 } ∪ f ([x0 ]) = {x0 } ∪ f (∅) = {x0 } = ∅ ∪ {x0 } = [x0 ] ∪ {x0 }

and so x0 ∈ X0 . Let x ∈ X0 ; then