To Know About Substitution
To Know About Substitution
To Know About Substitution
David Gale
The current boom in calculus reform programs has been going on now for more
than six years at a cumulative cost of well over five million dollars. A major theme
of the program has been the need to get away from so-called cook book calculus,
to teach concepts rather than techniques, understanding rather than rote memo-
rization. This is of course a worthy goal but just how one goes about achieving it is
not at all obvious. What I want to do in the paragraphs which follow is to look at
this question in the context of a special case by treating a particular calculus
question which has been bothering me on and off for more than 50 years.
To begin at the beginning, I took freshman calculus in 1940 and in the
intervening years I have taught virtually that same course to others dozens of
times. I found the course back then mildly disappointing in that it seemed to
consist for the most part of working hundreds of drill problems, but I managed to
master enough of the tricks to struggle through with a grade of B. There was one
thing in the course, though, that really bothered me, and that was the matter of
integration by substitution. I didn't have any trouble with integrands like x\ll-x2
and ln(x)/x. I understood that in general one hopes that the integrand will have
the form f(g(x))gl(x), in which case if one happens to know an antiderivative of
f , call it F , then the antiderivative one is looking for is F(g(x)). It was also clear,
as the book explained, that in fact this was nothing but the chain rule in reverse.
But then one day we had to integrate d m without the extra x on the
outside, so the book, "Calculus" by Arnold Dresden, said, well, make the substitu-
tion x = sin(t). Then dr = cos(t) dt etc. etc. Again, I could go through the
mechanics without difficulty but this time it seemed to me the operations were not
justified by anything we had done up to that time. In the earlier cases we made a
substitution of the form u = g(x) but now we were supposed instead to write
x = g(t), which didn't seem to me to be the same thing.
Because of the current interest in calculus instruction I decided now, after more
than half a century it would be interesting to see how textbooks these days are
handling the substitution problem that had thrown me off as a student. To this end
I looked in 10 fairly traditional texts, some of which are among the current best
sellers. The authors are Anton (to be abbreviated A), Edwards and Penny (EP),
~ l l kand Gulick (EG), Lang (L), Larson and Hostetler (LH), Marsden and
Weinstein (MW), Stein (SN), Stewart (ST), Swokowski (SW), Thomas and Finney
(TF). Also I looked at a draft of an as yet unpublished text by the Harvard
Calculus Consortium (H). Here are some of my findings.
All of these books use and prove the "direct" substitution theorem devoting an
entire section to the subject. None of them proves what I will call the inverse
substitution theorem although all of them except L and H use it fairly intensively,
devoting a whole section to Trigonometric Substitution. Only ST explicitly recog-
nizes the fact that inverse substitution is not the same as direct substitution, and I
Of course the equation is false. The expression lf(x) dr stands for antiderivative,
as in a table of integrals, and the variable, be it x, t, u or anything else is a
dummy. Clearly the antiderivatives on the left and right above are not equal. What
the books mean, no doubt, is that if you substitute g(x) for u after taking the
antiderivative on the right you get the antiderivative on the left. I expect some
readers will say I am being pedantic or that there is no need to be so rigorous at
the freshman level, but I think this kind of lapse is symptomatic of a rather strange
set of standards and perhaps it sheds light on why none of the books proves the
inverse substitution theorem. It is because none of them formulates it. Once one
does, the proof becomes more or less mechanical and one sees at once that it is
not a mere application of the chain rule but involves other things, as we will see in
a moment. The book ST .is interesting. It uses equation (1) above to describe
substitution while inverse substitution [the book's italics] is described by x = g(t)
and
where g is required to have an inverse. Notice that (1) and (2) are the same (false)
equation. Only the letters for the (dummy) variables are different (of course these
equations become correct when one considers definite integrals and puts in the
appropriate limits).
Let us now turn to the mathematics which to my surprise turned out to be
rather interesting. There are (at least) two different proofs of the inverse substitu-
tion theorem. The first is direct (brute force), and slightly messy. The second,
suggested to me by Ole Hald, is short and elegant and makes use of some
"theory". For the sake of cleanness I will use the circle notation for composition of
functions. People who prefer the more traditional f(g(x)) notation should have no
trouble translating the argument below, at the cost of having to carry around
.
masses of parentheses and a lot of, in my view, superfluous x's. (The subject of
notation, which is rather interesting, will be considered in an appendix.) Differenti-
ation will be denoted by a prime, ', and for typographical clarity I will use a dot, .,
for multiplication of functions. Finally I will use a block, , for "proof" as well as
for "Q. E. D.", a notational reform I have been trying to persuade the mathemati-
cal community to adopt for the past 25 years with no success whatsoever. First,
then, we have the chain rule,
I ( f 08)' = ( f ' g ) . g'.
O (Ch)
Now, both the substitution rules described in the preceding paragraphs deal
with the situation where we have three functions h, f and g and
h = ( f o g ) .gr. (*I
In the direct substitution case we know an antiderivative for f and want to find one
for h. The answer is given by,
Let me suggest at this point that the reader take two minutes to work out the
direct proof of (4) in order to see what is involved.
As one might expect, one needs to use not only the chain rule but also the
.
formula for the derivative of the inverse of a function which in our notation is
(g-l)' = 1 / ( g t g-')
We must now simplify the term on the right hand side and we need several facts.
The first is the general but not so familiar identity that for any functions a, b
and c
( a . b ) o c = (sac). ( b o c ) . (5)
The right hand side then becomes
((f o 8 ) g-') . (g' g-') . (g-'1,
which by the associative law for composition and the fact that g 0 g-' is the
.
identity function, simplifies to
f . (8' g-') . (g-I),'
O
.
be given, requires the antidifferentiability of f as a hypothesis. Thus, it assumes
that h has an antiderivative H, and f has an antiderivative F , but it does not need
to make use of (Inv).
Since F' = f we have by (Ch), ( F 0 gY = (f 0 g ) . g' = h, so H and F 0 g
have the same derivative and hence (theory) they differ by a constant, thus,
Fog=H+c. (6)
.
Now~omposingboth sides of (6) on the right with g-' gives
Fogog-' = F =Hog-' +cog-' =Hogp' + c,
so(H0g-')'=F1=f.
Exercises
1. Let f ( x ) = 2 x + 1, g ( x ) = x - 2.
(a) Calculate f 0 g and g 0 f .
[Not so routine but students will learn something by figuring these out.]
5. Let f ( x ) = 2 x 2 + 1, g ( x ) = x 2 - 2.
(a) Find k such that k 0 f = g.
[The answers are lox - 19, lox - 15, 6x + 3, 33. This is to illustrate the
importance of where you put the parentheses.]
6. Let f , g and h be any functions. Answer True or False and give reasons,
(a)(f . g ) o h = ( f o h ) . ( g O h ) ,
[It is perhaps too much to ask for a precise written argument here but the
instructor could give the details in class.]
(b) (f ~ g )h . = ( f . h ) o ( g . h).
[A counter example is needed. Some of the better students will perhaps find
one. Again a good problem for classroom discussion.]
7. Prove the Inverse Substitution Theorem: If h = (f 0 g ) . g' and g has an
inverse and H is an antiderivative of h then H 0 g-' is an antiderivative of
f. You will need to use Problem 3 and Problem 6 (a).
Department of Mathematics
University of California
Berkeley, C A 94720
gale@math. berkeley.edu