Tensors Made Easy
Tensors Made Easy
Tensors Made Easy
TENSORS
made easy
An informal introduction
to Maths of General Relativity
2018
6th edition – September 2018
Giancarlo Bernacchi
ISBN 978-1-326-23097-5 (printed book)
ISBN 978-1-326-23104-0 (e-book)
All rights reserved
Giancarlo Bernacchi
Rho - Milano - IT
giancarlobernacchi@libero.it
2 Tensors 19
2.1 Outer product between vectors and covectors 21
2.2 Matrix representation of tensors 23
2.3 Sum of tensors and product by a number 26
2.4 Symmetry and skew-symmetry 26
2.5 Representing tensors in T-mosaic 27
2.6 Tensors in T-mosaic model: definitions 28
2.7 Tensor inner product 30
2.8 Outer product in T-mosaic 39
2.9 Contraction 40
2.10 Inner product as outer product + contraction 42
2.11 Multiple connection of tensors 42
2.12 “Scalar product” or “identity” tensor 43
2.13 Inverse tensor 44
2.14 Vector-covector “dual switch” tensor 47
2.15 Vectors / covectors homogeneous scalar product 48
2.16 G applied to basis-vectors 51
2.17 G applied to a tensor 51
2.18 Relations between I, G, δ 53
3 Change of basis 56
3.1 Basis change in T-mosaic 59
3.2 Invariance of the null tensor 62
3.3 Invariance of tensor equations 63
4 Tensors in manifolds 64
4.1 Coordinate systems 66
4.2 Coordinate lines and surfaces 66
4.3 Coordinate bases 67
4.4 Coordinate bases and non-coordinate bases 70
4.5 Change of the coordinate system 72
4.6 Contravariant and covariant tensors 74
4.7 Affine tensors 75
4.8 Cartesian tensors 76
4.9 Magnitude of vectors 76
4.10 Distance and metric tensor 76
4.11 Euclidean distance 78
4.12 Generalized distances 79
4.13 Tensors and not – 80
4.14 Covariant derivative 81
4.15 The gradient ∇ ̃ at work 85
4.16 Gradient of some fundamental tensors 91
4.17 Covariant derivative and index raising / lowering 92
4.18 Christoffel symbols 92
4.19 Covariant derivative and invariance of tensor equations 96
4.20 T-mosaic representation of gradient, divergence 97
and covariant derivative
4.21 Derivative of a scalar or vector along a line 99
4.22 T-mosaic representation of derivatives along a line 103
Appendix 155
3
embarrassment tensor equations, and even a little obvious, has ever been presented
in a text. Which can even be surprising. But the magicians never reveal their tricks;
and mathematicians sometimes resemble them. If it will be appreciated, we will be
happy to have been just us to uncover the trick.
The comprehension of the text requires the operational knowledge of the differential
calculus, up to the Taylor series and to partial derivatives. Some notions of Linear
Algebra can help; about matrices, it will be enough to know that they are tables with
rows and columns (and that swapping them a great confusion can be created), or
little more.
We will constantly use the Einstein sum convention for repeated indexes which
greatly simplifies the writing of tensor equations.
The author's hope for these notes is that they can be useful for those starting to study
the topic.
G.B. May 2018
4
Notations and conventions
In these notes we'll use the standard notation for components of tensors, namely
upper (apexes) and lower (subscript) indexes in Greek letters, for example T , as
usual in Relativity. The coordinates will be marked by upper indexes, such as x ,
basis-vectors will be represented by e , basis-covectors by e .
In a tensor formula such as P = g V the index α that appears once at left side
member and once at right side member is a “free” index, while β that occurs twice in
the right member only is a repeated or “dummy” index.
We will constantly use the Einstein sum convention, whereby a repeated index
means an implicit summation over that index. For example, P = g V is for
n
1 2 n
P = g 1 V g 2 V ...g n V . Further examples are: A B = ∑ A B ,
=1
n n
1 2 n
A B = ∑∑ A
=1 =1
B
and A = A A ...A
1 2 n .
We note that, due to the sum convention, the “chain rule” for partial derivatives can
f f x
be simply written = and the sum over μ comes automatically.
x x x
A dummy index, unlike a free index, does not survive the summation and thus it
does not appear in the result. Its name can be freely changed as long as it does not
collide with other homonymous indexes in the same term.
In all equations the dummy indexes must always be balanced up-down and they
cannot occur more than twice in each term. The free indexes appear only once in
each term. In an equation both left and right members must have the same free
indexes.
These conventions make it much easier writing correct relationships.
The little needed of matrix algebra will be said when necessary. We only note that
the multiplication of two matrices requires us to “devise” a dummy index. Thus, for
instance, the product of matrices [ A ] and [B ] becomes [ A ]⋅[ B ] (also
for matrices we locate indexes up or down depending on the tensors they represent,
without a particular meaning for the matrix itself).
The mark will be used indiscriminately for scalar products between vectors and
covectors, both heterogeneous and homogeneous, as well as for tensor inner
products.
Other notations will be explained when introduced: there is some redundancy in
notations and we will use them with easiness according to convenience. In fact, we
had better getting familiar with all various alternative notations that can be found in
the literature.
The indented paragraphs marked ▫ are “inserts” in the thread of the speech and they
can, if desired, be skipped at first reading (they are usually justifications or proofs).
“Mnemo” boxes suggest simple rules to remind complex formulas.
5
“Make things as simple as possible, but not simpler”
A. Einstein
6
1 Vectors and covectors
1.1 Vectors
A set where we can define operations of addition between any two
elements and multiplication of an element by a number (such that the
result is still an element of the set) is called a vector space and its
elements vectors.**
The usual vectors of physics are oriented quantities that fall under this
definition; we will denote them generically by V .
In particular, we deal with the vector space formed by the set of
vectors defined at some point P.
7
The expansion eq.1.1 allows us to represent a vector as an n-tuple:
V 1 , V 2 , ...V n
provided there is a basis e1 , e2 ,... en fixed in advance.
comp
As an alternative to eq.1.2 we will also write V V with the
same meaning.
▪ As an example we graphically represent the vector-space of plane
vectors (n = 2) that branch off from P , drawing only some vectors
among the infinite ones:
C
A ≡ e1
B ≡ e2
8
1.3 Covectors
We define covector P (or dual vector or “one-form”) a linear scalar
function**of the vector V . Roughly speaking: P applied to a
vector results in a number:
P V
= number ∈ ℝ 1.3
P can thus be seen as an “operator” (or “functional”) that for any
vector taken in input gives a number as output.
By what rule? One of the possible rules has a particular interest
because it establishes a sort of reciprocity or duality between the
vectors V and covectors P , and it is what we aim to formalize.
▪ To begin with, we apply the covector P to a basis-vector (instead
of a generic vector). By definition, let's call the result the α-th
component of P :
P e = P 1.4
We can get in this way n numbers P (α = 1, 2, ... n), the components
of the covector P̃ (as well as the V were the components of the
vector V ).
In operational terms we can state a rule (which will be generalized
later on):
9
By itself, the choice of the basis of covectors is arbitrary, but at this
point, having already fixed both the covector P and (by means of
eq.1.4) its components, the basis of covectors {e } follows, in order
the last equation (eq.1.5) be true. In short, once used the vector basis
{e } to define the components of the covector, the choice of the
covector basis {e } is obliged.
▪ Before giving a mathematical form to the link between the two
bases, we observe that, by using the definition eq.1.4 given above,
the rule according to which P̃ acts on the generic vector V can be
specified as:
P V = P e V = V P e = V P 1.6
P V = e P V e = P V e e
but, on the other hand, (eq.1.6):
P V = P V
Both expansions are identical only if:
10
After defined the Kronecker symbol as : β
δ = 1 for β= 1.7
0 otherwise
we can write the duality condition:
e e = 1.8
A vector space of vectors and a vector space of covectors are dual if
and only if their bases are related in that way.
▪ We observe now that P V = V P (eq.16) lends itself to an
alternative interpretation.
Because of its symmetry, the product V P can be interpreted not
only as P V but also as V P , interchanging operator and
operand:
V P
= V P = number ∈ ℝ 1.9
In this “reversed” interpretation a vector can be seen as a linear scalar
function of a covector P̃ . * *
As a final step, if we want the duality to be complete, it would be
possible to express the components V of a vector as the result of the
application of V to the basis-covectors e (by symmetry with the
definition of component of covector eq.1.4):
V = V e 1.10
▫ It 's easy to see that this is the case, because:
V P
= V e P = P V e
but since V P
= P V (eq.1.9), equaling the right members
the assumption follows.
Eq.1.10 is the dual of eq.1.4 and together they express a general rule:
To get the components of a vector or covector apply the vector or
covector to its dual basis.
11
notation makes use of parentheses 〈... 〉 emphasizing the symmetry
of the two operands.
The linearity of the operation is already known.
All writings:
P V = V P
= 〈 P , V
〉 = 〈 V , P 〉 = V P 1.11
are equivalent and represent the heterogeneous scalar product between
a vector and a covector.
By the new introduced notation the duality condition between bases
eq.1.8 is currently expressed as:
〈 e , e 〉 = 1.12
The homogeneous scalar product between vectors or covectors of the
same kind requires a different definition that will be given further on.
0 otherwise
A remarkable property of the Kronecker δ often used in practical
calculations is that acts as an operator that identifies the two
indexes α, β turning one into the other.
For example: δβ V = V β ; δβ Pβ = P 1.13
Note that the summation that is implicit in the first member collapses
to the single value of the second member. This happens because
removes from the sum all terms whose indexes α, β are different,
making them equal to zero.
Roughly speaking: hooks by its first index the index of the
operand and changes it into its own second index: what survives is the
free index (note the “balance” of the indexes in both eq.1.13).
▫ We prove the first of eq.1.13:
Multiplying = e e (eq.1.8) by V gives:
V = e e V = e V = V
(the last equality of the chain is the rule eq.1.10).
Similarly for the other equation eq.1.13.
In practice, this property of Kronecker δ turns useful each time we can
12
make a product like e e to appear in an expression: we can
replace it by , that soon produces a change of the index in one of
the factors, as shown in eq.1.13.
So far we have considered the Kronecker symbol as a number; we
will see later that is to be considered as a component of a tensor.
P1
P ≡ P2
P3
1
e 2
e 3
e
e3
e2
e1
V3
V ≡ V2
V1
13
Vectors are cubes with pins upward; covectors are cubes with holes
downward. We'll refer to pins and holes as “connectors”.
Pins and holes are in number of n, the dimension of space.
Each pin represents a e (α = 1, 2, ... n); each hole is for a e .
In correspondence with the various pins or holes we have to imagine
the respective components written sideways on the body of the cube,
as shown in the picture.
The example above refers to a 3D space (for n = 4 the cubes would
have 4 pins or holes, and so on).
The n pins and the n holes must connect together simultaneously.
Their connection emulates the heterogeneous scalar product between
vectors and covectors and creates an object with no exposed
connectors (a scalar). The heterogeneous scalar product
e P
scalar
product P V
e V
may be indeed put in a pattern that may be interpreted as the metaphor
of interlocking cubes:
1 2 3 n
e× e× e× .... e×
P1 P2 P3 Pn P1 P2 P3 Pn
scalar × × × .... ×
product V1 V2 V 3 V n
e 1 e 2 e 3 e n
× × × .... ×
1 2 3
V V V Vn
14
it, while the generic name of the component array ( V for vectors
or P for covectors) is written in the body of the piece. This
representation is the basis of the model that we call “T-mosaic” for its
ability to be generalized to the case of tensors.
▪ So, from now on we will use the two-dimensional representation:
e
Vector V ≡ Covector P ≡ P
V
e
Blocks like those in the figure give a synthetic representation of the
expansion by components on the given basis (the “recipe” eq.1.2,
eq.1.5).
▪ The connection of the blocks, i.e. the connection of a block to
another, represents the heterogeneous scalar product:
P
P
e
P V = V P
= 〈 P , V
〉 ≡ → = P V =
e
V
V
e
15
For simplicity, nothing will be written in the body of the tessera which
represents a basis-vector or basis-covector, but we have to remember
that, according to the perspective representation, a series of 0 together
with a single 1 are inscribed in the side of the block.
Example in 5D:
0
0
0
e2 ≡ 1
0
This means that, in the scalar product, all the products that enter in the
sum go to zero, except one.
▪ The application of a covector to a basis-vector to get the covector
components (eq.1.4) is rendered by the connection:
P
P
P e = P : e → = P
e
e
V e = V : →
= V
e V
V
16
A “smooth” block, i.e. a block without free connectors, is a scalar ( = a
number).
▪ Note that:
In T-mosaic representation the connection always occurs between
connectors with the same index (same name), in contrast to what
happens in algebraic formulas (where it is necessary to diversify
them in order to avoid the summations interfere with each other).
▪ However, it is still possible to perform blockwise the connection
between different indexes, in which case we have to insert a block of
Kronecker δ as a “plug adapter”:
P P e A e = P A e
A = P A =
e = P A
P P P
e e e P
e e
e
→ → → e → P A
e e A
e A
A A
17
It is worth to note that in the usual T-mosaic representation the
connection occurs directly between P e and A e by means of
the homonymic α-connectors, skipping the first 3 steps.
▪ The T-mosaic representation of the duality relation (eq.1.8 or
eq.1.12) is a significant example of the use of Kronecker δ as a “plug
adapter”:
e e
e
〈 e , e 〉 ≡ → → =
e
e e
18
2 Tensors
The concept of tensor T is an extension of those of vector and
covector.
A tensor is a linear scalar function of h covectors and k vectors (h, k =
0, 1, 2, ...).
We may see T as an operator that takes h covectors and k vectors in
input to give a number as a result:
T
A ,
B , ... P , Q , ... = number ∈ ℝ 2.1
P
h
number
T ℝ
k
V
19
In general:
. ..
T e , e , ... e , e , ... = T . .. 2.3
▪ This equation allows us to specify the calculation rule left
undetermined by eq.2.1 (which number does come out by applying the
1
tensor to an input list?). For example, given a tensor S of rank 1 for
which eq.2.3 becomes S e , e = S , we get:
S V , P
= S e V , e P = V P S e , e = V P S 2.4
In general eq.2.1 works as follows:
TA ,
B , ... P , Q , ... = A B ⋅⋅⋅P Q⋅⋅⋅T .. .
.. .
2.5
and its result is a number (one for each set of values μ, ν, α, β, ...).
An expression like A B P Q T that contains only balanced
dummy indexes is a scalar because no index survives in the result
of the implicit multiple summation.
20
2.1 Outer product between vectors and covectors
Given the vectors A and the covectors P , Q
,B we define a vector
outer product between the vectors
A and B:
B such that:
A⊗ A⊗ = A P
B P , Q
B Q 2.7
Namely: A⊗ B is an operator acting on a couple of covectors (i.e.
vectors of the opposite kind) in terms of two scalar products, as stated
in the right member. The result is here again a number ∈ ℝ .
It can be immediately seen that the outer product ⊗ is non-
commutative: A⊗
B ≠ B ⊗ A (indeed, note that B⊗
A against the
same operand would give as a different result
B P ).
A Q
2
Also note that A⊗ B is a rank 0 tensor because it matches the
definition given for a tensor: it takes 2 covectors as input and gives a
number as result.
Similarly we can define the outer products between covectors or
0 1
between vectors and covectors, ranked 2 and 1 rispectively:
P ⊗ Q such that: P ⊗Q A ,B = P
A
Q B and also
P ⊗ A such that: P ⊗ A = P B
B , Q A Q
etc.
▪ Starting from vectors and covectors and making outer products
between them we can build tensors of gradually increasing rank.
In general, the inverse is not true: not all tensors can be expressed as
outer product of tensors of lower rank.
▪ We can now characterize the tensor-basis in terms of outer product of
basis-vectors and / or covectors. To fix on the case ( 02) , it is:
e = e ⊗ e 2.8
▫ In fact, ** from the definition of component T β = T( e⃗ , e⃗β )
(eq.2.2), using the expansion T = T e (eq.2.6) we get:
T = T e e , e , which is true only if:
* The most direct demonstration, based on a comparison between the two forms
Q = P e ⊗ Q e = P Q e ⊗ e = T e ⊗ e and T = T e
T = P⊗
holds only in the case of tensors decomposable as tensor outer product.
21
e e , e = because in this case:
T = T .
Since = e e and = e e the second-last equation
becomes:
e e , e = e e e e
which, by definition of ⊗ (eq.2.7), means:
e = e ⊗ e , q.e.d.
Hence, the basis of tensors has been reduced to the basis of
0
(co)vectors. The 2 tensor under consideration can then be
expanded on the basis of covectors:
T = T e ⊗ e 2.9
It's again the “recipe” of the tensor, as well as eq.2.6, but this time it
uses basis-vectors and basis-covectors as “ingredients”.
▪ In general, a tensor can be written as a linear combination of (or as
an expansion over the basis of) elementary outer products
e ⊗ e ⊗... e ⊗ e ⊗ ... whose coefficients are the components.
3
For instance, a 1 tensor can be expanded as:
T = T e⊗ e ⊗ e ⊗ e 2.10
which is usually simply written:
T = T e e e e 2.11
Note the “balance” of upper / lower indexes.
The symbol ⊗ can be normally omitted without ambiguity.
▫ In fact, products ⃗
A⃗ ⃗ P̃ can be unambiguously interpreted
B or V
as ⃗A⊗ ⃗B or V⃗ ⊗ P̃ because for other products other explicit
symbols are used, such as V⃗ ( P)
̃ , 〈 V⃗ , P̃ 〉 or V⃗ W
⃗ for scalar
products, T(...) or even for tensor inner products.
▪ The order of the indexes is important and is stated by eq.2.10 or
eq.2.11 which represent the tensor in terms of an outer product: it is
understood that changing the order of the indexes means to change the
order of factors in the outer product, in general non-commutative.
22
Thus, in general T ≠ T ≠ T .. . .
To preserve the order of the indexes a notation with a double sequence
for upper and lower indexes like Y is often enough. However, this
notation is ambiguous and turns out to be improper when indexes are
raised / lowered. Actually, it would be convenient to use a scanning
with reserved columns like Y ∣⋅∣⋅∣⋅∣ ; it aligns in a single sequence
upper and lower indexes and assigns to each index a specific place
wherein it may move up and down without colliding with other
indexes. To avoid any ambiguity we ought to use a notation such as
⋅ ⋅
Y ⋅ ⋅ ⋅ , where the dot ⋅ is used to keep busy the column, or
simply Y replacing the dot with a blank.
23
e1 ⊗e1 e1⊗ e2 ⋯ e1 ⊗ en
e2 ⊗ e1 e2⊗ e2 e2 ⊗ en
⋮ ⋮
en ⊗ e1 en ⊗e2 ⋯ en ⊗ en
[ ]
T 11 T 12 ⋯ T 1 n
21 22 2n
T ≡ T T T = [T ] 2.12
⋮ ⋮
T n1 T n2 ⋯ T nn
Similarly:
for 1 tensor: T ≡ [ T ] on grid e ⊗ e
1
▪ The outer product between vectors and tensors has similarities with
the Cartesian product (= set of pairs).
For example V ⊗ W generates a basis grid e ⊗ e that includes all
the pairs that can be formed by e and e and a related matrix of
components with all possible pairs such as V W β ( = T β), all
ordered by row and column.
2 produces a
If X is a rank 0 tensor, the outer product X ⊗ V
3-dimensional cubic grid e ⊗ e ⊗ e of all the triplets built by
e , e , e and a similar cubic structure of all possible pairs
X T .
▪ It should be emphasized that the dimension of the space on which
the tensor extends is the rank r of the tensor and has nothing to do
with n, the dimension of geometric space. For instance, a rank 2 tensor
is a matrix (2-dimensional) in a space of any dimension; what varies is
the number n of rows and columns.
Also the T-mosaic blockwise representation is invariant with the
dimension of space: the number of connectors does not change since
they are not individually represented, but as arrays.
24
▪ A physical example useful to make the notion of tensor more
concrete can be the "stress tensor". It is a particular double tensor in
the usual 3D space that plays a role in mechanics.
In a material body under stress we isolate a small tetrahedron OABC
1 2 3
and look at the forces ⃗s , ⃗s , ⃗s , ⃗s acting on each of its faces,
respectively ABC, OBC, OCA, OAB, transmitted from the surrounding
material (in the drawing the x1 axis is to be seen enter into the sheet).
x3
C
⃗s 1
s 13
s32
s23 s12
⃗s 2
s11
s22 ⃗s
ss1221 x2
O s32s
23
B
ss 13
31
A s33
⃗s 3
1
x
25
2.3 Sum of tensors and product by a number
As in the case of vectors and covectors, the set of all tensors of a given
h
rank k defined at a point P has the structure of a vector space, once
defined operations of sum between tensors and multiplication of
tensors by numbers.
The sum of tensors (of the same rank!) gives a tensor whose
components are the sums of the components:
... ... ...
AB = C C ... = A ... B ... 2.13
The product of a number a by a tensor has the effect of multiplying by
a all the components:
comp ...
a A a A ... 2.14
Note that the symmetry has to do with the order of the arguments in
0
the input list of the tensor. For example, taken a 2 tensor:
T A ,
B = A B T e , e = A B T
T
B,
A = B A T e , e = A B T
⇒ the order of the arguments in the list is not relevant if and only if
the tensor is symmetric, since:
T =T T =T
A ,B B , A 2.15
0 2
▪ For tensors of rank 2 or 0 , represented by a matrix, symmetry /
skew-symmetry reflect into their matrices:
• symmetric tensor ⇒ symmetric matrix: [ T ] = [ T ]
26
0 2
▪ Any tensor T ranked 2 or 0 can always be decomposed into a
symmetric part S and a skew-symmetric part A :
T = SA
that is T = S A where: 2.16
S β = 1 (T β +T β ) and A = 1 T −T 2.17
2 2
Tensors of rank greater than 2 can have more complex symmetries
with respect to exchanges of 3 or more indexes, or even groups of
indexes.
▪ The symmetries are intrinsic properties of the tensor and do not
depend on the choice of bases: if a tensor has a symmetry in a basis, it
keeps the same in other bases (as it will become clear later).
T
for T = T e e e e
e
27
The exposed connectors correspond to free indexes; they state the
h
number of pins e
rank, given by k = number of holes e or even by r = hk .
▪ Blocks representing tensors can connect to each other with the usual
rule: pins or holes of a tensor can connect with holes or pins (with
equal generic index) of vectors / covectors, or other tensors. The
meaning of each single connection is similar to that of the
heterogeneous scalar product of vectors and covectors: connectors
disappear and bodies merge multiplying to each other.
▪ When (at least) one of the factors is a tensor (r > 1) we properly refer
to the product as tensor inner product.
▪ The shapes of the blocks can be deformed for graphic reasons
without changing the order of the connectors. It is convenient to keep
fixed the orientation of the connectors (pins “on”, holes “down”).
= A P Q T = X
P Q
e e P Q
e e
T T X
→ =
e
A
e
set:
A μν
T A Pμ Q ν = X
28
The result is a single number X (all the indexes are dummy).
▪ Components of tensor T : produced by saturating with basis-vectors
and basis-covectors all connectors of the “naked” tensor. For example:
e e
e e
→ = T
T T
e
e
T e , e , e = T
P
e e
P
e e
T T Y
→ =
e
set:
e T
P = Y
29
ν
The result Y are n2 numbers (survived indexes are 2): they are the
ν
components of a double tensor Y = Y e⃗ν ẽ ; but it is improper to say
that the result is a tensor!
Remarks:
▪ The order of connection, stated by the input list, is important. Note
that a different result T e , e , P = Z ≠ T e , P , e = Y
would be given by connecting P with e instead of e .
▪ In general a “coated” or “saturated” tensor, that is a tensor with all
connectors plugged (by vector, covectors or other) so as to become a
“smooth object” is a number or a multiplicity of numbers.
▪ It is worth to emphasize the different meaning of notations which
are similar in appearance:
30
A particular case of tensor inner product is the already known
heterogeneous scalar (or dot) product between a vector and a covector.
Let's define the total rank R of the inner product as the sum of ranks of
the two tensors involved: R = r1 + r2 (R is the total number of
connectors of the tensors involved in the product). For the
heterogeneous scalar product is R =1 + 1=2 .
In a less trivial or strict sense the tensor inner product occurs between
tensors of which at least one of rank r > 1, that is, R > 2. In any case,
the tensor inner product lowers by 2 the rank R of the result.
We will examine examples of tensor inner products of total rank R
gradually increasing, detecting their properties and peculiarities.
P
P
e e e
e
T ( P̃ , ) ≡ e e → T μν → T μ ν Pμ =
V
T μν
set:
μν
T Pμ = V ν
and:
31
P
P
e e e
e
̃ ≡
T ( , P) → T μν → T μν Pν = W
e e
T μν set:
T μ ν Pν = W μ
Only one connector of the tensor is plugged and then the rank r of the
tensor decrease by 1 (in the examples the result is a vector ( 10) ).
Algebraic expressions corresponding to the two cases are:
T • P̃ = T
μν μν
e⃗μ e⃗ν ( P ẽ ) = T P e⃗μ e⃗ν ( ẽ ) =
μν μν μν ν
T P e⃗μ e⃗ν ( ẽ ) = T P e⃗ν δμ = T Pμ e⃗ν = V e⃗ν
=
32
The results are different depending on the element (the index)
“hooked” to form a δ , that must be known from the context (no matter
the position of e or e in the chain).
Likewise, when the inner product is made with a vector, as in
e e e e e we have two chances:
e e e e e = e e e or e e e e e = e e e
depending on whether the dot product goes to hook ẽ or ẽ β .
κ
Formally, the inner product of a vector or covector e⃗κ or ẽ
addressed to some element (different in kind) inside a tensorial chain
e e e e⋅⋅⋅ removes the “hooked” element from the chain and
insert a δ with the vanished indexes. The chain welds itself with a ring
less, without changing the order of the remaining ones.
Note that product P̃ T on fixed indexes cannot give a result different
from T P̃ . Hence P̃ T = T P̃ and the tensor inner product is
commutattive in this case.
▪ Similar is the case of tensor inner product tensor • basis-covector.
Here as well, the product can be performed in two ways:
μ
T( ẽ ) or T( ẽ ). We linger on the latter case only, using the same T
of previous examples:
T ( , ẽ ) = T ẽ = T μ ν e⃗μ e⃗ν ( ẽ ) = T μ ν e⃗μ δν = T μ e⃗μ
which we draw as:
e e
e
e e → = μ
T
μ T
μ
T
33
vector. The index says which basis-covector enters the product. For
example, if the product is made with the covector-base ẽ 1 , then
=1 and the result is the vector T μ1 ⃗eμ ).
34
▪ The total number of different products doable with two tensors A , B
(both as A • B and as B • A) is also reduced by the presence of
commutative products: we have to consider that commutative pairs
should be counted as a single product instead of two.
Aμ ν Aμ ν Aμ ν Aμ ν
e e e e e e e e
e e⃗ν e e⃗ν
δμβ δβ
ν
e e ẽ β ẽ β
e e e e e e⃗β e e
β β
B B
β
B B
β
* We use here the trick of δ as “plug adapter” to keep the correspondence of in
dexes between equations and blocks.
35
▪ The 4 results of A • B are in general different. But it may happen that:
* Of course, the 4 possible pairs of indexes (connectors) are the same as the
previous case, but their order is inverted. Note that the multiplicities of A•B are
counted separately from those of B • A .
** This rule is fully consistent with the usual interpretation of the tensor inner
product as "outer tensor product + contraction" which will be given later on.
36
▪ It is easy to realize that in general:
inner tensor products are commutative only when one of the two
factors is a vector or a covector; if both factors are tensors of rank
r ≥ 2 the products are non-commutative.
Indeed, in the inner tensor product vectors or covectors "disappear"
without leaving residual indexes, so that the result can be read in one
way only. Rank ≥ 2 tensors leave residual indexes; if both factors are
such, the reading order is twofold.
▪ Possible products grow very rapidly in number with the rank of the
tensors involved; only the presence of symmetries in the tensors can
drastically reduce the number of different products.
The counting of the multiplicity of the inner tensor product A • B will
be dealt with later, after introduced the contraction operation.
37
P
P
Y ν
T , P , e ≡
e e → =
e e T
e
T e
set:
e T P = Y
A
⊗ B → A B = A B
T
A⊗
B = A B e⊗ e = A B e e = T e e = T
38
or else:
e e e
P ⊗ A → P A = P A
e e
e
Y
P ⊗
A = P A e ⊗ e = P A e e = Y e e = Y
The outer product makes the bodies to blend together and components
multiply (without the sum convention comes in operation).
It is clear from the T-mosaic representation that the symbol ⊗ has a
meaning similar to that of the conjunction “and” in a list of items and
can usually be omitted without ambiguity.
▪ The same logic applies to tensors of rank r >1, and in such cases we
speak of outer tensor product.
The outer tensor product operates on tensors of any rank, and merges
them into one “by side gluing” The result is a composite tensor of rank
equal to the sum of the ranks.
For example, in a case A( 11) ⊗ B( 12) = C( 23) :
A B A B
C
⊗ → \ =
β
e e e e e e ẽ e e
set:
A B = C
39
▪ Note the non-commutativity: A ⊗ B ≠ B ⊗ A .
2.9 Contraction
Identifying two indexes of different kind (one upper and one lower), a
tensor undergoes a contraction (the repeated index becomes dummy
and appears no longer in the result). Contraction is an “unary”
operation.
The tensor rank lowers from hk to h−1
k −1 .
In T-mosaic metaphor both a pin and a hole belonging to the same
tensor disappear.
Example: contraction for α = ζ :
μ contraction μ
C βν ζ =
C βν C
=
e e e e e e e e
40
e e
e
e
▪ For subsequent repeated contractions a tensor (h) can be reduced
k
until h = 0 or k = 0. A tensor of rank (h) can be reduced to a scalar.
h
1
▪ The contraction of a 1 tensor gives a scalar, the sum of the main
diagonal elements of its matrix, and is called the trace:
A = A11 A22... Ann 2.18
The trace of I is = 11... = n , dimension of the manifold.**
41
e e e e e⃗β e e e
Aβ ν B βγ = C β. ν. βγ = . γ
Cν
ẽ β e β
e ẽ ν
ẽ
The 1st step (outer product) is univocal; the 2 nd step (contraction)
implies the choice of indexes (connectors) on which contraction must
take place. Gradually choosing a connector after the other exhausts the
choice of possibilities (multiplicity) of the inner product A B. The
same considerations apply to the mirror product B A.
▪ This 2-steps procedure is equivalent to the rules stated to write down
the result of the tensor inner product. Its usefulness lies especially in
complex cases such as:
A μ⋅⋅ν⋅β ⋅ζ B⋅β ⋅ μ ν⋅ζ ⋅ β⋅ μ ν ζ ⋅⋅
⋅γ = C ⋅⋅β⋅ ⋅ γ = C ⋅ ⋅⋅ γ 2.19
▪ Once interpreted as outer product + contraction, it is easy to compute
the multiplicity of the inner tensor product A B in terms of number
of contractions:
Given A ( h 1) and B ( h 2) the number of possible contractions of the
k1 k2
outer tensor product of the two tensors equals the pairs that can be
formed between a ⃗ e connector of A and a ẽ connector of B, and
between a ẽ connector of A and a ⃗e connector of B, that is:
multiplicity of AB = h1⋅k 2 + h2⋅ k 1
Identical multiplicity for the mirror-product B A .
42
2.12 “Scalar product” or “identity” tensor
The operation of (heterogeneous) scalar product takes in input a vector
1
and a covector to give a number ⇒ it is a tensor of rank 1 . We
denote it by I :
I P , V
≡ P , V
〉 = PV 2.20
and expand it as: I = I e ⊗ e , that is I = I e e .
How is it I ? Let us find its components giving to it the dual basis-
vectors as input list: then
I = I e , e = e , e 〉 = 2.21
[ ][
〈 e 1 , e1 〉 〈 e 1 , e2 〉 ⋯ 〈 e 1 , en 〉
〈 e 2 , en 〉 = 1 0 ...
]
2 2
δ ≡ I ≡ [ ] = 〈 e , e1 〉 〈 e , e2 〉 0 1 ...
⋮ ⋮ ... ... 1
〈 e n , e1 〉 〈 e n , e2 〉 ⋯ 〈 e n , en 〉
2.23
comp
namely: δ ≡ I 2.24
43
β
(note: δβ δ γ = δγ )
The T-mosaic blockwise representation of I( V⃗ ) is:
e
e
I ≡ e
e
→ → V ≡ V
e V
V ≡
V
44
For tensors of rank r ≠ 2 the inverse is not defined.
▪The inverse T -1 of a symmetric tensor T of rank 02 or 2
0
has the
following properties:
*
• the indexes position interchanges upper ↔ lower *
• it is represented by the inverse matrix
• it is symmetric
45
Currently we say that tensors T and T are inverse to each
other: usually we call the components of both tensors with the same
symbol T and distinguish them only by the position of the indexes.
However, it is worth realizing that we are dealing with different
tensors, and they cannot run both under the same symbol T (if we call
T one of the two, we must use another name for the other: in this
instance T -1 ) .
▪ It should also be reminded that (only) if T β is diagonal (i.e.
T ≠ 0 only for = ) its inverse will be diagonal as well, with
β
components T β = 1 T , and viceversa.
▪ The property of a tensor to have an inverse is intrinsic to the tensor
itself and does not depend on the choice of bases: if a tensor has
inverse in a basis, it has inverse in any other basis, too (as will become
clear later on).
▪ An obvious property belongs to the mixed tensor T of rank 1
1
“related” with both T and T defined by their inner product:
T = T T
Comparing this relation with eq.2.28 (written as T T = with
β in place of ν) we see that:
T β = δβ 2.30
Indeed, the mixed fundamental tensor ( 11) δ , or tensor I , is the
β
β
“common relative” of all couples of inverse tensors T β , T ).**
comp
▪ The mixed fundamental tensor I δ β has (with few others **)**
the property to be the inverse of itself.
▫ Indeed, an already seen ******property of Kronecker's δ:
δ β δβγ = δγ
is the condition of inverse (similar to eq.2.28) for δ β .
(T-mosaic icastically shows the meaning of this relation).
* That does not mean, of course, that it is the only existing mixed double tensor
(think to C = A B when A e B are not related to each other).
** Also auto-inverse are the tensors (1 ) , whose matrices are mirror images of I.
1
̃ .
***Already met while calculating I( P)
46
2.14 Vector-covector “dual switch” tensor
0
A tensor of rank 2 needs two vectors as input to give a scalar. If the
input list is incomplete and consists in one vector only, the result is a
covector:
G( V⃗ )= G β ẽ ẽ β (V γ e⃗γ ) = G β V γ ẽ ẽ β e⃗γ = G β V ẽ β = P β ẽ β = P̃
δ αγ
after set G V = P .
The T-mosaic representation is:
G G P
G ≡ → =
e e
V e e
e
set:
V ≡ V
G V = P
to be read: GV = G V e = P e = P
0
By means of a 2 tensor we can therefore transform a vector V
into a covector P̃ that belongs to the dual space.
0
Let us pick out a 2 tensor G as a “dual switch” to be used from
now on to transform any vector V into its “dual” covector V :
GV = V 2.29
G sets up a correspondence G : V V between the two dual vector
spaces (we use here the same name V with different marks above in
order to emphasize the relationship and the term “dual” as “related
through G in the dual space”).
The choice of G is arbitrary, nevertheless we must choose a tensor
which has inverse G-1 in order to perform the “switching” in the
opposite sense as well, from V to V . In addition, if we want the
inverse G-1 to be unique, we must pick out a G which is symmetric, as
we know. Note that, at this point, using one or the other of the two ẽ
connectors of G becomes indifferent.
47
Applying G-1 to the switch definition eq.2.31:
G-1 G V
= G-1 V
but G -1 G = I and IV = V , then:
-1
V = G V 2.32
.
G-1 thus sets up an inverse correspondence G-1 : V V
In T-mosaic terms:
V ≡ V
e
V
e e e e
G-1 ≡ → = V
G G
set:
G V = V
G V = G V e = V e = V
-1
to be read
48
⃗
A⃗ B ≝ ⃗ A , B̃ 〉 or, likewise, ≝ Ã , ⃗B〉 2.35
From the first equality, expanding on the bases and using the switch G
we get:
⃗
A⃗
B = ⃗A , B̃ 〉 = A e⃗ , Bγ ẽ γ 〉 = A e⃗ , G Bβ ẽ γ 〉 =
⏟ βγ
Bγ
= G A B e , e 〉 = G A B = G A B = G A ,
B
In short: A B
= G
A ,B 2.36
Hence, the dual switch tensor G is also the “scalar product between
two vectors” tensor.
The same result follows from the second equality eq.2.35.
The symmetry of G guarantees the commutative property of the scalar
product:
A B
=
B A
or G = G
A ,B B,
A 2.37
(note this is just the condition of symmetry for the tensor G eq.2.15).
▪ The homogeneous scalar product between two covectors is then
defined by means of G -1 :
B = G-1 A , B
A 2.38
G -1 , the inverse dual switch, is thus the “scalar product between
two covector” tensor.
▪ In the T-mosaic metaphor the scalar (or inner) product between two
vectors takes the form:
G G
→ = G A B
e e
A B
e e
A B
A
B = G A B
49
while the scalar (or inner) product between two covectors is:
A B A B = G A B
e e A B
e e
→ = G A B
G G
[ ]
e⃗1 e⃗1 e⃗1 e⃗2 ⋯ e⃗1 e⃗n
e e e e e⃗2 e⃗n 2.39
⇒ G ≡ [ G β ] = ⃗2 ⃗1 ⃗2 ⃗2
⋮ ⋮
e⃗n e⃗1 e⃗n e⃗2 ⋯ e⃗n e⃗n
[ ]
ẽ 1 ẽ 1 ẽ 1 ẽ 2 ⋯ ẽ 1 ẽ n
2 1 2 2 2.40
ẽ 2 ẽ n
≡ [ G β] = ẽ ẽ ẽ ẽ
-1
⇒ G
⋮ ⋮
ẽ ẽ ẽ ẽ ⋯ ẽ ẽ n
n 1 n 2 n
50
Notation
Various equivalent writings for the homogeneous scalar product are:
▪ between vectors:
A
B =
B A = ,B
A , B 〉 = A 〉 = G
A,
B = G
B ,
A
▪ between covectors:
B
A = B A ,
= A A , B 〉 = G-1 A
B〉 = = G-1 B , A
, B
The notation 〈 , 〉 is reserved to heterogeneous scalar product
vector-covector
has, in this example, really the effect of lowering the index α involved
in the product without altering the order of the remaining others.
However, this is not the case for any index. What we need is to
examine a series of cases more extensive than a single example,
having in mind that the application of G to a tensor according to usual
rules of the tensor inner product G (T) = G T gives rise to a
51
multiplicity of different products (the same applies to G-1 ). To do so,
let's think of the inner product as of the various possible contractions
of the outer product: in this interpretation it is up to the different
contractions to produce multiplicity.
▫ Referring to the case, let's examine the other inner products we
can do on the various indices, passing through the outer product
βγ ⋅⋅ βγ
Gμν X = Xμν and then performing the contractions.
⋅⋅ β γ ⋅β γ
The μ-α contraction leads to the known result X ν =X ν ; in
addition:
⋅⋅ β γ ⋅ γ
• contraction μ-β gives X βν = X ν
⋅⋅ β γ ⋅ β
• contraction μ-γ gives X γ ν = X ν
(since G is symmetric, contractions of ν with α, β or γ give the
same results as the contractions of μ).
Other results rise by the application of G in post-multiplication:
βγ βγ
from X G μ ν = X ⋅⋅⋅ μ ν we get X β⋅⋅γ ν , X ⋅⋅γν , X ⋅⋅β ν and other
similar contractions for ν .
We observe that, among the results, there are lowerings of indices
⋅β γ
(e.g. X ν , lowering α(ν , and X ⋅⋅ ν , lowering γ(ν ), together
β
with other results that are not simple lowerings of indices (for
⋅γ ⋅γ
example X ν : β is lowered β(ν but also shifted). X ν does
not appear among the results: it is therefore not possible to get by
means of G the transformation:
e e e e e
→
X
αβγ → X αν⋅ γ
*
e
as if e⃗β were isolated. *
* Why not to use the tool of the incomplete input lists? In the present case
X (... , G , ...) would seem able to lower the central index. However, inserting
tensors (instead of vectors or covectors) into the input list creates problems,
especially if the inserted tensor is of high rank (how to position in the result the
surviving indexes of that tensor?). In the present instance, too, even admitting that
G placed in the input list lowers the central index, the definition of inner product
as outer product + contraction would be lost.
52
It's easy to see that, given the usual rules of the inner tensor product,
only indices placed at the beginning or at the end of the string can be
raised / lowered by G (a similar conclusion applies to G-1 ).
It is clear, however, that this limitation does not manifest itself until
the indexes' string includes only two, that is, for tensors of rank r = 2 .
▪ For tensors of rank r =2 and, restricted to extreme indexes for tensors
of higher rank, we can enunciate the rules:
G applied to a tensor lowers an index (the one by which it connects).
μ ⋅
The writing by components, e.g. Gμ ν T ⋅ ⋅ γ = T ν ⋅ γ , clarifies which
indexes are involved.
G-1 applied to a tensor raises an index (the one by which it connects).
β μν β⋅ ν
For example T ⋅⋅ γμ G = T ⋅⋅ γ .
Mnemo
G κ ν T ⋅⋅ β ν⋅β
κ = T ⋅ : hook κ , raise up it, rename it ν
All equivalent to 〈 A ,
B〉 = 〈 A , B〉
=
A B and to one another
B=A
are the following expressions:
I A B = I
, = G
A , B B = G-1 A
A, , B
2.44
53
that is easy to ascertain using T-mosaic.
Gβ Bγ
β γ
ẽ ẽ ẽ
e⃗ e⃗β e⃗γ
βγ
A G
▫ That does not mean that G and I are the same tensor. We observe
that the name G is reserved to the tensor whose components are
G ; the other two tensors, whose components are
G and G , are different tensors and cannot be labeled by
54
the same name. In fact, the tensor with components G is G-1,
while that one whose components are Gβ coincides with I.
It is thus matter of three distinct tensors:**
comp
I G = I =
comp
G G = I =
comp
G -1 G = I =
even if the components' names are somewhat misleading.
Note that only is the Kronecker delta, represented by the
matrix diag (+1).
β β
Besides, neither I β nor I (and not even δ β and δ ) deal
with the identity tensor I (but rather with G which determine
their form; their matrix may be diag (+1) or not).
The conclusion is that we can label by the same name tensors
with indexes moved up / down when using a componentwise
notation, but we must be careful, when switching to the tensor
notation, to avoid identifying different tensors as one.
β
Just to avoid any confusion, in practice the notations I β , I ,
δ β , δ β are almost never used. Usually it is meant:
comp
I → δ β
comp
G → Gβ
-1 comp
G → Gβ
G β is sometimes used as a synonym for δβ .
To signify the metric tensor diag (+1) it is sometimes used δβ
(see later on).
* The rank is also different: 11, 02 , 02 respectively. Their matrix representation may
be formally the same if G ≡ diag (+1), but on a different “basis grid”!
55
3 Change of basis
The vector V has expansion V e on the vector basis {e}.
Its components will change as the basis changes. In which way?
By the way, it's worth noting that the components change, but the
vector V⃗ does not!
Let us denote by V β' the components on the new basis { e ' } ; it is
now V⃗ = V ⃗e β' , hence:
β'
V⃗ = V ⃗e β' = V ⃗e
β'
3.1
In other words, the same vector can be expanded on the new basis as
well as it was expanded on the old one (from now on the upper ' will
denote the new basis).
▪ Like all vectors, each basis-vector of the new basis ⃗e β' can be
expanded on the old basis-vectors e :
⃗e β' = Λβ' e⃗ ( β ' =1, 2,...n) 3.2
Λ β' are the coefficients of the expansion that describes the “recipe”
of the new ⃗e β' on the old basis { e } (i.e. on the basis of the old
“ingredients”).
Taken as a whole, these coefficients express the new basis in terms of
the old one; they can be arranged in a matrix n × n [ Λ β' ]
V⃗ = V e⃗ = V Λ β'
⃗e β'
⇒ V β' = Λ β'
V
3.4
V⃗ = V ⃗e β'
β'
and since:
56
▪ What about covectors? First let us deduce the transformation law for
components. From P = P e eq.1.4, which also holds in the new
basis, by means of the transformation of basis-vectors eq.3.2 that we
already know, we get:
̃ e β' ) = P(
P β' = P(⃗ ̃ Λ β' e⃗) = Λ β' P̃ (⃗e ) = Λ β' P 3.5
Then, from the definition of component of P and using eq.3.5 above,
we deduce the inverse transformation for basis-covectors (from new to
old ones):
P = P e β'
⇒ ẽ = Λ β' ẽ
̃P = P β' ẽ β' = Λ β' P ẽ β'
{ } [ ]
e⃗1 = Λ11 ' ⃗e 1 ' + Λ 21 ' ⃗e 2 ' +... Λ 1'1 Λ 2'1 ⋯
β'
⇒ [ Λ ]= Λ 1'2 Λ 2'2 ⋯
1' 2' β'
eq.3.3: e⃗ = Λ ⃗
e β' ⇒ e⃗2 = Λ2 ⃗e 1 ' + Λ 2 ⃗e 2 ' +...
............ ⋯
{ } [ ]
e 1 ' = 11 ' e 1 12 ' e 2 ... Λ 1'1 Λ12 ' ⋯
β' β'
⇒ [ Λ ]= Λ 2'1 Λ 22 ' ⋯
2' 2' 1 2' 2 β'
eq.3.6: ẽ =Λ ẽ ⇒
e = 1 e 2 e ...
............ ⋯
57
On the contrary, the inverse transformation (from old to new basis) is
ruled by two matrices that are the inverse of the previous two.
So, just one matrix (with its inverse and transposed inverse) is enough
to describe all the transformations of (components of) vectors and
covectors, basis-vectors and basis-covectors under change of basis.
▪ Let us agree to denote (without further specification) the matrix
that transforms vector components and basis-covectors from the old
bases system with index to the new system indexed β ' :
[ ]
Λ 11 ' Λ12' ⋯
Λ ≝ [Λ β'
]= Λ 12 ' Λ 2'2 ⋯ 3.7
⋯
58
δμ = e⃗ , ẽ μ 〉 = e⃗ , Λμν ' ẽ ν ' 〉 = Λμν ' e⃗ , ẽ ν ' 〉 ,
possible only if e , e ' 〉 = ' because then δμ = Λ μν ' Λ ν ' .
Hence, the element of the transformation matrix Λ is:
Λ ν ' = e⃗ , ẽ ν ' 〉 3.9
built up crossing together old and new bases.
▪ In practice, it's not convenient trying to remember the transformation
matrices to use in each different case, nor reasoning in terms of matrix
calculus: the balance of the indexes inherent in the sum convention is
an automatic mechanism that leads to write the right formulas in every
case.**
Mnemo
The transformation of a given object is correctly written by simply
taking care to balance the indexes. For example, the transformation
of components of covector from new to old basis, provisionally
ν'
written P μ = Λ P ν' , can only be completed as P μ = Λ μ P ν' ⇒
ν'
the matrix element to use here is Λμ
e '
e
β' '
* We use indifferently e.g. Λ or Λ entrusting to the apex the distinction.
59
The connection is made “wearing” the basis-converters blocks as
“shoes” on the connectors of the tensor, docking them on the pin-side
or the hole-side as needed, "apex on apex" or "non-apex on non-apex".
The body of the converter blocks will be marked with the element of
the transformation matrix Λ ' or Λ ' with apexes ' up or down
turned in the same way as those of connectors (this implicitly leads to
the correct choice of Λ ' or Λ ' ).
For example, to basis-transform the components of the vector V e
we must put on the e connector a converter block that plug up and
replaces it with ⃗e ' :
⃗e '
e '
' e '
Λ '
e → = V
'
e
V
' '
since: Λ V = V
V
Likewise, to basis-transform the components of the covector P e ,
the appropriate converter block applies to the old connector e in
order to leave the new connector ẽ ' exposed:
P
P
P'
e → =
e '
' ẽ
Λ ' ẽ
'
since: Λ ' P = P '
'
ẽ
60
▪ This rule doesn't apply to basis-vectors (the converter block
would contain in that case the inverse matrix, if one); however the
block representation of these instances has no practical interest.
▪ Subject to a basis transformation, a tensor needs to convert all its
connectors; an appropriate conversion block must be applied to each
connector.
For example:
e '
'
e '
e
e ' e '
T μ ν T μ'' ν'
→ T μ ν =
ẽ μ ẽ ν ẽ
μ'
ẽ
ν'
μ ν
e⃗μ e Λ μ' Λ ν'
61
▪ A very special case is that of tensor I = e e whose components
never change: = diag 1 is true in any basis:
e '
e '
'
'
e
e '
e
β
Λ β' β'
ẽ
β'
ẽ
' β ' β '
since: Λ Λ β' δ β = Λ β Λ β' = δ β'
62
3.3 Invariance of tensor equations
Vectors, covectors and tensors are the invariants of a given
“landscape”: changing the basis, only the way by which they are
represented by their components varies. Then, also the relationships
between them, or tensor equations, are invariant. For tensor equations
we mean equations in which only tensors (vectors, covectors and
scalars included) are involved, no matter whether written in tensor
notation T , V , ecc. or componentwise.
Reduced to its essential terms, a tensor equation is an equality
between two tensors like:
... .. .
A =B or A. .. = B . .. 3.13
Now, it's enough to put A - B = C to reduce to the equivalent form
eq.3.12 C = 0 or C μ.. .... = 0 that, once valid in a basis, is valid in all
bases. It follows that also eq.3.13, once valid in a basis, is valid in all
bases. Hence the equations in tensor form are not affected by the
particular basis chosen, but they hold in all bases without changes.
In practice
an equation derived in a system of bases, once expressed in tensor
form, applies as well in any system of bases.
Also some properties of tensors, if expressed by tensor relations, are
valid regardless of the basis. It is the case, inter alia, of the properties
of symmetry / skew-symmetry and invertibility. For example, eq.2.15
that expresses the symmetry properties of a double tensor is a tensor
equation, and that ensures that the symmetries of a tensor are the same
in any basis.
The invertibility condition eq.2.27, eq.2.28, is a tensor equation too,
and therefore a tensor which has inverse in a basis has inverse in any
other basis.
As will be seen in the following paragraphs, the change of bases can
be induced by a transformation of coordinates of the space in which
tensors are set. Tensor equations, inasmuch invariant under change of
basis regardless of the reason that this is due, will be invariant under
coordinate transformation (that is, valid in any reference accessible
via coordinate transformation).
In these features lies the strength and the greatest interest of the tensor
formulation.
63
4 Tensors in manifolds
We have so far considered vectors and tensors defined at a single point
P . This does not mean that P should be an isolated point: vectors and
tensors are usually given as vector fields or tensor fields defined in
some domain or continuum of points.
Henceforth we will not restrict to ℝ n space, but we shall consider a
wider class of spaces that retain some basic analytical properties of
ℝ n , such as the differentiability (of functions therein defined).
Belongs to this larger class any n-dimensional space M whose points
may be put in a one-to-one (= bijective) and continuous correspond
ence with the points of ℝ n (or its subsets). Continuity of correspond
ence means that points close in space M have as image points also
close in ℝ n , that is a requisite for the differentiability in M .
Under these conditions we refer to M as a differentiable manifold.
(Here we'll also use, instead of the term “manifold”, the less technical
“space” with the same meaning).
Roughly speaking, a differentiable manifold of dimension n is a space
that can be continuously “mapped” in ℝ n (with the possible excep
tion of some points).
64
is required that the correspondence φ(Q) ↔ ψ(Q) is itself
continuous and differentiable.
65
correspondence φ .
66
shown in the figure.
x3
x 2 = const x 1 = const
P
x2
x 3 = const
x1
x2
P'
1
d x x
67
increases from s to s + ds and the coordinates of the point move from
x1 , x 2 , ... x n to x1dx 1 , x 2dx 2 , ... x ndx n .
comp
The displacement from P to P' is then d ⃗x → dxμ , or:
d ⃗x = dx μ e⃗μ . 4.1
This equation works as a definition of a vector basis {e⃗μ } in P , in
such a way that each basis-vector is tangent to a coordinate line.
Note that d ⃗x is a vector (inasmuch it is independent of the
coordinate system which we refer to).
A basis of vectors defined in this way, tied to the coordinates, is called
a coordinate vector basis. Of course it is a basis among all the
comp
possible ones, but the easiest to use because only here d ⃗x → dx μ **
It is worth to note that this basis is in general P-depending: ⃗eμ =⃗e μ ( P) .
μ
▪ A further relation between d x and its components dx is:
dxμ = ẽ μ , d ⃗x 〉 4.2
(it is nothing but the rule eq.1.10, according to which we get the
components by applying the vector to the dual basis); its blockwise
representation in T-mosaic metaphor is:
ẽ μ ≡
e
e⃗μ → dx μ
d x ≡ dx μ
μ
The basis { ẽ } is the coordinate covector basis, dual to the previ
ously introduced coordinate vector basis.
▪ We aim now to link even this covector basis to the coordinates.
Let us consider a scalar function of the point f defined on the manifold
(at least along l) as a function of the coordinates. Its variation from the
initial point P along a path d ⃗x is given by its total differential:
f μ
df = μ dx 4.3
x
* If the basis is not that defined by eq.4.1, we can still decompose d ⃗x along the
μ μ
coordinate lines x , but d x is no longer a component along a basis-vector.
68
f
We note that the derivatives are the components of a covector
xμ
μ
because their product by components d x of the vector d ⃗x gives
the scalar df (in other words, eq.4.3 can be interpreted as an
heterogeneous scalar product).**
̃ f comp f
Let us denote by ∇ → μ this covector, whose components
x
in a coordinate basis are the partial derivatives of the function f , so
that it can be identified with the gradient of f itself.
In symbols, in a coordinate basis:
comp
̃ ̃ →
grad ≡ ∇ 4.4
xμ
or, in short notation μ ≡ μ for partial derivatives:
x
∇̃ comp
→ μ 4.5
Note that the gradient is a covector, not a vector.
The total differential (eq.4.3) can now be written in vector form:
̃ f ( d ⃗x ) = ∇
df =∇ ̃ f , d ⃗x 〉
ν
If as scalar function f we take one of the coordinates x ****we get:
x , d x 〉
dx =
By comparison with the already known dx =e , d x 〉 (eq.4.2) ⇒
⇒ x in a coordinate basis
e = 4.6
This means that in coordinate bases the basis-covector e coincides
with the gradient of the coordinate x and is therefore oriented in
direction of the faster variation of the coordinate itself, which is the
ν
direction normal to the coordinate surface x = const .
69
as shown in the following figures that illustrate the 3D case:
x3
e3
3
x 1 = const
e
P e2
t P
ns
co
2 = e
2
x e
1
e1
x 2
x 1 x 3 = const
70
(ρ, θ) d ⃗x is given by:
d ⃗x = êρ d ρ+êθ ρd θ 4.8
μ
that is compatible with the coordinate basis condition d ⃗x = e⃗μ dx
provided we take as a basis:
e = e , e = e 4.9
Here the basis-vector e “includes” ρ , it's no longer a unit vector and
also varies from point to point. Conversely, the basis of unit vectors
{ e , e } which is currently used in Vector Analysis for polar
coordinates is not a coordinate basis.
▫ An example of a non-coordinate basis in 3D rectangular
coordinates is obtained by applying a 45° rotation to unit vectors
⃗i , ⃗j in the horizontal plane, leaving ⃗k unchanged.
Writing ⃗i , ⃗j in terms of the new rotated vectors ⃗i ' , ⃗j' we get
for d x :
d ⃗x = √ 2 ( ⃗i '−⃗j' )dx + √ 2 ( ⃗i ' + ⃗j') dy + ⃗
k dz , 4.10
2 2
and this relation matches the coordinate bases condition (eq.4.1)
only by taking as basis vectors:
√ 2 ( ⃗i '−⃗j') , √ 2 ( ⃗i ' +⃗j') , ⃗
k
2 2
that is nothing but the old coordinate basis {⃗i , ⃗j , ⃗
k } . It goes
without saying that the rotated basis {⃗i ' ,⃗j' , ⃗k } is not a
coordinate basis (as we clearly see from eq.4.10).
This example shows that, as already said, only in a coordinate
comp
basis d ⃗x → dx μ is true. Not otherwise!
μ
* ⃗i and ̃i (in general ⃗e ν and ẽ ) remain distinct entities, although super-
imposable. In Cartesian they have the same expansion by components and their
"arrows" coincide, but under change of coordinates they return to differ.
71
▫ Indeed, it is already known that in rectangular the coordinate
basis is made by the unit vectors ⃗i , ⃗j , ⃗
k . They can be written in
terms of components:
⃗i = (1 , 0 , 0) , ⃗j =(0 ,1 , 0) , ⃗ k = (0 , 0 ,1)
Then the coordinate covector basis will necessarily be:
ĩ = (1 ,0 , 0) , ̃j = (0 , 1 , 0) , k̃ = (0 , 0 ,1)
because in this way only you get:
〈 ̃i , ⃗i 〉 = 1 , 〈 ̃i , ⃗j〉 = 0 , 〈 ̃i , ⃗
k〉=0 ,
〈 ̃j , ⃗i 〉 = 0 , 〈 ̃j , ⃗j〉 = 1 , 〈 ̃j , ⃗k 〉 = 0 ,
〈 k̃ , ⃗i 〉 = 0 , 〈 k̃ , ⃗j 〉 = 0 , 〈 k̃ , ⃗k 〉 = 1 ,
as required by the condition of duality.
v In polar coordinates, on the contrary, the vector coordinate basis
{e⃗ν } ≡ { ⃗e ρ , e⃗θ } does not match with the covector basis deduced from
ν ρ θ
duality condition {ẽ } ≡ { ẽ , ẽ } .
▫iIndeed: unit vectors in polar coordinates are by definition:
ê ρ =(1 , 0) , ê θ = (0 , 1) . Thus, from eq.4.9:
e⃗ρ = (1 , 0) , e⃗θ = (0 , ρ )
ρ μ μ
Taking ẽ = (a ,b) , ẽ θ = (c , d ) , from 〈 ẽ , e⃗ν 〉 = δ ν ⇒
1 = 〈 ẽ ρ , e⃗ρ 〉 = ( a ,b) (1 ,0)= a
ρ
0 = 〈 ẽ , e⃗θ 〉 = (a ,b) (0 ,ρ)= b ρ ⇒ b=0 ⇒ ẽ ρ=(1 , 0)
⇒ θ
0 = 〈 ẽ , e⃗ρ 〉 = (c , d ) (1 , 0) = c
1
1 = 〈 ẽ θ , e⃗θ 〉 = ( c ,d ) ( 0 , ρ ) = d ρ ⇒ d = ρ1 ⇒ ẽ θ=(0 , ρ )
Note that the dual switch tensor G , even if previously defined, has no
role in the calculation of a basis from its dual.
72
apex ' marks the new one). But which Λ must we use for a given
transformation of coordinates?
To find the actual form of Λ we impose to vector basis a twofold
condition: that the starting basis is a coordinate basis (first equality)
and that the arrival basis is a coordinated basis too (second equality):
μ ν'
d ⃗x = e⃗μ d x = e⃗ν' d x 4.11
Each new coordinate x ' will be related to all the old coordinates
1 2 n ν' ν' 1 2 n
x , x , ... x by a function x = x (x , x ,... x ) , i.e.
x ν ' = x ν ' (... , x μ , ...)
whose total differential is:
x ν' μ
dx ν' =
dx
xμ
Substituting into the double equality eq.4.11:
ν' ν'
x x
d ⃗x = e⃗μ dx μ = e⃗ν ' μ
μ dx ⇒ e⃗μ = e⃗ν ' 4.12
x xμ
A comparison with the change of basis-vectors eq.3.3 (inverse, from
new to old):
ν'
e⃗μ = Λ μ e⃗ν '
leads to the conclusion that:
ν'
x
Λ νμ ' =
μ 4.13
x
⇒ Λ νμ' is the matrix element of complying with the definition
eq.3.7 that we were looking for.**
Since the Jacobian matrix of the transformation is defined as:
[ ]
x1 ' x1 ' x 1'
1 2 ⋯ n
x x x
x2 ' x2 ' x 2'
[ ]
ν'
x
J ≝ x
1
x
2
x =
n
4.14
xμ
⋮ ⋮
xn ' x n'
x n'
⋯
x1 x2 xn
73
we can identify:
≡J 4.15
▪ The coordinate transformation has thus as related matrix the
Jacobian matrix J (whose elements are the partial derivatives of the
new coordinates with respect to the old ones). It states, together with
its inverse and transpose, how the old coordinate bases transform into
the new coordinate bases, and consequently how the components of
vectors and tensors transform when coordinates change.
After a coordinate transformation we have to transform bases and
components by means of the related matrix ≡ J (and its
inverse / transpose) to continue working in a coordinate basis.
▪ As always, to avoid confusion, it is more practical to start from the
transformation formulas and take care to balance the indexes; the
indexes of will suggest the proper partial derivatives to be taken.
μ μ xμ
For example, P ν ' = Λ ν ' Pμ ⇒ use the matrix Λ ν ' = .
x ν'
Note that the upper / lower position of the index marked ' in partial
derivatives agrees with the upper / lower position on the symbol .
74
x ' xμ x ν
T μ '' ν ' = Tμ ν 4.16
x x μ ' xν'
which is only a different way to write eq.3.10, in coordinate basis.
The generalization of eq.3.10 to tensors of any rank is obvious.
h
In general, a tensor of rank k is told contravariant of order (= rank)
h and covariant of order k.
Traditional texts begin here, using the transformation laws under
coordinate transformation of components**as a definition: a tensor is
defined as a multidimensional entity whose components transform
according to example eq.4.16.
* Although the traditional “by components” approach avoids speaking about bases,
it is implied that there one always works in a coordinate basis (both old and new).
75
4.8 Cartesian tensors
If we further restrict to linear orthogonal transformations, we identify
the class of Cartesian tensors (wider than affine tensors' one).
A linear orthogonal transformation is represented by an orthogonal
matrix.** Also for Cartesian tensors the same matrix [ a μ ] rules both
ν'
* A matrix is orthogonal when its rows are orthogonal vectors (i.e. scalar product
in pairs = 0), and so its columns too.
76
follows that G must be chosen appropriately.
0
We will denote g the tensor ( 2 ) we pick out among the possible
switches G to get the desired definition of distance.
The distance ds given by the tensor equation:
ds 2 = gd x , d x 4.19
is an invariant scalar and does not depend on the coordinate system.
At this point, the tensor g defines the geometry of the manifold, i.e. its
metric properties, based on the definition of distance between two
points. For this reason g is called metric tensor, or “metric”.
▪ If (and only if) we use coordinate bases, the distance can be written:
2
ds = g dx dx 4.20
▫ Indeed, (only) in a coordinate basis d x = e dx holds, hence:
ds 2 = gd x , d x = g e dx , e dx = g e , e dx dx =
= g dx dx
Remark:
Other definitions of scalars depend on g. The choice made in favor of
some g conditions the result of the vector scalar product, and through
this, the value of scalars such as the vector magnitude, the arc length
or distance, and others scalars defined by g itself.
On the other hand, the definitions of scalar product between vectors,
vector magnitude, arc length and so on are given by tensor equations,
the results of which are invariant.
77
To overcome the apparent contradiction, let us imagine the steps that
lead to the construction of a metric space:
• given a “landscape”, i.e. a space with a coordinate system and
containing “objects” like scalars, vectors, tensors,
• let's drop upon this space some g as a “native” metric;
• by means of that we calculate scalar products, vector magnitudes,
distances, etc. The value that results for these scalars obviously
depends on the imposed g.
• from this point on, against all the possible changes of coordinates,
the result of the scalar product between vectors no longer changes.
In fact, when a coordinate change is made, both the components of
the vectors ⃗ ⃗ and the components of g change in such a way
A ,B
that g( ⃗ ⃗ ) remains unchanged; then the vector magnitude, the
A ,B
distance, etc. remain unchanged too.
The vector scalar product and all the resulting scalars are therefore
invariant under coordinates transformation, although they are affected
by the “native” metric imposed to the space in the first instance.
[ ]
1 0 0
{
g β = 1 for = β
0 for ≠ β } i.e. g = 0 1 0 = diag 1
0 0 1
4.24
-1
Note that in this case g = g , namely g = g .
78
4.12 Generalized distances
The definition of distance given by eq.4.20 is more general than the
Euclidean case and includes any bilinear form in dx1 , dx 2 , ... dx n such
as:
g 11dx 1 dx 1g 12dx 1 dx 2 g 13dx 1 dx 3... g 21 dx 2dx 1 g 22 dx 2 dx 2... g nn dx n dx n
4.25
g
where the are subject to few restrictions. Inasmuch they build up
the matrix that represents the metric tensor g, we must at least require
that their matrix is symmetric and det [ g β ]≠0 so that g is invertible
-1
in order g to exist.
▫ Note, however, that in any case g is symmetrizable: for example,
if it were g 12≠g 21 in the bilinear form eq.4.25, just replacing
them by g ' 12 = g ' 21 = ( g 12+g 21 ) 2 nothing would change in
the definition of distance.
Metric spaces where the distance is defined by eq.4.19 with any g ,
provided symmetric and invertible, are called Riemann spaces.
▪ Given a manifold, its metric properties are fully described by the
metric tensor g associated to it. As tensor, g does not depend on the
coordinate system imposed onto the manifold: g does not change if
the coordinate system changes. However, we do not know how to
represent g by itself: its representation is only possible in terms of
components g and they do depend upon the specific coordinate
system (and therefore upon coordinate basis). For a given g we then
have different matrices [ g ] , one for each coordinate system, which
mean the same g , i.e. the same metric properties for the manifold.
For instance, for the 2D Euclidean plane, the metric tensor g , which
expresses the usual Euclidean metric properties is represented by
[ g ] = [ 10 0
1 ] in Cartesian coordinates and by [ g ' β' ] = 10[ ] in 0
ρ2
polar coordinates. Of course, there are countless other which [ gμ ' ν' ]
can be obtained from the previous ones by coordinate transformation
and consequent change of basis by means of the related matrix
Λμ ' Λβν'
(remind that [ g β ] → [ g μ ' ν' ] , i.e. g μ ' ν' = Λ μ' Λ ν ' g β ).
β
In other words, the same Euclidean metric can be expressed by all the
79
[ g ' ' ] attainable by coordinate transformation from the Cartesian
[ ]
1 0
rectangular [ g ] = 0 1 .
80
we get, taking its derivative, the transformation rule for derivatives of
the components of V ⃗' : **
ν ' ν ν ' κ' ν 2 ν
V V x ' x V x x ' x
μ = μ +V μ = μ +V '
x x x ' x x ' x
κ'
x x
' μ
x x
4.27
that is not the tensor components' transformation scheme eq.4.16
because of the presence of an additional 2 term. It follows that V ⃗'
is not a tensor. However, against the first appearance, V ⃗ ' is not the
⃗
derivative of vector V .
In the next paragraph a true vector “derivative of vector” μ V⃗ will be
built and it will appear to be a tensor.
* Leibniz rule for the derivative of product: (f ∙ g)' = f ' ∙ g + f ∙ g' and then the
“chain rule”.
81
λ
components Γμ ν each (for λ = 1,2,... n).
λ
The coefficients Γμ ν are called Christoffel symbols or connection
coefficients; they are in total n3 coefficients.
Eq.4.30 is equivalent to:
comp
μ e⃗ν → Γμλ ν 4.31
The Christoffel symbols are therefore the components of the
vectors “partial derivative of a basis vector” on the basis of the
vectors themselves
or:
The “recipe” of the partial derivative of a basis vector uses as
“ingredients” the same basis vectors while Christoffel symbols
represent their amounts.
Inserting eq.4.30 into eq.4.28, the latter becomes:
μ V⃗ = e⃗ν μ V ν + Γλμ ν e⃗λ V ν =
in the last right term , are dummy summation
indexes: they can therefore be freely changed (and
interchanged). Interchanging , :
= e⃗ν μ V ν + Γνμ λ e⃗ν V λ = e⃗ν ( μ V ν + Γμν λ V λ )
In short: μ V⃗ = e⃗ν ( ν ν λ
μ V + Γμ λ V ) 4.32
⏟ ν
V ;μ
We define covariant derivative of a vector with respect to x :
ν ν ν λ
V ;μ ≝ μ V + Γμ λ V 4.33
(we introduce here the subscript ; to denote the covariant derivative).
ν
The covariant derivative V ;μ is a scalar which differs from the
ν
respective partial derivative μ V by a corrective term in Γ .
So far, we did nothing but writing in a new form the derivative of a
vector, introducing by the way the Christoffel symbol Γ and the
covariant derivative.
▪ The important fact is that, unlike the ordinary derivative, the
covariant derivative has tensorial features in the sense that the n2
covariant derivatives eq.4.33 transform according to eq.4.16 (see in
82
Appendix a check in extenso) and can therefore be considered
components of tensors.
▪ From eq.4.32 we see that the derivative of a vector is a vector whose
components are the covariant derivatives of the vector, that means a
tensor:
μ V⃗ = e⃗ν V ν;μ 4.34
The partial derivative of a vector is a vector (tensor) whose scalar
components are the n covariant derivatives of the vector.
▪ When covariant derivative and partial derivative identify?
From eq.4.33 we see that for this to happen it must be ∀ Γ = 0 . Since
eq.4.30 means the implication
∀⃗e ν = constant ∀ Γ = 0
we deduce that:
when all basis-vectors ⃗ e ν are constant the covariant derivative
equals the partial derivative. This is the case of Euclidean-
Cartesian spaces (and of flat-metric spaces in general).
It is in these same cases that the derivative of a vector takes the simple
form:
V⃗ V ν
= e
⃗
xμ ν
xμ
L̃ = L λ ẽ λ
Really, L̃ is labeled with an upper and a lower μ index, then:
L̃νμ = Lμν λ ẽ λ
hence:
ν ν λ
μ ẽ = Lμ λ ẽ 4.36
83
which, placed into eq.4.35, gives:
̃ = ẽ ν μ Aν + Lμν λ ẽ λ Aν =
μ A
interchanging the dummy indexes , in the last term:
= ẽ ν μ Aν + Lμλ ν ẽ ν Aλ = ẽ ν ( μ Aν + Lμλ ν Aλ ) 4.37
▪ How are related Γ and L ? They simply differ by a sign:
Lλμ ν =−Γμλ ν 4.38
Aν ;μ
μ
We define covariant derivative of a covector with respect to x :
λ
Aν ; μ ≝ μ Aν − Γμ ν Aλ 4.41
analogous to eq.4.33. It is likewise a scalar.
84
̃ at work
4.15 The gradient ∇
We have already stated that the gradient ∇ is a covector whose
components in coordinate bases are the partial derivatives with
respect to the coordinates (eq.4.4, eq.4.5):
comp
̃
grad ̃
≡ ∇ → μ
namely ̃ = ẽ μ μ
∇ 4.43
⊗ vector
u Tensor outer product ∇
The result is the tensor 1 gradient of vector. In coordinate bases:
1
̃ ⊗ V⃗ = ẽ μ μ ⊗ V⃗
∇
that is to say:** ∇̃ V⃗ = ẽ μ μ V
⃗ 4.44
μ V⃗ comes from eq.4.32 ; substituting:
̃ V⃗ = ẽ μ e⃗ν ( μ V ν + Γμν λ V λ )
∇ 4.45
On the other hand, ∇ V may be formally written as expansion on its
ν
components we'll denote by ∇ μ V :
̃ V⃗ = ∇ μ V ν ẽ μ e⃗ν
∇ 4.46
and by comparison between the last two we see that
ν ν ν λ
∇ μ V = μ V + Γμ λ V 4.47
85
or, in the alternative notation suggested by eq.4.33:
∇ μ V ν ≡ V ν;μ 4.48
Eq.4.47 qualifies the covariant derivative of vector as component of
the tensor ∇ ̃ V⃗ (≡ ∇
̃ ⊗ V⃗ ) gradient of vector, while introducing a
further notation ∇ μ for the covariant derivative.
ν
The covariant derivatives of a vector ∇ μ V are the n2 components
of tensor “gradient of vector” ⊗V (usually written V ).
▫-In Appendix it is shown that the covariant derivatives V ;
transform according to the tensor contra / covariant scheme
exemplified by eq.4.16 and this confirms that the gradient V
of the vector of which they are components is indeed a tensor.
μ V⃗ = e⃗ν ∇ μ V ν 4.49
that is:
comp
μ V⃗ → ∇μ V ν 4.50
μ V⃗ V⃗
≡
xμ
partial
derivative
86
▪ In particular, when the vector is a basis-vector e the sketch above
reduces to:
n2 components
̃ e⃗ν
∇
λ
Γμ ν
n n
μ e⃗ν
Notation
μ
The covariant derivative of V with respect to x is denoted by
∇μ V ν or V ν;μ
Vν
Not to be confused with the partial derivative , denoted by
x
μ V ν or V ν,μ
The lower index after the , means partial derivative
The lower index after the ; means covariant derivative
⊗ covector
v Tensor outer product ∇
The result is the 0 tensor gradient of covector. In coordinate bases:
2
∇̃ ⊗A
̃ = ẽ μ μ ⊗ Ã
or ̃ A
∇ ̃ = ẽ μ μ Ã 4.52
We go on as in the case of gradient of a vector:
̃ comes from eq.4.40; replacing it:
μ A
∇̃ A
̃ = ẽ μ ẽ ν ( μ Aν − Γλμ ν Aλ ) 4.53
87
and by comparison with the generic definition
∇̃ A
̃ = ∇ μ Aν ẽ μ ẽ ν 4.54
we get:
λ
∇ μ Aν = μ A ν− Γμ ν Aλ 4.55
which is equivalent to eq.4.48 with the alternative notation:
∇ μ Aν ≡ Aν ;μ 4.56
Eq.4.55 represents the covariant derivative of covector and qualifies it
as a component of the tensor gradient of covector A (≡ ⊗ A ) .
Eq.4.55 is the analogous of eq.4.47, which was stated for vectors.
▪ The partial derivative of a covector (eq.4.42) may be written in the
new notation in terms of as:
̃ = e⃗ν ∇ μ Aν
μ A 4.57
which means:
comp
̃ → ∇ μ Aν
μ A 4.58
▪ For the gradient and the derivative of a covector a sketch like that
drawn for vectors applies, according to eq.4.52, eq.4.57:
̃ A
∇ ̃ = ẽ μ μ A
̃ = ẽ μ e⃗ν ∇ μ Aν
Mnemo
To write the covariant derivative of vectors and covectors:
▪ we had better writing the Γ before the component, not vice versa
▪ 1st step:
∇ μ Aν = μ A ν+Γμν or ∇ μ Aν = μ Aν −Γμ ν
(adjust the indexes of Γ according to those of the 1st member)
▪ 2nd step: paste a dummy index to Γ and to the component A that
follows in order to balance the upper / lower indexes:
. . . . . . . . +Γμν κ Aκ or . . . . . . . −Γμκ ν Aκ
88
̃ T = ∇ μ T γ β e⃗μ e⃗ ⃗e β ẽ γ
∇ namely:
̃ T comp
∇ → ∇ μT β
γ = μ T β κβ β κ κ β
γ + Γμ κ T γ + Γμκ T γ − Γμ γ T κ 4.59
h
In general, the covariant derivative of a tensor k is obtained by
attaching to the partial derivative h terms in Γ with positive sign
and k terms in Γ with negative sign. Treat the indexes individually,
one after the other in succession.
Covariant derivatives are n r+1 in number when the tensor has rank r.
Each covariant derivative consists of r terms in Γ since each one takes
into account the variability of a single index of basis-vector (or
covector), whose total number is r .
Mnemo
To write the covariant derivative of tensors, for example ∇ T :
▪ after the term in consider the indexes of T one by one:
▪ write the first term in Γ as it were ∇ T , then complete T
with its remaining indexes : T
▪ write the second term in Γ as it were T , then complete T
with its remaining indexes : T
▪ write the third Γ as it were T , then complete T
with its remaining indexes : − − T
▪ Comments on the form of components of tensor
∇̃ comp
→ μ holds in general for tensors of any rank r :
̃ T comp
∇ → μ T
μT are in any case the n components of ∇ ̃ T ; however, except the
case r = 0 , they are not the scalar components but just the first level-
components, n in number.
Instead, scalar components of ∇ ̃ T are the n r+1 covariant derivatives
̃ T when T has rank r ).
∇ μ ... (r +1 is the rank of ∇
For r = 0, or T ≡ f scalar, first level-components μ T and scalar
components coincide. For r =1, μT are vectors and for r >1 they are
tensors.
89
We write in general:
∇̃ comp
→ ∇μ 4.60
where ∇ μ are the covariant derivatives (scalar components).
For the very way in which they are defined, the covariant derivatives
∇ μ automatically assume different form depending on the object to
which the gradient ∇ ̃ is applied:
vector
x Inner product
̃ (V
∇ ⃗ ) = ∇ μ V μ = V μ;μ = div V⃗ 4.61
It is the scalar divergence of a vector.
Note that ∇ applied as inner product to a vector looks like a
covariant derivative:
∇ μ V μ = μ V μ + Γμμ κ V κ 4.62
but with equal indexes (dummy), and is actually a number.
The divergence of a vector div V ⃗ looks like a covariant derivative,
μ μ
except for a single repeated index: ∇ μ V or V ;μ
To justify the presence of additional terms in it is appropriate to
μ
think of the divergence ∇ μ V as a covariant derivative V
90
tensor h
y Inner product
k
Divergence(s) of tensors of higher order can similarly be defined
provided at least one upper index exists.
h−1
It results in a tensor of rank k , tensor divergence of a tensor.
If there are more than one upper index, various divergences can be
defined, one for each upper index. For each index of the tensor, both
upper or lower, including the one involved, an adequate term in
must be added. For example, the divergence with respect to β index of
2 1
the rank ( 1) tensor T = T γ β e⃗ e⃗β ẽ γ is the ( 1) tensor:
̃ (T) = ∇ β T γ β e⃗ ẽ γ
∇ namely:
comp
̃ (T) ∇ β T γ β = β T γ β + Γβ κ T κβ β κ κ β
4.64
∇ → γ + Γ βκ T γ −Γβ γ T κ
̃ T ( eq.4.59 ) for μ = β .
Note that it is the contraction of ∇
91
▫ It can be obtained like eq.4.66, passing through ∇ μ g β and
β κ
μ g and then considering that μ ẽ =−Γμ κ ẽ (eq.4.39).
The last eq.4.67, eq.4.68 are often respectively written as:
∇ κ gβ = 0 or g β; κ = 0
μν μν 4.69
and: ∇κ g =0 or g ;κ =0
92
would be null in Cartesian (since all its components are zero) and not
null in spherical coordinates, against the invariance of a tensor under
change of coordinates.
The Christoffel symbols are rather a set of n 3 coefficients depending
on 3 indexes but do not form any tensor. However, observing the way
in which it was introduced ( ⃗ Γ = Γ λ e⃗λ , eq.4.29 ) , each related
to a fixed pair of lower indexes is a vector whose components are
marked by the upper index λ. It is therefore valid (only for this index)
raising / lowering by means of g:
λ
g κ λ Γμ ν = Γμ ν κ 4.71
▪ An important property of the Christoffel symbols is to be symmetric
in coordinate basis with respect to the exchange of lower indexes:
λ λ
Γμ ν = Γ ν μ 4.72
93
3
and take into consideration, for instance, the coefficient Γ 21 from
the last equation. By observing the role it plays in the equation where
it appears, it can be understood as:
In general:
Γμλ ν can be understood as “component along ⃗e λ of the
derivative μ of the basis-vector ⃗e ν ”.
The whole story is: moving a small step d ⃗x from an initial point, all
basis-vectors change; among others the ν-th basis-vector ⃗e ν will
change; n partial derivatives will describe its variation along each of
the n coordinate lines; in particular μ e⃗ν will stand for its variation
μ
rate along the coordinate line x ; μ e⃗ν is a vector that can be
decomposed into n components in direction of each basis-vectors. We
λ
denote Γμ ν the component in the direction of ⃗e λ .
94
By summing member to member: 1st – 2nd + 3rd equation and
using the symmetry properties of the lower indexes eq.4.72
(which applies in coordinate bases) we get:
λ
g β ,μ − g βμ , + g μ ,β = 2 g λ Γμ β
Multiplying both members by g and since g g = : **
κ 1 κ
Γμβ = g ( g β ,μ − g βμ , +g μ ,β ) , c.v.
2
in the sense that, for a given space, all are valid or none.
▫ In fact all propositions make a single indivisible block because
they imply each other, as can be easily seen by eq.2.39, eq.4.31,
eq.4.47, eq.4.73.
In general we will qualify “flat” a metric where all these properties
hold (particularly ∀ Γ = 0 ). ******
95
4.19 Covariant derivative and invariance of tensor equations
A tensor equation contains only tensors, thus it may contain covariant
derivatives but not ordinary derivatives (which are not tensors). For
instance, g = 0 or its componentwise equivalent g = 0
(which actually stands for n 3 equations) are tensor equations; it is not
the case of eq.4.73.
A strategy often used to obtain tensor equations is called “comma goes
to semi-colon” and takes advantage of the fact that in a “flat”
coordinate system, the ordinary derivative and the covariant derivative
coincide since all Γ are zero. In practice:
Working in a “flat” coordinate system (e.g., Cartesian), a partial de
rivatives equation that proves to be valid in this system can be
turned into a tensor equation simply by replacing partial derivatives
with covariant derivatives, i.e. by replacing commas with semi
colons. The tensor equation obtained in this way is invariant and
applies in any reference accessible via coordinate transformation.**
Of course this is possible in flat spaces only. However, if the space is
curved, the strategy can be implemented punctually because any
curved space is locally flat in every (not "pathological") point, as we
will see later.
96
4.20 T-mosaic representation of gradient, divergence and
covariant derivative
is the covector
∇ ∇ that applies to scalars, vectors or tensors.
e
▪ Gradient of scalar: ̃ f
∇ = ∇ = f = ∇ f
e
e
e e
▪ Gradient of vector: ⊗ V
∇ = = =
∇ V ∇ V
e e
= V e e
covariant
derivative
▪ Gradient of covector: ⊗ P
∇ = ∇ P = ∇ P =
e e e e
= ∇ e
P e
covariant
derivative
▪ Gradient of tensor,
e.g. g: ⊗g
∇ = ∇ g = ∇ g
e e e e e e
97
▪ Divergence of vector:
∇
∇
e
div
A = A
∇ = → = ∇ A
e
A
A
∇
∇
e e
e
e e
T
∇ = → T = ∇ μ T νλ μ
T
e e
e
e
= X νλ
e
98
4.21 Derivative of a scalar or vector along a line
So far we have considered derivatives and covariant derivatives along
coordinate lines. Now we want to extend the concept to any lines.
▪ Given a scalar or a vector defined (at least at the points of a
parametric line) within the manifold, let us express the derivatives of
the scalar or vector along the line as the incremental ratio of the scalar
or vector with respect to the parameter.
The line l is defined by n parametric equations x μ( τ). The parameter
τ can be (or not) the arc length. At each point of the line let be defined
a tangent vector (a unit vector only if τ is the arc):
⃗ = d ⃗x
U 4.77
dτ
d xμ comp μ
⇒ Uμ= ⃗ d x in coordinate bases.
i.e. U
dτ dτ
99
▫-Indeed: along a coordinate line xμ̄ it is U ⃗ // e⃗μ ⇒ U μ̄ is the
̄
only nonzero component;** the summation on μ implicit in
U μ ∇ μ f (eq.4.79) thus collapses to the single -th term.
μ μ d xμ̄
Moreover, if is the arc, then d τ≡ d x ̄ ⇒ U ̄ = =1 .
dτ
For these reasons, eq.4.79 reduces to:
df μ
= U ̄ ∇ μ̄ f [no sum] = ∇ ̄μ f **** 4.80
dτ
q.e.d.
▪ Inasmuch projection on U ⃗ of the gradient ∇ f the derivative of a
scalar along a line is also denoted:
df
≡ ∇ U f 4.81
d
▪ Eq.4.79 and eq.4.81 enable us to write symbolically (for a scalar):
d
= ∇ U⃗ = U μ μ = U μ ∇ μ 4.82
dτ
along a line
v derivative of a vector V
In analogy with the result just found for the scalar f (eq.4.79) we
define the derivative of a vector V along a line l whose tangent
as the “projection” or “component” of the gradient on
vector is U
the tangent vector:
dV V
≝〈∇ ,U〉 4.83
d
V is a 1
The derivative of a vector along a line is a vector, since ∇ 1
tensor. Expanding on the bases:
d V⃗ ̃ V⃗ ,U
⃗ 〉 = 〈∇ μV ν ẽ μ e⃗ν ,U κe⃗κ 〉 = ∇ μV ν U κ 〈⏟ μ
=〈∇ ẽ , e⃗κ 〉 e⃗ν =
dτ μ
δκ
* Note that along the coordinate line x ̄μ, provided is the arc:
μ μ̄
U⃗ ≝ d ⃗x = e⃗μ dx = e⃗̄μ dxμ [no sum ] = e⃗μ
dτ dτ dx ̄ ̄
**The indication [no sum] offs the summation convention for repeated indexes in
the case.
100
μ ν DV ν
= U
⏟ ∇ μ V e⃗ν = e⃗
dτ ν 4.84
DV ν
≝
dτ
d V comp DV
Hence: 4.85
d d
ν
DV
after set: ≝ U μ ∇μ V ν 4.86
dτ
DV d V
is the ν-component of the derivative of vector and is
d d
referred to as covariant derivative along line or directional covariant
derivative or absolute derivative; it can be expressed as:
DV ν ν
μ ν μ V ν λ V ν dx μ ν λ μ
dτ ⏟
≝ U ∇μ V = U (
μ x
μ +Γ μλ V ) = μ
x d τ
+ Γμλ V U =
dx
=
dτ
d Vν ν
= +Γμλ V λU μ 4.87
dτ
A new derivative symbol has been introduced since the components of
d V dVν
are not simply d τ , but take the more complex form of eq.4.87
d
DV ν dVν
(only when ∀Γ= 0 expressions and are equivalent).
dτ dτ
101
or, symbolically:
d comp D μ
≡ ∇ U⃗ → ≡ U ∇μ 4.90
dτ dτ
▪ The relationship between the derivative of a vector along a line and
the covariant derivative becomes apparent when the line l is a
coordinate line x ̄μ .
▫ In this case U = 1 provided is the arc and the summation
on μ̄ collapses, as already seen (eq.4.80) for the derivative of a
scalar; then the chain eq.4.84 changes into:
dV ⃗ μ ν ν 4.91
= U ̄ ∇ ̄μ V e⃗ν = ∇ μ̄ V e⃗ν
dτ
d V⃗ comp
namely: → ∇ μ̄ V ν
dτ
The derivative of a vector along a coordinate line x ̄μ has the n covari
ant derivatives corresponding to the blocked index μ ̄ as components.
Notation
Derivatives along a line of :
df ̃ f ,U
⃗ 〉 = ∇ U⃗ f =U μ ∇ μ f
• scalar f : = 〈∇ (=U μ μ f )
dτ
⃗: d V⃗ ̃ V⃗ , U
⃗ 〉 = ∇ U⃗ V⃗
• vector V = 〈∇
dτ
comp DV μ
= U μ ∇ μV ν ( =U μV ν;μ )
dτ
102
4.22 T-mosaic representation of derivatives along a line
u Derivative along a line of a scalar:
∇ f
∇ f
df e
f ,U
= 〈∇ 〉 = → = U ∇ f =
d e U
U
= ∇ U f
(scalar)
e e
∇ V ∇ V
e
dV e e
V , U
= 〈∇ 〉 = → = =
d e e U ∇ V
U U
e
DV ≡
= ∇ U V
d
(vector)
103
Mnemo
Derivative along a line of a vector: the previous block diagram can
help recalling the right formulas.
recalls the
In particular, writing ∇ U V
disposition of the symbols in the block: ∇ V
104
5 Curved manifolds
The notion of curvature of lines (1D) and surfaces (2D) is intuitive (it
is curved every line that is not a straight line and every surface that is
not a plane). Ours one is a judgment by external observers. In both
cases we see objects that we can consider respectively 1D and 2D
spaces embedded in our 3D space; we have an extrinsic vision, from
outside or "from an extra dimension". Our judgment on the curvature
is related to this extrinsic vision: we see the curvature being realized
in a dimension that is extraneous and additional to the dimensionality
of the spaces themselves.
▪ Taking as an example 2D manifolds consisting of surfaces we can
establish a first tipology to distinguish various kinds of curvature:
positive (as for a spherical or ellipsoidal surface), negative (hyperbolic
surface), in addition to zero curvature (plane). In a positive curvature
manifold the sum of the internal angles of a triangle is > 180 °, the
ratio circumference to diameter is < π and two initially parallel
straight lines end up intersecting; in a negative curvature manifold the
sum of the internal angles of a triangle is <180°, the ratio
circumference to diameter is > π and two initially parallel lines end up
diverging. In both cases Euclidean geometry is no longer valid.
A spherical / ellipsoidal curvature is that of a ball; a negative or
hyperbolic curvature is less easy to imagine.
An intuitive model that accounts for both positive and negative
curvature manifolds consists of a flat metal plate (a) that is unevenly
heated. The thermal dilatation will be greater in the points where the
higher is the temperature.
If the plate is heated more in the central part the termal dilating will be
greater at the center and the foil will sag dome-shaped (b).
If the plate is more heated in the periphery, the termal dilation will be
greater in the outer band and the plate will “embark” assuming a wavy
shape (c).
The simplest case of negative curvature (c) is shown, in which the
curved surface has the shape of a hyperbolic paraboloid (the form
often assumed by fried potato chips).
105
The graphic representation that can be given for the three cases can
only be an external view, of extrinsic kind.
a) flat space
b) space with
positive curvature
c) space with
negative curvature
106
"intrinsic", and we will refer to it as a rule when speaking of curved
spaces (it will be seen later on that the concepts of extrinsic and
intrinsic curvature are not always superimposable).
So, curved is a space with intrinsic non-zero curvature.
A quantitative measure of the intrinsic curvature needs first a
definition of the Riemann tensor (and even earlier the notion of
parallel transport of a vector along a line); for a qualitative evaluation
“curved space or flat space”, an examination of the metric tensor g is
enough.
107
What is crucial is the existence of a reference system wherein all
g μ ν are independent of the point P , that is all elements of the
matrix [ g μν ] are numerical constants. If such a system exists, the
manifold is flat, otherwise** it is curved.
▪ Note however that flatness or not – are intrinsic features of the
manifold, independent of the reference system.
▪ In a flat manifold, in the reference system where ∀g μν =const , it is
consequently:
gμ ν
∀ =0 5.3
x
everywhere, and so on for second derivatives and the subsequent ones.
In a curved manifold not all derivatives of the metric are null (always
null are the covariant derivatives, due to the property ∇ ̃ g=0
(eq.4.67) which is valid in any Riemann' manifold. It means that
derivatives of the metric g μν are equal and opposite in sign to the
terms in Γ that appear in the (null) covariant derivative).
Recall that in a flat space ∀Γ=0 in each point.
▪ Because of a theorem of matrix calculus, any invertible symmetric
matrix like [ g μ ν ] can be diagonalized and set into canonical form
[ημ' ν' ]= diag (±1) by means of a matrix Λ in the following way: **
** Λμμ' Λ νν ' gμ ν = η μ' ν ' 5.4
(we use the symbol η for the metric in canonical form diag (±1), a
diagonal matrix whit only +1 or –1 in the main diagonal *** ).***
For a flat manifold, being the metric independent of the point, the
matrix Λ holds globally for the whole manifold.
It is easy to recognize in Λ the Jacobian matrix J related to a
coordinate transformation; it performs the change of bases required to
maintain coordinate bases (it is not the transformation but it comes
down from). Eq.5.4 express the resulting change of metric.
108
Hence, a flat space admits a coordinate system in which the metric
tensor assumes canonical form
[ g μ ν ] = [ η μ ν ] = diag (1) 5.5
due to the matrix Λ related to the transformation.
This property can be assumed as a definition of flatness:
It is flat a manifold where g μ ν assumes everywhere the canonical
diag ( 1) form in some (appropriate) coordinate system.
In particular, when [ g μ ν ] = diag(+1) the manifold is an Euclidean
space; if [ g μ ν ] = diag ( 1) with a single ‒1 it is a “Minkowski-like”
space. The occurrences of +1 and ‒1 in the main diagonal, or their
sum called “signature”, is a characteristic of the given space: under
coordinate tranformation the +1 and ‒1 can exchange their places in
the diagonal but do not change in number (Sylvester's theorem).
109
transformation whose Λ is the related matrix good for P does not work
for all the other points of the manifold, where the metric remains non-
stationary and non-canonical.
▪ Asking for null first derivatives means to ask for g μν to be
“stationary” in P , so that its value remains almost unchanged in the
neighborhood and holds even locally in the neighborhood of the point,
except for infinitesimals of higher order respect to the displacements
Δ x κ. ** This is enough to ensure that the metric is flat in P. The
demand g μν stationary is the “weak” analogy of the demand g μν
constant made for a flat manifold as a whole. The variability of the
metric in the neighborhood of P is only due to the second derivatives
that, unlike what happens in a flat space, cannot be all null. In the
same way in the frame locally flat in P , all the Γ are null, but not null
all their first derivatives.
▪ It is important not to confuse the locally flat coordinate system with
the tangent plane (or space): the first is however “drawn” with its
coordinates lines on the given space and is part of it, the second one is
external to it.
▪ Of the two conditions eq.5.6 crucial is the first; the second is
accessory and can follow. We can proceed in two steps: first transform
from the current reference to a new one in which the derivatives of all
g μν are null (at this point g μν are stationary coefficients); then
apply a linear transformation of coordinates of matrix Λ and bring the
new g μν to canonical form by applying twice the same Λ as in
eq.5.4. In itself, Λ expresses a linear transformation that, as such,
preserves the nullity of the derivatives obtained from the first trans
formation (the derivatives of the new g μν are linear combinations of
previous ones, all null).
▪ It happens that, given a curved manifold and some reference system,
there are locally flat points, that is, points where the metric
“spontaneously” fulfill the local flatness conditions. Which are these
points depends on the reference system: if you change the reference,
the points of “spontaneous” flatness change. Ultimately, local flatness,
* An analogy is given by a function f ( x) which is stationary at its maximum or
minimum points. In those points it is f '= 0 and for small displacements to the
right or to the left f ( x) only varies for infinitesimal of higher order with
respect to the displacement.
110
while being a property of each point of all Riemann manifolds, is
manifested as a feature of the reference system.
All this is formalized in the following theorem.
111
While at P is g μ' ν' (P ) = ημ ' ν' ( P ) , at any other point P' of the
P-neighborhood the transformation gives:
g μ ' ν' ( P ' ) = Λ μ' Λβν ' g β ( P ' ) 5.8
(denoted Λμ ' the related matrix (see eq.3.7, eq.3.8), to
transform g we have to use twice its transposed inverse).
In the last formula both g and are meant to be calculated in P'.
In a differentiable manifold, what happens in P' can be
approximated by what happens in P by means of a Taylor series.
We expand in Taylor series around P both members of eq.5.8,
first the left: **
γ' γ'
g μ ' ν '( P ' ) = g μ' ν' ( P ) + ( x −x 0 )g μ' ν' , γ' ( P) +
112
Now let's equal order by order the right terms of eq.5.9 and
eq.5.10 :
I) g μ ' ν' ( P) = Λμ' Λβν ' g β ( P) 5.11
II) g μ ' ν ' , γ ' ( P ) = γ'
( Λμ ' Λ βν ' g β ) ( P ) 5.12
x
2
III) g μ ' ν' , γ ' λ ' ( P) = γ' λ'
(Λ μ ' Λ βν ' g β ) ( P) 5.13
x x
Hereafter all terms are calculated in P and for that we omit this
specification below.
x
Recalling that Λμ ' = and carrying out the derivatives of
xμ '
product we get:
x xβ
I) g μ ' ν' = g β 5.14
xμ ' x ν'
2 x x β x 2 x β x x β
II) g μ' ν' , γ' = γ' μ' g β+ μ' γ' ν' g β+ μ ' ν' g β, γ'
x x x ν' x x x x x
5.15
3 x xβ
III) g μ ' ν ' , γ ' λ ' = g β +... 5.16
x γ' x λ ' x μ ' x ν '
The right side members of these equations contain terms such as
g , ' that can be rewritten, using the chain rule, as
xσ
g β ,γ ' = g β , σ γ' . The typology of the right side member
x
terms then reduces to:
• known terms: the g and their derivatives with respect to the
old coordinates, such as g ,
• unknowns terms to be determined: the derivatives of the old
coordinates with respect to the new ones.
(We'll see that just the derivatives of various order of the old
coordinates with respect to the new ones allow us to rebuild the
matrix related to the transformation and the transformation
itself.)
113
▫ Now let's perform the counting of equations and unknowns for
the case n = 4 , the most interesting for General Relativity.
The equations are counted by the left side members of I, II, III ;
the unknowns by the right side members of them.
I) There are 10 equations, as many as the independent elements of
the 4×4 symmetric tensor g ' ' . The g are known.
To get g μ ' ν' = ημ ' ν ' ⇒ we have to put = 0 or ± 1 each of the 10
independent elements of g μ ' ν ' (in the right side member) in
order to get the desired metric diag (±1).
x
Consequently, among the 16 unknown elements of matrix
xμ '
6 can be arbitrarily assigned, while the other 10 are determined
by equations I (eq.5.14). (The presence of six degrees of freedom
means that there is a multiplicity of transformations and hence of
matrices able to make canonical the metric). In this way have
been assigned or calculated values for all 16 first derivatives.
II)There are 40 equations ( g μ ' ν' has 10 independent elements; γ'
can take 4 values). Also the independent second derivatives like
2 x
γ' μ ' are 40 (4 values for α at numerator; 10 different pairs
x x
of γ', μ' at denominator).
All the g β and their first derivatives are already known. Hence,
we can set = 0 all the 40 g μ ' ν' , γ ' at first member (as it was in
our intent) and determine all the 40 second derivatives as a
consequence.
III) There are 100 equations (10 independent elements g μ ' ν ' to
derive with respect to the 10 different pairs generated by 4
numbers). Known the other factors, 80 third derivatives like
3 x
λ' γ' μ ' are to be determined (the 3 indexes at
x x x
denominator give 20 combinations,** the index at numerator 4
choices yet). Now, it is not possible to find a set of values for the
80 third derivatives such that all the 100 g μ ' ν ' , γ ' λ ' vanish; 20
among them remain nonzero.
* It is a matter of combinations (in the case of 4 items) with repeats, different for at
least one element.
114
We have thus shown that, for any point P, by assigning
appropriate values to the derivatives of various order, it is
possible to get (more than) one metric g ' ' such that at P :
• g μ ' ν' = ημ ' ν ' , the metric of flat manifold diag (±1)
• all its first derivatives are null: ∀ g μ ' ν ' , γ ' = 0
• some second derivatives are nonzero: ∃ some g μ' ν ' , γ ' λ' ≠ 0
▪ This metric ημ ' ν ' with zero first derivatives and second derivatives
not all zero qualifies the flat local system in P (even though the
manifold itself is curved because of the nonzero second derivatives).
A metric with zero first and second derivatives is conversely the
metric of the tangent space in P (which is a space everywhere flat).
The flat local metric of the manifold and that of the tangent space
coincide in P and (the metric being stationary) even in its
neighborhood except for a difference of infinitesimals of higher order.
▫ Finally, we note that, once calculated or assigned appropriate
values to the derivatives of various order of old coordinates with
respect to the new ones, we can reconstruct the series expansions
for the (inverse) transformation of coordinates x (x μ ' ) and for
the elements of matrix Λμ ' (inverse of ). Indeed, in their
'
series expansions in terms of new coordinates x around P :
γ' γ' x
x ( x μ' ) → x ( P ' ) = x ( P) + ( x −x 0 ) (P)+
xγ '
2 x
+ 1 ( x − x 0 )( x −x 0 )
γ' γ' λ' λ'
( P ) +... 5.17
2 x γ' x λ'
x γ' γ' 2 x
= ( P ) + ( x − x 0 ) (P) +
xμ ' x γ ' xμ '
3 x
+ 1 ( x − x 0 )(x −x 0 ) γ ' λ ' μ ' ( P ) +...
γ' γ' λ' λ'
2 x x x
5.18
115
only the already known derivatives of the old coordinates with
respect to the new ones appear as coefficients.
Once known its inverse, both the direct transformation xμ ' ( x )
and the related matrix Λ = [ Λμ' ] that induce the canonical
metric in P are in principle implicitly determined, as desired.
* If M represents a certain class of spaces only, the validity of the obtained tensor
equation will be limited to that class.
116
̃ g = 0 or “covariant constancy” property (eq.4.67)
▪ ∇
In the system locally flat at P of a given space it is by definition
g β ,μ ( P ) = 0 ; the commutation , ; leads to the tensor
equation g β ;μ ≡ ∇ μ g β = 0 (i.e. ∇̃ g = 0 ) valid at P in any
other permissible coordinate system for the same space. But, since
the chosen point P can be any, and likewise the chosen space, its
validity is quite general
▪ Leibniz rule for the covariant derivative of product
We show how the Leibniz rule for the derivative of product extends
to the covariant derivative in the particular case of outer product
between vector and covector (but the same considerations apply to
any outer or inner tensor products)
In the locally flat system at P of a given space, the Leibniz rule
applies (as in any flat space):
( A B β ) , κ = A, κ ⋅ Bβ + A ⋅ Bβ , κ
The equation becomes a tensor equation by switching , ;
( A B β );κ = A; κ ⋅ Bβ + A ⋅ Bβ; κ
so as to make it valid at P in any other reference; but since the
space and its point P are chosen arbitrarily, the equation has
universal validity.l
117
example, he will find that the ratio of the circumference to diameter is
a constant = π for all circles and that the sum of the inner angles of
any triangle is 180°. In fact, in its environment, i.e. in the small
portion of the spherical surface on which his life takes place, his two-
dimensional space is locally flat, as for us is flat the surface of a water
pool on the Earth's surface (the observer O3D sees this neighborhood
practically lying on the plane tangent to the spherical surface at the
point where O2D is located).
However, when O2D comes to consider very large circles, he realizes
that the circumference / diameter ratio is no more a constant and it is
smaller and smaller than π as the circle enlarges, and that the sum of
the inner angles of a triangle is variable but always > 180°.
O2D can deduce from these facts that his space is curved, though not
capable to understand a third dimension. In this particular case he will
observe an intrinsic positive curvature.
The curvature of a space is thus an intrinsic property of the space itself
and there is no need of an outside view to describe it. In other words,
it is not necessary for a n-dimensional space to be considered
embedded in another n + 1 dimensional space to reveal the curvature
(as it is not necessary to introduce a fifth dimension to affirm the
curvature of our 4D space-time).
▪ Another circumstance that allows O2D to discover the curvature of his
space is that, carrying a vector parallel to itself on a large enough
closed loop, the returned vector is no longer parallel ( = does not
overlap) the initial vector.
In a flat space such as the Euclidean space vectors can be transported
parallel to themselves without difficulty, so that they can be often
considered delocalized. For example, to measure the relative velocity
of two particles far apart, we may imagine to transport the vector ⃗v 2
parallel to itself from the second particle to the first so as to make their
origins coincide in order to carry out the subtraction ⃗v 2 −⃗ v1 .
But what does it mean a parallel transport in a curved space?
In a curved space, this expression has not an immediately clear
meaning. However, O2D can think to perform a parallel transport of a
vector by a step-by-step strategy, taking advantage from the fact that
in the neighborhood of each point his space looks substantially flat.
118
Let l be the curve along which he wants to parallel-transport the (foot
of the) vector V . At first the observer O2D and the vector are in P1 ;
the vector is transported parallel to itself to P2 , a point of the
(infinitesimal) neighborhood of P1 . Then O2D goes to P2 and parallel
transports the vector to P3 that belongs to the neighborhood of P2 ,
and so on. In this way the transport is always within a neighborhood in
which the space looks flat and the notion of parallel transport is not
ambiguous: vector in P1 // vector in P2 ; vector in P2 // vector in P3 ,
and so on.
But the question is: is vector in P1 // vector in Pn ? That is: does the
parallelism which works locally step-by-step by infinitesimal amounts
hold as well globally?
To answer this question we ought to transport the vector along a
closed path: only if the vector that returns from parallel transport is
superimposable onto the initial vector we can conclude for global
parallelism to have been preserved.
In fact it is not like that, at least for large enough circuits: just take for
example the transport A-B-C-A along the meridians and the equator on
a spherical surface:
A B
119
The mismatch between the initial vector and the one that returns after
the parallel transport can give a measure of the degree of curvature of
the space (this measure is fully accessible to the two-dimensional
observer O2D).
To do that quantitatively we'll give further on a more precise
mathematical definition of parallel transport of a vector along a line.
▪ A different explanation of how parallel transport of a vector works in
a curved space can be given in terms of tangent plane; for that it is
necessary to think the space in question as embedded in a space with
an extra-dimension. Let's describe here the parallel transport on a 2D
curved surface as seen by an observer by a O3D observer.
Let a vector V⃗ be defined at a point P of the surface, to be parallel-
transported along a line l defined on the surface.
The transport of the vector works as follows:
● Let us take the tangent plane to the surface at P .
● If you move the plane keeping it in contact with the curved surface
without slipping, the tangency point will move too. We'll move the
plane so that the tangency point goes forward continously along the
curve l.
● Along this route, from each instantaneous point of tangency, we draw
different curve because flattened). In each point of this line we'll find
drawn a vector pointing outward; all these vectors are parallel to each
other and to the initial vector V⃗ .
● Also on the curved surface we will find a vector coming out from
each point of the line l , all them equal to V⃗ in magnitude but not in
direction: all vectors drawn on the curved surface will have a different
orientation to one another and different from V⃗ . In particular, for a
closed path on the surface: V⃗ fin ≠V⃗ init .
The reason is that, along the route, the tangent plane has been
continuously changing its spatial orientation in order to adhere to the
surface all along the line, describing complex rotations, so that the
vector, always parallel to itself in the plane, progressively changes its
120
orientation on the curved surface.
P
P
121
5.7 Parallel transport of a vector along a line
μ
Let V a vector defined in each point of a parametric line x (τ) .
Moving from the point P to a point P the vector
undergoes an increase:
d V⃗
V⃗ ( τ+Δ τ) = V⃗ ( τ) + (τ) Δ τ + O[( Δ τ)2 ] * *
dτ
If the first degree term is missing, namely:
d V⃗
V⃗ ( τ+Δ τ) = V⃗ ( τ) + O[(Δ τ)2 ] =0 5.19
dτ
we say that the vector V is parallel transported.
In other words, transporting its origin along the line for an amount
the vector undergoes only a variation of the 2nd order,
infinitesimal respect to (i.e. keeps itself almost unchanged).
The parallel transport of a vector V is defined locally, in its own flat
neighborhood: over a finite length the parallelism between initial and
transported vector is no longer maintained .
▪ In each point of the line let a tangent vector U = d x be defined.
d
The parallel transport condition of V along U can be expressed by
one of the following forms:
∇ U⃗ V⃗ = 0 , 〈 ∇ ̃ V
⃗ ,U
⃗ 〉 = 0 , U μ ∇μ V ν = 0 5.20
all equivalent to the definition eq.5.19 because (eq.4.83, eq.4.89):
d V⃗ ⃗ 〉 comp
̃ V⃗ , U
∇ U⃗ V⃗ = = 〈∇ → U μ ∇μ V ν .
dτ
μ
Given a vector V defined at a point of a regular line x (τ), it is
always possible to parallel transport it along the line (that does not
mean V⃗ init and V⃗ fin will overlap). The vector to be transported can
have any orientation with respect to U .
5.8 Geodesics
When it happens that the tangent vector U is transported parallel to
itself along a line, the line is a geodesic.
* We denote by O(x) a quantity of the same order of x. In this case it is matter of a
quantity of the same order of (Δ τ)2 , i.e. of the second order with respect to Δ τ.
122
Geodesic condition for a line is then:
dU
= 0 along the line 5.21
d
Equivalent to eq.5.21 are the following statements:
∇U =0 , 〈∇U ,U
〉=0 5.22
U
or componentwise:
DU ν
=0
dτ
U μ ∇μ U ν = 0
5.23
dUν
+Γνμ λ U λ U μ = 0
dτ
d 2 xν λ μ
ν dx dx
+Γμλ =0
d τ2 dτ dτ
which come respectively from eq.4.85, eq.4.86, eq.4.87 (and eq.4.77).
All eq.5.21, eq.5.22, eq.5.23 are equivalent and represent the
equations of the geodesic.
Some properties of geodesics are the following:
▪ Along a geodesic the tangent vector is constant in magnitude.**
⃗ 2= g U U β is a scalar, hence (eq.4.79):
▫.Indeed: ∣U∣ β
d
( g β U U β)=U μ ∇ μ (g β U U β )=
dτ
= U μ ( g β U ∇ μ U β +g β U β ∇ μ U +U U β ∇ g β ) = 0
μ β μ
because U ∇ μ U = U ∇ μU = 0 is the parallel transport
along the geodesic and ∇ μ g β =0 (eq.4.67).
condition of U
⃗ 2 = const ,q.e.d.
Hence: g β U U β = const ⇒ ∣U∣
* The converse is not true: the constancy in magnitude of the tangent vector is not
a sufficient condition for a path to be a geodesic. See further on.
** A line may be parametrized in more than one way. For example, a parabola can
123
(two parameters are called affine if linked by a linear relationship).
▪ In a flat space, the geodesics are straight lines. In a curved space the
geodesics take their place preserving some properties. They are “the
straightest possible lines”, the minimum distance lines (see Appendix)
Given two points of the space, there is only one geodesic that connects
them, via the shortest possible route; its length measures the distance
between the two points.
▪ The geodesics of a curved space can be thought of as inertial
trajectories of particles not subject to external forces; along them a
particle transports its velocity vector parallel to itself, constant in
magnitude and tangent to the path without undergoing acceleration.
In a generic curved space let be a geodesic parametrized with the time
t as parameter. The tangent vector is then the velocity vector:
⃗ = d ⃗x =⃗v . Since the definition of geodesic d ⃗v =0 , along the
U
dt dt
geodesic the acceleration is null:
ν ν
d ⃗v Dv dv
a=
⃗ =0 and aν = = +Γμν λ v λ vμ = 0 (eq.5.23)
dt dt dt
(that happens because the two terms in d d t and Γ, in general ≠ 0
and variable with the point, compensate each other, zeroing).
Hence geodesics are inertial paths, as mentioned before.
We note by the way that the acceleration results to be the covariant
v along the line.
derivative of the vector velocity ⃗
We also note that the vector v⃗ is transported along the geodesics
constant in magnitude: ∣⃗v ∣= v = const . Hence, if τ is the arc-
length, then τ = v t and the two parameters τ and t are affine.
Therefore taking the geodesic parametrized by t is not limiting
because the re-parameterization preserves the geodeticity.
A different description of the same motion is given thinking of the
curved space as embedded inside a flat space (e.g. Euclidean) with an
extra dimension: here the path is no longer a geodesic (here geodesics
are straight lines) and the same motion is described as accelerated. In
fact, in the new context it is ∀Γ =0 due to the flatness and therefore
2 3 6
be parametrized by x = t ; y = t or x = t ; y = t .
124
ν d vν
a = ≠0 and a
⃗ ≠0.
dt
The two different points of view suggest that the same motion can be
described either in terms of space curvature or as acceleration.. The
equivalence between space curvature and acceleration is one of the
cornerstones of General Relativity.
▪ We enounce, without further deepen the intuitive meaning, a
property that is an exclusive feature of the geodesics of a 2D space
(surface): in each point of the geodetic line the vector normal to the
line coincides with the normal vector to the surface.
▪ Along a geodesic the tangent vector (it may be, as seen, the velocity
vector) is transported tangent to the curve and constant in magnitude.
Conversely, it is worth to note that it is not enough that the tangent
vector keeps tangent to a curve and constant in magnitude for a curve
to be geodesics. For a curve to be geodesics it is required the parallel
transport, which is a more stringent request.
For instance, on a 2D spherical surface geodesics are the greatest
circles, i.e. the equator and the meridians, but not the parallels: along
parallels the velocity vector can be transported tangent to the path and
constant in magnitude, but the parallel transport does not occur. As
they are not geodetic paths, parallels are not inertial trajectories and
can be traveled only if a constant acceleration directed along the
meridian is in action.
125
Earth's axis and is tangent to the spherical surface on the
parallel); the length of the arc will be that of the Earth's parallel.
Let's now parallel-transport on the plane the vector E0 initially
tangent to the circumference until the end of the flattened arc
(working on a plan, there is no ambiguity about the meaning of
parallel transport). Let E fin be the vector at the end of the
transport. Now we imagine to bring back to its original place on
the spherical surface the ribbon with the various subsequent
positions taken by the vector E ⃗ drawn on attached to it. In that
way we achieve a graphical representation of the evolution of
the vector E parallel transported along the Earth's parallel.
The steps of the procedure are explained in the figure below.
After having traveled a full circle of parallel transport on the
Earth's parallel of latitude the vector E has undergone a
clockwise rotation by an angle = 2 r cos = 2 sin , from
r / tg
E0 to E
fin .
Only for = 0 (along the equator) we get = 0 after a turn.
The trick to flatten on the plane the tape containing the trajectory
works whatever is the initial orientation of the vector you want
to parallel-transport.
126
5.9 Riemann tensor
Riemann tensor is the tool that provides a decisive criterion to state
whether a space (or a neighborhood) is flat or curved, and also
provides a quantitative measurement of curvature at any point (both in
cases of definite or indefinite metric).
▪ After parallel-transported a vector along a closed line in a curved
manifold, the final vector differs from the initial by an amount V
due to the curvature. This amount V depends on the path, but it can
be used as a measure of the curvature in a point if calculated along a
closed infinitesimal loop around that point.
Given a point A of the n-dimensional manifold we build a “parallelo-
gram loop” ABCD leaning on the coordinate lines of two generic
coordinates x , x picked among n and we parallel-transport the
vector V along the circuit ABCDAfin****
x
Δx
β C
B
β
x x
D
A
(of course, the construction and all what follows should be thought as
repeated for all possible pairs of coordinates with ,β = 1, 2,... n ).
Tangent to the coordinates are the basis-vectors e⃗ , e⃗β , coordinated
to x , x β .
⃗
dV
The parallel transport of V requires to be = 0 or, component
dτ
ν
wise U ∇ V = 0 (eq.4.89), a condition which along a segment of
ν
coordinate line x reduces to ∇ V = 0 (eq.4.91). But:
127
V V
V = 0 ⇒
V = 0 ⇒ = − V 5.24
x x
and a similar relation comes from the transport along xβ .
Due to the transport along the line segment AB, the components of the
vector V undergo an increase:
V ν
ν ν
V ( B) =V ( A)+ ( )
x ( AB)
Δ x = V ν ( A)−( Γν λ V λ )( AB) Δ x 5.25
[
= (Γ νβλ V λ )( DA) − ( Γβλ
ν
V λ )( BC) Δ xβ + ]
[
V CD − V AB x ]
using again the finite-increments theorem:
(Γβλν V λ )( BC )= (Γ νβλ V λ )( DA)+ x ( Γνβλ V λ )Δ x
(ΓβλνV λ )( AB)= (Γβλν V λ )(CD)− x ( Γβλν V λ )Δ xβ
where the derivatives are now calculated in
intermediate points inside the loop, omitting to
indicate it. We get:
128
ν λ β ν λ β
( βλ
=− Γ V ) Δ x Δ x + β (Γ λ V ) Δ x Δ x
x x
( V λ
) (
λ
=− Γνβλ , V λ+Γβλ
ν
x
Δ x
Δ x β
+ Γ ν
λ , β V λ
+Γ ν V
λ
x
β
β Δx Δ x
)
Vλ
and using the already known result
= −Γλ μ V μ (eq.5.24):
x
=−( Γνβλ , V λ−Γ νβλ Γλ μ V μ ) Δ x Δ x β + ( Γ ν λ ,β V λ −Γν λ Γλβμ V μ ) Δ xβ Δ x
We conclude:
Δ V ν = V μ Δ x Δ x β⏟
(−Γβμν , +Γ ν μ ,β +Γνβλ Γλ μ−Γν λ Γβμλ )
ν
5.26
Rμ β
Rμν β must be a tensor because the other factors are tensors as well as
1 1 1 1 1 1
V . Its rank is 3 for the rank balancing: 0 = 0 0 0 ⋅ 3 .
The position made in eq.5.26 is reordered, rewritten changing the
name of the indexes ↔ ν , β↔μ and using the symmetries of Γ, for
compliance with the commonly adopted notation:
Rβμ ν = Γβν , μ− Γβμ ,ν + Γλ μ Γλβ ν− Γ λ ν Γβλ μ 5.27
▪ Eq.5.26, rewritten with new indexes is:
ΔV =V β Δ x ν Δ xμ Rβμ ν 5.28
If P̃ is any covector:
P̃ (Δ V⃗ )= P Δ V = P V β Δ x ν Δ x μ Rβ μ ν = R( P̃ , V⃗ , Δ⃗x , Δ⃗x) 5.29
129
the conformation and meaning of R . * *
R βμ ν
e⃗
ν
ẽ β ẽ μ ẽ
e⃗β e⃗μ e⃗ν = ΔV
β
V Δ xμ Δ xν
130
▪ The Riemann tensor is related to the fact that the covariant order of
derivation for covariant second derivatives is not indifferent:
V β;μ ν ≠V β; νμ 5.31
so that: V β;μ ν −V β; νμ = V Rβμ ν . 5.32
▫ On the model of covariant derivative with respect to x ν of a
λ λ
tensor V βμ , given by V βμ; ν = ν V βμ −Γ νβ V λμ −Γ νμ V βλ we
can write the covariant derivative of V β;μ with respect to x ν
(i.e. the double covariant derivative of (co)vector V β with
respect to xμ and x ν ) as: **
V β;μ ν = ν V β ;μ −Γ λνβ V λ;μ −Γ λνμ V β; λ =
λ
= ν ( μ V β−ΓμβV λ )−Γ λνβ( μV λ −Γμλ V )−Γλνμ ( λ V β− ΓλβV )=
= 2νμ V β−V λ ν Γμβ
λ λ
−Γμβ νV λ −
−Γλνβ μ V λ + Γλνβ Γμλ V −
−Γλνμ λ V β + Γλνμ Γλβ V
By exchanging μ, ν to each other we get the analogue for V β; νμ :
131
2 2
Therefore the switching property of derivation order μ ν = νμ which
is valid for mixed second partial derivatives does not apply to covari
ant derivatives, unless R βμ ν=0 , which means flat space.
The difference between the mixed second covariant derivatives is a
curvature effect and can be used to give a formal alternative definition
of R .
132
Since at any point of any manifold in its own flat local system it is
∀ Γ = 0 , in that point the definition eq.5.27 given for R βμ ν
reduces to:
Rβμ ν =−Γ μ β ,ν +Γν β ,μ 5.37
From eq.4.73 which gives as functions of g and its derivatives:
ρ
Γμβ = 1 g (−g μβ ,ρ +g βρ ,μ+g ρμ ,β ) and since
ρ
g =0 :
2 xν
R βμ ν =− 1 g ρ (−g μβ , ρν +g βρ ,μ ν+ g ρμ ,β ν)+
2
+ 1 g (−g ν β ,ρμ+g βρ , νμ +g ρ ν ,βμ)
ρ
2
133
▪ Let us clarify a matter that seems to be a paradox: how is it possible
to calculate the curvature tensor R in P just placing, as eq.5.38 does,
in a system locally flat in P ? The explanation is that R as a tensor
does not depend on the coordinate system: to calculate it in some point
P it makes no difference to take as a coordinate system that one
locally flat in P and as metric its own metric g β . Even in the
locally flat system the result will be R ≠ 0 if the space is not globally
flat, because in P the first derivatives of Γ and second derivatives of
g β are not null.
Mnemo
To write down R in the local flat system (eq.5.38) the “Pascal
snail” can be used as a mnemonic aid to suggest the pairs of the
indexes before the comma:
+g ν ,..+g βμ , ..
β μ ν
− −g μ ,..−g β ν , ..
Use the remaining indexes for pairs after the comma.
Rβ μ ν = 1 (g β ν , μ +g μ ,β ν−g βμ , ν− g ν ,βμ )
2
134
whose right side member is the same as in the equation we started
from, but changed in sign and thus ⇒ Rβ μ ν = − R βμ ν .
It turns out from eq.5.38 that R is: i) skew-symmetric with respect to
the exchange of indexes within a pair; ii) symmetric with respect to
the exchange of a pair with the other; it also enjoys a property such
that: iii) the sum with cyclicity in the last 3 indexes is null:
i) Rβ μ ν =− R βμ ν R βν μ =− R βμ ν 5.40
ii) Rμ ν β = R βμ ν 5.41
iii) R βμ ν + R μ νβ + R ν βμ = 0 5.42
The four previous relations, deduced in a generic flat local system
(eq.5.38 applies only there ) are tensor equations and thus valid in any
reference frame, in any point, and in any space.
From i) it follows that the components with repeated indexes within
the same pair are null (for example R 11μ ν = Rβ 33 = R221ν = R1111 = 0 ).
Indeed, from i): R μ ν =−R μ ν ⇒ R μ ν = 0 ; and so on.
▪ On balance, because of its symmetries, among the n 4 components of
R are independent and significant (possibly ≠ 0) only
2 2
n ( n −1) 12 5.43
(namely 1 for n = 2; 6 for n = 3; 20 for n = 4; ...).
▫ The count is performed on the base of the number of repeated
indexes in R βμν taking into account symmetries i, ii, iii :
(eq.5.40, eq.5.41, eq.5.42)
• 4 repeated indexes ⇒ ∀ component = 0 ⇒ no significant
• 3 repeated indexes ⇒ ∀ component = 0 component
since in these cases there are pairs of equal indexes
• 2 repeated indexes, such as Rαβ αβ :
for 1st index: n possible choices
⇒ n (n−1) pairs
for 2nd index: n – 1 possible choices
to be halved by the symmetry i) within the 1st pair, in order
to count as one the pairs β, β ⇒
⇒ n (n−1) 2 first independent pairs, which, repeated,
constitute as many independent components.
135
• 1 repeated index, such as R β ν or similar:
for 1st pair n (n−1)2 independent choices are possible
(as above)
for 3rd index: 2 possible choices (one
among the 2 indexes of the 1st pair) ⇒ 2 (n−2)
th
for 4 index: (n−2) possible choices second pairs,
to be halved by the symmetry i) on the second pair ⇒
⇒ n (n−1)(n−2) 2 independent components
• no repeated index, such as Rαβ μν :
possible alignments with 4 different indexes are:
n ( n−1)( n−2)(n−3) (n choices for the 1st index, n−1
choices for the 2nd index, n−2 for the 3rd index, n−3
for the 4td index), to halven:
a first time for the symmetry i) within the 1st pair
a second time for the symmetry i) within the 2nd pair ⇒
a third time for the symmetry ii) of pairs' exchange
⇒ n ( n−1)(n−2)(n−3)8
For symmetry iii), only 2 components out of 3 are
independent, so the result must be multiplied by 2/3 ⇒
⇒ n (n−1)( n−2)(n−3)12 independent components
Summing up the three addends:
n(n−1)2 + n(n−1)(n−2)2 + n(n−1)(n−2)(n−3)12 =
=n 2(n 2−1)12 , q.e.d.
▫.To get this result we place again in the (generic) flat local system
of a point P and calculate the derivatives of R βμ ν by eq.5.38 :
1
R βμ ν ,λ = λ
( g ν ,βμ+ g βμ , ν −g μ ,βν −g β ν , μ)
x 2
136
= 1 ( g ν ,βμ λ +g βμ , ν λ −g μ ,βν λ −g βν , μ λ )
2
and similarly for R βν λ ,μ e R β λμ ,ν .
Adding up member to member the three equations and taking
into account that in g , the indexes α β , γ δ κ ** can be
permuted at will within the two groups, the right member
vanishes so that:
R βμ ν ,λ + R β ν λ ,μ + R βλ μ , ν = 0
Since in the flat local system there is no difference between
(first) derivative and covariant derivative, namely:
R βμ ν ; λ = R βμ ν ,λ ; R βν λ ;μ = R βν λ ,μ ; R βλμ ;ν = R βλ μ , ν
we can write R βμ ν ; λ + R βν λ ;μ + R β λμ ; ν = 0 which is a tensor
relationship and thus holds in any coordinate system, in any
point, in any manifold, q.e.d.
137
We define Ricci tensor the 0 tensor with components:
2
st rd
Rβ ν ≝ contraction of R βμ ν with respect to 1 and 3 index 5.46
The Ricci tensor is normally denoted by Ric to avoid confusion with
the Riemann tensor.
▪ Caution should be paid when contracting Riemann tensor on any two
indexes: before contracting we must move the two indexes to contract
in 1st and 3rd position using the symmetries of R βμ ν , in order to
perform anyway the contraction 1-3 whose result is known and
positive by definition.
In such manner, due to the symmetries of R βμ ν and using schemes
like eq.5.45 we see that:
• have sign + the results of contraction on indexes 1-3 or 2-4
• have sign – the results of contraction on indexers 1-4 or 2-3
(while contractions on indexes 1-2 or 3-4 give 0 as result)
▪ The tensor Rβ ν is symmetric (see eq.5.45).
▪ Rβ ν is a double tensor, represented by an n × n matrix. As for all
symmetric double tensors, the number of its independent (that is
significant) components amounts to n(n+1)2
▫ The count includes the n elements of the main diagonal, the n – 1
underlying the diagonal, the n – 2 still below, and so on until
reaching the single element in the lower left corner. In total:
n+(n−1)+(n−2)+ . . . +1= n (n−1)2
(remind that the sum of the first n numbers is n(n+1)2 ).
μ μ
* Remind that the trace is defined for a mixed tensor T ν as the sum T μ of
elements of the main diagonal of its matrix. Instead, the trace of T μν or T μν is
κμ μν
by definition the trace of g T μ ν or, rispectively, of g κμ T .
138
▪ The Ricci tensor Ric and the Ricci scalar R , inasmuch contractions of
Riemann tensor R , retain only part of the information on the curvature
of the space carried by R because the number of significant
components reduces when contracting from R to Ric to R .
If the complete information on the curvature at a point requires
n2(n2−1)12 numbers, the components of the tensor R, passing to
its contraction Ric that counts only n(n+1)2 components, or even
worse to R which is a single number, part of the information is lost.
▪ The number of significant components of R, Ric, R depending on the
dimension n of the space is:
n R Ric R
2 1 1 1
3 6 6 1
4 20 10 1
• for n = 2 : R contains as much information as Ric and R and it's
enough to define completely the curvature;
• for n = 3 : Ric contains as much information as R and it's enough to
define completely the curvature;
• for n ≥ 4 : only R completely defines the curvature.
▫ In general, if B = contraction of A and the number of signi
ficant components reduces going from A to B , the following
implications are valid:
A =0 ⇒ B= 0 and its counter-nominal B≠0 ⇒ A≠0 ,
but not the reverse ones. Hence the result B = 0 is ambiguous
and doesn't allow to conclude whether A is null or not.
In particular:
R=0 ⇒ R ic=0 ⇒ R=0, but not vice versa; as well as:
R≠0 ⇒ R ic≠0 ⇒ R≠0 , but not vice versa
R=0 as well as R ic =0 are (for n > 3) ambiguous results and do not
allow to say that the space is flat. In fact may be R = 0 without being
Ric = 0 , and Ric = 0 without being R = 0 .
In the case R = 0 it is necessary to go back to Ric ; if Ric ≠ 0 the space
is curved; if also Ric = 0 it's necessary to go back to R to conclude.
139
5.14 Measure of the curvature
Non-zero R , Ric or R allow not only to detect the curvature but also to
give it a measure.
For that we need to give a precise definition of curvature which is
generally valid for spaces of any dimension.
▪ For an observer confined in a one-dimensional space, or a line, it's
nonsense to talk about curvature: there is no intrinsic curvature in a
1D space. The curvature of the line can only be detected extrinsically
by an observer who sees the curve embedded in a higher dimension
space.
▪ It is reasonable to define (extrinsically) the curvature of the line l at
a point P on the basis of the radius r of the osculating circle ** at that
point. The more the radius of the osculating circle is small, the greater
is the curvature. As a measure of the curvature at point P we assume:
1
k= 5.48
r
▪ If the space is a 2D surface, for an O3D observer it will be reasonable
to define the curvature of the surface at a point P in terms of the
curvatures of the lines obtained by cutting the surface with planes
passing through P and normal to the surface itself. In practice, once
identified the normal to the surface in P , consider the bundle of planes
containing this straight line; each plane will intersect the surface
according to some line, whose curvature in P is given by the definition
given above.
In general the curvatures will be different line by line, ranging from a
minimum k min to a maximum k max. The two lines which correspond to
minimum and maximum are called main sections and the two planes
that generate them are called main directions. It can be shown that the
main directions are orthogonal to each other. Sectional curvature or
Gauss curvature of the surface at point P is then defined as:
k s = k min ⋅ k max 5.49
*The osculating circle in P is that circle which has 3 points infinitely close to each
other in common with the line in P (3 are in fact the points that identify a
circumference). Note that the circumferences tangent to the line in P are infinite,
each with only two points in common with the line. This makes the difference
between tangent circumferences and osculating circumference.
140
k s has positive sign if the osculating circles of the two curves in P
lay on the same side with respect to the surface, negative otherwise. If
one of the two osculating circles or both have infinite radius (i.e. they
are straight lines) the curvature of the surface is null.
The set of cases is as follows:
141
give a definition of generalized curvature K which applies to spaces of
any dimension, such as it reduces to the Gauss curvature in the 2D
case. This generalized curvature, however, is a function not only of the
point P but also of the direction: for each given point P of the space,
the curvature will be different according to the direction. Note that a
point P and a direction univocally define a plane P passing through P
and whose normal is oriented in the chosen direction.
▪ From the Riemann curvature tensor we get a generalized curvature
definition step by step as follows.
u Apply Riemann tensor R to an input list composed of two alternat
ively repeated vectors ⃗ A,⃗B : **
R( ⃗
A, ⃗
B,⃗
A, ⃗
B )= R βμ ν A B β Aμ B ν
It is meant that R is calculated at point P and that the two vectors
⃗
A,⃗ B depart from this point: they will then identify a plane P . The
β μ ν
result R βμ ν A B A B is an invariant scalar.
β μ ν
v One wonders how changes R βμ ν A B A B by changing the
selected vectors, however limiting to vectors of the same plane P
(which can be expressed as linear combination of ⃗ A and ⃗
B ). Take
for example:
⃗ =a ⃗
X A +b ⃗
B , Y⃗ =c ⃗ A +d ⃗
B
⃗ ⃗
and apply R to them in place of A and B ; the result is:
⃗ ,Y⃗ )= R β μ ν A Bβ A μ Bν ( ad −bc )2
⃗ , Y⃗ , X
R( X
▫ Indeed:
⃗ , Y⃗ , X
R( X ⃗ ,Y⃗ )= R β μ ν X Y β X μ Y ν =
* The reason for the alternation ⃗A , B⃗ , ⃗A , ⃗B lies in the fact that R β μ ν is anti-
symmetric for exchanges within the pairs α β e μ ν. The sequence A ⃗ , A,
⃗ B⃗ ,B
⃗
β β
would result in zero : R ,β ,μ,ν A A ...=−R β , ,μ,ν A A ... , but α,β are dummy
β β
indexes and can be exchanged, so that R ,β ,μ,ν A A ...=−R ,β ,μ,ν A A ...=0
142
=Rβμ ν⋅
⋅(a c A Aβ Aμ A ν+a 2cd A Aβ Aμ B ν+abc 2 A Aβ Bμ Aν+abcd A Aβ B μ B ν+
2 2
2 2 2 2
+a cd A B β Aμ A ν+a d A Bβ Aμ B ν+abcd A B β Bμ Aν+abd A B β Bμ B ν+
2 2 2 2
+abc B A β Aμ A ν+abcd B Aβ Aμ B ν+b c B Aβ B μ A ν+b cd B Aβ Bμ B ν+
2 2 2 2
+abcd B B β Aμ A ν+abd B B β Aμ B ν+b cd B Bβ Bμ Aν+b d B Bβ Bμ B ν )=
only terms like ABAB, ABBA, BAAB, BABA are ≠ 0; the
symmetries of R β μ ν state its signs and make null terms
β μ ν β
like R βμ ν A A .. , Rβ μν .. A A , R βμ ν B B ...
β μ ν 2 2 2 2
= R β μ ν A B A B (a d −abcd −abcd +b c )=
β μ ν 2
= R βμ ν A B A B ( ad −bc) , q.e.d.
Thus, switching from ⃗ A,⃗B to coplanar vectors the result changes
only by a constant scale factor (ad −bc)2.
w Instead, we need an invariant that holds for the plan P regardless of
the chosen vectors. For this purpose we must “normalize” the result
just obtained by eliminating the scale factor (ad −bc)2 .
We note that another invariant built on the same vectors, precisely
β μ ν
( g μ g β ν − g ν g βμ ) X Y X Y , serves the purpose if it is used as
denominator of the previous result, since
β μ ν
(g μ g β ν−g ν g βμ ) X Y X Y =
2 β μ ν
=( ad −bc) ( g μ g β ν − g ν g βμ ) A B A B **
143
Indeed the ratio between the two invariants:
R β μ ν X Y β X μ Y ν
K= β μ ν
=
( g μ g βν − g ν g βμ )X Y X Y
R βμ ν A Bβ Aμ B ν( ad−bc) 2
= =
(ad −bc)2( g μ g βν−g ν g βμ )A Bβ Aμ B ν
R βμ ν A B β Aμ B ν
= 5.50
( g μ g βν −g ν g βμ )A B β Aμ Bν
is in turn an invariant scalar that does not change switching from vec
tors ⃗ A,⃗B to coplanars X ⃗ , Y⃗ . This means that the ratio K does not
depend on the particular pair of vectors taken but on the plane P they
identify; in other words it is K = K ( P , P ).
K is defined Riemann curvature, in spaces of any dimension n ≥ 2.
For n = 2 Riemann curvature K reduces to Gauss curvature k s , being
K = k s) .**
It is K = 0 when R β μ ν =0 and vice versa: in these cases spaces are
intrinsically flat.****
When K ≠ 0 the space is intrinsically curved; if K > 0 the space has a
spherical curvature, if K < 0 the curvature is of hyperbolic type.
−abc 2 A 2 ⃗
A⃗B−a2 cd A 2 ⃗
A⃗B−abcd A 2 B 2−b 2d 2 B 4−b2 cd B 2 ⃗A ⃗ B−
2 2⃗ ⃗ 2 2⃗ ⃗ 2 2⃗ ⃗ ⃗ ⃗ 2 2 2 ⃗ ⃗ 2
−abd B A B−a cd A A B−abd B A B−abcd ( A B ) −a d ( A B ) −
−abc 2 A 2 ⃗
A⃗B−b2cd B 2 ⃗
A⃗B −b2 c2 ( ⃗ B ) 2−abcd( ⃗A ⃗
A ⃗ B )2 =
=a 2d 2 A2 B 2+b2c 2 A2 B 2−2abcd A 2 B 2+2abcd( ⃗A B⃗ )2−a 2 d 2( ⃗A B
⃗ )2−b 2 c 2( ⃗ B )2 =
A ⃗
=(ad −bc )2 A2 B 2 −(ad −bc )2( ⃗ B )2 =(ad −bc )2( A 2 B 2−( ⃗
A ⃗ B ) 2)=
A ⃗
=(ad −bc )2( Aμ B ν Aμ B ν − Aν Bμ Aμ B ν)=
=(ad −bc )2( g μ g βν A B β Aμ B ν − g ν g βμ A B β Aμ B ν)=
=(ad −bc )2( g μ g βν− g ν g βμ ) A Bβ Aμ B ν
* For the particular case of spherical surface see problem 33.
** Note that even spaces whose metric is not positive definite can be flat, such as the
Minkowski spaces of Special Relativity, which have a diag (±1). metric.
144
5.16 Isotropic spaces, spaces with constant curvature
▪ It can happen that the curvature K does not depend on the plane P
but only on the point P , so that it is the same in any direction. In these
cases, the space is called isotropic.
▪ If the curvature K is independent of P the space is said to be a space
with constant curvature.
▪ 2D spaces are necessarily isotropic, since only one (tangent) plane
pass through each point P ; the curvature K may, however, be different
from point to point, so they have not necessarily constant curvature.
▪ Spaces of dimension n >2, if isotropic, are spaces with constant
curvature K (Schur's theorem).
▫ Demonstration of Schur's theorem: definition eq.5.50
R βμ ν A B β A μ B ν
K= is equivalent to:
( g μ g βν − g ν g βμ ) A B β A μ B ν
[ K(g μ g βν − g ν g βμ )− R βμ ν ] A B β A μ B ν =0
in isotropic spaces where K = K (P ), i.e. K = K( x 1 ,x 2 ,... x n)
regardless of orientation, this must hold for ∀ ⃗
A, ⃗
B , that is:
K (g μ g βν − g ν g βμ )− R β μ ν = 0
R βμ ν= K ( g μ g βν− g ν g βμ ) 5.51
Taking the covariant derivative:
R βμ ν ; λ = K ; λ g μ g βν + K( g μ ;λ g βν − g μ g βν ; λ )+
− K ; λ g ν g βμ − K(g ν ; λ g βμ − g ν g βμ ; λ )=
and since ∀ g β; λ = 0:
= K ; λ( g μ g βν − g ν g βμ )
145
= K ;λ ( g μ g βν g μ g βν− g μ g βν g ν g βμ )+
+ K ;μ (g μ g βν g ν g βλ − g μ g βν g λ g βν)+
+ K ; ν(g μ g βν g λ g βμ − g μ g βν g μ g βλ )=0
=K ;λ( g μμ g νν−g μν g μν )+K ;μ ( g μν g νλ−g μλ g νν)+K ; ν( gμλ g μν−g μμ g νλ )=0
= K ;λ (n ⋅n− δμν δ νμ)+ K ;μ( δμν δ νλ −δμλ n)+K ; ν( δμλδ νμ −n δλν )=0
= K ;λ (n 2−n)+ K ;μ ( δμλ − δμλ n)+ K ;ν( δλν −n δ νλ )=0
= K ;λ (n 2−n)+ K ;μ ( δμλ (1−n))+ K ;ν( δνλ(1−n))=0
= K ;λ (n 2−n)+ K ;λ (1−n)+ K ;λ (1−n)=0
= K ;λ (n 2−3n +2)=0
= K ;λ (n−1)(n−2)=0
which is always fulfilled for n = 2 , ∀ K ;λ ; for n > 2 it must be
K ; λ =0 i.e. (being K an invariant scalar) K , λ =0 that is K
= const , q.e.d.
▪ Vice versa, spaces of dimension n > 2 with constant curvature K are
isotropic: in a curved space of dimension n > 2 one can not define "the
same direction" in different points because direction-vectors outgoing
from different points are not comparable. Hence a constant curvature
can only be achieved independent of the direction.
Mnemo
146
▪ Summary: for spaces of dimension n the implications are valid:
n=2 ⇒ isotropy
n>2 : isotropy constant curvature
−R= K (n −n2 )
R= K n ( n−1) , q.e.d..
g μ R βμ ν = K( gμμ g βν − g μν g βμ )
R β ν = K( n g βν −δμν g βμ )=
= K (n g βν − g βν)=
= K (n−1) g βν , q.e.d.
147
▪ From eq.5.52 we deduce that for spaces with constant curvature:
R=2 K for n = 2
R=6 K for n = 3
R=12 K for n = 4
In particular:
for 2D spaces the Riemann curvature is 1 2 Ricci curvature.
μ
g
βν
R βν λ = g
βν
Rμβ νλ = Rμλ
β β λ
ẽ ẽ ν ẽ λ ẽ ẽ ν ẽ λ ẽ
The result is a Ricci tensor as produced by a contraction 2-3.
148
that, thanks to the identity 2 R − R; ≡ 2 R ; −R; , may be
written as:
2 R − R; = 0 5.55
This expression or its equivalent eq.5.54 are sometimes called twice
contracted Bianchi identities.
Operating a further raising of the indexes:
g ν λ ( 2 Rμλ −δμλ R) ;μ = 0
( 2 Rν μ−δ ν μ R);μ = 0
that, since δ ν μ ≡ g ν μ (eq.2.46), yields:
( 2 Rν μ−g ν μ R);μ = 0
(R νμ 1
− g
2
νμ
R )
;μ
=0 5.56
G ν μ ≝ Rν μ − 1 g ν μ R 5.60
2
obtained by lowering the indexes once or twice by means of g β .
* No reference, of course, to the “dual switch” earlier denoted by the same symbol!
**Note that in equations that contain tensors of the same rank, reducible to the form
V β =W β or V β= W β , it is possible to lower / raise indexes simultaneously
and concurrently, even in groups.
149
In the first it is written δ μν in place of g μν (remind that δ μν = g μν ).
▪ First form eq.5.59 is already ready for contraction μ = ν :
μ μ 1
2
μ 1
G = G μ = R μ − δ μ R = R − n R = R 1− n
2 ( 1
2 )
that is: (
G == R 1− n
1
2 ) 5.60
150
5.18 Einstein spaces
Einstein space**is by definition a space where:
G νμ =λ g νμ with λ=const 5.62
R 1− n
( 2) g
νμ = Rνμ − 1 g νμ R
n 2
151
R νμ = R 1 − 1 g νμ + 1 g νμ R
(n 2) 2
R
R νμ = g 5.64
n νμ
This relationship between the Ricci tensor and the metric tensor
characterizes Einstein spaces.
Note that in a space of Einstein from R we can go back to Ric .
In particular, if R=0 also R νμ=0.
(2 Rn δ − δ R ) = 0
μ
λ
μ
λ
;μ
(δ R ( 1n − 12 )) =0
μ
ν
;μ
( 1n − 12 )R ; ν=0
( 1n − 12 )R , ν=0
152
This was also true for spaces with constant curvature. In fact, there is
an inclusion relationship between the two classes of spaces:
▪ Each space with constant curvature is an Einstein space
▫ Indeed:
1
the Einstein tensor G β ν ≝ Rβν − g βν R , recalling eq.5.53
2
valid in spaces with constant curvature, can be written:
1
G β ν = K ( n − 1) g β ν− g β ν R =
2
1
(
= g β ν K (n − 1) −
2 )
R = and by eq.5.52:
= gβ ν R ( 1n − 12 ) 5.65
λ=R ( 1n − 12 )
Einstein spaces therefore include:
• all spaces with n = 2
• spaces with n > 2 with constant curvature.
153
______________
G is our last stop. But just G is the beginning of a new story. More than a
hundred years have passed since the day when, mixing physics and
mathematics with an overdose of intuition, Einstein wrote:
G = T
anticipating, according to some, a theory of the Third Millennium to the
twentieth century, capable of reforming our understanding of the Universe
on a large scale. This equation states a direct proportionality between the
content of matter-energy represented by the “stress-energy-momentum”
tensor T ‒ a generalization in the four-dimensional space-time of the
stress tensor – and the curvature of space expressed by G. In a sense, T
belongs to the realm of physics, while G is matter of mathematics: that
maths we have toyed with so far. Compatibility with the Newton's
gravitation law allows to give a value to the constant κ and set the
fundamental equation of General Relativity in the form:
G = 8 4G T
c
which, beside the two glorious tensors, grants a place of honor to
fundamental constants such as the universal gravitational constant G and
the speed of light c. We note, dismayed, the serious and unjustified
absence of Planck's constant h. Who does not lack is the indefectible π,
which seems to claim its fundamental role in the architecture of the world,
although no one seems to make a great concern of that (those who think
of this observation as trivial try to imagine an Universe in which π is, yes,
a constant, but different from 3.14 ....).
______________
154
Appendix
1 - Transformation of Γ under coordinate change
We explicit from eq.4.30 e = e by (scalar) multiplying
both members by e :
〈
e , e 〉 = 〈 e , e 〉
= 〈 e , e 〉
Now let us operate a coordinate change x x ' (and coordinate
bases, too) and express the right member in the new frame:
= 〈 e , e 〉
〈 〉
'
= x ' ' e ' , ' e ' (chain rule on x ' )
x x
= '
〈
x '
' e ' , ' e '
〉
= '
'
〈
x '
' e ' , e '
〉
'
= '
〈 e '
x
'
'
e '
x
'
' , e ' 〉
'
= '
'
〈 , 〉
x
e '
' e
' '
'
x
'
' 〈 e ' , e
'
〉
'
= '
'
〈, 〉
e '
x
' e
' '
'
x
' '
' '
'
' '
' ' '
= ' '
' ' ' ' '
x
' κ β' γ' x x'
2 x β '
κ
=Λμ Λ γ' Λ ν Γ ' β' + μ
x x x ' x ν
β'
The first term would describe a tensor transform, but the additional
term leads to a different law and confirms that is not a tensor.
155
2 - Transformation of covariant derivative under coordinate change
To V ; = V , V
let's apply a coordinate change x ξ x ζ ' and express the right
member in the new coordinate frame:
V ; = V , V
2 '
' ' ' '' ' x ' x ' V '
x x x
' V'
' '
= ' ' V ' '
x x
' ' ' ' x 2 x ' '
' ' ' ' V ' ' V
x x x
' V ' ' ' x
= ' ' V ' '
x x x
' ' ' x 2 x ' ' *
*
' ' ' V ' ' V
x x x
' V' ' x
'
2 x
= ' V
x
'
x x' x '
' ' ' x 2 x ' '
' ' ' V ' ' V
x x x
or, changing some dummy indexes:
' V' ' x
'
2 x
= ' V
x
'
x x ' x '
' ' ' x 2 x ' '
' ' ' V ' ' V
x x x
' '
' x x x
* Recall that ' = = = ''
x x' x '
156
' ' ' 2
= ' '
' V , ' ' ' ' V terms in
' ' ' ' 2
= ' V , ' ' ' V terms in
The two terms in 2 cancel each other because opposite in sign. To
show that let's compute:
x
'
x x
=
x
'
x ' x
x x
' =
x x x
x ' x x x '
' ' ' '
x x x
=
x ' 2 x x 2 x '
= ' ' ' '
x x x x x x
x
On the other hand: ' = ' = 0 , hence:
x x x
x ' 2 x x 2 x '
' ' = − ' ' , q.e.d.
x x x x x x
The transformation for V ; is then:
' ' ' ' ' '
V ; = V , V = ' V , ' ' ' V = ' V ; ' ⇒
⇒ V ; transforms as a (component of a) 11 tensor.
157
3 - Non-tensoriality of basis-vectors, their derivatives and gradients
Are not tensors: a) basis-vectors ⃗e ν ; b) their derivatives μ ⃗e ν ; c)
their gradients ∇̃ ⃗e ν
a) "Basis-vector" is a role that is given to certain vectors: within a
vector space n vectors are chosen to wear the "jacket" of basis-vectors.
Under change of basis these vectors remain unchanged, tensorially
transforming their components (according to the usual contravariant
scheme eq.3.4), while their role of basis-vectors is transferred to other
vectors of the vector space. More than a law of transformation of
basis-vectors, eq.3.2 is the law ruling the role or jacket transfer.
For instance, under transformation from Cartesian coordinates to
spherical, the vector V⃗ ≡(1 ,0 ,0) that in Cartesian plays the role of
basis-vector ⃗i transforms its components by eq.3.4 and remains
unchanged, but loses the role of basis-vector which is transferred to
new vectors according to eq.3.2 (which looks like a covariant scheme).
Vectors underlying basis-vectors have then a tensor character that
does not belong to basis-vectors as such.
b) Scalar components of μ⃗e ν are the Christoffel symbols (eq.4.30,
eq.4.31) which, as shown in Appendix 1, do not behave as tensors: that
excludes that μ ⃗e ν is a tensor.
The same conclusion is reached seing that the derivatives of basic-
vectors μ⃗e ν are null in Cartesian coordinates but not in spherical
ones. Since a tensor which is null in a reference must be zero in all,
the derivatives of basis-vectors μ ⃗e ν are not tensors.
c) Also the gradients of basis-vectors ∇̃ ⃗e ν have as scalar components
the Christoffel symbols (eq.4.51); since they have not a tensor
character (see Appendix 1), it follows that gradients of basis-vectors
̃ ⃗e ν are not tensors.
∇
As above, the same conclusion is reached seing that the ∇̃ ⃗e ν are zero
in Cartesian but not zero in spherical coordinates.
We note that this does not conflict with the fact that V⃗ , ∇̃ V⃗ own
tensor character: the scalar components of both are the covariant
derivatives which transform as tensors.
Similar considerations apply to basis-covectors.
158
4 – Equation of geodesic
The equation of the geodesic is obtained as curve of minimal length
between two points A and B.
μ
̄x
-
l δx
μ
B
μ
x
l
A
√
dxμ dx ν
t
s=∫t g μ ν
B
dt
dt dt
A
μ
where in general g μν is function of the point, i.e. g μν = gμ ν(x ) .
μ
Let's define in each point x of the curve l a small arbitrary vector
⃗δ x variable with continuity, with components ⃗δ x comp
→ δ xμ that
turns to zero in A and B.
-
In this way a curve l is defined by equations
μ μ μ
̄x = x + δ x I
whose arc-lenght is:
√
d ̄x μ d ̄x ν
t
̄s =∫t ̄g μν
B
dt
A dt dt
ḡ μ ν being now calculated in the varied points ̄xμ
ḡ μ ν = ḡ μ ν( x̄μ)
The value of ḡ μ ν at point ̄x can be given from to the value of
μ
159
ν
d ̄x ν d ν ν d x ν d (δ x )
= (x + δx ) = +
dt dt dt dt
The radical quantity is thus:
(
g μν
)(
μ ν
d xμ d xν d x μ d (δ x ) d x ν d (δ x )
g μ ν ̄ ̄ dt = g μ ν +
̄
dt dt xξ
δ x ξ
dt
+
dt dt
+
dt )( )
a product like ( A+a)(B+b)(C +c) where the capital means a finite
term and the lowercase means an infinitesimal term. Carrying out the
product:
( A+a )(B+b)(C+c)=ABC+ABc+AbC+Abc+aBC+a Bc+abC+abc
Finite terms or first-order infinitesimal has been underlined; the others
are infinitesimal of higher order, therefore negligible. So we get:
μ ν μ ν μ ν μ
d̄x d ̄x dx dx d x d(δx ) d (δ x ) d x ν
gμ ν
̄ = g μν +g μ ν +g μ ν +
dt dt dt dt dt dt dt dt
g d xμ d xν
+ μξν δ x ξ
x dt dt
and being the 2nd and 3dt term the same one (just swap the dummies
μ , ν in the 3dt (here possible because g μ ν is simmetric):
d x μ d (δ x ) g μν
ν
d xμ d xν d xμ d x ν d xμ d x ν
ḡ μ ν ̄ ̄ = g μν +2 g μ ν + ξ δ xξ
dt dt dt dt dt dt x dt dt
a+ε a , preponderant term ε , corrective term
√ ̄g μ ν
d ̄x μ d ̄x ν
dt dt √
− gμ ν
d xμ d xν
dt dt
=
dt dt
g μν
√
2 xξ
μ
dx dx
dt dt
ν
dt dt
by integrating:
t t t g ...
δ s = ̄s −s =∫t √ ̄g μ ν ... dt − ∫t √ g μν ... dt =∫t μ ν dt
B B B
A A
√ g μ ν ...
A
160
If the arc s is chosen as parameter t the denominator radical quantity
μ ν 2
is = 1 because g μ ν dx dx =ds and thus:
d x μ d ( δ x ) 1 gμ ν
ν μ ν
ξ d x d x
sB
δ s =∫s ( gμ ν + δx )ds =
A ds ds 2 x ξ ds ds
sB
ν
d xμ d (δ x ) g d xμ d x ν
ds +∫s 1 μξν δ x ξ
s
=∫s
B
gμ ν ds =
A
⏟ds ⏟ ds 2 x ds dsA
u dv
Integration by parts ∫ u dv =u v −∫ v du yelds:
s
d xμ ν d xμ g d xμ d x ν
[ d
]
s B
)ds+∫ 1 μνξ δ x ξ
s
δ x −∫s δx ν ( g μν
B B
= g μν ds=
⏟ ds s ds ds A
s 2 x
A ds ds A
=0
ν
The term within brackets is = 0 because δ x =0 in A and B; by
swapping the dummies ν ,ξ in the last integral:
s d dx
μ
g μ
ν d x d x
ξ
) ds +∫s 1 μξ
s
=−∫s δ x ν (g μ ν
B B
ν δx ds=
ds A ds 2 x ds ds A
[ d xμ 1 gμ ξ d x d x
]
μ ξ
sB d
=−∫s δ x
ν
(g μ ν ) ds− ν ds
A ds ds 2 x ds ds
We calculate separately the term:
d d xμ d 2 x μ d g dxμ d 2 xμ d g μν dx μ dx ξ
( gμ ν )= g μν 2 + μ ν = gμ ν 2 + =
ds ds ds ds ds ds dx ξ ds ds
d 2 xμ 1 d g μ ν dx μ dx ξ 1 d g ξν dx ξ dxμ
+ =g μ ν
ξ + μ
ds 2 2 dx ds ds 2 dx ds ds
having swapped the dummies μ , ξ in the last cut-in-half-term.
By replacing what calculated above:
δs=
s
A [
d 2 xμ d g dx μ dx ξ 1 d g ξν dx ξ dxμ 1 g μξ d xμ d x ξ
=−∫s δ x ν g μ ν 2 +1 μξ ν
B
ds
+ −
2 dx ds ds 2 dxμ ds ds 2 x ν ds ds
ds=
]
[ ]
2 μ μ ξ
sB d x dx dx 1
=−∫s δ x ν g μ ν 2
+ ⋅ ( g μν ,ξ+g ξν ,μ−g μξ ,ν ) ds
A
ds ds ds 2
-
For δ s = ̄s −s =0 (geodesic condition) the curve l overlaps the
161
μ
geodesic l and the integral vanishes. Since δ x is arbitrary, to satisfy
the geodesic condition δ s =0 the square bracket must vanish too:
d 2 xμ dxμ dx ξ 1
gμ ν 2 + ⋅ (g μ ν ,ξ + g ξ ν ,μ − g μ ξ ,ν )=0
ds ds ds 2
Multiplying by g ν λ :
d 2 x μ dxμ dx ξ 1 ν λ
g μλ + ⋅ g ( g μ ν , ξ + g ξ ν ,μ − g μ ξ ,ν ) = 0
ds2 ds ds ⏟ 2
λ
Γμ ξ
2 μ 2 μ 2 λ
λ λ d x d x d x
Since g μ =δμ ⇒ g λμ
2
= δμλ 2 = 2 we get the differential
ds ds ds
equation(s) of the geodesic:
d 2 x λ dx μ dx ξ λ
+ Γ =0
ds2 ds ds μ ξ
162
5 – Riemannian metrics / spaces in general
A space is not associated with a particular metric (the metric changes
with the coordinate system you choose); however some characteristics
of the metric can imply precise properties of space. This is the case of
flat metrics (i.e. matrix with elements, or coefficients, constant) that
imply flat space (but not viceversa) as well as defined positive /
indefinite metrics that imply respectively space with positive distances
2
( ( ds > 0 ) or with distances of variable sign.
Here we classify Riemannian spaces based on some properties of the
spaces themselves or their metrics, referring to 4 attributes (0 = NO,
1 = YES). Forbidden combinations are barred.
163
• Reason for exclusion: a: Euclidean space ⇒ flat space
b: Euclidean sp. ⇒ g positive definite
c: g constant coefficients ⇒ flat sp.
{ }
d: g positive definite ⇒ Euclidean sp.
flat sp.
• Examples:
[ ]
example 1: [ g μν ]= 1
0
0
x1
⇒ ∃ Γ≠0 ; R≠0
[ ]
example 2: [ g μν ]= 1 0
0 −(x 2 )2
⇒ ∃ Γ≠0 ; R=0
[
example 3: [ g μ ν ]= 1
] 0
1 2
0 ( x ) +1
⇒ ∃ Γ≠0 ; R≠0
[ ]
example 4: [ g μ ν ]= 1 01 2
0 (x )
(Euclidean metric in polar coordinates)
⇒ ∃ Γ≠0 ; R=0
164
6 – Curvature in 2D spaces
A 2D space (surface) can have a variable curvature but is necessarily
isotropic, so as its Riemann curvature K
R βμ ν A B β A μ B ν
K=
( g μ g βν − g ν g βμ ) A B β A μ B ν
does not depend on the direction, that is independent of vectors ⃗
A, ⃗
B.
The summations of numerator and denominator contain addends with
any combination of the values 1, 2 in the indexes ,β,μ , ν , but in
the numerator only terms containing R1212 , R1221 , R 2112 , R2121 are
≠ 0, therefore in the formula for K only terms corresponding to the
alignments βμ ν =1212, 1221, 2112, 2121 must be considered
• at numerator:
R1212 A1 B 2 A1 B2 +R1221 A1 B 2 A2 B 1+R2112 A2 B1 A1 B 2+R 2121 A2 B1 A2 B1
• at denominator:
(g 11 g 22 − g 12 g 21 )A1 B 2 A1 B 2+( g 12 g 21 −g 11 g 22 )A1 B 2 A2 B1 +
+(g 21 g 12 −g 22 g 11 )A 2 B1 A1 B2 +( g 22 g 11 − g 21 g 12 ) A2 B1 A2 B1
Since ⃗
A, ⃗
B are any vectors, we can choose:
A=( 1,0) that is A1=1, A2 =0
⃗
B=(0,1) that is B1 =0, B2=1
⃗
In this way all the terms that contain A2 o B1 in numerator and
denominator vanish, giving:
R1212 A1 B2 A1 B2 R1212
K= =
( g 11 g 22 −g 12 g 21 )A1 B2 A1 B 2 det [ g β ]
165
166
Bibliographic references
▪ Among texts specifically devoted to Tensor Analysis the following retain a
relatively soft profile:
167
Fleisch, D.A. 2012, A Student's Guide to Vectors and Tensors, Cambridge University
Press, pag. 133
A very "friendly" introduction to Vector and Tensor Analysis, understandable even
without special prerequisites, neverthless with a good completeness (until the
introduction of the Riemann tensor, but without going into curved spaces). The
approach to tensor is traditional. Beautiful illustrations, many examples from
physics, many calculations carried out in full, together with a speech that gives the
impression of proceeding methodically and safely, without jumps and without
leaving behind dark spots, make this book an excellent autodidactict tool, also
accessible to a good high school student.
Spain, B. 2003, Tensor Calculus -A Concise Course, Dover Publications, pag. 125
A dense booklet, not so easy and quite old-style, but comprehensive and interesting
for the clever treatment of some topics (included the deduction of the equation of the
geodesic without resorting Euler-Lagrange equation, we referred to in Appendix 4).
168
Carroll, S. M. 1997 Lecture Notes on General Relativity, University of California
Santa Barbara, pag. 231
Download: http://xxx.lanl.gov/PS_cache/gr-qc/pdf/9712/9712019v1.pdf
These are the original readings, in a even more conversational tone, from which the
text quoted above has been developed. All that is important is located here, too,
except for some advanced topics of GR.
169