Nothing Special   »   [go: up one dir, main page]

Intrinsic Gaussian Vector Fields On Manifolds

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Intrinsic Gaussian Vector Fields on Manifolds

Daniel Robert-Nicoud Andreas Krause Viacheslav Borovitskiy

ETH Zürich, Switzerland


arXiv:2310.18824v2 [stat.ML] 31 Mar 2024

Abstract ing spatial data (Chilès and Delfiner, 2012) and au-
tomated decision-making, e.g., optimization (Noskova
Various applications ranging from robotics to and Borovitskiy, 2023; Shields et al., 2021; Snoek et
climate science require modeling signals on al., 2012), or sensor placement (Krause et al., 2008).
non-Euclidean domains, such as the sphere. Gaussian processes can be scalar- or vector-valued (Al-
Gaussian process models on manifolds have varez et al., 2012). The important special case of the
recently been proposed for such tasks, in latter is Gaussian vector fields. These can for exam-
particular when uncertainty quantification is ple be used to model velocities or accelerations, either
needed. In the manifold setting, vector- as a target in itself or as means for exploring an un-
valued signals can behave very differently known dynamical system. When the input domain
from scalar-valued ones, with much of the is Euclidean, vector fields are just vector-valued func-
progress so far focused on modeling the lat- tions. However, when the domain is a submanifold of
ter. The former, however, are crucial for a Euclidean space, such as when modeling wind speeds
many applications, such as modeling wind or ocean currents on the surface of Earth, the situation
speeds or force fields of unknown dynami- can be quite different: geometry places additional con-
cal systems. In this paper, we propose novel straints on vector fields that need to be accounted for.
Gaussian process models for vector-valued As illustrated in Figure 1, the values of a vector field
signals on manifolds that are intrinsically de- ought to be tangential to the manifold, while those of
fined and account for the geometry of the a vector function can be arbitrary vectors.
space in consideration. We provide compu-
tational primitives needed to deploy the re- In recent years, two different formalisms were pro-
sulting Hodge–Matérn Gaussian vector fields posed for defining Gaussian vector fields on manifolds.
on the two-dimensional sphere and the hyper- Lange-Hegermann (2018) approached the problem by
tori. Further, we highlight two generalization considering linear constraints on vector-valued Gaus-
directions: discrete two-dimensional meshes sian processes, constraining them to lie in the tangent
and “ideal” manifolds like hyperspheres, Lie space of a submanifold of Rd . Additionally, one can
groups, and homogeneous spaces. Finally, we
show that our Gaussian vector fields consti-
tute considerably more refined inductive bi-
ases than the extrinsic fields proposed before.

1 INTRODUCTION

Gaussian processes (Rasmussen and Williams, 2006)


are a widely used class of Bayesian models in ma-
(a) Vector Function (b) Vector Field
chine learning. They are known to perform well in
small data scenarios and to provide well-calibrated un- Figure 1: Vector fields on manifolds are not just vector
certainty. Their notable applications include model- functions, for them the vectors are always tangential.
Proceedings of the 27th International Conference on Artifi- Code: https://github.com/DanielRobertNicoud/imv-gps.
cial Intelligence and Statistics (AISTATS) 2024, Valencia, Also, see https://github.com/GPflow/GeometricKernels.
Spain. PMLR: Volume 238. Copyright 2024 by the au- Correspondence to: daniel.robertnicoud@gmail.com and
thor(s). viacheslav.borovitskiy@gmail.com.
Intrinsic Gaussian Vector Fields on Manifolds

impose further linear constraints to the resulting fields, 1.1 Gaussian Processes
such as making them divergence-free. Hutchinson et
al. (2021)—which is closer in spirit to this work— Let X be a set. A random function f on X is called
instead considered projecting vector-valued Gaussian a Gaussian process (GP) with mean µ : X → R and
processes to a submanifold of Rd . While both these covariance (or kernel) k : X × X → R—denoted by
procedures can in principle produce any valid Gaus- f ∼ GP(µ, k)—if for any finite set of points x in X
sian vector field, they are fundamentally extrinsic and we have f (x) ∼ N(µ(x), Kxx ) where K• •′ = k(•, •′ ).
we will show that the fields one gets in practice intro- Without loss of generality, we usually assume µ(·) = 0.
duce undesirable inductive biases. Assuming a GP prior f ∼ GP(0, k) and a Gaussian
To remedy this, we propose a new approach: fully likelihood y | f = N (y | f (x), σε2 ) with a fixed noise
intrinsic Gaussian vector fields based on the Hodge variance σε2 , the posterior f | y is a GP (Rasmussen
Laplacian 1 that we name Hodge–Matérn Gaussian vec- and Williams, 2006) with mean and covariance
tor fields. For some simple manifolds, namely for the −1
two-dimensional sphere S2 and for the hypertori Td , we µf |y (·) = K·x Kxx + σε2 I y, (1)
−1
develop computational techniques that allow effortless kf |y (·, ·′ ) = k(·, ·′ ) − K·x Kxx + σε2 I

Kx·′ . (2)
use of these intrinsic fields in downstream applications.
Here, the function µf |y can be used to draw predictions
The aforementioned computational techniques hinge
and the function kf |y is used to quantify uncertainty.
on knowing the eigenvalues and eigenfields of the
Hodge Laplacian. To this end, we describe how to: When X = Rd , Matérn Gaussian processes (Ras-
mussen and Williams, 2006; Stein, 1999) are most
a) derive these from the eigenpairs of the Laplace– often used. Their respective kernels kν,κ,σ2 are the
Beltrami operator when the manifold is two- three-parameter family of Matérn kernels, whose
dimensional, using automatic differentiation only; limiting case k∞,κ,σ2 for ν → ∞ is known as the
heat (a.k.a. squared exponential, RBF, Gaussian,
diffusion) kernel, which is arguably the most popular.
b) compute these on product manifolds in terms of
the eigenvalues and eigenfields on the factors; and
1.2 Gaussian Processes on Manifolds
c) get these explicitly in the cases of the circle S1 , Now consider X = M, where M is a compact Rieman-
the hypertori Td , and the sphere S2 . nian manifold. Throughout this paper, manifolds are
always assumed to be connected. Using the SPDE-
We conjecture that (a) can also be used to define Gaus- based characterization of Matérn Gaussian processes
sian vector fields on meshes, by changing the analytic of Lindgren et al. (2011) and Whittle (1963), Borovit-
notions into their appropriate discretizations. Further- skiy et al. (2020) showed how to compute Matérn ker-
more, we conjecture that (c) can be done for more gen- nels on M in terms of the spectrum of the Laplace–
eral parallelizable manifolds, e.g. on Lie groups, which Beltrami operator ∆:
may then facilitate the generalization to the very gen- ∞
σ2 X
eral class of homogeneous spaces. The latter includes kν,κ,σ2 (x, x′ ) = Φν,κ (λn )fn (x)fn (x′ ), (3)
many manifolds of interest which are poorly amenable Cν,κ n=0
to discretization because of their higher dimension. ∞
where {fn }n=0 is an orthonormal basis of eigenfunc-
By showing how our intrinsic Hodge–Matérn Gaussian tions of ∆ such that ∆fn = −λn fn ,
vector fields improve over their naı̈ve extrinsic coun- ( −ν−d/2

terparts on the two-dimensional sphere, we hope to κ2 + λ ν < ∞,
motivate further research. First, into the development Φν,κ (λ) = 2 (4)
− κ2 λ
e ν = ∞,
of practical intrinsic Gaussian vector fields on other
domains. Second, into applying the proposed models d = dim(M), and
R Cν,κ is a normalization constant that
in areas like climate/weather modeling and robotics. ensures vol1M M kν,κ,σ2 (x, x) dx = σ 2 . There exist an-
alytical (Azangulov et al., 2022; 2023) and numerical
1
The concurrent work by Peach et al. (2024) studies (Borovitskiy et al., 2020; Coveney et al., 2020) tech-
Gaussian vector fields induced by the connection Lapla- niques for computing the eigenpairs λn , fn , or bypass-
cian rather than the Hodge Laplacian (see Appendix A.7 ing the computation thereof. In the end, a truncated
on the difference between these two notions of Laplacian).
Importantly, they consider a very different setting: a pri- series from Equation (3) yields tractable Gaussian pro-
ori unknown manifolds that are estimated from finite data, cesses that respect the intrinsic geometry of the man-
showcasing an impressive neuroscience application. ifold (Rosa et al., 2023), as illustrated in Figure 2.
D. Robert-Nicoud, A. Krause, V. Borovitskiy

where x, x′ ∈ M , u ∈ Tx∗ M, v ∈ Tx∗′ M and Tx∗ M de-


notes the cotangent space of covectors, linear function-
als on Tx M. Together with a deterministic mean vector
(a) Intrinsic kernel field µ, such a kernel k determines the distribution of
f in a coordinate-free way, justifying the use of the
standard notation f ∼ GP(µ, k).

(b) Extrinsic kernel In order to define Gaussian vector fields that can be
used in practice, Hutchinson et al. (2021) introduced
Figure 2: Comparing an intrinsic Matérn kernel (ν = the notion of projected Gaussian processes. These
∞) of Equation (3) to an extrinsic one, the restriction are constructed by picking an isometric embedding
of a Euclidean Matérn kernel to the manifold. Note ϕ : M → RD into some Euclidean space.2 Then, a
the latter induces high correlation between the points tangent space Tx M can be identified with a subspace
across the minor axis of the ellipse, despite them being of Tϕ(x) RD ∼
= RD . Thus, there exists a projection Px
D
far from each other in terms of the intrinsic distance. from R to this subspace and f (x) = Px g(x) defines a
valid Gaussian vector field for any vector-valued GP g.

Importantly, same as in the Euclidean case we have 2 CHALLENGES


2
k∞,κ,σ2 (x, x′ ) ∝ P( κ2 , x, x′ ) (Azangulov et al., 2022),
where P(t, x, x′ ) is the heat kernel : the solution of The class of projected GPs we reviewed in Section 1.3
∂P is very large. In fact, any Gaussian vector field f can
= ∆x P, lim P(t, x, x′ ) = δ(x − x′ ), (5) be represented as a projected GP, simply because f =
∂t t→0
Px g with g = f , as proved in Hutchinson et al. (2021).
where ∆x denotes the Laplace–Beltrami operator act- However, in order to obtain f using this trick we need
ing on the variable x and δ is the Dirac delta function; to know it in the first place.
convergence takes place in the sense of distributions.
To avoid this “chicken and egg” problem, we should
1.3 Gaussian Vector Fields on Manifolds construct g using the tools we already posses. Practi-
cally, this means stacking together scalar-valued GPs.
Following Hutchinson et al. (2021), we introduce the The first challenge is to find scalar GPs that respect
notions of Gaussian vector fields and their kernels. the geometry of the manifold. It can be solved by using
Definition 1. A random vector field f is a function the manifold Matérn GPs reviewed in Section 1.2. To
mapping each x ∈ M to a random variable f (x) with obtain an expressive family, we can define g = Ah
values in the tangent space Tx M of M at x. It is where hj ∼ GP(0, kν,κj ,σ2 ) are independent and A
Gaussian if (f (x1 ), . . . , f (xn )) ∈ Tx1 M ⊕ . . . ⊕ Txn M is is an arbitrary matrix. In fact, this is exactly what
jointly Gaussian for all x1 , . . . , xn ∈ M and n ∈ N. Hutchinson et al. (2021) propose to do in practice.
However, as can be seen in Figure 3, this construc-
When M = Rd , this reduces to vector-valued GPs with
tion can produce undesirable artifacts. Upon a closer
output dimension equal to the input dimension d, e.g. a
examination, the reasons for this can be formalized.
concatenation of d independent scalar-valued GPs. If
instead M ⊆ Rd is an embedded manifold, concate- Proposition 2. With notation as above, if rk A > 1,
nating d scalar-valued GPs does not lead to a vector there are x, x′ ∈ S2 so that ∠(x, x′ ) = 90◦ but
field: in this case, f (x) ∈ Rd rather than f (x) ∈ Tx M
∥Cov(f (x), f (x′ ))∥ < ∥Cov(f (x), f (x̃))∥, κj → ∞, (8)
as illustrated by Figure 1.
If g is a vector-valued GP on Rd , then its values where x̃ = −x is antipodal to x (i.e. ∠(x, x̃) = 180◦ )
g(x) are Gaussian vectors. Thus, the kernel of g is and ∥•∥ denotes the Frobenius norm.
a matrix-valued function k(x, x′ ) = Cov(g(x), g(x′ )).
For a proof, see Appendix B.1. Proposition 2 implies
Alternatively, it can be defined as a scalar-valued func-
that, for large length scales κ, the covariance of a pro-
tion k((x, u), (x′ , v)) = Cov(u⊤ g(x), v ⊤ g(x′ )) where
jected GP is non-monotonic in the intrinsic distance
x, x′ , u, v ∈ Rd . Reinterpreting the inner products
on the sphere—this is an undesirable trait, usually.
a⊤ b as the linear functional • → a⊤ • applied to the
vector b leads to a particularly elegant generalization. Another issue is that the projected GP construction
For a Gaussian vector field f on a manifold M we put does not provide a way to force divergence-free or curl-
free inductive biases. These can be quite important for
k(x, x′ )(u, v) = k (x, u), (x′ , v)

(6)
2
 Such an embedding always exists by the Nash embed-
= Cov u(f (x)), v(f (x)) , (7) ding theorem; it is not unique.
Intrinsic Gaussian Vector Fields on Manifolds

(a) Projected Matérn (b) Projected Matérn, rotated (c) Hodge–Matérn (d) Hodge–Matérn, rotated

Figure 3: GP regression from a single observation (the red vector) for a very large length scale κ. Black vectors
represent the prediction µf |y (·), the background color shows the uncertainty kf |y (·, ·) : yellow for high, blue for
low. On (a) and (b) we use a projected Matérn GP of Hutchinson et al. (2021); on (c) and (d) we use our Hodge–
Matérn Gaussian vector field, ν = ∞ in both cases. Unnaturally, uncertainty in (a) and (b) is non-monotonous
with respect to the distance on the sphere: it is considerably lower at the antipode than halfway to it.

certain types of vector field data (Berlinghieri et al., Having found a suitable adaptation of the heat kernel
2023), as we will clearly observe in Section 5. to the vector field case, we turn to making it explicit.
Similarly to the scalar case, a Hilbert space L2 (M; T M)
To overcome these challenges, we introduce a fully in-
of square integrable vector fields can be defined. There
trinsic class of Gaussian vector fields on manifolds. ∞
is always an orthonormal basis {sn }n=0 of L2 (M; T M)
such that ∆sn = −λn sn , i.e. sn are eigenfields of the
3 INTRINSIC GAUSSIAN VECTOR FIELDS
Hodge Laplacian. Moreover, we have λn ≥ 0 for n ≥ 0.
The kernel P t can be computed in terms of the eigen-
In this section, we present the main ideas behind the
fields just like its scalar counterpart (cf. (3)). Namely,
construction of the intrinsic Gaussian vector fields we
as discussed in Appendix A.4, we have
propose. The mathematical formalism for this section
is detailed in Appendix A. ∞
X
P t (x, x′ ) = e−tλn sn (x) ⊗ sn (x′ ), (11)
3.1 The Hodge Heat Kernel n=0

Here, the notation means


We start by generalizing the heat kernel, i.e. the
Matérn kernel with ν = ∞. (sn (x) ⊗ sn (x′ ))(u, v) = u(sn (x)) v(sn (x′ )) (12)
Let M be a compact, oriented Riemannian manifold.
for u ∈ Tx∗ M and v ∈ Tx∗′ M.
Then, Hodge theory—see e.g. Rosenberg (1997) for an
approachable introduction—defines a generalization ∆
of the Laplace–Beltrami operator that acts on vector 3.2 Hodge–Matérn Kernels
fields on M instead of scalar functions, called the Hodge
Following Azangulov et al. (2022, 2023), we define
Laplacian. We consider the associated heat equation
Matérn kernels kν,κ,σ2 as integrals of the heat kernel:
∂u
(t, x) = ∆x u(t, x), (9) σ2
Z ∞
n 2ν
∂t ′
kν,κ,σ2 (x, x ) = tν−1+ 2 e− κ2 t P t (x, x′ ) dt. (13)
Cν,κ 0
where u is smooth in both variables and u(t, x) ∈ Tx M.
This equation admits a fundamental solution P: the Fubini’s theorem then readily implies the key formula
Hodge (heat) kernel which for any choice of t ∈ R>0

and x, x′ ∈ M gives a function ′ σ2 X
k ν,κ,σ 2 (x, x ) = Φν,κ (λn )sn (x)⊗sn (x′ ), (14)
Cν,κ n=0
P(t, x, x′ ) : Tx∗ M ⊗ Tx∗′ M −→ R. (10)
Considering t as a hyperparameter, we obtain a func- where Φν,κ is as in (4) and
R Cν,κ is a normalizing con-
tion P t (x, x′ ) with the exact signature that a kernel of stant that ensures vol1M M tr(kν,κ,σ2 (x, x)) dx = σ 2 .
Gaussian vector field should posses, as by Section 1.3. Same as the for Hodge heat kernel, these Hodge–
We prove the following. Matérn kernels determine Gaussian vector fields.
Theorem 3. For any t > 0 there exists a Gaussian Theorem 4. For any ν, κ, σ 2 > 0 there exists a Gaus-
vector field whose kernel is P t . sian vector field whose kernel is kν,κ,σ2 .
D. Robert-Nicoud, A. Krause, V. Borovitskiy

In practice, the series in Equation (11) should be trun- rameters, gives a more flexible family of kernels:
cated, with only a few terms corresponding to the
smallest eigenvalues λn used to approximately com- σ12 kdiv 2 curl 2 harm
ν,κ1 ,1 + σ2 kν,κ2 ,1 + σ3 kν,κ3 ,1 . (17)
pute the kernel, just like in the scalar case. Notice that
the functions Φν,κ (λ) are all decreasing in λ, so that By analogy with the concurrent paper by Yang et
the most significant terms are the ones corresponding al. (2024), we call this family Hodge-compositional
to the smallest eigenvalues. Matérn kernels. Unless prior knowledge suggests a
more specialized choice, this is the family we recom-
3.3 Divergence-Free and Curl-Free Kernels mend for use in practical applications, inferring all the
hyperparameters κi , σi from data. Our experimental
The celebrated Helmholtz decomposition (also known results in Section 5 support this recommendation.
as the fundamental theorem of vector calculus) states
that any vector field in Rd decomposes into the sum 3.5 Gaussian Process Regression
of its divergence-free and curl-free parts. The former,
intuitively, has no sinks and sources; the latter has All kernels defined above fall under the umbrella
no vortexes. Many vector fields in physics are known framework of Gaussian vector fields described in
to have only one of these parts. This suggests that Hutchinson et al. (2021). In practice, there are two
divergence-free and curl-free Gaussian vector fields can ways to perform Gaussian process regression in this
be useful inductive biases (Berlinghieri et al., 2023). setting. The first one—if the manifold is embedded
For manifolds, the analog of Helmholtz decomposition in RD —is to treat Gaussian vector fields as special
is the Hodge decomposition, see e.g. Rosenberg (1997, cases of Gaussian vector functions in RD . The second
Theorem 1.37). It states that any vector field u on M is to introduce a frame, i.e. a coordinate choice (not
can be represented as a sum of three fields: necessarily smooth) in all of the tangent spaces, and
describe all quantities in these coordinates. Depend-
u = u1 + u2 + u3 , (15) ing on whether an embedding or a frame is available,
either can be used. Both are merely modes of compu-
where u1 = ∇f1 for some function f1 , and thus is pure tation, not affecting the inductive biases of Gaussian
divergence—meaning that div u1 ̸= 0 and curl u1 = vector fields and not introducing any error per se.
0—and in particular curl-free, u2 = ⋆∇f2 , and thus
is pure curl and divergence-free, and u3 is a harmonic 3.6 Kernel Evaluation and Sampling
form, ∆u3 = 0, both curl- and divergence-free. The
symbol ⋆ denotes the Hodge star operator, which we As mentioned in the end of Section 3.2, given that
recall in Appendix A.1. For intuition on divergence eigenfields and eigenvalues are known, we can approx-
and curl, see Appendix A.2. imately evaluate the kernels by truncating the series
Importantly, the orthonormal basis of eigenfields in Equation (14). Such a truncation is a well-defined

{sn }n=0 may be chosen in such a way that each sn kernel, i.e. it corresponds to a Gaussian vector field.
is in exactly one of the three classes above. Let Idiv , Proposition 6. The Gaussian vector field
Icurl , and Iharm denote the index sets of the respective
classes of eigenfields. Using a single class, we can de- L
σ X
q
fine versions of Matérn Gaussian vector fields on the f (x) = p wn Φν,κ (λn )sn (x), (18)
Cν,κ n=0
manifold M with the associated inductive bias.
Theorem 5. There exists a Gaussian vector field f • — iid
where wn ∼ N (0, 1), corresponds to the kernel P
given
where • ∈ {div, curl, harm} —with kernel ∞
by the truncation of Equation (14) with the sum n=0
PL
σ2 X therein substituted by the sum n=0 .
kν,κ,σ2 (x, x′ ) = Φν,κ (λn )sn (x)⊗sn (x′ ). (16)

Cν,κ
n∈I• Importantly, Equation (18) allows to approximately
curl div sample Hodge–Matérn Gaussian vector fields in an ex-
What is more, div f = 0, curl f = 0 and
tremely computationally efficient way by simply draw-
∆f harm = 0 almost surely as long as f • is smooth
ing random wn ∼ N (0, 1). Efficiently sampling their
enough that div and curl are well-defined.
respective posteriors can be performed using pathwise
conditioning for Gaussian vector fields, as described in
3.4 Hodge-compositional Matérn Kernels Hutchinson et al. (2021).
Combining the pure divergence, pure curl, and har- Remark 7. Of course, direct analogs of Equation (18)
monic kernels, each with a separate set of hyperpa- also hold for the kernels of Theorem 5.
Intrinsic Gaussian Vector Fields on Manifolds

(a) Eigenfield √1 ∇Y3,2 (b) Eigenfield √1 ⋆ ∇Y3,2 (c) Eigenfield √1 ∇Y7,3 (d) Eigenfield √1 ⋆ ∇Y7,3
λ3 λ3 λ7 λ7

Figure 4: Eigenfunctions on the sphere S2 (represented by color) and the respective eigenfields.

In summary, (approximate) sampling, kernel evalua- Theorem 8. For each n ≥ 1, both ∇fn and ⋆∇fn are
tion and differentiation reduce to knowing eigenvalues eigenfields of the Hodge Laplacian, and the set
and eigenfields of the Hodge Laplacian on M . Thus,
we proceed to discuss how to obtain those in practice.
 
∇f ⋆∇f
√ n , √ n , gj n ≥ 1 and 0 ≤ j ≤ J (19)
λn λn
4 EXPLICIT EIGEN-VALUES AND -FIELDS
forms an orthonormal basis of L2 (M; T M).
The above allows defining intrinsic kernels on gen-
eral compact oriented Riemannian manifolds. How- All of these operators can easily be computed numer-
ever, actually computing these kernel requires solving ically, e.g. via automatic differentiation, which makes
for eigenfields and eigenvalues of the Hodge Laplacian. pointwise evaluation and differentiation of the kernels
Luckily, in some important cases this turns out to be an easy endeavor with modern computing systems.
tractable. We present them in this section.
The Sphere It is well known that the eigenfunctions
4.1 Surfaces and the Sphere of the Laplace–Beltrami operator on the sphere S2 are
given by the spherical harmonics Yℓ,m , for ℓ ≥ 0, −ℓ ≤
The main case of interest here is the sphere S2 . How- m ≤ ℓ, with eigenvalues λℓ = λℓ,m = ℓ(ℓ + 1). Addi-
ever, we start by considering the more general case tionally, the 0-eigenspace of the Hodge Laplacian on
of manifolds of dimension 2 (surfaces). We explain the sphere is trivial, see Appendix A.3. This, together
how to obtain the eigenfields and eigenvalues granted with Theorem 8, allows us to compute the eigenfields.
their scalar counterparts and a basis of harmonic vec- We visualize some of them in Figure 4.
tor fields are known.
The approximation of the full kernel can be made even
more efficient via the addition theorem, e.g. De Vito et
Surfaces Suppose M is a compact, oriented Rieman-
al. (2021, Section 7.3), that states
nian surface. We consider two intrinsic operators for
this case. The first is the gradient of a scalar func- X 2ℓ + 1
tion, giving us a vector field. The second operator Yℓ,m (x)Yℓ,m (x′ ) = Pℓ (x · x′ ), (20)
is the Hodge star operator ⋆ acting on vector fields, 4π
−ℓ≤m≤ℓ
which in the case of surfaces is just a 90◦ rotation of a
vector field in the positive direction, as shown in Ap- where Pℓ is the ℓ-th Legendre polynomial and the
pendix A.2. scalar product is taken in R3 after embedding the S2
Suppose we know all eigenfunctions {fn }n≥0 and their as the standard unit sphere. This reduces the compu-
respective eigenvalues 0 = λ0 < λ1 ≤ λ2 ≤ · · · of tations to a single simple function for each eigenvalue.
the Laplace–Beltrami operator on M. Further, assume As a result, we have the following.
that we have an orthonormal basis {gj }0≤j≤J of the Proposition 9. Writing
0-eigenspace of the Hodge Laplacian.3
2ℓ + 1
3
By the Hodge decomposition theorem, these form a Peℓ,ν,κ (z) = Φν,κ (λℓ )Pℓ (z), (21)
basis of the first de Rham cohomology group (a real vector 4πλℓ
space) of M, which is finite dimensional since M is compact
by assumption. the pure divergence and pure curl Hodge–Matérn ker-
D. Robert-Nicoud, A. Krause, V. Borovitskiy

ator in this case gives an identification of scalar func-


tions and vector fields on S1 : there is a canonical global
vector field v such that ∥v(x)∥ ≡ 1—in fact, there
are exactly two of them, and a choice of orientation
selects one—and a function f (x) on S1 is identified
with f (x)v(x). The Hodge Laplacian reduces to the
Laplace–Beltrami operator

∆(f (x)v(x)) = ∆(f (x))v(x). (24)


(a) Divergence-free field f curl (b) Curl-free field f div
Thus, the spectrum of the Hodge Laplacian is the same
Figure 5: Intrinsic Gaussian vector field samples on S2 .
as the spectrum of the Laplace–Beltrami operator un-
der this identification. Similarly, the vector kernels co-
nels on S2 are given by incide with their scalar counterparts, for which explicit
∞ formulas can be found in Borovitskiy et al. (2020).
′ σ2 X
kdiv
ν,κ,σ 2 (x, x ) = div
(∇x ⊗ ∇x′ )Peℓ,ν,κ (x · x′ ),
Cν,κ
ℓ=1 Hypertori The d-dimensional (flat5 ) hypertorus is
σ2

X defined as the product of d circles Td := S1 × · · · × S1 .

kcurl
ν,κ,σ 2 (x, x ) = curl
(⋆∇x ⊗ ⋆∇x′ )Peℓ,ν,κ (x · x′ ). It has a global basis of tangent vector fields by the
Cν,κ
ℓ=1 canonical identification T Td ∼= T S1 ⊗ · · · ⊗ T S1 , so
The full Hodge–Matérn kernel is the mean of the two: that we can write tangent fields on Td as a vector of
vector fields on S1 . Putting together what was said
1  div 
kν,κ,σ2 = kν,κ,σ2 + kcurl
ν,κ,σ 2 . (22) above, we obtain the following, cf. Appendix A.6.
2
Proposition 10. The eigenfields on Td are
Combining this with Proposition 6, we can sample
Hodge–Matérn Gaussian vector fields. Thanks to Re- d
!
fni (xi ) ej ∈ T S1 ⊗ · · · ⊗ T S1 ∼
Y
mark 7, we can also sample their pure divergence and = T Td , (25)
pure curl counterparts, which we illustrate in Figure 5. i=1

Pd
4.2 Product Manifolds and Hypertori 1 ≤ j ≤ d, with eigenvalue i=1 λni , where xi ∈ S1 ,
and (fn , λn ) are eigenpairs of the Laplace–Beltrami op-
d
Another tractable setting is product manifolds: if we erator on S1 . In particular, with k T the scalar Matérn
know the eigenfields of the Hodge Laplacian for the kernel on Td , the vector Hodge–Matérn kernel is
factors, we can construct those of the product. We
give an overview of this in general before deriving the d 1 Td
kTν,κ,σ2 (x, x′ ) = k ′
2 (xi , xi )I d . (26)
spectrum of the circle from the scalar case and using d ν,κ,σ
it to resolve the case of the hypertori.
4.3 Possible Extensions
Product Manifolds Let M, N be two compact, ori-
ented Riemannian manifolds with scalar manifold heat We propose two prospective directions into which our
kernels P M , P N and Hodge heat kernels P M , P N . The results could be extended.
vector kernel on the product M × N is given by
P M×N
t (x, x′ ) = PtM (x1 , x′1 )P N ′
t (x2 , x2 ) + Meshes Neither Hodge–Matérn kernels nor the
(23) eigenfields can be analytically computed on a general
+ PtN (x2 , x′2 )P M ′
t (x1 , x1 )
two-dimensional manifold. This is true even for their
for x = (x1 , x2 ) and x′ = (x′1 , x′2 ) and x1 , x′1 ∈ M; scalar counterparts (Borovitskiy et al., 2020). How-
x2 , x′2 ∈ N. The details are explained in Appendix A.6. ever, we expect that Hodge–Matérn kernels can be
numerically approximated on surfaces discretized into
Knowing the Hodge heat kernels, the other Hodge–
meshes. To do this, one needs to apply suitable dis-
Matérn kernels can be derived using Equation (13).
crete counterparts of ⋆ and ∇ to numerically approx-
imate the scalar eigenfunctions the Laplace–Beltrami
Circle The circle S1 is the only4 compact Rieman-
operator and also take care of the harmonic forms.
nian manifold of dimension 1. The Hodge star oper-
4 5
To be precise: any 1-dimensional compact Riemannian Not to be confused with the torus defined as a “donut”
manifold is isometric to the circle of the same length. in R3 , which has a different intrinsic geometry.
Intrinsic Gaussian Vector Fields on Manifolds

(a) Projected Matérn (ν = 12 ) (b) Ground truth (January 2010) (c) Div.-free Hodge-Matérn (ν = 12 )

Figure 6: Interpolation of wind speed on the surface of Earth. The observations are the red vectors along a
meridian. Figures (a) and (c) report predictive mean (black vectors) and uncertainty (color: yellow is high, blue
is low). Note that in figure (a) sinks and sources are present, while the inductive bias of (c) prohibits that. We
advise the reader to examine the global Figures 8 and 9, located in the appendix because of space limitations.

Lie Groups and Related Manifolds It is possi- nels all with ν = 12 , ∞. The hyperparameters are the
ble to obtain the scalar manifold Matérn kernels on noise variance for the first kernel, and the length scale
homogeneous spaces via the representation theory of κ, the variance σ 2 , and the noise variance σε2 for the
their symmetry groups (Azangulov et al., 2022). We others. They were optimized by maximizing marginal
conjecture that the the vector case can be treated log-likelihood. After visual inspection of the ground
similarly—in particular, in view of the work of Ikeda truth, we selected the best performing kernel—the
and Taniguchi (1978). This could result in explicit for- divergence-free Hodge–Matérn kernel with ν = 21 —
mulas for eigenfields and eigenvalues for homogeneous and the projected Matérn- 21 kernel, and ran four fur-
spaces given in terms of algebraic quantities only. ther experiments by fixing their length scale to high
values κ = 0.5 and 1. Importantly, we also experi-
5 EXPERIMENTS mented with the Hodge-compositional Matérn (H.-C.
M.) kernels of Section 3.4 to understand if these could
We complement the theoretical motivations for the use detect the correct inductive bias.6
of intrinsic kernels on manifolds with a practical exper- We report the mean and standard deviation of mean
iment on weather data from the ERA5 dataset (Hers- squared error (MSE) of the predicted mean and pre-
bach et al., 2023). Further experiments on syntheti- dictive negative log-likelihood (PNLL) of the ground
cally generated data are available in Appendix C. truth against the predictive distribution, where each
test point is considered independently of the others.
5.1 Setup

The dataset comes from the fifth generation ECMWF


atmospheric reanalysis of the global climate (ERA5)
(Hersbach et al., 2023). We took the monthly aver-
aged reanalysis data for wind (u- and v-components)
at the fixed 500hPa pressure level (corresponding to
approximately 5.5km altitude) from January to De-
cember 2010. As the wind at the 500hPa pressure level
at mid and high latitudes is approximately geostrophic
and therefore, approximately divergence-free (Holton,
2004), we expect the family of divergence-free (i.e. pure
curl) Hodge–Matérn kernels to provide good results.
The training and testing points are distributed along a
great circle, as displayed in Figure 6b. An experiment
was run for each month of data, totaling 12 experi-
ments for each kernel. (a) Projected (b) Hodge
We applied GP regression using pure noise, pro-
jected Matérn (P. M.), Hodge–Matérn (H.–M.), and Figure 7: Posterior samples of the models featured
divergence-free Hodge–Matérn (div-free H.–M.) ker- in Figure 6 (note: different scaling of vectors).
D. Robert-Nicoud, A. Krause, V. Borovitskiy

MSE PNLL MSE PNLL


Kernel Kernel
Mean Std Mean Std Mean Std Mean Std
Pure noise 2.07 0.18 2.89 0.10 div-free H.–M.– 12 1.10 0.12 2.16 0.13
P. M.– 21 1.39 0.15 2.33 0.11 div-free H.–M.–∞ 1.34 0.20 2.33 0.18
P. M.–∞ 1.53 0.20 2.43 0.15 H.–M.– 21 κ = 0.5 1.06 0.10 2.15 0.13
H.–M.– 21 1.67 0.16 2.58 0.12 H.–M.– 12 κ = 1.0 1.05 0.10 2.17 0.13
H.–M.–∞ 1.76 0.17 2.63 0.14 P. M.– 12 κ = 0.5 1.36 0.13 2.32 0.10
div-free H.–M.– 21 1.10 0.12 2.16 0.13 P. M.– 12 κ = 1.0 1.36 0.13 2.34 0.10
div-free H.–M.–∞ 1.34 0.20 2.33 0.18
H.-C. M.– 21 1.09 0.10 2.16 0.12 Table 2: Mean squared error and predictive negative
H.-C. M.–∞ 1.34 0.20 2.33 0.18 log-likelihood in the ERA5 wind data experiment for
divergence-free Hodge–Matérn and projected Matérn
kernels with ν = 12 and fixed length scale κ.
Table 1: Mean squared error and predictive negative
log-likelihood in the ERA5 wind data experiment.
the same scores as the actual divergence-free Hodge–
Matérn kernels. See Appendix C for more details.
5.2 Performance
6 CONCLUSION
Results are displayed in Tables 1 and 2. The data was
scaled so that the training observations have unit mean
In this work, we introduced a novel class of models—
norm, cf. Appendix C. Baselines are provided by fitting
the Hodge–Matérn vector fields—for learning tangen-
a pure noise kernel to the data and the projected ker-
tial vector fields on manifolds. These principled mod-
nels of Hutchinson et al. (2021). The best scores by far
els solve the shortcomings present in the preexist-
were obtained by fitting either of the divergence-free
ing literature, providing improved inductive biases.
Hodge–Matérn or Hodge-compositional Matérn ker-
We described computational techniques—kernel evalu-
nels with ν = 21 , which outperformed the other ker-
ation and differentiation, sampling—required for run-
nels we considered on all metrics. Next to them were
ning Gaussian process regression and for downstream
divergence-free Hodge–Matérn with ν = ∞ and the
applications on important manifolds such as the two-
projected Matérn kernel with ν = 12 .
dimensional sphere S2 and hypertori Td , and indicated
Two factors were of fundamental importance in ob- further extension directions. We applied our methods
taining good results: having a divergence-free induc- both to synthetic and real-world data on S2 , demon-
tive bias and allowing for lower degrees of smoothness strating that they can lead to improved performance.
by setting ν = 12 . In particular, considering the diver- We hope our results will inspire new extensions and ap-
gence of the various kernels explains why the projected plications in areas like climate modeling and robotics.
kernels tend to do better than the full Hodge-Matérn
kernels in this situation: they have lower absolute di- ACKNOWLEDGMENTS
vergence in expectation, cf. Appendix B.3, Figure 10.
The additional experiments where length scales were We are grateful to Gergana Gyuleva for her help with
fixed to high values are reported in Table 2. We notice retrieving meaningful weather data for our experi-
further test score improvements, but not significant ments. We thank Maosheng Yang and Dr. So Takao
ones. Figures 6 and 7 display posterior mean, standard for their insightful input on the drafts of the paper.
deviations and samples from GP regressions fitted We also thank Dr. Alexander Terenin for his guidance
using projected Matérn and divergence-free Hodge– in using his Blender rendering scripts of his Ph.D. the-
Matérn kernels, both with fixed κ = 1. sis repository.7 Finally, we thank Dr. Alan Pinoy for
providing the proof of Lemma 32. VB acknowledges
Analysis of the fitted hyperparameters of Hodge- support by an ETH Zürich Postdoctoral Fellowship.
compositional Matérn kernels shows that these were
able to automatically detect the correct inductive bias.
Specifically, the resulting Hodge-compositional kernel
was virtually divergence-free, which enabled it to reach
6
On the sphere S2 , these are linear combinations
of pure-divergence and pure-curl Hodge–Matérn kernels,
7
without the harmonic part which vanishes on S2 https://github.com/aterenin/phdthesis (Terenin, 2022).
Intrinsic Gaussian Vector Fields on Manifolds

REFERENCES E. De Vito, N. Mücke, and L. Rosasco. Reproduc-


ing kernel Hilbert spaces on manifolds: Sobolev
M. A. Alvarez, L. Rosasco, N. D. Lawrence, et al. Ker- and diffusion spaces. Analysis and Applications,
nels for vector-valued functions: A review. Founda- 19(03):363–396, 2021. Cited on page 6.
tions and Trends® in Machine Learning, 4(3):195–
266, 2012. Cited on page 1. H. Hersbach, B. Bell, P. Berrisford, G. Biavati,
A. Horányi, J. Muñoz Sabater, J. Nicolas, C.
I. Azangulov, A. Smolensky, A. Terenin, and V. Peubey, R. Radu, I. Rozum, D. Schepers, A. Sim-
Borovitskiy. Stationary Kernels and Gaussian mons, C. Soci, D. Dee, and J.-N. Thépaut. ERA5
Processes on Lie Groups and their Homoge- monthly averaged data on pressure levels from
neous Spaces I: the compact case. arXiv preprint 1940 to present. Copernicus Climate Change Ser-
arXiv:2208.14960, 2022. Cited on pages 2–4, 8. vice (C3S) Climate Data Store (CDS), 2023. DOI:
10.24381/cds.6860a573 (Accessed on 10-Oct-2023).
I. Azangulov, A. Smolensky, A. Terenin, and V. Cited on page 8.
Borovitskiy. Stationary Kernels and Gaussian Pro-
cesses on Lie Groups and their Homogeneous Spaces J. R. Holton. An Introduction to Dynamic Meteorol-
II: non-compact symmetric spaces. arXiv preprint ogy, volume 88 of International Geophysics. Elsevier
arXiv:2301.13088, 2023. Cited on pages 2, 4. Academic Press, 4th edition, 2004. Cited on page 8.

N. Berline, E. Getzler, M. Vergne, and B. Orsted. M. Hutchinson, A. Terenin, V. Borovitskiy, S. Takao,


Heat kernels and Dirac operators. Jahresbericht der Y. Teh, and M. Deisenroth. Vector-valued Gaussian
Deutschen Mathematiker Vereinigung, 97(3):63–65, Processes on Riemannian Manifolds via Gauge Inde-
1995. Cited on page 18. pendent Projected Kernels. In Advances in Neural
Information Processing Systems, volume 34, 2021.
R. Berlinghieri, B. L. Trippe, D. R. Burt, R. Giordano, Cited on pages 2–5, 9, 15.
K. Srinivasan, T. Özgökmen, J. Xia, and T. Brod-
erick. Gaussian processes at the Helm (holtz): A A. Ikeda and Y. Taniguchi. Spectra and eigenforms of
more fluid model for ocean currents. arXiv preprint the Laplacian on S n and P n (C). Osaka Journal of
arXiv:2302.10364, 2023. Cited on pages 4, 5. Mathematics, 15(3):515–546, 1978. Cited on page 8.

V. Borovitskiy, A. Terenin, P. Mostowsky, and M. P. A. Krause, A. Singh, and C. Guestrin. Near-optimal


Deisenroth. Matérn Gaussian processes on Rieman- sensor placements in Gaussian processes: Theory,
nian manifolds. In Advances in Neural Information efficient algorithms and empirical studies. Journal
Processing Systems, 2020. Cited on pages 2, 7, 18. of Machine Learning Research, 9(2), 2008. Cited on
page 1.
R. Bryant. Vector-Valued Stone–Weierstrass The-
orem? MathOverflow, 2020. https://mathover- M. Lange-Hegermann. Algorithmic linearly con-
flow.net/q/371618 (version: 2020-09-14). Cited on strained Gaussian processes. Advances in Neural In-
page 16. formation Processing Systems, 31, 2018. Cited on
page 1.
Y. Canzani. Analysis on manifolds via the Laplacian.
Lecture Notes available at: http://www.math.har- J. M. Lee. Manifolds and Differential Geometry, vol-
vard.edu/canzani/docs/Laplacian.pdf, 2013. Cited ume 107 of Graduate Studies in Mathematics. Amer-
on page 16. ican Mathematical Society, 2009. Cited on pages 13,
14, 18, 20.
J.-P. Chilès and P. Delfiner. Geostatistics: Modeling
Spatial Uncertainty. Wiley Series in Probability and F. Lindgren, H. Rue, and J. Lindström. An explicit
Statistics. Wiley & Sons, 2nd edition, 2012. Cited link between Gaussian fields and Gaussian Markov
on page 1. random fields: the stochastic partial differential
equation approach. Journal of the Royal Statistical
S. Coveney, C. Corrado, C. H. Roney, D. O’Hare, Society Series B: Statistical Methodology, 73(4):423–
S. E. Williams, M. D. O’Neill, S. A. Niederer, R. H. 498, 2011. Cited on page 2.
Clayton, J. E. Oakley, and R. D. Wilkinson. Gaus-
sian process manifold interpolation for probabilistic E. Noskova and V. Borovitskiy. Bayesian optimization
atrial activation maps and uncertain conduction ve- for demographic inference. G3: Genes, Genomes,
locity. Philosophical Transactions of the Royal Soci- Genetics, 13(7), 2023. Cited on page 1.
ety A, 378(2173), 2020. Cited on page 2.
D. Robert-Nicoud, A. Krause, V. Borovitskiy

V. K. Patodi. Curvature and the eigenforms of the


Laplace operator. Journal of Differential Geometry,
5(1-2):233–249, 1971. Cited on page 15.
R. L. Peach, M. Vinao-Carl, N. Grossman, M. David,
E. Mallas, D. Sharp, P. A. Malhotra, P. Van-
dergheynst, and A. Gosztolai. Implicit Gaussian
process representation of vector fields over arbi-
trary latent manifolds. In International Conference
on Learning Representations, 2024. Cited on page 2.
J. B. Prolla. On the Weierstrass–Stone Theorem. Jour-
nal of Approximation Theory, 78(3):299–313, 1994.
Cited on page 16.
C. E. Rasmussen and C. K. I. Williams. Gaussian
Processes for Machine Learning. MIT Press, 2006.
Cited on pages 1, 2.
P. Rosa, V. Borovitskiy, A. Terenin, and J. Rousseau.
Posterior Contraction Rates for Matérn Gaussian
Processes on Riemannian Manifolds. arXiv preprint
arXiv:2309.10918, 2023. Cited on page 2.
S. Rosenberg. The Laplacian on a Riemannian mani-
fold: an introduction to analysis on manifolds, num-
ber 31 in London Mathematical Society Student
Texts. Cambridge University Press, 1997. Cited on
pages 4, 5, 12–15, 17, 18.
N. Sharp, Y. Soliman, and K. Crane. The vector heat
method. ACM Transactions on Graphics, 38(3):1–
19, 2019. Cited on page 18.
B. J. Shields, J. Stevens, J. Li, M. Parasram, F.
Damani, J. I. M. Alvarado, J. M. Janey, R. P.
Adams, and A. G. Doyle. Bayesian reaction opti-
mization as a tool for chemical synthesis. Nature,
590(7844):89–96, 2021. Cited on page 1.
J. Snoek, H. Larochelle, and R. P. Adams. Practical
Bayesian Optimization of Machine Learning Algo-
rithms. In Advances in Neural Information Process-
ing Systems, volume 25, 2012. Cited on page 1.
M. L. Stein. Interpolation of Spatial Data: Some The-
ory for Kriging. Springer, 1999. Cited on page 2.
A. Terenin. Gaussian processes and statistical
decision-making in non-Euclidean spaces. arXiv
preprint arXiv:2202.10613, 2022. Cited on page 9.
P. Whittle. Stochastic processes in several dimensions.
Bulletin of the International Statistical Institute,
40(2):974–994, 1963. Cited on page 2.
M. Yang, V. Borovitskiy, and E. Isufi. Hodge-
Compositional Edge Gaussian Processes. In Inter-
national Conference on Artificial Intelligence and
Statistics, 2024. Cited on pages 5, 22.
Intrinsic Gaussian Vector Fields on Manifolds

A THEORY for a = a1 ∧ . . . ∧ ak and b = b1 ∧ . . . ∧ bk .


It is easily checked that Λn V is 1-dimensional and has
In this section, we present the mathematical theory
a canonical unit n-vector
behind the Hodge–Matérn Gaussian vector fields in-
troduced in the main body of the paper. Although we ω = e1 ∧ . . . ∧ en ∈ Λn V, (29)
previously almost exclusively talked about vector fields
on manifolds, this section is written in the more tech- where e1 , . . . , en ∈ V is an oriented orthonormal basis.
nically appropriate language of differential forms. On
The Hodge star operator is the linear operator
a Riemannian manifold, the metric provides an equiv-
alence between 1-forms and vector fields, which im- ⋆ : Λk V −→ Λn−k V (30)
mediately recovers the results of the main part of the
article. The general theory provides more than just mapping b ∈ Λk V to the unique ⋆b ∈ Λn−k V such that
intrinsic kernels for 1-forms, though: we obtain ker-
nels for differential forms of all degrees—hence intrin- a ∧ (⋆b) = ⟨a, b⟩ω (31)
sic Gaussian differential forms—which can be refined for any a ∈ Λk V .
to kernels for pure gradient, pure curl, and harmonic
processes. From now on, let M be a compact oriented Rieman-
nian manifold of dimension dM . The cotangent bundle
We will begin by introducing the Hodge star operator T ∗ M is the dual of the tangent bundle T M. In other
and link the abstract differential operators to the clas- words, at each x ∈ M the fibre Tx∗ M is the linear dual
sical notions of gradients, curls, and divergences. The of the tangent space Tx M at x. One can then take
Hodge star operator will allow us to define the Hodge the exterior products of the cotangent bundle, which
Laplacian and the heat equation on differential forms. are nothing more than the exterior product Λk Tx∗ M at
The fundamental solution of the latter is the heat ker- each x ∈ M. Sections of this bundle—smooth func-
nel (up to a normalization constant). We will then ex- tions that at each point x ∈ M associate an element
plore the cases of surfaces and products, and conclude of Λk Tx∗ M—are called k-differential forms on M. The
by giving an overview of another possible generaliza- space of k-differential forms is denoted by Ωk (M).
tion of the Laplacian to sections of a vector bundle
that could be considered in future research. The exterior derivative is an intrinsic differential op-
erator
We assume the reader has some basic familiarity with d : Ωk (M) −→ Ωk+1 (M) (32)
concepts of differential geometry, such as differential
forms and the exterior derivative. All manifolds are defined in local coordinates by linear extension of
assumed to be compact, connected, and oriented. dM
X ∂f (x)
Most of the material found in this appendix can be d(f (x) dxI ) = dxi ∧ dxI , (33)
∂xi
found in the book by Rosenberg (1997). We only give i=1
proofs of original results. where I = {i1 , . . . , ik } ⊆ {1, . . . , dM } is a set of k in-
dices and dxI = dxi1 ∧ · · · ∧ dxik . In order to build
A.1 The Hodge Star Operator intuition, one can notice that if X is a vector field
on M and f ∈ Ω0 (M), then df (X) is the directional
Let V be a finite-dimensional oriented vector space,
derivative of f in direction X.
say of dimension n, endowed with a non-degenerate
inner product ⟨ · , · ⟩. Applying the Hodge star operator to the cotangent
k bundle at each point of M (the inner product being
The k-th exterior product Λ V of V is the quotient
given by the Riemannian metric), one obtains a Hodge
of the tensor product of k copies of V by the sign
star operator on differential forms
representation of the symmetric group:
⋆ : Ωk (M ) −→ Ωn−k (M ). (34)
vσ(1) ⊗ · · · ⊗ vσ(k) ∼ sgn(σ)v1 ⊗ · · · ⊗ vk (27)
The canonical unit n-vector in Λn T ∗ M is nothing else
for v1 , . . . , vk ∈ V and σ ∈ Σk a permutation. We than the Riemannian volume form ωg , and by defini-
write v1 ∧· · ·∧vk for the equivalence class of v1 ⊗· · ·⊗vk , tion we have ⋆1 = ωg and ⋆ωg = 1.
and we call ∧ the wedge product.
If α, β ∈ Ωk (M) are two k-forms, their (Hodge) inner
The inner product on V induces an inner product on product is
the exterior product Λk V for 0 ≤ k ≤ n by Z Z
⟨a, b⟩ = det (⟨ai , bj ⟩i,j ) (28) ⟨α, β⟩L2 (M) = ⟨α, β⟩ωg = α ∧ ⋆β. (35)
M M
D. Robert-Nicoud, A. Krause, V. Borovitskiy

An easy computation shows that the dual of the exte- Building on the case of R3 , we call
rior derivative d : Ωk (M) → Ωk+1 (M) is given by
∇ = d : Ω0 (M) −→ Ω1 (M), (41)
⋆ dM k+1 k+1 k
d = (−1) ⋆ d⋆ : Ω (M) −→ Ω (M). (36) ⋆ 1
div = d : Ω (M) −→ Ω (M), 0
(42)
1 0
curl = ⋆ d : Ω (M) −→ Ω (M), (43)
A.2 Gradient, Divergence, and Curl on
Surfaces under the identification of 1-forms with vector fields
and 2-forms with functions. This also corresponds—
In this section, we will link the abstract operators ⋆, d, potentially up to a sign, depending on conventions—
and d⋆ to quantities which are in some sense more con- to the definition of divergence in Riemannian geometry
crete. Before doing that, we start with a quick recap using the Levi–Civita connection (Lee, 2009, Equation
on vector fields, their curls and divergences in R3 . 13.11). Working in a local coordinate system on M, the
In R3 , a basis for 1-forms is given by dx1 , dx2 , dx3 , and explicit expression for these operators is
a basis for 2-forms is dx1 ∧ dx2 , dx1 ∧ dx3 , dx2 ∧ dx3 . 
∂1 f (x)

The Hodge star operator sends the constant function ∇f (x) = , (44)
∂2 f (x)
1 to the volume form dx1 ∧ dx2 ∧ dx3 , and it maps  
v (x)
div 1 = ∂1 v1 (x) + ∂2 v2 (x), (45)
⋆ dx1 = dx2 ∧ dx3 , (37) v2 (x)
 
⋆ dx2 = − dx1 ∧ dx3 , (38) v (x)
curl 1 = ∂1 v2 (x) − ∂2 v1 (x). (46)
v2 (x)
⋆ dx3 = dx1 ∧ dx2 , (39)

and vice versa. Vector fields are identified with 1-forms A.3 The Hodge Laplacian
by mapping dxi 7→ ei , and also with 2-forms via the
Hodge star operator. Let M be a d-dimensional manifold. The Hodge Lapla-
cian on differential forms is then defined as
With this in mind, we immediately see that the exte-
rior derivative of a function ∆ := − (d⋆ d + d d⋆ ) . (47)

df = ∂1 f dx1 + ∂2 f dx2 + ∂3 f dx3 (40) Remark 11. In some texts, including Rosenberg
(1997), the Hodge Laplacian has the opposite sign.
is identified with taking the gradient ∇f . A straight- Remark 12. For k = 0 (i.e. on functions) this recov-
forward calculation also shows that taking the exterior ers the Laplace–Beltrami operator
derivative of a 1-form corresponds to taking the curl of
the corresponding vector field, and the exterior deriva- ∆ = − d d⋆ = −∇∗ ∇, (48)
tive of a 2-form gives the divergence of the associated
where ∇ is the Levi–Civita connection. Cf. also ap-
vector field. This and the fact that d2 = 0 recover the
pendix A.7. In the dual context of vector fields, we
well known relations between ∇, curl, and div. Via the
obtain the classical divergence of the gradient
Hodge star operator, similar statements can be made
about the d⋆ operator. ∆ = div ∇. (49)
On surfaces, the situation is a bit different since we
only have two dimensions in which to move. The fact The following deep results—found in Rosenberg (1997,
that our manifold is oriented gives us an orientation Theorems 1.30, 1.37, and 1.45)—give us all we need to
on each cotangent space Tx∗ M. Picking an oriented know about the spectrum of the Hodge Laplacian.
orthonormal local basis dx1 , dx2 ∈ Tx∗ M, the volume Theorem 13 (Hodge). All the eigenvalues of the
form is given locally by dx1 ∧ dx2 and it follows that Hodge Laplacian ∆ on Ωk (M) are non-negative, they
⋆ dx1 = dx2 and ⋆ dx2 = − dx1 . This corresponds to have finite multiplicity, and they accumulate only at
a rotation by 90◦ in the cotangent space. infinity. The eigenforms span a dense subset of
In the case where our surface is embedded in R3 , the ΩL2 (M). In particular, there exists an orthonormal
choice of an orientation is equivalent to the choice of a basis of ΩL2 (M) consisting of smooth eigenforms of ∆.
global unit normal field for the manifold, i.e. a smooth Remark 14. The convention on eigenpairs is that ϕ
choice of a unit normal vector for each point. Then, is an eigenform of (minus) eigenvalue λ if ∆ϕ = −λϕ.
for tangent vectors the Hodge star is given extrinsically Theorem 15 (Hodge decomposition). The space of
by a rotation by 90◦ around the unit normal at each smooth k-forms decomposes as
point. This can also be written as the cross product
of a tangent vector with the unit normal. Ωk (M) = ker ∆ ⊕ im d ⊕ im d⋆ . (50)
Intrinsic Gaussian Vector Fields on Manifolds

The next result links the kernel of the Hodge Laplacian Proof. For (1), it is immediate to see that ∆⋆ = ⋆∆ so
with a purely topological property of the manifold: de that ⋆ sends eigenforms to eigenforms with the same
Rham cohomology. An accessible introduction is given eigenvalue. If α, β are any two k-forms, then
in Rosenberg (1997, Section 1.4). Z
Theorem 16 (Hodge). The kernel of the Hodge ⟨⋆α, ⋆β⟩L2 (M) = ⋆α ∧ ⋆ ⋆ β (55)
Laplacian on k-forms is naturally isomorphic to the ZM
k-th de Rham cohomology group, which is a real vector = ⋆α ∧ β (56)
space: ZM
ker ∆ ∼ k
= HdR (M). (51) = β ∧ ⋆α = ⟨β, α⟩L2 (M) (57)
M
Remark 17. The following facts about de Rham co- = ⟨α, β⟩L2 (M) , (58)
homology are often useful. Assume M is compact and
connected. showing that ⋆ is an isometry and concluding the
point.
1. HdR0
(M) ∼= R, spanned by the constant function For (2), one easily checks that ∆ d = d∆ so that d
f (x) = 1. sends eigenforms that are in im d⋆ , i.e. the complement
of ker d, to eigenforms with the same eigenvalue. Let
dM
2. HdR (M) ∼
= R, spanned by the volume form. ϕ, ψ ∈ im d⋆ be eigenforms with eigenvalues λϕ , λψ
k respectively, then
3. All of the HdR (M) are finite dimensional, see
e.g. Lee (2009, Theorem 10.17). ⟨dϕ, dψ⟩L2 (M) = ⟨ϕ, d⋆ dψ⟩L2 (M) (59)
= ⟨ϕ, −∆ψ⟩L2 (M) (60)
An example that is exploited in the main body of this
paper is the well known fact that = λψ ⟨ϕ, ψ⟩L2 (M) , (61)

1
HdR (S2 ) = 0. (52) where in the middle equality we used the fact that
ψ ∈ im d⋆ and that d⋆ d⋆ = 0. This concludes the
This can be computed using the Mayer–Vietoris se- proof of the point, and (3) is analogous.
quence, see e.g. Lee (2009, Section 10.1).
A.4 The Heat Equation and its Kernel
Proposition 18. The various spaces of eigenforms
have the following relations. The heat equation for k-forms can now be defined as

1. The Hodge star ⋆ : Ωk (M) → Ωn−k (M) sends ∂t α(t, x) = ∆x α(t, x) (62)
eigenforms of the Hodge Laplacian to eigenforms with a given initial condition α(0, x) = β(x) ∈ Ωk (M).
with the same eigenvalue, and it preserves their
orthogonality and norm. A double (k-)form over M is a smooth section of the
bundle R ⊗ Λk T ∗ M ⊗ Λk T ∗ M over M × M, where the
2. The exterior derivative d : Ωk (M) → Ωk+1 (M) fibre above (x, x′ ) ∈ M × M is R ⊗ Λk Tx∗ M ⊗ Λk Tx∗′ M.
sends eigenforms in im d⋆ to eigenforms with the
A heat kernel for k-forms is a double form Pt (x, y)
same eigenvalue (and is zero on the other eigen-
such that
forms), it preserves orthogonality, and
√ 1. (∂t − ∆x )Pt (x, x′ ) = 0 and
∥ dϕ∥L2 (M) = λ∥ϕ∥L2 (M) (53)
2. limt→∞ M ⟨Pt (x, x′ ), α(x′ )⟩x′ dx′ = α(x) for any
R
for ϕ ∈ im d⋆ ⊆ Ωk (M) an eigenform of eigen- α ∈ Ωk (M), where the pointwise inner product
value −λ. and the integration are taken with respect to x′ ,
and integration is against the volume form of M.
3. The operator d⋆ : Ωk (M) → Ωk−1 (M) sends
eigenforms in im d to eigenforms with the same Theorem 19. Let ϕi ∈ Ωk (M) be an orthonormal ba-
eigenvalue (and is zero on the other eigenforms), sis of k-eigenforms of the Hodge Laplacian with (mi-
it preserves orthogonality, and nus) eigenvalues 0 ≤ λ1 ≤ λ2 ≤ · · · . The heat kernel
√ on k-forms exists, it is unique, and it can be expressed
∥ d⋆ ϕ∥L2 (M) = λ∥ϕ∥L2 (M) (54) by the following sum over eigenforms:

for ϕ ∈ im d ⊆ Ωk (M) an eigenform of eigen-
X
Pt (x, x′ ) = e−λn t ϕn (x) ⊗ ϕn (x′ ). (63)
value −λ. n=0
D. Robert-Nicoud, A. Krause, V. Borovitskiy

Proof. See Rosenberg (1997, Proposition 3.1) for func- Fibrewise bilinearity is obvious, we are left to prove
tions, the discussion at the end of Rosenberg (1997, that our kernel is positive semi-definite in the sense of
Section 3.2) and Patodi (1971) for the general state- Hutchinson et al. (2021, Definition 3). In order to do
ment on forms. so, for xi ∈ M and αxi ∈ Tx∗i M, 1 ≤ i ≤ n, consider
the sequence
Given the heat kernel, the solution for the heat equa- D Ept
tion αim (x) = P m1 (x, xi ), αxi . (71)
xi
(∂t + ∆x )α(t, x) = 0 (64)
with initial condition α(0, x) = β(x) is given by Then we have

α(t, x) = e−t∆ β (x) ⟨Pt (x, x′ ), αim (x′ )⟩x′ = (72)



(65) 
Z D Ept 
:= ⟨Pt (x, x′ ), β(x′ )⟩x′ dx′ (66) = Pt (x, x′ ), P m1 (x′ , xi ), αxi (73)
xi x′
M
DD E Ept
= ⟨Pt (x, ·), β(·)⟩L2 (M) (67) = Pt (x, x′ ), P m1 (x′ , xi ) ′ , αxi (74)
x xi
for t > 0. We now show that the heat kernel is a valid m→∞
−−−−→ ⟨Pt (x, xi ), αxi ⟩xi
pt
(75)
kernel for Gaussian processes.
Proposition 20. The heat kernel satisfies the follow- and similarly
ing properties.
k(αxi , αxj ) =
D E (76)
1. The kernel Pt (x, x′ ) is symmetric: for x, x′ ∈ M = lim αim (x), Pt (x, x′ ), αjm (x′ ) x′
.
m→∞ x
we have Pt (x, x′ ) = Pt (x′ , x) under the canonical
Pn
identification Therefore, letting αm (x) = αim (x) we obtain
i=1

Λk Tx∗ M ⊗ Λk Tx∗′ M ∼
= Λk Tx∗′ M ⊗ Λk Tx∗ M. (68) n X
X n
k(αxi , αxj ) = (77)
−t∆ i=1 j=1
2. The propagator e satisfies the semigroup prop-
erty e−(t+s)∆ = e−t∆ e−s∆ . = lim ⟨αm (x), ⟨Pt (x, x′ ), αm (x′ )⟩x′ ⟩x (78)
m→∞

3. The kernel Pt (x, x′ ) is positive semi-definite: for ≥ 0, (79)


all α ∈ ΩkL2 (M) we have
since each term in the sequence is non-negative.
⟨α(x), ⟨Pt (x, x′ ), α(x′ )⟩x′ ⟩x ≥ 0. (69)
Proposition 6. The Gaussian vector field
Proof. Statement (1) is immediate from Theorem 19. L
σ X
q
Alternatively, it can be proved from the fact that the f (x) = p wn Φν,κ (λn )sn (x), (18)
Hodge Laplacian is self-adjoint. Statement (2) follows Cν,κ n=0
in the same way as in Rosenberg (1997, pp. 28–29), as
iid
does statement (3). where wn ∼ N (0, 1), corresponds to the kernel P
given

by the truncation of Equation (14) with the sum n=0
PL
As an immediate corollary in the vector setting, we therein substituted by the sum n=0 .
obtain the following.
Theorem 3. For any t > 0 there exists a Gaussian Proof. The mean of the random variable f (x) is obvi-
vector field whose kernel is P t . ously 0. The covariance is easily computed to be

Cov(f (x), f (x′ )) =


Proof. The statement follows from Hutchinson et al.
L
(2021, Theorem 4), provided we prove that our ker- σ2 X (80)
nel satisfies Definition 3 in the cited work. The link = Φν,κ (λn )sn (x) ⊗ sn (x′ ).
Cν,κ n=0
between the two definitions is given by
This concludes the proof.
pt pt
D E
k(αx , βx′ ) = αx , ⟨Pt (x, x′ ), βx′ ⟩x′ , (70)
x
As a direct consequence of Theorem 19, Proposi-
where the scalar products on the right-hand side are tion 20, and of the Hodge decomposition theorem, we
taken pointwise, not integrating over the manifold. also have the following.
Intrinsic Gaussian Vector Fields on Manifolds

Theorem 5. There exists a Gaussian vector field f • — Theorem 21 (Stone–Weierstrass). Let A ⊆ C ∞ (M)
where • ∈ {div, curl, harm} —with kernel be a subalgebra of the algebra of smooth functions on M
which contains all constant functions and which sepa-
σ2 X rates points, i.e. such that for each x, y ∈ M there is
kν,κ,σ2 (x, x′ ) = Φν,κ (λn )sn (x)⊗sn (x′ ). (16)

Cν,κ a function f ∈ A with f (x) ̸= f (y). Then A is dense


n∈I•
in C ∞ (M).
What is more, div f curl = 0, curl f div = 0 and
With this, we get the following, e.g. Canzani (2013,
∆f harm = 0 almost surely as long as f • is smooth
Section 4.6).
enough that div and curl are well-defined.
Proposition 22. For X ∈ {M, N}, let {fiX }i ⊆
There are some situations where it is only necessary C ∞ (X) be an orthonormal basis of eigenfunctions of
to have partial information on the eigenforms of the the Laplace–Beltrami operator on X with eigenval-
Hodge Laplacian in order to gain full knowledge of ues −λXi . Then
the spectrum. We will now give the examples of  
surfaces—where it is enough to know the spectrum in M N
hi,j (xM , xN ) := fi (xM )gj (xN ) i, j , (82)
the scalar case and the harmonic 1-forms—and prod-
uct manifolds—where we only need to know the spec- where (xM , xN ) ∈ M × N, is an orthonormal basis
trum on the factors separately. of eigenfunctions of the Laplace–Beltrami operator on
M × N with eigenvalues −(λM N
i + λj ).
A.5 Surfaces
Proof. It is straightforward to check that the functions
Suppose now that dim M = 2 and assume that we hi,j are eigenfunctions with the prescribed eigenval-
have an orthonormal basis {fn }n of eigenfunctions of ues, and that they are orthonormal. We need to show
the Laplace–Beltrami operator on M with eigenvalues that these functions are dense in L2 (M × N). The
{−λn }n where 0 = λ1 < λ2 ≤ λ3 ≤ · · · (where we sets {fiX }i ⊆ C ∞ (X) are both dense in the spaces of
notice that the only eigenfunction with eigenvalue 0 smooth functions on the respective manifolds, so that
is the constant), as well as an orthonormal basis of 1- the hi,j are dense in the subalgebra spanned by the
eigenforms {αj }0≤j≤J of the first de Rham cohomol- products C ∞ (M) · C ∞ (N). By the Stone–Weierstrass
1
ogy HdR (M). Then, thanks to Hodge decomposition theorem, this subalgebra is dense in C ∞ (M × N), and
(Theorem 15) and Proposition 18, we have that an thus, also in L2 (M×N), which concludes the proof.
orthonormal basis of 1-eigenforms of Ω1 (M) is given
by Corollary 23. For the heat kernel on M × N we have

dfn ⋆ dfn
 PtM×N (x, x′ ) = PtM (xM , x′M )PtN (xN , x′N ), (83)
√ , √ , αj n ≥ 1 and 0 ≤ j ≤ J . (81)
λn λn for x = (xM , xN ) ∈ M × N and similarly for x′ .
An orthonormal basis of 2-forms is simply given by The more general case of differential forms is similar.
⋆fi = fi ω, where ω ∈ Ω2 (M) is the canonical volume We need the following result, which will play the role of
form. Similarly, we can also recover all the 0- and the Stone–Weierstrass theorem (Bryant, 2020). Recall
2-eigenforms from knowledge of the 1-eigenforms. that if A is an algebra and M is an A-module, an A-
In the dual case we have the following result from the submodule of M is a linear subspace M ′ of M such
main body of the paper. that for each a ∈ A and m′ ∈ M we have a · m′ ∈ M ′ .
Theorem 8. For each n ≥ 1, both ∇fn and ⋆∇fn are Proposition 24. Let A ⊆ C ∞ (M) be as in Theo-
eigenfields of the Hodge Laplacian, and the set rem 21 and let E ⊆ Ωk (M) be an A-submodule which
is such that for each point x ∈ M we have
 
∇f ⋆∇f
√ n , √ n , gj n ≥ 1 and 0 ≤ j ≤ J (19) span{e(x) | e ∈ E} = Λk Tx∗ M. (84)
λn λn
Then E is dense in Ωk (M) with respect to the supre-
forms an orthonormal basis of L2 (M; T M). mum norm.

A.6 Products Proof. First notice that we can choose a finite subset
e1 , . . . , en ∈ E that spans Λk T ∗ M everywhere. Indeed,
In this section, let M, N be two oriented, connected, consider the collection of open sets given by
compact Riemannian manifolds. We recall the Stone–
Weierstrass theorem, see e.g. Prolla (1994). {x ∈ M | e′1 (x), . . . , e′n′ (x) spans Λk Tx∗ M} (85)
D. Robert-Nicoud, A. Krause, V. Borovitskiy

for all possible choices of e′1 , . . . , e′n′ ∈ E. By our as- Proposition 10. The eigenfields on Td are
sumptions, this covers M. Thus, by compactness we !
d
can choose a finite sub-collection that also covers M.
fni (xi ) ej ∈ T S1 ⊗ · · · ⊗ T S1 ∼
Y
Taking all the elements of E appearing in this finite = T Td , (25)
i=1
sub-collection gives us the desired set. Every form
α ∈ Ωk (M) can then be written as Pd
1 ≤ j ≤ d, with eigenvalue i=1 λni , where xi ∈ S1 ,
n and (fn , λn ) are eigenpairs of the Laplace–Beltrami op-
d
X
α(x) = fi (x)ei (x) (86) erator on S1 . In particular, with k T the scalar Matérn
i=1 kernel on Td , the vector Hodge–Matérn kernel is

for some smooth fi (e.g. using bump functions). Each d 1 Td


kTν,κ,σ2 (x, x′ ) = k ′
2 (xi , xi )I d . (26)
of these functions fi can be approximated by elements d ν,κ,σ
of A, and since all sums of sections of the form a · e
for a ∈ A and e ∈ E are in E (as it is an A-module), Proof. We have a canonical identification
it follows that we can approximate α as well.
T S1 ∼
= S1 × R (91)
Proposition 25. For X ∈ {M, N}, let {ϕX,k
i ⊆ }i
Ωk (X) be an orthonormal basis of the eigenfields of of the tangent bundle of the circle with a trivial bundle.
the Hodge Laplacian with eigenvalues λX,k . The set It follows that we also have
i
n o T Td ∼
= Td × Rd (92)
ϕM,k
i
M
∧ ϕN,k
j
N
| kM + kN = k, i, j (87)
so that we can work in global coordinates as if we were
is an orthonormal basis of Ωk (M × N) with eigenvalues working in Rd .
λM,k
i
M
+ λN,k
j
N
. The Hodge star operator gives us an identification of
vector fields on the circle with functions on the circle.
Proof. Similar to Proposition 22. Checking that these By Proposition 25 we obtain the desired form for the
forms are eigenforms of the Hodge Laplacian with the eigenfields on Td .
prescribed eigenvalues and that they are orthonormal
The unnormalized form of the Hodge–Matérn kernel
is straightforward. To show that their span is dense in
is given by
Ωk (M × N), we notice that for x = (xM , xN ) ∈ M × N
we have Xd X d
Y
!
e−tλn fni (xi )fni (x′i ) ej ⊗ ej =
Λk Tx∗ (M × N) ∼
M
= Tx∗M M ∧ Tx∗N N. (88) j=1 n∈Nd i=1
kM +kN =k ! (93)
X d
Y
= e−tλn fni (xi )fni (x′i ) I d ,
It follows that the span our set of forms can play the
n∈Nd i=1
role of E in Proposition 24, with the role of A being
played by the products of eigenfunctions Pd
where n = (n1 , . . . , nd ) and λn = i=1 λni . In the
n o first part of the expression, we recognize the unnor-
A = ϕM,0i ∧ ϕN,0
j | i, j . (89) malized form of the scalar manifold Matérn kernel on
Td . We then notice that the normalization factors of
The statement follows. the scalar kernel and the vector one will differ by a fac-
tor d by taking the trace of the identity matrix, which
Corollary 26. The heat kernel on Ωk (M×N) is given
concludes the proof.
by

Ωk (M×N) A.7 An Alternative: Connection Laplacian


Pt (x, x′ ) =
X Ωi (M) Ωj (N) (90) There is another possible natural extension of the
= Pt (xM , x′M )Pt (xN , x′N )
i+j=k
Laplace–Beltrami operator to differential forms, called
the connection (or Bochner) Laplacian. In fact, it can
for x = (xM , xN ) ∈ M × N and similarly for x′ . be extended to much more than just differential forms:
it exists as soon as we have a vector bundle on a mani-
A direct application of this is Hodge–Matérn kernels fold with a nicely behaved connection and inner prod-
on the torus, which we give in the context of vector uct. An accessible introduction to this notion for the
fields. case of differential forms can be found in Rosenberg
Intrinsic Gaussian Vector Fields on Manifolds

(1997, Section 2.2), while an extensive treatise on it, Take arbitrary x, x′ ∈ S2 . Then
including the existence of a heat kernel, in Berline et
al. (1995). Cov(f (x), f (x′ )) = Cov(Px g(x), Px′ g(x′ )) (94)

There is a formal relationship between the Hodge = Px Cov(g(x), g(x ′


)) P⊤x′ (95)
Laplacian and the connection Laplacian, called the = Px A Cov(h(x), h(x ))A P⊤ ′ ⊤
x′ . (96)
Weitzenböck formula, see Rosenberg (1997, Sec-
tion 2.2.2) or Lee (2009, Section 13.12): their differ- Since Cov(h(x), h(x′ )) = kν,κ,σ2 (x, x′ )I and for arbi-
ence is given by a term depending only on the cur- trary x, x′ we have kν,κ,σ2 (x, x′ ) → σ 2 when κ → ∞
vature of the manifold. The Hodge theorems make it (Borovitskiy et al., 2020), we obtain
easier to work with the Hodge Laplacian by exploiting
the relationships between the exterior derivative and lim Cov(f (x), f (x′ )) = σ 2 Px AA⊤ P⊤
x′ =: Cxx′ . (97)
κ→∞
the Hodge star operator, which is the reason why we
focused on that operator in the present work. How- Without loss of generality, we can assume σ 2 = 1. To
ever, one gets valid kernels for GPs using the connec- analyze Cxx′ we construct the singular value decom-
tion Laplacian as well, and this latter operator has position (SVD) A = UΣV⊤ of the matrix A. Here
some very nice properties that could be fruitful to use. U and V are orthogonal matrices and Σ is a diagonal
For example, the heat kernel of the connection Lapla- matrix with non-negative entries. Then
cian asymptotically acts on vectors by parallel trans-
AA⊤ = UΣ V⊤ V ΣU⊤ = U Σ2 U⊤ = UΛU⊤ (98)
port, see Berline et al. (1995, Theorem 2.30). This
I =:Λ
fact was exploited in Sharp et al. (2019) to provide
efficient computational methods for parallel transport where the right-hand side is the eigendecomposition of
on manifolds. the matrix AA⊤ . We denote λi = Λii and, without
any loss of generality, assume that λ1 ≤ λ2 ≤ λ3 . By
B FURTHER RESULTS assumption, λ2 , λ3 > 0.
The columns U·j of U, j = 1, .., 3, form an orthonor-
This section contains the proofs of some additional re-
mal basis of R3 . We choose x = U·1 and x′ = U·2 .
sults on Gaussian vector fields that were stated in the
Then
main body of the paper or that we found of general
(
interest but could not include because of space limita- 0 for j = 1,
tions. Px U·j = Px̃ U·j = (99)
1 for j = 2, 3.
The first result formally shows that projected GPs (
built from stacking copies of known intrinsic scalar 0 for j = 2,
Px′ U·j = (100)
kernels will generally lead to undesirable uncertainty 1 for j = 1, 3.
patterns. The second subsection studies normaliza-
tion constants for the kernels of Hodge–Matérn Gaus- It follows that
sian vector fields. The third and last subsection aims ⊤
Cxx′ = (0, U·2 , U·3 ) Λ (U·1 , 0, U·3 ) (101)
to quantify the divergence of samples from different
Gaussian vector fields. = λ1 0 U⊤
·1

+ λ2 U·2 0 + λ3 U·3 U⊤
·3 (102)
= λ3 U·3 U⊤
·3 . (103)
B.1 Limitations of Projected GPs
Analogously, Cxx̃ = λ2 U·2 U⊤ ⊤
·2 + λ3 U·3 U·3 . Thus, we
We work on the standard unit sphere S2 ⊂ R3 . Let have q
g = Ah where hj ∼ GP(0, kν,κj ,σ2 ) for i ∈ {1, 2, 3} ∥Cxx′ ∥F = tr Cxx′ C⊤ xx′ = λ3 (104)
are independent and A ∈ R3×3 is an arbitrary matrix.
p
Let f be the associated projected vector GP. and similarly ∥Cxx̃ ∥F = λ22 + λ23 , proving the claim.
Proposition 2. With notation as above, if rk A > 1,
there are x, x′ ∈ S2 so that ∠(x, x′ ) = 90◦ but
∥Cov(f (x), f (x′ ))∥ < ∥Cov(f (x), f (x̃))∥, κj → ∞, (8) B.2 Kernel Normalization Constants

where x̃ = −x is antipodal to x (i.e. ∠(x, x̃) = 180 )
In Section 3.2, we defined the normalization constant
and ∥•∥ denotes the Frobenius norm.
for the kernels of Hodge–Matérn Gaussian vector fields
implicitly by requiring that
Proof. For simplicity of notation, we assume the κj are
all equal. Otherwise, we can formally regard κj → ∞
Z
1
tr k(x, x) dx = σ 2 .

as minj κj → ∞. (105)
vol M M
D. Robert-Nicoud, A. Krause, V. Borovitskiy

We will now explain what this normalization means Proposition 29. The appropriately normalized pro-
in practice and make these constants explicit for the jected Matérn kernel (with trivial coregionalization
kernels of Hodge–Matérn Gaussian vector fields and matrix) on a manifold M of dimension d embedded in
projected Matérn Gaussian vector fields. RN is given by
Proposition 27. Assume (105) holds, then
1
kπν,κ,σ2 (x, x′ ) = kν,κ,σ2 (x, x′ )PxT I N Px′ , (119)
1 h i
d
Ef ∼GP(0,k) ∥f ∥2L2 (M) = σ 2 . (106)
vol M
where kν,κ,σ2 is the scalar manifold Matérn kernel.
Proof. We have
1 Proof. Let x = x′ and pick a coordinate system such
h i
Ef ∼GP(0,k) ∥f ∥2L2 (M) = (107)
vol M that Tx M = span{e1 , . . . , ed }. Then we have
Z 
1
= Ef ∼GP(0,k) ∥f (x)∥22 dx (108) PxT I N Px′ = diag(1, . . . , 1, 0, . . . , 0). (120)
vol M M
Z d
1
Ef ∼GP(0,k) ∥f (x)∥22 dx
 
= (109)
vol M M
Z It follows that
1
Ef (x)∼N (0,k(x,x)) ∥f (x)∥22 dx
 
= (110) Z
vol M M 1  
tr kPν,κ,σ 2 (x, x) dx = (121)
vol M M
Z
1 
= tr k(x, x) dx (111) Z
vol M M 1
= kν,κ,σ2 (x, x) dx (122)
= σ2 , (112) vol M M
= σ2 , (123)
where the last equality holds by (105).
Proposition 28. The constant Cν,κ for the Hodge– as desired.
Matérn kernel kν,κ,σ2 is given by

B.3 Divergence of Gaussian Vector Fields
1 X
Cν,κ = Φν,κ (λn ), (113)
vol M n=0 We will now study the distribution of the (pointwise)
divergence of the Hodge–Matérn Gaussian vector fields
where the sum runs over all of the eigenfields of the and projected Matérn GPs. Similar techniques can be
Hodge Laplacian. Similar formulas are valid for the used to compute the full distribution for dα and d⋆ α
pure divergence, pure curl, and harmonic kernels by for α ∼ GP(0, kν,κ,σ2 ), which turns out to be another
restricting the sum to the appearing eigenfields. Gaussian process.
We fix a compact, oriented Riemannian manifold M of
Proof. We have
Z dimension dM ≥ 1 and we look at degree k = 1 dif-
ferential forms, although the computations straight-
tr(kν,κ,σ2 (x, x)) dx = (114)
M forwardly generalize to all other k. We write fn ∈

! C ∞ (M) = Ω0 (M) for a basis of eigenfunctions with
σ2 X
Z
= tr Φν,κ (λn )sn (x) ⊗ sn (x) dx eigenvalues −λn , for n ≥ 0, where 0 = λ0 < λ1 ≤ · · · .
M Cν,κ n=0 We also set ϕn , n ∈ N a basis of 1-eigenforms with
(115) eigenvalues including the eigenforms

σ2 X Z
= Φν,κ (λn ) tr (sn (x) ⊗ sn (x)) dx (116) 1
Cν,κ M
√ dfn (124)
n=0 λn

σ2 X Z
= Φν,κ (λn ) ∥sn (x)∥22 dx (117) for n ≥ 1. We use Cν,κ k
to denote the normal-
Cν,κ n=0 M
ization constant for the Hodge–Matérn kernel on k-

σ2 X forms (with k = 0 being the case of functions). For
= Φν,κ (λn ), (118)
Cν,κ α ∼ GP(0, k) a Gaussian differential form, we write
n=0
div α(x) for the random variable d⋆ α(x), where x ∈ M.
since ∥sn ∥L2 (M) = 1. Notice that here the trace is We assume div α(x) is well-defined, i.e. the Gaussian
taken with respect to the metric on M. The result process is smooth enough, which places restrictions on
follows immediately by requiring the left-hand side of the parameter ν. We leave the precise nature of these
the equation to equal σ 2 vol M. out of the scope of this paper.
Intrinsic Gaussian Vector Fields on Manifolds

Proposition 30. For the Hodge–Matérn Gaussian We now look at the projected kernel. Suppose ϕ : M →
form αν,κ,σ ∼ GP(0, kν,κ,σ2 ) on 1-forms we have RN is an isometric embedding and write kπν,κ,σ2 for the
∞ projected kernel obtained by projecting the vector ker-
σ2 X nel in RN where each component is a scalar manifold
Var(div αν,κ,σ (x)) = 1
λn Φν,κ (λn )fn (x)2 (125)
Cν,κ n=1 Matérn kernel with the same hyperparameters ν, κ, σ 2 .
In this case, we will talk about vectors and gradients
whenever αν,κ,σ is smooth enough for the divergence instead of differential forms (although a formulation in
to be well-defined. terms of 1-forms is also possible).

Proof. By Proposition 6 we have We write P : ϕ∗ (T RN ) → T M for the orthogonal pro-


jection to the tangent bundle of M. The proof of the
divαν,κ,σ (x) = d⋆ αν,κ,σ (x) (126) following helpful lemma was provided by Alan Pinoy
 
∞ q in a private communication.
σ X
= d⋆  q Φν,κ (λn )wn ϕn (x) (127) Lemma 32. Let f : M → R be a smooth function and
1
Cν,κ n=0 let w ∈ RN be a fixed vector. Then
∞ q
σ

div(f (x)Px w) = ∇f (x) + f (x)H(x) · w, (136)
X
=q Φν,κ (λn )wn d⋆ ϕn (x) (128)
1
Cν,κ n=0
where H is the mean curvature vector of the embedding
defined in Equation (138).
s

σ X Φν,κ (λn )
=q wn d⋆ dfn (x) (129)
1
Cν,κ n=1
λn Proof. We denote by ∇ the Levi–Civita connection of
∞ q M and by ∇ that of RN . Recall that the (vector)
σ X
second fundamental form of the embedding is given
=q λn Φν,κ (λn )wn fn (x), (130)
1
Cν,κ by
n=1
II(u, v) = ∇u v − ∇u v ∈ ϕ∗ (T RN ), (137)
where in the third line we used that d⋆ ϕn (x) = 0 un-
cf. Lee (2009, Section 4.2) (for the scalar version). In-
less ϕn (x) is of the form (124) and in the last line we
tuitively, it measures the infinitesimal curvature of M
used the fact that d⋆ dfn (x) = −∆fn (x) = λn fn (x).
inside of RN . Notice that it is always orthogonal to the
This gives the full distribution for div αν,κ,σ (x), and in
tangent space of M. The mean curvature vector is the
particular the desired formula for the variance. Note
trace of the second fundamental form: if e1 , . . . , edM is
that Equation (130) can fail to converge when ν is not
a local orthonormal frame of M, then
large enough (i.e. αν,κ,σ is not smooth enough).
dM
Corollary 31. On the sphere, in terms of vector
X
H(x) = II(ei , ei ). (138)
fields, we have i=1
P∞
σ2 n=1 λn Φν,κ (λn ) With these definitions at hand, we have
Var(div fν,κ,σ (x)) = 0 −Φ
(131)
2 4π Cν,κ ν,κ (0)
div(f (x)Px w) = ∇f (x) · w + f (x) div(Px w). (139)
for any x ∈ S2 .
We use the standard differential geometric notation
Proof. By symmetry, Var(div fν,κ,σ (x)) does not de- v(f ) for the derivative of f in direction v, where f is
pend on x, i.e. it is a constant. Therefore, we have a function and v is a vector field. For this last term,
with g denoting the metric on M, we compute
4π Var( div fν,κ,σ (x)) = (132) 
Z div(Px w) = tr ∇(Px w) (140)
= Var(div fν,κ,σ (x′ )) dx′ (133) dM
X
M

= g(∇ei (Px w), ei ) (141)
σ2 X
Z
= 1
λn Φν,κ (λn )fn (x′ )2 dx′ (134) i=1
M Cν,κ n=1
dM
X 
2 ∞ = ei g(Px w, ei ) − g(Px w, ∇ei ei ) (142)
σ X
i=1
= 1 λn Φν,κ (λn ) (135)
Cν,κ n=1 dM
X
= ei (Px w · ei ) − Px w · ∇ei ei (143)
since ∥fn ∥L2 (M) = 1. The statement follows by notic- i=1
ing that each eigenvalue in the spectrum on functions dM
appears twice in the spectrum on vector fields—except
X 
= ei (w · ei − w · ∇ei ei (144)
for 0, which does not appear. i=1
D. Robert-Nicoud, A. Krause, V. Borovitskiy

dM
X Corollary 34. On the sphere we have
= ∇ei w · e i + w · ∇ei e i − w · ∇ ei e i (145)
π
i=1 Var(div fν,κ,σ 2) =


σ2
P
dM (155)
 
X n=1 λn Φν,κ (λn )
= + 1

= w · ∇ei ei − ∇ei ei (146) 2 0
4πCν,κ
i=1
dM
X
! for any x ∈ S2 .
=w· II(ei , ei ) (147)
i=1 Proof. By the same symmetry argument as in Corol-
= w · H(x), (148) lary 31 and using the fact that for the standard unit
sphere embedding we have H(x) = n(x) the unit nor-
where (142) comes from the fact that ∇ is the Levi– mal vector, we obtain
Civita connection, (143) from the fact that ϕ is an π
isometric embedding, and (145) comes from the fact 4π Var(div fν,κ,σ 2) =

that ∇ is the Levi–Civita connection of RN (the metric ∞


!
σ2 X
0
(156)
being given by the dot product). Inserting this in (139) = 0
λn Φν,κ (λn ) + 4πCν,κ ,
2Cν,κ n=1
concludes the proof.
which is the stated result.
Proposition 33. For the projected Matérn GP
π π
fν,κ,σ 2 ∼ GP(0, k ν,κ,σ 2 ) we have
C ADDITIONAL EXPERIMENTAL DETAILS
π
Var( div fν,κ,σ 2) = (149)
C.1 Weather Modeling
2 ∞ 2
σ X ∇fn (x)
= 0
λn Φν,κ (λn ) √ + (150) In the weather modeling experiments, the signal was
dM Cν,κ n=1
λn 2
normalized before fitting by scaling it by a constant

!
X 2 scalar value so that the mean norm of the training
+ Φν,κ (λn )fn (x)2 ∥H(x)∥2 (151) observations is 1.
n=0

π
The training points were selected by taking the obser-
whenever fν,κ,σ 2 is smooth enough for the divergence
vations at longitudes 90◦ E and 90◦ W, and then picking
to be well-defined. one every 180 of them, spaced regularly. This gives a
final training set of 34 points. The 1220 testing points
Proof. Once again, by Proposition 6 we have were generated randomly.8
π
fν,κ,σ 2 (x) = Using the results of Appendix B.3, it was possible to
∞ q quantify the variance of the divergence of the Gaussian
σ X (152)
=q Φν,κ (λn )fn (x)Px wn vector fields arising from the (prior) kernels that were
0
dM Cν,κ n=0 fitted in the experiments. The results are displayed
in Figure 10, confirming that the absolute divergence
where wn ∼ N (0, I N ) is a sequence of i.i.d. multivari- was higher for the Hodge heat and Hodge–Matérn ker-
ate normal vectors. Taking the divergence by applying nels, which explains the worse performance of these
Lemma 32 to each summand, we obtain intrinsic kernels against the projected kernels for this
π
experiment.
divfν,κ,σ 2 (x) =

∞ q An examination of the fitted hyperparameters revealed


σ X
that the typical length scale for the divergence-free
=q Φν,κ (λn )∇fn (x) · wn +
0
dM Cν,κ n=1 (153) Hodge–Matérn kernels is between κ = 0.1 and κ = 0.3.
∞ q
In some rare exceptions, hyperparameter fitting con-
σ X
verged to a local optimum with large length scale,
+q Φν,κ (λn )fn (x)H(x) · wn .
0
dM Cν,κ n=0
but performance was not overly affected. After visual
inspection of the ground truth, we fitted divergence-
Since the wn are independent and free Hodge–Matérn kernels with fixed length scales of
κ = 0.5 and κ = 1. The results are reported in Ta-
∥∇fn (x) + fn (x)H(x)∥2 = ble 2. We notice potential small improvements in per-
(154) formance, but not statistically significant ones.
= ∥∇fn (x)∥2 + fn (x)2 ∥fn (x)H(x)∥2
8
Specifically, we used the Poisson disk sampling rou-
as ∇fn (x) and H(x) are orthogonal, the statement tine create poisson disk samples on the sphere from
follows. https://github.com/aterenin/phdthesis.
Intrinsic Gaussian Vector Fields on Manifolds

In the last experiment on this data, we fitted the spective kernel performed best. On the rotation vec-
Hodge-compositional Matérn kernels, i.e. linear com- tor field, Hodge–Matérn and divergence-free Hodge–
binations of pure-divergence and pure-curl Hodge– Matérn vastly outperformed all other kernels.
Matérn kernels, as reported also in Table 1. We
see that these kernels recover almost exactly the re-
sults of fitting a divergence-free Hodge–Matérn ker-
nel. A detailed analysis of the fitted hyperparameters
shows that the resulting length scales and variances
put full weight on the divergence-free part of the ker-
nel with an almost exact match of length scales. The
only exceptions are when the length scale converges
to a local optimum, which fully explains the minimal
advantage in performance of the linear combination
kernel over the divergence-free Hodge–Matérn kernel
with ν = 12 . This supports the fact that Hodge-
compositional Matérn kernels are able to automati-
cally recover appropriate inductive biases in such sit-
uations, mirroring their discrete counterparts in Yang
et al. (2024).

C.2 Synthetic Experiments

In addition to the weather modeling experiment de-


tailed in the main body of the paper, we ran various
experiments on synthetically generated data.
Namely, we considered samples drawn from each of
the following GPs on the sphere: Hodge–Matérn (ν =
∞, κ = 0.5), projected Matérn (ν = 21 , κ = 0.5),
Hodge–Matérn (ν = 12 , κ = 0.5), and curl-free Hodge–
Matérn (ν = 21 , κ = 0.5). In addition to this, we also
used the “rotation” vector field of Figure 1b, which is
given by
   
x y
y  7−→ −x . (157)
z 0

This vector field is pure curl, as it can obtained as


the curl of the function f (x, y, z) = z on the sphere.
For each experiment, 30 training points were selected
uniformly at random from the northern hemisphere,
and 100 testing points were selected—also uniformly
at random—from the southern hemisphere. Each ex-
periment was repeated 10 times: for the rotation vec-
tor field the training and testing points were resampled
at each experiment, while for the others a new sample
was drawn for each experiment.
In each experiment, we fitted the following GPs: pure
noise, projected Matérn (P. M.), Hodge–Matérn (H.–
M.), divergence-free Hodge–Matérn (div-free H.–M.),
and curl-free Hodge–Matérn (curl-free H.–M.), all with
ν = 21 , ∞. All kernels also fitted variance and an ad-
ditive noise.
The results are reported in Tables 3 and 4. We see
that for each experiment based on samples, the re-
D. Robert-Nicoud, A. Krause, V. Borovitskiy

curl-free
H.–M.– 12 sample H.–M.–∞ sample P. M.– 21 sample Rotation field
Kernel H.–M.– 12 sample
Mean Std Mean Std Mean Std Mean Std Mean Std
Pure noise 0.17 0.04 1.18 0.32 0.22 0.06 0.68 0.02 0.08 0.02
P. M.– 12 0.14 0.03 0.87 0.42 0.16 0.05 0.06 0.02 0.07 0.02
P. M.–∞ 0.19 0.04 0.71 0.26 0.20 0.07 0.34 0.36 0.07 0.03
H.–M.– 12 0.14 0.03 0.84 0.38 0.16 0.05 0.02 0.01 0.07 0.02
H.–M.–∞ 0.17 0.04 0.65 0.25 0.20 0.06 0.00 0.00 0.08 0.02
curl-free
0.20 0.06 1.15 0.47 0.22 0.07 0.68 0.02 0.05 0.01
H.–M.– 21
curl-free
0.16 0.04 1.11 0.37 0.23 0.06 0.69 0.03 0.08 0.02
H.–M.–∞
div-free
0.15 0.05 1.00 0.60 0.19 0.05 0.01 0.00 0.08 0.02
H.–M.– 12
div-free
0.16 0.04 0.81 0.46 0.18 0.04 0.00 0.00 0.08 0.02
H.–M.–∞

Table 3: MSE for synthetic experiments. The columns are datasets, the rows are models.

curl-free
H.–M.– 12 sample H.–M.–∞ sample P. M.– 21 sample Rotation field
Kernel H.–M.– 12 sample
Mean Std Mean Std Mean Std Mean Std Mean Std
Pure noise 0.41 0.25 2.39 0.34 0.67 0.29 1.76 0.04 -0.31 0.25
P. M.– 12 0.15 0.26 2.18 0.78 0.31 0.27 -0.65 0.20 -0.51 0.34
P. M.–∞ 0.61 0.39 1.47 0.63 0.58 0.40 -3.43 5.47 -0.38 0.73
H.–M.– 12 0.13 0.24 2.12 0.71 0.33 0.28 -1.41 0.15 -0.58 0.28
H.–M.–∞ 0.41 0.25 1.27 0.48 0.53 0.31 -9.42 0.04 -0.31 0.25
curl-free
0.66 0.59 2.66 0.71 0.67 0.37 1.77 0.03 -0.79 0.21
H.–M.– 21
curl-free
0.38 0.30 2.55 0.58 0.73 0.32 1.77 0.05 -0.31 0.25
H.–M.–∞
div-free
0.25 0.31 2.37 1.16 0.48 0.31 -2.20 0.17 -0.37 0.31
H.–M.– 12
div-free
0.33 0.25 1.52 0.66 0.46 0.18 -9.63 0.00 -0.31 0.25
H.–M.–∞

Table 4: Predictive NLL for synthetic experiments. The columns are datasets, the rows are models.
Intrinsic Gaussian Vector Fields on Manifolds

(a) Ground truth (January 2010) and observations.

(b) Predictive mean and uncertainty (blue is low and yellow is high).

(c) Posterior sample.

Figure 8: Robinson projection of Figures 6a, 6b and 7a displaying the ground truth, observations, predictive
mean, uncertainty, and a posterior sample of the GP with projected Matérn kernel with ν = 21 and length
scale κ = 0.5. The vectors in the sample are scaled independently from the ground truth and predictive mean.
Visually, the sample does not have structures reminiscent of that of the ground truth.
D. Robert-Nicoud, A. Krause, V. Borovitskiy

(a) Ground truth (January 2010) and observations.

(b) Predictive mean and uncertainty (blue is low and yellow is high).

(c) Posterior sample.

Figure 9: Robinson projection of Figures 6b, 6c and 7b displaying the ground truth, observations, predictive
mean, uncertainty, and a posterior sample of the GP with divergence-free Hodge–Matérn kernel with ν = 12 and
length scale κ = 0.5. The vectors in the sample are scaled independently from the ground truth and predictive
mean. Visually, and to the contrary of Figure 8, the sample appears to have structures reminiscent of that of
the ground truth, such as a strong west to east current in the southern hemisphere.
Intrinsic Gaussian Vector Fields on Manifolds

Kernel
3.5 P. M.−∞
P. M.−12
H.−M.−∞
3.0 H.−M.−12

2.5
Var(div f)

2.0

1.5

1.0

0.5

0.0
0 1 2 3 4 5 6 7 8 9 10 11
Experiment number

Figure 10: Variance of divergence for the prior kernels with fitted hyperparameters in the weather modeling
experiments. Note that the variance of the divergence does not depend on the input location in this case. The
divergence-free kernels and the kernels with fixed length scale are not represented here.

You might also like