Towards Geometric Deep Learning I - On The Shoulders of Giants

2022/7/7 下午1:18 Towards Geometric Deep Learning I: On the Shoulders of Giants | by Michael Bronstein | Jul, 2022 | Towards Data
owards Data Science
Open in app Get started
Published in Towards Data Science
Michael Bronstein Follow
Jul 4 · 14 min read · Listen
Save
ORIGINS OF GEOMETRIC DEEP LEARNING
Towards Geometric Deep Learning I: On the

Shoulders of Giants
Geometric Deep Learning approaches a broad class of ML problems
from the perspectives of symmetry and invariance, providing a
common blueprint for neural network architectures as diverse as
CNNs, GNNs, and Transformers. In a new series of posts, we study
how these ideas have emerged through history from ancient Greek
geometry to Graph Neural Networks.
56
Image: based on Shutterstock.
https://towardsdatascience.com/towards-geometric-deep-learning-i-on-the-shoulders-of-giants-726c205860f5 1/13
2022/7/7 下午1:18 Towards Geometric Deep Learning I: On the Shoulders of Giants | by Michael Bronstein | Jul, 2022 | Towards Data Science
What is in common between snowflakes and the Standard Model? Symmetry. In the first
post from the “Towards Geometric Deep Learning series,” we discuss how the concept of
symmetry has helped organise the zoo of nineteenth-century geometries and revolutionise
theoretical physics. This post is based on the introduction chapter of the book M. M.
Bronstein, J. Bruna, T. Cohen, and P. Veličković, Geometric Deep Learning (to appear with
MIT Press upon completion) and accompanies our course in the African Masters in
Machine Intelligence (AMMI). See our previous post summarising the concept of Geometric
Deep Learning.
T he last decade has witnessed an experimental revolution in data science and

machine learning, epitomised by deep learning methods. Indeed, many high-
dimensional learning tasks previously thought to be beyond reach — computer vision,
playing Go, or protein folding — are in fact feasible with appropriate computational
scale. Remarkably, the essence of deep learning is built from two simple algorithmic
principles: first, the notion of representation or feature learning, whereby adapted,
often hierarchical, features capture the appropriate notion of regularity for each task,
and second, learning by gradient descent-type optimisation, typically implemented as
backpropagation.
While learning generic functions in high dimensions is a cursed estimation problem,

most tasks of interest are not generic and come with essential predefined regularities
arising from the underlying low-dimensionality and structure of the physical world.
Geometric Deep Learning is concerned with exposing these regularities through
unified geometric principles that can be applied throughout a broad spectrum of
applications.
Exploiting the known symmetries of a large system is a powerful and classical remedy
against the curse of dimensionality, and forms the basis of most physical theories. Deep
learning systems are no exception, and since the early days, researchers have adapted
neural networks to exploit the low-dimensional geometry arising from physical
measurements, e.g. grids in images, sequences in time-series, or position and
momentum in molecules, and their associated symmetries, such as translation or
rotation.
Since these ideas have deep roots in science, we will attempt to see how they have
evolved throughout history, culminating in a common blueprint that can be applied to
most of today’s popular neural network architectures.
Order, Beauty, and Perfection
“Symmetry, as wide or as narrow as you may define

its meaning, is one idea by which man through the
ages has tried to comprehend and create order,
beauty, and perfection.” — Hermann Weyl (1952)
This somewhat poetic definition of symmetry is given in the eponymous book of the
great mathematician Hermann Weyl [1], his Schwanengesang on the eve of retirement
from the Institute for Advanced Study in Princeton. Weyl traces the special place
symmetry has occupied in science and art to the ancient times, from Sumerian
symmetric designs to the Pythagoreans, who believed the circle to be perfect due to its
rotational symmetry. Plato considered the five regular polyhedra bearing his name
today so fundamental that they must be the basic building blocks shaping the material
world.
Yet, though Plato is credited with coining the term συμμετρία, which literally
translates as ‘same measure’, he used it only vaguely to convey the beauty of proportion
in art and harmony in music. It was the German astronomer and mathematician
Johannes Kepler to attempt the first rigorous analysis of the symmetric shape of water
crystals. In his treatise ‘On the Six-Cornered Snowflake’ [2], he attributed the six-fold
dihedral structure of snowflakes to
hexagonal packing of particles — an idea that though preceded the clear
understanding of how matter is formed, still holds today as the basis of crystallography
[3].
Plato (left) believed that symmetric polyhedra (“Platonic solids”) were the fundamental building blocks of
Nature. Johannes Kepler (right) attributed for the first time the six-fold symmetry of water crystals to the
hexagonal packing of particles, antedating modern crystallography.
In modern mathematics, symmetry is almost univocally expressed in the language of

group theory. The origins of this theory are usually attributed to Évariste Galois, who
coined the term and used it to study the solvability of polynomial equations in the
1830s [4]. Two other names associated with group theory are those of Sophus Lie and
Felix Klein, who met and worked fruitfully together for a period of time [5]. The
former would develop the theory of continuous symmetries that today bears his name
(Lie groups); the latter proclaimed group theory to be the organising principle of
geometry in his Erlangen Programme. Given that Klein’s Programme is the inspiration
for Geometric Deep Learning, it is worthwhile to spend more time on its historical
context and revolutionary impact.
Évariste Galois (left) and his letter to a friend describing group theory on the night before his fatal duel. Felix
Klein (right) and the frontispiece of the research prospectus prepared for his professorial appointment,
which has entered the history of mathematics as the “Erlangen Programme.” Klein’s portrait: Ihor Gorskiy.
A Strange New World
T he foundations of modern geometry were formalised in ancient Greece nearly

2300 years ago by Euclid in a treatise named the Elements. Euclidean geometry
(which is still taught at school as ‘the geometry’) was a set of results built upon five
intuitive axioms or postulates. The Fifth Postulate — stating that it is possible to pass
only one line parallel to a given line through a point outside it — appeared less obvious
and an illustrious row of mathematicians broke their teeth trying to prove it since
antiquity, to no avail.
An early approach to the problem of the parallels appears in the eleventh-century

An early approach to the problem of the parallels appears in the eleventh century
Persian treatise ‘A commentary on the difficulties concerningOpen
the in
postulates
app ofstarted
Get Euclid’s
Elements’ by Omar Khayyam [6]. The eighteenth-century Italian Jesuit priest Giovanni
Saccheri was likely aware of this previous work judging by the title of his own work
Euclides ab omni nævo vindicatus (‘Euclid cleared of every stain’).
Like Khayyam, he considered the summit angles of a quadrilateral with sides

perpendicular to the base. The conclusion that acute angles lead to infinitely many
non-intersecting lines that can be passed through a point not on a straight line seemed
so counter-intuitive that he rejected it as ‘repugnatis naturæ linæ rectæ’ (‘repugnant to
the nature of straight lines’) [7].
Frontispiece of Giovanni Saccheri’s ‘Euclides vindicatus’ and the passage rejecting hyperbolic geometry as
‘repugnatis naturæ linæ rectæ.’
The nineteenth century has brought the realisation that the Fifth Postulate is not
essential and one can construct alternative geometries based on different notions of
parallelism. One such early example is projective geometry, arising, as the name
suggests, in perspective drawing and architecture. In this geometry, points and lines
are interchangeable, and there are no parallel lines in the usual sense: any lines meet
in a ‘point at infinity.’ While results in projective geometry had been known since
antiquity, it was first systematically studied by Jean-Victor Poncelet in 1812 [8].
T he credit for the first construction of a true non-Euclidean geometry is disputed.

The princeps mathematicorum Carl Friedrich Gauss worked on it around 1813
but never published any results [9]. The first publication on the subject of non-
Euclidean geometry was ‘On the Origins of Geometry’ by the Russian mathematician
Nikolai Lobachevsky [10]. In this work, he considered the Fifth Postulate an arbitrary
limitation and proposed an alternative one, that more than one line can pass through a
point that is parallel to a given one. Such construction requires a space with negative
curvature — what we now call a hyperbolic space — a notion that was still not fully
mastered at that time [11].

János Bolyai’s letter to his father in Hungarian dated 3 November 1823 (left), announcing his discovery of
hyperbolic geometry. Nikolai Lobachevsky (right) and the first page of his work “On the Origins of Geometry,”
published in 1829.
Lobachevsky’s idea appeared heretical and he was openly derided by colleagues [12].
A similar construction was independently discovered by the Hungarian János Bolyai,
who published it in 1832 under the name ‘absolute geometry.’ In an earlier letter to his
father dated 1823, he wrote enthusiastically about this new development:
“I have discovered such wonderful things that I was

amazed… out of nothing I have created a strange
new world.” — János Bolyai (1823)
In the meantime, new geometries continued to come out like from a cornucopia.
August Möbius [13], of the eponymous surface fame, studied affine
geometry. Gauss’ student Bernhardt Riemann introduced a very broad class of
geometries — called today Riemannian is his honour — in his habilitation lecture,
subsequently published under the title Über die Hypothesen, welche der Geometrie zu
Grunde liegen (‘On the Hypotheses on which Geometry is Based’) [14]. A special case of
Riemannian geometry is the ‘elliptic’ geometry of the sphere, another construction
violating Euclid’s Fifth Postulate, as there is no point on the sphere through which a
line can be drawn that never intersects a given line.
Towards the second half of the nineteenth century, Euclid’s monopoly over geometry
was completely shuttered. New types of geometry (Euclidean, affine, projective,
hyperbolic, spherical) emerged and became independent fields of study. However, the
relationships of these geometries and their hierarchy were not understood
relationships of these geometries and their hierarchy were not understood.
was in this exciting but messy situation that Felix Klein came forth, with a genius
It insight to use group theory as an algebraic abstraction of symmetry to organise

the ‘geometric zoo.’ Only 23 years old at the time of his appointment as a
professor in Erlangen, Klein, as it was customary in German universities, was requested
to deliver an inaugural research programme — titled Vergleichende Betrachtungen über
neuere geometrische Forschungen (‘A comparative review of recent researches in
geometry’), it has entered the annals of mathematics as the “Erlangen Programme”
[15].
The breakthrough insight of Klein was to approach the definition of geometry as the
study of invariants, or in other words, structures that are preserved under a certain
type of transformations (symmetries). Klein used the formalism of group theory to
define such transformations and use the hierarchy of groups and their subgroups in
order to classify different geometries arising from them.
Klein’s Erlangen Programme defined geometry as the space with a group of transformations. This allowed
classifying different types of geometry.
It appeared that Euclidean geometry is a special case of affine geometry, which is in

turn a special case of projective geometry (or, in terms of group theory, the Euclidean
group is a subgroup of the projective group). Klein’s Erlangen Programme was in a
sense the ‘second algebraisation’ of geometry (the first one being the analytic geometry
of René Descartes and the method of coordinates bearing his Latinised name Cartesius)
that allowed to produce results impossible by previous methods.
More general Riemannian geometry was explicitly excluded from Klein’s unified
geometric picture, and it took another fifty years before it was integrated, largely
thanks to the work of Élie Cartan in the 1920s. Furthermore, Category Theory, now
thanks to the work of Élie Cartan in the 1920s. Furthermore, Category Theory, now
pervasive in pure mathematics, can be “regarded as a continuation of the Klein
Erlangen Programme, in the sense that a geometrical space with its group of
transformations is generalized to a category with its algebra of mappings”, in the words
of its creators Samuel Eilenberg and Saunders Mac Lane [16].
The Theory of Everything
C onsidering projective geometry the most general of all, Klein complained in his
Vergleichende Betrachtungen [17]:
“How persistently the mathematical physicist

disregards the advantages afforded him in many
cases by only a moderate cultivation of the
projective view.” — Felix Klein (1872)
His advocating for the exploitation of geometry and the principles of symmetry in
physics has foretold the following century that was truly revolutionary for the field.
In Göttingen [18], Klein’s colleague Emmy Noether [19] proved that every
differentiable symmetry of the action of a physical system has a corresponding
conservation law [20]. It was, by all means, a stunning result: beforehand, meticulous
experimental observation was required to discover fundamental laws such as the
conservation of energy, and even then, it was an empirical result not coming from
anywhere. Noether’s Theorem — “a guiding star to 20th and 21st century physics,” in
the words of the Nobel laureate Frank Wilczek — allowed for example to show that the
conservation of energy emerges from the translational symmetry of time, a rather
intuitive idea that the results of an experiment should not depend on whether
it is conducted today or tomorrow.
Hermann Weyl (left) and a 1918 postcard from Einstein arguing with his initially proposed gauge theory.
Get started
Open in app
Emmy Noether (right) and the publication from the same year containing her celebrated theorem. Noether’s
portrait: Ihor Gorskiy.
Another symmetry associated with charge conservation, the global gauge invariance of
the electromagnetic field, first appeared in Maxwell’s formulation of electrodynamics
[21]; however, its importance initially remained unnoticed. The same Hermann Weyl,
who wrote so dithyrambically about symmetry, is the one who first introduced the
concept of gauge invariance in physics in the early 20th century [22], emphasizing its
role as a principle from which electromagnetism can be derived. It took several
decades until this fundamental principle — in its generalised form developed by Yang
and Mills [23] — proved successful in providing a unified framework to describe the
quantum-mechanical behaviour of electromagnetism and the weak and strong forces,
finally culminating in the Standard Model that captures all the fundamental forces of
nature but gravity. As succinctly put by another Nobel-winning physicist, Philip
Anderson [24],
“it is only slightly overstating the case to say that

physics is the study of symmetry” — Philip Anderson
(1972)
An impatient reader might wonder at this point, what does all this excursion
into the history of geometry and physics, however exciting it might be,
have to do with deep learning? As we will see in the next part, the geometric notions of
symmetry and invariance have been recognised as crucial even in early attempts to do
‘pattern recognition,’ and it is fair to say that geometry has accompanied the nascent
field of artificial intelligence from its very beginning.
[1] H. Weyl, Symmetry (1952), Princeton University Press.
[2] Fully titled Strena, Seu De Nive Sexangula (’New Year’s gift, or on the Six-Cornered
Snowflake’) was, as suggested by the title, a small booklet sent by Kepler in 1611 as a
Christmas gift to his patron and friend Johannes Matthäus Wackher von Wackenfels.
Christmas gift to his patron and friend Johannes Matthäus Wackher von Wackenfels.
[3] P. Ball, In retrospect: On the six-cornered snowflake (2011), Nature 480

(7378):455–455.
[4] Galois famously described the ideas of group theory (which he considered in the
context of finding solutions to polynomial equations) and coined the term “group”
(groupe in French) in a letter to a friend written on the eve of his fatal duel. He asked to
communicate his ideas to prominent mathematicians of the time, expressing the hope
that they would be able to “‘decipher all this mess’” (“‘déchiffrer tout ce gâchis”). Galois
died two days later from wounds suffered in the duel aged only 20, but his work has
been transformational in mathematics.
[5] See biographic notes in R. Tobies, Felix Klein — Mathematician, Academic

Organizer, Educational Reformer (2019), The Legacy of Felix Klein 5–21, Springer.
[6] Omar Khayyam is nowadays mainly remembered as a poet and author of the
immortal line “‘a flask of wine, a book of verse, and thou beside me.”
[7] The publication of Euclides vindicatus required the approval of the Inquisition,
which came in 1733 just a few months before the author’s death. Rediscovered by the
Italian differential geometer Eugenio Beltrami in the nineteenth century, Saccheri’s
work is now considered an early almost-successful attempt to construct hyperbolic
geometry.
[8] Poncelet was a military engineer and participant in Napoleon’s Russian campaign,
where he was captured and held as a prisoner until the end of the war. It was during
this captivity period that he wrote the Traité des propriétés
projectives des figures (‘Treatise on the projective properties of figures,’ 1822) that
revived the interest in projective geometry. Earlier foundation work on this subject was
done by his compatriot Gérard Desargues in 1643.
[9] In the 1832 letter to Farkas Bolyai following the publication of his son’s results,
Gauss famously wrote: “To praise it would amount to praising myself. For the entire
content of the work coincides almost exactly with my own meditations which have
occupied my mind for the past thirty or thirty-five years.” Gauss was also the first to
use the term ‘non-Euclidean geometry,’ referring strictu sensu to his own construction
of hyperbolic geometry. See
R. L. Faber, Foundations of Euclidean and non-Euclidean geometry (1983), Dekker and
the blog post in Cantor’s Paradise.

[10] Н. И. Лобачевский, О началах геометрии (1829).
[11] A model for hyperbolic geometry known as the pseudosphere, a surface with
constant negative curvature, was shown by Eugenio Beltrami, who also proved that
hyperbolic geometry was logically consistent. The term ‘hyperbolic geometry’ was
introduced by Felix Klein.
[12] For example, an 1834 pamphlet signed only with the initials “S.S.” (believed by
some to belong to Lobachevsky’s long-time opponent Ostrogradsky) claimed that
Lobachevsky made “an obscure and heavy theory” out of “the lightest and clearest
chapter of mathematics, geometry,” wondered why one would print such “ridiculous
fantasies,” and suggested that the book was a “joke or satire.”
[13] A. F. Möbius, Der barycentrische Calcul (1827).
[14] B. Riemann, Über die Hypothesen, welche der Geometrie zu Grunde liegen (1854).
See English translation.
[15] According to a popular belief, repeated in many sources including Wikipedia, the
Erlangen Programme was delivered in Klein’s inaugural address in October 1872. Klein
indeed gave such a talk (though on December 7, 1872), but it was for a non-
mathematical audience and concerned primarily his ideas of mathematical education;
see[4]. The name “Programme” comes from the subtitle of the published brochure
[17], Programm zum Eintritt in die philosophische Fakultät und den Senat der k.
Friedrich-Alexanders-Universität zu Erlangen (‘Programme for entry into the
Philosophical Faculty and the Senate of the Emperor Friedrich-Alexander University of
Erlangen’).
[16] S. Eilenberg and S. MacLane, General theory of natural equivalences (1945),

Trans. AMS 58(2):231–294. See also J.-P. Marquis, Category Theory and Klein’s
Erlangen Program (2009), From a Geometrical Point of View 9–40, Springer.
[17] F. Klein, Vergleichende Betrachtungen über neuere geometrische Forschungen

(1872). See English translation.
[18] At the time, Göttingen was Germany’s and the world’s leading centre of
mathematics. Though Erlangen is proud of its association with Klein, he stayed there
for only three years, moving in 1875 to the Technical University of Munich (then called
for only three years, moving in 1875 to the Technical University of Munich (then called
Technische Hochschule), followed by Leipzig (1880), and finally settling down in

Göttingen from 1886 until his retirement.
[19] Emmy Noether is rightfully regarded as one of the most important women in
mathematics and one of the greatest mathematicians of the twentieth century. She was
unlucky to be born and live in an epoch when the academic world was still entrenched
in the medieval beliefs of the unsuitability of women for science. Her career as one of
the few women in mathematics having to overcome prejudice and contempt was a
truly trailblazing one. It should be said to the credit of her male colleagues that some of
them tried to break the rules. When Klein and David Hilbert first unsuccessfully
attempted to secure a teaching position for Noether at Göttingen, they met fierce
opposition from the academic hierarchs. Hilbert reportedly retorted sarcastically to
concerns brought up in one such discussion: “I do not see that the sex of the candidate
is an argument against her admission as a Privatdozent. After all, the Senate is not a
bathhouse”(see C. Reid, Courant in Göttingen and New York: The Story of an Improbable
Mathematician (1976), Springer). Nevertheless, Noether enjoyed great esteem among
her close collaborators and students, and her male peers in Göttingen affectionately
referred to her as “Der Noether,” in the masculine (see C. Quigg, Colloquium: A Century
of Noether’s Theorem (2019), arXiv:1902.01989).
[20] E. Noether, Invariante Variationsprobleme (1918), König Gesellsch. d. Wiss. zu

Göttingen, Math-Phys. 235–257. See English translation.
[21] J. C. Maxwell, A dynamical theory of the electromagnetic field (1865),

Philosophical Transactions of the Royal Society of London 155:459–512.
[22] Weyl first conjectured (incorrectly) in 1919 that invariance under the change of
scale or “gauge” was a local symmetry of electromagnetism. The term gauge, or Eich in
German, was chosen by analogy to the various track gauges of railroads. After the
development of quantum mechanics, he modified the gauge choice by replacing the
scale factor with a change of wave phase in iH. Weyl, Elektron und gravitation (1929),
Zeitschrift für Physik 56 (5–6): 330–352. See N. Straumann, Early history of gauge
theories and weak interactions (1996), arXiv:hep-ph/9609230.
[23] C.-N. Yang and R. L. Mills, Conservation of isotopic spin and isotopic gauge
invariance (1954), Physical Review 96 (1):191.
[24] P. W. Anderson, More is different (1972), Science 177 (4047): 393–396.

The portraits of Klein and Noether were hand-drawn by Ihor Gorskiy. Detailed lecture
materials on Geometric Deep Learning are available on the project webpage. See Michael’s
other posts in Towards Data Science, subscribe to his posts, get Medium membership, or
follow Michael, Joan, Taco, and Petar on Twitter.
Sign up for The Variable

By Towards Data Science
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.
Get this newsletter
About Help Terms Privacy
Get the Medium app

Towards Geometric Deep Learning I - On The Shoulders of Giants

Uploaded by

Copyright:

Available Formats

Towards Geometric Deep Learning I - On The Shoulders of Giants

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Towards Geometric Deep Learning I - On The Shoulders of Giants

Uploaded by

Copyright:

Available Formats

2022/7/7 下午1:18 Towards Geometric Deep Learning I: On the Shoulders of Giants | by Michael Bronstein | Jul, 2022 | Towards Data

owards Data Science

Open in app Get started

Published in Towards Data Science

Michael Bronstein Follow

Jul 4 · 14 min read · Listen

ORIGINS OF GEOMETRIC DEEP LEARNING

Towards Geometric Deep Learning I: On the

Open in app Get started

T he last decade has witnessed an experimental revolution in data science and

While learning generic functions in high dimensions is a cursed estimation problem,

Order, Beauty, and Perfection

“Symmetry, as wide or as narrow as you may define

Open in app Get started

In modern mathematics, symmetry is almost univocally expressed in the language of

A Strange New World

T he foundations of modern geometry were formalised in ancient Greece nearly

An early approach to the problem of the parallels appears in the eleventh-century

Like Khayyam, he considered the summit angles of a quadrilateral with sides

T he credit for the first construction of a true non-Euclidean geometry is disputed.

mastered at that time [11].

“I have discovered such wonderful things that I was

It insight to use group theory as an algebraic abstraction of symmetry to organise

It appeared that Euclidean geometry is a special case of affine geometry, which is in

The Theory of Everything

“How persistently the mathematical physicist

“it is only slightly overstating the case to say that

[1] H. Weyl, Symmetry (1952), Princeton University Press.

[3] P. Ball, In retrospect: On the six-cornered snowflake (2011), Nature 480

[5] See biographic notes in R. Tobies, Felix Klein — Mathematician, Academic

the blog post in Cantor’s Paradise.

[10] Н. И. Лобачевский, О началах геометрии (1829).

[13] A. F. Möbius, Der barycentrische Calcul (1827).

[16] S. Eilenberg and S. MacLane, General theory of natural equivalences (1945),

[17] F. Klein, Vergleichende Betrachtungen über neuere geometrische Forschungen

Technische Hochschule), followed by Leipzig (1880), and finally settling down in

[20] E. Noether, Invariante Variationsprobleme (1918), König Gesellsch. d. Wiss. zu

[21] J. C. Maxwell, A dynamical theory of the electromagnetic field (1865),

[24] P. W. Anderson, More is different (1972), Science 177 (4047): 393–396.

Sign up for The Variable

Get this newsletter

About Help Terms Privacy

Get the Medium app

You might also like