J Biol Inorg Chem (2008) 13:157–170
DOI 10.1007/s00775-007-0318-7
MINIREVIEW
Iron–sulfur protein folds, iron–sulfur chemistry, and evolution
Jacques Meyer
Received: 5 September 2007 / Accepted: 25 October 2007 / Published online: 9 November 2007
Ó SBIC 2007
Abstract An inventory of unique local protein folds
around Fe–S clusters has been derived from the analysis of
protein structure databases. Nearly 50 such folds have been
identified, and over 90% of them harbor low-potential
[2Fe–2S]2+,+ or [4Fe–4S]2+,+ clusters. In contrast, highpotential Fe–S clusters, notwithstanding their structural
diversity, occur in only three different protein folds. These
observations suggest that the extant population of Fe–S
protein folds has to a large extent been shaped in the
reducing iron- and sulfur-rich environment that is believed
to have predominated on this planet until approximately
two billion years ago. High-potential active sites are then
surmised to be rarer because they emerged later, in a more
oxidizing biosphere, in conditions where iron and sulfide
had become poorly available, Fe–S clusters were less stable, and in addition faced competition from heme iron and
copper active sites. Among the low-potential Fe–S active
sites, protein folds hosting [4Fe–4S]2+,+ clusters outnumber
those with [2Fe–2S]2+,+ ones by a factor of 3 at least. This
is in keeping with the higher chemical stability and versatility of the tetranuclear clusters, compared with the
binuclear ones. It is therefore suggested that, at least while
novel Fe–S sites are evolving within proteins, the intrinsic
chemical stability of the inorganic moiety may be more
important than the stabilizing effect of the polypeptide
chain. The discovery rate of novel Fe–S-containing protein
folds underwent a sharp increase around 1995, and has
remained stable to this day. The current trend suggests that
the mapping of the Fe–S fold space is not near completion,
in agreement with predictions made for protein folds in
general. Altogether, the data collected and analyzed here
suggest that the extant structural landscape of Fe–S proteins has been shaped to a large extent by primeval
geochemical conditions on one hand, and iron–sulfur
chemistry on the other.
Keywords Ferredoxin Rubredoxin Hydrogenase
Iron–sulfur Bioenergetics Evolution
Abbreviations
CoA
coenzyme A
EPR
electron paramagnetic resonance
Fd
ferredoxin
FNR
fumarate nitrate regulator
GABA c-aminobutyric acid
HiPIP
high potential iron protein
PDB
Protein Data Bank
PRPP
phosphoribosylpyrophosphate
Rd
rubredoxin
tRNA
transfer RNA
Introduction
J. Meyer (&)
Laboratoire de Chimie et Biologie des Métaux,
IRTSV, Commissariat à l’Energie
Atomique/CNRS/Université Joseph Fourier,
UMR5249, CEA-Grenoble,
38054 Grenoble, France
e-mail: jacques.meyer@cea.fr; jccfmeyer@numericable.fr
Fe–S clusters are ubiquitous and essential components of
living cells. Proteins containing Fe–S active sites were first
detected as electron paramagnetic resonance signatures in
mitochondrial membranes [1], and shortly thereafter small
soluble ferredoxins (Fd) were isolated [2, 3]. Within only a
few years, a variety of other small Fe–S proteins were
123
158
characterized, and were soon found to contain iron and
inorganic sulfide [4]. Over the following decades, studies
implementing X-ray crystallography, chemical synthesis of
structural analogs, and spectroscopy revealed the structural
frameworks and chemical and magnetic properties of a
variety of Fe–S clusters [5–7]. While most of the latter
clusters consist of one to four iron atoms, some larger ones
contain up to eight irons [8]. Metals other than iron (e.g.,
nickel or molybdenum) may be part of or bound to Fe–S
clusters [8, 9]. Fe–S clusters have a marked preference for
thiolate ligation [6], and accordingly cysteinyl sulfur is by
far the most frequently implemented ligand of Fe–S active
sites. Nonetheless, histidine and to a lesser extent glutamine, serine, or arginine ligation has been evidenced in
several cases [10] (Table 1). Presently known Fe–S proteins range in size from 6 kDa to over 500 kDa, contain up
to nine Fe–S clusters [11], are present in all kinds of cells
and cellular compartments, and are involved in all sorts of
cellular functions [7, 12–14]. While Fe–S clusters are
intrinsically oxygen-sensitive, their stability in proteins is
vastly dependent on the polypeptide matrix: some are stable in air for weeks (e.g., thermophilic Fd [15]), while
others are destroyed within tens of seconds (e.g., nitrogenase [16]).
Biomimetic Fe–S chemistry has been an extremely
fruitful approach in the research on Fe–S proteins [6].
Thorough investigations over more than three decades have
brought forth detailed models for nearly all Fe–S active
sites in proteins, and insights into most aspects of their
structure and function [5, 6]. These studies have demonstrated that Fe–S active-site analogs can exist in the
absence of polypeptide chains, unveiled the plasticity and
dynamics of Fe–S clusters, and produced a host of structures interconnecting solid-state, inorganic, and biological
Fe–S chemistry [6, 17, 18].
Another facet of Fe–S chemistry is its proposed
involvement in the production of prebiotic organic molecules [19], and possibly in the evolution of protocellular
systems [20]. These hypotheses are not unrealistic in view
of the geochemical conditions that most likely prevailed at
that time [21]; such a chemical environment would have
favored the spontaneous assembly of Fe–S active sites
within primitive macromolecules (early proteins or their
precursors) possessing adequately positioned thiolate
ligands. Some simple Fe–S proteins [2] displaying
remarkably primitive features in their sequence [22] could
then be regarded as ‘‘fossils’’ endowed with the potential to
report on these early events. Likewise, the overwhelming
diversity of extant Fe–S proteins is suggestive of an early
and close association of Fe–S chemistry with the development of life on this planet.
Insights into these questions may be sought by analyzing
the structural diversity of Fe–S proteins. The latter is
123
J Biol Inorg Chem (2008) 13:157–170
probably best represented, even though incompletely, in
the several hundred Fe–S protein entries deposited with the
Protein Data Bank (PDB; http://www.rcsb.org). However,
rather than whole structures which often consist of several
domains or subunits and may contain up to nine Fe–S
clusters [11], we have considered here folds around individual Fe–S clusters. These folds, consisting of either
entire small proteins or parts of larger ones, may be
regarded as the basic units of biological Fe–S structural
diversity: indeed, while Fe–S clusters assume but a few
different structural frameworks, the variety of polypeptide
folds that host them is considerably larger. It is shown here
that the number of distinct Fe–S folds is close to 50, 34
with [4Fe–4S] clusters, 11 with [2Fe–2S] clusters, and only
one with a [1Fe] site. The discussion will focus on how
prebiotic and Fe–S chemistry, as well as protein evolution,
may have affected the nature and distribution of protein
folds around extant Fe–S clusters.
Classic Fe–S proteins and folds
This section includes small and relatively simple Fe–S
proteins that were discovered in the early and mid 1960s,
mostly thanks to their high stability and widespread distribution. For these very reasons they yielded to structural
analysis at an early stage and revealed the frameworks of
the now classic [4Fe–4S], [2Fe–2S], and [1Fe] active sites
(Fig. 1), as well as the most common Fe–S-containing
protein folds.
2[4Fe–4S] ferredoxin
The first isolated Fe–S protein was a small (55 residues)
clostridial Fd [2] that was subsequently shown to contain
two [4Fe–4S] clusters [23], hence the designation 2[4Fe–
4S] Fd. These proteins are low-potential (-150 to
-700 mV) electron carriers implementing the [4Fe-4S]2+,+
redox transition [7]. They are therefore mostly involved in
anaerobic metabolic pathways and in the more reducing
parts of photosynthetic and aerobic electron transfer chains.
Clostridial 2[4Fe–4S] Fd is a compact ellipsoid where the
polypeptide chain is tightly wrapped around the two [4Fe–
4S] clusters. Its iron content (eight atoms for 55 amino
acids) is unusually high; thus, the inorganic moiety makes
up a large part of the structure [23, 24], and much of the
stability is provided by the interactions between the polypeptide chain and the inorganic core. The structure displays
a conspicuous twofold symmetry resulting from an ancient
sequence duplication (see later). Each half of the sequence
contributes cysteine ligands to both clusters; thus, the two
sets of cysteine ligands are not consecutive in the sequence,
Folda
Date of
discoveryb
[4Fe–4S]
1
1972
2f
1972
3
1986
4
1989
5
1992
6
1994
7
1995
8
1995
9h
1995
10
1995
11i
1995
12
1996
13
1996
14
15
16
17
18
19
20f
21m
22
23
24
25n
26
27
1997
1997
1998
1998
1998
1999
2000
2000
2001
2001
2001
2001
2002
2003
Protein
Quaternary structured
Fe–S ligands
1dur 2fdn
1hip 1iua
2tmd 1o94
5acn 1c96
1nip 1cp2
1gph 1ao0
1aor
2abk 1kg2
1frv 1wui
1frv 1wui
1frv 1wui
1lrv
2pps 1jb0
2[4Fe–4S] Fd
HiPIP
Trimethylamine dehydrogenase
Aconitase
Nitrogenase Fe protein
Glutamine PRPP amidotransferase
Aldehyde-Fd oxidoreductase
Endonuclease III
NiFe hydrogenase
NiFe hydrogenase
NiFe hydrogenase
Leucine-rich repeat
Photosystem I
a (55)
a (83)
a2 (729)
a (754)
a2 (273)
a4 (465)
a2 (605)
a (211)
a (551) b (254)
a (551) b (254)
a (551) b (254)
a (244)
Heterododecamer
(PsaA–F, PsaI–M, PsaX)
1aa6
1aa6
1feh
1feh 1hfe
1e1d 1gnt
1b0p 2c42
1dj7 2pvo
1ea0
1hux
1su8
1su8
1h7w
1mjg 1oao
1olt
Formate dehydrogenase H
Sulfite reductase (SiRHP)
FeFe hydrogenase
FeFe hydrogenase
Hybrid cluster proteinl
Pyruvate-Fd oxidoreductase
Fd-thioredoxin reductase
Glutamate synthase
HO-glutaryl-CoA dehydratase (CompA)
CO dehydrogenase
CO dehydrogenase
Dihydropyrimidine dehydrogenase
Acetyl-CoA synthase
HemN
a (715)
a (570)
a (574)
a (574)
a (553)
a2 (1,232)
a (75) b (117)
a2 (1,479)
a2 (260)
a2 (636)
a2 (636)
a2 (1,025)
a2 (729) b2 (674)o
a (457)
1u8v
[71]q
2fug
2fug
2goy
2g36
2jh3
2z1d
4-Hydroxybutyryl-CoA dehydratase
Fe–S flavoprotein
Complex I (hydrophilic domain, hetero-octamer)
Complex I (hydrophilic domain, hetero-octamer)
Adenosine 50 -phosphosulfate reductase
Tryptophanyl-tRNA synthetase
Cobalt chelatase
HypD
a4 (490)
a4 (203)
Nqo1 (438)
Nqo6 (181)
a4 (267)
a2 (340)
a4 (472)
a (372)
(C8 C11 C14 C47) (C18 C37 C40 C43)e
C41 C46 C61 C75
C345 C348 C351 C364
C358 C421 C424 citrateg
C94 C129 (subunit-bridging cluster)
C236 C382 C437 C440
C288 C291 C295 C494
C187 C194 C197 C203
C17 C20 C112 C148
H185(Nd) C188 C213 C219
C228 C246 C249
C14 C17 C29 C35
C578(PsaA) C587(PsaA)
C565(PsaB) C574(PsaB)
(PsaA–PsaB bridging cluster)
C8 C11 C15 C42
C434 C440 C479 C483(br)j
H94(Ne) C98 C101 C107
C300 C355 C499 C503(br)k
C3 C6 C15 C21
C812 C815 C840 C1071
C55 C74 C76 C85
C1102 C1108 C1118
C127 C166 (subunit-bridging cluster)
C48 C51 C56 C70
C39 C47 (subunit-bridging cluster)
C91 C130 C136 Q156(Oe)
C506 C509(br)p C518 C528
C62 C66 C69
S-adenosylmethionine
(Met N,O)
C99 C103 H292(Ne) C299
C47 C50 C53 C59
C354 C356 C359 C400
C45 C46 C111 C140
C139 C140 C228 C231
C236 C259 C266 C269
C420 C423 C448 C452
C323 C338 C345 C362
1fxi 1czp
Plant- and vertebrate-type Fd
a (98)
C41 C46 C49 C79
Other Fe–S folds
in same protein
10, 11
9, 11
9, 10
1
1, 17, 101
1, 16, 101
1
24
23
23, 24
1, 16, 31, 101, 104
1, 16, 30, 101, 104
159
123
28
2004
29
2005
30
2006
2006
31r
32
2006
33
2007
34
2007
35
2007
[2Fe–2S]
101
1981
PDB entriesc
J Biol Inorg Chem (2008) 13:157–170
Table 1 Unique protein folds around [4Fe–4S], [2Fe–2S], and [1Fe] active sites
160
123
Table 1 continued
Folda
Date of
discoveryb
PDB entriesc
Protein
Quaternary structured
Fe–S ligands
Other Fe–S folds
in same protein
102
103
104
105n
106
107n
108
109
110
111n
[1Fe]
201
1995
1996
2000
2001
2004
2004
2006
2006
2007
2007
1dgj 1vlb
1rie 1jm1
1f37 1m2d
1hrk 2hrc
1r30
1ohv
1x0g
2ht9
2hu9
2qh7
Aldehyde oxidoreductase
Rieske-type proteins
Thioredoxin-like Fd
Ferrochelataset
Biotine synthase
GABA aminotransferase
IscA
Glutaredoxin
CopZv
MitoNEETw
a (907)
a (250)
a2 (110)
a2 (423)
a2 (369)
a2 (472)
aa0 (112)u
a2 (146)
a (204)
a2 (108)
C100 C103 C137 C139
C140 H142(Nd) C170 H173(Nd)
C9 C22 C55 C59
C196 C403 C406 C411
C97 C128 C188 R290(Ng)
C135 C138 (subunit-bridging cluster)
C37 C101 C103 C1030
C37 glutathione (subunit-bridging cluster)
C75 C77 C109 C119
C72 C74 C83 H87(Nd)
101
1970
4rxn 2dsx
Rubredoxin
a (54)
C6 C9 C39 C42
27
PDB Protein Data Bank, HiPIP high-potential iron protein, PRPP phosphoribosylpyrophosphate, Fd ferredoxin, CoA coenzyme A, tRNA transfer RNA, GABA c-aminobutyric acid
Numbered in the order of discovery, starting with ‘‘1’’ for [4Fe–4S] active sites, ‘‘101’’ for [2Fe–2S], and ‘‘201’’ for [1Fe]
b
First coordinate file deposition or publication; the latter may precede the former by several years for some earlier structures
c
A second entry has been included for subsequent structures with significantly higher resolution. The underscored entry is the one to which the numbering of the Fe–S cluster ligands refers
d
The length (amino acid residues) of each subunit is indicated. The underscored subunit contains the Fe–S fold of interest
e
Each parenthesis contains the ligands of one of the [4Fe–4S] clusters. The numerous varieties and presumed evolution of this fold are summarized in Fig. 2
f
HiPIP and Fd-thioredoxin reductase (fold 20) are the only protein folds accommodating [4Fe–4S] clusters in their 3+/2+ high-potential transition; all other folds listed here contain [4Fe–4S]2+/
1+
clusters. However, while the HiPIP cluster is genuinely high potential (see text), the Fd-thioredoxin reductase cluster has a low redox potential owing to its chemical interaction with a
dithiol/disulfide cysteine pair [72]
g
Three iron atoms have normal ligation to cysteine; the fourth one has three oxygen ligands (H2O, Cb carboxyl and Cb hydroxyl from citrate)
h
Flavodoxin-like domain
i
[3Fe–4S] cluster in this structure. In some [NiFe] hydrogenases (e.g., 1cc1) a [4Fe–4S] cluster is present thanks to the occurrence of a fourth cysteine ligand in the counterpart position of P239
j
C483 bridges one of the [4Fe–4S] irons and the heme iron
k
C503 bridges one of the [4Fe–4S] irons and one of the two [FeFe] active-site irons
l
Contains another 4Fe cluster that is not of the [4Fe–4S] type: it is highly asymmetric, involves several Fe–O bonds, and undergoes redox-linked structural changes
m
[3Fe–4S] cluster
n
Folds 25, 105, 107. and 111 are the only ones that have so far been found only in eukaryotes
o
Bifunctional (acetyl-CoA synthase/CO dehydrogenase) enzyme, in which the acetyl-CoA synthase function is assumed by the a2 moiety
p
C509 bridges one of the [4Fe–4S] irons and one of the [NiNi] active-site nickels. All ligands belong to the a subunits
q
Coordinates not deposited in the PDB [71]
r
Flavodoxin-like subunit, but the positions of the cysteine ligands, and the polypeptide fold around the [4Fe–4S] cluster differ from those in fold 9 [11]
s
Soluble fragment (residues 47–250)
t
N-terminus (residues 1–62) removed
u
Homodimer, but the two subunits assume different conformations
v
N-terminal (1–131) domain
w
Water-soluble C-terminal (33–108) domain
a
J Biol Inorg Chem (2008) 13:157–170
J Biol Inorg Chem (2008) 13:157–170
a
b
Fig. 1 a Structures of Fe–S active sites. Tetrahedral FeS4 coordination and diversity of iron nuclearity are prominent characteristics of
these active sites. b Positions of the Sc atoms of the cysteine ligands.
The [3Fe–4S] cluster (not shown here) is derived from the [4Fe–4S]
framework by removal of one iron and its cysteine ligand [5, 6]
but overlapping, and both clusters are accommodated in a
single domain. The 2[4Fe–4S] Fd fold is in fact the only
one, among those identified and listed in Table 1, that
contains a pair of Fe–S clusters, rather than a single one.
2[4Fe–4S] Fd clusters assume a quite common babbab
topology; hence, the latter has been dubbed ‘‘ferredoxin
fold’’ [25]. However, except for the shared topology, most
of the so-called Fd folds differ structurally from the genuine 2[4Fe–4S] Fd fold and are unlikely to bear any
phylogenetic relationship with it.
161
The primary structure of clostridial 2[4Fe–4S] Fd
revealed a putatively primitive amino acid composition and
a clear trace of ancestral gene duplication. It was therefore
inferred that this protein fold is very ancient [22]. The
demonstration that the apoprotein would spontaneously
refold into native Fd in the presence of iron ions and sulfide
[4] suggested the possibility of a spontaneous assembly in
conditions believed to be those required for the emergence
of life on this planet [20]. It also suggested the feasibility of
the thereafter successful synthesis of active-site analogs by
autoassembly [26].
In keeping with its surmised antiquity, the 2[4Fe–4S] Fd
fold is probably the most widespread Fe–S protein fold,
and the one that has undergone the most extensive modifications [27]. Some of the latter are outlined in Fig. 2.
They include insertions of polypeptide segments, loss of
one of the clusters and occasional incorporation of a
disulfide bond, as well as formation of [3Fe–4S] clusters by
abstraction of one iron, a recurrent theme in Fe–S chemistry and biochemistry [6, 28, 29]. The 2[4Fe–4S] Fd fold is
also remarkable by its frequent occurrence, not merely in
small soluble Fd, but also in subunits and domains of redox
enzymes, often in combination with other Fe–S protein
folds (see later).
It should be pointed out that the clearest imprint of the
primordial gene duplication is found in mesophilic clostridial 2[4Fe–4S] Fd, e.g., Clostridium acidurici [30],
which are therefore presumably the most primitive. In
contrast, all thermophilic [4Fe–4S] Fd are less symmetric
Fig. 2 Evolutionary scheme of
the 2[4Fe–4S] Fd fold. Only Fd
are shown here. Even greater
variations are displayed by
homologous domains in redox
enzymes. The generic names of
the microorganisms are
indicated (thermophiles in red),
followed by the relevant Protein
Data Bank (PDB) entries. The
ancestral clostridial Fd is shown
at the top, and the most
significant variations are
indicated on the arrows. Iron
atoms are shown as cyan
spheres, while sulfide atoms are
not shown for clarity. Clusters
labeled 3 are of the [3Fe–4S]
type. The core polypeptide fold
common to all proteins is shown
as black strands, insertions are
shown as red ribbons, disulfide
bridges are shown in yellow, and
the zinc atom is shown in green.
The structures were drawn using
RASMOL [70]
123
162
forms sporting various sorts of polypeptide chain extensions and occasional loss of one [4Fe–4S] cluster (Fig. 2).
Thermophilic Fd are therefore more likely to be derived
forms than primitive ones, in contradiction with a previous
proposal [31]. The phylogeny of 2[4Fe–4S] Fd would thus
be consistent with a mesophilic or moderately thermophilic
root of the tree of life [32, 33].
High-potential [4Fe–4S] protein
[4Fe–4S] high-potential iron proteins (HiPIP) are small
(55–85 residues) proteins isolated mostly, but not exclusively, from photosynthetic bacteria [34]. The role of
HiPIP as electron donors to the tetraheme cytochrome in
photosynthetic bacteria is now well established [35]. In
contrast, their function in nonphotosynthetic organisms
remains unknown.
HiPIP are globular proteins nearly devoid of secondary
structure. The [4Fe–4S] cluster is bound to four conserved
cysteines and is buried within the protein interior [36]. As a
result of its hydrophobic environment and hydrogenbonding network, the cluster implements the [4Fe–4S]3+,2+
transition; hence, the high redox potential (+100 to
+400 mV) [37]. HiPIP-like folds have not been found in
subunits or domains of larger proteins. While the name
‘‘HiPIP’’ has been conserved by tradition, these proteins
are in fact high-potential [4Fe–4S] Fd, as opposed to the
low-potential ones described in the previous section.
[2Fe–2S] plant- and vertebrate-type ferredoxin
Proteins of this type constitute a large family composed of
several subgroups [38]. Plant-type proteins function as
electron carriers between photosystem I and several
enzymes [3, 39], thus linking the ‘‘light’’ and ‘‘dark’’
reactions. Vertebrate (e.g., adrenodoxin) and bacterial (e.g.,
putidaredoxin) [2Fe–2S] Fd transfer electrons to hydroxylating enzymes, usually P450 cytochromes [40]. Other
groups include the [2Fe–2S] IscFd involved in the biosynthesis of Fe–S clusters [14, 41], and the XylT [2Fe–2S]
Fd committed to the activation of some oxygenases [42].
Yet other [2Fe–2S] Fd are found in halobacteria [43] and in
the hyperthermophile Aquifex aeolicus [15]. All these lowpotential proteins (-150 to -450 mV) implement the
[2Fe–2S]2+,+ redox transition of the cluster.
Plant- and vertebrate-type [2Fe–2S] Fd are globular
proteins (approximately 100 residues) wherein the [2Fe–
2S] cluster is located near the surface, and is protected by a
long loop including three of the four cysteine ligands [39–
41, 43–45]. The opposite side of the molecule consists of a
four -stranded b-sheet covered by an a-helix, which
123
J Biol Inorg Chem (2008) 13:157–170
together form a ubiquitin-like motif known as b-grasp [25].
Additional lateral a-helices connect the cluster-binding
region and the b-grasp [39–41, 43–45].
Notwithstanding their functional diversity, plant- and
vertebrate-type Fd are structurally much less diverse than
the 2[4Fe–4S] Fd. Variations include an approximately 20
residue N-terminal extension in halobacterial Fd [43], and
differences in the redox-partner interaction surfaces
between the plant-type and the vertebrate-type Fd [40].
While this [2Fe–2S] Fd fold is best known for its widespread occurrence in small soluble Fd, it is also present in
domains of many redox enzymes, often in combination
with other Fe–S domains and clusters (see later).
Rieske [2Fe–2S] protein
Rieske proteins were first characterized as subunits of
respiratory (cytochrome bc1) and photosynthetic (cytochrome b6f) electron transfer complexes, but were
subsequently found in oxygenases, either as subunits or
domains, or as small electron carrier Fd [46, 47].
The basic structural framework (approximately 120
residues) of Rieske proteins consists of three stacked bsheets, of which the upper one includes the two ligand
loops holding the [2Fe–2S] cluster [46, 48]. These two
loops contain one cysteine and one histidine ligand each,
and are interconnected by a disulfide bond which contributes significantly to the stability of the protein. The
iron atom closer to the surface has two solvent-exposed
histidine ligands; the other iron is bound to two buried
cysteine ligands [46, 48]. In membrane-bound complexes
(e.g., cytochrome bc1), the Rieske-type subunits possess
N-terminal hydrophobic anchors [46]. The fold of the
cluster-containing subdomain of Rieske proteins is strikingly similar to that of rubredoxin (Rd) [48] (see
‘‘Rubredoxin’’).
The redox transition of the cluster in Rieske proteins is
[2Fe–2S]2+,+, as in plant-type Fd, but histidine ligation
causes upshifts of the redox potential. As a result, many
Rieske proteins have positive potentials (+100 to
+400 mV); nevertheless, some dioxygenase-associated
Rieske proteins (approximately -100 mV) display redox
potentials closer to those of plant-type Fd [49].
Thioredoxin-like [2Fe–2S] ferredoxin
Thioredoxin-like [2Fe–2S] Fd were among the first Fe–S
proteins discovered [50], but their structure has only been
elucidated recently [51]. Their function is largely
unknown, although some data suggest an involvement in
nitrogen metabolism [52]. The redox transition of the
J Biol Inorg Chem (2008) 13:157–170
cluster is [2Fe–2S]2+,+, and the redox potentials are around
-300 mV [52].
The two identical subunits of approximately 100 residues contain one [2Fe–2S] cluster each and assume a
thioredoxin-like fold. Unlike in thioredoxin, however, the
b-sheet is covered by a-helices only on one side, allowing
the other side to be implemented as the subunit interface
[51]. These Fd are found only in bacteria [52], and are thus
less common than the [2Fe–2S] plant-type or the bacterial
2[4Fe–4S] Fd. The thioredoxin-like [2Fe–2S] Fd fold is
nevertheless encountered in a number of multiprotein
complexes or redox enzymes [52]. Well-known examples
are hydrogenases and NADH ubiquinone oxidoreductase
[53]. In the latter multisubunit protein, the thioredoxin-like
Fd fold is present as a single copy but carries an N-terminal
extension of approximately 75 residues that folds as a helix
bundle and contributes a tight interaction with a neighboring subunit [11].
Rubredoxin
The active site of the small (approximately 55 residues)
soluble Rd consists of a single iron atom coordinated to
four cysteinyl sulfurs occurring on two CysXXCys segments and belonging to two symmetry-related loops [54,
55]. The FeS4 site of Rd is unique among Fe–S active sites
in being devoid of inorganic sulfur. The FeS4 structural
unit also stands out as the basic building block of all Fe–S
clusters. The redox potential of the [1Fe]3+,2+ couple in Rd
is within the -100 to +200 mV range. Rd function as
electron carriers in various redox chains, many of which
are linked to oxygenation reactions or protection against
oxidative damage [55].
While the distribution of Rd is rather scanty, Rd-type
proteins and domains are very diverse. Some alkane-oxidizing bacteria contain a Rd consisting of approximately
130 residues, and a yet larger one (approximately 160
residues) containing two Rd modules [55]. A few algae
contain a Rd with an approximately 30 residue C-terminal
membrane anchor [55]. Desulforedoxin is a homodimer of
which each subunit (36 residues) assumes a shortened Rdlike fold with one CysXXCys and one CysCys [56] ligand
loop. Rubrerythrin [57] and superoxide reductase [58]
contain Rd-like domains in addition to their active-site
domains. Rd-like domains of undetermined functions,
perhaps regulatory, have been observed in the structures of
larger proteins [59, 60] (see later).
The Rd fold is remarkably similar to the Rieske-protein
fold in the active-site region [48] (see ‘‘Rieske [2Fe–2S]
protein’’), even though different ligand sets are implemented in the two cases. Indeed, mutating a single cysteine
ligand of the iron results in the installation of a [2Fe–2S]
163
cluster in Rd [61], and conversely, the Rieske site can be
converted into a [1Fe] Rd-like site by a triple mutation
[62].
Rd otherwise displays striking similarities with Zncontaining protein folds belonging to zinc-finger families.
Rd itself has similar affinities for iron and zinc, and the
metal content of Rd in vivo appears to be determined, at
least in Escherichia coli, by the relative concentrations of
the two metals in the culture medium [55, 63]. Heavier
metals of the same series (cadmium, mercury) can also
bind to the Rd active site [64]. Zinc-containing and cadmium-containing Rd-like domains have been found in a
type III ribonucleotide reductase [59], and in a protein
kinase G [60], respectively.
Proteins containing large Fe–S clusters
Besides the classic Fe–S clusters discussed already, larger
and more complex Fe–S clusters are present in a few
enzymes [8]. Prominent among them are the 8Fe7S
(P cluster) and Mo7Fe9S (FeMoco) clusters of nitrogenase,
the 2Fe–[4Fe–4S] active site of [FeFe] hydrogenases, and
the 2Ni–[4Fe–4S] (A cluster) and Ni4Fe5S (C cluster)
clusters of CO dehydrogenase/acetylcoenzyme A synthase
[8, 9]. These clusters are highly specialized catalysts and
require specific chaperone systems for their assembly [8].
While they display no clear evidence of structural or
functional diversification, such events may have occurred
in the distant past. The inventory set up and discussed
hereafter includes only the [4Fe–4S] moieties present in
some of these active sites.
Towards an inventory of Fe–S protein folds
Over the decades since their discovery, Fe–S proteins
have been found in all kinds of organisms and cell
compartments, and assume a large range of functions [5,
7, 12–14]. Genome and metagenome sequences have
confirmed the pervasiveness of Fe–S proteins in the living
world and may be used for an assessment of the distribution and diversity of Fe–S proteins [65, 66]. However,
while known Fe–S-binding patterns are easily identified in
sequence databases, novel ones are difficult to infer from
cysteine sequence patterns alone. In contrast, protein
structure databases provide unequivocal information on
the structure of the Fe–S active sites and their protein
environment. They also yield the structures of individual
Fe–S active sites in multicluster proteins. While threedimensional structures are far less numerous than
sequences, they are now counted in hundreds and thus
afford a sizeable sample of extant structures. Also,
123
164
structures of Fe–S active sites are, to some extent, sampled randomly, since novel Fe–S folds are discovered not
only in studies specifically aimed at Fe–S proteins [11],
but also serendipitously in proteins not previously known
to contain Fe–S clusters ([67], PDB entry 2g36). For these
reasons, the set of unique protein folds around Fe–S
active sites may be regarded as an adequate representation
of the structural diversity of Fe–S proteins.
Fe–S protein structures were retrieved from the PDB
[68] using the keywords ‘‘FeS,’’ ‘‘iron–sulfur,’’ ‘‘ferredoxin,’’ ‘‘rubredoxin,’’ ‘‘4Fe–4S,’’ and ‘‘2Fe–2S,’’ and the
pooled data were cross-checked with literature searches.
Altogether over 500 structure entries were collected. Some
of these structures (approximately 10%), mostly small
proteins, were obtained by using NMR. However, none of
the latter include folds that had not been previously characterized by X-ray crystallography. Indeed, while NMR is
very powerful for the investigation of dynamics or electromagnetic properties, it still has some limitations for the
determination of polypeptide chain conformation in the
vicinity of paramagnetic metal sites [69]. Thus, all fold
structures listed in Table 1 were determined by X-ray
crystallography.
The Fe–S protein crystal structures in the PDB are
redundant to a significant extent, mostly owing to multiple entries from classic Fe–S proteins that have been
extensively investigated over several decades. The PDB
includes more than ten HiPIP entries, more than 20 Rd,
more than 30 plant and vertebrate [2Fe–2S] Fd, and over
50 structures of the 2[4Fe–4S] Fd fold (taking into
account all variations pictured in Fig. 1). Even for some
of the large and complex proteins, e.g., nitrogenase [8],
entries are counted in dozens. The database nevertheless
contains many structures represented but just a few entries
or even a single entry. The body of data was first reduced
by eliminating redundant structures. Then, proteins containing more than one Fe–S cluster (excepting the 2[4Fe–
4S] Fd, see above) were inspected to discriminate
between the novel Fe–S folds and those that had previously been observed in other proteins (as in hydrogenases
[73] or complex I [11]). The uniqueness of domains was
also assessed by structural searches implementing DALI
[74]. The analysis eventually revealed nearly 50 unique
folds around individual [1Fe], [2Fe–2S], or [4Fe–4S]
clusters, as listed in Table 1. It should be pointed out that
some important Fe–S enzymes are not included here,
because they only contain previously identified Fe–S
folds. A case in point is the respiratory complex II/
fumarate reductase family [75], which accommodates
three Fe–S clusters harbored within fold types 1 and 101.
The main issues raised by the data in Table 1 are discussed in the following sections.
123
J Biol Inorg Chem (2008) 13:157–170
Fe–S protein folds versus protein folds at large
Protein fold groups are defined by the numbers of
secondary structure elements and their topological connections [25]. In addition to their phylogenetic
significance, protein folds and families are important for
structural predictions: indeed, structures of one or a few
members of a fold give way to structural predictions for
most members of that group [76]. Novel folds are therefore
primary targets for structural genomics projects [76], and in
fact the discovery rate of novel folds and families is
exponential, paralleling the growth rate of the PDB [76,
77]. There is nevertheless some controversy as to whether
the exploration of the protein structure universe is nearing
completion [78] or not [76, 77]. The enormous current
output of genomic and metagenomic sequences, and the
huge number of novel sequences that are produced [79],
together with the exponential growth of the number of
novel structures, suggest that a complete mapping of the
protein structural landscape is still a remote goal [76, 77].
Fe–S protein folds are defined here by considering the
polypeptide fold around and in the close vicinity of the
metal site, rather than complete domains. In some cases,
e.g., folds 1, 2, or 201, the fold around the Fe–S cluster
encompasses the complete protein fold. In many other
cases, what is regarded here as the Fe–S fold may consist of
only a small part of the whole protein or domain (e.g., folds
3–6). There are a few protein folds that are specifically
Fe–S folds (e.g., folds 1, 2, and 201), while many others
also occur in other contexts (flavodoxin, zinc finger). It
may also happen that a given protein fold is implemented
in more than one instance to harbor Fe–S clusters (folds 9
and 31, or 103 and 201).
The possibility for Fe–S clusters to be hosted in various
ways in a given protein fold would in principle offer nearly
unlimited opportunities for novel Fe–S folds. This should
be tempered by the observation that Fe–S proteins, notwithstanding their ubiquity and diversity, are only a small
minority of all extant proteins. They account for approximately 1% of the entries in the PDB, which may
nevertheless be somewhat of an underestimation of their
actual number in view of the higher than average difficulty
of their structural analysis.
The rate of discovery of Fe–S protein folds as a function
of time is shown in Fig. 3. While, the growth of the PDB,
or even subsets of the PDB, can be fitted with exponentials
[76, 77] (Fig. 3), the rate of acquisition of Fe–S fold
structures seems to depart significantly from such a trend
(Fig. 3). After a slow start, the rate underwent a sudden
fivefold to tenfold increase around 1995, and has changed
little on average sine then. This discontinuity suggests that
specific breakthroughs have been made in the structural
J Biol Inorg Chem (2008) 13:157–170
Fig. 3 Cumulated number of unique Fe–S protein folds as a function
of time (thick line). The dotted line shows the number of protein
crystal structure entries in the PDB. The vertical scale is the same as
for the Fe–S folds, but the numbers are thousands
investigation of Fe–S proteins. A tentative list may include
mastering of the often mandatory anoxic techniques in an
increasing number of laboratories, progress in the implementation of Fe–S clusters for phase determination, and
widespread use of synchrotron facilities. The fact that the
number of unique Fe–S folds increases about linearly
rather than exponentially may indicate that Fe–S proteins
still present obstacles that require specific skills to overcome them. On the other hand, the very constancy of the
rate, with however an apparent increase in the last couple
of years (Table 1, Fig. 3), suggests that the mapping of Fe–
S protein folds, like protein folds altogether [76, 77], is not
nearing completion.
Fe–S folds and their functions
Fe–S proteins as first isolated were small electron carrier
proteins (folds 1, 2, 101, and 201), and electron transfer
remains the predominant function among the much larger
number of presently known Fe–S folds. Catalysis of
chemical reactions is a well-established function of large
Fe–S clusters [8] (see before), but only very few [2Fe–2S]
or [4Fe–4S] active sites have been proven to possess such
functions. Aconitase (fold 4 [29]) is a dehydratase/hydratase, the [4Fe–4S] cluster in Fd-thioredoxin reductase
donates electrons to a redox active pair of cysteines and
undergoes transient bonding to one of them (fold 20 [72]),
the [4Fe–4S] cluster in radical S-adenosylmethionine proteins cleaves S-adenosylmethionine (fold 27 [80]), and the
[2Fe–2S] cluster in IscA (fold 108 [81]) is probably a
transient cluster being transferred to a target protein. Some
165
other functions, yet to be confirmed, would include oxygen
or redox sensing (folds 6 and 34), sulfur donation (fold 106
[82]), transfer of redox equivalents across bacterial membranes (fold 109 [83]), and signalization or Fe–S transfer
across the outer mitochondrial membrane (fold 111 [84]).
One should be reminded, however, that Fe–S proteins
having well-established regulatory roles and putative novel
folds have not yet yielded to structural analysis. Prominent
among these are the [2Fe–2S] SoxR superoxide and NO
sensor [85], the [4Fe–4S] fumarate nitrate regulator [86],
and the IscR regulator of the Isc operon [14]. The functions
of these proteins require them to undergo conformational
changes. They may thus be difficult to crystallize, a reason
why regulatory proteins are probably underrepresented
among the known Fe–S protein structures. In that respect,
the recently reported structures of the iron regulatory protein 1 (IRP1) are remarkable, even though no novel Fe–S
fold has been brought forth. Indeed, both functional
structures of this bifunctional protein have been elucidated:
the [4Fe–4S] cytosolic aconitase [87], and the Fe–S clusterfree RNA binding protein [88].
Fe–S cluster ligation and general features of Fe–S folds
It has long been known from studies on many Fe–S proteins as well as synthetic analogs of their active sites [5–7,
13] that thiolates (cysteines) are by far the preferred
organic ligands of Fe–S clusters. This is largely confirmed
by the compilation of Fe–S clusters in Table 1, even
though other residues, primarily histidine (folds 10, 16, 28,
103, and 111), but also glutamine (fold 25) or arginine
(fold 106), are implemented in a few cases. In contrast,
serine, which is nearly isostructural with cysteine and has
often been implemented as a ligand in engineered Fe–S
proteins, is involved as a natural ligand only in a particular
form of the nitrogenase P cluster [8, 10, 89]. Substrate
binding to one of the iron atoms has been evidenced in two
[4Fe–4S] enzymes (folds 4 and 27, with citrate [29] and
S-adenosylmethionine [80], respectively).
Altogether the Fe–S-binding folds exhibit a considerable
variety, resulting from diverse combinations of loops and
secondary structure elements. Fe–S clusters are generally
hosted within a single domain which provides them with a
relatively rigid and protective ligand framework. Fe–S sites
occurring at domain interfaces are generally buried within
the protein interior (e.g., fold 13). This is in keeping with
the documented instability of Fe–S clusters when exposed
to aqueous solvents and dioxygen [6]. Fe–S clusters located
at exposed domain or subunit interfaces (folds 5, 22, 107,
and 109) are rather uncommon, notoriously unstable, and
assume functions requiring conformational changes (folds
5 and 22).
123
166
Protein fold and Fe–S cluster structure
It has long been known, and is further confirmed in
Table 1, that there are many ways (folds) for a protein to
accommodate a given Fe–S cluster. But is it possible for a
given protein fold to host more than one cluster type?
There seems to be one structurally characterized case: folds
103 (Rieske protein) and 201 (Rd) are superimposable and
yet bind a [2Fe–2S] or a [1Fe] site, respectively. However,
this exceptional occurrence is only made possible through
the replacement (histidine for cysteine) and reorientation of
two of the active-site ligands. Hence, this fold has been
split into two distinct ones (103 and 201). In fact, going
from mononuclear to binuclear to tetranuclear clusters
results in significant displacements of the ligands of the
Fe–S active sites (Fig. 1b). Such displacements would
require large structural rearrangements of the polypeptide
chain, which makes it very unlikely for clusters of different
nuclearities to be hosted in similar protein folds. This may
be illustrated by observations made on the nitrogenase iron
protein (fold 5), which normally contains a [4Fe–4S]
cluster [8], but can under some circumstances host a [2Fe–
2S] cluster. The latter, however, happens only in conditions
where the protein is known to undergo structural changes:
either on the way to irreversible denaturation [90], or as a
result of ATP binding in the presence of glycerol [91].
Other proteins having the potential to bind either binuclear
or tetranuclear clusters include the fumarate nitrate regulator [86] and IscU [14], both of which are predicted from
their function to be flexible. Structures of these proteins
have not yet been forthcoming. The common [4Fe–4S]/
[3Fe–4S] conversion requires but a small rearrangement of
the cysteine ligand set [5, 6] and is therefore not relevant to
this discussion.
[2Fe–2S]-containing compared
with [4Fe–4S]-containing folds
Protein folds hosting low-potential [2Fe–2S]2+,+ and [4Fe–
4S]2+,+ active sites are an overwhelming majority in the list
in Table 1. These active sites operate in the same redox
potential range [7], and occur in the same biochemical
pathways and often side by side in the same enzymes [11,
73] (Table 1). It is therefore difficult to rationalize on
biochemical grounds why the [4Fe–4S] folds should outnumber the [2Fe–2S] ones by a factor of more than 3
(Table 1). Instead, an explanation based on the chemical
properties of the binuclear and tetranuclear Fe–S clusters is
proposed hereafter.
Biomimetic Fe–S chemistry has produced models for
most biological Fe–S active sites, as well as a considerable
body of information pertaining to their structure and
123
J Biol Inorg Chem (2008) 13:157–170
reactivity [6]. In its earliest developments, biomimetic Fe–
S chemistry revealed that simple reaction systems involving iron ions, sulfide, and thiols yield [4Fe–4S]2+ synthetic
analogs in most cases [6, 26], while specific thiolate
ligands or reaction conditions are required for the production of [2Fe–2S]2+ clusters [92]. Accordingly, over the
years, scores of tetranuclear clusters have been isolated and
structurally characterized, while merely a few binuclear
ones have yielded to similar investigations [6]. A [4Fe–
4S]2+ cluster with HS- as a thiolate ligand has even been
synthesized by sparging H2S through an Fe3+/2+ solution
[93]. While the reasons for the higher stability of the tetranuclear clusters have not been specifically addressed,
they may include the oxidation level of the iron, which is
lower in tetranuclear clusters (Fe2.5+) than in binuclear ones
(Fe3+). This might have further favored the formation of
tetranuclear clusters in the reducing conditions of the
primitive Earth [94].
The higher stability of the tetranuclear clusters becomes
even more conspicuous when one considers both redox
levels of the [4Fe–4S]2+,+ and [2Fe–2S]2+,+ couples that
are relevant at low potential (approximately -100 to
-700 mV). Indeed, while synthetic analogs for both [4Fe–
4S]2+,+ redox levels have been isolated and crystallized in
numbers, binuclear synthetic analogs have only been isolated and crystallized for the [2Fe–2S]2+ level [6].
Structural data on the [2Fe–2S]+ level are limited to a
single Fd [45]. An added difficulty with the binuclear
synthetic analogs in solution is their tendency to dimerize
into [4Fe–4S]2+ in some solvents or upon reduction to the
[2Fe–2S]+ level [95]. Thus, [4Fe–4S]2+,+ clusters are
structurally autonomous with respect to redox activity,
while the reversible functioning of the [2Fe–2S]2+,+ transition mandates the assistance of an exogenous structural
framework.
Tetranuclear clusters are also superior in terms of versatility and plasticity. They can be held and stabilized by
merely three thiolate ligands, in proteins as well as in
synthetic analogs, provided the ligands in the latter are
tridentate [5, 6, 96, 97]. A variety of ligands are acceptable
in the fourth position, including H2O or HO-. The unique
iron can also be reversibly abstracted in site-differentiated
synthetic analogs [98] as well as in a number of proteins
[28], in particular aconitase [29]. These data make a strong
point that three cysteine ligands suffice to stabilize either
[4Fe–4S] clusters or their immediate [3Fe–4S] precursors
in various protein contexts. In contrast, [2Fe–2S] clusters
held by only three ligands (cysteines) are documented in
merely two cases, both of them proteins mutated in vitro
[99, 100].
In keeping with these chemical properties, it is of note
that even in the simplest [2Fe–2S] proteins, e.g., plant-type
Fd, the smaller part of the structure is dedicated to
J Biol Inorg Chem (2008) 13:157–170
accommodating the Fe–S cluster, while the larger part
consists of super-secondary structures having a major stabilizing role for the overall structure and thereby for the
[2Fe–2S] cluster as well [45]. In contrast, in clostridialtype [4Fe–4S] Fd, the short polypeptide chain is entirely
wrapped closely around the two Fe–S clusters [24], and
distal regions of the protein contributing additional stabilizing forces are not mandatory. In that case, the protein
fold appears to be fully determined by its role as a ligand of
the Fe–S clusters.
Collectively these data indicate that [4Fe–4S] clusters
are more versatile and robust than the [2Fe–2S] clusters,
and are by far the predominant products in autoassembly
reactions involving iron, sulfide, and thiolates. They are
therefore likely to have been the most widespread biologically relevant Fe–S clusters throughout periods
encompassing prebiotic chemistry and the emergence of
life on this planet. They were also favored in the nascent
protein world, through their higher stability and ability to
nest in protein sites, including those having incomplete
ligand sets. This advantage was certainly decisive in
early periods of protein evolution, when most of the
basic folds presumably emerged [77, 101]. Likewise,
throughout the evolution of life, [4Fe–4S] clusters have
probably been the best candidates for the appearance of
new Fe–S sites in cysteine-rich regions of proteins (see
fold 33).
Low-potential compared with high-potential Fe–S folds
An overwhelming majority (all but folds 2, 103, and 201)
of Fe–S protein folds host low-potential [2Fe–2S]2+,+ or
[4Fe–4S]2+,+ clusters. This can be assigned to events that
took place in the geochemical conditions prevalent during
the emergence of life and probably until approximately
2.5 billion years ago [21], when dioxygen appeared in the
biosphere. In such reducing conditions with abundant iron,
sulfide, and thiolates, it is known from contemporary
chemistry [6] that [2Fe–2S]2+,+ and [4Fe–4S]2+,+ clusters
are the major products of self-assembly reactions. They
were thus the best candidates to occupy available sites in
primitive proteins or other organic biomolecules. They
would also have been best suited to function as catalysts in
the prevalent low-potential metabolic pathways. This,
added to the likelihood that many of the protein folds
emerged early in evolution [77, 101], provides an explanation for the prominent position of [2Fe–2S]2+,+ and
[4Fe–4S]2+,+ clusters in extant low-potential proteins.
Conversely, the appearance of oxidizing conditions
drastically decreased the availability of iron (through precipitation as ferric oxides and hydroxides), and thwarted
spontaneous assembly of Fe–S clusters. These conditions
167
also led to the oxidation of metallic copper and its mobilization as cuprous and cupric ions; hence, copper (e.g., in
copper oxidases) gained a significant edge over Fe–S as a
biocatalyst in the upper range of redox potentials [102].
This combination of copper mobilization and iron seclusion, added to an increasing implementation of iron in
other protein sites (e.g., hemes or binuclear oxo), could not
but strongly disfavor the use of Fe–S clusters as highpotential redox catalysts; hence, the small number of protein folds that turned out to harbor high-potential Fe–S
active sites, which are restricted to HiPIP (fold 2 [36, 37]),
Rieske-type proteins (fold 203 [46, 48]), and Rd (fold 201
[55]). And even then, the last two of these active sites are
accommodated in nearly superimposable zinc-finger-like
folds [48]. It should be mentioned here that the [4Fe–
4S]3+,2+ active site of Fd-thioredoxin reductase (fold 20
[72]) functions with the same redox couple as HiPIP, and
might therefore be listed among the high-potential Fe–S
clusters. However, owing to the proximity of a redoxactive disulfide with which it undergoes chemical interaction, that [4Fe–4S]3+,2+ cluster has in effect a low potential
(-400 mV [72]), which relates it functionally to the [4Fe–
4S]2+,+ active sites.
Evolution of Fe–S protein folds
Fe–S chemistry is surmised to have contributed significantly to prebiotic chemistry and to the emergence of life
on this planet [19, 20]. This, added to the structural and
functional versatility of Fe–S clusters, suggests that Fe–S
proteins were among the very first catalysts of biochemical
reactions. The primitive 2[4Fe–4S] Fd fold, which is taken
to be one of the most ancient protein folds altogether [22],
is considered as a remnant of, and as indirect evidence for,
these early events. Furthermore, the considerable diversification of this fold, a small part of which is illustrated in
Fig. 2, points to a protracted evolutionary course. Other
Fe–S folds, though less clearly ancient, may also have
appeared early in the history of life; a case in point is that
of the plant- and vertebrate-type Fd. The basic structural
scaffold of these ubiquitous proteins, the b-grasp [25], is
not unique to Fe–S proteins, but its Fe–S version (fold 101)
has been suggested to feature among the most ancient of its
varieties [103].
Other Fe–S folds are probably adaptations of preexisting
folds to the hosting of Fe–S active sites. Examples of the
latter are 103 and 201 (zinc finger) as well as 104 (thioredoxin). Indeed, while these Fe–S folds are undoubtedly
widespread and important, the general folds they are
derived from are really ubiquitous and assume essential
functions that probably predated the emergence of these
particular Fe–S protein families.
123
168
There are also classes of proteins where the presence of
an Fe–S cluster is an exceptional occurrence of unknown
functional relevance. A significant case is the tryptophanyl
transfer RNA synthetase from Thermotoga maritima (fold
33), which is at least rare, if not unique, as an Fe–S protein
among aminoacyl transfer RNA synthetases. The Fe–S
cluster has no known function in this particular case, and
its presence may simply result from the happenstance of at
least three (see ‘‘[2Fe–2S]-containing compared with [4Fe–
4S]-containing folds’’) appropriately positioned cysteine
residues. This occurrence points to a possible way for Fe–S
clusters to appear in novel sites where they may subsequently be stabilized, acquire new functions, gain a
selective advantage, and eventually lead to the emergence
of a new family of Fe–S proteins. Mechanisms of this sort
may have been operative in the past for the building of Fe–
S active sites in preexisting folds (flavodoxin, thioredoxin,
zinc finger, etc.).
Most of the Fe–S folds are represented among prokaryotes. Only folds 25, 105, 107, and 111 may be unique
to eukaryotes, as they have so far no counterparts in prokaryotes (Table 1). This would suggest that most Fe–S
folds, like protein folds at large, emerged prior to the
appearance of eukaryotes, while proteins in the latter
organisms evolved mainly by reshuffling and aggregation
of preexisting folds [76, 77].
Conclusions
Folds hosting low-potential Fe–S clusters vastly outnumber
the high-potential ones (44 and three, respectively;
Table 1), even though the latter include a variety of activesite frameworks (mononuclear, binuclear, and tetranuclear), and occur in very diverse biochemical contexts
(photosynthesis, respiration, response to oxidative stress).
This can be rationalized by taking into account the reigning
of anoxic/reducing conditions in the prebiotic era, as well
as during the early steps of life [19–21, 98]. The very active
Fe–S chemistry surmised to have taken place would then
have favored the assembly of structures that are known to
be the most stable in those conditions, i.e., [4Fe–4S]2+,+
and [2Fe–2S]2+,+ clusters [6]. The requirement for highpotential catalysts, which may have included Fe–S clusters,
was presumably marginal in those conditions. In contrast,
the expansion of oxidative metabolic pathways resulting
from the appearance of dioxygen required the evolution of
high-potential redox proteins. By then, however, the
emergence of pertinent Fe–S active sites was certainly
limited by the more oxidizing environment, by competition
with new metal catalysts, copper in particular [102], as well
as by the decreasing availability of novel protein folds [76,
77].
123
J Biol Inorg Chem (2008) 13:157–170
Among low-potential Fe–S folds, the [4Fe–4S] ones
largely outnumber the [2Fe–2S] ones (Table 1). This is
best rationalized by invoking the higher stability of the
tetranuclear clusters [6], which favored their pervasion of
the developing protein-fold space during early evolution. In
this case, and likewise for the predominance of lowpotential over high-potential Fe–S folds, chemistry and
geochemistry appear to have played key roles in biological
evolution.
These clear imprints of the inorganic realm in extant
biological Fe–S protein folds suggest that major features of
the latter were determined early in the course of evolution.
The subsequent rise of atmospheric dioxygen, with the
resulting collapse of dissolved iron and sulfide, probably
brought an end to the direct bearing of Fe–S chemistry on
Fe–S protein evolution. Indeed, the new geochemical
conditions forbade spontaneous chemical assembly of
biological Fe–S active sites, and mandated the evolution of
sophisticated biochemical pathways for their synthesis
[14].
In summary, the main features of the extant population
of Fe–S protein folds appear to be the outcome, on one
hand, of the reducing conditions prevalent during the first
half of Earth’s existence and, on the other hand, of fundamental chemical properties of the Fe–S clusters that are
now most widespread among living cells. Thus, Fe–S
proteins and their evolution powerfully illustrate the tight
links between the inorganic realm and life, not merely from
a structural and functional viewpoint, but also from a historical perspective.
References
1. Beinert H, Sands RH (1960) Biochem Biophys Res Commun
3:41–46
2. Mortenson LE, Valentine RC, Carnahan JE (1962) Biochem
Biophys Res Commun 7:448–452
3. Tagawa K, Arnon DI (1962) Nature 195:537–543
4. Malkin R, Rabinowitz JC (1966) Biochem Biophys Res Commun 23:822–827
5. Beinert H, Holm RH, Münck E (1997) Science 277:653–659
6. Rao PV, Holm RH (2004) Chem Rev 104:527–559
7. Beinert H, Meyer J, Lill R (2004) In: Lennarz WJ, Lane MD
(eds) Encyclopedia of biological chemistry, vol 2. Elsevier,
Amsterdam, pp 482–489
8. Rees DC (2002) Annu Rev Biochem 71:221–246
9. Volbeda A, Fontecilla-Camps JC (2005) Dalton Trans 3443–
3450
10. Moulis JM, Davasse V, Golinelli MP, Meyer J, Quinkal I (1996)
J Biol Inorg Chem 1:2–14
11. Sazanov LA, Hinchliffe P (2006) Science 311:1430–1436
12. Johnson MK (1998) Curr Opin Chem Biol 2:173–181
13. Johnson MK, Smith AD (2005) In: King RB (ed) Encyclopaedia of inorganic chemistry, vol 4. Wiley, Chichester,
pp 2589–2619
14. Johnson DC, Dean DR, Smith AD, Johnson MK (2004) Annu
Rev Biochem 74:247–281
J Biol Inorg Chem (2008) 13:157–170
15. Mitou G, Higgins C, Wittung-Stafshede P, Conover RC, Smith
AD, Johnson MK, Gaillard J, Stubna A, Münck E, Meyer J
(2003) Biochemistry 42:1354–1364
16. Eady RR, Smith BE, Cook KA, Postgate JR (1972) Biochem J
128:655–675
17. You JF, Papaefthymiou GC, Holm RH (1992) J Am Chem Soc
114:2697–2710
18. Long JR, Holm RH (1994) J Am Chem Soc 116:9987–10002
19. Wächtershäuser G (2006) Philos Trans R Soc Lond Ser B
361:1787–1808
20. Russell MJ (2007) Acta Biotheor (in press). doi:
10.1007/s10441-007-9018-5
21. Kirschvink JL (2005) Engineering & Science 4:10–20
22. Eck RV, Dayhoff MO (1966) Science 152:363–366
23. Adman ET, Sieker LC, Jensen LH (1973) J Biol Chem
248:3987–3996
24. Sieker LC, Adman ET (2001) In: Messerschmidt A, Huber R,
Poulos T, Wieghardt K (eds) Handbook of metalloproteins.
Wiley, Chichester, pp 574–592
25. Orengo CA, Thornton JM (2005) Annu Rev Biochem 74:867–
900
26. Herskovitz T, Averill BA, Holm RH, Ibers JA, Phillips WD,
Weiher JF (1972) Proc Natl Acad Sci USA 69:2437–2441
27. Moulis JM, Sieker LC, Wilson KS, Dauter Z (1996) Protein Sci
5:1765–1775
28. Johnson MK, Duderstadt RE, Duin EC (1999) Adv Inorg Chem
47:1–82
29. Beinert H, Kennedy MC, Stout CD (1996) Chem Rev 96:2335–
2373
30. Dauter Z, Wilson KS, Sieker LC, Meyer J, Moulis J-M (1997)
Biochemistry 36:16065–16073
31. Darimont B, Sterner R (1994) EMBO J 13:1772–1781
32. Brochier C, Philippe H (2002) Nature 417:244
33. Skophammer RG, Servin JA, Herbold CW, Lake JA (2007) Mol
Biol Evol 24:1761–1768
34. Bartsch RG (1978) Methods Enzymol 53:329–340
35. Ciurli S, Musiani F (2005) Photosynth Res 85:115–131
36. Liu L, Nogi T, Kobayashi M, Nozawa T, Miki K (2002) Acta
Cryst D58:1085–1091
37. Carter CW Jr (2001) In: Messerschmidt A, Huber R, Poulos T,
Wieghardt K (eds) Handbook of metalloproteins. Wiley,
Chichester, pp 602–609
38. Bertini I, Luchinat C, Provenzani A, Rosato A, Vasos PR (2002)
Proteins 46:110–127
39. Tsukihara T, Fukuyama K, Nakamura M, Katsube Y, Tanaka N,
Kakudo M, Wada K, Hase T, Matsubara H (1981) J Biochem
(Tokyo) 90:1763–1773
40. Grinberg AV, Hannemann F, Schiffler B, Müller J, Heinemann
U, Bernhardt R (2000) Proteins 40:590–612
41. Kakuta Y, Horio T, Takahashi Y, Fukuyama K (2001) Biochemistry 40:11007–11012
42. Hugo N, Meyer C, Armengaud J, Gaillard J, Timmis KN,
Jouanneau Y (2000) J Bacteriol 182:5580–5585
43. Frolow F, Harel M, Sussman JL, Mevarech M, Shoham M
(1996) Nat Struct Biol 3:452–458
44. Zanetti G, Binda C, Aliverti A (2001) In: Messerschmidt A,
Huber R, Poulos T, Wieghardt K (eds) Handbook of metalloproteins, Wiley, Chichester, pp 532–542
45. Morales R, Charon MH, Hudry-Clergeon G, Pétillot Y, Nørager
S, Medina M, Frey M (1999) Biochemistry 38:15764–15773
46. Link TA (2001) In: Messerschmidt A, Huber R, Poulos T,
Wieghardt K (eds) Handbook of metalloproteins. Wiley,
Chichester, pp 518–531
47. Lebrun E, Santini JM, Brugna M, Ducluzeau AL, Ouchane S,
Schoepp-Cothenet B, Baymann F, Nitschke W (2006) Mol Biol
Evol 23:1180–1191
169
48. Iwata S, Saynovits M, Link TA, Michel H (1996) Structure
4:567–579
49. Colbert CL, Couture MMJ, Eltis LD, Bolin JT (2000) Structure
8:1267–1278
50. Shethna YI, Wilson PW, Hansen RE, Beinert H (1964) Proc
Natl Acad Sci USA 52:1263–1271
51. Yeh AP, Chatelet C, Soltis SM, Kuhn P, Meyer J, Rees DC
(2000) J Mol Biol 300:587–595
52. Meyer J (2001) FEBS Lett 509:1–5
53. Vignais PM, Billoud B, Meyer J (2001) FEMS Microbiol Rev
25:455–501
54. Herriott JR, Sieker LC, Jensen LH (1970) J Mol Biol 50:391–
406
55. Meyer J, Moulis JM (2001) In: Messerschmidt A, Huber R,
Poulos T, Wieghardt K (eds) Handbook of metalloproteins.
Wiley, Chichester, pp 505–517
56. Archer M, Huber R, Tavares P, Moura I, Moura JJG, Carrondo
MA, Sieker LC, LeGall J, Romão MJ (1995) J Mol Biol
251:690–702
57. deMaré F, Kurtz DM Jr, Nordlund P (1996) Nat Struct Biol
3:539–546
58. Yeh AP, Hu Y, Jenney FE Jr, Adams MWW, Rees DC (2000)
Biochemistry 39:2499–2508
59. Logan DT, Mulliez E, Larsson KM, Bodevin S, Atta M,
Garnaud PE, Sjöberg BM, Fontecave M (2003) Proc Natl Acad
Sci USA 100:3826–3831
60. Scherr N, Honnappa S, Kunz G, Mueller P, Jayachandran R,
Winkler F, Pieters J, Steinmetz MO (2007) Proc Natl Acad Sci
USA 104:12151–12156
61. Meyer J, Gagnon J, Gaillard J, Lutz M, Achim C, Münck E,
Pétillot Y, Colangelo CM, Scott RA (1997) Biochemistry
36:13374–13380
62. Iwasaki T, Kounosu A, Tao Y, Li Z, Shokes JE, Cosper NJ, Imai
T, Urushiyama A, Scott RA (2005) J Biol Chem 280:9129–9134
63. Dauter Z, Wilson KS, Sieker LC, Moulis JM, Meyer J (1996)
Proc Natl Acad Sci USA 93:8836–8840
64. Maher M, Cross M, Wilce MCJ, Guss JM, Wedd AG (2004)
Acta Crystallogr Sect D 60:298–303
65. Meyer J (2004) FEBS Lett 570:1–6
66. Meyer J (2007) Cell Mol Life Sci 64:1063–1084
67. Leiros HKS, McSweeney SM (2007) J Struct Biol 159:92–102
68. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN,
Weissig H, Shindyalov IN, Bourne PE (2000) Nucleic Acids Res
28:235–242
69. Bertini I, Luchinat C, Parigi G, Pierattelli R (2005) Chembiochem 6:1536–1549
70. Sayle RA, Milner-White EJ (1995) Trends Biochem Sci 20:374–
376
71. Andrade SLA, Cruz F, Drennan CL, Ramakrishnan V, Rees DC,
Ferry JG, Einsle O (2005) J Bacteriol 187:3848–3854
72. Dai S, Friemann R, Glauser DA, Bourquin F, Manieri W,
Schürmann P, Eklund H (2007) Nature 448:92–98
73. Peters JW, Lanzilotta WN, Lemon BJ, Seefeldt LC (1998)
Science 282:1853–1858
74. Holm L, Sander C (1993) J Mol Biol 233:123–138
75. Lancaster CRD, Kröger A, Auer M, Michel H (1999) Nature
402:377–385
76. Grabowski M, Joachimiak A, Otwinowski Z, Minor W (2007)
Curr Opin Struct Biol 17:347–353
77. Caetano-Anolles G, Kim HS, Mittenthal JE (2007) Proc Natl
Acad Sci USA 104:9358–9363
78. Zhang Y, Hubner IA, Arakaki AK, Shakhnovich E, Skolnick J
(2006) Proc Natl Acad Sci USA 103:2605–2610
79. Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ,
Remington K, Eisen JA, Heidelberg KB, Manning G, Li W,
Jaroszewski L, Cieplak P, Miller CS, Li H, Mashiyama ST,
123
170
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
J Biol Inorg Chem (2008) 13:157–170
Joachimiak MP, van Belle C, Chandonia JM, Soergel DA, Zhai
Y, Natarajan K, Lee S, Raphael BJ, Bafna V, Friedman R,
Brenner SE, Godzik A, Eisenberg D, Dixon JE, Taylor SS,
Strausberg RL, Frazier M, Venter JC (2007) PLoS Biol 5:e16
Layer G, Heinz DW, Jahn D, Schubert W-D (2004) Curr Opin
Chem Biol 8:468–476
Morimoto K, Yamashita E, Kondou Y, Lee SJ, Arisaka F,
Tsukihara T, Nakai M (2006) J Mol Biol 360:117–132
Berkovitch F, Nicolet Y, Wan JT, Jarrett JT, Drennan CL (2004)
Science 303:76–79
Collet JF, Peisach D, Bardwell JC, Xu Z (2005) Protein Sci
14:1863–1869
Paddock ML, Wiley SE, Axelrod HL, Cohen AE, Roy M, Abresch EC, Capraro D, Murphy AN, Nechushtai R, Dixon JE,
Jennings PA (2007) Proc Natl Acad Sci USA 104:14342–14347
Demple B (2002) Mol Cell Biochem 234/235:11–18
Kiley PJ, Beinert H (2003) Curr Opin Microbiol 6:181–185
Dupuy J, Volbeda A, Carpentier P, Darnault C, Moulis J-M,
Fontecilla-Camps JC (2006) Structure 14:129–139
Walden WE, Selezneva AI, Dupuy J, Volbeda A, FontecillaCamps JC, Theil EC, Volz C (2007) Science 314:1903–1908
Yeh AP, Ambroggio XI, Andrade SLA, Einsle O, Chatelet C,
Meyer J, Rees DC (2002) J Biol Chem 277:34499–34507
123
90. Anderson GL, Howard JB (1984) Biochemistry 23:2118–2122
91. Sen S, Igarashi R, Smith A, Johnson MK, Seefeldt LC, Peters
JW (2004) Biochemistry 43:1787–1797
92. Mayerle JJ, Frankel RB, Holm RH, Ibers JA, Phillips WD,
Weiher JF (1973) Proc Natl Acad Sci USA 70:2429–2433
93. Müller A, Schladerbeck NH (1986) Naturwissenschaften
73:S669
94. Müller A, Schladerbeck NH (1985) Chimia 39:23–24
95. Hagen KS, Reynolds JG, Holm RH (1981) J Am Chem Soc
103:4054–4063
96. Stack TDP, Holm RH (1988) J Am Chem Soc 110:2484–2494
97. Weigel JA, Holm RH (1991) J Am Chem Soc 113:4184–4191
98. Zhou J, Hu Z, Münck E, Holm RH (1996) J Am Chem Soc
118:1966–1980
99. Meyer J, Fujinaga J, Gaillard J, Lutz M (1994) Biochemistry
33:13642–13650
100. Broach RB, Jarrett JT (2006) Biochemistry 45:14166–14174
101. Delaye L, Becerra A, Lazcano A (2005) Orig Life Evol Biosph
35:537–554
102. Williams RJP (2007) Dalton Trans 991–1001
103. Burroughs AM, Balaji S, Iyer LM, Aravind L (2007) Biol Direct
2:18