Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Iron–sulfur protein folds, iron–sulfur chemistry, and evolution

J Biol Inorg Chem (2008) 13:157–170 DOI 10.1007/s00775-007-0318-7 MINIREVIEW Iron–sulfur protein folds, iron–sulfur chemistry, and evolution Jacques Meyer Received: 5 September 2007 / Accepted: 25 October 2007 / Published online: 9 November 2007 Ó SBIC 2007 Abstract An inventory of unique local protein folds around Fe–S clusters has been derived from the analysis of protein structure databases. Nearly 50 such folds have been identified, and over 90% of them harbor low-potential [2Fe–2S]2+,+ or [4Fe–4S]2+,+ clusters. In contrast, highpotential Fe–S clusters, notwithstanding their structural diversity, occur in only three different protein folds. These observations suggest that the extant population of Fe–S protein folds has to a large extent been shaped in the reducing iron- and sulfur-rich environment that is believed to have predominated on this planet until approximately two billion years ago. High-potential active sites are then surmised to be rarer because they emerged later, in a more oxidizing biosphere, in conditions where iron and sulfide had become poorly available, Fe–S clusters were less stable, and in addition faced competition from heme iron and copper active sites. Among the low-potential Fe–S active sites, protein folds hosting [4Fe–4S]2+,+ clusters outnumber those with [2Fe–2S]2+,+ ones by a factor of 3 at least. This is in keeping with the higher chemical stability and versatility of the tetranuclear clusters, compared with the binuclear ones. It is therefore suggested that, at least while novel Fe–S sites are evolving within proteins, the intrinsic chemical stability of the inorganic moiety may be more important than the stabilizing effect of the polypeptide chain. The discovery rate of novel Fe–S-containing protein folds underwent a sharp increase around 1995, and has remained stable to this day. The current trend suggests that the mapping of the Fe–S fold space is not near completion, in agreement with predictions made for protein folds in general. Altogether, the data collected and analyzed here suggest that the extant structural landscape of Fe–S proteins has been shaped to a large extent by primeval geochemical conditions on one hand, and iron–sulfur chemistry on the other. Keywords Ferredoxin  Rubredoxin  Hydrogenase  Iron–sulfur  Bioenergetics  Evolution Abbreviations CoA coenzyme A EPR electron paramagnetic resonance Fd ferredoxin FNR fumarate nitrate regulator GABA c-aminobutyric acid HiPIP high potential iron protein PDB Protein Data Bank PRPP phosphoribosylpyrophosphate Rd rubredoxin tRNA transfer RNA Introduction J. Meyer (&) Laboratoire de Chimie et Biologie des Métaux, IRTSV, Commissariat à l’Energie Atomique/CNRS/Université Joseph Fourier, UMR5249, CEA-Grenoble, 38054 Grenoble, France e-mail: jacques.meyer@cea.fr; jccfmeyer@numericable.fr Fe–S clusters are ubiquitous and essential components of living cells. Proteins containing Fe–S active sites were first detected as electron paramagnetic resonance signatures in mitochondrial membranes [1], and shortly thereafter small soluble ferredoxins (Fd) were isolated [2, 3]. Within only a few years, a variety of other small Fe–S proteins were 123 158 characterized, and were soon found to contain iron and inorganic sulfide [4]. Over the following decades, studies implementing X-ray crystallography, chemical synthesis of structural analogs, and spectroscopy revealed the structural frameworks and chemical and magnetic properties of a variety of Fe–S clusters [5–7]. While most of the latter clusters consist of one to four iron atoms, some larger ones contain up to eight irons [8]. Metals other than iron (e.g., nickel or molybdenum) may be part of or bound to Fe–S clusters [8, 9]. Fe–S clusters have a marked preference for thiolate ligation [6], and accordingly cysteinyl sulfur is by far the most frequently implemented ligand of Fe–S active sites. Nonetheless, histidine and to a lesser extent glutamine, serine, or arginine ligation has been evidenced in several cases [10] (Table 1). Presently known Fe–S proteins range in size from 6 kDa to over 500 kDa, contain up to nine Fe–S clusters [11], are present in all kinds of cells and cellular compartments, and are involved in all sorts of cellular functions [7, 12–14]. While Fe–S clusters are intrinsically oxygen-sensitive, their stability in proteins is vastly dependent on the polypeptide matrix: some are stable in air for weeks (e.g., thermophilic Fd [15]), while others are destroyed within tens of seconds (e.g., nitrogenase [16]). Biomimetic Fe–S chemistry has been an extremely fruitful approach in the research on Fe–S proteins [6]. Thorough investigations over more than three decades have brought forth detailed models for nearly all Fe–S active sites in proteins, and insights into most aspects of their structure and function [5, 6]. These studies have demonstrated that Fe–S active-site analogs can exist in the absence of polypeptide chains, unveiled the plasticity and dynamics of Fe–S clusters, and produced a host of structures interconnecting solid-state, inorganic, and biological Fe–S chemistry [6, 17, 18]. Another facet of Fe–S chemistry is its proposed involvement in the production of prebiotic organic molecules [19], and possibly in the evolution of protocellular systems [20]. These hypotheses are not unrealistic in view of the geochemical conditions that most likely prevailed at that time [21]; such a chemical environment would have favored the spontaneous assembly of Fe–S active sites within primitive macromolecules (early proteins or their precursors) possessing adequately positioned thiolate ligands. Some simple Fe–S proteins [2] displaying remarkably primitive features in their sequence [22] could then be regarded as ‘‘fossils’’ endowed with the potential to report on these early events. Likewise, the overwhelming diversity of extant Fe–S proteins is suggestive of an early and close association of Fe–S chemistry with the development of life on this planet. Insights into these questions may be sought by analyzing the structural diversity of Fe–S proteins. The latter is 123 J Biol Inorg Chem (2008) 13:157–170 probably best represented, even though incompletely, in the several hundred Fe–S protein entries deposited with the Protein Data Bank (PDB; http://www.rcsb.org). However, rather than whole structures which often consist of several domains or subunits and may contain up to nine Fe–S clusters [11], we have considered here folds around individual Fe–S clusters. These folds, consisting of either entire small proteins or parts of larger ones, may be regarded as the basic units of biological Fe–S structural diversity: indeed, while Fe–S clusters assume but a few different structural frameworks, the variety of polypeptide folds that host them is considerably larger. It is shown here that the number of distinct Fe–S folds is close to 50, 34 with [4Fe–4S] clusters, 11 with [2Fe–2S] clusters, and only one with a [1Fe] site. The discussion will focus on how prebiotic and Fe–S chemistry, as well as protein evolution, may have affected the nature and distribution of protein folds around extant Fe–S clusters. Classic Fe–S proteins and folds This section includes small and relatively simple Fe–S proteins that were discovered in the early and mid 1960s, mostly thanks to their high stability and widespread distribution. For these very reasons they yielded to structural analysis at an early stage and revealed the frameworks of the now classic [4Fe–4S], [2Fe–2S], and [1Fe] active sites (Fig. 1), as well as the most common Fe–S-containing protein folds. 2[4Fe–4S] ferredoxin The first isolated Fe–S protein was a small (55 residues) clostridial Fd [2] that was subsequently shown to contain two [4Fe–4S] clusters [23], hence the designation 2[4Fe– 4S] Fd. These proteins are low-potential (-150 to -700 mV) electron carriers implementing the [4Fe-4S]2+,+ redox transition [7]. They are therefore mostly involved in anaerobic metabolic pathways and in the more reducing parts of photosynthetic and aerobic electron transfer chains. Clostridial 2[4Fe–4S] Fd is a compact ellipsoid where the polypeptide chain is tightly wrapped around the two [4Fe– 4S] clusters. Its iron content (eight atoms for 55 amino acids) is unusually high; thus, the inorganic moiety makes up a large part of the structure [23, 24], and much of the stability is provided by the interactions between the polypeptide chain and the inorganic core. The structure displays a conspicuous twofold symmetry resulting from an ancient sequence duplication (see later). Each half of the sequence contributes cysteine ligands to both clusters; thus, the two sets of cysteine ligands are not consecutive in the sequence, Folda Date of discoveryb [4Fe–4S] 1 1972 2f 1972 3 1986 4 1989 5 1992 6 1994 7 1995 8 1995 9h 1995 10 1995 11i 1995 12 1996 13 1996 14 15 16 17 18 19 20f 21m 22 23 24 25n 26 27 1997 1997 1998 1998 1998 1999 2000 2000 2001 2001 2001 2001 2002 2003 Protein Quaternary structured Fe–S ligands 1dur 2fdn 1hip 1iua 2tmd 1o94 5acn 1c96 1nip 1cp2 1gph 1ao0 1aor 2abk 1kg2 1frv 1wui 1frv 1wui 1frv 1wui 1lrv 2pps 1jb0 2[4Fe–4S] Fd HiPIP Trimethylamine dehydrogenase Aconitase Nitrogenase Fe protein Glutamine PRPP amidotransferase Aldehyde-Fd oxidoreductase Endonuclease III NiFe hydrogenase NiFe hydrogenase NiFe hydrogenase Leucine-rich repeat Photosystem I a (55) a (83) a2 (729) a (754) a2 (273) a4 (465) a2 (605) a (211) a (551) b (254) a (551) b (254) a (551) b (254) a (244) Heterododecamer (PsaA–F, PsaI–M, PsaX) 1aa6 1aa6 1feh 1feh 1hfe 1e1d 1gnt 1b0p 2c42 1dj7 2pvo 1ea0 1hux 1su8 1su8 1h7w 1mjg 1oao 1olt Formate dehydrogenase H Sulfite reductase (SiRHP) FeFe hydrogenase FeFe hydrogenase Hybrid cluster proteinl Pyruvate-Fd oxidoreductase Fd-thioredoxin reductase Glutamate synthase HO-glutaryl-CoA dehydratase (CompA) CO dehydrogenase CO dehydrogenase Dihydropyrimidine dehydrogenase Acetyl-CoA synthase HemN a (715) a (570) a (574) a (574) a (553) a2 (1,232) a (75) b (117) a2 (1,479) a2 (260) a2 (636) a2 (636) a2 (1,025) a2 (729) b2 (674)o a (457) 1u8v [71]q 2fug 2fug 2goy 2g36 2jh3 2z1d 4-Hydroxybutyryl-CoA dehydratase Fe–S flavoprotein Complex I (hydrophilic domain, hetero-octamer) Complex I (hydrophilic domain, hetero-octamer) Adenosine 50 -phosphosulfate reductase Tryptophanyl-tRNA synthetase Cobalt chelatase HypD a4 (490) a4 (203) Nqo1 (438) Nqo6 (181) a4 (267) a2 (340) a4 (472) a (372) (C8 C11 C14 C47) (C18 C37 C40 C43)e C41 C46 C61 C75 C345 C348 C351 C364 C358 C421 C424 citrateg C94 C129 (subunit-bridging cluster) C236 C382 C437 C440 C288 C291 C295 C494 C187 C194 C197 C203 C17 C20 C112 C148 H185(Nd) C188 C213 C219 C228 C246 C249 C14 C17 C29 C35 C578(PsaA) C587(PsaA) C565(PsaB) C574(PsaB) (PsaA–PsaB bridging cluster) C8 C11 C15 C42 C434 C440 C479 C483(br)j H94(Ne) C98 C101 C107 C300 C355 C499 C503(br)k C3 C6 C15 C21 C812 C815 C840 C1071 C55 C74 C76 C85 C1102 C1108 C1118 C127 C166 (subunit-bridging cluster) C48 C51 C56 C70 C39 C47 (subunit-bridging cluster) C91 C130 C136 Q156(Oe) C506 C509(br)p C518 C528 C62 C66 C69 S-adenosylmethionine (Met N,O) C99 C103 H292(Ne) C299 C47 C50 C53 C59 C354 C356 C359 C400 C45 C46 C111 C140 C139 C140 C228 C231 C236 C259 C266 C269 C420 C423 C448 C452 C323 C338 C345 C362 1fxi 1czp Plant- and vertebrate-type Fd a (98) C41 C46 C49 C79 Other Fe–S folds in same protein 10, 11 9, 11 9, 10 1 1, 17, 101 1, 16, 101 1 24 23 23, 24 1, 16, 31, 101, 104 1, 16, 30, 101, 104 159 123 28 2004 29 2005 30 2006 2006 31r 32 2006 33 2007 34 2007 35 2007 [2Fe–2S] 101 1981 PDB entriesc J Biol Inorg Chem (2008) 13:157–170 Table 1 Unique protein folds around [4Fe–4S], [2Fe–2S], and [1Fe] active sites 160 123 Table 1 continued Folda Date of discoveryb PDB entriesc Protein Quaternary structured Fe–S ligands Other Fe–S folds in same protein 102 103 104 105n 106 107n 108 109 110 111n [1Fe] 201 1995 1996 2000 2001 2004 2004 2006 2006 2007 2007 1dgj 1vlb 1rie 1jm1 1f37 1m2d 1hrk 2hrc 1r30 1ohv 1x0g 2ht9 2hu9 2qh7 Aldehyde oxidoreductase Rieske-type proteins Thioredoxin-like Fd Ferrochelataset Biotine synthase GABA aminotransferase IscA Glutaredoxin CopZv MitoNEETw a (907) a (250) a2 (110) a2 (423) a2 (369) a2 (472) aa0 (112)u a2 (146) a (204) a2 (108) C100 C103 C137 C139 C140 H142(Nd) C170 H173(Nd) C9 C22 C55 C59 C196 C403 C406 C411 C97 C128 C188 R290(Ng) C135 C138 (subunit-bridging cluster) C37 C101 C103 C1030 C37 glutathione (subunit-bridging cluster) C75 C77 C109 C119 C72 C74 C83 H87(Nd) 101 1970 4rxn 2dsx Rubredoxin a (54) C6 C9 C39 C42 27 PDB Protein Data Bank, HiPIP high-potential iron protein, PRPP phosphoribosylpyrophosphate, Fd ferredoxin, CoA coenzyme A, tRNA transfer RNA, GABA c-aminobutyric acid Numbered in the order of discovery, starting with ‘‘1’’ for [4Fe–4S] active sites, ‘‘101’’ for [2Fe–2S], and ‘‘201’’ for [1Fe] b First coordinate file deposition or publication; the latter may precede the former by several years for some earlier structures c A second entry has been included for subsequent structures with significantly higher resolution. The underscored entry is the one to which the numbering of the Fe–S cluster ligands refers d The length (amino acid residues) of each subunit is indicated. The underscored subunit contains the Fe–S fold of interest e Each parenthesis contains the ligands of one of the [4Fe–4S] clusters. The numerous varieties and presumed evolution of this fold are summarized in Fig. 2 f HiPIP and Fd-thioredoxin reductase (fold 20) are the only protein folds accommodating [4Fe–4S] clusters in their 3+/2+ high-potential transition; all other folds listed here contain [4Fe–4S]2+/ 1+ clusters. However, while the HiPIP cluster is genuinely high potential (see text), the Fd-thioredoxin reductase cluster has a low redox potential owing to its chemical interaction with a dithiol/disulfide cysteine pair [72] g Three iron atoms have normal ligation to cysteine; the fourth one has three oxygen ligands (H2O, Cb carboxyl and Cb hydroxyl from citrate) h Flavodoxin-like domain i [3Fe–4S] cluster in this structure. In some [NiFe] hydrogenases (e.g., 1cc1) a [4Fe–4S] cluster is present thanks to the occurrence of a fourth cysteine ligand in the counterpart position of P239 j C483 bridges one of the [4Fe–4S] irons and the heme iron k C503 bridges one of the [4Fe–4S] irons and one of the two [FeFe] active-site irons l Contains another 4Fe cluster that is not of the [4Fe–4S] type: it is highly asymmetric, involves several Fe–O bonds, and undergoes redox-linked structural changes m [3Fe–4S] cluster n Folds 25, 105, 107. and 111 are the only ones that have so far been found only in eukaryotes o Bifunctional (acetyl-CoA synthase/CO dehydrogenase) enzyme, in which the acetyl-CoA synthase function is assumed by the a2 moiety p C509 bridges one of the [4Fe–4S] irons and one of the [NiNi] active-site nickels. All ligands belong to the a subunits q Coordinates not deposited in the PDB [71] r Flavodoxin-like subunit, but the positions of the cysteine ligands, and the polypeptide fold around the [4Fe–4S] cluster differ from those in fold 9 [11] s Soluble fragment (residues 47–250) t N-terminus (residues 1–62) removed u Homodimer, but the two subunits assume different conformations v N-terminal (1–131) domain w Water-soluble C-terminal (33–108) domain a J Biol Inorg Chem (2008) 13:157–170 J Biol Inorg Chem (2008) 13:157–170 a b Fig. 1 a Structures of Fe–S active sites. Tetrahedral FeS4 coordination and diversity of iron nuclearity are prominent characteristics of these active sites. b Positions of the Sc atoms of the cysteine ligands. The [3Fe–4S] cluster (not shown here) is derived from the [4Fe–4S] framework by removal of one iron and its cysteine ligand [5, 6] but overlapping, and both clusters are accommodated in a single domain. The 2[4Fe–4S] Fd fold is in fact the only one, among those identified and listed in Table 1, that contains a pair of Fe–S clusters, rather than a single one. 2[4Fe–4S] Fd clusters assume a quite common babbab topology; hence, the latter has been dubbed ‘‘ferredoxin fold’’ [25]. However, except for the shared topology, most of the so-called Fd folds differ structurally from the genuine 2[4Fe–4S] Fd fold and are unlikely to bear any phylogenetic relationship with it. 161 The primary structure of clostridial 2[4Fe–4S] Fd revealed a putatively primitive amino acid composition and a clear trace of ancestral gene duplication. It was therefore inferred that this protein fold is very ancient [22]. The demonstration that the apoprotein would spontaneously refold into native Fd in the presence of iron ions and sulfide [4] suggested the possibility of a spontaneous assembly in conditions believed to be those required for the emergence of life on this planet [20]. It also suggested the feasibility of the thereafter successful synthesis of active-site analogs by autoassembly [26]. In keeping with its surmised antiquity, the 2[4Fe–4S] Fd fold is probably the most widespread Fe–S protein fold, and the one that has undergone the most extensive modifications [27]. Some of the latter are outlined in Fig. 2. They include insertions of polypeptide segments, loss of one of the clusters and occasional incorporation of a disulfide bond, as well as formation of [3Fe–4S] clusters by abstraction of one iron, a recurrent theme in Fe–S chemistry and biochemistry [6, 28, 29]. The 2[4Fe–4S] Fd fold is also remarkable by its frequent occurrence, not merely in small soluble Fd, but also in subunits and domains of redox enzymes, often in combination with other Fe–S protein folds (see later). It should be pointed out that the clearest imprint of the primordial gene duplication is found in mesophilic clostridial 2[4Fe–4S] Fd, e.g., Clostridium acidurici [30], which are therefore presumably the most primitive. In contrast, all thermophilic [4Fe–4S] Fd are less symmetric Fig. 2 Evolutionary scheme of the 2[4Fe–4S] Fd fold. Only Fd are shown here. Even greater variations are displayed by homologous domains in redox enzymes. The generic names of the microorganisms are indicated (thermophiles in red), followed by the relevant Protein Data Bank (PDB) entries. The ancestral clostridial Fd is shown at the top, and the most significant variations are indicated on the arrows. Iron atoms are shown as cyan spheres, while sulfide atoms are not shown for clarity. Clusters labeled 3 are of the [3Fe–4S] type. The core polypeptide fold common to all proteins is shown as black strands, insertions are shown as red ribbons, disulfide bridges are shown in yellow, and the zinc atom is shown in green. The structures were drawn using RASMOL [70] 123 162 forms sporting various sorts of polypeptide chain extensions and occasional loss of one [4Fe–4S] cluster (Fig. 2). Thermophilic Fd are therefore more likely to be derived forms than primitive ones, in contradiction with a previous proposal [31]. The phylogeny of 2[4Fe–4S] Fd would thus be consistent with a mesophilic or moderately thermophilic root of the tree of life [32, 33]. High-potential [4Fe–4S] protein [4Fe–4S] high-potential iron proteins (HiPIP) are small (55–85 residues) proteins isolated mostly, but not exclusively, from photosynthetic bacteria [34]. The role of HiPIP as electron donors to the tetraheme cytochrome in photosynthetic bacteria is now well established [35]. In contrast, their function in nonphotosynthetic organisms remains unknown. HiPIP are globular proteins nearly devoid of secondary structure. The [4Fe–4S] cluster is bound to four conserved cysteines and is buried within the protein interior [36]. As a result of its hydrophobic environment and hydrogenbonding network, the cluster implements the [4Fe–4S]3+,2+ transition; hence, the high redox potential (+100 to +400 mV) [37]. HiPIP-like folds have not been found in subunits or domains of larger proteins. While the name ‘‘HiPIP’’ has been conserved by tradition, these proteins are in fact high-potential [4Fe–4S] Fd, as opposed to the low-potential ones described in the previous section. [2Fe–2S] plant- and vertebrate-type ferredoxin Proteins of this type constitute a large family composed of several subgroups [38]. Plant-type proteins function as electron carriers between photosystem I and several enzymes [3, 39], thus linking the ‘‘light’’ and ‘‘dark’’ reactions. Vertebrate (e.g., adrenodoxin) and bacterial (e.g., putidaredoxin) [2Fe–2S] Fd transfer electrons to hydroxylating enzymes, usually P450 cytochromes [40]. Other groups include the [2Fe–2S] IscFd involved in the biosynthesis of Fe–S clusters [14, 41], and the XylT [2Fe–2S] Fd committed to the activation of some oxygenases [42]. Yet other [2Fe–2S] Fd are found in halobacteria [43] and in the hyperthermophile Aquifex aeolicus [15]. All these lowpotential proteins (-150 to -450 mV) implement the [2Fe–2S]2+,+ redox transition of the cluster. Plant- and vertebrate-type [2Fe–2S] Fd are globular proteins (approximately 100 residues) wherein the [2Fe– 2S] cluster is located near the surface, and is protected by a long loop including three of the four cysteine ligands [39– 41, 43–45]. The opposite side of the molecule consists of a four -stranded b-sheet covered by an a-helix, which 123 J Biol Inorg Chem (2008) 13:157–170 together form a ubiquitin-like motif known as b-grasp [25]. Additional lateral a-helices connect the cluster-binding region and the b-grasp [39–41, 43–45]. Notwithstanding their functional diversity, plant- and vertebrate-type Fd are structurally much less diverse than the 2[4Fe–4S] Fd. Variations include an approximately 20 residue N-terminal extension in halobacterial Fd [43], and differences in the redox-partner interaction surfaces between the plant-type and the vertebrate-type Fd [40]. While this [2Fe–2S] Fd fold is best known for its widespread occurrence in small soluble Fd, it is also present in domains of many redox enzymes, often in combination with other Fe–S domains and clusters (see later). Rieske [2Fe–2S] protein Rieske proteins were first characterized as subunits of respiratory (cytochrome bc1) and photosynthetic (cytochrome b6f) electron transfer complexes, but were subsequently found in oxygenases, either as subunits or domains, or as small electron carrier Fd [46, 47]. The basic structural framework (approximately 120 residues) of Rieske proteins consists of three stacked bsheets, of which the upper one includes the two ligand loops holding the [2Fe–2S] cluster [46, 48]. These two loops contain one cysteine and one histidine ligand each, and are interconnected by a disulfide bond which contributes significantly to the stability of the protein. The iron atom closer to the surface has two solvent-exposed histidine ligands; the other iron is bound to two buried cysteine ligands [46, 48]. In membrane-bound complexes (e.g., cytochrome bc1), the Rieske-type subunits possess N-terminal hydrophobic anchors [46]. The fold of the cluster-containing subdomain of Rieske proteins is strikingly similar to that of rubredoxin (Rd) [48] (see ‘‘Rubredoxin’’). The redox transition of the cluster in Rieske proteins is [2Fe–2S]2+,+, as in plant-type Fd, but histidine ligation causes upshifts of the redox potential. As a result, many Rieske proteins have positive potentials (+100 to +400 mV); nevertheless, some dioxygenase-associated Rieske proteins (approximately -100 mV) display redox potentials closer to those of plant-type Fd [49]. Thioredoxin-like [2Fe–2S] ferredoxin Thioredoxin-like [2Fe–2S] Fd were among the first Fe–S proteins discovered [50], but their structure has only been elucidated recently [51]. Their function is largely unknown, although some data suggest an involvement in nitrogen metabolism [52]. The redox transition of the J Biol Inorg Chem (2008) 13:157–170 cluster is [2Fe–2S]2+,+, and the redox potentials are around -300 mV [52]. The two identical subunits of approximately 100 residues contain one [2Fe–2S] cluster each and assume a thioredoxin-like fold. Unlike in thioredoxin, however, the b-sheet is covered by a-helices only on one side, allowing the other side to be implemented as the subunit interface [51]. These Fd are found only in bacteria [52], and are thus less common than the [2Fe–2S] plant-type or the bacterial 2[4Fe–4S] Fd. The thioredoxin-like [2Fe–2S] Fd fold is nevertheless encountered in a number of multiprotein complexes or redox enzymes [52]. Well-known examples are hydrogenases and NADH ubiquinone oxidoreductase [53]. In the latter multisubunit protein, the thioredoxin-like Fd fold is present as a single copy but carries an N-terminal extension of approximately 75 residues that folds as a helix bundle and contributes a tight interaction with a neighboring subunit [11]. Rubredoxin The active site of the small (approximately 55 residues) soluble Rd consists of a single iron atom coordinated to four cysteinyl sulfurs occurring on two CysXXCys segments and belonging to two symmetry-related loops [54, 55]. The FeS4 site of Rd is unique among Fe–S active sites in being devoid of inorganic sulfur. The FeS4 structural unit also stands out as the basic building block of all Fe–S clusters. The redox potential of the [1Fe]3+,2+ couple in Rd is within the -100 to +200 mV range. Rd function as electron carriers in various redox chains, many of which are linked to oxygenation reactions or protection against oxidative damage [55]. While the distribution of Rd is rather scanty, Rd-type proteins and domains are very diverse. Some alkane-oxidizing bacteria contain a Rd consisting of approximately 130 residues, and a yet larger one (approximately 160 residues) containing two Rd modules [55]. A few algae contain a Rd with an approximately 30 residue C-terminal membrane anchor [55]. Desulforedoxin is a homodimer of which each subunit (36 residues) assumes a shortened Rdlike fold with one CysXXCys and one CysCys [56] ligand loop. Rubrerythrin [57] and superoxide reductase [58] contain Rd-like domains in addition to their active-site domains. Rd-like domains of undetermined functions, perhaps regulatory, have been observed in the structures of larger proteins [59, 60] (see later). The Rd fold is remarkably similar to the Rieske-protein fold in the active-site region [48] (see ‘‘Rieske [2Fe–2S] protein’’), even though different ligand sets are implemented in the two cases. Indeed, mutating a single cysteine ligand of the iron results in the installation of a [2Fe–2S] 163 cluster in Rd [61], and conversely, the Rieske site can be converted into a [1Fe] Rd-like site by a triple mutation [62]. Rd otherwise displays striking similarities with Zncontaining protein folds belonging to zinc-finger families. Rd itself has similar affinities for iron and zinc, and the metal content of Rd in vivo appears to be determined, at least in Escherichia coli, by the relative concentrations of the two metals in the culture medium [55, 63]. Heavier metals of the same series (cadmium, mercury) can also bind to the Rd active site [64]. Zinc-containing and cadmium-containing Rd-like domains have been found in a type III ribonucleotide reductase [59], and in a protein kinase G [60], respectively. Proteins containing large Fe–S clusters Besides the classic Fe–S clusters discussed already, larger and more complex Fe–S clusters are present in a few enzymes [8]. Prominent among them are the 8Fe7S (P cluster) and Mo7Fe9S (FeMoco) clusters of nitrogenase, the 2Fe–[4Fe–4S] active site of [FeFe] hydrogenases, and the 2Ni–[4Fe–4S] (A cluster) and Ni4Fe5S (C cluster) clusters of CO dehydrogenase/acetylcoenzyme A synthase [8, 9]. These clusters are highly specialized catalysts and require specific chaperone systems for their assembly [8]. While they display no clear evidence of structural or functional diversification, such events may have occurred in the distant past. The inventory set up and discussed hereafter includes only the [4Fe–4S] moieties present in some of these active sites. Towards an inventory of Fe–S protein folds Over the decades since their discovery, Fe–S proteins have been found in all kinds of organisms and cell compartments, and assume a large range of functions [5, 7, 12–14]. Genome and metagenome sequences have confirmed the pervasiveness of Fe–S proteins in the living world and may be used for an assessment of the distribution and diversity of Fe–S proteins [65, 66]. However, while known Fe–S-binding patterns are easily identified in sequence databases, novel ones are difficult to infer from cysteine sequence patterns alone. In contrast, protein structure databases provide unequivocal information on the structure of the Fe–S active sites and their protein environment. They also yield the structures of individual Fe–S active sites in multicluster proteins. While threedimensional structures are far less numerous than sequences, they are now counted in hundreds and thus afford a sizeable sample of extant structures. Also, 123 164 structures of Fe–S active sites are, to some extent, sampled randomly, since novel Fe–S folds are discovered not only in studies specifically aimed at Fe–S proteins [11], but also serendipitously in proteins not previously known to contain Fe–S clusters ([67], PDB entry 2g36). For these reasons, the set of unique protein folds around Fe–S active sites may be regarded as an adequate representation of the structural diversity of Fe–S proteins. Fe–S protein structures were retrieved from the PDB [68] using the keywords ‘‘FeS,’’ ‘‘iron–sulfur,’’ ‘‘ferredoxin,’’ ‘‘rubredoxin,’’ ‘‘4Fe–4S,’’ and ‘‘2Fe–2S,’’ and the pooled data were cross-checked with literature searches. Altogether over 500 structure entries were collected. Some of these structures (approximately 10%), mostly small proteins, were obtained by using NMR. However, none of the latter include folds that had not been previously characterized by X-ray crystallography. Indeed, while NMR is very powerful for the investigation of dynamics or electromagnetic properties, it still has some limitations for the determination of polypeptide chain conformation in the vicinity of paramagnetic metal sites [69]. Thus, all fold structures listed in Table 1 were determined by X-ray crystallography. The Fe–S protein crystal structures in the PDB are redundant to a significant extent, mostly owing to multiple entries from classic Fe–S proteins that have been extensively investigated over several decades. The PDB includes more than ten HiPIP entries, more than 20 Rd, more than 30 plant and vertebrate [2Fe–2S] Fd, and over 50 structures of the 2[4Fe–4S] Fd fold (taking into account all variations pictured in Fig. 1). Even for some of the large and complex proteins, e.g., nitrogenase [8], entries are counted in dozens. The database nevertheless contains many structures represented but just a few entries or even a single entry. The body of data was first reduced by eliminating redundant structures. Then, proteins containing more than one Fe–S cluster (excepting the 2[4Fe– 4S] Fd, see above) were inspected to discriminate between the novel Fe–S folds and those that had previously been observed in other proteins (as in hydrogenases [73] or complex I [11]). The uniqueness of domains was also assessed by structural searches implementing DALI [74]. The analysis eventually revealed nearly 50 unique folds around individual [1Fe], [2Fe–2S], or [4Fe–4S] clusters, as listed in Table 1. It should be pointed out that some important Fe–S enzymes are not included here, because they only contain previously identified Fe–S folds. A case in point is the respiratory complex II/ fumarate reductase family [75], which accommodates three Fe–S clusters harbored within fold types 1 and 101. The main issues raised by the data in Table 1 are discussed in the following sections. 123 J Biol Inorg Chem (2008) 13:157–170 Fe–S protein folds versus protein folds at large Protein fold groups are defined by the numbers of secondary structure elements and their topological connections [25]. In addition to their phylogenetic significance, protein folds and families are important for structural predictions: indeed, structures of one or a few members of a fold give way to structural predictions for most members of that group [76]. Novel folds are therefore primary targets for structural genomics projects [76], and in fact the discovery rate of novel folds and families is exponential, paralleling the growth rate of the PDB [76, 77]. There is nevertheless some controversy as to whether the exploration of the protein structure universe is nearing completion [78] or not [76, 77]. The enormous current output of genomic and metagenomic sequences, and the huge number of novel sequences that are produced [79], together with the exponential growth of the number of novel structures, suggest that a complete mapping of the protein structural landscape is still a remote goal [76, 77]. Fe–S protein folds are defined here by considering the polypeptide fold around and in the close vicinity of the metal site, rather than complete domains. In some cases, e.g., folds 1, 2, or 201, the fold around the Fe–S cluster encompasses the complete protein fold. In many other cases, what is regarded here as the Fe–S fold may consist of only a small part of the whole protein or domain (e.g., folds 3–6). There are a few protein folds that are specifically Fe–S folds (e.g., folds 1, 2, and 201), while many others also occur in other contexts (flavodoxin, zinc finger). It may also happen that a given protein fold is implemented in more than one instance to harbor Fe–S clusters (folds 9 and 31, or 103 and 201). The possibility for Fe–S clusters to be hosted in various ways in a given protein fold would in principle offer nearly unlimited opportunities for novel Fe–S folds. This should be tempered by the observation that Fe–S proteins, notwithstanding their ubiquity and diversity, are only a small minority of all extant proteins. They account for approximately 1% of the entries in the PDB, which may nevertheless be somewhat of an underestimation of their actual number in view of the higher than average difficulty of their structural analysis. The rate of discovery of Fe–S protein folds as a function of time is shown in Fig. 3. While, the growth of the PDB, or even subsets of the PDB, can be fitted with exponentials [76, 77] (Fig. 3), the rate of acquisition of Fe–S fold structures seems to depart significantly from such a trend (Fig. 3). After a slow start, the rate underwent a sudden fivefold to tenfold increase around 1995, and has changed little on average sine then. This discontinuity suggests that specific breakthroughs have been made in the structural J Biol Inorg Chem (2008) 13:157–170 Fig. 3 Cumulated number of unique Fe–S protein folds as a function of time (thick line). The dotted line shows the number of protein crystal structure entries in the PDB. The vertical scale is the same as for the Fe–S folds, but the numbers are thousands investigation of Fe–S proteins. A tentative list may include mastering of the often mandatory anoxic techniques in an increasing number of laboratories, progress in the implementation of Fe–S clusters for phase determination, and widespread use of synchrotron facilities. The fact that the number of unique Fe–S folds increases about linearly rather than exponentially may indicate that Fe–S proteins still present obstacles that require specific skills to overcome them. On the other hand, the very constancy of the rate, with however an apparent increase in the last couple of years (Table 1, Fig. 3), suggests that the mapping of Fe– S protein folds, like protein folds altogether [76, 77], is not nearing completion. Fe–S folds and their functions Fe–S proteins as first isolated were small electron carrier proteins (folds 1, 2, 101, and 201), and electron transfer remains the predominant function among the much larger number of presently known Fe–S folds. Catalysis of chemical reactions is a well-established function of large Fe–S clusters [8] (see before), but only very few [2Fe–2S] or [4Fe–4S] active sites have been proven to possess such functions. Aconitase (fold 4 [29]) is a dehydratase/hydratase, the [4Fe–4S] cluster in Fd-thioredoxin reductase donates electrons to a redox active pair of cysteines and undergoes transient bonding to one of them (fold 20 [72]), the [4Fe–4S] cluster in radical S-adenosylmethionine proteins cleaves S-adenosylmethionine (fold 27 [80]), and the [2Fe–2S] cluster in IscA (fold 108 [81]) is probably a transient cluster being transferred to a target protein. Some 165 other functions, yet to be confirmed, would include oxygen or redox sensing (folds 6 and 34), sulfur donation (fold 106 [82]), transfer of redox equivalents across bacterial membranes (fold 109 [83]), and signalization or Fe–S transfer across the outer mitochondrial membrane (fold 111 [84]). One should be reminded, however, that Fe–S proteins having well-established regulatory roles and putative novel folds have not yet yielded to structural analysis. Prominent among these are the [2Fe–2S] SoxR superoxide and NO sensor [85], the [4Fe–4S] fumarate nitrate regulator [86], and the IscR regulator of the Isc operon [14]. The functions of these proteins require them to undergo conformational changes. They may thus be difficult to crystallize, a reason why regulatory proteins are probably underrepresented among the known Fe–S protein structures. In that respect, the recently reported structures of the iron regulatory protein 1 (IRP1) are remarkable, even though no novel Fe–S fold has been brought forth. Indeed, both functional structures of this bifunctional protein have been elucidated: the [4Fe–4S] cytosolic aconitase [87], and the Fe–S clusterfree RNA binding protein [88]. Fe–S cluster ligation and general features of Fe–S folds It has long been known from studies on many Fe–S proteins as well as synthetic analogs of their active sites [5–7, 13] that thiolates (cysteines) are by far the preferred organic ligands of Fe–S clusters. This is largely confirmed by the compilation of Fe–S clusters in Table 1, even though other residues, primarily histidine (folds 10, 16, 28, 103, and 111), but also glutamine (fold 25) or arginine (fold 106), are implemented in a few cases. In contrast, serine, which is nearly isostructural with cysteine and has often been implemented as a ligand in engineered Fe–S proteins, is involved as a natural ligand only in a particular form of the nitrogenase P cluster [8, 10, 89]. Substrate binding to one of the iron atoms has been evidenced in two [4Fe–4S] enzymes (folds 4 and 27, with citrate [29] and S-adenosylmethionine [80], respectively). Altogether the Fe–S-binding folds exhibit a considerable variety, resulting from diverse combinations of loops and secondary structure elements. Fe–S clusters are generally hosted within a single domain which provides them with a relatively rigid and protective ligand framework. Fe–S sites occurring at domain interfaces are generally buried within the protein interior (e.g., fold 13). This is in keeping with the documented instability of Fe–S clusters when exposed to aqueous solvents and dioxygen [6]. Fe–S clusters located at exposed domain or subunit interfaces (folds 5, 22, 107, and 109) are rather uncommon, notoriously unstable, and assume functions requiring conformational changes (folds 5 and 22). 123 166 Protein fold and Fe–S cluster structure It has long been known, and is further confirmed in Table 1, that there are many ways (folds) for a protein to accommodate a given Fe–S cluster. But is it possible for a given protein fold to host more than one cluster type? There seems to be one structurally characterized case: folds 103 (Rieske protein) and 201 (Rd) are superimposable and yet bind a [2Fe–2S] or a [1Fe] site, respectively. However, this exceptional occurrence is only made possible through the replacement (histidine for cysteine) and reorientation of two of the active-site ligands. Hence, this fold has been split into two distinct ones (103 and 201). In fact, going from mononuclear to binuclear to tetranuclear clusters results in significant displacements of the ligands of the Fe–S active sites (Fig. 1b). Such displacements would require large structural rearrangements of the polypeptide chain, which makes it very unlikely for clusters of different nuclearities to be hosted in similar protein folds. This may be illustrated by observations made on the nitrogenase iron protein (fold 5), which normally contains a [4Fe–4S] cluster [8], but can under some circumstances host a [2Fe– 2S] cluster. The latter, however, happens only in conditions where the protein is known to undergo structural changes: either on the way to irreversible denaturation [90], or as a result of ATP binding in the presence of glycerol [91]. Other proteins having the potential to bind either binuclear or tetranuclear clusters include the fumarate nitrate regulator [86] and IscU [14], both of which are predicted from their function to be flexible. Structures of these proteins have not yet been forthcoming. The common [4Fe–4S]/ [3Fe–4S] conversion requires but a small rearrangement of the cysteine ligand set [5, 6] and is therefore not relevant to this discussion. [2Fe–2S]-containing compared with [4Fe–4S]-containing folds Protein folds hosting low-potential [2Fe–2S]2+,+ and [4Fe– 4S]2+,+ active sites are an overwhelming majority in the list in Table 1. These active sites operate in the same redox potential range [7], and occur in the same biochemical pathways and often side by side in the same enzymes [11, 73] (Table 1). It is therefore difficult to rationalize on biochemical grounds why the [4Fe–4S] folds should outnumber the [2Fe–2S] ones by a factor of more than 3 (Table 1). Instead, an explanation based on the chemical properties of the binuclear and tetranuclear Fe–S clusters is proposed hereafter. Biomimetic Fe–S chemistry has produced models for most biological Fe–S active sites, as well as a considerable body of information pertaining to their structure and 123 J Biol Inorg Chem (2008) 13:157–170 reactivity [6]. In its earliest developments, biomimetic Fe– S chemistry revealed that simple reaction systems involving iron ions, sulfide, and thiols yield [4Fe–4S]2+ synthetic analogs in most cases [6, 26], while specific thiolate ligands or reaction conditions are required for the production of [2Fe–2S]2+ clusters [92]. Accordingly, over the years, scores of tetranuclear clusters have been isolated and structurally characterized, while merely a few binuclear ones have yielded to similar investigations [6]. A [4Fe– 4S]2+ cluster with HS- as a thiolate ligand has even been synthesized by sparging H2S through an Fe3+/2+ solution [93]. While the reasons for the higher stability of the tetranuclear clusters have not been specifically addressed, they may include the oxidation level of the iron, which is lower in tetranuclear clusters (Fe2.5+) than in binuclear ones (Fe3+). This might have further favored the formation of tetranuclear clusters in the reducing conditions of the primitive Earth [94]. The higher stability of the tetranuclear clusters becomes even more conspicuous when one considers both redox levels of the [4Fe–4S]2+,+ and [2Fe–2S]2+,+ couples that are relevant at low potential (approximately -100 to -700 mV). Indeed, while synthetic analogs for both [4Fe– 4S]2+,+ redox levels have been isolated and crystallized in numbers, binuclear synthetic analogs have only been isolated and crystallized for the [2Fe–2S]2+ level [6]. Structural data on the [2Fe–2S]+ level are limited to a single Fd [45]. An added difficulty with the binuclear synthetic analogs in solution is their tendency to dimerize into [4Fe–4S]2+ in some solvents or upon reduction to the [2Fe–2S]+ level [95]. Thus, [4Fe–4S]2+,+ clusters are structurally autonomous with respect to redox activity, while the reversible functioning of the [2Fe–2S]2+,+ transition mandates the assistance of an exogenous structural framework. Tetranuclear clusters are also superior in terms of versatility and plasticity. They can be held and stabilized by merely three thiolate ligands, in proteins as well as in synthetic analogs, provided the ligands in the latter are tridentate [5, 6, 96, 97]. A variety of ligands are acceptable in the fourth position, including H2O or HO-. The unique iron can also be reversibly abstracted in site-differentiated synthetic analogs [98] as well as in a number of proteins [28], in particular aconitase [29]. These data make a strong point that three cysteine ligands suffice to stabilize either [4Fe–4S] clusters or their immediate [3Fe–4S] precursors in various protein contexts. In contrast, [2Fe–2S] clusters held by only three ligands (cysteines) are documented in merely two cases, both of them proteins mutated in vitro [99, 100]. In keeping with these chemical properties, it is of note that even in the simplest [2Fe–2S] proteins, e.g., plant-type Fd, the smaller part of the structure is dedicated to J Biol Inorg Chem (2008) 13:157–170 accommodating the Fe–S cluster, while the larger part consists of super-secondary structures having a major stabilizing role for the overall structure and thereby for the [2Fe–2S] cluster as well [45]. In contrast, in clostridialtype [4Fe–4S] Fd, the short polypeptide chain is entirely wrapped closely around the two Fe–S clusters [24], and distal regions of the protein contributing additional stabilizing forces are not mandatory. In that case, the protein fold appears to be fully determined by its role as a ligand of the Fe–S clusters. Collectively these data indicate that [4Fe–4S] clusters are more versatile and robust than the [2Fe–2S] clusters, and are by far the predominant products in autoassembly reactions involving iron, sulfide, and thiolates. They are therefore likely to have been the most widespread biologically relevant Fe–S clusters throughout periods encompassing prebiotic chemistry and the emergence of life on this planet. They were also favored in the nascent protein world, through their higher stability and ability to nest in protein sites, including those having incomplete ligand sets. This advantage was certainly decisive in early periods of protein evolution, when most of the basic folds presumably emerged [77, 101]. Likewise, throughout the evolution of life, [4Fe–4S] clusters have probably been the best candidates for the appearance of new Fe–S sites in cysteine-rich regions of proteins (see fold 33). Low-potential compared with high-potential Fe–S folds An overwhelming majority (all but folds 2, 103, and 201) of Fe–S protein folds host low-potential [2Fe–2S]2+,+ or [4Fe–4S]2+,+ clusters. This can be assigned to events that took place in the geochemical conditions prevalent during the emergence of life and probably until approximately 2.5 billion years ago [21], when dioxygen appeared in the biosphere. In such reducing conditions with abundant iron, sulfide, and thiolates, it is known from contemporary chemistry [6] that [2Fe–2S]2+,+ and [4Fe–4S]2+,+ clusters are the major products of self-assembly reactions. They were thus the best candidates to occupy available sites in primitive proteins or other organic biomolecules. They would also have been best suited to function as catalysts in the prevalent low-potential metabolic pathways. This, added to the likelihood that many of the protein folds emerged early in evolution [77, 101], provides an explanation for the prominent position of [2Fe–2S]2+,+ and [4Fe–4S]2+,+ clusters in extant low-potential proteins. Conversely, the appearance of oxidizing conditions drastically decreased the availability of iron (through precipitation as ferric oxides and hydroxides), and thwarted spontaneous assembly of Fe–S clusters. These conditions 167 also led to the oxidation of metallic copper and its mobilization as cuprous and cupric ions; hence, copper (e.g., in copper oxidases) gained a significant edge over Fe–S as a biocatalyst in the upper range of redox potentials [102]. This combination of copper mobilization and iron seclusion, added to an increasing implementation of iron in other protein sites (e.g., hemes or binuclear oxo), could not but strongly disfavor the use of Fe–S clusters as highpotential redox catalysts; hence, the small number of protein folds that turned out to harbor high-potential Fe–S active sites, which are restricted to HiPIP (fold 2 [36, 37]), Rieske-type proteins (fold 203 [46, 48]), and Rd (fold 201 [55]). And even then, the last two of these active sites are accommodated in nearly superimposable zinc-finger-like folds [48]. It should be mentioned here that the [4Fe– 4S]3+,2+ active site of Fd-thioredoxin reductase (fold 20 [72]) functions with the same redox couple as HiPIP, and might therefore be listed among the high-potential Fe–S clusters. However, owing to the proximity of a redoxactive disulfide with which it undergoes chemical interaction, that [4Fe–4S]3+,2+ cluster has in effect a low potential (-400 mV [72]), which relates it functionally to the [4Fe– 4S]2+,+ active sites. Evolution of Fe–S protein folds Fe–S chemistry is surmised to have contributed significantly to prebiotic chemistry and to the emergence of life on this planet [19, 20]. This, added to the structural and functional versatility of Fe–S clusters, suggests that Fe–S proteins were among the very first catalysts of biochemical reactions. The primitive 2[4Fe–4S] Fd fold, which is taken to be one of the most ancient protein folds altogether [22], is considered as a remnant of, and as indirect evidence for, these early events. Furthermore, the considerable diversification of this fold, a small part of which is illustrated in Fig. 2, points to a protracted evolutionary course. Other Fe–S folds, though less clearly ancient, may also have appeared early in the history of life; a case in point is that of the plant- and vertebrate-type Fd. The basic structural scaffold of these ubiquitous proteins, the b-grasp [25], is not unique to Fe–S proteins, but its Fe–S version (fold 101) has been suggested to feature among the most ancient of its varieties [103]. Other Fe–S folds are probably adaptations of preexisting folds to the hosting of Fe–S active sites. Examples of the latter are 103 and 201 (zinc finger) as well as 104 (thioredoxin). Indeed, while these Fe–S folds are undoubtedly widespread and important, the general folds they are derived from are really ubiquitous and assume essential functions that probably predated the emergence of these particular Fe–S protein families. 123 168 There are also classes of proteins where the presence of an Fe–S cluster is an exceptional occurrence of unknown functional relevance. A significant case is the tryptophanyl transfer RNA synthetase from Thermotoga maritima (fold 33), which is at least rare, if not unique, as an Fe–S protein among aminoacyl transfer RNA synthetases. The Fe–S cluster has no known function in this particular case, and its presence may simply result from the happenstance of at least three (see ‘‘[2Fe–2S]-containing compared with [4Fe– 4S]-containing folds’’) appropriately positioned cysteine residues. This occurrence points to a possible way for Fe–S clusters to appear in novel sites where they may subsequently be stabilized, acquire new functions, gain a selective advantage, and eventually lead to the emergence of a new family of Fe–S proteins. Mechanisms of this sort may have been operative in the past for the building of Fe– S active sites in preexisting folds (flavodoxin, thioredoxin, zinc finger, etc.). Most of the Fe–S folds are represented among prokaryotes. Only folds 25, 105, 107, and 111 may be unique to eukaryotes, as they have so far no counterparts in prokaryotes (Table 1). This would suggest that most Fe–S folds, like protein folds at large, emerged prior to the appearance of eukaryotes, while proteins in the latter organisms evolved mainly by reshuffling and aggregation of preexisting folds [76, 77]. Conclusions Folds hosting low-potential Fe–S clusters vastly outnumber the high-potential ones (44 and three, respectively; Table 1), even though the latter include a variety of activesite frameworks (mononuclear, binuclear, and tetranuclear), and occur in very diverse biochemical contexts (photosynthesis, respiration, response to oxidative stress). This can be rationalized by taking into account the reigning of anoxic/reducing conditions in the prebiotic era, as well as during the early steps of life [19–21, 98]. The very active Fe–S chemistry surmised to have taken place would then have favored the assembly of structures that are known to be the most stable in those conditions, i.e., [4Fe–4S]2+,+ and [2Fe–2S]2+,+ clusters [6]. The requirement for highpotential catalysts, which may have included Fe–S clusters, was presumably marginal in those conditions. In contrast, the expansion of oxidative metabolic pathways resulting from the appearance of dioxygen required the evolution of high-potential redox proteins. By then, however, the emergence of pertinent Fe–S active sites was certainly limited by the more oxidizing environment, by competition with new metal catalysts, copper in particular [102], as well as by the decreasing availability of novel protein folds [76, 77]. 123 J Biol Inorg Chem (2008) 13:157–170 Among low-potential Fe–S folds, the [4Fe–4S] ones largely outnumber the [2Fe–2S] ones (Table 1). This is best rationalized by invoking the higher stability of the tetranuclear clusters [6], which favored their pervasion of the developing protein-fold space during early evolution. In this case, and likewise for the predominance of lowpotential over high-potential Fe–S folds, chemistry and geochemistry appear to have played key roles in biological evolution. These clear imprints of the inorganic realm in extant biological Fe–S protein folds suggest that major features of the latter were determined early in the course of evolution. The subsequent rise of atmospheric dioxygen, with the resulting collapse of dissolved iron and sulfide, probably brought an end to the direct bearing of Fe–S chemistry on Fe–S protein evolution. Indeed, the new geochemical conditions forbade spontaneous chemical assembly of biological Fe–S active sites, and mandated the evolution of sophisticated biochemical pathways for their synthesis [14]. In summary, the main features of the extant population of Fe–S protein folds appear to be the outcome, on one hand, of the reducing conditions prevalent during the first half of Earth’s existence and, on the other hand, of fundamental chemical properties of the Fe–S clusters that are now most widespread among living cells. Thus, Fe–S proteins and their evolution powerfully illustrate the tight links between the inorganic realm and life, not merely from a structural and functional viewpoint, but also from a historical perspective. References 1. Beinert H, Sands RH (1960) Biochem Biophys Res Commun 3:41–46 2. Mortenson LE, Valentine RC, Carnahan JE (1962) Biochem Biophys Res Commun 7:448–452 3. Tagawa K, Arnon DI (1962) Nature 195:537–543 4. Malkin R, Rabinowitz JC (1966) Biochem Biophys Res Commun 23:822–827 5. Beinert H, Holm RH, Münck E (1997) Science 277:653–659 6. Rao PV, Holm RH (2004) Chem Rev 104:527–559 7. Beinert H, Meyer J, Lill R (2004) In: Lennarz WJ, Lane MD (eds) Encyclopedia of biological chemistry, vol 2. Elsevier, Amsterdam, pp 482–489 8. Rees DC (2002) Annu Rev Biochem 71:221–246 9. Volbeda A, Fontecilla-Camps JC (2005) Dalton Trans 3443– 3450 10. Moulis JM, Davasse V, Golinelli MP, Meyer J, Quinkal I (1996) J Biol Inorg Chem 1:2–14 11. Sazanov LA, Hinchliffe P (2006) Science 311:1430–1436 12. Johnson MK (1998) Curr Opin Chem Biol 2:173–181 13. Johnson MK, Smith AD (2005) In: King RB (ed) Encyclopaedia of inorganic chemistry, vol 4. Wiley, Chichester, pp 2589–2619 14. Johnson DC, Dean DR, Smith AD, Johnson MK (2004) Annu Rev Biochem 74:247–281 J Biol Inorg Chem (2008) 13:157–170 15. Mitou G, Higgins C, Wittung-Stafshede P, Conover RC, Smith AD, Johnson MK, Gaillard J, Stubna A, Münck E, Meyer J (2003) Biochemistry 42:1354–1364 16. Eady RR, Smith BE, Cook KA, Postgate JR (1972) Biochem J 128:655–675 17. You JF, Papaefthymiou GC, Holm RH (1992) J Am Chem Soc 114:2697–2710 18. Long JR, Holm RH (1994) J Am Chem Soc 116:9987–10002 19. Wächtershäuser G (2006) Philos Trans R Soc Lond Ser B 361:1787–1808 20. Russell MJ (2007) Acta Biotheor (in press). doi: 10.1007/s10441-007-9018-5 21. Kirschvink JL (2005) Engineering & Science 4:10–20 22. Eck RV, Dayhoff MO (1966) Science 152:363–366 23. Adman ET, Sieker LC, Jensen LH (1973) J Biol Chem 248:3987–3996 24. Sieker LC, Adman ET (2001) In: Messerschmidt A, Huber R, Poulos T, Wieghardt K (eds) Handbook of metalloproteins. Wiley, Chichester, pp 574–592 25. Orengo CA, Thornton JM (2005) Annu Rev Biochem 74:867– 900 26. Herskovitz T, Averill BA, Holm RH, Ibers JA, Phillips WD, Weiher JF (1972) Proc Natl Acad Sci USA 69:2437–2441 27. Moulis JM, Sieker LC, Wilson KS, Dauter Z (1996) Protein Sci 5:1765–1775 28. Johnson MK, Duderstadt RE, Duin EC (1999) Adv Inorg Chem 47:1–82 29. Beinert H, Kennedy MC, Stout CD (1996) Chem Rev 96:2335– 2373 30. Dauter Z, Wilson KS, Sieker LC, Meyer J, Moulis J-M (1997) Biochemistry 36:16065–16073 31. Darimont B, Sterner R (1994) EMBO J 13:1772–1781 32. Brochier C, Philippe H (2002) Nature 417:244 33. Skophammer RG, Servin JA, Herbold CW, Lake JA (2007) Mol Biol Evol 24:1761–1768 34. Bartsch RG (1978) Methods Enzymol 53:329–340 35. Ciurli S, Musiani F (2005) Photosynth Res 85:115–131 36. Liu L, Nogi T, Kobayashi M, Nozawa T, Miki K (2002) Acta Cryst D58:1085–1091 37. Carter CW Jr (2001) In: Messerschmidt A, Huber R, Poulos T, Wieghardt K (eds) Handbook of metalloproteins. Wiley, Chichester, pp 602–609 38. Bertini I, Luchinat C, Provenzani A, Rosato A, Vasos PR (2002) Proteins 46:110–127 39. Tsukihara T, Fukuyama K, Nakamura M, Katsube Y, Tanaka N, Kakudo M, Wada K, Hase T, Matsubara H (1981) J Biochem (Tokyo) 90:1763–1773 40. Grinberg AV, Hannemann F, Schiffler B, Müller J, Heinemann U, Bernhardt R (2000) Proteins 40:590–612 41. Kakuta Y, Horio T, Takahashi Y, Fukuyama K (2001) Biochemistry 40:11007–11012 42. Hugo N, Meyer C, Armengaud J, Gaillard J, Timmis KN, Jouanneau Y (2000) J Bacteriol 182:5580–5585 43. Frolow F, Harel M, Sussman JL, Mevarech M, Shoham M (1996) Nat Struct Biol 3:452–458 44. Zanetti G, Binda C, Aliverti A (2001) In: Messerschmidt A, Huber R, Poulos T, Wieghardt K (eds) Handbook of metalloproteins, Wiley, Chichester, pp 532–542 45. Morales R, Charon MH, Hudry-Clergeon G, Pétillot Y, Nørager S, Medina M, Frey M (1999) Biochemistry 38:15764–15773 46. Link TA (2001) In: Messerschmidt A, Huber R, Poulos T, Wieghardt K (eds) Handbook of metalloproteins. Wiley, Chichester, pp 518–531 47. Lebrun E, Santini JM, Brugna M, Ducluzeau AL, Ouchane S, Schoepp-Cothenet B, Baymann F, Nitschke W (2006) Mol Biol Evol 23:1180–1191 169 48. Iwata S, Saynovits M, Link TA, Michel H (1996) Structure 4:567–579 49. Colbert CL, Couture MMJ, Eltis LD, Bolin JT (2000) Structure 8:1267–1278 50. Shethna YI, Wilson PW, Hansen RE, Beinert H (1964) Proc Natl Acad Sci USA 52:1263–1271 51. Yeh AP, Chatelet C, Soltis SM, Kuhn P, Meyer J, Rees DC (2000) J Mol Biol 300:587–595 52. Meyer J (2001) FEBS Lett 509:1–5 53. Vignais PM, Billoud B, Meyer J (2001) FEMS Microbiol Rev 25:455–501 54. Herriott JR, Sieker LC, Jensen LH (1970) J Mol Biol 50:391– 406 55. Meyer J, Moulis JM (2001) In: Messerschmidt A, Huber R, Poulos T, Wieghardt K (eds) Handbook of metalloproteins. Wiley, Chichester, pp 505–517 56. Archer M, Huber R, Tavares P, Moura I, Moura JJG, Carrondo MA, Sieker LC, LeGall J, Romão MJ (1995) J Mol Biol 251:690–702 57. deMaré F, Kurtz DM Jr, Nordlund P (1996) Nat Struct Biol 3:539–546 58. Yeh AP, Hu Y, Jenney FE Jr, Adams MWW, Rees DC (2000) Biochemistry 39:2499–2508 59. Logan DT, Mulliez E, Larsson KM, Bodevin S, Atta M, Garnaud PE, Sjöberg BM, Fontecave M (2003) Proc Natl Acad Sci USA 100:3826–3831 60. Scherr N, Honnappa S, Kunz G, Mueller P, Jayachandran R, Winkler F, Pieters J, Steinmetz MO (2007) Proc Natl Acad Sci USA 104:12151–12156 61. Meyer J, Gagnon J, Gaillard J, Lutz M, Achim C, Münck E, Pétillot Y, Colangelo CM, Scott RA (1997) Biochemistry 36:13374–13380 62. Iwasaki T, Kounosu A, Tao Y, Li Z, Shokes JE, Cosper NJ, Imai T, Urushiyama A, Scott RA (2005) J Biol Chem 280:9129–9134 63. Dauter Z, Wilson KS, Sieker LC, Moulis JM, Meyer J (1996) Proc Natl Acad Sci USA 93:8836–8840 64. Maher M, Cross M, Wilce MCJ, Guss JM, Wedd AG (2004) Acta Crystallogr Sect D 60:298–303 65. Meyer J (2004) FEBS Lett 570:1–6 66. Meyer J (2007) Cell Mol Life Sci 64:1063–1084 67. Leiros HKS, McSweeney SM (2007) J Struct Biol 159:92–102 68. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) Nucleic Acids Res 28:235–242 69. Bertini I, Luchinat C, Parigi G, Pierattelli R (2005) Chembiochem 6:1536–1549 70. Sayle RA, Milner-White EJ (1995) Trends Biochem Sci 20:374– 376 71. Andrade SLA, Cruz F, Drennan CL, Ramakrishnan V, Rees DC, Ferry JG, Einsle O (2005) J Bacteriol 187:3848–3854 72. Dai S, Friemann R, Glauser DA, Bourquin F, Manieri W, Schürmann P, Eklund H (2007) Nature 448:92–98 73. Peters JW, Lanzilotta WN, Lemon BJ, Seefeldt LC (1998) Science 282:1853–1858 74. Holm L, Sander C (1993) J Mol Biol 233:123–138 75. Lancaster CRD, Kröger A, Auer M, Michel H (1999) Nature 402:377–385 76. Grabowski M, Joachimiak A, Otwinowski Z, Minor W (2007) Curr Opin Struct Biol 17:347–353 77. Caetano-Anolles G, Kim HS, Mittenthal JE (2007) Proc Natl Acad Sci USA 104:9358–9363 78. Zhang Y, Hubner IA, Arakaki AK, Shakhnovich E, Skolnick J (2006) Proc Natl Acad Sci USA 103:2605–2610 79. Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ, Remington K, Eisen JA, Heidelberg KB, Manning G, Li W, Jaroszewski L, Cieplak P, Miller CS, Li H, Mashiyama ST, 123 170 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. J Biol Inorg Chem (2008) 13:157–170 Joachimiak MP, van Belle C, Chandonia JM, Soergel DA, Zhai Y, Natarajan K, Lee S, Raphael BJ, Bafna V, Friedman R, Brenner SE, Godzik A, Eisenberg D, Dixon JE, Taylor SS, Strausberg RL, Frazier M, Venter JC (2007) PLoS Biol 5:e16 Layer G, Heinz DW, Jahn D, Schubert W-D (2004) Curr Opin Chem Biol 8:468–476 Morimoto K, Yamashita E, Kondou Y, Lee SJ, Arisaka F, Tsukihara T, Nakai M (2006) J Mol Biol 360:117–132 Berkovitch F, Nicolet Y, Wan JT, Jarrett JT, Drennan CL (2004) Science 303:76–79 Collet JF, Peisach D, Bardwell JC, Xu Z (2005) Protein Sci 14:1863–1869 Paddock ML, Wiley SE, Axelrod HL, Cohen AE, Roy M, Abresch EC, Capraro D, Murphy AN, Nechushtai R, Dixon JE, Jennings PA (2007) Proc Natl Acad Sci USA 104:14342–14347 Demple B (2002) Mol Cell Biochem 234/235:11–18 Kiley PJ, Beinert H (2003) Curr Opin Microbiol 6:181–185 Dupuy J, Volbeda A, Carpentier P, Darnault C, Moulis J-M, Fontecilla-Camps JC (2006) Structure 14:129–139 Walden WE, Selezneva AI, Dupuy J, Volbeda A, FontecillaCamps JC, Theil EC, Volz C (2007) Science 314:1903–1908 Yeh AP, Ambroggio XI, Andrade SLA, Einsle O, Chatelet C, Meyer J, Rees DC (2002) J Biol Chem 277:34499–34507 123 90. Anderson GL, Howard JB (1984) Biochemistry 23:2118–2122 91. Sen S, Igarashi R, Smith A, Johnson MK, Seefeldt LC, Peters JW (2004) Biochemistry 43:1787–1797 92. Mayerle JJ, Frankel RB, Holm RH, Ibers JA, Phillips WD, Weiher JF (1973) Proc Natl Acad Sci USA 70:2429–2433 93. Müller A, Schladerbeck NH (1986) Naturwissenschaften 73:S669 94. Müller A, Schladerbeck NH (1985) Chimia 39:23–24 95. Hagen KS, Reynolds JG, Holm RH (1981) J Am Chem Soc 103:4054–4063 96. Stack TDP, Holm RH (1988) J Am Chem Soc 110:2484–2494 97. Weigel JA, Holm RH (1991) J Am Chem Soc 113:4184–4191 98. Zhou J, Hu Z, Münck E, Holm RH (1996) J Am Chem Soc 118:1966–1980 99. Meyer J, Fujinaga J, Gaillard J, Lutz M (1994) Biochemistry 33:13642–13650 100. Broach RB, Jarrett JT (2006) Biochemistry 45:14166–14174 101. Delaye L, Becerra A, Lazcano A (2005) Orig Life Evol Biosph 35:537–554 102. Williams RJP (2007) Dalton Trans 991–1001 103. Burroughs AM, Balaji S, Iyer LM, Aravind L (2007) Biol Direct 2:18