Abstract
Free full text
A digital twin reproducing gene regulatory network dynamics of early Ciona embryos indicates robust buffers in the network
Abstract
How gene regulatory networks (GRNs) encode gene expression dynamics and how GRNs evolve are not well understood, although these problems have been studied extensively. We created a digital twin that accurately reproduces expression dynamics of 13 genes that initiate expression in 32-cell ascidian embryos. We first showed that gene expression patterns can be manipulated according to predictions by this digital model. Next, to simulate GRN rewiring, we changed regulatory functions that represented their regulatory mechanisms in the digital twin, and found that in 55 of 100 cases, removal of a single regulator from a conjunctive clause of Boolean functions did not theoretically alter qualitative expression patterns of these genes. In other words, we found that more than half the regulators gave theoretically redundant temporal or spatial information to target genes. We experimentally substantiated that the expression pattern of Nodal was maintained without one of these factors, Zfpm, by changing the upstream regulatory sequence of Nodal. Such robust buffers of regulatory mechanisms may provide a basis of enabling developmental system drift, or rewiring of GRNs without changing expression patterns of downstream genes, during evolution.
Author summary
Although regulatory relationships in gene regulatory networks have been studied extensively during the past two decades, it is not fully understood how gene regulatory networks produce gene expression dynamics. One major reason is the difficulty of experimentally determining “regulatory functions” that mathematically describe how individual genes are regulated. Because of simplicity of embryonic structure and compactness of the genome, ascidian embryos provide an ideal system for analyzing dynamics of gene regulatory networks. In 32-cell ascidian embryos, there are only 13 regulatory genes that initiate expression. Regulatory functions for these genes, which are represented as Boolean functions, collectively constitute a digital twin that reproduces gene expression dynamics at single-cell resolution throughout the whole embryo. The digital twin is a powerful tool for exploring mechanisms that are not readily accessible in experiments using real embryos. We found that over half the regulators in regulatory functions are potentially redundant for specifying temporal and spatial gene expression patterns. It is generally believed that regulatory functions have changed frequently during evolution, and our results show that a gene regulatory network in early ascidian embryos has robust buffers, which constitute a basis for developmental system drift.
Introduction
Gene regulatory networks control gene expression dynamics during animal development. Although the structure of such networks, which represent connections among regulatory genes, has been studied extensively, it is not fully understood how these networks control gene expression dynamics. This is partly because it is laborious and challenging to mathematically represent how individual genes are regulated.
Ascidians are ideal chordates for studying gene regulatory networks because of their simple genome structure and simple embryonic structure. Indeed, the network structure of early embryonic stages has been analyzed extensively [1,2]. In particular, regulatory mechanisms in 32-cell embryos, in which germ layers are largely specified (Fig 1A), have been studied in detail [3–6]. A comprehensive expression assay [7] revealed that 13 regulatory genes begin to be expressed in 32-cell embryos. Regulatory mechanisms for 12 of these 13 genes have been described mathematically as Boolean functions, which represent necessary and sufficient conditions for their expression [5] (see an example for Boolean functions in Fig 2A). In other words, combinations of upstream regulatory factors that induce expression of these genes are represented as Boolean formulae, which we call regulatory functions (RFs). Expression patterns of these 12 genes can be predicted by calculations using these functions and activity of upstream factors. The RF for the remaining gene, Nodal, has also been determined provisionally. Necessary factors for Nodal expression were identified by our exhaustive examination of functions of regulatory genes expressed before the 32-cell stage. However, we did not always succeed in inducing Nodal expression in real embryos in conditions in which the tentative RF predicted Nodal activation [5]. This suggests that an additional modulator for transcription factors or signaling pathways may be involved in Nodal regulation. Determination of the Nodal RF will enable us to calculate expression patterns of all genes that begin to be expressed at the 32-cell stage. In other words, these RFs collectively constitute a digital twin, and we will be able to simulate gene expression dynamics in early ascidian embryos with this digital tool. In the present study, we first devised such a digital twin and then showed that gene expression patterns can indeed be manipulated in real embryos according to predictions from the digital twin. Using the digital twin, we also examined how changes in RFs potentially affected gene expression, because changes in RFs are thought to have occurred frequently during evolution but are difficult to assess experimentally using real embryos. We found robust buffers of regulatory mechanisms, which potentially permits rewiring of GRNs without changing expression patterns.
In the present study, RFs are represented as Boolean functions, as was successfully done in other systems [8–11]. There are several forms to represent a Boolean function, and disjunctive normal forms (DNFs) were used to represent RFs in a previous study [5] and in the present study, because this form is more easily interpreted from a biological viewpoint. DNFs are represented as the sum (disjunction; OR; ) of products (conjunction; AND; ) like ABA¬C (¬: negation, NOT). In this example, “AB” and “A¬C” are conjunctive clauses. If the RF for gene X is represented ABA¬C, this means that X is expressed when the upstream factors, A and B, are both present or when the upstream factor A is present and the upstream factor C is absent (or both). RFs can also be expressed as a form of truth tables. A truth table comprehensively represents expression of a target gene in various combinations of upstream factors. In the present study, we consider 18 upstream factors (see S1 Table); therefore, there are 262,144 (= 218) possible combinations. It is not realistic to comprehensively examine all these conditions experimentally. For this reason, in our previous study [5], among DNFs that are compatible with gene expression patterns in normal embryos and a limited number of experimental embryos (i.e. partially filled truth tables), DNFs with the smallest number of conjunctive clauses and the smallest number of literals (upstream regulators) were considered as primary candidates; if multiple candidates were obtained, we repeated experiments until we obtained a unique candidate DNF. In other words, candidate DNFs were determined under the assumption that the simplest DNF (DNF with the smallest number of conjunctive clauses and the smallest number of upstream regulators) that explains all observations is most likely. In experiments for determining DNFs, we used embryos injected with morpholino antisense oligonucleotides (MO) for knockdown and/or treated with an inhibitor for the MAPK signaling pathway. Then we experimentally verified whether candidate RFs correctly predicted gene expression under conditions that were not previously examined. In verification experiments, we used MOs, synthetic mRNAs, the MAPK signaling inhibitor, and a recombinant FGF protein. In this way, we determined RFs using experiments and theoretical analyses. The finding that RFs are successfully represented as Boolean functions indicates that qualitative, but not quantitative, control is important for gene expression in early ascidian embryos [5].
To predict gene expression patterns with these RFs, distribution patterns of the 18 upstream factors are necessary. Distribution of these upstream factors in normal embryos is based on observations in previous studies [6,12–24]. We assumed that descendants of cells expressing a transcription factor gene at the 16-cell stage express the encoded protein at the 32-cell stage because of a delay between gene expression and protein translation [5]. In the present study, we employed the same approach to determine the function of Nodal.
Results and discussion
Regulatory function of Nodal
Expression patterns predicted by the presumptive Nodal RF in various conditions are reproduced experimentally at the 32-cell stage [5]. If this Nodal RF represents sufficient conditions, Nodal expression could be induced in experimentally manipulated 16-cell embryos. However, we failed to induce Nodal expression in 16-cell experimental embryos, in which we reproduced conditions that were predicted to be sufficient to induce Nodal expression [5]. Therefore, it is likely that there is a missing activator that acts at the 32-cell stage, but not at the 16-cell stage. Because there is a delay between mRNA transcription and protein activity, we hypothesized that the mRNA encoding this missing factor should be expressed at the 16-cell stage. Since we have comprehensively identified genes encoding transcription factors and signaling molecules and have examined their functions [5,7], it is unlikely that this missing factor is a transcription factor or a signaling molecule.
Zfpm, a gene for a FOG family zinc finger protein, satisfies most of these criteria. It begins to be expressed in 16-cell embryos [21], and the encoded protein acts as a co-factor of Gata transcription factors [25]. Because Nodal is expressed in both the animal and vegetal hemispheres, the missing factor should also be expressed in both hemispheres. However, a previous study reported that expression of Zfpm was observed only in the animal hemisphere [21]. Therefore, we re-examined the expression pattern of Zfpm and found that it is, in fact, expressed in both hemispheres, except for the posterior-most germ-line cells at the 16-cell stage (S1A–S1C Fig) [Note that because transcription is repressed in germ-line cells [15,22], we did not consider germ-line cells in the present study]. Indeed, knockdown of Zfpm using a MO and CRISPR-based knockout of Zfpm abolished Nodal expression at the 32-cell stage (Figs 1B and 1C, and S2). Because Zfpm is a co-factor of Gata [25], we confirmed that Gata.a is necessary for activation of Nodal by knockdown of Gata.a, using a specific MO (Fig 1D) [this MO has been used repeatedly in studies by different groups [14,17,21,26,27]; therefore, we did not further evaluate specificity of this MO in the present study]. Thus, Zfpm and Gata.a positively regulate Nodal at the 32-cell stage.
The tentative Boolean function for Nodal, which was determined previously [5], is represented in a DNF, and consists of four conjunctive clauses (S3 Fig). Because the first three conjunctive clauses, respectively, represent expression in A6.1/A6.3/B6.1, B6.1, and b6.5 cells of normal embryos, and because expression in these cells was abolished in Zfpm morphants (Fig 1C), it is likely that Zfpm is involved in expression represented by these three conjunctive clauses. On the other hand, the fourth conjunctive clause does not represent expression in normal embryos but represents expression in experimental conditions, e.g. expression in embryos in which Gdf1/3-r and Admp are simultaneously knocked down [5], and involvement of Zfpm in the fourth conjunctive clause was unclear from the above experiments. To test the fourth function, we used 16-cell embryos, which do not normally express Nodal. In 16-cell embryos, we created an experimental condition that satisfied the fourth conjunctive clause by incubating embryos in sea water containing FGF2, which mimicked overexpression of Fgf9/16/20. In such embryos, Nodal was expressed in the animal hemisphere (S4A Fig). The observation that FGF treatment induced Nodal expression in 16-cell embryos suggested that Zfpm may be unnecessary for the fourth conjunctive clause, because Zfpm begins to be expressed at the 16-cell stage; therefore it is unlikely that sufficient Zfpm protein is available at this stage. Indeed, injection of the Zfpm MO did not affect Nodal expression in 16-cell embryos incubated in sea water containing FGF2 (S4B Fig).
There was some ambiguity in the fourth conjunctive clause of the previous tentative RF for Nodal. Because the fourth conjunctive clause does not represent expression in normal embryos, this uncertainty was not resolved in the previous study [5]. In the present study, to resolve this enigma, we performed additional experiments. First, we injected an MO against Gata.a or β-catenin in addition to FGF2 treatment, and found that Gata.a and β-catenin regulate Nodal positively and negatively, respectively, in 16-cell embryos treated with FGF2 (S4C and S4D Fig). Second, because it has been suggested that Prdm1-r may negatively regulate Nodal in the fourth conjunctive clause [5], we injected Prdm1-r mRNA and treated injected embryos with FGF2, confirming that Nodal expression was lost (S4E Fig). Similarly, a previous study showed that embryos injected with Foxa.a mRNA and treated with FGF2 do not express Nodal at the 16-cell stage [5] (S4F Fig). These data indicate that both Prdm1-r and Foxa.a act as negative regulators of Nodal.
Using a method that we previously developed [5] and experimental results for Nodal expression in various conditions, which we described above and in our previous study [5] (S1 Table), we obtained the simplest DNF of the Boolean function for Nodal (Fig 2A). For this calculation, we assumed that descendants of cells expressing a transcription factor gene at the 16-cell stage would express the encoded protein at the 32-cell stage (S1 Table), and used the distribution pattern of upstream factors inferred from observations in previous studies [6,13–24].
We confirmed that Nodal expression can be induced in 16-cell embryos by various experimental manipulations in patterns that each of the conjunctive clauses of the Nodal RF predicted (Fig 2B–2H). That is, the RF predicted no expression of Nodal in normal 16-cell embryos (Fig 2B), and indeed Nodal is not expressed [7] (see also Fig 3D). Similarly, as we described above, 16-cell embryos incubated in sea water containing FGF2 expressed Nodal in the animal hemisphere, and this condition satisfies the fourth conjunctive clause of the RF only in the animal hemisphere (Fig 2C). As we have already shown in S4 Fig, if we inject Foxa.a mRNA into unfertilized eggs and incubate them in sea water containing FGF2 after fertilization, Nodal is not expressed (this condition does not satisfy any conjunctive clauses). Then, we injected Foxa.a and Zfpm mRNAs into unfertilized eggs and incubated them in sea water containing FGF2. In this condition, the first conjunctive clause was satisfied in the vegetal hemisphere. That is, the RF predicted Nodal expression in the vegetal hemisphere, except for germ line cells, and such embryos indeed expressed Nodal in the vegetal hemisphere (Fig 2D). Similarly, we induced Nodal expression in 16-cell embryos according to the second and third conjunctive clauses by injecting mRNAs for Tbx6-r.b and Zfpm, and injecting mRNAs for Sox1/2/3, Prdm1-r, and Zfpm followed by FGF treatment, respectively (Fig 2E–2H). Thus, Nodal was expressed or not expressed in these experimental conditions, as the Nodal RF predicted.
As in the previous study [5], we assumed that the Nodal RF can be represented as a Boolean function. Our success suggests that this assumption is appropriate, and that qualitative controls are important for regulation in early Ciona embryos. The first three conjunctive clauses of the Nodal RF represent expression in A6.1/A6.3/B6.1, B6.1, and b6.5 of normal 32-cell embryos, respectively. On the other hand, the fourth conjunctive clause is not necessary for expression in normal embryos at the 32-cell stage.
Reproduction of the pattern of a cell with neural fate of a 32-cell embryo at the 16-cell stage
The Nodal RF and RFs for the other 12 genes, which have been determined previously [5], collectively constitute a digital twin to model gene expression dynamics of the whole 32-cell ascidian embryo. An HTML-based program that implements these RFs enables us to calculate at single-cell resolution how the 13 regulatory genes are expressed under various conditions (http://ghost.zool.kyoto-u.ac.jp/sim32v2/). Because these RFs represent conditions necessary and sufficient for gene expression, the RFs allow us to predict expression patterns of these 13 genes even at the 16-cell stage. Therefore, at the 16-cell stage, we tried to reproduce the pattern that is normally seen in a cell pair (b6.5) of 32-cell embryos using the RFs, as b6.5 cells are important for patterning the neural plate at later stages [1,28,29]. In other words, because only Nodal and Otx are expressed in b6.5 cells of normal 32-cell embryos among the above 13 genes [7], we computationally tested various conditions and found a condition to specifically induce Nodal and Otx expression at the 16-cell stage. The digital twin predicted that Nodal and Otx, but not the other 11 genes, would be expressed in the animal hemisphere of 16-cell embryos with overexpression of Sox1/2/3 and Fgf9/16/20 (Fig 3A and 3B). Otx and Nodal were indeed expressed in 16-cell embryos injected with Sox1/2/3 mRNA and treated with FGF2 (Fig 3C and 3D). We also confirmed that the remaining genes were not expressed or upregulated at the 16-cell stage in such experimental embryos. Expression levels of nine genes (excluding Wnt5 and Dlx.b), as well as expression levels of Nodal and Otx, were measured by reverse transcription and quantitative PCR (RT-qPCR) (Fig 3E). Meanwhile, Wnt5 and Dlx.b are maternally expressed, and these maternal mRNAs disturbed accurate measurement of zygotic expression by RT-qPCR. Therefore, these two genes were examined by in situ hybridization. Because their maternal mRNAs are localized at the posterior pole, these maternal mRNAs are easily discernible and no zygotic expression was observed in experimental embryos (Fig 3F and 3G). These analyses showed that Otx and Nodal, but not the other 11 genes, are activated in 16-cell embryos when injected with Sox1/2/3 mRNA and treated with FGF2. Thus, the expression pattern of b6.5 of 32-cell embryos was reproduced in the animal hemisphere of 16-cell embryos, as the digital twin predicted. In other words, the digital twin successfully reproduces gene expression dynamics of real embryos even under conditions that were not used for constructing the RFs.
In this way, we succeeded in reproducing the pattern of b6.5 of 32-cell embryos in 16-cell embryos with the aid of the digital twin. However, our success may not necessarily mean that these manipulated cells differentiate in the same way as normal b6.5 cells. We injected Sox1/2/3 mRNA into eggs, and it probably persists after the stage at which endogenous Sox1/2/3 mRNA diminishes. In addition, we did not examine how such manipulation changed expression of genes that begin to be expressed at the 16-cell stage. Therefore, it is not easy to predict the cell types into which these manipulated cells finally differentiate. Producing digital twins for 16-cell, 64-cell, and later embryos will resolve this problem, and a technique that degrades introduced mRNAs in a timely manner will also be useful for manipulating terminally differentiated cell types.
Robust buffers of regulatory mechanisms revealed by the digital twin
Because RFs are primarily encoded in cis-regulatory regions, often in a complex and redundant manner [30–34], it is generally difficult to change regulatory mechanisms or RFs in vivo. Therefore, the digital twin provides a unique opportunity to examine effects of changes in RFs. In other words, the digital twin can predict how gene expression patterns are altered by changing RFs. In RFs for the 13 genes (S2 Table), 16 upstream regulators appear 100 times in total. Each of these 16 regulators appears multiple times in different conjunctive clauses (Fig 4). We found 55 cases in which removal of a single regulator from a single conjunctive clause of RFs did not qualitatively change expression patterns in normal 16-cell or 32-cell embryos (Fig 4). For example, Lhx3/4, Neurogenin, and Dickkopf are expressed in the same cells, and their RFs are commonly represented as FoxdFgf9/16/20β-catenin (Fig 4). Our analysis using the digital twin shows that Fgf9/16/20β-catenin and Foxdβ-catenin can establish the same expression pattern (S5A Fig). That is, even if regulatory circuits are rewired in the future and RFs for Lhx3/4, Neurogenin, and Dickkopf are changed to Fgf9/16/20β-catenin or Foxdβ-catenin, the original gene expression pattern could be retained theoretically. It is noteworthy that this rewiring may include not only loss of binding sites for Foxd or Ets1/2 (an effector of the FGF pathway) but also reorganization of their cis-regulatory regions. This finding suggests that the observed theoretical redundancy may facilitate developmental system drift, or rewiring of GRNs without changing expression patterns of downstream genes. Note that the word “redundancy” here does not mean that redundant factors are not necessary for gene expression. Instead, it means that gene expression will not be changed without a redundant factor if regulatory regions are properly re-designed. In other words, information sufficient for specific gene expression is given to a target gene without a redundant factor. The digital twin also indicated that such rewiring would enable these genes to behave differently if expression patterns of upstream factors were changed (S5B, S5C, and S5D Fig). Conversely, it is theoretically possible that developmental system drift created this redundancy. However, this latter possibility is unlikely at least in cases of Lhx3/4, Neurogenin, and Dickkopf. If so, it would indicate that developmental system drift changed regulatory mechanisms of these genes independently three times, so that these three genes became controlled by the same combination of regulatory factors. Thus, the above results indicate that more than half the regulators give theoretically redundant temporal or spatial information to target genes, or that some do not furnish such information.
It is unlikely that the observed redundancy is an artifact derived from our method to determine RFs. First, we identified upstream regulatory factors in an unbiased way as we mentioned in the Introduction section. Second, we started with considering all theoretically possible conjunctive clauses to determine RFs, as we explained in detail in the Introduction section; therefore, it is unlikely that this method favors or disfavors redundancy of the network. Third, RFs represent necessary and sufficient conditions for expression of individual target genes, and their sufficiency was experimentally verified [5] (see also Fig 2).
Although more than half the regulators give theoretically redundant temporal or spatial information to target genes, they are necessary for expression of their target genes in real embryos. The above computational analysis showed that Foxd and Fgf9/16/20 theoretically gave redundant cues for regulation of Lhx3/4, whereas Lhx3 expression is lost in embryos injected with an MO against Foxd or Fgf9/16/20 [5]. Similarly, the present and previous studies have experimentally shown that each of the observed redundant factors act to establish specific expression patterns of the 13 target genes [4,5,14,19,20,26,35]. Exceptions include involvement of β-catenin and Gata.a in regulation of Dmrt.a and Dlx.b. Therefore, we confirmed that these factors indeed regulate expression of Dmrt.a and Dlx.b (S6 Fig). Thus, redundant temporal or spatial information is given to the 13 target genes.
In this way, the above hypothetical experiment using the digital twin indicates that wide tolerance in the gene regulatory network may constitute evolutionary potential to rewire networks, and may have contributed to evolution of gene regulatory networks. Thus, the digital twin makes it possible to analyze mechanisms that are difficult to access with experiments using real embryos.
Rewiring of the regulatory mechanism for Nodal
As shown in Fig 4, Zfpm is a factor that does not furnish spatial information for specific expression of Nodal. Even if Zfpm is removed simultaneously from all three conjunctive clauses that contain Zfpm, Nodal expression is theoretically unchanged in the digital twin (Fig 5A and 5B). That is, the digital twin predicted that the Nodal expression pattern can be maintained without Zfpm, if the Nodal cis-regulatory region is properly re-designed. To substantiate this prediction, we tried to manipulate the upstream region of Nodal.
Zfpm is a co-factor of Gata.a, but it has not completely been understood how Zfpm-dependency is encoded in cis-regulatory regions. Therefore, we first tried to find Gata.a binding sites that act independently of Zfpm. Zic-r.b is one of Gata.a targets (see Fig 4). When we hypothetically added regulation by Zfpm to all conjunctive clauses that included Gata.a in the Zic-r.b RF, this hypothetical RF failed to induce expression in B6.4 cells (Fig 5B). Because expression of Zic-r.b in B6.4 is important for muscle cell specification [36,37], this simulation indicated that the muscle specification program in the B6.4 lineage would be disrupted if mutations placed Zic-r.b under control of Zfpm. This observation suggested that Zic-r.b is constrained to be independent of Zfpm. Indeed, knockdown of Zfpm did not affect Zic-r.b expression in real embryos (Fig 5C). Therefore, it is highly likely that Gata.a binding sites responsible for Zic-r.b expression in 32-cell embryos function independently of Zfpm. Specifically, because two Gata.a-binding sites responsible for Zic-r.b expression in B6.4 have been identified [14], it is highly likely that these two Gata.a binding sites act independently of Zfpm.
The cis-regulatory region responsible for expression of Nodal in b6.5 of 32-cell embryos has been identified and contains four Gata-binding sites [19,38] (Fig 5D). A reporter construct that contained this region and a minimal promoter was expressed in b6.5 (Fig 5E) [19,38]. Because co-injection of this wild-type reporter and the Zfpm MO greatly reduced the percentage of embryos with reporter expression (Fig 5E and 5E’), this reporter reproduced the dependency of Nodal on Zfpm.
Next, we deleted the two upstream Gata.a binding sites of the Nodal cis-regulatory region, and instead inserted the two Gata.a binding sites responsible for Zic-r.b expression. This construct was expressed in b6.5, but co-injection of the Zfpm MO did not reduce the percentage of embryos with reporter expression (Fig 5F and 5F’). Thus, by this replacement, the reporter became independent of Zfpm, and the mutated upstream sequence retained the ability to direct expression in b6.5. Because Zic-r.b is not expressed in b6.5, it is not likely that the Gata.a binding sites of Zic-r.b give temporal and spatial information sufficient for driving the reporter in b6.5. That is, we substantiated the prediction that rewiring to make Nodal independent of Zfpm does not necessarily alter the Nodal expression pattern, and demonstrated that the observed redundancy of Zfpm for Nodal regulation is not an artifact.
Conclusions
We created a digital twin that accurately reproduces expression dynamics of 13 genes that initiate expression in ascidian 32-cell embryos. These RFs are deduced using data from a comprehensive in situ hybridization assay for regulatory genes [7] and exhaustive knockdown assays in both a previous study [5] and the present study. Therefore, we expect that we succeeded in mathematically representing the whole regulatory system of regulatory genes that initiate expression at the 32-cell stage, although there is a small possibility that additional hypothetical redundant factors are found in future. On the other hand, many regulatory genes analyzed in the present study are also expressed in later stages of the life cycle [7]. Such expression will be regulated by different mechanisms, and these mechanisms may not be included in the RFs we used in the present study. Nevertheless, we showed that this digital twin is useful for predicting expression patterns in various experimental conditions in early embryos, and indeed created the pattern that is normally seen in a neural cell pair (b6.5) of the 32-cell embryo in the 16-cell embryo. Thus, the RFs predict gene expression patterns under untested conditions; therefore, the RFs are not mere restatements of experimental results.
We showed that simulators that digitally reproduce gene expression patterns are useful for analyzing mechanisms that are largely inaccessible by experiments using real embryos. Our simulator revealed that regulation that is theoretically redundant or dispensable to give specific expression information occurred in more than half the cases involving the 13 RFs. This indicates robust buffers in the gene regulatory network. Robust buffers may have created a basis for GRN rewiring that changes expression patterns. Although it is unknown whether only the Ciona GRN in early embryos has such wide tolerance or whether this is a general feature of GRNs, we propose that such rewiring of GRNs without changing expression patterns, i.e. developmental system drift, may have occurred frequently during evolution. Indeed, examples of developmental system drift have been reported in ascidians [31,39,40]. If so, it is possible that GRNs in ancestral animals may have had more redundancy and that such redundancy may have decreased during evolution.
Materials and methods
Animals and gene identifiers
Adult specimens of Ciona intestinalis (type A; also called Ciona robusta) were obtained from the National BioResource Project for Ciona. cDNA clones were obtained from our EST clone collection [41]. Identifiers for genes examined in this study are as follows: Admp, KY21.Chr2.381/KH.C2.421; Bmp3, KY21.Chr12.897/KH.C12.491; β-catenin (Ctnnb), KY21.Chr9.48/KH.C9.53; Dickkopf, KY21.Chr6.647/KH.L20.29; Dlx.b, KY21.Chr7.361/KH.C7.243; Dmrt.a, KY21.Chr5.707/KH.S544.3; Efna.d, KY21.Chr3.881/KH.C3.716; Ets1/2 (Ets1/2.b), KY21.Chr10.346/KH.C10.113; Fgf9/16/20, KY21.Chr2.824/KH.C2.125; Foxa.a, KY21.Chr11.1129/KH.C11.313; Foxd, KY21.Chr8.654/655/KH.C8.396/890; Foxtun2, KY21.Chr14.884/KH.L150.2; Gata.a, KY21.Chr6.630/KH.L20.1; Gdf1/3-r, KY21.Chr4.347/KH.C4.547; Hes.a, KY21.Chr1.29/KH.C1.159; Hes.b, KY21.Chr3.564/KH.C3.312; Lhx3/4, KY21.Chr13.457/KH.S215.4; Macho-1 (Zic-r.a), KY21.Chr1.1337/KH.C1.727; Neurogenin, KY21.Chr6.434/KH.C6.129; Nodal, KY21.Chr14.864/KH.L106.16; Otx, KY21.Chr4.720/KH.C4.84; Pem1, KY21.Chr1.618/KH.C1.755; Prdm1-r, KY21.Chr12.994/997/KH.C12.105/493; Raf, KY21.Chr1.1417/KH.L18.20; Snail (Snai), KY21.Chr3.1356/KH.C3.751; Sox1/2/3, KY21.Chr1.254/KH.C1.99; Tbx6-r.a, KY21.Chr11.458/KH.L8.11; Tbx6-r.b, KY21.Chr11.465/466/467/KH.S654.1/2/3; Tcf7, KY21.Chr6.59/KH.C6.71; Tfap2-r.b, KY21.Chr7.1145/KH.C7.43; Wnt3, KY21.Chr9.971/KH.C9.27; Wnt5, KY21.Chr4.1174/KH.L152.45; Wnttun5, KY21.Chr9.822/KH.C9.257; Zfpm (Fog), KY21.Chr10.450/KH.C10.574; Zic-r.b, KY21.Chr6.26/27/28/29/30/31/KH.S816.1/2/4/KH.L59.1/12. Identifiers for the latest KY21 set [42] and a more commonly used KH set [43] are shown.
Functional assays
The sequence of the MO for Zfpm, which blocks translation, was 5’- GGACATTGTGTGTGTTATTTTTGTA -3’. Gata.a, β-catenin, and Efna.d MOs that were used in previous studies [14,17,21,26] were used here. These MOs were microinjected under a microscope. For overexpression, coding sequences of Zfpm, Sox1/2/3, Tbx6-r.b, Foxa.a, and Prdm1-r were cloned into pBluescript RN3 [44]. While there are two copies of Prdm1-r, we used Prdm1-r.a (KY21.Chr12.997) for overexpression. Injected RNAs were transcribed using a mMESSAGE mMACHINE T3 Transcription Kit (Thermo Fisher Scientific, #AM1348). To stabilize β-catenin in all cells, embryos were incubated in sea water containing BIO, which is an inhibitor of Gsk3 (Merck, #361550; 2.5 μM) [13], from the late 16-cell stage.
To mutate Zfpm, we used CRISPR technology. A guide RNA (5’-CCACGTCTCACCTGAAAGATATC-3’) was synthesized in vitro using a precision gRNA synthesis kit (Thermo Fisher Scientific, #A29377), and the guide RNA (200 ng/μL) and Cas9 protein (50 ng/μL; Thermo Fisher Scientific, #A36496) were co-injected. For genotyping, injected embryos were lysed and genomic DNA was extracted. Then, we amplified a genomic region that contained the target site by PCR (primer sequences are: 5’-CCATTCCGAACTTTCGTCGC-3’ and 5’-CGCTGCTTTTGTCATGTGGT-3’), and amplified DNA fragments were subjected to sequencing analysis. CRISPR efficiency was estimated with obtained sequence chromatograms and the TIDE program [45].
To mimic overexpression of Fgf9/16/20, we added human recombinant basic FGF (FGF2; Merck, #662005) to sea water with 0.1% BSA at a concentration of 10 ng/mL. All functional assays were performed at least twice with different batches of embryos.
Reporter assays
We used a reporter construct from a previous study [19] as a starting construct. This construct contained an upstream region of Nodal [genomic coordinates are Chr14:6168259–6168410 in the HT version assembly [46]], a basal promoter region of Zfpm, and lacZ; therefore, the overall configuration is the same as that of the construct used in another study [38]. Reporter constructs were introduced into unfertilized eggs by microinjection. Expression of lacZ was examined by in situ hybridization. Reporter experiments were performed at least twice with different batches of embryos.
Gene expression assays
Whole-mount in situ hybridization was performed, as described previously [5]. For fluorescent detection, we used Tyramide SuperBoost Kits (Thermo Fisher, #B40922). For RT-qPCR, we extracted RNA and converted RNA to cDNA using a Cells-to-Ct kit (Thermo Fisher Scientific, #4402954). cDNA samples were analyzed by quantitative PCR with the SYBR-Green method. Used primers were: Nodal, 5’-GGAATTGTACCGAGCCAAAA-3’ and 5’- ACGACGACCAACTTTGAACC-3’; Lhx3/4, 5’-GGTTGGCAAATGGAAGTCGAA-3’ and 5’-GCCAAGGTTTGTCCTGTACTTTGAG-3’; Neurogenin, 5’-GGCCTCACAAGACGTAATGG-3’ and 5’-AAGACCATGCATTCGGTTTC-3’; Dickkopf, 5’-ACACCTACTATAATACCTAAACGCGAAA-3’ and 5’-TTGTGCGCAACAGAAACCAT-3’; Snail, 5’-TGGTAAAGCGTTCTCACGTACCT-3’ and 5’-CACAGTGCATTGGTATGGTTTCTC-3’; Wnt3, 5’-ATTGACCAATGCAAGCATCA-3’ and 5’-TCCAATACAGGCCCGAATAC-3’; Wnt5, 5’-ATCGGGAACGTAAAGTAATGAACAT-3’ and 5’-CGAGCCGATCTCACAACGA-3’; Bmp3, 5’-GTCCGTAGCTTCTTCTCTGTAGCA-3’ and 5’-GCGGGTACGATTAGAATAGGTTTC-3’; Hes.b, 5’-CTTCGACTGTGCAAATTGTATCTTC-3’ and 5’-CGCGGCGTCGTTTTTC-3’; Zic-r.b, 5’-CGTTTGGAAGAAGCGAGAATTTAA-3’ and 5’-TTCAGTGTTGTGCATGTAACTATGCTT-3’; Dmrt.a, 5’-TCTGATCGCTGAACGACAAC-3’ and 5’-GTGGCGACTGTCGGTTATTT-3’; Otx, 5’-GGCTTAGGCCACGATATGAA-3’ and 5’-TAGCTCCTTGGTGCATTCCT-3’; Dlx.b, 5’-TTACAAACTGCACCCCCTTC-3’ and 5’-TCTCCTGGATCGGAATCAAC-3’; Macho-1, 5’- CCCAGTATGCACCAAATTCAGA-3’ and 5’- TGGTGTGAAAACGGGTGAAAC-3’; Pou2, 5’-AAGATGGTTGCTGGATGCTAATAAT-3’ and 5’-TTGGATTGGAGTGGGAATAACAA-3’. No amplification was observed in control samples that included water instead of reverse-transcriptase.
Boolean representation of Nodal regulatory function and the digital twin
To identify the simplest disjunctive normal form compatible with all experimental results in the present study and our previous study [5], we used a computer program, mindnf, that we developed previously [5].
The digital twin, in which RFs for the 13 genes are implemented, is provided as an HTML-based program (http://ghost.zool.kyoto-u.ac.jp/sim32v2/). The computer code is deposited in Zenodo ( 10.5281/zenodo.7604201).
Supporting information
S1 Fig
The expression pattern of Zfpm.Fluorescence in situ hybridization was used to examine expression patterns of Zfpm in (A) 8-cell, (B) 16-cell, and (C) 32-cell embryos. Note that expression is detected as green dots in nuclei of cells in the animal and vegetal hemispheres of a 16-cell embryo and in the animal hemisphere of a 32-cell embryo, but not in the 8-cell embryo or in the vegetal hemisphere of the 32-cell embryo. Nuclei are stained with DAPI (blue).
(PDF)
S2 Fig
CRISPR-based knockout of Zfpm to confirm the specificity of the phenotype in Zfpm morphants.(A) First, we examined genomic DNA of larvae developed from eggs injected with Cas9 protein and a guide RNA designed to bind to the region encoding the fifth and sixth zinc fingers of Zfpm. Sequencing followed by TIDE analysis [45] indicated that 69.6 and 74.4% of DNA fragments amplified with PCR from two batches of larvae contained mutations, suggesting that this guide RNA was effective. Next, we performed the same experiments using 16-cell embryos. Mutagenic efficiency varied from 1.1 to 69.5% among nine batches of embryos we examined. Therefore, we used three batches of embryos that showed high mutagenic efficiencies (69.5, 52.9, and 45.6%) for the following analyses. TIDE further indicated that these three batches contained 3/6/9 base insertions or deletions in 11.2%, 12.9%, and 6.1% of the amplified DNA fragments. Because these mutations did not cause frame-shifts, and may not have severely impaired Zfpm function, we conservatively estimated that 58.3 (= 69.5–11.2), 40.0 (= 52.9–12.9), and 39.5 (= 45.6–6.1) % of these embryos contained effective mutations. As all cells are diploid, 34.0, 16.0, and 15.6% of cells were estimated to contain effective mutations in both maternal and paternal alleles. By in situ hybridization, we found that 20.7, 18.9, and 10.0% of embryos in these batches lost Nodal expression. These percentages were close to the expected percentages, indicating that Zfpm is required for Nodal expression and that the Zfpm MO acted specifically. (B) An embryo that lost Nodal expression by CRISPR knockout of Zfpm. Expression was examined with in situ hybridization. An uninjected embryo (n = 43) and embryos injected with either Cas9 (n = 45) or Zfpm sgRNA (n = 39) are shown as controls. Nodal expression was not changed in these controls.
(PDF)
S3 Fig
The tentative Nodal regulatory function.This tentative function was determined previously [5], and explains the Nodal expression pattern in normal conditions at the 16-cell and 32-cell stages and in a variety of experimental conditions at the 32-cell stage, but cannot accurately predict expression patterns in experimental conditions at the 16-cell stage. Note that expression patterns of Prdm1-r and Foxa.a indicated that either of them or both are involved in the fourth conjunctive clause, but their involvement was not strictly tested [5].
(PDF)
S4 Fig
Examination of the fourth conjunctive clause of the RF for Nodal.Nodal expression was examined with in situ hybridization at the 16-cell stage in (A) an embryo treated with FGF2, (B) an embryo injected with the Zfpm MO and treated with FGF2, (C) an embryo injected with the Gata.a MO and treated with FGF2, (D) an embryo injected with the β-catenin MO and treated with FGF2, (E) an embryo injected with Prdm1-r mRNA and treated with FGF2, (F) an embryo injected with Foxa.a mRNA and treated with FGF2. Nodal expression in unperturbed control embryos is shown in Fig 3D. Photographs in (A) are the same as those in Fig 2C. Total numbers of embryos examined and numbers of embryos that photographs represent are shown within the panels.
(PDF)
S5 Fig
Rewiring that does not change expression patterns in normal embryos can induce different expression patterns if distribution patterns of upstream regulators are changed.(A) RFs for Lhx3/4, Neurogenin, and Dickkopf are commonly represented as FoxdFgf9/16/20β-catenin, and these genes are expressed in A6.1, A6.3, and B6.1 of normal embryos. Regulatory functions represented as Fgf9/16/20β-catenin and Foxdβ-catenin can induce the same expression patterns. (B-D) In cases in which (B) Foxd, (C) Fgf9/16/20, or (D) β-catenin acts in all cells of 16-and 32-cell embryos, three RFs induce different expression patterns. This observation indicates that Lhx3/4, Neurogenin, and Dickkopf could be expressed differently upon changes of distribution patterns of their upstream factor through rewiring that does not change expression patterns in normal embryos.
(PDF)
S6 Fig
β-catenin and Gata.a are involved in regulating Dmrt.a and Dlx.b expression.(A) Dmrt.a expression in normal unperturbed embryos. (B) Dmrt.a expression is abolished in embryos incubated in sea water containing BIO, an inhibitor of Gsk3; therefore, β-catenin is expected to be stabilized in all cells. (C) Dmrt.a expression is also abolished in embryos injected with the Gata.a MO. (D) Dlx.b expression is detected in anterior cells of the animal hemisphere of embryos injected with a MO against Efna.d, as we reported before [5]. (E) Dlx.b expression is not detected in embryos injected with the Efna.d MO and incubated in sea water containing BIO. (F) Dlx.b expression was not detected in embryos injected with the Efna.d MO and the Gata.a MO. Arrowheads indicate maternal Dlx.b mRNA localized to the posterior pole. Total numbers of embryos examined and numbers of embryos that photographs represent are shown below.
(PDF)
S1 Table
All experimental conditions and experimental results used to determine the Nodal regulatory function.(XLSX)
S2 Table
Regulatory functions for genes that initiate expression at the 32-cell stage.(XLSX)
Acknowledgments
We thank Reiji Masuda, Shinichi Tokuhiro, Chikako Imaizumi (Kyoto University), Manabu Yoshida (University of Tokyo), and other members working under the National BioResource Project for Ciona (MEXT, Japan) at Kyoto University and the University of Tokyo for providing experimental animals. We also thank Steven D. Aird for editing the manuscript.
Funding Statement
This research was supported by grants from the Japan Society for the Promotion of Science and CREST, Japan Science and Technology Agency (21H02486 and JPMJCR22N6 to YS, and 20J40280 and 22K06250 to MT). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Data Availability
All data are included in the main document and supplemental materials. The computer code is deposited in Zenodo ( 10.5281/zenodo.7604201).
References
Decision Letter 0
6 Jun 2023
Dear Dr Satou,
Thank you very much for submitting your Research Article entitled 'A digital twin reproducing the gene regulatory network of early Ciona embryos indicates robust buffering in the network' to PLOS Genetics.
The manuscript was fully evaluated at the editorial level and by three independent peer reviewers. We apologize for the length of this process. The reviewers appreciated your work. Referee 2 supports its publication, but referees 1 and 3 raised substantial concerns about the current manuscript. Referee 3, in particular, is worried that the way you constructed the networks in this and in your previous study (Tokuoka et al., 2021) may artificially favour the selection of redundant networks, which would undermine a major message of your study. Referee 1, besides encouraging you to revise/precise many sentences, is worried that the use of boolean networks may provide little more insight than providing a mere restatement of experimental results.
Based on these significant concerns, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version that in particular addresses these two major concerns. We cannot, of course, promise publication at that time.
Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.
If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to gro.solp@scitenegsolp.
If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.
To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future.
Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.
While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at gro.solp@serugif.
PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.
To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.
We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.
Yours sincerely,
Patrick Lemaire
Guest Editor
PLOS Genetics
Gregory Barsh
Editor-in-Chief
PLOS Genetics
Reviewer's Responses to Questions
Comments to the Authors:
Please note here if the review is uploaded as an attachment.
Reviewer #1: I have now carefully read this manuscript. My major objection is that the English is, at times, rather hard to read. The article provides an improvement on a previous related article. The improvement consist in the inclusion of one gene, Nodal, and an exploration of the redundancy in the regulation of the patterns of gene expression in the 32 cell stage. There are really not major take-home message rather than a slightly more detailed description of the underlying network (e.g. no actual cell signaling occurring). I am a theoretical biologist and I do not see much benefit in the boolean model. In practice it is just a re-statement of experimental findings. This is certainly not useless and the virtual twin can be helpful as a kind of summary for Ciona researchers. I am uncertain as to whether this manuscript provides enough incremental insight as to deserve publication in PLOS Genetics, I leave that decision to reviewers that work experimentally with Ciona.
I have a number of minor comments. In truth, these are only the major comments among the minor ones because there are quite many grammar mistakes and obscure sentences over the article. The article needs to revised by a native English speaker.
Minor:
-”Although the structure of such networks, which represent connections among
regulatory genes, has been studied extensively, it is unclear how these networks encode gene
expression dynamics.”. It is unclear what “encode” is supposed to mean in this sentence, I think encode is not the right word in this context.
-In line 48, It is just incorrect to label a phylogenetic group as “basal”. Please correct and check: https://pubmed.ncbi.nlm.nih.gov/16733736/
-I think the reading of the article would be facilitated by a better explanation or more detailed discussion of this statement: “Regulatory
mechanisms for 12 of these 13 genes have been described mathematically as Boolean functions,
56which represent necessary and sufficient conditions for their expression (Tokuoka et al., 2021)
(see an example for Boolean functions in Figure 2A).”
-This sentence is not clear enough: “The regulatory function (RF) for the remaining gene, Nodal, was also tentatively determined; it represents necessary conditions, but
does not represent sufficient conditions (Tokuoka et al., 2021). “. What is not necessary or sufficient? And for what?
-This sentence is also unclear: “In the present
study, we first devised such a digital twin and then showed that gene expression patterns can
indeed be manipulated according to predictions from the digital twin.” Are the authors meaning that they are going to perform experimental manipulations? This is clearer later but it should be clear already at this point.
-The following sentence is also unclear: “We
found robust buffering of regulatory mechanisms, which potentially permits rewiring of GRNs
without changing expression patterns.” What is a regulatory mechanism for the authors? How it relates to Rfs. What do they mean by “buffering”, this word is written as a verb form. This suggests that buffering is an active thing in this case but this sound a bit confusing and it is anyway unclear what, if anything, is doing the buffering.
-It is not totally clear what the authors mean by a gene’s RF. The authors provide no explicit definition.
-I cannot understand this statement: “The finding that RFs are
successfully represented as Boolean functions indicates that qualitative, but not quantitative,
control are important for regulation in early ascidian embryos (Tokuoka et al., 2021).” The last part after “control” is objectively not-understandable. This is probably because the authors use control and regulation in non-standard ways, that need to be clarified.
-The grammar in this statement is improvable: “We assumed that cells descended from cells expressing a transcription factor gene at the 16-cell stage express the encoded protein at the 32-cell stage
because of a delay between gene expression and protein translation (Tokuoka et al., 2021).”
-Is “artifactitiously” an actual word? How about just “artificially”
-The one-paragraph sentence between lines 154 and 164 needs to be broken into several different sentences, otherwise it is completely unredeable.
-The title of this section “Reproduction of the pattern of a cell of a 32-cell-embryo in 16-cell embryos” is written in a rather strange manner.
-In line 231: “Because RFs are encoded in cis-regulatory regions,…” Is that the case? Rfs can also be regulated by protein-protein interactions occurring between transcriptional factors and co-factors, as the authors results seem to indicate for the regulation of Nodal.
-”For RFs for the 13 genes, 17 regulators appear 99 times in total.” This sentence is not understandable.
-In line 244: “This finding suggests a possibility that the observed redundancy provides
245a basis for developmental system drifts.” This is statement is true but the reverse can also be true: this redundancy may be the result of developmental systems drift, not just a possible cause.
-In line 246: ”The digital twin also indicated that such rewiring enabled
these genes to behave differently from one another if expression patterns of upstream factors
were changed (S5B-D Fig)” The verb tenses seem to be off in this sentence. From the writing is seems that that is what happened in nature in the past, while that happened only in the model.
-In line 248: ”Although combinations of regulators may not be the only
evolutionary constraints of Rfs,” It is unclear what the authors mean by “combinations of regulators” and “evolutionary constraints” in the context of this sentence.
-I wonder to which extend the robustness results are a mere consequence of not getting the RF right for all genes. Ideally, the reader would not have to read all previous articles by the authors to figure out whether this is the case or not.
Reviewer #2: In this manuscript, Tokuoka and Satou present an updated version of their 16-32-cell stage digital twin embryo. Previously, they unraveled the simplest combinations of “regulatory functions” (RF) for 12 out of 13 regulatory genes/zones, activated at the 32-cell stage, using Boolean logic (Tokuoka et al, 2021). Here, they extend this analysis to cover the 13th gene, Nodal, completing its RF, with the addition of the regulator Zfpm (fog). The authors convincingly tested their predicted Nodal RF (combination of regulatory factors/conditions sufficient for Nodal expression) using the digital twin embryo and real embryo experiments. Next, the authors modified the theoretical RFs for all 13 genes and found that in 55 cases in which a single regulator was removed, the digital twin still predicted the correct gene expression output. In other words, in terms of computation, many factors had ‘redundant’ coding functions. The authors propose the hypothesis that this type of redundant coding function could be a mechanism that would allow developmental systems drift, since changes in regulatory sequences, when coupled with cis-regulatory changes, should still allow the same phenotypic output. Furthermore, the authors then test in embryos the hypothesis that removal of redundant factors will lead to the same phenotypic output if the cis-regulatory sequences are also adapted. They did this by swapping the Gata.a binding sites in the Nodal b6.5 enhancer (dependence on Zfpm) with those of Zic-r.b (independence on Zfpm), which confirmed their hypothesis. The updated HTML-based program is improved and easy to use and will be very useful for the community.
The manuscript is carefully conducted and interpretation is clever and interesting. I think that this thought-provoking manuscript will be of great interest for both ascidian developmental biologists and the broad audience of evo-devo researchers.
Minor comments.
Line 23: Sentence is not clear, I would replace ‘or developmental system drift’ with ‘enabling (or allowing) developmental systems drift’.
line 136: I would remove the word “artifactitiously” (or change it).
Reviewer #3: This is an interesting paper where the authors present a boolean model that is able to recapitulate the WT expression of dynamics of 13 genes in the 32-cell ascidian embryo as well as predict expression in certain experimental conditions. They then do in-silico knockout experiments to conclude that their network is highly redundant, in order to then speculate that it is this redundancy in the network that underlies the evolution of early ascidian patterning via developmental systems drift.
Some clarifications are required throughout, but importantly, in order to say anything about redundancy in the network the authors should first show that the way in which they construct their networks to begin with doesn't inadvertently favour redundant networks. This is particularly important because these networks are constructed taking a bottom-up approach rather than an unbiased parameter fitting approach.
Abstract
L13 (also L27 and L45): How GRNs encode dynamics and evolve are one of the most-studied and best-understood problems in biology, although I do agree that there is still a lot that we don’t know. I would encourage the authors to rephrase.
L14 (and throughout the paper): I find the terminology confusing. Why digital twin? Why not Boolean model of gene regulation?
L21: Playing Devil’s advocate, it might also reflect that the GRN was initially formulated to include many factors that aren’t necessary, and that might not actually be there in the embryo. How do we tease the two possibilities apart?
Introduction
L54-L59: More information on how RF were constructed in Tokuoka 2021 might be beneficial for the reader, particularly since the authors mention explicitly that for Nodal, this represented necessary but not sufficient conditions. The reader needs to know how these functions have been constructed, because the paper’s main claim is one of robustness of the network to evolutionary changes in the regulation. However, is these networks (RFs) are initially built assuming necessity, but with no, or little consideration, for the sufficiency, then it surely comes as no surprise that some of the components were redundant in the formulation to begin with
L80: I think it is important to mention how these RFs were found in thee 2021 paper, in addition to how they are formulated. Why did the algorithms used in Tokuoka 2021 not reveal more minimal RFs, if the simplest RFs were selected?
L97 (also L55): Please clarify why the RF for Nodal is the only one out of 13 that wasn’t calculated or inferred in the previous study (or did I understand this wrong?)
Results
L102: Reading this sentence has made me realise that I don’t quite understand how the simulations are being run. What is being simulated and how are these models initialised. From the introduction, I thought that the models are initialised using the expression at the 16-cell stage to simulate the expression at the 32-cell stage, but the sentence in L102 makes me think I have misunderstood this. Please clarify somewhere in the introduction.
L104: I would have thought that if the model recapitulates the pattern at the 32-cell stage but not at the 16-cell stage, the conclusion would be that the formulation was missing something present at the 16-cell stage, not at the 32-cell stage as stated.
L105: I thin that when the authors say inferred here, they really mean assumed.
L133: The fourth conjunction represents experimental conditions. Which specifically?
L134-L138: I’m not sure what the perturbation is here. Is it FGF up-regulation?
L153: Is the 4th conjunction an aggregate function representing every experimental perturbation then? Please clarify the difference between the 1st to 3rd conjunctions and the 4th somewhere in the introduction, and how these are inferred.
L166-L180: Verry useful recap, thank you!
L180: Except for Foxa.a injections + FGF.
L186: for wild type expression?
L187-L190: I am very confused by these statements. I am also not sure that adding a conjunction to capture experimental perturbations separately from the other 3 is rigorous. Surely in the embryo, the result of the perturbations is a consequence of the normal WT wiring of the GRN; it doesn’t have a separate “perturbation/experimental” module. If the authors want to include this for the sake of the model, OK, but to then from then assume that this conjunction actually exists separately from the main, or WT GRN, and that it can evolve separately and/or be useful later in development, is a stretch.
L196: his statement is incredibly confusing. Is it the case, that the all the simulating is being done at the 16-cell stage, and that they are simulating the presence of mRNAs of different genes in different cells according to whether the proteins of these are expressed in the daughter cells at the 32-cell stage?
L199-L202 clarify the previous point. Please, try to clarify this throughout.
L218: I wouldn’t call this a model prediction. The model has been constructed and selected to reproduce these patterns. Therefore, there is agreement between model and data, but this is not a model prediction.
L222: transformed cells?
L230: Although I agree with the results presented in the section beginning in L230, I think a more detailed discussion regarding how the original construction/inference of the networks might have favoured redundancy in the first place, is required. As I mentioned elsewhere, if the way in which these networks were constructed focused on necessity rather than sufficiency, it is possible that the algorithm had favoured networks with a high degree of redundancy to begin with. This does not mean that the GRNs in the Ascidian early embryo are redundant, in fact, they might not be, as the minimal GRN in this section shows. I don’t think this is the case, and think that the authors are probably right, but a comment on whether their initial model construction favoured, or at least did not discard, redundancy the networks is needed in order for the results presented in this section to be more rigorous.
L233: perturb?
L260: But you have shown above that Zfpm positively regulates Nodal (L126). Please clarify.
L256: I wonder whether the the flow of the paper might be improved if this section was to be moved to follow the section on the regulatory function of nodal?
Conclusions
L298: I think that here the authors are over-claiming slightly: Having all the data does not reveal the underlying the regulatory structure, hence the motivation for the modelling. I would rephrase to reflect this.
L306: The conclusion needs an in-depth exploration of whether the inferred networks might have had landed on more redundant formulations by default, or by design of the algorithms used to infer them. This is my biggest criticism of this paper: that the main claim – namely that redundancy within the GRN might have led to developmental systems drift in ascidians – is due to the networks in the early embryos of one extant species of ascidian being very redundant, where these networks have been inferred by constructing Boolean models that reproduce the data from the bottom up (not using unbiased parameter fitting or reverse-engineering for example).
**********
Have all data underlying the figures and results presented in the manuscript been provided?
Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.
Reviewer #1: Yes
Reviewer #2: Yes
Reviewer #3: Yes
**********
PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.
Reviewer #1: No
Reviewer #2: No
Reviewer #3: No
Author response to Decision Letter 0
15 Jun 2023
Decision Letter 1
16 Aug 2023
Dear Dr Satou,
Thank you very much for submitting your Research Article entitled 'A simulator reproducing the gene regulatory network of early Ciona embryos indicates robust buffers in the network' to PLOS Genetics.
The manuscript was fully evaluated at the editorial level and by independent peer reviewers. Following your revisions, Reviewer 3 joined Reviewer 1 in supporting publication. As Reviewer 2 was not available to assess the revisions, we asked Reviewer 4, who was not involved in the first round of refereeing, to comment on your work.
Reviewer 4 appreciated the attention to an important topic and identified some concerns with the model that we ask you address in a revised manuscript. Upon resubmission, the manuscript will not be sent back out for review.
We suggest that you give particular attention to the following two comments:
1: “In any case, for Lhx3/4, Neurogenin, and Dickkopf, it would be interesting to check if the function (Foxd & β-catenin)v(Fgf9/16/20 & β-catenin) was discarded because of the choice decided by the authors, or because it didn't comply with some experiments.”
2: “ how can we ensure, if a DNF with n literals is choosen, that we are not missing an additional regulator? e.g. if (AvB) as well as (AvBv¬C) were compatible with the experiments disclosing expression of X, it could be the case that in the future, a new experiment shows that indeed absence of C is required for the expression of X.”
Additionally, we encourage you to replace "conjunctions" with "conjunctive clauses" when applicable, to provide the model (the set of functions) in the form of an excel sheet or, better, an SBML file and to revise the title of your study, possibly following Reviewer 3's suggestion to include "digital twin".
Finally, we ask that you:
1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.
2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.
We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to gro.solp@scitenegsolp.
If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.
While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at gro.solp@serugif.
Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.
To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols
Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.
PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.
To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.
Please let us know if you have any questions while making these revisions.
Yours sincerely,
Patrick Lemaire
Guest Editor
PLOS Genetics
Gregory Barsh
Editor-in-Chief
PLOS Genetics
Reviewer's Responses to Questions
Comments to the Authors:
Please note here if the review is uploaded as an attachment.
Reviewer #3: I thank the authors for addressing my main concern, which was that their networks might be biased towards redundancy by construction and am happy with their response and amendments to the text. I also thank their responses to my other comments. Since writing my review, it has come to my attention that the notion of a “digital twin” is gaining traction. The authors should use it instead of simulator if they wish to. I am now supportive of the publication of this paper.
As an aside, I emphatically agree with the authors that the fact that the model is Boolean, does not make it “a mere re-statement of experimental results”. Boolean models have a lot of explanatory and predictive power, often more so than continuous models, and are perfectly suitable for this system where expression is on/off as opposed to graded.
Ps. The amendment in L356-357 should read “in ancestral animals”.
Reviewer #4: This manuscript presents a program enabling the determination of expression patterns of 13 genes in 32-cell ascidian embryos from that of 18 upstream factors, which start to be expressed in the 16-cell embryo. Regulatory mechanisms are represented as Boolean functions determined in a previous work [Tokuoka et al., 2021] for 12 of these genes. The case of Nodal is revisited as the function determined in [Tokuoka et al., 2021] could not account for Nodal expression in specific situations.
Contributions of this work are the implementation of the computational tool (called simulator), a regulatory function of Nodal, and an in silico experiment to supposedly show that some regulators are redundant. Overall, the paper is interesting, but in my opinion the model definition is not convincing.
- Concerning the implemented program, it is indeed well designed with its HTML version, and might be useful for the community. However, it is a bit excessive to call it a "simulator", as it merely performs the evaluation of Boolean functions (i.e. one step, from an input pattern to a resulting pattern).
- I am not qualified to evaluate the results concerrning the regulatory function of Nodal.
- My main concern relates to the RFs definition and the take home message about a supposed redundancy of many regulatory genes. The authors recognize that while "although more than half the regulators give theoretically redundant temporal or spatial information to target genes, they are necessary for expression of their target genes in real embryos." This statement sounds a bit odd to me, as I would understand "redundancy" differently.
But I am more concerned about the following claim "It is unlikely that the observed redundancy is an artifact derived from our method to determine RFs."
The problem to determine the regulatory functions is largely underdetermined as there are 2^18 (=262,144) potential truth tables for each target gene. As rightly put by the authors, it is impossible to perform so many experiments to determine the right function. When they say "we considered all theoretically possible conjunctions to determine RFs", it might be the case (note that the term "conjunctions" here is not appropriate), but they finally choose a single one, and the redundancy analysis on this sole function is therefore not convincing.
First, criteria for this choice are unclear: "DNFs with the smallest number of conjunctions and the smallest number of upstream regulators were considered as primary candidates." I suppose that the authors by "conjunctions" mean "conjunctive clauses", which should be corrected. Otherwise, it would be unclear which function would be choosen between A&B&C (A and B and C) and (A&B) v (B&C) ((A and B)or(B and C), as they have the same number of literals (A,B,C) and the same number of conjunctions. The notion of simplest function here is debatable, one could say that the simplest function is the least stringent (higher number of disjunctions, i.e. clauses), as AvB is true for 3 value configurations of A and B, whereas A&B is true for only one.
In any case, for Lhx3/4, Neurogenin, and Dickkopf, it would be interesting to check if the function (Foxd & β-catenin)v(Fgf9/16/20 & β-catenin) was discarded because of the choice decided by the authors, or beacuse it didn't comply with some experiments.
Second, how can we ensure, if a DNF with n literals is choosen, that we are not missing an additional regulator? e.g. if (AvB) as well as (AvBv¬C) were compatible with the experiments disclosing expression of X, it could be the case that in the future, a new experiment shows that indeed absence of C is required for the expression of X.
Further comments:
In Nodal's function, I am surprised to see that β-catenin appears as an activator and an inhibitor. Such cases are not so common. It would be good to check again an explain why β-catenin acts as a dual regulator.
I suppose that there exist data on binding sites in ascidian, and it would be interesting to use these data to help choosing the RFs (choice which is so far based on expression data).
In Table S1, it appears clearly that Foxd, Fgf9/16/20 and β-catenin do not have the same patterns of expression in the 16-cell and the 32-cell embryo. This would imply that they are regulated, but I am not sure the authors provide any comment on this point, neither on potential cell-cell signaling. I also wonder if there are some cross regulations, or circuits, as the proposed network is free of loops.
Finally, Boolean functions to model regulatory networks driving development have been used for several decades now. There are many studies, for example L. Sánchez & D. Thieffry as well as R. Albert & HG Othmer modelling work on the regulatory modules controlling drosophila segmentation, ER Álvarez-Buylla papers on Arabidopsis development.
There are also software tools devoted to Boolean models (GINsim, BoolNet,...).
I would advise to provide the model (the set of functions) in the form of an excel sheet or, better, as an SBML file.
Overall, the manuscript would deserve a carefull re-reading as there are some ill-constructed sentences.
The title should be changed. A simulator does not reproduce a network, rather its dynamics. Moreover, as previously mentionned the term "simulator" does not seem appropriate here.
**********
Have all data underlying the figures and results presented in the manuscript been provided?
Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.
Reviewer #3: Yes
Reviewer #4: Yes
**********
PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.
Reviewer #3: No
Reviewer #4: No
Author response to Decision Letter 1
21 Aug 2023
Decision Letter 2
1 Sep 2023
Dear Dr Satou,
We are pleased to inform you that your manuscript entitled "A digital twin reproducing gene regulatory network dynamics of early Ciona embryos indicates robust buffers in the network" has been editorially accepted for publication in PLOS Genetics. Congratulations!
Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.
Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at gro.solp@scitenegsolp.
In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.
If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.
Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!
Yours sincerely,
Patrick Lemaire
Guest Editor
PLOS Genetics
Gregory Barsh
Editor-in-Chief
PLOS Genetics
Twitter: @PLOSGenetics
----------------------------------------------------
Comments from the reviewers (if applicable):
----------------------------------------------------
Data Deposition
If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.
The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly:
http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-23-00299R2
More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact gro.dayrdatad@pleh for support.
Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.
----------------------------------------------------
Press Queries
If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via gro.solp@scitenegsolp.
Acceptance letter
15 Sep 2023
PGENETICS-D-23-00299R2
A digital twin reproducing gene regulatory network dynamics of early Ciona embryos indicates robust buffers in the network
Dear Dr Satou,
We are pleased to inform you that your manuscript entitled "A digital twin reproducing gene regulatory network dynamics of early Ciona embryos indicates robust buffers in the network" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.
The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.
Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.
Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!
With kind regards,
Livia Kovacs
PLOS Genetics
On behalf of:
The PLOS Genetics Team
Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom
gro.solp@scitenegsolp | +44 (0) 1223-442823
plosgenetics.org | Twitter: @PLOSGenetics
Articles from PLOS Genetics are provided here courtesy of PLOS
Full text links
Read article at publisher's site: https://doi.org/10.1371/journal.pgen.1010953
Read article for free, from open access legal sources, via Unpaywall: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1010953&type=printable
Citations & impact
This article has not been cited yet.
Impact metrics
Alternative metrics
Discover the attention surrounding your research
https://www.altmetric.com/details/154706886
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
A gene regulatory network for cell fate specification in Ciona embryos.
Curr Top Dev Biol, 139:1-33, 05 May 2020
Cited by: 6 articles | PMID: 32450958
Review
Gene regulatory systems that control gene expression in the Ciona embryo.
Proc Jpn Acad Ser B Phys Biol Sci, 91(2):33-51, 01 Jan 2015
Cited by: 16 articles | PMID: 25748582 | PMCID: PMC4406867
Review Free full text in Europe PMC
Identification of genes downstream of nodal in the Ciona intestinalis embryo.
Zoolog Sci, 27(2):69-75, 01 Feb 2010
Cited by: 5 articles | PMID: 20141410
Dissection of a Ciona regulatory element reveals complexity of cross-species enhancer activity.
Dev Biol, 390(2):261-272, 28 Mar 2014
Cited by: 7 articles | PMID: 24680932 | PMCID: PMC4010673
Funding
Funders who supported this work.
Japan Science and Technology Corporation (1)
Grant ID: JPMJCR22N6
Japan Society for the Promotion of Science (3)
Grant ID: 22K06250
Grant ID: 20J40280
Grant ID: 21H02486