Regulatory sequences involved in pancreas-specific gene expression
The present invention relates to regulatory sequences of the promoter of the Pax6 gene or a gene homologous to the ax6 gene, being capable of conferring expression in pancreatic cells. The present invention also relates to recombinant DNA molecules and vectors comprising said regulatory sequences as well as to host cells transformed therewith. The present invention additionally relates to pharmaceutical and diagnostic compositions comprising such regulatory sequences, recombinant DNA molecules and vectors. Furthermore, the present invention relates desto transgenic non-human animals, comprising the afore-mentioned recombinant DNA molecules or vectors stably integrated into their genome. The present invention also relates to the use of the before described recombinant DNA molecules and vectors for the preparation of pharmaceutical compositions for treating, preventing and/or delaying a disease related to the pancreas in a subject. Furthermore, the regulatory sequences, the recombinant DNA molecules and vectors of the invention can be used for the preparation of pharmaceutical compositions for inducing a pancreatic disease in a non-human animal. In addition, the present invention relates to a method for identifying agonists/activators or antagonists/inhibitors of genes or gene products involved in pancreatic diseases employing the above mentioned regulatory sequences, recombinant DNA molecules, vectors, cells, and transgenic non-human animals, to compounds identifiable by said methods, to antibodies directed to said compounds as well as to pharmaceutical and diagnostic compositions comprising said agonists/activators, antagonists/inhibitors and/or antibodies.
The pancreas is a particularly important organ from the point of view of human medicine because it suffers from two important diseases: diabetes mellitus and pancreatic cancer. Diabetes effects at least 30 million people world wide and despite the availability of insulin remains a major problem. Pancreatic cancer causes about 6,500 death per anum in the UK and is virtually incurable. Despite this medical importance, research in the developmental biology of the pancreas has produced
only a small number of clinically usable data in recent years; see Slack, Development 121 (1995), 1569-1580. In general, all patients suffering from type I diabetes and a large proportion of patients suffering from type II diabetes require daily insulin injections. This tedious and unpleasant treatment requires a high degree of discipline. However, although much effort has been spent in developing approaches to cure pancreatic diseases until now, there is no established treatment of such or related diseases.
The vertebrate Pax6 gene is related to the Drosophila pair-rule gene, paired (Walther and Gruss, Development 113 (1991 ), 1435-1449) and encodes two DNA binding domains, a paired- (Bopp et al., Cell 47 (1986), 1033-1040; Treisman et al., Genes Dev. 5 (1991 ), 594-604) and a pa/red-like homeo domain (Frigerio et al., Cell 47 (1989), 735-746). In different species including man, the Pax6 gene shows a complex spatio-temporal expression, exclusively confined to the developing eye, the central nervous system and the pancreas (Macdonald and Wilson, Curr. Opin. Neurobiol. 6 (1996), 49-56; Callaerts at al., Ann. Rev. Neurosci. 20 (1997), 483-532). Pax6 plays, inter alia, a key role in the eye morphogenesis. The most striking consequence from homozygosity for mutations of Pax6 homologues in Drosophila (eyeless, Quiring et al., Science 265 (1994), 785-789), mice (Small eye mouse, Sey, Hogan et al., J. Embryo. Exp. Morph. 97 (1988), 95-110; Hill et al., Nature 354 (1991), 522-525), rat (rSey, Fujisawa et al., Differentiation 57 (1994), 31-38) and human (aniridia, Jordan et al., Nature Genet. 1 (1992), 328-332; Glaser et al., Nature Genetics 2 (1992), 232-239; Glaser et al., Nature Genet. 7 (1994), 463-471) is the lack of eyes or a variety of ocular abnormalities in heterozygous conditions with a strong gene dosage effect (Glaser et al., loc. cit. 1994).
In differentiating eye-lens, Pax6 is involved in the lens-specific transcription of the αA-crystallin gene, δ1-crystallin gene and ζ-crystallin gene (reviewed in Cvekl and Piatigorsky, Bioessays 18 (1996), 62-I-630).
Furthermore, Pax6 has an important function for the development of the brain and spinal cord. Recent evidence indicates that the Pax6-loss-of-fu notion causes distortion of the cortical plate (Schmahl et al., Acta-Neuropathol-Berl. 86 (1993), 126- 135) and migration defects of the cortical neurons (Caric et al., Development 124 (1997), 5087-5096) that are most probably due to a failure of the radial glia cell
differentiation. Pax6 is required for correct forebrain patterning, as indicated by the defects in the establishment of morphological and expression boundaries, axonal pathfinding and differentiation of diencephalon in the Small eye mutant (Stoykova et al., Development 122 (1996), 3453-3465; Stoykova et al., Development 124 (1997), 3765-3777; Mastick et al., Development 124 (1997), 1985-1997; Grindley et al., Mech. Dev. 64 (1997), 111-126; Warren and Price, Development 124 (1997), 1573- 1582) and in human probands of aniridia (Glaser et al., loc. cit. 1994). However, a restricted expression of Pax6 has been detected in the developing pancreas (Walther and Gruss, loc. cit. 1991). In later embryonic stages and after birth the expression becomes localized to the endocrine α, β, γ, δ cells of the islands (Turque et al., Mol-Endocrinol 8 (1994), 929-938), producing the pancreatic hormones glucagon, insulin, somatostatin and pancreatic polypeptide PP, respectively (Slack, Development 121 (1995), 1569-1580). Mice with targeted disruption of Pax 6 lack glucogen producing α cells, demonstrating an essential role for Paxδ for the differentiation of this cell type (St. Onge, loc. cit.; Sander, Genes- Dev. 11 (1997), 1662-1667). In support of this genetic evidence, biochemical studies demonstrate that human Pax6 binds to a common element in the glucagon, insulin and somatostaten promoter (Sander, loc. cit).
Little is known about the molecular mechanisms that control the expression of the Pax6 gene. Results from, studies on the primary structure of Pax6 in quail and C.elegans suggest that the expression of Pax6 is under the control of different regulators through alternate promoters (Dozier et al, Cell Growth Diff. 4 (1993), 281- 289; Plaza et al., Mol. Cell. Biol. 15 (1995), 3344-3353; Zhang and Emmons, Nature 377 (1995), 55-59). Recent analysis of the human Pax6 promoter in transient transfection assays identified multiple cis-regulatory elements with distinct function in different cell lines (Xu and Saunders, J. Biol. Chem. 272 (1997), 3430-3436).
Thus, the technical problem of tne present invention is to provide means and methods for the treatment or modulation of pancreatic and related diseases as well as methods for the identification of substances suitable for such purposes. The solution to this technical problem is achieved by providing the embodiments characterized in the claims.
Accordingly, the invention relates to a regulatory sequence of the promoter of the Pax6 gene or of a promoter of a gene homologous to the Pax6 gene being capable of conferring expression in pancreatic cells.
In context with the present invention, the term "regulatory sequence" refers to sequences which influence the specificity and/or level of expression, for example in the sense that they confer cell and/or tissue specificity. Such regions can be located upstream of the transcription initiation site, but can also be located downstream of it, e.g., in transcribed but not translated leader sequences.
The term "promoter", within the meaning of the present invention refers to nucleotide sequences necessary for transcription initiation, i.e. RNA polymerase binding, and may also include, for example, the TATA box.
The term "promoter region of a gene homologous to the Pax6 gene", as used herein includes promoter regions and regulatory sequences of genes from other species, for example, human which are homologous to the murine Pax6 gene and which display substantially the same expression pattern. Such promoters are characterized by their capability of conferring expression of a heterologous DNA sequence in substantially all pancreatic cells. Thus, according to the present invention, regulatory sequences from other species can be used that are functionally homologous to the regulatory sequences of the promoter of the murine Pax6 gene, or promoters of genes that display an identical or similar pattern of expression, in the sense of being expressed in pancreatic cells.
In accordance with the present invention, it was surprisingly found that (a) pancreas specific element(s) is/are located at the 5' of (the) exon 0. This result is particularly interesting since the prior art (Xu and Saunders, The Journal of Biological Chemistry 272 (1997), 3430-3436; Plaza et al., Molecular and Cellular Biology 15 (1995), 3344- 3353) only identified exon 1 related regulatory sequences that direct expression of Pax6 in particular in neuronal differentiation and retina development. Although genomic sequences of Pax genes and putative promoters have been reported, the regulatory sequences which are capable of expressing a heteroiogous DNA- sequence in specific tissues, i.e. the pancreas, have not been described in the art. For example, a 1.5 kb 5' UTR genomic fragment of exon 0 of the quail Pax6 gene
published by Plaza et al. (Cell Growth Differ. 4 (1993), 1041-1050) does not comprise any pancreas specific sequences.
Therefore, regulatory sequences of the Pax6 promoter that are capable of and/or required for directing pancreas specific gene expression were not known in the art. The nucleotide sequence as published by Xu and Saunders (loc. cit.) contains the promoter region of exon 1 of Pax6. However, several transcription start sites exist (PO, PI, α). For example, an alternative 5' UTR of exon 0 is found 6.2 kb further upstream of exon 1 , which has not been described in the prior art.
The pancreas is made up of two different tissues, the exocrine cells which are responsible for the production of the digestive enzymes and the endocrine cells, forming Islets of Langerhans which lie as cell clusters in the exocrine pancreas tissue. At least four different cell types can be distinguished in the islets: the glucagon producing α-cells, the insulin producing β-cells, the somatostatin producing δ-cells and cells that produce the pancreatic polypeptide, referred to as PP-cells. Both, exocrine and endocrine tissue are derived from a endodermal primordium, see Slack, Development 121 (1995), 1569-1580. Pax6 expression was not only detected at the beginning of pancreas development in the endocrine progenitor cells but also in the major α-, β- and δ-cells (Walther, Development 113 (1991 ), 1435-1449; St. Onge, Nature 387 (1997), 406-409).
In accordance with the present invention, it has been surprisingly found that the regulatory region of the Pax6 gene depicted in Fig. 4 contains a pancreas specific element that alone is sufficient and necessary for conferring expression in pancreatic cells. Therefore, several regulatory sequences are identified which are useful to specifically express heterologous DNA sequences in pancreatic cells and/or tissue. In order to identify the regulatory element of the Pax6 gene, 5' upstream genomic fragments were cloned in front of a β-galactosidase reporter gene and the resulting chimeric genes were introduced by means of microinjection into the pronucleus of fertilized mouse egg cells. 5 out of 25 embryos show a β-galactosidase activity in lens, cornea and pancreas, which corresponded to the endogenous Pax6 expression. The experiments performed in accordance with the present invention revealed that the pancreas element lies 5 kb upstream of the transcription start point
of exon 0 (see Figure 2) and is restricted to a region of 1 ,100 base pairs. Moreover, it could be shown that the expression of the chimeric gene containing said pancreas element corresponds to the endogenous Pax6 expression in the pancreas. Thus, the regulatory sequence comprising the pancreas element of the promoter of the Pax6 gene of mouse or the promoter of a gene homologous to the Pax6 gene can be used to drive the expression of heterologous DNA sequences specifically in pancreatic cells.
It is possible for the person skilled in the art to isolate with the help of the known murine Pax6 gene, corresponding genes from other species, for example, from human or fugu (pufferfish). This can be done by conventional techniques known in the art, for example, by using the Pax6 gene as a hybridization probe or by designing appropriate PCR primers. It is then possible to isolate the corresponding promoter region or elements thereof by conventional techniques and test it for its their expression pattern. For this purpose, it is, for instance, possible to fuse the promoter to a reporter gene, such as luciferase, green fluorescent protein (GFP), or β- gaiactosidase (β-Gal/lacZ) and assess the expression of the reporter gene in transgenic animals, such as mice, see the appended examples. The present invention also relates to the use of regulatory sequences of promoter regions which are substantially identical to that of the murine Pax6 promoter or to a promoter of a homologous gene or to parts thereof and which are able to confer specific expression in pancreatic cells.
Such regulatory sequences differ at one or more positions from the above-mentioned regulatory sequences but still have the same specificity, namely they comprise the same or similar sequence motifs responsible for the above described expression pattern to which the one underlined in Fig. 4 is expected to belong. Preferably such regulatory sequences hybridize to one of the above-mentioned regulatory sequences, most preferably under stringent conditions. Particularly preferred are regulatory sequences which share at least 85%, more preferably 90-95%, and most preferably 96-99% sequence identity with one of the above-mentioned regulatory sequences and have the same specificity. Such regulatory sequences also comprise those which are altered, for example by nucleotide-deletion(s), insertion(s), substitution(s), addition(s), and/or recombination(s) and/or any other modification(s) known in the art either alone or in combination in comparison with the above- described nucleotide sequence. Methods for introducing such modifications in the
nucleotide sequence of the promoter of the invention are well known to the person skilled in the art and are described, e.g. in the examples. Furthermore, the nucleotide sequences of the invention can be compared as appropriate computer programs known in the art such as BLAST, which stands for Basic Local Alignment Search Tool (Altschul, 1997; Altschul, J. Mol. Evol. 36 (1993), 290-390; Altschul, J. Mol. Biol. 215 (1990); 403-410), can be used to search for local sequence alignments. BLAST produces alignments of nucleotide sequences to determine sequence similarity. Because of the local nature of the alignments, BLAST is especially useful in determining exact matches or in identifying homologues. With such means it is possible to identify conserved nucleotide sequences that may play a role in pancreas specific expression (see also the appended examples). Such conserved sequences will be described in more detail below. It is also immediately evident to the person skilled in the art that further regulatory sequences may be added to the promoter of the invention. For example, transcriptional enhancers and/or sequences which allow for induced expression of the regulatory sequences of the invention may be employed. A suitable inducible system is for example tetracycline-regulated gene expression as described, e.g., by Gossen and Bujard (Proc. Natl. Acad. Sci. USA 89 (1992), 5547-5551) and Gossen et al. (Trends Biotech. 12 (1994), 58-62). In one embodiment the regulatory sequence of the invention comprises a nucleotide sequence selected from the group consisting of
(a) the nucleotide sequence as depicted in Fig. 4 or (a) part(s) thereof;
(b) nucleotide sequences comprising nucleotides 1 to 1150 or nucleotides 42 to 1151 of the nucleotide sequence as depicted in Fig. 4;
(c) nucleotide sequences hybridizing with a nucleotide sequence as defined in (a) or (b) under stringent conditions; and
(d) nucleotide sequences comprising nucleotide sequences which are conserved in (a), (b) and (c).
The nucleotide sequence as depicted in Fig. 4 is part of a construct, 406/Sal, containing 8 kb of the promoter region of the murine Pax6 gene. From deletion experiments of the construct the position of the regulatory sequence was deduced and found to be restricted to an area of 2 kb as shown in Fig.1 , corresponding to nucleotides 1 to 2443 of the nucleotide sequence of Fig. 4. Preferably, the regulatory sequence used to confer pancreas specific gene expression comprises a region of
1.100 bp, between the restriction enzyme sides Spel and Hindi (see Fig.2), corresponding to nucleotides 1 to 1100 of , Fig. 4. In particular, a preferred embodiment of the pancreas specific regulatory element is located in a region of exactly 1109 bp between the Spel-site (position 42 bp) and Hindi (1151 bp) as shown in Fig. 4. The potential exists to modify the regulatory sequence as depicted in Fig. 4 or sequence motifs thereof by, e.g., nucleotide replacements which do not affect the overall structure or binding motif of the regulatory sequence so that it remains capable of conferring expression in pancreatic cells.
In a preferred embodiment of the invention said regulatory sequence comprises a nucleotide sequence comprising nucleotides 969 to 1086 of Fig. 4 (underlined sequence) or a corresponding nucleotide sequence of a Pax promoter or a Pax6 promoter other than the mouse promoter corresponding to said fragment. Sources of Pax promoters are known to the person skilled in the art and originate from different species like, for example, human, quail or fugu (pufferfish) and can be obtained by hybridization experiments. In accordance with the invention, a particular importance for pancreas tissue specificity is assigned to the above referenced nucleotide sequence.
Preferably, the nucleotide sequence comprising the element which confers pancreas specific gene expression comprises any one of the sequence motifs designated as motif C (CATTATTGT), motif D (TTTAATCCAATTATA) or Pbxl (ATCAATCA) either alone or in combination. If all of these motifs are present, they are preferably in the order as shown in Fig. 11. Motifs C and D represent hereby homeodomain DNA binding sites and the Pbxl consensus sequence might regulate Pax6 expression by direct binding of the transcription factor Pbxl (Lu, Mol. Cell Biol. 15 (1995), 3786- 3795).
In yet another embodiment of the Invention, the nucleotide sequence comprising nucleotides 1354 to 1460 of the nucleotide sequence shown in Fig. 4 (the Bglll/Accl sequence as shown in Fig. 10) is deleted from said regulatory sequences of the invention. The Bglll/Accl element of the promoter region of Pax6 confers eye lense and cornea specific gene expression and deletion of this element is expected to enhance pancreas specific gene expression.
10
In a preferred embodiment of the invention said regulatory sequence is derived from the Pax6 gene of mouse, human or fugu (pufferfish). The genomic sequence of the mouse Pax6 gene (Fig. 4) is derived from a λEMBL3A genomic library (phage clone gp52) generated from the mouse strain C57BI/6. Construct 1 was produced J->y subcloning a 3.7 kb EcoRI fragment and linearising it by a BglH partial digest therefore allowing the insertion of a β-galactosidase reporter gene within the first exon (EO) of Pax6. Construct 1 produced little or no β-galactosidase in transgenic mice and was therefore elongated at the 5' end by introducing a 7 kb Sail fragment containing further upstream regulatory sequences (construct 2, 406/SalI deposited with the DSMZ under accession No. DSM 11998). Construct 2 was then digested with the restriction enzymes KpnI/NotI to remove the prokaryotic vector sequence and microinjected into oocytes to generate transgenic mice.
As described above, the regulatory sequence of the present invention was originally obtained from mouse strain C57BI/6 by screening a genomic Lambda phage library (λEMBL3A) using a Pax6 cDNA probe described by Walther, Development 113 (1991 ), 1435. Phage clone gp52 contained the most upstream genomic sequence and was used for the promoter analysis experiments.
In another preferred embodiment said regulatory sequence is part of a recombinant DNA molecule. Such DNA molecules may further comprise further genes such as insulin, glucagon, Pax4, Pax6, Isl1 , Pdx1 or other pancreatic genes suitable for example for expression or co-expression of the aforementioned genes in a specific and restricted manner in the pancreas.
In a further preferred embodiment the recombinant DNA molecule of the invention comprises a minimal promoter. A minimal promoter is for example:
(a) a TATA- or CAAT-box, preferably in conjunction with an Sp-1 dependent activator, or
(b) an initiator element (Ir) in conjunction with an SP-1 dependent activator.
Preferably, said minimal promoter is a Pax6 derived minimal promoter, for example, comprising the motifs C, D and Pbxl as shown in Figure 11 and/or a TATA or CAAT- box.
In a preferred embodiment of the present invention, the regulatory sequencers operatively linked to a heterologous DNA sequence.
The term "heterologous" with respect to the DNA sequence being operatively linked to the regulatory sequence of the invention means that said DNA sequence is not naturally linked to the regulatory sequence of the invention. Expression of said heterologous DNA sequence comprises transcription of the DNA sequence, preferably into a translatable mRNA. Regulatory elements additionally required for ensuring expression in eukaryotic cells, preferably mammalian cells, are well known to those skilled in the art. They optionally comprise poly-A signals ensuring termination of transcription and stabilization of the transcript. Additional regulatory elements may include transcriptional as well as translational enhancers. Preferably said promoter is a minimal promoter, as indicated above. These promoters can be combined with the regulatory sequences of the invention in order to confer expression in pancreatic cells.
In a further preferred embodiment, the heterologous DNA sequence of the above- described recombinant DNA molecules encodes a peptide, protein, antisense RNA, sense RNA and/or ribozyme.
The recombinant DNA molecule of the invention can be used alone or as a part of a vector to express heterologous DNA sequences, which, e.g., encode proteins other than Pax6, in pancreatic cells for, e.g., gene therapy or as diagnostics of diseases related to the pancreas. The recombinant DNA molecule or vector containing the DNA sequence encoding a protein of interest is introduced into the ceils which in turn produce the protein of interest. In 'triis respect, it is also to be understood that the recombinant DNA molecule of the invention can be used for "gene targeting" and/or "gene replacement", for restoring a mutant gene or for creating a mutant gene via homologous recombination; see for example Mouellic, Proc. Natl. Acad. Sci. USA, 87 (1990), 4712-4716; Joyner, Gene Targeting, A Practical Approach, Oxford University Press.
Gene therapy, which is based on introducing therapeutic genes into cells by ex-vivo or in-vivo techniques is one of the most important applications of gene transfer. Suitable vectors and methods for in-vitro or in-vivo gene therapy are described in the literature and are known to the person skilled in the art; see, e.g., Giordano, Nature Medicine 2 (1996), 534-539; Schaper, Circ. Res. 79 (1996), 911-919; Anderson, Science 256 (1992), 808-813; Isner, Lancet 348 (1996), 370-374; Muhlhauser, Circ. Res. 77 (1995), 1077-1086; Wang, Nature Medicine 2 (1996), 714-716; Anderson, Nature 392 Supp. (1998), 25-30; Verma, Nature 389 (1997), 239-242; WO94/29469; WO 97/00957 or Schaper, Current Opinion in Biotechnology 7 (1996), 635-640, and references cited therein. Delivery of nucleic acids to a specific site in the body for gene therapy or antisense therapy may also be accomplished using a biolistic delivery system, such as that described by Williams (Proc. Natl. Acad. Sci. USA 88 (1991), 2726-2729).
Standard methods for transfecting cells with recombinant DNA are well known to those skilled in the art of molecular biology, see, e.g., WO 94/29469. Gene therapy and antisense therapy to pancreatic diseases may be carried out by directly administering the recombinant DNA molecule or vector of the invention to a patient or by transfecting pancreatic cells with the recombinant DNA molecule or vector of the invention ex vivo and infusing the transfected cells into the patient. Furthermore, research pertaining to gene transfer into cells of the germ line is one of the fastest growing fields in reproductive biology. Gene therapy, which is based on introducing therapeutic genes into cells by ex-vivo or in-vivo techniques is one of the most important applications of gene transfer. Suitable vectors and methods for in-vitro or in-vivo gene therapy are described in the literature and are known to the person skilled in the art; see, e.g., WO94/29469, WO 97/00957, Anderson (1998) loc. cit. or Schaper (1996), loc. cit. and references cited therein. It is to be understood that the introduced recombinant DNA molecules and vectors of the invention express the heterologous DNA sequence after introduction into said cell and preferably remain in this status during the lifetime of said cell. For example, cell lines which stably express the heterologous DNA under the control of the regulatory sequence of the invention may be engineered according to methods well known to those skilled in the art. Rather than using expression vectors which contain viral origins of replication, host cells can be transformed with the recombinant DNA molecule or vector of the invention and a selectable marker, either on the same or separate vectors. Following
the introduction of foreign DNA, engineered cells may be allowed to grow for 1-2 days in an enriched media, and then are switched to a selective media. The selectable marker in the recombinant plasmid confers resistance to the selection and allows for the selection of cells having stably integrated the plasmid into their chromosomes and grow to form foci which in turn can be cloned and expanded into cell lines. This method may advantageously be used to engineer cell lines which express the heterologous DNA sequence under the control of the regulatory sequence of the invention. Such engineered cell lines are particularly useful in screening compounds capable of modulating gene expression in pancreatic cells. Such compounds can be for example small molecules as described in Gottesfeld, Nature 387 (1997), 202-205.
A number of selection systems may be used, including but not limited to the herpes simplex virus thymidine kinase (Wigler, Cell 11 (1977), 223), hypoxanthine-guanine phosphoribosyitransferase (Szybalska, Proc. Natl. Acad. Sci. USA 48 (1962), 2026), and adenine phosphoribosyitransferase (Lowy, Cell 22 (1980), 817) in tk, hgprt or aprt cells, respectively. Also, antimetaboiite resistance can be used as the basis of selection for dhfr, which confers resistance to methotrexate (Wigler, Proc. Natl. Acad. Sci. USA 77 (1980), 3567; O'Hare, Proc. Natl. Acad. Sci. USA 78 (1981), 1527), gpt, which confers resistance to mycophenolic acid (Mulligan, Proc. Natl. Acad. Sci. USA 78 (1981 ), 2072); neo, which confers resistance to the aminoglycoside G-418 (Colberre-Garapin, J. Mol. Biol. 150 (1981), 1 ); hygro, which confers resistance to hygromycin (Santerre, Gene 30 (1984), 147); or puromycin (pat, puromycin N-acetyl transferase). Additional selectable genes have been described, for example, trpB, which allows cells to utilize indole in place of tryptophan; hisD, which allows cells to utilize histinol in place of histidine (Hartman, Proc. Natl. Acad. Sci. USA 85 (1988), 8047); and ODC (omithine decarboxylase) which confers resistance to the omithine decarboxyiase inhibitor, 2-(difluoromethyl)-DL-ornithine, DFMO (McConlogue, 1987, In: Current Communications in Molecular Biology, Cold Spring Harbor Laboratory ed.). On the other hand, the person skilled in the art may also use the regulatory sequences of the invention to "knock out" an endogenous gene comprising identical or similar regulatory sequences, for example, by gene targeting, cosuppression, triple helix, antisense or ribozyme technology. The regulatory sequences, recombinant DNA molecules and vectors of the invention may be designed for direct introduction or for introduction via iiposomes, or viral vectors (e.g. adenoviral, retroviral) into the
cell. Preferably, said cell is a germ line cell, embryonic cell, or egg ceil or derived therefrom.
The regulatory sequences of the invention may also be used in gene therapy to treat diseases such as for example type I and II diabetes, insulinomas and glucagonomas.
The regulatory sequences of the invention can be operatively linked to sequences encoding cellular growth factors capable of stimulating or inducing differentiation and proliferation of normal β-cells in the case of diabetes or cell death initiating factors capable of inhibiting the growth of tumorous cells present in pancreatic cancers.
In a particularly preferred embodiment of the present invention, said protein is selected from the group consisting of insulin, glucagon, Pax4, Pax6; Pdx1 , Isl1 and other proteins present in the pancreas.
The sequences encoding the human insulin gene, Pax4, Pax6, Pdx1, IsM and other genes are advantageously operatively linked to the regulatory sequences of the invention and expressed in the cells. Other heterologous proteins indicated above, e.g., proteins which may be specifically expressed in pancreatic cells to ensure the delivery of therapeutic peptides to the insulin producing β-cells may also be operatively linked to the regulatory sequences of the invention.
In another particularly preferred embodiment of the invention, said protein is a scorable marker, preferably luciferase, green fluorescent protein or β-galactosidase (lacZ). This embodiment is particularly useful for simple and rapid screening methods for compounds and substances described herein below capable or expected to be capable of modulating the pancreas specific gene expression. For example, pancreatic cells can be cultured in the presence and absence of a candidate compound in order to determine whether the compound affects the expression of genes which are under the control of regulatory sequences of the invention, which can be measured, e.g., by monitoring the expression of the above-mentioned marker. It is also immediately evident to those skilled in the art that other marker genes may be employed as well, encoding, for example, a selectable marker which provides for the direct selection of compounds which induce or inhibit the expression of said marker.
The regulatory sequences of the invention may also be used in methods of antisense therapy. Antisense therapy may be carried out by administering to an animal or a human patient, a recombinant DNA containing the regulatory sequences of the invention operably linked to a DNA sequence, i.e., an antisense template which is transcribed into an antisense RNA. The antisense RNA may be a short (generally at least 10, preferably at least 14 nucleotides, and optionally up to 100 or more nucleotides) nucleotide sequence formulated to be complementary to a portion of a specific mRNA sequence and/or DNA sequence of the gene of interest. Standard methods relating to antisense technology have been described (Melani, Cancer Res. 51 (1991), 2897-2901). Following transcription of the DNA sequence into antisense RNA, the antisense RNA binds to its target sequence within a cell, thereby inhibiting translation of the mRNA and down-regulating expression of the protein encoded by the mRNA. Such antisense therapy may be used to treat pancreatic diseases that are, for example, the result of Pax6 overexpression.
In a most preferred embodiment of the present invention, said antisense RNA or said ribozyme is directed against a gene involved in the development of the pancreas.
In a further embodiment, the invention relates to nucleic acid molecules of at least 15 nucleotides in length hybridizing specifically with a regulatory sequence as described above or with a complementary strand thereof. Specific hybridization occurs preferably under stringent conditions and implies no or very little cross-hybridization with nucleotide sequences having no or substantially different regulatory properties. The detection of only specifically hybridizing sequences will usually require stringent hybridization and washing conditions such as O.lxSSC, 0.1% SDS at 65°. Said nucleic acid molecules may be used as probes and/or for the control of gene expression. Nucleic acid probe technology is well known to those skilled in the art who will readily appreciate that such probes may vary in length. Preferred are nucleic acid probes of 17 to 35 nucleotides in length. Of course, it may also be appropriate to use nucleic acids of up to 100 and more nucleotides in length. The nucleic acid probes of the invention are useful for various applications. On the one hand, they may be used as PCR primers for amplification of regulatory sequences according to the invention. Another application is the use as a hybridization probe to identify
regulatory sequences hybridizing to the regulatory sequences of the invention by homology screening of genomic DNA libraries. Nucleic acid molecules according to this preferred embodiment of the invention which are complementary to a regulatory sequence as described above may also be used for repression of expression of a gene comprising such regulatory sequences, for example due to an antisense or triple helix effect or for the construction of appropriate ribozymes (see, e.g., EP-B1 0 291 533, EP-A1 0 321 201 , EP-A2 0 360 257) which specifically cleave the (pre)- mRNA of a gene comprising a regulatory sequence of the invention. Selection of appropriate target sites and corresponding ribozymes can be done as described for example in Steinecke, Ribozymes, Methods in Cell Biology 50, Galbraith et al. eds Academic Press, Inc. (1995), 449-460. Furthermore, the person skilled in the art is well aware that it is also possible to label such a nucleic acid probe with an appropriate marker for specific applications, such as for the detection of the presence of a regulatory sequence or recombinant DNA molecule of the invention in a sample derived from an organism.
The above described nucleic acid molecules may either be DNA or RNA or a hybrid thereof. Furthermore, said nucleic acid molecule may contain, for example, thioester bonds and/or nucleotide analogues, commonly used in oligonucleotide anti-sense approaches. Said modifications may be useful for the stabilization of the nucleic acid molecule against endo- and/or exonucleases in the cell. Said nucleic acid molecules may be transcribed by an appropriate vector containing a chimeric gene which allows for the transcription of said nucleic acid molecule in the cell. Such nucleic acid molecules may further contain ribozyme sequences which specifically cleave the (pre)-mRNA comprising the regulatory sequence of the invention.
The present invention also relates to vectors, particularly plasmids, cosmids, viruses and bacteriophages used conventionally in genetic engineering that comprise a regulatory sequence or a recombinant DNA molecule of the invention. Preferably, said vector is an expression vector' and/or a targeting vector. Expression vectors derived from viruses such as retroviruses, vaccinia virus, adeno-associated virus, herpes viruses, or bovine papilloma virus, may be used for delivery of the recombinant DNA molecule or vector of the invention into targeted cell populations. Methods which are well known to those skilled in the art can be used to construct recombinant viral vectors; see, for example, the techniques described in Sambrook,
Molecular Cloning A Laboratory Manual, Cold Spring Harbor Laboratory (1989) N.Y. and Ausubel, Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N.Y. (1989). Alternatively, the recombinant DNA molecules and vectors of the invention can be reconstituted into liposomes for delivery to target cells.
The present invention furthermore relates to host cells transformed with a regulatory sequence, a DNA molecule or vector of the invention. Said host cell may be a prokaryotic or eukaryotic cell. The regulatory sequence, vector or recombinant DNA molecule of the invention which is present in the host cell may either be integrated into the genome of the host cell or it may be maintained extrachromosomally. The host cell can be any prokaryotic or eukaryotic cell, such as a bacterial, insect, fungal, plant, animal or human cell. Preferred fungal cells are, for example, those of the genus Saccharomyces, in particular those of the species S. cerevisiae. Suitable mammalian cell lines comprise Saos-2 human osteosarcoma cells (ATCC HTB-85), HeLa human epidermoid carcinoma cells (ATCC CRL-7923), HepG2 human hepatoma cells (ATCC HB-8065), human fibroblasts (ATCC CRL-1634), U937 human histiocytic lymphoma cells (ATCC CRL-7939), RD human embryonic rhabdomyosarcoma cells (ATCC CCL-136), MCF7 human breast adenocarcinoma cells (ATCC HTB-22), JEG-3 human choriocarcinoma cells (ATCC HB36), A7r5 fetal rat aortic smooth muscle cells (ATCC CRL-1444), NIH 3T3 mouse fibroblasts (ATCC CRL-1658) HEP 3B (ATCC HB 8064), C6 (ATCC CCL 107) and GS 9L obtainable from the American Type Culture Collection. Primary-culture HUVEC may be obtained from Clonetics Corp. (San Diego, CA) and can be grown in EGM medium containing 2% fetal calf serum (Clonetics). Said host cell can also be a pancreatic cell or a cell derived therefrom, or a primary cell, tumor cell, spheroid cell, aggregate cell, stem cell or a differentiated cell although any other animal, preferably mammalian cell may be appropriate as well.
Moreover, the present invention relates to a composition, preferably a pharmaceutical composition comprising at least one of the aforementioned regulatory sequences, recombinant DNA molecules or vectors of the invention, either alone or in combination, and optionally a pharmaceutically acceptable carrier or excipient. Examples of suitable pharmaceutical carriers are well known in the art and include
phosphate buffered saline solutions, water, emulsions, such as oil/water emulsions, various types of wetting agents, sterile solutions etc. Compositions comprising such carriers can be formulated by well known conventional methods. Suitable excipients are equally well known and include water, gelatin, starch, magnesium stearate, talc, vegetable oils and the like. These pharmaceutical compositions can be administered to the subject at a suitable dose. Administration of the suitable compositions may be effected by different ways, e.g., by intravenous, intraperitoneal, subcutaneous, intramuscular, topical or intradermal administration. The dosage regimen will be determined by the attending physician and other clinical factors. As is well known in the medical arts, dosages for any one patient depends upon many factors, including the patient's size, body surface area, age, the particular compound to be administered, sex, time and route of administration, general health, and other drugs being administered concurrently. Generally, the regimen as a regular administration of the pharmaceutical composition should be in the range of 1 μg to 10 mg units per day. If the regimen is a continuous infusion, it should also be in the range of 1 μg to 10 mg units per kilogram of body weight per minute, respectively. Progress can be monitored by periodic assessment. Dosages will vary but a preferred dosage for intravenous administration of DNA is from approximately 106 to 1022 copies of the DNA molecule. The compositions of the invention may be administered locally or systemically. Administration will generally be parenterally, e.g., intravenously; DNA may also be administered directly to the target site, e.g., by biolistic delivery to an internal or external target site or by catheter to a site in an artery.
It is envisaged by the present invention that the various recombinant DNA molecules and vectors of the invention are administered either alone or in any combination using e.g. appropriate gene delivery systems, and optionally together with an appropriate compound, for example insulin, and/or together with a pharmaceutically acceptable carrier or excipient. Subsequent to administration, said recombinant DNA molecules may be stably integrated' into the genome of the mammal. On the other hand, viral vectors may be used which are specific for certain cells or tissues, preferably for pancreatic cells and persist in said cells. The pharmaceutical compositions prepared according to the invention can be used for the prevention or treatment or delaying of different kinds of diseases, which are related to the expression or overexpression of genes in pancreatic cells.
Furthermore, it is possible to use a pharmaceutical composition of the invention which comprises a regulatory sequence, recombinant DNA molecule or vector of the invention in gene therapy. Gene therapy approaches in connection with the recombinant DNA molecule of the invention have already been discussed herein above. In addition, in connection with the pharmaceutical composition of the invention, the following should be noted: suitable pharmaceutical compositions may include liposomes, receptor-mediated delivery systems, naked DNA, and viral vectors such as herpes viruses, retroviruses, adenoviruses, and adeno-associated viruses, among others. The pharmaceutical compositions according to the invention can be used for the treatment of diseases hitherto unknown as being related to pancreas related gene expression. An embryonic cell can be for example an embryonic stem cell as described in, e.g., Nagy, Proc. Natl. Acad. Sci. 90 (1993) 8424-8428. Further applications of the pharmaceutical composition of the invention as well as general and specific methods of producing the active ingredients thereof, in particular for gene therapy purposes, have been described herein above.
The present invention also relates to diagnostic compositions comprising at least one of the aforementioned nucleic acid molecules, recombinant DNA molecules or vectors, and, optionally suitable means for detection. Said compositions may further contain compounds such as further plasmids, antibiotics and the like for screening transgenic animals and/or animal cells useful for the genetic engineering of non- human animals, preferably mammals and most preferably mouse. The diagnostic compositions of the invention may be used for methods of detecting and isolating regulatory sequences which are a functionally equivalent to regulatory sequences of the invention capable of modulating gene expression in pancreatic cells.
The present invention also relates to a method for the production of a transgenic non- human animal, preferably a transgenic mouse, comprising the introduction of a recombinant DNA molecule or vector of the invention into a germ cell, an embryonic cell or an egg or a cell derived therefrom. The non-human animal to be used as a source for the production of the transgenic animal in the method of the invention may be a non-transgenic healthy animal, or may have a disease or disorder, preferably a pancreatic disease, such as types I and II diabetes, insulonomas and glucagonomas.
Said disease or disorder may be an inborn insufficiency or naturally developed or caused by genetic engineering, for instance by the expression of a DNA sequence encoding a protein involved in a pancreatic disease, preferably under the control of the regulatory sequence of the invention.
The invention also relates to transgenic non-human animals such as transgenic mice, rats, hamsters, dogs, monkeys, rabbits or pigs comprising a recombinant DNA molecule or vector of the invention or obtained by the method described above, preferably wherein said recombinant DNA molecule is stably integrated into the genome of said non-human animal, preferably such that the presence of said recombinant DNA molecule or vector leads to the transcription and/or expression of the heterologous DNA sequence by the regulatory sequence of the invention.
With the regulatory sequences of the invention, it is now possible to study in vivo pancreas specific gene expression. Furthermore, since panreatic cell specific gene expression has different patterns in different stages of physiological and pathological conditions, it is now possible to determine further regulatory sequences which may be important for the up- or down-regulation of pancreatic cell gene expression, for example in specific tumors. In addition, it is now possible to in vivo study mutations which affect different functional or regulatory aspects of specific gene expression in pancreatic cells.
The in vivo studies referred to above will be suitable to further broaden the knowledge on the mechanisms involved in pancreatic diseases. To date, it is known that type I diabetes is the result of partial or total destruction of the insulin-producing β-cells by immunological, chemical or viral factors. Nonetheless, a small population of cells within the pancreas are thought capable of regenerating the β-cell population. Certain genes have been shown to play a major role in β-cell neogenesis (Jonsson, Nature 371 (1994), 606; Ahlgren, Nature 385 (1997), 257; Sosa-Pineda, Nature 386 (1997), 399; St-Onge, Nature 387 (1997), 406). Expression of these genes under the control of the present invention in diabetic pancreata will allow or contribute to the understanding of the function of each of the mentioned genes.
The present invention further relates to a method for the identification of an agonist/activator and/or an antagonist/inhibitor of genes or gene products involved in pancreatic diseases comprising the steps of:
(a) providing an animal or human cell or tissue, or a non-human animal comprising a recombinant DNA molecule comprising a readout system operatively linked t ^at least one regulatory sequence capable of mediating or regulating pancreas specific expression of said readout system, wherein said regulatory sequence is preferably a regulatory sequence of the invention;
(b) cuituring said animal or human cell, or tissue or maintaining said non-human animal in the presence of a compound or a sample comprising a plurality of compounds under conditions which permit expression of said readout system; and
(c) identifying or verifying a sample and compound, respectively, which leads to suppression or activation and/or enhancement of expression of said readout system in said animal or human cell, or tissue, or non-human animal.
The term "read out system" in context with the present invention means a DNA sequence which upon transcription and/or expression in a cell, tissue or organism provides for a scorable and/or selectable phenotype. Such read out systems are well known to those skilled in the art and comprise, for example, recombinant DNA molecules as described above.
The term "plurality of compounds" as used to describe the method of the invention is to be understood as a plurality of substances which may or may not be identical. Said plurality of compounds may be comprised in, for example, samples, e.g., cell extracts from, e.g., plants, animals or microorganisms. Furthermore, said compounds may be known in the art but hitherto not known to be capable of suppressing or activating and/or enhancing the transcription of a pancreas-specific expressed gene. The plurality of compounds may be, e.g., added to the culture medium or injected into or fed to the animals.
in a preferred embodiment the method of the invention further comprises the step of
(d) identifying and/or isolating from the identified sample the compound responsible for said suppression or activation and/or enhancement of expression of said readout system in said animal or human cell, or tissue, or non-human animal.
In a more preferred embodiment the method of the invention further comprises the step of
(e) determining whether said sample or compound mimics, enhances or suppresses the cellular effects of the Pax6 protein.
And in a even more preferred embodiment, the method of the invention further comprises the step of
(f) subdividing the samples identified in step (c) and repeating steps (a) to (c) one or more times.
If a sample containing a plurality of compounds is identified in the method of the invention, then it is in a further step (d) as described herein above either possible to isolate the compound from the original sample identified as containing the compound capable of suppressing or activating and/or enhancing the transcription of the read out system, e.g. a panreatic cell-specific expressed gene in an animal or human cell, or tissue or non-human animal. It can then be, in step (e) as described above, determined whether said sample or compound mimics or suppresses the cellular effects of the Pax6 protein, for example the differentiation of glucagon-producing α- cells in pancreas of the mouse as described in St. Onge, Nature 387 (1997), 406- 409. Depending on the complexity of the samples, the steps described above can be performed several times preferably until the sample identified according to the method of the invention only comprises a limited number of or only one substance(s). Accordingly, one can further as step (f) subdive the samples identified in step (c) and repeat steps (a) to (c) one or more times. Therefore, one can further subdivide the original sample, for example, if it consists of a plurality of different compounds, so as to reduce the number of different substances per sample and repeat the method with the subdivisions of the original sample. Preferably said sample comprises substances of similar chemical and/or physical properties, and most preferably said substances are identical.
The method of the invention further comprises, optionally, the step of
(g) improving the desired characteristics of the compound identified in any one of steps (c), (d), (e) or (f) described herein above.
Improvement of characteristics of a compound are well known in the art and comprise, inter alia, computer-aided design, peptidomimetics, random mutagenesis. Compounds isolated by the above methods may therefore also serve as lead compounds for the development of e.g. analog compounds. The analogs should have a stabilized electronic configuration and molecular conformation. Idenfication of analog compounds can be performed through use of techniques such as self- consistent field (SCF) analysis, configuration interaction (Cl) analysis and normal m-P'- dynamics analysis. Computer programs for implementing these techniques are available; e.g., Rein, Computer-Assisted Modeling of Receptor-Ligand Interactions (Alan Liss, New York, 1989). Methods for the preparation of chemical derivatives and analogues are well known to those skilled in the art and are described in, for example, Beilstein, Handbook of Organic Chemistry, Springer edition New York Inc., 175 Fifth Avenue, New York, N.Y. 10010 U.S.A. and Organic Synthesis, Wiley, New York, USA. Furthermore, said derivatives and analogues can be tested for their effects according to methods known in the art.
Compounds which can be tested for in accordance with the present invention include peptides, proteins, nucleic acids, antibodies, small organic compounds, ligands, hormones, compounds produced by peptidomimetics, PNAs and the like. Said compounds can also be functional derivatives or analogues of known inhibitors or activators. Methods for the preparation of chemical derivatives and analogues are well known to those skilled in the art and are described in, for example, Beilstein, Handbook of Organic Chemistry, Springer edition New York Inc., 175 Fifth Avenue, New York, N.Y. 10010 U.S.A. and Organic Synthesis, Wiley, New York, USA.Said derivatives and analogues can be tested for their effects according to methods known in the art or as described, for example, in the appended examples. Furthermore, peptidomimetics and/or computer aided design of appropriate derivatives and analogues can be used, for example, according to the methods described below and said derivatives or analogues employed in the method of the invention to identify the desired compounds.
Nucleic acids useful in the method of the invention comprise DNA or RNA or hybrids thereof. Furthermore, said nucleic acid may contain, for example, thioester bonds and/or nucleotide analogues, commonly used in oligonucleotide anti-sense
approaches. These modifications may be useful for the stabilization of the nucleic acid molecule against endo- and/or exonucleases in the cell. Furthermore, the so- called "peptide nucleic acid" (PNA) technique can be used for the detection or inhibition of the expression of genes or gene products involved in pancreatic diseases by referring to the above identified method of the invention. For example, the binding of PNAs to complementary as well as various single stranded RNA and DNA nucleic acid molecules can be systematically investigated using, e.g., thermal turation and BIAcore surface-interaction techniques (Jensen, Biochemistry 36 (1997), 5072-5077). The synthesis of PNAs can be performed according to methods known in the art, for example, as described in Koch, J. Pept. Res. 49 (1997), 80-88; Finn, Nucleic Acids Research 24 (1996), 3357-3363. Furthermore, folding simulations and computer redesign of structural motifs of target proteins involved in pancreatic diseases can be performed using appropriate computer programs (Olszewski, Proteins 25 (1996), 286-299; Hoffman, Comput. Appl. Biosci. 11 (1995), 675-679). Computer can be used for the confirmational and energetic analysis of detailed protein models (Monge, J. Mol. Biol. 247 (1995), 995-1012; Renouf, Adv. Exp. Med. Biol. 376 (1995), 37^45). In particular, the appropriate programs can be used for the identification of interactive sites of a putative inhibitor and the target protein involved in pancreatic diseases by computer assistant searches for complementary structural motifs (Fassina, Immunomethods 5 (1994), 114-120). Further appropriate computer systems for the computer aided design of protein and peptides are described in the prior art, for example in Berry, Biochem. Soc. Trans. 22 (1994), 1033-1036; Wodak, Ann. N. Y. Acad. Sci. 501 (1987), 1-13; Pabo, Biochemistry 25 (1986), 5987-5991. The results obtained from the above-described computer analysis can be used in combination with the method of the invention for, e.g., optimizing known inhibitors of pancreatic diseases. Such pseudopeptide analogues of the natural amino acid sequence of the peptide may very efficiently mimic the parent molecule (Benkirane, J. Biol. Chem. 271 (1996), 33218-33224). Superactive peptidomimetic analogues of small peptide hormones in other systems are described in the prior art (Zhang, Biochem. Biophys. Res. Commun. 224 (1996), 327-331). Appropriate agonists/activators or inhibitors/antagonists can also be identified by the synthesis of peptidomimetic combinatorial libraries through successive amide alkylation and testing the resulting compounds, e.g., according to the methods described herein and in the appended examples. Methods for the generation and use of peptidomimetic combinatorial libraries are described in the prior art, for
example in Ostresh, Methods in Enzymology 267 (1996), 220-234 and Domer, Bioorg. Med. Chem. 4 (1996), 709-715. Furthermore, a three-dimensional and/or crystallographic structure of inhibitors of the target proteins involved in pancreatic diseases can be used for the design of peptidomimetic inhibitors target proteins (Rose, Biochemistry 35 (1996), 12933-12944; Rutenber, Bioorg. Med. Chem. 4 (1996), 1545- 1558).
Furthermore, antibodies sr ifically recognizing target proteins involved in pancreatic .; v-rø es or parts, i . „ρecific fragments or epitopes, of such target proteins and thereb> inactivating said protein may be employed. These antibodies can be monoclonal antibodies, polyclonal antibodies or synthetic antibodies as well as fragments of antibodies, such as Fab, Fv or scFv fragments etc. Antibodies or fragments thereof can be obtained by using methods which are described, e.g., in Harlow and Lane "Antibodies, A Laboratory Manual", CSH Press, Cold Spring Harbor, 1988 or EP-B1 0 451 216 and references cited therein. For example, surface piasmon resonance as employed in the BIAcore system can be used to increase the efficiency of phage antibodies which bind to an epitope of the target protein involved in pancreatic diseases (Schier, Human Antibodies Hybridomas 7 (1996), 97-105; Malmborg, J. Immunol. Methods 183 (1995), 7-13). In conclusion, the method of the invention (a) allows the identification (or verification) of compounds having the above referenced activating/enhancing or suppressing/inhibitory activities, and (b) optionally comprise the further step of improving the referenced feature(s) by, for example, peptidomimetics or computer programs.
In a preferred embodiment of the method of the invention said recombinant DNA molecule comprising said read out system is a recombinant DNA molecule of the invention or a vector of the invention as described in the embodiments hereinbefore.
In a further preferred embodiment of the method of the invention said animal or human cell, tissue or non-human animal is a cell, tissue or transgenic non-human animal of the invention described in the embodiments hereinbefore.
In a particularly preferred embodiment of the method of the invention said recombinant DNA molecule comprised in said animal or human cell, tissue or non-
human transgenic animal is introduced into the genome by transfection, transformation, electroporation, infection or particle bombardment.
Determining whether a compound is capable of suppressing or activating and/or enhancing the transcription of a pancreas-specific regulated gene can be done, for example, in mice by monitoring the reporter gene. It can further be done by monitoring the behavior or the health status of the transgenic non-human animals of the invention contacted with the compounds and compare it to that of wild-type animals. In an additional embodiment, said behavior or health status may be compared to that of a transgenic non-human animal contacted with a compound which is either known to be capable or incapable of suppressing or activating and/or enhancing Pax6 gene expression and/or function of said transgenic non-human animal of the invention. The compounds identified according to the method of the invention are expected to be very beneficial since treatment of pancreatic diseases such as diabetes is restricted to daily injections of insulin and, most importantly gene therapy that has been used so far is only limited due to the non-tissue specificity of the regulatory sequences used in the targeting vectors so far available.
In summary, the present invention provides methods for identifying compounds which modulate pancreas specific gene expression. For example, activators or compounds found to enhance Pax6 expression may be used in the processes of regenerating, increasing or enhancing the insulin producing β-cell population in patients suffering from diabetes.
In contrast, antagonists or compounds found to downregulate Pax6 expression may be used in the treatment of insulinomas or glucagonomas where overexpression of endogenous Pax6 may induce normal cells to become tumorigenic. The above-mentioned compounds can also be used to treat patients who have, or have had acute pancreatitis or surgi-sal pancreatectomy where the β-cell population has been destroyed.
The invention also comprises a method wherein the compound identified in the method described hereinbefore or a compound whose desired features have been further improved as described above is formulated in a pharmaceutical composition. Therefore, the present invention furthermore relates to a method for the preparation of an agonist/activator and/or an antagonist/inhibitor of genes or gene products
involved in pancreatic diseases, comprising the steps (a), (b) and (c) as described herein above, wherein the method further optionally comprises any of the steps of (d), (e), (f) and/or (g) as described herein above and formulating the compound identified in any one of these later steps into a pharmaceutical composition. The compounds identified obtained or improved according to the method of the present invention are expected to be very useful in diagnostic and therapeutic applications. Thus, in a further embodiment the invention relates to a compound obtained identified or improved according to the method of the invention said compound being an agonist/activator of Pax6 gene expression and/or function or an antagonist/inhibitor of Pax6 gene expression and/or function.
The therapeutically useful compounds identified or improved according to the method of the invention may be administered to a patient by any appropriate method for the particular compound, e.g., orally, intravenously, parenterally, transdermally, transmucosally, or by surgery or implantation (e.g., with the compound being in the form of a solid or semi-solid biologically compatible and resorbable matrix) at or near the site where the effect of the compound is desired. This method would apply most generally to patients suffering from diabetes where Pax6 activation by the compound would lead to a regeneration of insulin producing cells. Therapeutic doses are determined to be appropriate by one skilled in the art.
Such therapeutically useful compounds can be for example transacting factors which bind to the regulatory sequence of the invention. Identification of transacting factors is carried out using standard methods in the art (see, e.g., Sambrook, supra, and Ausubel, supra) or methods as described in the appended examples. To determine whether a protein binds to the regulatory sequences of the invention, standard DNA footprinting and/or native gel-shift analyses can be carried out. In order to identify a transacting factor which binds to the regulatory sequence of the invention, the regulatory sequence can be used as an affinity reagent in standard protein purification methods, or as a probe for screening an expression library. Once the transacting factor is identified, modulation of its binding to the regulatory sequences of the invention can be pursued, beginning with, for example, screening for inhibitors against the binding of the transacting factor to the regulatory sequences of the present invention or by applying mutagenesis techniques that would also effect the
active site or a site involved in the regulation of the factor. Activation or repression in connection with the goals of the present invention could then be achieved in a patient by administration of the transacting factor (or its inhibitor) or the gene encoding it, e.g. in a vector for gene therapy. In addition, if the active form of the transacting factor is a dimer, dominant-negative mutants of the transacting factor could be made in order to inhibit its activity. Furthermore, upon identification of the transacting factor, further components in the pathway leading to activation (e.g. signal transduction) or repression of a gene under the control of the regulatory sequences of the present invention can then be identified. Modulation of the activities of these components can then be pursued, in order to develop additional drugs and methods for modulating the expression of a gene under the control of the regulatory sequences of the present invention.
Besides the identification of transacting factors it is also immediately evident to the person skilled in the art that antibodies can be raised against the regulatory sequence of the invention or against the compounds identified according to the method of the present invention. Thus, the present invention also relates to an antibody specifically recognizing the compound of the present invention. Monoclonal antibodies can be prepared, for example, based on the techniques as originally described in Kδhler and Milstein, Nature 256 (1975), 495, and Galfre, Meth. Enzymol. 73 (1981 ), 3, which comprise the fusion of mouse myeloma cells to spleen cells derived from immunized mammals. Furthermore, antibodies or fragments thereof to the aforementioned pancreas-specific expressed gene products can be obtained by using methods which are described, e.g., in Harlow and Lane "Antibodies, A Laboratory Manual", CSH Press, Cold Spring Harbour, 1988. These antibodies may be monoclonal antibodies, polyclonal antibodies or synthetic antibodies as well as fragments of antibodies, such as Fab, Fv, or scFv fragments etc.
Moreover the present invention relates to pharmaceutical and diagnostic compositions comprising the above-described compounds which are agonists/activators or antagonists/inhibitors and/or antibodies and optionally a pharmaceutically acceptable carrier or suitable means for detection, respectively; see supra.
Further, the present invention relates to the use of the regulatory sequence, the recombinant DNA molecule, vector, cell, pharmaceutical compositions, diagnostic compositions or a transgenic non-human animal of the invention for the identification of a chemical and/or biological substance capable of suppressing or activating and/or enhancing the transcription, expression and/or activity of pancreas-specific genes and/or its expression products.
In a preferred embodiment, the chemical or biological substance used in the methods and uses of the present invention is selected from the group consisting of peptides, proteins, nucleic acids, antibodies, small organic compounds, antibiotics, hormones, neural transmitters, compounds obtained by peptidomimimetics, and PNAs (Milner, Nature Medicine 1 (1995), 879-880; Hupp, Cell 83 (1995), 237-245; Gibbs, Cell 79 (1994), 193-198 and references cited, supra).
In a further embodiment the present invention relates to the use of a regulatory sequence, a recombinant DNA molecule, vector, nucleic acid molecule of the invention, compound and/or antibody of the invention for the preparation of a composition for directing and/or preventing expression of genes specifically in the pancreas and/or for the preparation of a pharmaceutical composition for treating, preventing and/or delaying a pancreatic disease in a subject.
In a further embodiment, the present invention relates to the use of a recombinant DNA molecule, vector, nucleic acid molecule compound and/or antibody of the invention for the preparation of a composition for inducing a pancreatic disease in a non-human animal or in a transgenic non-human animal. As mentioned before, the regulatory sequences of the invention can be used for generating transgenic animals that display a pancreatic disease such as types I and II diabetes, insuionomas and glucagnonomas. For example anti-sense RNA directed against Pax6 mRNA can be expressed under the control of said sequence, or the knock-out-animals can be generated via homologous recombinant with a promoter fragment of the invention. The above described means can be formulated in composition suitable for the introduction into animal cells, preferably stem cells. Said compositions may comprise pharmaceutically acceptable carriers such as described before.
These and other embodiments are disclosed and encompassed by the description and examples of the present invention. For example, further literature concerning any one of the methods, uses and compounds to be employed in accordance with the present invention may be retrieved from public libraries, using for example electronic devices. For example the public database "Medline" may be utilized which is available on the Internet, for example under http://www.ncbi.nlm.nih.gov/PubMed/medline.html. Further databases and addresses, such as http://www.ncbi.nlm.nih.gov/, http://www.infobiogen.fr/, http://www.fmi.ch/biology/research_tools.html, http://www.tigr.org/, are known to the person skilled in the art and can also be obtained using, e.g., http://www.lycos.com. An overview of patent information in biotechnology and a survey of relevant sources of patent information useful for retrospective searching and for current awareness is given in Berks, TIBTECH 12 (1994), 352-364.
The pharmaceutical compositions, uses, methods of the invention can be used for the treatment of all kinds of diseases hitherto unknown as being related to or dependent on the modulation of the pancreas. The pharmaceutical compositions, methods and uses of the present invention may be desirably employed in humans, although animal treatment is also encompassed by the methods and uses described herein.
The figures show:
Figure 1 : Pax6 promoter/reporter gene constructs for the identification of decontrol elements for the transgene expression in lens, cornea and the pancreas. The exact location of exon 0 (transcription start) on the restriction map is marked. The arrow indicates the transcription start site in exon 0 (EO). The' cloned regions have been marked by horizontal lines. The identified 1100 nt cis-element for the pancreas is indicated by an box (A) while the 120 nt element for the lens and cornea element by a box (B). Abbreviations: A, Accl; B, Bglll; E, EcoRI; H, Hindlll; He, Hindi, N, Nsi; P, Pstl; S, Sail; Sp, Spe; X, Xbal. With the exception of construct 4 (TK1 ) all constructs are controlled by promoter EO.
Figure 2: Schematic summary of the regulative region of the exon 0 promoter. The region marked "pancreas" surrounds the pancreas element restricted to 520 bp, the area marked as "lens and cornea" shows the cornea and lens element restricted to 129 bp. The following restriction- enzymes were used:, Accl; B, Bglll; E, EcoRI; H, Hindlll; He, Hindi; N, Nsil; P, Pstl, S, Sail; Sp, Spel.'x, Xbal
Figure 3: Histological section of pancreas from line 406/Sal32. A, C, Transversal sections. B, Sagittal section. D, Transversal section stained with an antibody specific for glucagon. E, Transversal section stained with an antibody specific for insulin. The earliest expression of the β- gaiactosidase reporter gene was detected in E9.0 embryos. This expression corresponded to the endogenous expression pattern of Pax6 (A), β-galactosidase activity was also detected in E12.5 embryos (B) and in newboms/adult animals (C) where β-galactosidase expression was restricted to cells expressing either glucagon (D) or insulin (E).
Figure 4: Nucleotide sequence of the Pax6 promoter. Underlined sequence: this region has a' high sequence homology to the genomic sequence of the Pax6 phage-DNA of Fugu (pufferfish) rubripes. The location of the pancreas element is between the 5' Spe restriction side and the 3' Hindi restriction side.
Figure 5: Structure of the murine Pax6 transcripts. Arrows indicate the transcriptional start sites of the identified three transcripts, a, b, c. The translational start site (ATG, thin arrow) is located in exon 4. Exon 5, 6 and part of exon 7 contain the paired box, marked in black.
Figure 6: Sequence comparison between the mouse (M) and the quail (Q) exon sequences. (A, B) represents mouse RT-PCR products for transcript a and transcript b; (C) represents mouse genomic DNA sequence corresponding to exon α.
32
Figure 7: Developmental expression analysis of the lacZ reporter gene in a transgenic line carrying the construct 2 (406/Sal). (A-C, H-K) and (D-G, L-N) are views of embryos after whole-mount β-gal staining and after sectioning at the indicated stages respectively. *.,.
(A-K) Expression in the ectodermal derivatives of the developing eye. The arrowhead in C points to a stream of lacZ positive cells that extend from the head mesenchyme over the anterior edge of the first branchial arch. The curved arrow in H points to the anlage of the duct of the lacrimal gland. Abbreviations used: C- cornea, Ec- surface head ectoderm; L-lens; LC- cavity of the lens vesicle; LF- lens fibres; LGI - lacrimal gland; LP-lens pit; PI - lens placode; Oco- and Oci - outer and inner layer of the optic cup, respectively; OS- optic stalk; V- vessels. (C, L-M). Expression in the developing pancreas. (M) and (N) shows sections after β-gal whole mount staining and immunohistochemistry for detection of glucagon (M) or insulin (N). The arrows in M and N point to colocalization of the lacZ reporter expression with endocrine cells of the islands producing glucagon or insulin, respectively.
Figure 8: Transgene lacZ expression driven by the construct 12, (containing Fugu (pufferfish) control sequences) in mouse telencephaion (T), lens (L) and pancreas (P) at stage E 12.5.
Transgene lacZ expression driven by construct 14 (containing mouse control sequences) in dorsal telencephaion (T), hindbrain (Hb) and spinal cord (SC) at stage E 11.5.
Figure 9: Upper panel: Identification of the Pax6 c/s-elements responsible for the transgene expression in neural retina. The arrow indicates the transcriptional start site in exon 0, exon 1 and exon α. The reporter constructs carry the transcriptional start point from the Pax6 gene of exon α (construct 15), of exon 0 (constructs 17, 8, 19, 20) or from the TK-gene (construct 16). The identified 530 nt cis- element controlling
the transgene expression in the retina is indicated by yellow box. Abbreviations used: A, Accl; B, Bglll; E, EcoRI; D, Dralll; X, Xbal. Lower panel: Transgene lacZ expression in developing neural retina. (A-E) and (F) are views after whole mount lacZ staining in transgenic mice, carrying the construct 17 (406/8) or the construct 18 (406/Fugu (pufferfish)), respectively. The arrows in (A-C) and (F) point to a region within the dorsal neural retina that appears negative, observed in mice carying the mouse or Fugu (pufferfish) regulatory sequence, respectively. In (C) note the strong Pax6/\acZ expression in the iris (tr) of the eye at postnatal stage (P1) and the lack of signal in the dorsal domain (arrowheads). The curved arrow in (D, stage E 13.5) points to a thin layer of lacZ positive cells, connecting the strongly positive nasal and temporal retinal domains, observed in few transgenic embryos. The arrowhead in (E) points to a strong β-gal staining in the inner nuclear layer of the neural retina. Abbreviations used: L- lens; NR-neural retina; PL- pigmental layer of retina; ON- optic nerve.
Figure 10: Sequence comparison of the conserved elements in mouse, human and fugu (pufferfish) Pax6/genomic DNA that control the expression in the eye tissues of head surface ectodermal origin. The minimal element which is necessary for driving the lacZ expression of the reporter gene into the lens and cornea is a 107 nt Bglll/Accl fragment (boxed). Two potential homeobox binding sites are shown as DNA motif A and B.
Figure 11 : Sequence comparison of the mouse element controlling the expression in the pancreas with the human and the Fugu (pufferfish) Pax6 genomic DNA reveals a fragment of 124 nt with high percentage identitiy. This sequence contains potential binding sites for Pbxl. Two possible motifs (C and D) for homeobox binding sites are boxed.
Figure 12: Sequence comparison of the retina element between mouse, human and Fugu (pufferfish) genomic region. This sequence contains potential
binding sites for the Msx1 and Pax2. and possible motifs for homeobox binding sites (motif E and motif I).
Figure 13: Scheme illustrating the localization of the identified control elements in the mouse Pax6 locus that control the expression of the gene in X e pancreas (box A); lens, cornea, lacrimal gland, conjunctiva (box B), telencephaion, spinal cord and hindbrain (box C) and neural retina (box D)
Figure 14: Table showing transgene expression in transient and founder embryos.
The present invention is further illustrated by reference to the following non-limiting examples.
Unless stated otherwise in the examples, all recombinant DNA techniques are performed according to protocols as described in Sambrook et al. (1989), Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, NY or in Volumes 1 and 2 of Ausubel et al. (1994), Current Protocols in Molecular Biology, Current Protocols.
The disclosure contents of the various documents cited in this specification are incorporated herein by references.
The examples illustrate the invention.
Example 1 : Cloning of the Pax6-promoter
The regulatory sequence of the present invention was originally obtained from mouse strain C57BI/6 (commercially available at the Zentralinstitut fur Versuchstierzucht, Hannover) by screening of a genomic Lambda phage library (λEMBL3A, obtainable from the laboratory of Prof. Sidney Brenner, MRC, Cambridge, UK), using a Pax6 cDNA probe described by Walther, Development 113 (1991). Phage clone gp52 contained the most upstream genomic sequence and was used for the promoter
analysis experiments. A composite restriction map of part of the genomic region isolated is shown in Fig. 1.
Example 2: Characterization of the Pa 6-promoter
To determine the elements which are responsible for the transcriptional activity of the 8 kb promoter fragment, a series of deletions were introduced into the construct 2, 406/Sal. This construct (406/Sal) was deposited with DSMZ (accession number DSM 11998; Deutsche Sammiung von Mikroorganismen und Zellkulturen, Mascheroder Weg 1 b, D-38124 Braunschweig) on February 12th, 1998 in accordance with the Budapest Treaty. Construct 2 (406/Sal) originated from a 3,7 kb genomic EcoRI subclone which contains the first untranslated exon (exon 0, EO). This fragment was partially digested with Bglll, which cuts in Exon 0. A lacZ-polyA-fragment (lacZ gene with its own ATG and Sv40 polyadenylation signal) was inserted into the Bglll-site of this construct via blunt-end-reaction (construct 1 , 406). This construct did not show any Pax6 specific expression patterns. For this reason a 7 kb Sall-subclone from murine genomic Pax6 gene with further 5' upstream genomic sequence was digested with Nsil and Kpnl and ligated to a 6.7 kb fragment of construct 406, obtained by linearisation with Kpnl and partial digestion with Nsil containing the minimal promoter of Pax6 and the lacZ-fragment. The deletions were produced by removing 5' and 3' genomic sequences of construct 2 (406/Sal). Subsequently, transgenic mice were produced with this construct (Fig.1). Therefor, construct 2 was linearised with Notl and Kpnl and the fragment was injected into the pronucleus of oocytes of FVB-mice. Production of transgenic embryos and whole-mount β-galactosidase staining was performed as described by A. L. Joyner Ed., Gene Targeting, A Practical Approach (1993), Oxford University Press. The DNA of the embryonal membranes of the embryos were analyzed using Southern blots with a lacZ probe. After β-galactosidase staining the embryos were fixed overnight with 4% paraformaldehyd and after washing with PBS, the embryos were dehydrated with ethanol and embedded into paraffin. The cross and sagittal sections were counterstained with neutral-red and the transgene expression pattern was analyzed by light microscopy.
36
Removal of 3 kb genomic sequence (construct 3, 406/SpeI) still leads to expression of the transgene in the lens, cornea and pancreas. A further deletion of 1151 bp at the 5' end (construct 5, 406/HincII) showed lacZ staining only in the lens and cornea, but no more expression in the pancreas. This leads to the conclusion that the regulatory element required for Pax6 expression in the pancreas is restricted to the region between the Spel-side and the HincII-side of construct 2 (406/Sal).
A deletion of construct 2 at 3 kb in the 5' area (Fig. 1; construct 3, 406/Spe) did not change the transgenic expression pattern. From the results of this deletion experiment the position of the regulation elements was deduced and found to be restricted to an area of 2 kb (Fig.1).
The earliest transgenic activity in the pancreas could be determined on day 9,0 p.c. (Fig. 3) with the promoter fragment (Fig. 1; construct 2 406/Sal), which mimics the endogenous Paxβ expression, β-galactosidase activity could be observed in the pancreas throughout embryo development (Fig. 3). Strong expression of the transgene can also be demonstrated in newborn mice and in adult transgenic mice (6 weeks; Fig. 3).
The pancreas element, which lies at 5 kb upstream of the transcription start point of exon 0, is restricted to a region of 1100 bp defined by the deletion analysis described above.
As evidence that the transgenic activity corresponds to the endogenous Paxβ expression in the pancreas, endocrine hormone production in the transgenic lines was examined. By using specific antibodies against either insulin or glucagon β- galactosidase positive cells were shown to express either insulin or glucagon (Fig. 3). No ectopic expression could be detected in the exocrine tissue of the pancreas. The primary antibodies used in the analysis, mouse anti-insulin (Sigma) and mouse anti- glucagon (Sigma), were applied on paraffin sections after β-galactosidase staining and detected with a secondary horseradish peroxidase antibody as described by Sosa-Pineda, (1997) Nature 386, 399-402.
The regulatory elements and constructs are further described in the following examples:
Example 3: Further transgene construction for the identification of mouse Paxβ regulatory elements
These further construct comprises:
Construct 4 (TK-1 ) was generated by cloning a 2.4 kb Notl-Asp718 fragment fr-Q-m the 7 kb Sail subcione mentioned above (see construct 2), into the vector pax-L680 which contains a minimal TK promoter and a lacZ-gene SV40-polyA cassette. Construct 5 (406/Hincll) was generated by removing the Sall/Hincll fragment respectively from construct 2.
Constructs 6 - 9 were generated by ligating the following blunt ended genomic DNA fragments into the construct 11 (construct 11 as described below): Hindi to EcoRI; (construct 6), EcoRI - EcoRI (construct 7), Accl - Accl (construct 8) Bglll - EcoRI (construct 9_ Construct 8 contains two copies of the fragment in 5' - 3' orientation, while construct 9 has three copies in 3' - 5' orientation.
As a negative control, construct 10 containing the Paxβ minimal promoter PO was generated by deleting the Sall-Xbal fragment from construct 2. This construct only contains the Paxβ minimal promoter PO and the lacZ-polyA cassette. In construct 11_the promoter PO was further shortened by deleting the upstream Xbal - BamHI fragment and subsequently the Hindi - EcoRI fragment from construct 6 was subcloned into it in δ'.to 3'orientation.
The construct 12/Fugu (pufferfish), (Fig. 8) carrying pufferfish control elements was generated as follows. A 12 kb genomic Sail fragment was subcloned into pBSKS+. A blunt ended IRES/lacZ/polyA-fragment was ligated into the blunt ended Kpnl- restriction-site in the polylinker of the 12 kb genomic subcione. The IRES (internal ribosome entry site) was added upstream to the lacZ gene to facilitate its cap- independent translation.
For construct 13, the lacZ-poly A cassette was inserted into the BamHI site of exon 4. A total of 13 kb of upstream DNA sequence was added in multiple steps to generate the final construct 2118/14P (Fig. 8).
For construct 14, the lacZ-polyA cassette was inserted into the Narl site of exon 1. Subsequently, the 5 kb upstream genomic sequence was added (Fig. 8.) Construct 15 was generated by inserting the lacZ-polyA cassette into the Xbal site of the exon α present in the 1.2 kb EcoRI - Xbal clone (Fig. 9)
Construct 16 (TK-2, Fig. 9) was generated by cloning a 1.8 kb genomic Accl- fragment containing exon into the vector pax-L680 which contains a minimal TK- promoter and a lacZ-gene SV40-polyA cassette. To test further DNA sequences for the retina specific element we used the minimal promoter PO of Paxβ with the lacZ gene (see construct 7-11 ). DNA sequences tested in the reporter transgene.s (construct 17, 19 and 20, Fig. 9) were isolated from the 1.8 kb Accl genomic- fragment, end-filled using the Klenow-fragment of DNA polymerase I and subcloned into the minimal-promoter oriented 5' to 3' with respect to the lacZ. The upstream DNA sequences contained in the constructs 17, 19 and 20 are Accl- Accl (1.4 kb), Bglll - Xbal (0.6 kb) and Dralll - Xbal (0.29 kb), respectively.
To generate construct 18 (406/Fugu (pufferfish), Fig. 9) we used a 600 bp EcoRV/ Sail-fragment with genomic sequences from the pufferfish Paxβ locus, which was cloned 5' to the minimal promoter P0.
Transgenic sequences were always purified from vector sequences by appropriate restriction enzymes prior to microinjection.
Example 4: Cloning the pufferfish Paxβ locus
For the isolation of the fugu (pufferfish) Paxβ homologue a genomic lambda-DASHII- library of the pufferfish (Fugu rubripes) was screened with a 320 bp EcoRI-fragment of the murine Paxβ- cDNA. Three Paxβ- phage-clones were isolated and subcloned for sequence analysis. The fugu (pufferfish) and mouse sequences were aligned with the program BESTFIT and FASTA of the GCG package.
The following examples illustrate further the pancreas specificity of the regulatory sequences of the invention:
Example 5: Localization of three transcription start sites and sequence analysis of the Paxβ promoter regions in the mouse
In order to delineate the cis-essential elements required for the spatial and temporal activity of the Paxβ gene it was attempted to localize the transcriptional start sites assuming that at least some control elements are located 5' of these sites. The mouse Paxβ promoter region was identified using a combination of primer extension, RT-PCR and genomic DNA sequencing. To localize the transcription start sites, a 18 base primer (nt 113-91 ) complementary to the 5' end of the Paxβ cDNA (Walther et al., Genomics 11 (1991 ), 424-434) was used and two primer extension products of 400 nt and 600 nt in length were obtained suggesting that the 5' end of the published Paxβ cDNA does not contain the initiation site for mRNA transcription. This is compatible with the Paxβ mRNA size of 3 kb (Walther, et al., loc. cit.). Recently, two transcripts with alternative 5 'UTR of the Paxβ -gene in quail (Pax- QNR) which are under the control of two promoters (PO.P1 ) have been described (Dozier et al., Cell Growth Diff. 4 (1993), 281-289; Plaza et al., Mol. Cell. Biol. 15 (1995), 3344-3353). By alignment of the murine genomic sequences with the corresponding areas of quail cDNA clones (Pax-QNR-I, Martin, et al., Oncogene 7 (1992), 1721-1728; Pax-QNR-2 and QNR-B1 , Dozier, et al., loc. cit.) and a human cDNA clone (IHPx-2, Glaser et al., Nature Genetics 2 (1992),232-239) three sequences with high homology at the 5 'UTR regions were identified. The estimated nucleotide identity was of about 70% for exon 0 (transcript a), 93 % for exon 1 (transcript b) and 86% for exon α (transcript c, Fig. 5). The mouse exon 1 is located only 100 bp 5' of exon 2, the alternative 5'UTR (exon 0) is found 6.2 kb further upstream of exon 1 , while exon is located between exon 4 and 5 (Fig. 5). To determine whether the identified homologous regions are contained in authentic transcripts of the mouse Paxβ gene and whether alternative splicing occurs at the 5'- end, RT-PCR experiments were performed. The sequences of the RT-PCR-products for transcript a and transcript b (Fig. 6) in mouse matches with the 5' UTR-regions of the quail transcripts, indicating a conservation of the transcription start sites in the two species. Similarly, high homology has been detected for the mouse and the quail exon α after genomic DNA sequencing .
Sequence analysis of the upstream promoter region of exon 0 in the mouse and the quail (Plaza et al., Cell Growth and Diff. 4 (1993), 1041-1050) revealed a conserved TATA-like sequence (ATATTAA), a conserved CCAAT box as well as several putative transcriptional consensus sequences including a binding site for cAMP-
response elements and a binding site for c-Myb. Furthermore, the upstream promoter region of exon 1 in mouse contains consensus sequences for various basal promoter elements, such as a conserved TATA-like sequence (AATATTT), three CCAAT boxes and consensus binding sites for Sp1 and Ap-2, which are also highly conserved in the Paxβ- gene of the quail (Plaza et al., loc. cit.) and human (Xu and Saunders, J. Biol. Chem. 272 (1997), 3430-3436). However, no conserved TATA-like sequences and CCAAT boxes were found in the 5'UTR of exon α.
Example 6: Expression of Paxβ in ectodermal derivatives of the developing eye, in the pancreas and in the olfactory bulb is directed by a regulatory region located 5' from exon 0.
For the identification of regulatory elements which control the complex spatio- temporal expression of the mouse Paxβ gene in vivo, we generated transgenic mice using lacZ as a reporter gene. The first fusion construct 406 (construct 1 , Fig. 1 ), contains 3 kb sequences located upstream of exon 0. Injected embryos of generation 0 (Fo) were examined for the presence of the transgene by the expression of β- galactosidase (β-gal) from embryonic day E10.5 to E12.5. Since the transgenic embryos either showed no or ectopic β-gal activity (Fig. 14, Table 1 ), this construct was elongated with a further 5 kb upstream fragment (construct 2, 406/Sall; Fig. 1). From the analyzed 25 transgenic embryos at E 12.5, five showed restricted reporter β-gal staining in the lens, the cornea and the pancreas. It was concluded that the regulatory element localized on this additional 5kb fragment 5' to the first promoter P0 controls the Paxβ expression in these tissues. Knowing the complex expression of Paxβ in all tissues of the developing eye (Walther and Gruss, Development 113 (1991 ), 1435-1449), it was examined whether the reporter gene expression would remain restricted during embryogenesis only to the ectodermal (lens and cornea) eye derivatives, or whether it would extend to the retina and pigmental retinal layer or the neuroectodermal eye derivatives, where Paxβ is also expressed strongly. Therefore, 6 stable transgenic lines were established with construct 2 and the expression of the transgene in two of them was examined in detail from E8.0 until adult stage (Fig. 7; Fig. 14: Table 1 ).
Endogenous Paxβ mRNA is initially detected (E8.0-8.5) in a broad region of the head surface ectoderm, including the region from which the lens placode will develop (Walther and Gruss, loc. cit.; Li et al., Dev. Biol. 162 (1994), 181-194; Grindley, Development 121 (1995), 1433-1442). Later on, the expression is confined to the lens pit, the lens vesicle, differentiating lens and also to the surface ectoderm forrrii-ng the cornea. The expression of the reporter gene driven by construct 2 in the developing eye is illustrated in Fig. 7. The first expression of the transgene is detected at E9.0 in the surface ectoderm (Ec) over the presumptive eye region (Fig. 7A,D). At E9.5 -E9.75 β-gal activity increases within the area of the presumptive lens placode (c, Fig.7B) and at E 10.5 the expression becomes confined to the forming lens pit (LP, Fig. 7C,E), presumptive corneal ectoderm (Fig.7F) and a stream of cells that populates the anterior edge of the maxillary domain of the first branchial arch (arrowhead in Fig.7C). Remarkably, no transgenic expression was found in the inner (Oci) and outer (Oco) layer of the invaginating optic cup (Fig. 7E-G). Similarly to the endogenous Paxβ expression (Macdonald and Wilson, Curr. Opin. Neurobiol 6 (1996), 49-56), a strong reporter lacZ signal is observed at stage E 12.5 in proliferating cells of the lens (L, Fig.7F). One day later, when the differentiation of the lens fibres (LF, Fig.7G) starts, the transgene activity starts to decline in these areas. At stage E13.5 the β-gal staining in the developing cornea (C, Fig.7F-l) becomes more prominent. A further domain of transgenic activity was detected within the temporal orbita in a duct that will later form the lacrimal gland (LGI, Fig.7l) and also has ectodermal origin. The transgenic expression in the lacrimal gland (Fig.7J), and in the cornea (data not shown) was maintained one day after birth (P1). Consistent with the endogenous expression of Paxβ in conjunctiva (Koroma et al., Investigative Olphathmology and visual science 38 (1997), 108-120), the conjunctival epithelium of the adult eye was also β-gal positive (data not shown).
Construct 2 (406/Sall) is also able to direct the Paxβ reporter gene expression in the pancreas (Fig7. L-M). As previously reported endogenous Paxβ is expressed in the developing pancreas (Walther and Gruss, loc. cit.) and Paxβ transcripts are detected in all four cell types (α,β, γ,δ) of pancreatic islet cells, but not in the exocrine cell lines (Turque et al., Mol- Endocrinol. 8 (1994), 929-938). At stage E9.5 β-gal staining appears in all transgenic lines in a subset of fore- and midgut cells (the pancreatic bud, P , Fig. 2L), similar to the endogenous Paxβ (Sander et al., Genes-Dev. 11
(1997), 1662-1673) and at E10.5 the expression is clearly seen in the pancreas (P, Fig. 7C). At later stages and one day after birth (P1), the Paxβ reporter transgene is expressed throughout the entire endocrine pancreas (Fig. 7M,N). Double histostaining for lacZ and immunostaining for insulin (Fig.7M) or glucagon (Fig.7N) confirms the colocalization of the Paxβ promoter/lacZ expression with ins-ujin producing (β) and glucagon producing (α) cells of the pancreatic islets. It is noteworthy, that apart from the expression in the developing lens and pancreas, two out of six transgenic lines that were analyzed in detail showed at stage E 12.5 and E 13.5 a very restricted lacZ expression within the anlage of the olfactory bulb (Fig.7K), a region where Paxβ is also specifically expressed (Stoykova and Gruss, J. Neurosci. 14 (1994), 1395-1412).
Example 7: Distinct regulatory elements are necessary for the expression of Paxβ in eye ectodermal tissues and in the pancreas
To further delineate the cis- acting regulatory elements that specifically control the expression of Paxβ either in eye ectodermal tissues or in the pancreas a detailed functional analysis of the positive 8 kb region contained in the construct 2 were performed. Reporter transgenes containing various subfragments from the 8 kb fragment, the Paxβ promoter PO, lacZ and the SV40 polyA sequences were used to generate transgenic embryos (see Fig. 1 ). Truncation at the 5' end of construct 2 (406/Sall) resulted in construct 3 (406/Spe), which was still able to drive the transgenic lacZ expression in the lens, the cornea and the pancreas. As already mentioned, transgenic mice carrying the 3 kb Paxβ promoter/lacZ fusion construct 1 lack any β-gal activity, indicating that the regulatory regions for the lens, cornea and pancreas are located within a 2.4 kb fragment (Fig. 1 ). However, insertion of this 2.4 kb fragment upstream to the lacZ driven by the minimal TK promoter failed to support any lacZ expression, indicating most probably that these regulatory elements are non-functional with the minimal TK promoter. Therefore, various overlapping fragments of the 2.4 kb regulatory region (construct 5 - 9, Fig. 1 , Fig. 14: Table 1) were placed upstream of the Paxβ promoter PO. The negative control, construct 10 (406/Xbal), which contains only the minimal Paxβ promoter PO provides no specific
transgene activity. Interestingly, while the construct 3 (406/Spe) was still sufficient to direct the reporter gene expression in both the surface ectoderm derivatives and the pancreas, the 1.29 kb fragment in construct 5 (406/Hincll) directs the lacZ expression only in the lens/cornea. These results indicate that the pancreas specific regulatory element (box A, Fig. 1 ) is located on a 1100 bp Spe/Hincll-fragment 4.6 kb upstream of exon 0. Furthermore, the sequence comparison performed among corresponding mouse, human and fugu (pufferfish) DNAs revealed a 124 bp sequence of 74% homology (Fig. 11 ) in the Paxβ regulatory region which is suggested to be responsible for controlling the expression of the gene in the pancreas. To determine the regulatory region that is sufficient to control the expression of reporter genes in the lens and the cornea transgenic mice were created carrying construct 6 and construct 7 (see Fig. 1 ). In transient assays, the 340 bp fragment of construct 6, (406/H) directed expression only in the developing lens and the cornea, while the construct 7 (406/E) containing a 280 bp fragment gave no β-gal staining. Further trimming of construct 6 to a 120 bp fragment (construct 8, 406/A) resulted in lacZ expression in the lens and additional ectopic patchy staining in retina (in 6 out of 11 transient assays, Fig. 14: Table 1 ). Furthermore, transgenic mice were created carrying the construct 9 (406/B) that contains an 130 bp Bglll/EcoRI fragment overlapping with construct 8, but in 3 '-5 Orientation. The reporter lacZ expression was detected in the lens and the cornea, indicating that this regulatory element can act as an independent enhancer. However, similar to construct 8, this construct also gave in addition to the correct lens and cornea specificity, additional ectopic expression in the retina, suggesting that a negative regulatory element might be missing on these 2 constructs. By comparing the constructs 8 and 9, it was assumed that an overlapping sequence of 107 bp (Bglll- Accl, Fig. 10) located 3.6 kb upstream of exon 0 (Fig. 1 ) is the minimal sequence sufficient to direct the lacZ activity in the ectodermal derivatives of the develoDing eye, lens and cornea, while a sequence beyond this element appears necessary to strictly limit this expression. A high homology was also found for the 107 bp regulatory element in mouse, human and fugu (pufferfish) (see below and Fig. 10).
Taken together, these results demonstrate that two different regulatory elements are located within a 4.6 kb region 5' of exon 0, a 107 bp fragment which is sufficient to
drive the reporter lacZ expression in the lens, cornea and lacrimal gland and a 1100 bp fragment which is responsible for lacZ expression in the pancreas.
Example 8: Identification of fugu (pufferfish) Paxβ regulatory elements directing lacZ expression in the mouse telencephaion, lens and pancreas.
The tetraodontoid fish, Fugu rubripes, has a compact genome of approximately 400 Mb, which is nine times smaller than the mouse genome (Brenner et al., Nature 366 (1993), 265-268), thus making the analysis of regulatory sequences less time consuming. As Paxβ is strongly conserved both structurally and functionally through evolution, the availability of information on the Paxβ cis-reguiatory element in the fish would facilitate the identification of further regulatory elements within the large mouse Paxβ locus. The identification of enhancer regions using cross species comparison has already been successfully applied (Marshall et al., Nature 370 (1994), 567-571; Aparicio et al., Proc. Natl. Acad. Sci USA 92 (1995), 1684-1688; Kimura et al., Development 124 (1997), 3929-3941 ).
A 12 kb fugu (pufferfish) genomic phage clone from the Paxβ locus containing the paired box and the 5' untranslated region (including exons 1 , 2, 3, 4 and α ; see Fig. 8) was isolated. Comparison of the intron/exon structure revealed that the fugu (pufferfish) Paxβ locus is one third smaller than the corresponding human (Glaser et al., loc. cit.) and mouse sequences. Noteworthy, the presence of exon 0 could not be detected on the fugu (pufferfish) genomic clone either by DNA hybridisation or by sequence comparison.
The functional activity of this sequence was further tested using the mouse in vivo reporter assay. In transgenic mice, the 12 kb Fugu (pufferfish) genomic sequence (construct 12, Fig. 8), directs the lacZ expression in the lens and in the pancreas, thus demonstrating functional conservation of cis-regulatory elements between the fish and the mouse. In addition, a very intensive β-gai staining was detected in the dorso-lateral domain of the telencephaion (Fig. 8A) that is one of the most prominent characteristics of the endogenous Paxβ expression. Unfortunately, out of the analyzed 79 embryos (F0) 3 were transgenic and only 1 expressed lacZ (Fig. 14: Table 1 ).
To identify corresponding mouse DNA sequences that may regulate the Paxβ expression in the telencephaion, a 13 kb fragment encompassing the region from exon 0 to exon 4 (thus lacking the lens, cornea and pancreas elements) was used to make construct 13 (2118/14P, Fig. 8). The transgenic embryos exhibited a β-gal staining similar to the expression of the endogenous Paxβ gene thus including^the regions of dorsal telencephaion, diencephalon, pretectum, hindbrain, spinal cord and nasal epithelium. Additional ectopic expression in the vertebrae and the kidney was also seen. In several other embryos in addition to the activity in telencephaion and spinal cord an ectopic expression was evident in the mesencephalic roof. However, a further truncation of this large fragment to a 5 kb fragment located upstream of exon 1 (construct 14, ENN1 , Fig. 8) showed a more restricted expression. In 4 out of 11 transgenic mice carrying construct 14, the reporter lacZ expression was detected within the dorsal telencephalic cortex, hindbrain and in the spinal cord (Fig. 8B) However, some ectopic expression in midbrain was also detected in 2 out of 4 LacZ positive embryos.
Example 9: Localization of conserved mouse- and fugu (pufferfish) regulatory elements directing transgenic expression in the neural retina
Results from in vitro experiments with the quail Paxβ gene revealed a region 7.5 kb downstream of the quail PO promoter acting as an enhancer in neural retina cells (Plaza et al., Mol. Cell. Biol. 15 (1995), 892-903). To identify regulatory elements specific for Paxβ expression in the mouse neural retina several constructs carrying different regions between exon 1 and exon 5 were used in order to generate transgenic mice (see Fig. 9). The construct 17 (406/8, Fig. 9) contains a 1.8 kb Accl fragment upstream of the minimal promoter PO. Of the 14 transgenic embryos carrying this construct, 13 exhibited β-gal staining only in the retina. In order to analyse in detail the spatio-temporal expression controlled by this regulatory element a stable transgenic line was established. The initial transgenic expression was detected at E9.0 in the nasal and temporal region of the developing neural retina (Fig. 9A). At midgestation stage (Fig. 9B), intensive β-gal staining still appears confined mostly to the nasal and temporal domain missing the dorsal aspect known
from the endogenous expression of Paxβ (Walther and Gruss, loc. cit; Grindley et al., loc. cit). It should be mentioned, however, that in several transgenic embryos the size of the β-gal negative domain within the dorsal retina is smaller and in a few cases even a thin layer of lacZ positive cells connected the two strongly positive retinal domains (curved arrow in Fig. 9D). As illustrated in Fig. 5E, the transgenicji- gal activity is very strong in the inner- (arrowhead) and in the pigmental layer of retina (PL). On sections of E 18 d.p.c. embryonic eyes the staining was observed in the ganglionic and amacrine cells (data not shown). As at early developmental stages, a regionalized lacZ -expression in the retina (at intermediate level) and in the iris (at very high level) was detected after birth (Fig. 9C).
Construct 15 (Fig. 9) harboring a 1.2 kb EcoRI/Xbal fragment and a lacZ-polyA cassette as an insertion into exon α, directed similar expression pattern in the neural retina of transgenic embryos, indicating the location of the regulatory sequences in a 900 bp region (Fig. 9). Trimming the fragment to 530 bp included in construct 19 (406/BX, Fig. 9) resulted in a similar lacZ- activity in retina, while a further smaller 290 bp fragment in construct 20 (406/DX) failed to show a transgenic expression (Fig. 9). No β-gal staining could be detected when using a heterologous minimal TK- promoter, indicating that this specific minimal promoter is not sufficient to activate the Paxβ regulatory sequences (construct 16, Fig. 9).
Sequence comparison revealed that this mouse genomic area is highly conserved (87% in 403 bp, Fig. 12) with the identified neuroretina-specific enhancer element of the quail Paxβ gene (Plaza et al., loc. cit). Additionally, the same genomic area exhibits a high sequence identity (81% in 414 bp) to the Paxβ gene of the pufferfish. A 600 bp genomic fragment of Fugu (pufferfish) carrying this conserved region was inserted upstream to the mouse promoter P0 and the lacZ gene (construct 18, 406/Fugu (pufferfish), Fig. 9). This construct (carrying the fugu (pufferfish) conserved sequence) was able to reproduce the restricted expression pattern in the nasal and temporal part of the retina seen in the transgenic embryos, (compare Fig. 9A.B/F). These results demonstrate the functional conservation of cis-regulatory sequences in the Paxβ gene during eye evolution.
Example 10: Conservation of putative regulatory regions in the pufferfish Paxβ- locus
The nucleotide sequences of the identified regulatory elements reveal several DNA binding motifs of transcription factors which are highly conserved among mouse, human and fugu (pufferfish), suggesting that they may act as upstream regulators of the Paxβ gene. The 340 bp Hindl/EcoRI murine fragment (construct 6) responsive for the surface ectoderm expression shows a high sequence homology within 245 bp of human and fugu (pufferfish) genomic Paxβ sequences (Fig. 10). The 245 bp sequences contain two conserved TAAT-core motifs, critical components of many homeodomain DNA binding sites. Motif A"CTTAATG" is located in position nt 56 - nt 62 , while Motif B "GCTAATGTCT" is located in position nt 210 - nt 220. The 1100 bp fragment for the pancreas specific element revealed a sequence of 120 nt with high sequence identity to human and fugu (pufferfish) genomic Paxβ DNA, containing two motifs for homeodomain DNA binding sites: motif C: "CATTATTGT" in position nt 60 - nt 68 and motif D "TTTAATCCAATTATA" in position nt. 156 - nt. 170, (Fig. 11 ). Furthermore, a PBX-1 consensus binding-site "AATCAATCA" is located in position nt 97 (Lu et al., Mol. Cell. Biol. 15 (1995), 3786-3795) which may regulate Paxβ expression by direct binding.
Additionally, the sequence of the retina specific fragment shows a high conservation among mouse, human, fugu (pufferfish) and quail (Fig. 12). Position nt 185 reveals a homeodomain binding site for the transcription factor MSX-1 "CAATTAG" (Catron et al., Mol. Cell. Biol. 13 (1993), 2357-2365). Two further putative homeodomain binding sites "AAATTAAG" and "GTTTTATT" are located at positions nt 233 and nt 262 respectively. The sequence at nt 199 reveals a binding motif for the transcription factor Pax2 (Czerny et al., Genes Dev. 7 (1993), 2048-2061 ; Epstein et al., J. Biol. Chem. 269 (1994), 8344-8361).