Nothing Special   »   [go: up one dir, main page]

WO2007027583A2 - Method of isolating stem cells and identifying differentially expressed genes in sonic hedgehog expressing cells and descendent cells - Google Patents

Method of isolating stem cells and identifying differentially expressed genes in sonic hedgehog expressing cells and descendent cells Download PDF

Info

Publication number
WO2007027583A2
WO2007027583A2 PCT/US2006/033491 US2006033491W WO2007027583A2 WO 2007027583 A2 WO2007027583 A2 WO 2007027583A2 US 2006033491 W US2006033491 W US 2006033491W WO 2007027583 A2 WO2007027583 A2 WO 2007027583A2
Authority
WO
WIPO (PCT)
Prior art keywords
cells
shh
gene
expressed
tissue
Prior art date
Application number
PCT/US2006/033491
Other languages
French (fr)
Other versions
WO2007027583A3 (en
Inventor
Brain D. Harfe
Martin J. Cohn
Original Assignee
University Of Florida Research Foundation, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Florida Research Foundation, Inc. filed Critical University Of Florida Research Foundation, Inc.
Publication of WO2007027583A2 publication Critical patent/WO2007027583A2/en
Publication of WO2007027583A3 publication Critical patent/WO2007027583A3/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
    • A01K67/027New or modified breeds of vertebrates
    • A01K67/0275Genetically modified vertebrates, e.g. transgenic
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/05Animals comprising random inserted nucleic acids (transgenic)
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2227/00Animals characterised by species
    • A01K2227/10Mammal
    • A01K2227/105Murine
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2267/00Animals characterised by purpose
    • A01K2267/03Animal model, e.g. for test or diseases
    • A01K2267/0393Animal model comprising a reporter system for screening tests

Definitions

  • the invention provides for methods of isolating stem cells and identifying the genes that are differentially expressed in stem cells. More particularly, the present invention relates to methods for discovering genes involved in tissue engineering, regeneration, reconstruction and/or repair.
  • Stem cells are undifferentiated or immature cells that have the capacity to self renew and to give rise to various specialized cell types. Once differentiated or induced to differentiate, stem cells can be used to repair damaged and malfunctioning organs. Stem cells can be of embryonic, fetal or adult origin.
  • Embryonic stem cells can be isolated from the inner cell mass of pre-implantation embryos (ES cells) or from the primordial germ cells found in the genital ridges of post- implanted embryos (EG cells), or they can be generated by nucleus transfer into enucleated oocytes and blastocyst development. When grown in special culture conditions such as spinner culture or hanging drops, both ES and EG cells aggregate to form embryoid bodies (EB). EBs are composed of various cell types similar to those present during embryogenesis.
  • EBs When cultured in appropriate media, EBs can be used to generate in vitro differentiated phenotypes, such as extraembryonic endoderm, hematopoietic cells, neurons and glia, cardiomyocytes, skeletal muscle, pancreatic, liver, endothelial, adipose, cartilage, bone, and vascular muscle cells.
  • phenotypes such as extraembryonic endoderm, hematopoietic cells, neurons and glia, cardiomyocytes, skeletal muscle, pancreatic, liver, endothelial, adipose, cartilage, bone, and vascular muscle cells.
  • the invention provides for methods of isolating stem cells and identifying genes that are differentially expressed in Sonic Hedgehog gene expressing stem cells and all their descendent cells (i.e., cells into which they differentiate). More particularly, the present invention relates to methods for discovering genes and proteins which may be useful in tissue engineering, regeneration, reconstruction and/or repair of tissues expressing the Sonic Hedgehog gene.
  • the present invention provides for methods of isolating stem cells and their descendant cells in selected tissues co-expressing the sonic hedgehog (Shh) gene and a marker gene comprising: a) obtaining a transgenic subject in which a marker gene has been inserted into the Shh locus of the subject's genome; and b) isolating Shh/ma ⁇ ker gene expressing cells and STzA/marker gene non-expressing cells from the selected tissue.
  • Shh sonic hedgehog
  • the present invention provides for methods of identifying differentially : expressed genes in selected tissues (i.e., stem cells and their descendant cells) co-expressing the sonic hedgehog (Shh) gene and a marker gene comprising: a) obtaining a non-human transgenic subject in which the marker gene has been inserted into the Shh locus of the subject's genome; b) isolating Shh/mavker gene expressing cells and Shh/ marker gene non-expressing cells from the selected tissue; c) analyzing complementary RNAs from Shh/ marker gene expressing cells and Shh/ marker gene using a microarray screen of the subject's transcriptome; and d) identifying genes that are expressed at higher and lower levels in Shh/ma ⁇ ke ⁇ gene expressing cells relative to jS7z/z/marker gene non-expressing cells.
  • selected tissues i.e., stem cells and their descendant cells
  • Shh sonic hedgehog
  • tissue or regions of an embryo such as the zone of polarizing activity (ZPA) of a developing limb bud, the genital tubercle and floor plate of the neural tube.
  • ZPA zone of polarizing activity
  • the methods can be practiced using selected tissue from an adult, such as populations of progenitor or stem cells, tissue derived from enamel knot tissue, tissue derived from follicle tissue of hair, tissue derived from the nervous system, tissue derived from the retina, tissue derived from endoderm of the gastrointestinal tract, tissue derived from an intervertebral disk, tissue derived from genitourinary epithelial tissue, or any tissue expressing or which has expressed the Shh gene or protein at any time during the life of the animal.
  • inventions of the present invention include proteins translated from a genes identified as expressed at relatively higher or lower levels according to the methods of the present invention and used as therapeutic agents for regenerating, reconstructing or repairing a number tissue systems.
  • Specific embodiments of the present invention include the genes and translated proteins identified using the methods of the present invention in the patterning of the vertebrate limb, and the development of the genital tubercle, including TM-I, TM-2, EST 1437418, Mmu-miR-135a- 2, and AP-2 beta.
  • the present invention provides for methods of identifying genes and proteins that are important in tissue engineering, regeneration, reconstruction and or repair of tissues that express the Shh gene. Other aspects of the invention are described infra.
  • Figures IA-C are two diagrams (IA, 1C) and a photograph (IB) relating to construction of a Shhgfpcre allele and a corresponding transgenic mouse, in accordance with an embodiment of the invention.
  • Figure IA shows the site of insertion of a gfpcre fusion cassette at the ATG of Shh.
  • Figure IB is a wholemount of an embryo formed by crossing mice carrying the Shhgfcre allele and the R26R reporter, resulting in the production of ⁇ -galactosidase-positive cells in all locations in which Shh is normally expressed, including the limbs (black arrowheads) and notochord/neural tube (gray arrowheads). Staining in the notochord/neural tube is present throughout the length of the embryo.
  • Figure 1C illustrates schematically that the R26R reporter is normally turned off in all tissues, but can be activated by the expression of CRE protein.
  • FIGS 2A-F are six photographs illustrating whole mount RNA in situ hybridization of mouse forelimb at embryonic day 10.5 (E10.5) showing expression of genes as indicated in each panel.
  • the genes were identified by screening a gene chip with probes made from mRNA of stem cells expressing sonic hedgehog (ShK) that were isolated from the limbs of ElO.5 Shhgfpcre transgenic mice in which cells that express Shh, and their descendants, express a green fluorescent marker (GFP), facilitating their isolation and purification, in accordance with an embodiment of the invention.
  • Figures 3 is a series often photographs showing a comparison of expression patterns of
  • TMl and Shh expression patterns are almost identical; the slight differences in expression are most likely due to small differences in stages of the limbs.
  • TMl is not expressed outside the Zone of Polarizing Activity (ZPA).
  • ZPA Zone of Polarizing Activity
  • Figures 4A-E are five photographs illustrating patterns of TMl expression during limb formation in mutant mouse strains and in the chick embryo.
  • Figures 4 A and 4B illustrate expression of TMl in hindlimbs of ElO wild type and Fgf4/Fg ⁇ null mice, respectively; no expression of TMl is observed in limbs that lack Fgf4 and Fg ⁇ (arrow in 4B).
  • Figures 4C and 4D respectively show expression of TMl in ElO wild type and Shh null forelimbs. TMl expression is observed in a wild type domain in limbs that lack Shh (arrow in 4D).
  • Figure 4E shows that TMl is expressed in the ZPA of stage 20 chick forelimbs (arrow).
  • Figure 5 is six photographs illustrating expression of Shh during development of mouse external genitalia from E 10-El 5.5. Dark staining ( ⁇ -galactosidase activity) persists in the urethral epithelium(arrows) of the genital tubercle, as seen in ventral views of wholemounts.
  • Figure 6 is a photograph. of a transverse section through a genital tubercle at of a transgenic Shhgfpcre mouse embryo at E12.5 showing the distribution of Shh mRNA. Shh transcripts (dark staining) are restricted to the urethral epithelium.
  • Figure 7 is two photographs showing a genital tubercle dissected from an E12.5 ShhGJpCre/+ eYFP floxed mouse embryo. YFP is detected throughout the entire urethral epithelium.
  • Figure 8 is a graph showing results of FACS separation of eYFP-labelled urethral epithelial cells from genital tubercles of ShhGfpCre; eYFP floxed mice; YFP cells are gated within box R2.
  • Figures 9A-D are three fluorescence photographs (A-C) and a scatter plot (D) pertaining to spinal column formation in the mouse embryo as visualized in transgenic mice expressing Shh and a reporter gene, in accordance with an embodiment of the invention.
  • Figure 9 A is a bright field view of a developing spinal column at postnatal day 5;
  • Fig. 9B shows yellow fluorescent protein (YFP) localization in the nucleus pulposus of intervertebral disks of the same spinal column shown in Fig. 9A in a shhgfpcre/A; eYFP/+ mouse.
  • YFP yellow fluorescent protein
  • Downward facing arrows in Figs. 9A and 9B mark the positions of intervertebral disks; upward facing arrows mark the positions of vertebral bodies.
  • FIG. 9B shows LacZ staining of a single nucleus pulposus in a transverse section through a vertebral body in a newborn Shhgfpere/+, R26R/+ PO mouse.
  • R26R is a LacZ reporter; the vertebra has been stained to reveal LacZ label in the nucleus pulposus (arrows).
  • Figure 9D is a scatter plot showing results of a fluorescence activated cell sorting (FACS) experiment in which cells expressing CRE from the Shhgfcere allele turn on the eYFP reporter. Purified YFP+ cells are shown within the gate R2; unlabeled cells lie to the left of gate R2. Cells present in the R2 gate are over 94% YFP positive. Cells present in the R3 gate are ⁇ 0.5% YFP positive.
  • FACS fluorescence activated cell sorting
  • Figures 10A-E are five photographs showing expression of Shh in the notochord and subsequently in the nucleus pulposus, but not the vertebrae (A-C), in a transgenic mouse model in which Shh expression can be conditionally controlled by administering tamoxifen.
  • Figures D and E show Shh expression in the urethra (D, E) and preputial glands (E).
  • FIGS 1 IA-C are three photographs showing that Shh is expressed in a subset of nucleus pulposus cells (arrows) in the intervertebral disks of postnatal mice.
  • the invention provides methods of isolating stem cells and their descendant cells, as well as methods of identifying genes that are differentially expressed in cells that express the Sonic Hedgehog (Shh) gene.
  • the present invention further provides genes and proteins that may be useful in tissue engineering, regeneration, reconstruction and/or repair of tissues that are actively, or have at any time during the life of the organism, expressed the Sonic Hedgehog gene, including various tissues of the limbs, genitourinary tract and vertebral column, including intervertebral disks.
  • the present invention provides for methods of isolating stem cells and their descendant cells, as well as identifying the genes that are differentially expressed in stem cells and their descendant cells. More particularly, the present invention relates to methods for discovering genes involved in tissue engineering, regeneration, reconstruction and/or repair.
  • a "marker gene,” as the term is used herein, can be a detectable genetic trait or segment of DNA that can be identified and tracked.
  • a marker gene can serve as a flag for another gene, sometimes called the target gene.
  • a marker gene must be on the same chromosome as the target gene and near enough to it so that the two genes (the marker gene and the target gene) are genetically linked and are usually inherited together.
  • marker genes useful in the present invention include the gene for green fluorescent protein (gfp), dsRed, eYFP, eGFP, eCFP, LacZ and other marker genes currently known and as will be known in the future to those people of ordinary skill in the art.
  • gfp green fluorescent protein
  • dsRed dsRed
  • eYFP dsRed
  • eGFP eGFP
  • eCFP eCFP
  • LacZ lacZ
  • Embryos containing this transgene expressed the marker gene in all cells that normally express Shh, including zones of developing tissues.
  • any cell that at one time expressed Shh can be purified and used to screen arrays, or for cell culture.
  • a gene that encodes a gfpcre protein fusion was knocked into the Shh locus (Fig. IA).
  • the gfpcre cassette contains a nuclear localization signal and was inserted at the translational start ATG codon of Shh. Insertion of the gfpcre cassette at the start ATG of Shh results in the production of GFP in cells that normally express Shh mRNA.
  • the gfpere cassette is a translational fusion between gfp and the site-specific recombinase ere 1 .
  • CRE protein recognizes loxP DNA sequences and can instigate recombination between two loxP sites present as direct repeats in the genome (Fig. IB). In mice containing the Shhgfpcre allele, CRE should be expressed in all cells in which GFP (and Shh mRNA) are observed.
  • R26R and eYFP (enhanced Yellow Florescent Protein) alleles ' 39 were used to advantage, hi these alleles, reporter genes have been inserted into the ROSA locus. Mice containing the R26R or eYFP alleles transcribe LacZ or YFP, respectively, but no protein is made because these alleles contain a stop cassette flanked by loxP sites upstream of the reporters. Cells expressing the recombinase CRE undergo a recombination event between the two loxP sites, resulting in the removal of the stop cassette and expression of the reporters.
  • eYFP enhanced Yellow Florescent Protein
  • reporter protein will continue to be produced in the cell in which the recombination event occurred and in all descendants of that cell (Fig. 1C). Expression will continue irrespective of whether CRE protein is present during later development. Accordingly, using this system, cells and their descendants are irreversibly marked and can be followed throughout development.
  • cRNAs Complementary RNAs
  • the vertebrate limb is a highly intricate structure containing many different types of cells and tissues.
  • the limb is first visible as a small bud of tissue protruding from the flank of an E9.5 mouse embryo 1 (as shown, e.g., in Fig. 2).
  • E9.5 mouse embryo 1 as shown, e.g., in Fig. 2.
  • this mostly undifferentiated mass of cells will divide and differentiate to form all cell types associated with the limb, including muscle, tendon and bone.
  • Uncovering the mechanisms responsible for the patterning of the limb has been an area of active investigation for decades. In spite of intense effort, the molecular pathways responsible for patterning of most cell types in the vertebrate limb are not yet known.
  • the limb is patterned along three distinct axes; dorsal/ventral, proximal/distal and anterior/posterior.
  • Saunders and Gasseling 2 reported key findings into the mechanism responsible for patterning the anterior/posterior axis in 1968. They grafted tissue from the posterior of one chick limb into the anterior distal margin of a second limb bud. This resulted in mirror image duplication of the digits and in some cases the ulna. The extra digits produced near the graft were of reversed polarity such that the most anterior ectopic digits in position were the most posterior in character.
  • Shh secreted factor
  • Shh mRNA is first observed in the posterior of the limb bud at E9.75 6 . Over the course of the next 2 days, Shh mRNA remains in the posterior distal portion of the limb as it expands.
  • Shh protein is observed in a posterior to anterior concentration gradient, with highest levels of protein found in cells expressing Shh mRNA 6 .
  • Numerous studies have identified an important role for Shh in patterning the anterior/posterior (A/P) axis of the vertebrate limb 3 ' 6"8 .
  • the correct concentration of SHH protein is essential for the patterning of the anterior/posterior axis ' 9 ' 10 .
  • digit one in the hindlimb does not appear to require SHH for its formation.
  • concentration of SHH was varied suggested that the highest concentration of SHH results in the production of digit five with progressively lower SHH concentrations specifying digits 4-2.
  • cells other than SHH-expressing cells sense different concentrations of SHH and respond by modulating their gene expression profiles. The cells that respond to SHH then form different digits depending on the concentration of SHH protein to which they have been exposed. Role of cells formerly expressing Shh in the vertebrate limb.
  • Cells that formerly expressed Shh have at least two roles in the vertebrate limb.
  • the first is to terminate the STz/z-Fgffeedback loop that is responsible for controlling the size of the limb 11 .
  • the outgrowth of the limb is controlled by two distinct signaling centers, the ZPA located in the posterior/distal mesenchyme and the AER (apical ectodermal ridge) located at the tip of the developing limb 12 .
  • a positive feedback loop involving Shh and Gremlin expression in the mesenchyme and Fgf4 expression in the AER is required for normal limb outgrowth to occur 11 ' 13 ' 14 .
  • Shh maintains Fgf expression from the AER by upregulating Gremlin in the mesenchyme adjacent to SM-expressing cells 15 ' 16 .
  • Gremlin is a bone morphogenetic protein (Bmp) antagonist and prevents Bmp proteins from downregulating Fgfexpression in the AER 17 ' 18 .
  • Bmp bone morphogenetic protein
  • the signaling loop between Shh and Fgfs must be halted at the appropriate point during development to produce a normal-sized limb. Prolonged expression of Shh in chick limbs after normal Shh expression has ceased results in prolonged Fgf expression in the AER and the continued outgrowth of the digit rays 19 ' 20 .
  • the end result of increasing the amount of time the Shh-Fgffeedba.dk loop is operating is the production of additional phalanges and an overall longer limb.
  • Shh is expressed in an invariant, small number of posterior/distal cells along the margin of the limb. A few factors are known to be involved in activating Shh in this specific region of the vertebrate limb. However, none of them are expressed exclusively in S7z/z-positive cells.
  • dHAND is a basic helix-loop-helix transcription factor that is expressed prior to Shh in the early limb 23 . During later development, dHAND is restricted to the posterior of the limb and overlaps with Shh. Misexpression of dHAND in the anterior of the limb results in ectopic anterior expression of Shh and the duplication of skeletal elements . Consistent with the proposed role for dHAND in activating Shh, in dHAND null mice Shh is not expressed .
  • Hoxb8 has also been implicated in regulating Shh expression in the vertebrate limb.
  • Hoxb8 is expressed in the posterior of E9.5 mouse limbs and overexpression throughout the limb results in ectopic Shh expression and forelimb duplications 2 .
  • chick Hoxb8 is also expressed in the posterior of forelimbs and can be ectopically induced by anterior expression of retinoic acid 25 ' 26 . Inhibition of retinoic acid results in the downregulation of Hoxb8 and a loss of Shh 25 .
  • Hoxb8 is not expressed in the posterior hindlimb in either mouse or chick and overexpression of Hoxb8 in the mouse hindlimb does not result in the production of any visible defects 24 .
  • genes in the HoxD cluster are important for expression of
  • Shh 27 Ectopic Hox expression resulted in misexpression of Shh in the anterior of the limb and double posterior limbs. It has been proposed that dHAND may be activated by posterior HoxD gene expression 27 . Both HoxD expression and dHAND are thought to be repressed by anterior expression of the GH3 repressor and paired homeodomain gene Alx-4 ' .
  • the "classic" ZPA has been redefined through molecular biology as the region of the limb in which Shh is expressed. Based on the essential roles for iSM-expressing cells in normal limb development we reasoned that there might be additional factors present in the ZPA, performing ⁇ S7z/2-dependent and/or independent functions.
  • the inventors employed a transgenic mouse allele in which Gfp is inserted into the Shh locus as described above. Embryos containing this transgene express GFP in all cells that normally express Shh, including the ZPA. Cells expressing Shh were cell-sorted from ElO.5 limbs and then complementary RNAs from GFP- positive (the ZPA) and GFP-negative (rest of the limb bud) cell populations were used to screen Affymetrix mouse GeneChips.
  • TMl that contains eight transmembrane domains
  • Whole mount RNA in situ hybridization revealed that in the limb, TMl was temporally and spatially expressed in a pattern identical to Shh. Outside the limb, TMl also appears to be expressed in many of the same areas as Shh.
  • TMl is a member of a highly conserved family of uncharacterized transmembrane proteins that share no similarity to any of the known transmembrane proteins functioning in the SM-signaling pathway ⁇ Patched, Smoothened, Hip or Dispatched).
  • TMl members There are eight TMl members in most vertebrates, 4 in Drosophila and 1 in C. elegans. In the chick limb, we have found that the TMl homolog is also expressed exclusively in the ZPA.
  • RNA in situ hybridization analysis was also performed using several other genes listed in Table 1, including: EST 1460299 (HB9); EST 1420784 (TM2); EST 1437418; and EST 1435670 (AP-2) (shown in Figs. 2C-F, respectively. Like Shh and TMl, these genes were found to be highly expressed in the ZPA, and not in other areas of the developing limb bud. The results of the in situ hybridization analysis thus confirmed the differential expression of these genes in the ZPA of the limb bud, as revealed by the gene chip screening technique, demonstrating the power of this approach for identifying new genes associated with limb formation.
  • Genitourinary disease is a major public health concern, with cancer of the prostate being the second most lethal malignancy in men.
  • Most cancers of the genitourinary tract involve transformation of epithelial cells that line urologic organs, known as urothelial or uroepithelial cells.
  • Hedgehog signaling has been shown to be associated with cancers of the stomach, esophagus, pancreas, prostate and biliary tract.
  • Birth defects affecting the genitourinary system are among the most common congenital anomalies in humans, with hypospadias, an ectopic ventral opening of the urethra, affecting one in every 250 live births . The molecular control of early genitourinary development is not well understood.
  • the penis and clitoris develop from the genital tubercle, an outgrowth of the ventral margin of the cloaca that consists of endoderm, mesoderm and ectoderm (Fig. 5).
  • the genital tubercles of males and females are morphologically indistinct.
  • the early tubercle in both sexes has the potential to differentiate into either male or female genitalia, and exposure to androgens masculinizes the genitalia. Disruption of androgen signaling can result in feminization of the genitalia, which frequently includes hypospadias ' ls 92 '
  • Fgfs Fibroblast Growth Factors
  • FgfR2-IIIb a receptor for FgflO and Fgf7
  • FgfR2-IIIb a receptor for FgflO and Fgf7
  • Sonic hedgehog signaling has been implicated in urologic cancers of the prostate and bladder.
  • Shh is expressed during normal development, where it regulates differentiation of prostate epithelium and growth of the prostate. Inhibition of Shh signaling during prostate development leads to precocious and abnormal differentiation of epithelial cells, causing prostatic lesions similar to those found in prostate intraepithelial neoplasia.
  • Hedgehog activity is required for regeneration of prostatic epithelium, and continuous pathway activation results in transformation of progenitor cells to a tumorigenic phenotype. Activation of the hedgehog pathway requires expression of the Smoothened gene, which is triggered by expression of endogenous hedgehog proteins.
  • prostate cancer cell lines have been shown to contain elevated levels of Patched mRNA, and growth of these cancer cells can be inhibited by administration of compounds that block hedgehog activity. Tumor growth can be blocked in vivo by administering hedgehog pathway antagonists to adult mice. Regression of these tumors is likely to be a consequence of failed renewal of tumor stem cells. Thus, the biological role of hedgehog signaling in the maintenance of uroepithelial progenitor or stem cells seems to be common to normal development and to cancer. Whereas normal development involves appropriate withdrawal of hedgehog signaling at the end of organ formation, tumorigenesis involves unregulated hedgehog signaling, which leads to continuous regeneration of prostate epithelium from transformed endogenous progenitors.
  • hedgehog pathway activation is strongly correlated with metastasis of prostate cancer.
  • Superficial bladder cancer is frequently associated with chromosomal deletion at position 9q22.3, the same position as the Patched locus. While it is not yet known whether loss of Ptc activity in bladder cancer is a causal factor, its involvement in basal cell carcinoma, and the importance of hedgehog signaling in basal cell renewal elsewhere in the urogenital tract, raises the possibility that hedgehog may also mediate tumor progression in the bladder.
  • Shh like other members of the hedgehog family, activates signal transduction by binding to the Patched (Ptc) transmembrane receptors 94> 102 ' 103 ' 107 .
  • Ptc Patched
  • the binding of Hedgehog to Ptc alleviates repression of Smoothened (Smo), which, in turn, results in activation of GH and subsequent expression of hedgehog target genes, including Ptc 106 .
  • Smo Smoothened
  • members of the Bone morpho genetic protein (Bmp) family are expressed in the limb bud, and it has been suggested that Bmps may be involved in relaying the hedgehog signal over long ranges 114 .
  • Hoxdl3 and Hoxal3 play an essential role in external genital and limb development 98> 104 ' 109 . Loss of function of both genes results in agenesis of the genital tubercle, and heterozygosity for either gene causes patterning defects of the phallus. A recent study reports that Hoxal3 null mice exhibit hypospadias 104 .
  • H0XA13 mutations in H0XA13 are responsible for the range of phenotypes seen in Hand-Foot-Genital Syndrome, which affects the distal limbs and genitourinary system.
  • the secreted signaling molecule Wnt5a is expressed in a number of embryonic outgrowths, where it acts as a positive regulator of cell
  • urologic tissues Reconstruction of urologic tissues is usually performed with non-urologic tissues from the gastrointestinal tract, skin, placenta, dura, peritoneum or secretory mucosa from other tissues.
  • Introduction of non-urologic cells into the genitourinary tract is generally problematic and can lead to abnormal physiological function of urologic organs, scarring, and fibrosis.
  • Prosthetic replacements have been engineered from natural and synthetic materials such as silicone, collagen matrix, gelatin and polyvinyl, but these offer poor solutions and lead to failure due to bioincompatibility or structural/mechanical problems.
  • Cellular transplantation offers great promise for biological reconstruction of urologic organs, and progress has been made in the area of scaffold development onto which cells can be seeded for in vifro and in vivo growth. After seeding, cells spread over the scaffold and reach a stable shape. Morphology of the cellular component of engineered organs depends on (1) adhesion of cells to the substrate, (2) adhesion of cells to one another, (3) rigidity of the substrate, and (4) availability of appropriate nutrients, including oxygen. An important factor in the reconstruction of genitourinary organs is identification of a donor cell population that can be directed down an appropriate developmental pathway, then survive over a long term, differentiate and grow.
  • Urothelial stem cells provide great opportunities for successful regeneration, repair and reconstruction of urologic tissues.
  • the inventors isolated these cells from the ShhGfpCre mouse described herein, and purified the cells by flow cytometry using Fluorescence Activated Cell Sorting (FACS). More specifically, a population of SM-expressing, fluorescent cells was obtained from the urethral plate of mouse genital tubercle at embryonic day 12.5 (Fig. 7), and 94% enrichment of labeled cells was achieved by FACS (Fig. 8).
  • FACS Fluorescence Activated Cell Sorting
  • the data indicate that the methods of gene isolation described herein have resulted in the identification of the transcriptional profile of urethral progenitor cells in the E12.5 mouse embryo. It is contemplated that the cells of the invention, or the genes/RNA or proteins discovered from the transcriptome of these cells will provide useful therapeutics for regenerating, reconstructing or repairing any tissue needing such therapy that has at one time expressed Shh, such as therapeutic agents for regenerating, reconstructing or repairing the genitourinary system.
  • Table 5 lists genes that are upregulated in non-iSM-expressing cells of the genital tubercle.
  • the most common cause of back pain is thought to result from the degeneration of the intervertebral disk 118"120 . This usually occurs in two ways: either through herniation of disk material into the vertebral column or through the reduction of disk height. The reduction in the
  • each disk is composed of an outer annulus f ⁇ brosus consisting of concentric lamellae of collagen fibers; superior and inferior cartilaginous end plates (these mark the regions between the intervertebral disk and the vertebrae); and an inner layer called the nucleus pulposus 122 .
  • the primary clinical treatment is the surgical removal of the disk in an attempt to relieve pressure on spinal nerve roots 124 . If removal of the affected disk results in a mechanically unstable spine, fusion of adjacent vertebrae can be performed in an attempt to stabilize the spine 12 ' 125 . This type of operation is usually beneficial for the alleviation of pain caused by disk herniation. However, the overall success rate for spinal fusions is only 50-75% 125 . Spinal fusions have also been shown to accelerate the degeneration of adjacent discs 126 ' 127 . The relatively low rate of success and potential for serious side effects from this treatment suggests that alternative treatments are needed to cure or alleviate back pain. Intervertebral disk repair or replacement as a treatment for back pain
  • Tissue engineering of a nucleus pulposus has the potential to be used for either complete disk replacement or the repair of damaged tissue.
  • the first step in the engineering of a nucleus pulposus is to identify, characterize and purify a population of cells or proteins that can form or repair this structure.
  • the nucleus pulposus is composed predominantly of type II collagen, aggrecan and water 120 .
  • Cell types found in the nucleus pulposus are primarily small chondrocyte-like cells.
  • a second population of cells is found in the nucleus pulposus. These cells are much larger than the chondrocyte-like cells and have been proposed to be "notochordal" in origin 1 °' 131 .
  • notochordal cells in the nucleus pulposus express a subset of genes that may be involved in organizing this structure. The gradual loss of this population of cells during the life of many mammalian species prior to the onset of disk degeneration suggests that this population of cells is involved in the maintenance and/or repair of this structure '
  • the notochordal cells in the nucleus pulposus have been proposed to arise from the notochord, a structure that has long been known to play an essential role in patterning the spinal column and intervertebral disks 120 ' 133 .
  • the notochord is a mesodermal rod, ventral to the neural tube, running from the head through the tail of the mouse embryo.
  • the notochord During embryonic development, the notochord is thought to pattern parts of the overlying vertebral column 133 . It has been difficult to identify the final fates of cells comprising the notochord since this structure begins to recede during mid- to late embryonic development, hi mice (and humans) the secreted factor Shh is expressed in the notochord 134 ' 137 . The notochord expresses a number of genes, including the secreted factor sonic hedgehog (Slih) 13 ' 1 5 . Prior to studies described herein, the role of the notochord in forming the nucleus pulposus had previously not been known. Characterization of the stem cell population of the nucleus pulposus.
  • the inventors have created a mouse in which a gfpcre fusion cassette 13 is inserted into the Shh locus (Shhgfpcre allele).
  • Shhgfpcre allele all Shh-expressing cells that arise from the notochord are detectable by expression of a marker protein.
  • the nucleus pulposus is comprised entirely of cells that have previously expressed Shh.
  • Mice containing both Shhgfpcre and either of the reporter alleles had reporter expression in the nucleus pulposus (Fig. 9). Expression was observed in the nucleus pulposus from mid-embryogenesis, when this structure first forms, through adulthood. This data suggests that the nucleus pulposus is comprised of cells from the notochord that at some time expressed Shh Purification of stem cells of the nucleus pulposus.
  • the Shhgfpcre mouse provides a unique tool to investigate whether a stem cell-like population exists in the nucleus pulposus.
  • Using the Shhgfpcre allele it is possible to purify cells from both the notochord and the nucleus pulposus. Characterization of these cell populations is an essential first step toward the goal of curing back pain associated with damaged intervertebral disks.
  • the nucleus pulposus is believed to contain a stem cell-like population of notochordal cells that is essential for the maintenance and repair of this structure 120 .
  • Mice containing both the Shhgfpcre and eYFP alleles have intervertebral disks in which the nucleus pulposus glows bright green (Figure 9B).
  • This mouse strain is ideal for dissecting, purifying and culturing the stem cell population in the nucleus pulposus.
  • Disks can be dissected from newborn mice using a fluorescent stereo dissecting microscope. The use of a fluorescent microscope during the dissection of the intervertebral disks greatly enhances the ability to quickly and easily identify this tissue in newborn animals.
  • placing the disks in a trypsin/BSA solution creates a single-cell suspension of this tissue.
  • the single-cell suspension of intervertebral disk cells is then sorted into YFP-positive (nucleus pulposus cells) and YFP- negative (rest of the disk) cell populations using a flow cytometer. tube.
  • YFP-positive cells from the notochord and neural tube have been isolated (Fig. 9D). In all cases, greater than 90% enrichment in YFP-positive cells can be obtained.
  • notochordal cells have been proposed to act either as stem cells or as an organizer of this structure 20 ' 130 ' " 2 . Notochordal cells can synthesize new extracellular matrix and can regulate the production of proteoglycan by other cells in the nucleus pulposus.
  • Our purified population of cells should contain notochordal cells and chondrocyte-like cells.
  • nucleus pulposus cells were initially plate purified nucleus pulposus cells using standard cell culture conditions since it is not yet clear what factors are necessary for the propagation of stem cells from the nucleus pulposus.
  • Notochordal cells in the nucleus pulposus have been reported to express a distinct subset of molecular markers 120 .
  • the presence of notochordal cells in our purified cell population is confirmed using antibodies against proteins known to be expressed in this population (e.g., CD44s 132 , galectin 3 143 ' vimentin 144 , cyokeratins 8 and 19 144 ' CSPG 145 , and collagen type IIA 146 ' 147 ).
  • the Shhgfcere allele can be used to identify all cells that are actively expressing Shh.
  • Shh is expressed in the notochord prior to the formation of the nucleus pulposus 134 . It is not known whether Shh continues to be expressed in the nucleus pulposus. In other tissues, Shh has been shown to be essential for the maintenance of a stem cell-like fate 148 .
  • Shh RNA in situ hybridizations can be performed to detect Shh mRNA.
  • the published Shh antibody 149 can also be used to detect the location of SHH protein in the nucleus pulposus.
  • the Shhgfpcre allele provides a useful alternative tovercome these problems. Because, as discussed, the Shhgfpcre allele expresses GFP in all cells in which Shh is expressed, it is possible to use a commercially available, and highly sensitive anti-GFP antibody to detect low amounts of Shh protein. Once Shh expression ceases, GFP expression disappears
  • nucleus pulposus cells expresses Shh, then these cells are candidates for being stem cells or an organizer of this structure.
  • the inventors have discovered a small population of ⁇ S7*/z-expressing cells at the edge of the nucleus pulposus of postnatal mice (Fig. 11). It is also possible that by culturing the entire purified nucleus pulposus one can identify a stem cell-like population using known notochordal markers. In either case, the identification and characterization of a cell population that is capable of repairing a damaged nucleus pulposus, or of constructing a new one is an essential first step in the treatment of disk- related back pain.
  • Transcriptome of the nucleus pulposus Proteins expressed in the nucleus pulposus are responsible for the maintenance of this structure 120 . The ability to replace proteins in the nucleus pulposus that have degraded due to age or been damaged as a result of injury or disease may have potential therapeutic benefits.
  • Cells from the nucleus pulposus can be isolated and purified and the cells can be characterized in cell culture. In addition to growing and expanding populations of these cells, RNA can be extracted from purified nucleus pulposus cells and amplified mRNA can be labeled and hybridized to Affymetrix DNA microarrays to determine the identities of (all) genes expressed in the nucleus pulposus (the transcriptome), as described above. The identification of the nucleus pulposus transcriptome will provide information leading to recognition of candidate genes that are involved in maintaining the integrity of the nucleus pulposus or are responsible for specifying distinct types of cells in this tissue.
  • extracellular matrix proteins have been demonstrated to provide a number of functions in the nucleus pulposus (for example, providing compressive stiffness) 120 .
  • Secreted factors such as Shh and Bmp molecules, have been demonstrated in other tissues to be able to pattern undifferentiated cells into defined tissues and organs 120 ' 132 .
  • Putative extracellular matrix proteins and secreted factors from the nucleus pulposus transcriptome are identified using existing mouse genomic databases and the NCBI Blast program. RNA in situ hybridizations and rtPCR can be performed to confirm that identified factors are expressed in the nucleus pulposus.
  • RNA in situ hybridizations are also useful to determine whether identified factors are expressed in subsets of nucleus pulposus cells. Expression of a gene(s) in a subset of nucleus pulposus cells would be suggestive of important cell-type-specific functions for these factors. In addition, they would provide useful markers for the identification of specific cell types. RNA in situ hybridization is not ideal for determining the locations of factors expressed at low levels or in a small number of cells. Primers designed to amplify genes identified in an array screen that were not detected using RNA in situ hybridization can be used for PCR on cDNA from nucleus pulposus cells. Treatment of Intervertebral Discs
  • Treatment or repair of disk defects can be accomplished by injecting purified regenerative cells, or proteins into damaged disks.
  • Putative stem cells in the nucleus pulposus have been shown to disappear prior to disk degeneration ' .
  • the re-introduction of a stem cell population may be able to halt or even heal disk degeneration by directly repopulating disks or by providing a factor(s) that is capable of correctly repairing disks.
  • the identification of genes whose products are required for the integrity of the disk, or that pattern disk cells may also aid in the repair of this structure.
  • the ability to inject purified protein(s) and/or cells directly into a damaged or diseased disk in order to effect disk repair would be a powerful tool in the treatment of back pain. DETECTION OF NUCLEIC ACID MOLECULES
  • Analysis of gene expression is not limited to any one specific method but can include any method known in the art. All of these principles may be applied independently, in combination, or in combination with other known methods of sequence identification. Examples of methods of gene expression analysis known in the art include DNA arrays or microarrays (Brazma and ViIo, FEBS Lett., 2000, 480, 17-24; Celis, et al., FEBS Lett., 2000, 480, 2-16), SAGE (serial analysis of gene expression) (Madden, et al., Drug Discov.
  • Expressed Sequenced Tags can also be used to identify nucleic acid molecules which are over expressed in a cancer cell.
  • ESTs from a variety of databases can be identified.
  • preferred databases include, for example, Online Mendelian Inheritance in Man (OMIM), the Cancer Genome Anatomy Project (CGAP), GenBank, EMBL, PIR, SWISS-PROT, and the like.
  • OMIM which is a database of genetic mutations associated with disease, was developed, in part, for the National Center for Biotechnology Information (NCBI).
  • NCBI National Center for Biotechnology Information
  • CGAP which is an interdisciplinary program to establish the information and technological tools required to decipher the molecular anatomy of a cancer cell.
  • CGAP can be accessed through the world wide web of the Internet, at, for example, ncbi.nlm.nih.gov/ncicgap/. Some of these databases may contain complete or partial nucleotide sequences.
  • alternative transcript forms can also be selected from private genetic databases.
  • nucleic acid molecules can be selected from available publications or can be determined especially for use in connection with the present invention.
  • Alternative transcript forms can be generated from individual ESTs which are within each of the databases by computer software which generates contiguous sequences.
  • the nucleotide sequence of the nucleic acid molecule is determined by assembling a plurality of overlapping ESTs.
  • the EST database (dbEST), which is known and available to those skilled in the art, comprises approximately one million different human mRNA sequences comprising from about 500 to 1000 nucleotides, and various numbers of ESTs from a number of different organisms.
  • dbEST can be accessed through the world wide web of the Internet, at, for example, ncbi.nlm.nih.gov/dbEST/index.html. These sequences are • derived from a cloning strategy that uses cDNA expression clones for genome sequencing.
  • ESTs have applications in the discovery of new genes, mapping of genomes, and identification of coding regions in genomic sequences. Another important feature of EST sequence information that is becoming rapidly available is tissue-specific gene expression data. This can be extremely useful in targeting selective gene(s) for therapeutic intervention. Since EST sequences are relatively short, they must be assembled in order to provide a complete sequence. Because every available clone is sequenced, it results in a number of overlapping regions being reported in the database. The end result is the elicitation of alternative transcript forms from, for example, normal cells and cancer cells.
  • the resultant virtual transcript may represent an already characterized nucleic acid or may be a novel nucleic acid with no known biological function.
  • the Institute for Genomic Research (TIGR) Human Genome Index (HGI) database which is known and available to those skilled in the art, contains a list of human transcripts.
  • TIGR can be accessed through the world wide web of the Internet, at, for example, tigr.org. Transcripts can be generated in this manner using TIGR- Assembler, an engine to build virtual transcripts and which is known and available to those skilled in the art.
  • TIGR- Assembler is a tool for assembling large sets of overlapping sequence data such as ESTs, BACs, or small genomes, and can be used to assemble eukaryotic or prokaryotic sequences.
  • TIGR- Assembler is described in, for example, Sutton, et al, Genome Science & Tech., 1995, 1, 9-19, which is incorporated herein by reference in its entirety, and can be accessed through the file transfer program of the Internet, at, for example, tigr.org/pub/software/TIGR. assembler.
  • GLAXO-MRC which is known and available to those skilled in the art, is another protocol for constructing virtual transcripts.
  • an "allele” or " variant” is an alternative form of a gene.
  • variants of the genes encoding any potential Shh-related genes identified by the methods of this invention may result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in polypeptides whose structure or function may or may not be altered. Any given natural or recombinant gene may have none, one, or many allelic forms.
  • Common mutational changes that give rise to variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. Each of these types of changes may occur alone, or in combination with the others, one or more times in a given sequence.
  • nucleic acid molecules can be grouped into sets depending on the homology, for example.
  • the members of a set of nucleic acid molecules are compared.
  • the set of nucleic acid molecules is a set of alternative transcript forms of nucleic acid.
  • the members of the set of alternative transcript forms of nucleic acids include at least one member which is associated, or whose encoded protein is associated, with a disease state or biological condition such as a stage of development.
  • comparison of the members of the set of nucleic acid molecules results in the identification of at least one alternative transcript form of nucleic acid molecule which is associated, or whose encoded protein is associated, with a disease state or biological condition.
  • the members of the set of nucleic acid molecules are from a common gene. In another embodiment of the invention, the members of the set of nucleic acid molecules are from a plurality of genes. In another embodiment of the invention, the members of the set of nucleic acid molecules are from different taxonomic species. Nucleotide sequences of a plurality of nucleic acids from different taxonomic species can be identified by performing a sequence similarity search, an ortholog search, or both, such searches being known to persons of ordinary skill in the art.
  • Sequence similarity searches can be performed manually or by using several available computer programs known to those skilled in the art.
  • Blast and Smith- Waterman algorithms which are available and known to those skilled in the art, and the like can be used.
  • Blast is NCBI's sequence similarity search tool designed to support analysis of nucleotide and protein sequence databases. Blast can be accessed through the world wide web of the Internet, at, for example, ncbi.nlm.nih.gov/BLAST/.
  • the GCG Package provides a local version of Blast that can be used either with public domain databases or with any locally available searchable database.
  • GCG Package v9.0 is a commercially available software package that contains over 100 interrelated software programs that enables analysis of sequences by editing, mapping, comparing and aligning them.
  • Other programs included in the GCG Package include, for example, programs which facilitate RNA secondary structure predictions, nucleic acid fragment assembly, and evolutionary analysis.
  • the most prominent genetic databases include, for example, programs which facilitate RNA secondary structure predictions, nucleic acid fragment assembly, and evolutionary analysis.
  • GCG GenBank, EMBL, PIR, and SWISS-PROT
  • GCG can be accessed through the Internet at, for example, http://www.gcg.com/.
  • Fetch is a tool available in GCG that can get annotated GenBank records based on accession numbers and is similar to Entrez.
  • Another sequence similarity search can be performed with Gene World and GeneThesaurus from Pangea.
  • Gene World 2.5 is an automated, flexible, high-throughput application for analysis of polynucleotide and protein sequences. Gene World allows for automatic analysis and annotations of sequences.
  • Gene World incorporates several tools for homology searching, gene finding, multiple sequence alignment, secondary structure prediction, and motif identification.
  • GeneThesaurus 1.0 tm is a sequence and annotation data subscription service providing information from multiple sources, providing a relational data model for public and local data.
  • Another alternative sequence similarity search can be performed, for example, by BlastParse.
  • BlastParse is a PERL script running on a UNIX platform that automates the strategy described above. BlastParse takes a list of target accession numbers of interest and parses all the GenBank fields into "tab-delimited” text that can then be saved in a "relational database” format for easier search and analysis, which provides flexibility. The end result is a series of completely parsed GenBank records that can be easily sorted, filtered, and queried against, as well as an annotations-relational database.
  • the plurality of nucleic acids from different taxonomic species which have homology to the target nucleic acid, as described above in the sequence similarity search are further delineated so as to find orthologs of the target nucleic acid therein.
  • An "ortholog” is a term defined in gene classification to refer to two genes in widely divergent organisms that have sequence similarity, and perform similar functions within the context of the organism.
  • paralogs are genes within a species that occur due to gene duplication, but have evolved new functions, and are also referred to as "isotypes.”
  • paralog searches can also be performed. By performing an ortholog search, an exhaustive list of homologous sequences from as diverse organisms as possible is obtained.
  • an ortholog search can be performed by programs available to those skilled in the art including, for example, Compare.
  • an ortholog search is performed with access to complete and parsed GenBank annotations for each of the sequences.
  • the records obtained from GenBank are "flat-files", and are not ideally suited for automated analysis.
  • the ortholog search is performed using a Q-Compare program. Preferred steps of the Q-Compare protocol are described in the flowchart set forth in U.S. Pat. No. 6,221,587, incorporated herein by reference.
  • interspecies sequence comparison is performed using Compare, which is available and known to those skilled in the art.
  • Compare is a GCG tool that allows pair-wise comparisons of sequences using a window/stringency criterion. Compare produces an output file containing points where matches of specified quality are found, can be plotted with another GCG tool, DotPlot.
  • the nucleic acid molecules of this invention can be isolated using the technique described in the experimental section or replicated using PCR.
  • the PCR technology is the subject matter of U.S. Pat. Nos. 4,683,195, 4,800,159, 4,754,065, and 4,683,202 and described in PCR: The
  • this invention also provides a process for obtaining the polynucleotides of this invention by providing the linear sequence of the polynucleotide, nucleotides, appropriate primer molecules, chemicals such as enzymes and instructions for their replication and chemically replicating or linking the nucleotides in the proper orientation to obtain the polynucleotides.
  • these polynucleotides are further isolated. Still further, one of skill in the art can insert the polynucleotide into a suitable replication vector and insert the vector into a suitable host cell (procaryotic or eucaryotic) for replication and amplification. The DNA so amplified can be isolated from the cell by methods well known to those of skill in the art. A process for obtaining polynucleotides by this method is further provided herein as well as the polynucleotides so obtained.
  • the terms "nucleic acid molecule" and or “polynucleotide” are used interchangeably throughout the specification, unless otherwise specified.
  • nucleic acid molecule refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules”) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules”), or any phosphoester analogues thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix. Double stranded DNA--DNA, DNA-RNA and RNA-RNA helices are possible.
  • nucleic acid molecule refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms.
  • this term includes double-stranded DNA found, inter alia, in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes.
  • sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the non-transcribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA).
  • a "recombinant DNA molecule” is a DNA molecule that has undergone a molecular biological manipulation.
  • Percent identity and similarity between two sequences can be determined using a mathematical algorithm (see, e.g., Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M.
  • the sequences are aligned for optimal comparison purposes (e.g., gaps are introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes).
  • the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap which need to be introduced for optimal alignment of the two sequences.
  • the amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions, respectively, are then compared.
  • amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”
  • a “comparison window” refers to a segment of any one of the number of contiguous positions selected from the group consisting of from 25 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
  • Methods of alignment of sequences for comparison are well-known in the art. For example, the percent identity between two amino acid sequences can be determined using the Needleman and Wunsch algorithm (J. MoI. Biol. (48): 444-453, 1970) which is part of the GAP program in the GCG software package (available at http://www.gcg.com), by the local homology algorithm of Smith & Waterman (Adv. Appl. Math. 2: 482, 1981), by the search for similarity methods of Pearson & Lipman (Proc. Natl. Acad. Sci. USA 85: 2444, 1988) and
  • Gap parameters can be modified to suit a user's needs. For example, when employing the GCG software package, a
  • NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6 can be used.
  • Examplary gap weights using a Blossom 62 matrix or a PAM250 matrix are 16, 14, 12, 10, 8, 6, or 4, while exemplary length weights are 1, 2, 3, 4, 5, or 6.
  • the GCG software package can be used to determine percent identity between nucleic acid sequences. The percent identity between two amino acid or nucleotide sequences also can be determined using the algorithm of E. Myers and W. Miller (CABIOS 4: 11-17, 1989) which has been incorporated into the ALIGN program (version 2.0), using a PAMl 20 weight residue table, a gap length penalty of 12 and a gap penalty of 4.
  • gapped BLAST can be used as described in Altschul et al. (Nucleic Acids Res. 25(17): 3389-3402, 1997).
  • the default parameters of the respective programs e.g., XBLAST and NBLAST
  • fragment or segment as applied to a nucleic acid sequence, or gene, will ordinarily be at least about 5 contiguous nucleic acid bases (for nucleic acid sequence or gene) or amino acids (for polypeptides), typically at least about 10 contiguous nucleic acid bases or amino acids, more typically at least about 20 contiguous nucleic acid bases or amino acids, usually at least about 30 contiguous nucleic acid bases or amino acids, preferably at least about 40 contiguous nucleic acid bases or amino acids, more preferably at least about 50 contiguous nucleic acid bases or amino acids, and even more preferably at least about 60 to 80 or more contiguous nucleic acid bases or amino acids in length.
  • “Overlapping fragments” as used herein, refer to contiguous nucleic acid fragments which begin at the amino terminal end of a nucleic acid and end at the carboxy terminal end of the nucleic acid or protein. Each nucleic acid or fragment has at least about one contiguous nucleic acid position in common with the next nucleic acid fragment, more preferably at least about three contiguous nucleic acid bases in common, most preferably at least about ten contiguous nucleic acid bases in common.
  • a significant "fragment" in a nucleic acid context is a contiguous segment of at least about 17 nucleotides, generally at least 20 nucleotides, more generally at least 23 nucleotides, ordinarily at least 26 nucleotides, more ordinarily at least 29 nucleotides, often at least 32 nucleotides, more often at least 35 nucleotides, typically at least 38 nucleotides, more typically at least 41 nucleotides, usually at least 44 nucleotides, more usually at least 47 nucleotides, preferably at least 50 nucleotides, more preferably at least 53 nucleotides, and in particularly preferred embodiments will be at least 56 or more nucleotides.
  • Additional preferred embodiments will include lengths in excess of those numbers, e.g., 63, 72, 87, 96, 105, 117, etc.
  • Said fragments may have termini at any pairs of locations, but especially at boundaries between structural domains, e.g., membrane spanning portions.
  • homologous nucleic acid sequences when compared, exhibit significant sequence identity or similarity.
  • the standards for homology in nucleic acids are either measures for homology generally used in the art by sequence comparison or based upon hybridization conditions. The hybridization conditions are described in greater detail below.
  • nucleic acid sequence comparison context means either that the segments, or their complementary strands, when compared, are identical when optimally aligned, with appropriate nucleotide insertions or deletions, in at least about 50% of the nucleotides, generally at least 56%, more generally at least 59%, ordinarily at least 62%, more ordinarily at least 65%, often at least 68%, more often at least 71%, typically at least 74%, more typically at least 77%, usually at least 80%, more usually at least about 85%, preferably at least about 90%, more preferably at least about 95 to 98% or more, and in particular embodiments, as high at about 99% or more of the nucleotides.
  • substantial homology exists when the segments will hybridize under selective hybridization conditions, to a strand, or its complement.
  • selective hybridization will occur when there is at least about 55% homology over a stretch of at least about 14 nucleotides, preferably at least about 65%, more preferably at least about 75%, and most preferably at least about 90%. See, Kanehisa (1984) Nuc. Acids Res. 12:203-213.
  • the length of homology comparison may be over longer stretches, and in certain embodiments will be over a stretch of at least about 17 nucleotides, usually at least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 28 nucleotides, more typically at least about 40 nucleotides, preferably at least about 50 nucleotides, and more preferably at least about 75 to 100 or more nucleotides.
  • the endpoints of the segments may be at many different pair combinations.
  • Stringent conditions in referring to homology in the hybridization context, will be stringent combined conditions of salt, temperature, organic solvents, and other parameters, typically those controlled in hybridization reactions.
  • Stringent temperature conditions will usually include temperatures in excess of about 30° C, more usually in excess of about 37°C, typically in excess of about 45° C, more typically in excess of about 55° C, preferably in excess of about 65° C, and more preferably in excess of about 70° C.
  • Stringent salt conditions will ordinarily be less than about 1000 mM, usually less than about 500 mM, more usually less than about 400 mM, typically less than about 300 mM, preferably less than about 200 mM, and more preferably less than about 150 mM. However, the combination of parameters is much more important than the measure of any single parameter. See, e.g., Wetmur and Davidson (1968) J. MoI. Biol. 31 :349- 370.
  • the primers of the invention embrace oligonucleotides of sufficient length and appropriate sequence so as to provide specific initiation of polymerization on a significant number of nucleic acids in the polymorphic locus.
  • the term "primer” as used herein refers to a sequence comprising two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and most preferably more than eight, which sequence is capable of initiating synthesis of a primer extension product, which is substantially complementary to a polymorphic locus strand.
  • Environmental conditions conducive to synthesis include the presence of nucleoside triphosphates and an agent for polymerization, such as DNA polymerase, and a suitable temperature and pH.
  • the primer is preferably single stranded for maximum efficiency in amplification, but may be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxy ribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent for polymerization. The exact length of primer will depend on many factors, including temperature, buffer, and nucleotide composition.
  • An oligonucleotide primer typically contains 12-20 or more nucleotides, although it may contain fewer nucleotides.
  • Primers of the invention are designed to be “substantially” complementary to each strand of the genomic locus to be amplified and include the appropriate G or C nucleotides as discussed above. This means that the primers must be sufficiently complementary to hybridize with their respective strands under conditions which allow the agent for polymerization to perform. In other words, the primers should have sufficient complementarity with the 5' and 3' flanking sequences to hybridize therewith and permit amplification of the genomic locus.
  • Oligonucleotide primers of the invention are employed in the amplification process which is an enzymatic chain reaction that produces exponential quantities of target locus relative to the number of reaction steps involved.
  • one primer is complementary to the negative (-) strand of the locus and the other is complementary to the positive (+) strand.
  • Annealing the primers to denatured nucleic acid followed by extension with an enzyme, such as the large fragment of DNA Polymerase I and nucleotides results in newly synthesized + and - strands containing the target locus sequence. Because these newly synthesized sequences are also templates, repeated cycles of denaturing, primer annealing, and extension results in exponential production of the region (i.e., the target locus sequence) defined by the primer.
  • the product of the chain reaction is a discrete nucleic acid duplex with termini corresponding to the ends of the specific primers employed.
  • the oligonucleotide primers of the invention may be prepared using any suitable method, such as conventional phosphotriester and phosphodiester methods or automated embodiments thereof. In one such automated embodiment, diethylphosphoramidites are used as starting materials and may be synthesized as described by Beaucage, et al. (Tetrahedron Letters, 22:1859-1862, 1981). One method for synthesizing oligonucleotides on a modified solid support is described in U.S. Pat. No. 4,458,066.
  • nucleic acid specimen in purified or nonpurif ⁇ ed form, can be utilized as the starting nucleic acid or acids, provided it contains, or is suspected of containing, the specific nucleic acid sequence containing the target locus (e.g., CpG).
  • the process may employ, for example, DNA or RNA, including messenger RNA, wherein DNA or RNA may be single stranded or double stranded.
  • RNA is to be used as a template
  • enzymes, and/or conditions optimal for reverse transcribing the template to DNA would be utilized.
  • a DNA- RNA hybrid which contains one strand of each may be utilized.
  • a mixture of nucleic acids may also be employed, or the nucleic acids produced in a previous amplification reaction herein, using the same or different primers may be so utilized.
  • the specific nucleic acid sequence to be amplified i.e., the target locus, may be a fraction of a larger molecule or can be present initially as a discrete molecule, so that the specific sequence constitutes the entire nucleic acid. It is not necessary that the sequence to be amplified be present initially in a pure form; it may be a minor fraction of a complex mixture, such as contained in whole human DNA.
  • the nucleic acid-containing specimen used for detection may be from any source including a developing limb bud, enamel knot tissue, follicle of a hair, retinal tissue, brain, colon, urogenital tissue, hematopoietic tissue, thymus, testis, ovarian, uterine, prostate, breast, gastrointestinal, colon, lung and renal tissue and may be extracted by a variety of techniques such as that described by Maniatis, et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., pp 280, 281, 1982).
  • the extracted sample is impure (such as plasma, serum, or blood or a sample embedded in parrafm)
  • it may be treated before amplification with an amount of a reagent effective to open the cells, fluids, tissues, or animal cell membranes of the sample, and to expose and/or separate the strand(s) of the nucleic acid(s).
  • a reagent effective to open the cells, fluids, tissues, or animal cell membranes of the sample, and to expose and/or separate the strand(s) of the nucleic acid(s). This lysing and nucleic acid denaturing step to expose and separate the strands will allow amplification to occur much more readily.
  • Strand separation can be effected either as a separate step or simultaneously with the synthesis of the primer extension products. This strand separation can be accomplished using various suitable denaturing conditions, including physical, chemical, or enzymatic means; the term "denaturing" includes all such means.
  • One physical method of separating nucleic acid strands involves heating the nucleic acid until it is denatured. Typical heat denaturation may involve temperatures ranging from about 80 degrees to 105 degrees C for times ranging from about 1 to 10 minutes.
  • Strand separation may also be induced by an enzyme from the class of enzymes known as helicases or by the enzyme RecA, which has helicase activity, and in the presence of riboATP, is known to denature DNA.
  • an enzyme from the class of enzymes known as helicases or by the enzyme RecA which has helicase activity, and in the presence of riboATP, is known to denature DNA.
  • the reaction conditions suitable for strand separation of nucleic acids with helicases are described by Kuhn Hoffmann-Berling (CSH-Quantitative Biology, 43:63, 1978) and techniques for using RecA are reviewed in C. Radding (Ann. Rev. Genetics, 16:405-437, 1982).
  • the separated strands are ready to be used as a template for the synthesis of additional nucleic acid strands.
  • This synthesis is performed under conditions allowing hybridization of primers to templates to occur. Generally synthesis occurs in a buffered aqueous solution, preferably at a pH of 7-9, most preferably about 8.
  • a molar excess for genomic nucleic acid, usually about 10 :1 primer template
  • a molar excess for genomic nucleic acid, usually about 10 :1 primer template
  • the amount of complementary strand may not be known if the process of the invention is used for diagnostic applications, so that the amount of primer relative to the amount of complementary strand cannot be determined with certainty.
  • the amount of primer added will generally be in molar excess over the amount of complementary strand (template) when the sequence to be amplified is contained in a mixture of complicated long-chain nucleic acid strands. A large molar excess is preferred to improve the efficiency of the process.
  • the deoxyribonucleoside triphosphates dATP, dCTP, dGTP, and dTTP are added to the synthesis mixture, either separately or together with the primers, in adequate amounts and the resulting solution is heated to about 90 degrees to 100 degree C from about 1 to 10 minutes, preferably from 1 to 4 minutes. After this heating period, the solution is allowed to cool to room temperature, which is preferable for the primer hybridization. To the cooled mixture is added an appropriate agent for effecting the primer extension reaction (called herein "agent for polymerization”), and the reaction is allowed to occur under conditions known in the art.
  • agent for polymerization may also be added together with the other reagents if it is heat stable.
  • This synthesis (or amplification) reaction may occur at room temperature up to a temperature above which the agent for polymerization no longer functions.
  • the temperature is generally not greater than about 40 degrees C. Most conveniently, the reaction occurs at room temperature.
  • the agent for polymerization may be any compound or system which will function to accomplish the synthesis of primer extension products, including enzymes.
  • Suitable enzymes for this purpose include, for example, E. coli DNA polymerase I, Klenow fragment of E. coli DNA polymerase I, T4 DNA polymerase, other available DNA polymerases, polymerase muteins, reverse transcriptase, and other enzymes, including heat-stable enzymes (i.e., those enzymes which perform primer extension after being subjected to temperatures sufficiently elevated to cause denaturation).
  • Suitable enzymes will facilitate combination of the nucleotides in the proper manner to form the primer extension products which are complementary to each locus nucleic acid strand.
  • the synthesis will be initiated at the 3' end of each primer and proceed in the 5' direction along the template strand, until synthesis terminates, producing molecules of different lengths.
  • agents for polymerization may be agents for polymerization, however, which initiate synthesis at the 5' end and proceed in the other direction, using the same process as described above.
  • the method of amplifying is by PCR, as described herein and as is commonly used by those of ordinary skill in the art.
  • Alternative methods of amplification have been described and can also be employed as long as the methylated and non-methylated loci amplified by PCR using the primers of the invention is similarly amplified by the alternative means.
  • the amplified products are preferably identified as by sequencing. Sequences amplified by the methods of the invention can be further evaluated, detected, cloned, sequenced, and the like, either in solution or after binding to a solid support, by any method usually applied to the detection of a specific DNA sequence such as PCR, oligomer restriction (Saiki, et al., Bio/Technology, 3:1008-1012, 1985), allele-specific oligonucleotide (ASO) probe analysis (Conner, et al., Proc. Natl. Acad. Sci.
  • ASO allele-specific oligonucleotide
  • OLAs oligonucleotide ligation assays
  • the genes that are co-expressed with the Shh gene are identified by a biochip, such as for example, an Affymetrix GeneChip.
  • a biochip such as for example, an Affymetrix GeneChip.
  • a corresponding SAGE tag can be identified, and the total number of SAGE tags present in the SAGEmap database (http:// www.ncbi.nlm.nih.gov/SAGE/) can be determined.
  • any gene having at least about five tags in about one of these two SAGE libraries was then excluded from further analysis.
  • Serial Analysis of Gene Expression is based on the identification of and characterization of partial, defined sequences of transcripts corresponding to gene segments. These defined transcript sequence "tags" are markers for genes which are expressed in a cell, a tissue, or an extract, for example.
  • a short nucleotide sequence tag (9 to 10 bp) contains sufficient information content to uniquely identify a transcript provided it is isolated from a defined position within the transcript. For example, a sequence as short as 9 bp can distinguish 262,144 transcripts (4 9 ) given a random nucleotide distribution at the tag site, whereas estimates suggest that the human genome encodes about 80,000 to 200,000 transcripts (Fields, et al., Nature Genetics, 7:345 1994). The size of the tag can be shorter for lower eukaryotes or prokaryotes, for example, where the number of transcripts encoded by the genome is lower. For example, a tag as short as 6-7 bp may be sufficient for distinguishing transcripts in yeast.
  • serial analysis of the sequence tags requires a means to establish the register and boundaries of each tag.
  • the concept of deriving a defined tag from a sequence in accordance with the present invention is useful in matching tags of samples to a sequence database.
  • a computer method is used to match a sample sequence with known sequences.
  • the tags used herein uniquely identify genes.
  • genes can be identified by matching the tag to a gene database member, or by using the tag sequences as probes to physically isolate previously unidentified genes from cDNA libraries.
  • the methods by which genes are isolated from libraries using DNA probes are well known in the art. See, for example, Veculescu et al., Science 270: 484 (1995), and Sambrook et al. (1989), Molecular Cloning: A Laboratory Manual, 2 nd ed. (Cold Spring Harbor Press, Cold Spring Harbor, N. Y.).
  • the Shhgfpcre ES cell targeting construct was made in pBR322, a low copy number vector, because it had previously been reported that the presence of sequences 5' of the mouse Shh locus resulted in bacterial cell death.
  • the 5' targeting arm was 1.2 kb followed by a gfpcre cassette containing an in-frame fusion between gfp and ere.
  • the gfpcre cassette was placed at the ATG of Shh. Base pairs located at -1 and -5, relative to the Shh ATG, were changed from a G/C to a C/G and A/T to a T/ A 5 respectively, to create a Sail site used to clone the 5' targeting arm.
  • the 3' targeting arm was 8 kb and began 35 bp downstream of the Shh ATG.
  • the only genomic sequences lacking in correctly targeted ES cells were the first 35 base pairs after the ATG of Shh. These base pairs were excised to create a Shh null allele.
  • AU genomic sequence involved in regulating expression from the Shh locus are present in correctly targeted ES cells.
  • Correctly targeted ES cells were identified by southern blot analysis. ES cells were injected into the blastocoele of a recipient blastocyst, which was then implanted into a pseudo-pregnant recipient mouse. Male chimeric mice were then produced in which some of the animals' cells were derived from the ES cells and the remainder were from the recipient blastocyst.
  • Chimeric males mice were used for breeding to determine if the injected ES cells contributed to the germline.
  • Successful passage of the Shhgfpcre allele through the germline resulted in the establishment of a stable mouse line in which the gfpcre cassette had been inserted at the ATG of the Shh gene.
  • Embryos and adults were genotyped by one of the following methods: PCR, staining for ⁇ -galactosidase, or direct visualization of GFP expression.
  • Example 1 Purifying Shh-Expressing Cells from the Embryonic Vertebrate Limb.
  • Sonic hedgehog (ShH) is the only gene known to be expressed exclusively in the Zone of Polarizing Activity (ZPA), an important signaling center during limb development.
  • ZPA Zone of Polarizing Activity
  • the Shhgfpcre allele was constructed by placing a GfpCre fusion cassette into one of the Shh alleles. Insertion of this cassette creates a null Shh allele. However, the remaining S/z/z allele is unaffected and mice that are heterozygous for the Shhgfpcre allele are phenotypically wild type. As a consequence of containing the Shhgfpcre allele, mice express GFP and CRE in all cells that normally express Shh 21 . In the limb, GFP expression is indistinguishable from that of endogenous Shh and no ectopic GFP or CRE expression is observed outside of the wild type Shh expression domains.
  • GFP-positive limbs from 2-4 embryos were harvested and placed in trypsin for 5 minutes at 37°C. After 5 minutes, the trypsin was replaced with fresh trypsin and the limbs were allowed to incubate for an additional 5 minutes at 37°C. The limbs were then washed twice in PBS and dissociated into a single-cell suspension by pipetting vigorously for 5 minutes. Control, GFP- negative limbs were processed in an identical manner.
  • the single-cell suspension of limb cells was then sorted using FACS.
  • Limbs that lacked GFP were used to set a control GFP-negative baseline.
  • Dissociated cells from limbs that expressed GFP in the ZPA that is, contained the Shhgfpcre allele
  • Two easily distinguishable populations of cells were observed in the GFP-positive sample. The first overlapped the trace obtained when GFP-negative limbs had been sorted.
  • the second population contained GFP-positive cells.
  • Shhgfpcre limbs cells that were GFP-positive and GFP- negative were collected from the same sample.
  • the GFP-positive population should contain ! cells that express Shh while the GFP-negative population is composed of cells present in the rest of the limb.
  • GFP-positive cells From each limb, -700 GFP-positive cells were obtained. To determine the purity of each population of cells, an aliquot of GFP-negative and GFP-positive sorted cells was resorted using identical parameters as the initial sort. Of the GFP-negative resorted cells, >99.7% remained GFP negative. Of the GFP-positive resorted cells, >82% remained GFP positive. These data demonstrate that we can successfully purify GFP-negative and GFP-positive (Shh expressing) cells from limbs of embryonic mice.
  • Example 2 Hybridizing Affymetrix Arrays with cRNA from Shh-positive and Shh-Negative El 0.5 Limbs.
  • Affymetrix 2-step cDNA amplification kit since the starting amount of RNA we obtained from the GFP-positive population was insufficient for single step amplification. A total of eight chips was used, 4 with cKNA from the GFP-positive cells and 4 with cRNA from the GFP-negative population. Each chip was hybridized with a biologically independent sample. Equal amounts of cRNA were used on each chip and processed using standard Affymetrix procedures. Comparison of the transcriptomes of iS7z/?-expressing and S7z/z-non-expressing cells should yield genes that are exclusively expressed in Shh cells and nowhere else in the limb. Affymetrix GeneChips function by detecting transcript abundance, which can be compared between two (or more) different cell populations.
  • Affymetrix GeneChips allow for the identification of the transcripts present in a defined population of cells.
  • Affymetrix GeneChips were probed with cRNA made from either Shh- positive cells or cells that do not express Shh in the ElO.5 mouse limb. After hybridizing the chips, ⁇ 40% of the probe sets were scored as "present” on chips hybridized with samples from either the SM-positive or SM-negative cells.
  • To determine which genes are differentially expressed in the two cell populations we used two array analysis tools, BRB array tools and D- chip 30 . The two programs produced overlapping but not identical sets of genes that are expressed at higher levels in either the SM-positive or negative populations of cells, as described below.
  • Hb9 has been demonstrated to play an important role in pancreas development but no function in limb development has been reported 33 ' 34 .
  • Cholecystokinin, a gastrointestional hormone 35 and arachidonate 12-lipoxygenase, an enzyme that introduces a molecular oxygen at carbon 12 of arachidonic acid to generate a 12-hydroperoxy 36 derivative were also uncovered (Table 1). The role that these genes may play in limb development is unknown.
  • TMl and TM2 are not similar to the transmembrane genes patched, smoothened, Hid or dispatched, which are known to have essential roles in Shh signaling.
  • the other eight genes are ESTs of unknown function.
  • Cells that express Shh in the E 10.5 mouse limb comprise a small subset of the total number of cells present in the limb.
  • Shh-expressing and non-expressing cells we were able to identify a pool of 120 genes that appeared to have elevated expression in cells that did not express Shh.
  • the number of genes elevated in cells that were not expressing Shh was larger than the 15 genes elevated in S/z/z-positive cells. This is not unexpected, since precursor cells for a number of tissues including bone, muscle and connective tissue are known to be present outside of the >S7z/z-positive regions of the ElO.5 limb.
  • the list of genes that show upregulated expression, relative to that observed in S7z/z-expressing cells, is listed in Table 2, supra. Example 5. In vivo Validation of Target Genes Identified by Microarray Screening.
  • RNA in situ hybridizations were performed. Of the thirteen unknown genes found in our screen to be up-regulated in SM-expressing cells, five were detected in the El 0.5 limb using whole mount RNA in situ hybridization. Four of these genes, i.e., the two transmembrane factors TMl and TM2 and two ESTs, were expressed in ⁇ S7z/z-positive cells in the El 0.5 limb (Fig. 2). These data demonstrate that the array experiment can identify previously unidentified genes that are co-expressed with Shh.
  • RNA whole mount in situ hybridization From the pool of 120 genes identified as being down-regulated in S/z/z-expressing cells, four were characterized using RNA whole mount in situ hybridization. One gene, i.e., netrin, had previously been reported as being expressed in the vertebrate limb in cells that do not normally express Shh 37 . We confirmed that this gene is not present in SM-positive cells by performing double RNA whole mount in situ hybridization with netrin and Shh.
  • TMl transmembrane gene TMl was identified as being enriched in Shh cells.
  • RNA in situ hybridizations using TMl as a probe confirmed that this gene is expressed exclusively in iSM-expressing cells at ElO.5 (the stage that limbs were harvested for screening ; Fig. 2B).
  • TMl expression in the limb was indistinguishable from Shh ( Figure 3). TMl expression was first observed in E9.75 limbs in the posterior/distal portion of the limb and overlapped with Shh expression.
  • TMl was observed in a number of locations in which Shh is expressed including the brain, branchial arches and gut. In all experiments, a clone containing 2kb of TMl was used as an in situ probe. This probe has no sequence identity to Shh at the DNA or amino acid level.
  • TMl is predicted to contain eight membrane-spanning domains. There are four transmembrane proteins, Patched (Ptc), Smoothened (Smo), Hedgehog-interacting protein (Hip) and Dispatched (Displ) that are known to function in the STz/z-signaling pathway (reviewed in 10 ). Ptc is thought to bind to Smo in the absence of Shh protein. Upon binding Shh, Ptc releases Smo and 5%/z-signaling commences in a cell. Both Ptc and Smo axe expressed in the ZPA and in cells surrounding the ZPA. Displ is important for removing Shh from cells that are actively producing Shh protein; however Displ does not appear to be expressed in the ElO.5 limb ZPA. Hip has been proposed to be involved in fine-tuning the Shh protein gradient and is not expressed in the ZPA. Amino acid alignments indicate that TMl shares no protein similarity to Ptc, Smo, Hip or
  • TMl may have a novel role in the £7z/?-signaling pathway.
  • TMl also contains a DUF590 domain, which is a conserved protein domain of unknown function.
  • TMl is a member of a highly conserved gene family consisting of eight members. All eight members contain eight transmembrane domains and the DUF590 domain. The furthest removed from TMl is 36% identical at the amino acid level while the most similar family member is 59% identical (homology is observed throughout the entire coding regions of family members). At least eight members of this protein family are found in Zebrafish, humans, rats and chicks. There are four family members present in Drosophil ⁇ and one in C, eleg ⁇ ns.
  • DOGl the human TMl homolog
  • GIST gastrointestinal stromal tumors
  • DOGl was present in 136 of 139 (97.8%) of GIST tumors 38 .
  • the authors suggest that DOGl may be a very useful marker for testing if GIST tumors can be treated with imatinib mesylate.
  • TMl no other functions for TMl are known in vertebrates and the functions of all invertebrate TMl members remain unknown. In addition, the expression pattern of TMl has not previously been described in any organism.
  • TMl is Not Expressed in Fgf4/Fgf8 Double Mutant Null Limbs.
  • the fibroblast growth factors (Fgfs) Fgf4 and Fg ⁇ are expressed in the Apical Ectodermal
  • the Msx2- ere transgene causes complete inactivation of Fgf4 and Fg ⁇ before they are expressed (a genetic "null") 44 .
  • expression of both Fgf4 and Fg ⁇ commences before Cre is expressed; thus there is a burst of Fgf activity during the early development of the forelimb before these genes are inactivated 42 .
  • the genes Shh and Fgf 4 are part of a feedback loop that is responsible for controlling the final size of the limb (see Developing Limb Bud, supra). Examination of the hindlimbs of Fgf4;Fg ⁇ double knockout embryos demonstrated that TMl expression requires Fgf expression in the AER (compare Fig. 4a and 4b). No TMl expression was observed in the hindlimbs of embryos that lacked both Fgf4 and Fg ⁇ . In forelimbs, a significant decrease in TMl expression occurred. The presence of a small amount of TMl in Fgf4/8 double knockout forelimbs may be the result of an early burst of Fgf expression in this tissue.
  • TMl expression in Fgf4/8 forelimbs was similar to the expression pattern reported for Shh in these double mutants 42 .
  • These data indicate that TMl expression, like Shh, requires Fgf expression from the AER. From these experiments it is not clear whether TMl is absent in Fgf4;Fg ⁇ double mutants because TMl expression is directly dependent on the expression of Fgf ' factors present in the AER 5 or whether the loss of TMl is an indirect result of loss of multiple independent signaling pathways.
  • mice that lack Shh have been created previously 7 . These mice lack most distal limb structures but still form relatively normal proximal limbs. Since TMl and Shh expression overlap in the limb, it was possible that Shh expression is required for expression of TMl. This Example describes results of a study to test this hypothesis.
  • TMl expression was detected in its normal pattern in the posterior of ElO Shh null limbs (compare Fig. 3c and 3d). After ElO, TMl expression was not detected. This experiment demonstrates that Shh is not required for the initial expression of TMl in the mouse limb, but it may be required for maintaining TMl expression.
  • Gremlin in the limb has also been reported to be independent of Shh expression .
  • this gene is not expressed in 5%/ ⁇ -positive cells but in cells adjacent to cells expressing Shh. It has been proposed that the initial burst of Gremlin expression in the posterior of SM-null limbs is due to an unknown posterior signal 13 . It is possible that TMl is involved in the pathway responsible for propagating this signal in the absence of Shh. Based on these observations, it is believed that genes, RNA or protein as shown herein to be differentially expressed in the developing limb bud will provide useful therapeutic agents for regenerating, reconstructing or repairing the limb of a subject in need thereof, such as a mature adult.
  • Example 10 Chick TMl is Expressed in the ZPA.
  • Chicks are an excellent model system in which to perform embryological experiments. Unlike the mouse embryo, the chick embryo is easily accessible for tissue manipulations and an individual chick embryo can be observed at several time points during the course of an experiment. This Example demonstrates that the chick system can be used to study the role of TMl in limb development.
  • cTMl that shares 80% amino acid identity with mouse TMl.
  • cTMl like its mouse counterpart, contains eight transmembrane domains and a DUF590 domain.
  • cTMl is expressed in the ZPA of chick forelimbs until stage 21 of development (approximately equivalent to mouse stage El 0.5) but is not expressed in hindlimbs (Fig. 4E). At later stages of development, cTMl was not expressed in either the forelimbs or hindlimbs. Expression was also observed in the brain, gut and branchial arches, consistent with mouse TMl expression.
  • This Example describes the construction and use of a mouse model that conditionally expresses a reporter protein in Shh-expressing cells only upon induction of expression of the gene construct by tamoxifen.
  • the model is useful in studies of cell fate mapping.
  • this model was used to demonstrate that the notochord is the progenitor of the entire nucleus pulposis of the intervertebral disks.
  • a tamoxifen-inducible ShhcreERT2 allele was created.
  • the ShhcreERT2 allele can be used to activate the R26R reporter allele in Shh-expressing cells at discrete developmental stages. Briefly, this allele was created as follows.
  • the ere gene in the ShhcreERT2 allele is a fusion between ere and an estrogen binding domain. This fusion protein can be activated in tissue- specific locations in the mouse embryo upon injection of tamoxifen (as described in ' ).
  • the creERT2 gene was knocked into the Shh locus in ES cells. Correctly targeted ES cells were injected into mouse blastocysts to create chimeric animals carrying the ShhcreERT2 allele. These mice were then bred to obtain germline transmission of the ShhcreERT2 allele.
  • the ShhcreERT2 mouse model was used to determine the origin of the cells that form the nucleus pulposus of the intervertebral disks.
  • the notochord begins to form at E7.0.
  • the nucleus pulposus forms at E14.5.
  • the fate-mapping experiment described herein required that injected tamoxifen be cleared from the embryo prior to formation of the nucleus pulposus. Since it has been reported that CRE activity is undetectable 48 hours after tamoxifen injection, we first injected tamoxifen between E6.0 and E7.75. However, we have found that tamoxifen injections into pregnant mice carrying E8.0 or younger embryos results in death of the embryos.
  • Figures 1OB and 1OC both illustrate 10 ⁇ m transverse sections.
  • reporter activity was found in the limb, as expected, since the limb forms less than 48 hours after the E8.5 tamoxifen injection.
  • the preputial glands of the external genitalia express Shh at E13.5.
  • Fig. 1 IA-C mice pregnant with El 9.5 embryos carrying the ShhcreERT2 and R26R reporter alleles were injected with tamoxifen and the pups were harvested one day after birth. In these animals, -8-15 cells in each nucleus pulposus were stained (Fig. 1 IA-C). More particularly, Figs. 1 IA-C illustrate results of ⁇ -galactosidase staining of 10 ⁇ m saggital sections of the intervertebral disks of Pl mice that were exposed to tamoxifen at El 9.5. The arrows point to cells positive for ⁇ - galactosidase (i.e., they express CRE from the ShhcreERT2 allele).
  • Each panel (A-C) is from a different intervertebral disk.
  • Drossopoulou, G. et al. A model for anteroposterior patterning of the vertebrate limb based on sequential long- and short-range Shh signalling and Bmp signalling. Development 127,
  • DOGl novel marker
  • Drosophila segment polarity gene hh is expressed in tissues with polarizing activity in zebrafish embryos. Cell 75, 1431-44 (1993).
  • DOGl The novel marker, DOGl, is expressed ubiquitously in gastrointestinal stromal tumors irrespective of KIT or PDGFRA mutation status. Am J Pathol. 2004

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Environmental Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Veterinary Medicine (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Animal Husbandry (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention provides for methods of isolating stem cells and their descendant cells, as well as identifying genes that are differentially expressed in Sonic Hedgehog (Shh) gene expressing cells. More particularly, the present invention relates to methods for discovering genes and proteins that may be useful in tissue engineering, regeneration, reconstruction and/or repair of tissues that are actively, or have at any time during the life of the organism, expressed the Sonic Hedgehog gene.

Description

Docket No: 60943WO (49163)
METHOD OF ISOLATING STEM CELLS AND IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN SONIC HEDGEHOG-EXPRESSING CELLS AND
DESCENDENT CELLS
RELATED APPLICATIONS
The present application claims priority under 35 U.S.C. 119(e) to U.S. Provisional Application No. 60/713,400, entitled Method of Isolating Stem Cells and Identifying Differentially Expressed Genes in Sonic Hedgehog Expressing Cells and Descendant Cells, filed August 31, 2005, the disclosure of which is hereby incorporated by reference in its entirety.
FIELD OF THE INVENTION
The invention provides for methods of isolating stem cells and identifying the genes that are differentially expressed in stem cells. More particularly, the present invention relates to methods for discovering genes involved in tissue engineering, regeneration, reconstruction and/or repair.
BACKGROUND OF THE INVENTION
Stem cells are undifferentiated or immature cells that have the capacity to self renew and to give rise to various specialized cell types. Once differentiated or induced to differentiate, stem cells can be used to repair damaged and malfunctioning organs. Stem cells can be of embryonic, fetal or adult origin.
Embryonic stem cells can be isolated from the inner cell mass of pre-implantation embryos (ES cells) or from the primordial germ cells found in the genital ridges of post- implanted embryos (EG cells), or they can be generated by nucleus transfer into enucleated oocytes and blastocyst development. When grown in special culture conditions such as spinner culture or hanging drops, both ES and EG cells aggregate to form embryoid bodies (EB). EBs are composed of various cell types similar to those present during embryogenesis. When cultured in appropriate media, EBs can be used to generate in vitro differentiated phenotypes, such as extraembryonic endoderm, hematopoietic cells, neurons and glia, cardiomyocytes, skeletal muscle, pancreatic, liver, endothelial, adipose, cartilage, bone, and vascular muscle cells.
It would be highly desirable to develop methods which would enable the isolation of stem cells and allow for the analysis of the genes that are differentially expressed in stem cells, particularly at various stages of development. Several publications and patent documents are cited throughout the specification in order to describe the state of the art to which this invention pertains. Full citations for those references that are numbered can be found at the end of the specification. Each citation is incorporated herein as though set forth in full. SUMMARY OF THE INVENTION
The invention provides for methods of isolating stem cells and identifying genes that are differentially expressed in Sonic Hedgehog gene expressing stem cells and all their descendent cells (i.e., cells into which they differentiate). More particularly, the present invention relates to methods for discovering genes and proteins which may be useful in tissue engineering, regeneration, reconstruction and/or repair of tissues expressing the Sonic Hedgehog gene. Specifically, the present invention provides for methods of isolating stem cells and their descendant cells in selected tissues co-expressing the sonic hedgehog (Shh) gene and a marker gene comprising: a) obtaining a transgenic subject in which a marker gene has been inserted into the Shh locus of the subject's genome; and b) isolating Shh/maτker gene expressing cells and STzA/marker gene non-expressing cells from the selected tissue.
In addition, the present invention provides for methods of identifying differentially : expressed genes in selected tissues (i.e., stem cells and their descendant cells) co-expressing the sonic hedgehog (Shh) gene and a marker gene comprising: a) obtaining a non-human transgenic subject in which the marker gene has been inserted into the Shh locus of the subject's genome; b) isolating Shh/mavker gene expressing cells and Shh/ marker gene non-expressing cells from the selected tissue; c) analyzing complementary RNAs from Shh/ marker gene expressing cells and Shh/ marker gene using a microarray screen of the subject's transcriptome; and d) identifying genes that are expressed at higher and lower levels in Shh/maτkeτ gene expressing cells relative to jS7z/z/marker gene non-expressing cells. v These methods can be practiced in selected tissues or regions of an embryo, such as the zone of polarizing activity (ZPA) of a developing limb bud, the genital tubercle and floor plate of the neural tube. Alternatively, the methods can be practiced using selected tissue from an adult, such as populations of progenitor or stem cells, tissue derived from enamel knot tissue, tissue derived from follicle tissue of hair, tissue derived from the nervous system, tissue derived from the retina, tissue derived from endoderm of the gastrointestinal tract, tissue derived from an intervertebral disk, tissue derived from genitourinary epithelial tissue, or any tissue expressing or which has expressed the Shh gene or protein at any time during the life of the animal. Other embodiments of the present invention include proteins translated from a genes identified as expressed at relatively higher or lower levels according to the methods of the present invention and used as therapeutic agents for regenerating, reconstructing or repairing a number tissue systems. Specific embodiments of the present invention include the genes and translated proteins identified using the methods of the present invention in the patterning of the vertebrate limb, and the development of the genital tubercle, including TM-I, TM-2, EST 1437418, Mmu-miR-135a- 2, and AP-2 beta.
As such, the present invention provides for methods of identifying genes and proteins that are important in tissue engineering, regeneration, reconstruction and or repair of tissues that express the Shh gene. Other aspects of the invention are described infra.
BRIEF DESCRIPTION OF THE DRAWINGS Figures IA-C are two diagrams (IA, 1C) and a photograph (IB) relating to construction of a Shhgfpcre allele and a corresponding transgenic mouse, in accordance with an embodiment of the invention. Figure IA shows the site of insertion of a gfpcre fusion cassette at the ATG of Shh. Figure IB is a wholemount of an embryo formed by crossing mice carrying the Shhgfcre allele and the R26R reporter, resulting in the production of β-galactosidase-positive cells in all locations in which Shh is normally expressed, including the limbs (black arrowheads) and notochord/neural tube (gray arrowheads). Staining in the notochord/neural tube is present throughout the length of the embryo. Figure 1C illustrates schematically that the R26R reporter is normally turned off in all tissues, but can be activated by the expression of CRE protein.
Figures 2A-F are six photographs illustrating whole mount RNA in situ hybridization of mouse forelimb at embryonic day 10.5 (E10.5) showing expression of genes as indicated in each panel. The genes were identified by screening a gene chip with probes made from mRNA of stem cells expressing sonic hedgehog (ShK) that were isolated from the limbs of ElO.5 Shhgfpcre transgenic mice in which cells that express Shh, and their descendants, express a green fluorescent marker (GFP), facilitating their isolation and purification, in accordance with an embodiment of the invention. Figures 3 is a series often photographs showing a comparison of expression patterns of
Shh (upper panels) and TMl (lower panels) in the developing mouse forelimb on the indicated embryonic days. TMl and Shh expression patterns are almost identical; the slight differences in expression are most likely due to small differences in stages of the limbs. In the limb, TMl is not expressed outside the Zone of Polarizing Activity (ZPA). Prior to E9.5 and after El 1.75, neither Shh nor TMl are expressed in the mouse limb. Identical results were obtained in the hindlimb. Figures 4A-E are five photographs illustrating patterns of TMl expression during limb formation in mutant mouse strains and in the chick embryo. Figures 4 A and 4B illustrate expression of TMl in hindlimbs of ElO wild type and Fgf4/Fgβ null mice, respectively; no expression of TMl is observed in limbs that lack Fgf4 and Fgβ (arrow in 4B). Figures 4C and 4D respectively show expression of TMl in ElO wild type and Shh null forelimbs. TMl expression is observed in a wild type domain in limbs that lack Shh (arrow in 4D). Figure 4E shows that TMl is expressed in the ZPA of stage 20 chick forelimbs (arrow).
Figure 5 is six photographs illustrating expression of Shh during development of mouse external genitalia from E 10-El 5.5. Dark staining (β-galactosidase activity) persists in the urethral epithelium(arrows) of the genital tubercle, as seen in ventral views of wholemounts. Figure 6 is a photograph. of a transverse section through a genital tubercle at of a transgenic Shhgfpcre mouse embryo at E12.5 showing the distribution of Shh mRNA. Shh transcripts (dark staining) are restricted to the urethral epithelium.
Figure 7 is two photographs showing a genital tubercle dissected from an E12.5 ShhGJpCre/+ eYFP floxed mouse embryo. YFP is detected throughout the entire urethral epithelium.
Figure 8 is a graph showing results of FACS separation of eYFP-labelled urethral epithelial cells from genital tubercles of ShhGfpCre; eYFP floxed mice; YFP cells are gated within box R2.
Figures 9A-D are three fluorescence photographs (A-C) and a scatter plot (D) pertaining to spinal column formation in the mouse embryo as visualized in transgenic mice expressing Shh and a reporter gene, in accordance with an embodiment of the invention. Figure 9 A is a bright field view of a developing spinal column at postnatal day 5; Fig. 9B shows yellow fluorescent protein (YFP) localization in the nucleus pulposus of intervertebral disks of the same spinal column shown in Fig. 9A in a shhgfpcre/A; eYFP/+ mouse. Downward facing arrows in Figs. 9A and 9B mark the positions of intervertebral disks; upward facing arrows mark the positions of vertebral bodies. In addition to YFP in the disks, (Fig. 9B), a low autofluorescence is observed in the vertebral bodies (arrows). Figure 9C shows LacZ staining of a single nucleus pulposus in a transverse section through a vertebral body in a newborn Shhgfpere/+, R26R/+ PO mouse. R26R is a LacZ reporter; the vertebra has been stained to reveal LacZ label in the nucleus pulposus (arrows). Figure 9D is a scatter plot showing results of a fluorescence activated cell sorting (FACS) experiment in which cells expressing CRE from the Shhgfcere allele turn on the eYFP reporter. Purified YFP+ cells are shown within the gate R2; unlabeled cells lie to the left of gate R2. Cells present in the R2 gate are over 94% YFP positive. Cells present in the R3 gate are <0.5% YFP positive.
Figures 10A-E are five photographs showing expression of Shh in the notochord and subsequently in the nucleus pulposus, but not the vertebrae (A-C), in a transgenic mouse model in which Shh expression can be conditionally controlled by administering tamoxifen. Figures D and E show Shh expression in the urethra (D, E) and preputial glands (E).
Figures 1 IA-C are three photographs showing that Shh is expressed in a subset of nucleus pulposus cells (arrows) in the intervertebral disks of postnatal mice.
DETAILED DESCRIPTION OF THE INVENTION The invention provides methods of isolating stem cells and their descendant cells, as well as methods of identifying genes that are differentially expressed in cells that express the Sonic Hedgehog (Shh) gene. The present invention further provides genes and proteins that may be useful in tissue engineering, regeneration, reconstruction and/or repair of tissues that are actively, or have at any time during the life of the organism, expressed the Sonic Hedgehog gene, including various tissues of the limbs, genitourinary tract and vertebral column, including intervertebral disks.
It is understood that this invention is not limited to the particular materials and methods described herein. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments and is not intended to limit the scope of the present invention, which will be limited only by the appended claims. As used herein, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. The following references provide one of skill with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2nd ed. 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); The Glossary of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The Harper Collins Dictionary of Biology (1991). As used herein, all terms have the meanings ascribed to them, unless specified otherwise.
All publications mentioned herein are cited for the purpose of describing and disclosing the cell lines, protocols, reagents and vectors which are reported in the publications and which might be used in connection with the invention. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.
The present invention provides for methods of isolating stem cells and their descendant cells, as well as identifying the genes that are differentially expressed in stem cells and their descendant cells. More particularly, the present invention relates to methods for discovering genes involved in tissue engineering, regeneration, reconstruction and/or repair.
Mouse models engineered for detecting Shh-expressing cells and their descendants.
To uncover genes differentially expressed in zones of developing tissues, transgenic mouse alleles were employed in which a gene for a marker gene (protein) was inserted into the sonic hedgehog (Shh) locus. A "marker gene," as the term is used herein, can be a detectable genetic trait or segment of DNA that can be identified and tracked. A marker gene can serve as a flag for another gene, sometimes called the target gene. A marker gene must be on the same chromosome as the target gene and near enough to it so that the two genes (the marker gene and the target gene) are genetically linked and are usually inherited together. Examples of marker genes useful in the present invention include the gene for green fluorescent protein (gfp), dsRed, eYFP, eGFP, eCFP, LacZ and other marker genes currently known and as will be known in the future to those people of ordinary skill in the art.
Embryos containing this transgene expressed the marker gene in all cells that normally express Shh, including zones of developing tissues. In addition, using this mouse allele, any cell that at one time expressed Shh can be purified and used to screen arrays, or for cell culture.
In a specific embodiment, using ES cells, a gene that encodes a gfpcre protein fusion was knocked into the Shh locus (Fig. IA). The gfpcre cassette contains a nuclear localization signal and was inserted at the translational start ATG codon of Shh. Insertion of the gfpcre cassette at the start ATG of Shh results in the production of GFP in cells that normally express Shh mRNA. The gfpere cassette is a translational fusion between gfp and the site-specific recombinase ere1 . CRE protein recognizes loxP DNA sequences and can instigate recombination between two loxP sites present as direct repeats in the genome (Fig. IB). In mice containing the Shhgfpcre allele, CRE should be expressed in all cells in which GFP (and Shh mRNA) are observed.
To determine the expression pattern of CRE in Shhgfpcre mice, R26R and eYFP (enhanced Yellow Florescent Protein) alleles ' 39 were used to advantage, hi these alleles, reporter genes have been inserted into the ROSA locus. Mice containing the R26R or eYFP alleles transcribe LacZ or YFP, respectively, but no protein is made because these alleles contain a stop cassette flanked by loxP sites upstream of the reporters. Cells expressing the recombinase CRE undergo a recombination event between the two loxP sites, resulting in the removal of the stop cassette and expression of the reporters. Importantly, once the stop cassette is removed, reporter protein will continue to be produced in the cell in which the recombination event occurred and in all descendants of that cell (Fig. 1C). Expression will continue irrespective of whether CRE protein is present during later development. Accordingly, using this system, cells and their descendants are irreversibly marked and can be followed throughout development.
In studies described herein, cells expressing Shh were isolated from various developing tissues of Shhgfpcre mice and cell-sorted from non-Shh expressing cells by virtue of expression of the marker. Complementary RNAs (cRNAs) were prepared from populations of marker gene- positive (developing zones) and marker gene-negative (non-developing zones) cells and the cRNAs were then used to screen gene chips such as Affymetrix mouse GeneChips in order to identify genes showing differential expression during development of specific tissues. Such use of microarrays to compare expression profiles of SM-expressing and S7z/z-non-expressing cells allowed for the identification of up-regulated and down-regulated genes in the stem cells of developing tissue zones.
Knowledge of differentially expressed genes in stem cells provides valuable information regarding which genes and proteins are important in the control of organ development and will greatly increase the potential for organ and tissue restoration, reconstruction, engineering and regeneration. As shown herein, the inventors have demonstrated the ability to isolate stem cells and identify differentially expressed genes during discreet developmental stages of tissue morphogenesis, including that which occurs in the developing limb bud, the developing genitourinary tract and the developing intervertebral disc. n DEVELOPING LIMB BUD
Morphological studies of limb development The improper patterning of the vertebrate limb is one of the most common types of birth defects. Identification of the genes responsible for patterning a limb is a vital step in the treatment of patients with limb abnormalities. Classical embryology experiments performed by John Saunders and colleagues in the 1950' s identified a small region of the distal/posterior limb (the zone of polarizing activity or ZPA) as being responsible for patterning the anterior/posterior limb axis. Subsequent experiments determined that the secreted factor Sonic Hedgehog (Shh) is expressed in the classically defined ZPA and is responsible for the polarizing activity associated with this region of the developing limb. To date, Shh is the only gene known to be expressed exclusively in the ZPA and only Shh, or its two related family members Dhh and Ihh can completely polarize a limb.
The vertebrate limb is a highly intricate structure containing many different types of cells and tissues. During development, the limb is first visible as a small bud of tissue protruding from the flank of an E9.5 mouse embryo1 (as shown, e.g., in Fig. 2). Over the next four days of development, this mostly undifferentiated mass of cells will divide and differentiate to form all cell types associated with the limb, including muscle, tendon and bone. Uncovering the mechanisms responsible for the patterning of the limb has been an area of active investigation for decades. In spite of intense effort, the molecular pathways responsible for patterning of most cell types in the vertebrate limb are not yet known.
The limb is patterned along three distinct axes; dorsal/ventral, proximal/distal and anterior/posterior. Saunders and Gasseling2 reported key findings into the mechanism responsible for patterning the anterior/posterior axis in 1968. They grafted tissue from the posterior of one chick limb into the anterior distal margin of a second limb bud. This resulted in mirror image duplication of the digits and in some cases the ulna. The extra digits produced near the graft were of reversed polarity such that the most anterior ectopic digits in position were the most posterior in character. The production of ectopic digits only occurred when tissue from a small region of the distal/posterior region of the limb was grafted onto a host limb. This region of the limb was given the name the "Zone of Polarizing Activity", or ZPA. Role of Shh in limb development
Subsequent studies identified the secreted factor Shh as a factor that is expressed precisely in those cells identified as the ZPA3"5. Shh mRNA is first observed in the posterior of the limb bud at E9.756. Over the course of the next 2 days, Shh mRNA remains in the posterior distal portion of the limb as it expands. Shh protein is observed in a posterior to anterior concentration gradient, with highest levels of protein found in cells expressing Shh mRNA6. Numerous studies have identified an important role for Shh in patterning the anterior/posterior (A/P) axis of the vertebrate limb3'6"8. Ectopic expression of Shh in the anterior of the chick or mouse limb results in a mirror image duplication of the digits, suggesting that Shh is the factor present in the ZPA that is responsible for patterning the anterior/posterior axis. Further evidence that Shh specifies the anterior/posterior axis comes from experiments in which Shh was genetically ablated from the mouse limb7'8. Removal of Shh in the limb results in the absence of all digits in the forelimb and four of the five digits in the hindlimb. The presence in Shh null embryos of a hindlimb digit that resembles digit one suggests that this digit does not require SHH protein to be correctly specified.
The correct concentration of SHH protein is essential for the patterning of the anterior/posterior axis '9'10. As mentioned above, digit one in the hindlimb does not appear to require SHH for its formation. Experiments in which the concentration of SHH was varied suggested that the highest concentration of SHH results in the production of digit five with progressively lower SHH concentrations specifying digits 4-2. In this model, cells other than SHH-expressing cells sense different concentrations of SHH and respond by modulating their gene expression profiles. The cells that respond to SHH then form different digits depending on the concentration of SHH protein to which they have been exposed. Role of cells formerly expressing Shh in the vertebrate limb. Cells that formerly expressed Shh have at least two roles in the vertebrate limb. The first is to terminate the STz/z-Fgffeedback loop that is responsible for controlling the size of the limb11. The outgrowth of the limb is controlled by two distinct signaling centers, the ZPA located in the posterior/distal mesenchyme and the AER (apical ectodermal ridge) located at the tip of the developing limb12. A positive feedback loop involving Shh and Gremlin expression in the mesenchyme and Fgf4 expression in the AER is required for normal limb outgrowth to occur11'13'14. Shh maintains Fgf expression from the AER by upregulating Gremlin in the mesenchyme adjacent to SM-expressing cells15'16. Gremlin is a bone morphogenetic protein (Bmp) antagonist and prevents Bmp proteins from downregulating Fgfexpression in the AER17'18. The signaling loop between Shh and Fgfs must be halted at the appropriate point during development to produce a normal-sized limb. Prolonged expression of Shh in chick limbs after normal Shh expression has ceased results in prolonged Fgf expression in the AER and the continued outgrowth of the digit rays19'20. The end result of increasing the amount of time the Shh-Fgffeedba.dk loop is operating is the production of additional phalanges and an overall longer limb.
The mechanism responsible for abolishing the SM-Fg/Teedback loop has recently been investigated. Fate mapping experiments in which we followed the fates of ZPA cells after they ceased to express Shh demonstrated that these cells expanded anteriorly to encompass a domain that is much larger than the ZPA11'21. This wedge-like domain corresponds to the Gremlin-minus domain of cells that appears prior to the breakdown of the SM-Fgffeedback loop. These data suggest a model to explain the breakdown of the feedback loop. Cells that express Shh and their descendants cannot express Gremlin. During the initial stages of limb development this is not a problem because Shh diffuses from the ZPA. As limb growth proceeds, the proliferation of Shh descendants serves as physical barrier between the source of Shh and the cells that produce Gremlin. Once the cells become too distant to receive Shh signal, Gremlin is not expressed and the feedback loop breaks down. Consistent with this model, removal of the forming wedge of cells that cannot express Shh maintains the feedback loop for an extended period11. It is not known why cells that have at one time expressed Shh cannot express Gremlin.
During later limb development, after the Shh-Fgffeedba.dk loop has broken down, former S7z/z-expressing cells play a second role in limb development. Our fate map studies of the descendants of SM-producing cells demonstrate that these cells encompass all of the two most posterior digits (digits four and five) and part of the middle digit (digit three)21. Previous work by a number of laboratories clearly demonstrated that digit identity is specified, at least in part, by the concentration of Shh to which a forming digit is exposed6'9'22. Since digits four and five are comprised completely of cells that have at one time expressed Shh, they were exposed to the same concentration of Shh. However, our fate mapping experiments using a tamoxifen-inducible CRE cassette indicate that the posterior digits form from cells that have expressed Shh for longer periods of time21. These data suggest that while the specification of the anterior digits depends upon differential concentrations of Shh, the length of time of exposure to Shh is critical in the specification of the differences between the most posterior digits. The mechanisms responsible for cells sensing differences in the length of time they are exposed to Shh are not known. Regulation of Shh transcription in the vertebrate limb.
Shh is expressed in an invariant, small number of posterior/distal cells along the margin of the limb. A few factors are known to be involved in activating Shh in this specific region of the vertebrate limb. However, none of them are expressed exclusively in S7z/z-positive cells. One, dHAND is a basic helix-loop-helix transcription factor that is expressed prior to Shh in the early limb23. During later development, dHAND is restricted to the posterior of the limb and overlaps with Shh. Misexpression of dHAND in the anterior of the limb results in ectopic anterior expression of Shh and the duplication of skeletal elements . Consistent with the proposed role for dHAND in activating Shh, in dHAND null mice Shh is not expressed .
The homeodomain gene Hoxb8 has also been implicated in regulating Shh expression in the vertebrate limb. Hoxb8 is expressed in the posterior of E9.5 mouse limbs and overexpression throughout the limb results in ectopic Shh expression and forelimb duplications2 . In chick, Hoxb8 is also expressed in the posterior of forelimbs and can be ectopically induced by anterior expression of retinoic acid25'26. Inhibition of retinoic acid results in the downregulation of Hoxb8 and a loss of Shh25. Hoxb8 is not expressed in the posterior hindlimb in either mouse or chick and overexpression of Hoxb8 in the mouse hindlimb does not result in the production of any visible defects24. Recently it has been shown that genes in the HoxD cluster are important for expression of
Shh27. Ectopic Hox expression resulted in misexpression of Shh in the anterior of the limb and double posterior limbs. It has been proposed that dHAND may be activated by posterior HoxD gene expression27. Both HoxD expression and dHAND are thought to be repressed by anterior expression of the GH3 repressor and paired homeodomain gene Alx-4 ' . The "classic" ZPA has been redefined through molecular biology as the region of the limb in which Shh is expressed. Based on the essential roles for iSM-expressing cells in normal limb development we reasoned that there might be additional factors present in the ZPA, performing ιS7z/2-dependent and/or independent functions. These genes might be involved in regulating the expression of Shh in the ZPA by turning on Shh in a specific region of the limb mesenchyme (upstream factors). Additional genes expressed in the ZPA might function downstream of Shh in the £7z/*-signaling pathway to modify cell fate decisions depending on the concentration and/or amount of time a ZPA cell is exposed to Shh. Factors that fall into this class may be essential for specifying the temporal part of the posterior digit specification pathway. It is also possible that additional ZPA-specific genes may not function at all with Shh but instead pattern other, SΑ/z-independent aspects of limb development. Discovery of genes associated with limb formation To uncover additional genes expressed in the ZPA, the inventors employed a transgenic mouse allele in which Gfp is inserted into the Shh locus as described above. Embryos containing this transgene express GFP in all cells that normally express Shh, including the ZPA. Cells expressing Shh were cell-sorted from ElO.5 limbs and then complementary RNAs from GFP- positive (the ZPA) and GFP-negative (rest of the limb bud) cell populations were used to screen Affymetrix mouse GeneChips.
Analysis of the data revealed that Shh and ~15 additional genes were expressed at higher levels in the ZPA than in the rest of the limb (See Table 1). Table 1. Genes upregulated in GFP-positive (ZPA) cells of E10.5 mouse limbs.
Figure imgf000013_0001
One of these genes, an uncharacterized EST provisionally named TMl, that contains eight transmembrane domains, was analyzed further. Whole mount RNA in situ hybridization revealed that in the limb, TMl was temporally and spatially expressed in a pattern identical to Shh. Outside the limb, TMl also appears to be expressed in many of the same areas as Shh.
TMl is a member of a highly conserved family of uncharacterized transmembrane proteins that share no similarity to any of the known transmembrane proteins functioning in the SM-signaling pathway {Patched, Smoothened, Hip or Dispatched). There are eight TMl members in most vertebrates, 4 in Drosophila and 1 in C. elegans. In the chick limb, we have found that the TMl homolog is also expressed exclusively in the ZPA.
Whole mount RNA in situ hybridization analysis was also performed using several other genes listed in Table 1, including: EST 1460299 (HB9); EST 1420784 (TM2); EST 1437418; and EST 1435670 (AP-2) (shown in Figs. 2C-F, respectively. Like Shh and TMl, these genes were found to be highly expressed in the ZPA, and not in other areas of the developing limb bud. The results of the in situ hybridization analysis thus confirmed the differential expression of these genes in the ZPA of the limb bud, as revealed by the gene chip screening technique, demonstrating the power of this approach for identifying new genes associated with limb formation.
Nucleic acid and amino acid sequence identifiers of exemplary genes/products discovered by this approach are listed in Table 2. Table 2. Se uence Information for Genes Ex ressed in ZPA of Limb Bud
Figure imgf000014_0002
*microKNA
A number of other genes was found to be down-regulated in Shh-expressing cells as compared with non-Shh expressing cells (Table 3).
Table 3. Genes Down-regulated in GFP-positive (ZPA) Cells of E10.5 Mouse Limbs.
Figure imgf000014_0001
Figure imgf000015_0001
2) DEVELOPING GENITOURINARY TRACT
Genitourinary disease is a major public health concern, with cancer of the prostate being the second most lethal malignancy in men. Most cancers of the genitourinary tract involve transformation of epithelial cells that line urologic organs, known as urothelial or uroepithelial cells. Hedgehog signaling has been shown to be associated with cancers of the stomach, esophagus, pancreas, prostate and biliary tract. Birth defects affecting the genitourinary system are among the most common congenital anomalies in humans, with hypospadias, an ectopic ventral opening of the urethra, affecting one in every 250 live births . The molecular control of early genitourinary development is not well understood. During embryogenesis, the penis and clitoris develop from the genital tubercle, an outgrowth of the ventral margin of the cloaca that consists of endoderm, mesoderm and ectoderm (Fig. 5). Initially, the genital tubercles of males and females are morphologically indistinct. The early tubercle in both sexes has the potential to differentiate into either male or female genitalia, and exposure to androgens masculinizes the genitalia. Disruption of androgen signaling can result in feminization of the genitalia, which frequently includes hypospadias ' ls 92'
Although androgens play key roles in external genital development, outgrowth and patterning of the genitalia are well underway before the onset of androgen production. For example, human genital development begins almost two weeks before the onset of testosterone synthesis by the testis97. Early patterning of the genital tubercle, therefore, must be controlled independently of androgens. Whereas the effects of androgens on mammalian genital development have been extensively studied 91' 93> 100> 105> 112, the genetic mechanisms that regulate early development of the genital tubercle remain largely unknown. Molecular genetics of external genital development To date there have been few studies of the genetic control of external genital development 109. In an earlier study, we carried out an in situ hybridization screen of mouse genital tubercles between El 0.5 and El 6.5 108. This initial study of external genital development was designed to test the hypothesis that outgrowth and patterning of the genital tubercle and the limb bud are controlled by a common set of genes. Riboprobes from a suite of genes known to be involved in limb development were utilized to test whether these genes were also expressed in the posterior genitourinary tract. This approach resulted in the identification of over thirty genes that are expressed in both limbs and genitalia; however, only two of those genes, i.e., Shh and Fgβ, were expressed exclusively in the urethral plate, the endodermally-derived epithelial structure that gives rise to the urothelial cells of the urethra. Sonic hedgehog is expressed in endodermal cells of the urogenital system tract and the embryonic gut tube, from which the entire genitourinary tract (with the exception of the kidney) develops (Fig. 5). These endodermal cells differentiate into the epithelial lining of the genitourinary tract from the bladder to the distal urethra.
After identifying the urethral plate epithelium as a key signaling region in the external genitalia, and demonstrating that cells of the urethral epithelium express Shh, the inventors showed that a null mutation in the mouse Shh gene results in a complete failure of external genital development, suggesting that Shh mediates the organizing activity of the urethral epithelium 10I
Expression of Shh in the limb bud is maintained by Fibroblast Growth Factors (Fgfs) from the apical ridge, and Shh, in turn, feeds back to maintain ectodermal expression of these Fgfs no. A similar relationship between Shh and Fgfs may exist in the genital tubercle. Fgβ is expressed at the distal tip of the urethral epithelium, and application of FGF8-soaked beads to genital tubercles that have had their epithelium removed - a manipulation that normally leads to downregulation of Shh and arrest of outgrowth ~ can restore distal outgrowth 96'108. It has been shown that FgfR2-IIIb, a receptor for FgflO and Fgf7, is required for maturation of urethral epithelial cells. Loss of function mutation in Fgfr2-IIIb or its major ligand, FgflO, results in severe disorganization of the urethral epithelium and, consequently, failure of urethral tube development (hypospadias) in mice151. Shh in genitourinary disease
Sonic hedgehog signaling has been implicated in urologic cancers of the prostate and bladder. In the prostate, Shh is expressed during normal development, where it regulates differentiation of prostate epithelium and growth of the prostate. Inhibition of Shh signaling during prostate development leads to precocious and abnormal differentiation of epithelial cells, causing prostatic lesions similar to those found in prostate intraepithelial neoplasia. Hedgehog activity is required for regeneration of prostatic epithelium, and continuous pathway activation results in transformation of progenitor cells to a tumorigenic phenotype. Activation of the hedgehog pathway requires expression of the Smoothened gene, which is triggered by expression of endogenous hedgehog proteins. Several prostate cancer cell lines have been shown to contain elevated levels of Patched mRNA, and growth of these cancer cells can be inhibited by administration of compounds that block hedgehog activity. Tumor growth can be blocked in vivo by administering hedgehog pathway antagonists to adult mice. Regression of these tumors is likely to be a consequence of failed renewal of tumor stem cells. Thus, the biological role of hedgehog signaling in the maintenance of uroepithelial progenitor or stem cells seems to be common to normal development and to cancer. Whereas normal development involves appropriate withdrawal of hedgehog signaling at the end of organ formation, tumorigenesis involves unregulated hedgehog signaling, which leads to continuous regeneration of prostate epithelium from transformed endogenous progenitors. Thus, hedgehog pathway activation is strongly correlated with metastasis of prostate cancer. Superficial bladder cancer is frequently associated with chromosomal deletion at position 9q22.3, the same position as the Patched locus. While it is not yet known whether loss of Ptc activity in bladder cancer is a causal factor, its involvement in basal cell carcinoma, and the importance of hedgehog signaling in basal cell renewal elsewhere in the urogenital tract, raises the possibility that hedgehog may also mediate tumor progression in the bladder. Shh signaling in genital tubercle
Shh, like other members of the hedgehog family, activates signal transduction by binding to the Patched (Ptc) transmembrane receptors 94> 102' 103 '107. The binding of Hedgehog to Ptc alleviates repression of Smoothened (Smo), which, in turn, results in activation of GH and subsequent expression of hedgehog target genes, including Ptc106. As discussed above, members of the Bone morpho genetic protein (Bmp) family are expressed in the limb bud, and it has been suggested that Bmps may be involved in relaying the hedgehog signal over long ranges 114. More recent work, however, indicates that Shh itself can act over long- and short-range distances 101' 115, and Bmp signaling may be involved in modulating Ptc expression 107. Expression of all of the aforementioned genes has been detected in the genital tubercle 5' '
108' ni, but few direct tests of their function in external genital development have been conducted. Members of the Hox and Wnt gene families were among first factors implicated in early development of the genital tubercle. Hoxdl3 and Hoxal3 play an essential role in external genital and limb development 98> 104' 109. Loss of function of both genes results in agenesis of the genital tubercle, and heterozygosity for either gene causes patterning defects of the phallus. A recent study reports that Hoxal3 null mice exhibit hypospadias 104. hi humans, mutations in H0XA13 are responsible for the range of phenotypes seen in Hand-Foot-Genital Syndrome, which affects the distal limbs and genitourinary system. The secreted signaling molecule Wnt5a is expressed in a number of embryonic outgrowths, where it acts as a positive regulator of cell
1 11X proliferation . Expression of Wntδa is graded from distal-to-proximal in both the limb bud and genital tubercle, and loss of Wnt5a function impairs distal outgrowth of these structures 113. Tissue engineering of urologic organs
Reconstruction of urologic tissues is usually performed with non-urologic tissues from the gastrointestinal tract, skin, placenta, dura, peritoneum or secretory mucosa from other tissues. Introduction of non-urologic cells into the genitourinary tract is generally problematic and can lead to abnormal physiological function of urologic organs, scarring, and fibrosis. Prosthetic replacements have been engineered from natural and synthetic materials such as silicone, collagen matrix, gelatin and polyvinyl, but these offer poor solutions and lead to failure due to bioincompatibility or structural/mechanical problems.
Cellular transplantation offers great promise for biological reconstruction of urologic organs, and progress has been made in the area of scaffold development onto which cells can be seeded for in vifro and in vivo growth. After seeding, cells spread over the scaffold and reach a stable shape. Morphology of the cellular component of engineered organs depends on (1) adhesion of cells to the substrate, (2) adhesion of cells to one another, (3) rigidity of the substrate, and (4) availability of appropriate nutrients, including oxygen. An important factor in the reconstruction of genitourinary organs is identification of a donor cell population that can be directed down an appropriate developmental pathway, then survive over a long term, differentiate and grow. Thus far, most reconstructions of bladder, urethral and ureteric tissues have been engineered with smooth muscle and/or uroepithelial cells, however smooth muscle cells frequently lose their contractile properties in culture and differentiate into myofibroblasts that secrete collagen, which, in turn, results in scar formation and excess salt formation when exposed to urine.
Experiments with uroepithelial cells have yielded more promising results, showing that seeded cells can proliferate and form a multilayered lining within the lumen of the scaffold. Urothelial cells have been shown recently to facilitate the recruitment and transdifferentiation of fibroblasts into smooth muscle when co-cultured on an acellular matrix. Isolation of urothelial stem cells from ffM-expressing transgenic mice
Urothelial stem cells provide great opportunities for successful regeneration, repair and reconstruction of urologic tissues. Building upon the finding that the urothelial lining of the genitourinary tract is derived from Shh-expressing endodermal cells, the inventors isolated these cells from the ShhGfpCre mouse described herein, and purified the cells by flow cytometry using Fluorescence Activated Cell Sorting (FACS). More specifically, a population of SM-expressing, fluorescent cells was obtained from the urethral plate of mouse genital tubercle at embryonic day 12.5 (Fig. 7), and 94% enrichment of labeled cells was achieved by FACS (Fig. 8). In order to characterize the transcriptional profile of these urethral progenitor cells relative to unlabelled stromal (mesenchymal) cells, RNA was extracted from positively- and negatively- labeled cells and used to screen an AffyMetrix mouse cDNA microarray chip. Analysis of the microarray data resulted in the identification of forty-one genes that were upregulated significantly in urethral plate cells, relative to the adjacent mesenchyme cells. The genes identified by this analysis are listed in Table 4.
The data indicate that the methods of gene isolation described herein have resulted in the identification of the transcriptional profile of urethral progenitor cells in the E12.5 mouse embryo. It is contemplated that the cells of the invention, or the genes/RNA or proteins discovered from the transcriptome of these cells will provide useful therapeutics for regenerating, reconstructing or repairing any tissue needing such therapy that has at one time expressed Shh, such as therapeutic agents for regenerating, reconstructing or repairing the genitourinary system.
10
Table 4. Genes upregulated in ShhGfpCre expressing cells of urethral eipthelium of E12.5 moluse genitalia
Figure imgf000020_0001
Figure imgf000021_0001
Figure imgf000022_0001
Table 5 lists genes that are upregulated in non-iSM-expressing cells of the genital tubercle.
Table 5. Genes downregulated in ShhGfpCre expressing cells of urethral epithelium of E12.5 mouse genitalia (these genes are upregulated in genital mesenchyme cells and downregulated in epithelium). 5
Probe ID Gene Gene symbol
1) 1450621_ a_ at hemoglobin Y, Hbb-y beta-like embryonic chain
2) 1437810_ a_ at Hemoglobin Z, Hbb-bhl beta-like embryonic chain
3) 1434502 X at solute carrier Slc4al family 4 (anion exchanger), member 1
4) 1437990 X at emoglobin Z, beta- Hbb-bhl like embryonic chain
5) 1448021 at Transcribed http://www.ncbi. nlm.nih.Qov/UniGene/qLierv.cqi?TEXT= sequences
6) 1417184_ _s_ at hemoglobin, beta Hbb-bl adult major chain
In order to confirm the validity of the microarray results, endogenous mRNA was examined by in situ hybridization in whole mount genital tubercles from mice at embryonic day (E)12.5. Of the first twenty-five genes analyzed using this method, all twenty-five were
10 confirmed as being expressed in the urethral plate cells during normal development, demonstrating that the microarray analysis led to few, if any, false positives being identified, hi further studies that inventors have carried out a time-course experiment to map the spatial and temporal dynamics of gene expression in the genital tubercle. The data indicate that the methods of gene isolation described herein have resulted in the identification of the transcriptional profile
15 of urethral progenitor cells in the E12.5 mouse embryo. 3) DEVELOPING INTERVERTEBRAL DISC
The most common cause of back pain is thought to result from the degeneration of the intervertebral disk118"120. This usually occurs in two ways: either through herniation of disk material into the vertebral column or through the reduction of disk height. The reduction in the
20 height of disks results in the compression of the vertebral facets that then exerts pressure on the nerve roots, leading to back pain. In a normal vertebral column the intervertebral disks join adjacent vertebral bodies so that the spine can bend120' 121. Each disk is composed of an outer annulus fϊbrosus consisting of concentric lamellae of collagen fibers; superior and inferior cartilaginous end plates (these mark the regions between the intervertebral disk and the vertebrae); and an inner layer called the nucleus pulposus122. Cells located within the nucleus pulposus are thought to be required for synthesizing and maintaining the intervertebral disk"9' Damage to or loss of the nucleus pulposus in the intervertebral disk often leads to disease and back pain120.
Once an intervertebral disk degrades, the primary clinical treatment is the surgical removal of the disk in an attempt to relieve pressure on spinal nerve roots124 . If removal of the affected disk results in a mechanically unstable spine, fusion of adjacent vertebrae can be performed in an attempt to stabilize the spine12 '125. This type of operation is usually beneficial for the alleviation of pain caused by disk herniation. However, the overall success rate for spinal fusions is only 50-75%125. Spinal fusions have also been shown to accelerate the degeneration of adjacent discs126' 127. The relatively low rate of success and potential for serious side effects from this treatment suggests that alternative treatments are needed to cure or alleviate back pain. Intervertebral disk repair or replacement as a treatment for back pain
One attractive and promising area of research for the treatment of back pain is the field of disk replacement and/or repair. The replacement or repair of damaged disks before the occurrence of spinal nerve damage could significantly reduce the need for spinal fusions and other types of spinal surgeries. Two primary areas of research are focused on the repairing/healing of damages disks: replacement of these disk with synthetic implants and biological repair, or replacement of discs (tissue engineering). The replacement of the nucleus pulposus with synthetic implants (e.g., artificial disks composed of blends of hydro gels) is a promising but immature field of research128' 129. One significant drawback of this technology is that the placement of synthetic implants in the spine requires the surgical removal of a damaged disk.
Tissue engineering of a nucleus pulposus has the potential to be used for either complete disk replacement or the repair of damaged tissue. The first step in the engineering of a nucleus pulposus is to identify, characterize and purify a population of cells or proteins that can form or repair this structure. The nucleus pulposus is composed predominantly of type II collagen, aggrecan and water120. Cell types found in the nucleus pulposus are primarily small chondrocyte-like cells. Interestingly, in young animals and adults of some species a second population of cells is found in the nucleus pulposus. These cells are much larger than the chondrocyte-like cells and have been proposed to be "notochordal" in origin1 °' 131. In addition to their unique morphology, notochordal cells in the nucleus pulposus express a subset of genes that may be involved in organizing this structure. The gradual loss of this population of cells during the life of many mammalian species prior to the onset of disk degeneration suggests that this population of cells is involved in the maintenance and/or repair of this structure ' The notochordal cells in the nucleus pulposus have been proposed to arise from the notochord, a structure that has long been known to play an essential role in patterning the spinal column and intervertebral disks120' 133. The notochord is a mesodermal rod, ventral to the neural tube, running from the head through the tail of the mouse embryo. During embryonic development, the notochord is thought to pattern parts of the overlying vertebral column133. It has been difficult to identify the final fates of cells comprising the notochord since this structure begins to recede during mid- to late embryonic development, hi mice (and humans) the secreted factor Shh is expressed in the notochord134'137. The notochord expresses a number of genes, including the secreted factor sonic hedgehog (Slih)13 ' 1 5. Prior to studies described herein, the role of the notochord in forming the nucleus pulposus had previously not been known. Characterization of the stem cell population of the nucleus pulposus. As described above, the inventors have created a mouse in which a gfpcre fusion cassette13 is inserted into the Shh locus (Shhgfpcre allele). In this model, all Shh-expressing cells that arise from the notochord are detectable by expression of a marker protein. Using this model it has been demonstrated that the nucleus pulposus is comprised entirely of cells that have previously expressed Shh. Mice containing both Shhgfpcre and either of the reporter alleles had reporter expression in the nucleus pulposus (Fig. 9). Expression was observed in the nucleus pulposus from mid-embryogenesis, when this structure first forms, through adulthood. This data suggests that the nucleus pulposus is comprised of cells from the notochord that at some time expressed Shh Purification of stem cells of the nucleus pulposus.
The Shhgfpcre mouse provides a unique tool to investigate whether a stem cell-like population exists in the nucleus pulposus. Using the Shhgfpcre allele it is possible to purify cells from both the notochord and the nucleus pulposus. Characterization of these cell populations is an essential first step toward the goal of curing back pain associated with damaged intervertebral disks. The nucleus pulposus is believed to contain a stem cell-like population of notochordal cells that is essential for the maintenance and repair of this structure120. Mice containing both the Shhgfpcre and eYFP alleles have intervertebral disks in which the nucleus pulposus glows bright green (Figure 9B). This mouse strain is ideal for dissecting, purifying and culturing the stem cell population in the nucleus pulposus. Disks can be dissected from newborn mice using a fluorescent stereo dissecting microscope. The use of a fluorescent microscope during the dissection of the intervertebral disks greatly enhances the ability to quickly and easily identify this tissue in newborn animals. Once the intervertebral disks are dissected, placing the disks in a trypsin/BSA solution creates a single-cell suspension of this tissue. The single-cell suspension of intervertebral disk cells is then sorted into YFP-positive (nucleus pulposus cells) and YFP- negative (rest of the disk) cell populations using a flow cytometer. tube. Using the above procedure, YFP-positive cells from the notochord and neural tube have been isolated (Fig. 9D). In all cases, greater than 90% enrichment in YFP-positive cells can be obtained. Within the nucleus pulposus, notochordal cells have been proposed to act either as stem cells or as an organizer of this structure 20'130' " 2. Notochordal cells can synthesize new extracellular matrix and can regulate the production of proteoglycan by other cells in the nucleus pulposus. Our purified population of cells should contain notochordal cells and chondrocyte-like cells. We initially plate purified nucleus pulposus cells using standard cell culture conditions since it is not yet clear what factors are necessary for the propagation of stem cells from the nucleus pulposus. Notochordal cells in the nucleus pulposus have been reported to express a distinct subset of molecular markers120. The presence of notochordal cells in our purified cell population is confirmed using antibodies against proteins known to be expressed in this population (e.g., CD44s132, galectin 3143' vimentin144, cyokeratins 8 and 19144' CSPG145, and collagen type IIA146'147). In addition to fate mapping <SM-expressing cells, the Shhgfcere allele can be used to identify all cells that are actively expressing Shh. Shh is expressed in the notochord prior to the formation of the nucleus pulposus134. It is not known whether Shh continues to be expressed in the nucleus pulposus. In other tissues, Shh has been shown to be essential for the maintenance of a stem cell-like fate148. To determine if Shh is expressed in the putative notochordal stem cell population located in the nucleus pulposus, Shh RNA in situ hybridizations can be performed to detect Shh mRNA. The published Shh antibody149 can also be used to detect the location of SHH protein in the nucleus pulposus. In situations in which the Shh antibody works poorly in some tissues or in which Shh RNA in situ hybridization is difficult to perform due to small numbers of cells the Shhgfpcre allele provides a useful alternative tovercome these problems. Because, as discussed, the Shhgfpcre allele expresses GFP in all cells in which Shh is expressed, it is possible to use a commercially available, and highly sensitive anti-GFP antibody to detect low amounts of Shh protein. Once Shh expression ceases, GFP expression disappears
If a subpopulation of nucleus pulposus cells expresses Shh, then these cells are candidates for being stem cells or an organizer of this structure. As shown below, the inventors have discovered a small population of <S7*/z-expressing cells at the edge of the nucleus pulposus of postnatal mice (Fig. 11). It is also possible that by culturing the entire purified nucleus pulposus one can identify a stem cell-like population using known notochordal markers. In either case, the identification and characterization of a cell population that is capable of repairing a damaged nucleus pulposus, or of constructing a new one is an essential first step in the treatment of disk- related back pain. Transcriptome of the nucleus pulposus: , Proteins expressed in the nucleus pulposus are responsible for the maintenance of this structure120. The ability to replace proteins in the nucleus pulposus that have degraded due to age or been damaged as a result of injury or disease may have potential therapeutic benefits. Cells from the nucleus pulposus can be isolated and purified and the cells can be characterized in cell culture. In addition to growing and expanding populations of these cells, RNA can be extracted from purified nucleus pulposus cells and amplified mRNA can be labeled and hybridized to Affymetrix DNA microarrays to determine the identities of (all) genes expressed in the nucleus pulposus (the transcriptome), as described above. The identification of the nucleus pulposus transcriptome will provide information leading to recognition of candidate genes that are involved in maintaining the integrity of the nucleus pulposus or are responsible for specifying distinct types of cells in this tissue.
One preferred set of genes of interest in the nucleus puloposus transcriptome are extracellular matrix proteins and secreted factors. Extracellular matrix proteins have been demonstrated to provide a number of functions in the nucleus pulposus (for example, providing compressive stiffness)120. Secreted factors, such as Shh and Bmp molecules, have been demonstrated in other tissues to be able to pattern undifferentiated cells into defined tissues and organs120'132. Putative extracellular matrix proteins and secreted factors from the nucleus pulposus transcriptome are identified using existing mouse genomic databases and the NCBI Blast program. RNA in situ hybridizations and rtPCR can be performed to confirm that identified factors are expressed in the nucleus pulposus. RNA in situ hybridizations are also useful to determine whether identified factors are expressed in subsets of nucleus pulposus cells. Expression of a gene(s) in a subset of nucleus pulposus cells would be suggestive of important cell-type-specific functions for these factors. In addition, they would provide useful markers for the identification of specific cell types. RNA in situ hybridization is not ideal for determining the locations of factors expressed at low levels or in a small number of cells. Primers designed to amplify genes identified in an array screen that were not detected using RNA in situ hybridization can be used for PCR on cDNA from nucleus pulposus cells. Treatment of Intervertebral Discs
Treatment or repair of disk defects can be accomplished by injecting purified regenerative cells, or proteins into damaged disks. Putative stem cells in the nucleus pulposus have been shown to disappear prior to disk degeneration ' . The re-introduction of a stem cell population may be able to halt or even heal disk degeneration by directly repopulating disks or by providing a factor(s) that is capable of correctly repairing disks. The identification of genes whose products are required for the integrity of the disk, or that pattern disk cells, may also aid in the repair of this structure. The ability to inject purified protein(s) and/or cells directly into a damaged or diseased disk in order to effect disk repair would be a powerful tool in the treatment of back pain. DETECTION OF NUCLEIC ACID MOLECULES
Analysis of gene expression is not limited to any one specific method but can include any method known in the art. All of these principles may be applied independently, in combination, or in combination with other known methods of sequence identification. Examples of methods of gene expression analysis known in the art include DNA arrays or microarrays (Brazma and ViIo, FEBS Lett., 2000, 480, 17-24; Celis, et al., FEBS Lett., 2000, 480, 2-16), SAGE (serial analysis of gene expression) (Madden, et al., Drug Discov. Today, 2000, 5, 415-425), READS (restriction enzyme amplification of digested cDNAs) (Prashar and Weissman, Methods Enzymol., 1999, 303, 258-72), TOGA (total gene expression analysis) (Sutcliffe, et al., Proc. Natl. Acad. Sci. U. S. A., 2000, 97, 1976-81), protein arrays and proteinics (Celis, et al., FEBS Lett., 2000, 480, 2-16; Jungblut, et al., Electrophoresis, 1999, 20, 2100-10), expressed sequence tag (EST) sequencing (Celis, et al., FEBS Lett., 2000, 480, 2-16; Larsson, et al., J. Biotechnol., 2000, 80, 143-57), subtractive RNA fingerprinting (SuRF) (Fuchs, et al., Anal. Biochem., 2000, 286, 91-98; Larson, et al., Cytometry, 2000, 41, 203-208), subtractive cloning, differential display (DD) (Jurecic and Belmont, Curr. Opin. Microbiol., 2000, 3, 316-21), comparative genomic hybridization (Carulli, et al., J. Cell Biochem. SuppL, 1998, 31, 286-96), FISH (fluorescent in situ hybridization) techniques (Going and Gusterson, Eur. J. Cancer, 1999, 35, 1895-904) and mass spectrometry methods (reviewed in (To, Comb. Chem. High Throughput Screen, 2000, 3, 235-41).
In a preferred embodiment, Expressed Sequenced Tags (ESTs), can also be used to identify nucleic acid molecules which are over expressed in a cancer cell. ESTs from a variety of databases can be identified. For example, preferred databases include, for example, Online Mendelian Inheritance in Man (OMIM), the Cancer Genome Anatomy Project (CGAP), GenBank, EMBL, PIR, SWISS-PROT, and the like. OMIM, which is a database of genetic mutations associated with disease, was developed, in part, for the National Center for Biotechnology Information (NCBI). OMIM can be accessed through the world wide web of the Internet, at, for example, ncbi.nlm.nih.gov/Omim/. CGAP, which is an interdisciplinary program to establish the information and technological tools required to decipher the molecular anatomy of a cancer cell. CGAP can be accessed through the world wide web of the Internet, at, for example, ncbi.nlm.nih.gov/ncicgap/. Some of these databases may contain complete or partial nucleotide sequences. In addition, alternative transcript forms can also be selected from private genetic databases. Alternatively, nucleic acid molecules can be selected from available publications or can be determined especially for use in connection with the present invention. Alternative transcript forms can be generated from individual ESTs which are within each of the databases by computer software which generates contiguous sequences. In another embodiment of the present invention, the nucleotide sequence of the nucleic acid molecule is determined by assembling a plurality of overlapping ESTs. The EST database (dbEST), which is known and available to those skilled in the art, comprises approximately one million different human mRNA sequences comprising from about 500 to 1000 nucleotides, and various numbers of ESTs from a number of different organisms. dbEST can be accessed through the world wide web of the Internet, at, for example, ncbi.nlm.nih.gov/dbEST/index.html. These sequences are derived from a cloning strategy that uses cDNA expression clones for genome sequencing. ESTs have applications in the discovery of new genes, mapping of genomes, and identification of coding regions in genomic sequences. Another important feature of EST sequence information that is becoming rapidly available is tissue-specific gene expression data. This can be extremely useful in targeting selective gene(s) for therapeutic intervention. Since EST sequences are relatively short, they must be assembled in order to provide a complete sequence. Because every available clone is sequenced, it results in a number of overlapping regions being reported in the database. The end result is the elicitation of alternative transcript forms from, for example, normal cells and cancer cells.
Assembly of overlapping ESTs extended along both the 5' and 3' directions results in a full-length "virtual transcript." The resultant virtual transcript may represent an already characterized nucleic acid or may be a novel nucleic acid with no known biological function. The Institute for Genomic Research (TIGR) Human Genome Index (HGI) database, which is known and available to those skilled in the art, contains a list of human transcripts. TIGR can be accessed through the world wide web of the Internet, at, for example, tigr.org. Transcripts can be generated in this manner using TIGR- Assembler, an engine to build virtual transcripts and which is known and available to those skilled in the art. TIGR- Assembler is a tool for assembling large sets of overlapping sequence data such as ESTs, BACs, or small genomes, and can be used to assemble eukaryotic or prokaryotic sequences. TIGR- Assembler is described in, for example, Sutton, et al, Genome Science & Tech., 1995, 1, 9-19, which is incorporated herein by reference in its entirety, and can be accessed through the file transfer program of the Internet, at, for example, tigr.org/pub/software/TIGR. assembler. In addition, GLAXO-MRC, which is known and available to those skilled in the art, is another protocol for constructing virtual transcripts. In addition, "Find Neighbors and Assemble EST Blast" protocol, which runs on a UNIX platform, has been developed to construct virtual transcripts. PHRAP is used for sequence assembly within Find Neighbors and Assemble EST Blast. PHRAP can be accessed through the world wide web of the Internet, at, for example, chimera.biotech.washington.edu/ uwgc/tools/phrap.htm. Identification of ESTs and generation of contiguous ESTs to form full length RNA molecules is described in detail in U.S. application Ser. No. 09/076,440, which is incorporated herein by reference in its entirety.
An "allele" or " variant" is an alternative form of a gene. Of particular utility in the invention are variants of the genes encoding any potential Shh-related genes identified by the methods of this invention. Variants may result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in polypeptides whose structure or function may or may not be altered. Any given natural or recombinant gene may have none, one, or many allelic forms. Common mutational changes that give rise to variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. Each of these types of changes may occur alone, or in combination with the others, one or more times in a given sequence.
To further identify variant nucleic acid molecules, nucleic acid molecules can be grouped into sets depending on the homology, for example. The members of a set of nucleic acid molecules are compared. Preferably, the set of nucleic acid molecules is a set of alternative transcript forms of nucleic acid. Preferably, the members of the set of alternative transcript forms of nucleic acids include at least one member which is associated, or whose encoded protein is associated, with a disease state or biological condition such as a stage of development. Thus, comparison of the members of the set of nucleic acid molecules results in the identification of at least one alternative transcript form of nucleic acid molecule which is associated, or whose encoded protein is associated, with a disease state or biological condition. In a preferred embodiment of the invention, the members of the set of nucleic acid molecules are from a common gene. In another embodiment of the invention, the members of the set of nucleic acid molecules are from a plurality of genes. In another embodiment of the invention, the members of the set of nucleic acid molecules are from different taxonomic species. Nucleotide sequences of a plurality of nucleic acids from different taxonomic species can be identified by performing a sequence similarity search, an ortholog search, or both, such searches being known to persons of ordinary skill in the art.
Sequence similarity searches can be performed manually or by using several available computer programs known to those skilled in the art. Preferably, Blast and Smith- Waterman algorithms, which are available and known to those skilled in the art, and the like can be used. Blast is NCBI's sequence similarity search tool designed to support analysis of nucleotide and protein sequence databases. Blast can be accessed through the world wide web of the Internet, at, for example, ncbi.nlm.nih.gov/BLAST/. The GCG Package provides a local version of Blast that can be used either with public domain databases or with any locally available searchable database. GCG Package v9.0 is a commercially available software package that contains over 100 interrelated software programs that enables analysis of sequences by editing, mapping, comparing and aligning them. Other programs included in the GCG Package include, for example, programs which facilitate RNA secondary structure predictions, nucleic acid fragment assembly, and evolutionary analysis. In addition, the most prominent genetic databases
(GenBank, EMBL, PIR, and SWISS-PROT) are distributed along with the GCG Package and are fully accessible with the database searching and manipulation programs. GCG can be accessed through the Internet at, for example, http://www.gcg.com/. Fetch is a tool available in GCG that can get annotated GenBank records based on accession numbers and is similar to Entrez. Another sequence similarity search can be performed with Gene World and GeneThesaurus from Pangea. Gene World 2.5 is an automated, flexible, high-throughput application for analysis of polynucleotide and protein sequences. Gene World allows for automatic analysis and annotations of sequences. Like GCG, Gene World incorporates several tools for homology searching, gene finding, multiple sequence alignment, secondary structure prediction, and motif identification. GeneThesaurus 1.0 tm is a sequence and annotation data subscription service providing information from multiple sources, providing a relational data model for public and local data. Another alternative sequence similarity search can be performed, for example, by BlastParse.
BlastParse is a PERL script running on a UNIX platform that automates the strategy described above. BlastParse takes a list of target accession numbers of interest and parses all the GenBank fields into "tab-delimited" text that can then be saved in a "relational database" format for easier search and analysis, which provides flexibility. The end result is a series of completely parsed GenBank records that can be easily sorted, filtered, and queried against, as well as an annotations-relational database.
Preferably, the plurality of nucleic acids from different taxonomic species which have homology to the target nucleic acid, as described above in the sequence similarity search, are further delineated so as to find orthologs of the target nucleic acid therein. An "ortholog" is a term defined in gene classification to refer to two genes in widely divergent organisms that have sequence similarity, and perform similar functions within the context of the organism. In contrast, "paralogs" are genes within a species that occur due to gene duplication, but have evolved new functions, and are also referred to as "isotypes." Optionally, paralog searches can also be performed. By performing an ortholog search, an exhaustive list of homologous sequences from as diverse organisms as possible is obtained. Subsequently, these sequences are analyzed to select the best representative sequence that fits the criteria for being an ortholog. An ortholog search can be performed by programs available to those skilled in the art including, for example, Compare. Preferably, an ortholog search is performed with access to complete and parsed GenBank annotations for each of the sequences. Currently, the records obtained from GenBank are "flat-files", and are not ideally suited for automated analysis. Preferably, the ortholog search is performed using a Q-Compare program. Preferred steps of the Q-Compare protocol are described in the flowchart set forth in U.S. Pat. No. 6,221,587, incorporated herein by reference.
Preferably, interspecies sequence comparison is performed using Compare, which is available and known to those skilled in the art. Compare is a GCG tool that allows pair-wise comparisons of sequences using a window/stringency criterion. Compare produces an output file containing points where matches of specified quality are found, can be plotted with another GCG tool, DotPlot.
The nucleic acid molecules of this invention can be isolated using the technique described in the experimental section or replicated using PCR. The PCR technology is the subject matter of U.S. Pat. Nos. 4,683,195, 4,800,159, 4,754,065, and 4,683,202 and described in PCR: The
Polymerase Chain Reaction (Mullis et al. eds, Birkhauser Press, Boston (1994)) or MacPherson et al. (1991) and (1994), and references cited therein (see Methylation Specific PCR below). Alternatively, one of skill in the art can use the sequences provided herein and a commercial DNA synthesizer to replicate the DNA. Accordingly, this invention also provides a process for obtaining the polynucleotides of this invention by providing the linear sequence of the polynucleotide, nucleotides, appropriate primer molecules, chemicals such as enzymes and instructions for their replication and chemically replicating or linking the nucleotides in the proper orientation to obtain the polynucleotides. In a separate embodiment, these polynucleotides are further isolated. Still further, one of skill in the art can insert the polynucleotide into a suitable replication vector and insert the vector into a suitable host cell (procaryotic or eucaryotic) for replication and amplification. The DNA so amplified can be isolated from the cell by methods well known to those of skill in the art. A process for obtaining polynucleotides by this method is further provided herein as well as the polynucleotides so obtained. The terms "nucleic acid molecule" and or "polynucleotide" are used interchangeably throughout the specification, unless otherwise specified. As used herein, "nucleic acid molecule" refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules") or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules"), or any phosphoester analogues thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix. Double stranded DNA--DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes. In discussing the structure of particular double-stranded DNA molecules, sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the non-transcribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA). A "recombinant DNA molecule" is a DNA molecule that has undergone a molecular biological manipulation.
Percent identity and similarity between two sequences (nucleic acid or polypeptide) can be determined using a mathematical algorithm (see, e.g., Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991). To determine the percent identity of two amino acid sequences or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps are introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap which need to be introduced for optimal alignment of the two sequences. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions, respectively, are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid "identity" is equivalent to amino acid or nucleic acid "homology").
A "comparison window" refers to a segment of any one of the number of contiguous positions selected from the group consisting of from 25 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. For example, the percent identity between two amino acid sequences can be determined using the Needleman and Wunsch algorithm (J. MoI. Biol. (48): 444-453, 1970) which is part of the GAP program in the GCG software package (available at http://www.gcg.com), by the local homology algorithm of Smith & Waterman (Adv. Appl. Math. 2: 482, 1981), by the search for similarity methods of Pearson & Lipman (Proc. Natl. Acad. Sci. USA 85: 2444, 1988) and
Altschul, et al. (Nucleic Acids Res. 25(17): 3389-3402, 1997), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and BLAST in the Wisconsin Genetics Software Package (available from, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Ausubel et al., supra). Gap parameters can be modified to suit a user's needs. For example, when employing the GCG software package, a
NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6 can be used. Examplary gap weights using a Blossom 62 matrix or a PAM250 matrix, are 16, 14, 12, 10, 8, 6, or 4, while exemplary length weights are 1, 2, 3, 4, 5, or 6. The GCG software package can be used to determine percent identity between nucleic acid sequences. The percent identity between two amino acid or nucleotide sequences also can be determined using the algorithm of E. Myers and W. Miller (CABIOS 4: 11-17, 1989) which has been incorporated into the ALIGN program (version 2.0), using a PAMl 20 weight residue table, a gap length penalty of 12 and a gap penalty of 4.
The nucleic acid sequences of the present invention can further be used as query sequences to perform a search against sequence databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (J. MoI. Biol. 215: 403-10, 1990). BLAST nucleotide searches can be performed with the NBLAST program, with exemplary scores=100, and wordlengths=12 to obtain nucleotide sequences homologous to or with sufficient percent identity to the nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, with exemplary scores=50 and wordlengths=3 to obtain amino acid sequences sufficiently homologous to or with sufficient % identity to the proteins of the invention. To obtain gapped alignments for comparison purposes, gapped BLAST can be used as described in Altschul et al. (Nucleic Acids Res. 25(17): 3389-3402, 1997). When using BLAST and gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. In accordance with the present invention there may be employed conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained folly in the literature. See, e.g., Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. (herein "Sambrook et al, 1989"); DNA Cloning: A Practical Approach, Volumes I and II (D. N. Glover ed. 1985); Oligonucleotide Synthesis (M. J. Gait ed. 1984); Nucleic Acid Hybridization [B. D. Hames & S. J. Higgins eds. (1985)]; Transcription And Translation [B. D. Hames & S. J. Higgins, eds. (1984)]; Animal Cell Culture [R. I. Freshney, ed. (1986)]; Immobilized Cells And Enzymes [IRL Press, (1986)]; B. Perbal, A Practical Guide To Molecular Cloning (1984); F. M. Ausubel et al. (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (1994).
As used herein, the term "fragment or segment,"as applied to a nucleic acid sequence, or gene, will ordinarily be at least about 5 contiguous nucleic acid bases (for nucleic acid sequence or gene) or amino acids (for polypeptides), typically at least about 10 contiguous nucleic acid bases or amino acids, more typically at least about 20 contiguous nucleic acid bases or amino acids, usually at least about 30 contiguous nucleic acid bases or amino acids, preferably at least about 40 contiguous nucleic acid bases or amino acids, more preferably at least about 50 contiguous nucleic acid bases or amino acids, and even more preferably at least about 60 to 80 or more contiguous nucleic acid bases or amino acids in length. "Overlapping fragments" as used herein, refer to contiguous nucleic acid fragments which begin at the amino terminal end of a nucleic acid and end at the carboxy terminal end of the nucleic acid or protein. Each nucleic acid or fragment has at least about one contiguous nucleic acid position in common with the next nucleic acid fragment, more preferably at least about three contiguous nucleic acid bases in common, most preferably at least about ten contiguous nucleic acid bases in common. A significant "fragment" in a nucleic acid context is a contiguous segment of at least about 17 nucleotides, generally at least 20 nucleotides, more generally at least 23 nucleotides, ordinarily at least 26 nucleotides, more ordinarily at least 29 nucleotides, often at least 32 nucleotides, more often at least 35 nucleotides, typically at least 38 nucleotides, more typically at least 41 nucleotides, usually at least 44 nucleotides, more usually at least 47 nucleotides, preferably at least 50 nucleotides, more preferably at least 53 nucleotides, and in particularly preferred embodiments will be at least 56 or more nucleotides. Additional preferred embodiments will include lengths in excess of those numbers, e.g., 63, 72, 87, 96, 105, 117, etc. Said fragments may have termini at any pairs of locations, but especially at boundaries between structural domains, e.g., membrane spanning portions.
Homologous nucleic acid sequences, when compared, exhibit significant sequence identity or similarity. The standards for homology in nucleic acids are either measures for homology generally used in the art by sequence comparison or based upon hybridization conditions. The hybridization conditions are described in greater detail below.
As used herein, "substantial homology" in the nucleic acid sequence comparison context means either that the segments, or their complementary strands, when compared, are identical when optimally aligned, with appropriate nucleotide insertions or deletions, in at least about 50% of the nucleotides, generally at least 56%, more generally at least 59%, ordinarily at least 62%, more ordinarily at least 65%, often at least 68%, more often at least 71%, typically at least 74%, more typically at least 77%, usually at least 80%, more usually at least about 85%, preferably at least about 90%, more preferably at least about 95 to 98% or more, and in particular embodiments, as high at about 99% or more of the nucleotides. Alternatively, substantial homology exists when the segments will hybridize under selective hybridization conditions, to a strand, or its complement. Typically, selective hybridization will occur when there is at least about 55% homology over a stretch of at least about 14 nucleotides, preferably at least about 65%, more preferably at least about 75%, and most preferably at least about 90%. See, Kanehisa (1984) Nuc. Acids Res. 12:203-213. The length of homology comparison, as described, may be over longer stretches, and in certain embodiments will be over a stretch of at least about 17 nucleotides, usually at least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 28 nucleotides, more typically at least about 40 nucleotides, preferably at least about 50 nucleotides, and more preferably at least about 75 to 100 or more nucleotides. The endpoints of the segments may be at many different pair combinations. Stringent conditions, in referring to homology in the hybridization context, will be stringent combined conditions of salt, temperature, organic solvents, and other parameters, typically those controlled in hybridization reactions. Stringent temperature conditions will usually include temperatures in excess of about 30° C, more usually in excess of about 37°C, typically in excess of about 45° C, more typically in excess of about 55° C, preferably in excess of about 65° C, and more preferably in excess of about 70° C. Stringent salt conditions will ordinarily be less than about 1000 mM, usually less than about 500 mM, more usually less than about 400 mM, typically less than about 300 mM, preferably less than about 200 mM, and more preferably less than about 150 mM. However, the combination of parameters is much more important than the measure of any single parameter. See, e.g., Wetmur and Davidson (1968) J. MoI. Biol. 31 :349- 370.
The primers of the invention embrace oligonucleotides of sufficient length and appropriate sequence so as to provide specific initiation of polymerization on a significant number of nucleic acids in the polymorphic locus. Specifically, the term "primer" as used herein refers to a sequence comprising two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and most preferably more than eight, which sequence is capable of initiating synthesis of a primer extension product, which is substantially complementary to a polymorphic locus strand. Environmental conditions conducive to synthesis include the presence of nucleoside triphosphates and an agent for polymerization, such as DNA polymerase, and a suitable temperature and pH. The primer is preferably single stranded for maximum efficiency in amplification, but may be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxy ribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent for polymerization. The exact length of primer will depend on many factors, including temperature, buffer, and nucleotide composition. An oligonucleotide primer typically contains 12-20 or more nucleotides, although it may contain fewer nucleotides. Primers of the invention are designed to be "substantially" complementary to each strand of the genomic locus to be amplified and include the appropriate G or C nucleotides as discussed above. This means that the primers must be sufficiently complementary to hybridize with their respective strands under conditions which allow the agent for polymerization to perform. In other words, the primers should have sufficient complementarity with the 5' and 3' flanking sequences to hybridize therewith and permit amplification of the genomic locus.
Oligonucleotide primers of the invention are employed in the amplification process which is an enzymatic chain reaction that produces exponential quantities of target locus relative to the number of reaction steps involved. Typically, one primer is complementary to the negative (-) strand of the locus and the other is complementary to the positive (+) strand. Annealing the primers to denatured nucleic acid followed by extension with an enzyme, such as the large fragment of DNA Polymerase I and nucleotides, results in newly synthesized + and - strands containing the target locus sequence. Because these newly synthesized sequences are also templates, repeated cycles of denaturing, primer annealing, and extension results in exponential production of the region (i.e., the target locus sequence) defined by the primer. The product of the chain reaction is a discrete nucleic acid duplex with termini corresponding to the ends of the specific primers employed. The oligonucleotide primers of the invention may be prepared using any suitable method, such as conventional phosphotriester and phosphodiester methods or automated embodiments thereof. In one such automated embodiment, diethylphosphoramidites are used as starting materials and may be synthesized as described by Beaucage, et al. (Tetrahedron Letters, 22:1859-1862, 1981). One method for synthesizing oligonucleotides on a modified solid support is described in U.S. Pat. No. 4,458,066.
Any nucleic acid specimen, in purified or nonpurifϊed form, can be utilized as the starting nucleic acid or acids, provided it contains, or is suspected of containing, the specific nucleic acid sequence containing the target locus (e.g., CpG). Thus, the process may employ, for example, DNA or RNA, including messenger RNA, wherein DNA or RNA may be single stranded or double stranded. In the event that RNA is to be used as a template, enzymes, and/or conditions optimal for reverse transcribing the template to DNA would be utilized. In addition, a DNA- RNA hybrid which contains one strand of each may be utilized. A mixture of nucleic acids may also be employed, or the nucleic acids produced in a previous amplification reaction herein, using the same or different primers may be so utilized. The specific nucleic acid sequence to be amplified, i.e., the target locus, may be a fraction of a larger molecule or can be present initially as a discrete molecule, so that the specific sequence constitutes the entire nucleic acid. It is not necessary that the sequence to be amplified be present initially in a pure form; it may be a minor fraction of a complex mixture, such as contained in whole human DNA.
The nucleic acid-containing specimen used for detection may be from any source including a developing limb bud, enamel knot tissue, follicle of a hair, retinal tissue, brain, colon, urogenital tissue, hematopoietic tissue, thymus, testis, ovarian, uterine, prostate, breast, gastrointestinal, colon, lung and renal tissue and may be extracted by a variety of techniques such as that described by Maniatis, et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., pp 280, 281, 1982). If the extracted sample is impure (such as plasma, serum, or blood or a sample embedded in parrafm), it may be treated before amplification with an amount of a reagent effective to open the cells, fluids, tissues, or animal cell membranes of the sample, and to expose and/or separate the strand(s) of the nucleic acid(s). This lysing and nucleic acid denaturing step to expose and separate the strands will allow amplification to occur much more readily.
Where the target nucleic acid sequence of the sample contains two strands, it is necessary to separate the strands of the nucleic acid before it can be used as the template. Strand separation can be effected either as a separate step or simultaneously with the synthesis of the primer extension products. This strand separation can be accomplished using various suitable denaturing conditions, including physical, chemical, or enzymatic means; the term "denaturing" includes all such means. One physical method of separating nucleic acid strands involves heating the nucleic acid until it is denatured. Typical heat denaturation may involve temperatures ranging from about 80 degrees to 105 degrees C for times ranging from about 1 to 10 minutes. Strand separation may also be induced by an enzyme from the class of enzymes known as helicases or by the enzyme RecA, which has helicase activity, and in the presence of riboATP, is known to denature DNA. The reaction conditions suitable for strand separation of nucleic acids with helicases are described by Kuhn Hoffmann-Berling (CSH-Quantitative Biology, 43:63, 1978) and techniques for using RecA are reviewed in C. Radding (Ann. Rev. Genetics, 16:405-437, 1982).
When complementary strands of nucleic acid or acids are separated, regardless of whether the nucleic acid was originally double or single stranded, the separated strands are ready to be used as a template for the synthesis of additional nucleic acid strands. This synthesis is performed under conditions allowing hybridization of primers to templates to occur. Generally synthesis occurs in a buffered aqueous solution, preferably at a pH of 7-9, most preferably about 8. Preferably, a molar excess (for genomic nucleic acid, usually about 10 :1 primer template) of the two oligonucleotide primers is added to the buffer containing the separated template strands. It is understood, however, that the amount of complementary strand may not be known if the process of the invention is used for diagnostic applications, so that the amount of primer relative to the amount of complementary strand cannot be determined with certainty. As a practical matter, however, the amount of primer added will generally be in molar excess over the amount of complementary strand (template) when the sequence to be amplified is contained in a mixture of complicated long-chain nucleic acid strands. A large molar excess is preferred to improve the efficiency of the process.
The deoxyribonucleoside triphosphates dATP, dCTP, dGTP, and dTTP are added to the synthesis mixture, either separately or together with the primers, in adequate amounts and the resulting solution is heated to about 90 degrees to 100 degree C from about 1 to 10 minutes, preferably from 1 to 4 minutes. After this heating period, the solution is allowed to cool to room temperature, which is preferable for the primer hybridization. To the cooled mixture is added an appropriate agent for effecting the primer extension reaction (called herein "agent for polymerization"), and the reaction is allowed to occur under conditions known in the art. The agent for polymerization may also be added together with the other reagents if it is heat stable. This synthesis (or amplification) reaction may occur at room temperature up to a temperature above which the agent for polymerization no longer functions. Thus, for example, if DNA polymerase is used as the agent, the temperature is generally not greater than about 40 degrees C. Most conveniently, the reaction occurs at room temperature.
The agent for polymerization may be any compound or system which will function to accomplish the synthesis of primer extension products, including enzymes. Suitable enzymes for this purpose include, for example, E. coli DNA polymerase I, Klenow fragment of E. coli DNA polymerase I, T4 DNA polymerase, other available DNA polymerases, polymerase muteins, reverse transcriptase, and other enzymes, including heat-stable enzymes (i.e., those enzymes which perform primer extension after being subjected to temperatures sufficiently elevated to cause denaturation). Suitable enzymes will facilitate combination of the nucleotides in the proper manner to form the primer extension products which are complementary to each locus nucleic acid strand. Generally, the synthesis will be initiated at the 3' end of each primer and proceed in the 5' direction along the template strand, until synthesis terminates, producing molecules of different lengths. There may be agents for polymerization, however, which initiate synthesis at the 5' end and proceed in the other direction, using the same process as described above.
Preferably, the method of amplifying is by PCR, as described herein and as is commonly used by those of ordinary skill in the art. Alternative methods of amplification have been described and can also be employed as long as the methylated and non-methylated loci amplified by PCR using the primers of the invention is similarly amplified by the alternative means.
The amplified products are preferably identified as by sequencing. Sequences amplified by the methods of the invention can be further evaluated, detected, cloned, sequenced, and the like, either in solution or after binding to a solid support, by any method usually applied to the detection of a specific DNA sequence such as PCR, oligomer restriction (Saiki, et al., Bio/Technology, 3:1008-1012, 1985), allele-specific oligonucleotide (ASO) probe analysis (Conner, et al., Proc. Natl. Acad. Sci. USA, 80:278, 1983), oligonucleotide ligation assays (OLAs) (Landegren, et al., Science, 241:1077, 1988), and the like. Molecular techniques for DNA analysis have been reviewed (Landegren, et al., Science, 242:229-237, 1988.
In another embodiment of the invention, the genes that are co-expressed with the Shh gene are identified by a biochip, such as for example, an Affymetrix GeneChip. For each gene identified as differentially expressed by Affymetrix GeneChip, a corresponding SAGE tag can be identified, and the total number of SAGE tags present in the SAGEmap database (http:// www.ncbi.nlm.nih.gov/SAGE/) can be determined. Preferably, any gene having at least about five tags in about one of these two SAGE libraries was then excluded from further analysis. Serial Analysis of Gene Expression (SAGE), is based on the identification of and characterization of partial, defined sequences of transcripts corresponding to gene segments. These defined transcript sequence "tags" are markers for genes which are expressed in a cell, a tissue, or an extract, for example.
SAGE is based on several principles. First, a short nucleotide sequence tag (9 to 10 bp) contains sufficient information content to uniquely identify a transcript provided it is isolated from a defined position within the transcript. For example, a sequence as short as 9 bp can distinguish 262,144 transcripts (49) given a random nucleotide distribution at the tag site, whereas estimates suggest that the human genome encodes about 80,000 to 200,000 transcripts (Fields, et al., Nature Genetics, 7:345 1994). The size of the tag can be shorter for lower eukaryotes or prokaryotes, for example, where the number of transcripts encoded by the genome is lower. For example, a tag as short as 6-7 bp may be sufficient for distinguishing transcripts in yeast.
Second, random dimerization of tags allows a procedure for reducing bias (caused by amplification and/or cloning). Third, concatenation of these short sequence tags allows the efficient analysis of transcripts in a serial manner by sequencing multiple tags within a single vector or clone. As with serial communication by computers, wherein information is transmitted as a continuous string of data, serial analysis of the sequence tags requires a means to establish the register and boundaries of each tag. The concept of deriving a defined tag from a sequence in accordance with the present invention is useful in matching tags of samples to a sequence database. In the preferred embodiment, a computer method is used to match a sample sequence with known sequences. The tags used herein uniquely identify genes. This is due to their length, and their specific location (31) in a gene from which they are drawn. The full length genes can be identified by matching the tag to a gene database member, or by using the tag sequences as probes to physically isolate previously unidentified genes from cDNA libraries. The methods by which genes are isolated from libraries using DNA probes are well known in the art. See, for example, Veculescu et al., Science 270: 484 (1995), and Sambrook et al. (1989), Molecular Cloning: A Laboratory Manual, 2nd ed. (Cold Spring Harbor Press, Cold Spring Harbor, N. Y.). Once a gene or transcript has been identified, either by matching to a database entry, or by physically hybridizing to a cDNA molecule, the position of the hybridizing or matching region in the transcript can be determined. If the tag sequence is not in the 3' end, immediately adjacent to the restriction enzyme used to generate the SAGE tags, then a spurious match may have been made. Confirmation of the identity of a SAGE tag can be made by comparing transcription levels of the tag to that of the identified gene in certain cell types. EXEMPLIFICATION: The following examples are offered by way of illustration, not by way of limitation. While specific examples have been provided, the above description is illustrative and not restrictive. Any one or more of the features of the previously described embodiments can be combined in any manner with one or more features of any other embodiments in the present invention. Furthermore, many variations of the invention will become apparent to those skilled in the art upon review of the specification. The scope of the invention should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims along with their full scope of equivalents.
All publications and patent documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication or patent document were so individually denoted. By their citation of various references in this document, Applicants do not admit that any particular reference is "prior art" to their invention. Materials and Methods
Mouse Construction and Genotyping. The Shhgfpcre ES cell targeting construct was made in pBR322, a low copy number vector, because it had previously been reported that the presence of sequences 5' of the mouse Shh locus resulted in bacterial cell death. The 5' targeting arm was 1.2 kb followed by a gfpcre cassette containing an in-frame fusion between gfp and ere. The gfpcre cassette was placed at the ATG of Shh. Base pairs located at -1 and -5, relative to the Shh ATG, were changed from a G/C to a C/G and A/T to a T/ A5 respectively, to create a Sail site used to clone the 5' targeting arm. The 3' targeting arm was 8 kb and began 35 bp downstream of the Shh ATG. The only genomic sequences lacking in correctly targeted ES cells were the first 35 base pairs after the ATG of Shh. These base pairs were excised to create a Shh null allele. AU genomic sequence involved in regulating expression from the Shh locus are present in correctly targeted ES cells. Correctly targeted ES cells were identified by southern blot analysis. ES cells were injected into the blastocoele of a recipient blastocyst, which was then implanted into a pseudo-pregnant recipient mouse. Male chimeric mice were then produced in which some of the animals' cells were derived from the ES cells and the remainder were from the recipient blastocyst. Chimeric males mice were used for breeding to determine if the injected ES cells contributed to the germline. Successful passage of the Shhgfpcre allele through the germline resulted in the establishment of a stable mouse line in which the gfpcre cassette had been inserted at the ATG of the Shh gene. Embryos and adults were genotyped by one of the following methods: PCR, staining for β-galactosidase, or direct visualization of GFP expression. Example 1. Purifying Shh-Expressing Cells from the Embryonic Vertebrate Limb.
Sonic hedgehog (ShH) is the only gene known to be expressed exclusively in the Zone of Polarizing Activity (ZPA), an important signaling center during limb development. This Example describes use of Shhgfpcre mice as described above to isolate and purify a population of stem cells from the developing limb buds of embryos and to discover genes involved in limb development.
As discussed, the Shhgfpcre allele was constructed by placing a GfpCre fusion cassette into one of the Shh alleles. Insertion of this cassette creates a null Shh allele. However, the remaining S/z/z allele is unaffected and mice that are heterozygous for the Shhgfpcre allele are phenotypically wild type. As a consequence of containing the Shhgfpcre allele, mice express GFP and CRE in all cells that normally express Shh21. In the limb, GFP expression is indistinguishable from that of endogenous Shh and no ectopic GFP or CRE expression is observed outside of the wild type Shh expression domains. The ability to mark cells that express Shh with GFP allowed us to purify this population of cells using fluorescence-activated cell sorting (FACS). Limbs were harvested from embryos at embryonic day 10.5 (ElO.5) containing the Shhgfpcre allele because at this stage Shh (and therefore GFP) is strongly expressed, and the limb is in the process of being patterned. Both forelimbs and hindlimbs were combined from the Shhgfpcre- positive embryos. Hindlimb development is approximately a half-day behind forelimb development in ElO.5 limbs. Combining all four limbs allowed us to assay two slightly different stages of development on the same array. It is possible that by combining forelimbs and hindlimbs in our analysis that we missed genes that are either expressed transiently at a specific stage of development or that are forelimbor hindlimb specific.
GFP-positive limbs from 2-4 embryos were harvested and placed in trypsin for 5 minutes at 37°C. After 5 minutes, the trypsin was replaced with fresh trypsin and the limbs were allowed to incubate for an additional 5 minutes at 37°C. The limbs were then washed twice in PBS and dissociated into a single-cell suspension by pipetting vigorously for 5 minutes. Control, GFP- negative limbs were processed in an identical manner.
The single-cell suspension of limb cells was then sorted using FACS. Limbs that lacked GFP were used to set a control GFP-negative baseline. Dissociated cells from limbs that expressed GFP in the ZPA (that is, contained the Shhgfpcre allele) were then sorted. Two easily distinguishable populations of cells were observed in the GFP-positive sample. The first overlapped the trace obtained when GFP-negative limbs had been sorted. The second population contained GFP-positive cells. In Shhgfpcre limbs, cells that were GFP-positive and GFP- negative were collected from the same sample. The GFP-positive population should contain ! cells that express Shh while the GFP-negative population is composed of cells present in the rest of the limb. From each limb, -700 GFP-positive cells were obtained. To determine the purity of each population of cells, an aliquot of GFP-negative and GFP-positive sorted cells was resorted using identical parameters as the initial sort. Of the GFP-negative resorted cells, >99.7% remained GFP negative. Of the GFP-positive resorted cells, >82% remained GFP positive. These data demonstrate that we can successfully purify GFP-negative and GFP-positive (Shh expressing) cells from limbs of embryonic mice.
Example 2. Hybridizing Affymetrix Arrays with cRNA from Shh-positive and Shh-Negative El 0.5 Limbs.
To identify the genes present in these two different populations of cells we selected the Affymetrix Gene Chip 430 version 2.0 since this chip has the most complete coverage of the mouse transcriptome (-39,000 probe sets per chip). Probes (cRNAs) were made using the
Affymetrix 2-step cDNA amplification kit since the starting amount of RNA we obtained from the GFP-positive population was insufficient for single step amplification. A total of eight chips was used, 4 with cKNA from the GFP-positive cells and 4 with cRNA from the GFP-negative population. Each chip was hybridized with a biologically independent sample. Equal amounts of cRNA were used on each chip and processed using standard Affymetrix procedures. Comparison of the transcriptomes of iS7z/?-expressing and S7z/z-non-expressing cells should yield genes that are exclusively expressed in Shh cells and nowhere else in the limb. Affymetrix GeneChips function by detecting transcript abundance, which can be compared between two (or more) different cell populations. In this experiment, any gene that is more highly expressed in SM-positive cells compared to the rest of the limb should be uncovered. If a gene is expressed in Shh cells and in a population of cells outside the ZPA, it is possible that this gene would still be detected in our experiment. In addition, genes that are transcribed at much higher levels in 57z/z-positive cells but are also present at lower levels throughout the limb may be detected. Genes expressed in a posterior-to-anterior concentration gradient can also be identified in this experiment. Example 3. Genes Upregulated in Shh-expressing Cells Identified by Array Screening
Affymetrix GeneChips allow for the identification of the transcripts present in a defined population of cells. Affymetrix GeneChips were probed with cRNA made from either Shh- positive cells or cells that do not express Shh in the ElO.5 mouse limb. After hybridizing the chips, ~40% of the probe sets were scored as "present" on chips hybridized with samples from either the SM-positive or SM-negative cells. To determine which genes are differentially expressed in the two cell populations we used two array analysis tools, BRB array tools and D- chip30. The two programs produced overlapping but not identical sets of genes that are expressed at higher levels in either the SM-positive or negative populations of cells, as described below. In the GFP-positive (SM-expressing) population, 15 genes were identified as being expressed at higher levels relative to the GFP-negative populations. These genes are listed in Table 1, supra. As expected, one of these genes was Shh. Obtaining Shh in this data set indicates that the arrays can identify genes expressed differentially in our two populations of cells and suggests that the remaining genes identified may be upregulated or exclusively expressed in the Shh- positive population. The transcription factor Tbx2 was also identified as being enriched in SM-positive cells. Other laboratories have characterized Tbx2 and its expression is known to overlap with Shh in the ElO.5 limb31'32. Tbx2 is also expressed in the posterior of the limb in non-Shh expressing cells and in a small region of the anterior limb. One other known transcription factor was identified, i.e., Hb9. Hb9 has been demonstrated to play an important role in pancreas development but no function in limb development has been reported33'34. Cholecystokinin, a gastrointestional hormone35 and arachidonate 12-lipoxygenase, an enzyme that introduces a molecular oxygen at carbon 12 of arachidonic acid to generate a 12-hydroperoxy36 derivative were also uncovered (Table 1). The role that these genes may play in limb development is unknown.
Of the remaining ten genes uncovered in the screen, two transcripts with homology to two different families of transmembrane factors were identified. We have named these genes provisionally as TMl and TM2. The function of neither of these transmembrane families has been described. TMl and TM2 are not similar to the transmembrane genes patched, smoothened, Hid or dispatched, which are known to have essential roles in Shh signaling. The other eight genes are ESTs of unknown function.
There are a number of known genes that have been shown to overlap with Shh that were not identified as being upregulated in SM-positive cells in the array screen. It is possible that even though these genes are expressed in Shh cells, an equal or greater number of transcripts are expressed outside the Shh expression domain. This would result in this family of genes being scored as "present" in both GFP-positive and GFP -negative cells, but not as upregulated in the 5/z/z-positive cell population. Example 4. Genes Downregulated in Shh-expressing Cells Identified by Array Screening
Cells that express Shh in the E 10.5 mouse limb comprise a small subset of the total number of cells present in the limb. By comparing two populations of cells, i.e., Shh- expressing and non-expressing cells we were able to identify a pool of 120 genes that appeared to have elevated expression in cells that did not express Shh. The number of genes elevated in cells that were not expressing Shh was larger than the 15 genes elevated in S/z/z-positive cells. This is not unexpected, since precursor cells for a number of tissues including bone, muscle and connective tissue are known to be present outside of the >S7z/z-positive regions of the ElO.5 limb. The list of genes that show upregulated expression, relative to that observed in S7z/z-expressing cells, is listed in Table 2, supra. Example 5. In vivo Validation of Target Genes Identified by Microarray Screening.
The array experiments described above provide the relative expression levels of specific genes in the two populations of cells assayed. To determine where in the E 10.5 limb a specific gene is expressed, RNA in situ hybridizations were performed. Of the thirteen unknown genes found in our screen to be up-regulated in SM-expressing cells, five were detected in the El 0.5 limb using whole mount RNA in situ hybridization. Four of these genes, i.e., the two transmembrane factors TMl and TM2 and two ESTs, were expressed in ιS7z/z-positive cells in the El 0.5 limb (Fig. 2). These data demonstrate that the array experiment can identify previously unidentified genes that are co-expressed with Shh.
The fifth EST appeared to be expressed throughout the ElO.5 limb. It is not clear why this gene was detected in the screen. It is possible that this transcript is elevated in <SM-positive cells while lower levels are expressed throughout the limb. Whole mount RNA in situ hybridizations cannot be used to distinguish between populations of cells expressing a gene at slightly different levels. The eight remaining genes did not produce a visible in situ signal in limbs. They may be expressed at levels below the sensitivity of our in situ protocol.
From the pool of 120 genes identified as being down-regulated in S/z/z-expressing cells, four were characterized using RNA whole mount in situ hybridization. One gene, i.e., netrin, had previously been reported as being expressed in the vertebrate limb in cells that do not normally express Shh37. We confirmed that this gene is not present in SM-positive cells by performing double RNA whole mount in situ hybridization with netrin and Shh.
We further examined three ESTs randomly chosen from our data set. The expression of two of these genes was found not to overlap with Shh in the limb. The third was not detected in the limb. Of the 120 genes identified as being up-regulated outside of Shh- expressing cells, 12 have been previously characterized in the mouse limb. All of these genes have been reported not to overlap with Shh. These data demonstrate that the Shhgfpcre mouse can be used to identify novel genes expressed in SM-expressing and non-expressing cell populations in the vertebrate limb. Example 6. In situ Characterization of TMl in Mice.
As shown above, the transmembrane gene TMl was identified as being enriched in Shh cells. RNA in situ hybridizations using TMl as a probe confirmed that this gene is expressed exclusively in iSM-expressing cells at ElO.5 (the stage that limbs were harvested for screening ; Fig. 2B). To determine if TMl is expressed in a manner similar to Shh throughout limb development we performed RNA in situ hybridizations on E9.75-E12 limbs. At all stages examined, TMl expression in the limb was indistinguishable from Shh (Figure 3). TMl expression was first observed in E9.75 limbs in the posterior/distal portion of the limb and overlapped with Shh expression. Over the course of the next two days, expression in the limb was only observed in cells that expressed Shh. TMl expression ceased at El 1.75, the same stage at which Shh expression disappears in the mouse limb. No TMl expression was observed in the limb outside the ZPA at any stage examined. This is, to our knowledge, the first report of any gene besides Shh that is exclusively expressed in the ZPA throughout limb development.
Outside the limb, TMl was observed in a number of locations in which Shh is expressed including the brain, branchial arches and gut. In all experiments, a clone containing 2kb of TMl was used as an in situ probe. This probe has no sequence identity to Shh at the DNA or amino acid level. Example 7. Characterization of TMl by Bioinformatics.
TMl is predicted to contain eight membrane-spanning domains. There are four transmembrane proteins, Patched (Ptc), Smoothened (Smo), Hedgehog-interacting protein (Hip) and Dispatched (Displ) that are known to function in the STz/z-signaling pathway (reviewed in 10). Ptc is thought to bind to Smo in the absence of Shh protein. Upon binding Shh, Ptc releases Smo and 5%/z-signaling commences in a cell. Both Ptc and Smo axe expressed in the ZPA and in cells surrounding the ZPA. Displ is important for removing Shh from cells that are actively producing Shh protein; however Displ does not appear to be expressed in the ElO.5 limb ZPA. Hip has been proposed to be involved in fine-tuning the Shh protein gradient and is not expressed in the ZPA. Amino acid alignments indicate that TMl shares no protein similarity to Ptc, Smo, Hip or
Displ, suggesting that TMl may have a novel role in the £7z/?-signaling pathway. TMl also contains a DUF590 domain, which is a conserved protein domain of unknown function.
In mice, TMl is a member of a highly conserved gene family consisting of eight members. All eight members contain eight transmembrane domains and the DUF590 domain. The furthest removed from TMl is 36% identical at the amino acid level while the most similar family member is 59% identical (homology is observed throughout the entire coding regions of family members). At least eight members of this protein family are found in Zebrafish, humans, rats and chicks. There are four family members present in Drosophilα and one in C, elegαns.
In humans, DOGl (the human TMl homolog) has been shown to be expressed ubiquitously in gastrointestinal stromal tumors (GIST), while others types of soft tissue tumor do not express DOGl38. Gastrointestinal stromal tumors occur in the wall of the bowel and can be treated by the drug imatinib mesylate, which may work through the inhibition of the KIT tyrosine kinase receptor39. In the majority of GISTs, corresponding mutations in KIT have been identified, supporting this hypothesis. However, a subset of GIST tumors that respond to imatinib mesylate do not express high levels of KIT38'39. Using an array-based approach it was found that DOGl was present in 136 of 139 (97.8%) of GIST tumors38. The authors suggest that DOGl may be a very useful marker for testing if GIST tumors can be treated with imatinib mesylate.
No other functions for TMl are known in vertebrates and the functions of all invertebrate TMl members remain unknown. In addition, the expression pattern of TMl has not previously been described in any organism.
Example 8. TMl is Not Expressed in Fgf4/Fgf8 Double Mutant Null Limbs. The fibroblast growth factors (Fgfs) Fgf4 and Fgβ are expressed in the Apical Ectodermal
Ridge (AER) and are required for the proper proximal/distal patterning of the vertebrate limb40'41. Removal of both of these factors results in loss of all limb elements42'43. These Fgf factors are also required for Shh expression in the limb mesenchyme . This Example describes studies performed to determine if TMl expression is dependent on expression of Fgfs. Whole mount RNA in situ hybridizations were carried out using Fgf4;Fgβ;Msx2-cre double knockout embryos (kindly provided by Xin Sun, University of Wisconsin, Madison). In the hindlimb, the Msx2- ere transgene causes complete inactivation of Fgf4 and Fgβ before they are expressed (a genetic "null")44. In the forelimb, expression of both Fgf4 and Fgβ commences before Cre is expressed; thus there is a burst of Fgf activity during the early development of the forelimb before these genes are inactivated42.
The genes Shh and Fgf 4 are part of a feedback loop that is responsible for controlling the final size of the limb (see Developing Limb Bud, supra). Examination of the hindlimbs of Fgf4;Fgβ double knockout embryos demonstrated that TMl expression requires Fgf expression in the AER (compare Fig. 4a and 4b). No TMl expression was observed in the hindlimbs of embryos that lacked both Fgf4 and Fgβ. In forelimbs, a significant decrease in TMl expression occurred. The presence of a small amount of TMl in Fgf4/8 double knockout forelimbs may be the result of an early burst of Fgf expression in this tissue.
TMl expression in Fgf4/8 forelimbs was similar to the expression pattern reported for Shh in these double mutants42. These data indicate that TMl expression, like Shh, requires Fgf expression from the AER. From these experiments it is not clear whether TMl is absent in Fgf4;Fgβ double mutants because TMl expression is directly dependent on the expression of Fgf 'factors present in the AER5 or whether the loss of TMl is an indirect result of loss of multiple independent signaling pathways.
Example 9. TMl is Expressed in Shh Null Limbs.
Mice that lack Shh have been created previously7. These mice lack most distal limb structures but still form relatively normal proximal limbs. Since TMl and Shh expression overlap in the limb, it was possible that Shh expression is required for expression of TMl. This Example describes results of a study to test this hypothesis.
Whole mount RNA in situ hybridizations with TMl were performed on Shh null embryos. In these experiments, TMl expression was detected in its normal pattern in the posterior of ElO Shh null limbs (compare Fig. 3c and 3d). After ElO, TMl expression was not detected. This experiment demonstrates that Shh is not required for the initial expression of TMl in the mouse limb, but it may be required for maintaining TMl expression.
The initial expression of Gremlin in the limb has also been reported to be independent of Shh expression . However, unlike TMl, this gene is not expressed in 5%/ϊ-positive cells but in cells adjacent to cells expressing Shh. It has been proposed that the initial burst of Gremlin expression in the posterior of SM-null limbs is due to an unknown posterior signal13. It is possible that TMl is involved in the pathway responsible for propagating this signal in the absence of Shh. Based on these observations, it is believed that genes, RNA or protein as shown herein to be differentially expressed in the developing limb bud will provide useful therapeutic agents for regenerating, reconstructing or repairing the limb of a subject in need thereof, such as a mature adult.
Example 10. Chick TMl is Expressed in the ZPA.
Chicks are an excellent model system in which to perform embryological experiments. Unlike the mouse embryo, the chick embryo is easily accessible for tissue manipulations and an individual chick embryo can be observed at several time points during the course of an experiment. This Example demonstrates that the chick system can be used to study the role of TMl in limb development.
Searching the chick EST database, we identified a gene we provisionally named cTMl that shares 80% amino acid identity with mouse TMl. cTMl, like its mouse counterpart, contains eight transmembrane domains and a DUF590 domain.
Whole mount RNA in situ hybridizations demonstrated that cTMl is expressed in the ZPA of chick forelimbs until stage 21 of development (approximately equivalent to mouse stage El 0.5) but is not expressed in hindlimbs (Fig. 4E). At later stages of development, cTMl was not expressed in either the forelimbs or hindlimbs. Expression was also observed in the brain, gut and branchial arches, consistent with mouse TMl expression.
Analysis of related chick TMl family members did not uncover any additional genes expressed in either the chick forelimbs or hindlimbs. It is not clear why cTMl is not expressed in both forelimbs and hindlimbs. However, the presence of cTMl in the ZPA of forelimbs allows for testing the role this factor may play in SM-signaling and/or limb patterning using the chick developmental system in addition to the mouse system.
Example 11. ShhcreERT2 Mouse Model Conditionally Expressing a Reporter in Shh Cells: Evidence that Nucleus Pulposus is Derived from Notochord.
This Example describes the construction and use of a mouse model that conditionally expresses a reporter protein in Shh-expressing cells only upon induction of expression of the gene construct by tamoxifen. The model is useful in studies of cell fate mapping. In this study, this model was used to demonstrate that the notochord is the progenitor of the entire nucleus pulposis of the intervertebral disks.
A tamoxifen-inducible ShhcreERT2 allele was created. The ShhcreERT2 allele can be used to activate the R26R reporter allele in Shh-expressing cells at discrete developmental stages. Briefly, this allele was created as follows. The ere gene in the ShhcreERT2 allele is a fusion between ere and an estrogen binding domain. This fusion protein can be activated in tissue- specific locations in the mouse embryo upon injection of tamoxifen (as described in ' ). The creERT2 gene was knocked into the Shh locus in ES cells. Correctly targeted ES cells were injected into mouse blastocysts to create chimeric animals carrying the ShhcreERT2 allele. These mice were then bred to obtain germline transmission of the ShhcreERT2 allele.
The ShhcreERT2 mouse model was used to determine the origin of the cells that form the nucleus pulposus of the intervertebral disks. During embryological development, the notochord begins to form at E7.0. We have preliminarily shown that the nucleus pulposus forms at E14.5. The fate-mapping experiment described herein required that injected tamoxifen be cleared from the embryo prior to formation of the nucleus pulposus. Since it has been reported that CRE activity is undetectable 48 hours after tamoxifen injection, we first injected tamoxifen between E6.0 and E7.75. However, we have found that tamoxifen injections into pregnant mice carrying E8.0 or younger embryos results in death of the embryos. To bypass this lethality, we injected tamoxifen into pregnant mice carrying E8.5 pups. The embryos were then harvested at E14.5 or E16.5 and stained for β-galactosidase activity. In E 14.5 embryos, expression was observed in the notochord and forming disks (Figure 10A). By E16.5, staining was observed only throughout the nucleus pulposus (Figure 10B). Referring to Figs. lOA-C, the notochord is the progenitor of the intervertebral disks. Tamoxifen was injected into mice carrying E8.5 embryos. Embryos were harvested and stained for β-galactosidase activity at either E14.5 (Fig. 10A) or E16.5 (Figs. 10B5C). At E14.5, the nucleus pulposus of the intervertebral disks is forming. The insert in Fig. 1OA shows three intervertebral disks (ventral view). Note that remnants of the notochord are still present between the forming disks (indicated by arrows). By E 16.5, the disks have formed and the entire nucleus pulposus is stained blue (Fig. 1OB; NP = nucleus pulposus), demonstrating that this structure is derived from the notochord. There are no blue-stained cells between the disks in the vertebrae (Fig. 1OC; V = vertebrae), indicating that the notochord does not contribute to vertebrae. Figures 1OB and 1OC both illustrate 10 μm transverse sections. We initially aimed to examine the limb to show that the injected tamoxifen had been cleared from the embryo prior to disk formation at E 14.5. However, reporter activity was found in the limb, as expected, since the limb forms less than 48 hours after the E8.5 tamoxifen injection. We therefore examined expression of the reporter in the preputial glands of the external genitalia, which express Shh from E 13.5 onward. If tamoxifen were present in the embryo at E 13.5 or later, reporter expression would be seen in these glands. The preputial glands of the external genitalia express Shh at E13.5. The absence of β-galactosidase in these glands in embryos exposed to tamoxifen (Fig. 10D) at E8.5 indicates that the tamoxifen has been cleared from these embryos prior to E 13.5 (before the intervertebral disks form). In contrast, the preputial glands are labeled in embryos constitutively expressing CRE in all Shh-producing cells (Fig. 1OE; preputial glands labeled with "*"). Thus β-galactosidase expression was not observed in preputial glands of tamoxifen-treated embryos, although reporter expression was still observed in the urethra. This result demonstrates that tamoxifen injected at E8.5 was cleared from the embryo prior to Shh expression in the preputial glands at E 13.5, and therefore prior to intervertebral disk formation at E 14.5. From this demonstration it may be concluded that the entire nucleus pulposus is derived from the notochord. Accordingly, it is specifically envisioned that the cells of the invention, or the genes, RNA or a protein translated from a gene identified as expressed at relatively higher or lower levels according to the methods of the invention will provide useful therapeutics for treatment of intervertebral disk rupture, degeneration, disease or injury.
Example 12. ShhcreERTl Mouse Model Conditionally Expressing a Reporter in Shh Cells: Demonstration that Shh is Expressed in a Small Subset of Cells in the Nucleus Pulposus.
This Example describes the discovery that during late development, Shh is only expressed in a small subset of cells in the nucleus pulposus.
Mice pregnant with El 9.5 embryos carrying the ShhcreERT2 and R26R reporter alleles were injected with tamoxifen and the pups were harvested one day after birth. In these animals, -8-15 cells in each nucleus pulposus were stained (Fig. 1 IA-C). More particularly, Figs. 1 IA-C illustrate results of β-galactosidase staining of 10 μm saggital sections of the intervertebral disks of Pl mice that were exposed to tamoxifen at El 9.5. The arrows point to cells positive for β- galactosidase (i.e., they express CRE from the ShhcreERT2 allele). Each panel (A-C) is from a different intervertebral disk. These results indicate that during late embryogenesis, Shh is expressed in a discrete subset of cells in the nucleus pulposus. It is possible that these Shh- expressing cells are the "stem cell-like" cells previously described in the literature.
In other tissues, Shh signaling has been shown to be required for cell proliferation, while loss of Shh signaling results in cell differentiation. In the nucleus pulposus, only two cell types have been described: notochord-like cells and chondrocyte-like cells which have been proposed to arise from the notochord-like cells. The SM-positive population of cells we have identified in late embryonic development is either a population of notochord-like stem cells or a new, previously undetected cell population because there are too few labeled cells for them to be chondrocyte-like cells. REFERENCES 1. Martin, P. Tissue patterning in the developing mouse limb. Int JDev Biol 34, 323-36 (1990).
2. Saunders, J. W. a. G., M. T. Ectodermal-mesenchymal interactions in the orgin of limb symmetry. Epithelial-Mesenchymal Interactions. R. Fleischmajer and R. E. Billingham. Baltimore, Williams and Wilkins, 78-97 (1968). 3. Riddle, R. D., Johnson, R. L., Laufer, E. & Tabin, C. Sonic hedgehog mediates the polarizing activity of the ZPA. Cell 75, 1401-16 (1993). 4. Echelard, Y. et al. Sonic hedgehog, a member of a family of putative signaling molecules, is implicated in the regulation of CNS polarity. Cell 75, 1417-30 (1993).
5. Krauss, S., Concordet, J. P. & Ingham, P. W. A functionally conserved homolog of the Drosophila segment polarity gene hh is expressed in tissues with polarizing activity in zebrafish embryos. Cell 75, 1431-44 (1993).
6. Lewis, P. M. et al. Cholesterol modification of sonic hedgehog is required for long-range signaling activity and effective modulation of signaling by Ptc 1. Cell 105, 599-612 (2001).
7. Chiang, C. et al. Cyclopia and defective axial patterning in mice lacking Sonic hedgehog gene function. Nature 383, 407-13 (1996). 8. Chiang, C. et al. Manifestation of the limb prepattern: limb development in the absence of sonic hedgehog function. Dev Biol 236, 421-35 (2001).
9. Drossopoulou, G. et al. A model for anteroposterior patterning of the vertebrate limb based on sequential long- and short-range Shh signalling and Bmp signalling. Development 127,
1337-48 (2000). 10. Ingham, P. W. & McMahon, A. P. Hedgehog signaling in animal development: paradigms and principles. Genes Dev 15, 3059-87 (2001).
11. Scherz, P. J., Harfe, B. D., McMahon, A. P. & Tabin, C. J. The limb bud Shh-Fgf feedback loop is terminated by expansion of former ZPA cells. Science 305, 396-9 (2004).
12. Panman, L. & Zeller, R. Patterning the limb before and after SHH signalling. JAnat 202, 3-12 (2003).
13. Zuniga, A., Haramis, A. P., McMahon, A. P. & Zeller, R. Signal relay by BMP antagonism controls the SHH/FGF4 feedback loop in vertebrate limb buds. Nature 401, 598-602 (1999).
14. Capdevila, J., Tsukui, T., Rodriquez Esteban, C, Zappavigna, V. & Izpisua Belmonte, J. C. Control of vertebrate limb outgrowth by the proximal factor Meis2 and distal antagonism of
BMPs by Gremlin. MoI Cell 4, 839-49 (1999).
15. Laufer, E., Nelson, C. E., Johnson, R. L., Morgan, B. A. & Tabin, C. Sonic hedgehog and Fgf-4 act through a signaling cascade and feedback loop to integrate growth and patterning of the developing limb bud. Cell 19, 993-1003 (1994). 16. Niswander, L., Jeffrey, S., Martin, G. R. & Tickle, C. A positive feedback loop coordinates growth and patterning in the vertebrate limb. Nature 371, 60942 (1994). 17. Khokha, M. K., Hsu, D., Brunet, L. J., Dioniie, M. S. & Harland, R. M. Gremlin is the BMP antagonist required for maintenance of Shh and Fgf signals during limb patterning. Nat Genet 34, 303-7 (2003).
18. Merino, R. et al. The BMP antagonist Gremlin regulates outgrowth, chondrogenesis and programmed cell death in the developing limb. Development 126, 5515-22 (1999).
19. Dalin, R. D. & Fallon, J. F. Interdigital regulation of digit identity and homeotic transformation by modulated BMP signaling. Science 289, 438-41 (2000).
20. Sanz-Ezquerro, J. J. & Tickle, C. Fgf signaling controls the number of phalanges and tip formation in developing digits. Curr Biol 13, 1830-6 (2003). 21. Harfe, B. D. et al. Evidence for an expansion-based temporal shh gradient in specifying vertebrate digit identities. Cell 118, 517-28 (2004).
22. Yang, Y. et al. Relationship between dose, distance and time in Sonic Hedgehog- mediated regulation of anteroposterior polarity in the chick limb. Development 124, 4393-404 (1997). 23. Charite, J., McFadden, D. G. & Olson, E. N. The bHLH transcription factor dHAND controls Sonic hedgehog expression and establishment of the zone of polarizing activity during limb development. Development 127, 2461-70 (2000).
24. Charite, J., de Graaff, W., Shen, S. & Deschamps, J. Ectopic expression of Hoxb-8 causes duplication of the ZPA in the forelimb and homeotic transformation of axial structures. Cell 78, 589-601 (1994).
25. Stratford, T. H., Kostakopoulou, K. & Maden, M. Hoxb-8 has a role in establishing early anterior-posterior polarity in chick forelimb but not hindlimb. Development 124, 4225-34 (1997).
26. Lu, H. C, Revelli, J. P., Goering, L., Thaller, C. & Eichele, G. Retinoid signaling is required for the establishment of a ZPA and for the expression of Hoxb-8, a mediator of ZPA formation. Development 124, 1643-51 (1997).
27. Zakany, J., Kmita, M. & Duboule, D. A dual role for Hox genes in limb anterior-posterior asymmetry. Science 304, 1669-72 (2004).
28. Qu, S. et al. Polydactyly and ectopic ZPA formation in Alx-4 mutant mice. Development 124, 3999-4008 (1997). 29. Chen, Y. et al. Direct interaction with Hoxd proteins reverses GH3 -repressor function to promote digit formation downstream of Shh. Development 131, 2339-47 (2004). 30. analyses were performed using BRB ArrayTools developed by Richard Simon and Amy Peng Lam. (2004).
31. Yamada, M., Revelli, J. P., Eichele, G., Barron, M. & Schwartz, R. J. Expression of chick Tbx-2, Tbx-3, and Tbx-5 genes during early heart development: evidence for BMP2 induction of Tbx2. Dev Biol 228, 95-105 (2000).
32. Suzuki, T., Takeuchi, J., Koshiba-Takeuchi, K. & Ogura, T. Tbx Genes Specify Posterior Digit Identity through Shh and BMP Signaling. Dev Cell 6, 43-53 (2004).
33. Li, H., Arber, S., Jessell, T. M. & Edlund, H. Selective agenesis of the dorsal pancreas in mice lacking homeobox gene Hlxb9. Nat Genet 23, 67-70 (1999). 34. Harrison, K. A., Thaler, J., Pfaff, S. L., Gu, H. & Kehrl, J. H. Pancreas dorsal lobe agenesis and abnormal islets of Langerhans in Hlxb9-deficient mice. Nat Genet 23, 71-5 (1999).
35. AIy5 A., Shulkes, A. & Baldwin, G. S. Gastrins, cholecystokinins and gastrointestinal cancer. Biochim Biophys Acta 1704, 1-10 (2004).
36. Yoshimoto, T. & Takahashi, Y. Arachidonate 12-lipoxygenases. Prostaglandins Other Lipid Mediat 68-69, 245-62 (2002).
37. PuschelPuschel, A. W. Divergent properties of mouse netrins. Mech Dev 83, 65-75 (1999).
38. West, R. B. et al. The novel marker, DOGl, is expressed ubiquitously in gastrointestinal stromal tumors irrespective of KIT or PDGFRA mutation status. Am J Pathol 165, 107-13 (2004).
39. Subramanian, S. et al. Gastrointestinal stromal tumors (GISTs) with KIT and PDGFRA mutations have distinct gene expression profiles. Oncogene 23, 7780-90 (2004).
40. Crossley, P. H., Minowada, G., MacArthur, C. A. & Martin, G. R. Roles for FGF8 in the induction, initiation, and maintenance of chick limb development. Cell 84, 127-36 (1996). 41. Niswander, L., Tickle, C, Vogel, A., Booth, I. & Martin, G. R. FGF-4 replaces the apical ectodermal ridge and directs outgrowth and patterning of the limb. Cell 75, 579-87 (1993).
42. Sun, X., Mariani, F. V. & Martin, G. R. Functions of FGF signalling from the apical ectodermal ridge in limb development. Nature 418, 501-8 (2002). 43. Boulet, A. M., Moon, A. M., Arenkiel, B. R. & Capecchi, M. R. The roles of Fgf4 and Fgf8 in limb bud initiation and outgrowth. Dev Biol 273, 361-72 (2004).
44. Bellahcene A and Castronovo V. (1995). Am. J. Pathol, 146, 95-100. 45. Bradshaw AD, Francki A5 Motamed K, Howe C and Sage EH. (1999). MoI Biol. Cell, 10, 1569-1579.
46. Bradshaw AD and Sage EH. (2001). J Clin. Invest, 107, 1049-1054.
47. Brekken RA, Puolakkainen P, Graves DC, Workman Q Lubkin SR and Sage EH. (2003). J Clin. Invest, 111, 487-495. 482: Francki A, et al. SPARC regulates cell cycle pr...[PMID: 12577314]Related Articles, Links.
48. Brekken RA and Sage EH. (2001). Matrix Biol, 19, 816-827.
49. Briggs J, Chamboredon S, Castellazzi M, Kerry JA and Bos TJ. (2002). Oncogene, 21, 7077-7091. 50. Chambers RC, Leoni P, Kaminski N, Laurent GJ and Heller RA. (2003). Am. J. Pathol, 162, 533-546.
51. Dhanesuan N, Sharp JA, Blick T, Price JT and Thompson EW. (2002). Breast Cancer Res. Treat, 75, 73-85.
52. Francki A, Bradshaw AD, Bassuk JA, Howe CC, Couser WG and Sage EH. (1999).f Biol Chem., 274, 32145-32152.
53. Francki A, Motamed K, McClure TD, Kaya M, Murri C, Blake DJ, Carbon JG and Sage EH. (2003). J. Cell. Biochem., 88, 802-811.
54. Friess H, Yamanaka Y, Buchler M, Ebert M, Beger HG; Gold LI and Korc M. (1993). Gastroenterology, 105, 1846-1856. 55. Fukushima N, Sato N, Ueki T, Rosty C, Walter KM, Wilentz RE, Yeo CJ, Hruban RH and Goggins M. (2002). Am JPathol, 160, 1573-1581.
56. Funk SE and Sage EH. (1991). Proc. Natl. Acad. Sci. USA, 88, 2648-2652.
57. Funic SE and Sage EH. (1993). J. Cell Physiol, 154, 53-63.
58. Hahn SA5 Seymour AB, Hoque AT, Schutte M, da Costa LT5 Redston MS5 Caldas C, Weinstein CL5 Fischer A, Yeo CJ5 Hruban RH and Kern SE. (1995). Cancer Res., 55, 4670-
4675.
59. Herman JG5 Graff JR5 Myohanen S5 Nelkin BD and Baylin SB. (1996). Proc. Natl Acad. Sci. USA, 93, 9821-9826.
60. lacobuzio-Donahue CA; Argani P, Hempen PM, Jones J and Kern SE. (2002a). Cancer Res., 62, 5351-5357..
61. lacobuzio-Donahue CA5 Ryu B5 Hruban RH and Kern SE. (2002b). Am. J. Pathol. , 16O5 91-99. 62. Jacob K, Webber M, Benayahu D and Kleinman HK. (1999). Cancer Res., 59, 4453- 4457.
63. Jansen M, Fukushima N5 Rosty C5 Walter K5 Altink R5 Heek TV5 Hruban R5 Offerhaus JG and Goggins M. (2002). Cancer Biol Ther, I5 293-296. 64. Jendraschak E and Sage EH. (1996). Semin Cancer Biol, 7, 139-146.
65. Jones PA and Baylin SB. (2002). Nat. Rev. Genet, 3, 415-428.
66. Le Bail B5 Faouzi S5 Boussarie L, Guirouilh J5 Blanc JF5 Carles J5 Bioulac-Sage P5 Balabaud C and Rosenbaum J. (1999). J. Pathol, 189, 46-52.
67. Ledda F5 Bravo Al5 Adris S5 Bover L5 Mordoh J and Podhajcer OL. (1997). J. Invest. Dermatol., 108, 210-214.
68. MaeharaN, Matsumoto K, Kuba K5 Mizumoto K5 Tanaka M and Nakamura T. (2001). Br. J. Cancer., 84, 864-873.
69. Massi D5 Franchi A5 Borgognoni L, Reali UM and Santucci M. (1999). Hum. Pathol, 30, 339-344. 70. Mok SC5 Chan WY, Wong KK5 Muto MG and Berkowitz RS. (1996). Oncogene, 12, 1895-1901.
71. Paley PJ5 Goff BA, Gown AM5 Greer BE and Sage EH. (2000). Gynecol. Oncol, 78, 336-341.
72. Porte H5 Chastre E5 Prevot S5 Nordlinger B5 Empereur S, Basset P5 Chambon P and Gespach C. (1995). Int. J. Cancer, 64, 70-75.
73. Porte H, Triboulet JP, Kotelevets L, Carrat F5 Prevot S5 Nordlinger B, DiGioia Y, Wurtz A, Comoglio P, Gespach C and Chastre E. (1998). Clin. Cancer Res., 4, 1375-1382.
74. Porter PL, Sage EH, Lane TF, Funk SE and Gown AM. (1995). J. Histochem. Cytochem., 43, 791-800. 75. Reed MJ, Vernon RB5 Abrass IB and Sage EH. (1994). J. Cell Physiol, 158, 169-179.
76. Rempel SA5 Ge S and Gutierrez JA. (1999). Clin. Cancer Res., 5, 237-241.
77. Rosty C, Christa L, Kuzdzal S5 Baldwin WM5 Zahurak ML5 Carnot F5 Chan DW5 Canto M5 Lillemoe KD, Cameron JL, Yeo CJ, Hruban RH and Goggins M. (2002). Cancer Res., 62, 1868-1875. 78. Ryu B, Jones J5 Hollingsworth MA5 Hruban RH and Kem SE. (2001). Cancer Res., 61, 1833-1838. 79. Sato N, Ueki T, FukushimaN, Iacobuzio-Donahue CA, Yeo CJ, Cameron JL, Hruban RH and Goggins M. (2002). Gastroenterology, 123, 365-372.
80. Schultz C, Lemke N, Ge S, Golembiesld WA and Rempel SA. (2002). Cancer Res., 62, 6270-6277. 81. Thomas R, True LD, Bassuk JA, Lange PH and Vessella RL. (2000). Clin. Cancer Res., 6, 1140-1149.
82. Ueki T, Toyota M, Slcinner H, Walter KM, Yeo CJ, Issa JP, Hruban RH and Goggins M. (2001). Cancer Res., 61, 8540-8546.
83. Ueki T, Toyota M, Sohn T, Yeo CJ, Issa JP, Hruban RH and Goggins M. (2000). Cancer Res., 60, 1835-1839.
84. Ueki T, Walter KM, Skinner H, Jaffee E, Hruban RH and Goggins M. (2002). Oncogene, 21, 2114-2117.
85. Wewer UM, Albrechtsen R, Fisher LW, Young MF and Termine JD. (1988). Am. J. Pathol, 132, 345-355. 86. Wrana JL, Overall CM and Sodek J. (1991). Eur. J. Biochem., 197, 519-528.
87. Yamanaka M, Kanda K, Li NC, Fukumori T, Oka N, Kanayama HO and Kagawa S. (2001). J. Urol, 166, 2495-2499.
88. Yan Q and Sage EH. (1999). J. Histochem. Cytochem., 47, 1495-1506.
89. Yiu GK, Chan WY, Ng SW, Chan PS, Cheung KK, Berkowitz RS and Mok SC. (2001). Am. J. Pathol.,159, 609-622.
90. Anderson, C. A. and Clark, R. L. (1990). External genitalia of the rat: normal development and the histogenesis of 5 alpha-reductase inhibitor-induced abnormalities. Teratology 42, 483-96.
91. Baskin, L. S., Sutherland, R. S., DiSandro, M. J., Hayward, S. W., Lipschutz, J. and Cunha, G. R. (1997). The effect of testosterone on androgen receptors and human penile growth. J Urol 158, 1113-8.
92. Brinkmann, A. O. 1. (2001). Molecular basis of androgen insensitivity. MoI Cell Endocrinol 179, 105-9.
93. Glucksmann, A., Ooka-Souda, S., Miura-Yasugi, E. and Mizuno, T. (1976). The effect of neonatal treatment of male mice with antiandrogens and of females with androgens on the development of the os penis and os clitoridis. JAnat 121, 363-70. 94. Goodrich, L. V., Johnson, R. L., Milenkovic, L., McMahon, J. A. and Scott, M. P. (1996). Conservation of the hedgehog/patched signaling pathway from flies to mice: induction of a mouse patched gene by Hedgehog. Genes Dev 10, 301-12.
95. Haraguchi, R., Mo, R., Hui, C, Motoyama, J., Makino, S., Shiroishi, T., Gaffield, W. and Yamada, G. (2001). Unique functions of Sonic hedgehog signaling during external genitalia development. Development 128, 4241-50.
96. Haraguchi, R., Suzuki, K., Murakami, R., Sakai, M., Kamikawa, M., Kengaku, M., Sekine, K., Kawano, H., Kato, S., Ueno, N. et al. (2000). Molecular analysis of external genitalia formation: the role of fibroblast growth factor (Fgf) genes during genital tubercle formation. Development 127, 2471-9.
97. Kalloo, N. B., Gearhart, J. P. and Barrack, E. R. (1993). Sexually dimorphic expression of estrogen receptors, but not of androgen receptors in human fetal external genitalia. J Clin Endocrinol Metah 77, 692-8.
98. Kondo, T., Zakany, J., Innis, J. W. and Duboule, D. (1997). Of fingers, toes and penises. Nature 390, 29.
99. Kurzrock, E. A., Baskin, L. S. and Cunha, G. R. (1999). Ontogeny of the male urethra: theory of endodermal differentiation. Differentiation 64, 115-22.
100. Kurzrock, E. A., Jegatheesan, P., Cunha, G. R. and Baskin, L. S. (2000). Urethral development in the fetal rabbit and induction of hypospadias: a model for human development. J Urol 164, 1786-92.
101. Lewis, P. M., Dunn, M. P., McMahon, J. A., Logan, M., Martin, J. F., St- Jacques, B. and McMahon, A. P. (2001). Cholesterol modification of sonic hedgehog is required for long-range signaling activity and effective modulation of signaling by Ptcl. Cell 105, 599-612.
102. Mango, V., Davey, R. A., Zuo, Y., Cunningham, J. M. and Tabin, C. J. (1996a). Biochemical evidence that patched is the Hedgehog receptor. Nature 384, 176-9.
103. Mango, V., Scott, M. P., Johnson, R. L., Goodrich, L. V. and Tabin, C. J. (1996b). Conservation in hedgehog signaling: induction of a chicken patched homolog by Sonic hedgehog in the developing limb. Development 122, 1225-33.
104. Morgan, E. A., Nguyen, S. B., Scott, V. and Stadler, H. S. (2003). Loss of Bmρ7 and Fgf8 signaling in Hoxal3-mutant mice causes hypospadia. Development 130, 3095-109.
105. Murakami, R. (1987). A histological study of the development of the penis of wild-type and androgen-insensitive mice. JAnat 153, 223-31. 106. Pearse, R. V. and Tabin, C. J. (1998). The molecular ZPA. J Exp Zool 282, 677-90.
107. Pearse, R. V., Vogan, K. J. and Tabin, C. J. (2001). Ptcl and Ptc2 transcripts provide distinct readouts of hedgehog signaling activity during chick embryogenesis. Developmental Biology 239, 15-29. 108. Perriton, C. L., Powles, N., Chiang, C, Maconochie, M. K. and Cohn, M. J. (2002). Sonic hedgehog signaling from the urethral epithelium controls external genital development. Dev Biol 247, 26-46.
109. Stadler, H. S. (2003). Modelling genitourinary defects in mice: an emerging genetic and developmental system. Nat Rev Genet 4, 478-82. 110. Sun, X., Lewandoski, M., Meyers, E. N., Liu, Y. H., Maxson, R. E., Jr. and Martin, G. R. (2000). Conditional inactivation of Fgf4 reveals complexity of signalling during limb bud development. Nat Genet 25, 83-6.
111. Suzuki, K., Ogino, Y., Murakami, R., Satoh, Y., Bachiller, D. and Yamada, G. (2002). Embryonic development of mouse external genitalia: insights into a unique mode of organogenesis . Evol Dev 4, 133-41.
112. Williams-Ashman, H. G. and Reddi, A. H. (1991). Differentiation of mesenchymal tissues during phallic morphogenesis with emphasis on the os penis: roles of androgens and other regulatory agents. J Steroid Biochem MoI Biol 39, 873-81.
113. Yamaguchi, T. P., Bradley, A., McMahon, A. P. and Jones, S. (1999). A Wnt5a pathway underlies outgrowth of multiple structures in the vertebrate embryo. Development 126, 1211-23.
114. Yang, Y., Drossopoulou, G., Chuang, P. T., Duprez, D., Marti, E., Bumcrot, D., Vargesson, N., Clarke, J., Niswander, L., McMahon, A. et al. (1997). Relationship between dose, distance and time in Sonic Hedgehog-mediated regulation of anteroposterior polarity in the chick limb. Development 124, 4393-404. 115. Zeng, X., Goetz, J. A., Suber, L. M., Scott, W. J., Jr., Schreiner, C. M. and Robbins, D.
J. (2001). A freely diffusible form of Sonic hedgehog mediates long-range signalling. Nature
411, 716-20.
116. Frymoyer, J. W. in The lumbar spine (ed. Wiesel, S. W., Weinstein, J.N., Herkowitz,
H.N., Dvorak, J., and Bell, G.R.) 8-15 (W.B Saunders, Philadelphia, PA, 1996). 117. Praemer, A. P., Furner, S., and Rice, D.P. Musculoskeletal conditions in the United
States. (American Academy of Orthoscopic Surgery, 1992). 118. DePalma, A. F. a. R., R.H. The intervertebral disc (ed. Saunders, W. B.) (Philadelphia, PA5 1970).
119. Dwyer, A. P. in The lumbar spine (ed. Wiesel, S. W., Weinstein, J.N., Herkowitz, H.N., Dvorak,. J, and Bell, G.R.) (Philadelphia, PA, 1996). 120. Hunter, C. J., Matyas, J. R. & Duncan, N. A. The notochordal cell in the nucleus pulposus: a review in the context of tissue engineering. Tissue Eng 9, 667-77 (2003).
121. Urban, J. P. & McMullin, J. F. Swelling pressure of the lumbar intervertebral discs: influence of age, spinal level, composition, and degeneration. Spine 13, 179-87 (1988).
122. Humzah, M. D. & Soames, R. W. Human intervertebral disc: structure and function. AnatRec 220, 337-56 (1988).
123. Holm, S. H. in The lumbar spine (ed. Wiesel, S. W., Weinstein, J.N., Herkowitz, H.N., Dvorak, J., and Bell, G.R.) 285-310 (W.B. Saunders, Philadelphia, PA, 1996).
124. Hanley, E. N., Jr., Delamarter, R.B., McCullouch, J.A., and Takahashi, K. in The lumbar spine (ed. Wiesel, S. W., Weinstein, J.N., Herkowitz, H.N., Dvorak, J., and Bell, G.R.) (W.B. Saunders, Philadelphia, PA, 1996).
125. Herkowitz, H. N. in The lumbar spine (ed. Wiesel, S. W., Weinstein, J.N., Herkowitz, H.N., Dvorak, J., and Bell, G.R.) (W.B. Saunders, Philadelphia, PA, 1996).
126. Bao, Q. B., McCullen, G. M., Higham, P. A., Dumbleton, J. H. & Yuan, H. A. The artificial disc: theory, design and materials. Biomaterials Yl, 1157-67 (1996). 127. Bao, Q. B. & Yuan, H. A. Prosthetic disc replacement: the future? Clin Orthop, 139-45 (2002).
128. Bao, Q. B. & Yuan, H. A. New technologies in spine: nucleus replacement. Spine 27, 1245-7 (2002).
129. Thomas, J., Lowman, A. & Marcolongo, M. Novel associated hydrogels for nucleus pulposus replacement. J Biomed Mater Res 67 A, 1329-37 (2003).
130. Trout, J. J., Buckwalter, J. A. & Moore, K. C. Ultrastructure of the human intervertebral disc: II. Cells of the nucleus pulposus. AnatRec 204, 307-14 (1982).
131. Trout, J. J., Buckwalter, J. A., Moore, K. C. & Landas, S. K. Ultrastructure of the human intervertebral disc. I. Changes in notochordal cells with age. Tissue Cell 14, 359-69 (1982). 132. Stevens, J. W., Kurriger, G. L., Carter, A. S. & Maynard, J. A. CD44 expression in the developing and growing rat intervertebral disc. Dev Dyn 219, 381-90 (2000). 133. Cleaver, O. & Krieg, P. A. Notochord patterning of the endoderm. Dev Biol 234, 1-12 (2001).
134. Echelard, Y. et al. Sonic hedgehog, a member of a family of putative signaling molecules, is implicated in the regulation of CNS polarity. Cell 75, 1417-30 (1993). 135. Krauss, S., Concordet, J. P. & Ingham, P. W. A functionally conserved homolog of the
Drosophila segment polarity gene hh is expressed in tissues with polarizing activity in zebrafish embryos. Cell 75, 1431-44 (1993).
136. Le, Y., Miller, J. L. & Sauer, B. GFPcre fusion vectors with enhanced expression. Anal
Biochem 270, 334-6 (1999). 137. Odent, S. et al. Expression of the Sonic hedgehog (SHH) gene during early human development and phenotypic expression of new mutations causing holoprosencephaly. Hum MoI
Genet 8, 1683-9 (1999).
138. Srinivas, S. et al. Cre reporter strains produced by targeted insertion of EYFP and ECFP into the ROSA26 locus. BMCDevBiol 1, 4 (2001). 139. Soriano, P. Generalized lacZ expression with the ROSA26 Cre reporter strain. Nat Genet
21, 70-1 (1999).
140. Maldonado, B. A. & Oegema, T. R., Jr. Initial characterization of the metabolism of intervertebral disc cells encapsulated in microspheres. J Orthop Res 10, 677-90 (1992).
141. Souter, W. A. & Taylor, T. K. Sulphated acid mucopolysaccharide metabolism in the rabbit intervertebral disc. JBone Joint Surg Br 52, 371-84 (1970).
142. Aguiar, D. J., Johnson, S. L. & Oegema, T. R. Notochordal cells interact with nucleus pulposus cells: regulation of proteoglycan synthesis. Exp Cell Res 246, 129-37 (1999).
143. Dagher, S. F., Wang, J. L. & Patterson, R. J. Identification of galectin-3 as a factor in pre-mRNA splicing. Proc Nail Acad Sci USA 92, 1213-7 (1995). 144. Stosiek, P., Kasper, M. & Karsten, U. Expression of cytokeratin and vimentin in nucleus pulposus cells. Differentiation 39, 78-81 (1988).
145. Domowicz, M. et al. The biochemically and immunologically distinct CSPG of notochord is a product of the aggrecan gene. Dev Biol 171, 655-64 (1995).
146. Takaishi, H., Yamada, H. & Yabe, Y. Preferential expression of alternatively spliced transcript of type II procollagen in the rabbit notochordal remnant and developing fibrocartilages. Biochim BiophysActa 1350, 253-8 (1997). 147. Sandell, L. J. In situ expression of collagen and proteoglycan genes in notochord and during skeletal development and growth. Microsc Res Tech 28, 470-82 (1994).
148. McMahon, A. P., Ingham, P. W. & Tabin, C. J. Developmental roles and clinical significance of hedgehog signaling. Curr Top Dev Biol 53, 1-114 (2003). 149. Lewis, P. M. et al. Cholesterol modification of sonic hedgehog is required for long-range signaling activity and effective modulation of signaling by Ptcl. Cell 105, 599-612 (2001). (Molecular Probes, A-6455).
150. Hogan, B. L. Bone morphogenetic proteins: multifunctional regulators of vertebrate development. Genes Dev 10, 1580-94 (1996). 151. Petiot A. et al., Development of the mammalian urethra is controlled by Fgfr2-IIIb. Development 132(10):2441-50 (May 2005).
152. Guo Q, Loomis C, Joyner AL, Fate map of mouse ventral limb ectoderm and the apical ectodermal ridge. Dev Biol. Dec 1;264(1): 166-78 (2003).
153. West RB. Et al., The novel marker, DOGl, is expressed ubiquitously in gastrointestinal stromal tumors irrespective of KIT or PDGFRA mutation status. Am J Pathol. 2004
M;165(l):107-13.

Claims

We claim:
1. A method of isolating cells in selected tissues co-expressing the sonic hedgehog (Shh) gene and a marker gene comprising: a) obtaining a non-human transgenic subject in which a marker gene has been inserted into the subject's genome; and b) isolating SM/marker gene expressing cells and SM/marker gene non-expressing cells from the selected tissue.
2. A method of isolating cells in selected tissues that have at one time co-expressed the sonic hedgehog (ShK) gene and a marker gene comprising: a) obtaining a transgenic subject in which a marker gene has been inserted into the subject's genome; and b) isolating iSM/marker gene expressing cells and Shh/maxkev gene non-expressing cells from the selected tissue.
3. The method according to claims 1 or 2 wherein the marker gene has been inserted into the Shh locus of the subject's genome.
4. A method of identifying differentially expressed genes in selected tissues co-expressing the sonic hedgehog (Shh) gene and a marker gene comprising: a) obtaining a transgenic subject in which a marker gene has been inserted into the Shh locus of the subject's genome; b) isolating SM/marker gene expressing cells and SM/marker gene non-expressing cells from the selected tissue; c) analyzing the complementary RNAs from S7z/z/marker gene expressing cells and SM/marker gene using a microarray screen of the subject's transcriptome; and d) identifying genes that are expressed at higher or lower levels in SM/marker gene expressing cells relative to STϊ/z/marker gene non-expressing cells.
5. A method of identifying differentially expressed genes in selected descendants of cells that have at one time expressed the sonic hedgehog (Shh) gene: a) obtaining a transgenic subject in which a marker gene has been inserted into the Shh locus of the subject's genome; b) isolating SM/marker gene expressing cells and SM/marker gene non-expressing cells from the selected tissue; c) analyzing the complementary RNAs from S7z/z/marker gene expressing cells and SM/marker gene non-expressing cells using a microarray screen of the subject's transcriptome; and d) identifying genes that are expressed at higher or lower levels in >S7z/z/marker gene expressing cells relative to Shh/markex gene non-expressing cells.
6. A method according to any one of claims 1 through 5, wherein the selected tissue is the zone of polarizing activity of a developing limb bud, the genital tubercle, or the nucleus pulposus of an intervertebral disc.
7. A method according to any one of claims 1 through 5, wherein the transgenic subject is a mammal.
8. A method according to claim 7, wherein the mammal is a mouse.
9. A method according to any one of claims 1 through 5, wherein the transgenic subject is an embryo.
10. A method according to any one of claims 1 through 5, wherein the transgenic subject is a fully matured adult.
11. A method according to any one of claims 1 through 5, wherein the selected tissue is embryonic tissue.
12. A method according to any one of claims 1 through 5, wherein the selected tissue is a population of progenitor or stem cells.
13. A method according to claim 11, wherein the selected tissue is derived from the floor plate of the neural tube in the transgenic subject.
14. A method according to claims 11 or 12, wherein the selected tissue is derived from enamel knot tissue of the transgenic subject.
15. A method according to claims 11 or 12, wherein the selected tissue is derived from follicle tissue of hair from the transgenic subject.
16. A method according to claims 11 or 12, wherein the selected tissue is derived from nervous tissue of the transgenic subject.
17. A method according to claims 11 or 12, wherein the selected tissue is derived from retinal tissue of the transgenic subject.
18. A method according to claims 11 or 12, wherein the selected tissue is derived from endoderm of the gastrointestinal tract of the transgenic subj ect.
19. A method according to claims 11 or 12, wherein the selected tissue is derived from genitourinary epithelial tissue of the transgenic subject.
20. A method according to claims 11 or 12, wherein the selected tissue is derived from tissue expressing the Shh gene or protein, or from cells that have at one time expressed said gene or protein.
21. A protein translated from a gene identified as expressed at relatively higher or lower levels according to the method of claims 4 or 5, used as a therapeutic agent in the treatment of intervertebral disk rupture, degeneration, disease or injury.
22. A protein translated from a gene identified as expressed at relatively higher or lower levels according to the method of claims 4 or 5, used as a therapeutic agent for regenerating, reconstructing or repairing the genitourinary system.
23. A protein translated from a gene identified as expressed at relatively higher or lower levels according to the method of claims 4 or 5, used as a therapeutic agent for regenerating, reconstructing or repairing the limb of a subject.
24. A protein translated from a gene identified as expressed at relatively higher or lower levels according to the method of claims 4 or 5, used as a therapeutic agent for regenerating, reconstructing or repairing any tissue needing such therapy that has at one time expressed Shh.
25. A method according to claims 4 or 5 wherein the microarray screens are the Affymetrix mouse GeneChips® .
26. A nucleic acid sequence identified as expressed at relatively higher or lower levels according to the method of claims 4 or 5, selected from the group consisting of TM-I, TM-2, EST 1437418, Mmu-miR-135a-2, and AP-2 beta.
PCT/US2006/033491 2005-08-31 2006-08-28 Method of isolating stem cells and identifying differentially expressed genes in sonic hedgehog expressing cells and descendent cells WO2007027583A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US71340005P 2005-08-31 2005-08-31
US60/713,400 2005-08-31

Publications (2)

Publication Number Publication Date
WO2007027583A2 true WO2007027583A2 (en) 2007-03-08
WO2007027583A3 WO2007027583A3 (en) 2007-06-07

Family

ID=37809405

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/033491 WO2007027583A2 (en) 2005-08-31 2006-08-28 Method of isolating stem cells and identifying differentially expressed genes in sonic hedgehog expressing cells and descendent cells

Country Status (1)

Country Link
WO (1) WO2007027583A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11554195B2 (en) 2015-06-22 2023-01-17 Cedars-Sinai Medical Center Method for regenerating the interverterbral disc with notochordal cells

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ORO A.E. ET AL.: 'Basal cell carcinomas in mice overexpressing sonic hedgehog' SCIENCE vol. 276, 02 May 1997, pages 817 - 821, XP003013724 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11554195B2 (en) 2015-06-22 2023-01-17 Cedars-Sinai Medical Center Method for regenerating the interverterbral disc with notochordal cells

Also Published As

Publication number Publication date
WO2007027583A3 (en) 2007-06-07

Similar Documents

Publication Publication Date Title
Robertson et al. Large-scale discovery of male reproductive tract-specific genes through analysis of RNA-seq datasets
Stolfi et al. Divergent mechanisms regulate conserved cardiopharyngeal development and gene expression in distantly related ascidians
CN111417731A (en) Universal early cancer diagnosis
Cuykendall et al. Identification of germ plasm‐associated transcripts by microarray analysis of Xenopus vegetal cortex RNA
Goller et al. Transcriptional regulator BPTF/FAC1 is essential for trophoblast differentiation during early mouse development
Wu et al. The genetic program of oocytes can be modified in vivo in the zebrafish ovary
Young et al. Noggin is required for first pharyngeal arch differentiation in the frog Xenopus tropicalis
Cheung et al. Double maternal-effect: duplicated nucleoplasmin 2 genes, npm2a and npm2b, with essential but distinct functions are shared by fish and tetrapods
Smith et al. The MLC1v gene provides a transgenic marker of myocardium formation within developing chambers of the Xenopus heart
US20200149063A1 (en) Methods for gender determination and selection of avian embryos in unhatched eggs
Bobe et al. In silico identification and molecular characterization of genes predominantly expressed in the fish oocyte
Barton et al. Molecular biology of cardiac development and growth
WO2007027583A2 (en) Method of isolating stem cells and identifying differentially expressed genes in sonic hedgehog expressing cells and descendent cells
Kerr et al. Maternal Tgif1 regulates nodal gene expression in Xenopus
Delhotal Characterization of NHLRC2 gene-edited mice: a model for bovine developmental duplications
Kaelin et al. Molecular and genetic characterization of sex-linked orange coat color in the domestic cat
US20030096219A1 (en) Compositions and methods for the analysis of mucin gene expression and identification of drugs having the ability to inhibit mucin gene expression
Distasio Novel Regulators of Neural Crest and Neural Progenitor Survival
Lapedriza Gene regulatory network of melanocyte development
Cheung et al. Double maternal effect: duplicated nucleoplasmin 2 genes, npm2a and npm2b, are shared by fish and tetrapods, and have distinct and essential roles in early embryogenesis
Mok et al. PBX1 and PBX3 transcription factors regulate SHH expression in the Frontonasal Ectodermal Zone through complementary mechanisms
Murdoch Genetic analysis of neural tube defects in the mouse
JP4412734B2 (en) A novel gene Brachyury Expression Nuclear Inhibitor, BENI that governs mesoderm formation in early vertebrate development
Qin Genetic Analysis of Transcription Factors Involved in Zebrafish Sex Differentiation
CA3200855A1 (en) Influenza a-resistant animals having edited anp32 genes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06813846

Country of ref document: EP

Kind code of ref document: A2